Outside of the context of your demo, suppose that you have events a, b, c
and d.  Event a is the one we are centered on and is relatively rare.
 Event b is not so rare, but has weak correlation with a.  Event c is as
rare as a, but correlates strongly with it.  Even d is quite common, but
has no correlation with a.

The 2x2 matrices that you would get would look something like this.  In
each of these, a and NOT a are in rows while other and NOT other are in
columns.

versus b, llrRoot = 8.03
       b NOT b  a *10* *10*  NOT a *1000* *99000*



versus c, llrRoot = 11.5
      c NOT c  a *10* *10*  NOT a *30* *99970*



versus d, llrRoot = 0
  d NOT d  a *10* *10*  NOT a *50000* *50000*

Note that what we are holding constant here is the prevalence of a (20
times) and the distribution of a under the conditions of the other symbol.
 What is being varied is the distribution of the other symbol in the "NOT
a" case.




On Sun, Aug 18, 2013 at 10:50 AM, B Lyon <[email protected]> wrote:

> Thanks folks for taking a look.
>
> I haven't sat down to try it yet, but wondering how hard it is to construct
> (realizable and realistic) k11, k12, k21, k22 values for three binary
> sequences X, Y, Z where (X,Y) and (Y,Z) have same co-occurrence, but you
> can tweak k12 and k21 so that the LLR values are extremely different in
> both directions.  I assume that k22 doesn't matter much in practice since
> things are sparse and k22 is huge.  Well, obviously, I guess you could
> simply switch the k12/k21 values between the two sequence pairs to flip the
> order at will... which is information that co-occurrence of course does not
> "know about".
>
>
> On Sat, Aug 17, 2013 at 10:30 PM, Ted Dunning <[email protected]>
> wrote:
>
> > This is nice.  As you say, k11 is the only part that is used in
> > cooccurrence and it doesn't weight by prevalence, either.
> >
> > This size analysis is hard to demonstrate much difference because it is
> > hard to show interesting values of LLR without absurdly string
> coordination
> > between items.
> >
> >
> > On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <[email protected]> wrote:
> >
> > > As part of trying to get a better grip on recommenders, I have started
> a
> > > simple interactive visualization that begins with the raw data of
> > user-item
> > > interactions and goes all the way to being able to twiddle the
> > interactions
> > > in a test user vector to see the impact on recommended items.  This is
> > for
> > > simple "user interacted with an item" case rather than numerical
> > > preferences for items.  The goal is to show the intermediate pieces and
> > how
> > > they fit together via popup text on mouseovers and dynamic highlighting
> > of
> > > the related pieces.  I am of course interested in feedback as I keep
> > > tweaking on it - not sure I got all the terminology quite right yet,
> for
> > > example, and might have missed some other things I need to know about.
> > >  Note that this material is covered in Chapter 6.2 in MIA in the
> > discussion
> > > on distributed recommenders.
> > >
> > > It's on googledrive here (very much a work-in-progress):
> > >
> > > https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
> > >
> > > (apologies to small resolution screens)
> > >
> > > This is based only on the co-occurrence matrix, rather than including
> the
> > > other similarity measures, although in working through this, it seems
> > that
> > > the other ones can just be interpreted as having alternative
> definitions
> > of
> > > what "*" means in matrix multiplication of A^T*A, where A is the
> > user-item
> > > matrix... and as an aside to me begs the interesting question of
> [purely
> > > hypotheticall?] situations where LLR and co-occurrence are at odds with
> > > each other in making recommendations, as co-occurrence seems to be just
> > > using the "k11" term that is part of the LLR calculation.
> > >
> > > My goal (at the moment at least) is to eventually continue this for the
> > > solr-recommender project that started as few weeks ago, where we have
> the
> > > additional cross-matrix, as well as a kind of regrouping of pieces for
> > > solr.
> > >
> > >
> > > --
> > > BF Lyon
> > > http://www.nowherenearithaca.com
> > >
> >
>
>
>
> --
> BF Lyon
> http://www.nowherenearithaca.com
>

Reply via email to