Which is why LLR would be really nice in two action cross-similairty case. The 
cross-corelation sparsification via cooccurrence is probably pretty weak, no?


On Aug 18, 2013, at 11:53 AM, Ted Dunning <[email protected]> wrote:

Outside of the context of your demo, suppose that you have events a, b, c
and d.  Event a is the one we are centered on and is relatively rare.
Event b is not so rare, but has weak correlation with a.  Event c is as
rare as a, but correlates strongly with it.  Even d is quite common, but
has no correlation with a.

The 2x2 matrices that you would get would look something like this.  In
each of these, a and NOT a are in rows while other and NOT other are in
columns.

versus b, llrRoot = 8.03
      b NOT b  a *10* *10*  NOT a *1000* *99000*



versus c, llrRoot = 11.5
     c NOT c  a *10* *10*  NOT a *30* *99970*



versus d, llrRoot = 0
 d NOT d  a *10* *10*  NOT a *50000* *50000*

Note that what we are holding constant here is the prevalence of a (20
times) and the distribution of a under the conditions of the other symbol.
What is being varied is the distribution of the other symbol in the "NOT
a" case.




On Sun, Aug 18, 2013 at 10:50 AM, B Lyon <[email protected]> wrote:

> Thanks folks for taking a look.
> 
> I haven't sat down to try it yet, but wondering how hard it is to construct
> (realizable and realistic) k11, k12, k21, k22 values for three binary
> sequences X, Y, Z where (X,Y) and (Y,Z) have same co-occurrence, but you
> can tweak k12 and k21 so that the LLR values are extremely different in
> both directions.  I assume that k22 doesn't matter much in practice since
> things are sparse and k22 is huge.  Well, obviously, I guess you could
> simply switch the k12/k21 values between the two sequence pairs to flip the
> order at will... which is information that co-occurrence of course does not
> "know about".
> 
> 
> On Sat, Aug 17, 2013 at 10:30 PM, Ted Dunning <[email protected]>
> wrote:
> 
>> This is nice.  As you say, k11 is the only part that is used in
>> cooccurrence and it doesn't weight by prevalence, either.
>> 
>> This size analysis is hard to demonstrate much difference because it is
>> hard to show interesting values of LLR without absurdly string
> coordination
>> between items.
>> 
>> 
>> On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <[email protected]> wrote:
>> 
>>> As part of trying to get a better grip on recommenders, I have started
> a
>>> simple interactive visualization that begins with the raw data of
>> user-item
>>> interactions and goes all the way to being able to twiddle the
>> interactions
>>> in a test user vector to see the impact on recommended items.  This is
>> for
>>> simple "user interacted with an item" case rather than numerical
>>> preferences for items.  The goal is to show the intermediate pieces and
>> how
>>> they fit together via popup text on mouseovers and dynamic highlighting
>> of
>>> the related pieces.  I am of course interested in feedback as I keep
>>> tweaking on it - not sure I got all the terminology quite right yet,
> for
>>> example, and might have missed some other things I need to know about.
>>> Note that this material is covered in Chapter 6.2 in MIA in the
>> discussion
>>> on distributed recommenders.
>>> 
>>> It's on googledrive here (very much a work-in-progress):
>>> 
>>> https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/
>>> 
>>> (apologies to small resolution screens)
>>> 
>>> This is based only on the co-occurrence matrix, rather than including
> the
>>> other similarity measures, although in working through this, it seems
>> that
>>> the other ones can just be interpreted as having alternative
> definitions
>> of
>>> what "*" means in matrix multiplication of A^T*A, where A is the
>> user-item
>>> matrix... and as an aside to me begs the interesting question of
> [purely
>>> hypotheticall?] situations where LLR and co-occurrence are at odds with
>>> each other in making recommendations, as co-occurrence seems to be just
>>> using the "k11" term that is part of the LLR calculation.
>>> 
>>> My goal (at the moment at least) is to eventually continue this for the
>>> solr-recommender project that started as few weeks ago, where we have
> the
>>> additional cross-matrix, as well as a kind of regrouping of pieces for
>>> solr.
>>> 
>>> 
>>> --
>>> BF Lyon
>>> http://www.nowherenearithaca.com
>>> 
>> 
> 
> 
> 
> --
> BF Lyon
> http://www.nowherenearithaca.com
> 

Reply via email to