Very cool. There is an excel spreadsheet in the solr-recommender project at src/test/resources/Recommender\ Math.xlsx that shows an example of the cross-recommender with data that trivially simulates purchases (B) and views (A), where everything purchased has been viewed but not the other way around.
The data shows cases where no recommendation could be made from [B'B] but can from [B'A]. Thus illustrating why the cross-recommender might be of use. It's not as flashy but you can change values and get the results recalculated. On Aug 18, 2013, at 10:50 AM, B Lyon <[email protected]> wrote: Thanks folks for taking a look. I haven't sat down to try it yet, but wondering how hard it is to construct (realizable and realistic) k11, k12, k21, k22 values for three binary sequences X, Y, Z where (X,Y) and (Y,Z) have same co-occurrence, but you can tweak k12 and k21 so that the LLR values are extremely different in both directions. I assume that k22 doesn't matter much in practice since things are sparse and k22 is huge. Well, obviously, I guess you could simply switch the k12/k21 values between the two sequence pairs to flip the order at will... which is information that co-occurrence of course does not "know about". On Sat, Aug 17, 2013 at 10:30 PM, Ted Dunning <[email protected]> wrote: > This is nice. As you say, k11 is the only part that is used in > cooccurrence and it doesn't weight by prevalence, either. > > This size analysis is hard to demonstrate much difference because it is > hard to show interesting values of LLR without absurdly string coordination > between items. > > > On Fri, Aug 16, 2013 at 8:21 PM, B Lyon <[email protected]> wrote: > >> As part of trying to get a better grip on recommenders, I have started a >> simple interactive visualization that begins with the raw data of > user-item >> interactions and goes all the way to being able to twiddle the > interactions >> in a test user vector to see the impact on recommended items. This is > for >> simple "user interacted with an item" case rather than numerical >> preferences for items. The goal is to show the intermediate pieces and > how >> they fit together via popup text on mouseovers and dynamic highlighting > of >> the related pieces. I am of course interested in feedback as I keep >> tweaking on it - not sure I got all the terminology quite right yet, for >> example, and might have missed some other things I need to know about. >> Note that this material is covered in Chapter 6.2 in MIA in the > discussion >> on distributed recommenders. >> >> It's on googledrive here (very much a work-in-progress): >> >> https://googledrive.com/host/0B2GQktu-wcTiWHRwZFJacjlqODA/ >> >> (apologies to small resolution screens) >> >> This is based only on the co-occurrence matrix, rather than including the >> other similarity measures, although in working through this, it seems > that >> the other ones can just be interpreted as having alternative definitions > of >> what "*" means in matrix multiplication of A^T*A, where A is the > user-item >> matrix... and as an aside to me begs the interesting question of [purely >> hypotheticall?] situations where LLR and co-occurrence are at odds with >> each other in making recommendations, as co-occurrence seems to be just >> using the "k11" term that is part of the LLR calculation. >> >> My goal (at the moment at least) is to eventually continue this for the >> solr-recommender project that started as few weeks ago, where we have the >> additional cross-matrix, as well as a kind of regrouping of pieces for >> solr. >> >> >> -- >> BF Lyon >> http://www.nowherenearithaca.com >> > -- BF Lyon http://www.nowherenearithaca.com
