My plan was to NOT use lucene to start with though I see the benefits. This is 
because I want to experiment with weighting--doing idf, no weighting, and with 
a non-log idf. Also I want to experiment with temporal decay of recomendability 
and maybe blend item similarity based results in certain cases. Getting the raw 
recs is important to the experiment so I'd like to use mahout cf/taste if 
possible.

Therefore my discussion was assuming the use of the entire mahout cf/taste 
framework even in the retrieving of recs. In that light  B'B h_p is just 
another way of stating the usual train with user and items purchased then get 
recs for users and so that part of recs is covered. Since this also supports 
item similarity based queries (no user in the query) I'm covered.

As to the B'A h_v part, isn't that just replacing where cf/taste would calc 
B'B, the self-join matrix, with the result of B'A? To use cf/taste you would 
ingest the user and items viewed data to create h_v for all users, but instead 
of allowing cf/taste to calculate B'B you would replace it with B'A. Then at 
query time taste would take a user and return purchase recs by calculating B'A 
h_v? isn't this correct? I hope someone comments on this because it is the 
route I plan to explore.

The downside is that without lucene I would  have two (or more) sets of recs to 
blend. I can make lucene return the raw recs fields but not sure how to return 
similarity based queries with lucene and don't really want to tackle that just 
yet (keep it simple?)

Also in using cf/taste I don't need to create rows of the combined matrix, I 
can treat them as independent recommenders, which means I can tune them 
independently. I still have questions about how to generate the view values of 
A but that's another discussion.

Then the question is how to blend B'A h_v with B'B h_p?

The range of both of these will be identical.  Each row of [B'A | B'B] 
corresponds to a document.  One field (the view=>purchase indicators) contains 
a row of B'A and another field (the purchase=>purchase indicators) will contain 
a row of B'B.

The query will ultimately contain two fields corresponding to recent views and 
recent purchases.  The search engine will combine the scores from these 
intelligently without any effort on your part.  You can tune how this works, 
but I haven't ever found that very useful.

Reply via email to