On Feb 10, 2013, at 2:39pm, Johannes Schulte wrote:

> Hi,
> 
> i am currently implementing a system of the same kind, LLR sparsified
> "term"-cooccurrence vectors in lucene (since not a day goes by where i see
> Ted praising this).
> There are not only views and purchases, but also search terms, facets and a
> lot more textual information to be included in the cooccurrence matrix (as
> "input").
> That's why i went with the feature hashing framework in mahout. This gives
> small (hd/mem) user profiles and allows for reusing the vectors for click
> prediction and/or clustering. The main difference is that there's only two
> fields in lucene with a lot of terms (Numbers), representing the features.
> Two fields because i think predicting views (besides purchases) might in
> some cases be better than predicting nothing.
> I don't think it  should make a big differing in scoring because in a
> vector space model used by most engines it's just, well a vector space and
> i don't know if the field norm make sense after stripping values from the
> term vectors with the LLR threshold.
> 
> @Ted
>> It is handy to simply use the binary values of the sparsified versions of
>> these and let the search engine handle the weighting of different
>> components at query time.
> 
> Do you really want to omit the cooccurrence counts which would become the
> term frequecies? how would the engine then weight different inputs against
> each other?
> 
> And,
> if anyone knows a
> 1. smarter way to index the cooccurrence counts in lucene than a
> tokenstream that emits a word k times for a cooccurrence count of k

I haven't been following this discussion, but in general using payloads is a 
way of providing additional information about a term that can be used for 
scoring.

> 2. way to avoid treating the (hashed) vector column indices as terms but
> reusing them? It's a bit weird hashing to an int and then having the lucene
> term dictionary treating them as string, mapping to another int

Is there a performance/size issue here?

Also I'm assuming you're using the solr.TrieIntField field type (not the 
string-ified value).

-- Ken



> On Sun, Feb 10, 2013 at 6:36 PM, Ted Dunning <[email protected]> wrote:
> 
>> Actually treating the different interactions separately can lead to very
>> good recommendations.  The only issue is that the interactions are no
>> longer dyadic.
>> 
>> If you think about it, having two different kinds of interactions is like
>> adjoining interaction matrices for the two different kinds of interaction.
>> Suppose that you have user x views in matrix A and you have user x
>> purchases in matrix B.  The complete interaction matrix of user x (views +
>> purchases) is [A | B].
>> 
>> When you compute cooccurrence in this matrix, you get
>> 
>>               [A | B] = [ A' ]           [ A' A  A' B ]
>>      [A | B]' [A | B] = [    ] [A | B] = [            ]
>>               [A | B] = [ B' ]           [ B' A  B' B ]
>> 
>> This matrix is (view + purchase) x (view + purchase).  But we don't care
>> about predicting views so we only really need a matrix that is purchase x
>> (view
>> + purchase).  This is just the bottom part of the matrix above, or [ B'A |
>> B'B ].  When you produce purchase recommendations r_p by multiplying by a
>> mixed view and purchase history vector h which has a view part h_v and a
>> purchase part h_p, you get
>> 
>>      r_p = [ B' A  B' B ] h = B'A h_v + B'B h_p
>> 
>> That is a prediction of purchases based on past views and past purchase.
>> 
>> Note that this general form applies for both decomposition techniques such
>> as SVD, ALS and LLL as well as for sparsification techniques such as the
>> LLR sparsification.  All that changes is the mechanics of how you do the
>> multiplications.  Weighting of components works the same as well.
>> 
>> What is very different here is that we have a component of cross
>> recommendation.  That is the B'A in the formula above.  This is very
>> different from a normal recommendation and cannot be computed with the
>> simple self-join that we normally have in Mahout cooccurrence computation
>> and also very different from the decompositions that we normally do.  It
>> isn't hard to adapt the Mahout computations, however.
>> 
>> When implementing a recommendation using a search engine as the base, these
>> same techniques also work extremely well in my experience.  What happens is
>> that for each item that you would like to recommend, you would have one
>> field that has components of B'A and one field that has components of B'B.
>> It is handy to simply use the binary values of the sparsified versions of
>> these and let the search engine handle the weighting of different
>> components at query time.  Having these components separated into different
>> fields in the search index seems to help quite a lot, which makes a fair
>> bit of sense.
>> 
>> On Sun, Feb 10, 2013 at 9:55 AM, Sean Owen <[email protected]> wrote:
>>> 
>>> I think you'd have to hack the code to not exclude previously-seen items,
>>> or at least, not of the type you wish to consider. Yes you would also
>> have
>>> to hack it to add rather than replace existing values. Or for test
>>> purposes, just do the adding yourself before inputting the data.
>>> 
>>> My hunch is that it will hurt non-trivially to treat different
>> interaction
>>> types as different items. You probably want to predict that someone who
>>> viewed a product over and over is likely to buy it, but this would only
>>> weakly tend to occur if the bought-item is not the same thing as the
>>> viewed-item. You'd learn they go together but not as strongly as ought to
>>> be obvious from the fact that they're the same. Still, interesting
>> thought.
>>> 
>>> There ought to be some 'signal' in this data, just a question of how much
>>> vs noise. A purchase means much more than a page view of course; it's not
>>> as subject to noise. Finding a means to use that info is probably going
>> to
>>> help.
>>> 
>>> 
>>> 
>>> 
>>> On Sat, Feb 9, 2013 at 7:50 PM, Pat Ferrel <[email protected]> wrote:
>>> 
>>>> I'd like to experiment with using several types of implicit preference
>>>> values with recommenders. I have purchases as an implicit pref of high
>>>> strength. I'd like to see if add-to-cart, view-product-details,
>>>> impressions-seen, etc. can increase offline precision in holdout tests.
>>>> These less than obvious implicit prefs will get a much lower value than
>>>> purchase and i'll experiment with different mixes. The problem is that
>> some
>>>> of these prefs will indicate that the user, for whom I'm getting recs,
>> has
>>>> expressed a preference.
>>>> 
>>>> Using these implicit prefs seems reasonable in finding similarity of
>> taste
>>>> between users but presents several problems. 1) how to encode the
>> prefs,
>>>> each impression-seen will increase the strength of preference of a user
>> for
>>>> an item but the recommender framework replaces the preference value for
>>>> items preferred more than once, doesn't it? 2) AFAIK the current
>>>> recommender framework will return recs only for items that the user in
>>>> question has expressed no preference for. If I use something like
>>>> view-product-details or impressions-seen, I will be removing anything
>> the
>>>> user has seen from the recs, which is not what I want in this
>> experiment.
>>>> 
>>>> Has anyone tried something like this? I'm not convinced that these
>> other
>>>> implicit preferences will add anything to the recommender, just trying
>> to
>>>> find out.
>> 

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





Reply via email to