This data was for a mobile shopping app. Other answers below.

> On May 21, 2013, at 5:42 PM, Ted Dunning <[email protected]> wrote:
> 
> Inline
> 
> 
> On Tue, May 21, 2013 at 8:59 AM, Pat Ferrel <[email protected]> wrote:
> 
>> In the interest of getting some empirical data out about various
>> architectures:
>> 
>> On Mon, May 20, 2013 at 9:46 AM, Pat Ferrel <[email protected]> wrote:
>> 
>>>> ...
>>>> You use the user history vector as a query?
>>> 
>>> The most recent suffix of the history vector.  How much is used varies by
>>> the purpose.
>> 
>> We did some experiments with this using a year+ of e-com data. We measured
>> the precision using different amounts of the history vector in 3 month
>> increments. The precision increased throughout the year. At about 9 months
>> the affects of what appears to be item/product/catalog/new model churn
>> begin to become significant and so precision started to level off. We did
>> *not* apply a filter to recs so that items not in the current catalog were
>> not filtered before precision was measured. We'd expect this to improve
>> results using older data.
>> 
> 
> This is a time filter.  How many transactions did this turn out to be.  I
> typically recommend truncating based on transactions rather than time.
> 
> My own experience was music and video recommendations.  Long history
> definitely did not help much there.
> 

This is what I've heard before; recommending music and media has it's own set 
of characteristics. We were at the point of looking at history on segments of 
the catalog (music vs food etc.) to do the same analysis. I suspect we would 
have found what you are saying. 

It would be a bit of processing to save only so many transactions for each 
user, certainly not impossible. Logs come in by time increments so we got new 
ones periodically but didn't count each user's transactions.

In any case the item similarity type recs were quite a bit more predictive of 
purchases than those based on user specific history, which changes the 
requirements a bit. We always measured precision on both history and similarity 
based reqs though and a blend of both got the best score. 

I don't have access to the data now--I sure wish we had some to share so these 
issues could be investigated and compared. 

> 
>>> 
>>> 20 recs is not sufficient.  Typically you need 300 for any given context
>>> and you need to recompute those very frequently.  If you use geo-specific
>>> recommendations, you may need thousands of recommendations to have enough
>>> geo-dispersion.  The search engine approach can handle all of that on the
>>> fly.
>>> 
>>> Also, the cached recs are user x (20-300) non-zeros.  The sparsified
>>> item-item cooccurrence matrix is item x 50.  Moreover, search engines are
>>> very good at compression.  If users >> items, then item x 50 is much
>>> smaller, especially after high quality compression (6:1 is a common
>>> compression ratio).
>>> 
>> 
>> The end application designed by the ecom customer required less than 10
>> recs for any given context so 20 gave us of room for runtime context type
>> boosting.
>> 
> 
> And how do you generate the next page of results?
> 

Its was a mobile app so it did not have a next page of results, it was an app 
design issue we had no control over. Actually Amazon on their web site uses 
only 100 in a horizontally scrolling strip, we had much less space to fill.

But regarding saving more reqs--most of my experience was in storing the entire 
recommendation matrix for a slightly different purpose. I was working on 
building the cross-recommender, which (as you know) is an ensemble of models 
where you have to learn weights for each part of the model. To do a linear 
combination, without knowing the weight ahead of time, you need virtually all 
recs for each query. I never got to finish that so I've reproduced the code but 
now, as I said, lack the data.

From all the research I did into the predictive power of various actions, the 
cross-recommender seemed to hold the most promise for cleaning one action using 
another more predictive action. If I can prove this out then a whole range of 
multi-action chains present themselves. At very least we have the framework for 
creating and learning the weights for small ensembles.

Reply via email to