Yes your only issue there, which I think you had touched on, was that you
have to put your current cart (which hasn't been purchased) into the model
in order to get an answer out of a recommender. I think we've talked about
the recommend-to-anonymous function in the context of another system, which
is exactly what you need here.

Yes, all you have to do then is reproduce the recommender computation. But
I understand that you were hoping to avoid rewriting it. It's really just a
loop though, so not much work to reproduce.

100K items x a few items in a cart is a few hundred thousand similarities.
This isn't trivial but not going to take seconds, I think. Yes this gets
much faster if you can precompute item-item similarity. Computing NxN pairs
is going to take a long time though when N=100,000. So yes something like
clustering is the nice way to scale that. Then your clusters greatly limit
the number of candidates to consider because you can round every other
inter-cluster similarity to 0.

By this point... I imagine it's about as hard to whip up a frequent itemset
implementation! or crib one and adapt it. This is in mahout. That's
probably the right tool for the job.



On Thu, Feb 14, 2013 at 8:19 PM, Pat Ferrel <[email protected]> wrote:

> I'm creating a matrix of cart ids and items ids so cart x items in cart.
> The 'preference' then is cartID, itemID. This will create the correct
> matrix I think.
>
> For any cart id I would get a ranked list of recommended items that was
> calculated from other carts. This seems like what is needed in a SC
> recommender. So doing this should give a "recommend to this collection of
> items", right?
>
> The only issue is finding the best cart to get the recs. I would be doing
> a pair-wise similarity comparison for N carts to the current cart contents
> and the result would have to come back in a very short amount of time, on
> the order of the time to get recs for 3M users and 100K items.
>
> Not sure what N is yet but the # of items is the same as in the purchase
> matrix. So finding the best cart to get recs for will be N similarity
> comparisons--worst case. Each cart is likely to have only a few items in it
> and I imagine this speeds the similarity calc.
>
> I guess I'll try it as described and optimize for speed if the precision
> is good compared to the apriori algo.
>
> On Feb 14, 2013, at 10:57 AM, Sean Owen <[email protected]> wrote:
>
> I don't think it's necessarily slow; this is how item-based recommenders
> work. The only thing stopping you from using Mahout directly is that I
> don't think there's an easy way to say "recommend to this collection of
> items". But that's what is happening inside when you recommend for a user.
>
> You can just roll your own version of it. Yes you are computing similarity
> for k carted items  by all N items, but is N so large? hundreds of
> thousands of products? this is still likely pretty fast even if the
> similarity is over millions of carts. Some smart precomputation and caching
> goes a long way too.
>
>
> On Thu, Feb 14, 2013 at 7:10 PM, Pat Ferrel <[email protected]> wrote:
>
> > Yes, one time tested way to do this is the "apriori" algo which looks at
> > frequent item sets and creates rules.
> >
> > I was looking for a shortcut using a recommender, which would be super
> > easy to try. The rule builder is a little harder to implement but we can
> > also test precision on that and compare the two.
> >
> > The recommender method below should be reasonable AFAICT except for the
> > method(s) of retrieving recs, which seem likely to be slow.
> >
> > On Feb 14, 2013, at 9:45 AM, Sean Owen <[email protected]> wrote:
> >
> > This sounds like a job for frequent item set mining, which is kind of a
> > special case of the ideas you've mentioned here. Given N items in a cart,
> > which next item most frequently occurs in a purchased cart?
> >
> >
> > On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel <[email protected]>
> wrote:
> >
> >> I thought you might say that but we don't have the add-to-cart action.
> We
> >> have to calculate cart purchases by matching cart IDs or session IDs. So
> > we
> >> only have cart purchases with items.
> >>
> >> If we had the add-to-cart and the purchase we could use your
> cross-action
> >> method for getting recs by training only on those two actions.
> >>
> >> Still without the add-to-cart the method below should work, right? The
> >> main problem being finding a similar cart in the training set quickly.
> > Are
> >> there other problems?
> >>
> >> On Feb 14, 2013, at 9:19 AM, Ted Dunning <[email protected]> wrote:
> >>
> >> I think that this is an excellent use case for cross recommendation from
> >> cart contents (items) to cart purchases (items).  The cross aspect is
> > that
> >> the recommendation is from two different kinds of actions, not two kinds
> > of
> >> things.  The first action is insertion into a cart and the second is
> >> purchase of an item.
> >>
> >> On Thu, Feb 14, 2013 at 9:53 AM, Pat Ferrel <[email protected]>
> > wrote:
> >>
> >>> There are several methods for recommending things given a shopping cart
> >>> contents. At the risk of using the same tool for every problem I was
> >>> thinking about a recommender's use here.
> >>>
> >>> I'd do something like train on shopping cart purchases so row = cartID,
> >>> column = itemID.
> >>> Given cart contents I could find the most similar cart in the training
> >> set
> >>> by using a similarity measure then get recs for this closest matched
> >> cart.
> >>>
> >>> The search for similar carts may be slow if I have to check for
> pairwise
> >>> similarity so I could cluster and find the best cluster then search it
> >> for
> >>> the best cart. I could create a decision tree on all trained carts and
> >> walk
> >>> as far as I can down the tree to find the cart with the most
> >> cooccurrences.
> >>> There may be other cooccurrence based methods in mahout??? With the id
> > of
> >>> the cart I can then get recs from the training set. I could also
> fold-in
> >>> the new cart contents to the training set and ask for recs based on it
> >>> (this seems like it would take a long time to compute). This last would
> >>> also pollute the trained matrix with partial carts over time.
> >>>
> >>> This seems like another place where Lucene might help but are there
> > other
> >>> mahout methods to look at before I diving into Lucene?
> >>
> >>
> >
> >
>
>

Reply via email to