> > On Aug 21, 2014, at 1:22 AM, Serega Sheypak <[email protected]> wrote: > >>> What you are doing is best practice for showing similar “views”. The > technique for using multiple actions will be covered in a series of blogs > posts and may be put on the Mahout site eventually > Great thanks! > >>> People look at 100 things and buy 1, as you say. The question is: Do you > want people to buy something or just browse your site? > No objections for your point. I understand it. It should work for pretty > big ecom, right? Small ecom sell 100-200 items per day and have wide range > of items.
Ah, then using Ted’s metrics views _is_ probably your best bet. You can probably still personalize view recommendations. Since you are already using itemsimilarity it can be a second step that builds on the first. > >>> Filter out any items not in the catalog from your recommendations. > We have it on data preparation stage. We recalculate item similarity each > day sliding back for 60 days excluding non-available items on preparation > stage. > > Thank you! We did reach good results, business guys got satisfaction :) > > > 2014-08-20 20:28 GMT+04:00 Pat Ferrel <[email protected]>: > >>> >>> On Aug 19, 2014, at 11:26 PM, Serega Sheypak <[email protected]> >> wrote: >>> >>> Hi! >>> 1. There was a bug in UI, I've checked raw recommendations. "water >> heating >>> device" has low score. So first 30 recommended items really fits iPhone, >>> next are not so good. Anyway result is good, thank you very much. >>> 2. I've inspected "sessions" of users, really there are people who viewed >>> iphone and heating device. 10 people for last month. >>> 3. I will calculate relative measurment, I didn't calc what is % of these >>> people comparing to others and how they fluence on score result. >>> >> >> That’s great. The Spark version sorts the result by weights, but I think >> the mapreduce version doesn't >> >>> *You wrote:* >>> Then once you have that working you can add more actions but only with >>> cross-cooccurrence, adding by weighting* will not work with this type of >>> recommender*, which recommender can work with weights for actions? >> >> What you are doing is best practice for showing similar “views”. The >> technique for using multiple actions will be covered in a series of blogs >> posts and may be put on the Mahout site eventually. It requires >> spark-itemsimilarity. For now I’d strongly suggest you look at training on >> purchase data alone - see the comments below. >> >>> >>> *About building recommendations using sales.* >>> Sales are less than 1% from item views. You will recommend only stuff >>> people buy. >> >> The point is not volume of data but quality of data. I once measured how >> predictive of purchases the views were and found them a rather poor >> predictor. People look at 100 things and buy 1, as you say. The question >> is: Do you want people to buy something or just browse your site? >> >> On the other hand you would need to see how good your coverage is of >> purchases. Do you have enough items purchased by several people (Ted’s >> questions below will guide you)? If there is good coverage then you _do >> not_ restrict the range by using only purchase data. You increase the >> quality. >> >>> If you recommend what people see you significantly widen range >>> of possible buy actions. People always buy case "XXX" with iphone. You >>> would never recommened them to buy case "YYY". If people watch "XXX" and >>> "YYY" it's reasonable to recommened "YYY". Maybe "YYY" it's more >> expensive >>> that is why people prefer cheaper "XXX". What's wrong with this >> assumption? >> >> Nothing at all. Remember that your goal is to cause a purchase but using >> views requires some “scrubbing” of views. You want, in effect, >> views-that-lead-to-purchases. In a cooccurrence recommender this can be >> done with cross-cooccurrence and I’ll describe that elsewhere, it’s too >> long for an email to describe but pretty easy to use. >> >> I’d wager that if you restrict to purchases your sales will go up over >> recommending views. But that is without looking at your data. If you need >> more data try increase the sliding time window to add more purchases. This >> will eventually start including things that are no longer in your catalog >> so will have diminishing returns but 60 days seem like a short time period. >> Filter out any items not in the catalog from your recommendations. >> >> You want recency to matter, this is good intuition. The in-catalog filter >> is one simple way, and there are others when you get to personalization. >> >>> >>> *About our obsessive desire to add weights for actions.* >>> We would like to self-tune our recommendations. If user clicks our >>> recommendation it's a signal for us that items are related. So next time >>> this link should have higher score. What are the approaches to do it? >>> >> >> Yes, you do want the things that lead to purchases to go into the training >> data. This is good intuition. But you don’t do it with weights you train on >> new purchases, regardless of whether they came from random views, >> rec-views, or … You don’t care whether a rec was clicked on; you care if a >> purchase was made and you don’t care what part of the UI caused it. UI >> analysis is very very important but doesn’t help the recommender, it guides >> UI decisions. So measuring clicks is good but shouldn’t be used to change >> recs. >> >> One way to increase the value of your recs is to add a little randomness >> to their ordering. If you have 10 things to recommend get 20 from >> itemsimilarity and apply a normally distributed random weighting, then >> re-sort and show the top 10. This will move some things up in order and >> show them where without the re-ordering they would never be shown. The >> technique allows you to expose more items to possible purchase and >> therefore affect the ordering the next time you train. The actual algorithm >> takes more space to describe but the idea is a lot like a multi-armed >> bandit where the best bandit eventually gets all trials. In this case the >> best rec leads to a purchase and gets into the new training data and so >> will be shown more often the next time. >> >> Another thing you can do is create a “shopping cart” recommender. This >> looks at items purchased together—an item-set. It is a strong indicator of >> relatedness. >> >> Suggestions: >> 1) personalize: this is likely to make the most difference since you will >> be showing different things to different people. The “Practical Machine >> Learning” is short and easy to read—it describes this. >> 2) move to purchase data training, wait for cross-cooccurrence to add in >> view data. Do this if you have good coverage (Ted’s questions below relate >> to this). >> 3) increase the training period if needed to get good catalog coverage >> 4) consider dithering your recs to expose more items to purchase and >> therefore self-tune by increasing the quality of your training data. >> >>> >>> >>> >>> 2014-08-20 7:18 GMT+04:00 Ted Dunning <[email protected]>: >>> >>>> On Tue, Aug 19, 2014 at 12:53 AM, Serega Sheypak < >> [email protected] >>>>> >>>> wrote: >>>> >>>>> What could be a reason for recommending "Water heat device " to iPhone? >>>>> iPhone is one of the most popular item. There should be a lot of people >>>>> viewing iPhone with "Water heat device "? >>>>> >>>> >>>> What are the numbers? >>>> >>>> How many people got each item? How many people total? How many people >> got >>>> both? >>>> >>>> What about the same for the iPhone related items? >>>> >>> >> >
