No. Go for this more recent (and much shorter) one:
http://www.mapr.com/practical-machine-learning And if you like it, leave a review on Amazon: http://www.amazon.com/Practical-Machine-Learning-Innovations-Recommendation-ebook/dp/B00JRHVNT4 On Thu, Aug 21, 2014 at 11:31 PM, Serega Sheypak <[email protected]> wrote: > Ok, I got it. Is it Ted's book? > > http://www.amazon.com/Mahout-Action-Sean-Owen/dp/1935182684/ref=la_B00EHXC1NK_1_1?s=books&ie=UTF8&qid=1408689021&sr=1-1 > > I've read this one: > > http://www.amazon.com/Apache-Mahout-Cookbook-Piero-Giacomelli-ebook/dp/B00HJR6R86/ref=sr_1_2?s=books&ie=UTF8&qid=1408689063&sr=1-2&keywords=mahout > > No satisfaction at all > > > > > 2014-08-21 20:32 GMT+04:00 Pat Ferrel <[email protected]>: > > > Sorry that wasn’t clear. > > > > Given your purchase volume, you may not have very good coverage training > > on purchases only. So using views may be your best bet. Ted’s metrics > were: > > "How many people got each item? How many people total? How many people > > got both?” This is how you tell what action has enough data to be useful. > > In your case that my be views. > > > > The other point was about doing personalization. Since you have > > itemsimilarity working well you can add personalization with a search > > engine using methods described in Ted’s book. This requires that you > > capture user history (views in this case) and use that as a query on the > > itemsimilarity data. If you know enough of the current user’s recent > > history this will allow you to show “people with the same taste as you > also > > looked at these items”. > > > > Currently you are not personalizing, you are showing the same “similar > > items” to every user. That is fine but personalization may improve things > > further. > > > > > > On Aug 21, 2014, at 8:08 AM, Serega Sheypak <[email protected]> > > wrote: > > > > Excuse me looks like I've missed important point > > "Ah, then using Ted’s metrics views _is_ probably your best bet." > > You are talking about "personal recommendations" serving from search > > engine? The idea was to get active vitior view history and give him > > "similar" view histories from search engine in runtime? > > > > > > 2014-08-21 18:50 GMT+04:00 Pat Ferrel <[email protected]>: > > > > >> > > >> On Aug 21, 2014, at 1:22 AM, Serega Sheypak <[email protected] > > > > > wrote: > > >> > > >>>> What you are doing is best practice for showing similar “views”. The > > >> technique for using multiple actions will be covered in a series of > > blogs > > >> posts and may be put on the Mahout site eventually > > >> Great thanks! > > >> > > >>>> People look at 100 things and buy 1, as you say. The question is: Do > > > you > > >> want people to buy something or just browse your site? > > >> No objections for your point. I understand it. It should work for > pretty > > >> big ecom, right? Small ecom sell 100-200 items per day and have wide > > > range > > >> of items. > > > > > > Ah, then using Ted’s metrics views _is_ probably your best bet. You can > > > probably still personalize view recommendations. Since you are already > > > using itemsimilarity it can be a second step that builds on the first. > > > > > >> > > >>>> Filter out any items not in the catalog from your recommendations. > > >> We have it on data preparation stage. We recalculate item similarity > > each > > >> day sliding back for 60 days excluding non-available items on > > preparation > > >> stage. > > >> > > >> Thank you! We did reach good results, business guys got satisfaction > :) > > >> > > >> > > >> 2014-08-20 20:28 GMT+04:00 Pat Ferrel <[email protected]>: > > >> > > >>>> > > >>>> On Aug 19, 2014, at 11:26 PM, Serega Sheypak < > > [email protected] > > >> > > >>> wrote: > > >>>> > > >>>> Hi! > > >>>> 1. There was a bug in UI, I've checked raw recommendations. "water > > >>> heating > > >>>> device" has low score. So first 30 recommended items really fits > > > iPhone, > > >>>> next are not so good. Anyway result is good, thank you very much. > > >>>> 2. I've inspected "sessions" of users, really there are people who > > > viewed > > >>>> iphone and heating device. 10 people for last month. > > >>>> 3. I will calculate relative measurment, I didn't calc what is % of > > > these > > >>>> people comparing to others and how they fluence on score result. > > >>>> > > >>> > > >>> That’s great. The Spark version sorts the result by weights, but I > > think > > >>> the mapreduce version doesn't > > >>> > > >>>> *You wrote:* > > >>>> Then once you have that working you can add more actions but only > with > > >>>> cross-cooccurrence, adding by weighting* will not work with this > type > > > of > > >>>> recommender*, which recommender can work with weights for actions? > > >>> > > >>> What you are doing is best practice for showing similar “views”. The > > >>> technique for using multiple actions will be covered in a series of > > > blogs > > >>> posts and may be put on the Mahout site eventually. It requires > > >>> spark-itemsimilarity. For now I’d strongly suggest you look at > training > > > on > > >>> purchase data alone - see the comments below. > > >>> > > >>>> > > >>>> *About building recommendations using sales.* > > >>>> Sales are less than 1% from item views. You will recommend only > stuff > > >>>> people buy. > > >>> > > >>> The point is not volume of data but quality of data. I once measured > > how > > >>> predictive of purchases the views were and found them a rather poor > > >>> predictor. People look at 100 things and buy 1, as you say. The > > question > > >>> is: Do you want people to buy something or just browse your site? > > >>> > > >>> On the other hand you would need to see how good your coverage is of > > >>> purchases. Do you have enough items purchased by several people > (Ted’s > > >>> questions below will guide you)? If there is good coverage then you > _do > > >>> not_ restrict the range by using only purchase data. You increase the > > >>> quality. > > >>> > > >>>> If you recommend what people see you significantly widen range > > >>>> of possible buy actions. People always buy case "XXX" with iphone. > You > > >>>> would never recommened them to buy case "YYY". If people watch "XXX" > > > and > > >>>> "YYY" it's reasonable to recommened "YYY". Maybe "YYY" it's more > > >>> expensive > > >>>> that is why people prefer cheaper "XXX". What's wrong with this > > >>> assumption? > > >>> > > >>> Nothing at all. Remember that your goal is to cause a purchase but > > using > > >>> views requires some “scrubbing” of views. You want, in effect, > > >>> views-that-lead-to-purchases. In a cooccurrence recommender this can > be > > >>> done with cross-cooccurrence and I’ll describe that elsewhere, it’s > too > > >>> long for an email to describe but pretty easy to use. > > >>> > > >>> I’d wager that if you restrict to purchases your sales will go up > over > > >>> recommending views. But that is without looking at your data. If you > > > need > > >>> more data try increase the sliding time window to add more purchases. > > > This > > >>> will eventually start including things that are no longer in your > > > catalog > > >>> so will have diminishing returns but 60 days seem like a short time > > > period. > > >>> Filter out any items not in the catalog from your recommendations. > > >>> > > >>> You want recency to matter, this is good intuition. The in-catalog > > > filter > > >>> is one simple way, and there are others when you get to > > personalization. > > >>> > > >>>> > > >>>> *About our obsessive desire to add weights for actions.* > > >>>> We would like to self-tune our recommendations. If user clicks our > > >>>> recommendation it's a signal for us that items are related. So next > > > time > > >>>> this link should have higher score. What are the approaches to do > it? > > >>>> > > >>> > > >>> Yes, you do want the things that lead to purchases to go into the > > > training > > >>> data. This is good intuition. But you don’t do it with weights you > > > train on > > >>> new purchases, regardless of whether they came from random views, > > >>> rec-views, or … You don’t care whether a rec was clicked on; you care > > > if a > > >>> purchase was made and you don’t care what part of the UI caused it. > UI > > >>> analysis is very very important but doesn’t help the recommender, it > > > guides > > >>> UI decisions. So measuring clicks is good but shouldn’t be used to > > > change > > >>> recs. > > >>> > > >>> One way to increase the value of your recs is to add a little > > randomness > > >>> to their ordering. If you have 10 things to recommend get 20 from > > >>> itemsimilarity and apply a normally distributed random weighting, > then > > >>> re-sort and show the top 10. This will move some things up in order > and > > >>> show them where without the re-ordering they would never be shown. > The > > >>> technique allows you to expose more items to possible purchase and > > >>> therefore affect the ordering the next time you train. The actual > > > algorithm > > >>> takes more space to describe but the idea is a lot like a multi-armed > > >>> bandit where the best bandit eventually gets all trials. In this case > > > the > > >>> best rec leads to a purchase and gets into the new training data and > so > > >>> will be shown more often the next time. > > >>> > > >>> Another thing you can do is create a “shopping cart” recommender. > This > > >>> looks at items purchased together—an item-set. It is a strong > indicator > > > of > > >>> relatedness. > > >>> > > >>> Suggestions: > > >>> 1) personalize: this is likely to make the most difference since you > > > will > > >>> be showing different things to different people. The “Practical > Machine > > >>> Learning” is short and easy to read—it describes this. > > >>> 2) move to purchase data training, wait for cross-cooccurrence to add > > in > > >>> view data. Do this if you have good coverage (Ted’s questions below > > > relate > > >>> to this). > > >>> 3) increase the training period if needed to get good catalog > coverage > > >>> 4) consider dithering your recs to expose more items to purchase and > > >>> therefore self-tune by increasing the quality of your training data. > > >>> > > >>>> > > >>>> > > >>>> > > >>>> 2014-08-20 7:18 GMT+04:00 Ted Dunning <[email protected]>: > > >>>> > > >>>>> On Tue, Aug 19, 2014 at 12:53 AM, Serega Sheypak < > > >>> [email protected] > > >>>>>> > > >>>>> wrote: > > >>>>> > > >>>>>> What could be a reason for recommending "Water heat device " to > > > iPhone? > > >>>>>> iPhone is one of the most popular item. There should be a lot of > > > people > > >>>>>> viewing iPhone with "Water heat device "? > > >>>>>> > > >>>>> > > >>>>> What are the numbers? > > >>>>> > > >>>>> How many people got each item? How many people total? How many > > > people > > >>> got > > >>>>> both? > > >>>>> > > >>>>> What about the same for the iPhone related items? > > >>>>> > > >>>> > > >>> > > >> > > > > > > > >
