Re: Evaluation of different recommendation algorithms for 12.000 user data set

Ted Dunning Mon, 21 Nov 2011 07:21:42 -0800

Your product is subject to seasonality constraints (which teas are likely
right now) and repeat buying.  I would separate out the recommendation of
repeat buys from the separation of new items.


You may also find that item-item links on your web site are helpful.  These
are easy to get using this system.

On Mon, Nov 21, 2011 at 11:46 AM, Manuel Blechschmidt <
[email protected]> wrote:

> Hello Sean,
>
> On 21.11.2011, at 12:16, Sean Owen wrote:
>
> > Yes, because you have fewer items, an item-item-similarity-based
> algorithm
> > probably runs much faster.
>
> Thanks for your blazing fast feedback.
>
> >
> > I would not necessarily use the raw number of kg as a preference. It's
> not
> > really true that someone who buys 10kg of an item likes it 10x more than
> > one he buys 1kg of. Maybe the second spice is much more valuable? I would
> > at least try taking the logarithm of the weight, but, I think this is
> very
> > noisy as a proxy for "preference". It creates illogical leaps -- because
> > one user bought 85kg of X, and Y is "similar" to X, this would conclude
> > that you're somewhat likely to buy 85kg of Y too. I would probably not
> use
> > weight at all this way.
>
> Thanks for this suggestions. I will consider to integrate a logarithmic
> weight into the recommender. At the moment I am more concerned to get the
> user feedback component working. From some manual tests I can already tell
> that the recommendation for some users make sense.
>
> Based on my own profile I can tell that when I buy more of a certain
> product then I also like the product more.
>
> I am also thinking about some seasonal tweaking. Tea is a very seasonal
> product during winter and christmas other flavors are sold then in summer.
> http://diuf.unifr.ch/main/is/sites/diuf.unifr.ch.main.is/files/documents/publications/WS07-08-011.pdf
>
> >
> > It is not therefore surprising that log-likelihood works well, since it
> > ignores this value actually.
> >
> > (You mentioned RMSE but your evaluation metric is
> > average-absolute-difference -- L1, not L2).
>
> You are right RMSE (root-mean-squared-error) is wrong. I think it is MEA
> (mean-avagerage-error).
>
> >
> > This is quite a small data set so you should have no performance issues.
> > Your evaluations, which run over all users in the data set, are taking
> mere
> > seconds. I am sure you could get away with much less memory/processing if
> > you like.
>
> This is by far good enough. The more important part is the newsletter
> sending. I have to generate about 10.000 emails that makes more headache
> then the recommender.
>
> /Manuel
>
> >
> >
> > On Mon, Nov 21, 2011 at 11:06 AM, Manuel Blechschmidt <
> > [email protected]> wrote:
> >
> >> Hello Mahout Team, hello users,
> >> me and a friend are currently evaluating recommendation techniques for
> >> personalizing a newsletter for a company selling tea, spices and some
> other
> >> products. Mahout is such a great product which saves me hours of time
> and
> >> millions of money because I want to give something back I write this
> small
> >> case study to the mailing list.
> >>
> >> I am conducting an offline testing of which recommender is the most
> >> accurate one. Further I am interested in run time behavior like memory
> >> consumption and runtime.
> >>
> >> The data contains implicit feedback. The preferences of the user is the
> >> amount in gramm that he bought from a certain product (453 g ~ 1
> pound). If
> >> a certain product does not have this data it is replaced with 50. So
> >> basically I want mahout to predict how much of a certain product is a
> user
> >> buying next. This is also helpful for demand planing. I am currently not
> >> using any time data because I did not find a recommender which is using
> >> this data.
> >>
> >> Users: 12858
> >> Items: 5467
> >> 121304 preferences
> >> MaxPreference: 85850.0 (Meaning that there is someone who ordered 85 kg
> of
> >> a certain tea or spice)
> >> MinPreference: 50.0
> >>
> >> Here are the pure benchmarks for accuracy in RMSE. They change during
> >> every run of the evaluation (~15%):
> >>
> >> Evaluation of randomBased (baseline): 43045.380570443434
> >> (RandomRecommender(model)) (Time: ~0.3 s) (Memory: 16MB)
> >> Evaluation of ItemBased with Pearson Correlation: 315.5804958647985
> >> (GenericItemBasedRecommender(model, PearsonCorrelationSimilarity(model))
> >> (Time: ~1s)  (Memory: 35MB)
> >> Evaluation of ItemBase with uncentered Cosine: 198.25393235323375
> >> (GenericItemBasedRecommender(model, UncenteredCosineSimilarity(model)))
> >> (Time: ~1s)  (Memory: 32MB)
> >> Evaluation of ItemBase with log likelihood: 176.45243607278724
> >> (GenericItemBasedRecommender(model, LogLikelihoodSimilarity(model)))
> >> (Time: ~5s)  (Memory: 42MB)
> >> Evaluation of UserBased 3 with Pearson Correlation: 1378.1188069379868
> >> (GenericUserBasedRecommender(model, NearestNUserNeighborhood(3,
> >> PearsonCorrelationSimilarity(model), model),
> >> PearsonCorrelationSimilarity(model)))  (Time: ~52s) (Memory: 57MB)
> >> Evaluation of UserBased 20 with Pearson Correlation: 1144.1905989614288
> >> (GenericUserBasedRecommender(model, NearestNUserNeighborhood(20,
> >> PearsonCorrelationSimilarity(model), model),
> >> PearsonCorrelationSimilarity(model)))  (Time: ~51s) (Memory: 57MB)
> >> Evaluation of SlopeOne: 464.8989330869532 (SlopeOneRecommender(model))
> >> (Time: ~4s) (Memory: 604MB)
> >> Evaluation of SVD based: 326.1050823499026 (ALSWRFactorizer(model, 100,
> >> 0.3, 5)) (Time: ) (Memory: 691MB)
> >>
> >> These were measured with the following method:
> >>
> >> RecommenderEvaluator evaluator = new
> >> AverageAbsoluteDifferenceRecommenderEvaluator();
> >> double evaluation = evaluator.evaluate(randomBased, null, myModel,
> >>       0.9, 1.0);
> >>
> >> Memory usage was about 50m with the item based case. Slope One and SVD
> >> base seams to use the most memory (615MB & 691MB).
> >>
> >> The performance differs a lot. The fastest ones where the item based.
> They
> >> took about 1 to 5 seconds (PearsonCorrelationSimilarity and
> >> UncenteredCosineSimilarity 1 s, LogLikelihoodSimilarity 5s)
> >> The user based where a lot slower.
> >>
> >> Conclusion is that in my case the item based approach is the fastest,
> >> lowest memory consumption and most accurate one. Further I can use the
> >> recommendedBecause function.
> >>
> >> Here is the spec of the computer:
> >> 2.3GHz Intel Core i5 (4 Cores). 1024 MB for java virtual machine.
> >>
> >> In the next step, probably in the next 2 month. I have to design a
> >> newsletter and send it to the customers. Then I can benchmark the user
> >> acceptance rate of the recommendations.
> >>
> >> Any suggestions for enhancements are appreciated. If anybody is
> interested
> >> in the dataset or the evaluation code send me a private email. I might
> be
> >> able to convince the company to give out the dataset if the person is
> doing
> >> some interesting research.
> >>
> >> /Manuel
> >> --
> >> Manuel Blechschmidt
> >> Dortustr. 57
> >> 14467 Potsdam
> >> Mobil: 0173/6322621
> >> Twitter: http://twitter.com/Manuel_B
> >>
> >>
>
> --
> Manuel Blechschmidt
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter: http://twitter.com/Manuel_B
>
>

Re: Evaluation of different recommendation algorithms for 12.000 user data set

Reply via email to