Re: Evaluation of different recommendation algorithms for 12.000 user data set

Manuel Blechschmidt Mon, 21 Nov 2011 03:46:47 -0800

Hello Sean,

On 21.11.2011, at 12:16, Sean Owen wrote:


> Yes, because you have fewer items, an item-item-similarity-based algorithm
> probably runs much faster.

Thanks for your blazing fast feedback.

> 
> I would not necessarily use the raw number of kg as a preference. It's not
> really true that someone who buys 10kg of an item likes it 10x more than
> one he buys 1kg of. Maybe the second spice is much more valuable? I would
> at least try taking the logarithm of the weight, but, I think this is very
> noisy as a proxy for "preference". It creates illogical leaps -- because
> one user bought 85kg of X, and Y is "similar" to X, this would conclude
> that you're somewhat likely to buy 85kg of Y too. I would probably not use
> weight at all this way.

Thanks for this suggestions. I will consider to integrate a logarithmic weight 
into the recommender. At the moment I am more concerned to get the user 
feedback component working. From some manual tests I can already tell that the 
recommendation for some users make sense.

Based on my own profile I can tell that when I buy more of a certain product 
then I also like the product more.

I am also thinking about some seasonal tweaking. Tea is a very seasonal product 
during winter and christmas other flavors are sold then in summer. 
http://diuf.unifr.ch/main/is/sites/diuf.unifr.ch.main.is/files/documents/publications/WS07-08-011.pdf

> 
> It is not therefore surprising that log-likelihood works well, since it
> ignores this value actually.
> 
> (You mentioned RMSE but your evaluation metric is
> average-absolute-difference -- L1, not L2).

You are right RMSE (root-mean-squared-error) is wrong. I think it is MEA 
(mean-avagerage-error).

> 
> This is quite a small data set so you should have no performance issues.
> Your evaluations, which run over all users in the data set, are taking mere
> seconds. I am sure you could get away with much less memory/processing if
> you like.

This is by far good enough. The more important part is the newsletter sending. 
I have to generate about 10.000 emails that makes more headache then the 
recommender.

/Manuel

> 
> 
> On Mon, Nov 21, 2011 at 11:06 AM, Manuel Blechschmidt <
> [email protected]> wrote:
> 
>> Hello Mahout Team, hello users,
>> me and a friend are currently evaluating recommendation techniques for
>> personalizing a newsletter for a company selling tea, spices and some other
>> products. Mahout is such a great product which saves me hours of time and
>> millions of money because I want to give something back I write this small
>> case study to the mailing list.
>> 
>> I am conducting an offline testing of which recommender is the most
>> accurate one. Further I am interested in run time behavior like memory
>> consumption and runtime.
>> 
>> The data contains implicit feedback. The preferences of the user is the
>> amount in gramm that he bought from a certain product (453 g ~ 1 pound). If
>> a certain product does not have this data it is replaced with 50. So
>> basically I want mahout to predict how much of a certain product is a user
>> buying next. This is also helpful for demand planing. I am currently not
>> using any time data because I did not find a recommender which is using
>> this data.
>> 
>> Users: 12858
>> Items: 5467
>> 121304 preferences
>> MaxPreference: 85850.0 (Meaning that there is someone who ordered 85 kg of
>> a certain tea or spice)
>> MinPreference: 50.0
>> 
>> Here are the pure benchmarks for accuracy in RMSE. They change during
>> every run of the evaluation (~15%):
>> 
>> Evaluation of randomBased (baseline): 43045.380570443434
>> (RandomRecommender(model)) (Time: ~0.3 s) (Memory: 16MB)
>> Evaluation of ItemBased with Pearson Correlation: 315.5804958647985
>> (GenericItemBasedRecommender(model, PearsonCorrelationSimilarity(model))
>> (Time: ~1s)  (Memory: 35MB)
>> Evaluation of ItemBase with uncentered Cosine: 198.25393235323375
>> (GenericItemBasedRecommender(model, UncenteredCosineSimilarity(model)))
>> (Time: ~1s)  (Memory: 32MB)
>> Evaluation of ItemBase with log likelihood: 176.45243607278724
>> (GenericItemBasedRecommender(model, LogLikelihoodSimilarity(model)))
>> (Time: ~5s)  (Memory: 42MB)
>> Evaluation of UserBased 3 with Pearson Correlation: 1378.1188069379868
>> (GenericUserBasedRecommender(model, NearestNUserNeighborhood(3,
>> PearsonCorrelationSimilarity(model), model),
>> PearsonCorrelationSimilarity(model)))  (Time: ~52s) (Memory: 57MB)
>> Evaluation of UserBased 20 with Pearson Correlation: 1144.1905989614288
>> (GenericUserBasedRecommender(model, NearestNUserNeighborhood(20,
>> PearsonCorrelationSimilarity(model), model),
>> PearsonCorrelationSimilarity(model)))  (Time: ~51s) (Memory: 57MB)
>> Evaluation of SlopeOne: 464.8989330869532 (SlopeOneRecommender(model))
>> (Time: ~4s) (Memory: 604MB)
>> Evaluation of SVD based: 326.1050823499026 (ALSWRFactorizer(model, 100,
>> 0.3, 5)) (Time: ) (Memory: 691MB)
>> 
>> These were measured with the following method:
>> 
>> RecommenderEvaluator evaluator = new
>> AverageAbsoluteDifferenceRecommenderEvaluator();
>> double evaluation = evaluator.evaluate(randomBased, null, myModel,
>>       0.9, 1.0);
>> 
>> Memory usage was about 50m with the item based case. Slope One and SVD
>> base seams to use the most memory (615MB & 691MB).
>> 
>> The performance differs a lot. The fastest ones where the item based. They
>> took about 1 to 5 seconds (PearsonCorrelationSimilarity and
>> UncenteredCosineSimilarity 1 s, LogLikelihoodSimilarity 5s)
>> The user based where a lot slower.
>> 
>> Conclusion is that in my case the item based approach is the fastest,
>> lowest memory consumption and most accurate one. Further I can use the
>> recommendedBecause function.
>> 
>> Here is the spec of the computer:
>> 2.3GHz Intel Core i5 (4 Cores). 1024 MB for java virtual machine.
>> 
>> In the next step, probably in the next 2 month. I have to design a
>> newsletter and send it to the customers. Then I can benchmark the user
>> acceptance rate of the recommendations.
>> 
>> Any suggestions for enhancements are appreciated. If anybody is interested
>> in the dataset or the evaluation code send me a private email. I might be
>> able to convince the company to give out the dataset if the person is doing
>> some interesting research.
>> 
>> /Manuel
>> --
>> Manuel Blechschmidt
>> Dortustr. 57
>> 14467 Potsdam
>> Mobil: 0173/6322621
>> Twitter: http://twitter.com/Manuel_B
>> 
>> 

-- 
Manuel Blechschmidt
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B

Re: Evaluation of different recommendation algorithms for 12.000 user data set

Reply via email to