Re: Evaluation of different recommendation algorithms for 12.000 user data set

Sean Owen Mon, 21 Nov 2011 03:16:44 -0800

Yes, because you have fewer items, an item-item-similarity-based algorithm
probably runs much faster.


I would not necessarily use the raw number of kg as a preference. It's not
really true that someone who buys 10kg of an item likes it 10x more than
one he buys 1kg of. Maybe the second spice is much more valuable? I would
at least try taking the logarithm of the weight, but, I think this is very
noisy as a proxy for "preference". It creates illogical leaps -- because
one user bought 85kg of X, and Y is "similar" to X, this would conclude
that you're somewhat likely to buy 85kg of Y too. I would probably not use
weight at all this way.

It is not therefore surprising that log-likelihood works well, since it
ignores this value actually.

(You mentioned RMSE but your evaluation metric is
average-absolute-difference -- L1, not L2).

This is quite a small data set so you should have no performance issues.
Your evaluations, which run over all users in the data set, are taking mere
seconds. I am sure you could get away with much less memory/processing if
you like.


On Mon, Nov 21, 2011 at 11:06 AM, Manuel Blechschmidt <
[email protected]> wrote:

> Hello Mahout Team, hello users,
> me and a friend are currently evaluating recommendation techniques for
> personalizing a newsletter for a company selling tea, spices and some other
> products. Mahout is such a great product which saves me hours of time and
> millions of money because I want to give something back I write this small
> case study to the mailing list.
>
> I am conducting an offline testing of which recommender is the most
> accurate one. Further I am interested in run time behavior like memory
> consumption and runtime.
>
> The data contains implicit feedback. The preferences of the user is the
> amount in gramm that he bought from a certain product (453 g ~ 1 pound). If
> a certain product does not have this data it is replaced with 50. So
> basically I want mahout to predict how much of a certain product is a user
> buying next. This is also helpful for demand planing. I am currently not
> using any time data because I did not find a recommender which is using
> this data.
>
> Users: 12858
> Items: 5467
> 121304 preferences
> MaxPreference: 85850.0 (Meaning that there is someone who ordered 85 kg of
> a certain tea or spice)
> MinPreference: 50.0
>
> Here are the pure benchmarks for accuracy in RMSE. They change during
> every run of the evaluation (~15%):
>
> Evaluation of randomBased (baseline): 43045.380570443434
> (RandomRecommender(model)) (Time: ~0.3 s) (Memory: 16MB)
> Evaluation of ItemBased with Pearson Correlation: 315.5804958647985
> (GenericItemBasedRecommender(model, PearsonCorrelationSimilarity(model))
> (Time: ~1s)  (Memory: 35MB)
> Evaluation of ItemBase with uncentered Cosine: 198.25393235323375
> (GenericItemBasedRecommender(model, UncenteredCosineSimilarity(model)))
> (Time: ~1s)  (Memory: 32MB)
> Evaluation of ItemBase with log likelihood: 176.45243607278724
> (GenericItemBasedRecommender(model, LogLikelihoodSimilarity(model)))
>  (Time: ~5s)  (Memory: 42MB)
> Evaluation of UserBased 3 with Pearson Correlation: 1378.1188069379868
> (GenericUserBasedRecommender(model, NearestNUserNeighborhood(3,
> PearsonCorrelationSimilarity(model), model),
> PearsonCorrelationSimilarity(model)))  (Time: ~52s) (Memory: 57MB)
> Evaluation of UserBased 20 with Pearson Correlation: 1144.1905989614288
> (GenericUserBasedRecommender(model, NearestNUserNeighborhood(20,
> PearsonCorrelationSimilarity(model), model),
> PearsonCorrelationSimilarity(model)))  (Time: ~51s) (Memory: 57MB)
> Evaluation of SlopeOne: 464.8989330869532 (SlopeOneRecommender(model))
> (Time: ~4s) (Memory: 604MB)
> Evaluation of SVD based: 326.1050823499026 (ALSWRFactorizer(model, 100,
> 0.3, 5)) (Time: ) (Memory: 691MB)
>
> These were measured with the following method:
>
> RecommenderEvaluator evaluator = new
> AverageAbsoluteDifferenceRecommenderEvaluator();
> double evaluation = evaluator.evaluate(randomBased, null, myModel,
>        0.9, 1.0);
>
> Memory usage was about 50m with the item based case. Slope One and SVD
> base seams to use the most memory (615MB & 691MB).
>
> The performance differs a lot. The fastest ones where the item based. They
> took about 1 to 5 seconds (PearsonCorrelationSimilarity and
> UncenteredCosineSimilarity 1 s, LogLikelihoodSimilarity 5s)
> The user based where a lot slower.
>
> Conclusion is that in my case the item based approach is the fastest,
> lowest memory consumption and most accurate one. Further I can use the
> recommendedBecause function.
>
> Here is the spec of the computer:
> 2.3GHz Intel Core i5 (4 Cores). 1024 MB for java virtual machine.
>
> In the next step, probably in the next 2 month. I have to design a
> newsletter and send it to the customers. Then I can benchmark the user
> acceptance rate of the recommendations.
>
> Any suggestions for enhancements are appreciated. If anybody is interested
> in the dataset or the evaluation code send me a private email. I might be
> able to convince the company to give out the dataset if the person is doing
> some interesting research.
>
> /Manuel
> --
> Manuel Blechschmidt
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter: http://twitter.com/Manuel_B
>
>

Re: Evaluation of different recommendation algorithms for 12.000 user data set

Reply via email to