Re: Scalability concerns with concurrent recommendations

Sebastian Schelter Wed, 28 Sep 2011 02:55:50 -0700

Maybe this article will help you:

http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/


Can you give some more details about your recommender setup?

Some general hints that to scale that out:

* precompute item similarities, only keep the top k similar items per
item (use something like 50), experiment with this number

* make sure these item similarities are loaded into memory and the
candidate item strategy of your recommender accesses those directly (eg
use MySQLJDBCInMemoryItemSimilarity together with
AllSimilarItemsCandidateItemsStrategy)

* only use a max number of preferences per user for recommendation,
maybe the n latest interactions, experiment with this number, make sure
you either fetch those preferences from memory (if the data fits into
RAM) or you use a setup similar to that in my blogpost where you can
fetch all preferences for a user in a single database query

--sebastian




On 28.09.2011 10:38, udachny wrote:
> I am building a large scale user-based recommender which could have up to 1
> billion preferences. I am planning to reduce it to the 100M chunks are
> recommended by Sean:
> http://lucene.472066.n3.nabble.com/Evaluating-Mahout-s-recommender-support-td2161876.html#a2167800
> 
> With this size, the application can be set up in a non-distributed mode. The
> recommender will be set up as a web servlet whose service will be consumed
> by web applications. That means there will be lots of concurrent
> recommendation requests. 
> 
> Thus my main concern is how well do Mahout recommenders handle the volumes
> of concurrent recommendations. I have done some benchmarking with JMeter
> using out-of-the-box Mahout GenericBooleanPrefUserBasedRecommender examples
> and am seeing the following trends:
> 
> Number of concurrent recommendations | time per recommendation
> 10 | 150 ms
> 100 | 2000 ms
> 1000 | 14000 ms
> 
> As you can see, even with 100 concurrent users the time-per-recommendation
> is unacceptably slow. 
> 
> Has anyone done more benchmarks about concurrent recommendations?
> Can you post any architectural ideas about setting up scalable distributed
> recommenders that can handle high concurrency?
> 
> 
> 
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Scalability-concerns-with-concurrent-recommendations-tp3375424p3375424.html
> Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Scalability concerns with concurrent recommendations

Reply via email to