Those times sound about right to me. Past about 10 you're starting to be bottlenecked on CPU time, so the response time is scaling up nearly linearly. You're just going to need to throw more machines at it: for 10x the volume put up 10x servers. That's actually the good news.
On Wed, Sep 28, 2011 at 9:38 AM, udachny <[email protected]> wrote: > I am building a large scale user-based recommender which could have up to 1 > billion preferences. I am planning to reduce it to the 100M chunks are > recommended by Sean: > > http://lucene.472066.n3.nabble.com/Evaluating-Mahout-s-recommender-support-td2161876.html#a2167800 > > With this size, the application can be set up in a non-distributed mode. > The > recommender will be set up as a web servlet whose service will be consumed > by web applications. That means there will be lots of concurrent > recommendation requests. > > Thus my main concern is how well do Mahout recommenders handle the volumes > of concurrent recommendations. I have done some benchmarking with JMeter > using out-of-the-box Mahout GenericBooleanPrefUserBasedRecommender examples > and am seeing the following trends: > > Number of concurrent recommendations | time per recommendation > 10 | 150 ms > 100 | 2000 ms > 1000 | 14000 ms > > As you can see, even with 100 concurrent users the time-per-recommendation > is unacceptably slow. > > Has anyone done more benchmarks about concurrent recommendations? > Can you post any architectural ideas about setting up scalable distributed > recommenders that can handle high concurrency? > > > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Scalability-concerns-with-concurrent-recommendations-tp3375424p3375424.html > Sent from the Mahout User List mailing list archive at Nabble.com. >
