I am building a large scale user-based recommender which could have up to 1 billion preferences. I am planning to reduce it to the 100M chunks are recommended by Sean: http://lucene.472066.n3.nabble.com/Evaluating-Mahout-s-recommender-support-td2161876.html#a2167800
With this size, the application can be set up in a non-distributed mode. The recommender will be set up as a web servlet whose service will be consumed by web applications. That means there will be lots of concurrent recommendation requests. Thus my main concern is how well do Mahout recommenders handle the volumes of concurrent recommendations. I have done some benchmarking with JMeter using out-of-the-box Mahout GenericBooleanPrefUserBasedRecommender examples and am seeing the following trends: Number of concurrent recommendations | time per recommendation 10 | 150 ms 100 | 2000 ms 1000 | 14000 ms As you can see, even with 100 concurrent users the time-per-recommendation is unacceptably slow. Has anyone done more benchmarks about concurrent recommendations? Can you post any architectural ideas about setting up scalable distributed recommenders that can handle high concurrency? -- View this message in context: http://lucene.472066.n3.nabble.com/Scalability-concerns-with-concurrent-recommendations-tp3375424p3375424.html Sent from the Mahout User List mailing list archive at Nabble.com.
