Mahout Similarity Caching

Gabor Bernat Mon, 22 Apr 2013 15:22:56 -0700

Hello,

I'm using Mahout in a system, where the typical response time should be
below 100ms. I'm using an item based recommender with float preference
values (with Tanimato similarity for now, which is passed into a
CachingItemSimilarity objec for performance reasonst). My model has around
7k items, 26k users with around 100k preferences linking them.


Instead of performing a recommendation, I only need to estimate preferences
of the user for around 3-4k items (this is important, as this allows the
integration of a business rule engine in the recommendation process inside
the system where I'm working).

Now my problem is that for users with lots of preferences (200+) this
estimation process takes forever (49second+). I'm assuming the issue lies
into the calculation of the similarity measurements; so I though I'll do
this asynchroniously in a train like process, save it, and at start up just
load it into memory this precomputed information. However, I cannot see any
way to load this information into the CachingSimilarity object; nor can I
persist the CachingSimilarity object and load it.

So any ideas, on how to cut down the estimation times?

Thanks,

Bernát GÁBOR

Mahout Similarity Caching

Reply via email to