Re: Mahout Similarity Caching

Sean Owen Mon, 22 Apr 2013 15:53:56 -0700

49 seconds is orders of magnitude too long -- something is very wrong
here, for so little data. Are you running this off a database? or are
you somehow counting the overhead of 3-4K network calls?


On Mon, Apr 22, 2013 at 11:22 PM, Gabor Bernat <ber...@primeranks.net> wrote:
> Hello,
>
> I'm using Mahout in a system, where the typical response time should be
> below 100ms. I'm using an item based recommender with float preference
> values (with Tanimato similarity for now, which is passed into a
> CachingItemSimilarity objec for performance reasonst). My model has around
> 7k items, 26k users with around 100k preferences linking them.
>
> Instead of performing a recommendation, I only need to estimate preferences
> of the user for around 3-4k items (this is important, as this allows the
> integration of a business rule engine in the recommendation process inside
> the system where I'm working).
>
> Now my problem is that for users with lots of preferences (200+) this
> estimation process takes forever (49second+). I'm assuming the issue lies
> into the calculation of the similarity measurements; so I though I'll do
> this asynchroniously in a train like process, save it, and at start up just
> load it into memory this precomputed information. However, I cannot see any
> way to load this information into the CachingSimilarity object; nor can I
> persist the CachingSimilarity object and load it.
>
> So any ideas, on how to cut down the estimation times?
>
> Thanks,
>
> Bernát GÁBOR

Re: Mahout Similarity Caching

Reply via email to