Hello Daniel,

On 01.12.2011, at 16:06, Daniel Zohar wrote:

> Hi Manuel,
> I haven't got to the point where CacheItemSimilarity kicks in. That is, I
> will have to run a lot of recommendations in order to get a real benefit
> from it. I would first like to optimize the 'cold start' so it's at least
> serves at reasonable time. Usually cache is used to prevent repeated
> calculations, but personally I dont think it's a replacement for optimized
> performance. Don't you agree?

I agree but making recommendations work in real time is not an engineering 
problem. It is an academic problem. Mahout is already implemented in an 
optimized manner. They are not using the Java Collections Framework and are 
using the colt library for math calculations.

What you are currently doing is tuning different parameters of the recommender 
to get the best fit between:
- accuracy
- space
- time

It would be great if you could just specify your requirements for example: 90% 
of recommendations in less then 0.5s and 512m of RAM and the system 
automatically adjusts the different tuning parameters to get the best accuracy 
with this set up.

/Manuel

> 
> Also, I will try to profile the app now as you suggest and send the results
> asap.
> 
> Thanks!
> 
> On Thu, Dec 1, 2011 at 4:56 PM, Manuel Blechschmidt <
> [email protected]> wrote:
> 
>> Hi Daniel,
>> actually you are running the profile inside tomcat. You should take a
>> snapshot and then drill down to the functions where the actual
>> recommendation takes place. The current screenshots also contains some
>> profiles from Tomcat threads which are sleeping a lot and therefore taking
>> a lot of time.
>> 
>> Further the screenshots does not contain the amount how often the
>> different functions are called.
>> 
>> You have to profile multiple requests alone. The CacheItemSimilarity gets
>> filled therefore it should go faster and faster.
>> 
>> On 01.12.2011, at 15:11, Daniel Zohar wrote:
>> 
>>> @Manuel thanks for the tips. I have installed VisualVM and followed are
>> the
>>> results
>>> I did two sampling -
>>> - With the optimized SamplingCandidateItemsStrategy (
>>> http://pastebin.com/6n9C8Pw1): http://static.inky.ws/image/934/image.jpg
>>> - Without the optimized SamplingCandidateItemsStrategy:
>>> http://static.inky.ws/image/935/image.jpg
>>> 
>> 
>> The big hot spot is the function FastIDSet.find():
>> 
>> Optimized: 13,759 s
>> Unoptimized: 246,487 s
>> 
>> So you see that your optimization already got you a performance boost of
>> 2000%.
>> 
>> Did you play around with the CacheItemSimilarity cache sizes?
>> 
>> /Manuel
>> 
>> --
>> Manuel Blechschmidt
>> Dortustr. 57
>> 14467 Potsdam
>> Mobil: 0173/6322621
>> Twitter: http://twitter.com/Manuel_B
>> 
>> 

-- 
Manuel Blechschmidt
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B

Reply via email to