Hello Daniel, On 01.12.2011, at 16:06, Daniel Zohar wrote:
> Hi Manuel, > I haven't got to the point where CacheItemSimilarity kicks in. That is, I > will have to run a lot of recommendations in order to get a real benefit > from it. I would first like to optimize the 'cold start' so it's at least > serves at reasonable time. Usually cache is used to prevent repeated > calculations, but personally I dont think it's a replacement for optimized > performance. Don't you agree? I agree but making recommendations work in real time is not an engineering problem. It is an academic problem. Mahout is already implemented in an optimized manner. They are not using the Java Collections Framework and are using the colt library for math calculations. What you are currently doing is tuning different parameters of the recommender to get the best fit between: - accuracy - space - time It would be great if you could just specify your requirements for example: 90% of recommendations in less then 0.5s and 512m of RAM and the system automatically adjusts the different tuning parameters to get the best accuracy with this set up. /Manuel > > Also, I will try to profile the app now as you suggest and send the results > asap. > > Thanks! > > On Thu, Dec 1, 2011 at 4:56 PM, Manuel Blechschmidt < > [email protected]> wrote: > >> Hi Daniel, >> actually you are running the profile inside tomcat. You should take a >> snapshot and then drill down to the functions where the actual >> recommendation takes place. The current screenshots also contains some >> profiles from Tomcat threads which are sleeping a lot and therefore taking >> a lot of time. >> >> Further the screenshots does not contain the amount how often the >> different functions are called. >> >> You have to profile multiple requests alone. The CacheItemSimilarity gets >> filled therefore it should go faster and faster. >> >> On 01.12.2011, at 15:11, Daniel Zohar wrote: >> >>> @Manuel thanks for the tips. I have installed VisualVM and followed are >> the >>> results >>> I did two sampling - >>> - With the optimized SamplingCandidateItemsStrategy ( >>> http://pastebin.com/6n9C8Pw1): http://static.inky.ws/image/934/image.jpg >>> - Without the optimized SamplingCandidateItemsStrategy: >>> http://static.inky.ws/image/935/image.jpg >>> >> >> The big hot spot is the function FastIDSet.find(): >> >> Optimized: 13,759 s >> Unoptimized: 246,487 s >> >> So you see that your optimization already got you a performance boost of >> 2000%. >> >> Did you play around with the CacheItemSimilarity cache sizes? >> >> /Manuel >> >> -- >> Manuel Blechschmidt >> Dortustr. 57 >> 14467 Potsdam >> Mobil: 0173/6322621 >> Twitter: http://twitter.com/Manuel_B >> >> -- Manuel Blechschmidt Dortustr. 57 14467 Potsdam Mobil: 0173/6322621 Twitter: http://twitter.com/Manuel_B
