That's way too long. :) I haven't looked at your implementation, but is it in-memory? if not, it's never going to be fast. A single recommendation request will generate thousands of hits to the NoSQL store and that's just not going to be fast. It has to act like a cache.
These algos are generally pretty intensive in random data access. That's why parallelizing them is hard, but, when done well can be very handy. I don't think there's anything so special about knn in this regard. Sean On Sun, Feb 26, 2012 at 4:29 PM, Nick Jordan <[email protected]> wrote: > I've continued working on this. Everything appears to return correctly, > but in doing some debugging by using it in my own application I'm seeing > some performance issues. > > Specifically when I run it as the data model as part of > a KnnItemBasedRecommender the results are taking on the order of hours for > a single recommendation to come back. I've looked at the Caching to see if > I could the problem there (and have even primed the cache with every > user/item) and the performance is still atrocious. > > I had originally modeled this after the CassandraDataModel and it doesn't > seem that once the cache is primed that this has anything to do with > accessing the data in DynamoDB. Are KnnItemBasedRecommenders generally > slow for something like this? I used to run this off of a flat file and > never had performance problems. > > Thanks. > > Nick > > On Thu, Feb 9, 2012 at 9:05 AM, Sean Owen (Commented) (JIRA) < > [email protected]> wrote: > >> >> [ >> https://issues.apache.org/jira/browse/MAHOUT-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204538#comment-13204538] >> >> Sean Owen commented on MAHOUT-972: >> ---------------------------------- >> >> Ok, good start. This will go in integration/ and it will need to refer to >> Amazon libs in pom.xml. When done you'll want to add copyright headers and >> standardize the format and all that, but that's a detail. Ping when you've >> got something you feel is committable. >> >> > Implement Taste DynamoDBDataModel >> > --------------------------------- >> > >> > Key: MAHOUT-972 >> > URL: https://issues.apache.org/jira/browse/MAHOUT-972 >> > Project: Mahout >> > Issue Type: Improvement >> > Components: Collaborative Filtering >> > Affects Versions: 0.6 >> > Reporter: Nick Jordan >> > Priority: Minor >> > Labels: datamodel >> > Attachments: DynamoDBDataModel.java >> > >> > Original Estimate: 504h >> > Remaining Estimate: 504h >> > >> > Implement Amazon's DynamoDB as a data model to be used for collaborative >> filtering Taste models. >> > I've actually begun work on this, but have never submitted to an ASF >> project before. I'll submit the patch when I've done enough testing that I >> think it is ready. If anyone has any hints/tips that will make the >> patch/submission process easier I'd be happy to hear them. >> >> -- >> This message is automatically generated by JIRA. >> If you think it was sent incorrectly, please contact your JIRA >> administrators: >> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa >> For more information on JIRA, see: http://www.atlassian.com/software/jira >> >> >>
