That's way too long. :) I haven't looked at your implementation, but
is it in-memory? if not, it's never going to be fast. A single
recommendation request will generate thousands of hits to the NoSQL
store and that's just not going to be fast. It has to act like a
cache.

These algos are generally pretty intensive in random data access.
That's why parallelizing them is hard, but, when done well can be very
handy.

I don't think there's anything so special about knn in this regard.

Sean

On Sun, Feb 26, 2012 at 4:29 PM, Nick Jordan <[email protected]> wrote:
> I've continued working on this.  Everything appears to return correctly,
> but in doing some debugging by using it in my own application I'm seeing
> some performance issues.
>
> Specifically when I run it as the data model as part of
> a KnnItemBasedRecommender the results are taking on the order of hours for
> a single recommendation to come back.  I've looked at the Caching to see if
> I could the problem there (and have even primed the cache with every
> user/item) and the performance is still atrocious.
>
> I had originally modeled this after the CassandraDataModel and it doesn't
> seem that once the cache is primed that this has anything to do with
> accessing the data in DynamoDB.  Are KnnItemBasedRecommenders generally
> slow for something like this?  I used to run this off of a flat file and
> never had performance problems.
>
> Thanks.
>
> Nick
>
> On Thu, Feb 9, 2012 at 9:05 AM, Sean Owen (Commented) (JIRA) <
> [email protected]> wrote:
>
>>
>>    [
>> https://issues.apache.org/jira/browse/MAHOUT-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204538#comment-13204538]
>>
>> Sean Owen commented on MAHOUT-972:
>> ----------------------------------
>>
>> Ok, good start. This will go in integration/ and it will need to refer to
>> Amazon libs in pom.xml. When done you'll want to add copyright headers and
>> standardize the format and all that, but that's a detail. Ping when you've
>> got something you feel is committable.
>>
>> > Implement Taste DynamoDBDataModel
>> > ---------------------------------
>> >
>> >                 Key: MAHOUT-972
>> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-972
>> >             Project: Mahout
>> >          Issue Type: Improvement
>> >          Components: Collaborative Filtering
>> >    Affects Versions: 0.6
>> >            Reporter: Nick Jordan
>> >            Priority: Minor
>> >              Labels: datamodel
>> >         Attachments: DynamoDBDataModel.java
>> >
>> >   Original Estimate: 504h
>> >  Remaining Estimate: 504h
>> >
>> > Implement Amazon's DynamoDB as a data model to be used for collaborative
>> filtering Taste models.
>> > I've actually begun work on this, but have never submitted to an ASF
>> project before.  I'll submit the patch when I've done enough testing that I
>> think it is ready.  If anyone has any hints/tips that will make the
>> patch/submission process easier I'd be happy to hear them.
>>
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA
>> administrators:
>> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>>
>>
>>

Reply via email to