Is there a job that can do LLR-based trimming? On Mon, Oct 3, 2011 at 7:36 AM, Ted Dunning (Commented) (JIRA) < [email protected]> wrote:
> > [ > https://issues.apache.org/jira/browse/MAHOUT-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119335#comment-13119335] > > Ted Dunning commented on MAHOUT-824: > ------------------------------------ > > For any method that doesn't have good regularization, trimming helps avoid > over-training. Slope-one and all of the correlation methods have zero > regularization and are seriously susceptible to coincidence. LLR trimming > is kind of the simplest level of regularization. Methods like latent factor > log-linear have serious and real regularization and probably don't need > trimming. > > > FastByIDRunningAverage: Optimize SlopeOneRecommender by optimizing > MemoryDiffStorage > > > ------------------------------------------------------------------------------------ > > > > Key: MAHOUT-824 > > URL: https://issues.apache.org/jira/browse/MAHOUT-824 > > Project: Mahout > > Issue Type: Improvement > > Reporter: Lance Norskog > > Assignee: Sean Owen > > Priority: Trivial > > Fix For: 0.6 > > > > Attachments: MAHOUT-824.patch, MAHOUT-824.short.patch > > > > > > The SlopeOneRecommender has by far the best RMS of all of the online > recommenders in Mahout (that I've found). Unfortunately the implementation > also uses much more memory and is unuseable on my laptop. > > This patch optimizes memory (and speed) by folding > FastByIDMap<RunningAverage> into one class: FastByIDRunningAverage. This is > what it sounds like: a Long-addressable array of running averages (and > optionally standard deviation). > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > > -- Lance Norskog [email protected]
