+1 for changing the algorithms. The safest and most non-disruptive course would be to simply leave the issue (or better mark it 'Won't fix') and change the algorithms.
I haven't taken a look at the MDCA so can't comment. I will take a look at the Drichilet Process implementation that Ted attached as soon as I find some time. -----Original Message----- From: Ted Dunning [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 16, 2008 1:27 AM To: [email protected] Subject: Re: [jira] Commented: (MAHOUT-31) Implementation of PLSI that uses EM We should consider changing algorithms. MDCA is a good candidate. So would be nested Dirchlet processes. Neither of these is necessarily all that much more difficult to implement than PLSI and both should give better results. On 4/15/08 12:52 PM, "Grant Ingersoll (JIRA)" <[EMAIL PROTECTED]> wrote: > > [ > https://issues.apache.org/jira/browse/MAHOUT-31?page=com.atlassian.jir > a.plugin > .system.issuetabpanels:comment-tabpanel&focusedCommentId=12589206#acti > on_12589 > 206 ] > > Grant Ingersoll commented on MAHOUT-31: > --------------------------------------- > > My bad, I thought there was a patch here. I just want to avoid the > case of someone who has knowledge that they think they are infringing > and still puts up a patch. > > So, in that case, I am fine if someone other than Ankur takes it up > (or who works with Ankur, I think). I just am a bit paranoid since we > are so early stage, I don't want anything to derail the positive > momentum we have going here. > >> Implementation of PLSI that uses EM >> ----------------------------------- >> >> Key: MAHOUT-31 >> URL: https://issues.apache.org/jira/browse/MAHOUT-31 >> Project: Mahout >> Issue Type: New Feature >> Reporter: Isabel Drost >> >> This should implement the proposal in the original Google Paper on >> PLSI in news retrieval.
