[ 
https://issues.apache.org/jira/browse/MAHOUT-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659282#action_12659282
 ] 

Isabel Drost commented on MAHOUT-30:
------------------------------------

> Regarding the question of whether something should be called a Distribution 
> or a Sampler, the mathematical terminology is that a distribution is 
> something you can sample so the the Distribution terminology would be most 
> compatible that way. The fact that only one method is currently 
> defined is likely a temporary thing ... other methods could well be required 
> for later efforts.

I understand. I do not have any strong objections. I think, a short class 
comment in DirichletDistribution would already help to avoid at least my 
confusion. (Although not trying to understand code at 2a.m. local time might 
help as well... ;) )

> I was thinking of moving the display code into the examples directory.

Sounds like a great idea to me.

> I did that so Ted could use his favorite library but he has not been pursuing 
> it. I'm happy with blog and, as commons does not have the 
> needed sampling methods without Ted's patches, suggest we could go with blog. 
> Removing the plugability would clean up the code some too.

Do you know what the current status of the patches is? I must admit I have a 
slight preference for commons-math as well, in case they support what we need.


> dirichlet process implementation
> --------------------------------
>
>                 Key: MAHOUT-30
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-30
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Clustering
>            Reporter: Isabel Drost
>            Assignee: Jeff Eastman
>         Attachments: jeastman.vcf, MAHOUT-30.patch, MAHOUT-30b.patch, 
> MAHOUT-30c.patch
>
>
> Copied over from original issue:
> > Further extension can also be made by assuming an infinite mixture model. 
> > The implementation is only slightly more difficult and the result is a 
> > (nearly)
> > non-parametric clustering algorithm.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to