[ 
https://issues.apache.org/jira/browse/MAHOUT-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900393#action_12900393
 ] 

Hudson commented on MAHOUT-479:
-------------------------------

Integrated in Mahout-Quality #208 (See 
[https://hudson.apache.org/hudson/job/Mahout-Quality/208/])
    MAHOUT-479: This commit refactors Cluster to inherit from 
Model<VectorWritable> instead of AbstractCluster
which now inherits just Cluster. The existing Dirichlet models also now inherit 
from Cluster, simplifying the use
of generics and cleaning up a lot of the code.

Since Dirichlet now can iterate over arbitrary Clusters as its models, this 
opens up the entire set of DistanceMeasure
based clusters for Dirichlet processing. This should allow the output of e.g. 
Canopy to become the prior model
distribution in a subsequent Dirichlet step. Is this a feature?

The AbstractCluster hierarchy has been adjusted allowing 
GaussianClusterDistribution and 
DistanceMeasureClusterDistribution to instantiate its subclasses. Both have 
tests and seem to work as expected.

All unit tests run. More to follow


> Streamline classification/ clustering data structures
> -----------------------------------------------------
>
>                 Key: MAHOUT-479
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-479
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification, Clustering
>    Affects Versions: 0.1, 0.2, 0.3, 0.4
>            Reporter: Isabel Drost
>
> Opening this JIRA issue to collect ideas on how to streamline our 
> classification and clustering algorithms to make integration for users easier 
> as per mailing list thread http://markmail.org/message/pnzvrqpv5226twfs
> {quote}
> Jake and Robin and I were talking the other evening and a common lament was 
> that our classification (and clustering) stuff was all over the map in terms 
> of data structures.  Driving that to rest and getting those comments even 
> vaguely as plug and play as our much more advanced recommendation components 
> would be very, very helpful.
> {quote}
> This issue probably also realates to MAHOUT-287 (intention there is to make 
> naive bayes run on vectors as input).
> Ted, Jake, Robin: Would be great if someone of you could add a comment on 
> some of the issues you discussed "the other evening" and (if applicable) any 
> minor or major changes you think could help solve this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to