Re: [jira] [Commented] (MAHOUT-843) Top Down Clustering

Lance Norskog Thu, 08 Dec 2011 14:24:35 -0800

Paritosh- thanks for jumping through all of these hoops. (If only the
committers' code went through this much scrutiny :)


On Wed, Dec 7, 2011 at 9:57 PM, Hudson (Commented) (JIRA)
<[email protected]>wrote:

>
>    [
> https://issues.apache.org/jira/browse/MAHOUT-843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165014#comment-13165014]
>
> Hudson commented on MAHOUT-843:
> -------------------------------
>
> Integrated in Mahout-Quality #1236 (See [
> https://builds.apache.org/job/Mahout-Quality/1236/])
>    MAHOUT-843: Final patch plus some integration fixes. All tests run
>
> jeastman :
> http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1211715
> Files :
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/canopy/CanopyDriver.java
> * /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/PathDirectory.java
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/TopDownClusteringPathConstants.java
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterCountReader.java
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessor.java
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorDriver.java
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorMapper.java
> *
> /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorReducer.java
> *
> /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/canopy/TestCanopyCreation.java
> *
> /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/kmeans/TestKmeansClustering.java
> * /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown
> *
> /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/PathDirectoryTest.java
> *
> /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor
> *
> /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterCountReaderTest.java
> *
> /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorTest.java
> *
> /mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterDumper.java
> *
> /mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterEvaluator.java
> *
> /mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/cdbw/TestCDbwEvaluator.java
> * /mahout/trunk/src/conf/clusterpp.props
> * /mahout/trunk/src/conf/driver.classes.props
>
>
> > Top Down Clustering
> > -------------------
> >
> >                 Key: MAHOUT-843
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-843
> >             Project: Mahout
> >          Issue Type: New Feature
> >          Components: Clustering
> >    Affects Versions: 0.6
> >            Reporter: Paritosh Ranjan
> >            Assignee: Jeff Eastman
> >              Labels: clustering, patch
> >             Fix For: 0.6
> >
> >         Attachments: MAHOUT-843-patch,
> MAHOUT-843-patch-only-postprocessor,
> MAHOUT-843-patch-only-postprocessor-final,
> MAHOUT-843-patch-only-postprocessor-v1,
> MAHOUT-843-patch-only-postprocessor-v2,
> MAHOUT-843-patch-only-postprocessor-v3,
> MAHOUT-843-patch-only-postprocessor-v4,
> MAHOUT-843-patch-only-postprocessor-v5, MAHOUT-843-patch-v1,
> Top-Down-Clustering-patch
> >
> >
> > Top Down Clustering works in multiple steps. The first step is to find
> comparative bigger clusters. The second step is to cluster the bigger
> chunks into meaningful clusters. This can performance while clustering big
> amount of data. And, it also removes the dependency of providing input
> clusters/numbers to the clustering algorithm.
> > The "big" is a relative term, as well as the smaller "meaningful" terms.
> So, the control of this "bigger" and "smaller/meaningful" clusters will be
> controlled by the user.
> > Which clustering algorithm to be used in the top level and which to use
> in the bottom level can also be selected by the user. Initially, it can be
> done for only one/few clustering algorithms, and later, option can be
> provided to use all the algorithms ( which suits the case ).
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators:
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>


-- 
Lance Norskog
[email protected]

Re: [jira] [Commented] (MAHOUT-843) Top Down Clustering

Reply via email to