[
https://issues.apache.org/jira/browse/MAHOUT-843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165014#comment-13165014]
Hudson commented on MAHOUT-843:
-------------------------------
Integrated in Mahout-Quality #1236 (See [
https://builds.apache.org/job/Mahout-Quality/1236/])
MAHOUT-843: Final patch plus some integration fixes. All tests run
jeastman :
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1211715
Files :
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/canopy/CanopyDriver.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/PathDirectory.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/TopDownClusteringPathConstants.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterCountReader.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessor.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorDriver.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorMapper.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorReducer.java
*
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/canopy/TestCanopyCreation.java
*
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/kmeans/TestKmeansClustering.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown
*
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/PathDirectoryTest.java
*
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor
*
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterCountReaderTest.java
*
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorTest.java
*
/mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterDumper.java
*
/mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterEvaluator.java
*
/mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/cdbw/TestCDbwEvaluator.java
* /mahout/trunk/src/conf/clusterpp.props
* /mahout/trunk/src/conf/driver.classes.props
Top Down Clustering
-------------------
Key: MAHOUT-843
URL: https://issues.apache.org/jira/browse/MAHOUT-843
Project: Mahout
Issue Type: New Feature
Components: Clustering
Affects Versions: 0.6
Reporter: Paritosh Ranjan
Assignee: Jeff Eastman
Labels: clustering, patch
Fix For: 0.6
Attachments: MAHOUT-843-patch,
MAHOUT-843-patch-only-postprocessor,
MAHOUT-843-patch-only-postprocessor-final,
MAHOUT-843-patch-only-postprocessor-v1,
MAHOUT-843-patch-only-postprocessor-v2,
MAHOUT-843-patch-only-postprocessor-v3,
MAHOUT-843-patch-only-postprocessor-v4,
MAHOUT-843-patch-only-postprocessor-v5, MAHOUT-843-patch-v1,
Top-Down-Clustering-patch
Top Down Clustering works in multiple steps. The first step is to find
comparative bigger clusters. The second step is to cluster the bigger
chunks into meaningful clusters. This can performance while
clustering big
amount of data. And, it also removes the dependency of providing input
clusters/numbers to the clustering algorithm.
The "big" is a relative term, as well as the smaller "meaningful"
terms.
So, the control of this "bigger" and "smaller/meaningful" clusters
will be
controlled by the user.
Which clustering algorithm to be used in the top level and which to
use
in the bottom level can also be selected by the user. Initially, it
can be
done for only one/few clustering algorithms, and later, option can be
provided to use all the algorithms ( which suits the case ).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA
administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira