There was an issue with empty cluster file being created for Canopy which has since been fixed in present trunk. So u may want to work off of present trunk. Also Canopy's been marked for deprecation in future release so whatever u r trying to do, you may want to look at the alternatives.
On Fri, Jun 20, 2014 at 4:53 AM, Kamesh <[email protected]> wrote: > Hi Andrew, > I am invoking Canopy Driver class to perform clustering. I am able to see > the results when output format is either TEXT or CSV. However, when I am > using JSON, I am getting the exception as I mentioned above. > > > On Wed, Jun 18, 2014 at 10:32 PM, Andrew Musselman < > [email protected]> wrote: > > > Kamesh, can you please describe the schema of your input data, along with > > your command to perform the clustering? > > > > > > On Mon, Jun 16, 2014 at 12:44 AM, Kamesh <[email protected]> > wrote: > > > > > Thanks for the response Andrew. I am using Mahout 0.9 version. > However, I > > > tried with trunk version but still I am getting output in the following > > > format > > > > > > C-55{n=1 c=[15993058.000] r=[]} > > > C-56{n=2 c=[15993061.167] r=[]} > > > C-57{n=1 c=[15993062.000] r=[]} > > > > > > C-97{n=1 c=[15993103.000] r=[]} > > > C-98{n=2 c=[15993119.333] r=[0.395]} > > > C-99{n=1 c=[15993105.000] r=[]} > > > > > > and hence, not able to figure out the data points inside each cluster. > > > > > > Also, When I am running with "-of JSON" getting NPE > > > > > > Exception in thread "main" java.lang.NullPointerException > > > at > > > > > > > > > org.apache.mahout.utils.clustering.JsonClusterWriter.getTopFeaturesList(JsonClusterWriter.java:118) > > > at > > > > > > > > > org.apache.mahout.utils.clustering.JsonClusterWriter.write(JsonClusterWriter.java:73) > > > at > > > > > > > > > org.apache.mahout.utils.clustering.AbstractClusterWriter.write(AbstractClusterWriter.java:115) > > > at > > > > > > > > > org.apache.mahout.utils.clustering.AbstractClusterWriter.write(AbstractClusterWriter.java:102) > > > > > > I am executing cluster dump using the following way > > > > > > hadoop jar mahout-integration-1.0-SNAPSHOT.jar > > > org.apache.mahout.utils.clustering.ClusterDumper -i > > > /canopy/clusters-0-final -p /canopy/clusteredPoints -of JSON -n 1000 > > > > > > Also I have observed that the *part* file created inside > > *clusteredPoints* > > > is empty. > > > > > > Please help me how to get data points from each cluster. > > > > > > > > > On Fri, Jun 13, 2014 at 9:24 PM, Andrew Musselman < > > > [email protected]> wrote: > > > > > > > That's going to be easier if you can work off of trunk, since the > > output > > > of > > > > clustering has been cleaned up to write a better format, per > > > > https://issues.apache.org/jira/browse/MAHOUT-1505 > > > > > > > > E.g., > > > > > > > > { > > > > "top_terms": [ > > > > {"all":3.0149030685424805}, > > > > {"english":3.0149030685424805}, > > > > {"best":3.0149030685424805}, > > > > {"spaniel":3.0149030685424805}, > > > > {"springer":3.0149030685424805}, > > > > {"dogs":1.9162907600402832} > > > > ], > > > > "cluster_id": 7, > > > > "cluster": { > > > > "r": [], > > > > "c": [ > > > > {"all":3.015}, > > > > {"best":3.015}, > > > > {"dogs":1.916}, > > > > {"english":3.015}, > > > > {"spaniel":3.015}, > > > > {"springer":3.015} > > > > ], > > > > "n": 1, > > > > "identifier": "C-7" > > > > }, > > > > "points": [ > > > > { > > > > "point": [ > > > > {"all":3.015}, > > > > {"best":3.015}, > > > > {"dogs":1.916}, > > > > {"english":3.015}, > > > > {"spaniel":3.015}, > > > > {"springer":3.015} > > > > ], > > > > "vector_name": "P(14)", > > > > "weight": "1.0" > > > > } > > > > ] > > > > } > > > > > > > > > > > > On Fri, Jun 13, 2014 at 2:42 AM, Kamesh <[email protected]> > > wrote: > > > > > > > > > Hi All, > > > > > Please help me in getting the data points inside each cluster. > > > > > The output of the clustering algorithm is center of the cluster and > > > > radius > > > > > of the cluster. How do we derive actual data points inside each > > cluster > > > > > from this output. > > > > > > > > > > -- > > > > > Kamesh. > > > > > > > > > > > > > > > > > > > > > -- > > > Kamesh. > > > > > > > > > -- > Kamesh. >
