Hi All,

My HADOOP_CLASSPATH was interfering somehow. Things seem to work fine now.

-bash-4.1$ export HADOOP_CLASSPATH=""

./mahout clusterdump -i /scratch/dummyvectoroutput/clusters-*-final --pointsDir 
/scratch/clusterdump
MAHOUT-JOB: 
/apps/mahout/trunk/examples/target/mahout-examples-0.9-SNAPSHOT-job.jar
Warning: $HADOOP_HOME is deprecated.

13/12/20 15:06:15 INFO common.AbstractJob: Command line arguments: 
{--dictionaryType=[text], 
--distanceMeasure=[org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure],
 --endPhase=[2147483647], 
--input=[/scratch/dummyvectoroutput/clusters-*-final], --outputFormat=[TEXT], 
--pointsDir=[/scratch/clusterdump], --startPhase=[0], --tempDir=[temp]}
CL-92{n=10 c=[343.032, 272.783, 78.239, 4.934, 54.654] r=[72.995, 74.388, 
75.692, 14.803, 80.172]}
CL-7{n=34 c=[61.475, 67.234, 94.989, 75.609, 267.051] r=[80.386, 84.565, 
124.621, 86.960, 90.146]}
CL-98{n=30 c=[28.038, 81.483, 145.317, 269.980, 52.420] r=[43.357, 114.179, 
136.547, 119.696, 84.281]}
CL-3{n=8 c=[339.604, 28.429, 124.278, 61.143, 84.997] r=[73.463, 44.537, 
128.509, 40.645, 100.324]}
VL-46{n=18 c=[58.082, 299.551, 79.124, 65.438, 39.663] r=[61.926, 96.523, 
91.026, 91.622, 66.675]}
13/12/20 15:06:16 INFO clustering.ClusterDumper: Wrote 5 clusters
13/12/20 15:06:16 INFO driver.MahoutDriver: Program took 841 ms (Minutes: 
0.014016666666666667)

> Date: Fri, 20 Dec 2013 15:02:13 -0800
> From: [email protected]
> Subject: Re: clusterdump
> To: [email protected]
> 
> I would investigate all of those 'Unable to add .....' messages first. 
> Checkout the latest code and run a clean build.
> 
> 
> 
> 
> 
> On Friday, December 20, 2013 5:58 PM, Sameer Tilak <[email protected]> wrote:
>  
> Suneel:
> Yes, I am working off of trunk. I saw that example. In my case the data is 
> numeric -- I assume that means no need for dictionary etc . I am not sure 
> what is going on, but I still get the following errors:
> 
> ./mahout clusterdump -i /scratch/dummyvectoroutput/clusters-*-final -o 
> /scratch/clusterdump
> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> Warning: $HADOOP_HOME is deprecated.
> 
> Running on hadoop, using /users/p529444/software/hadoop-1.0.3/bin/hadoop and 
> HADOOP_CONF_DIR=/apps/hadoop/hadoop-conf
> MAHOUT-JOB: 
> /apps/mahout/trunk/examples/target/mahout-examples-0.9-SNAPSHOT-job.jar
> Warning: $HADOOP_HOME is deprecated.
> 
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.clustering.ClusterDumper
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.classifier.sgd.TrainLogistic
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.vectors.lucene.Driver
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.classifier.sgd.RunAdaptiveLogistic
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.SequenceFileDumper
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.classifier.sgd.PrintResourceOrFile
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.classifier.sgd.ValidateAdaptiveLogistic
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.text.WikipediaToSequenceFile
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.classifier.ConfusionMatrixDumper
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.regex.RegexConverterDriver
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.text.SequenceFilesFromMailArchives
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.classifier.sgd.TrainAdaptiveLogistic
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.vectors.VectorDumper
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.vectors.RowIdJob
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.clustering.streaming.tools.ClusterQualitySummarizer
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.SplitInput
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.clustering.streaming.tools.ResplitSequenceFiles
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.text.SequenceFilesFromLuceneStorageDriver
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.MatrixDumper
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.text.SequenceFilesFromDirectory
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.classifier.sgd.RunLogistic
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.ConcatenateVectorsJob
> 13/12/20 14:57:02 WARN driver.MahoutDriver: Unable to add class: 
> org.apache.mahout.utils.vectors.arff.Driver
> Unknown program 'clusterdump' chosen.
> Valid program names are:
>   baumwelch: : Baum-Welch algorithm for unsupervised HMM training
>   canopy: : Canopy clustering
>   cleansvd: : Cleanup and verification of SVD output
>   clusterpp: : Groups Clustering Output In Clusters
>   cvb: : LDA via Collapsed Variation Bayes (0th deriv. approx)
>   cvb0_local: : LDA via Collapsed Variation Bayes, in memory locally.
>   evaluateFactorization: : compute RMSE and MAE of a rating matrix 
> factorization against probes
>   fkmeans: : Fuzzy K-means clustering
>   hmmpredict: : Generate random sequence of observations by given HMM
>   itemsimilarity: : Compute the item-item-similarities for item-based 
> collaborative filtering
>   kmeans: : K-means clustering
>   matrixmult: : Take the product of two matrices
>   parallelALS: : ALS-WR factorization of a rating matrix
>   recommendfactorized: : Compute recommendations using the factorization of a 
> rating matrix
>   recommenditembased: : Compute recommendations using item-based 
> collaborative filtering
>   rowsimilarity: : Compute the pairwise similarities of the rows of a matrix
>   seq2encoded: : Encoded Sparse Vector generation from Text sequence files
>   seq2sparse: : Sparse Vector generation from Text sequence files
>   spectralkmeans: : Spectral k-means clustering
>   splitDataset: : split a rating dataset into training and probe parts
>   ssvd: : Stochastic SVD
>   streamingkmeans: : Streaming k-means clustering
>   svd: : Lanczos Singular Value Decomposition
>   testnb: : Test the Vector-based Bayes classifier
>   trainnb: : Train the Vector-based Bayes classifier
>   transpose: : Take the transpose of a matrix
>   vecdist: : Compute the distances between a set of Vectors (or Cluster or 
> Canopy, they must fit in memory) and a list of Vectors
>   viterbi: : Viterbi decoding of hidden states from given output states 
> sequence
> 
> 
> > Date: Fri, 20 Dec 2013 14:42:33 -0800
> > From: [email protected]
> > Subject: Re: clusterdump
> > To: [email protected]
> > 
> > Are you working off of trunk? 'clusterdump' is being used in 
> > examples/bin/cluster-reuters.sh.
> > 
> > 
> > 
> > 
> > 
> > On Friday, December 20, 2013 5:33 PM, Sameer Tilak <[email protected]> wrote:
> >  
> > Hi All,
> > I was able to do the clustering and need some help with viewing the result. 
> > I get the following problem.
> > 
> > ./mahout clusterdump -i /scratch/dummyvectoroutput/clusters-*-final -d 
> > /scratch/dummyvectorfinalclusters
> > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> > Warning: $HADOOP_HOME is deprecated.
> > 
> > Running on hadoop, using /users/p529444/software/hadoop-1.0.3/bin/hadoop 
> > and HADOOP_CONF_DIR=/apps/hadoop/hadoop-conf
> > MAHOUT-JOB: 
> > /apps/mahout/trunk/examples/target/mahout-examples-0.9-SNAPSHOT-job.jar
> > Warning: $HADOOP_HOME is deprecated.
> > 
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.clustering.ClusterDumper
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.classifier.sgd.TrainLogistic
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.vectors.lucene.Driver
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.classifier.sgd.RunAdaptiveLogistic
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.SequenceFileDumper
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.classifier.sgd.PrintResourceOrFile
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.classifier.sgd.ValidateAdaptiveLogistic
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.text.WikipediaToSequenceFile
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.classifier.ConfusionMatrixDumper
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.regex.RegexConverterDriver
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.text.SequenceFilesFromMailArchives
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.classifier.sgd.TrainAdaptiveLogistic
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.vectors.VectorDumper
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.vectors.RowIdJob
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.clustering.streaming.tools.ClusterQualitySummarizer
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.SplitInput
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.clustering.streaming.tools.ResplitSequenceFiles
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.text.SequenceFilesFromLuceneStorageDriver
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.MatrixDumper
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.text.SequenceFilesFromDirectory
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.classifier.sgd.RunLogistic
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.ConcatenateVectorsJob
> > 13/12/20 14:21:56 WARN driver.MahoutDriver: Unable to add class: 
> > org.apache.mahout.utils.vectors.arff.Driver
> > Unknown program 'clusterdump' chosen.
> > Valid program names are:
> >   baumwelch: : Baum-Welch algorithm for unsupervised HMM training
> >   canopy: : Canopy clustering
> >   cleansvd: : Cleanup and verification of SVD output
> >   clusterpp: : Groups Clustering Output In Clusters
> >   cvb: : LDA via Collapsed Variation Bayes (0th deriv. approx)
> >   cvb0_local: : LDA via Collapsed Variation Bayes, in memory locally.
> >   evaluateFactorization: : compute RMSE and MAE of a rating matrix 
> > factorization against probes
> >   fkmeans: : Fuzzy K-means clustering
> >   hmmpredict: : Generate random sequence of observations by given HMM
> >   itemsimilarity: : Compute the item-item-similarities for item-based 
> > collaborative filtering
> >   kmeans: : K-means clustering
> >   matrixmult: : Take the product of two matrices
> >   parallelALS: : ALS-WR factorization of a rating matrix
> >   recommendfactorized: : Compute recommendations using the factorization of 
> > a rating matrix
> >   recommenditembased: : Compute recommendations using item-based 
> > collaborative filtering
> >   rowsimilarity: : Compute the pairwise similarities of the rows of a matrix
> >   seq2encoded: : Encoded Sparse Vector generation from Text sequence files
> >   seq2sparse: : Sparse Vector generation from Text sequence files
> >   spectralkmeans: : Spectral k-means clustering
> >   splitDataset: : split a rating dataset into training and probe parts
> >   ssvd: : Stochastic SVD
> >   streamingkmeans: : Streaming k-means clustering
> >   svd: : Lanczos Singular Value Decomposition
> >   testnb: : Test the Vector-based Bayes classifier
> >   trainnb: : Train the Vector-based Bayes classifier
> >   transpose: : Take the transpose of a matrix
> >   vecdist: : Compute the distances between a set of Vectors (or Cluster or 
> > Canopy, they must fit in memory) and a list of Vectors
> >   viterbi: : Viterbi decoding of hidden states from given output states 
> > sequence                          
                                          

Reply via email to