Re: Does clusterdump still support option "--seqFileDir"?

Paritosh Ranjan Wed, 05 Sep 2012 01:04:11 -0700

I think its version/doc mismatch. The current version just takes theinput path as seqFileDir.


seqFileDir = getInputPath();



On 05-09-2012 12:56, javaboom wrote:

I've tried to use "clusterdump". I followed this manual
https://cwiki.apache.org/MAHOUT/cluster-dumper.html

I tried the following command line

  $MAHOUT_HOME/bin/mahout clusterdump --seqFileDir output/clusters-10
--pointsDir output/clusteredPoints --output
$MAHOUT_HOME/examples/output/clusteranalyze.txt

I got a problem i.e., "clusterdump" cannot recognize the option
"--seqFileDir". Then I checked the help option of the command as follows:

============================================================================
root@ubuntu:~/trunk/bin# ./mahout clusterdump --help
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /root/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
Usage:
  [--input <input> --output <output> --outputFormat <outputFormat>
--substring
<substring> --numWords <numWords> --pointsDir <pointsDir> --samplePoints
<samplePoints> --dictionary <dictionary> --dictionaryType <dictionaryType>
--evaluate --distanceMeasure <distanceMeasure> --help --tempDir <tempDir>
--startPhase <startPhase> --endPhase <endPhase>]
Job-Specific Options:
   --input (-i) input                         Path to job input directory.
   --output (-o) output                       The directory pathname for
output.
   --outputFormat (-of) outputFormat          The optional output format to
                                              write the results as.  Options:
                                              TEXT, CSV or GRAPH_ML
   --substring (-b) substring                 The number of chars of the
                                              asFormatString() to print
   --numWords (-n) numWords                   The number of top terms to
print
   --pointsDir (-p) pointsDir                 The directory containing points
                                              sequence files mapping input
                                              vectors to their cluster.  If
                                              specified, then the program
will
                                              output the points associated
with
                                              a cluster
   --samplePoints (-sp) samplePoints          Specifies the maximum number of
                                              points to include _per_
cluster.
                                              The default is to include all
                                              points
   --dictionary (-d) dictionary               The dictionary file
   --dictionaryType (-dt) dictionaryType      The dictionary file type
                                              (text|sequencefile)
   --evaluate (-e)                            Run ClusterEvaluator and
                                              CDbwEvaluator over the input.
The
                                              output will be appended to the
                                              rest of the output at the end.
   --distanceMeasure (-dm) distanceMeasure    The classname of the
                                              DistanceMeasure. Default is
                                              SquaredEuclidean
   --help (-h)                                Print out help
   --tempDir tempDir                          Intermediate output directory
   --startPhase startPhase                    First phase to run
   --endPhase endPhase                        Last phase to run
Specify HDFS directories while running on hadoop; else specify local file
system directories
12/09/05 15:17:25 INFO driver.MahoutDriver: Program took 170 ms (Minutes:
0.0028333333333333335)
============================================================================

Could you please help me? How can I solve this problem? Have I used
different Mahout version?

Thank you in advance




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Does-clusterdump-still-support-option-seqFileDir-tp4005517.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Does clusterdump still support option "--seqFileDir"?

Reply via email to