Ec dccddxxedddxxdddcddc.exd.ddxxxccxxxcxdccdddciidcdxddddxdvd.m Shuru se i told ppl ,cx.fddcuxddcxcrdcdddcddcrccdddrr rrcdxcddmcxndxxccdddd.ccxxcdddxcdddxccfddcdxccdxxdxdeddxcdsd
Ccddjcdd..cdxdd.gddmcddcddcfexfex N cec Xdedd E eexx.vdxcdddcccrm.rccdd.cccddd dddcdr..r . C xxcxxxexcxecmdddx dddcdr. Egc.mmr.m.crmmd co.mi d. Heddrd, Suneel Marthi <[email protected]> wrote: >R u still specifying the -c option, its only needed if u have initial >centroids to launch the KMEans from otherwise KMeans picks random centroids. > >Also CosineDistanceMeasure doesn't make sense with kMeans which is in >Euclidean space -try using SquaredEuclidean or Euclidean distances. > >On Tue, Mar 10, 2015 at 1:27 AM, Raghuveer <[email protected]> >wrote: > >> Hi All, >> I am trying to run the command: >> ./mahout kmeans -i >> hdfs://master:54310/user/netlog/upload/output4/tfidf-vectors/part-r-00000 >> -o >> hdfs://master:54310//user/netlog/upload/output4/tfidf-vectors-kmeans-clusters-raghuveer >> -c hdfs://master:54310/user/netlog/upload/mahoutoutput -dm >> org.apache.mahout.common.distance.CosineDistanceMeasure -x 5 -ow -cl -k 25 >> -xm mapreduce >> Since i dont have any clusters yet to give it as an input i can remove it >> is what forums suggested. But now i get the error >> >> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR= >> MAHOUT-JOB: >> /home/raghuveer/trunk/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar >> 15/03/10 10:52:53 ERROR common.AbstractJob: Missing required option >> --clusters >> Missing required option >> --clusters >> >> Usage: >> [--input <input> --output <output> --distanceMeasure >> <distanceMeasure> >> --clusters <clusters> --numClusters <k> --randomSeed >> <randomSeed1> >> [<randomSeed2> ...] --convergenceDelta <convergenceDelta> --maxIter >> <maxIter> >> --overwrite --clustering --method <method> >> --outlierThreshold >> <outlierThreshold> --help --tempDir <tempDir> --startPhase >> <startPhase> >> --endPhase >> <endPhase>] >> --clusters (-c) clusters The input centroids, as Vectors. Must be >> a >> SequenceFile of Writable, Cluster/Canopy. If >> k is >> also specified, then a random set of vectors >> will >> be selected and written out to this path >> first >> 15/03/10 10:52:53 INFO driver.MahoutDriver: Program took 370 ms (Minutes: >> 0.006166666666666667) >> Kindly help me out. >> Thanks >> >> >>
