I dont quite get what is that you are trying to say.... :)
On Tuesday, March 10, 2015 6:16 PM, Roshan Kedar <[email protected]>
wrote:
Ec dccddxxedddxxdddcddc.exd.ddxxxccxxxcxdccdddciidcdxddddxdvd.m Shuru se i
told ppl ,cx.fddcuxddcxcrdcdddcddcrccdddrr
rrcdxcddmcxndxxccdddd.ccxxcdddxcdddxccfddcdxccdxxdxdeddxcdsd
Ccddjcdd..cdxdd.gddmcddcddcfexfex
N cec
Xdedd
E eexx.vdxcdddcccrm.rccdd.cccddd dddcdr..r
. C xxcxxxexcxecmdddx dddcdr.
Egc.mmr.m.crmmd co.mi d. Heddrd,
Suneel Marthi <[email protected]> wrote:
>R u still specifying the -c option, its only needed if u have initial
>centroids to launch the KMEans from otherwise KMeans picks random centroids.
>
>Also CosineDistanceMeasure doesn't make sense with kMeans which is in
>Euclidean space -try using SquaredEuclidean or Euclidean distances.
>
>On Tue, Mar 10, 2015 at 1:27 AM, Raghuveer <[email protected]>
>wrote:
>
>> Hi All,
>> I am trying to run the command:
>> ./mahout kmeans -i
>> hdfs://master:54310/user/netlog/upload/output4/tfidf-vectors/part-r-00000
>> -o
>> hdfs://master:54310//user/netlog/upload/output4/tfidf-vectors-kmeans-clusters-raghuveer
>> -c hdfs://master:54310/user/netlog/upload/mahoutoutput -dm
>> org.apache.mahout.common.distance.CosineDistanceMeasure -x 5 -ow -cl -k 25
>> -xm mapreduce
>> Since i dont have any clusters yet to give it as an input i can remove it
>> is what forums suggested. But now i get the error
>>
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
>> MAHOUT-JOB:
>> /home/raghuveer/trunk/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar
>> 15/03/10 10:52:53 ERROR common.AbstractJob: Missing required option
>> --clusters
>> Missing required option
>> --clusters
>>
>> Usage:
>> [--input <input> --output <output> --distanceMeasure
>> <distanceMeasure>
>> --clusters <clusters> --numClusters <k> --randomSeed
>> <randomSeed1>
>> [<randomSeed2> ...] --convergenceDelta <convergenceDelta> --maxIter
>> <maxIter>
>> --overwrite --clustering --method <method>
>> --outlierThreshold
>> <outlierThreshold> --help --tempDir <tempDir> --startPhase
>> <startPhase>
>> --endPhase
>> <endPhase>]
>> --clusters (-c) clusters The input centroids, as Vectors. Must be
>> a
>> SequenceFile of Writable, Cluster/Canopy. If
>> k is
>> also specified, then a random set of vectors
>> will
>> be selected and written out to this path
>> first
>> 15/03/10 10:52:53 INFO driver.MahoutDriver: Program took 370 ms (Minutes:
>> 0.006166666666666667)
>> Kindly help me out.
>> Thanks
>>
>>
>>