thx
At 2011-10-12 21:51:51,"Grant Ingersoll" <[email protected]> wrote: > >On Oct 12, 2011, at 9:26 AM, beneo_7 wrote: > >> >> >> // sequenceFile -> vector >> mahout seq2sparse -i ../temp/input -o ../temp/vector/ -chunk 100 -wt TFIDF >> -ow > >I think you need the --namedVector option to get/keep named vectors. You >might try using the SequenceFile dumper (seqdumper) to examine the output of >this. > >(Also, in the future, this question is best asked on [email protected]) > >> >> >> // vector -> canopy >> mahoutcanopy -i /home/hduser/temp/vector/vector -o /home/hduser/temp/canopy/ >> -dm org.apache.mahout.common.distance.CosineDistanceMeasure -t1 0.032 -t2 >> 0.008 -ow >> >> >> >> >> // canopy -> kmeans >> KMeansDriver.run( conf, // configuration vectorPath, // the directory >> pathname for input points canopyClusterPath, // the directory pathname for >> initial & computed clusters kmeansPath, // the directory pathname for output >> points new CosineDistanceMeasure(), // cos 0.1d, // the convergence delta >> value 10, // the maximum number of iterations true, // run clustering false >> // execute map reduce ); >> >> >> >> >> no exception thrown and thx in advance >> >> >> >> >> At 2011-10-12 20:27:19,"Grant Ingersoll" <[email protected]> wrote: >>> Can you share your actual commands? >>> >>> On Oct 12, 2011, at 6:21 AM, beneo_7 wrote: >>> >>>> hi all >>>> i create vector using lucene index, and the mahout will use NamedVector, >>>> but how about create vector from sequenceFile??? >>>> >>>> now, i create vector from text with the follow steps: >>>> >>>> step #1 >>>> text -> sequeneceFile >>>> key = text, value = text >>>> i do not use seqdirectory, cuz i want to put the String key into >>>> the sequenceFile, not the doc Id >>>> >>>> step #2 >>>> seq2sparse using TFIDF >>>> the output i use tfidf-vectors/ >>>> >>>> step #3 #4 >>>> canopy -> kmeans >>>> >>>> step #4 >>>> clusterDump >>>> >>>> i found the vector is >>>> org.apache.mahout.math.RandomAccessSparseVector, and where i can found the >>>> sequenceFile key?? >>>> >>>> thx in advance >>> >>> -------------------------------------------- >>> Grant Ingersoll >>> http://www.lucidimagination.com >>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>> > >-------------------------------------------- >Grant Ingersoll >http://www.lucidimagination.com >Lucene Eurocon 2011: http://www.lucene-eurocon.com >
