> Jeff Eastman <jdog <at> windwardsolutions.com> writes: > Try naming the input *directory* not the particular input file.
I tried,but the result was the same. But i did a discovery about a bug of mahout. When I try to convert a text file in a sequence with the command line: bin/mahout seqdirectory –input <PATH> --output <PATH> --charset UTF-8 and then in a sparse vector with: bin/mahout seq2sparse --input <PATH>/content/reuters/seqfiles/ --norm 2 --weight TF --output <PATH>/content/reuters/seqfiles-TF/ --minDF 5 --maxDFPercent 90 if the original file isn't correct,or the path is incorrect mahout create a fake chunk-0,not useful for the seq2sparse,and the second command create other useless things because files are empty and you can see this because the file part-00000 in the folder vector is around 90 bytes. I think that this was an old your answer to a similar problem like mine ^^ have you got a link or a site where I can download a correct text file that is a dataset? so i can try to convert it in sequence and then in vectors to see what mahout kmeans produce. Thanks in advance!
