./cluster-reuters.sh --help This script clusters the Reuters data set using a variety of algorithms. The data set is downloaded automatically.
The data is online & downloaded from: http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.tar.gz (line 82) Reading the script I can see (line 118): $MAHOUT kmeans \ -i ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/ \ -c ${WORK_DIR}/reuters-kmeans-clusters \ -o ${WORK_DIR}/reuters-kmeans \ -dm org.apache.mahout.common.distance.EuclideanDistanceMeasure \ -x 10 -k 20 -ow --clustering \ The file ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/ exists locally after the script fails so I see it's being passed in but I get this error. Thoughts?