./cluster-reuters.sh --help
This script clusters the Reuters data set using a variety of algorithms.
The data set is downloaded automatically.

The data is online & downloaded from:
http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.tar.gz  (line 82)

Reading the script I can see (line 118):
  $MAHOUT kmeans \
    -i ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/ \
    -c ${WORK_DIR}/reuters-kmeans-clusters \
    -o ${WORK_DIR}/reuters-kmeans \
    -dm org.apache.mahout.common.distance.EuclideanDistanceMeasure \
    -x 10 -k 20 -ow --clustering \

The file ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/ exists
locally after the script fails so I see it's being passed in but I get this
error.

Thoughts?

Reply via email to