+1 -s got changed to -i some time back and it looks like some of the $MAHOUT clusterdump invocations didn't get upgraded. I agree it needs fixing.

On 6/9/12 8:27 AM, Drew Farris wrote:
Hi All,

In kicking the tires of the 0.7 release, I've discovered that the
arguments for clusterdump in examples/bin/cluster-reuters.sh aren't
quite right.

When running what's checked in, I get:

12/06/09 08:10:47 ERROR common.AbstractJob: Unexpected -s while
processing Job-Specific Options:
usage:<command>  [Generic Options] [Job-Specific Options]

The current dump commands look like:

   $MAHOUT clusterdump \
     -s ${WORK_DIR}/reuters-kmeans/clusters-*-final \
     -d ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/dictionary.file-0 \
     -dt sequencefile -b 100 -n 20 --evaluate -dm
org.apache.mahout.common.distance.CosineDistanceMeasure \
     --pointsDir ${WORK_DIR}/reuters-kmeans/clusteredPoints

I think they should be:

   $MAHOUT clusterdump \
     -i ${WORK_DIR}/reuters-kmeans/clusters-*-final \
     -o ${WORK_DIR}/reuters-kmeans/clusters-dump -of TEXT \
     -d ${WORK_DIR}/reuters-out-seqdir-sparse-kmeans/dictionary.file-0 \
     -dt sequencefile -b 100 -n 20 --evaluate -dm
org.apache.mahout.common.distance.CosineDistanceMeasure \
     --pointsDir ${WORK_DIR}/reuters-kmeans/clusteredPoints

Anyone opposed to getting this fix in for 0.7?

Drew



Reply via email to