Space: Apache Mahout (https://cwiki.apache.org/confluence/display/MAHOUT) Page: Creating Vectors from Weka's ARFF Format (https://cwiki.apache.org/confluence/display/MAHOUT/Creating+Vectors+from+Weka%27s+ARFF+Format)
Edited by Joe Prasanna Kumar: --------------------------------------------------------------------- h1. Introduction Mahout now has capabilities for converting Weka's [ARFF|http://www.cs.waikato.ac.nz/~ml/weka/arff.html] (2.1) format to Mahout's Vector format. h1. Running the Converter ARFF files are easily converted using the org.apache.mahout.utils.arff.Driver program. The input arguments can be found by running it with the \--help argument which produces results similar to: {noformat} Usage: [--input <input> --output <output> --max <max> --help --dictOut <dictOut> --outputWriter <outputWriter> --delimiter <delimiter>] Options --input (-d) input The file or directory containing the ARFF files. If it is a directory, all .arff files will be converted. (Mandatory parameter) --output (-o) output The output directory. Files will have the same name as the input, but with the extension .mvc (Mandatory parameter) --max (-m) max The maximum number of vectors to output. If not specified, then it will loop over all docs (Optional parameter) --help (-h) Print out help (Optional parameter) --dictOut (-t) dictOut The file to output the label bindings (Mandatory parameter) --outputWriter (-e) outputWriter The VectorWriter to use, either seq (SequenceFileVectorWriter - default) or file (Writes to a File using JSON format) (Optional parameter) --delimiter (-l) delimiter The delimiter for outputing the dictionary (Optional parameter) {noformat} You can use the parameters in its long format like \--input or using the equivalent short name \-d. From here, running the Driver is as simple as pointing it at the ARFF file: {noformat} $MAHOUT_HOME/bin/mahout arff.vector -d ./content/reuters-modapte/ \ -t ./content/reuters-modapte/output/dict.txt -o ./content/reuters-modapte/output/convert {noformat} Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action
