Author: srowen
Date: Fri Sep 24 11:49:18 2010
New Revision: 1000822
URL: http://svn.apache.org/viewvc?rev=1000822&view=rev
Log:
MAHOUT-508
Added:
mahout/trunk/mahout/
mahout/trunk/mahout/conf/
mahout/trunk/mahout/conf/arff.vector.props
Modified:
mahout/trunk/conf/driver.classes.props
Modified: mahout/trunk/conf/driver.classes.props
URL:
http://svn.apache.org/viewvc/mahout/trunk/conf/driver.classes.props?rev=1000822&r1=1000821&r2=1000822&view=diff
==============================================================================
--- mahout/trunk/conf/driver.classes.props (original)
+++ mahout/trunk/conf/driver.classes.props Fri Sep 24 11:49:18 2010
@@ -12,6 +12,7 @@ org.apache.mahout.clustering.canopy.Cano
org.apache.mahout.math.hadoop.TransposeJob = transpose : Take the transpose of
a matrix
org.apache.mahout.math.hadoop.MatrixMultiplicationJob = matrixmult : Take the
produc of two matrices
org.apache.mahout.utils.vectors.lucene.Driver = lucene.vector : Generate
Vectors from a Lucene index
+org.apache.mahout.utils.vectors.arff.Driver = arff.vector : Generate Vectors
from an ARFF file or directory
org.apache.mahout.text.SequenceFilesFromDirectory = seqdirectory : Generate
sequence files (of Text) from a directory
org.apache.mahout.text.SparseVectorsFromSequenceFiles = seq2sparse: Sparse
Vector generation from Text sequence files
org.apache.mahout.utils.vectors.RowIdJob = rowid : Map
SequenceFile<Text,VectorWritable> to {SequenceFile<IntWritable,VectorWritable>,
SequenceFile<IntWritable,Text>}
Added: mahout/trunk/mahout/conf/arff.vector.props
URL:
http://svn.apache.org/viewvc/mahout/trunk/mahout/conf/arff.vector.props?rev=1000822&view=auto
==============================================================================
--- mahout/trunk/mahout/conf/arff.vector.props (added)
+++ mahout/trunk/mahout/conf/arff.vector.props Fri Sep 24 11:49:18 2010
@@ -0,0 +1,9 @@
+# The following parameters must be specified
+#d|input = /path/to/input
+#o|output = /path/to/output
+#t|dictOut = /path/to/dictionaryFileOrDirectory
+
+# The following parameters all have default values if not specified
+#m|max = <Max number of vectors to output. Defaults to Long.MAX_VALUE>
+#e|outputWriter <Defaults to 'seq' for SequenceFileVectorWriter or 'file' for
JSON output>
+#l|delimiter <Delimiter for outputing the dictionary. Defaults to '\t'>