Try mahout seq2sparse --help
On Mon, Jun 27, 2011 at 1:36 PM, wine lover <[email protected]> wrote: > Hello Everyone, > > When using seqdirectory to convert directory of documents to SequenceFile > format, it asks to set the parameter of chunk size: > <-chunk <MAX SIZE OF EACH CHUNK in Megabytes> 64> > > In the example of build-ruters.sh, the chunk size is setup as 5. But I do > not know why? Is parameter input-dependent or system-dependent? Is there > any > rule for setting this parameter? > > When using seq2sparse to creat vectors from SequenceFile, I notice that the > build-ruters.sh use it as follows: > $MAHOUT seq2sparse \ > -i mahout-work/reuters-out-seqdir/ \ > -o mahout-work/reuters-out-seqdir-sparse-lda \ > -wt tf -seq -nr 3 \ > > What does "-nr 3" stand for? > > Thanks, > > Wenyia > -- ksh:
