Hello! I have trouble running the example "seq2sparse" with TFIDF weights. My TF vectors are Ok, while TFIDF vectors are 10 times smaller. Looks like seq2sparse cuts my terms during TFxIDF step. Document1 in TF vector has 20 terms, while Document1 in TFIDF vector has only 2 terms. What is wrong? I spent 2 days finding the answer and configuring seq2sparse parameters ((
Thanks in advance! mahout seq2sparse -ow \ -chunk 512 \ --maxDFPercent 90 \ --maxNGramSize 1 \ --numReducers 128 \ --minSupport 150 \ -i --- \ -o --- \ -wt tfidf \ --namedVector \ -a org.apache.lucene.analysis.WhitespaceAnalyzer Pavel
