On Jul 2, 2009, at 12:09 PM, Allan Roberto Avendano Sudario wrote:

Regards,
This is the entire exception message:


java -cp $JAVACLASSPATH org.apache.mahout.utils.vectors.Driver --dir
/home/hadoop/Desktop/<urls>/index  --field content  --dictOut
/home/hadoop/Desktop/dictionary/dict.txt --output
/home/hadoop/Desktop/dictionary/out.txt --max 50 --norm 2


09/07/02 09:35:47 INFO vectors.Driver: Output File:
/home/hadoop/Desktop/dictionary/out.txt
09/07/02 09:35:47 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
09/07/02 09:35:47 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
09/07/02 09:35:47 INFO compress.CodecPool: Got brand-new compressor
Exception in thread "main" java.lang.NullPointerException
       at
org.apache.mahout.utils.vectors.lucene.LuceneIteratable $TDIterator.next(LuceneIteratable.java:111)
       at
org.apache.mahout.utils.vectors.lucene.LuceneIteratable $TDIterator.next(LuceneIteratable.java:82)
       at
org .apache .mahout .utils .vectors .io.SequenceFileVectorWriter.write(SequenceFileVectorWriter.java:25)
       at org.apache.mahout.utils.vectors.Driver.main(Driver.java:204)


Well, I used a nutch crawl index, is that correct? mmm... I have change to
contenc field, but nothing happened.
Possibly the nutch crawl doesn´t have Term Vector indexed.

This would be my guess. A small edit to Nutch code would probably allow it. Just find where it creates a new Field and add in the TV stuff.

Reply via email to