[ https://issues.apache.org/jira/browse/MAHOUT-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Ingersoll reassigned MAHOUT-546: -------------------------------------- Assignee: Grant Ingersoll > Bug creating vector from Solr index with TrieFields > --------------------------------------------------- > > Key: MAHOUT-546 > URL: https://issues.apache.org/jira/browse/MAHOUT-546 > Project: Mahout > Issue Type: Bug > Components: Utils > Affects Versions: 0.4 > Reporter: G Fernandes > Assignee: Grant Ingersoll > > A Solr index with an Id field of type TrieLong causes NPE when trying to > extract vectors from the lucene index > Command line used: > ../bin/mahout lucene.vector -d ~/solr/data/index/ -o vectors.vec -t d.dic -f > text -i articleId > where 'articleId' is a Stored numeric field declared as 'tlong' in Solr. > Output: > no HADOOP_HOME set, running locally > Nov 16, 2010 2:59:50 PM org.slf4j.impl.JCLLoggerAdapter info > INFO: Output File: moreover.vec > Nov 16, 2010 2:59:51 PM org.apache.hadoop.util.NativeCodeLoader <clinit> > WARNING: Unable to load native-hadoop library for your platform... using > builtin-java classes where applicable > Nov 16, 2010 2:59:51 PM org.apache.hadoop.io.compress.CodecPool getCompressor > INFO: Got brand-new compressor > Exception in thread "main" java.lang.NullPointerException > at java.io.DataOutputStream.writeUTF(DataOutputStream.java:330) > at java.io.DataOutputStream.writeUTF(DataOutputStream.java:306) > at org.apache.mahout.math.VectorWritable.write(VectorWritable.java:125) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77) > at > org.apache.hadoop.io.SequenceFile$RecordCompressWriter.append(SequenceFile.java:1128) > at > org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:977) > at > org.apache.mahout.utils.vectors.io.SequenceFileVectorWriter.write(SequenceFileVectorWriter.java:46) > at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:226) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184) > The problem is at class LuceneIteratable line 130: > name = indexReader.document(doc, idFieldSelector).get(idField); > It does not work with the way Solr stores numeric fields in the index. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.