Hi Grant, I applied the patch but still no luck. In debugging, I found that in LuceneIterable, line 129:
<< result = result.normalize(normPower); >> seems to make result, which was before a NamedVector, back into a Vector and causes the name to be lost. If I change the code to allow the name to be kept by replacing the line with: << result = new NamedVector(result.normalize(normPower), name); >> then the name is included and the result remains a NamedVector but the VectorDumper code still just prints out Vectors and not NamedVectors. Perhaps I am going back this wrong but shouldn't there be a check in the VectorDumper to find out the type of vector being dumped? Thanks, Kris 2010/6/30 Grant Ingersoll <[email protected]> > Kris, > > Can you try the patch at > https://issues.apache.org/jira/secure/attachment/12448396/MAHOUT-379-lucene.patch > > Thanks, > Grant > > On Jun 30, 2010, at 8:53 AM, Grant Ingersoll wrote: > > > > > On Jun 30, 2010, at 8:39 AM, Grant Ingersoll wrote: > > > >> > >> On Jun 29, 2010, at 1:54 PM, Kris Jack wrote: > >> > >>> Hi everyone, > >>> > >>> I have been using mahout to generate vectors from a lucene index using: > >>> > >>> $MAHOUT_HOME/bin/mahout lucene.vector > >>> > >>> In doing so, mahout creates an output file that has new ids for my > >>> documents, that are completely unlike my original --idField, that is a > >>> string. How can I relate the new ids to my original ids? Is there is > a > >>> method that allows me to output the vectors with the original --idField > >>> values that appear in the lucene index rather than the new doc ids? > >> > >> > >> Hmm, it seems the --idField stuff has been commented out, likely with > the change of labels. > >> > > > > I've brought the issue up over on dev@, as it is a bug. > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem using Solr/Lucene: > http://www.lucidimagination.com/search > > -- Dr Kris Jack, http://www.mendeley.com/profiles/kris-jack/
