Re: Reading Vectors Created from a Lucene Index

Kris Jack Thu, 01 Jul 2010 07:32:50 -0700

Hi Grant,

I applied the patch but still no luck.  In debugging, I found that in
LuceneIterable, line 129:


<<
  result = result.normalize(normPower);
>>

seems to make result, which was before a NamedVector, back into a Vector and
causes the name to be lost.  If I change the code to allow the name to be
kept by replacing the line with:

<<
  result = new NamedVector(result.normalize(normPower), name);
>>

then the name is included and the result remains a NamedVector but the
VectorDumper code still just prints out Vectors and not NamedVectors.
Perhaps I am going back this wrong but shouldn't there be a check in the
VectorDumper to find out the type of vector being dumped?

Thanks,
Kris



2010/6/30 Grant Ingersoll <[email protected]>

> Kris,
>
> Can you try the patch at
> https://issues.apache.org/jira/secure/attachment/12448396/MAHOUT-379-lucene.patch
>
> Thanks,
> Grant
>
> On Jun 30, 2010, at 8:53 AM, Grant Ingersoll wrote:
>
> >
> > On Jun 30, 2010, at 8:39 AM, Grant Ingersoll wrote:
> >
> >>
> >> On Jun 29, 2010, at 1:54 PM, Kris Jack wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> I have been using mahout to generate vectors from a lucene index using:
> >>>
> >>> $MAHOUT_HOME/bin/mahout lucene.vector
> >>>
> >>> In doing so, mahout creates an output file that has new ids for my
> >>> documents, that are completely unlike my original --idField, that is a
> >>> string.  How can I relate the new ids to my original ids?  Is there is
> a
> >>> method that allows me to output the vectors with the original --idField
> >>> values that appear in the lucene index rather than the new doc ids?
> >>
> >>
> >> Hmm, it seems the --idField stuff has been commented out, likely with
> the change of labels.
> >>
> >
> > I've brought the issue up over on dev@, as it is a bug.
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem using Solr/Lucene:
> http://www.lucidimagination.com/search
>
>


-- 
Dr Kris Jack,
http://www.mendeley.com/profiles/kris-jack/

Re: Reading Vectors Created from a Lucene Index

Reply via email to