FWIW Grant I only see it used in two places:

TFDFMapper.map() where it's used as an index into a vector
JWriterTermInfoWriter.write() where it is merely output, not really used

On Wed, Sep 23, 2009 at 4:32 PM, Grant Ingersoll <[email protected]> wrote:
> The term entries are used to map the text to a position in the Vector.  So,
> the readDictionary is just loading up that mapping such that when it
> examines the vector it can print out that term 14534 is really "foobar", or
> whatever.
>
> There may be an abstraction to be made here, but I'd have to dig a little
> deeper into the code to say for sure.
>
>
> On Sep 23, 2009, at 4:58 PM, Jack Tanner wrote:
>
>>
>> The TermEntry constructor is (String term, int termIdx, int docFreq).
>> What's the point of termIdx? I see that it gets used for an assert in
>> LDAPrintTopics.java:readDictionary() , but it seems redundant otherwise.
>> (Background: I'd like to generate vectors for LDA directly, bypassing
>> Lucene. Following o.a.m.utils.vectors.lucene.Driver, I see that I need to
>> generate a dictionary file for the "printing out top terms per topic" step.
>> This uses TermInfo, which contains lots of TermEntry elements.)
>>
>> _________________________________________________________________
>> Bing™  brings you maps, menus, and reviews organized in one place.   Try
>> it now.
>>
>> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>

Reply via email to