FWIW Grant I only see it used in two places: TFDFMapper.map() where it's used as an index into a vector JWriterTermInfoWriter.write() where it is merely output, not really used
On Wed, Sep 23, 2009 at 4:32 PM, Grant Ingersoll <[email protected]> wrote: > The term entries are used to map the text to a position in the Vector. So, > the readDictionary is just loading up that mapping such that when it > examines the vector it can print out that term 14534 is really "foobar", or > whatever. > > There may be an abstraction to be made here, but I'd have to dig a little > deeper into the code to say for sure. > > > On Sep 23, 2009, at 4:58 PM, Jack Tanner wrote: > >> >> The TermEntry constructor is (String term, int termIdx, int docFreq). >> What's the point of termIdx? I see that it gets used for an assert in >> LDAPrintTopics.java:readDictionary() , but it seems redundant otherwise. >> (Background: I'd like to generate vectors for LDA directly, bypassing >> Lucene. Following o.a.m.utils.vectors.lucene.Driver, I see that I need to >> generate a dictionary file for the "printing out top terms per topic" step. >> This uses TermInfo, which contains lots of TermEntry elements.) >> >> _________________________________________________________________ >> Bing™ brings you maps, menus, and reviews organized in one place. Try >> it now. >> >> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1 > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using > Solr/Lucene: > http://www.lucidimagination.com/search > >
