On Thu, Aug 21, 2008 at 7:20 PM, David Lee <[EMAIL PROTECTED]> wrote: > Clarification question: > > If I don't store term vectors, then I: > -- won't have information on the position of matching terms > -- I don't have the term frequency vector > > -- but I should still have the frequency of terms per document in the .frq > file, right? > > So what's the difference between the term frequency vector and the > information saved in the .frq file?
It's how the data can be efficiently accessed... by term or by document. Lucene is naturally an inverted index, and thus makes it easy to ask "what documents contain this term". Term vectors store the term information indexed by document and make it easy to ask "what terms does this specific document have". -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]