17 mar 2007 kl. 08.15 skrev Doron Cohen:
"Mike Klaas" <[EMAIL PROTECTED]> wrote on 16/03/2007 14:26:46:
On 3/15/07, karl wettin <[EMAIL PROTECTED]> wrote:
I propose a change of the current IndexReader.getTermFreqVector/s-
code so that it /always/ return the vector space model of a
document,
even when set fields are set as Field.TermVector.NO.
Is that crazy? Could be really slow, but except for that.. And if it
is cached then that information is known by inspecting the fields.
People don't go fetching term vectors without knowing what thay are
doing, are they?
The highlighting contrib code does this: attempt to retrieve the
termvector, catch InvalidArgumentException, fall back to re-analysis
of the data.
This way makes more sense to me. IndexReader.getTermFreqVector()
means its
there, just bring it,
They way I look at it the vector space model is there all the time and
Field.TermVector.YES really means Field.TermVector.Level1Cached.
Also, I would not mind a soft referenced map in IndexReader that keeps
track of all resoved term vectors. Perhaps that should be a decoration.
--
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]