Hi,

Term Vectors are somehow duplicate information. It is used to get quickly *per 
document* all vectors for *one field*. This means you get the positions, 
offsets, and frequencies for the requested document as one blob like a stored 
field that can be used e.g. for more like this or highlighting 
(FastVectorHighligter also needs term vectors).

It's identical to the difference between indexed fields and stored field (in 
fact the information stored if you enable TermVectors during indexing is 
similar to stored fields, see it like a binary stored field containing all 
vectors for the corresponding document).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]


> -----Original Message-----
> From: sol myr [mailto:[email protected]]
> Sent: Tuesday, October 04, 2011 12:08 PM
> To: [email protected]
> Subject: Re: [Lucene] Frequencies and positions - are they stored per field?
> 
> Thanks a lot.
> But then what's the added value of Field.TermVector?
> 
> Can't it be deduced from the overall Lucene index? Or is it just inefficient 
> to
> deduce?
> 
> Thanks again :)
> 
> 
> 
> ----- Original Message -----
> From: Uwe Schindler <[email protected]>
> To: [email protected]; 'sol myr' <[email protected]>
> Cc:
> Sent: Tuesday, October 4, 2011 11:53 AM
> Subject: RE: [Lucene] Frequencies and positions - are they stored per field?
> 
> Lucene always uses a field, a query using a term without a field is 
> impossible.
> See each field as a parallel inverted index; all statistics are per field, 
> too. If you
> pass a query without a field name to QueryParser it will chose the default 
> field,
> that’s given when creating the QueryParser.
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: [email protected]
> 
> 
> > -----Original Message-----
> > From: sol myr [mailto:[email protected]]
> > Sent: Tuesday, October 04, 2011 11:46 AM
> > To: lucene
> > Subject: [Lucene] Frequencies and positions - are they stored per field?
> >
> >
> >
> > Hi,
> >
> > I use Lucene, but an not familiar with its internals.
> > I'd appreciate help understanding whether Term Frequences and
> > Positions -
> are
> > stored  per Document of per Field?
> > On the one hand, I never ask for "Field.TermVector" because I read
> > it's
> only
> > required for "MoreLikeThis" (which I don't need).
> > On the other hand, my searches *are* based on fields...
> >
> > Here's my code:
> > // Write (without Field.TermVector):
> >
> > Document doc=new Document();
> > doc.add(new Field("subject",  "Requisition request", Store.YES,
> > Index.ANALYZED)); doc.add(new Field("body",  "Attached is an Urgent
> > requisition request", Store.YES, Index.ANALYZED));
> > write.addDocument(doc);
> >
> > // And my Query:
> > Query query=parser.parse("subject : urgent");
> >
> > Now how does Lucene manage this query?
> > I asked it to search the "subject" Field.
> > But if the "inverted index" doesn't keep fields, it would only
> > remember
> that
> > "The term 'Urgent' appears in SOME FIELD of document#1 "...
> > Isn't it true?
> >
> > If so, how would it make sure to retrieve only documents that match in
> > the Subject ?
> >
> > Thanks.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to