Query in a doc context

2017-12-14 Thread Vadim Gindin
Hi all. As I can understand. All Queries (or most of them?) are single-field oriented. They may implement different search/score logic, but they are intended for a single field. For example, simple TermQuery or PhraseQuery. If I need to implement the search through different fields I should use Bo

Terminology. LeafReader -> TermEnum -> PostingsEnum

2017-12-14 Thread Vadim Gindin
Hi All I have a question about API. Particularly, about used terminology. 1. LeafReader. Why it starts with "Leaf"? Can I understand that, that such reader is intended for reading only one leaf of index tree? Does it mean that it is working inside a context (LeafReaderContext) of several document

Re: Terminology. LeafReader -> TermEnum -> PostingsEnum

2017-12-14 Thread Vadim Gindin
I made a mistake in issue 5. The real case is the PostingEnum has many implementations, not the DocIdSetIterator. Please read the question 5 as follows. 5. Should I use a concrete implementation of PostingEnum? When it makes sense? Or I always should get PostingsEnum as a result of a call TermEnum

Re: Tracking that all query terms are matched in one document

2017-12-14 Thread Vadim Gindin
Thank you On Wed, Dec 13, 2017 at 3:32 PM, Mikhail Khludnev wrote: > There are two algorithm for scoring disjunction: term-a-time, doc-at-time. > The former was called BooleanScorer and the later was called > BooleanScorer2. > I remember that they was drastically renamed and/or replaced with > B

Re: Query in a doc context

2017-12-14 Thread Mikhail Khludnev
Hello, Vadim. Please find inline. On Thu, Dec 14, 2017 at 11:43 AM, Vadim Gindin wrote: > Hi all. > > As I can understand. All Queries (or most of them?) are single-field > oriented. They may implement different search/score logic, but they are > intended for a single field. For example, simple

Re: Query in a doc context

2017-12-14 Thread Vadim Gindin
Thanks Mikhail Could you describe your sentences in more detail? Vadim On Thu, Dec 14, 2017 at 7:08 PM, Mikhail Khludnev wrote: > Hello, Vadim. > > Please find inline. > > On Thu, Dec 14, 2017 at 11:43 AM, Vadim Gindin > wrote: > > > Hi all. > > > > As I can understand. All Queries (or most o

Re: Terminology. LeafReader -> TermEnum -> PostingsEnum

2017-12-14 Thread Mikhail Khludnev
Vadim, I suppose https://vimeo.com/32065505 is old good explanation of all Lucene API dimensions. It covers the most of your questions. FWIW, Leaf is a segment, and postings is a list of occurrences. Regarding attributes in postings, iirc it's only used in some suggester, but now I even can't find

Re: Query in a doc context

2017-12-14 Thread Mike Dinescu (DNQ)
Apologies if I completely misundetstood but if you are looking to do a full doc match, you could duplicate duplicated the doc into another field that is a true full text index of the document. And search on that. Wouldn't that be exactly what you want? On Thu, Dec 14, 2017 at 6:53 AM Vadim Gindin

Re: Query in a doc context

2017-12-14 Thread Vadim Gindin
Mike, I don't need full doc match. I need a multi-field match and later I need to know - what fields are matched for a document to be able to calculate other multi-fields-oriented metrics. Regards, Vadim Gindin On Thu, Dec 14, 2017 at 8:46 PM, Mike Dinescu (DNQ) wrote: > Apologies if I complet