Re: Improvements to the Explanation class

2019-01-12 Thread Vadim Gindin
Hi all, I think it is a good idea. I have a similar situation and I had to store additional data (features values) as a string and parse it further. So I'd be glad if your proposal will be implemented. Regards, Vadim Gindin On Fri, Jan 11, 2019 at 7:17 PM Sambhav Kothari (BLOOMBERG/ LONDON

Re: Camel case search with Lucene

2018-10-04 Thread Vadim Gindin
Hi Ira. If you want to use camel case query, for example, search "redHotChilly" instead of "red hot chilly" - you should use own pattern tokenizer to divide the query by regex pattern. Regards Vadim Gindin On Thu, Oct 4, 2018 at 11:58 AM Gordin, Ira wrote: > Hi

Re: Question about BytesRef and BinaryDocValues

2018-08-24 Thread Vadim Gindin
(postingsEnum) } return null; } After that you're getting a payload in a CustomFieldScorer.score() in the following way: postingsEnum.nextPosition(); BytesRef payload = postings.getPayload(); Regards, Vadim Gindin On Fri, Aug 24, 2018 at 10:16 AM Kevin Manuel wrote: > Hi Va

Re: Question about BytesRef and BinaryDocValues

2018-08-23 Thread Vadim Gindin
Hi Kevin! I think that your field is "analyzed" and so your field value is divided to 2 terms "hey" and "tom". So docvalue is written for each of them. Regards Vadim Gindin пт, 24 авг. 2018, 5:19 Kevin Manuel : > Hi, > > I'm using lucene version 4.3.1 an

Re: CustomQuery.bulkScorer isn't called from BooleanQuery with filter block

2018-07-26 Thread Vadim Gindin
AM Adrien Grand wrote: > Hello Vadim, > > It looks like your query only supports bulkScorer() and not scorer()? > Unfortunately this is illegal: queries must implement scorer(). Today, > conjunctions never use the bulkScorer API. > > Le mer. 25 juil. 2018 à 18:47, Vadim G

CustomQuery.bulkScorer isn't called from BooleanQuery with filter block

2018-07-25 Thread Vadim Gindin
ems they should work together. And that is why bulkScorer isn't called. Is there a way to integrate CustomQuery.bulkScorer() with possible adjacent filters? Regards, Vadim Gindin

Re: Explain flag in CustomQuery

2018-06-27 Thread Vadim Gindin
a *search *action. I'll probably ask that in Elasticsearch forum. Thanks :) Regards Vadim Gindin On Tue, Jun 26, 2018 at 1:48 AM Mikhail Khludnev wrote: > Vadim, > Why wouldn't you ask in Elastic forum? > > On Mon, Jun 25, 2018 at 11:39 PM Vadim Gindin > wrote: > &g

Explain flag in CustomQuery

2018-06-25 Thread Vadim Gindin
you advice me? Regards, Vadim Gindin

Postings.getPayload() returns null

2018-03-23 Thread Vadim Gindin
ngsEnum = te.postings(null, PostingsEnum.ALL); int pos = postingsEnum.nextPosition(); BytesRef payload = postingsEnum.getPayload(); // assert payload.bytesEquals(new BytesRef(new byte[]{1})); // TODO: use payload in scoring formula fldScorers.

Re: Read DocValue twice

2018-02-22 Thread Vadim Gindin
do this for all matches. We don't have a solution for this. > > Caching the scorer doesn't work since scorers can only be iterated once. > > Le jeu. 22 févr. 2018 à 12:11, Vadim Gindin <vgin...@detectum.com> a > écrit : > > > I'd like to use "explain" me

Re: Read DocValue twice

2018-02-22 Thread Vadim Gindin
effective way to do this? Is there a possibility to accelerate "explain", for example with scorer caching? - Lucene uses the only Scorer (for entire segment) for calling score() method. What about explain()? - Iterators are really - readable-once only? Regards, Vadim Gindin On Thu, F

Re: Read DocValue twice

2018-02-21 Thread Vadim Gindin
Grand <jpou...@gmail.com> wrote: > This might not solve all problems, but you should stop caching the weight > in the query and stop caching the scorer in the weight: just create a new > scorer in calls to explain(). > > Le mer. 21 févr. 2018 à 14:05, Vadim Gindin <vgin...

Re: Read DocValue twice

2018-02-21 Thread Vadim Gindin
. On Tue, Feb 20, 2018 at 8:03 PM, Vadim Gindin <vgin...@detectum.com> wrote: > Probably it is not possible to attach files from email letter. Here they > are: > > ConstTermScorer.java > <http://lucene.472066.n3.nabble.com/file/t493564/ConstTermScorer.java> > Prize

Re: Read DocValue twice

2018-02-20 Thread Vadim Gindin
Probably it is not possible to attach files from email letter. Here they are: ConstTermScorer.java PrizeDisjunctionScorer.java PhraseQuery.java

Re: Read DocValue twice

2018-02-20 Thread Vadim Gindin
() and in explanation(). Isn't it? Regards, Vadim Gindin On Mon, Feb 19, 2018 at 10:55 PM, Adrien Grand <jpou...@gmail.com> wrote: > Yes, this is the problem. This doc ID is a special sentinel value that > means that the iterator is exhausted. I don't have enough context to know > what th

Re: Read DocValue twice

2018-02-19 Thread Vadim Gindin
he values of topList.doc and > reader.maxDoc() are before before you call advanceExact? > > What do you mean by "I reuse the same DisiPriorityQueue of scorers in > score() and explain()". This shouldn't be possible. > > Le lun. 19 févr. 2018 à 15:23, Vadim Gindin <vgin.

Re: Read DocValue twice

2018-02-19 Thread Vadim Gindin
lue(). > > Le lun. 19 févr. 2018 à 14:41, Vadim Gindin <vgin...@detectum.com> a > écrit : > > > Hi all > > > > I use DocValue for scoring function. I.e. I have some column with > integers, > > that are used in scoring formula. So I have a scorer that c

Read DocValue twice

2018-02-19 Thread Vadim Gindin
can't read the values twice? 2) How can I manage this situation? 3) Can it work for NumericDocValues? Regards, Vadim Gindin

Custom explain implementation - how to transfer the data

2018-01-19 Thread Vadim Gindin
Assume, I have some scorer. During the execution of score() method, I'm caching a document id and scoring details to a Map. Further, in the explain(docID) method, I'm taking scoring details from that map by docID. Is it a correct scheme? If no how to implement it correctly? Regards, Vadim Gindin

Re: Wrong ID in explain() method.

2017-12-31 Thread Vadim Gindin
Yes, thanks a lot for your help. Do you mean that id of category must not be transferred to explain? If yes why it is happen? Regards Vadim Gindin 29 дек. 2017 г. 14:22 пользователь "Mikhail Khludnev" <m...@apache.org> написал: > Responded on the elastic forum. Have you se

Re: Query in a doc context

2017-12-31 Thread Vadim Gindin
Thanks Mikhail! I'll look there. Happy new year ) Regards Vadim Gindin 31 дек. 2017 г. 2:21 пользователь "Mikhail Khludnev" <m...@apache.org> написал: > Literally it's done in Solr (excuse moi) via > q=field1:(foo bar baz)^=3 field2:(foo bar baz)^=4 field3:(foo

Re: Wrong ID in explain() method.

2017-12-28 Thread Vadim Gindin
rved word for Lucene? Regards, Vadim Gindin On Wed, Dec 27, 2017 at 12:43 PM, Vadim Gindin <vgin...@detectum.com> wrote: > Hi all. > > I've written a simple plugin, that implements custom scoring logic and > extending `Explanation`. I have some real index data that looks like this: &

Wrong ID in explain() method.

2017-12-26 Thread Vadim Gindin
real installation, but in the test case - it works fine. 1. ID=342 and others come to explain(id) method. Note, it is not a document id - it is ID of the nested object (category). Why does it happen? 2. I have a test case, based on ESIntegTestCase. It works fine with this document. But this document is not founded in the real index. Regards, Vadim Gindin

Re: Query in a doc context

2017-12-26 Thread Vadim Gindin
like explanation extending and composing sum scores. Regards, Vadim Gindin On Fri, Dec 15, 2017 at 10:33 PM, Mike Dinescu (DNQ) <mdine...@donaq.com> wrote: > Got it. I misunderstood the question (actually I'm still not convinced I > fully understand what you're looking for)

Re: Query in a doc context

2017-12-14 Thread Vadim Gindin
Mike, I don't need full doc match. I need a multi-field match and later I need to know - what fields are matched for a document to be able to calculate other multi-fields-oriented metrics. Regards, Vadim Gindin On Thu, Dec 14, 2017 at 8:46 PM, Mike Dinescu (DNQ) <mdine...@donaq.com>

Re: Query in a doc context

2017-12-14 Thread Vadim Gindin
Thanks Mikhail Could you describe your sentences in more detail? Vadim On Thu, Dec 14, 2017 at 7:08 PM, Mikhail Khludnev <m...@apache.org> wrote: > Hello, Vadim. > > Please find inline. > > On Thu, Dec 14, 2017 at 11:43 AM, Vadim Gindin <vgin...@detectum.co

Re: Tracking that all query terms are matched in one document

2017-12-14 Thread Vadim Gindin
rastically renamed and/or replaced with > BulkScorer or so. Anyway, you need to find a way to prevent term-at-time > scoring, when FakeScorer is injected. > You need to make it score doc-at-time. As I told you, it's far way. > > On Wed, Dec 13, 2017 at 11:55 AM, Vadim Gindin <vgin...@det

Re: Terminology. LeafReader -> TermEnum -> PostingsEnum

2017-12-14 Thread Vadim Gindin
that I need to keep some coefficients along with tokens to use them further in scoring. For example, if the matched token is a synonym - I could multiple the query score to 0.75. Regards, Vadim Gindin On Thu, Dec 14, 2017 at 2:15 PM, Vadim Gindin <vgin...@detectum.com> wrote: > Hi All &

Terminology. LeafReader -> TermEnum -> PostingsEnum

2017-12-14 Thread Vadim Gindin
tations of that interface. Why is it used in LeafReader? What the principal difference between these 20 implementations and which of them can be really useful? Regards, Vadim Gindin

Query in a doc context

2017-12-14 Thread Vadim Gindin
Hi all. As I can understand. All Queries (or most of them?) are single-field oriented. They may implement different search/score logic, but they are intended for a single field. For example, simple TermQuery or PhraseQuery. If I need to implement the search through different fields I should use

Re: Tracking that all query terms are matched in one document

2017-12-13 Thread Vadim Gindin
nks, Vadim Gindin On Fri, Dec 8, 2017 at 2:01 PM, Vadim Gindin <vgin...@detectum.com> wrote: > Thank's for your help. I'll try that. > > On Tue, Dec 5, 2017 at 4:18 PM, Mikhail Khludnev <m...@apache.org> wrote: > >> Vadim, >> You can create a collec

Re: Tracking that all query terms are matched in one document

2017-12-05 Thread Vadim Gindin
value - is a list of terms by whom this document was matched. I need to save somewhere the document ID and the term matched that document. Could somebody advise me an appropriate place? Regards, Vadim Gindin On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin <vgin...@detectum.com> wrote: &

Re: Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
stQuery(bq, queryBoost); Vadim On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <msoko...@gmail.com> wrote: > Well how did you make the original query? > > On Dec 4, 2017 12:05 PM, "Vadim Gindin" <vgin...@detectum.com> wrote: > > > Yes, thanks. My qu

Re: Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
will be added, > not multiplied. > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <vgin...@detectum.com> wrote: > > > Thanks, Michael! > > > > Yes, I'm sure. Could you explain your proposal in more detail? > > > > Regards, > > Vadi

Re: Scorer.iterator() - how to implement correctly

2017-12-04 Thread Vadim Gindin
to query boost. Now it works. Thank's a lot! Regards, Vadim Gindin On Mon, Dec 4, 2017 at 3:17 PM, Adrien Grand <jpou...@gmail.com> wrote: > It is correct... but ConstantScoreQuery is the way to go with your > use-case. It should not return scores of 0 unless you are misusing the API &

Re: Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
Thanks, Michael! Yes, I'm sure. Could you explain your proposal in more detail? Regards, Vadim Gindin On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <msoko...@gmail.com> wrote: > You could combine a Boolean and query with the same terms, as an optional > clause. Are yo

Re: Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
Sorry I've accidentally sent an unfinished letter ). Could somebody advise me the way how to implement the following thing? Regards Vadim Gindin On Mon, Dec 4, 2017 at 3:12 PM, Vadim Gindin <vgin...@detectum.com> wrote: > Hi all. > > I need to track that all query terms are

Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
Hi all. I need to track that all query terms are matched in one document. When all terms are matched I need to multiply the score of such document to some constant coefficient.

Re: Scorer.iterator() - how to implement correctly

2017-12-04 Thread Vadim Gindin
, Vadim Gindin <vgin...@detectum.com> wrote: > Hi Adrien. > > ConstantScoreQuery - I'd tried that earlier. There is the problem. It > returns score = 0.0 for my configuration with Boolean.. I've debugged and > found, that it happens because of the following: > > @Override &

Scorer.iterator() - how to implement correctly

2017-11-30 Thread Vadim Gindin
ld "vendor" - score - 5f. I'm creating a subquery for each field and specify score for it using custom QUERY that is almost the same as TermQuery except Weight.Scorer Any help is appreciated. Regards, Vadim Gindin

Re: COST vs SCORE vs WEIGHT

2017-11-30 Thread Vadim Gindin
); And then public DocIdSetIterator iterator() { return iterator; } Is that a correct implementation? Are there other ways to implement it? Thanks a lot for your response Regards, Vadim Gindin On Thu, Nov 30, 2017 at 8:56 PM, Adrien Grand <jpou...@gmail.com> wrote: > Hi Vadim, > >

COST vs SCORE vs WEIGHT

2017-11-30 Thread Vadim Gindin
that? Regards, Vadim Gindin

Re: Custom scoring algorithm and Explanation extending.

2017-11-22 Thread Vadim Gindin
Thank's a lot! On Mon, Nov 20, 2017 at 11:22 PM, Adrien Grand <jpou...@gmail.com> wrote: > Hi Vadim, > > Le jeu. 16 nov. 2017 à 18:09, Vadim Gindin <vgin...@detectum.com> a écrit > : > > > 1. I would like to use my custom scoring algorithm. Is it make sense to &g

Custom scoring algorithm and Explanation extending.

2017-11-16 Thread Vadim Gindin
plain" that uses Lucene's Explanation class under the hood. But this class covers only scoring aspects. I would like to include matching logic details there. It seems a good place but this class is final.. Regards, Vadim Gindin

Extending Explanation class information

2017-11-16 Thread Vadim Gindin
g/querying documents by concrete query? Regards, Vadim Gindin