Re: Finding out which fields matched the query

2022-06-29 Thread Shai Erera
I think it's a matter of tradeoff. For example when you do faceting then we require complete evaluation, and since this field-matching is a kind of aggregation I think it's OK if that's how it works. Users can choose which technique they want to apply based on their usecase. Anyway I don't think

Re: Finding out which fields matched the query

2022-06-28 Thread Alan Woodward
I think it depends on what information we actually want to get here. If it’s just finding which fields matched in which document, then running Matches over the top-k results is fine. If you want to get some kind of aggregate data, as in you want to get a list of fields that matched in *any*

Re: Finding out which fields matched the query

2022-06-27 Thread Walter Underwood
For a quick hack, you can use highlighting. That does more than you want, showing which words match, but it does have the info. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jun 27, 2022, at 3:23 AM, Shai Erera wrote: > > Thanks Uwe, I didn't

Re: Finding out which fields matched the query

2022-06-27 Thread Uwe Schindler
Hi Adrien, maybe it changed a bit, but last time I looked into is it was somehow wrapping all Queries using a wrapper "NamedQuery" or similiar. When it collected hits it was able to figure out by a wrapper somewhere around weight/scorer/DISI and set a flag that the query was a hit. It could

Re: Finding out which fields matched the query

2022-06-27 Thread Adrien Grand
Uwe, Elasticsearch's named queries are not using a collector actually. Ater top hits have been evaluated for the whole query, they are evaluated independently on each of the top hits. It's probably faster than the collector approach since it doesn't add per-document overhead to collection, but

Re: Finding out which fields matched the query

2022-06-27 Thread Shai Erera
Thanks Alan, yeah I guess I was thinking about the usecase I described, which involves (usually) simple term queries, but you're definitely right about complex boolean clauses as well non-term queries. I think the case for highlighter is different though? I mean you usually generate highlights

Re: Finding out which fields matched the query

2022-06-27 Thread Dawid Weiss
A side note - I've been using a highlighter based on matches API for quite some time now and it's been fantastic. Very precise and handles non-trivial queries (interval queries) very well.

Re: Finding out which fields matched the query

2022-06-27 Thread Alan Woodward
Your approach is almost certainly more efficient, but it might give you false matches in some cases - for example, if you have a complex query with many nested MUST and SHOULD clauses, you can have a leaf TermScorer that is positioned on the correct document, but which is part of a clause that

Re: Finding out which fields matched the query

2022-06-27 Thread Shai Erera
Thanks Uwe, I didn't know about named queries, but it seems useful. Is there interest in getting similar functionality in Lucene, or perhaps just the FieldMatching collector? I'd be happy to PR-it. As for usecase, I was thinking of using something similar to this collector for some kind of

Re: Finding out which fields matched the query

2022-06-27 Thread Uwe Schindler
I think the collector approach is perfectly fine for mass-processing of queries. By the way: Elasticserach/Opensearch have a feature already built-in and it is working based on collector API in a similar way like you mentioned (as far as I remember). It is a bit different as you can tag any

Re: Finding out which fields matched the query

2022-06-27 Thread Shai Erera
Out of curiosity and for education purposes, is the Collector approach I proposed wrong/inefficient? Or less efficient than the matches() API? I'm thinking, if you want to both match/rank documents and as a side effect know which fields matched, the Collector will perform better than

Re: Finding out which fields matched the query

2022-06-27 Thread Dawid Weiss
The matches API is awesome. Use it. You can also get a rough glimpse into a superset of fields potentially matching the query via: query.visit( new QueryVisitor() { @Override public boolean acceptField(String field) { affectedFields.add(field);

Re: Finding out which fields matched the query

2022-06-27 Thread Alan Woodward
The Matches API will give you this information - it’s still likely to be fairly slow, but it’s a lot easier to use than trying to parse Explain output. Query q = ….; Weight w = searcher.createWeight(searcher.rewrite(query), ScoreMode.COMPLETE_NO_SCORES, 1.0f); Matches m = w.matches(context,

Re: Finding out which fields matched the query

2022-06-27 Thread Jörn Franke
What is the reason you need the matched fields? Maybe your use case can be solved using sth completely different than knowing which fields were matched. > Am 25.06.2022 um 06:58 schrieb Yichen Sun : > > Hello! > > I’m a MSCS student from BU and learning to use Lucene. Recently I try to >

Re: Finding out which fields matched the query

2022-06-26 Thread Shai Erera
Hi Yichen, I think you can implement a custom Collector which tracks the fields that were matched for each Scorer. I implemented an example such Collector below: public class FieldMatchingCollector implements Collector { /** Holds the number of matching documents for each field. */ public

Finding out which fields matched the query

2022-06-24 Thread Yichen Sun
Hello! I’m a MSCS student from BU and learning to use Lucene. Recently I try to output matched fields by one query. For example, for one document, there are 10 fields and 2 of them match the query. I want to get the name of these fields. I have tried using explain() method and getting