I feel like there could be some considerable overlap with features
provided by Luwak, which was contributed to Lucene fairly recently,
and I think does the query inversion work required for this; maybe
more of it already exists here? I don't know if that module handles
the query rewriting, or the term indexing you're talking about though.

On Mon, Dec 14, 2020 at 11:25 PM Atri Sharma <a...@apache.org> wrote:
>
> +1
>
> I would suggest that this be an independent project hosted on Github (there 
> have been similar projects in the past that have seen success that way)
>
> On Tue, 15 Dec 2020, 09:37 David Smiley, <dsmi...@apache.org> wrote:
>>
>> Great optimization!
>>
>> I'm dubious on it being a good contribution to Lucene itself however, 
>> because what you propose fits cleanly above Lucene.  Even at a ES/Solr layer 
>> (which I know you don't use, but hypothetically speaking), I'm dubious there 
>> as well.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Mon, Dec 14, 2020 at 2:37 PM Michael Froh <msf...@gmail.com> wrote:
>>>
>>> My team at work has a neat feature that we've built on top of Lucene that 
>>> has provided a substantial (20%+) increase in maximum qps and some 
>>> reduction in query latency.
>>>
>>> Basically, we run a training process that looks at historical queries to 
>>> find frequently co-occurring combinations of required clauses, say "+A +B 
>>> +C +D". Then at indexing time, if a document satisfies one of these known 
>>> combinations, we add a new term to the doc, like "opto:ABCD". At query 
>>> time, we can then replace the required clauses with a single TermQuery for 
>>> the "optimized" term.
>>>
>>> It adds a little bit of extra work at indexing time and requires the 
>>> offline training step, but we've found that it yields a significant boost 
>>> at query time.
>>>
>>> We're interested in open-sourcing this feature. Is it something worth 
>>> adding to Lucene? Since it doesn't require any core changes, maybe as a 
>>> module?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to