I feel like there could be some considerable overlap with features provided by Luwak, which was contributed to Lucene fairly recently, and I think does the query inversion work required for this; maybe more of it already exists here? I don't know if that module handles the query rewriting, or the term indexing you're talking about though.
On Mon, Dec 14, 2020 at 11:25 PM Atri Sharma <a...@apache.org> wrote: > > +1 > > I would suggest that this be an independent project hosted on Github (there > have been similar projects in the past that have seen success that way) > > On Tue, 15 Dec 2020, 09:37 David Smiley, <dsmi...@apache.org> wrote: >> >> Great optimization! >> >> I'm dubious on it being a good contribution to Lucene itself however, >> because what you propose fits cleanly above Lucene. Even at a ES/Solr layer >> (which I know you don't use, but hypothetically speaking), I'm dubious there >> as well. >> >> ~ David Smiley >> Apache Lucene/Solr Search Developer >> http://www.linkedin.com/in/davidwsmiley >> >> >> On Mon, Dec 14, 2020 at 2:37 PM Michael Froh <msf...@gmail.com> wrote: >>> >>> My team at work has a neat feature that we've built on top of Lucene that >>> has provided a substantial (20%+) increase in maximum qps and some >>> reduction in query latency. >>> >>> Basically, we run a training process that looks at historical queries to >>> find frequently co-occurring combinations of required clauses, say "+A +B >>> +C +D". Then at indexing time, if a document satisfies one of these known >>> combinations, we add a new term to the doc, like "opto:ABCD". At query >>> time, we can then replace the required clauses with a single TermQuery for >>> the "optimized" term. >>> >>> It adds a little bit of extra work at indexing time and requires the >>> offline training step, but we've found that it yields a significant boost >>> at query time. >>> >>> We're interested in open-sourcing this feature. Is it something worth >>> adding to Lucene? Since it doesn't require any core changes, maybe as a >>> module? --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org