Thanks Adrien. Longish term planning in open source is such a hard thing so I'm glad you are helping to herd us cats ;)
I've also finally switched our nightly benchmarks to use concurrent search (intra-query concurrency)! It's annotation GM in the charts. Some queries got faster, like BooleanQuery disjunction of two high frequency terms ( https://home.apache.org/~mikemccand/lucenebench/OrHighHigh.html) and some got slower e.g. simple TermQuery ( https://home.apache.org/~mikemccand/lucenebench/Term.html). Now as we make improvements to Lucene's cross-slice / cross-thread search concurrency, e.g. intra-segment concurrency, we should be able to see the gains in our nightly benchmarks. Adding concurrency to Lucene has been such a long and fun road, and we are really only getting started in search-time concurrency. Mike McCandless http://blog.mikemccandless.com On Wed, Jun 26, 2024 at 10:59 AM Adrien Grand <jpou...@gmail.com> wrote: > Hello everyone, > > Time flies, I started this email thread ~3.5 months ago and we now have ~3 > months before September 22nd, where 10.0 will go on feature freeze. > > Robert kindly added a description to the GitHub milestone that refers to > this thread: https://github.com/apache/lucene/milestone/2. > > Overall, progress looks rather good to me: > - I/O concurrency is progressing nicely > https://github.com/apache/lucene/issues/13179. In particular I'm hoping > to merge I/O concurrency for terms dictionary lookups soon. > - Ignacio recently merged initial support for sparse indexing. > https://github.com/apache/lucene/issues/11432 There are follow-ups we > need to address, but they look reasonable in terms of amount of work and > uncontroversial. > > Some things have got less traction: > - We haven't made significant progress on intra-segment search > concurrency: https://github.com/apache/lucene/issues/9721. > - Relatedly, if we think that IndexSearcher should enable concurrency by > default, a major version is a good time to make such a big change to > runtime behavior. https://github.com/apache/lucene/issues/11523 > > In any case, help is welcome. I know people have been creating more issues > that they attached to the 10.0 milestone, e.g. doing more off-heap scoring > for vectors https://github.com/apache/lucene/issues/13515 or deprecating > the COSINE similarity https://github.com/apache/lucene/issues/13281. This > is great too, the list isn't closed, I'll start thinking harder about which > changes specifically should block the release as we get closer to September > (I can't think of any at the moment). In the meantime, it's fine to > optimistically attach issues to the 10.0 milestone. > > On Wed, Mar 20, 2024 at 2:09 PM Adrien Grand <jpou...@gmail.com> wrote: > >> Thanks Mike and Dawid for the kind words, and thanks Patrick, Luca and >> Egor for your interest in decoupling index geometry from search >> concurrency, this would be a great release highlight if we can get it into >> Lucene 10! >> >> I haven't seen pushback on the proposed schedule so I plan on proceeding >> with this timeline in mind. >> >> If you have changes that you would like to include in Lucene 10.0, please >> add the 10.0 milestone >> <https://github.com/apache/lucene/milestones/10.0.0> to them. It's ok to >> be a bit ambitious at this stage and optimistically mark some changes as >> scheduled for 10.0, we'll have opportunities for removing items from this >> list when the date comes closer and some issues are not getting proper >> traction. I'll take care of that. >> >> On Mon, Mar 18, 2024 at 11:39 AM Dawid Weiss <dawid.we...@gmail.com> >> wrote: >> >>> [...] but Adrien I don't honestly believe anyone who is >>>> paying attention thinks that is what you have been doing! >>> >>> >>> +1. I wish I were procrastinating as productively! >>> >>> D. >>> >> >> >> -- >> Adrien >> > > > -- > Adrien >