Thanks Adrien.  Longish term planning in open source is such a hard thing
so I'm glad you are helping to herd us cats ;)

I've also finally switched our nightly benchmarks to use concurrent search
(intra-query concurrency)!  It's annotation GM in the charts.  Some queries
got faster, like BooleanQuery disjunction of two high frequency terms (
https://home.apache.org/~mikemccand/lucenebench/OrHighHigh.html) and some
got slower e.g. simple TermQuery (
https://home.apache.org/~mikemccand/lucenebench/Term.html).  Now as we make
improvements to Lucene's cross-slice / cross-thread search concurrency,
e.g. intra-segment concurrency, we should be able to see the gains in our
nightly benchmarks.  Adding concurrency to Lucene has been such a long and
fun road, and we are really only getting started in search-time concurrency.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jun 26, 2024 at 10:59 AM Adrien Grand <jpou...@gmail.com> wrote:

> Hello everyone,
>
> Time flies, I started this email thread ~3.5 months ago and we now have ~3
> months before September 22nd, where 10.0 will go on feature freeze.
>
> Robert kindly added a description to the GitHub milestone that refers to
> this thread: https://github.com/apache/lucene/milestone/2.
>
> Overall, progress looks rather good to me:
>  - I/O concurrency is progressing nicely
> https://github.com/apache/lucene/issues/13179. In particular I'm hoping
> to merge I/O concurrency for terms dictionary lookups soon.
>  - Ignacio recently merged initial support for sparse indexing.
> https://github.com/apache/lucene/issues/11432 There are follow-ups we
> need to address, but they look reasonable in terms of amount of work and
> uncontroversial.
>
> Some things have got less traction:
>  - We haven't made significant progress on intra-segment search
> concurrency: https://github.com/apache/lucene/issues/9721.
>  - Relatedly, if we think that IndexSearcher should enable concurrency by
> default, a major version is a good time to make such a big change to
> runtime behavior. https://github.com/apache/lucene/issues/11523
>
> In any case, help is welcome. I know people have been creating more issues
> that they attached to the 10.0 milestone, e.g. doing more off-heap scoring
> for vectors https://github.com/apache/lucene/issues/13515 or deprecating
> the COSINE similarity https://github.com/apache/lucene/issues/13281. This
> is great too, the list isn't closed, I'll start thinking harder about which
> changes specifically should block the release as we get closer to September
> (I can't think of any at the moment). In the meantime, it's fine to
> optimistically attach issues to the 10.0 milestone.
>
> On Wed, Mar 20, 2024 at 2:09 PM Adrien Grand <jpou...@gmail.com> wrote:
>
>> Thanks Mike and Dawid for the kind words, and thanks Patrick, Luca and
>> Egor for your interest in decoupling index geometry from search
>> concurrency, this would be a great release highlight if we can get it into
>> Lucene 10!
>>
>> I haven't seen pushback on the proposed schedule so I plan on proceeding
>> with this timeline in mind.
>>
>> If you have changes that you would like to include in Lucene 10.0, please
>> add the 10.0 milestone
>> <https://github.com/apache/lucene/milestones/10.0.0> to them. It's ok to
>> be a bit ambitious at this stage and optimistically mark some changes as
>> scheduled for 10.0, we'll have opportunities for removing items from this
>> list when the date comes closer and some issues are not getting proper
>> traction. I'll take care of that.
>>
>> On Mon, Mar 18, 2024 at 11:39 AM Dawid Weiss <dawid.we...@gmail.com>
>> wrote:
>>
>>> [...] but Adrien I don't honestly believe anyone who is
>>>> paying attention thinks that is what you have been doing!
>>>
>>>
>>> +1. I wish I were procrastinating as productively!
>>>
>>> D.
>>>
>>
>>
>> --
>> Adrien
>>
>
>
> --
> Adrien
>

Reply via email to