Re: Lucene 10.0 and 9.12 blockers

Uwe Schindler Tue, 01 Oct 2024 07:38:21 -0700

Oh Thanks,

did not notice. So all looks fine. Because I was checking and found only9.x files.

We still have some 9.x codec versions in the checkout, but thats thesame for older ones. The only important thing is that the top-levelcodec is on 10.x, we don't need to care about postings, docvalues,...


Uwe

Am 01.10.2024 um 09:53 schrieb Chris Hegarty:

Uwe,

On 1 Oct 2024, at 08:08, Uwe Schindler <[email protected]> wrote:

Hi,
are we ready for a release. Because at moment I don't think we have fully 
cleaned up the codecs and this could lead to problems when going to 11.x 
because of version numbers in the index files!?
     • Rename the most recent codec to 10x
     • Remove older 9.x stuff and move to backwards, only keep latest.
Is this still planned or ongoing?

Yes. I believe everything that is needed is in Adrien’s PR  - 
https://github.com/apache/lucene/pull/13815. I actually thought that it was 
already merged.

I’ll do some final testing, etc, on this PR and get it merged.

-Chris.

Uwe
Am 30.09.2024 um 19:35 schrieb Luca Cavanna:

Hi all,
given Adrien's limited availability, Chris and I flipped a coin and decided
that I will be the release manager for Lucene 10. branch_10x and branch_10_0
have already been created. I will follow up with the first RC tomorrow.

On Mon, Sep 30, 2024 at 5:14 PM Adrien Grand <[email protected]> wrote:
FYI I have little availability this week so Luca and Chris offered to help with
the release. Thank you both!

Le sam. 14 sept. 2024, 18:07, Luca Cavanna <[email protected]> a écrit :
On the topic of removing search(Query, Collector), I think it is unlikely that
we will be able to complete that task before the Lucene 10 branch is cut. It
would be nice to still work on the remaining issues, but it's probably good to
have less urgency around it hence more time to think about how to properly
address the remaining issues. I think that removing a deprecated method leaving
users without a proper replacement is not a good trade-off.

I have an outstanding PR around CollectorManager#forSequentialExecution, based
on the risk I described in my previous message.I am now concluding that it
would be safer to remove such a public method from branch_9x before it gets
released. I described the reason in the PR I just opened:
https://github.com/apache/lucene/pull/13790 . I'd like to know what folks think
about this. I myself have changed my mind a few times on the topic.

On Thu, Sep 12, 2024 at 10:09 PM Michael Froh <[email protected]> wrote:
Aha! I realized that I was totally misreading some stacktraces while debugging some of
the Join tests. You're absolutely right that a ScorerSupplier is created for each doc ID
range -- not just the Scorer. Also, the test that I "helped" Vamsi with didn't
really get fixed by moving construction of the TermsEnum into the get() method -- it just
didn't happen to fail that time. *facepalm*

Please ignore my unfounded panic :D

Thanks,
Froh

On Thu, Sep 12, 2024 at 1:33 AM Luca Cavanna <[email protected]> wrote:
Thanks for raising this Michael.

I see a risk as well around removing the deprecated search(Query, Collector)
method by using a sequential collector manager. Previously, we would happily
execute sequentially despite an executor being provided to the searcher
(because search(Query, Collector) bypasses concurrency entirely). After the
migration, we would execute concurrently, hence the collector manager would
throw an exception as soon as more than one slice gets searched. The sequential
collector manager can be used in tests to overcome the lack of a collector
manager, but that only moves the problem to the user side: what are users of
these collectors going to do given they need a collector manager to call
search, and we don't provide one? Their only option would be to not set an
executor to their searcher. Maybe it is a possibility to clearly document that
but is it acceptable to require users to have a non concurrent index searcher
instance around to use for non concurrent queries? There's only a few cases of
this fortunately. There's also a couple of cases like QueryUtils and JoinUtil
where we have utility methods that call search internally and accept an
external searcher. Those searchers may have an executor set to them, and the
only safe way to migrate these is to add concurrent collector managers support
or explicitly state that the provided search shouldn't have an executor set to
it. Another option if we are not happy with the workaround I mentioned, is to
consider leaving search(Query, Collector) deprecated in Lucene 10 and removing
it in Lucene 11. It is a shame because we are not far off, but I am not sure
that this warrants delaying the release.
I am not entirely sure how this aligns with the risk you mentioned, which cases
of SimpleCollector are you referring to specifically?

Regarding your second concern around intra-segment concurrency: while I had to
adapt a couple of tests to be intra-segment ready as they made wrong
assumptions, we are now leveraging intra-segment concurrency in all tests
(provided that the searcher is created using LuceneTestCase#newSearcher),
besides when DrillSideways is used. I have seen a couple of post-merge failures
that may be related which I will look into, but nothing that would suggest that
the design is entirely problematic. When you retrieve a scorer supplier or a
bulk scorer you provide a LeafReaderContext. The overall expectation is that
you get a different instance each time, regardless of whether you have already
seen the segment or not. If that is the case there is no state shared between
threads, because ScorerSuppliers should not get shared across threads? It is
not the expectation that the support for intra-segment concurrency requires
bulk scorers and scorer suppliers to become thread-safe.
With that in mind, I checked DefaultScorerSupplier and I do not see why it
would not work, as long as each call to retrieve a scorer supplier returns a
new instance of if that points to a new instance of Scorer, which holds state
created as part of that same scorerSupplier call. The problem that we have is
that we duplicate ahead of time work for partitions of the same segment
(tracked in https://github.com/apache/lucene/issues/13745), because if we need
to pull doc_values, we will do so for the same segment multiple times. I would
assume that if something is off with DefaultScorerSupplier, tests would show
that clearly, as it is widely used.
I also checked FeatureQuery, and I see that each call to
scorerSupplier(LeafReaderContext) returns a new instance of the supplier which
points to different TermsEnum instance retrieved multiple times for the same
segment. Removing this duplication will require additional work, and there will
be bugs, or incorrect assumptions made in existing scorer supplier instances,
but those should not be too hard to fix.
Does this make sense to you? Perhaps there are additional changes to make in
the migrate guide or javadocs to clarify what I described, let me know what you
think.

Cheers
Luca

On Thu, Sep 12, 2024 at 9:42 AM Michael Froh <[email protected]> wrote:
Oh, I forgot to mention:

I think we should deprecate the DefaultScorerSupplier constructor that takes a
Scorer. There's no way that works with intra-segment concurrency.

Maybe we remove DefaultScorerSupplier altogether?

On Thu, Sep 12, 2024 at 12:35 AM Michael Froh <[email protected]> wrote:
I'm a big fan of both of Luca's topics. I'd like to raise a small red flag
around them, though, since they seem to be connected.

Working through the join module and helping my colleague @harshavamsi on the QueryUtils
side, I see two layers of unpreparatedness for the modern "concurrency first"
architecture. (Again, I want to make clear that I think the modern architecture is the
way to go and we can and should get there in time for Lucene 10.)

1. There are several uses of SimpleCollector, where it's assumed that one
collector will collect all results on a single thread. With the deprecated
method, this forces single-threaded behavior all the time. In my opinion, these
represent 13+ year technical debt for cases where you couldn't properly use an
IndexSearcher to do concurrent searches.
2. With the merge of intra-segment searches, we have another layer:
ScorerSuppliers that share mutable state across the Scorers that they produce.
For example, @harshavamsi came across a case today in the sigmoid function for
FeatureQuery where a TermsEnum was created in the ScorerSupplier and passed
into the Scorers. Each Scorer shared the same TermsEnum. What changed? In the
old concurrency model, one thread might search a few segments, but each segment
was guaranteed to only be searched by one thread. Now, with intra-segment
concurrency, we produce one ScorerSupplier per segment, but may produce
multiple Scorers across different threads. If the ScorerSupplier produces some
mutable object and shares it across the resulting Scorers, you're going to have
a bad time. Fun fact: back in 2012, we had an office Halloween party and I
dressed as the thing that scares me the most. I printed a picture of Texas
(since everyone recognizes Texas) with a TV remote control mute button in the
middle. I sewed it to my shirt in the four corners. It was mutable state held
by multiple threads.

I definitely think we should address these before the Lucene 10 release, as
they provide a clean break from the old world. I also think it's a decent
amount of work (but not unsurmountable). I'm also maybe no longer a fan of the
helper method that Greg added in his PR for the monitor module, since it risks
sweeping non-threadsafe code under the rug, if folks make single-threaded tests
(which is essentially what they've been doing all along -- see my first point
above).

I haven't properly looked into the scope of my second point above, but I've
seen at least two cases in the past two days. Hopefully it's not too bad, but
it might be a risk. I think the first point is still pretty easy to address.

Thanks,
Froh

On Thu, Aug 29, 2024 at 2:15 AM Luca Cavanna <[email protected]> wrote:
For Lucene 10.0, I have two topics to raise:

1. Remove the deprecated IndexSearcher#search(Query, Collector) in favour of
IndexSearcher#search(Query, CollectorManager)
(https://github.com/apache/lucene/issues/12892): this involves removing the
leftover usages in facet, grouping, join and test-framework, plus in some
tests. A list of the leftover usages is in the description of the issue. It
would be great to complete this for Lucene 10, otherwise this deprecated method
and usages will stick around for much longer. What do others think? Should we
make this a blocker for the release? I think this is not a huge effort and it
is parallelizable across different people.

2. Intra-segment concurrency (https://github.com/apache/lucene/pull/13542):
current thinking is to add support for partitioning segments when searching,
and searching across segment partitions concurrently. My intention is to
introduce breaking changes and documentation in Lucene 10 (really only the
basics), without switching the default slicing of IndexSearcher to create
segment partitions. We will want to leverage segment partitions in testing.
More iterations are going to be needed to remove duplicated work across
partitions of the same segment, which is my next step, but currently out of
scope for Lucene 10. Judging from the reviews I got so far, my PR is not far
and I am working on it to address comments, polish it a bit more and merge it
soon.

Feedback is welcome

Cheers
Luca

On Wed, Aug 28, 2024 at 3:05 PM Adrien Grand <[email protected]> wrote:
Thanks Mike.

On Wed, Aug 28, 2024 at 2:16 PM Michael McCandless <[email protected]>
wrote:
I think maybe also https://github.com/apache/lucene/issues/13519 should be a
blocker? It looks like 8 bit vector HNSW quantization is broken (unless I'm
making a silly mistake with luceneutil tooling).

I've also set its milestone to 10.0.0.

Do we really not have a way to mark an issue a blocker for a given release? That's insane. OK
well I went and created "blocker" label, and added that to GH 13519. Greg, I'll also go
mark your linked issue as "blocker".

Mike McCandless

http://blog.mikemccandless.com

On Sat, Aug 24, 2024 at 2:33 PM Uwe Schindler <[email protected]> wrote:
Hi,
I updated Policeman Jenkins to have JDK 23 RC and JDK 24 EA releases.
Uwe
P.S.: Unfortunately I have to update the macOS Hackintosh VM to have a newer
operating system version: JDK 22 and later no longer run on this machine.
Am 23.08.2024 um 10:41 schrieb Uwe Schindler:

Hi,
In 9.x there's still the backport of 
https://github.com/apache/lucene/pull/13570 to be done. The PR apperas in the 
changelog, but was not backported yet. Chris and I will do this soon.
9.last release on Sept 22 fits perfectly with the JDK 23 release (and we will 
have Panama Vector Support). I am seeting up Jenkins Job with latest RC now to 
verify all vector stuff works with 23.
Uwe
Am 08.08.2024 um 18:50 schrieb Adrien Grand:

Hello everyone,

As previously discussed, I plan on releasing 9.last and 10.0 under the 
following timeline:
- ~September 15th: 10.0 feature freeze - main becomes 11.0
- ~September 22nd: 9.last release,
- ~October 1st: 10.0 release.

Unless someone shortly volunteers to do a 9.x release, this 9.last release will 
likely be 9.12.

As these dates are coming shortly, I would like to start tracking blockers. 
Please reply to this thread with issues that you know about that should delay 
the 9.last or 10.0 releases.

Chris, Uwe: I also wanted to check with you if this timeline works well with 
regards to supporting Java 23 in 9.last and 10.0?

--
Adrien

--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: [email protected]

--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: [email protected]


--
Adrien

--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: [email protected]

Re: Lucene 10.0 and 9.12 blockers

Reply via email to