Re: Lucene 10.0 and 9.12 blockers

Chris Hegarty Tue, 01 Oct 2024 09:13:15 -0700

Uwe,

> On 1 Oct 2024, at 08:08, Uwe Schindler <u...@thetaphi.de> wrote:
> 
> Hi,
> are we ready for a release. Because at moment I don't think we have fully 
> cleaned up the codecs and this could lead to problems when going to 11.x 
> because of version numbers in the index files!?
>     • Rename the most recent codec to 10x
>     • Remove older 9.x stuff and move to backwards, only keep latest.
> Is this still planned or ongoing?


Yes. I believe everything that is needed is in Adrien’s PR  - 
https://github.com/apache/lucene/pull/13815. I actually thought that it was 
already merged. 

I’ll do some final testing, etc, on this PR and get it merged.

-Chris.

> Uwe
> Am 30.09.2024 um 19:35 schrieb Luca Cavanna:
>> Hi all,  
>> given Adrien's limited availability, Chris and I flipped a coin and decided 
>> that I will be the release manager for Lucene 10. branch_10x and branch_10_0 
>> have already been created. I will follow up with the first RC tomorrow. 
>> 
>> On Mon, Sep 30, 2024 at 5:14 PM Adrien Grand <jpou...@gmail.com> wrote:
>> FYI I have little availability this week so Luca and Chris offered to help 
>> with the release. Thank you both!
>> 
>> Le sam. 14 sept. 2024, 18:07, Luca Cavanna <java...@apache.org> a écrit :
>> On the topic of removing search(Query, Collector), I think it is unlikely 
>> that we will be able to complete that task before the Lucene 10 branch is 
>> cut. It would be nice to still work on the remaining issues, but it's 
>> probably good to have less urgency around it hence more time to think about 
>> how to properly address the remaining issues. I think that removing a 
>> deprecated method leaving users without a proper replacement is not a good 
>> trade-off. 
>> 
>> I have an outstanding PR around CollectorManager#forSequentialExecution, 
>> based on the risk I described in my previous message.I am now concluding 
>> that it would be safer to remove such a public method from branch_9x before 
>> it gets released. I described the reason in the PR I just opened: 
>> https://github.com/apache/lucene/pull/13790 . I'd like to know what folks 
>> think about this. I myself have changed my mind a few times on the topic.
>> 
>> On Thu, Sep 12, 2024 at 10:09 PM Michael Froh <msf...@gmail.com> wrote:
>> Aha! I realized that I was totally misreading some stacktraces while 
>> debugging some of the Join tests. You're absolutely right that a 
>> ScorerSupplier is created for each doc ID range -- not just the Scorer. 
>> Also, the test that I "helped" Vamsi with didn't really get fixed by moving 
>> construction of the TermsEnum into the get() method -- it just didn't happen 
>> to fail that time. *facepalm* 
>> 
>> Please ignore my unfounded panic :D
>> 
>> Thanks,
>> Froh
>> 
>> On Thu, Sep 12, 2024 at 1:33 AM Luca Cavanna <java...@apache.org> wrote:
>> Thanks for raising this Michael. 
>> 
>> I see a risk as well around removing the deprecated search(Query, Collector) 
>> method by using a sequential collector manager. Previously, we would happily 
>> execute sequentially despite an executor being provided to the searcher 
>> (because search(Query, Collector) bypasses concurrency entirely). After the 
>> migration, we would execute concurrently, hence the collector manager would 
>> throw an exception as soon as more than one slice gets searched. The 
>> sequential collector manager can be used in tests to overcome the lack of a 
>> collector manager, but that only moves the problem to the user side: what 
>> are users of these collectors going to do given they need a collector 
>> manager to call search, and we don't provide one? Their only option would be 
>> to not set an executor to their searcher. Maybe it is a possibility to 
>> clearly document that but is it acceptable to require users to have a non 
>> concurrent index searcher instance around to use for non concurrent queries? 
>> There's only a few cases of this fortunately. There's also a couple of cases 
>> like QueryUtils and JoinUtil where we have utility methods that call search 
>> internally and accept an external searcher. Those searchers may have an 
>> executor set to them, and the only safe way to migrate these is to add 
>> concurrent collector managers support or explicitly state that the provided 
>> search shouldn't have an executor set to it. Another option if we are not 
>> happy with the workaround I mentioned, is to consider leaving search(Query, 
>> Collector) deprecated in Lucene 10 and removing it in Lucene 11. It is a 
>> shame because we are not far off, but I am not sure that this warrants 
>> delaying the release.
>> I am not entirely sure how this aligns with the risk you mentioned, which 
>> cases of SimpleCollector are you referring to specifically?
>> 
>> Regarding your second concern around intra-segment concurrency: while I had 
>> to adapt a couple of tests to be intra-segment ready as they made wrong 
>> assumptions, we are now leveraging intra-segment concurrency in all tests 
>> (provided that the searcher is created using LuceneTestCase#newSearcher), 
>> besides when DrillSideways is used. I have seen a couple of post-merge 
>> failures that may be related which I will look into, but nothing that would 
>> suggest that the design is entirely problematic. When you retrieve a scorer 
>> supplier or a bulk scorer you provide a LeafReaderContext. The overall 
>> expectation is that you get a different instance each time, regardless of 
>> whether you have already seen the segment or not. If that is the case there 
>> is no state shared between threads, because ScorerSuppliers should not get 
>> shared across threads? It is not the expectation that the support for 
>> intra-segment concurrency requires bulk scorers and scorer suppliers to 
>> become thread-safe. 
>> With that in mind, I checked DefaultScorerSupplier and I do not see why it 
>> would not work, as long as each call to retrieve a scorer supplier returns a 
>> new instance of if that points to a new instance of Scorer, which holds 
>> state created as part of that same scorerSupplier call. The problem that we 
>> have is that we duplicate ahead of time work for partitions of the same 
>> segment (tracked in https://github.com/apache/lucene/issues/13745), because 
>> if we need to pull doc_values, we will do so for the same segment multiple 
>> times. I would assume that if something is off with DefaultScorerSupplier, 
>> tests would show that clearly, as it is widely used.
>> I also checked FeatureQuery, and I see that each call to 
>> scorerSupplier(LeafReaderContext) returns a new instance of the supplier 
>> which points to different TermsEnum instance retrieved multiple times for 
>> the same segment. Removing this duplication will require additional work, 
>> and there will be bugs, or incorrect assumptions made in existing scorer 
>> supplier instances, but those should not be too hard to fix. 
>> Does this make sense to you? Perhaps there are additional changes to make in 
>> the migrate guide or javadocs to clarify what I described, let me know what 
>> you think.
>> 
>> Cheers
>> Luca
>> 
>> On Thu, Sep 12, 2024 at 9:42 AM Michael Froh <msf...@gmail.com> wrote:
>> Oh, I forgot to mention: 
>> 
>> I think we should deprecate the DefaultScorerSupplier constructor that takes 
>> a Scorer. There's no way that works with intra-segment concurrency.
>> 
>> Maybe we remove DefaultScorerSupplier altogether?
>> 
>> On Thu, Sep 12, 2024 at 12:35 AM Michael Froh <msf...@gmail.com> wrote:
>> I'm a big fan of both of Luca's topics. I'd like to raise a small red flag 
>> around them, though, since they seem to be connected.
>> 
>> Working through the join module and helping my colleague @harshavamsi on the 
>> QueryUtils side, I see two layers of unpreparatedness for the modern 
>> "concurrency first" architecture. (Again, I want to make clear that I think 
>> the modern architecture is the way to go and we can and should get there in 
>> time for Lucene 10.)
>> 
>> 1. There are several uses of SimpleCollector, where it's assumed that one 
>> collector will collect all results on a single thread. With the deprecated 
>> method, this forces single-threaded behavior all the time. In my opinion, 
>> these represent 13+ year technical debt for cases where you couldn't 
>> properly use an IndexSearcher to do concurrent searches.
>> 2. With the merge of intra-segment searches, we have another layer: 
>> ScorerSuppliers that share mutable state across the Scorers that they 
>> produce. For example, @harshavamsi came across a case today in the sigmoid 
>> function for FeatureQuery where a TermsEnum was created in the 
>> ScorerSupplier and passed into the Scorers. Each Scorer shared the same 
>> TermsEnum. What changed? In the old concurrency model, one thread might 
>> search a few segments, but each segment was guaranteed to only be searched 
>> by one thread. Now, with intra-segment concurrency, we produce one 
>> ScorerSupplier per segment, but may produce multiple Scorers across 
>> different threads. If the ScorerSupplier produces some mutable object and 
>> shares it across the resulting Scorers, you're going to have a bad time. Fun 
>> fact: back in 2012, we had an office Halloween party and I dressed as the 
>> thing that scares me the most. I printed a picture of Texas (since everyone 
>> recognizes Texas) with a TV remote control mute button in the middle. I 
>> sewed it to my shirt in the four corners. It was mutable state held by 
>> multiple threads.
>> 
>> I definitely think we should address these before the Lucene 10 release, as 
>> they provide a clean break from the old world. I also think it's a decent 
>> amount of work (but not unsurmountable). I'm also maybe no longer a fan of 
>> the helper method that Greg added in his PR for the monitor module, since it 
>> risks sweeping non-threadsafe code under the rug, if folks make 
>> single-threaded tests (which is essentially what they've been doing all 
>> along -- see my first point above).
>> 
>> I haven't properly looked into the scope of my second point above, but I've 
>> seen at least two cases in the past two days. Hopefully it's not too bad, 
>> but it might be a risk. I think the first point is still pretty easy to 
>> address.
>> 
>> Thanks,
>> Froh
>> 
>> On Thu, Aug 29, 2024 at 2:15 AM Luca Cavanna <l...@elastic.co.invalid> wrote:
>> For Lucene 10.0, I have two topics to raise: 
>> 
>> 1. Remove the deprecated IndexSearcher#search(Query, Collector) in favour of 
>> IndexSearcher#search(Query, CollectorManager)  
>> (https://github.com/apache/lucene/issues/12892): this involves removing the 
>> leftover usages in facet, grouping, join and test-framework, plus in some 
>> tests. A list of the leftover usages is in the description of the issue. It 
>> would be great to complete this for Lucene 10, otherwise this deprecated 
>> method and usages will stick around for much longer. What do others think? 
>> Should we make this a blocker for the release? I think this is not a huge 
>> effort and it is parallelizable across different people.
>> 
>> 2. Intra-segment concurrency (https://github.com/apache/lucene/pull/13542): 
>> current thinking is to add support for partitioning segments when searching, 
>> and searching across segment partitions concurrently. My intention is to 
>> introduce breaking changes and documentation in Lucene 10 (really only the 
>> basics), without switching the default slicing of IndexSearcher to create 
>> segment partitions. We will want to leverage segment partitions in testing. 
>> More iterations are going to be needed to remove duplicated work across 
>> partitions of the same segment, which is my next step, but currently out of 
>> scope for Lucene 10. Judging from the reviews I got so far, my PR is not far 
>> and I am working on it to address comments, polish it a bit more and merge 
>> it soon.
>> 
>> Feedback is welcome
>> 
>> Cheers
>> Luca
>> 
>> On Wed, Aug 28, 2024 at 3:05 PM Adrien Grand <jpou...@gmail.com> wrote:
>> Thanks Mike.
>> 
>> On Wed, Aug 28, 2024 at 2:16 PM Michael McCandless 
>> <luc...@mikemccandless.com> wrote:
>> I think maybe also https://github.com/apache/lucene/issues/13519 should be a 
>> blocker?  It looks like 8 bit vector HNSW quantization is broken (unless I'm 
>> making a silly mistake with luceneutil tooling). 
>> 
>> I've also set its milestone to 10.0.0.
>> 
>> Do we really not have a way to mark an issue a blocker for a given release?  
>> That's insane.  OK well I went and created "blocker" label, and added that 
>> to GH 13519.  Greg, I'll also go mark your linked issue as "blocker".
>> 
>> Mike McCandless
>> 
>> http://blog.mikemccandless.com
>> 
>> 
>> On Sat, Aug 24, 2024 at 2:33 PM Uwe Schindler <u...@thetaphi.de> wrote:
>> Hi,
>> I updated Policeman Jenkins to have JDK 23 RC and JDK 24 EA releases.
>> Uwe
>> P.S.: Unfortunately I have to update the macOS Hackintosh VM to have a newer 
>> operating system version: JDK 22 and later no longer run on this machine.
>> Am 23.08.2024 um 10:41 schrieb Uwe Schindler:
>>> Hi,
>>> In 9.x there's still the backport of 
>>> https://github.com/apache/lucene/pull/13570 to be done. The PR apperas in 
>>> the changelog, but was not backported yet. Chris and I will do this soon.
>>> 9.last release on Sept 22 fits perfectly with the JDK 23 release (and we 
>>> will have Panama Vector Support). I am seeting up Jenkins Job with latest 
>>> RC now to verify all vector stuff works with 23.
>>> Uwe
>>> Am 08.08.2024 um 18:50 schrieb Adrien Grand:
>>>> Hello everyone, 
>>>> 
>>>> As previously discussed, I plan on releasing 9.last and 10.0 under the 
>>>> following timeline:
>>>> - ~September 15th: 10.0 feature freeze - main becomes 11.0
>>>> - ~September 22nd: 9.last release,
>>>> - ~October 1st: 10.0 release.
>>>> 
>>>> Unless someone shortly volunteers to do a 9.x release, this 9.last release 
>>>> will likely be 9.12.
>>>> 
>>>> As these dates are coming shortly, I would like to start tracking 
>>>> blockers. Please reply to this thread with issues that you know about that 
>>>> should delay the 9.last or 10.0 releases.
>>>> 
>>>> Chris, Uwe: I also wanted to check with you if this timeline works well 
>>>> with regards to supporting Java 23 in 9.last and 10.0?
>>>> 
>>>> -- 
>>>> Adrien
>>> -- 
>>> Uwe Schindler
>>> Achterdiek 19, D-28357 Bremen
>>> https://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>> -- 
>> Uwe Schindler
>> Achterdiek 19, D-28357 Bremen
>> https://www.thetaphi.de
>> eMail: u...@thetaphi.de
>> 
>> 
>> -- 
>> Adrien
> -- 
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene 10.0 and 9.12 blockers

Reply via email to