Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

2024-02-26 Thread via GitHub
zhaih commented on code in PR #13124: URL: https://github.com/apache/lucene/pull/13124#discussion_r1503101967 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -910,4 +936,58 @@ public void setSuppressExceptions(ConcurrentMergeScheduler cms)

Re: [PR] Vector accelerated GroupVInt decoder for MemorySegmentIndexInput [lucene]

2024-02-26 Thread via GitHub
mccullocht commented on PR #13133: URL: https://github.com/apache/lucene/pull/13133#issuecomment-1964602685 @rmuir @uschindler Would it be sufficient to add a factory function to `VectorizationProvider` for this like `GroupVIntUtil.Decoder createGroupVIntDecoder(MemorySegment segment)`? I

Re: [PR] Fix DV update files referenced by merge will be deleted by concurrent flush [lucene]

2024-02-26 Thread via GitHub
stefanvodita commented on PR #13017: URL: https://github.com/apache/lucene/pull/13017#issuecomment-1964882814 Thanks for updating the PR. Before merging, can you also add an entry to [CHANGES.txt, as a bug fix under

Re: [PR] Vector accelerated GroupVInt decoder for MemorySegmentIndexInput [lucene]

2024-02-26 Thread via GitHub
uschindler commented on PR #13133: URL: https://github.com/apache/lucene/pull/13133#issuecomment-1964649237 Hi, you can try it out. Two important comments: - In Java 21, MemorySegment is still a "preview" class, so it can't be used in any public method signatures. So actually to pass the

Re: [PR] Choose sparse values in IntTaxonomyFacets when FacetsCollector has em… [lucene]

2024-02-26 Thread via GitHub
gsmiller commented on PR #12559: URL: https://github.com/apache/lucene/pull/12559#issuecomment-1964636964 Closing this out since we fixed the underlying root cause last year (sorry, don't have the PR in front of me but can dig it up if necessary). -- This is an automated message from the

Re: [PR] Choose sparse values in IntTaxonomyFacets when FacetsCollector has em… [lucene]

2024-02-26 Thread via GitHub
gsmiller closed pull request #12559: Choose sparse values in IntTaxonomyFacets when FacetsCollector has em… URL: https://github.com/apache/lucene/pull/12559 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Vector accelerated GroupVInt decoder for MemorySegmentIndexInput [lucene]

2024-02-26 Thread via GitHub
uschindler commented on PR #13133: URL: https://github.com/apache/lucene/pull/13133#issuecomment-1964672015 > I'm surprised by how slow this is with AVX off given that this can be implemented with SSE2 :(. This is why we try to avoid the incubating vector API as much as possible.

Re: [PR] Allow multiple JDKs in smoke test [lucene]

2024-02-26 Thread via GitHub
dweiss commented on PR #13108: URL: https://github.com/apache/lucene/pull/13108#issuecomment-1965293218 Just FYI - I've returned to trying to add a nightly smoketester workflow check. The main branch required some changes to the script as well, which sort of proves Rob's point that we need

Re: [PR] Allow multiple JDKs in smoke test [lucene]

2024-02-26 Thread via GitHub
dweiss commented on PR #13108: URL: https://github.com/apache/lucene/pull/13108#issuecomment-1965313190 I don't think it's going to be so easy - some things just didn't work for me with local artifacts. I'll follow up, perhaps tomorrow. -- This is an automated message from the Apache Git

Re: [I] (Byte|Float)KnnVectorFieldSource unusable if segment has no vector values [lucene]

2024-02-26 Thread via GitHub
hossman closed issue #13105: (Byte|Float)KnnVectorFieldSource unusable if segment has no vector values URL: https://github.com/apache/lucene/issues/13105 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Allow multiple JDKs in smoke test [lucene]

2024-02-26 Thread via GitHub
uschindler commented on PR #13108: URL: https://github.com/apache/lucene/pull/13108#issuecomment-1965306394 I checked the script. Forward port should be more or less: - copy file to main branch - change BASE_VERSION constant in the script The new script is no longer a search

Re: [PR] Vector accelerated GroupVInt decoder for MemorySegmentIndexInput [lucene]

2024-02-26 Thread via GitHub
rmuir commented on PR #13133: URL: https://github.com/apache/lucene/pull/13133#issuecomment-1965433852 > I'm surprised by how slow this is with AVX off given that this can be implemented with SSE2 :(. Yes, it is surprising: we found the same situation with VectorUtil byte[] methods.

Re: [PR] Allow multiple JDKs in smoke test [lucene]

2024-02-26 Thread via GitHub
dweiss commented on PR #13108: URL: https://github.com/apache/lucene/pull/13108#issuecomment-1965938522 I think we can merge this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Allow multiple JDKs in smoke test [lucene]

2024-02-26 Thread via GitHub
stefanvodita commented on PR #13108: URL: https://github.com/apache/lucene/pull/13108#issuecomment-1964103920 > If you run with Lucene 9.10.0 (just released last week), it should work. I think so, but the 9.10.0 RC artefacts were no longer available at

Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

2024-02-26 Thread via GitHub
msokolov commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1964229191 ooh exciting! I left some comments in a related issue that were maybe a little clueless given all the discussion here that I missed until now. Still I'm happy about the direction this

Re: [PR] Allow multiple JDKs in smoke test [lucene]

2024-02-26 Thread via GitHub
stefanvodita commented on code in PR #13108: URL: https://github.com/apache/lucene/pull/13108#discussion_r1502581266 ## dev-tools/scripts/smokeTestRelease.py: ## @@ -633,9 +636,12 @@ def verifyUnpacked(java, artifact, unpackPath, gitRevision, version, testArgs):

Re: [PR] Allow multiple JDKs in smoke test [lucene]

2024-02-26 Thread via GitHub
stefanvodita commented on code in PR #13108: URL: https://github.com/apache/lucene/pull/13108#discussion_r1502581745 ## dev-tools/scripts/smokeTestRelease.py: ## @@ -911,33 +917,45 @@ def crawl(downloadedFiles, urlString, targetDir, exclusions=set()):

Re: [PR] Vector accelerated GroupVInt decoder for MemorySegmentIndexInput [lucene]

2024-02-26 Thread via GitHub
rmuir commented on PR #13133: URL: https://github.com/apache/lucene/pull/13133#issuecomment-1964303622 Hi, a couple suggestions: 1. Somehow, we need to avoid Vector API code inside the MemorySegment code. Just because MemorySegment is available, does not mean Vector API is usable,

Re: [PR] Vector accelerated GroupVInt decoder for MemorySegmentIndexInput [lucene]

2024-02-26 Thread via GitHub
uschindler commented on PR #13133: URL: https://github.com/apache/lucene/pull/13133#issuecomment-1964318304 I have to agree with Robert. We can't glue that code inside MemorySegmentIndexInput. The only way to do this is to move the vector optimized decoding to a special