[GitHub] [lucene] vigyasharma closed issue #12097: TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary failure
vigyasharma closed issue #12097: TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary failure URL: https://github.com/apache/lucene/issues/12097 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] vigyasharma merged pull request #12098: Fix failure in TestIndexSortSortedNumericDocValuesRangeQuery
vigyasharma merged PR #12098: URL: https://github.com/apache/lucene/pull/12098 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12098: Fix failure in TestIndexSortSortedNumericDocValuesRangeQuery
rmuir commented on PR #12098: URL: https://github.com/apache/lucene/pull/12098#issuecomment-1397749519 if a test wants to enforce it only has one segment, it should `forceMerge()`, make use of `LuceneTestCase.getOnlyLeafReader()`, etc. Otherwise the number of segments can vary based on flushing/merging, especially when using `LuceneTestCase.newIndexWriterConfig()`, `RandomIndexWriter`, etc. so in general, we want to avoid assertions that rely upon certain segment structure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jmazanec15 commented on a diff in pull request #12050: Reuse HNSW graph for intialization during merge
jmazanec15 commented on code in PR #12050: URL: https://github.com/apache/lucene/pull/12050#discussion_r1081896861 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -94,36 +93,83 @@ public int size() { } /** - * Add node on the given level + * Add node on the given level. Nodes can be inserted out of order, but it requires that the nodes Review Comment: Updated structure to use treemap to represent upper levels of graph. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jmazanec15 commented on pull request #12050: Reuse HNSW graph for intialization during merge
jmazanec15 commented on PR #12050: URL: https://github.com/apache/lucene/pull/12050#issuecomment-1397643952 Per [this discussion](https://github.com/apache/lucene/pull/12050#discussion_r1061034056), I refactored OnHeapHnswGraph to use a TreeMap to represent the graph structure for levels greater than 0. I ran performance tests with the same setup as https://github.com/apache/lucene/issues/11354#issuecomment-1239961308, and the results did not show a significant difference in indexing time between my previous implementation, the implementation using the map, and the current implementation with no merge optimization. Additionally, the results did not show a difference in merge time between by previous implementation and the implementation using the map. Here are the results: ### Segment Size 10K Exper. | Total indexing time (s) | Total time to merge numeric vectors (ms) | Recall -- | -- | -- | -- Control-1 | 189s | 697280 | 0.979 Control-2 | 190s | 722042 | 0.979 Control-3 | 191s | 713402 | 0.979 Test-array 1 | 190s | 683966 | 0.98 Test-array 2 | 187s | 683584 | 0.98 Test-array 3 | 190s | 702458 | 0.98 Test-map 1 | 189s | 723582 | 0.98 Test-map 2 | 187s | 658196 | 0.98 Test-map 3 | 190s | 66 | 0.98 ### Segment Size 100K Exper. | Total indexing time (s) | Total time to merge numeric vectors (ms) | Recall -- | -- | -- | -- Control-1 | 366s | 675361 | 0.981 Control-2 | 370s | 695974 | 0.981 Control-3 | 367s | 684418 | 0.981 Test-array 1 | 368s | 651814 | 0.981 Test-array 2 | 368s | 654862 | 0.981 Test-array 3 | 368s | 656062 | 0.981 Test-map 1 | 364s | 637257 | 0.981 Test-map 2 | 370s | 628755 | 0.981 Test-map 3 | 366s | 647569 | 0.981 ### Segment Size 500K Exper. | Total indexing time (s) | Total time to merge numeric vectors (ms) | Recall -- | -- | -- | -- Control-1 | 633s | 655538 | 0.98 Control-2 | 631s | 664622 | 0.98 Control-3 | 627s | 635919 | 0.98 Test-array 1 | 639s | 376139 | 0.98 Test-array 2 | 636s | 378071 | 0.98 Test-array 3 | 638s | 352633 | 0.98 Test-map 1 | 645s | 373572 | 0.98 Test-map 2 | 635s | 374309 | 0.98 Test-map 3 | 633s | 381212 | 0.98 Given that the results do not show a significant difference, I switched to use the treemap to avoid multiple large array copies. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] vigyasharma opened a new pull request, #12098: Fix failure in TestIndexSortSortedNumericDocValuesRangeQuery
vigyasharma opened a new pull request, #12098: URL: https://github.com/apache/lucene/pull/12098 Fixes bug in `TestIndexSortSortedNumericDocValuesRangeQuery. testCountBoundary`. Addresses #12097 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] vigyasharma commented on issue #12097: TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary failure
vigyasharma commented on issue #12097: URL: https://github.com/apache/lucene/issues/12097#issuecomment-1397542712 Wait.. I think the assert should simply be on total no. of documents, not documents per leaf. Something like: ```java int count = 0; for (LeafReaderContext context : searcher.getLeafContexts()) { count += weight.count(context); } assertEquals(2, count); ``` This passes on my workspace. I'll raise a PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] vigyasharma opened a new issue, #12097: TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary failure
vigyasharma opened a new issue, #12097: URL: https://github.com/apache/lucene/issues/12097 ### Description Found this test failing in Lucene-Check-9.x - Build # 4239. **Steps to repro:** ```ruby gradlew test --tests TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary -Dtests.seed=A11A06AE642497F1 -Dtests.multiplier=2 -Dtests.locale=en-KE -Dtests.timezone=Singapore -Dtests.asserts=true -Dtests.file.encoding=UTF-8 ``` I've been able to repro this failure in both `branch_9x` as well as `main`. This is what I found with a debugger - It seems that in this test seed, the missing document [(created here)](https://github.com/apache/lucene/blob/main/lucene/core/src/test/org/apache/lucene/search/TestIndexSortSortedNumericDocValuesRangeQuery.java#L589), goes into a separate leaf. Then for that leaf context, [the IteratorAndCount](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSortSortedNumericDocValuesRangeQuery.java#L200) has `count = -1`, taking the flow to the fallback query. And since the document does not have a points value, the fallback query gets `PointValues == null`, leading it to return a 0 [here](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L381). I'm not very familiar with this part of Lucene, but I guess it is likely for the missing document to land in a leaf of its own. Would it help if the missing doc had a points value but for a different field? ### Gradle command to reproduce ```ruby gradlew test --tests TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary -Dtests.seed=A11A06AE642497F1 -Dtests.multiplier=2 -Dtests.locale=en-KE -Dtests.timezone=Singapore -Dtests.asserts=true -Dtests.file.encoding=UTF-8 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)
uschindler commented on PR #12094: URL: https://github.com/apache/lucene/pull/12094#issuecomment-1397386789 I am fine with both PRs, both technically correct. I don't care about username. If I would do a relaese I would insert "policeman" into the artifacts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] magibney commented on a diff in pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)
magibney commented on code in PR #12094: URL: https://github.com/apache/lucene/pull/12094#discussion_r1081614175 ## gradle/java/jar-manifest.gradle: ## @@ -46,7 +46,9 @@ subprojects { if (snapshotBuild) { return "${project.version} ${gitRev} [snapshot build, details omitted]" } else { - return "${project.version} ${gitRev} - ${System.properties['user.name']} - ${buildDate} ${buildTime}" + def sysProps = System.properties Review Comment: I've adjusted this PR accordingly, but following on Robert's feedback also opened a PR that simply removes the username from MANIFEST.MF. Assuming we want to go that direction, we can just close this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] magibney opened a new pull request, #12096: remove username from MANIFEST.MF in build artifacts
magibney opened a new pull request, #12096: URL: https://github.com/apache/lucene/pull/12096 Following on discussion from #12094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] magibney commented on pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)
magibney commented on PR #12094: URL: https://github.com/apache/lucene/pull/12094#issuecomment-1397358626 > Should we just remove the username from the manifest? This doesn't make sense to me, we don't put usernames anywhere else (e.g. no @author at apache)... This seems fine to me. The RM is associated with the release via the GPG signature, which really is the only meaningful association with the release. That would substantially change the nature of this PR though. If we go that route I'll close this and open a new PR for removing the username from manifest? Is any further discussion necessary to go ahead with removing the username from the manifest? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] magibney commented on pull request #12095: buildAndPushRelease should optionally pause before assembleRelease
magibney commented on PR #12095: URL: https://github.com/apache/lucene/pull/12095#issuecomment-1397344248 The main reason I didn't make this the default is because I'm not sure whether running this through the releaseWizard would support user input. I'm using the releaseWizard to guide me through the steps but running them all manually, so user input is definitely supported. TBH I'm not sure -- maybe releaseWizard _would_ support user input to subcommands? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] javanna commented on pull request #12085: update releaseWizard.py to support offline gpg key
javanna commented on PR #12085: URL: https://github.com/apache/lucene/pull/12085#issuecomment-1396993570 Thanks @magibney ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] javanna merged pull request #12085: update releaseWizard.py to support offline gpg key
javanna merged PR #12085: URL: https://github.com/apache/lucene/pull/12085 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)
rmuir commented on PR #12094: URL: https://github.com/apache/lucene/pull/12094#issuecomment-1396942514 I have also witnessed harassment from solr users towards the person whose name happens to be in there. Please, lets remove the username. If I am ignored and this option is kept, I will use the option to populate something extremely offensive into the field rather than my real username, and create a release candidate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)
rmuir commented on PR #12094: URL: https://github.com/apache/lucene/pull/12094#issuecomment-1396882440 Should we just remove the username from the manifest? This doesn't make sense to me, we don't put usernames anywhere else (e.g. no `@author` at apache)... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] romseygeek commented on pull request #12095: buildAndPushRelease should optionally pause before assembleRelease
romseygeek commented on PR #12095: URL: https://github.com/apache/lucene/pull/12095#issuecomment-1396781927 +1, this has caught me multiple times! I think I'd personally make it the default but I don't know if others have things set up so that they don't need to type in their GPG pin. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org