[GitHub] [lucene] rmuir commented on pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-14 Thread GitBox
rmuir commented on PR #12087: URL: https://github.com/apache/lucene/pull/12087#issuecomment-1383064967 the benchmark above uses queries such as `"la|21,22,23",// 2226 hits` in this case we form a boolean query of TermQuery:"la" AND admin2code in (21,22,23). The admin2 codes are

[GitHub] [lucene] rmuir commented on pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-14 Thread GitBox
rmuir commented on PR #12087: URL: https://github.com/apache/lucene/pull/12087#issuecomment-1383064386 Here's my benchmarks with attached java program: [NumSetBenchmark.java.txt](https://github.com/apache/lucene/files/10419558/NumSetBenchmark.java.txt) * `main` uses

[GitHub] [lucene] rmuir commented on pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-14 Thread GitBox
rmuir commented on PR #12087: URL: https://github.com/apache/lucene/pull/12087#issuecomment-1382954253 intended as followups: * look into PointRangeQuery and implement necessary estimation for IndexOrDocValuesQuery to do the right thing * Add newSetQuery() to

[GitHub] [lucene] rmuir commented on issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

2023-01-14 Thread GitBox
rmuir commented on issue #12028: URL: https://github.com/apache/lucene/issues/12028#issuecomment-1382953573 I don't think it is good to degrade to `BooleanQuery` when using points or doc-values, it will only hurt performance. Let's add `NumericDocValuesField.newSlowSetQuery()` and

[GitHub] [lucene] rmuir opened a new pull request, #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-14 Thread GitBox
rmuir opened a new pull request, #12087: URL: https://github.com/apache/lucene/pull/12087 Clean up this query a bit, and move it around to support: * NumericDocValuesField.newSlowSetQuery() * SortedNumericDocValuesField.newSlowSetQuery() This complements the existing

[GitHub] [lucene] rmuir merged pull request #12086: Upgrade to errorprone 2.18

2023-01-14 Thread GitBox
rmuir merged PR #12086: URL: https://github.com/apache/lucene/pull/12086 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir closed issue #12057: ban finalizers in the build somehow (worst-case: use error-prone)

2023-01-14 Thread GitBox
rmuir closed issue #12057: ban finalizers in the build somehow (worst-case: use error-prone) URL: https://github.com/apache/lucene/issues/12057 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] rmuir opened a new pull request, #12086: Upgrade to errorprone 2.18

2023-01-14 Thread GitBox
rmuir opened a new pull request, #12086: URL: https://github.com/apache/lucene/pull/12086 Went thru the new checks as usual. Now that `Finalize` has our bugfix, I enabled it. Closes #12057 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [lucene] rmuir merged pull request #12056: Update to error-prone 2.17

2023-01-14 Thread GitBox
rmuir merged PR #12056: URL: https://github.com/apache/lucene/pull/12056 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir merged pull request #12038: remove non-NRT replication support

2023-01-14 Thread GitBox
rmuir merged PR #12038: URL: https://github.com/apache/lucene/pull/12038 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir closed issue #11381: remove non-NRT replication support [LUCENE-10345]

2023-01-14 Thread GitBox
rmuir closed issue #11381: remove non-NRT replication support [LUCENE-10345] URL: https://github.com/apache/lucene/issues/11381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] benwtrent commented on pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2023-01-14 Thread GitBox
benwtrent commented on PR #11860: URL: https://github.com/apache/lucene/pull/11860#issuecomment-1382728572 This for sure has to do with reading for the memory offsets and then reading the neighbors. I can dig into this a little bit next week unless somebody else has a really good

[GitHub] [lucene] jpountz commented on pull request #12079: Speed up 1D BKD merging.

2023-01-14 Thread GitBox
jpountz commented on PR #12079: URL: https://github.com/apache/lucene/pull/12079#issuecomment-1382690674 The last data point at https://people.apache.org/~mikemccand/lucenebench/sparseResults.html#tot_merge_times has a drop for overall merging that I expect to be mostly contributed by this

[GitHub] [lucene] jpountz commented on pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2023-01-14 Thread GitBox
jpountz commented on PR #11860: URL: https://github.com/apache/lucene/pull/11860#issuecomment-1382689973 For reference, there seems to be a 6-7% QPS drop on nightly benchmarks associated with this change. https://people.apache.org/~mikemccand/lucenebench/VectorSearch.html I think it's