[GitHub] [lucene-solr] itygh commented on pull request #2676: SOLR-16626: Upgrade to Netty 4.1.87.Final

2023-01-20 Thread GitBox
itygh commented on PR #2676: URL: https://github.com/apache/lucene-solr/pull/2676#issuecomment-1398560219 这是来自QQ邮箱的假期自动回复邮件。您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [lucene-solr] janhoy opened a new pull request, #2676: SOLR-16626: Upgrade to Netty 4.1.87.Final

2023-01-20 Thread GitBox
janhoy opened a new pull request, #2676: URL: https://github.com/apache/lucene-solr/pull/2676 Backport from 9.x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [lucene] risdenk commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2023-01-20 Thread GitBox
risdenk commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1398549982 Re: immutable Query in Solr - See https://issues.apache.org/jira/browse/SOLR-16509 and https://github.com/apache/solr/pull/1146 -- This is an automated message from the Apache Git

[GitHub] [lucene] alessandrobenedetti commented on pull request #12029: introduce support in KnnVectorQuery for getters/setters

2023-01-20 Thread GitBox
alessandrobenedetti commented on PR #12029: URL: https://github.com/apache/lucene/pull/12029#issuecomment-1398524390 waiting for the checks and then I'll merge tonight! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #12029: introduce support in KnnVectorQuery for getters/setters

2023-01-20 Thread GitBox
alessandrobenedetti commented on code in PR #12029: URL: https://github.com/apache/lucene/pull/12029#discussion_r1082676520 ## lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java: ## @@ -33,6 +33,7 @@ import org.apache.lucene.store.Directory; import

[GitHub] [lucene] mkhludnev commented on issue #11218: GraphTokenStreamFiniteStrings#articulationPointsRecurse can run into stack overflows [LUCENE-10181]

2023-01-20 Thread GitBox
mkhludnev commented on issue #11218: URL: https://github.com/apache/lucene/issues/11218#issuecomment-1398122320 @hassenome , can you share versions, stacktrace and invocation arguments? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [lucene] hassenome commented on issue #11218: GraphTokenStreamFiniteStrings#articulationPointsRecurse can run into stack overflows [LUCENE-10181]

2023-01-20 Thread GitBox
hassenome commented on issue #11218: URL: https://github.com/apache/lucene/issues/11218#issuecomment-1398087851 Hello, We are facing this error, as a root cause for a feature used by ElasticSearch. We are wondering if there is an update to this issue? -- This is an automated message

[GitHub] [lucene] vigyasharma closed issue #12097: TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary failure

2023-01-19 Thread GitBox
vigyasharma closed issue #12097: TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary failure URL: https://github.com/apache/lucene/issues/12097 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [lucene] vigyasharma merged pull request #12098: Fix failure in TestIndexSortSortedNumericDocValuesRangeQuery

2023-01-19 Thread GitBox
vigyasharma merged PR #12098: URL: https://github.com/apache/lucene/pull/12098 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir commented on pull request #12098: Fix failure in TestIndexSortSortedNumericDocValuesRangeQuery

2023-01-19 Thread GitBox
rmuir commented on PR #12098: URL: https://github.com/apache/lucene/pull/12098#issuecomment-1397749519 if a test wants to enforce it only has one segment, it should `forceMerge()`, make use of `LuceneTestCase.getOnlyLeafReader()`, etc. Otherwise the number of segments can vary based

[GitHub] [lucene] jmazanec15 commented on a diff in pull request #12050: Reuse HNSW graph for intialization during merge

2023-01-19 Thread GitBox
jmazanec15 commented on code in PR #12050: URL: https://github.com/apache/lucene/pull/12050#discussion_r1081896861 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -94,36 +93,83 @@ public int size() { } /** - * Add node on the given

[GitHub] [lucene] jmazanec15 commented on pull request #12050: Reuse HNSW graph for intialization during merge

2023-01-19 Thread GitBox
jmazanec15 commented on PR #12050: URL: https://github.com/apache/lucene/pull/12050#issuecomment-1397643952 Per [this discussion](https://github.com/apache/lucene/pull/12050#discussion_r1061034056), I refactored OnHeapHnswGraph to use a TreeMap to represent the graph structure for levels

[GitHub] [lucene] vigyasharma opened a new pull request, #12098: Fix failure in TestIndexSortSortedNumericDocValuesRangeQuery

2023-01-19 Thread GitBox
vigyasharma opened a new pull request, #12098: URL: https://github.com/apache/lucene/pull/12098 Fixes bug in `TestIndexSortSortedNumericDocValuesRangeQuery. testCountBoundary`. Addresses #12097 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [lucene] vigyasharma commented on issue #12097: TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary failure

2023-01-19 Thread GitBox
vigyasharma commented on issue #12097: URL: https://github.com/apache/lucene/issues/12097#issuecomment-1397542712 Wait.. I think the assert should simply be on total no. of documents, not documents per leaf. Something like: ```java int count = 0; for (LeafReaderContext

[GitHub] [lucene] vigyasharma opened a new issue, #12097: TestIndexSortSortedNumericDocValuesRangeQuery.testCountBoundary failure

2023-01-19 Thread GitBox
vigyasharma opened a new issue, #12097: URL: https://github.com/apache/lucene/issues/12097 ### Description Found this test failing in Lucene-Check-9.x - Build # 4239. **Steps to repro:** ```ruby gradlew test --tests

[GitHub] [lucene] uschindler commented on pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-19 Thread GitBox
uschindler commented on PR #12094: URL: https://github.com/apache/lucene/pull/12094#issuecomment-1397386789 I am fine with both PRs, both technically correct. I don't care about username. If I would do a relaese I would insert "policeman" into the artifacts. -- This is an automated

[GitHub] [lucene] magibney commented on a diff in pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-19 Thread GitBox
magibney commented on code in PR #12094: URL: https://github.com/apache/lucene/pull/12094#discussion_r1081614175 ## gradle/java/jar-manifest.gradle: ## @@ -46,7 +46,9 @@ subprojects { if (snapshotBuild) { return "${project.version} ${gitRev}

[GitHub] [lucene] magibney opened a new pull request, #12096: remove username from MANIFEST.MF in build artifacts

2023-01-19 Thread GitBox
magibney opened a new pull request, #12096: URL: https://github.com/apache/lucene/pull/12096 Following on discussion from #12094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] magibney commented on pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-19 Thread GitBox
magibney commented on PR #12094: URL: https://github.com/apache/lucene/pull/12094#issuecomment-1397358626 > Should we just remove the username from the manifest? This doesn't make sense to me, we don't put usernames anywhere else (e.g. no @author at apache)... This seems fine to me.

[GitHub] [lucene] magibney commented on pull request #12095: buildAndPushRelease should optionally pause before assembleRelease

2023-01-19 Thread GitBox
magibney commented on PR #12095: URL: https://github.com/apache/lucene/pull/12095#issuecomment-1397344248 The main reason I didn't make this the default is because I'm not sure whether running this through the releaseWizard would support user input. I'm using the releaseWizard to guide me

[GitHub] [lucene] javanna commented on pull request #12085: update releaseWizard.py to support offline gpg key

2023-01-19 Thread GitBox
javanna commented on PR #12085: URL: https://github.com/apache/lucene/pull/12085#issuecomment-1396993570 Thanks @magibney ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] javanna merged pull request #12085: update releaseWizard.py to support offline gpg key

2023-01-19 Thread GitBox
javanna merged PR #12085: URL: https://github.com/apache/lucene/pull/12085 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir commented on pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-19 Thread GitBox
rmuir commented on PR #12094: URL: https://github.com/apache/lucene/pull/12094#issuecomment-1396942514 I have also witnessed harassment from solr users towards the person whose name happens to be in there. Please, lets remove the username. If I am ignored and this option is kept, I

[GitHub] [lucene] rmuir commented on pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-19 Thread GitBox
rmuir commented on PR #12094: URL: https://github.com/apache/lucene/pull/12094#issuecomment-1396882440 Should we just remove the username from the manifest? This doesn't make sense to me, we don't put usernames anywhere else (e.g. no `@author` at apache)... -- This is an automated

[GitHub] [lucene] romseygeek commented on pull request #12095: buildAndPushRelease should optionally pause before assembleRelease

2023-01-19 Thread GitBox
romseygeek commented on PR #12095: URL: https://github.com/apache/lucene/pull/12095#issuecomment-1396781927 +1, this has caught me multiple times! I think I'd personally make it the default but I don't know if others have things set up so that they don't need to type in their GPG pin.

[GitHub] [lucene] vigyasharma commented on issue #12000: Lucene-facet leaves ThreadLocal that creates a memory leak

2023-01-18 Thread GitBox
vigyasharma commented on issue #12000: URL: https://github.com/apache/lucene/issues/12000#issuecomment-1396551522 Removed UTF8TaxonomyWriterCache from main, and deprecated it in 9.x. We now default to LruTaxonomyWriterCache. PRs have been merged in. Closing this issue. -- This is an

[GitHub] [lucene] vigyasharma closed issue #12000: Lucene-facet leaves ThreadLocal that creates a memory leak

2023-01-18 Thread GitBox
vigyasharma closed issue #12000: Lucene-facet leaves ThreadLocal that creates a memory leak URL: https://github.com/apache/lucene/issues/12000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] vigyasharma commented on issue #12082: LeafFieldComparator setBottom not being called before compareBottom

2023-01-18 Thread GitBox
vigyasharma commented on issue #12082: URL: https://github.com/apache/lucene/issues/12082#issuecomment-1396549638 I think you're right that `bottom` should be scoped outside the `LeafFieldComparator`. It stores the bottom slot value for competitive hits and should survive across leaf

[GitHub] [lucene] LuXugang merged pull request #12084: Same bound with fallbackQuery

2023-01-18 Thread GitBox
LuXugang merged PR #12084: URL: https://github.com/apache/lucene/pull/12084 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] uschindler commented on a diff in pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-18 Thread GitBox
uschindler commented on code in PR #12094: URL: https://github.com/apache/lucene/pull/12094#discussion_r1080680435 ## dev-tools/scripts/buildAndPushRelease.py: ## @@ -120,6 +120,8 @@ def prepare(root, version, gpg_key_id, gpg_password, gpg_home=None, sign_gradle= print('

[GitHub] [lucene] uschindler commented on a diff in pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-18 Thread GitBox
uschindler commented on code in PR #12094: URL: https://github.com/apache/lucene/pull/12094#discussion_r1080679818 ## dev-tools/scripts/buildAndPushRelease.py: ## @@ -120,6 +120,8 @@ def prepare(root, version, gpg_key_id, gpg_password, gpg_home=None, sign_gradle= print('

[GitHub] [lucene] uschindler commented on a diff in pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-18 Thread GitBox
uschindler commented on code in PR #12094: URL: https://github.com/apache/lucene/pull/12094#discussion_r1080678559 ## gradle/java/jar-manifest.gradle: ## @@ -46,7 +46,9 @@ subprojects { if (snapshotBuild) { return "${project.version} ${gitRev}

[GitHub] [lucene] uschindler commented on a diff in pull request #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-18 Thread GitBox
uschindler commented on code in PR #12094: URL: https://github.com/apache/lucene/pull/12094#discussion_r1080678559 ## gradle/java/jar-manifest.gradle: ## @@ -46,7 +46,9 @@ subprojects { if (snapshotBuild) { return "${project.version} ${gitRev}

[GitHub] [lucene] vigyasharma merged pull request #12093: Deprecate support for UTF8TaxonomyWriterCache

2023-01-18 Thread GitBox
vigyasharma merged PR #12093: URL: https://github.com/apache/lucene/pull/12093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] jmazanec15 commented on a diff in pull request #12050: Reuse HNSW graph for intialization during merge

2023-01-18 Thread GitBox
jmazanec15 commented on code in PR #12050: URL: https://github.com/apache/lucene/pull/12050#discussion_r1080646383 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -94,36 +93,83 @@ public int size() { } /** - * Add node on the given

[GitHub] [lucene] vigyasharma merged pull request #12092: Remove UTF8TaxonomyWriterCache

2023-01-18 Thread GitBox
vigyasharma merged PR #12092: URL: https://github.com/apache/lucene/pull/12092 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] vigyasharma commented on pull request #12093: Deprecate support for UTF8TaxonomyWriterCache

2023-01-18 Thread GitBox
vigyasharma commented on PR #12093: URL: https://github.com/apache/lucene/pull/12093#issuecomment-1387643504 > hange the default implementation in branch_9x to LRU as well? (either here on this issue or via #12092). I think it would be good to not default to the deprecated impl. Ah,

[GitHub] [lucene] rmuir commented on pull request #12093: Deprecate support for UTF8TaxonomyWriterCache

2023-01-18 Thread GitBox
rmuir commented on PR #12093: URL: https://github.com/apache/lucene/pull/12093#issuecomment-1387508863 @vigyasharma do you intend to change the default implementation in branch_9x to LRU as well? (either here on this issue or via #12092). I think it would be good to not default to the

[GitHub] [lucene] magibney opened a new pull request, #12095: buildAndPushRelease should optionally pause before assembleRelease

2023-01-18 Thread GitBox
magibney opened a new pull request, #12095: URL: https://github.com/apache/lucene/pull/12095 buildAndPushRelease currently proceeds directly from running tests to assembling the release (and signing jars). Since assembleRelease prompts for GPG key PIN, it can easily happen that the RM

[GitHub] [lucene] magibney opened a new pull request, #12094: releaseWizard: allow explicitly setting MANIFEST.MF userid (e.g., to apache id)

2023-01-18 Thread GitBox
magibney opened a new pull request, #12094: URL: https://github.com/apache/lucene/pull/12094 buildAndPushRelease (release script) currently sets the username portion of the `ImplementationVersion` property MANIFEST.MF entry for built jars according the local machine username of the active

[GitHub] [lucene] rmuir commented on issue #12091: Speeding up Lucene Vector Similarity through the Java Vector API

2023-01-18 Thread GitBox
rmuir commented on issue #12091: URL: https://github.com/apache/lucene/issues/12091#issuecomment-1386986370 There is nothing to do here about it. Convince OpenJDK to stop hostaging the vector api in incubating status like they have done for years. When it is at least in "Preview"

[GitHub] [lucene] rmuir commented on issue #12090: Building a Lucene posting format that leverages the Java Vector API

2023-01-18 Thread GitBox
rmuir commented on issue #12090: URL: https://github.com/apache/lucene/issues/12090#issuecomment-1386986113 There is nothing to do here about it. Convince OpenJDK to stop hostaging the vector api in incubating status like they have done for years. When it is at least in "Preview"

[GitHub] [lucene] rmuir commented on issue #11902: Customization of Edit distance costs for different operations

2023-01-18 Thread GitBox
rmuir commented on issue #11902: URL: https://github.com/apache/lucene/issues/11902#issuecomment-1386981136 this would be far too trappy, entirely too slow. use toy python libraries like the one referenced if you want to build toys, but this is a library for building search engines --

[GitHub] [lucene] rmuir closed issue #11902: Customization of Edit distance costs for different operations

2023-01-18 Thread GitBox
rmuir closed issue #11902: Customization of Edit distance costs for different operations URL: https://github.com/apache/lucene/issues/11902 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] mohamedniyaz1996 commented on issue #11902: Customization of Edit distance costs for different operations

2023-01-18 Thread GitBox
mohamedniyaz1996 commented on issue #11902: URL: https://github.com/apache/lucene/issues/11902#issuecomment-1386830082 @tang-hi , I agree it will be a dip in performance - but still it can be provided as a feature with a warning about performance drop. -- This is an automated message

[GitHub] [lucene] vigyasharma commented on pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2023-01-17 Thread GitBox
vigyasharma commented on PR #12013: URL: https://github.com/apache/lucene/pull/12013#issuecomment-1386565076 PR - https://github.com/apache/lucene/pull/12093 to deprecate `UTF8TaxonomyWriterCache` in 9.x -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [lucene] vigyasharma opened a new pull request, #12093: Deprecate support for UTF8TaxonomyWriterCache

2023-01-17 Thread GitBox
vigyasharma opened a new pull request, #12093: URL: https://github.com/apache/lucene/pull/12093 As discussed in PR #12013 , deprecating support for `UTF8TaxonomyWriterCache` in branch_9x. Addresses #12000 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [lucene] vigyasharma merged pull request #12045: fix typo in KoreanNumberFilter

2023-01-17 Thread GitBox
vigyasharma merged PR #12045: URL: https://github.com/apache/lucene/pull/12045 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] vigyasharma closed pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2023-01-17 Thread GitBox
vigyasharma closed pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close() URL: https://github.com/apache/lucene/pull/12013 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [lucene] vigyasharma commented on pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2023-01-17 Thread GitBox
vigyasharma commented on PR #12013: URL: https://github.com/apache/lucene/pull/12013#issuecomment-1386545577 Created a separate PR - #12092 to remove support for `UTF8TaxonomyWriterCache` from main. Will close this PR. -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] vigyasharma opened a new pull request, #12092: Remove UTF8TaxonomyWriterCache

2023-01-17 Thread GitBox
vigyasharma opened a new pull request, #12092: URL: https://github.com/apache/lucene/pull/12092 As per the discussion in PR #12013, this change removes the never evicting `UTF8TaxonomyWriterCache` and uses `LruTaxonomyWriterCache` as the default taxonomy writer cache implementation.

[GitHub] [lucene] jebnix commented on issue #11870: Create a Markdown based documentation

2023-01-17 Thread GitBox
jebnix commented on issue #11870: URL: https://github.com/apache/lucene/issues/11870#issuecomment-1386297416 @uschindler That's nice, but I personally miss two things about the Lucene repo: 1. The ability to find the documentation in a central place (that makes the contribution much

[GitHub] [lucene] mulugetam opened a new issue, #12091: Speeding up Lucene Vector Similarity through the Java Vector API

2023-01-17 Thread GitBox
mulugetam opened a new issue, #12091: URL: https://github.com/apache/lucene/issues/12091 ### Description Lucene's implementation of ANN relies on a scalar implementation of the vector similarity functions

[GitHub] [lucene] mulugetam opened a new issue, #12090: Building a Lucene posting format that leverages the Java Vector API

2023-01-17 Thread GitBox
mulugetam opened a new issue, #12090: URL: https://github.com/apache/lucene/issues/12090 ### Description This issue is to start a conversation on implementing a vectorized encoding and decoding scheme for postings. A few months ago, we implemented vectorized integer

[GitHub] [lucene] gsmiller commented on a diff in pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
gsmiller commented on code in PR #12089: URL: https://github.com/apache/lucene/pull/12089#discussion_r1072874867 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/TermInSetQuery.java: ## @@ -0,0 +1,527 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [lucene] gsmiller commented on a diff in pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
gsmiller commented on code in PR #12089: URL: https://github.com/apache/lucene/pull/12089#discussion_r1072872477 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/TermInSetQuery.java: ## @@ -0,0 +1,527 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [lucene] gsmiller commented on a diff in pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
gsmiller commented on code in PR #12089: URL: https://github.com/apache/lucene/pull/12089#discussion_r1072871141 ## lucene/core/src/java/org/apache/lucene/search/DisiWrapper.java: ## @@ -57,4 +57,14 @@ public DisiWrapper(Scorer scorer) { matchCost = 0f; } } + +

[GitHub] [lucene] rmuir commented on a diff in pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
rmuir commented on code in PR #12089: URL: https://github.com/apache/lucene/pull/12089#discussion_r1072855208 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/TermInSetQuery.java: ## @@ -0,0 +1,527 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [lucene] rmuir commented on a diff in pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
rmuir commented on code in PR #12089: URL: https://github.com/apache/lucene/pull/12089#discussion_r1072841550 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/TermInSetQuery.java: ## @@ -0,0 +1,527 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [lucene] gsmiller commented on pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
gsmiller commented on PR #12089: URL: https://github.com/apache/lucene/pull/12089#issuecomment-1386068784 @rmuir > I was naively thinking to try to the same approach with the DocValuesTermsQuery that is in sandbox... I think that's probably a good place to start honestly. I was

[GitHub] [lucene] gsmiller commented on a diff in pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
gsmiller commented on code in PR #12089: URL: https://github.com/apache/lucene/pull/12089#discussion_r1072835306 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/TermInSetQuery.java: ## @@ -0,0 +1,527 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [lucene] rmuir commented on a diff in pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
rmuir commented on code in PR #12089: URL: https://github.com/apache/lucene/pull/12089#discussion_r1072830614 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/queries/TermInSetQuery.java: ## @@ -0,0 +1,527 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [lucene] rmuir commented on pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
rmuir commented on PR #12089: URL: https://github.com/apache/lucene/pull/12089#issuecomment-1385982331 I modified the benchmark from #12087 to just use StringField instead of IntField. The queries are supposed to be "hard" in that I'm not trying to benchmark what is necessarily typical,

[GitHub] [lucene] rmuir commented on pull request #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
rmuir commented on PR #12089: URL: https://github.com/apache/lucene/pull/12089#issuecomment-1385954675 Thanks for looking at this. I can alter benchmark from #12087 to test this case, honestly we could even just take the benchmark and index the numeric field as a string instead as a start

[GitHub] [lucene] gsmiller commented on pull request #12054: Introduce a new `KeywordField`.

2023-01-17 Thread GitBox
gsmiller commented on PR #12054: URL: https://github.com/apache/lucene/pull/12054#issuecomment-1385952712 Somewhat related to this PR, I've been experimenting with the idea of a "self optimizing" `TermInSetQuery` implementation that toggles between using postings and doc values based on

[GitHub] [lucene] gsmiller opened a new pull request, #12089: [DRAFT] Explore TermInSet Query that "self optimizes"

2023-01-17 Thread GitBox
gsmiller opened a new pull request, #12089: URL: https://github.com/apache/lucene/pull/12089 ### Description This is a DRAFT PR to sketch out the idea of a "self optimizing" TermInSetQuery. The idea is to build on the new `KeywordField` being proposed in #12054, which indexes both

[GitHub] [lucene] rmuir closed issue #11869: Add RangeOnRangeFacetCounts

2023-01-17 Thread GitBox
rmuir closed issue #11869: Add RangeOnRangeFacetCounts URL: https://github.com/apache/lucene/issues/11869 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [lucene] rmuir commented on issue #11795: Add FilterDirectory to track write amplification factor

2023-01-17 Thread GitBox
rmuir commented on issue #11795: URL: https://github.com/apache/lucene/issues/11795#issuecomment-1385823162 Closing as the PR has been merged and is in the 9.5.0 section of CHANGES.txt -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [lucene] rmuir closed issue #11795: Add FilterDirectory to track write amplification factor

2023-01-17 Thread GitBox
rmuir closed issue #11795: Add FilterDirectory to track write amplification factor URL: https://github.com/apache/lucene/issues/11795 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] rmuir commented on issue #11869: Add RangeOnRangeFacetCounts

2023-01-17 Thread GitBox
rmuir commented on issue #11869: URL: https://github.com/apache/lucene/issues/11869#issuecomment-1385822481 Closing as the PR has been merged and is in the 9.5.0 section of CHANGES.txt -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [lucene] rmuir merged pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-16 Thread GitBox
rmuir merged PR #12087: URL: https://github.com/apache/lucene/pull/12087 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir commented on a diff in pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-16 Thread GitBox
rmuir commented on code in PR #12087: URL: https://github.com/apache/lucene/pull/12087#discussion_r1071265859 ## lucene/core/src/java/org/apache/lucene/document/NumericDocValuesField.java: ## @@ -97,6 +97,27 @@ SortedNumericDocValues getValues(LeafReader reader, String field)

[GitHub] [lucene] jpountz commented on a diff in pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-16 Thread GitBox
jpountz commented on code in PR #12087: URL: https://github.com/apache/lucene/pull/12087#discussion_r1071254421 ## lucene/core/src/java/org/apache/lucene/document/NumericDocValuesField.java: ## @@ -97,6 +97,27 @@ SortedNumericDocValues getValues(LeafReader reader, String

[GitHub] [lucene] romseygeek commented on pull request #12088: Don't throw UOE when highlighting FieldExistsQuery

2023-01-16 Thread GitBox
romseygeek commented on PR #12088: URL: https://github.com/apache/lucene/pull/12088#issuecomment-1383932158 Thanks for the review @mkhludnev! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] romseygeek merged pull request #12088: Don't throw UOE when highlighting FieldExistsQuery

2023-01-16 Thread GitBox
romseygeek merged PR #12088: URL: https://github.com/apache/lucene/pull/12088 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] romseygeek opened a new pull request, #12088: Don't throw UOE when highlighting FieldExistsQuery

2023-01-16 Thread GitBox
romseygeek opened a new pull request, #12088: URL: https://github.com/apache/lucene/pull/12088 WeightedSpanTermExtractor will try to rewrite queries that it doesn't know about, to see if they end up as something it does know about and that it can extract terms from. To support field

[GitHub] [lucene] rmuir commented on pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-14 Thread GitBox
rmuir commented on PR #12087: URL: https://github.com/apache/lucene/pull/12087#issuecomment-1383064967 the benchmark above uses queries such as `"la|21,22,23",// 2226 hits` in this case we form a boolean query of TermQuery:"la" AND admin2code in (21,22,23). The admin2 codes are

[GitHub] [lucene] rmuir commented on pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-14 Thread GitBox
rmuir commented on PR #12087: URL: https://github.com/apache/lucene/pull/12087#issuecomment-1383064386 Here's my benchmarks with attached java program: [NumSetBenchmark.java.txt](https://github.com/apache/lucene/files/10419558/NumSetBenchmark.java.txt) * `main` uses

[GitHub] [lucene] rmuir commented on pull request #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-14 Thread GitBox
rmuir commented on PR #12087: URL: https://github.com/apache/lucene/pull/12087#issuecomment-1382954253 intended as followups: * look into PointRangeQuery and implement necessary estimation for IndexOrDocValuesQuery to do the right thing * Add newSetQuery() to

[GitHub] [lucene] rmuir commented on issue #12028: Add newSetQuery for IntField, LongField, FloatField, DoubleField

2023-01-14 Thread GitBox
rmuir commented on issue #12028: URL: https://github.com/apache/lucene/issues/12028#issuecomment-1382953573 I don't think it is good to degrade to `BooleanQuery` when using points or doc-values, it will only hurt performance. Let's add `NumericDocValuesField.newSlowSetQuery()` and

[GitHub] [lucene] rmuir opened a new pull request, #12087: Graduate DocValuesNumbersQuery from lucene/sandbox to newSlowSetQuery()

2023-01-14 Thread GitBox
rmuir opened a new pull request, #12087: URL: https://github.com/apache/lucene/pull/12087 Clean up this query a bit, and move it around to support: * NumericDocValuesField.newSlowSetQuery() * SortedNumericDocValuesField.newSlowSetQuery() This complements the existing

[GitHub] [lucene] rmuir merged pull request #12086: Upgrade to errorprone 2.18

2023-01-14 Thread GitBox
rmuir merged PR #12086: URL: https://github.com/apache/lucene/pull/12086 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir closed issue #12057: ban finalizers in the build somehow (worst-case: use error-prone)

2023-01-14 Thread GitBox
rmuir closed issue #12057: ban finalizers in the build somehow (worst-case: use error-prone) URL: https://github.com/apache/lucene/issues/12057 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] rmuir opened a new pull request, #12086: Upgrade to errorprone 2.18

2023-01-14 Thread GitBox
rmuir opened a new pull request, #12086: URL: https://github.com/apache/lucene/pull/12086 Went thru the new checks as usual. Now that `Finalize` has our bugfix, I enabled it. Closes #12057 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [lucene] rmuir merged pull request #12056: Update to error-prone 2.17

2023-01-14 Thread GitBox
rmuir merged PR #12056: URL: https://github.com/apache/lucene/pull/12056 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir merged pull request #12038: remove non-NRT replication support

2023-01-14 Thread GitBox
rmuir merged PR #12038: URL: https://github.com/apache/lucene/pull/12038 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir closed issue #11381: remove non-NRT replication support [LUCENE-10345]

2023-01-14 Thread GitBox
rmuir closed issue #11381: remove non-NRT replication support [LUCENE-10345] URL: https://github.com/apache/lucene/issues/11381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] benwtrent commented on pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2023-01-14 Thread GitBox
benwtrent commented on PR #11860: URL: https://github.com/apache/lucene/pull/11860#issuecomment-1382728572 This for sure has to do with reading for the memory offsets and then reading the neighbors. I can dig into this a little bit next week unless somebody else has a really good

[GitHub] [lucene] jpountz commented on pull request #12079: Speed up 1D BKD merging.

2023-01-14 Thread GitBox
jpountz commented on PR #12079: URL: https://github.com/apache/lucene/pull/12079#issuecomment-1382690674 The last data point at https://people.apache.org/~mikemccand/lucenebench/sparseResults.html#tot_merge_times has a drop for overall merging that I expect to be mostly contributed by this

[GitHub] [lucene] jpountz commented on pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2023-01-14 Thread GitBox
jpountz commented on PR #11860: URL: https://github.com/apache/lucene/pull/11860#issuecomment-1382689973 For reference, there seems to be a 6-7% QPS drop on nightly benchmarks associated with this change. https://people.apache.org/~mikemccand/lucenebench/VectorSearch.html I think it's

[GitHub] [lucene] zhaih commented on pull request #12050: Reuse HNSW graph for intialization during merge

2023-01-13 Thread GitBox
zhaih commented on PR #12050: URL: https://github.com/apache/lucene/pull/12050#issuecomment-1382268387 +1, That sounds good! On Fri, Jan 13, 2023, 11:10 John Mazanec ***@***.***> wrote: > ***@***. commented on this pull request. > -- > >

[GitHub] [lucene] jmazanec15 commented on a diff in pull request #12050: Reuse HNSW graph for intialization during merge

2023-01-13 Thread GitBox
jmazanec15 commented on code in PR #12050: URL: https://github.com/apache/lucene/pull/12050#discussion_r1069901702 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -94,36 +93,83 @@ public int size() { } /** - * Add node on the given

[GitHub] [lucene] magibney opened a new pull request, #12085: update releaseWizard.py to support offline gpg key

2023-01-13 Thread GitBox
magibney opened a new pull request, #12085: URL: https://github.com/apache/lucene/pull/12085 porting analogous change from solr: https://github.com/apache/solr/pull/1288 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [lucene] LuXugang opened a new pull request, #12084: Same bound with fallbackQuery

2023-01-13 Thread GitBox
LuXugang opened a new pull request, #12084: URL: https://github.com/apache/lucene/pull/12084 ## Description IndexSortSortedNumericDocValuesRangeQuery should have the same bound with fallbackQuery. According to the comment, my guess it is a typo thing? -- This is an

[GitHub] [lucene] jpountz commented on pull request #12083: MultiCollector shouldn't report that scores are needed when they're not.

2023-01-13 Thread GitBox
jpountz commented on PR #12083: URL: https://github.com/apache/lucene/pull/12083#issuecomment-1381870866 Thanks Luca! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [lucene] jpountz merged pull request #12083: MultiCollector shouldn't report that scores are needed when they're not.

2023-01-13 Thread GitBox
jpountz merged PR #12083: URL: https://github.com/apache/lucene/pull/12083 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] jpountz opened a new pull request, #12083: MultiCollector shouldn't report that scores are needed when they're not.

2023-01-13 Thread GitBox
jpountz opened a new pull request, #12083: URL: https://github.com/apache/lucene/pull/12083 When sub collectors don't agree on their `ScoreMode`, `MultiCollector` currently returns `COMPLETE`. This makes sense when assuming that there is likely one collector computing top hits

[GitHub] [lucene] LuXugang merged pull request #12078: Enhance XXXField#newRangeQuery

2023-01-13 Thread GitBox
LuXugang merged PR #12078: URL: https://github.com/apache/lucene/pull/12078 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] LuXugang closed issue #12074: Enhance XXXField#newRangeQuery

2023-01-13 Thread GitBox
LuXugang closed issue #12074: Enhance XXXField#newRangeQuery URL: https://github.com/apache/lucene/issues/12074 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [lucene] romseygeek commented on pull request #11807: No need to rewrite queries in unified highlighter

2023-01-13 Thread GitBox
romseygeek commented on PR #11807: URL: https://github.com/apache/lucene/pull/11807#issuecomment-1381532407 Oops, yes, I should have backported it at the time. Will do that now! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

  1   2   3   4   5   6   7   8   9   10   >