[GitHub] [lucene] rmuir commented on issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
rmuir commented on issue #12023: URL: https://github.com/apache/lucene/issues/12023#issuecomment-1353025015 determinization has already been removed here. that is the problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [lucene] rmuir closed issue #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
rmuir closed issue #12023: Mechanism to interrupt long-running/resource intensive queries URL: https://github.com/apache/lucene/issues/12023 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] Bukhtawar opened a new issue, #12023: Mechanism to interrupt long-running/resource intensive queries

2022-12-15 Thread GitBox
Bukhtawar opened a new issue, #12023: URL: https://github.com/apache/lucene/issues/12023 ### Description As a part of https://github.com/opensearch-project/OpenSearch/issues/687 we detected that regex queries can run into tight loops for quite long. Below is the stack trace of the

[GitHub] [lucene] rmuir commented on issue #12021: Large fields with large="true" can be truncated in v9+

2022-12-15 Thread GitBox
rmuir commented on issue #12021: URL: https://github.com/apache/lucene/issues/12021#issuecomment-1352977757 This looks like a bug in solr code (SolrDocumentFetcher) so I'd recommend opening a bug over at https://github.com/apache/solr -- This is an automated message from the Apache Git

[GitHub] [lucene] iverase commented on a diff in pull request #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
iverase commented on code in PR #12022: URL: https://github.com/apache/lucene/pull/12022#discussion_r1049521314 ## lucene/CHANGES.txt: ## @@ -68,6 +68,8 @@ Bug Fixes * LUCENE-10599: LogMergePolicy is more likely to keep merging segments until they reach the maximum merge

[GitHub] [lucene] iverase commented on a diff in pull request #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
iverase commented on code in PR #12022: URL: https://github.com/apache/lucene/pull/12022#discussion_r1049521314 ## lucene/CHANGES.txt: ## @@ -68,6 +68,8 @@ Bug Fixes * LUCENE-10599: LogMergePolicy is more likely to keep merging segments until they reach the maximum merge

[GitHub] [lucene] craigtaverner opened a new pull request, #12022: Fix flat polygons incorrectly containing intersecting geometries

2022-12-15 Thread GitBox
craigtaverner opened a new pull request, #12022: URL: https://github.com/apache/lucene/pull/12022 Fixes https://github.com/apache/lucene/issues/12020 ### Description When performing a search using a shape geometry query of relation type `QueryRelation.CONTAINS`, it is possible

[GitHub] [lucene] nosvalds opened a new issue, #12021: Large fields with large="true" can be truncated in v9+

2022-12-15 Thread GitBox
nosvalds opened a new issue, #12021: URL: https://github.com/apache/lucene/issues/12021 ### Description ## Issue For fields using `large="true"`, large fields (which is what they are intended for) can be truncated in v9+ of Lucene. Example fieldtype definition: ```

[GitHub] [lucene] craigtaverner opened a new issue, #12020: Very flat polygons give incorrect 'contains' result

2022-12-15 Thread GitBox
craigtaverner opened a new issue, #12020: URL: https://github.com/apache/lucene/issues/12020 ### Description When performing a search using a shape geometry query of relation type `QueryRelation.CONTAINS`, it is possible to get a false positive when two geometries intersect, but

[GitHub] [lucene] LuXugang commented on a diff in pull request #12017: Aggressive `count` in BooleanWeight

2022-12-15 Thread GitBox
LuXugang commented on code in PR #12017: URL: https://github.com/apache/lucene/pull/12017#discussion_r1049162986 ## lucene/core/src/test/org/apache/lucene/search/TestBooleanQuery.java: ## @@ -1015,6 +1015,80 @@ public void testDisjunctionRandomClausesMatchesCount() throws

[GitHub] [lucene] LuXugang commented on a diff in pull request #12017: Aggressive `count` in BooleanWeight

2022-12-14 Thread GitBox
LuXugang commented on code in PR #12017: URL: https://github.com/apache/lucene/pull/12017#discussion_r1049163040 ## lucene/core/src/test/org/apache/lucene/search/TestBooleanQuery.java: ## @@ -1015,6 +1015,80 @@ public void testDisjunctionRandomClausesMatchesCount() throws

[GitHub] [lucene] LuXugang commented on a diff in pull request #12017: Aggressive `count` in BooleanWeight

2022-12-14 Thread GitBox
LuXugang commented on code in PR #12017: URL: https://github.com/apache/lucene/pull/12017#discussion_r1049162892 ## lucene/core/src/java/org/apache/lucene/search/BooleanWeight.java: ## @@ -470,14 +470,19 @@ private int reqCount(LeafReaderContext context) throws IOException {

[GitHub] [lucene] uschindler commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
uschindler commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1352328979 > I will check it out and inspect the coverage report from .js tests and see if there are any holes. If i find them I will push more tests. I am just really paranoid about some of

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1352322297 I will check it out and inspect the coverage report from .js tests and see if there are any holes. If i find them I will push more tests. I am just really paranoid about some of the slow

[GitHub] [lucene] uschindler commented on a diff in pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
uschindler commented on code in PR #12016: URL: https://github.com/apache/lucene/pull/12016#discussion_r1049026102 ## lucene/expressions/src/java/module-info.java: ## @@ -19,7 +19,7 @@ module org.apache.lucene.expressions { Review Comment: Ok. No problem. I just noticed

[GitHub] [lucene] reta commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
reta commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1352291207 > sorry i didnt do the other tests, I gotta run for now. i can do them later tonight or tomorrow if you need but just wanted to prototype it out @rmuir tests have been migrated,

[GitHub] [lucene] reta commented on a diff in pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
reta commented on code in PR #12016: URL: https://github.com/apache/lucene/pull/12016#discussion_r1049019777 ## lucene/expressions/src/java/module-info.java: ## @@ -19,7 +19,7 @@ module org.apache.lucene.expressions { Review Comment: ah ... it is still automatic :(

[GitHub] [lucene] rmuir commented on a diff in pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on code in PR #12016: URL: https://github.com/apache/lucene/pull/12016#discussion_r1049016403 ## lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompilerSettings.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [lucene] uschindler commented on a diff in pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
uschindler commented on code in PR #12016: URL: https://github.com/apache/lucene/pull/12016#discussion_r1049015663 ## lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompilerSettings.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software

[GitHub] [lucene] reta commented on a diff in pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
reta commented on code in PR #12016: URL: https://github.com/apache/lucene/pull/12016#discussion_r1049014391 ## lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompilerSettings.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [lucene] uschindler commented on a diff in pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
uschindler commented on code in PR #12016: URL: https://github.com/apache/lucene/pull/12016#discussion_r1049010145 ## lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompilerSettings.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software

[GitHub] [lucene] uschindler commented on a diff in pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
uschindler commented on code in PR #12016: URL: https://github.com/apache/lucene/pull/12016#discussion_r1049005864 ## lucene/expressions/src/java/module-info.java: ## @@ -19,7 +19,7 @@ module org.apache.lucene.expressions { Review Comment: Can we now remove this

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1352259233 sorry i didnt do the other tests, I gotta run for now. i can do them later tonight or tomorrow if you need but just wanted to prototype it out -- This is an automated message from the

[GitHub] [lucene] reta commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
reta commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1352258413 > I pushed a proposal to your branch (only fixing one of the tests in .js to use it). If you are really against it, just revert the commit. But i think it keeps tests simpler. :+1:

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1352248407 I pushed a proposal to your branch (only fixing one of the tests in .js to use it). If you are really against it, just revert the commit. But i think it keeps tests simpler. -- This is

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1352177919 we don't need to parameterize the pickiness IMO, we can just turn it on in these tests. Thanks for getting this hooked up. I will look at your branch and try to play with it. -- This

[GitHub] [lucene] reta commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
reta commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1352084948 @rmuir first pass over pickiness, all tests are now run w/o diagnostics listener (picky / not picky mode), the rough observations so far are encouraging, there are no significant changes

[GitHub] [lucene] benwtrent opened a new pull request, #12019: Clean up vector backward-codecs

2022-12-14 Thread GitBox
benwtrent opened a new pull request, #12019: URL: https://github.com/apache/lucene/pull/12019 I noticed that the Lucene94 backward-codec was still allowing KNN fields writer. I have removed it, and updated the related tests similarly to other KNN backward-codec changes. There are

[GitHub] [lucene] benwtrent commented on a diff in pull request #11946: add similarity threshold for hnsw

2022-12-14 Thread GitBox
benwtrent commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1048856364 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { *

[GitHub] [lucene] msokolov commented on a diff in pull request #11946: add similarity threshold for hnsw

2022-12-14 Thread GitBox
msokolov commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1048846385 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { *

[GitHub] [lucene] rmuir commented on issue #11963: Improve vector quantization API

2022-12-14 Thread GitBox
rmuir commented on issue #11963: URL: https://github.com/apache/lucene/issues/11963#issuecomment-1351946133 I would tend to expect some organization like this: indexing: * ByteVectorField * FloatVectorField searching: * ByteVectorField.newQuery *

[GitHub] [lucene] benwtrent commented on issue #11963: Improve vector quantization API

2022-12-14 Thread GitBox
benwtrent commented on issue #11963: URL: https://github.com/apache/lucene/issues/11963#issuecomment-1351900658 OK, step one of this is done with the `KnnByteVectorQuery`. I am next approaching a new `KnnByteVectorField` and this will cause some refactoring to `LeafReader` or

[GitHub] [lucene] AlmostFamiliar commented on issue #9231: HyphenationCompoundWordTokenFilter creates overlapping tokens with onlyLongestMatch enabled [LUCENE-8183]

2022-12-14 Thread GitBox
AlmostFamiliar commented on issue #9231: URL: https://github.com/apache/lucene/issues/9231#issuecomment-1351830837 Are there still plans to merge this? I would also be very interested in this feature. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [lucene] benwtrent commented on a diff in pull request #11946: add similarity threshold for hnsw

2022-12-14 Thread GitBox
benwtrent commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1048764283 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { *

[GitHub] [lucene] benwtrent commented on a diff in pull request #11946: add similarity threshold for hnsw

2022-12-14 Thread GitBox
benwtrent commented on code in PR #11946: URL: https://github.com/apache/lucene/pull/11946#discussion_r1048763428 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) { *

[GitHub] [lucene] jdconrad commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
jdconrad commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351772233 > javacc? I was under the impression this was EOL? If it's still well supported I'm not familiar enough with the code generation to know if this would avoid the pitfalls ANTLR

[GitHub] [lucene] msokolov commented on pull request #11946: add similarity threshold for hnsw

2022-12-14 Thread GitBox
msokolov commented on PR #11946: URL: https://github.com/apache/lucene/pull/11946#issuecomment-1351747886 It seems there are conflicts due to a recent refactor of this query - would you mind merging from main and resolving those please? -- This is an automated message from the Apache Git

[GitHub] [lucene] uschindler commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
uschindler commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351743460 > One other possibility would be to hand-roll the expression lexer/parser. This would get rid of the need for any additional dependencies and generated code. From what I can tell the

[GitHub] [lucene] uschindler commented on pull request #12014: Ban use of Math.fma across the entire codebase

2022-12-14 Thread GitBox
uschindler commented on PR #12014: URL: https://github.com/apache/lucene/pull/12014#issuecomment-1351734347 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [lucene] rmuir commented on pull request #12014: Ban use of Math.fma across the entire codebase

2022-12-14 Thread GitBox
rmuir commented on PR #12014: URL: https://github.com/apache/lucene/pull/12014#issuecomment-1351680479 I looked at what e.g. glibc does here as a fallback out of curiousity, for floats it is very simple (using Dekker algorithm), but requires changing the FP rounding mode, which you cant do

[GitHub] [lucene] jdconrad commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
jdconrad commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351666949 One other possibility would be to hand-roll the expression lexer/parser. This would get rid of the need for any additional dependencies and generated code. From what I can tell the API

[GitHub] [lucene] dweiss commented on pull request #12014: Ban use of Math.fma across the entire codebase

2022-12-14 Thread GitBox
dweiss commented on PR #12014: URL: https://github.com/apache/lucene/pull/12014#issuecomment-1351660443 I honestly don't know who can use this method without any provided cpuid check... We actually use fma in our code but do so by detecting the performance difference between a naive

[GitHub] [lucene] benwtrent commented on pull request #12014: Ban use of Math.fma across the entire codebase

2022-12-14 Thread GitBox
benwtrent commented on PR #12014: URL: https://github.com/apache/lucene/pull/12014#issuecomment-1351560532 Holy crap, creating `BigDecimal` and then multiplying & adding is crazy. This is a completely unacceptable fallback calculation for this method. +1 on banning its use in the

[GitHub] [lucene] jpountz merged pull request #12018: Move byte vector queries into new KnnByteVectorQuery (#12004)

2022-12-14 Thread GitBox
jpountz merged PR #12018: URL: https://github.com/apache/lucene/pull/12018 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] benwtrent commented on pull request #12018: Move byte vector queries into new KnnByteVectorQuery (#12004)

2022-12-14 Thread GitBox
benwtrent commented on PR #12018: URL: https://github.com/apache/lucene/pull/12018#issuecomment-1351392230 @jpountz Here is the backport. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351378582 another way to say it: shading is a terrible, TERRIBLE idea and you only hear about it in java, because java is the only language with developers that are bad enough to consider it.

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351374308 I feel that if we publish a shaded jar we become guilty of "contributing to the delinquency" of terrible supply chain management, by hiding third party dependencies and their versions

[GitHub] [lucene] uschindler commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
uschindler commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351368145 > > Can we specify our dependencies in a different way (e.g. exact version) in our maven stuff so this won't happen? e.g. you can do this in python, and specify that you depend on

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351366346 Yes, java showed what a disaster it is around supply chains with the log4j vulnerability. Shading should not even be considered as an option. -- This is an automated message from the

[GitHub] [lucene] benwtrent opened a new pull request, #12018: Move byte vector queries into new KnnByteVectorQuery (#12004)

2022-12-14 Thread GitBox
benwtrent opened a new pull request, #12018: URL: https://github.com/apache/lucene/pull/12018 Backport of #12004 This is the first commit of a much larger refactor. The overall goal is to separate the concerns of byte vectors and float vectors. Making their usage and APIs clearer

[GitHub] [lucene] dweiss commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
dweiss commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351355396 > Can we specify our dependencies in a different way (e.g. exact version) in our maven stuff so this won't happen? e.g. you can do this in python, and specify that you depend on antlr ==

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351345609 I also dont understand the issue where ppl think they can modify arbitrary versions of lucene dependencies. Can we specify our dependencies in a different way (e.g. exact version)

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351332547 personally i am against the shading. I think it is a huge antipattern, it hides third-party artifacts completely. think about someone trying to do security or license audit and they have

[GitHub] [lucene] benwtrent commented on pull request #12004: Move byte vector queries into new KnnByteVectorQuery

2022-12-14 Thread GitBox
benwtrent commented on PR #12004: URL: https://github.com/apache/lucene/pull/12004#issuecomment-1351211685 @uschindler 100%! I can get the backport PR done asap -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [lucene] jpountz commented on a diff in pull request #12017: Aggressive `count` in BooleanWeight

2022-12-14 Thread GitBox
jpountz commented on code in PR #12017: URL: https://github.com/apache/lucene/pull/12017#discussion_r1048383541 ## lucene/core/src/java/org/apache/lucene/search/BooleanWeight.java: ## @@ -470,14 +470,19 @@ private int reqCount(LeafReaderContext context) throws IOException {

[GitHub] [lucene] dweiss commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
dweiss commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1351155914 > With ASM this was hard to do but with Grade it is quite easy. Just add another configuration with those two dependencies and use the "Gradle Shadow Plugin"

[GitHub] [lucene] LuXugang opened a new pull request, #12017: Aggressive `count` in BooleanWeight

2022-12-14 Thread GitBox
LuXugang opened a new pull request, #12017: URL: https://github.com/apache/lucene/pull/12017 When BooleanQuery is pure disjunction, if at least one clause could match all docs, then we could get right `count` even though there was other clause whose `count` is unknown. -- This is

[GitHub] [lucene] fcofdez commented on pull request #11997: Add IntField, LongField, FloatField and DoubleField

2022-12-14 Thread GitBox
fcofdez commented on PR #11997: URL: https://github.com/apache/lucene/pull/11997#issuecomment-1351000741 @jpountz thanks for looking into this, unfortunately I didn't have time to go through the last round of review comments. The change makes sense to me. -- This is an automated message

[GitHub] [lucene] jpountz commented on pull request #11997: Add IntField, LongField, FloatField and DoubleField

2022-12-14 Thread GitBox
jpountz commented on PR #11997: URL: https://github.com/apache/lucene/pull/11997#issuecomment-1350964750 @fcofdez I hope you don't mind but I checked out your branch to look into the above issue and ended up pushing my changes. These new fields not only support indexing sorted numeric doc

[GitHub] [lucene] uschindler commented on pull request #12004: Move byte vector queries into new KnnByteVectorQuery

2022-12-14 Thread GitBox
uschindler commented on PR #12004: URL: https://github.com/apache/lucene/pull/12004#issuecomment-1350722742 > @benwtrent Would you mind helping with the backport by creating a PR? There are a few conflicts and I could use some help with resolving them. There are also Java 17 switch

[GitHub] [lucene] jpountz commented on pull request #12004: Move byte vector queries into new KnnByteVectorQuery

2022-12-14 Thread GitBox
jpountz commented on PR #12004: URL: https://github.com/apache/lucene/pull/12004#issuecomment-1350702865 @benwtrent Would you mind helping with the backport by creating a PR? There are a few conflicts and I could use some help with resolving them. -- This is an automated message from the

[GitHub] [lucene] iverase commented on pull request #12006: Do int compare instead of ArrayUtil#compareUnsigned4 in LatlonPointQueries

2022-12-14 Thread GitBox
iverase commented on PR #12006: URL: https://github.com/apache/lucene/pull/12006#issuecomment-1350678308 > If it gives a good speedup, we could think about applying the same trick to other queries too, if we can do it without making things too ugly. For example LatLonPoint, IntPoint, etc

[GitHub] [lucene] uschindler commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-14 Thread GitBox
uschindler commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1350666129 Hi, I agree with all Robert say. I would also like to make another suggestion (that could also be applied when this has been fixed). To me it looks like a bad decission of antlr,

[GitHub] [lucene] jpountz merged pull request #12004: Move byte vector queries into new KnnByteVectorQuery

2022-12-14 Thread GitBox
jpountz merged PR #12004: URL: https://github.com/apache/lucene/pull/12004 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] gf2121 commented on pull request #12006: Do int compare instead of ArrayUtil#compareUnsigned4 in LatlonPointQueries

2022-12-14 Thread GitBox
gf2121 commented on PR #12006: URL: https://github.com/apache/lucene/pull/12006#issuecomment-1350639546 This brings a small jump for Distance Filter LatLonPoint task. http://people.apache.org/~mikemccand/geobench.html -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1350455204 Simple package-private `static` method to turn on the pickiness should do it. I don't want to see 100 new constructors/abstractions added with booleans, just because antlr made a

[GitHub] [lucene] jdconrad commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
jdconrad commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1350071474 I agree with @rmuir that having an ambiguity check for tests similar to Painless would be great for expressions. I'm a bit surprised this change didn't require much additionally to the

[GitHub] [lucene] reta commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
reta commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349804323 Thanks for encouraging @rmuir ! I will be working on the matter this week and share my findings, thank you! -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349778653 In general, sorry if i discouraged before, it is really just a frustrating situation If you get stuck, just leave the PR open. I will try to dig into this too, I have been through

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349610775 cc: @jdconrad who might remember a lot more about this than me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349606909 here is coverage report using the current antlr. I guess i dont know why so much is missing here:

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349604029 As far as inspecting coverage, I suspect it is pretty good. But there is instructions in https://github.com/apache/lucene/blob/main/help/tests.txt on how to generate reports. -- This

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349602516 @reta I remember doing this adds overhead, that's why it is a boolean there. so it really just needs to be something we do from tests. for example it could be a package-private setter or

[GitHub] [lucene] reta commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
reta commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349601503 > Looks like this in painless:

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349599904 Looks like this in painless:

[GitHub] [lucene] rmuir commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
rmuir commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349597878 the only way i know to prevent the traps is to do like painless and "enable picky mode" which fails test instead of doing slow things. and to have 100% test coverage of grammar! --

[GitHub] [lucene] reta commented on pull request #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
reta commented on PR #12016: URL: https://github.com/apache/lucene/pull/12016#issuecomment-1349213966 @rmuir @uschindler what kind of (performance? jmh?) testing would help to discard / prove that moving to 4.11.x makes / does not make sense. You have definitely seen traps in the past, I

[GitHub] [lucene] reta opened a new pull request, #12016: Upgrade ANTLR to version 4.11.1

2022-12-13 Thread GitBox
reta opened a new pull request, #12016: URL: https://github.com/apache/lucene/pull/12016 Signed-off-by: Andriy Redko ### Description The Apache Lucene is using quite old version of ANTLR 4.5.1-1. By itself, it is not a showstopper, but more profound issue is that some ANTLR

[GitHub] [lucene] rmuir closed issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-13 Thread GitBox
rmuir closed issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems URL: https://github.com/apache/lucene/issues/12012 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [lucene] rmuir commented on pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2022-12-13 Thread GitBox
rmuir commented on PR #12013: URL: https://github.com/apache/lucene/pull/12013#issuecomment-1348842317 Yeah, I'm highly suspicious of this cache. I dug into this and found that previously stored fields (!) were used for this lookup. So no surprise there was a cache around it, since this is

[GitHub] [lucene] rmuir commented on pull request #12015: Run spotless after javac (#12012)

2022-12-13 Thread GitBox
rmuir commented on PR #12015: URL: https://github.com/apache/lucene/pull/12015#issuecomment-1348790082 Thanks again, it really helps make the backporting process less painful for me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [lucene] dweiss commented on pull request #12015: Run spotless after javac (#12012)

2022-12-13 Thread GitBox
dweiss commented on PR #12015: URL: https://github.com/apache/lucene/pull/12015#issuecomment-1348772038 Sorry - I see it now. I've pushed to my local fork at gh. Didn't notice it. I've just pushed to 9x on apache as well. -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] dweiss commented on pull request #12015: Run spotless after javac (#12012)

2022-12-13 Thread GitBox
dweiss commented on PR #12015: URL: https://github.com/apache/lucene/pull/12015#issuecomment-1348767456 You have?! Well, that's strange - I've pushed it to 9x... Can you reproduce it with the head? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [lucene] rmuir commented on pull request #12015: Run spotless after javac (#12012)

2022-12-13 Thread GitBox
rmuir commented on PR #12015: URL: https://github.com/apache/lucene/pull/12015#issuecomment-1348669494 @dweiss do you mind pushing to 9.x, otherwise i will do it for you. I just hit this again in 9.x :) -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [lucene] rmuir merged pull request #11998: Migrate away from per-segment-per-threadlocals on SegmentReader

2022-12-13 Thread GitBox
rmuir merged PR #11998: URL: https://github.com/apache/lucene/pull/11998 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir commented on pull request #12015: Run spotless after javac (#12012)

2022-12-13 Thread GitBox
rmuir commented on PR #12015: URL: https://github.com/apache/lucene/pull/12015#issuecomment-1348488213 thank you @dweiss -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] iverase merged pull request #11988: Fix algorithm that chooses the bridge between a polygon and a hole

2022-12-13 Thread GitBox
iverase merged PR #11988: URL: https://github.com/apache/lucene/pull/11988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] iverase closed issue #11986: Polygons failing to tessellate

2022-12-13 Thread GitBox
iverase closed issue #11986: Polygons failing to tessellate URL: https://github.com/apache/lucene/issues/11986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [lucene] dweiss commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox
dweiss commented on issue #12012: URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347874082 I've applied the patch to 9x and main, btw. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [lucene] dweiss merged pull request #12015: Run spotless after javac (#12012)

2022-12-12 Thread GitBox
dweiss merged PR #12015: URL: https://github.com/apache/lucene/pull/12015 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] dweiss commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox
dweiss commented on issue #12012: URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347872711 I filed a PR #12015 . Indeed the 'core' formatting task seems to be named 'spotlessJava' (spotless + format name), the check and apply are just attached to it in a fancy manner via

[GitHub] [lucene] dweiss commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox
dweiss commented on issue #12012: URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347844975 > I think we are not slowing down the spotlessJava task which is the one actually failing for me? Eh. I don't know what the relationships between those spotless tasks are.

[GitHub] [lucene] rmuir merged pull request #12010: Enable LongDoubleConversion error-prone check

2022-12-12 Thread GitBox
rmuir merged PR #12010: URL: https://github.com/apache/lucene/pull/12010 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] rmuir closed issue #12009: Find (and fix) places where we treat a long as a double without an explicit cast

2022-12-12 Thread GitBox
rmuir closed issue #12009: Find (and fix) places where we treat a long as a double without an explicit cast URL: https://github.com/apache/lucene/issues/12009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [lucene] rmuir commented on pull request #12014: Ban use of Math.fma across the entire codebase

2022-12-12 Thread GitBox
rmuir commented on PR #12014: URL: https://github.com/apache/lucene/pull/12014#issuecomment-1347620183 Yeah, I think if the fallback java code was 2x, 4x, or 8x slower (like you would expect from these intrinsics), we wouldn't be having this conversation :) -- This is an automated

[GitHub] [lucene] gsmiller commented on pull request #12014: Ban use of Math.fma across the entire codebase

2022-12-12 Thread GitBox
gsmiller commented on PR #12014: URL: https://github.com/apache/lucene/pull/12014#issuecomment-1347617300 +1, seems reasonable to me. We can always remove this ban in the future if there's a good reason, but seems reasonable to put this in place to prevent it sneaking in for now. --

[GitHub] [lucene] rmuir opened a new pull request, #12014: Ban use of Math.fma across the entire codebase

2022-12-12 Thread GitBox
rmuir opened a new pull request, #12014: URL: https://github.com/apache/lucene/pull/12014 When FMA is not supported by the hardware, these methods fall back to `BigDecimal` usage [1] which causes them to be 2500x slower [2]. While most hardware in the last 10 years may have the

[GitHub] [lucene] uschindler commented on pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2022-12-12 Thread GitBox
uschindler commented on PR #12013: URL: https://github.com/apache/lucene/pull/12013#issuecomment-1347501489 Please remove the cache. Totally useless. It hurts more because the escape analysis can't work. When tiered compilation stepped in, this is just a useless extra My rule:

[GitHub] [lucene] rmuir commented on pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2022-12-12 Thread GitBox
rmuir commented on PR #12013: URL: https://github.com/apache/lucene/pull/12013#issuecomment-1347439475 Also (summoning my inner @uschindler), curious what the perf impact is if we simply remove these caches completely. Maybe they are not relevant to the performance anymore due to faceting

[GitHub] [lucene] rmuir commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox
rmuir commented on issue #12012: URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347423692 I ran this patch on the mac m1 and also got `BUILD SUCCESSFUL in 3m 33s`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

<    1   2   3   4   5   6   7   8   9   10   >