[jira] [Commented] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024929#comment-17024929 ] Xin-Chun Zhang commented on LUCENE-9004: Is there any possible to merge LUCENE-9136 with this issue? > Approximate nearest vector search > - > > Key: LUCENE-9004 > URL: https://issues.apache.org/jira/browse/LUCENE-9004 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael Sokolov >Priority: Major > Attachments: hnsw_layered_graph.png > > Time Spent: 3h 10m > Remaining Estimate: 0h > > "Semantic" search based on machine-learned vector "embeddings" representing > terms, queries and documents is becoming a must-have feature for a modern > search engine. SOLR-12890 is exploring various approaches to this, including > providing vector-based scoring functions. This is a spinoff issue from that. > The idea here is to explore approximate nearest-neighbor search. Researchers > have found an approach based on navigating a graph that partially encodes the > nearest neighbor relation at multiple scales can provide accuracy > 95% (as > compared to exact nearest neighbor calculations) at a reasonable cost. This > issue will explore implementing HNSW (hierarchical navigable small-world) > graphs for the purpose of approximate nearest vector search (often referred > to as KNN or k-nearest-neighbor search). > At a high level the way this algorithm works is this. First assume you have a > graph that has a partial encoding of the nearest neighbor relation, with some > short and some long-distance links. If this graph is built in the right way > (has the hierarchical navigable small world property), then you can > efficiently traverse it to find nearest neighbors (approximately) in log N > time where N is the number of nodes in the graph. I believe this idea was > pioneered in [1]. The great insight in that paper is that if you use the > graph search algorithm to find the K nearest neighbors of a new document > while indexing, and then link those neighbors (undirectedly, ie both ways) to > the new document, then the graph that emerges will have the desired > properties. > The implementation I propose for Lucene is as follows. We need two new data > structures to encode the vectors and the graph. We can encode vectors using a > light wrapper around {{BinaryDocValues}} (we also want to encode the vector > dimension and have efficient conversion from bytes to floats). For the graph > we can use {{SortedNumericDocValues}} where the values we encode are the > docids of the related documents. Encoding the interdocument relations using > docids directly will make it relatively fast to traverse the graph since we > won't need to lookup through an id-field indirection. This choice limits us > to building a graph-per-segment since it would be impractical to maintain a > global graph for the whole index in the face of segment merges. However > graph-per-segment is a very natural at search time - we can traverse each > segments' graph independently and merge results as we do today for term-based > search. > At index time, however, merging graphs is somewhat challenging. While > indexing we build a graph incrementally, performing searches to construct > links among neighbors. When merging segments we must construct a new graph > containing elements of all the merged segments. Ideally we would somehow > preserve the work done when building the initial graphs, but at least as a > start I'd propose we construct a new graph from scratch when merging. The > process is going to be limited, at least initially, to graphs that can fit > in RAM since we require random access to the entire graph while constructing > it: In order to add links bidirectionally we must continually update existing > documents. > I think we want to express this API to users as a single joint > {{KnnGraphField}} abstraction that joins together the vectors and the graph > as a single joint field type. Mostly it just looks like a vector-valued > field, but has this graph attached to it. > I'll push a branch with my POC and would love to hear comments. It has many > nocommits, basic design is not really set, there is no Query implementation > and no integration iwth IndexSearcher, but it does work by some measure using > a standalone test class. I've tested with uniform random vectors and on my > laptop indexed 10K documents in around 10 seconds and searched them at 95% > recall (compared with exact nearest-neighbor baseline) at around 250 QPS. I > haven't made any attempt to use multithreaded search for this, but it is > amenable to per-segment concurrency. > [1] >
[jira] [Created] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020
Guruprasad K K created SOLR-14224: - Summary: Not able to build solr 6.6.2 from source after January 15, 2020 Key: SOLR-14224 URL: https://issues.apache.org/jira/browse/SOLR-14224 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 6.6.2 Reporter: Guruprasad K K After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 version uses http connection. So our builds are failing. But looks like latest version of solr has the fix to this in common_build.xml and other places where it uses https connection to maven. What is the work around for this if we cant upgrade the solr version and still if we want to use 6.6.2? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12325) introduce uniqueBlockQuery(parent:true) aggregation for JSON Facet
[ https://issues.apache.org/jira/browse/SOLR-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024847#comment-17024847 ] Munendra S N commented on SOLR-12325: - Apologies Mikhail, I was caught in some other thing. +1 to suggested approach > introduce uniqueBlockQuery(parent:true) aggregation for JSON Facet > -- > > Key: SOLR-12325 > URL: https://issues.apache.org/jira/browse/SOLR-12325 > Project: Solr > Issue Type: New Feature > Components: Facet Module >Reporter: Mikhail Khludnev >Assignee: Mikhail Khludnev >Priority: Major > Fix For: 8.5 > > Attachments: SOLR-12325.patch, SOLR-12325.patch, SOLR-12325.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > It might be faster twin for {{uniqueBlock(\_root_)}}. Please utilise buildin > query parsing method, don't invent your own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson opened a new pull request #1218: Javacc erick
ErickErickson opened a new pull request #1218: Javacc erick URL: https://github.com/apache/lucene-solr/pull/1218 Here's the build changes to get javacc to run, modeled on the jflex changes , many thanks for the model. Only two files changed here ;) If the structure is OK, I'll fill in the "doLast" blocks with the cleanup code and maybe be able extract some common parts. NOTE: you can't even compile the result of running this because I wanted the changes to the build structure to be clear first so didn't include the cleanup tasks yet. So if this structure is OK, should I merge it into master before or after the rest of the cleanup? My assumption is after. I want to try to get all the warnings etc. out of the generated code in the next phase to reduce the temptation for people to make hand-edits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on issue #1191: SOLR-14197 Reduce API of SolrResourceLoader
madrob commented on issue #1191: SOLR-14197 Reduce API of SolrResourceLoader URL: https://github.com/apache/lucene-solr/pull/1191#issuecomment-579026541 This looks pretty nice and was something I had been thinking about as well. Skimmed the first handful of commits and things made sense, I'll try to take a deeper look at this tomorrow or Wednesday! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on issue #1205: SOLR-14206: Annotate HttpSolrCall as thread-safe
madrob commented on issue #1205: SOLR-14206: Annotate HttpSolrCall as thread-safe URL: https://github.com/apache/lucene-solr/pull/1205#issuecomment-579025145 I think you already did this in #1203 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob opened a new pull request #1217: SOLR-14223 PublicKeyHandler consumes a lot of entropy during tests
madrob opened a new pull request #1217: SOLR-14223 PublicKeyHandler consumes a lot of entropy during tests URL: https://github.com/apache/lucene-solr/pull/1217 Use a non-blocking implementation of SecureRandom for generating RSA Keys. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14223) PublicKeyHandler consumes a lot of entropy during tests
[ https://issues.apache.org/jira/browse/SOLR-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024777#comment-17024777 ] Mike Drob commented on SOLR-14223: -- cc: [~noble.paul] [~varun] - Interested in your thoughts since you were active on the original issue. > PublicKeyHandler consumes a lot of entropy during tests > --- > > Key: SOLR-14223 > URL: https://issues.apache.org/jira/browse/SOLR-14223 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.4, 8.0 >Reporter: Mike Drob >Priority: Major > > After the changes in SOLR-12354 to eagerly create a {{PublicKeyHandler}} for > the CoreContainer, the creation of the underlying {{RSAKeyPair}} uses > {{SecureRandom}} to generate primes. This eats up a lot of system entropy and > can slow down tests significantly (I observed it adding 10s to an individual > test). > Similar to what we do for SSL config for tests, we can swap in a non blocking > implementation of SecureRandom for the key pair generation to allow multiple > tests to run better in parallel. Primality testing with BigInteger is also > slow, so I'm not sure how much total speedup we can get here, maybe it's > worth checking if there are faster implementations out there in other > libraries. > In production cases, this also blocks creation of all cores. We should only > create the Handler if necessary, i.e. if the existing authn/z tell us that > they won't support internode requests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14223) PublicKeyHandler consumes a lot of entropy during tests
Mike Drob created SOLR-14223: Summary: PublicKeyHandler consumes a lot of entropy during tests Key: SOLR-14223 URL: https://issues.apache.org/jira/browse/SOLR-14223 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 8.0, 7.4 Reporter: Mike Drob After the changes in SOLR-12354 to eagerly create a {{PublicKeyHandler}} for the CoreContainer, the creation of the underlying {{RSAKeyPair}} uses {{SecureRandom}} to generate primes. This eats up a lot of system entropy and can slow down tests significantly (I observed it adding 10s to an individual test). Similar to what we do for SSL config for tests, we can swap in a non blocking implementation of SecureRandom for the key pair generation to allow multiple tests to run better in parallel. Primality testing with BigInteger is also slow, so I'm not sure how much total speedup we can get here, maybe it's worth checking if there are faster implementations out there in other libraries. In production cases, this also blocks creation of all cores. We should only create the Handler if necessary, i.e. if the existing authn/z tell us that they won't support internode requests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse
tflobbe commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse URL: https://github.com/apache/lucene-solr/pull/1210#discussion_r371507235 ## File path: solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java ## @@ -26,7 +26,9 @@ import java.util.Objects; public class OverseerSolrResponse extends SolrResponse { - + + private static final long serialVersionUID = 4721653044098960880L; Review comment: I agree, everything that uses java serialization should be setting a serialVersionUID, my concern is that it may be too late now. I think you discovered a bug with your test (thanks!), but I believe it's too late to add a serialVersionUID now because for some users it could mean exactly the same as if we'd had one before and now we are changing it. Hopefully this won't be an issue once the serialization is in javabin and the Java serialization part is removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14222) CloudSolrClient converts (update) 403 error to 500 error
[ https://issues.apache.org/jira/browse/SOLR-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-14222: -- Attachment: SOLR-14222_test.patch Status: Open (was: Open) attaching SOLR-14222_test.patch which shows the problem. > CloudSolrClient converts (update) 403 error to 500 error > - > > Key: SOLR-14222 > URL: https://issues.apache.org/jira/browse/SOLR-14222 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud, SolrJ >Reporter: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14222_test.patch > > > Something about the way CloudSolrClient pulls UpdateRequetss apart to send > docs direct to leaders also seems to cause it to report status code "500" > Server Errors when 403 authorization errors are thrown by the server. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14222) CloudSolrClient converts (update) 403 error to 500 error
Chris M. Hostetter created SOLR-14222: - Summary: CloudSolrClient converts (update) 403 error to 500 error Key: SOLR-14222 URL: https://issues.apache.org/jira/browse/SOLR-14222 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud, SolrJ Reporter: Chris M. Hostetter Something about the way CloudSolrClient pulls UpdateRequetss apart to send docs direct to leaders also seems to cause it to report status code "500" Server Errors when 403 authorization errors are thrown by the server. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn commented on issue #1215: LUCENE-9164: Ignore ACE on tragic event if IW is closed
dnhatn commented on issue #1215: LUCENE-9164: Ignore ACE on tragic event if IW is closed URL: https://github.com/apache/lucene-solr/pull/1215#issuecomment-578960010 @mikemccand @jpountz Would you mind taking a look? Thank you. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14040) solr.xml shareSchema does not work in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024672#comment-17024672 ] Chris M. Hostetter commented on SOLR-14040: --- FWIW: TestBulkSchemaConcurrent was failing a lot on master as well after your original commit, but the master failures seemed to have dropped off after your Jan22 commits while the 8x failures continued. I have not dug into the logs from the failures to compare 8x / master (or 8x bbefore/after your "restore legacy Collection auto-creation" commits) to see if the *nature* of the failures is diff – but you might want to before they get purged (my report system only keeps the past 7 days worth of logs due to disk constraints) > solr.xml shareSchema does not work in SolrCloud > --- > > Key: SOLR-14040 > URL: https://issues.apache.org/jira/browse/SOLR-14040 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Fix For: 8.5 > > Time Spent: 0.5h > Remaining Estimate: 0h > > solr.xml has a shareSchema boolean option that can be toggled from the > default of false to true in order to share IndexSchema objects within the > Solr node. This is silently ignored in SolrCloud mode. The pertinent code > is {{org.apache.solr.core.ConfigSetService#createConfigSetService}} which > creates a CloudConfigSetService that is not related to the SchemaCaching > class. This may not be a big deal in SolrCloud which tends not to deal well > with many cores per node but I'm working on changing that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14040) solr.xml shareSchema does not work in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024668#comment-17024668 ] David Smiley commented on SOLR-14040: - It appears that problem was recently fixed in SOLR-14211. Notice that fix was in master for awhile and only 13 hours ago was it back-ported to 8x. > solr.xml shareSchema does not work in SolrCloud > --- > > Key: SOLR-14040 > URL: https://issues.apache.org/jira/browse/SOLR-14040 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Fix For: 8.5 > > Time Spent: 0.5h > Remaining Estimate: 0h > > solr.xml has a shareSchema boolean option that can be toggled from the > default of false to true in order to share IndexSchema objects within the > Solr node. This is silently ignored in SolrCloud mode. The pertinent code > is {{org.apache.solr.core.ConfigSetService#createConfigSetService}} which > creates a CloudConfigSetService that is not related to the SchemaCaching > class. This may not be a big deal in SolrCloud which tends not to deal well > with many cores per node but I'm working on changing that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14040) solr.xml shareSchema does not work in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024652#comment-17024652 ] David Smiley commented on SOLR-14040: - I have not; I didn't make the connection. Hmmm, it's interesting that only 8x has failed and not master. I checked that the changes happened on both branches for both commits. Hmmm, looking more... > solr.xml shareSchema does not work in SolrCloud > --- > > Key: SOLR-14040 > URL: https://issues.apache.org/jira/browse/SOLR-14040 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Fix For: 8.5 > > Time Spent: 0.5h > Remaining Estimate: 0h > > solr.xml has a shareSchema boolean option that can be toggled from the > default of false to true in order to share IndexSchema objects within the > Solr node. This is silently ignored in SolrCloud mode. The pertinent code > is {{org.apache.solr.core.ConfigSetService#createConfigSetService}} which > creates a CloudConfigSetService that is not related to the SchemaCaching > class. This may not be a big deal in SolrCloud which tends not to deal well > with many cores per node but I'm working on changing that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12325) introduce uniqueBlockQuery(parent:true) aggregation for JSON Facet
[ https://issues.apache.org/jira/browse/SOLR-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024637#comment-17024637 ] Mikhail Khludnev commented on SOLR-12325: - No concerns so far. I'm going to revamp syntax as follows: ||Syntax||Behavior|| |uniqueBlock(field)|as-is field logic| |uniqueBlock($fieldparam)..=field|as-is field reference logic| |uniqueBlock(\{!v=type_s:pipe\})|new query logic| |uniqueBlock(\{!v=$qref\})...=K:amber some|new query referencing logic| Looking forward for your opinion. > introduce uniqueBlockQuery(parent:true) aggregation for JSON Facet > -- > > Key: SOLR-12325 > URL: https://issues.apache.org/jira/browse/SOLR-12325 > Project: Solr > Issue Type: New Feature > Components: Facet Module >Reporter: Mikhail Khludnev >Assignee: Mikhail Khludnev >Priority: Major > Fix For: 8.5 > > Attachments: SOLR-12325.patch, SOLR-12325.patch, SOLR-12325.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > It might be faster twin for {{uniqueBlock(\_root_)}}. Please utilise buildin > query parsing method, don't invent your own. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11207) Add OWASP dependency checker to detect security vulnerabilities in third party libraries
[ https://issues.apache.org/jira/browse/SOLR-11207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024624#comment-17024624 ] ASF subversion and git services commented on SOLR-11207: Commit 53f7b394e49e9b6d5f3e3aa6980078421d87688e in lucene-solr's branch refs/heads/master from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=53f7b39 ] SOLR-11207: Mute warnings for owasp false positives > Add OWASP dependency checker to detect security vulnerabilities in third > party libraries > > > Key: SOLR-11207 > URL: https://issues.apache.org/jira/browse/SOLR-11207 > Project: Solr > Issue Type: Improvement > Components: Build >Affects Versions: 6.0 >Reporter: Hrishikesh Gadre >Assignee: Jan Høydahl >Priority: Major > Fix For: master (9.0) > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Lucene/Solr project depends on number of third party libraries. Some of those > libraries contain security vulnerabilities. Upgrading to versions of those > libraries that have fixes for those vulnerabilities is a simple, critical > step we can take to improve the security of the system. But for that we need > a tool which can scan the Lucene/Solr dependencies and look up the security > database for known vulnerabilities. > I found that [OWASP > dependency-checker|https://jeremylong.github.io/DependencyCheck/dependency-check-ant/] > can be used for this purpose. It provides a ant task which we can include in > the Lucene/Solr build. We also need to figure out how (and when) to invoke > this dependency-checker. But this can be figured out once we complete the > first step of integrating this tool with the Lucene/Solr build system. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
[ https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024612#comment-17024612 ] ASF subversion and git services commented on LUCENE-9184: - Commit ff635cf701f086241117c5dab925aa5ef825ce51 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ff635cf ] LUCENE-9184, LUCENE-9183: allow skipping git status check in precommit with -Pvalidation.git.failOnModified=false (or place this in gradle.properties to make it permanent). > Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) > - > > Key: LUCENE-9184 > URL: https://issues.apache.org/jira/browse/LUCENE-9184 > Project: Lucene - Core > Issue Type: Wish > Components: general/build >Reporter: Uwe Schindler >Assignee: Dawid Weiss >Priority: Major > Fix For: master (9.0) > > > Depending on the type of Git Client you are using (I hate the command line, I > use Eclipse Git or TortoiseGit -- my preference), the way how files are > committed differs. Normally with git command line you would first stage all > files and then commit them. If you stage them and then run precommit, it > works fine, as the "changed" and "added" and other stati are ignored and its > still confirmed as "clean". After the pre-COMMIT task you finally commit. > But Git GUIs don't have the concept of staging. You can (similar to > Subversion) add files and delete files, but when you modify a file you cannot > explicitely "stage" the change. What you do is to open the commit GUI, put > checkboxes on all files you want to commit and then the GUI triggers a stage > and commit directly after each other. > In this workflow, the precommit check of course complains about "modified" > files. > This is the reason why the Ant task does have 2 modes: > - The strict mode which forbids any change in the working copy, so it must be > 100% clean. By default, Ant only runs this if the property "is.jenkins.build" > is enabled. The reason for that is to detect any change in the working copy > caused by running the Jenkins CI (like temporary files munging around). > - The default "committer/developer" mode: In this case the working copy check > only complains about "untracked" or "missing" files. So a committer who > changes some files can still pass precommit. If he adds a new file, he has to > add it to the index, so its not untracked. But generally normal modifications > of working copy are allowed. > Please add this back. There was a reason why I set up the check-working-copy > Ant task like it was. > If others aggree, i'd like to change the task so it has two modes: > - Full clean mode (for CI builds), enabled only if it's a CI build -- we > should maybe add some tasks like "jenkins-hourly"on root project that enables > this mode > - Developer mode (default), that does not care about "modified" files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9183) Allow optional skipping of git status check in precommit
[ https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-9183. - > Allow optional skipping of git status check in precommit > > > Key: LUCENE-9183 > URL: https://issues.apache.org/jira/browse/LUCENE-9183 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > > Had an offline conversation with Uwe about it. For people who don't use git > staging > (only IDEs) the precommit may be problematic as it currently fails on locally > changed > files. > I'll add an option to skip it, if the developer so desires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9183) Allow optional skipping of git status check in precommit
[ https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024613#comment-17024613 ] ASF subversion and git services commented on LUCENE-9183: - Commit ff635cf701f086241117c5dab925aa5ef825ce51 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ff635cf ] LUCENE-9184, LUCENE-9183: allow skipping git status check in precommit with -Pvalidation.git.failOnModified=false (or place this in gradle.properties to make it permanent). > Allow optional skipping of git status check in precommit > > > Key: LUCENE-9183 > URL: https://issues.apache.org/jira/browse/LUCENE-9183 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > > Had an offline conversation with Uwe about it. For people who don't use git > staging > (only IDEs) the precommit may be problematic as it currently fails on locally > changed > files. > I'll add an option to skip it, if the developer so desires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
[ https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-9184. - Resolution: Fixed > Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) > - > > Key: LUCENE-9184 > URL: https://issues.apache.org/jira/browse/LUCENE-9184 > Project: Lucene - Core > Issue Type: Wish > Components: general/build >Reporter: Uwe Schindler >Assignee: Dawid Weiss >Priority: Major > Fix For: master (9.0) > > > Depending on the type of Git Client you are using (I hate the command line, I > use Eclipse Git or TortoiseGit -- my preference), the way how files are > committed differs. Normally with git command line you would first stage all > files and then commit them. If you stage them and then run precommit, it > works fine, as the "changed" and "added" and other stati are ignored and its > still confirmed as "clean". After the pre-COMMIT task you finally commit. > But Git GUIs don't have the concept of staging. You can (similar to > Subversion) add files and delete files, but when you modify a file you cannot > explicitely "stage" the change. What you do is to open the commit GUI, put > checkboxes on all files you want to commit and then the GUI triggers a stage > and commit directly after each other. > In this workflow, the precommit check of course complains about "modified" > files. > This is the reason why the Ant task does have 2 modes: > - The strict mode which forbids any change in the working copy, so it must be > 100% clean. By default, Ant only runs this if the property "is.jenkins.build" > is enabled. The reason for that is to detect any change in the working copy > caused by running the Jenkins CI (like temporary files munging around). > - The default "committer/developer" mode: In this case the working copy check > only complains about "untracked" or "missing" files. So a committer who > changes some files can still pass precommit. If he adds a new file, he has to > add it to the index, so its not untracked. But generally normal modifications > of working copy are allowed. > Please add this back. There was a reason why I set up the check-working-copy > Ant task like it was. > If others aggree, i'd like to change the task so it has two modes: > - Full clean mode (for CI builds), enabled only if it's a CI build -- we > should maybe add some tasks like "jenkins-hourly"on root project that enables > this mode > - Developer mode (default), that does not care about "modified" files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
[ https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-9184: -- Description: Depending on the type of Git Client you are using (I hate the command line, I use Eclipse Git or TortoiseGit -- my preference), the way how files are committed differs. Normally with git command line you would first stage all files and then commit them. If you stage them and then run precommit, it works fine, as the "changed" and "added" and other stati are ignored and its still confirmed as "clean". After the pre-COMMIT task you finally commit. But Git GUIs don't have the concept of staging. You can (similar to Subversion) add files and delete files, but when you modify a file you cannot explicitely "stage" the change. What you do is to open the commit GUI, put checkboxes on all files you want to commit and then the GUI triggers a stage and commit directly after each other. In this workflow, the precommit check of course complains about "modified" files. This is the reason why the Ant task does have 2 modes: - The strict mode which forbids any change in the working copy, so it must be 100% clean. By default, Ant only runs this if the property "is.jenkins.build" is enabled. The reason for that is to detect any change in the working copy caused by running the Jenkins CI (like temporary files munging around). - The default "committer/developer" mode: In this case the working copy check only complains about "untracked" or "missing" files. So a committer who changes some files can still pass precommit. If he adds a new file, he has to add it to the index, so its not untracked. But generally normal modifications of working copy are allowed. Please add this back. There was a reason why I set up the check-working-copy Ant task like it was. If others aggree, i'd like to change the task so it has two modes: - Full clean mode (for CI builds), enabled only if it's a CI build -- we should maybe add some tasks like "jenkins-hourly"on root project that enables this mode - Developer mode (default), that does not care about "modified" files. was: Depending on the type of Git Client you are using (I hate the command line, I use Eclipse Git or TortoiseGit -- my preference), the way how files are committed differs. Normally with git command line you would first stage all files and then commit them. If you stage them and then run precommit, it works fine, as the "changed" and "added" and other stati are ignored and its still confirmed as "clean". After the pre-COMMIT task you finally commit. But Git GUIs don't have the concept of staging. You can (similar to Subversion) add files and delete files, but when you modify a file you cannot explicitely "stage" the change. What you do is to open the commit GUI, put checkboxes on all files you want to commit and then the GUI triggers a stage and commit directly after each other. In this workflow, the precommit check of course complains about "modified" files. This is the reason why the Ant task does have 2 modes: - The strict mode which forbids any change in the working copy, so it must be 100% clean. By default, Ant only runs this if the property "is.jenkins.build" is enabled. The reason for that is to detect any change in the working copy caused by running the Jenkins CI (like temporary files munging around). - The default "committer/developer" mode: In this case the working copy check only complains about "untracked" or "missing" files. So a committer who changes some files can still pass precommit. If he adds a new file, he has to add it to the index, so its not untracked. But generally normal modifications of working copy are allowed. Please add this back. There was a reason why I set up the check-working-copy Ant tak like it was. If others aggree, i'd like to change the task so it has two modes: - Full clean mode (for CI builds), enabled only if it's a CI build -- we should maybe add some tasks like "jenkins-hourly"on root project that enables this mode - Developer mode (default), that does not care about "modified" files. > Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) > - > > Key: LUCENE-9184 > URL: https://issues.apache.org/jira/browse/LUCENE-9184 > Project: Lucene - Core > Issue Type: Wish > Components: general/build >Reporter: Uwe Schindler >Assignee: Dawid Weiss >Priority: Major > Fix For: master (9.0) > > > Depending on the type of Git Client you are using (I hate the command line, I > use Eclipse Git or TortoiseGit -- my preference), the way how files are > committed differs. Normally with git command line you would first stage all > files and then commit them. If you stage them and
[jira] [Reopened] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
[ https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reopened LUCENE-9184: --- Assignee: Dawid Weiss Lol, we both closed the linked issues. > Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) > - > > Key: LUCENE-9184 > URL: https://issues.apache.org/jira/browse/LUCENE-9184 > Project: Lucene - Core > Issue Type: Wish > Components: general/build >Reporter: Uwe Schindler >Assignee: Dawid Weiss >Priority: Major > Fix For: master (9.0) > > > Depending on the type of Git Client you are using (I hate the command line, I > use Eclipse Git or TortoiseGit -- my preference), the way how files are > committed differs. Normally with git command line you would first stage all > files and then commit them. If you stage them and then run precommit, it > works fine, as the "changed" and "added" and other stati are ignored and its > still confirmed as "clean". After the pre-COMMIT task you finally commit. > But Git GUIs don't have the concept of staging. You can (similar to > Subversion) add files and delete files, but when you modify a file you cannot > explicitely "stage" the change. What you do is to open the commit GUI, put > checkboxes on all files you want to commit and then the GUI triggers a stage > and commit directly after each other. > In this workflow, the precommit check of course complains about "modified" > files. > This is the reason why the Ant task does have 2 modes: > - The strict mode which forbids any change in the working copy, so it must be > 100% clean. By default, Ant only runs this if the property "is.jenkins.build" > is enabled. The reason for that is to detect any change in the working copy > caused by running the Jenkins CI (like temporary files munging around). > - The default "committer/developer" mode: In this case the working copy check > only complains about "untracked" or "missing" files. So a committer who > changes some files can still pass precommit. If he adds a new file, he has to > add it to the index, so its not untracked. But generally normal modifications > of working copy are allowed. > Please add this back. There was a reason why I set up the check-working-copy > Ant tak like it was. > If others aggree, i'd like to change the task so it has two modes: > - Full clean mode (for CI builds), enabled only if it's a CI build -- we > should maybe add some tasks like "jenkins-hourly"on root project that enables > this mode > - Developer mode (default), that does not care about "modified" files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9183) Allow optional skipping of git status check in precommit
[ https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-9183: Status: Reopened (was: Closed) > Allow optional skipping of git status check in precommit > > > Key: LUCENE-9183 > URL: https://issues.apache.org/jira/browse/LUCENE-9183 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > > Had an offline conversation with Uwe about it. For people who don't use git > staging > (only IDEs) the precommit may be problematic as it currently fails on locally > changed > files. > I'll add an option to skip it, if the developer so desires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
[ https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-9184. - Assignee: (was: Dawid Weiss) Resolution: Duplicate Duplicate of LUCENE-9183 > Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) > - > > Key: LUCENE-9184 > URL: https://issues.apache.org/jira/browse/LUCENE-9184 > Project: Lucene - Core > Issue Type: Wish > Components: general/build >Reporter: Uwe Schindler >Priority: Major > Fix For: master (9.0) > > > Depending on the type of Git Client you are using (I hate the command line, I > use Eclipse Git or TortoiseGit -- my preference), the way how files are > committed differs. Normally with git command line you would first stage all > files and then commit them. If you stage them and then run precommit, it > works fine, as the "changed" and "added" and other stati are ignored and its > still confirmed as "clean". After the pre-COMMIT task you finally commit. > But Git GUIs don't have the concept of staging. You can (similar to > Subversion) add files and delete files, but when you modify a file you cannot > explicitely "stage" the change. What you do is to open the commit GUI, put > checkboxes on all files you want to commit and then the GUI triggers a stage > and commit directly after each other. > In this workflow, the precommit check of course complains about "modified" > files. > This is the reason why the Ant task does have 2 modes: > - The strict mode which forbids any change in the working copy, so it must be > 100% clean. By default, Ant only runs this if the property "is.jenkins.build" > is enabled. The reason for that is to detect any change in the working copy > caused by running the Jenkins CI (like temporary files munging around). > - The default "committer/developer" mode: In this case the working copy check > only complains about "untracked" or "missing" files. So a committer who > changes some files can still pass precommit. If he adds a new file, he has to > add it to the index, so its not untracked. But generally normal modifications > of working copy are allowed. > Please add this back. There was a reason why I set up the check-working-copy > Ant tak like it was. > If others aggree, i'd like to change the task so it has two modes: > - Full clean mode (for CI builds), enabled only if it's a CI build -- we > should maybe add some tasks like "jenkins-hourly"on root project that enables > this mode > - Developer mode (default), that does not care about "modified" files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
[ https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-9184: - Assignee: Dawid Weiss (was: Uwe Schindler) > Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) > - > > Key: LUCENE-9184 > URL: https://issues.apache.org/jira/browse/LUCENE-9184 > Project: Lucene - Core > Issue Type: Wish > Components: general/build >Reporter: Uwe Schindler >Assignee: Dawid Weiss >Priority: Major > Fix For: master (9.0) > > > Depending on the type of Git Client you are using (I hate the command line, I > use Eclipse Git or TortoiseGit -- my preference), the way how files are > committed differs. Normally with git command line you would first stage all > files and then commit them. If you stage them and then run precommit, it > works fine, as the "changed" and "added" and other stati are ignored and its > still confirmed as "clean". After the pre-COMMIT task you finally commit. > But Git GUIs don't have the concept of staging. You can (similar to > Subversion) add files and delete files, but when you modify a file you cannot > explicitely "stage" the change. What you do is to open the commit GUI, put > checkboxes on all files you want to commit and then the GUI triggers a stage > and commit directly after each other. > In this workflow, the precommit check of course complains about "modified" > files. > This is the reason why the Ant task does have 2 modes: > - The strict mode which forbids any change in the working copy, so it must be > 100% clean. By default, Ant only runs this if the property "is.jenkins.build" > is enabled. The reason for that is to detect any change in the working copy > caused by running the Jenkins CI (like temporary files munging around). > - The default "committer/developer" mode: In this case the working copy check > only complains about "untracked" or "missing" files. So a committer who > changes some files can still pass precommit. If he adds a new file, he has to > add it to the index, so its not untracked. But generally normal modifications > of working copy are allowed. > Please add this back. There was a reason why I set up the check-working-copy > Ant tak like it was. > If others aggree, i'd like to change the task so it has two modes: > - Full clean mode (for CI builds), enabled only if it's a CI build -- we > should maybe add some tasks like "jenkins-hourly"on root project that enables > this mode > - Developer mode (default), that does not care about "modified" files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
[ https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-9184: - Assignee: Dawid Weiss > Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) > - > > Key: LUCENE-9184 > URL: https://issues.apache.org/jira/browse/LUCENE-9184 > Project: Lucene - Core > Issue Type: Wish > Components: general/build >Reporter: Uwe Schindler >Assignee: Dawid Weiss >Priority: Major > Fix For: master (9.0) > > > Depending on the type of Git Client you are using (I hate the command line, I > use Eclipse Git or TortoiseGit -- my preference), the way how files are > committed differs. Normally with git command line you would first stage all > files and then commit them. If you stage them and then run precommit, it > works fine, as the "changed" and "added" and other stati are ignored and its > still confirmed as "clean". After the pre-COMMIT task you finally commit. > But Git GUIs don't have the concept of staging. You can (similar to > Subversion) add files and delete files, but when you modify a file you cannot > explicitely "stage" the change. What you do is to open the commit GUI, put > checkboxes on all files you want to commit and then the GUI triggers a stage > and commit directly after each other. > In this workflow, the precommit check of course complains about "modified" > files. > This is the reason why the Ant task does have 2 modes: > - The strict mode which forbids any change in the working copy, so it must be > 100% clean. By default, Ant only runs this if the property "is.jenkins.build" > is enabled. The reason for that is to detect any change in the working copy > caused by running the Jenkins CI (like temporary files munging around). > - The default "committer/developer" mode: In this case the working copy check > only complains about "untracked" or "missing" files. So a committer who > changes some files can still pass precommit. If he adds a new file, he has to > add it to the index, so its not untracked. But generally normal modifications > of working copy are allowed. > Please add this back. There was a reason why I set up the check-working-copy > Ant tak like it was. > If others aggree, i'd like to change the task so it has two modes: > - Full clean mode (for CI builds), enabled only if it's a CI build -- we > should maybe add some tasks like "jenkins-hourly"on root project that enables > this mode > - Developer mode (default), that does not care about "modified" files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9183) Allow optional skipping of git status check in precommit
[ https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-9183. --- Resolution: Duplicate > Allow optional skipping of git status check in precommit > > > Key: LUCENE-9183 > URL: https://issues.apache.org/jira/browse/LUCENE-9183 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > > Had an offline conversation with Uwe about it. For people who don't use git > staging > (only IDEs) the precommit may be problematic as it currently fails on locally > changed > files. > I'll add an option to skip it, if the developer so desires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
[ https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-9184: - Assignee: Uwe Schindler (was: Dawid Weiss) > Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) > - > > Key: LUCENE-9184 > URL: https://issues.apache.org/jira/browse/LUCENE-9184 > Project: Lucene - Core > Issue Type: Wish > Components: general/build >Reporter: Uwe Schindler >Assignee: Uwe Schindler >Priority: Major > Fix For: master (9.0) > > > Depending on the type of Git Client you are using (I hate the command line, I > use Eclipse Git or TortoiseGit -- my preference), the way how files are > committed differs. Normally with git command line you would first stage all > files and then commit them. If you stage them and then run precommit, it > works fine, as the "changed" and "added" and other stati are ignored and its > still confirmed as "clean". After the pre-COMMIT task you finally commit. > But Git GUIs don't have the concept of staging. You can (similar to > Subversion) add files and delete files, but when you modify a file you cannot > explicitely "stage" the change. What you do is to open the commit GUI, put > checkboxes on all files you want to commit and then the GUI triggers a stage > and commit directly after each other. > In this workflow, the precommit check of course complains about "modified" > files. > This is the reason why the Ant task does have 2 modes: > - The strict mode which forbids any change in the working copy, so it must be > 100% clean. By default, Ant only runs this if the property "is.jenkins.build" > is enabled. The reason for that is to detect any change in the working copy > caused by running the Jenkins CI (like temporary files munging around). > - The default "committer/developer" mode: In this case the working copy check > only complains about "untracked" or "missing" files. So a committer who > changes some files can still pass precommit. If he adds a new file, he has to > add it to the index, so its not untracked. But generally normal modifications > of working copy are allowed. > Please add this back. There was a reason why I set up the check-working-copy > Ant tak like it was. > If others aggree, i'd like to change the task so it has two modes: > - Full clean mode (for CI builds), enabled only if it's a CI build -- we > should maybe add some tasks like "jenkins-hourly"on root project that enables > this mode > - Developer mode (default), that does not care about "modified" files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Closed] (LUCENE-9183) Allow optional skipping of git status check in precommit
[ https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler closed LUCENE-9183. - > Allow optional skipping of git status check in precommit > > > Key: LUCENE-9183 > URL: https://issues.apache.org/jira/browse/LUCENE-9183 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > > Had an offline conversation with Uwe about it. For people who don't use git > staging > (only IDEs) the precommit may be problematic as it currently fails on locally > changed > files. > I'll add an option to skip it, if the developer so desires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
Uwe Schindler created LUCENE-9184: - Summary: Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant) Key: LUCENE-9184 URL: https://issues.apache.org/jira/browse/LUCENE-9184 Project: Lucene - Core Issue Type: Wish Components: general/build Reporter: Uwe Schindler Fix For: master (9.0) Depending on the type of Git Client you are using (I hate the command line, I use Eclipse Git or TortoiseGit -- my preference), the way how files are committed differs. Normally with git command line you would first stage all files and then commit them. If you stage them and then run precommit, it works fine, as the "changed" and "added" and other stati are ignored and its still confirmed as "clean". After the pre-COMMIT task you finally commit. But Git GUIs don't have the concept of staging. You can (similar to Subversion) add files and delete files, but when you modify a file you cannot explicitely "stage" the change. What you do is to open the commit GUI, put checkboxes on all files you want to commit and then the GUI triggers a stage and commit directly after each other. In this workflow, the precommit check of course complains about "modified" files. This is the reason why the Ant task does have 2 modes: - The strict mode which forbids any change in the working copy, so it must be 100% clean. By default, Ant only runs this if the property "is.jenkins.build" is enabled. The reason for that is to detect any change in the working copy caused by running the Jenkins CI (like temporary files munging around). - The default "committer/developer" mode: In this case the working copy check only complains about "untracked" or "missing" files. So a committer who changes some files can still pass precommit. If he adds a new file, he has to add it to the index, so its not untracked. But generally normal modifications of working copy are allowed. Please add this back. There was a reason why I set up the check-working-copy Ant tak like it was. If others aggree, i'd like to change the task so it has two modes: - Full clean mode (for CI builds), enabled only if it's a CI build -- we should maybe add some tasks like "jenkins-hourly"on root project that enables this mode - Developer mode (default), that does not care about "modified" files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9183) Allow optional skipping of git status check in precommit
Dawid Weiss created LUCENE-9183: --- Summary: Allow optional skipping of git status check in precommit Key: LUCENE-9183 URL: https://issues.apache.org/jira/browse/LUCENE-9183 Project: Lucene - Core Issue Type: Task Reporter: Dawid Weiss Assignee: Dawid Weiss Had an offline conversation with Uwe about it. For people who don't use git staging (only IDEs) the precommit may be problematic as it currently fails on locally changed files. I'll add an option to skip it, if the developer so desires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9171) Synonyms Boost by Payload
[ https://issues.apache.org/jira/browse/LUCENE-9171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024589#comment-17024589 ] Alessandro Benedetti commented on LUCENE-9171: -- Thanks [~romseygeek], your feedback has been extremely valuable. I proceeded with the implementation. the code is attached to the PR and it seems much cleaner to me now that I followed the AttributeSource approach. Let me know, > Synonyms Boost by Payload > - > > Key: LUCENE-9171 > URL: https://issues.apache.org/jira/browse/LUCENE-9171 > Project: Lucene - Core > Issue Type: New Feature > Components: core/queryparser >Reporter: Alessandro Benedetti >Priority: Major > > I have been working in the additional capability of boosting queries by terms > payload through a parameter to enable it in Lucene Query Builder. > This has been done targeting the Synonyms Query. > It is parametric, so it meant to see no difference unless the feature is > enabled. > Solr has its bits to comply thorugh its SynonymsQueryStyles -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14201) some SolrCore are not released after being removed
[ https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024584#comment-17024584 ] Christine Poerschke commented on SOLR-14201: Thanks [~vinhlh] for sharing these steps to reproduce the issue, and details of the unreleased classes. I notice that 'optimise' and 'alias' use is part of the steps; if one or both of them was omitted and the issue then did or did not continue to happen, that might provide further insights, if not already tried? Specifically w.r.t. the 'optimise' step, it might be interesting to explore if the optimise has finished by the time the collection deletion is requested. [~GoodmanR]'s SOLR-13609 ticket is also about visibility into optimise progress. > some SolrCore are not released after being removed > -- > > Key: SOLR-14201 > URL: https://issues.apache.org/jira/browse/SOLR-14201 > Project: Solr > Issue Type: Bug >Reporter: Christine Poerschke >Priority: Major > Attachments: image-2020-01-22-10-39-15-301.png, > image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, > image-2020-01-22-14-45-52-730.png > > > [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0): > bq. In 7.7.2, some SolrCore still are not released after being removed. > https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png > Starting this ticket for a separate investigation and fix. A next > investigative step could be to try and reproduce the issue on the latest 8.x > release. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14213) Configuring Solr Cloud to use Shared Storage
[ https://issues.apache.org/jira/browse/SOLR-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024570#comment-17024570 ] Andy Vuong commented on SOLR-14213: --- This should probably be in solr.xml and not solrconfig.xml as I said above. > Configuring Solr Cloud to use Shared Storage > > > Key: SOLR-14213 > URL: https://issues.apache.org/jira/browse/SOLR-14213 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Andy Vuong >Priority: Minor > > Clients can currently create shared collections by sending a collection > admin command such as > *_solr/admin/collections?action=CREATE=gettingstarted=true=1_* > > There are a set of shared storage specific classes such as > SharedStorageManager that get initialized on startup when the CoreContainer > loads. There are also components that are lazily loaded when shared storage > functionality is needed. This was initially written this way because a Solr > Cloud cluster could spin up and not used shared collections in which case > shared store components wouldn’t need to be loaded. There is also no support > for configuring Solr Cloud to use shared storage via config files. Lazy > loading leads to some poor code and initialization flow that should be > revisited. > This JIRA is for designing the configuration of Solr Cloud to use shared > storage and initializing shared storage components based on this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-8776) Support RankQuery in grouping
[ https://issues.apache.org/jira/browse/SOLR-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024561#comment-17024561 ] David White commented on SOLR-8776: --- Is there any plan on moving this fix forward into an official version of Solr? This is a crucial bug. > Support RankQuery in grouping > - > > Key: SOLR-8776 > URL: https://issues.apache.org/jira/browse/SOLR-8776 > Project: Solr > Issue Type: Improvement > Components: search >Affects Versions: 6.0 >Reporter: Diego Ceccarelli >Priority: Minor > Attachments: 0001-SOLR-8776-Support-RankQuery-in-grouping.patch, > 0001-SOLR-8776-Support-RankQuery-in-grouping.patch, > 0001-SOLR-8776-Support-RankQuery-in-grouping.patch, > 0001-SOLR-8776-Support-RankQuery-in-grouping.patch, > 0001-SOLR-8776-Support-RankQuery-in-grouping.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Currently it is not possible to use RankQuery [1] and Grouping [2] together > (see also [3]). In some situations Grouping can be replaced by Collapse and > Expand Results [4] (that supports reranking), but i) collapse cannot > guarantee that at least a minimum number of groups will be returned for a > query, and ii) in the Solr Cloud setting you will have constraints on how to > partition the documents among the shards. > I'm going to start working on supporting RankQuery in grouping. I'll start > attaching a patch with a test that fails because grouping does not support > the rank query and then I'll try to fix the problem, starting from the non > distributed setting (GroupingSearch). > My feeling is that since grouping is mostly performed by Lucene, RankQuery > should be refactored and moved (or partially moved) there. > Any feedback is welcome. > [1] https://cwiki.apache.org/confluence/display/solr/RankQuery+API > [2] https://cwiki.apache.org/confluence/display/solr/Result+Grouping > [3] > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201507.mbox/%3ccahm-lpuvspest-sw63_8a6gt-wor6ds_t_nb2rope93e4+s...@mail.gmail.com%3E > [4] > https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse
andywebb1975 commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse URL: https://github.com/apache/lucene-solr/pull/1210#discussion_r371411317 ## File path: solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java ## @@ -26,7 +26,9 @@ import java.util.Objects; public class OverseerSolrResponse extends SolrResponse { - + + private static final long serialVersionUID = 4721653044098960880L; Review comment: hi Tomas, that's my understanding too - and it's possible that people may be running builds of earlier Solr versions whose serialVersionUID for this class differ from 472165... . I've some (slightly shaky!) evidence in support of doing it this way: in https://github.com/apache/lucene-solr/pull/1140 for SOLR-14165 I saw that the same earlier UID for SolrResponse had been generated by several different stacks, but it's possible the stacks weren't sufficiently different. Also I can see about a dozen instances of serialVersionUID being set to explicit values (other than 1) in the Lucene/Solr codebase, presumably for reasons similar to the current one - though to be fair I've no way to tell if these have caused compatibility issues with custom builds in the past. I do think that serialVersionUID should be set explicitly for all serializable classes, and that setting it to the value used in the earlier official release builds is the best value to use when it's being set retrospectively like this. In the interests of finding other approaches to making the serialization vs javabin change backwards-compatible, I've just run a build that didn't set serialVersionUID explicitly but made useUnsafeSerialization and useUnsafeDeserialization private, to see if this would give me the same default UID as before. It didn't - I got a third value -550706... instead. So as far as I can see the only options here are to remove all the changes from OverseerSolrResponse and put them elsewhere as you describe above, or to set its serialVersionUID to a possibly-incompatible value as I've done in this PR. hope this helps! Andy This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4702) Terms dictionary compression
[ https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024532#comment-17024532 ] Adrien Grand commented on LUCENE-4702: -- OK I benchmarked with multi-segment indices this time to try to better replicate nightly benchmarks. I opened a pull request at https://github.com/apache/lucene-solr/pull/1216 that: - removes compression of suffix lengths since it didn't help much anymay, - replaces LZ4 on stats by explicit run-length compression - only tries out LZ4 for suffix bytes if the average suffix length is > 6 to reduce index-time overhead since it's unlikely to meet the saving expectations otherwise anyway, in order to reduce index-time overhead On wikibigall, the specialized RLE makes the tim file even smaller with this change (969MB vs. 996MB) and luceneutil seems to be a bit more happy: {noformat} TaskQPS baseline StdDev QPS patch StdDev Pct diff IntNRQ 144.16 (1.2%) 143.47 (1.9%) -0.5% ( -3% -2%) TermBGroup1M 32.04 (5.1%) 31.93 (5.1%) -0.4% ( -10% - 10%) TermDTSort 39.13 (0.9%) 39.05 (1.0%) -0.2% ( -2% -1%) TermGroup1M 40.18 (4.0%) 40.12 (3.4%) -0.2% ( -7% -7%) TermTitleSort 124.62 (1.9%) 124.54 (1.6%) -0.1% ( -3% -3%) TermDayOfYearSort 88.37 (6.9%) 88.34 (7.1%) -0.0% ( -13% - 14%) TermGroup10K 28.56 (5.0%) 28.56 (4.4%) 0.0% ( -8% -9%) IntervalsOrdered4.50 (1.1%)4.51 (0.6%) 0.0% ( -1% -1%) TermBGroup1M1P 45.83 (4.1%) 45.85 (4.0%) 0.0% ( -7% -8%) TermMonthSort 137.33 (1.8%) 137.40 (1.3%) 0.1% ( -2% -3%) AndHighHigh 72.97 (2.8%) 73.05 (2.7%) 0.1% ( -5% -5%) OrHighMed 77.75 (2.7%) 77.85 (2.7%) 0.1% ( -5% -5%) SpanNear 10.66 (1.2%) 10.68 (1.2%) 0.2% ( -2% -2%) Phrase 59.75 (4.9%) 59.91 (5.2%) 0.3% ( -9% - 10%) Term 1358.87 (6.8%) 1363.02 (6.1%) 0.3% ( -11% - 14%) AndMedOrHighHigh 28.18 (3.0%) 28.27 (2.5%) 0.3% ( -5% -6%) OrHighHigh 18.55 (3.2%) 18.61 (2.2%) 0.3% ( -4% -5%) SloppyPhrase 19.41 (3.9%) 19.49 (3.5%) 0.4% ( -6% -8%) AndHighMed 65.81 (2.8%) 66.15 (2.4%) 0.5% ( -4% -5%) AndHighOrMedMed 36.49 (2.5%) 36.69 (1.9%) 0.5% ( -3% -5%) TermGroup100 12.19 (3.9%) 12.27 (4.0%) 0.6% ( -7% -8%) PKLookup 217.61 (3.2%) 220.39 (3.3%) 1.3% ( -5% -8%) Prefix3 197.95 (3.3%) 202.32 (3.4%) 2.2% ( -4% -9%) Wildcard 37.78 (2.2%) 41.43 (2.8%) 9.6% ( 4% - 14%) Fuzzy1 47.77 (5.5%) 53.35 (8.4%) 11.7% ( -2% - 27%) Fuzzy2 43.69 (7.5%) 49.50 (10.7%) 13.3% ( -4% - 34%) Respell 34.05 (1.6%) 41.94 (1.4%) 23.2% ( 19% - 26%) {noformat} I plan to commit it and see how that affects nigthly benchmarks. > Terms dictionary compression > > > Key: LUCENE-4702 > URL: https://issues.apache.org/jira/browse/LUCENE-4702 > Project: Lucene - Core > Issue Type: Wish >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Attachments: LUCENE-4702.patch, LUCENE-4702.patch > > Time Spent: 3h 40m > Remaining Estimate: 0h > > I've done a quick test with the block tree terms dictionary by replacing a > call to IndexOutput.writeBytes to write suffix bytes with a call to > LZ4.compressHC to test the peformance hit. Interestingly, search performance > was very good (see comparison table below) and the tim files were 14% smaller > (from 150432 bytes overall to 129516). > {noformat} > TaskQPS baseline StdDevQPS compressed StdDev > Pct diff > Fuzzy1 111.50 (2.0%) 78.78 (1.5%) > -29.4% ( -32% - -26%) > Fuzzy2 36.99 (2.7%) 28.59 (1.5%) > -22.7% ( -26% - -18%) > Respell 122.86 (2.1%) 103.89 (1.7%) > -15.4% ( -18% - -11%) >
[jira] [Commented] (SOLR-14040) solr.xml shareSchema does not work in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024531#comment-17024531 ] Chris M. Hostetter commented on SOLR-14040: --- David: we're still seeing a much higher rate of jenkins failures from TestBulkSchemaConcurrent since your changes then we've ever seen in the past ... these don't appear to reproduce reliably, suggesting that there is some sort of timing/concurrency issue at play (not suprising given the nature of the changes and the nature of the test) Have you investigated these at all? > solr.xml shareSchema does not work in SolrCloud > --- > > Key: SOLR-14040 > URL: https://issues.apache.org/jira/browse/SOLR-14040 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Fix For: 8.5 > > Time Spent: 0.5h > Remaining Estimate: 0h > > solr.xml has a shareSchema boolean option that can be toggled from the > default of false to true in order to share IndexSchema objects within the > Solr node. This is silently ignored in SolrCloud mode. The pertinent code > is {{org.apache.solr.core.ConfigSetService#createConfigSetService}} which > creates a CloudConfigSetService that is not related to the SchemaCaching > class. This may not be a big deal in SolrCloud which tends not to deal well > with many cores per node but I'm working on changing that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed
[ https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024529#comment-17024529 ] Lucene/Solr QA commented on LUCENE-9164: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 49s{color} | {color:green} core in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 5m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-9164 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12991924/LUCENE-9164.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 9e4c445d174 | | ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/251/testReport/ | | modules | C: lucene/core U: lucene/core | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/251/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Should not consider ACE a tragedy if IW is closed > - > > Key: LUCENE-9164 > URL: https://issues.apache.org/jira/browse/LUCENE-9164 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: master (9.0), 8.5, 8.4.2 >Reporter: Nhat Nguyen >Assignee: Nhat Nguyen >Priority: Major > Attachments: LUCENE-9164.patch, LUCENE-9164.patch > > Time Spent: 10m > Remaining Estimate: 0h > > If IndexWriter is closed or being closed, AlreadyClosedException is expected. > We should not consider it a tragic event in this case. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz opened a new pull request #1216: LUCENE-4702: Reduce terms dictionary compression overhead.
jpountz opened a new pull request #1216: LUCENE-4702: Reduce terms dictionary compression overhead. URL: https://github.com/apache/lucene-solr/pull/1216 Changes include: - Removed LZ4 compression of suffix lengths which didn't save much space anyway. - For stats, LZ4 was only really used for run-length compression of terms whose docFreq is 1. This has been replaced by explicit run-length compression. - Since we only use LZ4 for suffix bytes if the compression ration is < 75%, we now only try LZ4 out if the average suffix length is greater than 6, in order to reduce index-time overhead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4702) Terms dictionary compression
[ https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024523#comment-17024523 ] ASF subversion and git services commented on LUCENE-4702: - Commit 9e4c445d17415e8b8433872df4e263d1ef144dba in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9e4c445 ] LUCENE-4702: CHANGES entry. > Terms dictionary compression > > > Key: LUCENE-4702 > URL: https://issues.apache.org/jira/browse/LUCENE-4702 > Project: Lucene - Core > Issue Type: Wish >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Attachments: LUCENE-4702.patch, LUCENE-4702.patch > > Time Spent: 3.5h > Remaining Estimate: 0h > > I've done a quick test with the block tree terms dictionary by replacing a > call to IndexOutput.writeBytes to write suffix bytes with a call to > LZ4.compressHC to test the peformance hit. Interestingly, search performance > was very good (see comparison table below) and the tim files were 14% smaller > (from 150432 bytes overall to 129516). > {noformat} > TaskQPS baseline StdDevQPS compressed StdDev > Pct diff > Fuzzy1 111.50 (2.0%) 78.78 (1.5%) > -29.4% ( -32% - -26%) > Fuzzy2 36.99 (2.7%) 28.59 (1.5%) > -22.7% ( -26% - -18%) > Respell 122.86 (2.1%) 103.89 (1.7%) > -15.4% ( -18% - -11%) > Wildcard 100.58 (4.3%) 94.42 (3.2%) > -6.1% ( -13% -1%) > Prefix3 124.90 (5.7%) 122.67 (4.7%) > -1.8% ( -11% -9%) >OrHighLow 169.87 (6.8%) 167.77 (8.0%) > -1.2% ( -15% - 14%) > LowTerm 1949.85 (4.5%) 1929.02 (3.4%) > -1.1% ( -8% -7%) > AndHighLow 2011.95 (3.5%) 1991.85 (3.3%) > -1.0% ( -7% -5%) > OrHighHigh 155.63 (6.7%) 154.12 (7.9%) > -1.0% ( -14% - 14%) > AndHighHigh 341.82 (1.2%) 339.49 (1.7%) > -0.7% ( -3% -2%) >OrHighMed 217.55 (6.3%) 216.16 (7.1%) > -0.6% ( -13% - 13%) > IntNRQ 53.10 (10.9%) 52.90 (8.6%) > -0.4% ( -17% - 21%) > MedTerm 998.11 (3.8%) 994.82 (5.6%) > -0.3% ( -9% -9%) > MedSpanNear 60.50 (3.7%) 60.36 (4.8%) > -0.2% ( -8% -8%) > HighSpanNear 19.74 (4.5%) 19.72 (5.1%) > -0.1% ( -9% -9%) > LowSpanNear 101.93 (3.2%) 101.82 (4.4%) > -0.1% ( -7% -7%) > AndHighMed 366.18 (1.7%) 366.93 (1.7%) > 0.2% ( -3% -3%) > PKLookup 237.28 (4.0%) 237.96 (4.2%) > 0.3% ( -7% -8%) >MedPhrase 173.17 (4.7%) 174.69 (4.7%) > 0.9% ( -8% - 10%) > LowSloppyPhrase 180.91 (2.6%) 182.79 (2.7%) > 1.0% ( -4% -6%) >LowPhrase 374.64 (5.5%) 379.11 (5.8%) > 1.2% ( -9% - 13%) > HighTerm 253.14 (7.9%) 256.97 (11.4%) > 1.5% ( -16% - 22%) > HighPhrase 19.52 (10.6%) 19.83 (11.0%) > 1.6% ( -18% - 25%) > MedSloppyPhrase 141.90 (2.6%) 144.11 (2.5%) > 1.6% ( -3% -6%) > HighSloppyPhrase 25.26 (4.8%) 25.97 (5.0%) > 2.8% ( -6% - 13%) > {noformat} > Only queries which are very terms-dictionary-intensive got a performance hit > (Fuzzy, Fuzzy2, Respell, Wildcard), other queries including Prefix3 behaved > (surprisingly) well. > Do you think of it as something worth exploring? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9116) Simplify postings API by removing long[] metadata
[ https://issues.apache.org/jira/browse/LUCENE-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024521#comment-17024521 ] ASF subversion and git services commented on LUCENE-9116: - Commit ace4fcc7be47e171d37932a191d646f1924a9319 in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ace4fcc ] LUCENE-9116: Remove long[] from `PostingsWriterBase#encodeTerm`. (#1149) (#1158) All the metadata can be directly encoded in the `DataOutput`. > Simplify postings API by removing long[] metadata > - > > Key: LUCENE-9116 > URL: https://issues.apache.org/jira/browse/LUCENE-9116 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Fix For: 8.5 > > Time Spent: 50m > Remaining Estimate: 0h > > The postings API allows to store metadata about a term either in a long[] or > in a byte[]. This is unnecessary as all information could be encoded in the > byte[], which is what most codecs do in practice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4702) Terms dictionary compression
[ https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024522#comment-17024522 ] ASF subversion and git services commented on LUCENE-4702: - Commit 666bdac64d68c3f247760d0a2a1c7a441502af1e in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=666bdac ] LUCENE-4702: CHANGES entry. > Terms dictionary compression > > > Key: LUCENE-4702 > URL: https://issues.apache.org/jira/browse/LUCENE-4702 > Project: Lucene - Core > Issue Type: Wish >Reporter: Adrien Grand >Assignee: Adrien Grand >Priority: Trivial > Attachments: LUCENE-4702.patch, LUCENE-4702.patch > > Time Spent: 3.5h > Remaining Estimate: 0h > > I've done a quick test with the block tree terms dictionary by replacing a > call to IndexOutput.writeBytes to write suffix bytes with a call to > LZ4.compressHC to test the peformance hit. Interestingly, search performance > was very good (see comparison table below) and the tim files were 14% smaller > (from 150432 bytes overall to 129516). > {noformat} > TaskQPS baseline StdDevQPS compressed StdDev > Pct diff > Fuzzy1 111.50 (2.0%) 78.78 (1.5%) > -29.4% ( -32% - -26%) > Fuzzy2 36.99 (2.7%) 28.59 (1.5%) > -22.7% ( -26% - -18%) > Respell 122.86 (2.1%) 103.89 (1.7%) > -15.4% ( -18% - -11%) > Wildcard 100.58 (4.3%) 94.42 (3.2%) > -6.1% ( -13% -1%) > Prefix3 124.90 (5.7%) 122.67 (4.7%) > -1.8% ( -11% -9%) >OrHighLow 169.87 (6.8%) 167.77 (8.0%) > -1.2% ( -15% - 14%) > LowTerm 1949.85 (4.5%) 1929.02 (3.4%) > -1.1% ( -8% -7%) > AndHighLow 2011.95 (3.5%) 1991.85 (3.3%) > -1.0% ( -7% -5%) > OrHighHigh 155.63 (6.7%) 154.12 (7.9%) > -1.0% ( -14% - 14%) > AndHighHigh 341.82 (1.2%) 339.49 (1.7%) > -0.7% ( -3% -2%) >OrHighMed 217.55 (6.3%) 216.16 (7.1%) > -0.6% ( -13% - 13%) > IntNRQ 53.10 (10.9%) 52.90 (8.6%) > -0.4% ( -17% - 21%) > MedTerm 998.11 (3.8%) 994.82 (5.6%) > -0.3% ( -9% -9%) > MedSpanNear 60.50 (3.7%) 60.36 (4.8%) > -0.2% ( -8% -8%) > HighSpanNear 19.74 (4.5%) 19.72 (5.1%) > -0.1% ( -9% -9%) > LowSpanNear 101.93 (3.2%) 101.82 (4.4%) > -0.1% ( -7% -7%) > AndHighMed 366.18 (1.7%) 366.93 (1.7%) > 0.2% ( -3% -3%) > PKLookup 237.28 (4.0%) 237.96 (4.2%) > 0.3% ( -7% -8%) >MedPhrase 173.17 (4.7%) 174.69 (4.7%) > 0.9% ( -8% - 10%) > LowSloppyPhrase 180.91 (2.6%) 182.79 (2.7%) > 1.0% ( -4% -6%) >LowPhrase 374.64 (5.5%) 379.11 (5.8%) > 1.2% ( -9% - 13%) > HighTerm 253.14 (7.9%) 256.97 (11.4%) > 1.5% ( -16% - 22%) > HighPhrase 19.52 (10.6%) 19.83 (11.0%) > 1.6% ( -18% - 25%) > MedSloppyPhrase 141.90 (2.6%) 144.11 (2.5%) > 1.6% ( -3% -6%) > HighSloppyPhrase 25.26 (4.8%) 25.97 (5.0%) > 2.8% ( -6% - 13%) > {noformat} > Only queries which are very terms-dictionary-intensive got a performance hit > (Fuzzy, Fuzzy2, Respell, Wildcard), other queries including Prefix3 behaved > (surprisingly) well. > Do you think of it as something worth exploring? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation
dsmiley commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation URL: https://github.com/apache/lucene-solr/pull/1171#discussion_r371376748 ## File path: solr/core/src/java/org/apache/solr/search/TopLevelJoinQuery.java ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.search; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; + +import org.apache.lucene.index.DocValues; +import org.apache.lucene.index.LeafReader; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.SortedSetDocValues; +import org.apache.lucene.search.Collector; +import org.apache.lucene.search.ConstantScoreScorer; +import org.apache.lucene.search.ConstantScoreWeight; +import org.apache.lucene.search.DocIdSetIterator; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.Query; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.search.Scorer; +import org.apache.lucene.search.TwoPhaseIterator; +import org.apache.lucene.search.Weight; +import org.apache.lucene.util.BytesRef; +import org.apache.lucene.util.LongBitSet; +import org.apache.solr.common.SolrException; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.search.join.MultiValueTermOrdinalCollector; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class TopLevelJoinQuery extends JoinQuery { Review comment: Always add at least one sentence javadoc for a class This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024516#comment-17024516 ] Dawid Weiss commented on LUCENE-9182: - Ok. > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024515#comment-17024515 ] Robert Muir commented on LUCENE-9182: - Especially this section: [https://www.apache.org/legal/src-headers.html#faq-exceptions] I took a look at other projects such as tomcat, hadoop, etc. I am seeing headers for all build.xml, pom.xml, even things like build.properties have license headers. The current ant build files in our repo also have license headers. > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024514#comment-17024514 ] Robert Muir commented on LUCENE-9182: - I think so. This is just my interpretation from reading https://www.apache.org/legal/src-headers.html > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024512#comment-17024512 ] Dawid Weiss commented on LUCENE-9182: - I didn't add them on purpose... are they really required (is it an apache legal requirement)? If it's not required I wouldn't bother. > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024513#comment-17024513 ] ASF subversion and git services commented on LUCENE-9182: - Commit fd5a0ce7c26eff4524b6968b8e84322299516b17 in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd5a0ce ] LUCENE-9182: the rat-sources.gradle was the one .gradle file already with a license header, we don't need it twice > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed
[ https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024511#comment-17024511 ] Nhat Nguyen commented on LUCENE-9164: - [~atris] Thanks for looking. This is not about double closing. An outstanding refresh (i.e., IndexWriter#getReader) considers ACE a tragedy if IndexWriter is closed midway. This behavior is bogus and requires another layer of locking. > Should not consider ACE a tragedy if IW is closed > - > > Key: LUCENE-9164 > URL: https://issues.apache.org/jira/browse/LUCENE-9164 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: master (9.0), 8.5, 8.4.2 >Reporter: Nhat Nguyen >Assignee: Nhat Nguyen >Priority: Major > Attachments: LUCENE-9164.patch, LUCENE-9164.patch > > Time Spent: 10m > Remaining Estimate: 0h > > If IndexWriter is closed or being closed, AlreadyClosedException is expected. > We should not consider it a tragic event in this case. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8143) Remove SpanBoostQuery
[ https://issues.apache.org/jira/browse/LUCENE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024509#comment-17024509 ] Alan Woodward commented on LUCENE-8143: --- I'm not sure it's a fault in SpanScorer - PayloadScoreQuery for example will adjust a score depending on individual spans matched, so there is a way to do it. It's just that SpanBoostQuery doesn't... > Remove SpanBoostQuery > - > > Key: LUCENE-8143 > URL: https://issues.apache.org/jira/browse/LUCENE-8143 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > I initially added it so that span queries could still be boosted, but this > was actually a mistake: boosts are ignored on inner span queries, only the > boost of the top-level span query, the one that performs scoring, is not > ignored. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-9182. - Fix Version/s: master (9.0) Resolution: Fixed > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024507#comment-17024507 ] ASF subversion and git services commented on LUCENE-9182: - Commit 975df9ddd3688fa3530cb975b77005c4eb863d05 in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=975df9d ] LUCENE-9182: add apache license headers to all .gradle files and enforce in rat task > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024505#comment-17024505 ] Robert Muir commented on LUCENE-9182: - the only interesting changes are to the rat task itself. the rest i auto-gen'd {code} diff --git a/gradle/validation/rat-sources.gradle b/gradle/validation/rat-sources.gradle index c50bd5005e0..82875bab1c4 100644 --- a/gradle/validation/rat-sources.gradle +++ b/gradle/validation/rat-sources.gradle @@ -1,3 +1,20 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + /* * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file @@ -40,6 +57,7 @@ configure(rootProject) { rat { includes += [ "buildSrc/**/*.java", +"gradle/**/*.gradle", "lucene/tools/forbiddenApis/**", "lucene/tools/prettify/**", ] @@ -119,6 +137,7 @@ configure(project(":solr:webapp")) { class RatTask extends DefaultTask { @Input List includes = [ +"*.gradle", "*.xml", "src/tools/**" ] @@ -131,7 +150,6 @@ class RatTask extends DefaultTask { "**/TODO", "**/*.txt", "**/*.iml", -"**/*.gradle", "build/**" ] {code} > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
[ https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-9182: Attachment: LUCENE-9182.patch > add apache license headers to all .gradle files and enforce in rat task > --- > > Key: LUCENE-9182 > URL: https://issues.apache.org/jira/browse/LUCENE-9182 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Major > Attachments: LUCENE-9182.patch > > > Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task
Robert Muir created LUCENE-9182: --- Summary: add apache license headers to all .gradle files and enforce in rat task Key: LUCENE-9182 URL: https://issues.apache.org/jira/browse/LUCENE-9182 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Assignee: Robert Muir Attachments: LUCENE-9182.patch Currently rat is ignoring the problem, let's fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse
tflobbe commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse URL: https://github.com/apache/lucene-solr/pull/1210#discussion_r371362174 ## File path: solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java ## @@ -26,7 +26,9 @@ import java.util.Objects; public class OverseerSolrResponse extends SolrResponse { - + + private static final long serialVersionUID = 4721653044098960880L; Review comment: My understanding is that this number can actually vary depending on the compiler, so setting it to a specific value like this (expecting it to be number the class had in previous versions) may not work for everyone. Since the changes done in SOLR-14095 are just addition of static methods, maybe the better solution is to just revert them from OverseerSolrResponse and put them in some other util class This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9181) gradlew(.bat) should pass --parallel flag
[ https://issues.apache.org/jira/browse/LUCENE-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024494#comment-17024494 ] Dawid Weiss commented on LUCENE-9181: - These wrappers are sometimes regenerated so such tweaks would have to be reapplied but sounds ok to me! An alternative is to stop the build on the first run and require a re-run... Seems lame though. > gradlew(.bat) should pass --parallel flag > - > > Key: LUCENE-9181 > URL: https://issues.apache.org/jira/browse/LUCENE-9181 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > Followup to LUCENE-9179. > For example I have 2 real cores (4 apparent cpus, hyperthreads). > With LUCENE-9179 change the build will work the first time, but it will only > use one builder and take an eternity. > Instead if these wrappers passed --parallel then the first build would use 4 > builders (built in gradle default). > Subsequent builds for me would only use 2, we still pass --parallel but now > our gradle.properties tells it to only use 2 > would give a better first experience (fans spin a bit higher for that first > build, but better than slow as hell?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9178) Use run-length encoding when writing docIds in BKD tree
[ https://issues.apache.org/jira/browse/LUCENE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024490#comment-17024490 ] Adrien Grand commented on LUCENE-9178: -- This is an interesting idea. In the multi-valued case we don't have any requirements for the order of points in leaves, so we could even sort leaves by doc ID in order to make this more likely to kick in. This would break the other storage optimization we have that does run-length encoding on the leading byte of the dimension that has the shortest shared prefix, but maybe there would be greater savings on doc IDs by doing delta plus run-length encoding. > Use run-length encoding when writing docIds in BKD tree > --- > > Key: LUCENE-9178 > URL: https://issues.apache.org/jira/browse/LUCENE-9178 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Priority: Major > > I think we can easily check if it make sense to write docIds using length > compression in the BKD tree. This can probably save some space in the case of > Muti value documents, e.g LatLonShape and XYShape. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run
[ https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024488#comment-17024488 ] Robert Muir commented on LUCENE-9179: - I opened LUCENE-9181 for a simple thing we could do to improve that first experience. > gradle setupLocalDefaultsOnce can screw up on the first run > --- > > Key: LUCENE-9179 > URL: https://issues.apache.org/jira/browse/LUCENE-9179 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Dawid Weiss >Priority: Major > > To reproduce: > {noformat} > rm gradle.properties > ./gradlew -p lucene test > {noformat} > It will fail with a strange error: > {noformat} > > Included build in /home/rmuir/workspace/lucene-solr/lucene has name > > 'lucene' which is the same as a project of the main build. > {noformat} > It makes me wonder if we should try to do this recursive build stuff at all > on the first time, or do it a different way (e.g. alternatives are to fail > build, or maybe simply invoke ./gradlew ourselves so that it also picks up > parallelism changes)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9175) gradle build leaks tons of gradle-worker-classpath* files in tmpdir
[ https://issues.apache.org/jira/browse/LUCENE-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024489#comment-17024489 ] Dawid Weiss commented on LUCENE-9175: - I think it's a bug in gradle. These files are never cleaned up and temp file provider doesn't really clean them up either. https://github.com/gradle/gradle/issues/12020 > gradle build leaks tons of gradle-worker-classpath* files in tmpdir > --- > > Key: LUCENE-9175 > URL: https://issues.apache.org/jira/browse/LUCENE-9175 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > This may be a sign of classloader issues or similar that cause other issues > like LUCENE-9174? > {noformat} > $ ls /tmp/gradle-worker-classpath* | wc -l > 523 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9181) gradlew(.bat) should pass --parallel flag
Robert Muir created LUCENE-9181: --- Summary: gradlew(.bat) should pass --parallel flag Key: LUCENE-9181 URL: https://issues.apache.org/jira/browse/LUCENE-9181 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Followup to LUCENE-9179. For example I have 2 real cores (4 apparent cpus, hyperthreads). With LUCENE-9179 change the build will work the first time, but it will only use one builder and take an eternity. Instead if these wrappers passed --parallel then the first build would use 4 builders (built in gradle default). Subsequent builds for me would only use 2, we still pass --parallel but now our gradle.properties tells it to only use 2 would give a better first experience (fans spin a bit higher for that first build, but better than slow as hell?) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn opened a new pull request #1215: LUCENE-9164: Ignore ACE on tragic event if IW is closed
dnhatn opened a new pull request #1215: LUCENE-9164: Ignore ACE on tragic event if IW is closed URL: https://github.com/apache/lucene-solr/pull/1215 If an IndexWriter was closed, then AlreadyClosedException should not be considered a tragic event. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run
[ https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-9179. - Resolution: Fixed Thanks, didn't know about the issue. > gradle setupLocalDefaultsOnce can screw up on the first run > --- > > Key: LUCENE-9179 > URL: https://issues.apache.org/jira/browse/LUCENE-9179 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Dawid Weiss >Priority: Major > > To reproduce: > {noformat} > rm gradle.properties > ./gradlew -p lucene test > {noformat} > It will fail with a strange error: > {noformat} > > Included build in /home/rmuir/workspace/lucene-solr/lucene has name > > 'lucene' which is the same as a project of the main build. > {noformat} > It makes me wonder if we should try to do this recursive build stuff at all > on the first time, or do it a different way (e.g. alternatives are to fail > build, or maybe simply invoke ./gradlew ourselves so that it also picks up > parallelism changes)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run
[ https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024479#comment-17024479 ] ASF subversion and git services commented on LUCENE-9179: - Commit b420ef8f77209690dcd47e45700a952409ccac62 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b420ef8 ] LUCENE-9179: don't invoke the same build recursively upon first run, just continue. Seems like gradle bug but let's not cry about it - it just happens once and CI defaults can be passed independently on command-line. > gradle setupLocalDefaultsOnce can screw up on the first run > --- > > Key: LUCENE-9179 > URL: https://issues.apache.org/jira/browse/LUCENE-9179 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Dawid Weiss >Priority: Major > > To reproduce: > {noformat} > rm gradle.properties > ./gradlew -p lucene test > {noformat} > It will fail with a strange error: > {noformat} > > Included build in /home/rmuir/workspace/lucene-solr/lucene has name > > 'lucene' which is the same as a project of the main build. > {noformat} > It makes me wonder if we should try to do this recursive build stuff at all > on the first time, or do it a different way (e.g. alternatives are to fail > build, or maybe simply invoke ./gradlew ourselves so that it also picks up > parallelism changes)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9180) newlines/gitattributes cleanup
[ https://issues.apache.org/jira/browse/LUCENE-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024477#comment-17024477 ] Robert Muir commented on LUCENE-9180: - I fixed the inconsistent newlines. > newlines/gitattributes cleanup > -- > > Key: LUCENE-9180 > URL: https://issues.apache.org/jira/browse/LUCENE-9180 > Project: Lucene - Core > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Priority: Major > > merge the two .gitattributes files into a single one at the root, fix some > random files with DOS newlines that don't need them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9180) newlines/gitattributes cleanup
[ https://issues.apache.org/jira/browse/LUCENE-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024476#comment-17024476 ] ASF subversion and git services commented on LUCENE-9180: - Commit d614bb854d2b2892969c9b1f9de5f12f88f7181f in lucene-solr's branch refs/heads/branch_8x from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d614bb8 ] LUCENE-9180: dos2unix files that don't need dos line endings. gitignore gradle-specific stuff that shows up modified if you switch branches, no gradle here. > newlines/gitattributes cleanup > -- > > Key: LUCENE-9180 > URL: https://issues.apache.org/jira/browse/LUCENE-9180 > Project: Lucene - Core > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Priority: Major > > merge the two .gitattributes files into a single one at the root, fix some > random files with DOS newlines that don't need them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9180) newlines/gitattributes cleanup
[ https://issues.apache.org/jira/browse/LUCENE-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024473#comment-17024473 ] ASF subversion and git services commented on LUCENE-9180: - Commit 8e357b167bf742aacff39ddfff934a958b0a590d in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8e357b1 ] LUCENE-9180: dos2unix files that don't need dos line endings > newlines/gitattributes cleanup > -- > > Key: LUCENE-9180 > URL: https://issues.apache.org/jira/browse/LUCENE-9180 > Project: Lucene - Core > Issue Type: Task > Components: general/build >Reporter: Robert Muir >Priority: Major > > merge the two .gitattributes files into a single one at the root, fix some > random files with DOS newlines that don't need them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9020) Find a way to publish Solr RefGuide and Javadocs without checking into git
[ https://issues.apache.org/jira/browse/LUCENE-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-9020. --- Resolution: Fixed > Find a way to publish Solr RefGuide and Javadocs without checking into git > -- > > Key: LUCENE-9020 > URL: https://issues.apache.org/jira/browse/LUCENE-9020 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Jan Høydahl >Assignee: Uwe Schindler >Priority: Major > > Currently we check in all versions of RefGuide (hundreds of small html files) > into svn to publish as part of the site. With new site we should find a > smoother way to do this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9020) Find a way to publish Solr RefGuide and Javadocs without checking into git
[ https://issues.apache.org/jira/browse/LUCENE-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024469#comment-17024469 ] Uwe Schindler commented on LUCENE-9020: --- I was able to set everything up. It was a bit more coplicated, as AliasMatch and absolute path names are not allowed in .htaccess files (which is a per directory config, so actual directory is resolved). See INFRA-19439 for more details. We solved the issue mostly over the Slack Channel. In short what we did with Daniel: "Alias" and "AliasMatch" does not work in ".htaccess" (which is a per-directory config and therefore the file system patch is already found out, so it's to late to apply aliases. Aliases only work on server config or location). The workaround is to use "mod-rewrite". The fix consists of 2 separate parts: - INFRA added an Alias/Rewrite on their side in the global server config that can be used by all project server. It maps URI path "/__root" to the filesystem path where all project webpages are hosted: See this initial commit: https://github.com/apache/infrastructure-p6/compare/a63511b7499f...63e23e52b18b - Lucene/Solr changed their .htaccess to use rewrite directives that just rewrite the above URLs to "/__root/old-svn-website.../". This makes it independent from real filesystem paths. We (Lucene) just know that below the URI path "/__root" we can reach all project folders that are checked out on wb server. Only backside: You can theoretically reach every website by hand-crafting an URL like https://lucene.apache.org/__root/someotherproject/somehtml. With nginx as webserver you could define this URI path as "internal", but Apache HTTPD does not have this notion. Nginx uses this "internal" notion for resource endpoints only accessible by rewrites. Our htaccess now looks like this: https://github.com/apache/lucene-site/blob/3fa9933b276897f89525c61301d9e4e2da863b85/content/.htaccess#L121-L125 We did some tests with checking out part of the SVN tree on the staging machine. But the whole rewrite generally only works in production (which is not different to our old website, as the old CMS was also not showing our javadocs). The final step is to bring the website to production. We may need some more help once this will be done, as we cannot guarantee that all works perfect in production (we hope so). > Find a way to publish Solr RefGuide and Javadocs without checking into git > -- > > Key: LUCENE-9020 > URL: https://issues.apache.org/jira/browse/LUCENE-9020 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Jan Høydahl >Assignee: Uwe Schindler >Priority: Major > > Currently we check in all versions of RefGuide (hundreds of small html files) > into svn to publish as part of the site. With new site we should find a > smoother way to do this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed
[ https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024461#comment-17024461 ] Atri Sharma commented on LUCENE-9164: - I am in dual minds on this -- isnt trying to close an already closed IndexWriter a sign of a potentially fatal bug in the user code? This changes user facing behaviour (unless I am reading the patch wrong) so would want to understand the reasoning for this change. > Should not consider ACE a tragedy if IW is closed > - > > Key: LUCENE-9164 > URL: https://issues.apache.org/jira/browse/LUCENE-9164 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: master (9.0), 8.5, 8.4.2 >Reporter: Nhat Nguyen >Assignee: Nhat Nguyen >Priority: Major > Attachments: LUCENE-9164.patch, LUCENE-9164.patch > > > If IndexWriter is closed or being closed, AlreadyClosedException is expected. > We should not consider it a tragic event in this case. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run
[ https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024460#comment-17024460 ] Dawid Weiss commented on LUCENE-9179: - I"ll fix it to not run recursively; it'll just generate the defaults and continue with the build. It may be slower on the first run but it'll still print a message about it. Looks like a bug in gradle task to me. > gradle setupLocalDefaultsOnce can screw up on the first run > --- > > Key: LUCENE-9179 > URL: https://issues.apache.org/jira/browse/LUCENE-9179 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Assignee: Dawid Weiss >Priority: Major > > To reproduce: > {noformat} > rm gradle.properties > ./gradlew -p lucene test > {noformat} > It will fail with a strange error: > {noformat} > > Included build in /home/rmuir/workspace/lucene-solr/lucene has name > > 'lucene' which is the same as a project of the main build. > {noformat} > It makes me wonder if we should try to do this recursive build stuff at all > on the first time, or do it a different way (e.g. alternatives are to fail > build, or maybe simply invoke ./gradlew ourselves so that it also picks up > parallelism changes)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024459#comment-17024459 ] Robert Muir commented on LUCENE-9166: - The only use-case IMO is individual test debugging. You got a securityexception, and for some reason its unclear why, so you have to dig a bit deeper. I haven't dug into the low level issues with gradle here, but its similar to other narrow use-cases such as wanting to use PrintCompilation or other such stuff in tests and get all the output without something trying to interpret it. > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run
[ https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024450#comment-17024450 ] Dawid Weiss commented on LUCENE-9179: - We don't need a recursive build at all -- it will just generate defaults and continue to run. I only wanted to run recursively because I hoped the new machine-specific defaults would be picked up (I don't think they are, not in full). > gradle setupLocalDefaultsOnce can screw up on the first run > --- > > Key: LUCENE-9179 > URL: https://issues.apache.org/jira/browse/LUCENE-9179 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > To reproduce: > {noformat} > rm gradle.properties > ./gradlew -p lucene test > {noformat} > It will fail with a strange error: > {noformat} > > Included build in /home/rmuir/workspace/lucene-solr/lucene has name > > 'lucene' which is the same as a project of the main build. > {noformat} > It makes me wonder if we should try to do this recursive build stuff at all > on the first time, or do it a different way (e.g. alternatives are to fail > build, or maybe simply invoke ./gradlew ourselves so that it also picks up > parallelism changes)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9174) Bump default gradle memory to 2g
[ https://issues.apache.org/jira/browse/LUCENE-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024448#comment-17024448 ] Dawid Weiss commented on LUCENE-9174: - That's right - that's what I had in mind. The daemon runs out of heap. The situation depends on what you're running; javadocs are computed within daemon's JVM I think; maybe with too many parallel threads it just explodes. I haven't looked at the problem closely yet. > Bump default gradle memory to 2g > > > Key: LUCENE-9174 > URL: https://issues.apache.org/jira/browse/LUCENE-9174 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > > I see these from time to time so I'll bump the daemon's heap to 2 gigs. Don't > know why it needs to much... > {code} > Expiring Daemon because JVM heap space is exhausted > Daemon will be stopped at the end of the build after running out of JVM memory > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9180) newlines/gitattributes cleanup
Robert Muir created LUCENE-9180: --- Summary: newlines/gitattributes cleanup Key: LUCENE-9180 URL: https://issues.apache.org/jira/browse/LUCENE-9180 Project: Lucene - Core Issue Type: Task Components: general/build Reporter: Robert Muir merge the two .gitattributes files into a single one at the root, fix some random files with DOS newlines that don't need them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024442#comment-17024442 ] Dawid Weiss commented on LUCENE-9166: - bq. I can't remember how it worked, but I feel like it was still using your junit runner to actually run the tests versus the built-in gradle test support? Correct. They switched recently. There are benefits of using gradle's runner (better integration with the rest of the infrastructure is one of them). I don't know how to solve it properly yet. Security and other JVM-level messaging is a very narrow area and not commonly used. Maybe we just need a dumb substitute for running these (they're typically individual tests anyway). > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9175) gradle build leaks tons of gradle-worker-classpath* files in tmpdir
[ https://issues.apache.org/jira/browse/LUCENE-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024438#comment-17024438 ] Robert Muir commented on LUCENE-9175: - Gradle creates the file here: https://github.com/gradle/gradle/blob/b7f79aa9b29cd6ad7fb9f189dceb0311ef7b6bfd/subprojects/core/src/main/java/org/gradle/process/internal/worker/child/ApplicationClassesInSystemClassLoaderWorkerImplementationFactory.java#L96 > gradle build leaks tons of gradle-worker-classpath* files in tmpdir > --- > > Key: LUCENE-9175 > URL: https://issues.apache.org/jira/browse/LUCENE-9175 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > This may be a sign of classloader issues or similar that cause other issues > like LUCENE-9174? > {noformat} > $ ls /tmp/gradle-worker-classpath* | wc -l > 523 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024435#comment-17024435 ] Robert Muir commented on LUCENE-9166: - Interesting. I feel like this worked for the elasticsearch gradle build a long time ago. I can't remember how it worked, but I feel like it was still using your junit runner to actually run the tests versus the built-in gradle test support? Maybe it would solve several of our problems? > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024429#comment-17024429 ] Dawid Weiss commented on LUCENE-9166: - Correct. It is dumb. The API is there to provide a different ordering but it's all internal so I don't know if it makes sense to waste cycles right now to try to fix it. What worries me more is LUCENE-9120: this is something we will probably need sooner or later. I don't think there is an easy workaround inside gradle itself. It's more likely we'll have to redirect to ant or implement a custom java launcher for such corner-cases (which isn't a big deal but requires some coding). > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gerlowskija commented on issue #1171: SOLR-13892: Add 'top-level' docValues Join implementation
gerlowskija commented on issue #1171: SOLR-13892: Add 'top-level' docValues Join implementation URL: https://github.com/apache/lucene-solr/pull/1171#issuecomment-578801824 > In JoinQuery.rewrite I see it says explicitly "don't rewrite the subQuery" but why not? If it never gets rewritten then that's a bug. That's a question for the original Join author maybe? I'm not familiar enough with how rewrites work to speak to it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gerlowskija commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation
gerlowskija commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation URL: https://github.com/apache/lucene-solr/pull/1171#discussion_r371286866 ## File path: solr/core/src/java/org/apache/solr/search/JoinQParserPlugin.java ## @@ -59,67 +60,124 @@ import org.apache.solr.search.join.ScoreJoinQParserPlugin; import org.apache.solr.util.RTimer; import org.apache.solr.util.RefCounted; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; public class JoinQParserPlugin extends QParserPlugin { + private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + public static final String NAME = "join"; + /** Choose the internal algorithm */ + private static final String METHOD = "method"; + + private static class JoinParams { +final String fromField; +final String fromCore; +final Query fromQuery; +final long fromCoreOpenTime; +final String toField; + +public JoinParams(String fromField, String fromCore, Query fromQuery, long fromCoreOpenTime, String toField) { + this.fromField = fromField; + this.fromCore = fromCore; + this.fromQuery = fromQuery; + this.fromCoreOpenTime = fromCoreOpenTime; + this.toField = toField; +} + } + + private enum Method { +index { + @Override + Query makeFilter(QParser qparser) throws SyntaxError { +final JoinParams jParams = parseJoin(qparser); +final JoinQuery q = new JoinQuery(jParams.fromField, jParams.toField, jParams.fromCore, jParams.fromQuery); +q.fromCoreOpenTime = jParams.fromCoreOpenTime; +return q; + } +}, +dvWithScore { + @Override + Query makeFilter(QParser qparser) throws SyntaxError { +return new ScoreJoinQParserPlugin().createParser(qparser.qstr, qparser.localParams, qparser.params, qparser.req).parse(); + } +}, +topLevelDV { + @Override + Query makeFilter(QParser qparser) throws SyntaxError { +final JoinParams jParams = parseJoin(qparser); +final JoinQuery q = new TopLevelJoinQuery(jParams.fromField, jParams.toField, jParams.fromCore, jParams.fromQuery); +q.fromCoreOpenTime = jParams.fromCoreOpenTime; +return q; + } +}; + +abstract Query makeFilter(QParser qparser) throws SyntaxError; + +JoinParams parseJoin(QParser qparser) throws SyntaxError { + final String fromField = qparser.getParam("from"); + final String fromIndex = qparser.getParam("fromIndex"); + final String toField = qparser.getParam("to"); + final String v = qparser.localParams.get("v"); + final String coreName; + + Query fromQuery; + long fromCoreOpenTime = 0; + + if (fromIndex != null && !fromIndex.equals(qparser.req.getCore().getCoreDescriptor().getName()) ) { +CoreContainer container = qparser.req.getCore().getCoreContainer(); + +// if in SolrCloud mode, fromIndex should be the name of a single-sharded collection +coreName = ScoreJoinQParserPlugin.getCoreName(fromIndex, container); + +final SolrCore fromCore = container.getCore(coreName); +if (fromCore == null) { + throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, + "Cross-core join: no such core " + coreName); +} + +RefCounted fromHolder = null; +LocalSolrQueryRequest otherReq = new LocalSolrQueryRequest(fromCore, qparser.params); +try { Review comment: Totally agree - should be using try-with-resources here. But I'm reluctant to introduce changes here that aren't strictly necessary. (Github shows this section as "added", but really it was just moved from elsewhere in the file.) My opinion on this changes, but I've been burned too many times recently by adding a "harmless" refactor into a related commit, only for that to cause issues later that force a revert. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gerlowskija commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation
gerlowskija commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation URL: https://github.com/apache/lucene-solr/pull/1171#discussion_r371296171 ## File path: solr/core/src/java/org/apache/solr/search/join/MultiValueTermOrdinalCollector.java ## @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.search.join; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; + +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.SortedSetDocValues; +import org.apache.lucene.search.ScoreMode; +import org.apache.lucene.util.LongBitSet; +import org.apache.solr.search.DelegatingCollector; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Populates a bitset of (top-level) ordinals based on field values in a multi-valued field. + */ +public class MultiValueTermOrdinalCollector extends DelegatingCollector { Review comment: Because it saves me from reimplementing `getLeafCollector()` and some other methods. If you're wondering why DelegatingCollector as opposed to SimpleCollector or other options, there's not a great answer - DelegatingCollector was needed in some earlier revision when things were postfilter based. I've changed it to use SimpleCollector; hopefully that addresses your concern. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14193) Update tutorial.adoc(line no:664) so that command executes in windows enviroment
[ https://issues.apache.org/jira/browse/SOLR-14193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024396#comment-17024396 ] Cassandra Targett commented on SOLR-14193: -- OK, that makes sense. Looking forward to your new PR & thanks for your help! > Update tutorial.adoc(line no:664) so that command executes in windows > enviroment > > > Key: SOLR-14193 > URL: https://issues.apache.org/jira/browse/SOLR-14193 > Project: Solr > Issue Type: Bug > Components: documentation >Affects Versions: 8.4 >Reporter: balaji sundaram >Priority: Minor > > > {{When executing the following command in windows 10 "java -jar -Dc=films > -Dparams=f.genre.split=true_by.split=true=|_by.separator=| > -Dauto example\exampledocs\post.jar example\films\*.csv", it throws error "& > was unexpected at this time."}} > Fix: the command should escape "&" and "|" symbol{{}} > {{}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run
[ https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024395#comment-17024395 ] Robert Muir commented on LUCENE-9179: - I know [~mikemccand] hit this trying to use gradle in luceneutils (with a clean checkout like a CI tool might do). As a workaround i suggested he run {{./gradlew help}} first so that it generates the properties file, then run again. > gradle setupLocalDefaultsOnce can screw up on the first run > --- > > Key: LUCENE-9179 > URL: https://issues.apache.org/jira/browse/LUCENE-9179 > Project: Lucene - Core > Issue Type: Task >Reporter: Robert Muir >Priority: Major > > To reproduce: > {noformat} > rm gradle.properties > ./gradlew -p lucene test > {noformat} > It will fail with a strange error: > {noformat} > > Included build in /home/rmuir/workspace/lucene-solr/lucene has name > > 'lucene' which is the same as a project of the main build. > {noformat} > It makes me wonder if we should try to do this recursive build stuff at all > on the first time, or do it a different way (e.g. alternatives are to fail > build, or maybe simply invoke ./gradlew ourselves so that it also picks up > parallelism changes)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run
Robert Muir created LUCENE-9179: --- Summary: gradle setupLocalDefaultsOnce can screw up on the first run Key: LUCENE-9179 URL: https://issues.apache.org/jira/browse/LUCENE-9179 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir To reproduce: {noformat} rm gradle.properties ./gradlew -p lucene test {noformat} It will fail with a strange error: {noformat} > Included build in /home/rmuir/workspace/lucene-solr/lucene has name 'lucene' > which is the same as a project of the main build. {noformat} It makes me wonder if we should try to do this recursive build stuff at all on the first time, or do it a different way (e.g. alternatives are to fail build, or maybe simply invoke ./gradlew ourselves so that it also picks up parallelism changes)? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14220) Unable to build 7_7 or 8_4 due to missing dependency
[ https://issues.apache.org/jira/browse/SOLR-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett resolved SOLR-14220. -- Resolution: Duplicate This appears to be a duplicate of LUCENE-9170; closing this in favor of that one. > Unable to build 7_7 or 8_4 due to missing dependency > > > Key: SOLR-14220 > URL: https://issues.apache.org/jira/browse/SOLR-14220 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Build >Affects Versions: 7.7, 8.4 >Reporter: Karl Stoney >Priority: Major > Labels: build, build-failure > > Attempting to build from: > 7_7: > https://github.com/apache/lucene-solr/commit/7a309c21ebbc1b08d9edf67802b63fc0bc7affcf > or > 8_4: > https://github.com/apache/lucene-solr/commit/7d3ac7c284b26ce62f41d3b8686f70c7d6bd758d > Results in the same build failure: > {code:java} > BUILD FAILED > /usr/local/autotrader/app/lucene-solr/solr/build.xml:685: The following error > occurred while executing this line: > /usr/local/autotrader/app/lucene-solr/solr/build.xml:656: The following error > occurred while executing this line: > /usr/local/autotrader/app/lucene-solr/lucene/common-build.xml:653: Error > downloading wagon provider from the remote repository: Missing: > -- > 1) org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-7 > Try downloading the file manually from the project website. > Then, install it using the command: > mvn install:install-file -DgroupId=org.apache.maven.wagon > -DartifactId=wagon-ssh -Dversion=1.0-beta-7 -Dpackaging=jar > -Dfile=/path/to/file > Alternatively, if you host your own repository you can deploy the file > there: > mvn deploy:deploy-file -DgroupId=org.apache.maven.wagon > -DartifactId=wagon-ssh -Dversion=1.0-beta-7 -Dpackaging=jar > -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id] > Path to dependency: > 1) unspecified:unspecified:jar:0.0 > 2) org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-7 > -- > 1 required artifact is missing. > for artifact: > unspecified:unspecified:jar:0.0 > from the specified remote repositories: > central (http://repo1.maven.org/maven2) > {code} > Previously building 7_7 from 3aad3311a97256a8537dd04165c67edcce1c153c, and > 8_4 from c0b96fd305946b2564b967272e6e23c59ab0b5da worked fine. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-10665) POC for a PF4J based plugin system
[ https://issues.apache.org/jira/browse/SOLR-10665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-10665: Description: In SOLR-5103 we have been discussing improvements to Solr plugin system, with ability to bundle a plugin as zip, and easily install from shell or Admin UI. This task aims to create a working POC to demonstrate how PF4J (Plugin Framework4J) can be used to bring a very simple plugin packaging and installation system to Solr with a minimum of effort. Code speaks louder than words :) The POC effort is a quite large patch and will be cutting some corners to get the feature in the hands of people who can test and evaluate. If there is consensus to add this to Solr, there will be other sub tasks to split up the elephant into committable chunks. The design document is located here: [https://s.apache.org/solr-plugin] (Google Doc) - comments are welcome in the document or here. was: In SOLR-5103 we have been discussing improvements to Solr plugin system, with ability to bundle a plugin as zip, and easily install from shell or Admin UI. This task aims to create a working POC to demonstrate how PF4J (Plugin Framework4J) can be used to bring a very simple plugin packaging and installation system to Solr with a minimum of effort. Code speaks louder than words :) The POC effort is a quite large patch and will be cutting some corners to get the feature in the hands of people who can test and evaluate. If there is consensus to add this to Solr, there will be other sub tasks to split up the elephant into committable chunks. The design document is located here: https://s.apache.org/solr-plugin (Google Doc) - comments are welcome in the document or here. > POC for a PF4J based plugin system > -- > > Key: SOLR-10665 > URL: https://issues.apache.org/jira/browse/SOLR-10665 > Project: Solr > Issue Type: New Feature > Components: Plugin system >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Labels: pf4j, plugins > Attachments: SOLR-10665.patch > > Time Spent: 50m > Remaining Estimate: 0h > > In SOLR-5103 we have been discussing improvements to Solr plugin system, with > ability to bundle a plugin as zip, and easily install from shell or Admin UI. > This task aims to create a working POC to demonstrate how PF4J (Plugin > Framework4J) can be used to bring a very simple plugin packaging and > installation system to Solr with a minimum of effort. Code speaks louder than > words :) > The POC effort is a quite large patch and will be cutting some corners to get > the feature in the hands of people who can test and evaluate. If there is > consensus to add this to Solr, there will be other sub tasks to split up the > elephant into committable chunks. > The design document is located here: [https://s.apache.org/solr-plugin] > (Google Doc) - comments are welcome in the document or here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024377#comment-17024377 ] Robert Muir commented on LUCENE-9166: - LOL thanks, I am not digging too far yet: trying to balance time also fixing slow tests. It seems the gradle load balancing is quite a bit dumber than the junit4-work-stealing we had before. It makes up for it somewhat by parallelizing across modules but we have some big fat ones that can bottleneck builds. There is probably an easy win here... > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9178) Use run-length encoding when writing docIds in BKD tree
Ignacio Vera created LUCENE-9178: Summary: Use run-length encoding when writing docIds in BKD tree Key: LUCENE-9178 URL: https://issues.apache.org/jira/browse/LUCENE-9178 Project: Lucene - Core Issue Type: Improvement Reporter: Ignacio Vera I think we can easily check if it make sense to write docIds using length compression in the BKD tree. This can probably save some space in the case of Muti value documents, e.g LatLonShape and XYShape. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024363#comment-17024363 ] Dawid Weiss commented on LUCENE-9166: - No worries, I didn't get that impression. :) As for debugging gradle - welcome to the club. The more I am involved in those complex gradle builds the more schizophrenic I become about them. One moment you're in awe and they're the greatest thing, the next you're debugging or digging through source to figure out what's wrong. > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9173) SynonymGraphFilter doesn't correctly consume decompounded tokens (branched token graph)
[ https://issues.apache.org/jira/browse/LUCENE-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024358#comment-17024358 ] Michael McCandless commented on LUCENE-9173: Yeah this is a known tricky issue for both {{SynonymFilter}} and {{SynonymGraphFilter}} (though, maybe the former does not throw an exception if you feed it a graph?). Also, note that the exception above is while building the {{SynonymMap}} and not while actually tokenizing. You could successfully build a {{SynonymMap}} but then if you feed a graph to {{SynonymGraphFilter}} I think it detects that and throws an exception then. It is possible to fix this – it's just software! – it's just rather tricky for {{SynonymGraphFilter}} to find matches in an incoming graph, and in general could become quite costly in adversarial cases of e.g. high numbers of tokens at the same position in the input graph. > SynonymGraphFilter doesn't correctly consume decompounded tokens (branched > token graph) > > > Key: LUCENE-9173 > URL: https://issues.apache.org/jira/browse/LUCENE-9173 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Reporter: Tomoko Uchida >Priority: Minor > > This is a derived issue from LUCENE-9123. > When the tokenizer that is given to SynonymGraphFilter decompound tokens or > emit multiple tokens at the same position, SynonymGraphFilter cannot > correctly handle them (an exception will be thrown). > For example, JapaneseTokenizer (mode=SEARCH) would emit a token and two > decompounded tokens for the text "株式会社": > {code:java} > 株式会社 (positionIncrement=0, positionLength=2) > 株式 (positionIncrement=1, positionLength=1) > 会社 (positionIncrement=1, positionLength=1) > {code} > Then if we give a synonym "株式会社,コーポレーション" by SynonymGraphFilterFactory (set > tokenizerFactory=JapaneseTokenizerFactory) this exception is thrown. > {code:java} > Caused by: java.lang.IllegalArgumentException: term: 株式会社 analyzed to a token > (株式会社) with position increment != 1 (got: 0) > at > org.apache.lucene.analysis.synonym.SynonymMap$Parser.analyze(SynonymMap.java:325) > ~[lucene-analyzers-common-8.4.0.jar:8.4.0 > bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38] > at > org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:114) > ~[lucene-analyzers-common-8.4.0.jar:8.4.0 > bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38] > at > org.apache.lucene.analysis.synonym.SolrSynonymParser.parse(SolrSynonymParser.java:70) > ~[lucene-analyzers-common-8.4.0.jar:8.4.0 > bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38] > at > org.apache.lucene.analysis.synonym.SynonymGraphFilterFactory.loadSynonyms(SynonymGraphFilterFactory.java:179) > ~[lucene-analyzers-common-8.4.0.jar:8.4.0 > bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38] > at > org.apache.lucene.analysis.synonym.SynonymGraphFilterFactory.inform(SynonymGraphFilterFactory.java:154) > ~[lucene-analyzers-common-8.4.0.jar:8.4.0 > bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38] > {code} > This isn't only limited to JapaneseTokenizer but a more general issue about > handling branched token graph (decompounded tokens in the midstream). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024351#comment-17024351 ] Robert Muir commented on LUCENE-9166: - I didn't mean to give the impression this bug was your fault. It took some digging for me to figure out WTF was happening because I didn't see anything configured to filter out traces at all. I had to add system.out.printlns to figure out the Set actually had a filter in it by default put there by gradle, and what it was doing, etc... > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024349#comment-17024349 ] Dawid Weiss commented on LUCENE-9166: - Let's leave the stack trace in full. I don't think it harms anyone: the build is much less chatty anyway and when a failure happens the stack trace is a fairly important. We can always trim it later. As for upgrading gradle... Seems like anything I touch recently turns out to have bugs in it so I'm careful with upgrades if something is working. ;) But feel free to try it out - altering gradle/wrapper/gradle-wrapper.properties should do the trick. > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8143) Remove SpanBoostQuery
[ https://issues.apache.org/jira/browse/LUCENE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024346#comment-17024346 ] David Smiley commented on LUCENE-8143: -- The fact that SpanBoostQuery only works at the top appears to not to be a fault of it's own; this is a very simple / straight-forward Query. Instead the fault / limitation seems to be in SpanWeight or somewhere around there. Killing SpanBoostQuery is a red herring then; no? > Remove SpanBoostQuery > - > > Key: LUCENE-8143 > URL: https://issues.apache.org/jira/browse/LUCENE-8143 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > I initially added it so that span queries could still be boosted, but this > was actually a mistake: boosts are ignored on inner span queries, only the > boost of the top-level span query, the one that performs scoring, is not > ignored. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces
[ https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024344#comment-17024344 ] Robert Muir commented on LUCENE-9166: - [~dweiss] I think this is actually working around https://github.com/gradle/gradle/issues/11220 which looks like it was fixed in 6.1.1 Still I am skeptical of the filtering :) But alternatively we could revert this commit and upgrade and it would solve at least the particular problem that I had here. Looks like the fix simply checks for where the filter would remove the whole stacktrace completely... > gradle build: test failures need stacktraces > > > Key: LUCENE-9166 > URL: https://issues.apache.org/jira/browse/LUCENE-9166 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir >Priority: Major > Fix For: master (9.0) > > Attachments: LUCENE-9166.patch > > > Test failures are missing the stacktrace. Worse yet, it tells you go to look > at a separate (very long) filename which also has no stacktrace :( > I know gradle tries really hard to be quiet and not say anything, but when a > test fails, that isn't the time or place :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow
[ https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024336#comment-17024336 ] Jim Ferenczi commented on LUCENE-9177: -- They use the `kuromoji` tokenizer so I think there's some value to apply NFKC as a char filter ? > ICUNormalizer2CharFilter worst case is very slow > > > Key: LUCENE-9177 > URL: https://issues.apache.org/jira/browse/LUCENE-9177 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Attachments: lucene.patch > > > ICUNormalizer2CharFilter is fast most of the times but we've had some report > in Elasticsearch that some unrealistic data can slow down the process very > significantly. For instance an input that consists of characters to normalize > with no normalization-inert character in between can take up to several > seconds to process few hundreds of kilo-bytes on my machine. While the input > is not realistic, this worst case can slow down indexing considerably when > dealing with uncleaned data. > I attached a small test that reproduces the slow processing using a stream > that contains a lot of repetition of the character `℃` and no > normalization-inert character. I am not surprised that the processing is > slower than usual but several seconds to process seems a lot. Adding > normalization-inert character makes the processing a lot more faster so I > wonder if we can improve the process to split the input more eagerly ? > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow
[ https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024333#comment-17024333 ] Robert Muir commented on LUCENE-9177: - If they are just doing NFKC, then normalization won't impact most tokenizers (standard, icu) so just use the tokenfilter instead? it doesn't have these issues. The charfilter should only be needed to try to "cleanup" for tokenizers that don't understand unicode, so that they will then tokenize properly. > ICUNormalizer2CharFilter worst case is very slow > > > Key: LUCENE-9177 > URL: https://issues.apache.org/jira/browse/LUCENE-9177 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Attachments: lucene.patch > > > ICUNormalizer2CharFilter is fast most of the times but we've had some report > in Elasticsearch that some unrealistic data can slow down the process very > significantly. For instance an input that consists of characters to normalize > with no normalization-inert character in between can take up to several > seconds to process few hundreds of kilo-bytes on my machine. While the input > is not realistic, this worst case can slow down indexing considerably when > dealing with uncleaned data. > I attached a small test that reproduces the slow processing using a stream > that contains a lot of repetition of the character `℃` and no > normalization-inert character. I am not surprised that the processing is > slower than usual but several seconds to process seems a lot. Adding > normalization-inert character makes the processing a lot more faster so I > wonder if we can improve the process to split the input more eagerly ? > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org