[jira] [Commented] (SOLR-14282) /get handler doesn't return copied fields
[ https://issues.apache.org/jira/browse/SOLR-14282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212845#comment-17212845 ] Andrei Minin commented on SOLR-14282: - Sure, I am interested to work on PR - need some time to set SOLR environment and test app. > /get handler doesn't return copied fields > - > > Key: SOLR-14282 > URL: https://issues.apache.org/jira/browse/SOLR-14282 > Project: Solr > Issue Type: Bug > Components: search, SolrJ >Affects Versions: 8.4 > Environment: SOLR 8.4.0, SOLRJ, Oracle Java 8 >Reporter: Andrei Minin >Priority: Major > Attachments: copied_fields_test.zip, managed-schema.xml > > > We are using /get handler to retrieve documents by id in our Java application > (SolrJ) > I found that copied fields are missing in documents returned by /get handler > but same documents returned by query contain copied (by schema) fields. > Attached documents: > # Integration test project archive > # Managed schema file for SOLR > SOLR schema details: > # Unique field name "d_ida_s" > # Lowecase text type definition: > {code:java} > positionIncrementGap="100"> > > > > > {code} > 3. Copy field instruction sample: > {code:java} > stored="true" multiValued="false"/> > /> > {code} > ConcurrenceUserNamea_s is string type field and ConcurrenceUserNameu_lca_s is > lower case text type field. > Integration test uploads document to SOLR server and makes 2 requests: one > using /get rest point to fetch document by id and one using query field name>:. > Document returned by /get rest, doesn't have copied fields while document > returned by query, contains copied fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9524) NullPointerException in IndexSearcher.explain() when using ComplexPhraseQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212830#comment-17212830 ] Zach Chen commented on LUCENE-9524: --- Thanks Adrien! I've opened a PR to perform a null check in SpanWeight#explain and provide alternative explanation without score when its null. Please let me know if it looks good to you. > NullPointerException in IndexSearcher.explain() when using > ComplexPhraseQueryParser > --- > > Key: LUCENE-9524 > URL: https://issues.apache.org/jira/browse/LUCENE-9524 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser, core/search >Affects Versions: 8.6, 8.6.2 >Reporter: Michał Słomkowski >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I get NPE when I use {{IndexSearcher.explain()}}. Checked with Lucene 8.6.0 > and 8.6.2. > The query: {{(lorem AND NOT "dolor lorem") OR ipsum}} > The text: {{dolor lorem ipsum}} > Stack trace: > {code} > java.lang.NullPointerException at > java.util.Objects.requireNonNull(Objects.java:203) > at org.apache.lucene.search.LeafSimScorer.(LeafSimScorer.java:38) > at > org.apache.lucene.search.spans.SpanWeight.explain(SpanWeight.java:160) > at org.apache.lucene.search.BooleanWeight.explain(BooleanWeight.java:87) > at org.apache.lucene.search.BooleanWeight.explain(BooleanWeight.java:87) > at > org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:716) > at > org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:693) > {code} > Minimal example code: > {code:java} > val analyzer = new StandardAnalyzer(); > val query = new ComplexPhraseQueryParser("", analyzer).parse(queryString); > final MemoryIndex memoryIndex = new MemoryIndex(true); > memoryIndex.addField("", text, analyzer); > final IndexSearcher searcher = memoryIndex.createSearcher(); > final TopDocs topDocs = searcher.search(query, 1); > final ScoreDoc match = topDocs.scoreDocs[0]; > searcher.explain(query, match.doc); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn opened a new pull request #1978: LUCENE-9524: Fix NPE in SpanWeight#explain when no scoring is require…
zacharymorn opened a new pull request #1978: URL: https://github.com/apache/lucene-solr/pull/1978 # Description `SpanWeight#explain` uses `Similarity.SimScorer` to generate explanations, and may fail with NullPointerException when scoring is not needed and thus `Similarity.SimScorer` is set to null, such as matching for must not occur span query. # Solution The solution is to check for null `Similarity.SimScorer` in `SpanWeight#explain`, and provides alternative explanation. # Tests Added 1 integration test to `TestMemoryIndex` that was broken before the fix. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `./gradlew check`. - [x] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1963: SOLR-14827: Refactor schema loading to not use XPath
noblepaul commented on a change in pull request #1963: URL: https://github.com/apache/lucene-solr/pull/1963#discussion_r503655774 ## File path: solr/core/src/java/org/apache/solr/util/plugin/AbstractPluginLoader.java ## @@ -135,15 +134,13 @@ protected T create(SolrClassLoader loader, String name, String className, Node n * If a default element is defined, it will be returned from this function. * */ - public T load(SolrClassLoader loader, NodeList nodes ) Review comment: it could be This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec
msokolov commented on pull request #1930: URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707431740 > One option could be 200-dimensional GloVe word vectors, available from http://ann-benchmarks.com/glove-200-angular.hdf5. I think these are trained on Twitter data. +1 I'm looking into adding GloVe data to luceneutil benchmarks, initially just to index and retrieve them, then I hope to add tasks for scoring lexical matches, and then for knn matching. I think some of the GloVe datasets are trained on wikipedia (plus other text) so should be suitable for use in our benchmarks, which are based on wikipedia text. I think for initial performance comparisons we can use our own tool; it wouldn't be as nicely controlled as running in the same framework, but if we are careful the results should be comparable. And it's good to know there is a reasonable path for integrating with ann-benchmark using py4j, and I hadn't realized there was a --batch option. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jtibshirani edited a comment on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec
jtibshirani edited a comment on pull request #1930: URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707404485 > It seems to expect your algorithm to be delivered as an in-process extension to Python, which works OK for a native code library, but I'm not sure how we'd present Lucene to it. We don't want to have to call through a network API? I ended up using `py4j` to call out to Lucene, which sets up a 'gateway server' and passes data between the Python + Java processes through a socket. I found there to be a significant overhead from converting between Python <-> Java, but this can be largely mitigated by making sure to use 'batch mode' (the `--batch` option), which allows all query vectors to be passed to Lucene at once. Amortizing the overhead this way, I was able to get consistent + informative results. Let me know if you're interested in trying the py4j option and I can post set-up steps. I found it helpful while developing but it's quite tricky and maybe shouldn't be the main way to track performance right now (as you mentioned) ! A note that it's possible to use vector data from ann-benchmarks without integrating with the framework. The datasets are listed [here](https://github.com/erikbern/ann-benchmarks/blob/master/ann_benchmarks/datasets.py#L396) and made available on the website in hdf5 format. One option could be 200-dimensional GloVe word vectors, available from `http://ann-benchmarks.com/glove-200-angular.hdf5`. I think these are trained on Twitter data. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jtibshirani commented on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec
jtibshirani commented on pull request #1930: URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707404485 > It seems to expect your algorithm to be delivered as an in-process extension to Python, which works OK for a native code library, but I'm not sure how we'd present Lucene to it. We don't want to have to call through a network API? I ended up using `py4j` to call out to Lucene, which sets up a 'gateway server' and passes data between the Python + Java processes through a socket. I did find there to be a significant overhead just from converting between Python <-> Java, but can be largely mitigated by making sure to use 'batch mode' (the `--batch` option), which allows all query vectors to be passed to Lucene at once. Amortizing the overhead this way, I was able to get consistent + informative results. Let me know if you're interested in trying the py4j option and I can post set-up steps. I found it helpful while developing but it's quite tricky and maybe shouldn't be the main way to track performance right now (as you mentioned) ! A note that it's possible to use vector data from ann-benchmarks without integrating with the framework. The datasets are listed [here](https://github.com/erikbern/ann-benchmarks/blob/master/ann_benchmarks/datasets.py#L396) and made available on the website in hdf5 format. One option could be 200-dimensional GloVe word vectors, available from `http://ann-benchmarks.com/glove-200-angular.hdf5`. I think these are trained on Twitter data. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9568) FuzzyTermEnums sets negative boost for fuzzy search & highlight
[ https://issues.apache.org/jira/browse/LUCENE-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212738#comment-17212738 ] Robert Muir commented on LUCENE-9568: - It will break most of the top-N optimizations of the query. If you use the max term length to compute the boost, then the PQ can never optimize. That would be because there could always exist some unseen term with a length of say: 1MB (just for illustration) which would rank extremely high. By using the min, it is bounded by the query term, and once that PQ fills with ed=2, the query can restrict itself further and only look for ed=1, ed=0, etc. Practically it is this way because that's how fuzzy was defined, with the min and the tie-breaker after this boost being term sort order (you can check the code of an ancient version like 2.9 to see: https://github.com/apache/lucene-solr/blob/releases/lucene/2.9.2/src/java/org/apache/lucene/search/FuzzyTermEnum.java#L189 ). We just exploited it for all its worth. It is especially important for small PQ (top-N) sizes such as spell checking. Hopefully I have explained it ok, this thing is hairy :) Happy to try again if needed. I think first we should make a test, ideally one that doesn't use highlighting? I think there should be an alternative, simpler fix that won't break the top-N optimization. > FuzzyTermEnums sets negative boost for fuzzy search & highlight > --- > > Key: LUCENE-9568 > URL: https://issues.apache.org/jira/browse/LUCENE-9568 > Project: Lucene - Core > Issue Type: Bug > Components: modules/highlighter >Affects Versions: 8.5.1 >Reporter: Juraj Jurčo >Priority: Minor > Labels: highlighting, newbie > Attachments: FindSqlHighlightTest.java > > > *Description* > When user indexes a word with an apostrophe and constructs a fuzzy query for > highlighter, it throws an exception with set negative boost for a query. > *Repro Steps* > # Index a text with apostrophe. E.g. doesn't > # Parse a fuzzy query e.g.: se~, se~2, se~3 > # Try to highlight a text with apostrophe > # The exception is thrown (for details see attached test test with repro > steps) > *Actual Result* > {{java.lang.IllegalArgumentException: boost must be a positive float, got > -1.0}} > *Expected Result* > * No exception. > * Highlighting marks are inserted into a text. > *Workaround* > - not known. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jtibshirani edited a comment on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec
jtibshirani edited a comment on pull request #1930: URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-706813811 > Hmm I tried to get that benchmarking suite to run and it requires some major Python-fu. I managed to get this working a few months ago while experimenting with a clustering-based approach: https://github.com/jtibshirani/ann-benchmarks/pull/2. It indeed involved a lot of set-up -- I can try to get it working again and post results. Going forward, I think it will be helpful to use ann-benchmarks to compare recall and QPS against the ANN reference implementations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1977: SOLR-14907: Support single file upload/overwrite in configSet API
HoustonPutman commented on a change in pull request #1977: URL: https://github.com/apache/lucene-solr/pull/1977#discussion_r503580189 ## File path: solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java ## @@ -170,10 +170,14 @@ private void handleConfigUploadRequest(SolrQueryRequest req, SolrQueryResponse r boolean overwritesExisting = zkClient.exists(configPathInZk, true); -if (overwritesExisting && !req.getParams().getBool(ConfigSetParams.OVERWRITE, false)) { - throw new SolrException(ErrorCode.BAD_REQUEST, - "The configuration " + configSetName + " already exists in zookeeper"); -} +// Get upload parameters +String singleFilePath = req.getParams().get(ConfigSetParams.FILE_PATH, ""); +boolean allowOverwrite = req.getParams().getBool(ConfigSetParams.OVERWRITE, false); +// Cleanup is not allowed while using singleFilePath upload +boolean cleanup = singleFilePath.isEmpty() && req.getParams().getBool(ConfigSetParams.CLEANUP, false); Review comment: Added error handling for that and cases where a bad filePath is given. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1977: SOLR-14907: Support single file upload/overwrite in configSet API
tflobbe commented on a change in pull request #1977: URL: https://github.com/apache/lucene-solr/pull/1977#discussion_r503567179 ## File path: solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java ## @@ -170,10 +170,14 @@ private void handleConfigUploadRequest(SolrQueryRequest req, SolrQueryResponse r boolean overwritesExisting = zkClient.exists(configPathInZk, true); -if (overwritesExisting && !req.getParams().getBool(ConfigSetParams.OVERWRITE, false)) { - throw new SolrException(ErrorCode.BAD_REQUEST, - "The configuration " + configSetName + " already exists in zookeeper"); -} +// Get upload parameters +String singleFilePath = req.getParams().get(ConfigSetParams.FILE_PATH, ""); +boolean allowOverwrite = req.getParams().getBool(ConfigSetParams.OVERWRITE, false); +// Cleanup is not allowed while using singleFilePath upload +boolean cleanup = singleFilePath.isEmpty() && req.getParams().getBool(ConfigSetParams.CLEANUP, false); Review comment: should we error instead of silently ignoring the `cleanup` param? it defaults to `false`, so someone must have explicitly set it to `true` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9536) Optimize OrdinalMap when one segment contains all distinct values?
[ https://issues.apache.org/jira/browse/LUCENE-9536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212713#comment-17212713 ] Julie Tibshirani commented on LUCENE-9536: -- I opened a pull request implementing the idea. It was indeed simple + fast to detect. > Optimize OrdinalMap when one segment contains all distinct values? > -- > > Key: LUCENE-9536 > URL: https://issues.apache.org/jira/browse/LUCENE-9536 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Julie Tibshirani >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > For doc values that are not too high cardinality, it seems common to have > some large segments that contain all distinct values (plus many small > segments who are missing some values). In this case, we could check if the > first segment ords map perfectly to global ords and if so store > `globalOrdDeltas` and `firstSegments` as `LongValues.ZEROES`. This could save > a small amount of space. > I don’t think it would help a huge amount, especially since the optimization > might only kick in with small/ medium cardinalities, which don’t create huge > `OrdinalMap` instances anyways? But it is simple and seemed worth mentioning. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9568) FuzzyTermEnums sets negative boost for fuzzy search & highlight
[ https://issues.apache.org/jira/browse/LUCENE-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212703#comment-17212703 ] Adrien Grand commented on LUCENE-9568: -- This will change the top hits returned by fuzzy queries so I suspect that some users will be a bit angry, but I can't think of a reason why minTermLength makes more sense than maxTermLength so +1 to this suggestion to avoid the case when the edit distance is greater than the minimum term length. > FuzzyTermEnums sets negative boost for fuzzy search & highlight > --- > > Key: LUCENE-9568 > URL: https://issues.apache.org/jira/browse/LUCENE-9568 > Project: Lucene - Core > Issue Type: Bug > Components: modules/highlighter >Affects Versions: 8.5.1 >Reporter: Juraj Jurčo >Priority: Minor > Labels: highlighting, newbie > Attachments: FindSqlHighlightTest.java > > > *Description* > When user indexes a word with an apostrophe and constructs a fuzzy query for > highlighter, it throws an exception with set negative boost for a query. > *Repro Steps* > # Index a text with apostrophe. E.g. doesn't > # Parse a fuzzy query e.g.: se~, se~2, se~3 > # Try to highlight a text with apostrophe > # The exception is thrown (for details see attached test test with repro > steps) > *Actual Result* > {{java.lang.IllegalArgumentException: boost must be a positive float, got > -1.0}} > *Expected Result* > * No exception. > * Highlighting marks are inserted into a text. > *Workaround* > - not known. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14282) /get handler doesn't return copied fields
[ https://issues.apache.org/jira/browse/SOLR-14282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212653#comment-17212653 ] David Eric Pugh commented on SOLR-14282: I just ran into this issue as well. If you are interested in working on a PR, I'd love to work on it with you. > /get handler doesn't return copied fields > - > > Key: SOLR-14282 > URL: https://issues.apache.org/jira/browse/SOLR-14282 > Project: Solr > Issue Type: Bug > Components: search, SolrJ >Affects Versions: 8.4 > Environment: SOLR 8.4.0, SOLRJ, Oracle Java 8 >Reporter: Andrei Minin >Priority: Major > Attachments: copied_fields_test.zip, managed-schema.xml > > > We are using /get handler to retrieve documents by id in our Java application > (SolrJ) > I found that copied fields are missing in documents returned by /get handler > but same documents returned by query contain copied (by schema) fields. > Attached documents: > # Integration test project archive > # Managed schema file for SOLR > SOLR schema details: > # Unique field name "d_ida_s" > # Lowecase text type definition: > {code:java} > positionIncrementGap="100"> > > > > > {code} > 3. Copy field instruction sample: > {code:java} > stored="true" multiValued="false"/> > /> > {code} > ConcurrenceUserNamea_s is string type field and ConcurrenceUserNameu_lca_s is > lower case text type field. > Integration test uploads document to SOLR server and makes 2 requests: one > using /get rest point to fetch document by id and one using query field name>:. > Document returned by /get rest, doesn't have copied fields while document > returned by query, contains copied fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-9207) PeerSync recovery fails if number of updates requested is high
[ https://issues.apache.org/jira/browse/SOLR-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212621#comment-17212621 ] Evgeny Ivanskiy commented on SOLR-9207: --- Hi [~praste], [~shalin]. We are seeing an intermittent issue where some of our collections fail to elect a leader after a restart. The log shows that hosts are failing to become leader due to a sync failure. And that the sync failure is due to not receiving the expected number of updates Investigation shows that there are duplicate versions in tlogs. So, in this case: If we get versions: *1,1,2,2,3,3* we than request the updates in range *1...3*. As result we get *3 updates* but *totalRequestedUpdates is 6* and sync failed. Is there is an assumption that getVersions should return distinct values or that is the bug in PeerSync.handleVersionsWithRanges which does't take into account duplicate versions? > PeerSync recovery fails if number of updates requested is high > -- > > Key: SOLR-9207 > URL: https://issues.apache.org/jira/browse/SOLR-9207 > Project: Solr > Issue Type: Bug >Affects Versions: 5.1, 6.0 >Reporter: Pushkar Raste >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 6.2, 7.0 > > Attachments: SOLR-9207.patch, SOLR-9207.patch, SOLR-9207.patch_updated > > > {{PeerSync}} recovery fails if we request more than ~99K updates. > If update solrconfig to retain more {{tlogs}} to leverage > https://issues.apache.org/jira/browse/SOLR-6359 > During out testing we found out that recovery using {{PeerSync}} fails if we > ask for more than ~99K updates, with following error > {code} > WARN PeerSync [RecoveryThread] - PeerSync: core=hold_shard1 url= > exception talking to , failed > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: > Expected mime type application/octet-stream but got application/xml. > > > application/x-www-form-urlencoded content > length (4761994 bytes) exceeds upload limit of 2048 KB t name="code">400 > > {code} > We arrived at ~99K with following match > * max_version_number = Long.MAX_VALUE = 9223372036854775807 > * bytes per version number = 20 (on the wire as POST request sends version > number as string) > * additional bytes for separator , > * max_versions_in_single_request = 2MB/21 = ~99864 > I could think of 2 ways to fix it > 1. Ask for about updates in chunks of 90K inside {{PeerSync.requestUpdates()}} > 2. Use application/octet-stream encoding -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved
[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212597#comment-17212597 ] Thomas Wöckinger commented on SOLR-14923: - [~dsmiley], [~noble], [~ab] maybe one of you can help, your are involved at the last changes on the DistributedUpdateProcessor. Thx for your help. > Indexing performance is unacceptable when child documents are involved > -- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors >Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6 >Reporter: Thomas Wöckinger >Priority: Critical > Labels: performance > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is included in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block of the UpdateLog instance, therefore all > other operations on the UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14925) CVE-2020-13957: The checks added to unauthenticated configset uploads can be circumvented
[ https://issues.apache.org/jira/browse/SOLR-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Eduardo Fernandez Lobbe updated SOLR-14925: - Security: Public (was: Private (Security Issue)) > CVE-2020-13957: The checks added to unauthenticated configset uploads can be > circumvented > - > > Key: SOLR-14925 > URL: https://issues.apache.org/jira/browse/SOLR-14925 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.6, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 7.0, 7.0.1, 7.1, > 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, 7.7.2, 8.0, 8.1, 8.2, > 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.6, 8.5.1, 8.5.2, 8.6.1, 8.6.2 >Reporter: Tomas Eduardo Fernandez Lobbe >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Fix For: master (9.0), 8.7, 8.6.3 > > > Severity: High > Vendor: The Apache Software Foundation > Versions Affected: > 6.6.0 to 6.6.5 > 7.0.0 to 7.7.3 > 8.0.0 to 8.6.2 > Description: > Solr prevents some features considered dangerous (which could be used for > remote code execution) to be configured in a ConfigSet that's uploaded via > API without authentication/authorization. The checks in place to prevent such > features can be circumvented by using a combination of UPLOAD/CREATE actions. > Mitigation: > Any of the following are enough to prevent this vulnerability: > * Disable UPLOAD command in ConfigSets API if not used by setting the system > property: {{configset.upload.enabled}} to {{false}} [1] > * Use Authentication/Authorization and make sure unknown requests aren't > allowed [2] > * Upgrade to Solr 8.6.3 or greater. > * If upgrading is not an option, consider applying the patch in SOLR-14663 > ([3]) > * No Solr API, including the Admin UI, is designed to be exposed to > non-trusted parties. Tune your firewall so that only trusted computers and > people are allowed access > Credit: > Tomás Fernández Löbbe, András Salamon > References: > [1] https://lucene.apache.org/solr/guide/8_6/configsets-api.html > [2] > https://lucene.apache.org/solr/guide/8_6/authentication-and-authorization-plugins.html > [3] https://issues.apache.org/jira/browse/SOLR-14663 > [4] https://issues.apache.org/jira/browse/SOLR-14925 > [5] https://wiki.apache.org/solr/SolrSecurity -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14923) Indexing performance is unacceptable when child documents are involved
[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212567#comment-17212567 ] Thomas Wöckinger edited comment on SOLR-14923 at 10/12/20, 6:19 PM: Overall index time, for a complete reindex is reduced from 2134 mins to 83 mins when simply commenting out ulog.openRealtimeSearcher and using 12 threads. So somehow we should find a solution for this, because the difference is nearly a factor of 26!! was (Author: thomas.woeckinger): Overall index time, for a complete reindex is reduced from 2134 mins to 83 mins when simply commenting out ulog.openRealtimeSearcher and using 12 threads. So somehow we should find a solution for this. > Indexing performance is unacceptable when child documents are involved > -- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors >Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6 >Reporter: Thomas Wöckinger >Priority: Critical > Labels: performance > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is included in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block of the UpdateLog instance, therefore all > other operations on the UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14907) Support single file upload/overwrite in configSet API
[ https://issues.apache.org/jira/browse/SOLR-14907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman reassigned SOLR-14907: - Assignee: Houston Putman > Support single file upload/overwrite in configSet API > - > > Key: SOLR-14907 > URL: https://issues.apache.org/jira/browse/SOLR-14907 > Project: Solr > Issue Type: Improvement > Components: configset-api >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > After SOLR-10391 was implemented, users are now able to overwrite existing > configSets using the configSet API. However the files uploaded are still > required to be zipped and indexed from the base configSet path in ZK. Users > might want to just update a single file, such as a synonyms list, and not > have to tar it up first. > The proposed solution is to add parameters to the UPLOAD configSet action, to > allow this single-file use case. This would utilize the protections already > provided by the API, such as maintaining the trustiness of configSets being > modified. > This feature is part of the solution to replace managed resources, which is > planned to be deprecated and removed by 9.0 (SOLR-14766). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved
[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212567#comment-17212567 ] Thomas Wöckinger commented on SOLR-14923: - Overall index time, for a complete reindex is reduced from 2134 mins to 83 mins when simply commenting out ulog.openRealtimeSearcher and using 12 threads. So somehow we should find a solution for this. > Indexing performance is unacceptable when child documents are involved > -- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors >Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6 >Reporter: Thomas Wöckinger >Priority: Critical > Labels: performance > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is included in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block of the UpdateLog instance, therefore all > other operations on the UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1976: LUCENE-9578: TermRangeQuery empty string lower bound edge case
jpountz commented on a change in pull request #1976: URL: https://github.com/apache/lucene-solr/pull/1976#discussion_r503439534 ## File path: lucene/core/src/java/org/apache/lucene/util/automaton/Automata.java ## @@ -254,7 +254,7 @@ public static Automaton makeBinaryInterval(BytesRef min, boolean minInclusive, B cmp = min.compareTo(max); } else { cmp = -1; - if (min.length == 0 && minInclusive) { + if (min.length == 0) { Review comment: this looks wrong as we should still make sure that the empty string is rejected if min=="" and minInclusive==false? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman opened a new pull request #1977: SOLR-14907: Support single file upload/overwrite in configSet API
HoustonPutman opened a new pull request #1977: URL: https://github.com/apache/lucene-solr/pull/1977 https://issues.apache.org/jira/browse/SOLR-14907 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9524) NullPointerException in IndexSearcher.explain() when using ComplexPhraseQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212529#comment-17212529 ] Adrien Grand commented on LUCENE-9524: -- Thanks for digging this! I think it's a bug in SpanWeight#explain, which should keep working even if scores are not requested. I guess we could either create a dummy {{Similarity.SimScorer}} when scores are not requested to make sure that the explain logic keeps working, or change the explain() logic to keep working when the simScorer is null? > NullPointerException in IndexSearcher.explain() when using > ComplexPhraseQueryParser > --- > > Key: LUCENE-9524 > URL: https://issues.apache.org/jira/browse/LUCENE-9524 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser, core/search >Affects Versions: 8.6, 8.6.2 >Reporter: Michał Słomkowski >Priority: Major > > I get NPE when I use {{IndexSearcher.explain()}}. Checked with Lucene 8.6.0 > and 8.6.2. > The query: {{(lorem AND NOT "dolor lorem") OR ipsum}} > The text: {{dolor lorem ipsum}} > Stack trace: > {code} > java.lang.NullPointerException at > java.util.Objects.requireNonNull(Objects.java:203) > at org.apache.lucene.search.LeafSimScorer.(LeafSimScorer.java:38) > at > org.apache.lucene.search.spans.SpanWeight.explain(SpanWeight.java:160) > at org.apache.lucene.search.BooleanWeight.explain(BooleanWeight.java:87) > at org.apache.lucene.search.BooleanWeight.explain(BooleanWeight.java:87) > at > org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:716) > at > org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:693) > {code} > Minimal example code: > {code:java} > val analyzer = new StandardAnalyzer(); > val query = new ComplexPhraseQueryParser("", analyzer).parse(queryString); > final MemoryIndex memoryIndex = new MemoryIndex(true); > memoryIndex.addField("", text, analyzer); > final IndexSearcher searcher = memoryIndex.createSearcher(); > final TopDocs topDocs = searcher.search(query, 1); > final ScoreDoc match = topDocs.scoreDocs[0]; > searcher.explain(query, match.doc); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cbuescher opened a new pull request #1976: LUCENE-9578: TermRangeQuery empty string lower bound edge case
cbuescher opened a new pull request #1976: URL: https://github.com/apache/lucene-solr/pull/1976 # Description Currently a TermRangeQuery with the empty String ("") as lower bound and includeLower=false leads internally constructs an Automaton that doesn't match anything. This is unexpected expecially for open upper bounds where any string should be considered to be "higher" than the empty string. # Solution This PR changes "Automata#makeBinaryInterval" so that for an empty string lower bound and an open upper bound, any String should match the query regardless or the includeLower flag. # Tests Added two new tests to `TestAutomaton`. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `./gradlew check` - [x] I have added tests for my changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links
[ https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-14870: -- Fix Version/s: master (9.0) Resolution: Fixed Status: Resolved (was: Patch Available) > gradle build does not validate ref-guide -> javadoc links > - > > Key: SOLR-14870 > URL: https://issues.apache.org/jira/browse/SOLR-14870 > Project: Solr > Issue Type: Task >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-14870.patch, SOLR-14870.patch > > > the ant build had (has on 8x) a feature that ensured we didn't have any > broken links between the ref guide and the javadocs... > {code} > depends="javadocs,changes-to-html,process-webpages"> > inheritall="false"> > > > > > {code} > ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} > just did interanal validation of the strucure of the guide, but this hook > ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first > build the javadocs; then build the ref-guide; then validate _all_ links i > nthe ref-guide, even those to (local) javadocs > While the "local.javadocs" property logic _inside_ the > solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage > this functionality from the "solr" project doesn't seem to have been > preserved -- so currently, {{gradle check}} doesn't know/care if someone adds > a nonsense javadoc link to the ref-guide (or removes a class/method whose > javadoc is already currently to from the ref guide) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links
[ https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212449#comment-17212449 ] ASF subversion and git services commented on SOLR-14870: Commit b4f044219319fc0a0a94b92e2d90a6b25dae9de0 in lucene-solr's branch refs/heads/master from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b4f0442 ] SOLR-14870: refactor ref-guide build.gradle logic to re-enable guide->javadoc link checking fix 'broken' javadoc links in ref-guide to match new documentation path structures for 9.x > gradle build does not validate ref-guide -> javadoc links > - > > Key: SOLR-14870 > URL: https://issues.apache.org/jira/browse/SOLR-14870 > Project: Solr > Issue Type: Task >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14870.patch, SOLR-14870.patch > > > the ant build had (has on 8x) a feature that ensured we didn't have any > broken links between the ref guide and the javadocs... > {code} > depends="javadocs,changes-to-html,process-webpages"> > inheritall="false"> > > > > > {code} > ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} > just did interanal validation of the strucure of the guide, but this hook > ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first > build the javadocs; then build the ref-guide; then validate _all_ links i > nthe ref-guide, even those to (local) javadocs > While the "local.javadocs" property logic _inside_ the > solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage > this functionality from the "solr" project doesn't seem to have been > preserved -- so currently, {{gradle check}} doesn't know/care if someone adds > a nonsense javadoc link to the ref-guide (or removes a class/method whose > javadoc is already currently to from the ref guide) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1963: SOLR-14827: Refactor schema loading to not use XPath
madrob commented on a change in pull request #1963: URL: https://github.com/apache/lucene-solr/pull/1963#discussion_r503332224 ## File path: solr/core/src/java/org/apache/solr/core/XmlConfigFile.java ## @@ -145,15 +139,15 @@ public XmlConfigFile(SolrResourceLoader loader, String name, InputSource is, Str db.setErrorHandler(xmllog); try { doc = db.parse(is); -origDoc = copyDoc(doc); +origDoc = doc; Review comment: If these are always the same, down still need two copies of it? ## File path: solr/core/src/java/org/apache/solr/cloud/CloudConfigSetService.java ## @@ -39,14 +43,28 @@ */ public class CloudConfigSetService extends ConfigSetService { private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); - + private Map cache = new ConcurrentHashMap<>(); Review comment: nit: make final? ## File path: solr/solrj/src/java/org/apache/solr/common/ConfigNode.java ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.common; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.Set; +import java.util.function.Function; +import java.util.function.Predicate; + +import org.apache.solr.cluster.api.SimpleMap; + +/** + * A generic interface that represents a config file, mostly XML + */ +public interface ConfigNode { + ThreadLocal> SUBSTITUTES = new ThreadLocal<>(); + + /** + * Name of the tag + */ + String name(); + + /** + * Text value of the node + */ + String textValue(); + + /** + * Attributes + */ + SimpleMap attributes(); + + /** + * Child by name + */ + default ConfigNode child(String name) { +return child(null, name); + } + + /**Iterate through child nodes with the name and return the first child that matches + */ + default ConfigNode child(Predicate test, String name) { Review comment: I think these are more natural with the order of the parameters reversed. It also aligns better with the single argument version calling `child(name, null)` - easier to reason about in an IDE autocomplete. `child(name, test)` reads more similarly to the previous XPath `//name[test]`. ## File path: solr/solrj/src/java/org/apache/solr/common/ConfigNode.java ## @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.common; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.Set; +import java.util.function.Function; +import java.util.function.Predicate; + +import org.apache.solr.cluster.api.SimpleMap; + +/** + * A generic interface that represents a config file, mostly XML + */ +public interface ConfigNode { + ThreadLocal> SUBSTITUTES = new ThreadLocal<>(); + + /** + * Name of the tag + */ + String name(); + + /** + * Text value of the node + */ + String textValue(); + + /** + * Attributes + */ + SimpleMap attributes(); + + /** + * Child by name + */ + default ConfigNode child(String name) { +return child(null, name); + } + + /**Iterate through child nodes with the name and return the first child that matches + */ + default ConfigNode child(Predicate test, String name) { +ConfigNode[] result = new ConfigNode[1]; +forEachChild(it -> { + if (name!=null && !name.equals(it.name()))
[GitHub] [lucene-solr] sigram commented on a change in pull request #1974: SOLR-14914: Add option to disable metrics collection
sigram commented on a change in pull request #1974: URL: https://github.com/apache/lucene-solr/pull/1974#discussion_r503358068 ## File path: solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java ## @@ -110,19 +110,22 @@ public static final int DEFAULT_CLOUD_REPORTER_PERIOD = 60; - private MetricRegistry.MetricSupplier counterSupplier; - private MetricRegistry.MetricSupplier meterSupplier; - private MetricRegistry.MetricSupplier timerSupplier; - private MetricRegistry.MetricSupplier histogramSupplier; + private final MetricsConfig metricsConfig; + private final MetricRegistry.MetricSupplier counterSupplier; + private final MetricRegistry.MetricSupplier meterSupplier; + private final MetricRegistry.MetricSupplier timerSupplier; + private final MetricRegistry.MetricSupplier histogramSupplier; public SolrMetricManager() { +metricsConfig = new MetricsConfig.MetricsConfigBuilder().build(); counterSupplier = MetricSuppliers.counterSupplier(null, null); meterSupplier = MetricSuppliers.meterSupplier(null, null); Review comment: @muse-dev is obviously wrong, this can never be null as it's a static method, and it accepts null args. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9578) TermRangeQuery with empty string lower bound edge case
Christoph Büscher created LUCENE-9578: - Summary: TermRangeQuery with empty string lower bound edge case Key: LUCENE-9578 URL: https://issues.apache.org/jira/browse/LUCENE-9578 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 8.6.3, trunk Reporter: Christoph Büscher Currently a TermRangeQuery with the empty String ("") as lower bound and includeLower=false leads internally constructs an Automaton that doesn't match anything. This is unexpected expecially for open upper bounds where any string should be considered to be "higher" than the empty string. I think "Automata#makeBinaryInterval" should be changed so that for an empty string lower bound and an open upper bound, any String should match the query regardless or the includeLower flag. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14924) Some ReplicationHandler metrics are reported using incorrect types
Andrzej Bialecki created SOLR-14924: --- Summary: Some ReplicationHandler metrics are reported using incorrect types Key: SOLR-14924 URL: https://issues.apache.org/jira/browse/SOLR-14924 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: metrics Affects Versions: 8.6.3, 8.7 Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Some metrics reported from {{ReplicationHandler}} use incorrect types - they are reported as String values instead of the numerics. This is caused by using {{ReplicationHandler.addVal}} utility method with the type {{Integer.class}}, which the method doesn't support and it returns the value as a string. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] munendrasn opened a new pull request #1975: Include missing commands in package tool help section
munendrasn opened a new pull request #1975: URL: https://github.com/apache/lucene-solr/pull/1975 * Include add-key and uninstall commands > As this is a minor change, I haven't created a JIRA, will create one if required This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14923) Indexing performance is unacceptable when child documents are involved
[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212360#comment-17212360 ] Thomas Wöckinger edited comment on SOLR-14923 at 10/12/20, 1:24 PM: The critical lines (branch 8.6) in org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505. So the question is, can this code be avoided if 'waitSearcher' is specified with 'false'. This is not my first contribution, but in this case it will be a fundamental change, so if someone can guide me in the right direction on fixing this it would be great. was (Author: thomas.woeckinger): The critical lines (branch 8.6) in org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505. So the question is, can this code be avoided if 'waitSearcher' is specified with 'false'. This is not my first contribution, but in this case it will be fundamental change, so if someone can guide me in the right direction on fixing this it would be great. > Indexing performance is unacceptable when child documents are involved > -- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors >Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6 >Reporter: Thomas Wöckinger >Priority: Critical > Labels: performance > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is included in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block of the UpdateLog instance, therefore all > other operations on the UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14923) Indexing performance is unacceptable when child documents are involved
[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Wöckinger updated SOLR-14923: Description: Parallel indexing does not make sense at moment when child documents are used. The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the end of the method doVersionAdd if Ulog caches should be refreshed. This check will return true if any child document is included in the AddUpdateCommand. If so ulog.openRealtimeSearcher(); is called, this call is very expensive, and executed in a synchronized block of the UpdateLog instance, therefore all other operations on the UpdateLog are blocked too. Because every important UpdateLog method (add, delete, ...) is done using a synchronized block almost each operation is blocked. This reduces multi threaded index update to a single thread behavior. The described behavior is not depending on any option of the UpdateRequest, so it does not make any difference if 'waitFlush', 'waitSearcher' or 'softCommit' is true or false. The described behavior makes the usage of ChildDocuments useless, because the performance is unacceptable. was: Parallel indexing does not make sense at moment when child documents are used. The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the end of the method doVersionAdd if Ulog caches should be refreshed. This check will return true if any child document is include in the AddUpdateCommand. If so ulog.openRealtimeSearcher(); is called, this call is very expensive, and executed in a synchronized block, therefor all other operations on the UpdateLog are blocked too. Because every important UpdateLog method (add, delete, ...) is done using a synchronized block almost each operation is blocked. This reduces multi threaded index update to a single thread behavior. The described behavior is not depending on any option of the UpdateRequest, so it does not make any difference if 'waitFlush', 'waitSearcher' or 'softCommit' is true or false. The described behavior makes the usage of ChildDocuments useless, because the performance is unacceptable. > Indexing performance is unacceptable when child documents are involved > -- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors >Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6 >Reporter: Thomas Wöckinger >Priority: Critical > Labels: performance > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is included in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block of the UpdateLog instance, therefore all > other operations on the UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14923) Indexing performance is unacceptable when child documents are involved
[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212360#comment-17212360 ] Thomas Wöckinger edited comment on SOLR-14923 at 10/12/20, 1:21 PM: The critical lines (branch 8.6) in org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505. So the question is, can this code be avoided if 'waitSearcher' is specified with 'false'. This is not my first contribution, but in this case it will be fundamental change, so if someone can guide me in the right direction on fixing this it would be great. was (Author: thomas.woeckinger): The critical lines (branch 8.6) in org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505. So the question is, can this code be avoided if 'waitSearcher' is specified with 'false'. If someone can guide me in the right direction on fixing this it would be great. > Indexing performance is unacceptable when child documents are involved > -- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors >Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6 >Reporter: Thomas Wöckinger >Priority: Critical > Labels: performance > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is include in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block, therefor all other operations on the > UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved
[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212360#comment-17212360 ] Thomas Wöckinger commented on SOLR-14923: - The critical lines (branch 8.6) in org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505. So the question is, can this code be avoided if 'waitSearcher' is specified with 'false'. If someone can guide me in the right direction on fixing this it would be great. > Indexing performance is unacceptable when child documents are involved > -- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors >Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6 >Reporter: Thomas Wöckinger >Priority: Critical > Labels: performance > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is include in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block, therefor all other operations on the > UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14923) Indexing performance is unacceptable when child documents are involved
Thomas Wöckinger created SOLR-14923: --- Summary: Indexing performance is unacceptable when child documents are involved Key: SOLR-14923 URL: https://issues.apache.org/jira/browse/SOLR-14923 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: update, UpdateRequestProcessors Affects Versions: 8.6, 8.5, 8.4, 8.3, master (9.0) Reporter: Thomas Wöckinger Parallel indexing does not make sense at moment when child documents are used. The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the end of the method doVersionAdd if Ulog caches should be refreshed. This check will return true if any child document is include in the AddUpdateCommand. If so ulog.openRealtimeSearcher(); is called, this call is very expensive, and executed in a synchronized block, therefor all other operations on the UpdateLog are blocked too. Because every important UpdateLog method (add, delete, ...) is done using a synchronized block almost each operation is blocked. This reduces multi threaded index update to a single thread behavior. The described behavior is not depending on any option of the UpdateRequest, so it does not make any difference if 'waitFlush', 'waitSearcher' or 'softCommit' is true or false. The described behavior makes the usage of ChildDocuments useless, because the performance is unacceptable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930
[ https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212352#comment-17212352 ] Erick Erickson commented on SOLR-14844: --- [~samuelgmartinez] JIRAs can only be assigned to committers, so I'll leave it assigned to me. I see all the changes that happen here, so I'll be monitoring. When it's ready I'll commit it. It's a bit confusing at present because master is built with Gradle and 8x with ant. At root, you have to change versions.props for master and lucene/ivy-versions.properties in 8x. Then there's some magic incantations you have to do that regenerate checksums and similar that are different in each. Which shows a lot of file changes for just a few code changes. I've attached a couple of patches that, absent this problem they are what I would have committed if it had "just worked". So you should be able to apply them, then fix whatever you need to. When you're ready, submit a PR or patch (whichever is more comfortable) and I'll take it from there. That should allow you to get to the real problem without getting frustrated by all the fiddly bits. Do change solr/CHANGES.txt in both versions to give yourself credit for fixing this. The convention I use for something like this where someone else does the heavy lifting is: "(Samuel Garcia Martinez via Erick Erickson)". That way you get credit for the work and I get the blame if something goes wrong ;) Finally, when switching back and forth between master and 8x there may be cruft left over that fails precommit. "git clean -dxf" is your friend in those cases, although that'll also erase your IDE files and you'll have to "ant idea" in 8x or just open the project in master. Real Soon Now I'll look at worktree to avoid this... Oh, and on master "gradlew check" sometimes runs out of memory on my machine, although I haven't heard other's complain so maybe I'm just lucky. If that happens, you can bump the memory gradle uses in gradle.properties And thanks! [^SOLR-14884-8x.patch] [^SOLR-14844-master.patch] > Upgrade Jetty to 9.4.32.v20200930 > - > > Key: SOLR-14844 > URL: https://issues.apache.org/jira/browse/SOLR-14844 > Project: Solr > Issue Type: Improvement >Affects Versions: 8.6 >Reporter: Cassandra Targett >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-14844-master.patch, SOLR-14884-8x.patch > > > A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools > raising red flags > ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]). > Here's the Jetty issue: > [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in > 9.4.30+, so we should upgrade to that for 8.7 > -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than > requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) > know if this problem is even exploitable in Solr, or b) if the workaround > suggested is even possible in Solr.- > In normal Solr installs, w/o jetty optimizations, this issue is largely > mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930
[ https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14844: -- Attachment: SOLR-14844-master.patch SOLR-14884-8x.patch > Upgrade Jetty to 9.4.32.v20200930 > - > > Key: SOLR-14844 > URL: https://issues.apache.org/jira/browse/SOLR-14844 > Project: Solr > Issue Type: Improvement >Affects Versions: 8.6 >Reporter: Cassandra Targett >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-14844-master.patch, SOLR-14884-8x.patch > > > A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools > raising red flags > ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]). > Here's the Jetty issue: > [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in > 9.4.30+, so we should upgrade to that for 8.7 > -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than > requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) > know if this problem is even exploitable in Solr, or b) if the workaround > suggested is even possible in Solr.- > In normal Solr installs, w/o jetty optimizations, this issue is largely > mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930
[ https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212311#comment-17212311 ] Samuel García Martínez commented on SOLR-14844: --- I can handle the upgrade completely if you want, so feel free to assign it to me and I'll submit a PR on Github. I may need some guidance on "non obvious" changes to upgrade Jetty version (updating solr/licenses and some other things I may not be aware of). I would approach this as follows: * Understand why is not reproducible on master branch * Modify the unit tests to ensure they pass on both branches * Upgrade Jetty version * Open new ticket to improve gzip handling on the client > Upgrade Jetty to 9.4.32.v20200930 > - > > Key: SOLR-14844 > URL: https://issues.apache.org/jira/browse/SOLR-14844 > Project: Solr > Issue Type: Improvement >Affects Versions: 8.6 >Reporter: Cassandra Targett >Assignee: Erick Erickson >Priority: Major > > A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools > raising red flags > ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]). > Here's the Jetty issue: > [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in > 9.4.30+, so we should upgrade to that for 8.7 > -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than > requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) > know if this problem is even exploitable in Solr, or b) if the workaround > suggested is even possible in Solr.- > In normal Solr installs, w/o jetty optimizations, this issue is largely > mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930
[ https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212311#comment-17212311 ] Samuel García Martínez edited comment on SOLR-14844 at 10/12/20, 11:29 AM: --- I can handle the upgrade completely if you want, so feel free to assign it to me and I'll submit a PR on Github. I may need some guidance on "non obvious" changes to upgrade Jetty version (updating solr/licenses and some other things I may not be aware of). I would approach this as follows: * Understand why is not reproducible on master branch * Modify the unit tests to ensure they pass on both branches * Upgrade Jetty version * Open new JIRA to improve gzip handling on the client was (Author: samuelgmartinez): I can handle the upgrade completely if you want, so feel free to assign it to me and I'll submit a PR on Github. I may need some guidance on "non obvious" changes to upgrade Jetty version (updating solr/licenses and some other things I may not be aware of). I would approach this as follows: * Understand why is not reproducible on master branch * Modify the unit tests to ensure they pass on both branches * Upgrade Jetty version * Open new ticket to improve gzip handling on the client > Upgrade Jetty to 9.4.32.v20200930 > - > > Key: SOLR-14844 > URL: https://issues.apache.org/jira/browse/SOLR-14844 > Project: Solr > Issue Type: Improvement >Affects Versions: 8.6 >Reporter: Cassandra Targett >Assignee: Erick Erickson >Priority: Major > > A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools > raising red flags > ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]). > Here's the Jetty issue: > [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in > 9.4.30+, so we should upgrade to that for 8.7 > -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than > requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) > know if this problem is even exploitable in Solr, or b) if the workaround > suggested is even possible in Solr.- > In normal Solr installs, w/o jetty optimizations, this issue is largely > mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] msokolov edited a comment on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec
msokolov edited a comment on pull request #1930: URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707088977 > it will be helpful to use ann-benchmark so .. I finally did get it working, but I have a few questions. Aside: I think my main stumbling block was running on an ARM instance - that may have caused some dependency issues, and then I found most of the algorithms are compiled with x86-only compiler extensions, sigh. But my main concern there is aboutt the way this benchmarking system runs. It seems to expect your algorithm to be delivered as an in-process extension to Python, which works OK for a native code library, but I'm not sure how we'd present Lucene to it. We don't want to have to call through a network API? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec
msokolov commented on pull request #1930: URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707088977 > it will be helpful to use ann-benchmark so .. I finally did get it working, but I have a few questions. Aside: I think my main stumbling block was running on an ARM instance - that may have caused some dependency issues, and then I found most of the algorithms are compiled with x86-only compiler extensions, sigh. But my main concern there is aboutt the way this benchmarking system runs. It seems to expect your algorithm to be delivered as an in-process extension to Python, which works OK for a native code library, but I'm not sure how we'd present Lucene to it. We don't want to have to call through a network API? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #1974: SOLR-14914: Add option to disable metrics collection
muse-dev[bot] commented on a change in pull request #1974: URL: https://github.com/apache/lucene-solr/pull/1974#discussion_r503255143 ## File path: solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java ## @@ -110,19 +110,22 @@ public static final int DEFAULT_CLOUD_REPORTER_PERIOD = 60; - private MetricRegistry.MetricSupplier counterSupplier; - private MetricRegistry.MetricSupplier meterSupplier; - private MetricRegistry.MetricSupplier timerSupplier; - private MetricRegistry.MetricSupplier histogramSupplier; + private final MetricsConfig metricsConfig; + private final MetricRegistry.MetricSupplier counterSupplier; + private final MetricRegistry.MetricSupplier meterSupplier; + private final MetricRegistry.MetricSupplier timerSupplier; + private final MetricRegistry.MetricSupplier histogramSupplier; public SolrMetricManager() { +metricsConfig = new MetricsConfig.MetricsConfigBuilder().build(); counterSupplier = MetricSuppliers.counterSupplier(null, null); meterSupplier = MetricSuppliers.meterSupplier(null, null); Review comment: *NULL_DEREFERENCE:* object `null` is dereferenced by call to `meterSupplier(...)` at line 122. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930
[ https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212095#comment-17212095 ] Erick Erickson edited comment on SOLR-14844 at 10/12/20, 11:14 AM: --- [~samuelgmartinez] That'd be great (if you supplied a patch). I'm much more comfortable now that you have an explanation. I'm not clear on why this only fails in 8x and not master, but that's just my confusion. How do you want to proceed? If you just attach a patch file, I can put it in this upgrade or we can have a separate JIRA... Let me know and thanks! was (Author: erickerickson): [~samuelgmartinez] That'd be great (if you supplied a patch). I'm much more comfortable now that you have an explanation. I'm not clear on why this only fails in 8x and not master, but that's just my confusion. How do you want to proceed? If you just attach a patch file, I can put it in this upgrade or we can have a separate patch... Let me know and thanks! > Upgrade Jetty to 9.4.32.v20200930 > - > > Key: SOLR-14844 > URL: https://issues.apache.org/jira/browse/SOLR-14844 > Project: Solr > Issue Type: Improvement >Affects Versions: 8.6 >Reporter: Cassandra Targett >Assignee: Erick Erickson >Priority: Major > > A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools > raising red flags > ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]). > Here's the Jetty issue: > [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in > 9.4.30+, so we should upgrade to that for 8.7 > -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than > requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) > know if this problem is even exploitable in Solr, or b) if the workaround > suggested is even possible in Solr.- > In normal Solr installs, w/o jetty optimizations, this issue is largely > mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul opened a new pull request #1974: SOLR-14914: Add option to disable metrics collection
noblepaul opened a new pull request #1974: URL: https://github.com/apache/lucene-solr/pull/1974 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] iverase commented on pull request #1907: LUCENE-9538: Detect polygon self-intersections in the Tessellator
iverase commented on pull request #1907: URL: https://github.com/apache/lucene-solr/pull/1907#issuecomment-707042029 Thinking more about this, I am not sure if we can make it optional. There can be situations when the polygon is invalid but it actually does not fail when it is tessellated. We have an example in the tests, where the polygon looks like: https://user-images.githubusercontent.com/29038686/95737373-d8105780-0c87-11eb-9f1f-3bd16e392067.png;> This polygon is invalid and the new check will throw an error but it does not fail if you try to tessellate it. The documentation of the 'Tessellator` explicitly says that: ``` Holes may only touch at one vertex ``` IMHO I think we should fail in this case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14922) Include solr-ref-guide tasks in sourceSets for IntelliJ
[ https://issues.apache.org/jira/browse/SOLR-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212288#comment-17212288 ] ASF subversion and git services commented on SOLR-14922: Commit e444df1435f3e24e6f09db47d9d369b3d4e85f12 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e444df1 ] SOLR-14922: Include solr-ref-guide tasks in sourceSets for IntelliJ (#1973) > Include solr-ref-guide tasks in sourceSets for IntelliJ > --- > > Key: SOLR-14922 > URL: https://issues.apache.org/jira/browse/SOLR-14922 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14922) Include solr-ref-guide tasks in sourceSets for IntelliJ
[ https://issues.apache.org/jira/browse/SOLR-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-14922. Fix Version/s: master (9.0) Resolution: Fixed > Include solr-ref-guide tasks in sourceSets for IntelliJ > --- > > Key: SOLR-14922 > URL: https://issues.apache.org/jira/browse/SOLR-14922 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss merged pull request #1973: SOLR-14922: Include solr-ref-guide tasks in sourceSets for IntelliJ
dweiss merged pull request #1973: URL: https://github.com/apache/lucene-solr/pull/1973 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss opened a new pull request #1973: SOLR-14922: Include solr-ref-guide tasks in sourceSets for IntelliJ
dweiss opened a new pull request #1973: URL: https://github.com/apache/lucene-solr/pull/1973 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14922) Include solr-ref-guide tasks in sourceSets for IntelliJ
Dawid Weiss created SOLR-14922: -- Summary: Include solr-ref-guide tasks in sourceSets for IntelliJ Key: SOLR-14922 URL: https://issues.apache.org/jira/browse/SOLR-14922 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Reporter: Dawid Weiss Assignee: Dawid Weiss -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14915) Prometheus-exporter should not depend on Solr-core
[ https://issues.apache.org/jira/browse/SOLR-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212272#comment-17212272 ] Dawid Weiss commented on SOLR-14915: I attached a patch showing how to modify your pr so that the entire "distribution" is included in the Solr packaging. It's just a suggestion (the solution requires a decision on whether to include duplicated JARs, which scripts to include, etc.). > Prometheus-exporter should not depend on Solr-core > -- > > Key: SOLR-14915 > URL: https://issues.apache.org/jira/browse/SOLR-14915 > Project: Solr > Issue Type: Improvement > Components: contrib - prometheus-exporter >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: patch.patch > > Time Spent: 40m > Remaining Estimate: 0h > > I think it's *crazy* that our Prometheus exporter depends on Solr-core -- > this thing is a _client_ of Solr; it does not live within Solr. The exporter > ought to be fairly lean. One consequence of this dependency is that, for > example, security vulnerabilities reported against Solr (e.g. Jetty) can (and > do, where I work) wind up being reported against this module even though > Prometheus isn't using Jetty. > From my evaluation today of what's going on, it appears the crux of the > problem is that the prometheus exporter uses some utility mechanisms in > Solr-core like XmlConfig (which depends on SolrResourceLoader and the rabbit > hole goes deeper...) and DOMUtils (further depends on PropertiesUtil). It > can easy be made to not use XmlConfig. DOMUtils & PropertiesUtil could move > to SolrJ which already has lots of little dependency-free utilities needed by > SolrJ and Solr-core alike. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14915) Prometheus-exporter should not depend on Solr-core
[ https://issues.apache.org/jira/browse/SOLR-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-14915: --- Attachment: patch.patch > Prometheus-exporter should not depend on Solr-core > -- > > Key: SOLR-14915 > URL: https://issues.apache.org/jira/browse/SOLR-14915 > Project: Solr > Issue Type: Improvement > Components: contrib - prometheus-exporter >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Attachments: patch.patch > > Time Spent: 40m > Remaining Estimate: 0h > > I think it's *crazy* that our Prometheus exporter depends on Solr-core -- > this thing is a _client_ of Solr; it does not live within Solr. The exporter > ought to be fairly lean. One consequence of this dependency is that, for > example, security vulnerabilities reported against Solr (e.g. Jetty) can (and > do, where I work) wind up being reported against this module even though > Prometheus isn't using Jetty. > From my evaluation today of what's going on, it appears the crux of the > problem is that the prometheus exporter uses some utility mechanisms in > Solr-core like XmlConfig (which depends on SolrResourceLoader and the rabbit > hole goes deeper...) and DOMUtils (further depends on PropertiesUtil). It > can easy be made to not use XmlConfig. DOMUtils & PropertiesUtil could move > to SolrJ which already has lots of little dependency-free utilities needed by > SolrJ and Solr-core alike. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1964: SOLR-14749: Cluster singleton part of PR-1785
sigram commented on a change in pull request #1964: URL: https://github.com/apache/lucene-solr/pull/1964#discussion_r503147180 ## File path: solr/core/src/java/org/apache/solr/api/AnnotatedApi.java ## @@ -85,9 +85,18 @@ public EndPoint getEndPoint() { } public static List getApis(Object obj) { -return getApis(obj.getClass(), obj); +return getApis(obj.getClass(), obj, true); } - public static List getApis(Class theClass , Object obj) { + + /** + * Get a list of Api-s supported by this class. + * @param theClass class + * @param obj object of this class (may be null) + * @param required if true then an exception is thrown if no Api-s can be retrieved, if false + *then absence of Api-s is silently ignored. + * @return list of discovered Api-s + */ + public static List getApis(Class theClass , Object obj, boolean required) { Review comment: Do you mean the name is confusing? Either name is fine with me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1964: SOLR-14749: Cluster singleton part of PR-1785
sigram commented on a change in pull request #1964: URL: https://github.com/apache/lucene-solr/pull/1964#discussion_r503146425 ## File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java ## @@ -168,6 +172,33 @@ public CoreLoadFailure(CoreDescriptor cd, Exception loadFailure) { } } + public static class ClusterSingletons { Review comment: That's the downside of splitting this PR into API & implementation ... The way we load CC it's nearly impossible to properly initialize singletons from within the load() method if we don't defer the initialization until the load() is completed. `ClusterSingletons` class is just a helper for deferred init of singletons once they are all discovered and loaded. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1964: SOLR-14749: Cluster singleton part of PR-1785
sigram commented on a change in pull request #1964: URL: https://github.com/apache/lucene-solr/pull/1964#discussion_r503145004 ## File path: solr/core/src/java/org/apache/solr/handler/admin/ContainerPluginsApi.java ## @@ -51,7 +51,7 @@ public class ContainerPluginsApi { private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); - public static final String PLUGIN = "plugin"; + public static final String PLUGINS = "plugin"; Review comment: it's a left-over from the earlier change that reverted but I missed this change. Still, I think we should change the name of the path to "plugins" because the singular form doesn't make sense. Back-compat can be preserved by initially accepting both names. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14656) Deprecate current autoscaling framework, remove from master
[ https://issues.apache.org/jira/browse/SOLR-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212247#comment-17212247 ] Andrzej Bialecki commented on SOLR-14656: - {quote}this is not in CHANGES.txt, do you think it should be? {quote} Yes, I just added it - thanks for the reminder. > Deprecate current autoscaling framework, remove from master > --- > > Key: SOLR-14656 > URL: https://issues.apache.org/jira/browse/SOLR-14656 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Andrzej Bialecki >Priority: Blocker > Fix For: 8.7 > > Attachments: Screenshot from 2020-07-18 07-49-01.png > > Time Spent: 3h > Remaining Estimate: 0h > > The autoscaling framework is being re-designed in SOLR-14613 (SIP: > https://cwiki.apache.org/confluence/display/SOLR/SIP-8+Autoscaling+policy+engine+V2). > The current autoscaling framework is very inefficient, improperly designed > and too bloated and doesn't receive the level of support we aspire to provide > for all components that we ship. > This issue is to deprecate current autoscaling framework in 8x, so we can > focus on the new autoscaling framework afresh. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14656) Deprecate current autoscaling framework, remove from master
[ https://issues.apache.org/jira/browse/SOLR-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212246#comment-17212246 ] ASF subversion and git services commented on SOLR-14656: Commit ae26be479c743d7c60ac28e050d491e91291f85a in lucene-solr's branch refs/heads/branch_8x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ae26be4 ] SOLR-14656: Add a deprecation notice to CHANGES.txt. > Deprecate current autoscaling framework, remove from master > --- > > Key: SOLR-14656 > URL: https://issues.apache.org/jira/browse/SOLR-14656 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Assignee: Andrzej Bialecki >Priority: Blocker > Fix For: 8.7 > > Attachments: Screenshot from 2020-07-18 07-49-01.png > > Time Spent: 3h > Remaining Estimate: 0h > > The autoscaling framework is being re-designed in SOLR-14613 (SIP: > https://cwiki.apache.org/confluence/display/SOLR/SIP-8+Autoscaling+policy+engine+V2). > The current autoscaling framework is very inefficient, improperly designed > and too bloated and doesn't receive the level of support we aspire to provide > for all components that we ship. > This issue is to deprecate current autoscaling framework in 8x, so we can > focus on the new autoscaling framework afresh. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] barrotsteindev commented on pull request #1970: SOLR-14869: do not add deleted docs in child doc transformer
barrotsteindev commented on pull request #1970: URL: https://github.com/apache/lucene-solr/pull/1970#issuecomment-706953245 Thanks @dsmiley This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #1972: SOLR-14915: Prometheus-exporter does not depend on Solr-core any longer
dweiss commented on pull request #1972: URL: https://github.com/apache/lucene-solr/pull/1972#issuecomment-706949342 Well, yes. I know it's tempting to use gradle's "ready-to-use" plugins but the situation is different when you have a stand-alone project and a multi-module aggregate. Let me explain. gradle plugins add tons of stuff (configurations, dependencies, tasks, conventions) and hide what's going on behind the scenes. This has pros and cons. While you can use the automatically-added 'run' task, the subproject is also populated with stuff that will increase the overall build time (like the stand-alone distributions on assemble) and may conflict with other parts of the multi-module build. For this reason I typically prefer to be explicit about packaging and distribution aspects, even if some things need to be done manually. My preference would be to just keep using java-library and add corresponding entries to the JAR's manifest; the "run" task is still achievable with plain java-library - Lucene's Luke configuration has an example of how this can be done. Please do as you please though - I don't even know what this particular contrib is :) As for your question why not all JARs are copied - this is a larger question that applies to the current Solr packaging. I tried to emulate the way ant worked (in packaging.gradle) - this means copying the module's "unique" set of JARs to avoid duplication. This isn't easy and makes configuration management fairly hairy. I think stand-alone tools like prometheus-exporter or luke should have full executable distribution in the binary release - even if this means duplicating some JARs (including Lucene JARs). This makes it easier to reason about what they need, makes the configuration simpler... at the cost of increased distribution size. The alternative is to hand-edit the build and include what's needed or hand-edit the scripts to point at the right JARs. I'm really not sure which way is right (I believe the full set of artifacts required to run a stand-alone tool is better but it's my personal opinion). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9562) Unify 'analysis' package with produced artifact names
[ https://issues.apache.org/jira/browse/LUCENE-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-9562. - Fix Version/s: master (9.0) Resolution: Fixed > Unify 'analysis' package with produced artifact names > - > > Key: LUCENE-9562 > URL: https://issues.apache.org/jira/browse/LUCENE-9562 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Minor > Fix For: master (9.0) > > Time Spent: 40m > Remaining Estimate: 0h > > Lucene has 'analysis' module but its sub-modules produce 'lucene-analyzers-*' > artifacts. This inconsistency is currently handled by setting artifact names > manually: > {code} > configure(subprojects.findAll { it.path.contains(':lucene:analysis:') }) { > project.archivesBaseName = project.archivesBaseName.replace("-analysis-", > "-analyzers-") > } > {code} > but I keep wondering if we should just make it one or the other - either > rename 'analysis' to 'analyzers' or produce 'lucene-analysis-' artifacts. > My personal opinion is to produce 'lucene-analysis-' packages because this > keeps repository structure the same (backports will be easier) and we're > targeting a major release anyway so people can adjust dependency names when > upgrading. This change would be also consistent with package naming inside > those modules. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9562) Unify 'analysis' package with produced artifact names
[ https://issues.apache.org/jira/browse/LUCENE-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212178#comment-17212178 ] ASF subversion and git services commented on LUCENE-9562: - Commit c5cf13259e36582270042c46c9106138aadcb1d0 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c5cf132 ] LUCENE-9562: All binary analysis packages (and corresponding Maven artifacts) with names containing '-analyzers-' have been renamed to '-analysis-'. (#1968) > Unify 'analysis' package with produced artifact names > - > > Key: LUCENE-9562 > URL: https://issues.apache.org/jira/browse/LUCENE-9562 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Minor > Fix For: master (9.0) > > Time Spent: 40m > Remaining Estimate: 0h > > Lucene has 'analysis' module but its sub-modules produce 'lucene-analyzers-*' > artifacts. This inconsistency is currently handled by setting artifact names > manually: > {code} > configure(subprojects.findAll { it.path.contains(':lucene:analysis:') }) { > project.archivesBaseName = project.archivesBaseName.replace("-analysis-", > "-analyzers-") > } > {code} > but I keep wondering if we should just make it one or the other - either > rename 'analysis' to 'analyzers' or produce 'lucene-analysis-' artifacts. > My personal opinion is to produce 'lucene-analysis-' packages because this > keeps repository structure the same (backports will be easier) and we're > targeting a major release anyway so people can adjust dependency names when > upgrading. This change would be also consistent with package naming inside > those modules. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss merged pull request #1968: LUCENE-9562: Unify 'analysis' package with produced artifact names (-analyzers- to -analysis-)
dweiss merged pull request #1968: URL: https://github.com/apache/lucene-solr/pull/1968 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6831) Review LinkedList usage
[ https://issues.apache.org/jira/browse/LUCENE-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212176#comment-17212176 ] ASF subversion and git services commented on LUCENE-6831: - Commit 7362c4ce603d81f276a83070e51bda52c3528bd7 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7362c4c ] LUCENE-6831: start removing LinkedList in favor of ArrayList or De/Queues (#1969) I'm committing it in, seems like a trivial thing. > Review LinkedList usage > --- > > Key: LUCENE-6831 > URL: https://issues.apache.org/jira/browse/LUCENE-6831 > Project: Lucene - Core > Issue Type: Task >Reporter: Dawid Weiss >Priority: Trivial > Fix For: 6.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I quickly scanned the code (out of curiosity) and most of the use cases of > LinkedList are as a Queue, in which case indeed an ArrayDeque would be a > better choice, especially if the maximum size is known in advance. > There are also some invalid/ incorrect uses like calling size() on a linked > list in {{MultiPhraseQueryNodeBuilder}}, which should be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss merged pull request #1969: LUCENE-6831: start removing LinkedList in favor of ArrayList or De/Queues
dweiss merged pull request #1969: URL: https://github.com/apache/lucene-solr/pull/1969 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org