[jira] [Commented] (SOLR-14282) /get handler doesn't return copied fields

2020-10-12 Thread Andrei Minin (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212845#comment-17212845
 ] 

Andrei Minin commented on SOLR-14282:
-

Sure, I am interested to work on PR - need some time to set SOLR environment 
and test app.

> /get handler doesn't return copied fields
> -
>
> Key: SOLR-14282
> URL: https://issues.apache.org/jira/browse/SOLR-14282
> Project: Solr
>  Issue Type: Bug
>  Components: search, SolrJ
>Affects Versions: 8.4
> Environment: SOLR 8.4.0, SOLRJ, Oracle Java 8 
>Reporter: Andrei Minin
>Priority: Major
> Attachments: copied_fields_test.zip, managed-schema.xml
>
>
> We are using /get handler to retrieve documents by id in our Java application 
> (SolrJ)
> I found that copied fields are missing in documents returned by /get handler 
> but  same documents returned by  query contain copied (by schema) fields.
> Attached documents:
>  # Integration test project archive
>  # Managed schema file for SOLR
> SOLR schema details:
>  # Unique field name "d_ida_s"
>  # Lowecase text type definition:
> {code:java}
>   positionIncrementGap="100">
>   
> 
> 
>   
> {code}
>           3. Copy field instruction sample: 
> {code:java}
>  stored="true" multiValued="false"/>
>  /> 
> {code}
> ConcurrenceUserNamea_s is string type field and ConcurrenceUserNameu_lca_s is 
> lower case text type field.
> Integration test uploads document to SOLR server and makes 2 requests: one 
> using /get rest point to fetch document by id and one using query  field name>:.
> Document returned by /get rest, doesn't have copied fields while document 
> returned by query, contains copied fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9524) NullPointerException in IndexSearcher.explain() when using ComplexPhraseQueryParser

2020-10-12 Thread Zach Chen (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212830#comment-17212830
 ] 

Zach Chen commented on LUCENE-9524:
---

Thanks Adrien! I've opened a PR to perform a null check in SpanWeight#explain 
and provide alternative explanation without score when its null. Please let me 
know if it looks good to you.

> NullPointerException in IndexSearcher.explain() when using 
> ComplexPhraseQueryParser
> ---
>
> Key: LUCENE-9524
> URL: https://issues.apache.org/jira/browse/LUCENE-9524
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser, core/search
>Affects Versions: 8.6, 8.6.2
>Reporter: Michał Słomkowski
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I get NPE when I use {{IndexSearcher.explain()}}. Checked with Lucene 8.6.0
> and 8.6.2.
> The query: {{(lorem AND NOT "dolor lorem") OR ipsum}}
> The text: {{dolor lorem ipsum}}
> Stack trace:
> {code}
> java.lang.NullPointerException at 
> java.util.Objects.requireNonNull(Objects.java:203)
>   at org.apache.lucene.search.LeafSimScorer.(LeafSimScorer.java:38)
>   at 
> org.apache.lucene.search.spans.SpanWeight.explain(SpanWeight.java:160)
>   at org.apache.lucene.search.BooleanWeight.explain(BooleanWeight.java:87)
>   at org.apache.lucene.search.BooleanWeight.explain(BooleanWeight.java:87)
>   at 
> org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:716)
>   at 
> org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:693)
> {code}
> Minimal example code:
> {code:java}
> val analyzer = new StandardAnalyzer();
> val query = new ComplexPhraseQueryParser("", analyzer).parse(queryString);
> final MemoryIndex memoryIndex = new MemoryIndex(true);
> memoryIndex.addField("", text, analyzer);
> final IndexSearcher searcher = memoryIndex.createSearcher();
> final TopDocs topDocs = searcher.search(query, 1);
> final ScoreDoc match = topDocs.scoreDocs[0];
> searcher.explain(query, match.doc);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] zacharymorn opened a new pull request #1978: LUCENE-9524: Fix NPE in SpanWeight#explain when no scoring is require…

2020-10-12 Thread GitBox


zacharymorn opened a new pull request #1978:
URL: https://github.com/apache/lucene-solr/pull/1978


   # Description
   `SpanWeight#explain` uses `Similarity.SimScorer` to generate explanations, 
and may fail with NullPointerException when scoring is not needed and thus 
`Similarity.SimScorer` is set to null, such as matching for must not occur span 
query.
   
   # Solution
   The solution is to check for null `Similarity.SimScorer` in 
`SpanWeight#explain`, and provides alternative explanation.
   
   # Tests
   Added 1 integration test to `TestMemoryIndex` that was broken before the fix.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1963: SOLR-14827: Refactor schema loading to not use XPath

2020-10-12 Thread GitBox


noblepaul commented on a change in pull request #1963:
URL: https://github.com/apache/lucene-solr/pull/1963#discussion_r503655774



##
File path: 
solr/core/src/java/org/apache/solr/util/plugin/AbstractPluginLoader.java
##
@@ -135,15 +134,13 @@ protected T create(SolrClassLoader loader, String name, 
String className, Node n
* If a default element is defined, it will be returned from this function.
* 
*/
-  public T load(SolrClassLoader loader, NodeList nodes )

Review comment:
   it could be





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec

2020-10-12 Thread GitBox


msokolov commented on pull request #1930:
URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707431740


   > One option could be 200-dimensional GloVe word vectors, available from 
http://ann-benchmarks.com/glove-200-angular.hdf5. I think these are trained on 
Twitter data.
   
   +1 I'm looking into adding GloVe data to luceneutil benchmarks, initially 
just to index and retrieve them, then I hope to add tasks for scoring lexical 
matches, and then for knn matching. I think some of the GloVe datasets are 
trained on wikipedia (plus other text) so should be suitable for use in our 
benchmarks, which are based on wikipedia text.
   
   I think for initial performance comparisons we can use our own tool; it 
wouldn't be as nicely controlled as running in the same framework, but if we 
are careful the results should be comparable. And it's good to know there is a 
reasonable path for integrating with ann-benchmark using py4j, and I hadn't 
realized there was a --batch option.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jtibshirani edited a comment on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec

2020-10-12 Thread GitBox


jtibshirani edited a comment on pull request #1930:
URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707404485


   >  It seems to expect your algorithm to be delivered as an in-process 
extension to Python, which works OK for a native code library, but I'm not sure 
how we'd present Lucene to it. We don't want to have to call through a network 
API?
   
   I ended up using `py4j` to call out to Lucene, which sets up a 'gateway 
server' and passes data between the Python + Java processes through a socket. I 
found there to be a significant overhead from converting between Python <-> 
Java, but this can be largely mitigated by making sure to use 'batch mode' (the 
`--batch` option), which allows all query vectors to be passed to Lucene at 
once. Amortizing the overhead this way, I was able to get consistent + 
informative results. Let me know if you're interested in trying the py4j option 
and I can post set-up steps. I found it helpful while developing but it's quite 
tricky and maybe shouldn't be the main way to track performance right now (as 
you mentioned) !
   
   A note that it's possible to use vector data from ann-benchmarks without 
integrating with the framework. The datasets are listed 
[here](https://github.com/erikbern/ann-benchmarks/blob/master/ann_benchmarks/datasets.py#L396)
 and made available on the website in hdf5 format. One option could be 
200-dimensional GloVe word vectors, available from 
`http://ann-benchmarks.com/glove-200-angular.hdf5`. I think these are trained 
on Twitter data.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jtibshirani commented on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec

2020-10-12 Thread GitBox


jtibshirani commented on pull request #1930:
URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707404485


   >  It seems to expect your algorithm to be delivered as an in-process 
extension to Python, which works OK for a native code library, but I'm not sure 
how we'd present Lucene to it. We don't want to have to call through a network 
API?
   
   I ended up using `py4j` to call out to Lucene, which sets up a 'gateway 
server' and passes data between the Python + Java processes through a socket. I 
did find there to be a significant overhead just from converting between Python 
<-> Java, but can be largely mitigated by making sure to use 'batch mode' (the 
`--batch` option), which allows all query vectors to be passed to Lucene at 
once. Amortizing the overhead this way, I was able to get consistent + 
informative results. Let me know if you're interested in trying the py4j option 
and I can post set-up steps. I found it helpful while developing but it's quite 
tricky and maybe shouldn't be the main way to track performance right now (as 
you mentioned) !
   
   A note that it's possible to use vector data from ann-benchmarks without 
integrating with the framework. The datasets are listed 
[here](https://github.com/erikbern/ann-benchmarks/blob/master/ann_benchmarks/datasets.py#L396)
 and made available on the website in hdf5 format. One option could be 
200-dimensional GloVe word vectors, available from 
`http://ann-benchmarks.com/glove-200-angular.hdf5`. I think these are trained 
on Twitter data.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9568) FuzzyTermEnums sets negative boost for fuzzy search & highlight

2020-10-12 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212738#comment-17212738
 ] 

Robert Muir commented on LUCENE-9568:
-

It will break most of the top-N optimizations of the query. If you use the max 
term length to compute the boost, then the PQ can never optimize. That would be 
because there could always exist some unseen term with a length of say: 1MB 
(just for illustration) which would rank extremely high. 

By using the min, it is bounded by the query term, and once that PQ fills with 
ed=2, the query can restrict itself further and only look for ed=1, ed=0, etc. 
Practically it is this way because that's how fuzzy was defined, with the min 
and the tie-breaker after this boost being term sort order (you can check the 
code of an ancient version like 2.9 to see: 
https://github.com/apache/lucene-solr/blob/releases/lucene/2.9.2/src/java/org/apache/lucene/search/FuzzyTermEnum.java#L189
 ). We just exploited it for all its worth. It is especially important for 
small PQ (top-N) sizes such as spell checking.

Hopefully I have explained it ok, this thing is hairy :) Happy to try again if 
needed.

I think first we should make a test, ideally one that doesn't use highlighting? 
I think there should be an alternative, simpler fix that won't break the top-N 
optimization.

> FuzzyTermEnums sets negative boost for fuzzy search & highlight
> ---
>
> Key: LUCENE-9568
> URL: https://issues.apache.org/jira/browse/LUCENE-9568
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 8.5.1
>Reporter: Juraj Jurčo
>Priority: Minor
>  Labels: highlighting, newbie
> Attachments: FindSqlHighlightTest.java
>
>
> *Description*
>  When user indexes a word with an apostrophe and constructs a fuzzy query for 
> highlighter, it throws an exception with set negative boost for a query. 
> *Repro Steps*
>  # Index a text with apostrophe. E.g. doesn't
>  # Parse a fuzzy query e.g.: se~, se~2, se~3
>  # Try to highlight a text with apostrophe
>  # The exception is thrown (for details see attached test test with repro 
> steps)
> *Actual Result*
>  {{java.lang.IllegalArgumentException: boost must be a positive float, got 
> -1.0}}
> *Expected Result*
>  * No exception.
>  * Highlighting marks are inserted into a text.
> *Workaround*
>  - not known.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jtibshirani edited a comment on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec

2020-10-12 Thread GitBox


jtibshirani edited a comment on pull request #1930:
URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-706813811


   > Hmm I tried to get that benchmarking suite to run and it requires some 
major Python-fu.
   
   I managed to get this working a few months ago while experimenting with a 
clustering-based approach: 
https://github.com/jtibshirani/ann-benchmarks/pull/2. It indeed involved a lot 
of set-up -- I can try to get it working again and post results. Going forward, 
I think it will be helpful to use ann-benchmarks to compare recall and QPS 
against the ANN reference implementations.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman commented on a change in pull request #1977: SOLR-14907: Support single file upload/overwrite in configSet API

2020-10-12 Thread GitBox


HoustonPutman commented on a change in pull request #1977:
URL: https://github.com/apache/lucene-solr/pull/1977#discussion_r503580189



##
File path: 
solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java
##
@@ -170,10 +170,14 @@ private void handleConfigUploadRequest(SolrQueryRequest 
req, SolrQueryResponse r
 
 boolean overwritesExisting = zkClient.exists(configPathInZk, true);
 
-if (overwritesExisting && 
!req.getParams().getBool(ConfigSetParams.OVERWRITE, false)) {
-  throw new SolrException(ErrorCode.BAD_REQUEST,
-  "The configuration " + configSetName + " already exists in 
zookeeper");
-}
+// Get upload parameters
+String singleFilePath = req.getParams().get(ConfigSetParams.FILE_PATH, "");
+boolean allowOverwrite = 
req.getParams().getBool(ConfigSetParams.OVERWRITE, false);
+// Cleanup is not allowed while using singleFilePath upload
+boolean cleanup = singleFilePath.isEmpty() && 
req.getParams().getBool(ConfigSetParams.CLEANUP, false);

Review comment:
   Added error handling for that and cases where a bad filePath is given.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1977: SOLR-14907: Support single file upload/overwrite in configSet API

2020-10-12 Thread GitBox


tflobbe commented on a change in pull request #1977:
URL: https://github.com/apache/lucene-solr/pull/1977#discussion_r503567179



##
File path: 
solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java
##
@@ -170,10 +170,14 @@ private void handleConfigUploadRequest(SolrQueryRequest 
req, SolrQueryResponse r
 
 boolean overwritesExisting = zkClient.exists(configPathInZk, true);
 
-if (overwritesExisting && 
!req.getParams().getBool(ConfigSetParams.OVERWRITE, false)) {
-  throw new SolrException(ErrorCode.BAD_REQUEST,
-  "The configuration " + configSetName + " already exists in 
zookeeper");
-}
+// Get upload parameters
+String singleFilePath = req.getParams().get(ConfigSetParams.FILE_PATH, "");
+boolean allowOverwrite = 
req.getParams().getBool(ConfigSetParams.OVERWRITE, false);
+// Cleanup is not allowed while using singleFilePath upload
+boolean cleanup = singleFilePath.isEmpty() && 
req.getParams().getBool(ConfigSetParams.CLEANUP, false);

Review comment:
   should we error instead of silently ignoring the `cleanup` param? it 
defaults to `false`, so someone must have explicitly set it to `true`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9536) Optimize OrdinalMap when one segment contains all distinct values?

2020-10-12 Thread Julie Tibshirani (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212713#comment-17212713
 ] 

Julie Tibshirani commented on LUCENE-9536:
--

I opened a pull request implementing the idea. It was indeed simple + fast to 
detect.

> Optimize OrdinalMap when one segment contains all distinct values?
> --
>
> Key: LUCENE-9536
> URL: https://issues.apache.org/jira/browse/LUCENE-9536
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Julie Tibshirani
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For doc values that are not too high cardinality, it seems common to have 
> some large segments that contain all distinct values (plus many small 
> segments who are missing some values). In this case, we could check if the 
> first segment ords map perfectly to global ords and if so store 
> `globalOrdDeltas` and `firstSegments` as `LongValues.ZEROES`. This could save 
> a small amount of space.
> I don’t think it would help a huge amount, especially since the optimization 
> might only kick in with small/ medium cardinalities, which don’t create huge 
> `OrdinalMap` instances anyways? But it is simple and seemed worth mentioning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9568) FuzzyTermEnums sets negative boost for fuzzy search & highlight

2020-10-12 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212703#comment-17212703
 ] 

Adrien Grand commented on LUCENE-9568:
--

This will change the top hits returned by fuzzy queries so I suspect that some 
users will be a bit angry, but I can't think of a reason why minTermLength 
makes more sense than maxTermLength so +1 to this suggestion to avoid the case 
when the edit distance is greater than the minimum term length.

> FuzzyTermEnums sets negative boost for fuzzy search & highlight
> ---
>
> Key: LUCENE-9568
> URL: https://issues.apache.org/jira/browse/LUCENE-9568
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 8.5.1
>Reporter: Juraj Jurčo
>Priority: Minor
>  Labels: highlighting, newbie
> Attachments: FindSqlHighlightTest.java
>
>
> *Description*
>  When user indexes a word with an apostrophe and constructs a fuzzy query for 
> highlighter, it throws an exception with set negative boost for a query. 
> *Repro Steps*
>  # Index a text with apostrophe. E.g. doesn't
>  # Parse a fuzzy query e.g.: se~, se~2, se~3
>  # Try to highlight a text with apostrophe
>  # The exception is thrown (for details see attached test test with repro 
> steps)
> *Actual Result*
>  {{java.lang.IllegalArgumentException: boost must be a positive float, got 
> -1.0}}
> *Expected Result*
>  * No exception.
>  * Highlighting marks are inserted into a text.
> *Workaround*
>  - not known.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14282) /get handler doesn't return copied fields

2020-10-12 Thread David Eric Pugh (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212653#comment-17212653
 ] 

David Eric Pugh commented on SOLR-14282:


I just ran into this issue as well.   If you are interested in working on a PR, 
I'd love to work on it with you.

> /get handler doesn't return copied fields
> -
>
> Key: SOLR-14282
> URL: https://issues.apache.org/jira/browse/SOLR-14282
> Project: Solr
>  Issue Type: Bug
>  Components: search, SolrJ
>Affects Versions: 8.4
> Environment: SOLR 8.4.0, SOLRJ, Oracle Java 8 
>Reporter: Andrei Minin
>Priority: Major
> Attachments: copied_fields_test.zip, managed-schema.xml
>
>
> We are using /get handler to retrieve documents by id in our Java application 
> (SolrJ)
> I found that copied fields are missing in documents returned by /get handler 
> but  same documents returned by  query contain copied (by schema) fields.
> Attached documents:
>  # Integration test project archive
>  # Managed schema file for SOLR
> SOLR schema details:
>  # Unique field name "d_ida_s"
>  # Lowecase text type definition:
> {code:java}
>   positionIncrementGap="100">
>   
> 
> 
>   
> {code}
>           3. Copy field instruction sample: 
> {code:java}
>  stored="true" multiValued="false"/>
>  /> 
> {code}
> ConcurrenceUserNamea_s is string type field and ConcurrenceUserNameu_lca_s is 
> lower case text type field.
> Integration test uploads document to SOLR server and makes 2 requests: one 
> using /get rest point to fetch document by id and one using query  field name>:.
> Document returned by /get rest, doesn't have copied fields while document 
> returned by query, contains copied fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-9207) PeerSync recovery fails if number of updates requested is high

2020-10-12 Thread Evgeny Ivanskiy (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212621#comment-17212621
 ] 

Evgeny Ivanskiy commented on SOLR-9207:
---

Hi [~praste], [~shalin].

We are seeing an intermittent issue where some of our collections fail to elect 
a leader after a restart. The log shows that hosts are failing to become leader 
due to a sync failure.
And that the sync failure is due to not receiving the expected number of 
updates 
Investigation shows that there are duplicate versions in tlogs.
So, in this case: 
If we get versions: *1,1,2,2,3,3* we than request the updates in range *1...3*. 
As result we get *3 updates* but *totalRequestedUpdates is 6* and sync failed.
Is there is an assumption that getVersions should return distinct values or 
that is the bug in PeerSync.handleVersionsWithRanges which does't take into 
account duplicate versions?

 

 

> PeerSync recovery fails if number of updates requested is high
> --
>
> Key: SOLR-9207
> URL: https://issues.apache.org/jira/browse/SOLR-9207
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.1, 6.0
>Reporter: Pushkar Raste
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 6.2, 7.0
>
> Attachments: SOLR-9207.patch, SOLR-9207.patch, SOLR-9207.patch_updated
>
>
> {{PeerSync}} recovery fails if we request more than ~99K updates. 
> If update solrconfig to retain more {{tlogs}} to leverage 
> https://issues.apache.org/jira/browse/SOLR-6359
> During out testing we found out that recovery using {{PeerSync}} fails if we 
> ask for more than ~99K updates, with following error
> {code}
>  WARN  PeerSync [RecoveryThread] - PeerSync: core=hold_shard1 url=
> exception talking to , failed
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
> Expected mime type application/octet-stream but got application/xml. 
> 
> 
> application/x-www-form-urlencoded content 
> length (4761994 bytes) exceeds upload limit of 2048 KB t name="code">400
> 
> {code}
> We arrived at ~99K with following match
> * max_version_number = Long.MAX_VALUE = 9223372036854775807  
> * bytes per version number =  20 (on the wire as POST request sends version 
> number as string)
> * additional bytes for separator ,
> * max_versions_in_single_request = 2MB/21 = ~99864
> I could think of 2 ways to fix it
> 1. Ask for about updates in chunks of 90K inside {{PeerSync.requestUpdates()}}
> 2. Use application/octet-stream encoding 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212597#comment-17212597
 ] 

Thomas Wöckinger commented on SOLR-14923:
-

[~dsmiley], [~noble], [~ab] maybe one of you can help, your are involved at the 
last changes on the DistributedUpdateProcessor.

Thx for your help.

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14925) CVE-2020-13957: The checks added to unauthenticated configset uploads can be circumvented

2020-10-12 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe updated SOLR-14925:
-
Security: Public  (was: Private (Security Issue))

> CVE-2020-13957: The checks added to unauthenticated configset uploads can be 
> circumvented
> -
>
> Key: SOLR-14925
> URL: https://issues.apache.org/jira/browse/SOLR-14925
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 7.0, 7.0.1, 7.1, 
> 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, 7.7.2, 8.0, 8.1, 8.2, 
> 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.6, 8.5.1, 8.5.2, 8.6.1, 8.6.2
>Reporter: Tomas Eduardo Fernandez Lobbe
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Major
> Fix For: master (9.0), 8.7, 8.6.3
>
>
> Severity: High
> Vendor: The Apache Software Foundation
> Versions Affected:
> 6.6.0 to 6.6.5
> 7.0.0 to 7.7.3
> 8.0.0 to 8.6.2
> Description:
> Solr prevents some features considered dangerous (which could be used for 
> remote code execution) to be configured in a ConfigSet that's uploaded via 
> API without authentication/authorization. The checks in place to prevent such 
> features can be circumvented by using a combination of UPLOAD/CREATE actions.
> Mitigation:
> Any of the following are enough to prevent this vulnerability:
> * Disable UPLOAD command in ConfigSets API if not used by setting the system 
> property: {{configset.upload.enabled}} to {{false}} [1]
> * Use Authentication/Authorization and make sure unknown requests aren't 
> allowed [2]
> * Upgrade to Solr 8.6.3 or greater.
> * If upgrading is not an option, consider applying the patch in SOLR-14663 
> ([3])
> * No Solr API, including the Admin UI, is designed to be exposed to 
> non-trusted parties. Tune your firewall so that only trusted computers and 
> people are allowed access
> Credit:
> Tomás Fernández Löbbe, András Salamon
> References:
> [1] https://lucene.apache.org/solr/guide/8_6/configsets-api.html
> [2] 
> https://lucene.apache.org/solr/guide/8_6/authentication-and-authorization-plugins.html
> [3] https://issues.apache.org/jira/browse/SOLR-14663
> [4] https://issues.apache.org/jira/browse/SOLR-14925
> [5] https://wiki.apache.org/solr/SolrSecurity



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212567#comment-17212567
 ] 

Thomas Wöckinger edited comment on SOLR-14923 at 10/12/20, 6:19 PM:


Overall index time, for a complete reindex is reduced from 2134 mins to 83 mins 
when simply commenting out ulog.openRealtimeSearcher and using 12 threads.

So somehow we should find a solution for this, because the difference is nearly 
a factor of 26!!


was (Author: thomas.woeckinger):
Overall index time, for a complete reindex is reduced from 2134 mins to 83 mins 
when simply commenting out ulog.openRealtimeSearcher and using 12 threads.

So somehow we should find a solution for this.

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14907) Support single file upload/overwrite in configSet API

2020-10-12 Thread Houston Putman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman reassigned SOLR-14907:
-

Assignee: Houston Putman

> Support single file upload/overwrite in configSet API
> -
>
> Key: SOLR-14907
> URL: https://issues.apache.org/jira/browse/SOLR-14907
> Project: Solr
>  Issue Type: Improvement
>  Components: configset-api
>Reporter: Houston Putman
>Assignee: Houston Putman
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After SOLR-10391 was implemented, users are now able to overwrite existing 
> configSets using the configSet API. However the files uploaded are still 
> required to be zipped and indexed from the base configSet path in ZK. Users 
> might want to just update a single file, such as a synonyms list, and not 
> have to tar it up first.
> The proposed solution is to add parameters to the UPLOAD configSet action, to 
> allow this single-file use case. This would utilize the protections already 
> provided by the API, such as maintaining the trustiness of configSets being 
> modified.
> This feature is part of the solution to replace managed resources, which is 
> planned to be deprecated and removed by 9.0 (SOLR-14766).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212567#comment-17212567
 ] 

Thomas Wöckinger commented on SOLR-14923:
-

Overall index time, for a complete reindex is reduced from 2134 mins to 83 mins 
when simply commenting out ulog.openRealtimeSearcher and using 12 threads.

So somehow we should find a solution for this.

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #1976: LUCENE-9578: TermRangeQuery empty string lower bound edge case

2020-10-12 Thread GitBox


jpountz commented on a change in pull request #1976:
URL: https://github.com/apache/lucene-solr/pull/1976#discussion_r503439534



##
File path: lucene/core/src/java/org/apache/lucene/util/automaton/Automata.java
##
@@ -254,7 +254,7 @@ public static Automaton makeBinaryInterval(BytesRef min, 
boolean minInclusive, B
   cmp = min.compareTo(max);
 } else {
   cmp = -1;
-  if (min.length == 0 && minInclusive) {
+  if (min.length == 0) {

Review comment:
   this looks wrong as we should still make sure that the empty string is 
rejected if min=="" and minInclusive==false?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman opened a new pull request #1977: SOLR-14907: Support single file upload/overwrite in configSet API

2020-10-12 Thread GitBox


HoustonPutman opened a new pull request #1977:
URL: https://github.com/apache/lucene-solr/pull/1977


   https://issues.apache.org/jira/browse/SOLR-14907



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9524) NullPointerException in IndexSearcher.explain() when using ComplexPhraseQueryParser

2020-10-12 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212529#comment-17212529
 ] 

Adrien Grand commented on LUCENE-9524:
--

Thanks for digging this! I think it's a bug in SpanWeight#explain, which should 
keep working even if scores are not requested. I guess we could either create a 
dummy {{Similarity.SimScorer}} when scores are not requested to make sure that 
the explain logic keeps working, or change the explain() logic to keep working 
when the simScorer is null?

> NullPointerException in IndexSearcher.explain() when using 
> ComplexPhraseQueryParser
> ---
>
> Key: LUCENE-9524
> URL: https://issues.apache.org/jira/browse/LUCENE-9524
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser, core/search
>Affects Versions: 8.6, 8.6.2
>Reporter: Michał Słomkowski
>Priority: Major
>
> I get NPE when I use {{IndexSearcher.explain()}}. Checked with Lucene 8.6.0
> and 8.6.2.
> The query: {{(lorem AND NOT "dolor lorem") OR ipsum}}
> The text: {{dolor lorem ipsum}}
> Stack trace:
> {code}
> java.lang.NullPointerException at 
> java.util.Objects.requireNonNull(Objects.java:203)
>   at org.apache.lucene.search.LeafSimScorer.(LeafSimScorer.java:38)
>   at 
> org.apache.lucene.search.spans.SpanWeight.explain(SpanWeight.java:160)
>   at org.apache.lucene.search.BooleanWeight.explain(BooleanWeight.java:87)
>   at org.apache.lucene.search.BooleanWeight.explain(BooleanWeight.java:87)
>   at 
> org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:716)
>   at 
> org.apache.lucene.search.IndexSearcher.explain(IndexSearcher.java:693)
> {code}
> Minimal example code:
> {code:java}
> val analyzer = new StandardAnalyzer();
> val query = new ComplexPhraseQueryParser("", analyzer).parse(queryString);
> final MemoryIndex memoryIndex = new MemoryIndex(true);
> memoryIndex.addField("", text, analyzer);
> final IndexSearcher searcher = memoryIndex.createSearcher();
> final TopDocs topDocs = searcher.search(query, 1);
> final ScoreDoc match = topDocs.scoreDocs[0];
> searcher.explain(query, match.doc);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] cbuescher opened a new pull request #1976: LUCENE-9578: TermRangeQuery empty string lower bound edge case

2020-10-12 Thread GitBox


cbuescher opened a new pull request #1976:
URL: https://github.com/apache/lucene-solr/pull/1976


   # Description
   
   Currently a TermRangeQuery with the empty String ("") as lower bound and
   includeLower=false leads internally constructs an Automaton that doesn't 
match
   anything. This is unexpected expecially for open upper bounds where any 
string
   should be considered to be "higher" than the empty string.
   
   # Solution
   
   This PR changes "Automata#makeBinaryInterval" so that for an empty string 
lower
   bound and an open upper bound, any String should match the query regardless 
or
   the includeLower flag.
   
   # Tests
   
   Added two new tests to `TestAutomaton`.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `./gradlew check`
   - [x] I have added tests for my changes.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-10-12 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-14870:
--
Fix Version/s: master (9.0)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-14870.patch, SOLR-14870.patch
>
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14870) gradle build does not validate ref-guide -> javadoc links

2020-10-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212449#comment-17212449
 ] 

ASF subversion and git services commented on SOLR-14870:


Commit b4f044219319fc0a0a94b92e2d90a6b25dae9de0 in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b4f0442 ]

SOLR-14870: refactor ref-guide build.gradle logic to re-enable guide->javadoc 
link checking

fix 'broken' javadoc links in ref-guide to match new documentation path 
structures for 9.x


> gradle build does not validate ref-guide -> javadoc links
> -
>
> Key: SOLR-14870
> URL: https://issues.apache.org/jira/browse/SOLR-14870
> Project: Solr
>  Issue Type: Task
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14870.patch, SOLR-14870.patch
>
>
> the ant build had (has on 8x) a feature that ensured we didn't have any 
> broken links between the ref guide and the javadocs...
> {code}
>  depends="javadocs,changes-to-html,process-webpages">
>  inheritall="false">
>   
>   
> 
>   
> {code}
> ...by default {{cd solr/solr-ref-guide && ant bare-bones-html-validation}} 
> just did interanal validation of the strucure of the guide, but this hook 
> ment that {{cd solr && ant documentation}} (or {{ant precommit}}) would first 
> build the javadocs; then build the ref-guide; then validate _all_ links i 
> nthe ref-guide, even those to (local) javadocs
> While the "local.javadocs" property logic _inside_ the 
> solr-ref-guide/build.xml was ported to build.gradle, the logic to leverage 
> this functionality from the "solr" project doesn't seem to have been 
> preserved -- so currently, {{gradle check}} doesn't know/care if someone adds 
> a nonsense javadoc link to the ref-guide (or removes a class/method whose 
> javadoc is already currently to from the ref guide)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1963: SOLR-14827: Refactor schema loading to not use XPath

2020-10-12 Thread GitBox


madrob commented on a change in pull request #1963:
URL: https://github.com/apache/lucene-solr/pull/1963#discussion_r503332224



##
File path: solr/core/src/java/org/apache/solr/core/XmlConfigFile.java
##
@@ -145,15 +139,15 @@ public XmlConfigFile(SolrResourceLoader loader, String 
name, InputSource is, Str
   db.setErrorHandler(xmllog);
   try {
 doc = db.parse(is);
-origDoc = copyDoc(doc);
+origDoc = doc;

Review comment:
   If these are always the same, down still need two copies of it?

##
File path: solr/core/src/java/org/apache/solr/cloud/CloudConfigSetService.java
##
@@ -39,14 +43,28 @@
  */
 public class CloudConfigSetService extends ConfigSetService {
   private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
-  
+  private Map cache = new ConcurrentHashMap<>();

Review comment:
   nit: make final?

##
File path: solr/solrj/src/java/org/apache/solr/common/ConfigNode.java
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.common;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.function.Function;
+import java.util.function.Predicate;
+
+import org.apache.solr.cluster.api.SimpleMap;
+
+/**
+ * A generic interface that represents a config file, mostly XML
+ */
+public interface ConfigNode {
+  ThreadLocal> SUBSTITUTES = new ThreadLocal<>();
+
+  /**
+   * Name of the tag
+   */
+  String name();
+
+  /**
+   * Text value of the node
+   */
+  String textValue();
+
+  /**
+   * Attributes
+   */
+  SimpleMap attributes();
+
+  /**
+   * Child by name
+   */
+  default ConfigNode child(String name) {
+return child(null, name);
+  }
+
+  /**Iterate through child nodes with the name and return the first child that 
matches
+   */
+  default ConfigNode child(Predicate test, String name) {

Review comment:
   I think these are more natural with the order of the parameters 
reversed. It also aligns better with the single argument version calling 
`child(name, null)` - easier to reason about in an IDE autocomplete.
   
   `child(name, test)` reads more similarly to the previous XPath 
`//name[test]`.

##
File path: solr/solrj/src/java/org/apache/solr/common/ConfigNode.java
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.common;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.function.Function;
+import java.util.function.Predicate;
+
+import org.apache.solr.cluster.api.SimpleMap;
+
+/**
+ * A generic interface that represents a config file, mostly XML
+ */
+public interface ConfigNode {
+  ThreadLocal> SUBSTITUTES = new ThreadLocal<>();
+
+  /**
+   * Name of the tag
+   */
+  String name();
+
+  /**
+   * Text value of the node
+   */
+  String textValue();
+
+  /**
+   * Attributes
+   */
+  SimpleMap attributes();
+
+  /**
+   * Child by name
+   */
+  default ConfigNode child(String name) {
+return child(null, name);
+  }
+
+  /**Iterate through child nodes with the name and return the first child that 
matches
+   */
+  default ConfigNode child(Predicate test, String name) {
+ConfigNode[] result = new ConfigNode[1];
+forEachChild(it -> {
+  if (name!=null && !name.equals(it.name())) 

[GitHub] [lucene-solr] sigram commented on a change in pull request #1974: SOLR-14914: Add option to disable metrics collection

2020-10-12 Thread GitBox


sigram commented on a change in pull request #1974:
URL: https://github.com/apache/lucene-solr/pull/1974#discussion_r503358068



##
File path: solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java
##
@@ -110,19 +110,22 @@
 
   public static final int DEFAULT_CLOUD_REPORTER_PERIOD = 60;
 
-  private MetricRegistry.MetricSupplier counterSupplier;
-  private MetricRegistry.MetricSupplier meterSupplier;
-  private MetricRegistry.MetricSupplier timerSupplier;
-  private MetricRegistry.MetricSupplier histogramSupplier;
+  private final MetricsConfig metricsConfig;
+  private final MetricRegistry.MetricSupplier counterSupplier;
+  private final MetricRegistry.MetricSupplier meterSupplier;
+  private final MetricRegistry.MetricSupplier timerSupplier;
+  private final MetricRegistry.MetricSupplier histogramSupplier;
 
   public SolrMetricManager() {
+metricsConfig = new MetricsConfig.MetricsConfigBuilder().build();
 counterSupplier = MetricSuppliers.counterSupplier(null, null);
 meterSupplier = MetricSuppliers.meterSupplier(null, null);

Review comment:
   @muse-dev is obviously wrong, this can never be null as it's a static 
method, and it accepts null args.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9578) TermRangeQuery with empty string lower bound edge case

2020-10-12 Thread Jira
Christoph Büscher created LUCENE-9578:
-

 Summary: TermRangeQuery with empty string lower bound edge case
 Key: LUCENE-9578
 URL: https://issues.apache.org/jira/browse/LUCENE-9578
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 8.6.3, trunk
Reporter: Christoph Büscher


Currently a TermRangeQuery with the empty String ("") as lower bound and 
includeLower=false leads internally constructs an Automaton that doesn't match 
anything. This is unexpected expecially for open upper bounds where any string 
should be considered to be "higher" than the empty string. 

I think "Automata#makeBinaryInterval" should be changed so that for an empty 
string lower bound and an open upper bound, any String should match the query 
regardless or the includeLower flag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14924) Some ReplicationHandler metrics are reported using incorrect types

2020-10-12 Thread Andrzej Bialecki (Jira)
Andrzej Bialecki created SOLR-14924:
---

 Summary: Some ReplicationHandler metrics are reported using 
incorrect types
 Key: SOLR-14924
 URL: https://issues.apache.org/jira/browse/SOLR-14924
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: metrics
Affects Versions: 8.6.3, 8.7
Reporter: Andrzej Bialecki
Assignee: Andrzej Bialecki


Some metrics reported from {{ReplicationHandler}} use incorrect types - they 
are reported as String values instead of the numerics.

This is caused by using {{ReplicationHandler.addVal}} utility method with the 
type {{Integer.class}}, which the method doesn't support and it returns the 
value as a string.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] munendrasn opened a new pull request #1975: Include missing commands in package tool help section

2020-10-12 Thread GitBox


munendrasn opened a new pull request #1975:
URL: https://github.com/apache/lucene-solr/pull/1975


   * Include add-key and uninstall commands
   
   > As this is a minor change, I haven't created a JIRA, will create one if 
required



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212360#comment-17212360
 ] 

Thomas Wöckinger edited comment on SOLR-14923 at 10/12/20, 1:24 PM:


The critical lines (branch 8.6) in 
org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505.

So the question is, can this code be avoided if 'waitSearcher' is specified 
with 'false'.

This is not my first contribution, but in this case it will be a fundamental 
change, so if someone can guide me in the right direction on fixing this it 
would be great.


was (Author: thomas.woeckinger):
The critical lines (branch 8.6) in 
org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505.

So the question is, can this code be avoided if 'waitSearcher' is specified 
with 'false'.

This is not my first contribution, but in this case it will be fundamental 
change, so if someone can guide me in the right direction on fixing this it 
would be great.

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-10-12 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Wöckinger updated SOLR-14923:

Description: 
Parallel indexing does not make sense at moment when child documents are used.

The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
end of the method doVersionAdd if Ulog caches should be refreshed.

This check will return true if any child document is included in the 
AddUpdateCommand.

If so ulog.openRealtimeSearcher(); is called, this call is very expensive, and 
executed in a synchronized block of the UpdateLog instance, therefore all other 
operations on the UpdateLog are blocked too.

Because every important UpdateLog method (add, delete, ...) is done using a 
synchronized block almost each operation is blocked.

This reduces multi threaded index update to a single thread behavior.

The described behavior is not depending on any option of the UpdateRequest, so 
it does not make any difference if 'waitFlush', 'waitSearcher' or 'softCommit'  
is true or false.

The described behavior makes the usage of ChildDocuments useless, because the 
performance is unacceptable.

 

 

  was:
Parallel indexing does not make sense at moment when child documents are used.

The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
end of the method doVersionAdd if Ulog caches should be refreshed.

This check will return true if any child document is include in the 
AddUpdateCommand.

If so ulog.openRealtimeSearcher(); is called, this call is very expensive, and 
executed in a synchronized block, therefor all other operations on the 
UpdateLog are blocked too.

Because every important UpdateLog method (add, delete, ...) is done using a 
synchronized block almost each operation is blocked.

This reduces multi threaded index update to a single thread behavior.

The described behavior is not depending on any option of the UpdateRequest, so 
it does not make any difference if 'waitFlush', 'waitSearcher' or 'softCommit'  
is true or false.

The described behavior makes the usage of ChildDocuments useless, because the 
performance is unacceptable.

 

 


> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block of the UpdateLog instance, therefore all 
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212360#comment-17212360
 ] 

Thomas Wöckinger edited comment on SOLR-14923 at 10/12/20, 1:21 PM:


The critical lines (branch 8.6) in 
org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505.

So the question is, can this code be avoided if 'waitSearcher' is specified 
with 'false'.

This is not my first contribution, but in this case it will be fundamental 
change, so if someone can guide me in the right direction on fixing this it 
would be great.


was (Author: thomas.woeckinger):
The critical lines (branch 8.6) in 
org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505.

So the question is, can this code be avoided if 'waitSearcher' is specified 
with 'false'.

If someone can guide me in the right direction on fixing this it would be great.

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is include in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block, therefor all other operations on the 
> UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212360#comment-17212360
 ] 

Thomas Wöckinger commented on SOLR-14923:
-

The critical lines (branch 8.6) in 
org.apache.solr.update.processor.DistributedUpdateProcessor are 500 to 505.

So the question is, can this code be avoided if 'waitSearcher' is specified 
with 'false'.

If someone can guide me in the right direction on fixing this it would be great.

> Indexing performance is unacceptable when child documents are involved
> --
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update, UpdateRequestProcessors
>Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6
>Reporter: Thomas Wöckinger
>Priority: Critical
>  Labels: performance
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is include in the 
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive, 
> and executed in a synchronized block, therefor all other operations on the 
> UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a 
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest, 
> so it does not make any difference if 'waitFlush', 'waitSearcher' or 
> 'softCommit'  is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the 
> performance is unacceptable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14923) Indexing performance is unacceptable when child documents are involved

2020-10-12 Thread Jira
Thomas Wöckinger created SOLR-14923:
---

 Summary: Indexing performance is unacceptable when child documents 
are involved
 Key: SOLR-14923
 URL: https://issues.apache.org/jira/browse/SOLR-14923
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: update, UpdateRequestProcessors
Affects Versions: 8.6, 8.5, 8.4, 8.3, master (9.0)
Reporter: Thomas Wöckinger


Parallel indexing does not make sense at moment when child documents are used.

The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the 
end of the method doVersionAdd if Ulog caches should be refreshed.

This check will return true if any child document is include in the 
AddUpdateCommand.

If so ulog.openRealtimeSearcher(); is called, this call is very expensive, and 
executed in a synchronized block, therefor all other operations on the 
UpdateLog are blocked too.

Because every important UpdateLog method (add, delete, ...) is done using a 
synchronized block almost each operation is blocked.

This reduces multi threaded index update to a single thread behavior.

The described behavior is not depending on any option of the UpdateRequest, so 
it does not make any difference if 'waitFlush', 'waitSearcher' or 'softCommit'  
is true or false.

The described behavior makes the usage of ChildDocuments useless, because the 
performance is unacceptable.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930

2020-10-12 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212352#comment-17212352
 ] 

Erick Erickson commented on SOLR-14844:
---

[~samuelgmartinez]

JIRAs can only be assigned to committers, so I'll leave it assigned to me. I 
see all the changes that happen here, so I'll be monitoring. When it's ready 
I'll commit it.

It's a bit confusing at present because master is built with Gradle and 8x with 
ant.

 At root, you have to change versions.props for master and 
lucene/ivy-versions.properties in 8x. Then there's some magic incantations you 
have to do that regenerate checksums and similar that are different in each. 
Which shows a lot of file changes for just a few code changes.

I've attached a couple of patches that, absent this problem they are what I 
would have committed if it had "just worked". So you should be able to apply 
them, then fix whatever you need to. When you're ready, submit a PR or patch 
(whichever is more comfortable) and I'll take it from there. That should allow 
you to get to the real problem without getting frustrated by all the fiddly 
bits.

Do change solr/CHANGES.txt in both versions to give yourself credit for fixing 
this. The convention I use for something like this where someone else does the 
heavy lifting is: "(Samuel Garcia Martinez via Erick Erickson)". That way you 
get credit for the work and I get the blame if something goes wrong ;)

Finally, when switching back and forth between master and 8x there may be cruft 
left over that fails precommit. "git clean -dxf" is your friend in those cases, 
although that'll also erase your IDE files and you'll have to "ant idea" in 8x 
or just open the project in master. Real Soon Now I'll look at worktree to 
avoid this...  Oh, and on master "gradlew check" sometimes runs out of memory 
on my machine, although I haven't heard other's complain so maybe I'm just 
lucky. If that happens, you can bump the memory gradle uses in gradle.properties

And thanks!

[^SOLR-14884-8x.patch]  [^SOLR-14844-master.patch] 

> Upgrade Jetty to 9.4.32.v20200930
> -
>
> Key: SOLR-14844
> URL: https://issues.apache.org/jira/browse/SOLR-14844
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Cassandra Targett
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14844-master.patch, SOLR-14884-8x.patch
>
>
> A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools 
> raising red flags 
> ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]).
> Here's the Jetty issue: 
> [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in 
> 9.4.30+, so we should upgrade to that for 8.7
> -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than 
> requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) 
> know if this problem is even exploitable in Solr, or b) if the workaround 
> suggested is even possible in Solr.-
> In normal Solr installs, w/o jetty optimizations, this issue is largely 
> mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930

2020-10-12 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14844:
--
Attachment: SOLR-14844-master.patch
SOLR-14884-8x.patch

> Upgrade Jetty to 9.4.32.v20200930
> -
>
> Key: SOLR-14844
> URL: https://issues.apache.org/jira/browse/SOLR-14844
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Cassandra Targett
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14844-master.patch, SOLR-14884-8x.patch
>
>
> A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools 
> raising red flags 
> ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]).
> Here's the Jetty issue: 
> [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in 
> 9.4.30+, so we should upgrade to that for 8.7
> -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than 
> requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) 
> know if this problem is even exploitable in Solr, or b) if the workaround 
> suggested is even possible in Solr.-
> In normal Solr installs, w/o jetty optimizations, this issue is largely 
> mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212311#comment-17212311
 ] 

Samuel García Martínez commented on SOLR-14844:
---

I can handle the upgrade completely if you want, so feel free to assign it to 
me and I'll submit a PR on Github. I may need some guidance on "non obvious" 
changes to upgrade Jetty version (updating solr/licenses and some other things 
I may not be aware of).

I would approach this as follows:
* Understand why is not reproducible on master branch
* Modify the unit tests to ensure they pass on both branches
* Upgrade Jetty version
* Open new ticket to improve gzip handling on the client

> Upgrade Jetty to 9.4.32.v20200930
> -
>
> Key: SOLR-14844
> URL: https://issues.apache.org/jira/browse/SOLR-14844
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Cassandra Targett
>Assignee: Erick Erickson
>Priority: Major
>
> A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools 
> raising red flags 
> ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]).
> Here's the Jetty issue: 
> [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in 
> 9.4.30+, so we should upgrade to that for 8.7
> -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than 
> requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) 
> know if this problem is even exploitable in Solr, or b) if the workaround 
> suggested is even possible in Solr.-
> In normal Solr installs, w/o jetty optimizations, this issue is largely 
> mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212311#comment-17212311
 ] 

Samuel García Martínez edited comment on SOLR-14844 at 10/12/20, 11:29 AM:
---

I can handle the upgrade completely if you want, so feel free to assign it to 
me and I'll submit a PR on Github. I may need some guidance on "non obvious" 
changes to upgrade Jetty version (updating solr/licenses and some other things 
I may not be aware of).

I would approach this as follows:
* Understand why is not reproducible on master branch
* Modify the unit tests to ensure they pass on both branches
* Upgrade Jetty version
* Open new JIRA to improve gzip handling on the client


was (Author: samuelgmartinez):
I can handle the upgrade completely if you want, so feel free to assign it to 
me and I'll submit a PR on Github. I may need some guidance on "non obvious" 
changes to upgrade Jetty version (updating solr/licenses and some other things 
I may not be aware of).

I would approach this as follows:
* Understand why is not reproducible on master branch
* Modify the unit tests to ensure they pass on both branches
* Upgrade Jetty version
* Open new ticket to improve gzip handling on the client

> Upgrade Jetty to 9.4.32.v20200930
> -
>
> Key: SOLR-14844
> URL: https://issues.apache.org/jira/browse/SOLR-14844
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Cassandra Targett
>Assignee: Erick Erickson
>Priority: Major
>
> A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools 
> raising red flags 
> ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]).
> Here's the Jetty issue: 
> [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in 
> 9.4.30+, so we should upgrade to that for 8.7
> -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than 
> requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) 
> know if this problem is even exploitable in Solr, or b) if the workaround 
> suggested is even possible in Solr.-
> In normal Solr installs, w/o jetty optimizations, this issue is largely 
> mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov edited a comment on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec

2020-10-12 Thread GitBox


msokolov edited a comment on pull request #1930:
URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707088977


   > it will be helpful to use ann-benchmark
   
   so .. I finally did get it working, but I have a few questions. Aside: I 
think my main stumbling block was running on an ARM instance - that may have 
caused some dependency issues, and then I found most of the algorithms are 
compiled with x86-only compiler extensions, sigh. But my main concern there is 
aboutt the way this benchmarking system runs. It seems to expect your algorithm 
to be delivered as an in-process extension to Python, which works OK for a 
native code library, but I'm not sure how we'd present Lucene to it. We don't 
want to have to call through a network API?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on pull request #1930: LUCENE-9322: add VectorValues to new Lucene90 codec

2020-10-12 Thread GitBox


msokolov commented on pull request #1930:
URL: https://github.com/apache/lucene-solr/pull/1930#issuecomment-707088977


   > it will be helpful to use ann-benchmark
   so .. I finally did get it working, but I have a few questions. Aside: I 
think my main stumbling block was running on an ARM instance - that may have 
caused some dependency issues, and then I found most of the algorithms are 
compiled with x86-only compiler extensions, sigh. But my main concern there is 
aboutt the way this benchmarking system runs. It seems to expect your algorithm 
to be delivered as an in-process extension to Python, which works OK for a 
native code library, but I'm not sure how we'd present Lucene to it. We don't 
want to have to call through a network API?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #1974: SOLR-14914: Add option to disable metrics collection

2020-10-12 Thread GitBox


muse-dev[bot] commented on a change in pull request #1974:
URL: https://github.com/apache/lucene-solr/pull/1974#discussion_r503255143



##
File path: solr/core/src/java/org/apache/solr/metrics/SolrMetricManager.java
##
@@ -110,19 +110,22 @@
 
   public static final int DEFAULT_CLOUD_REPORTER_PERIOD = 60;
 
-  private MetricRegistry.MetricSupplier counterSupplier;
-  private MetricRegistry.MetricSupplier meterSupplier;
-  private MetricRegistry.MetricSupplier timerSupplier;
-  private MetricRegistry.MetricSupplier histogramSupplier;
+  private final MetricsConfig metricsConfig;
+  private final MetricRegistry.MetricSupplier counterSupplier;
+  private final MetricRegistry.MetricSupplier meterSupplier;
+  private final MetricRegistry.MetricSupplier timerSupplier;
+  private final MetricRegistry.MetricSupplier histogramSupplier;
 
   public SolrMetricManager() {
+metricsConfig = new MetricsConfig.MetricsConfigBuilder().build();
 counterSupplier = MetricSuppliers.counterSupplier(null, null);
 meterSupplier = MetricSuppliers.meterSupplier(null, null);

Review comment:
   *NULL_DEREFERENCE:*  object `null` is dereferenced by call to 
`meterSupplier(...)` at line 122.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14844) Upgrade Jetty to 9.4.32.v20200930

2020-10-12 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212095#comment-17212095
 ] 

Erick Erickson edited comment on SOLR-14844 at 10/12/20, 11:14 AM:
---

[~samuelgmartinez] That'd be great (if you supplied a patch). I'm much more 
comfortable now that you have an explanation. I'm not clear on why this only 
fails in 8x and not master, but that's just my confusion.

How do you want to proceed? If you just attach a patch file, I can put it in 
this upgrade or we can have a separate JIRA...

Let me know and thanks!


was (Author: erickerickson):
[~samuelgmartinez] That'd be great (if you supplied a patch). I'm much more 
comfortable now that you have an explanation. I'm not clear on why this only 
fails in 8x and not master, but that's just my confusion.

How do you want to proceed? If you just attach a patch file, I can put it in 
this upgrade or we can have a separate patch...

Let me know and thanks!

> Upgrade Jetty to 9.4.32.v20200930
> -
>
> Key: SOLR-14844
> URL: https://issues.apache.org/jira/browse/SOLR-14844
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 8.6
>Reporter: Cassandra Targett
>Assignee: Erick Erickson
>Priority: Major
>
> A CVE was found in Jetty 9.4.27-9.4.29 that has some security scanning tools 
> raising red flags 
> ([https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17638]).
> Here's the Jetty issue: 
> [https://bugs.eclipse.org/bugs/show_bug.cgi?id=564984]. It's fixed in 
> 9.4.30+, so we should upgrade to that for 8.7
> -It has a simple mitigation (raise Jetty's responseHeaderSize to higher than 
> requestHeaderSize), but I don't know how Solr uses Jetty well enough to a) 
> know if this problem is even exploitable in Solr, or b) if the workaround 
> suggested is even possible in Solr.-
> In normal Solr installs, w/o jetty optimizations, this issue is largely 
> mitigated in 8.6.3: see SOLR-14896 (and linked bug fixes) for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul opened a new pull request #1974: SOLR-14914: Add option to disable metrics collection

2020-10-12 Thread GitBox


noblepaul opened a new pull request #1974:
URL: https://github.com/apache/lucene-solr/pull/1974


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase commented on pull request #1907: LUCENE-9538: Detect polygon self-intersections in the Tessellator

2020-10-12 Thread GitBox


iverase commented on pull request #1907:
URL: https://github.com/apache/lucene-solr/pull/1907#issuecomment-707042029


   Thinking more about this, I am not sure if we can make it optional. There 
can be situations when the polygon is invalid but it actually does not fail 
when it is tessellated. We have an example in the tests, where the polygon 
looks like:
   
   https://user-images.githubusercontent.com/29038686/95737373-d8105780-0c87-11eb-9f1f-3bd16e392067.png;>
   
   This polygon is invalid and the new check will throw an error but it does 
not fail if you try to tessellate it. The documentation of the 'Tessellator` 
explicitly says that:
   
   ```
   Holes may only touch at one vertex
   ```
   IMHO I think we should fail in this case.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14922) Include solr-ref-guide tasks in sourceSets for IntelliJ

2020-10-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212288#comment-17212288
 ] 

ASF subversion and git services commented on SOLR-14922:


Commit e444df1435f3e24e6f09db47d9d369b3d4e85f12 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e444df1 ]

SOLR-14922: Include solr-ref-guide tasks in sourceSets for IntelliJ (#1973)



> Include solr-ref-guide tasks in sourceSets for IntelliJ
> ---
>
> Key: SOLR-14922
> URL: https://issues.apache.org/jira/browse/SOLR-14922
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14922) Include solr-ref-guide tasks in sourceSets for IntelliJ

2020-10-12 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved SOLR-14922.

Fix Version/s: master (9.0)
   Resolution: Fixed

> Include solr-ref-guide tasks in sourceSets for IntelliJ
> ---
>
> Key: SOLR-14922
> URL: https://issues.apache.org/jira/browse/SOLR-14922
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss merged pull request #1973: SOLR-14922: Include solr-ref-guide tasks in sourceSets for IntelliJ

2020-10-12 Thread GitBox


dweiss merged pull request #1973:
URL: https://github.com/apache/lucene-solr/pull/1973


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss opened a new pull request #1973: SOLR-14922: Include solr-ref-guide tasks in sourceSets for IntelliJ

2020-10-12 Thread GitBox


dweiss opened a new pull request #1973:
URL: https://github.com/apache/lucene-solr/pull/1973


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14922) Include solr-ref-guide tasks in sourceSets for IntelliJ

2020-10-12 Thread Dawid Weiss (Jira)
Dawid Weiss created SOLR-14922:
--

 Summary: Include solr-ref-guide tasks in sourceSets for IntelliJ
 Key: SOLR-14922
 URL: https://issues.apache.org/jira/browse/SOLR-14922
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Dawid Weiss
Assignee: Dawid Weiss






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14915) Prometheus-exporter should not depend on Solr-core

2020-10-12 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212272#comment-17212272
 ] 

Dawid Weiss commented on SOLR-14915:


I attached a patch showing how to modify your pr so that the entire 
"distribution" is included in the Solr packaging. It's just a suggestion (the 
solution requires a decision on whether to include duplicated JARs, which 
scripts to include, etc.).

> Prometheus-exporter should not depend on Solr-core
> --
>
> Key: SOLR-14915
> URL: https://issues.apache.org/jira/browse/SOLR-14915
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - prometheus-exporter
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
> Attachments: patch.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I think it's *crazy* that our Prometheus exporter depends on Solr-core -- 
> this thing is a _client_ of Solr; it does not live within Solr.  The exporter 
> ought to be fairly lean.  One consequence of this dependency is that, for 
> example, security vulnerabilities reported against Solr (e.g. Jetty) can (and 
> do, where I work) wind up being reported against this module even though 
> Prometheus isn't using Jetty.
> From my evaluation today of what's going on, it appears the crux of the 
> problem is that the prometheus exporter uses some utility mechanisms in 
> Solr-core like XmlConfig (which depends on SolrResourceLoader and the rabbit 
> hole goes deeper...) and DOMUtils (further depends on PropertiesUtil).  It 
> can easy be made to not use XmlConfig.  DOMUtils & PropertiesUtil could move 
> to SolrJ which already has lots of little dependency-free utilities needed by 
> SolrJ and Solr-core alike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14915) Prometheus-exporter should not depend on Solr-core

2020-10-12 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-14915:
---
Attachment: patch.patch

> Prometheus-exporter should not depend on Solr-core
> --
>
> Key: SOLR-14915
> URL: https://issues.apache.org/jira/browse/SOLR-14915
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - prometheus-exporter
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
> Attachments: patch.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I think it's *crazy* that our Prometheus exporter depends on Solr-core -- 
> this thing is a _client_ of Solr; it does not live within Solr.  The exporter 
> ought to be fairly lean.  One consequence of this dependency is that, for 
> example, security vulnerabilities reported against Solr (e.g. Jetty) can (and 
> do, where I work) wind up being reported against this module even though 
> Prometheus isn't using Jetty.
> From my evaluation today of what's going on, it appears the crux of the 
> problem is that the prometheus exporter uses some utility mechanisms in 
> Solr-core like XmlConfig (which depends on SolrResourceLoader and the rabbit 
> hole goes deeper...) and DOMUtils (further depends on PropertiesUtil).  It 
> can easy be made to not use XmlConfig.  DOMUtils & PropertiesUtil could move 
> to SolrJ which already has lots of little dependency-free utilities needed by 
> SolrJ and Solr-core alike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1964: SOLR-14749: Cluster singleton part of PR-1785

2020-10-12 Thread GitBox


sigram commented on a change in pull request #1964:
URL: https://github.com/apache/lucene-solr/pull/1964#discussion_r503147180



##
File path: solr/core/src/java/org/apache/solr/api/AnnotatedApi.java
##
@@ -85,9 +85,18 @@ public EndPoint getEndPoint() {
   }
 
   public static List getApis(Object obj) {
-return getApis(obj.getClass(), obj);
+return getApis(obj.getClass(), obj, true);
   }
-  public static List getApis(Class theClass , Object 
obj)  {
+
+  /**
+   * Get a list of Api-s supported by this class.
+   * @param theClass class
+   * @param obj object of this class (may be null)
+   * @param required if true then an exception is thrown if no Api-s can be 
retrieved, if false
+   *then absence of Api-s is silently ignored.
+   * @return list of discovered Api-s
+   */
+  public static List getApis(Class theClass , Object 
obj, boolean required)  {

Review comment:
   Do you mean the name is confusing? Either name is fine with me.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1964: SOLR-14749: Cluster singleton part of PR-1785

2020-10-12 Thread GitBox


sigram commented on a change in pull request #1964:
URL: https://github.com/apache/lucene-solr/pull/1964#discussion_r503146425



##
File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java
##
@@ -168,6 +172,33 @@ public CoreLoadFailure(CoreDescriptor cd, Exception 
loadFailure) {
 }
   }
 
+  public static class ClusterSingletons {

Review comment:
   That's the downside of splitting this PR into API & implementation ... 
The way we load CC it's nearly impossible to properly initialize singletons 
from within the load() method if we don't defer the initialization until the 
load() is completed. `ClusterSingletons` class is just a helper for deferred 
init of singletons once they are all discovered and loaded.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1964: SOLR-14749: Cluster singleton part of PR-1785

2020-10-12 Thread GitBox


sigram commented on a change in pull request #1964:
URL: https://github.com/apache/lucene-solr/pull/1964#discussion_r503145004



##
File path: 
solr/core/src/java/org/apache/solr/handler/admin/ContainerPluginsApi.java
##
@@ -51,7 +51,7 @@
 public class ContainerPluginsApi {
   private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
 
-  public static final String PLUGIN = "plugin";
+  public static final String PLUGINS = "plugin";

Review comment:
   it's a left-over from the earlier change that reverted but I missed this 
change.
   
   Still, I think we should change the name of the path to "plugins" because 
the singular form doesn't make sense. Back-compat can be preserved by initially 
accepting both names.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14656) Deprecate current autoscaling framework, remove from master

2020-10-12 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212247#comment-17212247
 ] 

Andrzej Bialecki commented on SOLR-14656:
-

{quote}this is not in CHANGES.txt, do you think it should be? 
{quote}
 

Yes, I just added it - thanks for the reminder.

> Deprecate current autoscaling framework, remove from master
> ---
>
> Key: SOLR-14656
> URL: https://issues.apache.org/jira/browse/SOLR-14656
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: Andrzej Bialecki
>Priority: Blocker
> Fix For: 8.7
>
> Attachments: Screenshot from 2020-07-18 07-49-01.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The autoscaling framework is being re-designed in SOLR-14613 (SIP: 
> https://cwiki.apache.org/confluence/display/SOLR/SIP-8+Autoscaling+policy+engine+V2).
> The current autoscaling framework is very inefficient, improperly designed 
> and too bloated and doesn't receive the level of support we aspire to provide 
> for all components that we ship.
> This issue is to deprecate current autoscaling framework in 8x, so we can 
> focus on the new autoscaling framework afresh.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14656) Deprecate current autoscaling framework, remove from master

2020-10-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212246#comment-17212246
 ] 

ASF subversion and git services commented on SOLR-14656:


Commit ae26be479c743d7c60ac28e050d491e91291f85a in lucene-solr's branch 
refs/heads/branch_8x from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ae26be4 ]

SOLR-14656: Add a deprecation notice to CHANGES.txt.


> Deprecate current autoscaling framework, remove from master
> ---
>
> Key: SOLR-14656
> URL: https://issues.apache.org/jira/browse/SOLR-14656
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Assignee: Andrzej Bialecki
>Priority: Blocker
> Fix For: 8.7
>
> Attachments: Screenshot from 2020-07-18 07-49-01.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The autoscaling framework is being re-designed in SOLR-14613 (SIP: 
> https://cwiki.apache.org/confluence/display/SOLR/SIP-8+Autoscaling+policy+engine+V2).
> The current autoscaling framework is very inefficient, improperly designed 
> and too bloated and doesn't receive the level of support we aspire to provide 
> for all components that we ship.
> This issue is to deprecate current autoscaling framework in 8x, so we can 
> focus on the new autoscaling framework afresh.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] barrotsteindev commented on pull request #1970: SOLR-14869: do not add deleted docs in child doc transformer

2020-10-12 Thread GitBox


barrotsteindev commented on pull request #1970:
URL: https://github.com/apache/lucene-solr/pull/1970#issuecomment-706953245


   Thanks @dsmiley 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on pull request #1972: SOLR-14915: Prometheus-exporter does not depend on Solr-core any longer

2020-10-12 Thread GitBox


dweiss commented on pull request #1972:
URL: https://github.com/apache/lucene-solr/pull/1972#issuecomment-706949342


   Well, yes. I know it's tempting to use gradle's "ready-to-use" plugins but 
the situation is different when you have a stand-alone project and a 
multi-module aggregate. Let me explain.
   
   gradle plugins add tons of stuff (configurations, dependencies, tasks, 
conventions) and hide what's going on behind the scenes. This has pros and 
cons. While you can use the automatically-added 'run' task, the subproject is 
also populated with stuff that will increase the overall build time (like the 
stand-alone distributions on assemble) and may conflict with other parts of the 
multi-module build.
   
   For this reason I typically prefer to be explicit about packaging and 
distribution aspects, even if some things need to be done manually. My 
preference would be to just keep using java-library and add corresponding 
entries to the JAR's manifest; the "run" task is still achievable with plain 
java-library - Lucene's Luke configuration has an example of how this can be 
done.  Please do as you please though - I don't even know what this particular 
contrib is :)
   
   As for your question why not all JARs are copied - this is a larger question 
that applies to the current Solr packaging. I tried to emulate the way ant 
worked (in packaging.gradle) - this means copying the module's "unique" set of 
JARs to avoid duplication. This isn't easy and makes configuration management 
fairly hairy. I think stand-alone tools like prometheus-exporter or luke should 
have full executable distribution in the binary release - even if this means 
duplicating some JARs (including Lucene JARs). This makes it easier to reason 
about what they need, makes the configuration simpler... at the cost of 
increased distribution size.
   
   The alternative is to hand-edit the build and include what's needed or 
hand-edit the scripts to point at the right JARs.
   
   I'm really not sure which way is right (I believe the full set of artifacts 
required to run a stand-alone tool is better but it's my personal opinion).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9562) Unify 'analysis' package with produced artifact names

2020-10-12 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-9562.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

> Unify 'analysis' package with produced artifact names
> -
>
> Key: LUCENE-9562
> URL: https://issues.apache.org/jira/browse/LUCENE-9562
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Lucene has 'analysis' module but its sub-modules produce 'lucene-analyzers-*' 
> artifacts. This inconsistency is currently handled by setting artifact names 
> manually:
> {code}
> configure(subprojects.findAll { it.path.contains(':lucene:analysis:') }) {
>   project.archivesBaseName = project.archivesBaseName.replace("-analysis-", 
> "-analyzers-")
> }
> {code}
> but I keep wondering if we should just make it one or the other - either 
> rename 'analysis' to 'analyzers' or produce 'lucene-analysis-' artifacts.
> My personal opinion is to produce 'lucene-analysis-' packages because this 
> keeps repository structure the same (backports will be easier) and we're 
> targeting a major release anyway so people can adjust dependency names when 
> upgrading. This change would be also consistent with package naming inside 
> those modules. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9562) Unify 'analysis' package with produced artifact names

2020-10-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212178#comment-17212178
 ] 

ASF subversion and git services commented on LUCENE-9562:
-

Commit c5cf13259e36582270042c46c9106138aadcb1d0 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c5cf132 ]

LUCENE-9562: All binary analysis packages (and corresponding Maven artifacts) 
with names containing '-analyzers-' have been renamed to '-analysis-'. (#1968)



> Unify 'analysis' package with produced artifact names
> -
>
> Key: LUCENE-9562
> URL: https://issues.apache.org/jira/browse/LUCENE-9562
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Lucene has 'analysis' module but its sub-modules produce 'lucene-analyzers-*' 
> artifacts. This inconsistency is currently handled by setting artifact names 
> manually:
> {code}
> configure(subprojects.findAll { it.path.contains(':lucene:analysis:') }) {
>   project.archivesBaseName = project.archivesBaseName.replace("-analysis-", 
> "-analyzers-")
> }
> {code}
> but I keep wondering if we should just make it one or the other - either 
> rename 'analysis' to 'analyzers' or produce 'lucene-analysis-' artifacts.
> My personal opinion is to produce 'lucene-analysis-' packages because this 
> keeps repository structure the same (backports will be easier) and we're 
> targeting a major release anyway so people can adjust dependency names when 
> upgrading. This change would be also consistent with package naming inside 
> those modules. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss merged pull request #1968: LUCENE-9562: Unify 'analysis' package with produced artifact names (-analyzers- to -analysis-)

2020-10-12 Thread GitBox


dweiss merged pull request #1968:
URL: https://github.com/apache/lucene-solr/pull/1968


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6831) Review LinkedList usage

2020-10-12 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212176#comment-17212176
 ] 

ASF subversion and git services commented on LUCENE-6831:
-

Commit 7362c4ce603d81f276a83070e51bda52c3528bd7 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7362c4c ]

LUCENE-6831: start removing LinkedList in favor of ArrayList or De/Queues 
(#1969)

I'm committing it in, seems like a trivial thing.

> Review LinkedList usage
> ---
>
> Key: LUCENE-6831
> URL: https://issues.apache.org/jira/browse/LUCENE-6831
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Priority: Trivial
> Fix For: 6.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I quickly scanned the code (out of curiosity) and most of the use cases of 
> LinkedList are as a Queue, in which case indeed an ArrayDeque would be a 
> better choice, especially if the maximum size is known in advance.
> There are also some invalid/ incorrect uses like calling size() on a linked 
> list in {{MultiPhraseQueryNodeBuilder}}, which should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss merged pull request #1969: LUCENE-6831: start removing LinkedList in favor of ArrayList or De/Queues

2020-10-12 Thread GitBox


dweiss merged pull request #1969:
URL: https://github.com/apache/lucene-solr/pull/1969


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org