[jira] [Updated] (SOLR-8306) Enhance ExpandComponent to allow expand.hits=0
[ https://issues.apache.org/jira/browse/SOLR-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Munendra S N updated SOLR-8306: --- Status: Patch Available (was: Open) > Enhance ExpandComponent to allow expand.hits=0 > -- > > Key: SOLR-8306 > URL: https://issues.apache.org/jira/browse/SOLR-8306 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.3.1 >Reporter: Marshall Sanders >Priority: Minor > Labels: expand > Fix For: 5.5 > > Attachments: SOLR-8306.patch, SOLR-8306.patch, SOLR-8306.patch, > SOLR-8306_branch_5x@1715230.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This enhancement allows the ExpandComponent to allow expand.hits=0 for those > who don't want an expanded document returned and only want the numFound from > the expand section. > This is useful for "See 54 more like this" use cases, but without the > performance hit of gathering an entire expanded document. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-8306) Enhance ExpandComponent to allow expand.hits=0
[ https://issues.apache.org/jira/browse/SOLR-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058442#comment-17058442 ] Munendra S N commented on SOLR-8306: [^SOLR-8306.patch] Thanks [~adhenderson] for the PR. I have attached the patch generated using your PR as I'm not sure if pre-commit build supported for PR. Finally, will merge the PR so that you can have the attribution Few minor changes * {{Changes.txt}} should go to 8.6 instead of 8.5 as release branch is cut and I think this fits in optimization than improvements based on recent email thread about categorization of issues * When expand.rows=0 scores won't be computed. So, {{maxScore}} would never be available even if score is requested, this should be fine but we might need to add this solr documentation https://lucene.apache.org/solr/guide/8_4/collapse-and-expand-results.html#expand-component > Enhance ExpandComponent to allow expand.hits=0 > -- > > Key: SOLR-8306 > URL: https://issues.apache.org/jira/browse/SOLR-8306 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.3.1 >Reporter: Marshall Sanders >Priority: Minor > Labels: expand > Fix For: 5.5 > > Attachments: SOLR-8306.patch, SOLR-8306.patch, SOLR-8306.patch, > SOLR-8306_branch_5x@1715230.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This enhancement allows the ExpandComponent to allow expand.hits=0 for those > who don't want an expanded document returned and only want the numFound from > the expand section. > This is useful for "See 54 more like this" use cases, but without the > performance hit of gathering an entire expanded document. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-8306) Enhance ExpandComponent to allow expand.hits=0
[ https://issues.apache.org/jira/browse/SOLR-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Munendra S N updated SOLR-8306: --- Attachment: SOLR-8306.patch > Enhance ExpandComponent to allow expand.hits=0 > -- > > Key: SOLR-8306 > URL: https://issues.apache.org/jira/browse/SOLR-8306 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.3.1 >Reporter: Marshall Sanders >Priority: Minor > Labels: expand > Fix For: 5.5 > > Attachments: SOLR-8306.patch, SOLR-8306.patch, SOLR-8306.patch, > SOLR-8306_branch_5x@1715230.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This enhancement allows the ExpandComponent to allow expand.hits=0 for those > who don't want an expanded document returned and only want the numFound from > the expand section. > This is useful for "See 54 more like this" use cases, but without the > performance hit of gathering an entire expanded document. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn commented on a change in pull request #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument
dnhatn commented on a change in pull request #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument URL: https://github.com/apache/lucene-solr/pull/1346#discussion_r392011697 ## File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java ## @@ -474,49 +474,11 @@ long updateDocuments(final Iterable return seqNo; } + long updateDocument(final Iterable doc, final Analyzer analyzer, Review comment: Can we also remove this method and delegate `updateDocument` to `updateDocuments` in IndexWriter instead? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn commented on a change in pull request #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument
dnhatn commented on a change in pull request #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument URL: https://github.com/apache/lucene-solr/pull/1346#discussion_r392011485 ## File path: lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java ## @@ -346,7 +285,7 @@ public long updateDocuments(Iterable deleteNode) { + private long finishDocument(DocumentsWriterDeleteQueue.Node deleteNode, int docCount) { Review comment: Should we call this finishDocument**S** ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1298: SOLR-14289 Skip ZkChroot check when not necessary
dsmiley commented on a change in pull request #1298: SOLR-14289 Skip ZkChroot check when not necessary URL: https://github.com/apache/lucene-solr/pull/1298#discussion_r391985435 ## File path: solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java ## @@ -285,7 +285,7 @@ public static NodeConfig loadNodeConfig(Path solrHome, Properties nodeProperties if (zkClient.exists("/solr.xml", true)) { log.info("solr.xml found in ZooKeeper. Loading..."); byte[] data = zkClient.getData("/solr.xml", null, null, true); -return SolrXmlConfig.fromInputStream(loader, new ByteArrayInputStream(data)); +return SolrXmlConfig.fromInputStream(loader, new ByteArrayInputStream(data), "zookeeper"); Review comment: Then I much prefer a boolean as it'd be much clearer -- "isInZooKeeper" or some-such. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1298: SOLR-14289 Skip ZkChroot check when not necessary
madrob commented on a change in pull request #1298: SOLR-14289 Skip ZkChroot check when not necessary URL: https://github.com/apache/lucene-solr/pull/1298#discussion_r391898475 ## File path: solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java ## @@ -285,7 +285,7 @@ public static NodeConfig loadNodeConfig(Path solrHome, Properties nodeProperties if (zkClient.exists("/solr.xml", true)) { log.info("solr.xml found in ZooKeeper. Loading..."); byte[] data = zkClient.getData("/solr.xml", null, null, true); -return SolrXmlConfig.fromInputStream(loader, new ByteArrayInputStream(data)); +return SolrXmlConfig.fromInputStream(loader, new ByteArrayInputStream(data), "zookeeper"); Review comment: Yea, a boolean would be sufficient. I was thinking about if we need to have other sources in the future, but we can change this to a string/enum later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-10336) NPE during queryCache warming
[ https://issues.apache.org/jira/browse/SOLR-10336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058204#comment-17058204 ] Joel Bernstein edited comment on SOLR-10336 at 3/12/20, 7:40 PM: - This is likely resolved as well. Multiple collapses should no longer be a problem. was (Author: joel.bernstein): This is likely resolved as well > NPE during queryCache warming > - > > Key: SOLR-10336 > URL: https://issues.apache.org/jira/browse/SOLR-10336 > Project: Solr > Issue Type: Bug >Affects Versions: 6.4.2 >Reporter: Markus Jelsma >Priority: Major > Fix For: 7.0 > > > Regular cache warming stumbles on this NPE. It seems to be related to > SOLR-9104, it is the same collection and the query that fails the cache > warmer is similar to that of SOLR-9104, i.e, two CollapsingQParsers. > {code} > Error during auto-warming of > key:org.apache.solr.search.QueryResultKey@fe9769ca:java.lang.NullPointerException > at > org.apache.solr.search.CollapsingQParserPlugin$IntScoreCollector.finish(CollapsingQParserPlugin.java:816) > at > org.apache.solr.search.CollapsingQParserPlugin$IntScoreCollector.finish(CollapsingQParserPlugin.java:853) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:256) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1823) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1640) > at > org.apache.solr.search.SolrIndexSearcher.lambda$initRegenerators$3(SolrIndexSearcher.java:604) > at org.apache.solr.search.LFUCache.warm(LFUCache.java:188) > at > org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:2376) > at > org.apache.solr.core.SolrCore.lambda$getSearcher$2(SolrCore.java:2054) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-10336) NPE during queryCache warming
[ https://issues.apache.org/jira/browse/SOLR-10336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058204#comment-17058204 ] Joel Bernstein commented on SOLR-10336: --- This is likely resolved as well > NPE during queryCache warming > - > > Key: SOLR-10336 > URL: https://issues.apache.org/jira/browse/SOLR-10336 > Project: Solr > Issue Type: Bug >Affects Versions: 6.4.2 >Reporter: Markus Jelsma >Priority: Major > Fix For: 7.0 > > > Regular cache warming stumbles on this NPE. It seems to be related to > SOLR-9104, it is the same collection and the query that fails the cache > warmer is similar to that of SOLR-9104, i.e, two CollapsingQParsers. > {code} > Error during auto-warming of > key:org.apache.solr.search.QueryResultKey@fe9769ca:java.lang.NullPointerException > at > org.apache.solr.search.CollapsingQParserPlugin$IntScoreCollector.finish(CollapsingQParserPlugin.java:816) > at > org.apache.solr.search.CollapsingQParserPlugin$IntScoreCollector.finish(CollapsingQParserPlugin.java:853) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:256) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1823) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1640) > at > org.apache.solr.search.SolrIndexSearcher.lambda$initRegenerators$3(SolrIndexSearcher.java:604) > at org.apache.solr.search.LFUCache.warm(LFUCache.java:188) > at > org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:2376) > at > org.apache.solr.core.SolrCore.lambda$getSearcher$2(SolrCore.java:2054) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob opened a new pull request #1347: SOLR-14322 Improve AbstractFullDistribZkTestBase.waitForThingsToLevelOut
madrob opened a new pull request #1347: SOLR-14322 Improve AbstractFullDistribZkTestBase.waitForThingsToLevelOut URL: https://github.com/apache/lucene-solr/pull/1347 https://issues.apache.org/jira/browse/SOLR-14322 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on issue #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument
s1monw commented on issue #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument URL: https://github.com/apache/lucene-solr/pull/1346#issuecomment-598371903 @uschindler I would love to get your input here too This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9274) UnifiedHighlighter cannot handle SpanMultiTermQueryWrapper with an Automaton of type SINGLE
[ https://issues.apache.org/jira/browse/LUCENE-9274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058188#comment-17058188 ] David Smiley commented on LUCENE-9274: -- Yes indeed; the latter part of my comment here: https://issues.apache.org/jira/browse/LUCENE-8158?focusedCommentId=16352779=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16352779 I just filed a new issue for this: LUCENE-9277. I have no plans to work on this anytime soon but I'm always happy to code review / merge (which itself is plenty of work – _alas_). > UnifiedHighlighter cannot handle SpanMultiTermQueryWrapper with an Automaton > of type SINGLE > --- > > Key: LUCENE-9274 > URL: https://issues.apache.org/jira/browse/LUCENE-9274 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.4 >Reporter: Christoph Goller >Priority: Major > Attachments: TestUnifiedHighlighterMTQ.java > > > MultiTermHighlighting.extractAutomata ignores a Term from a SINGLE Automaton > and Highlighting does not work. > Of course an AutomatonQuery with a single term does not make much sense, but > it may be generated by an automatic process. > Possible fixes: > * Either implement consumeTerms in MultiTermHighlighting.AutomataCollector > * Or remove special case for SINGLE in CompiledAutomaton.visit > I attatch a Unit Test -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9277) UnifiedHighlighter: internally visit the query tree once
David Smiley created LUCENE-9277: Summary: UnifiedHighlighter: internally visit the query tree once Key: LUCENE-9277 URL: https://issues.apache.org/jira/browse/LUCENE-9277 Project: Lucene - Core Issue Type: Task Components: modules/highlighter Reporter: David Smiley Ideally the UnifiedHighlighter should "visit" the query tree *once* instead of several times (weight.extractTerms, MultiTermHighlighting, PhraseHelper). Perhaps this can happen in one new class, perhaps called QueryExtractor. It's debatable wether this would replace a bunch of fields presently on UHComponents or whether it would simply help produce the existing UHComponents; shrug. Admittedly, I don't know how much of an "optimization" this is, or wether this is just a refactoring that is done on principle. I simply like the principle of it; knowing there are multiple _visit_s to the query gnaws at me. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw opened a new pull request #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument
s1monw opened a new pull request #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument URL: https://github.com/apache/lucene-solr/pull/1346 Today we have a large amount of duplicated code that is rather of complex nature. This change consolidates the code-paths to always use the updateDocuments path. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on issue #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument
s1monw commented on issue #1346: LUCENE-9276: Use same code-path for updateDocuments and updateDocument URL: https://github.com/apache/lucene-solr/pull/1346#issuecomment-598351277 @mikemccand @dnhatn wanna take a look This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9276) Consolidate DW(PT)#updateDocument and #updateDocuments
Simon Willnauer created LUCENE-9276: --- Summary: Consolidate DW(PT)#updateDocument and #updateDocuments Key: LUCENE-9276 URL: https://issues.apache.org/jira/browse/LUCENE-9276 Project: Lucene - Core Issue Type: Improvement Affects Versions: master (9.0), 8.5 Reporter: Simon Willnauer While I was working on another IW related issue I made some changes to DW#updateDocument but forgot DW#updateDocuments which is annoying since the code is 99% identical. The same applies to DWPT#updateDocument[s]. IMO this is the wrong place to optimize in order to safe one or two object creations. Maybe we can remove this code duplication. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1329: SOLR-14275: Policy calculations are very slow for large clusters and large operations
sigram commented on a change in pull request #1329: SOLR-14275: Policy calculations are very slow for large clusters and large operations URL: https://github.com/apache/lucene-solr/pull/1329#discussion_r391797970 ## File path: solr/core/src/java/org/apache/solr/cloud/autoscaling/InactiveShardPlanAction.java ## @@ -102,9 +104,14 @@ public void process(TriggerEvent event, ActionContext context) throws Exception String parentPath = ZkStateReader.COLLECTIONS_ZKNODE + "/" + coll.getName(); List locks; try { - locks = cloudManager.getDistribStateManager().listData(parentPath).stream() - .filter(name -> name.endsWith("-splitting")) - .collect(Collectors.toList()); + DistribStateManager stateManager = cloudManager.getDistribStateManager(); Review comment: this change is an unrelated fix, please ignore for now - this should go into a separate Jira. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
[ https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Eduardo Fernandez Lobbe resolved SOLR-14316. -- Fix Version/s: 8.6 master (9.0) Resolution: Fixed Not sure what's the problem with Git tagging. Committed to: master: https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=commit;h=9a8602c96eebfad97e3f1502cef6c3110653cf67 branch_8x: https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=commit;h=8d6349b2e0cf89daea6ffa07760ee18719e72eb6 > Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's > equals() method > - > > Key: SOLR-14316 > URL: https://issues.apache.org/jira/browse/SOLR-14316 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Minor > Labels: patch > Fix For: master (9.0), 8.6 > > Time Spent: 2.5h > Remaining Estimate: 0h > > There is an unchecked type conversion warning in JavaBinCodec's > readMapEntry's equals() method. > This change removes that warning by handling a checked conversion and also > adds to tests to an earlier untested api. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe merged pull request #1344: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() squashed
tflobbe merged pull request #1344: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() squashed URL: https://github.com/apache/lucene-solr/pull/1344 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9274) UnifiedHighlighter cannot handle SpanMultiTermQueryWrapper with an Automaton of type SINGLE
[ https://issues.apache.org/jira/browse/LUCENE-9274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058078#comment-17058078 ] Alan Woodward commented on LUCENE-9274: --- I think ideally we'd merge UnifiedHighlighter.extractTerms() and MultiTermHighlighting.extractAutomata(), so that the single term here is handled the same as terms from any other query. I know this is something that [~dsmiley] has been thinking about? > UnifiedHighlighter cannot handle SpanMultiTermQueryWrapper with an Automaton > of type SINGLE > --- > > Key: LUCENE-9274 > URL: https://issues.apache.org/jira/browse/LUCENE-9274 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.4 >Reporter: Christoph Goller >Priority: Major > Attachments: TestUnifiedHighlighterMTQ.java > > > MultiTermHighlighting.extractAutomata ignores a Term from a SINGLE Automaton > and Highlighting does not work. > Of course an AutomatonQuery with a single term does not make much sense, but > it may be generated by an automatic process. > Possible fixes: > * Either implement consumeTerms in MultiTermHighlighting.AutomataCollector > * Or remove special case for SINGLE in CompiledAutomaton.visit > I attatch a Unit Test -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14324) infra-solr commands are not working in Linux server
[ https://issues.apache.org/jira/browse/SOLR-14324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-14324. --- Resolution: Incomplete First, here’s not nearly enough information here to even begin to help. Second, please raise issues like this on the Solr user’s list first. If it’s determined that this really is a code problem rather than an issue with your environment, we can open a new Jira or reopen this one. > infra-solr commands are not working in Linux server > --- > > Key: SOLR-14324 > URL: https://issues.apache.org/jira/browse/SOLR-14324 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 7.3.1 >Reporter: GANESAN.P >Priority: Critical > Labels: linuc > > [root@node03 hduser]# systemctl status solr.service > Unit solr.service could not be found. > [root@node03 hduser]# solr status > bash: solr: command not found... > [root@node03 hduser]# -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14314) Solr does not response most of the update request some times
[ https://issues.apache.org/jira/browse/SOLR-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058031#comment-17058031 ] Ishan Chattopadhyaya commented on SOLR-14314: - Please ask in solr-users. The best Solr practitioners are there. Jira is not a support portal. > Solr does not response most of the update request some times > > > Key: SOLR-14314 > URL: https://issues.apache.org/jira/browse/SOLR-14314 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Aaron Sun >Priority: Critical > Attachments: jstack_bad_state.log, solrlog.tar.gz, solrlog.tar.gz > > > Solr version: > {noformat} > solr-spec > 8.4.1 > solr-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:40:28 > lucene-spec > 8.4.1 > lucene-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:35:00 > {noformat} > > Java process: > {noformat} > java -Xms100G -Xmx200G -DSTOP.PORT=8078 -DSTOP.KEY=ardsolrstop > -Dsolr.solr.home=/ardome/solr -Djetty.port=8983 > -Dsolr.log.dir=/var/ardendo/log -jar start.jar --module=http > {noformat} > Run on a powerful server with 32 cores, 265GB RAM. > The problem is that time to time it start to get very slow to update solr > documents, for example time out after 30 minutes. > document size is around 20k~50K each, each http request send to /update is > around 4MB~10MB. > /update request is done by multi processes. > Some of the update get response, but the differences between "QTime" and > http response time is big, one example, qtime = 66s, http response time is > 2304s. > According to jstack for the thread state, lots of BLOCKED state. > thread dump log is attached. > Any hint would be appreciate, thanks! > > > > > > > > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8929) Early Terminating CollectorManager
[ https://issues.apache.org/jira/browse/LUCENE-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057967#comment-17057967 ] Michael Sokolov commented on LUCENE-8929: - Thanks for the insightful comments, [~jim.ferenczi], you've given me a lot to think about! I had not really considered sorting segments: that makes a lot of sense when documents are at least roughly inserted in sort order. I would have thought merges would interfere with that opto, but I guess for the most part it works out? The performance improvements you saw are stunning. It would be great if we could get the segment sorting ideas merged into the Lucene code base, no? I wonder how we determine when they are applicable though. In Elasticsearch is it done based on some a-priori knowledge, or do you analyze the distribution and turn on the opto automatically? That would be compelling I think. On the other hand, the use case inspiring this does not tend to correlate index sort order and insertion order, so I don't think it would benefit as much from segment sorting (except due to chance, or in special cases), so I think these are really two separate optimizations and issues. We should be sure to structure the code in such a way that can accomodate them all and properly choose which one to apply. We don't have a formal query planner in Lucene, but I guess we are beginning to evolve one. I think the idea of splitting collectors is a good one, to avoid overmuch complexity in a single collector, but there is also a good deal of shared code across these. I can give that a try and see what it looks like. By the way, I did also run a test using luceneutil's "modification timestamp" field as the index sort and saw similar gains. I think that field is more tightly correlated with insertion order, and also has much higher cardinality, so it makes a good counterpoint: I'll post results here later once I can do a workup. I hear your concern about the non-determinism due to tie-breaking, but I * think* this is accounted for by including (global) docid in the comparison in MaxScoreTerminator.LeafState? I may be missing something though. It doesn't seem we have a good unit test checking for this tiebreak. I'll add to TestTopFieldCollector.testRandomMaxScoreTermination to make sure that case is covered. I'm not sure what to say about the `LeafFieldComparator` idea - it sounds powerful, but I am also a bit leery of these complex Comparators - they make other things more difficult since it becomes challenging to reason about the sort order "from the outside". I had to resort to some "instanceof" hackery to restrict consideration to cases where the comparator is numeric, and extracting the sort value from the comparator is pretty messy too. We pay a complexity cost here to handle some edge cases of more abstract comparators. > Early Terminating CollectorManager > -- > > Key: LUCENE-8929 > URL: https://issues.apache.org/jira/browse/LUCENE-8929 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Atri Sharma >Priority: Major > Time Spent: 7h 20m > Remaining Estimate: 0h > > We should have an early terminating collector manager which accurately tracks > hits across all of its collectors and determines when there are enough hits, > allowing all the collectors to abort. > The options for the same are: > 1) Shared total count : Global "scoreboard" where all collectors update their > current hit count. At the end of each document's collection, collector checks > if N > threshold, and aborts if true > 2) State Reporting Collectors: Collectors report their total number of counts > collected periodically using a callback mechanism, and get a proceed or abort > decision. > 1) has the overhead of synchronization in the hot path, 2) can collect > unnecessary hits before aborting. > I am planning to work on 2), unless objections -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-13264) unexpected autoscaling set-trigger response
[ https://issues.apache.org/jira/browse/SOLR-13264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki reassigned SOLR-13264: --- Assignee: Andrzej Bialecki > unexpected autoscaling set-trigger response > --- > > Key: SOLR-13264 > URL: https://issues.apache.org/jira/browse/SOLR-13264 > Project: Solr > Issue Type: Bug > Components: AutoScaling >Reporter: Christine Poerschke >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-13264.patch, SOLR-13264.patch > > > Steps to reproduce: > {code} > ./bin/solr start -cloud -noprompt > ./bin/solr create -c demo -d _default -shards 1 -replicationFactor 1 > curl "http://localhost:8983/solr/admin/autoscaling; -d' > { > "set-trigger" : { > "name" : "index_size_trigger", > "event" : "indexSize", > "aboveDocs" : 12345, > "aboveOp" : "SPLITSHARD", > "enabled" : true, > "actions" : [ > { > "name" : "compute_plan", > "class": "solr.ComputePlanAction" > } > ] > } > } > ' > ./bin/solr stop -all > {code} > The {{aboveOp}} is documented on > https://lucene.apache.org/solr/guide/7_6/solrcloud-autoscaling-triggers.html#index-size-trigger > and logically should be accepted (even though it is actually the default) > but unexpectedly an error message is returned {{"Error validating trigger > config index_size_trigger: > TriggerValidationException\{name=index_size_trigger, > details='\{aboveOp=unknown property\}'\}"}}. > From a quick look it seems that in the {{IndexSizeTrigger}} constructor > additional values need to be passed to the {{TriggerUtils.validProperties}} > method i.e. aboveOp, belowOp and maybe others too i.e. > aboveSize/belowSize/etc. Illustrative patch to follow. Thank you. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13807) Caching for term facet counts
[ https://issues.apache.org/jira/browse/SOLR-13807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057947#comment-17057947 ] Michael Gibney commented on SOLR-13807: --- Still working at [PR #751|https://github.com/apache/lucene-solr/pull/751], I separated the earlier monolithic commit (and recent incremental adjustments) into two logical commits: first introducing single-sweep collection of facet term counts across different DocSet domains, second introducing a term facet count cache. (N.b. tests are passing for each commit, but validation is failing based on intentional nocommits). [~hossman], I have yet to expand on the test stub you introduced, but if you don't think it's premature to take a second look at this now that the two features have been separated into logical commits, I'd appreciate any feedback you have to offer. I was reluctant to force-push, and wasn't sure whether to open new PRs or work with the existing one; but I left the old (monolithic + test-stub-patch + iterative-adjustments) available [here|https://github.com/magibney/lucene-solr/tree/SOLR-13132-mingled-sweep-and-cache], and figured the new 2-commit push would clarify things and be a good jumping off point for however we want to proceed (whether new PRs, etc...). I know I had said I would make single-sweep collection dependent on facet cache, but (as you can see) I went the opposite way. Functionality-wise, facet-cache would have made sense first, but code/structure-wise, sweep-first was much cleaner and clearer. > Caching for term facet counts > - > > Key: SOLR-13807 > URL: https://issues.apache.org/jira/browse/SOLR-13807 > Project: Solr > Issue Type: New Feature > Components: Facet Module >Affects Versions: master (9.0), 8.2 >Reporter: Michael Gibney >Priority: Minor > Attachments: SOLR-13807__SOLR-13132_test_stub.patch > > > Solr does not have a facet count cache; so for _every_ request, term facets > are recalculated for _every_ (facet) field, by iterating over _every_ field > value for _every_ doc in the result domain, and incrementing the associated > count. > As a result, subsequent requests end up redoing a lot of the same work, > including all associated object allocation, GC, etc. This situation could > benefit from integrated caching. > Because of the domain-based, serial/iterative nature of term facet > calculation, latency is proportional to the size of the result domain. > Consequently, one common/clear manifestation of this issue is high latency > for faceting over an unrestricted domain (e.g., {{\*:\*}}), as might be > observed on a top-level landing page that exposes facets. This type of > "static" case is often mitigated by external (to Solr) caching, either with a > caching layer between Solr and a front-end application, or within a front-end > application, or even with a caching layer between the end user and a > front-end application. > But in addition to the overhead of handling this caching elsewhere in the > stack (or, for a new user, even being aware of this as a potential issue to > mitigate), any external caching mitigation is really only appropriate for > relatively static cases like the "landing page" example described above. A > Solr-internal facet count cache (analogous to the {{filterCache}}) would > provide the following additional benefits: > # ease of use/out-of-the-box configuration to address a common performance > concern > # compact (specifically caching count arrays, without the extra baggage that > accompanies a naive external caching approach) > # NRT-friendly (could be implemented to be segment-aware) > # modular, capable of reusing the same cached values in conjunction with > variant requests over the same result domain (this would support common use > cases like paging, but also potentially more interesting direct uses of > facets). > # could be used for distributed refinement (i.e., if facet counts over a > given domain are cached, a refinement request could simply look up the > ordinal value for each enumerated term and directly grab the count out of the > count array that was cached during the first phase of facet calculation) > # composable (e.g., in aggregate functions that calculate values based on > facet counts across different domains, like SKG/relatedness – see SOLR-13132) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] magibney commented on issue #751: SOLR-13132: single sweep iteration over base, foreground, and background sets for "relatedness" calculation
magibney commented on issue #751: SOLR-13132: single sweep iteration over base, foreground, and background sets for "relatedness" calculation URL: https://github.com/apache/lucene-solr/pull/751#issuecomment-598193540 Force push of bc4b18f separates into two logical commits: one introduces single-sweep collection of term facet counts over multiple domains, the second (which builds on the first) introduces a facet cache, which is more generally useful, but is particularly helpful for performance of relatedness calculation over relatively stable "background" sets. The original monolithic PR, with some iterative improvements after @hossman's test stub patch, is available [here](https://github.com/magibney/lucene-solr/tree/SOLR-13132-mingled-sweep-and-cache). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search
[ https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057931#comment-17057931 ] Tomoko Uchida commented on LUCENE-9136: --- [~jim.ferenczi] Thank you for elaborating. I agree with you, it's great if we have some abstraction for vectors (interface or abstract base class with default implementation?) for experimenting different ann search algorithms. > Introduce IVFFlat to Lucene for ANN similarity search > - > > Key: LUCENE-9136 > URL: https://issues.apache.org/jira/browse/LUCENE-9136 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Xin-Chun Zhang >Priority: Major > Attachments: glove-100-angular.png, glove-25-angular.png, > image-2020-03-07-01-22-06-132.png, image-2020-03-07-01-25-58-047.png, > image-2020-03-07-01-27-12-859.png, sift-128-euclidean.png > > Time Spent: 50m > Remaining Estimate: 0h > > Representation learning (RL) has been an established discipline in the > machine learning space for decades but it draws tremendous attention lately > with the emergence of deep learning. The central problem of RL is to > determine an optimal representation of the input data. By embedding the data > into a high dimensional vector, the vector retrieval (VR) method is then > applied to search the relevant items. > With the rapid development of RL over the past few years, the technique has > been used extensively in industry from online advertising to computer vision > and speech recognition. There exist many open source implementations of VR > algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various > choices for potential users. However, the aforementioned implementations are > all written in C++, and no plan for supporting Java interface, making it hard > to be integrated in Java projects or those who are not familier with C/C++ > [[https://github.com/facebookresearch/faiss/issues/105]]. > The algorithms for vector retrieval can be roughly classified into four > categories, > # Tree-base algorithms, such as KD-tree; > # Hashing methods, such as LSH (Local Sensitive Hashing); > # Product quantization based algorithms, such as IVFFlat; > # Graph-base algorithms, such as HNSW, SSG, NSG; > where IVFFlat and HNSW are the most popular ones among all the VR algorithms. > IVFFlat is better for high-precision applications such as face recognition, > while HNSW performs better in general scenarios including recommendation and > personalized advertisement. *The recall ratio of IVFFlat could be gradually > increased by adjusting the query parameter (nprobe), while it's hard for HNSW > to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. > Recently, the implementation of HNSW (Hierarchical Navigable Small World, > LUCENE-9004) for Lucene, has made great progress. The issue draws attention > of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. > As an alternative for solving ANN similarity search problems, IVFFlat is also > very popular with many users and supporters. Compared with HNSW, IVFFlat has > smaller index size but requires k-means clustering, while HNSW is faster in > query (no training required) but requires extra storage for saving graphs > [indexing 1M > vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]]. > Another advantage is that IVFFlat can be faster and more accurate when > enables GPU parallel computing (current not support in Java). Both algorithms > have their merits and demerits. Since HNSW is now under development, it may > be better to provide both implementations (HNSW && IVFFlat) for potential > users who are faced with very different scenarios and want to more choices. > The latest branch is > [*lucene-9136-ann-ivfflat*]([https://github.com/irvingzhang/lucene-solr/commits/jira/lucene-9136-ann-ivfflat)|https://github.com/irvingzhang/lucene-solr/commits/jira/lucene-9136-ann-ivfflat] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14314) Solr does not response most of the update request some times
[ https://issues.apache.org/jira/browse/SOLR-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Sun updated SOLR-14314: - Attachment: solrlog.tar.gz > Solr does not response most of the update request some times > > > Key: SOLR-14314 > URL: https://issues.apache.org/jira/browse/SOLR-14314 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Aaron Sun >Priority: Critical > Attachments: jstack_bad_state.log, solrlog.tar.gz, solrlog.tar.gz > > > Solr version: > {noformat} > solr-spec > 8.4.1 > solr-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:40:28 > lucene-spec > 8.4.1 > lucene-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:35:00 > {noformat} > > Java process: > {noformat} > java -Xms100G -Xmx200G -DSTOP.PORT=8078 -DSTOP.KEY=ardsolrstop > -Dsolr.solr.home=/ardome/solr -Djetty.port=8983 > -Dsolr.log.dir=/var/ardendo/log -jar start.jar --module=http > {noformat} > Run on a powerful server with 32 cores, 265GB RAM. > The problem is that time to time it start to get very slow to update solr > documents, for example time out after 30 minutes. > document size is around 20k~50K each, each http request send to /update is > around 4MB~10MB. > /update request is done by multi processes. > Some of the update get response, but the differences between "QTime" and > http response time is big, one example, qtime = 66s, http response time is > 2304s. > According to jstack for the thread state, lots of BLOCKED state. > thread dump log is attached. > Any hint would be appreciate, thanks! > > > > > > > > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14314) Solr does not response most of the update request some times
[ https://issues.apache.org/jira/browse/SOLR-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057919#comment-17057919 ] Aaron Sun commented on SOLR-14314: -- After more stability test, it turned out that big pause could still happen even with heapsize 25GB . {noformat} 2020-03-12 14:09:45.434 DEBUG (qtp1668016508-3378) [ x:agglogtrackitem] o.a.s.u.DirectUpdateHandler2 updateDocument(add\{_version_=1660963872229556224,id=2101611210074371724}) 2020-03-12 14:09:45.434 DEBUG (qtp1668016508-3406) [ x:agglogtrackitem] o.a.s.u.DirectUpdateHandler2 updateDocument(add\{_version_=1660963872228507650,id=2101703060188004924}) 2020-03-12 14:09:48.680 DEBUG (qtp1668016508-82044) [ x:agglogtrackitem] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE add\{,id=2101702020780064124} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 14:09:48.696 DEBUG (qtp1668016508-82044) [ x:agglogtrackitem] o.a.s.u.DirectUpdateHandler2 updateDocument(add\{_version_=1660963875644768256,id=2101702020780064124}) 2020-03-12 14:10:01.879 DEBUG (qtp1668016508-82115) [ x:agglogtrackitem] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE add\{,id=2102002130766448724} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 14:10:01.879 DEBUG (qtp1668016508-82115) [ x:agglogtrackitem] o.a.s.u.DirectUpdateHandler2 updateDocument(add\{_version_=1660963889483874304,id=2102002130766448724}) 2020-03-12 14:10:08.566 DEBUG (qtp1668016508-82155) [ x:agglogtrackitem] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE add\{,id=2101702170061492124} \{{params(commit=true),defaults(wt=json)}} {noformat} {noformat} java -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/var/ardendo/log/solr_gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=20M -Xms25G -Xmx25G -DSTOP.PORT=8078 -DSTOP.KEY=ardsolrstop -Dsolr.solr.home=/data1/solr8 -Djetty.port=8983 -Dsolr.log.dir=/var/ardendo/log -jar start.jar --module=http {noformat} [^solrlog.tar.gz] > Solr does not response most of the update request some times > > > Key: SOLR-14314 > URL: https://issues.apache.org/jira/browse/SOLR-14314 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Aaron Sun >Priority: Critical > Attachments: jstack_bad_state.log, solrlog.tar.gz, solrlog.tar.gz > > > Solr version: > {noformat} > solr-spec > 8.4.1 > solr-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:40:28 > lucene-spec > 8.4.1 > lucene-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:35:00 > {noformat} > > Java process: > {noformat} > java -Xms100G -Xmx200G -DSTOP.PORT=8078 -DSTOP.KEY=ardsolrstop > -Dsolr.solr.home=/ardome/solr -Djetty.port=8983 > -Dsolr.log.dir=/var/ardendo/log -jar start.jar --module=http > {noformat} > Run on a powerful server with 32 cores, 265GB RAM. > The problem is that time to time it start to get very slow to update solr > documents, for example time out after 30 minutes. > document size is around 20k~50K each, each http request send to /update is > around 4MB~10MB. > /update request is done by multi processes. > Some of the update get response, but the differences between "QTime" and > http response time is big, one example, qtime = 66s, http response time is > 2304s. > According to jstack for the thread state, lots of BLOCKED state. > thread dump log is attached. > Any hint would be appreciate, thanks! > > > > > > > > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14314) Solr does not response most of the update request some times
[ https://issues.apache.org/jira/browse/SOLR-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057912#comment-17057912 ] Aaron Sun commented on SOLR-14314: -- Update: After more stability test, it turned out that big pause could still happen even with heapsize 25GB . {noformat} {noformat} > Solr does not response most of the update request some times > > > Key: SOLR-14314 > URL: https://issues.apache.org/jira/browse/SOLR-14314 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Aaron Sun >Priority: Critical > Attachments: jstack_bad_state.log, solrlog.tar.gz > > > Solr version: > {noformat} > solr-spec > 8.4.1 > solr-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:40:28 > lucene-spec > 8.4.1 > lucene-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:35:00 > {noformat} > > Java process: > {noformat} > java -Xms100G -Xmx200G -DSTOP.PORT=8078 -DSTOP.KEY=ardsolrstop > -Dsolr.solr.home=/ardome/solr -Djetty.port=8983 > -Dsolr.log.dir=/var/ardendo/log -jar start.jar --module=http > {noformat} > Run on a powerful server with 32 cores, 265GB RAM. > The problem is that time to time it start to get very slow to update solr > documents, for example time out after 30 minutes. > document size is around 20k~50K each, each http request send to /update is > around 4MB~10MB. > /update request is done by multi processes. > Some of the update get response, but the differences between "QTime" and > http response time is big, one example, qtime = 66s, http response time is > 2304s. > According to jstack for the thread state, lots of BLOCKED state. > thread dump log is attached. > Any hint would be appreciate, thanks! > > > > > > > > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-14314) Solr does not response most of the update request some times
[ https://issues.apache.org/jira/browse/SOLR-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Sun updated SOLR-14314: - Comment: was deleted (was: Update: After more stability test, it turned out that big pause could still happen even with heapsize 25GB . {noformat} {noformat} ) > Solr does not response most of the update request some times > > > Key: SOLR-14314 > URL: https://issues.apache.org/jira/browse/SOLR-14314 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Aaron Sun >Priority: Critical > Attachments: jstack_bad_state.log, solrlog.tar.gz > > > Solr version: > {noformat} > solr-spec > 8.4.1 > solr-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:40:28 > lucene-spec > 8.4.1 > lucene-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:35:00 > {noformat} > > Java process: > {noformat} > java -Xms100G -Xmx200G -DSTOP.PORT=8078 -DSTOP.KEY=ardsolrstop > -Dsolr.solr.home=/ardome/solr -Djetty.port=8983 > -Dsolr.log.dir=/var/ardendo/log -jar start.jar --module=http > {noformat} > Run on a powerful server with 32 cores, 265GB RAM. > The problem is that time to time it start to get very slow to update solr > documents, for example time out after 30 minutes. > document size is around 20k~50K each, each http request send to /update is > around 4MB~10MB. > /update request is done by multi processes. > Some of the update get response, but the differences between "QTime" and > http response time is big, one example, qtime = 66s, http response time is > 2304s. > According to jstack for the thread state, lots of BLOCKED state. > thread dump log is attached. > Any hint would be appreciate, thanks! > > > > > > > > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057911#comment-17057911 ] Jim Ferenczi commented on LUCENE-9004: -- I like this issue a lot and all the discussions around it, thanks all for working on this! I'd like to share some of the findings we had while working on similar solutions for Elasticsearch. The known limitations for graph based approach are the computational cost of building the graph and the memory needed to store the neighbors per node. Regarding the computational cost, inserting a new node is equivalent to a query in the solution so we should expect that the number of comparisons needed to insert a node in the graph will grow logarithmically with the size of the data. We've made some tests to check the number of comparisons needed on the million scale and found out that this number doesn't vary too much on the dataset present in the ann-benchmark repo. To get good performance at search time, the efConstruction parameter need to be set high (from 200 to 800 in the best results) while M (max numbers of neighbors per node) can can remain lower (16 to 64). This led to around 10k comparisons in average for the ann-benchmark dataset in the 1-10M ranges. 10K comparisons for 1-10M ranges at query time is very compelling. Users can also trade some recall with performance and get acceptable results in the 1-10k ranges. However this trade-offs are more difficult to apply at build time where the quality of the graph is important to maintain. I mainly see this cost as static due to its logarithmic growth that is verified in the paper around small-world graph approaches. This is the main trade-offs that users need to make when using graph-based approaches, building will be slow. Regarding the memory consumption, I have mixed feelings. The fact that we need to keep M nearest neighbors per node should not be a problem at search time since the graph can be static and accessed through a file. The random reads nature of a query in the graph will require disk seeks and reads but we retrieve M neighbors each time so we're not talking of tiny random reads and the filesystem cache will keep the hot nodes in direct memory (upper layer in the hierarchical graph?). I am saying this because it seems that we're expecting to load the entire graph in RAM at some point. I don't think this is needed at query time, hence my comment. The tricky part in my opinion here is at build time where the graph is updated dynamically. This requires more efficient access and the ability to change the data. We also need to keep the nearest neighbor distances for each neighbor so the total cost is N*M*8 where N is the total number of documents, M the maximum number of neighbors per node and 8 the cost associated with keeping a doc id and the distance for each neighbor (int+float). The formula is slightly more complicated for hierarchical graph but doesn't change the scale. This memory requirement seems acceptable for medium-sized graph in the range of 1-10M but can become problematic when building large graphs of hundreds of millions nodes. Considering the logarithmic growth of the number of operations needed to find a local minimum when the dataset grows, building large graphs is encouraged at the expense of more memory. I don't know what would be acceptable but requiring tens of gigabytes of heap memories to build such graph doesn't seem compelling to me. Considering that the benefit of using a graph are already visible in the 1-10M ranges I also wonder if we could make a compromise and cap the size of the graphs that we build. So instead of having one graph per segment, we'd build N depending on how much memory the user is willing to allocate for the build and the total number of docs present in the segment. Obviously, searching these graphs sequentially would be more costly than having a single giant graph. However, this could also have interesting properties when merging segments since we wouldn't need to rebuild graphs that reached the maximum size allowed (assuming there's no deleted documents). This is an idea that I wanted to not limit ourselves to the overall size of a single graph in a single segment of 2 billions vectors (maximum allowed per index in Lucene). > Approximate nearest vector search > - > > Key: LUCENE-9004 > URL: https://issues.apache.org/jira/browse/LUCENE-9004 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael Sokolov >Priority: Major > Attachments: hnsw_layered_graph.png > > Time Spent: 3h 20m > Remaining Estimate: 0h > > "Semantic" search based on machine-learned vector "embeddings" representing > terms, queries and documents is becoming a must-have feature for a modern > search
[jira] [Commented] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search
[ https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057886#comment-17057886 ] Jim Ferenczi commented on LUCENE-9136: -- Thanks for chiming in [~tomoko]. Although I still think it would be valuable to discuss the minimal signature needed to share a new codec in both approaches. I also think that there is a consensus around the fact that multiple strategies could be needed depending on the trade-offs that users are willing to take. If we start adding codecs and formats for every strategy that we think valuable I am afraid that this will block us sooner that we expect. If we agree that having a new codec for vectors and ann is valuable in Lucene, my proposal is to have a generic codec that can be used to test different strategies (k-means, hsnw, ...). IMO this could also changed the goal for these approaches since we don't want to require users to tune tons of internal options (numbers of neighbors, numbers of levels, ...) upfront. > Introduce IVFFlat to Lucene for ANN similarity search > - > > Key: LUCENE-9136 > URL: https://issues.apache.org/jira/browse/LUCENE-9136 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Xin-Chun Zhang >Priority: Major > Attachments: glove-100-angular.png, glove-25-angular.png, > image-2020-03-07-01-22-06-132.png, image-2020-03-07-01-25-58-047.png, > image-2020-03-07-01-27-12-859.png, sift-128-euclidean.png > > Time Spent: 50m > Remaining Estimate: 0h > > Representation learning (RL) has been an established discipline in the > machine learning space for decades but it draws tremendous attention lately > with the emergence of deep learning. The central problem of RL is to > determine an optimal representation of the input data. By embedding the data > into a high dimensional vector, the vector retrieval (VR) method is then > applied to search the relevant items. > With the rapid development of RL over the past few years, the technique has > been used extensively in industry from online advertising to computer vision > and speech recognition. There exist many open source implementations of VR > algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various > choices for potential users. However, the aforementioned implementations are > all written in C++, and no plan for supporting Java interface, making it hard > to be integrated in Java projects or those who are not familier with C/C++ > [[https://github.com/facebookresearch/faiss/issues/105]]. > The algorithms for vector retrieval can be roughly classified into four > categories, > # Tree-base algorithms, such as KD-tree; > # Hashing methods, such as LSH (Local Sensitive Hashing); > # Product quantization based algorithms, such as IVFFlat; > # Graph-base algorithms, such as HNSW, SSG, NSG; > where IVFFlat and HNSW are the most popular ones among all the VR algorithms. > IVFFlat is better for high-precision applications such as face recognition, > while HNSW performs better in general scenarios including recommendation and > personalized advertisement. *The recall ratio of IVFFlat could be gradually > increased by adjusting the query parameter (nprobe), while it's hard for HNSW > to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. > Recently, the implementation of HNSW (Hierarchical Navigable Small World, > LUCENE-9004) for Lucene, has made great progress. The issue draws attention > of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. > As an alternative for solving ANN similarity search problems, IVFFlat is also > very popular with many users and supporters. Compared with HNSW, IVFFlat has > smaller index size but requires k-means clustering, while HNSW is faster in > query (no training required) but requires extra storage for saving graphs > [indexing 1M > vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]]. > Another advantage is that IVFFlat can be faster and more accurate when > enables GPU parallel computing (current not support in Java). Both algorithms > have their merits and demerits. Since HNSW is now under development, it may > be better to provide both implementations (HNSW && IVFFlat) for potential > users who are faced with very different scenarios and want to more choices. > The latest branch is > [*lucene-9136-ann-ivfflat*]([https://github.com/irvingzhang/lucene-solr/commits/jira/lucene-9136-ann-ivfflat)|https://github.com/irvingzhang/lucene-solr/commits/jira/lucene-9136-ann-ivfflat] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands,
[GitHub] [lucene-solr] iverase opened a new pull request #1345: make TestLatLonMultiPolygonShapeQueries more resilient for CONTAINS queries
iverase opened a new pull request #1345: make TestLatLonMultiPolygonShapeQueries more resilient for CONTAINS queries URL: https://github.com/apache/lucene-solr/pull/1345 This test can fail when a circle goes over the pole as distance point to line can have quite a bit of error. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9275) TestLatLonMultiPolygonShapeQueries failure
Ignacio Vera created LUCENE-9275: Summary: TestLatLonMultiPolygonShapeQueries failure Key: LUCENE-9275 URL: https://issues.apache.org/jira/browse/LUCENE-9275 Project: Lucene - Core Issue Type: Test Reporter: Ignacio Vera This test can fail for big circle queries when it goes over the pole. {code} Error Message: wrong hit (first of possibly more): FAIL: id=128 should match but did not relation=CONTAINS query=LatLonShapeQuery: field=shape:[CIRCLE([73.45044631686574,-43.522442537891635] radius = 1320857.7583952076 meters),] docID=127 shape=[[-43.60599318072272, -95.89632190395075] [1.401298464324817E-45, -95.89632190395075] [1.401298464324817E-45, 148.0564038690461] [-43.60599318072272, -95.89632190395075] , [-8.713707222781277, -137.43977030462523] [-8.665986874636296, -136.83720024522643] [-8.605159056677273, -135.67900228425023] [-9.022985319342514, -135.7748381870073] [-9.57551836995, -135.03944293912676] [-10.486875163146422, -133.75932451570236] [-12.667313123772418, -133.7153234402556] [-15.400299607273027, -133.5089745815] [-17.28330603483186, -134.4554641982157] [-21.607368456646313, -136.29612908889345] [-20.932241412751615, -139.63293025024942] [-20.650194586536255, -141.13774572688035] [-19.001635084539416, -144.5606838562986] [-15.72417778804206, -146.161554433355] [-15.56323460342411, -147.13460257950626] [-11.61552273270253, -144.82632867223] [-8.302765767406079, -143.5037337366715] [-9.07099844105521, -140.49240322673248] [-7.525403752869964, -140.08470342809397] [-8.713707222781277, -137.43977030462523] , [0.999403953552, -157.66023552014605] [90.0, -157.66023552014605] [90.0, 1.401298464324817E-45] [0.999403953552, 1.401298464324817E-45] [0.999403953552, -157.66023552014605] , [78.40177762548313, 0.999403953552] [90.0, 0.999403953552] [90.0, 107.68304478215401] [78.40177762548313, 0.999403953552] ] deleted?=false distanceQuery=CIRCLE([73.45044631686574,-43.522442537891635] radius = 1320857.7583952076 meters) {code} reproduce with: {code}ant test -Dtestcase=TestLatLonMultiPolygonShapeQueries -Dtests.method=testRandomMedium -Dtests.seed=B76D55AB11A1D02A -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=vi -Dtests.timezone=Etc/GMT-3 -Dtests.asserts=true -Dtests.file.encoding=UTF-8{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search
[ https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057878#comment-17057878 ] Tomoko Uchida commented on LUCENE-9136: --- {code} Do we need a new VectorFormat that can be shared with the graph-based approach ? {code} About this point, I think we don't need to consider both approaches at once. Please don't wait or take care the hnsw issue, and concentrate to get this in the master. I or someone with more knowledge/experience in this area will find the good way to integrate the graph-based approach later. > Introduce IVFFlat to Lucene for ANN similarity search > - > > Key: LUCENE-9136 > URL: https://issues.apache.org/jira/browse/LUCENE-9136 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Xin-Chun Zhang >Priority: Major > Attachments: glove-100-angular.png, glove-25-angular.png, > image-2020-03-07-01-22-06-132.png, image-2020-03-07-01-25-58-047.png, > image-2020-03-07-01-27-12-859.png, sift-128-euclidean.png > > Time Spent: 50m > Remaining Estimate: 0h > > Representation learning (RL) has been an established discipline in the > machine learning space for decades but it draws tremendous attention lately > with the emergence of deep learning. The central problem of RL is to > determine an optimal representation of the input data. By embedding the data > into a high dimensional vector, the vector retrieval (VR) method is then > applied to search the relevant items. > With the rapid development of RL over the past few years, the technique has > been used extensively in industry from online advertising to computer vision > and speech recognition. There exist many open source implementations of VR > algorithms, such as Facebook's FAISS and Microsoft's SPTAG, providing various > choices for potential users. However, the aforementioned implementations are > all written in C++, and no plan for supporting Java interface, making it hard > to be integrated in Java projects or those who are not familier with C/C++ > [[https://github.com/facebookresearch/faiss/issues/105]]. > The algorithms for vector retrieval can be roughly classified into four > categories, > # Tree-base algorithms, such as KD-tree; > # Hashing methods, such as LSH (Local Sensitive Hashing); > # Product quantization based algorithms, such as IVFFlat; > # Graph-base algorithms, such as HNSW, SSG, NSG; > where IVFFlat and HNSW are the most popular ones among all the VR algorithms. > IVFFlat is better for high-precision applications such as face recognition, > while HNSW performs better in general scenarios including recommendation and > personalized advertisement. *The recall ratio of IVFFlat could be gradually > increased by adjusting the query parameter (nprobe), while it's hard for HNSW > to improve its accuracy*. In theory, IVFFlat could achieve 100% recall ratio. > Recently, the implementation of HNSW (Hierarchical Navigable Small World, > LUCENE-9004) for Lucene, has made great progress. The issue draws attention > of those who are interested in Lucene or hope to use HNSW with Solr/Lucene. > As an alternative for solving ANN similarity search problems, IVFFlat is also > very popular with many users and supporters. Compared with HNSW, IVFFlat has > smaller index size but requires k-means clustering, while HNSW is faster in > query (no training required) but requires extra storage for saving graphs > [indexing 1M > vectors|[https://github.com/facebookresearch/faiss/wiki/Indexing-1M-vectors]]. > Another advantage is that IVFFlat can be faster and more accurate when > enables GPU parallel computing (current not support in Java). Both algorithms > have their merits and demerits. Since HNSW is now under development, it may > be better to provide both implementations (HNSW && IVFFlat) for potential > users who are faced with very different scenarios and want to more choices. > The latest branch is > [*lucene-9136-ann-ivfflat*]([https://github.com/irvingzhang/lucene-solr/commits/jira/lucene-9136-ann-ivfflat)|https://github.com/irvingzhang/lucene-solr/commits/jira/lucene-9136-ann-ivfflat] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14324) infra-solr commands are not working in Linux server
GANESAN.P created SOLR-14324: Summary: infra-solr commands are not working in Linux server Key: SOLR-14324 URL: https://issues.apache.org/jira/browse/SOLR-14324 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Affects Versions: 7.3.1 Reporter: GANESAN.P [root@node03 hduser]# systemctl status solr.service Unit solr.service could not be found. [root@node03 hduser]# solr status bash: solr: command not found... [root@node03 hduser]# -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on issue #1316: LUCENE-8929 parallel early termination in TopFieldCollector using minmin score
jimczi commented on issue #1316: LUCENE-8929 parallel early termination in TopFieldCollector using minmin score URL: https://github.com/apache/lucene-solr/pull/1316#issuecomment-598150417 Thanks for the ping @msokolov . > if you can comment on whether the MaxScoreAccumulator still provides additional benefit alongside this opto? I haven't tried removing it, but I wonder if it might be doing something redundant now - I'm not totally clear what impact setMinCompetitiveScore will have. It's redundant in spirit but the MaxScoreAccumulator is for queries sorted by relevancy. So it used so that queries sorted by relevancy can use `setMinCompetitiveScore` even if they have a tiebreaker on another field (using TopFieldCollector). The logic is similar to what you added in the `MaxScoreTerminator` except that the side effects of changes in the maximum score are handled by the top collectors directly. I left a comment in the original issue but I think we should try to merge the optimization you have for the sorted index case in the current logic or create a new top field collector dedicated to optimize the retrieval of large top N on sorted indices. With this pr we would have 3 different objects used by concurrent requests to speed up search but I think it would be preferrable to specialize at this point. What do you think ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1304: LUCENE-9242: generate javadocs by calling Ant javadoc task
dweiss commented on a change in pull request #1304: LUCENE-9242: generate javadocs by calling Ant javadoc task URL: https://github.com/apache/lucene-solr/pull/1304#discussion_r391552531 ## File path: gradle/invoke-javadoc.gradle ## @@ -0,0 +1,335 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// invoke javadoc tool + +allprojects { + + ext { +javadocRoot = project.path.startsWith(':lucene') ? project(':lucene').file("build/docs") : project(':solr').file("build/docs") +javadocDestDir = "${javadocRoot}/${project.name}" + } + + plugins.withType(JavaPlugin) { +def libName = project.path.startsWith(":lucene") ? "Lucene" : "Solr" +def title = "${libName} ${project.version} ${project.name} API".toString() +def srcDirs = sourceSets.main.java.srcDirs.findAll { dir -> dir.exists() } + +task invokeJavadoc { + description "Generates Javadoc API documentation for the main source code. This invokes Ant Javadoc Task." + group "documentation" + + dependsOn sourceSets.main.compileClasspath + + inputs.property("linksource", "no") + inputs.property("linkJUnit", false) + inputs.property("linkHref", []) + + inputs.files sourceSets.main.java.asFileTree + outputs.dir project.javadocRoot + + doFirst { +srcDirs.each { srcDir -> + ant.javadoc( + overview: file("${srcDir}/overview.html"), + packagenames: "org.apache.lucene.*,org.apache.solr.*", + destDir: project.javadocDestDir, + access: "protected", + encoding: "UTF-8", + charset: "UTF-8", + docencoding: "UTF-8", + noindex: "true", + includenosourcepackages: "true", + author: "true", + version: "true", + linksource: inputs.properties.linksource, + use: "true", + failonerror: "true", + locale: "en_US", + windowtitle: title, + doctitle: title, + maxmemory: "512m", + classpath: sourceSets.main.compileClasspath.asPath, + bottom: "Copyright 2000-${buildYear} Apache Software Foundation. All Rights Reserved." + ) { +packageset(dir: srcDir) + +tag(name: "lucene.experimental", description: "WARNING: This API is experimental and might change in incompatible ways in the next release.") +tag(name: "lucene.internal", description: "NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.") +tag(name: "lucene.spi", description: "SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):", scope: "types") + +// resolve links to JavaSE and JUnit API +link(offline: "true", href: "https://docs.oracle.com/en/java/javase/11/docs/api/;, packageListLoc: project(":lucene").file("tools/javadoc/java11/").toString()) +if (inputs.properties.get("linkJUnit")) { + link(offline: "true", href: "https://junit.org/junit4/javadoc/4.12/;, packageListLoc: project(":lucene").file("tools/javadoc/junit").toString()) +} +// resolve inter-module links if 'linkHref' property is specified +inputs.properties.get("linkHref").each { href -> + link(href: href) +} + +arg(line: "--release 11") +arg(line: "-Xdoclint:all,-missing") + +// force locale to be "en_US" (fix for: https://bugs.openjdk.java.net/browse/JDK-8222793) +arg(line: "-J-Duser.language=en -J-Duser.country=US") + } +} + +// append some special table css, prettify css +ant.concat(destfile: "${javadocDestDir}/stylesheet.css", append: "true", fixlastline: "true", encoding: "UTF-8") { + filelist(dir: project(":lucene").file("tools/javadoc"), files: "table_padding.css") + filelist(dir: project(":lucene").file("tools/prettify"), files: "prettify.css") +} +// append prettify to scripts +ant.concat(destfile:
[GitHub] [lucene-solr] mocobeta commented on a change in pull request #1304: LUCENE-9242: generate javadocs by calling Ant javadoc task
mocobeta commented on a change in pull request #1304: LUCENE-9242: generate javadocs by calling Ant javadoc task URL: https://github.com/apache/lucene-solr/pull/1304#discussion_r391547313 ## File path: gradle/invoke-javadoc.gradle ## @@ -0,0 +1,335 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// invoke javadoc tool + +allprojects { + + ext { +javadocRoot = project.path.startsWith(':lucene') ? project(':lucene').file("build/docs") : project(':solr').file("build/docs") +javadocDestDir = "${javadocRoot}/${project.name}" + } + + plugins.withType(JavaPlugin) { +def libName = project.path.startsWith(":lucene") ? "Lucene" : "Solr" +def title = "${libName} ${project.version} ${project.name} API".toString() +def srcDirs = sourceSets.main.java.srcDirs.findAll { dir -> dir.exists() } + +task invokeJavadoc { + description "Generates Javadoc API documentation for the main source code. This invokes Ant Javadoc Task." + group "documentation" + + dependsOn sourceSets.main.compileClasspath + + inputs.property("linksource", "no") + inputs.property("linkJUnit", false) + inputs.property("linkHref", []) + + inputs.files sourceSets.main.java.asFileTree + outputs.dir project.javadocRoot + + doFirst { +srcDirs.each { srcDir -> + ant.javadoc( + overview: file("${srcDir}/overview.html"), + packagenames: "org.apache.lucene.*,org.apache.solr.*", + destDir: project.javadocDestDir, + access: "protected", + encoding: "UTF-8", + charset: "UTF-8", + docencoding: "UTF-8", + noindex: "true", + includenosourcepackages: "true", + author: "true", + version: "true", + linksource: inputs.properties.linksource, + use: "true", + failonerror: "true", + locale: "en_US", + windowtitle: title, + doctitle: title, + maxmemory: "512m", + classpath: sourceSets.main.compileClasspath.asPath, + bottom: "Copyright 2000-${buildYear} Apache Software Foundation. All Rights Reserved." + ) { +packageset(dir: srcDir) + +tag(name: "lucene.experimental", description: "WARNING: This API is experimental and might change in incompatible ways in the next release.") +tag(name: "lucene.internal", description: "NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.") +tag(name: "lucene.spi", description: "SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):", scope: "types") + +// resolve links to JavaSE and JUnit API +link(offline: "true", href: "https://docs.oracle.com/en/java/javase/11/docs/api/;, packageListLoc: project(":lucene").file("tools/javadoc/java11/").toString()) +if (inputs.properties.get("linkJUnit")) { + link(offline: "true", href: "https://junit.org/junit4/javadoc/4.12/;, packageListLoc: project(":lucene").file("tools/javadoc/junit").toString()) +} +// resolve inter-module links if 'linkHref' property is specified +inputs.properties.get("linkHref").each { href -> + link(href: href) +} + +arg(line: "--release 11") +arg(line: "-Xdoclint:all,-missing") + +// force locale to be "en_US" (fix for: https://bugs.openjdk.java.net/browse/JDK-8222793) +arg(line: "-J-Duser.language=en -J-Duser.country=US") + } +} + +// append some special table css, prettify css +ant.concat(destfile: "${javadocDestDir}/stylesheet.css", append: "true", fixlastline: "true", encoding: "UTF-8") { + filelist(dir: project(":lucene").file("tools/javadoc"), files: "table_padding.css") + filelist(dir: project(":lucene").file("tools/prettify"), files: "prettify.css") +} +// append prettify to scripts +ant.concat(destfile:
[jira] [Commented] (LUCENE-9136) Introduce IVFFlat to Lucene for ANN similarity search
[ https://issues.apache.org/jira/browse/LUCENE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057803#comment-17057803 ] Jim Ferenczi commented on LUCENE-9136: -- > ??I was thinking we could actually reuse the existing `PostingsFormat` and >`DocValuesFormat` implementations.?? That's the one of the main reason why this approach is interesting for Lucene. The main operation at query time is a basic inverted lists search so it would be a shame to not reuse the existing formats that were designed for this purpose. In general I think that this approach (k-means clustering at index time) is very compelling since it's a light layer on top of existing functionalities. The computational cost is big (running k-means and assigning vectors to centroids) but we can ensure that it remains acceptable by capping the number of centroids or by using an hybrid approach with a small-world graph like Julie suggested. Regarding the link with the graph-based approach, I wonder what the new ANN Codec should expose. If the goal is to provide approximate nearest neighbors capabilities to Lucene I don't think we want to leak any implementation details there. It's difficult to tell now since both effort are in the design phase but I think we should aim at something very simple that only exposes an approximate nearest neighbor search. Something like: {code:java} interface VectorFormat { TopDocs ann(int topN, int maxDocsToVisit); float[] getVector(int docID); }{code} should be enough. Most of the format we have in Lucene have sensible defaults or compute parameters based on the shape of the data so I don't think we should expose tons of options here. This is another advantage of this approach in my opinion since we can compute the number of centroids needed for each segment automatically. The research in this area are also moving fast so we need to remain open to new approaches without requiring to add a new format all the time. > Actually, we need random access to the vector values! For a typical search >engine, we are going to retrieving the best matched documents after obtaining >the TopK docIDs. Retrieving vectors via these docIDs requires random access to >the vector values. You can sort the TopK (which should be small) by docIDs and then perform the lookup sequentially ? That's how we retrieve stored fields from top documents in the normal search. This is again an advantage against the graph based approach because it is compliant with the search model in Lucene that requires forward iteration. To move forward on this issue I'd like to list the things that need clarifications in my opinion: * Do we need a new VectorFormat that can be shared with the graph-based approach ? ** This decision and the design of the VectorFormat is important to ensure that both efforts can move independently. Currently it it not clear if this approach can move forward if the graph-based approach is stalled or needs more work. I tend to think that having a simple format upfront can drive decisions we make on both approaches so we should tackle this first. * What is the acceptable state for this approach to be considered ready to merge ? ** Lots of optimizations have been mentioned in both issues but I think we should drive for simplicity first. That's the beauty of the k-means approach, it's simple to understand and reason about. We should have a first version that reuses the internal data formats since they fit perfectly. I think that's what Julie's pr brings here while leaving the room for further improvements like any Lucene features. ** We should decorrelate the progress here from the one in the other Lucene issue. This is linked to question 1 but I think it's key to move forward. In general I feel like the branch proposed by [~irvingzhang] and the additional changes by [~jtibshirani] are moving toward the right direction. The qps improvement over a brute-force approach are already compelling as outlined in [https://github.com/apache/lucene-solr/pull/1314] so I don't think it will be difficult to have a consensus whether this would be useful to add in Lucene. > Introduce IVFFlat to Lucene for ANN similarity search > - > > Key: LUCENE-9136 > URL: https://issues.apache.org/jira/browse/LUCENE-9136 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Xin-Chun Zhang >Priority: Major > Attachments: glove-100-angular.png, glove-25-angular.png, > image-2020-03-07-01-22-06-132.png, image-2020-03-07-01-25-58-047.png, > image-2020-03-07-01-27-12-859.png, sift-128-euclidean.png > > Time Spent: 50m > Remaining Estimate: 0h > > Representation learning (RL) has been an established discipline in the > machine learning space for decades but it draws tremendous
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1304: LUCENE-9242: generate javadocs by calling Ant javadoc task
dweiss commented on a change in pull request #1304: LUCENE-9242: generate javadocs by calling Ant javadoc task URL: https://github.com/apache/lucene-solr/pull/1304#discussion_r391523979 ## File path: gradle/invoke-javadoc.gradle ## @@ -0,0 +1,335 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// invoke javadoc tool + +allprojects { + + ext { +javadocRoot = project.path.startsWith(':lucene') ? project(':lucene').file("build/docs") : project(':solr').file("build/docs") +javadocDestDir = "${javadocRoot}/${project.name}" + } + + plugins.withType(JavaPlugin) { +def libName = project.path.startsWith(":lucene") ? "Lucene" : "Solr" +def title = "${libName} ${project.version} ${project.name} API".toString() +def srcDirs = sourceSets.main.java.srcDirs.findAll { dir -> dir.exists() } + +task invokeJavadoc { + description "Generates Javadoc API documentation for the main source code. This invokes Ant Javadoc Task." + group "documentation" + + dependsOn sourceSets.main.compileClasspath + + inputs.property("linksource", "no") + inputs.property("linkJUnit", false) + inputs.property("linkHref", []) + + inputs.files sourceSets.main.java.asFileTree + outputs.dir project.javadocRoot + + doFirst { +srcDirs.each { srcDir -> + ant.javadoc( + overview: file("${srcDir}/overview.html"), + packagenames: "org.apache.lucene.*,org.apache.solr.*", + destDir: project.javadocDestDir, + access: "protected", + encoding: "UTF-8", + charset: "UTF-8", + docencoding: "UTF-8", + noindex: "true", + includenosourcepackages: "true", + author: "true", + version: "true", + linksource: inputs.properties.linksource, + use: "true", + failonerror: "true", + locale: "en_US", + windowtitle: title, + doctitle: title, + maxmemory: "512m", + classpath: sourceSets.main.compileClasspath.asPath, + bottom: "Copyright 2000-${buildYear} Apache Software Foundation. All Rights Reserved." + ) { +packageset(dir: srcDir) + +tag(name: "lucene.experimental", description: "WARNING: This API is experimental and might change in incompatible ways in the next release.") +tag(name: "lucene.internal", description: "NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.") +tag(name: "lucene.spi", description: "SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):", scope: "types") + +// resolve links to JavaSE and JUnit API +link(offline: "true", href: "https://docs.oracle.com/en/java/javase/11/docs/api/;, packageListLoc: project(":lucene").file("tools/javadoc/java11/").toString()) +if (inputs.properties.get("linkJUnit")) { + link(offline: "true", href: "https://junit.org/junit4/javadoc/4.12/;, packageListLoc: project(":lucene").file("tools/javadoc/junit").toString()) +} +// resolve inter-module links if 'linkHref' property is specified +inputs.properties.get("linkHref").each { href -> + link(href: href) +} + +arg(line: "--release 11") +arg(line: "-Xdoclint:all,-missing") + +// force locale to be "en_US" (fix for: https://bugs.openjdk.java.net/browse/JDK-8222793) +arg(line: "-J-Duser.language=en -J-Duser.country=US") + } +} + +// append some special table css, prettify css +ant.concat(destfile: "${javadocDestDir}/stylesheet.css", append: "true", fixlastline: "true", encoding: "UTF-8") { + filelist(dir: project(":lucene").file("tools/javadoc"), files: "table_padding.css") + filelist(dir: project(":lucene").file("tools/prettify"), files: "prettify.css") +} +// append prettify to scripts +ant.concat(destfile:
[jira] [Commented] (SOLR-13944) CollapsingQParserPlugin throws NPE instead of bad request
[ https://issues.apache.org/jira/browse/SOLR-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057784#comment-17057784 ] Lucene/Solr QA commented on SOLR-13944: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 2m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 32m 17s{color} | {color:red} core in the patch failed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 13s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | solr.search.CurrencyRangeFacetCloudTest | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | SOLR-13944 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12996485/SOLR-13944.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene2-us-west.apache.org 4.4.0-170-generic #199-Ubuntu SMP Thu Nov 14 01:45:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 8a940e7 | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 | | Default Java | LTS | | unit | https://builds.apache.org/job/PreCommit-SOLR-Build/710/artifact/out/patch-unit-solr_core.txt | | Test Results | https://builds.apache.org/job/PreCommit-SOLR-Build/710/testReport/ | | modules | C: solr/core U: solr/core | | Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/710/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > CollapsingQParserPlugin throws NPE instead of bad request > - > > Key: SOLR-13944 > URL: https://issues.apache.org/jira/browse/SOLR-13944 > Project: Solr > Issue Type: Bug >Affects Versions: 7.3.1 >Reporter: Stefan >Assignee: Munendra S N >Priority: Minor > Attachments: SOLR-13944.patch, SOLR-13944.patch, SOLR-13944.patch, > SOLR-13944.patch > > > I noticed the following NPE: > {code:java} > java.lang.NullPointerException at > org.apache.solr.search.CollapsingQParserPlugin$OrdFieldValueCollector.finish(CollapsingQParserPlugin.java:1021) > at > org.apache.solr.search.CollapsingQParserPlugin$OrdFieldValueCollector.finish(CollapsingQParserPlugin.java:1081) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:230) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1602) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1419) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:584) > {code} > If I am correct, the problem was already addressed in SOLR-8807. The fix does > was not working in this case though, because of a syntax error in the query > (I used the local parameter syntax twice instead of combining it). The > relevant part of the query is: > {code:java} > ={!tag=collapser}{!collapse field=productId sort='merchantOrder asc, price > asc, id asc'} > {code} > After discussing that on the mailing list, I was asked to open a ticket, > because this situation should result in a bad request instead of a > NullpointerException (see > [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201911.mbox/%3CCAMJgJxTuSb%3D8szO8bvHiAafJOs08O_NMB4pcaHOXME4Jj-GO2A%40mail.gmail.com%3E]) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (SOLR-14314) Solr does not response most of the update request some times
[ https://issues.apache.org/jira/browse/SOLR-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057767#comment-17057767 ] Aaron Sun edited comment on SOLR-14314 at 3/12/20, 10:09 AM: - [~ichattopadhyaya] Thanks for the valuable answer. After change the JVM heap size to 25 GB, it indeed become much better, still a bit pause in log here and there, but much shorter, around 1~2 seconds. Is it possible to make it even better? Also notice the pause happen more often around "HttpSolrCall Closing out SolrRequest" which does not seem related with GC pause. Regarding the muliple solr nodes(JVMs), I guess you refer to this page: [https://lucene.apache.org/solr/guide/7_2/taking-solr-to-production.html#running-multiple-solr-nodes-per-host|https://lucene.apache.org/solr/guide/7_2/taking-solr-to-production.html#running-multiple-solr-nodes-per-host,] , is that mean each solr instance have it's own solr home directory and port? if so how to split the data? one core with one instance? Is that mean client need to manage which solr instance to talk with? I couldn't find good example on internet, appreciate if you could provide some guidance. {noformat} 2020-03-12 10:37:19.804 DEBUG (qtp1668016508-3474) [ x:aggprogram] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE commit\{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:37:20.543 DEBUG (qtp1668016508-4857) [ x:aggprogram] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE add\{,id=2101608110097976031} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:39:11.250 DEBUG (qtp1668016508-6123) [ x:aggasset] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE commit\{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:39:11.915 TRACE (qtp1668016508-3376) [ x:agglogtrackitem] o.a.s.u.UpdateLog TLOG: added id 2102003090810779924 to tlog\{file=/data1/solr8/agglogtrackitem/data/tlog/tlog.0002583 refcount=1} LogPtr(1081326) map=1784607161 2020-03-12 10:40:08.746 DEBUG (qtp1668016508-382) [ x:aggasset] o.a.s.s.HttpSolrCall Closing out SolrRequest: \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:40:09.640 DEBUG (qtp1668016508-3239) [ x:aggasset] o.a.s.u.TransactionLog New TransactionLog file=/data1/solr8/aggasset/data/tlog/tlog.0001116, exists=false, size=0, openExisting=false 2020-03-12 10:40:58.182 DEBUG (qtp1668016508-3413) [ x:agglogtrackitem] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE commit\{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:41:00.318 TRACE (qtp1668016508-381) [ x:agglogtrackitem] o.a.s.u.UpdateLog TLOG: added id 2101701290647113224 to tlog\{file=/data1/solr8/agglogtrackitem/data/tlog/tlog.0002593 refcount=1} LogPtr(1940077) map=1984880505 2020-03-12 10:41:33.880 DEBUG (qtp1668016508-771) [ x:agglogtrackitem] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE commit\{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:41:35.754 TRACE (qtp1668016508-3298) [ x:agglogtrackitem] o.a.s.u.UpdateLog TLOG: added id 2102003070806775224 to tlog\{file=/data1/solr8/agglogtrackitem/data/tlog/tlog.0002598 refcount=1} LogPtr(4246525) map=1493020555 2020-03-12 10:42:23.140 DEBUG (qtp1668016508-107) [ x:agglogtrackitem] o.a.s.u.DirectUpdateHandler2 updateDocument(add\{_version_=1660950824311848960,id=2101702170007764324}) 2020-03-12 10:42:23.935 TRACE (qtp1668016508-380) [ x:agglogtrackitem] o.a.s.u.UpdateLog TLOG: added id 2101806210189104124 to tlog\{file=/data1/solr8/agglogtrackitem/data/tlog/tlog.0002605 refcount=1} LogPtr(5096503) map=2041040637 {noformat} And the QTime with 100+s still not sound too good {noformat} 2020-03-12 11:08:27.586 INFO (qtp1668016508-15663) [ x:agglogtrackitem] o.a.s.u.p.LogUpdateProcessorFactory [agglogtrackitem] webapp=/solr path=/update params=\{commit=true}{add=[2102001130729569124 (1660952325002362880), 2102001220746002624 (1660952325018091520), 2102002130766975424 (1660952325216272385), 2102003020799380624 (1660952325224660992), 2102001150733370324 (1660952325239341056), 2102003090811568924 (1660952325280235520), 2102002130766460924 (1660952325295964161), 2102001220746002024 (1660952325313789954), 2102002200779134024 (1660952325333712896), 2102002280792794524 (1660952325357830145), ... (200 adds)],commit=} 0 134457 2020-03-12 11:08:27.586 DEBUG (qtp1668016508-15663) [ x:agglogtrackitem] o.a.s.s.HttpSolrCall Closing out SolrRequest:
[jira] [Commented] (LUCENE-8929) Early Terminating CollectorManager
[ https://issues.apache.org/jira/browse/LUCENE-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057773#comment-17057773 ] Jim Ferenczi commented on LUCENE-8929: -- Interesting results [~sokolov] and thanks for cleaning the TopFieldCollector. Regarding the challenge, I wonder if DAY_OF_YEAR is a good candidate. Considering the cardinality of the field, it could be more efficient to sort the leaves based on their max values and number of documents before forking new threads ? This is not the case here, but for time-based data where the order of segments follow the natural order of insertion, sorting by segments prior to search can improve the performance dramatically even for small top N. This is something we added in Elasticsearch to boost the performance of queries sorted by timestamp on time-based indices: [https://github.com/elastic/elasticsearch/pull/44021] For sorted queries in general, I think it could be interesting to differentiate requests that don't require to follow the natural order of segments. This is true for concurrent requests but this shouldn't be limited to this case. Today we try to share a global state between leaves so that concurrent and sequential request can early terminate efficiently. We also handle sorted indices and queries sorted by relevancy and a tiebreaker, all of that in the same TopFieldCollector. I know you already made some cleanup but it is maybe a time to have a clear split ? Optimizing queries on sorted indices for large top N could be enhanced further if we add a special top field collector for this purpose. You could for instance remove the leaf priority queue entirely since results are already sorted ? I am also not sure that we're comparing the same thing in the benchmark. If I understand the last pr correctly, leaves are terminated as soon as they've reached the global lower bound so they don't tiebreak ties on doc ids. Not sure if that makes a big difference or not in terms of performance but that would at least make the top N non-deterministic so that's a problem. I am supportive of any improvements we want to make on sorted queries but we should also keep the TopFieldCollector simple. Another idea that we discussed with Adrien would be to give the ability to skip documents in the LeafFieldComparator. This is similar in spirit than what we have in queries with setMinCompetitiveScore: {code:java} public interface LeafFieldComparator { void setBottom(final int slot) throws IOException; ... default DocIdSetIterator iterator() { return null; } }{code} If the returned iterator is used in conjunction with the query, it should be possible to stop/modify the remaining collection when setBottom is called by the top collector. With this mechanism in place it could be much simpler implement the optimization we added in Elasticsearch in: [https://github.com/elastic/elasticsearch/pull/49732]. I am not sure if this would be usable for the optimization you want but I wanted to share this idea since it could have the same impact on sorted queries in Lucene than the block-max WAND have on queries sorted by score. > Early Terminating CollectorManager > -- > > Key: LUCENE-8929 > URL: https://issues.apache.org/jira/browse/LUCENE-8929 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Atri Sharma >Priority: Major > Time Spent: 7h 10m > Remaining Estimate: 0h > > We should have an early terminating collector manager which accurately tracks > hits across all of its collectors and determines when there are enough hits, > allowing all the collectors to abort. > The options for the same are: > 1) Shared total count : Global "scoreboard" where all collectors update their > current hit count. At the end of each document's collection, collector checks > if N > threshold, and aborts if true > 2) State Reporting Collectors: Collectors report their total number of counts > collected periodically using a callback mechanism, and get a proceed or abort > decision. > 1) has the overhead of synchronization in the hot path, 2) can collect > unnecessary hits before aborting. > I am planning to work on 2), unless objections -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14314) Solr does not response most of the update request some times
[ https://issues.apache.org/jira/browse/SOLR-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057767#comment-17057767 ] Aaron Sun commented on SOLR-14314: -- [~ichattopadhyaya] Thanks for the valuable answer. After change the JVM heap size to 25 GB, it indeed become much better, still a bit pause in log here and there, but much shorter, around 1~2 seconds. Is it possible to make it even better? Also notice the pause happen more often around "HttpSolrCall Closing out SolrRequest" which does not seem related with GC pause. Regarding the muliple solr nodes(JVMs), I guess you refer to this page: [https://lucene.apache.org/solr/guide/7_2/taking-solr-to-production.html#running-multiple-solr-nodes-per-host|https://lucene.apache.org/solr/guide/7_2/taking-solr-to-production.html#running-multiple-solr-nodes-per-host,] , is that mean each solr instance have it's own solr home directory and port? if so how to split the data? one core with one instance? Is that mean client need to manage which solr instance to talk with? I couldn't find good example on internet, appreciate if you could provide some guidance. {noformat} 2020-03-12 10:37:19.804 DEBUG (qtp1668016508-3474) [ x:aggprogram] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE commit\{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:37:20.543 DEBUG (qtp1668016508-4857) [ x:aggprogram] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE add\{,id=2101608110097976031} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:39:11.250 DEBUG (qtp1668016508-6123) [ x:aggasset] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE commit\{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:39:11.915 TRACE (qtp1668016508-3376) [ x:agglogtrackitem] o.a.s.u.UpdateLog TLOG: added id 2102003090810779924 to tlog\{file=/data1/solr8/agglogtrackitem/data/tlog/tlog.0002583 refcount=1} LogPtr(1081326) map=1784607161 2020-03-12 10:40:08.746 DEBUG (qtp1668016508-382) [ x:aggasset] o.a.s.s.HttpSolrCall Closing out SolrRequest: \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:40:09.640 DEBUG (qtp1668016508-3239) [ x:aggasset] o.a.s.u.TransactionLog New TransactionLog file=/data1/solr8/aggasset/data/tlog/tlog.0001116, exists=false, size=0, openExisting=false 2020-03-12 10:40:58.182 DEBUG (qtp1668016508-3413) [ x:agglogtrackitem] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE commit\{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:41:00.318 TRACE (qtp1668016508-381) [ x:agglogtrackitem] o.a.s.u.UpdateLog TLOG: added id 2101701290647113224 to tlog\{file=/data1/solr8/agglogtrackitem/data/tlog/tlog.0002593 refcount=1} LogPtr(1940077) map=1984880505 2020-03-12 10:41:33.880 DEBUG (qtp1668016508-771) [ x:agglogtrackitem] o.a.s.u.p.LogUpdateProcessorFactory PRE_UPDATE commit\{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} \{{params(commit=true),defaults(wt=json)}} 2020-03-12 10:41:35.754 TRACE (qtp1668016508-3298) [ x:agglogtrackitem] o.a.s.u.UpdateLog TLOG: added id 2102003070806775224 to tlog\{file=/data1/solr8/agglogtrackitem/data/tlog/tlog.0002598 refcount=1} LogPtr(4246525) map=1493020555 2020-03-12 10:42:23.140 DEBUG (qtp1668016508-107) [ x:agglogtrackitem] o.a.s.u.DirectUpdateHandler2 updateDocument(add\{_version_=1660950824311848960,id=2101702170007764324}) 2020-03-12 10:42:23.935 TRACE (qtp1668016508-380) [ x:agglogtrackitem] o.a.s.u.UpdateLog TLOG: added id 2101806210189104124 to tlog\{file=/data1/solr8/agglogtrackitem/data/tlog/tlog.0002605 refcount=1} LogPtr(5096503) map=2041040637 {noformat} > Solr does not response most of the update request some times > > > Key: SOLR-14314 > URL: https://issues.apache.org/jira/browse/SOLR-14314 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Aaron Sun >Priority: Critical > Attachments: jstack_bad_state.log, solrlog.tar.gz > > > Solr version: > {noformat} > solr-spec > 8.4.1 > solr-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:40:28 > lucene-spec > 8.4.1 > lucene-impl > 8.4.1 832bf13dd9187095831caf69783179d41059d013 - ishan - 2020-01-10 13:35:00 > {noformat} > > Java process: > {noformat} > java -Xms100G -Xmx200G -DSTOP.PORT=8078 -DSTOP.KEY=ardsolrstop > -Dsolr.solr.home=/ardome/solr
[GitHub] [lucene-solr] mocobeta commented on a change in pull request #1304: LUCENE-9242: generate javadocs by calling Ant javadoc task
mocobeta commented on a change in pull request #1304: LUCENE-9242: generate javadocs by calling Ant javadoc task URL: https://github.com/apache/lucene-solr/pull/1304#discussion_r391504306 ## File path: gradle/invoke-javadoc.gradle ## @@ -0,0 +1,335 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// invoke javadoc tool + +allprojects { + + ext { +javadocRoot = project.path.startsWith(':lucene') ? project(':lucene').file("build/docs") : project(':solr').file("build/docs") +javadocDestDir = "${javadocRoot}/${project.name}" + } + + plugins.withType(JavaPlugin) { +def libName = project.path.startsWith(":lucene") ? "Lucene" : "Solr" +def title = "${libName} ${project.version} ${project.name} API".toString() +def srcDirs = sourceSets.main.java.srcDirs.findAll { dir -> dir.exists() } + +task invokeJavadoc { + description "Generates Javadoc API documentation for the main source code. This invokes Ant Javadoc Task." + group "documentation" + + dependsOn sourceSets.main.compileClasspath + + inputs.property("linksource", "no") + inputs.property("linkJUnit", false) + inputs.property("linkHref", []) + + inputs.files sourceSets.main.java.asFileTree + outputs.dir project.javadocRoot + + doFirst { +srcDirs.each { srcDir -> + ant.javadoc( + overview: file("${srcDir}/overview.html"), + packagenames: "org.apache.lucene.*,org.apache.solr.*", + destDir: project.javadocDestDir, + access: "protected", + encoding: "UTF-8", + charset: "UTF-8", + docencoding: "UTF-8", + noindex: "true", + includenosourcepackages: "true", + author: "true", + version: "true", + linksource: inputs.properties.linksource, + use: "true", + failonerror: "true", + locale: "en_US", + windowtitle: title, + doctitle: title, + maxmemory: "512m", + classpath: sourceSets.main.compileClasspath.asPath, + bottom: "Copyright 2000-${buildYear} Apache Software Foundation. All Rights Reserved." + ) { +packageset(dir: srcDir) + +tag(name: "lucene.experimental", description: "WARNING: This API is experimental and might change in incompatible ways in the next release.") +tag(name: "lucene.internal", description: "NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.") +tag(name: "lucene.spi", description: "SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):", scope: "types") + +// resolve links to JavaSE and JUnit API +link(offline: "true", href: "https://docs.oracle.com/en/java/javase/11/docs/api/;, packageListLoc: project(":lucene").file("tools/javadoc/java11/").toString()) +if (inputs.properties.get("linkJUnit")) { + link(offline: "true", href: "https://junit.org/junit4/javadoc/4.12/;, packageListLoc: project(":lucene").file("tools/javadoc/junit").toString()) +} +// resolve inter-module links if 'linkHref' property is specified +inputs.properties.get("linkHref").each { href -> + link(href: href) +} + +arg(line: "--release 11") +arg(line: "-Xdoclint:all,-missing") + +// force locale to be "en_US" (fix for: https://bugs.openjdk.java.net/browse/JDK-8222793) +arg(line: "-J-Duser.language=en -J-Duser.country=US") + } +} + +// append some special table css, prettify css +ant.concat(destfile: "${javadocDestDir}/stylesheet.css", append: "true", fixlastline: "true", encoding: "UTF-8") { + filelist(dir: project(":lucene").file("tools/javadoc"), files: "table_padding.css") + filelist(dir: project(":lucene").file("tools/prettify"), files: "prettify.css") +} +// append prettify to scripts +ant.concat(destfile:
[jira] [Comment Edited] (SOLR-13264) unexpected autoscaling set-trigger response
[ https://issues.apache.org/jira/browse/SOLR-13264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057728#comment-17057728 ] Christof Lorenz edited comment on SOLR-13264 at 3/12/20, 9:03 AM: -- This is where the problem is, it looks like the belowOp and aboveOp are not being added to the validProperties: {code:java} public IndexSizeTrigger(String name) { super(TriggerEventType.INDEXSIZE, name); TriggerUtils.validProperties(validProperties, ABOVE_BYTES_PROP, ABOVE_DOCS_PROP, BELOW_BYTES_PROP, BELOW_DOCS_PROP, COLLECTIONS_PROP); }{code} Without being able to define the Op the trigger is not usable at all. I am currently working with 7.4 looking to update to 8.x was (Author: lochri): This is where the problem is, it looks like the belowOp and aboveOp are not being added to the validProperties: {code:java} public IndexSizeTrigger(String name) { super(TriggerEventType.INDEXSIZE, name); TriggerUtils.validProperties(validProperties, ABOVE_BYTES_PROP, ABOVE_DOCS_PROP, BELOW_BYTES_PROP, BELOW_DOCS_PROP, COLLECTIONS_PROP); }{code} Without being able to define the Op the trigger is not usable at all. > unexpected autoscaling set-trigger response > --- > > Key: SOLR-13264 > URL: https://issues.apache.org/jira/browse/SOLR-13264 > Project: Solr > Issue Type: Bug > Components: AutoScaling >Reporter: Christine Poerschke >Priority: Minor > Attachments: SOLR-13264.patch, SOLR-13264.patch > > > Steps to reproduce: > {code} > ./bin/solr start -cloud -noprompt > ./bin/solr create -c demo -d _default -shards 1 -replicationFactor 1 > curl "http://localhost:8983/solr/admin/autoscaling; -d' > { > "set-trigger" : { > "name" : "index_size_trigger", > "event" : "indexSize", > "aboveDocs" : 12345, > "aboveOp" : "SPLITSHARD", > "enabled" : true, > "actions" : [ > { > "name" : "compute_plan", > "class": "solr.ComputePlanAction" > } > ] > } > } > ' > ./bin/solr stop -all > {code} > The {{aboveOp}} is documented on > https://lucene.apache.org/solr/guide/7_6/solrcloud-autoscaling-triggers.html#index-size-trigger > and logically should be accepted (even though it is actually the default) > but unexpectedly an error message is returned {{"Error validating trigger > config index_size_trigger: > TriggerValidationException\{name=index_size_trigger, > details='\{aboveOp=unknown property\}'\}"}}. > From a quick look it seems that in the {{IndexSizeTrigger}} constructor > additional values need to be passed to the {{TriggerUtils.validProperties}} > method i.e. aboveOp, belowOp and maybe others too i.e. > aboveSize/belowSize/etc. Illustrative patch to follow. Thank you. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13264) unexpected autoscaling set-trigger response
[ https://issues.apache.org/jira/browse/SOLR-13264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057728#comment-17057728 ] Christof Lorenz commented on SOLR-13264: This is where the problem is, it looks like the belowOp and aboveOp are not being added to the validProperties: {code:java} public IndexSizeTrigger(String name) { super(TriggerEventType.INDEXSIZE, name); TriggerUtils.validProperties(validProperties, ABOVE_BYTES_PROP, ABOVE_DOCS_PROP, BELOW_BYTES_PROP, BELOW_DOCS_PROP, COLLECTIONS_PROP); }{code} Without being able to define the Op the trigger is not usable at all. > unexpected autoscaling set-trigger response > --- > > Key: SOLR-13264 > URL: https://issues.apache.org/jira/browse/SOLR-13264 > Project: Solr > Issue Type: Bug > Components: AutoScaling >Reporter: Christine Poerschke >Priority: Minor > Attachments: SOLR-13264.patch, SOLR-13264.patch > > > Steps to reproduce: > {code} > ./bin/solr start -cloud -noprompt > ./bin/solr create -c demo -d _default -shards 1 -replicationFactor 1 > curl "http://localhost:8983/solr/admin/autoscaling; -d' > { > "set-trigger" : { > "name" : "index_size_trigger", > "event" : "indexSize", > "aboveDocs" : 12345, > "aboveOp" : "SPLITSHARD", > "enabled" : true, > "actions" : [ > { > "name" : "compute_plan", > "class": "solr.ComputePlanAction" > } > ] > } > } > ' > ./bin/solr stop -all > {code} > The {{aboveOp}} is documented on > https://lucene.apache.org/solr/guide/7_6/solrcloud-autoscaling-triggers.html#index-size-trigger > and logically should be accepted (even though it is actually the default) > but unexpectedly an error message is returned {{"Error validating trigger > config index_size_trigger: > TriggerValidationException\{name=index_size_trigger, > details='\{aboveOp=unknown property\}'\}"}}. > From a quick look it seems that in the {{IndexSizeTrigger}} constructor > additional values need to be passed to the {{TriggerUtils.validProperties}} > method i.e. aboveOp, belowOp and maybe others too i.e. > aboveSize/belowSize/etc. Illustrative patch to follow. Thank you. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14300) Some conditional clauses on unindexed field will be ignored by query parser in some specific cases
[ https://issues.apache.org/jira/browse/SOLR-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057682#comment-17057682 ] Hongtai Xue commented on SOLR-14300: hi, I attached a patch to fix this issue. h3. about bug the if statement here is wrong. {code:java} for (BooleanClause clause : clauses) { ... // NOTE, for query "B:1 OR B:2" // when parse come to "B:2" , // filedValues here will not be null since "B:1" has been stored in fieldValues fieldValues = fmap.get(sfield); ... if ((fieldValues == null && useTermsQuery) || !sfield.indexed()) { fieldValues = new ArrayList<>(2); // <-- here, if B is not indexed, fieldValues will be overwritten, and "B:1" will lost fmap.put(sfield, fieldValues); } ... } {code} please check comment above, if sfield is not indexed, fieldValues will always be overwritten. even fieldValues is not null. another question is why only "q=A:1 OR B:1 OR A:2 OR B:2" causes problem, but "q=A:1 OR A:2 OR B:1 OR B:2" is OK. the answer is [here|https://github.com/apache/lucene-solr/blob/branch_8_4/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L705]. the bug code is only run when field change. if the fields are same in clause, nothing will happen. h3. how to fix so, obviously, it's a very simple bug, and we only changed one line to fix it. {code:java} -if ((fieldValues == null && useTermsQuery) || !sfield.indexed()) { +if (fieldValues == null && (useTermsQuery || !sfield.indexed())) { {code} fieldValues will only be initialized when it's null. h3. test we confirmed the issue is fixed. the following queries get same results. * query1: [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)=query] {code:json} "debug":{ "rawquerystring":" (name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd)", "querystring":" (name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd)", "parsedquery":"cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]]))", "parsedquery_toString":"cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]])", "QParser":"LuceneQParser"} {code} * query2: [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)=query] {code:json} "debug":{ "rawquerystring":" (name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd)", "querystring":" (name_str:Foundation OR name_str:Jhereg OR cat:book OR cat:cd)", "parsedquery":"cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]]))", "parsedquery_toString":"cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a 68 65 72 65 67]])", "QParser":"LuceneQParser"}} {code} > Some conditional clauses on unindexed field will be ignored by query parser > in some specific cases > -- > > Key: SOLR-14300 > URL: https://issues.apache.org/jira/browse/SOLR-14300 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > Environment: Solr 7.3.1 > centos7.5 >Reporter: Hongtai Xue >Priority: Minor > Labels: newbie, patch > Fix For: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > > Attachments: SOLR-14300.patch > > > In some specific cases, some conditional clauses on unindexed field will be > ignored > * for query like, q=A:1 OR B:1 OR A:2 OR B:2 > if field B is not indexed(but docValues="true"), "B:1" will be lost. > > * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2, > it will work perfect. > the only difference of two queries is that they are wrote in different orders. > one is *ABAB*, another is *AABB.* > > *steps of reproduce* > you can easily reproduce this problem on a solr collection with _default > configset and exampledocs/books.csv data. > # create a _default collection > {code:java} > bin/solr create -c books -s 2 -rf 2{code} > # post books.csv. > {code:java} > bin/post -c books example/exampledocs/books.csv{code} > # run followed query. > ** query1: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)=query] > ** query2: >
[jira] [Updated] (SOLR-14300) Some conditional clauses on unindexed field will be ignored by query parser in some specific cases
[ https://issues.apache.org/jira/browse/SOLR-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongtai Xue updated SOLR-14300: --- Attachment: SOLR-14300.patch > Some conditional clauses on unindexed field will be ignored by query parser > in some specific cases > -- > > Key: SOLR-14300 > URL: https://issues.apache.org/jira/browse/SOLR-14300 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > Environment: Solr 7.3.1 > centos7.5 >Reporter: Hongtai Xue >Priority: Minor > Labels: newbie, patch > Fix For: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > > Attachments: SOLR-14300.patch > > > In some specific cases, some conditional clauses on unindexed field will be > ignored > * for query like, q=A:1 OR B:1 OR A:2 OR B:2 > if field B is not indexed(but docValues="true"), "B:1" will be lost. > > * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2, > it will work perfect. > the only difference of two queries is that they are wrote in different orders. > one is *ABAB*, another is *AABB.* > > *steps of reproduce* > you can easily reproduce this problem on a solr collection with _default > configset and exampledocs/books.csv data. > # create a _default collection > {code:java} > bin/solr create -c books -s 2 -rf 2{code} > # post books.csv. > {code:java} > bin/post -c books example/exampledocs/books.csv{code} > # run followed query. > ** query1: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)=query] > ** query2: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)=query] > ** then you can find the parsedqueries are different. > *** query1. ("name_str:Foundation" is lost.) > {code:json} > "debug":{ > "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg > OR cat:cd)", > "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR > cat:cd)", > "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a > 68 65 72 65 67]]))", > "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] > TO [4a 68 65 72 65 67]])", > "QParser":"LuceneQParser"}}{code} > *** query2. ("name_str:Foundation" isn't lost.) > {code:json} > "debug":{ > "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book > OR cat:cd)", > "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR > cat:cd)", > "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f > 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO > [4a 68 65 72 65 67]])))", > "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 > 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 > 67] TO [4a 68 65 72 65 67]]))", > "QParser":"LuceneQParser"}{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14300) Some conditional clauses on unindexed field will be ignored by query parser in some specific cases
[ https://issues.apache.org/jira/browse/SOLR-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongtai Xue updated SOLR-14300: --- Labels: newbie patch (was: patch) > Some conditional clauses on unindexed field will be ignored by query parser > in some specific cases > -- > > Key: SOLR-14300 > URL: https://issues.apache.org/jira/browse/SOLR-14300 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > Environment: Solr 7.3.1 > centos7.5 >Reporter: Hongtai Xue >Priority: Minor > Labels: newbie, patch > Fix For: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > > > In some specific cases, some conditional clauses on unindexed field will be > ignored > * for query like, q=A:1 OR B:1 OR A:2 OR B:2 > if field B is not indexed(but docValues="true"), "B:1" will be lost. > > * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2, > it will work perfect. > the only difference of two queries is that they are wrote in different orders. > one is *ABAB*, another is *AABB.* > > *steps of reproduce* > you can easily reproduce this problem on a solr collection with _default > configset and exampledocs/books.csv data. > # create a _default collection > {code:java} > bin/solr create -c books -s 2 -rf 2{code} > # post books.csv. > {code:java} > bin/post -c books example/exampledocs/books.csv{code} > # run followed query. > ** query1: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)=query] > ** query2: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)=query] > ** then you can find the parsedqueries are different. > *** query1. ("name_str:Foundation" is lost.) > {code:json} > "debug":{ > "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg > OR cat:cd)", > "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR > cat:cd)", > "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a > 68 65 72 65 67]]))", > "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] > TO [4a 68 65 72 65 67]])", > "QParser":"LuceneQParser"}}{code} > *** query2. ("name_str:Foundation" isn't lost.) > {code:json} > "debug":{ > "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book > OR cat:cd)", > "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR > cat:cd)", > "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f > 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO > [4a 68 65 72 65 67]])))", > "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 > 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 > 67] TO [4a 68 65 72 65 67]]))", > "QParser":"LuceneQParser"}{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14300) Some conditional clauses on unindexed field will be ignored by query parser in some specific cases
[ https://issues.apache.org/jira/browse/SOLR-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongtai Xue updated SOLR-14300: --- Labels: patch (was: ) > Some conditional clauses on unindexed field will be ignored by query parser > in some specific cases > -- > > Key: SOLR-14300 > URL: https://issues.apache.org/jira/browse/SOLR-14300 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > Environment: Solr 7.3.1 > centos7.5 >Reporter: Hongtai Xue >Priority: Minor > Labels: patch > Fix For: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > > > In some specific cases, some conditional clauses on unindexed field will be > ignored > * for query like, q=A:1 OR B:1 OR A:2 OR B:2 > if field B is not indexed(but docValues="true"), "B:1" will be lost. > > * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2, > it will work perfect. > the only difference of two queries is that they are wrote in different orders. > one is *ABAB*, another is *AABB.* > > *steps of reproduce* > you can easily reproduce this problem on a solr collection with _default > configset and exampledocs/books.csv data. > # create a _default collection > {code:java} > bin/solr create -c books -s 2 -rf 2{code} > # post books.csv. > {code:java} > bin/post -c books example/exampledocs/books.csv{code} > # run followed query. > ** query1: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)=query] > ** query2: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)=query] > ** then you can find the parsedqueries are different. > *** query1. ("name_str:Foundation" is lost.) > {code:json} > "debug":{ > "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg > OR cat:cd)", > "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR > cat:cd)", > "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a > 68 65 72 65 67]]))", > "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] > TO [4a 68 65 72 65 67]])", > "QParser":"LuceneQParser"}}{code} > *** query2. ("name_str:Foundation" isn't lost.) > {code:json} > "debug":{ > "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book > OR cat:cd)", > "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR > cat:cd)", > "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f > 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO > [4a 68 65 72 65 67]])))", > "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 > 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 > 67] TO [4a 68 65 72 65 67]]))", > "QParser":"LuceneQParser"}{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14300) Some conditional clauses on unindexed field will be ignored by query parser in some specific cases
[ https://issues.apache.org/jira/browse/SOLR-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongtai Xue updated SOLR-14300: --- Fix Version/s: 7.3 7.4 7.5 7.6 7.7 8.0 8.1 8.2 8.3 8.4 Affects Version/s: (was: 7.3.1) 7.3 7.4 7.5 7.6 7.7 8.0 8.1 8.2 8.3 8.4 > Some conditional clauses on unindexed field will be ignored by query parser > in some specific cases > -- > > Key: SOLR-14300 > URL: https://issues.apache.org/jira/browse/SOLR-14300 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > Environment: Solr 7.3.1 > centos7.5 >Reporter: Hongtai Xue >Priority: Minor > Fix For: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4 > > > In some specific cases, some conditional clauses on unindexed field will be > ignored > * for query like, q=A:1 OR B:1 OR A:2 OR B:2 > if field B is not indexed(but docValues="true"), "B:1" will be lost. > > * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2, > it will work perfect. > the only difference of two queries is that they are wrote in different orders. > one is *ABAB*, another is *AABB.* > > *steps of reproduce* > you can easily reproduce this problem on a solr collection with _default > configset and exampledocs/books.csv data. > # create a _default collection > {code:java} > bin/solr create -c books -s 2 -rf 2{code} > # post books.csv. > {code:java} > bin/post -c books example/exampledocs/books.csv{code} > # run followed query. > ** query1: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)=query] > ** query2: > [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)=query] > ** then you can find the parsedqueries are different. > *** query1. ("name_str:Foundation" is lost.) > {code:json} > "debug":{ > "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg > OR cat:cd)", > "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR > cat:cd)", > "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a > 68 65 72 65 67]]))", > "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] > TO [4a 68 65 72 65 67]])", > "QParser":"LuceneQParser"}}{code} > *** query2. ("name_str:Foundation" isn't lost.) > {code:json} > "debug":{ > "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book > OR cat:cd)", > "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR > cat:cd)", > "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f > 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO > [4a 68 65 72 65 67]])))", > "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 > 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 > 67] TO [4a 68 65 72 65 67]]))", > "QParser":"LuceneQParser"}{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14317) HttpClusterStateProvider throws exception when only one node down
[ https://issues.apache.org/jira/browse/SOLR-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lyle updated SOLR-14317: Attachment: SOLR-14317.patch > HttpClusterStateProvider throws exception when only one node down > - > > Key: SOLR-14317 > URL: https://issues.apache.org/jira/browse/SOLR-14317 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.1, 7.7.2 >Reporter: Lyle >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-14317.patch > > Time Spent: 10m > Remaining Estimate: 0h > > When create a CloudSolrClient with solrUrls, if the first url in the solrUrls > list is invalid or server is down, it will throw exception directly rather > than try remaining url. > In > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L65], > if fetchLiveNodes(initialClient) have any IOException, in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClient.java#L648], > exceptions will be caught and throw SolrServerException to the upper caller, > while no IOExceptioin will be caught in > HttpClusterStateProvider.fetchLiveNodes(HttpClusterStateProvider.java:200). > The SolrServerException should be caught as well in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L69], > so that if first node provided in solrUrs down, we can try to use the second > to fetch live nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14317) HttpClusterStateProvider throws exception when only one node down
[ https://issues.apache.org/jira/browse/SOLR-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lyle updated SOLR-14317: Attachment: (was: SOLR-14317) > HttpClusterStateProvider throws exception when only one node down > - > > Key: SOLR-14317 > URL: https://issues.apache.org/jira/browse/SOLR-14317 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.1, 7.7.2 >Reporter: Lyle >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-14317.patch > > Time Spent: 10m > Remaining Estimate: 0h > > When create a CloudSolrClient with solrUrls, if the first url in the solrUrls > list is invalid or server is down, it will throw exception directly rather > than try remaining url. > In > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L65], > if fetchLiveNodes(initialClient) have any IOException, in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClient.java#L648], > exceptions will be caught and throw SolrServerException to the upper caller, > while no IOExceptioin will be caught in > HttpClusterStateProvider.fetchLiveNodes(HttpClusterStateProvider.java:200). > The SolrServerException should be caught as well in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L69], > so that if first node provided in solrUrs down, we can try to use the second > to fetch live nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14317) HttpClusterStateProvider throws exception when only one node down
[ https://issues.apache.org/jira/browse/SOLR-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lyle updated SOLR-14317: Attachment: SOLR-14317 > HttpClusterStateProvider throws exception when only one node down > - > > Key: SOLR-14317 > URL: https://issues.apache.org/jira/browse/SOLR-14317 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.1, 7.7.2 >Reporter: Lyle >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-14317 > > Time Spent: 10m > Remaining Estimate: 0h > > When create a CloudSolrClient with solrUrls, if the first url in the solrUrls > list is invalid or server is down, it will throw exception directly rather > than try remaining url. > In > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L65], > if fetchLiveNodes(initialClient) have any IOException, in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClient.java#L648], > exceptions will be caught and throw SolrServerException to the upper caller, > while no IOExceptioin will be caught in > HttpClusterStateProvider.fetchLiveNodes(HttpClusterStateProvider.java:200). > The SolrServerException should be caught as well in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L69], > so that if first node provided in solrUrs down, we can try to use the second > to fetch live nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org