[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734499#comment-15734499 ] Mark Miller commented on SOLR-9824: --- bq. It may be safer to ensure that interrupt() only affects the queue.poll calls and not anything else; It's fine - in our case we know all the outstanding documents have been added to the queue by the time we are in the blockuntilfinished block. We don't access CUSS in a multi threaded manner. Once we are in blockUntilFinished and the queue is empty, we know we are just interrupting poll. We want to use CUSS internally because I don't want to dupe a bunch of this logic. But our use case and it's general use case are very different. We shouldn't try to fit both use cases in the same box. This option will be for use cases like ours. You are not just keeping a server around to pull and use to add docs as needed over time. You are creating a instance for a known load of docs, it's going away after that load, and you don't want to spin up or down connections or threads, and we access the CUSS instance single threaded. That is the case we need to optimize for. I'm much less interested in improving CUSS for non internal use anyway, I'd rather spin any changes for that use case into another issue. It's really not a great client for SolrCloud for a lot of other reasons. And it's very easy to introduce bugs with changes that look like they don't hurt. We have seen those same types of changes hurt before. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18479 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18479/ Java: 64bit/jdk-9-ea+140 -XX:-UseCompressedOops -XX:+UseSerialGC 1 tests failed. FAILED: org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.testDelegationTokenCancelFail Error Message: expected:<200> but was:<404> Stack Trace: java.lang.AssertionError: expected:<200> but was:<404> at __randomizedtesting.SeedInfo.seed([E7F611551FECF6A2:8F49247FCF76E44E]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.cancelDelegationToken(TestSolrCloudWithDelegationTokens.java:140) at org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.testDelegationTokenCancelFail(TestSolrCloudWithDelegationTokens.java:304) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:535) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734466#comment-15734466 ] Mark Miller commented on SOLR-9824: --- I'd rather not break the current client behavior. As it is now, you can keep a CUSS around and it will spin up and down threads as activity comes and goes. We are using CUSS in a very specific way though - we don't want threads to spin up and down, really we want one thread and we want it to stick around, whether we have a long GC pause, too much load, or whatever. A lot of load is not when you want to start tearing down and building more connections. I don't think we ever want to use 0 for this distrib update use case. Only if you could fix that lastDoc marker issue, and given how things work you really can't. There are also back compat breaks I don't want to introduce. Interruption behavior being one of them. Considering we are dealing with Runnable, I see no problem with using futures to cancel tasks, seems odder to try and break threads out of the executor. Futures are how you are supposed to wait for or cancel executor tasks. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734440#comment-15734440 ] David Smiley commented on SOLR-9824: I like the idea of removing lastDoc; it's hard to work with. I'm not sure why we even need a longLastingThreads flag. Can't we just ensure that blockUntilFinished() triggers the runners to interrupt their poll's always? You're doing this now in the patch (albeit only when longLastingThreads==true) but why not simply always? And why are Future's needed to do the cancel() when we can interrupt the Threads directly? We could expose them easily by having the Runner store it's current thread when run() is called. If we didn't need a "longLastingThreads" boolean then the client could set the poll time independently, perhaps defaulting to '0' but might set it to be very long if it intends to call blockUntilFinished(). Arguably, blockUntilFinished() might log a warning if the poll time is zero because it would amount to misconfiguration. It may be safer to ensure that interrupt() *only* affects the queue.poll calls and not anything else; but I'm unsure if anything else internal to writing the document to the stream would interrupt/cancel part way to warrant caring. Do you know? It's do-able but would require some extra boolean volatile state variables like doStop and currentlyPolling. I'm confused about something: the first line of sendUpdateStream() is {{while(!queue.isEmpty)}} but why even do that given that given we poll the queue? i.e. why not {{while(true)}}? Or perhaps why even loop at all given the caller has a similar loop. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-9824: -- Attachment: SOLR-9824.patch > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-6.x-Linux (64bit/jdk1.8.0_102) - Build # 2375 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/2375/ Java: 64bit/jdk1.8.0_102 -XX:-UseCompressedOops -XX:+UseSerialGC 1 tests failed. FAILED: org.apache.solr.cloud.TestRandomRequestDistribution.test Error Message: Shard a1x2_shard1_replica2 received all 10 requests Stack Trace: java.lang.AssertionError: Shard a1x2_shard1_replica2 received all 10 requests at __randomizedtesting.SeedInfo.seed([76F5C3323B1962A6:FEA1FCE895E50F5E]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.TestRandomRequestDistribution.testRequestTracking(TestRandomRequestDistribution.java:121) at org.apache.solr.cloud.TestRandomRequestDistribution.test(TestRandomRequestDistribution.java:65) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
[jira] [Updated] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-9824: -- Attachment: (was: SOLR-9824.patch) > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch, SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-9824: -- Attachment: SOLR-9824.patch > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch, SOLR-9824.patch, SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-9824: -- Attachment: SOLR-9824.patch > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch, SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7570) Tragic events during merges can lead to deadlock
[ https://issues.apache.org/jira/browse/LUCENE-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-7570: --- Attachment: LUCENE-7570.patch Here's a patch w/ test case reproducing the deadlock, and a simple fix, that just postpones launching merges until after we are out of the commit lock. > Tragic events during merges can lead to deadlock > > > Key: LUCENE-7570 > URL: https://issues.apache.org/jira/browse/LUCENE-7570 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 5.5, master (7.0) >Reporter: Joey Echeverria > Attachments: LUCENE-7570.patch, thread_dump.txt > > > When an {{IndexWriter#commit()}} is stalled due to too many pending merges, > you can get a deadlock if the currently active merge thread hits a tragic > event. > # The thread performing the commit synchronizes on the the {{commitLock}} in > {{commitInternal}}. > # The thread goes on to to call {{ConcurrentMergeScheduler#doStall()}} which > {{waits()}} on the {{ConcurrentMergeScheduler}} object. This release the > merge scheduler's monitor lock, but not the {{commitLock}} in {{IndexWriter}}. > # Sometime after this wait begins, the merge thread gets a tragic exception > can calls {{IndexWriter#tragicEvent()}} which in turn calls > {{IndexWriter#rollbackInternal()}}. > # The {{IndexWriter#rollbackInternal()}} synchronizes on the {{commitLock}} > which is still held by the committing thread from (1) above which is waiting > on the merge(s) to complete. Hence, deadlock. > We hit this bug with Lucene 5.5, but I looked at the code in the master > branch and it looks like the deadlock still exists there as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733911#comment-15733911 ] Mark Miller commented on SOLR-9824: --- So what I do is use a max int poll time and take advantage of that fact that we know internally we use CUSS like: * Create CUSS * AddDocs * BlockUntilFinished * Done This is not the general case where we may then add more docs or something, when we call blockUntilFinished we know all the docs we are going to add are in the queue. So we use a max int queue poll time - we want a single connection and we know how long we want to keep the connection up. That means we have to bail on those queue waits though - so in blockUntilFinished, we just wait until the queue has been emptied and then we interrupt the waits on the queue polling so that all the runners shutdown. The idea is, rather than this lastDoc flag that has so many issues, we just try to keep polling through the length of the request. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-9824: -- Attachment: SOLR-9824.patch Here is an idea that would work for all formats in all cases. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > Attachments: SOLR-9824.patch > > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues
[ https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733820#comment-15733820 ] Hoss Man commented on SOLR-5944: I've pushed an update to {{TestInPlaceUpdatesDistrib}} that refactors some randomized index building into a new {{buildRandomIndex}} helper method which is now used by most of the "test" methods in that class. It's *NOT* currently used by {{docValuesUpdateTest()}} even though it was designed to be -- I made several aborted attempts to try and switch that method to use {{buildRandomIndex}} knowing that many assertions in that test would need other tweaks to account for more docs in the index, but i kept running into weird failures that took me a while to explain. Ultimately I realized the problem is that currently, {{schema-inplace-updates.xml}} is configured with {{inplace_updatable_float}} having a {{default="0"}} setting -- which (besides making most of our testing using hta field much weaker then i realized) means that the initial sanity checks in {{docValuesUpdateTest()}} are even less useful then i originally thought. [~ichattopadhyaya]: do you remember why this default is set on {{inplace_updatable_float}} (and {{inplace_updatable_int}}) {{schema-inplace-updates.xml}} ? ... i see {{TestInPlaceUpdatesDistrib}} doing a preliminary sanity check assertion that the defaults exists in the schema, but I don't see any test that seems to care/expect that default to work, and it seems to weaken our test coverage of the more common case... Specifically: when I tried to remove it, I started seeing NPEs from {{SolrIndexSearcher.decorateDocValueFields}} in various tests: {noformat} [junit4] ERROR 0.05s J2 | TestInPlaceUpdatesStandalone.testUpdateTwoDifferentFields <<< [junit4]> Throwable #1: java.lang.NullPointerException [junit4]>at __randomizedtesting.SeedInfo.seed([29D61963E75459C5:26F189B6F032D44A]:0) [junit4]>at org.apache.solr.search.SolrIndexSearcher.decorateDocValueFields(SolrIndexSearcher.java:810) [junit4]>at org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:599) [junit4]>at org.apache.solr.update.processor.AtomicUpdateDocumentMerger.doInPlaceUpdateMerge(AtomicUpdateDocumentMerger.java:286) [junit4]>at org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:1414) [junit4]>at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1072) [junit4]>at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:751) [junit4]>at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorF {noformat} ...while it certainly makes sense to have some testing of inplace updates when there is a schema specified {{default}} that's *non-zero* (although see the previously mentioned SOLR-9838 for some issues with doing that currently), i'm now concerned about how much of the code may *only* be working _because_ these fields have an explicit {{default="0"}} ? > Support updates of numeric DocValues > > > Key: SOLR-5944 > URL: https://issues.apache.org/jira/browse/SOLR-5944 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Shalin Shekhar Mangar > Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, > TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, > TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, > TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, def
[jira] [Resolved] (LUCENE-7583) Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD leaf block?
[ https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-7583. Resolution: Fixed > Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD > leaf block? > - > > Key: LUCENE-7583 > URL: https://issues.apache.org/jira/browse/LUCENE-7583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7583-hardcode-writeVInt.patch, > LUCENE-7583.fork-FastOutputStream.patch, LUCENE-7583.patch, > LUCENE-7583.patch, LUCENE-7583.private-IndexOutput.patch > > > When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint, > int, short, etc.), and I've seen deep thread stacks through our IndexOutput > impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is > writing. > So I tried a small change, to have BKDWriter do its own buffering, by first > writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in > 1 KB byte[] chunks) to the actual IndexOutput. > This gives a non-trivial reduction (~6%) in the total time for BKD writing + > merging time on the 20M NYC taxis nightly benchmark (2 times each): > Trunk, sparse: > - total: 64.691 sec > - total: 64.702 sec > Patch, sparse: > - total: 60.820 sec > - total: 60.965 sec > Trunk dense: > - total: 62.730 sec > - total: 62.383 sec > Patch dense: > - total: 58.805 sec > - total: 58.742 sec > The results seem to be consistent and reproducible. I'm using Java 1.8.0_101 > on a fast SSD on Ubuntu 16.04. > It's sort of weird and annoying that this helps so much, because > {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}} > (default 8 KB buffer) to buffer writes. > [~thetaphi] suggested maybe hotspot is failing to inline/optimize the > {{writeByte}} / the call stack just has too many layers. > We could commit this patch (it's trivial) but it'd be nice to understand and > fix why buffering writes is somehow costly so any other Lucene codec > components that write lots of little things can be improved too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7583) Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD leaf block?
[ https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733737#comment-15733737 ] ASF subversion and git services commented on LUCENE-7583: - Commit ca428ce2381fd9a8e6f56767ad9d0fb1638ba7dc in lucene-solr's branch refs/heads/branch_6x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ca428ce ] LUCENE-7583: move this class to the right package > Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD > leaf block? > - > > Key: LUCENE-7583 > URL: https://issues.apache.org/jira/browse/LUCENE-7583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7583-hardcode-writeVInt.patch, > LUCENE-7583.fork-FastOutputStream.patch, LUCENE-7583.patch, > LUCENE-7583.patch, LUCENE-7583.private-IndexOutput.patch > > > When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint, > int, short, etc.), and I've seen deep thread stacks through our IndexOutput > impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is > writing. > So I tried a small change, to have BKDWriter do its own buffering, by first > writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in > 1 KB byte[] chunks) to the actual IndexOutput. > This gives a non-trivial reduction (~6%) in the total time for BKD writing + > merging time on the 20M NYC taxis nightly benchmark (2 times each): > Trunk, sparse: > - total: 64.691 sec > - total: 64.702 sec > Patch, sparse: > - total: 60.820 sec > - total: 60.965 sec > Trunk dense: > - total: 62.730 sec > - total: 62.383 sec > Patch dense: > - total: 58.805 sec > - total: 58.742 sec > The results seem to be consistent and reproducible. I'm using Java 1.8.0_101 > on a fast SSD on Ubuntu 16.04. > It's sort of weird and annoying that this helps so much, because > {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}} > (default 8 KB buffer) to buffer writes. > [~thetaphi] suggested maybe hotspot is failing to inline/optimize the > {{writeByte}} / the call stack just has too many layers. > We could commit this patch (it's trivial) but it'd be nice to understand and > fix why buffering writes is somehow costly so any other Lucene codec > components that write lots of little things can be improved too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7583) Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD leaf block?
[ https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733734#comment-15733734 ] ASF subversion and git services commented on LUCENE-7583: - Commit c185617582b4bf3ce2899c9ae67e9eeaf2c21741 in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c185617 ] LUCENE-7583: move this class to the right package > Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD > leaf block? > - > > Key: LUCENE-7583 > URL: https://issues.apache.org/jira/browse/LUCENE-7583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7583-hardcode-writeVInt.patch, > LUCENE-7583.fork-FastOutputStream.patch, LUCENE-7583.patch, > LUCENE-7583.patch, LUCENE-7583.private-IndexOutput.patch > > > When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint, > int, short, etc.), and I've seen deep thread stacks through our IndexOutput > impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is > writing. > So I tried a small change, to have BKDWriter do its own buffering, by first > writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in > 1 KB byte[] chunks) to the actual IndexOutput. > This gives a non-trivial reduction (~6%) in the total time for BKD writing + > merging time on the 20M NYC taxis nightly benchmark (2 times each): > Trunk, sparse: > - total: 64.691 sec > - total: 64.702 sec > Patch, sparse: > - total: 60.820 sec > - total: 60.965 sec > Trunk dense: > - total: 62.730 sec > - total: 62.383 sec > Patch dense: > - total: 58.805 sec > - total: 58.742 sec > The results seem to be consistent and reproducible. I'm using Java 1.8.0_101 > on a fast SSD on Ubuntu 16.04. > It's sort of weird and annoying that this helps so much, because > {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}} > (default 8 KB buffer) to buffer writes. > [~thetaphi] suggested maybe hotspot is failing to inline/optimize the > {{writeByte}} / the call stack just has too many layers. > We could commit this patch (it's trivial) but it'd be nice to understand and > fix why buffering writes is somehow costly so any other Lucene codec > components that write lots of little things can be improved too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-9837) Performance regression of numeric field uninversion time
[ https://issues.apache.org/jira/browse/SOLR-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-9837. Resolution: Fixed > Performance regression of numeric field uninversion time > > > Key: SOLR-9837 > URL: https://issues.apache.org/jira/browse/SOLR-9837 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: master (7.0) >Reporter: Yonik Seeley >Assignee: Yonik Seeley > Fix For: master (7.0) > > > Somehow related to LUCENE-7407, after the transition, the uninvert time of > numeric fields has gone up substantially. I haven't tested all field types > yet, just integer fields, which show a 55% performance regression for the > initial uninvert time. > This was tested with a numeric field of cardinality 1M on a 10M doc index. > {code} > q=id:1&sort=my_numeric_field desc > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9837) Performance regression of numeric field uninversion time
[ https://issues.apache.org/jira/browse/SOLR-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733719#comment-15733719 ] ASF subversion and git services commented on SOLR-9837: --- Commit 1d2e440a8fe3df8d3207a7428841f79f63381e4f in lucene-solr's branch refs/heads/master from [~yo...@apache.org] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1d2e440 ] SOLR-9837: fix redundant calculation of docsWithField for numeric fields in field cache > Performance regression of numeric field uninversion time > > > Key: SOLR-9837 > URL: https://issues.apache.org/jira/browse/SOLR-9837 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: master (7.0) >Reporter: Yonik Seeley >Assignee: Yonik Seeley > Fix For: master (7.0) > > > Somehow related to LUCENE-7407, after the transition, the uninvert time of > numeric fields has gone up substantially. I haven't tested all field types > yet, just integer fields, which show a 55% performance regression for the > initial uninvert time. > This was tested with a numeric field of cardinality 1M on a 10M doc index. > {code} > q=id:1&sort=my_numeric_field desc > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-7583) Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD leaf block?
[ https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reopened LUCENE-7583: > Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD > leaf block? > - > > Key: LUCENE-7583 > URL: https://issues.apache.org/jira/browse/LUCENE-7583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7583-hardcode-writeVInt.patch, > LUCENE-7583.fork-FastOutputStream.patch, LUCENE-7583.patch, > LUCENE-7583.patch, LUCENE-7583.private-IndexOutput.patch > > > When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint, > int, short, etc.), and I've seen deep thread stacks through our IndexOutput > impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is > writing. > So I tried a small change, to have BKDWriter do its own buffering, by first > writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in > 1 KB byte[] chunks) to the actual IndexOutput. > This gives a non-trivial reduction (~6%) in the total time for BKD writing + > merging time on the 20M NYC taxis nightly benchmark (2 times each): > Trunk, sparse: > - total: 64.691 sec > - total: 64.702 sec > Patch, sparse: > - total: 60.820 sec > - total: 60.965 sec > Trunk dense: > - total: 62.730 sec > - total: 62.383 sec > Patch dense: > - total: 58.805 sec > - total: 58.742 sec > The results seem to be consistent and reproducible. I'm using Java 1.8.0_101 > on a fast SSD on Ubuntu 16.04. > It's sort of weird and annoying that this helps so much, because > {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}} > (default 8 KB buffer) to buffer writes. > [~thetaphi] suggested maybe hotspot is failing to inline/optimize the > {{writeByte}} / the call stack just has too many layers. > We could commit this patch (it's trivial) but it'd be nice to understand and > fix why buffering writes is somehow costly so any other Lucene codec > components that write lots of little things can be improved too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7583) Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD leaf block?
[ https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733691#comment-15733691 ] Michael McCandless commented on LUCENE-7583: bq. Note: this breaks my eclipse build (on 6x at least, and I presume master) Gak, I'll fix! Thanks for reporting this. > Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD > leaf block? > - > > Key: LUCENE-7583 > URL: https://issues.apache.org/jira/browse/LUCENE-7583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7583-hardcode-writeVInt.patch, > LUCENE-7583.fork-FastOutputStream.patch, LUCENE-7583.patch, > LUCENE-7583.patch, LUCENE-7583.private-IndexOutput.patch > > > When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint, > int, short, etc.), and I've seen deep thread stacks through our IndexOutput > impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is > writing. > So I tried a small change, to have BKDWriter do its own buffering, by first > writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in > 1 KB byte[] chunks) to the actual IndexOutput. > This gives a non-trivial reduction (~6%) in the total time for BKD writing + > merging time on the 20M NYC taxis nightly benchmark (2 times each): > Trunk, sparse: > - total: 64.691 sec > - total: 64.702 sec > Patch, sparse: > - total: 60.820 sec > - total: 60.965 sec > Trunk dense: > - total: 62.730 sec > - total: 62.383 sec > Patch dense: > - total: 58.805 sec > - total: 58.742 sec > The results seem to be consistent and reproducible. I'm using Java 1.8.0_101 > on a fast SSD on Ubuntu 16.04. > It's sort of weird and annoying that this helps so much, because > {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}} > (default 8 KB buffer) to buffer writes. > [~thetaphi] suggested maybe hotspot is failing to inline/optimize the > {{writeByte}} / the call stack just has too many layers. > We could commit this patch (it's trivial) but it'd be nice to understand and > fix why buffering writes is somehow costly so any other Lucene codec > components that write lots of little things can be improved too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733653#comment-15733653 ] Mark Miller commented on SOLR-9824: --- No that marker trick type thing can't work. Bummer. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7583) Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD leaf block?
[ https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733619#comment-15733619 ] Daniel Collins commented on LUCENE-7583: Note: this breaks my eclipse build (on 6x at least, and I presume master) because lucene/core/src/java/org/apache/lucene/util/GrowableByteArrayDataOutput.java claims to be in package org.apache.lucene.store, but is actually in the util dir. Ant compile is fine, but I guess Eclipse is more pedantic. > Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD > leaf block? > - > > Key: LUCENE-7583 > URL: https://issues.apache.org/jira/browse/LUCENE-7583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7583-hardcode-writeVInt.patch, > LUCENE-7583.fork-FastOutputStream.patch, LUCENE-7583.patch, > LUCENE-7583.patch, LUCENE-7583.private-IndexOutput.patch > > > When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint, > int, short, etc.), and I've seen deep thread stacks through our IndexOutput > impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is > writing. > So I tried a small change, to have BKDWriter do its own buffering, by first > writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in > 1 KB byte[] chunks) to the actual IndexOutput. > This gives a non-trivial reduction (~6%) in the total time for BKD writing + > merging time on the 20M NYC taxis nightly benchmark (2 times each): > Trunk, sparse: > - total: 64.691 sec > - total: 64.702 sec > Patch, sparse: > - total: 60.820 sec > - total: 60.965 sec > Trunk dense: > - total: 62.730 sec > - total: 62.383 sec > Patch dense: > - total: 58.805 sec > - total: 58.742 sec > The results seem to be consistent and reproducible. I'm using Java 1.8.0_101 > on a fast SSD on Ubuntu 16.04. > It's sort of weird and annoying that this helps so much, because > {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}} > (default 8 KB buffer) to buffer writes. > [~thetaphi] suggested maybe hotspot is failing to inline/optimize the > {{writeByte}} / the call stack just has too many layers. > We could commit this patch (it's trivial) but it'd be nice to understand and > fix why buffering writes is somehow costly so any other Lucene codec > components that write lots of little things can be improved too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-master - Build # 1531 - Unstable
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/1531/ 1 tests failed. FAILED: org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI Error Message: Collection not found: awhollynewcollection_0 Stack Trace: org.apache.solr.common.SolrException: Collection not found: awhollynewcollection_0 at __randomizedtesting.SeedInfo.seed([40F287082DB55FCD:887F3BC2B867058]:0) at org.apache.solr.client.solrj.impl.CloudSolrClient.getCollectionNames(CloudSolrClient.java:1362) at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1058) at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1037) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) at org.apache.solr.client.solrj.request.UpdateRequest.commit(UpdateRequest.java:232) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI(CollectionsAPIDistributedZkTest.java:516) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.T
[JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18476 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18476/ Java: 64bit/jdk-9-ea+140 -XX:+UseCompressedOops -XX:+UseSerialGC 1 tests failed. FAILED: org.apache.solr.TestDistributedSearch.test Error Message: Expected to find shardAddress in the up shard info: {error=org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request,trace=org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:416) at org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:228) at org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199) at java.util.concurrent.FutureTask.run(java.base@9-ea/FutureTask.java:264) at java.util.concurrent.Executors$RunnableAdapter.call(java.base@9-ea/Executors.java:514) at java.util.concurrent.FutureTask.run(java.base@9-ea/FutureTask.java:264) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@9-ea/ThreadPoolExecutor.java:1161) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@9-ea/ThreadPoolExecutor.java:635) at java.lang.Thread.run(java.base@9-ea/Thread.java:843) ,time=2} Stack Trace: java.lang.AssertionError: Expected to find shardAddress in the up shard info: {error=org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request,trace=org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:416) at org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:228) at org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199) at java.util.concurrent.FutureTask.run(java.base@9-ea/FutureTask.java:264) at java.util.concurrent.Executors$RunnableAdapter.call(java.base@9-ea/Executors.java:514) at java.util.concurrent.FutureTask.run(java.base@9-ea/FutureTask.java:264) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@9-ea/ThreadPoolExecutor.java:1161) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@9-ea/ThreadPoolExecutor.java:635) at java.lang.Thread.run(java.base@9-ea/Thread.java:843) ,time=2} at __randomizedtesting.SeedInfo.seed([7CCE2BE5241BDF49:F49A143F8AE7B2B1]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.TestDistributedSearch.comparePartialResponses(TestDistributedSearch.java:1172) at org.apache.solr.TestDistributedSearch.queryPartialResults(TestDistributedSearch.java:1113) at org.apache.solr.TestDistributedSearch.test(TestDistributedSearch.java:973) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:535) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsRepeatStatement.callStatement(BaseDistributedSearchTestCase.java:1011) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:960) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMark
[jira] [Commented] (SOLR-9837) Performance regression of numeric field uninversion time
[ https://issues.apache.org/jira/browse/SOLR-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733368#comment-15733368 ] Yonik Seeley commented on SOLR-9837: OK, I found the culprit... https://github.com/apache/lucene-solr/commit/f7aa200d406dbd05a35d6116198302d90b92cb29#diff-595e0e789c5e7ac91fe0300782f1bea6R640 This causes the field to be traversed twice... the first time for docsWithValue, and the second time for the actual uninversion (which also calculates docsWithValue anyway but then doesn't use it). > Performance regression of numeric field uninversion time > > > Key: SOLR-9837 > URL: https://issues.apache.org/jira/browse/SOLR-9837 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: master (7.0) >Reporter: Yonik Seeley >Assignee: Yonik Seeley > Fix For: master (7.0) > > > Somehow related to LUCENE-7407, after the transition, the uninvert time of > numeric fields has gone up substantially. I haven't tested all field types > yet, just integer fields, which show a 55% performance regression for the > initial uninvert time. > This was tested with a numeric field of cardinality 1M on a 10M doc index. > {code} > q=id:1&sort=my_numeric_field desc > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1574#comment-1574 ] Steve Rowe commented on LUCENE-7585: The patch appears to be against a non-existent version of {{CommonAnalysisFactoryParams.java}}, so doesn't apply properly. Manual repair worked for me (saving the patch as the target file, trimming to include just the target file contents, removing {{-}} lines, and deleting {{+}} prefixes from lines). It'd be nice to alphabetize the constants. Otherwise, +1. > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733236#comment-15733236 ] Mark Miller edited comment on SOLR-9824 at 12/8/16 8:17 PM: Another possible optimization that could make clients always work is to start writing a start and end marker for the whole request. This would allow all of the Solr clients to handle this perfectly for every case. We could use a PushbackInputStream to work with or without these special markers. And we could still do the best guess approach based on size for HTTP clients. * Looks pretty difficult to incorporate the last doc detection with that idea though :( was (Author: markrmil...@gmail.com): Another possible optimization that could make clients always work is to start writing a start and end marker for the whole request. This would allow all of the Solr clients to handle this perfectly for every case. We could use a PushbackInputStream to work with or without these special markers. And we could still do the best guess approach based on size for HTTP clients. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733236#comment-15733236 ] Mark Miller commented on SOLR-9824: --- Another possible optimization that could make clients always work is to start writing a start and end marker for the whole request. This would allow all of the Solr clients to handle this perfectly for every case. We could use a PushbackInputStream to work with or without these special markers. And we could still do the best guess approach based on size for HTTP clients. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733233#comment-15733233 ] ASF subversion and git services commented on SOLR-4735: --- Commit ab52041c9bfea8285446b79f39ddfbf02eebc845 in lucene-solr's branch refs/heads/feature/metrics from [~ab] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ab52041 ] SOLR-4735 Test also core rename. Fix expectations when registries are reused across tests in the same JVM. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch, screenshot-1.png > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale&subj=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-EA] Lucene-Solr-6.x-Linux (32bit/jdk-9-ea+140) - Build # 2372 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/2372/ Java: 32bit/jdk-9-ea+140 -server -XX:+UseG1GC 1 tests failed. FAILED: org.apache.solr.schema.PreAnalyzedFieldManagedSchemaCloudTest.testAdd2Fields Error Message: No live SolrServers available to handle this request:[https://127.0.0.1:32772/solr/managed-preanalyzed, https://127.0.0.1:38766/solr/managed-preanalyzed] Stack Trace: org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request:[https://127.0.0.1:32772/solr/managed-preanalyzed, https://127.0.0.1:38766/solr/managed-preanalyzed] at __randomizedtesting.SeedInfo.seed([98C58B21836EC85F:30D039CA5F685BA9]:0) at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:414) at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1344) at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1095) at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1037) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) at org.apache.solr.schema.PreAnalyzedFieldManagedSchemaCloudTest.addField(PreAnalyzedFieldManagedSchemaCloudTest.java:61) at org.apache.solr.schema.PreAnalyzedFieldManagedSchemaCloudTest.testAdd2Fields(PreAnalyzedFieldManagedSchemaCloudTest.java:52) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:535) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evalu
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733062#comment-15733062 ] Mark Miller commented on SOLR-9824: --- {code} if (pollQueueTime > 0 && threadCount == 1 && req.isLastDocInBatch()) { // no need to wait to see another doc in the queue if we've hit the last doc in a batch System.out.println("set poll time to 0"); upd = queue.poll(0, TimeUnit.MILLISECONDS); } else { upd = queue.poll(pollQueueTime, TimeUnit.MILLISECONDS); } {code} This extra optimization is the problem you describe in the description. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9839) Use 'useFactory' in tests instead of setting manually
[ https://issues.apache.org/jira/browse/SOLR-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733041#comment-15733041 ] Erick Erickson commented on SOLR-9839: -- Mike: Mostly I was just adding color commentary.. What commit message? > Use 'useFactory' in tests instead of setting manually > - > > Key: SOLR-9839 > URL: https://issues.apache.org/jira/browse/SOLR-9839 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Reporter: Mike Drob >Priority: Minor > Attachments: SOLR-9839.patch > > > We have several tests that will explicitly set a directory factory via > SysProp, some of which forget to unset it. > We should use {{useFactory}} so that we can benefit from the call to > {{resetFactory}} in {{SolrTestCaseJ4.teardownTestCases}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8542) Integrate Learning to Rank into Solr
[ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733033#comment-15733033 ] ASF subversion and git services commented on SOLR-8542: --- Commit f87d672be749fde603f592021bba875fd01e0f01 in lucene-solr's branch refs/heads/branch_6x from [~Michael Nilsson] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f87d672 ] SOLR-8542: disallow reRankDocs<1 i.e. must rerank at least 1 document (Michael Nilsson via Christine Poerschke) > Integrate Learning to Rank into Solr > > > Key: SOLR-8542 > URL: https://issues.apache.org/jira/browse/SOLR-8542 > Project: Solr > Issue Type: New Feature >Reporter: Joshua Pantony >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch, > SOLR-8542.patch > > > This is a ticket to integrate learning to rank machine learning models into > Solr. Solr Learning to Rank (LTR) provides a way for you to extract features > directly inside Solr for use in training a machine learned model. You can > then deploy that model to Solr and use it to rerank your top X search > results. This concept was previously [presented by the authors at Lucene/Solr > Revolution > 2015|http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp]. > [Read through the > README|https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-release/solr/contrib/ltr] > for a tutorial on using the plugin, in addition to how to train your own > external model. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8542) Integrate Learning to Rank into Solr
[ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733032#comment-15733032 ] ASF subversion and git services commented on SOLR-8542: --- Commit 084809b77cc6b62be5f6f888d78574487cb3ec5b in lucene-solr's branch refs/heads/branch_6x from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=084809b ] SOLR-8542: Add maven config and improve IntelliJ config. > Integrate Learning to Rank into Solr > > > Key: SOLR-8542 > URL: https://issues.apache.org/jira/browse/SOLR-8542 > Project: Solr > Issue Type: New Feature >Reporter: Joshua Pantony >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch, > SOLR-8542.patch > > > This is a ticket to integrate learning to rank machine learning models into > Solr. Solr Learning to Rank (LTR) provides a way for you to extract features > directly inside Solr for use in training a machine learned model. You can > then deploy that model to Solr and use it to rerank your top X search > results. This concept was previously [presented by the authors at Lucene/Solr > Revolution > 2015|http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp]. > [Read through the > README|https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-release/solr/contrib/ltr] > for a tutorial on using the plugin, in addition to how to train your own > external model. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8542) Integrate Learning to Rank into Solr
[ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733036#comment-15733036 ] ASF subversion and git services commented on SOLR-8542: --- Commit 3e2657214e103290142d0facfc860cb01f6e033e in lucene-solr's branch refs/heads/branch_6x from [~cpoerschke] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3e26572 ] SOLR-8542: couple of tweaks (Michael Nilsson, Diego Ceccarelli, Christine Poerschke) * removed code triplication in ManagedModelStore * LTRScoringQuery.java tweaks * FeatureLogger.makeFeatureVector(...) can now safely be called repeatedly (though that doesn't happen at present) * make Feature.FeatureWeight.extractTerms a no-op; (OriginalScore|SolrFeature)Weight now implement extractTerms * LTRThreadModule javadocs and README.md tweaks * add TestFieldValueFeature.testBooleanValue test; replace "T"/"F" magic string use in FieldValueFeature * add TestOriginalScoreScorer test; add OriginalScoreScorer.freq() method * in TestMultipleAdditiveTreesModel revive dead explain test > Integrate Learning to Rank into Solr > > > Key: SOLR-8542 > URL: https://issues.apache.org/jira/browse/SOLR-8542 > Project: Solr > Issue Type: New Feature >Reporter: Joshua Pantony >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch, > SOLR-8542.patch > > > This is a ticket to integrate learning to rank machine learning models into > Solr. Solr Learning to Rank (LTR) provides a way for you to extract features > directly inside Solr for use in training a machine learned model. You can > then deploy that model to Solr and use it to rerank your top X search > results. This concept was previously [presented by the authors at Lucene/Solr > Revolution > 2015|http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp]. > [Read through the > README|https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-release/solr/contrib/ltr] > for a tutorial on using the plugin, in addition to how to train your own > external model. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8542) Integrate Learning to Rank into Solr
[ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733034#comment-15733034 ] ASF subversion and git services commented on SOLR-8542: --- Commit 252c6e9385ba516887543eb1968c8654b35b2b81 in lucene-solr's branch refs/heads/branch_6x from [~cpoerschke] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=252c6e9 ] SOLR-8542, SOLR-9746: prefix solr/contrib/ltr's search and response.transform packages with ltr > Integrate Learning to Rank into Solr > > > Key: SOLR-8542 > URL: https://issues.apache.org/jira/browse/SOLR-8542 > Project: Solr > Issue Type: New Feature >Reporter: Joshua Pantony >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch, > SOLR-8542.patch > > > This is a ticket to integrate learning to rank machine learning models into > Solr. Solr Learning to Rank (LTR) provides a way for you to extract features > directly inside Solr for use in training a machine learned model. You can > then deploy that model to Solr and use it to rerank your top X search > results. This concept was previously [presented by the authors at Lucene/Solr > Revolution > 2015|http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp]. > [Read through the > README|https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-release/solr/contrib/ltr] > for a tutorial on using the plugin, in addition to how to train your own > external model. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8542) Integrate Learning to Rank into Solr
[ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733031#comment-15733031 ] ASF subversion and git services commented on SOLR-8542: --- Commit a511b30a50672365d46c3d052e19a9fedd228e2e in lucene-solr's branch refs/heads/branch_6x from [~cpoerschke] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a511b30 ] SOLR-8542: Adds Solr Learning to Rank (LTR) plugin for reranking results with machine learning models. (Michael Nilsson, Diego Ceccarelli, Joshua Pantony, Jon Dorando, Naveen Santhapuri, Alessandro Benedetti, David Grohmann, Christine Poerschke) > Integrate Learning to Rank into Solr > > > Key: SOLR-8542 > URL: https://issues.apache.org/jira/browse/SOLR-8542 > Project: Solr > Issue Type: New Feature >Reporter: Joshua Pantony >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch, > SOLR-8542.patch > > > This is a ticket to integrate learning to rank machine learning models into > Solr. Solr Learning to Rank (LTR) provides a way for you to extract features > directly inside Solr for use in training a machine learned model. You can > then deploy that model to Solr and use it to rerank your top X search > results. This concept was previously [presented by the authors at Lucene/Solr > Revolution > 2015|http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp]. > [Read through the > README|https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-release/solr/contrib/ltr] > for a tutorial on using the plugin, in addition to how to train your own > external model. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8542) Integrate Learning to Rank into Solr
[ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733037#comment-15733037 ] ASF subversion and git services commented on SOLR-8542: --- Commit 9e8dd854cda6d56cc8d498cc23d138eeb74732fd in lucene-solr's branch refs/heads/branch_6x from [~cpoerschke] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9e8dd85 ] SOLR-8542: master-to-branch_6x backport changes (Michael Nilsson, Naveen Santhapuri, Christine Poerschke) * removed 'boost' arg from LTRScoringQuery.createWeight signature * classes extending Weight now implement normalize and getValueForNormalization * FieldLengthFeatureScorer tweaks > Integrate Learning to Rank into Solr > > > Key: SOLR-8542 > URL: https://issues.apache.org/jira/browse/SOLR-8542 > Project: Solr > Issue Type: New Feature >Reporter: Joshua Pantony >Assignee: Christine Poerschke >Priority: Minor > Attachments: SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch, > SOLR-8542.patch > > > This is a ticket to integrate learning to rank machine learning models into > Solr. Solr Learning to Rank (LTR) provides a way for you to extract features > directly inside Solr for use in training a machine learned model. You can > then deploy that model to Solr and use it to rerank your top X search > results. This concept was previously [presented by the authors at Lucene/Solr > Revolution > 2015|http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp]. > [Read through the > README|https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-release/solr/contrib/ltr] > for a tutorial on using the plugin, in addition to how to train your own > external model. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9746) Eclipse project broken due to duplicate package-info.java in LTR contrib
[ https://issues.apache.org/jira/browse/SOLR-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733035#comment-15733035 ] ASF subversion and git services commented on SOLR-9746: --- Commit 252c6e9385ba516887543eb1968c8654b35b2b81 in lucene-solr's branch refs/heads/branch_6x from [~cpoerschke] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=252c6e9 ] SOLR-8542, SOLR-9746: prefix solr/contrib/ltr's search and response.transform packages with ltr > Eclipse project broken due to duplicate package-info.java in LTR contrib > > > Key: SOLR-9746 > URL: https://issues.apache.org/jira/browse/SOLR-9746 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Christine Poerschke >Priority: Minor > > The eclipse project generated through {{ant eclipse}} seems to be broken, > since there are errors complaining duplicate resources. The problem is that > the following files have the same package and class names: > {code} > ./solr/core/src/java/org/apache/solr/response/transform/package-info.java > ./solr/contrib/ltr/src/java/org/apache/solr/response/transform/package-info.java > ./solr/core/src/java/org/apache/solr/search/package-info.java > ./solr/contrib/ltr/src/java/org/apache/solr/search/package-info.java > {code} > Not sure if the idea project is affected. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9839) Use 'useFactory' in tests instead of setting manually
[ https://issues.apache.org/jira/browse/SOLR-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733025#comment-15733025 ] Mike Drob commented on SOLR-9839: - Looks like SystemPropertiesRestoreRule is automatically applied to everything inheriting from SolrTestCaseJ4. In which case, it would still be nice to remove the manual resetting from a bunch of out test cases. In light of this, I still think the code change is fine as is. The commit message in the patch references the wrong JIRA. [~erickerickson] - is there anything else that you think needs to be changed? > Use 'useFactory' in tests instead of setting manually > - > > Key: SOLR-9839 > URL: https://issues.apache.org/jira/browse/SOLR-9839 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Reporter: Mike Drob >Priority: Minor > Attachments: SOLR-9839.patch > > > We have several tests that will explicitly set a directory factory via > SysProp, some of which forget to unset it. > We should use {{useFactory}} so that we can benefit from the call to > {{resetFactory}} in {{SolrTestCaseJ4.teardownTestCases}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732967#comment-15732967 ] Mark Miller commented on SOLR-9824: --- bq. This sounds really clever if we can verify that most clients do send the size for small single docs. I fixed clients to send the size like they are supposed and avoid chunked encoding in the last year or two. Clients like curl and such are also smart enough to do this when you are not trying to stream something in (if you don't send it, you have to use chunked encoding). So a very quick and simple win would be to always wait when there is no size as we know the client is trying to stream in multiple documents unless it's broken. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (LUCENE-3080) cutover highlighter to BytesRef
[ https://issues.apache.org/jira/browse/LUCENE-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley closed LUCENE-3080. Resolution: Won't Fix > cutover highlighter to BytesRef > --- > > Key: LUCENE-3080 > URL: https://issues.apache.org/jira/browse/LUCENE-3080 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/highlighter >Reporter: Michael McCandless > > Highlighter still uses char[] terms (consumes tokens from the analyzer as > char[] not as BytesRef), which is causing problems for merging SOLR-2497 to > trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley reassigned LUCENE-7585: Assignee: David Smiley > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Assignee: David Smiley >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7585) Interface for common parameters used across analysis factories
[ https://issues.apache.org/jira/browse/LUCENE-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732930#comment-15732930 ] David Smiley commented on LUCENE-7585: -- +1 LGTM. The only thing I'd change is "LUCENE_MATCH_VERSION_PARAM" to "LUCENE_MATCH_VERSION". If there are no further comments then I'll commit this later this weekend. > Interface for common parameters used across analysis factories > -- > > Key: LUCENE-7585 > URL: https://issues.apache.org/jira/browse/LUCENE-7585 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis >Affects Versions: 6.3 >Reporter: Ahmet Arslan >Priority: Minor > Fix For: master (7.0) > > Attachments: LUCENE-7585.patch > > > Certain parameters (String constants) are same/common for multiple analysis > factories. Some examples are {{ignoreCase}}, {{dictionary}}, and > {{preserveOriginal}}. These string constants are handled inconsistently in > different factories. This is an effort to define most common constants in > ({{CommonAnalysisFactoryParams}}) interface and reuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732899#comment-15732899 ] David Smiley commented on SOLR-9824: bq. streaming and chunked encoding would have no size and wait This sounds really clever if we can verify that most clients do send the size for small single docs. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732865#comment-15732865 ] Mark Miller commented on SOLR-9824: --- Another idea is to look at the request size and use the wait when the size is large enough - streaming and chunked encoding would have no size and wait, small docs or a few docs would not wait, and lots of docs or a really large doc would wait. Given a really large doc will take a while anyway, the additional wait should not be that bad. Just another idea, will keep poking around this. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732817#comment-15732817 ] Mark Miller commented on SOLR-9824: --- At least for the solrj clients - harder to do the same with raw http. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests
[ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732805#comment-15732805 ] Mark Miller commented on SOLR-9824: --- The lastDocInBatch trick may not be the right solution - it's not even really compatible with streaming in updates. I think we should be able to tackle this better somehow. The client really should know what is up and communicate that to the server. Are we sending a single doc? Let the server know and it can avoid waiting. Are we sending a batch of documents in a single request? Let the server know and it can use a wait. Are we streaming in documents? We wait. The lasDocInBatch could be an optimization on that that avoids waiting on the last doc in a batch, but it feels like we should be able to address this in a way that is not format specific as well. > Documents indexed in bulk are replicated using too many HTTP requests > - > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.3 >Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing > small documents, I looked at the logs of my SolrCloud nodes. I noticed that > shards would see an /update log message every ~6ms which is *way* too much. > These are requests from one shard (that isn't a leader/replica for these docs > but the recipient from my client) to the target shard leader (no additional > replicas). One might ask why I'm not sending docs to the right shard in the > first place; I have a reason but it's besides the point -- there's a real > Solr perf problem here and this probably applies equally to > replicationFactor>1 situations too. I could turn off the logs but that would > hide useful stuff, and it's disconcerting to me that so many short-lived HTTP > requests are happening, somehow at the bequest of DistributedUpdateProcessor. > After lots of analysis and debugging and hair pulling, I finally figured it > out. > In SOLR-7333 ([~tpot]) introduced an optimization called > {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will > poll with a '0' timeout to the internal queue, so that it can close the > connection without it hanging around any longer than needed. This part makes > sense to me. Currently the only spot that has the smarts to set this flag is > {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the > last document. So if a shard received docs in a javabin stream (but not > other formats) one would expect the _last_ document to have this flag. > There's even a test. Docs without this flag get the default poll time; for > javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send > javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. > I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW > DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter > (defaulting to javabin) to send each document separately without any leading > marker or trailing marker. For the XML format by comparison, there is a > leading and trailing marker ( ... ). Since there's no outer > container for the javabin unmarshalling to detect the last document, it marks > _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-9834) A variety of spots in the code can create a collection zk node after the collection has been removed.
[ https://issues.apache.org/jira/browse/SOLR-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-9834. --- Resolution: Fixed Fix Version/s: 6.4 master (7.0) > A variety of spots in the code can create a collection zk node after the > collection has been removed. > - > > Key: SOLR-9834 > URL: https://issues.apache.org/jira/browse/SOLR-9834 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: master (7.0), 6.4 > > Attachments: SOLR-9834.patch, SOLR-9834.patch > > > The results of this have annoyed me for some time. We should fail rather than > create the collection node and only ensure the rest of the path exists if the > collection node did not need to be created. > Currently, leader elections can trigger on delete and recreate a collection > zk node that was just removed. I think there was a bit of defense put in > against that, but I still see it and this is more thorough as well as a step > towards the ZK=Truth path. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9834) A variety of spots in the code can create a collection zk node after the collection has been removed.
[ https://issues.apache.org/jira/browse/SOLR-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732760#comment-15732760 ] ASF subversion and git services commented on SOLR-9834: --- Commit 89327187439ca2dfa2d49b5ae2bf327031e6d730 in lucene-solr's branch refs/heads/branch_6x from markrmiller [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8932718 ] SOLR-9834: A variety of spots in the code can create a collection zk node after the collection has been removed. # Conflicts: # solr/CHANGES.txt > A variety of spots in the code can create a collection zk node after the > collection has been removed. > - > > Key: SOLR-9834 > URL: https://issues.apache.org/jira/browse/SOLR-9834 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller > Attachments: SOLR-9834.patch, SOLR-9834.patch > > > The results of this have annoyed me for some time. We should fail rather than > create the collection node and only ensure the rest of the path exists if the > collection node did not need to be created. > Currently, leader elections can trigger on delete and recreate a collection > zk node that was just removed. I think there was a bit of defense put in > against that, but I still see it and this is more thorough as well as a step > towards the ZK=Truth path. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9834) A variety of spots in the code can create a collection zk node after the collection has been removed.
[ https://issues.apache.org/jira/browse/SOLR-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732751#comment-15732751 ] ASF subversion and git services commented on SOLR-9834: --- Commit 1055209940faec71bd8046af3323d5982529525b in lucene-solr's branch refs/heads/master from markrmiller [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1055209 ] SOLR-9834: A variety of spots in the code can create a collection zk node after the collection has been removed. > A variety of spots in the code can create a collection zk node after the > collection has been removed. > - > > Key: SOLR-9834 > URL: https://issues.apache.org/jira/browse/SOLR-9834 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller > Attachments: SOLR-9834.patch, SOLR-9834.patch > > > The results of this have annoyed me for some time. We should fail rather than > create the collection node and only ensure the rest of the path exists if the > collection node did not need to be created. > Currently, leader elections can trigger on delete and recreate a collection > zk node that was just removed. I think there was a bit of defense put in > against that, but I still see it and this is more thorough as well as a step > towards the ZK=Truth path. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9834) A variety of spots in the code can create a collection zk node after the collection has been removed.
[ https://issues.apache.org/jira/browse/SOLR-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-9834: -- Attachment: SOLR-9834.patch I'll commit this soon. > A variety of spots in the code can create a collection zk node after the > collection has been removed. > - > > Key: SOLR-9834 > URL: https://issues.apache.org/jira/browse/SOLR-9834 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller > Attachments: SOLR-9834.patch, SOLR-9834.patch > > > The results of this have annoyed me for some time. We should fail rather than > create the collection node and only ensure the rest of the path exists if the > collection node did not need to be created. > Currently, leader elections can trigger on delete and recreate a collection > zk node that was just removed. I think there was a bit of defense put in > against that, but I still see it and this is more thorough as well as a step > towards the ZK=Truth path. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-6.x - Build # 222 - Still Unstable
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-6.x/222/ 5 tests failed. FAILED: org.apache.solr.cloud.CdcrReplicationDistributedZkTest.testReplicationStartStop Error Message: Timeout while trying to assert number of documents @ target_collection Stack Trace: java.lang.AssertionError: Timeout while trying to assert number of documents @ target_collection at __randomizedtesting.SeedInfo.seed([81DBE39246B6B3AD:2183D01526E3824]:0) at org.apache.solr.cloud.BaseCdcrDistributedZkTest.assertNumDocs(BaseCdcrDistributedZkTest.java:271) at org.apache.solr.cloud.CdcrReplicationDistributedZkTest.testReplicationStartStop(CdcrReplicationDistributedZkTest.java:173) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(Threa
[jira] [Commented] (SOLR-9835) Create another replication mode for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732576#comment-15732576 ] Shalin Shekhar Mangar commented on SOLR-9835: - Hold your questions for a bit. Dat and I are working on a design and we will try to answer your questions as much as we can. We will post it in a couple of days. > Create another replication mode for SolrCloud > - > > Key: SOLR-9835 > URL: https://issues.apache.org/jira/browse/SOLR-9835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat > > The current replication mechanism of SolrCloud is called state machine, which > replicas start in same initial state and for each input, the input is > distributed across replicas so all replicas will end up with same next state. > But this type of replication have some drawbacks > - The commit (which costly) have to run on all replicas > - Slow recovery, because if replica miss more than N updates on its down > time, the replica have to download entire index from its leader. > So we create create another replication mode for SolrCloud called state > transfer, which acts like master/slave replication. In basically > - Leader distribute the update to other replicas, but the leader only apply > the update to IW, other replicas just store the update to UpdateLog (act like > replication). > - Replicas frequently polling the latest segments from leader. > Pros: > - Lightweight for indexing, because only leader are running the commit, > updates. > - Very fast recovery, replicas just have to download the missing segments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9699) CoreStatus requests can fail if executed during a core reload
[ https://issues.apache.org/jira/browse/SOLR-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732572#comment-15732572 ] Erick Erickson commented on SOLR-9699: -- Easier to track if linked. > CoreStatus requests can fail if executed during a core reload > - > > Key: SOLR-9699 > URL: https://issues.apache.org/jira/browse/SOLR-9699 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Alan Woodward > > CoreStatus requests delegate some of their response down to a core's > IndexWriter. If the core is being reloaded, then there's a race between > these calls and the IndexWriter being closed, which can lead to the request > failing with an AlreadyClosedException. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9835) Create another replication mode for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732554#comment-15732554 ] Ishan Chattopadhyaya commented on SOLR-9835: I'm curious about the handling of searcher reopens (which would create a new segment, afaict). Such a searcher reopen can happen if an RTG request uses filters. {code} ulog.openRealtimeSearcher(); // force open a new realtime searcher {code} > Create another replication mode for SolrCloud > - > > Key: SOLR-9835 > URL: https://issues.apache.org/jira/browse/SOLR-9835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat > > The current replication mechanism of SolrCloud is called state machine, which > replicas start in same initial state and for each input, the input is > distributed across replicas so all replicas will end up with same next state. > But this type of replication have some drawbacks > - The commit (which costly) have to run on all replicas > - Slow recovery, because if replica miss more than N updates on its down > time, the replica have to download entire index from its leader. > So we create create another replication mode for SolrCloud called state > transfer, which acts like master/slave replication. In basically > - Leader distribute the update to other replicas, but the leader only apply > the update to IW, other replicas just store the update to UpdateLog (act like > replication). > - Replicas frequently polling the latest segments from leader. > Pros: > - Lightweight for indexing, because only leader are running the commit, > updates. > - Very fast recovery, replicas just have to download the missing segments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9835) Create another replication mode for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732527#comment-15732527 ] Mark Miller commented on SOLR-9835: --- Soft commits are for near realtime and doing master->slave replication won't benefit from that. Effectively, soft commits won't be useful. > Create another replication mode for SolrCloud > - > > Key: SOLR-9835 > URL: https://issues.apache.org/jira/browse/SOLR-9835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat > > The current replication mechanism of SolrCloud is called state machine, which > replicas start in same initial state and for each input, the input is > distributed across replicas so all replicas will end up with same next state. > But this type of replication have some drawbacks > - The commit (which costly) have to run on all replicas > - Slow recovery, because if replica miss more than N updates on its down > time, the replica have to download entire index from its leader. > So we create create another replication mode for SolrCloud called state > transfer, which acts like master/slave replication. In basically > - Leader distribute the update to other replicas, but the leader only apply > the update to IW, other replicas just store the update to UpdateLog (act like > replication). > - Replicas frequently polling the latest segments from leader. > Pros: > - Lightweight for indexing, because only leader are running the commit, > updates. > - Very fast recovery, replicas just have to download the missing segments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7586) fail precommit on varargsArgumentNeedCast
[ https://issues.apache.org/jira/browse/LUCENE-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732509#comment-15732509 ] Uwe Schindler commented on LUCENE-7586: --- I don't fully understand the issue description. The issue you enabled precommit checks for is only catched by Eclipse compiler. The warning list by Jenkins is produced on Javac's output - so how does that relate to each other? I agree we should primarily fix the Oracle Javac warnings (mostly rawtypes or unsafe casts and a few try-resources and fall-through warnings), but I can also work on enabling warning reports for the eclipse compiler on Policeman Jenkins. But making it a failure for some cases (as you did) is also fine. > fail precommit on varargsArgumentNeedCast > - > > Key: LUCENE-7586 > URL: https://issues.apache.org/jira/browse/LUCENE-7586 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-7586.patch > > > Why this, why now? > I had noticed the Java Warnings (listing and trend chart) on [~thetaphi]'s > https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/ and after fixing > one really easy warning got curious about others of the same category. There > aren't any and so it would seem obvious to update the precommit checks (and > Eclipse settings) to prevent future introductions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active
[ https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732476#comment-15732476 ] Mark Miller commented on SOLR-9829: --- Well, you have a lot going on in this JIRA in terms of what you bring up. SOLR-7956 addresses the issue that causes that stack trace though, which is not related to GC. > Solr cannot provide index service after a large GC pause but core state in ZK > is still active > - > > Key: SOLR-9829 > URL: https://issues.apache.org/jira/browse/SOLR-9829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 5.3.2 > Environment: Redhat enterprise server 64bit >Reporter: Forest Soup > > When Solr meets a large GC pause like > https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it > cannot provide service and never come back until restart. > But in the ZooKeeper, the cores on that server still shows active and server > is also in live_nodes. > Some /update requests got http 500 due to "IndexWriter is closed". Some gots > http 400 due to "possible analysis error." whose root cause is still > "IndexWriter is closed", which we think it should return 500 > instead(documented in https://issues.apache.org/jira/browse/SOLR-9825). > Our questions in this JIRA are: > 1, should solr mark cores as down in zk when it cannot provide index service? > 2, Is it possible solr re-open the IndexWriter to provide index service again? > solr log snippets: > 2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 > r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore > org.apache.solr.common.SolrException: Exception writing document id > Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C > to the index; possible analysis error. > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167) > at > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706) > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207) > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) > at > org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231) > at > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143) > at > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113) > at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76) > at > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) > a
[jira] [Updated] (LUCENE-7586) fail precommit on varargsArgumentNeedCast
[ https://issues.apache.org/jira/browse/LUCENE-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated LUCENE-7586: Attachment: LUCENE-7586.patch > fail precommit on varargsArgumentNeedCast > - > > Key: LUCENE-7586 > URL: https://issues.apache.org/jira/browse/LUCENE-7586 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-7586.patch > > > Why this, why now? > I had noticed the Java Warnings (listing and trend chart) on [~thetaphi]'s > https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/ and after fixing > one really easy warning got curious about others of the same category. There > aren't any and so it would seem obvious to update the precommit checks (and > Eclipse settings) to prevent future introductions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-7586) fail precommit on varargsArgumentNeedCast
Christine Poerschke created LUCENE-7586: --- Summary: fail precommit on varargsArgumentNeedCast Key: LUCENE-7586 URL: https://issues.apache.org/jira/browse/LUCENE-7586 Project: Lucene - Core Issue Type: Task Reporter: Christine Poerschke Assignee: Christine Poerschke Priority: Minor Why this, why now? I had noticed the Java Warnings (listing and trend chart) on [~thetaphi]'s https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/ and after fixing one really easy warning got curious about others of the same category. There aren't any and so it would seem obvious to update the precommit checks (and Eclipse settings) to prevent future introductions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active
[ https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732316#comment-15732316 ] Forest Soup commented on SOLR-9829: --- Thanks All! I have a mail thread tracking on it. http://lucene.472066.n3.nabble.com/Solr-cannot-provide-index-service-after-a-large-GC-pause-but-core-state-in-ZK-is-still-active-td4308942.html Could you please help comments on the questions in it? Thanks! @Mark and Varun, are you sure this issue is dup of https://issues.apache.org/jira/browse/SOLR-7956 ? If yes, I'll try to backport it to 5.3.2. And also I see Daisy created a similar JIRA: https://issues.apache.org/jira/browse/SOLR-9830 . Although her root cause is the too many open file, but could you make sure it's also the dup of SOLR-7956? > Solr cannot provide index service after a large GC pause but core state in ZK > is still active > - > > Key: SOLR-9829 > URL: https://issues.apache.org/jira/browse/SOLR-9829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 5.3.2 > Environment: Redhat enterprise server 64bit >Reporter: Forest Soup > > When Solr meets a large GC pause like > https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it > cannot provide service and never come back until restart. > But in the ZooKeeper, the cores on that server still shows active and server > is also in live_nodes. > Some /update requests got http 500 due to "IndexWriter is closed". Some gots > http 400 due to "possible analysis error." whose root cause is still > "IndexWriter is closed", which we think it should return 500 > instead(documented in https://issues.apache.org/jira/browse/SOLR-9825). > Our questions in this JIRA are: > 1, should solr mark cores as down in zk when it cannot provide index service? > 2, Is it possible solr re-open the IndexWriter to provide index service again? > solr log snippets: > 2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 > r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore > org.apache.solr.common.SolrException: Exception writing document id > Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C > to the index; possible analysis error. > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167) > at > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706) > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207) > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) > at > org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231) > at > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143) > at > org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113) > at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76) > at > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) > at > org.eclipse.jetty.
[jira] [Commented] (SOLR-9828) Very long young generation stop the world GC pause
[ https://issues.apache.org/jira/browse/SOLR-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732274#comment-15732274 ] Forest Soup commented on SOLR-9828: --- The mail thread: http://lucene.472066.n3.nabble.com/Very-long-young-generation-stop-the-world-GC-pause-td4308911.html > Very long young generation stop the world GC pause > --- > > Key: SOLR-9828 > URL: https://issues.apache.org/jira/browse/SOLR-9828 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 5.3.2 > Environment: Linux Redhat 64bit >Reporter: Forest Soup > > We are using oracle jdk8u92 64bit. > The jvm memory related options: > -Xms32768m > -Xmx32768m > -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/mnt/solrdata1/log > -XX:+UseG1GC > -XX:+PerfDisableSharedMem > -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=8m > -XX:MaxGCPauseMillis=100 > -XX:InitiatingHeapOccupancyPercent=35 > -XX:+AggressiveOpts > -XX:+AlwaysPreTouch > -XX:ConcGCThreads=16 > -XX:ParallelGCThreads=18 > -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/mnt/solrdata1/log > -verbose:gc > -XX:+PrintHeapAtGC > -XX:+PrintGCDetails > -XX:+PrintGCDateStamps > -XX:+PrintGCTimeStamps > -XX:+PrintTenuringDistribution > -XX:+PrintGCApplicationStoppedTime > -Xloggc:/mnt/solrdata1/log/solr_gc.log > It usually works fine. But recently we met very long stop the world young > generation GC pause. Some snippets of the gc log are as below: > 2016-11-22T20:43:16.436+: 2942054.483: Total time for which application > threads were stopped: 0.0005510 seconds, Stopping threads took: 0.894 > seconds > 2016-11-22T20:43:16.463+: 2942054.509: Total time for which application > threads were stopped: 0.0029195 seconds, Stopping threads took: 0.804 > seconds > {Heap before GC invocations=2246 (full 0): > garbage-first heap total 26673152K, used 4683965K [0x7f0c1000, > 0x7f0c108065c0, 0x7f141000) > region size 8192K, 162 young (1327104K), 17 survivors (139264K) > Metaspace used 56487K, capacity 57092K, committed 58368K, reserved > 59392K > 2016-11-22T20:43:16.555+: 2942054.602: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 88080384 bytes, new threshold 15 (max 15) > - age 1: 28176280 bytes, 28176280 total > - age 2:5632480 bytes, 33808760 total > - age 3:9719072 bytes, 43527832 total > - age 4:6219408 bytes, 49747240 total > - age 5:4465544 bytes, 54212784 total > - age 6:3417168 bytes, 57629952 total > - age 7:5343072 bytes, 62973024 total > - age 8:2784808 bytes, 65757832 total > - age 9:6538056 bytes, 72295888 total > - age 10:6368016 bytes, 78663904 total > - age 11: 695216 bytes, 79359120 total > , 97.2044320 secs] >[Parallel Time: 19.8 ms, GC Workers: 18] > [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: > 2942054612.7, Diff: 10.6] > [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: > 43.5] > [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0] > [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.0] > [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: > 165.9] > [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1] > [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: > 0.6] > [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: > 308.7] > [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: > 2942054621.8, Diff: 0.0] >[Code Root Fixup: 0.1 ms] >[Code Root Purge: 0.0 ms] >[Clear CT: 0.2 ms] >[Other: 97184.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 8.5 ms] > [Ref Enq: 0.2 ms] > [Redirty Cards: 0.2 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.4 ms] >[Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: > 4574.2M(25.4G)->3450.8M(26.8G)] > Heap after GC invocations=2247 (full 0): > garbage-first heap total 28049408K, used 3533601K [0x7f0c1000, > 0x7f0c10806b00, 0x7f141000) > region size 8192K, 21 young (172032K), 21 survivors (172032K) > Metaspace used 56487K, capacity 57092K, committed 58368K, reserved > 59392K > } > [Times: user=0.00 sys=94.28, real=97.19 secs] > 2016-11-22T20:44:53.760+: 2942151.806: Total time for which
[jira] [Commented] (SOLR-9828) Very long young generation stop the world GC pause
[ https://issues.apache.org/jira/browse/SOLR-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732269#comment-15732269 ] Forest Soup commented on SOLR-9828: --- Thanks Shawn, I'll use this mail thread talking on it instead of this JIRA. Could you please help comment on the question in the mail thread? Thanks! 1, As you can see in the gc log, the long GC pause is not a full GC. It's a young generation GC instead. In our case, full gc is fast and young gc got some long stw pause. Do you have any comments on that, as we usually believe full gc may cause longer pause, but young generation should be ok? 2, Will these JVM options make it better? -XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=10 2016-11-22T20:43:16.463+: 2942054.509: Total time for which application threads were stopped: 0.0029195 seconds, Stopping threads took: 0.804 seconds {Heap before GC invocations=2246 (full 0): garbage-first heap total 26673152K, used 4683965K [0x7f0c1000, 0x7f0c108065c0, 0x7f141000) region size 8192K, 162 young (1327104K), 17 survivors (139264K) Metaspace used 56487K, capacity 57092K, committed 58368K, reserved 59392K 2016-11-22T20:43:16.555+: 2942054.602: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 88080384 bytes, new threshold 15 (max 15) - age 1: 28176280 bytes, 28176280 total - age 2:5632480 bytes, 33808760 total - age 3:9719072 bytes, 43527832 total - age 4:6219408 bytes, 49747240 total - age 5:4465544 bytes, 54212784 total - age 6:3417168 bytes, 57629952 total - age 7:5343072 bytes, 62973024 total - age 8:2784808 bytes, 65757832 total - age 9:6538056 bytes, 72295888 total - age 10:6368016 bytes, 78663904 total - age 11: 695216 bytes, 79359120 total , 97.2044320 secs] [Parallel Time: 19.8 ms, GC Workers: 18] [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 2942054612.7, Diff: 10.6] [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 43.5] [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0] [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 165.9] [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1] [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.6] [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 308.7] [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 2942054621.8, Diff: 0.0] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.2 ms] [Other: 97184.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 8.5 ms] [Ref Enq: 0.2 ms] [Redirty Cards: 0.2 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.4 ms] [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 4574.2M(25.4G)->3450.8M(26.8G)] Heap after GC invocations=2247 (full 0): garbage-first heap total 28049408K, used 3533601K [0x7f0c1000, 0x7f0c10806b00, 0x7f141000) region size 8192K, 21 young (172032K), 21 survivors (172032K) Metaspace used 56487K, capacity 57092K, committed 58368K, reserved 59392K } [Times: user=0.00 sys=94.28, real=97.19 secs] 2016-11-22T20:44:53.760+: 2942151.806: Total time for which application threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 seconds > Very long young generation stop the world GC pause > --- > > Key: SOLR-9828 > URL: https://issues.apache.org/jira/browse/SOLR-9828 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 5.3.2 > Environment: Linux Redhat 64bit >Reporter: Forest Soup > > We are using oracle jdk8u92 64bit. > The jvm memory related options: > -Xms32768m > -Xmx32768m > -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/mnt/solrdata1/log > -XX:+UseG1GC > -XX:+PerfDisableSharedMem > -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=8m > -XX:MaxGCPauseMillis=100 > -XX:InitiatingHeapOccupancyPercent=35 > -XX:+AggressiveOpts > -XX:+AlwaysPreTouch > -XX:ConcGCThreads=16 > -XX:ParallelGCThreads=18 > -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/mnt/solrdata1/log > -verbose:gc > -XX:+PrintHeapAtGC > -XX:+PrintGCDetails > -XX:+PrintGCDate
[JENKINS] Lucene-Solr-6.x-Linux (32bit/jdk1.8.0_102) - Build # 2370 - Unstable!
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/2370/ Java: 32bit/jdk1.8.0_102 -server -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.core.TestDynamicLoading.testDynamicLoading Error Message: Could not get expected value 'X val' for path 'x' full output: { "responseHeader":{ "status":0, "QTime":0}, "params":{"wt":"json"}, "context":{ "webapp":"", "path":"/test1", "httpMethod":"GET"}, "class":"org.apache.solr.core.BlobStoreTestRequestHandler", "x":null}, from server: null Stack Trace: java.lang.AssertionError: Could not get expected value 'X val' for path 'x' full output: { "responseHeader":{ "status":0, "QTime":0}, "params":{"wt":"json"}, "context":{ "webapp":"", "path":"/test1", "httpMethod":"GET"}, "class":"org.apache.solr.core.BlobStoreTestRequestHandler", "x":null}, from server: null at __randomizedtesting.SeedInfo.seed([69AB5B57D561D90A:B1E6760022BC7CAA]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.core.TestSolrConfigHandler.testForResponseElement(TestSolrConfigHandler.java:535) at org.apache.solr.core.TestDynamicLoading.testDynamicLoading(TestDynamicLoading.java:232) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.l
[jira] [Updated] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active
[ https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forest Soup updated SOLR-9829: -- Description: When Solr meets a large GC pause like https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it cannot provide service and never come back until restart. But in the ZooKeeper, the cores on that server still shows active and server is also in live_nodes. Some /update requests got http 500 due to "IndexWriter is closed". Some gots http 400 due to "possible analysis error." whose root cause is still "IndexWriter is closed", which we think it should return 500 instead(documented in https://issues.apache.org/jira/browse/SOLR-9825). Our questions in this JIRA are: 1, should solr mark cores as down in zk when it cannot provide index service? 2, Is it possible solr re-open the IndexWriter to provide index service again? solr log snippets: 2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore org.apache.solr.common.SolrException: Exception writing document id Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C to the index; possible analysis error. at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113) at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:499) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnec
[jira] [Commented] (SOLR-9815) Verbose Garbage Collection logging is on by default
[ https://issues.apache.org/jira/browse/SOLR-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732199#comment-15732199 ] Jan Høydahl commented on SOLR-9815: --- I was more thinking about your solr.in.sh prior to upgrading to 6.3. Since the default in solr.in.sh used to be verbose GC, did you change it? Or is 6.3 the first Solr version you are running? Instead of empty GC_LOG_OPTS, you could try to pick any combination of the GC related JVM flags that suits your needs the best. > Verbose Garbage Collection logging is on by default > --- > > Key: SOLR-9815 > URL: https://issues.apache.org/jira/browse/SOLR-9815 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Affects Versions: 6.3 >Reporter: Gethin James >Priority: Minor > > There have been some excellent logging fixes in 6.3 > (http://www.cominvent.com/2016/11/07/solr-logging-just-got-better/). However > now, by default, Solr is logging a great deal of garbage collection > information. > It seems that this logging is excessive, can we make the default logging to > not be verbose? > For linux/mac setting GC_LOG_OPTS="" in solr.in.sh seems to work around the > issue, but looking at solr.cmd I don't think that will work for windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9815) Verbose Garbage Collection logging is on by default
[ https://issues.apache.org/jira/browse/SOLR-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732190#comment-15732190 ] Gethin James commented on SOLR-9815: We are using an OOTB Solr 6.3 instance, no changes to GC_LOG_OPTS. After discovering the gc logging I am now using the workaround {code}GC_LOG_OPTS=""{code} in solr.in.sh > Verbose Garbage Collection logging is on by default > --- > > Key: SOLR-9815 > URL: https://issues.apache.org/jira/browse/SOLR-9815 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Affects Versions: 6.3 >Reporter: Gethin James >Priority: Minor > > There have been some excellent logging fixes in 6.3 > (http://www.cominvent.com/2016/11/07/solr-logging-just-got-better/). However > now, by default, Solr is logging a great deal of garbage collection > information. > It seems that this logging is excessive, can we make the default logging to > not be verbose? > For linux/mac setting GC_LOG_OPTS="" in solr.in.sh seems to work around the > issue, but looking at solr.cmd I don't think that will work for windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9513) Introduce a generic authentication plugin which delegates all functionality to Hadoop authentication framework
[ https://issues.apache.org/jira/browse/SOLR-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732177#comment-15732177 ] Ishan Chattopadhyaya commented on SOLR-9513: GenericHadoopAuthPlugin and ConfigurableInternodeAuthHadoopPlugin sound fine. Just as an aside, I was wondering if we can drop "Generic" as well; but I leave it to your judgement. However, I would prefer to refrain from highlighting (while documenting) the distinction between standalone and SolrCloud too much. I think both are potentially useful in standalone, since there can be internode communication even in non-SolrCloud/standalone setups (during master/slave replication). > Introduce a generic authentication plugin which delegates all functionality > to Hadoop authentication framework > -- > > Key: SOLR-9513 > URL: https://issues.apache.org/jira/browse/SOLR-9513 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hrishikesh Gadre > > Currently Solr kerberos authentication plugin delegates the core logic to > Hadoop authentication framework. But the configuration parameters required by > the Hadoop authentication framework are hardcoded in the plugin code itself. > https://github.com/apache/lucene-solr/blob/5b770b56d012279d334f41e4ef7fe652480fd3cf/solr/core/src/java/org/apache/solr/security/KerberosPlugin.java#L119 > The problem with this approach is that we need to make code changes in Solr > to expose new capabilities added in Hadoop authentication framework. e.g. > HADOOP-12082 > We should implement a generic Solr authentication plugin which will accept > configuration parameters via security.json (in Zookeeper) and delegate them > to Hadoop authentication framework. This will allow to utilize new features > in Hadoop without code changes in Solr. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9835) Create another replication mode for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732174#comment-15732174 ] Pushkar Raste commented on SOLR-9835: - I am curious to know how soft commits (in memory segments) would be handled. > Create another replication mode for SolrCloud > - > > Key: SOLR-9835 > URL: https://issues.apache.org/jira/browse/SOLR-9835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat > > The current replication mechanism of SolrCloud is called state machine, which > replicas start in same initial state and for each input, the input is > distributed across replicas so all replicas will end up with same next state. > But this type of replication have some drawbacks > - The commit (which costly) have to run on all replicas > - Slow recovery, because if replica miss more than N updates on its down > time, the replica have to download entire index from its leader. > So we create create another replication mode for SolrCloud called state > transfer, which acts like master/slave replication. In basically > - Leader distribute the update to other replicas, but the leader only apply > the update to IW, other replicas just store the update to UpdateLog (act like > replication). > - Replicas frequently polling the latest segments from leader. > Pros: > - Lightweight for indexing, because only leader are running the commit, > updates. > - Very fast recovery, replicas just have to download the missing segments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9815) Verbose Garbage Collection logging is on by default
[ https://issues.apache.org/jira/browse/SOLR-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732169#comment-15732169 ] Jan Høydahl commented on SOLR-9815: --- Can you clarify what changes you made to GC_LOG_OPTS in your own environment prior to filing this issue? I assume you use a modified solr.in.sh since you did not expect to see verbose GC logging? > Verbose Garbage Collection logging is on by default > --- > > Key: SOLR-9815 > URL: https://issues.apache.org/jira/browse/SOLR-9815 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Affects Versions: 6.3 >Reporter: Gethin James >Priority: Minor > > There have been some excellent logging fixes in 6.3 > (http://www.cominvent.com/2016/11/07/solr-logging-just-got-better/). However > now, by default, Solr is logging a great deal of garbage collection > information. > It seems that this logging is excessive, can we make the default logging to > not be verbose? > For linux/mac setting GC_LOG_OPTS="" in solr.in.sh seems to work around the > issue, but looking at solr.cmd I don't think that will work for windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9815) Verbose Garbage Collection logging is on by default
[ https://issues.apache.org/jira/browse/SOLR-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732161#comment-15732161 ] Gethin James commented on SOLR-9815: So to be clear, you don't think this is a bug because OOTB solr should have verbose garbage collection logging, but there is a workaround (changing solr.in.sh) for those who don't want it? > Verbose Garbage Collection logging is on by default > --- > > Key: SOLR-9815 > URL: https://issues.apache.org/jira/browse/SOLR-9815 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Affects Versions: 6.3 >Reporter: Gethin James >Priority: Minor > > There have been some excellent logging fixes in 6.3 > (http://www.cominvent.com/2016/11/07/solr-logging-just-got-better/). However > now, by default, Solr is logging a great deal of garbage collection > information. > It seems that this logging is excessive, can we make the default logging to > not be verbose? > For linux/mac setting GC_LOG_OPTS="" in solr.in.sh seems to work around the > issue, but looking at solr.cmd I don't think that will work for windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9815) Verbose Garbage Collection logging is on by default
[ https://issues.apache.org/jira/browse/SOLR-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-9815: -- Issue Type: Wish (was: Bug) > Verbose Garbage Collection logging is on by default > --- > > Key: SOLR-9815 > URL: https://issues.apache.org/jira/browse/SOLR-9815 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Affects Versions: 6.3 >Reporter: Gethin James >Priority: Minor > > There have been some excellent logging fixes in 6.3 > (http://www.cominvent.com/2016/11/07/solr-logging-just-got-better/). However > now, by default, Solr is logging a great deal of garbage collection > information. > It seems that this logging is excessive, can we make the default logging to > not be verbose? > For linux/mac setting GC_LOG_OPTS="" in solr.in.sh seems to work around the > issue, but looking at solr.cmd I don't think that will work for windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9815) Verbose Garbage Collection logging is on by default
[ https://issues.apache.org/jira/browse/SOLR-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732148#comment-15732148 ] Jan Høydahl commented on SOLR-9815: --- So the OOTB settings are same as default solr.in.sh. If anyone tuned their solr.in.sh then that would still override the bin/solr defaults. If anyone REMOVED the GC_LOG_OPTS from solr.in.sh then they would see the defaults taking effect again, which I suppose is ok. They can always override with some custom settings of their own. So I agree this is not a bug. > Verbose Garbage Collection logging is on by default > --- > > Key: SOLR-9815 > URL: https://issues.apache.org/jira/browse/SOLR-9815 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Affects Versions: 6.3 >Reporter: Gethin James >Priority: Minor > > There have been some excellent logging fixes in 6.3 > (http://www.cominvent.com/2016/11/07/solr-logging-just-got-better/). However > now, by default, Solr is logging a great deal of garbage collection > information. > It seems that this logging is excessive, can we make the default logging to > not be verbose? > For linux/mac setting GC_LOG_OPTS="" in solr.in.sh seems to work around the > issue, but looking at solr.cmd I don't think that will work for windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9815) Verbose Garbage Collection logging is on by default
[ https://issues.apache.org/jira/browse/SOLR-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732134#comment-15732134 ] David Smiley commented on SOLR-9815: I don't _think_ I introduced it; I kept the effective settings the same. GC_LOG_OPTS was *not* commented out in solr.in.sh when I did SOLR-7850. At least this was my intention; if I made a mistake then my bad. Any way, if folks would rather this GC logging not happen by default then I'm fine with that. Personally I like it. Perhaps reducing the verbosity would address the O.P.'s concerns. > Verbose Garbage Collection logging is on by default > --- > > Key: SOLR-9815 > URL: https://issues.apache.org/jira/browse/SOLR-9815 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Affects Versions: 6.3 >Reporter: Gethin James >Priority: Minor > > There have been some excellent logging fixes in 6.3 > (http://www.cominvent.com/2016/11/07/solr-logging-just-got-better/). However > now, by default, Solr is logging a great deal of garbage collection > information. > It seems that this logging is excessive, can we make the default logging to > not be verbose? > For linux/mac setting GC_LOG_OPTS="" in solr.in.sh seems to work around the > issue, but looking at solr.cmd I don't think that will work for windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9699) CoreStatus requests can fail if executed during a core reload
[ https://issues.apache.org/jira/browse/SOLR-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731915#comment-15731915 ] Daisy.Yuan commented on SOLR-9699: -- I'think https://issues.apache.org/jira/browse/SOLR-4668 is the same problem. The satcktrace is provided. > CoreStatus requests can fail if executed during a core reload > - > > Key: SOLR-9699 > URL: https://issues.apache.org/jira/browse/SOLR-9699 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Alan Woodward > > CoreStatus requests delegate some of their response down to a core's > IndexWriter. If the core is being reloaded, then there's a race between > these calls and the IndexWriter being closed, which can lead to the request > failing with an AlreadyClosedException. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7583) Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD leaf block?
[ https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731833#comment-15731833 ] Michael McCandless commented on LUCENE-7583: bq. I was just wondering why you added the "if (newSize > currentSize)" like checks before ArrayUtil.grow. Because I wasn't trusting that the JVM would inline this method call. Also, I think these methods are poorly named. They should be {{maybeGrow}} or {{growIfNeeded}} if indeed they are lenient when you call them on an array that is not in fact in need of growing. If you feel strongly, I can remove that if, but I think it makes the code look sloppier. > Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD > leaf block? > - > > Key: LUCENE-7583 > URL: https://issues.apache.org/jira/browse/LUCENE-7583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7583-hardcode-writeVInt.patch, > LUCENE-7583.fork-FastOutputStream.patch, LUCENE-7583.patch, > LUCENE-7583.patch, LUCENE-7583.private-IndexOutput.patch > > > When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint, > int, short, etc.), and I've seen deep thread stacks through our IndexOutput > impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is > writing. > So I tried a small change, to have BKDWriter do its own buffering, by first > writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in > 1 KB byte[] chunks) to the actual IndexOutput. > This gives a non-trivial reduction (~6%) in the total time for BKD writing + > merging time on the 20M NYC taxis nightly benchmark (2 times each): > Trunk, sparse: > - total: 64.691 sec > - total: 64.702 sec > Patch, sparse: > - total: 60.820 sec > - total: 60.965 sec > Trunk dense: > - total: 62.730 sec > - total: 62.383 sec > Patch dense: > - total: 58.805 sec > - total: 58.742 sec > The results seem to be consistent and reproducible. I'm using Java 1.8.0_101 > on a fast SSD on Ubuntu 16.04. > It's sort of weird and annoying that this helps so much, because > {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}} > (default 8 KB buffer) to buffer writes. > [~thetaphi] suggested maybe hotspot is failing to inline/optimize the > {{writeByte}} / the call stack just has too many layers. > We could commit this patch (it's trivial) but it'd be nice to understand and > fix why buffering writes is somehow costly so any other Lucene codec > components that write lots of little things can be improved too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731828#comment-15731828 ] Shalin Shekhar Mangar commented on SOLR-4735: - bq. Didn't Shalin already add a /admin/metrics endpoint for reporting metrics? In SOLR-9812, I've added the metrics servlet supplied by the library but it is not longer useful for the flexible registry scheme implemented here. I'm writing a custom handler for it now. I will upload a patch soon. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch, screenshot-1.png > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale&subj=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731809#comment-15731809 ] Ramkumar Aiyengar commented on SOLR-4735: - Didn't Shalin already add a /admin/metrics endpoint for reporting metrics? > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch, screenshot-1.png > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale&subj=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7583) Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD leaf block?
[ https://issues.apache.org/jira/browse/LUCENE-7583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731714#comment-15731714 ] Uwe Schindler commented on LUCENE-7583: --- Thanks Mike! I was just wondering why you added the "if (newSize > currentSize)" like checks before ArrayUtil.grow. ArrayUtil.grow does this already and exits early, so the check is done twice. > Can we improve OutputStreamIndexOutput's byte buffering when writing each BKD > leaf block? > - > > Key: LUCENE-7583 > URL: https://issues.apache.org/jira/browse/LUCENE-7583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7583-hardcode-writeVInt.patch, > LUCENE-7583.fork-FastOutputStream.patch, LUCENE-7583.patch, > LUCENE-7583.patch, LUCENE-7583.private-IndexOutput.patch > > > When BKD writes its leaf blocks, it's essentially a lot of tiny writes (vint, > int, short, etc.), and I've seen deep thread stacks through our IndexOutput > impl ({{OutputStreamIndexOutput}}) when pulling hot threads while BKD is > writing. > So I tried a small change, to have BKDWriter do its own buffering, by first > writing each leaf block into a {{RAMOutputStream}}, and then dumping that (in > 1 KB byte[] chunks) to the actual IndexOutput. > This gives a non-trivial reduction (~6%) in the total time for BKD writing + > merging time on the 20M NYC taxis nightly benchmark (2 times each): > Trunk, sparse: > - total: 64.691 sec > - total: 64.702 sec > Patch, sparse: > - total: 60.820 sec > - total: 60.965 sec > Trunk dense: > - total: 62.730 sec > - total: 62.383 sec > Patch dense: > - total: 58.805 sec > - total: 58.742 sec > The results seem to be consistent and reproducible. I'm using Java 1.8.0_101 > on a fast SSD on Ubuntu 16.04. > It's sort of weird and annoying that this helps so much, because > {{OutputStreamIndexOutput}} already uses java's {{BufferedOutputStream}} > (default 8 KB buffer) to buffer writes. > [~thetaphi] suggested maybe hotspot is failing to inline/optimize the > {{writeByte}} / the call stack just has too many layers. > We could commit this patch (it's trivial) but it'd be nice to understand and > fix why buffering writes is somehow costly so any other Lucene codec > components that write lots of little things can be improved too. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-master - Build # 1176 - Still Unstable
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1176/ 5 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.ConcurrentDeleteAndCreateCollectionTest Error Message: ObjectTracker found 10 object(s) that were not released!!! [InternalHttpClient, InternalHttpClient, InternalHttpClient, InternalHttpClient, InternalHttpClient, InternalHttpClient, InternalHttpClient, InternalHttpClient, InternalHttpClient, InternalHttpClient] org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException at org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:43) at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:267) at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:214) at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:201) at org.apache.solr.client.solrj.impl.HttpSolrClient.(HttpSolrClient.java:210) at org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:830) at org.apache.solr.SolrTestCaseJ4.getHttpSolrClient(SolrTestCaseJ4.java:2275) at org.apache.solr.cloud.ConcurrentDeleteAndCreateCollectionTest.testConcurrentCreateAndDeleteDoesNotFail(ConcurrentDeleteAndCreateCollectionTest.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAda