[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563361#comment-14563361 ] ASF subversion and git services commented on SOLR-6820: --- Commit 1682293 from [~thelabdude] in branch 'dev/branches/lucene_solr_5_2' [ https://svn.apache.org/r1682293 ] SOLR-6820: fix numVersionBuckets name attribute in configsets The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563336#comment-14563336 ] ASF subversion and git services commented on SOLR-6820: --- Commit 1682291 from [~thelabdude] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1682291 ] SOLR-6820: fix numVersionBuckets name attribute in configsets The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563320#comment-14563320 ] ASF subversion and git services commented on SOLR-6820: --- Commit 1682288 from [~thelabdude] in branch 'dev/trunk' [ https://svn.apache.org/r1682288 ] SOLR-6820: fix numVersionBuckets name attribute in configsets The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556774#comment-14556774 ] Timothy Potter commented on SOLR-6820: -- This has been committed to 5.2 but I forgot to include the ticket # in the commit message for the 5.2 branch :-( The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556412#comment-14556412 ] Erick Erickson commented on SOLR-6820: -- Should we put the 65K bucket default into 5.2? I don't see a good reason not to. The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556417#comment-14556417 ] Yonik Seeley commented on SOLR-6820: I don't like that it's a band-aid around the real problem, but it's the best pseudo-workaround we currently have I guess (it's based on luck... we're just lowering the likelihood of a different thread hitting the blocked bucket). +1 for raising to 65536 for 5.2 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556497#comment-14556497 ] ASF subversion and git services commented on SOLR-6820: --- Commit 1681169 from [~thelabdude] in branch 'dev/trunk' [ https://svn.apache.org/r1681169 ] SOLR-6820: Increase the default number of buckets to 65536 instead of 256 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556512#comment-14556512 ] ASF subversion and git services commented on SOLR-6820: --- Commit 1681171 from [~thelabdude] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1681171 ] SOLR-6820: Increase the default number of buckets to 65536 instead of 256 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552506#comment-14552506 ] Shalin Shekhar Mangar commented on SOLR-6820: - Shouldn't we increase the default numBuckets to 65536? It doesn't look very expensive to do so -- the numBuckets are used to create an array of VersionBucket objects each of which contains a single long value. The benefit is quite significant for such a small cost. The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552566#comment-14552566 ] Timothy Potter commented on SOLR-6820: -- [~shalinmangar] good point! I'm happy to make the default 65536 ... [~ysee...@gmail.com] [~markrmil...@gmail.com] any objections to changing the default setting from 256 to 65536? Thanks. The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Fix For: Trunk, 5.2 Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552456#comment-14552456 ] ASF subversion and git services commented on SOLR-6820: --- Commit 1680593 from [~thelabdude] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1680593 ] SOLR-6820: Make the number of version buckets used by the UpdateLog configurable as increasing beyond the default 256 has been shown to help with high volume indexing performance in SolrCloud The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552421#comment-14552421 ] ASF subversion and git services commented on SOLR-6820: --- Commit 1680586 from [~thelabdude] in branch 'dev/trunk' [ https://svn.apache.org/r1680586 ] SOLR-6820: Make the number of version buckets used by the UpdateLog configurable as increasing beyond the default 256 has been shown to help with high volume indexing performance in SolrCloud The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14490307#comment-14490307 ] Timothy Potter commented on SOLR-6820: -- Thanks for the nice description of this Yonik! I think this thread dump shows the problem nicely: {code} recoveryExecutor-7-thread-1 prio=10 tid=0x7f0fe821e800 nid=0xbafc runnable [0x7f1003c3] java.lang.Thread.State: RUNNABLE at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc(Lucene41PostingsReader.java:440) at org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111) at org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:351) at org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59) at org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(BufferedUpdatesStream.java:545) at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:284) - locked 0x0005d9e96768 (a org.apache.lucene.index.BufferedUpdatesStream) at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3238) - locked 0x0005d9a4f2a8 (a org.apache.solr.update.SolrIndexWriter) at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3229) - locked 0x0005d9a4f2a8 (a org.apache.solr.update.SolrIndexWriter) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:384) - locked 0x0005d9a4f2a8 (a org.apache.solr.update.SolrIndexWriter) - locked 0x0005dc1943a8 (a java.lang.Object) at org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:289) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:274) at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1465) at org.apache.solr.update.UpdateLog.add(UpdateLog.java:424) - locked 0x0005dd011ab8 (a org.apache.solr.update.UpdateLog) at org.apache.solr.update.DirectUpdateHandler2.addAndDelete(DirectUpdateHandler2.java:449) - locked 0x0005dc1943d8 (a java.lang.Object) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:216) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:928) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1082) - locked 0x0005c2b560d0 (a org.apache.solr.update.VersionBucket) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:695) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104) at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1343) at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1217) ... {code} and {code} recoveryExecutor-7-thread-2 prio=10 tid=0x7ec5b4003000 nid=0xc131 waiting for monitor entry [0x7f1003aab000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:999) - waiting to lock 0x0005c2b560d0 (a org.apache.solr.update.VersionBucket) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:695) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104) at org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1343) at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1217) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ... {code} The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL:
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488125#comment-14488125 ] Yonik Seeley commented on SOLR-6820: At first I couldn't understand why such a large number of buckets was needed... the math for two documents wanting to use the same bucket at the same time should be similar to the birthday problem: http://en.wikipedia.org/wiki/Birthday_problem , with n being the number of indexing threads, and d being the number of buckets. But then I realized... if one add takes a long time because lucene used the thread to flush a segment, then other indexing threads will quickly pile up on the same bucket. Basically, and indexing thread quickly indexes documents into random slots until it accidentally hits the same bucket as the blocked thread, and then it stops. I guess having lots of buckets is the best work-around we can do for now (but it still doesn't cure the problem). The root cure would be a way for Lucene to not use client threads for long running index operations. The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389615#comment-14389615 ] Timothy Potter commented on SOLR-6820: -- jfyi - my patch for SOLR-7332 includes the fix for this as well, but I'll probably separate the two before committing The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Assignee: Timothy Potter Attachments: SOLR-6820.patch, threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243313#comment-14243313 ] Mark Miller commented on SOLR-6820: --- This may be related to SOLR-6838. I have to do some investigation with Overwrite = false. The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Attachments: threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241957#comment-14241957 ] Mark Miller commented on SOLR-6820: --- In the instance I'm seeing, the hashing looks okay, but it seems that this can be hit and block for up almost a minute in this test. It will hold the bucket sync and the other threads appear to lock up as well, I assume as each quickly hits a doc that needs the lock held by the thread blocked below: {quote} qtp1204167249-19 [BLOCKED] org.apache.lucene.index.IndexWriter.publishFlushedSegment(SegmentCommitInfo, FrozenBufferedUpdates, FrozenBufferedUpdates) IndexWriter.java:2273 org.apache.lucene.index.DocumentsWriterFlushQueue$FlushTicket.publishFlushedSegment(IndexWriter, DocumentsWriterPerThread$FlushedSegment, FrozenBufferedUpdates) DocumentsWriterFlushQueue.java:198 org.apache.lucene.index.DocumentsWriterFlushQueue$FlushTicket.finishFlush(IndexWriter, DocumentsWriterPerThread$FlushedSegment, FrozenBufferedUpdates) DocumentsWriterFlushQueue.java:213 org.apache.lucene.index.DocumentsWriterFlushQueue$SegmentFlushTicket.publish(IndexWriter) DocumentsWriterFlushQueue.java:249 org.apache.lucene.index.DocumentsWriterFlushQueue.innerPurge(IndexWriter) DocumentsWriterFlushQueue.java:116 org.apache.lucene.index.DocumentsWriterFlushQueue.tryPurge(IndexWriter) DocumentsWriterFlushQueue.java:149 org.apache.lucene.index.DocumentsWriter.purgeBuffer(IndexWriter, boolean) DocumentsWriter.java:183 org.apache.lucene.index.IndexWriter.purge(boolean) IndexWriter.java:4536 org.apache.lucene.index.IndexWriter.doAfterSegmentFlushed(boolean, boolean) IndexWriter.java:4550 org.apache.lucene.index.DocumentsWriter$MergePendingEvent.process(IndexWriter, boolean, boolean) DocumentsWriter.java:700 org.apache.lucene.index.IndexWriter.processEvents(Queue, boolean, boolean) IndexWriter.java:4578 org.apache.lucene.index.IndexWriter.processEvents(boolean, boolean) IndexWriter.java:4570 org.apache.lucene.index.IndexWriter.updateDocument(Term, IndexDocument, Analyzer) IndexWriter.java:1394 org.apache.solr.update.DirectUpdateHandler2.addDoc0(AddUpdateCommand) DirectUpdateHandler2.java:240 org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) DirectUpdateHandler2.java:164 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand) RunUpdateProcessorFactory.java:69 org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(AddUpdateCommand) UpdateRequestProcessor.java:51 org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(AddUpdateCommand) DistributedUpdateProcessor.java:931 org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(AddUpdateCommand) DistributedUpdateProcessor.java:1085 org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(AddUpdateCommand) DistributedUpdateProcessor.java:697 org.apache.solr.update.processor.LogUpdateProcessor.processAdd(AddUpdateCommand) LogUpdateProcessorFactory.java:104 {quote} Other intermittent times seem to lock up very briefly as all the threads hit the same bucket and one takes a moment to add the doc. The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Attachments: threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242024#comment-14242024 ] Yonik Seeley commented on SOLR-6820: bq. In the instance I'm seeing, the hashing looks okay, but it seems that this can be hit and block for up almost a minute in this test. Hmmm, this looks like a lucene limitation, and perhaps should be considered a bug. It looks like the IndexWriter is stealing the thread used to add a document in order flush a complete segment (and hence a single add can sometimes take an order of magnitude longer?) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller Attachments: threads.png -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240368#comment-14240368 ] Mark Miller commented on SOLR-6820: --- I'll collect the hot spots for each case shortly. Might be interesting to add some quick debug stats as well. The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234189#comment-14234189 ] Mark Miller commented on SOLR-6820: --- In my simple testing for SOLR-6816, I can see that raising the number of buckets to 1024 from 256 seems to do nothing, but raising it to 65536 seem to give performance similar to removing the sync entirely. We should probably make the number of buckets configurable, consider our default, and review the hashing that is going on. The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.
[ https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234328#comment-14234328 ] Yonik Seeley commented on SOLR-6820: That's pretty surprising... with a 6 core CPU, it's unclear how 256 buckets could cause that much contention. The id's are UUIDs, and they are hashed with murmur3... it should be well distributed. Is the occasional block causing something else bad to happen in the whole stack (sending clients, etc)? The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication. - Key: SOLR-6820 URL: https://issues.apache.org/jira/browse/SOLR-6820 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Mark Miller -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org