[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563361#comment-14563361
 ] 

ASF subversion and git services commented on SOLR-6820:
---

Commit 1682293 from [~thelabdude] in branch 'dev/branches/lucene_solr_5_2'
[ https://svn.apache.org/r1682293 ]

SOLR-6820: fix numVersionBuckets name attribute in configsets

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563336#comment-14563336
 ] 

ASF subversion and git services commented on SOLR-6820:
---

Commit 1682291 from [~thelabdude] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1682291 ]

SOLR-6820: fix numVersionBuckets name attribute in configsets

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-28 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563320#comment-14563320
 ] 

ASF subversion and git services commented on SOLR-6820:
---

Commit 1682288 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1682288 ]

SOLR-6820: fix numVersionBuckets name attribute in configsets

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-22 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556774#comment-14556774
 ] 

Timothy Potter commented on SOLR-6820:
--

This has been committed to 5.2 but I forgot to include the ticket # in the 
commit message for the 5.2 branch :-(

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-22 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556412#comment-14556412
 ] 

Erick Erickson commented on SOLR-6820:
--

Should we put the 65K bucket default into 5.2? I don't see a good reason not to.

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-22 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556417#comment-14556417
 ] 

Yonik Seeley commented on SOLR-6820:


I don't like that it's a band-aid around the real problem, but it's the best 
pseudo-workaround we currently have I guess (it's based on luck... we're just 
lowering the likelihood of a different thread hitting the blocked bucket).

+1 for raising to 65536 for 5.2

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556497#comment-14556497
 ] 

ASF subversion and git services commented on SOLR-6820:
---

Commit 1681169 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1681169 ]

SOLR-6820: Increase the default number of buckets to 65536 instead of 256

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556512#comment-14556512
 ] 

ASF subversion and git services commented on SOLR-6820:
---

Commit 1681171 from [~thelabdude] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1681171 ]

SOLR-6820: Increase the default number of buckets to 65536 instead of 256

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-20 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552506#comment-14552506
 ] 

Shalin Shekhar Mangar commented on SOLR-6820:
-

Shouldn't we increase the default numBuckets to 65536? It doesn't look very 
expensive to do so -- the numBuckets are used to create an array of 
VersionBucket objects each of which contains a single long value. The benefit 
is quite significant for such a small cost.

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-20 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552566#comment-14552566
 ] 

Timothy Potter commented on SOLR-6820:
--

[~shalinmangar] good point! I'm happy to make the default 65536 ... 
[~ysee...@gmail.com] [~markrmil...@gmail.com] any objections to changing the 
default setting from 256 to 65536? Thanks.

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Fix For: Trunk, 5.2

 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552456#comment-14552456
 ] 

ASF subversion and git services commented on SOLR-6820:
---

Commit 1680593 from [~thelabdude] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1680593 ]

SOLR-6820: Make the number of version buckets used by the UpdateLog 
configurable as increasing beyond the default 256 has been shown to help with 
high volume indexing performance in SolrCloud

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-05-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552421#comment-14552421
 ] 

ASF subversion and git services commented on SOLR-6820:
---

Commit 1680586 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1680586 ]

SOLR-6820: Make the number of version buckets used by the UpdateLog 
configurable as increasing beyond the default 256 has been shown to help with 
high volume indexing performance in SolrCloud

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-04-10 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14490307#comment-14490307
 ] 

Timothy Potter commented on SOLR-6820:
--

Thanks for the nice description of this Yonik! I think this thread dump shows 
the problem nicely:

{code}
recoveryExecutor-7-thread-1 prio=10 tid=0x7f0fe821e800 nid=0xbafc 
runnable [0x7f1003c3]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc(Lucene41PostingsReader.java:440)
at 
org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111)
at 
org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
at 
org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:351)
at 
org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59)
at 
org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(BufferedUpdatesStream.java:545)
at 
org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:284)
- locked 0x0005d9e96768 (a 
org.apache.lucene.index.BufferedUpdatesStream)
at 
org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3238)
- locked 0x0005d9a4f2a8 (a org.apache.solr.update.SolrIndexWriter)
at 
org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3229)
- locked 0x0005d9a4f2a8 (a org.apache.solr.update.SolrIndexWriter)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:384)
- locked 0x0005d9a4f2a8 (a org.apache.solr.update.SolrIndexWriter)
- locked 0x0005dc1943a8 (a java.lang.Object)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:289)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:274)
at 
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1465)
at org.apache.solr.update.UpdateLog.add(UpdateLog.java:424)
- locked 0x0005dd011ab8 (a org.apache.solr.update.UpdateLog)
at 
org.apache.solr.update.DirectUpdateHandler2.addAndDelete(DirectUpdateHandler2.java:449)
- locked 0x0005dc1943d8 (a java.lang.Object)
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:216)
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:928)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1082)
- locked 0x0005c2b560d0 (a org.apache.solr.update.VersionBucket)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:695)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
at 
org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1343)
at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1217)
...
{code}

and

{code}
recoveryExecutor-7-thread-2 prio=10 tid=0x7ec5b4003000 nid=0xc131 waiting 
for monitor entry [0x7f1003aab000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:999)
- waiting to lock 0x0005c2b560d0 (a 
org.apache.solr.update.VersionBucket)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:695)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
at 
org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1343)
at org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1217)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   ...
{code}

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: 

[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-04-09 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488125#comment-14488125
 ] 

Yonik Seeley commented on SOLR-6820:


At first I couldn't understand why such a large number of buckets was needed... 
the math for two documents wanting to use the same bucket at the same time 
should be similar to the birthday problem: 
http://en.wikipedia.org/wiki/Birthday_problem , with n being the number of 
indexing threads, and d being the number of buckets.

But then I realized... if one add takes a long time because lucene used the 
thread to flush a segment, then other indexing threads will quickly pile up on 
the same bucket.  Basically, and indexing thread quickly indexes documents into 
random slots until it accidentally hits the same bucket as the blocked thread, 
and then it stops.
I guess having lots of buckets is the best work-around we can do for now (but 
it still doesn't cure the problem).  The root cure would be a way for Lucene to 
not use client threads for long running index operations.



 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2015-03-31 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389615#comment-14389615
 ] 

Timothy Potter commented on SOLR-6820:
--

jfyi - my patch for SOLR-7332 includes the fix for this as well, but I'll 
probably separate the two before committing

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Timothy Potter
 Attachments: SOLR-6820.patch, threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2014-12-11 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243313#comment-14243313
 ] 

Mark Miller commented on SOLR-6820:
---

This may be related to SOLR-6838. I have to do some investigation with 
Overwrite = false.

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
 Attachments: threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2014-12-10 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241957#comment-14241957
 ] 

Mark Miller commented on SOLR-6820:
---

In the instance I'm seeing, the hashing looks okay, but it seems that this can 
be hit and block for up almost a minute in this test. It will hold the bucket 
sync and the other threads appear to lock up as well, I assume as each quickly 
hits a doc that needs the lock held by the thread blocked below:

{quote}
 qtp1204167249-19 [BLOCKED]
org.apache.lucene.index.IndexWriter.publishFlushedSegment(SegmentCommitInfo, 
FrozenBufferedUpdates, FrozenBufferedUpdates) IndexWriter.java:2273
org.apache.lucene.index.DocumentsWriterFlushQueue$FlushTicket.publishFlushedSegment(IndexWriter,
 DocumentsWriterPerThread$FlushedSegment, FrozenBufferedUpdates) 
DocumentsWriterFlushQueue.java:198
org.apache.lucene.index.DocumentsWriterFlushQueue$FlushTicket.finishFlush(IndexWriter,
 DocumentsWriterPerThread$FlushedSegment, FrozenBufferedUpdates) 
DocumentsWriterFlushQueue.java:213
org.apache.lucene.index.DocumentsWriterFlushQueue$SegmentFlushTicket.publish(IndexWriter)
 DocumentsWriterFlushQueue.java:249
org.apache.lucene.index.DocumentsWriterFlushQueue.innerPurge(IndexWriter) 
DocumentsWriterFlushQueue.java:116
org.apache.lucene.index.DocumentsWriterFlushQueue.tryPurge(IndexWriter) 
DocumentsWriterFlushQueue.java:149
org.apache.lucene.index.DocumentsWriter.purgeBuffer(IndexWriter, boolean) 
DocumentsWriter.java:183
org.apache.lucene.index.IndexWriter.purge(boolean) IndexWriter.java:4536
org.apache.lucene.index.IndexWriter.doAfterSegmentFlushed(boolean, boolean) 
IndexWriter.java:4550
org.apache.lucene.index.DocumentsWriter$MergePendingEvent.process(IndexWriter, 
boolean, boolean) DocumentsWriter.java:700
org.apache.lucene.index.IndexWriter.processEvents(Queue, boolean, boolean) 
IndexWriter.java:4578
org.apache.lucene.index.IndexWriter.processEvents(boolean, boolean) 
IndexWriter.java:4570
org.apache.lucene.index.IndexWriter.updateDocument(Term, IndexDocument, 
Analyzer) IndexWriter.java:1394
org.apache.solr.update.DirectUpdateHandler2.addDoc0(AddUpdateCommand) 
DirectUpdateHandler2.java:240
org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) 
DirectUpdateHandler2.java:164
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand)
 RunUpdateProcessorFactory.java:69
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(AddUpdateCommand)
 UpdateRequestProcessor.java:51
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(AddUpdateCommand)
 DistributedUpdateProcessor.java:931
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(AddUpdateCommand)
 DistributedUpdateProcessor.java:1085
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(AddUpdateCommand)
 DistributedUpdateProcessor.java:697
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(AddUpdateCommand)
 LogUpdateProcessorFactory.java:104
{quote}

Other intermittent times seem to lock up very briefly as all the threads hit 
the same bucket and one takes a moment to add the doc.

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
 Attachments: threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2014-12-10 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242024#comment-14242024
 ] 

Yonik Seeley commented on SOLR-6820:


bq. In the instance I'm seeing, the hashing looks okay, but it seems that this 
can be hit and block for up almost a minute in this test.

Hmmm, this looks like a lucene limitation, and perhaps should be considered a 
bug.  It looks like the IndexWriter is stealing the thread used to add a 
document in order flush a complete segment (and hence a single add can 
sometimes take an order of magnitude longer?)

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller
 Attachments: threads.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2014-12-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240368#comment-14240368
 ] 

Mark Miller commented on SOLR-6820:
---

I'll collect the hot spots for each case shortly. Might be interesting to add 
some quick debug stats as well.

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2014-12-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234189#comment-14234189
 ] 

Mark Miller commented on SOLR-6820:
---

In my simple testing for SOLR-6816, I can see that raising the number of 
buckets to 1024 from 256 seems to do nothing, but raising it to 65536 seem to 
give performance similar to removing the sync entirely.

We should probably make the number of buckets configurable, consider our 
default, and review the hashing that is going on. 

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6820) The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument appears to be a large bottleneck when using replication.

2014-12-04 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234328#comment-14234328
 ] 

Yonik Seeley commented on SOLR-6820:


That's pretty surprising... with a 6 core CPU, it's unclear how 256 buckets 
could cause that much contention.  The id's are UUIDs, and they are hashed with 
murmur3... it should be well distributed.  Is the occasional block causing 
something else bad to happen in the whole stack (sending clients, etc)?

 The sync on the VersionInfo bucket in DistributedUpdateProcesser#addDocument 
 appears to be a large bottleneck when using replication.
 -

 Key: SOLR-6820
 URL: https://issues.apache.org/jira/browse/SOLR-6820
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Mark Miller





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org