[jira] [Commented] (SOLR-12338) Replay buffering tlog in parallel

Cao Manh Dat (JIRA) Thu, 10 May 2018 19:24:50 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471392#comment-16471392
 ]


Cao Manh Dat commented on SOLR-12338:
-------------------------------------

{quote}I have doubts on the use of a new ArrayBlockingQueue<>(1) per doc ID 
hash bucket. What if the client adds a Runnable for doc1, then immediately adds 
another Runnable for doc1. You're intending for the second runnable to block 
until the first completes to achieve the per-doc ID serialization. But this may 
not happen; a thread may start on the first runnable (which frees up the second 
runnable to be submitted), then the thread doesn't get CPU time, and then the 
other Runnable zooms ahead out-of-order. See what I mean?
{quote}
It is per threads (which is small), not per bucket. If I understand correctly, 
what you mean here is two threads waiting for a lock to be released, the one 
who come late win the lock. This seems can be solve by set the fair flag of 
{{ArrayBlockingQueue}} to true, right?

{quote}
Also if you submit without an ID, then it should probably proceed right to the 
delegate Executor.  Why does it pick an ID at random?
{quote}
This can help us to know how many threads are running (pending). Therefore 
OrderedExecutor does not execute more than {{numThreads }}in parallel. It also 
solves the case when ExecutorService's queue is full it will throw 
RejectedExecutionException.

> Replay buffering tlog in parallel
> ---------------------------------
>
>                 Key: SOLR-12338
>                 URL: https://issues.apache.org/jira/browse/SOLR-12338
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>            Assignee: Cao Manh Dat
>            Priority: Major
>         Attachments: SOLR-12338.patch, SOLR-12338.patch
>
>
> Since updates with different id are independent, therefore it is safe to 
> replay them in parallel. This will significantly reduce recovering time of 
> replicas in high load indexing environment. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-12338) Replay buffering tlog in parallel

Reply via email to