[
https://issues.apache.org/jira/browse/SOLR-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474832#comment-16474832
]
David Smiley commented on SOLR-12338:
-------------------------------------
I looked at this again (after a few days of vacation) and I withdraw my concern
that there's a bug. The use of ArrayBlockingQueue(1) is acting as a sort of
Lock in the same way I suggested to use a Lock. Couldn't you simply replace it
with a Lock? The put() becomes a lock(), and the poll() becomes an unlock();
see what I mean?. I think this is clearer since it's a simpler mechanism than
an ArrayBlockingQueue, and the use of ABQ in this specific way (size 1) could
lend itself to misuse later if someone thinks increasing its size or type gains
us parallelism. And I don't think the fairness setting matters here. And
although you initialized the size of this array of ABQ to be the number of
threads, I think we ought to use a larger array to prevent collisions (prevent
needlessly blocking on different docIDs that hash to the same thread).
I also was thinking of a way to have more "on-deck" runnables for a given
docID, waiting in-line. The Runnable we submit to the delegate could be some
inner class OrderedRunnable that has a "next" pointer to the next
OrderedRunnable. We could maintain a parallel array of the top OrderedRunnable
(parallel to an array of Locks). Manipulating the OrderedRunnable chain
requires holding the lock. To ensure we bound these things waiting in-line, we
could use one Semaphore for the whole OrderedExecutor instance. There's more
to it than this. Of course this adds complexity, but the current approach
(either ABQ or Lock) can unfortunately block needlessly if the doc ID is locked
yet soon more/different dock IDs will be submitted next and there are available
threads. Perhaps this is overthinking it (over optimization / complexity) as
this will not be the common case? This would be even more needless if we
increase the Lock array to prevent collisions so nevermind I guess.
{quote}(RE Submit without ID) This can help us to know how many threads are
running (pending). Therefore OrderedExecutor does not execute more than
\{{numThreads }}in parallel. It also solves the case when ExecutorService's
queue is full it will throw RejectedExecutionException.
{quote}
Isn't this up to how the backing delegate is configured? If it's using a fixed
thread pool, then there won't be more threads running. Likewise for
RejectedExecutionException.
> Replay buffering tlog in parallel
> ---------------------------------
>
> Key: SOLR-12338
> URL: https://issues.apache.org/jira/browse/SOLR-12338
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Cao Manh Dat
> Assignee: Cao Manh Dat
> Priority: Major
> Attachments: SOLR-12338.patch, SOLR-12338.patch
>
>
> Since updates with different id are independent, therefore it is safe to
> replay them in parallel. This will significantly reduce recovering time of
> replicas in high load indexing environment.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]