[
https://issues.apache.org/jira/browse/SOLR-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479197#comment-16479197
]
David Smiley commented on SOLR-12338:
-------------------------------------
Maybe you can propose {{SetBlockingQueue}} (or whatever name we settle on) to
Guava? Even if it's not accepted ultimately; there might be some great
feedback and/or pointers to something similar that proves useful, as this stuff
is hard so the more eyes the better.
I like that you've avoided hash collisions altogether by not doing hashes! Use
of ConcurrentHashMap<Integer,...> makes sense to me for such an approach.
However it appears we have some complexity to deal with since keys need to be
added and removed on demand, safely, which seems to be quite tricky.
* I think the "hash" variable should not be called this to avoid confusion as
there is no hashing. Maybe just "id" or "lockId"
* Do we still need the Random stuff?
* Maybe rename your "SetBlockingQueue" to "SetSemaphore" or probably better
"SetLock" as it does not hold anything (Queues hold stuff)
* Can "Semaphore sizeLock" be renamed to "sizeSemaphore" or "sizePermits" is it
does not extend Lock?
* Can the "closed" state be removed from SetBlockingQueue altogether? It's not
clear it actually needs to be "closed". It seems wrong; other concurrent
mechanisms don't have this notion (no Queue, Lock, or Semaphore does, etc.)
FWIW I stripped this from the class and the test passed.
* Perhaps its better to acquire() the size permit first in add() instead of
last to prevent lots of producing threads inserting keys into a map only to
eventually wait. Although it might add annoying try-finally to add() to ensure
we put the permit back if there's an exception after (e.g. interrupt). Heck;
maybe that's an issue no matter what the sequence is.
* Can the value side of the ConcurrentHashMap be a Lock (I guess ReentrantLock
impl)? It seems like the most direct concept we want; Semaphore is more than a
Lock as it tracks permits that we don't need here?
* The hot while loop of map.putIfAbsent seems fishy to me. Even if it may be
rare in practice, I wonder if we can do something simpler? You may get luck
with map.compute\* methods on ConcurrentHashMap which execute the lambda
atomically. Though I don't know if it's bad to block if we try to acquire a
lock within there. I see remove() removes the value of the Map but perhaps it
the value were a mechanism that tracked that there's a producer pending, then
we should not remove the value from the lock? If we did this, then maybe that
would simplify add()? I'm not sure.
Perhaps a simpler approach would involve involve a Set of weakly referenced
objects, and thus we don't need to worry about removal. In such a design add()
would need to return a reference to the member of the set, and that object
would have a "release()" method when done. I'm not sure if in practice these
might be GC'ed fast enough if they end up being usually very temporary? Shrug.
> Replay buffering tlog in parallel
> ---------------------------------
>
> Key: SOLR-12338
> URL: https://issues.apache.org/jira/browse/SOLR-12338
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Cao Manh Dat
> Assignee: Cao Manh Dat
> Priority: Major
> Attachments: SOLR-12338.patch, SOLR-12338.patch, SOLR-12338.patch
>
>
> Since updates with different id are independent, therefore it is safe to
> replay them in parallel. This will significantly reduce recovering time of
> replicas in high load indexing environment.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]