[
https://issues.apache.org/jira/browse/HADOOP-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904363#comment-13904363
]
Daryn Sharp commented on HADOOP-10278:
--------------------------------------
Up front, my requirement is I don't want a feature that will be rarely used to
add additional lock contention during normal operation. I'm -1 to more locking.
bq. LinkedBlockingQueue separates its lock into only two locks, put side and
get side, so that congesion in each side causes context switches. Moreover the
separation of the lock is ineffective when there are almost always empty in the
queue, and collision between put and get causes context switches.
I'm not sure I understand. Context switches are inevitable if the queue is
empty. Under normal conditions the put/take lock reduces contention.
bq. An issue arises when production > consumption:
I thought about that case with the original analysis but I didn't want the post
to be too big. If the NN is under heavy load, the callq is full, and the user
uses a new queue is of a much smaller size - I'd argue that's a case of user
error. The draining thread will compete with new calls but should eventually
complete.
bq. The atomic reference is not needed for the queue and volatile is enough if
you only set and get. Volatile variables may have much chance to be optimized
by VM because volatile is within the language specification.
AtomicRef is just a wrapper around a volatile (by nature can't be optimized)
with CAS methods. I like the fact AtomicRef provides an obvious hint that the
value is going to be changed, plus we should be using CAS to swap the queue ref
to prevent races.
---
So how about something like this rough idea:
# CallQueueManager (or whatever it's called) maintains a putRef and takeRef,
initially set to the same queue
# for a swap, use CAS to update the putRef with the new callq so readers begin
filling it - note it's not yet being serviced by handlers
# now wait for handlers to drain all the calls in takeRef's old callq (1) -
periodically check isEmpty()?
# finally update takeRef to point to the new callq
(1)There's a race where readers may drop in 1 more call after the callq
swapped. Using isEmpty() is imperfect due to the race. I dislike time-based
solutions, but if handlers are polling for a few seconds, that should be plenty
of time to get straggler calls before the handlers switch to the new/current
queue.
> Refactor to make CallQueue pluggable
> ------------------------------------
>
> Key: HADOOP-10278
> URL: https://issues.apache.org/jira/browse/HADOOP-10278
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ipc
> Reporter: Chris Li
> Attachments: HADOOP-10278-atomicref-adapter.patch,
> HADOOP-10278-atomicref-adapter.patch, HADOOP-10278-atomicref-rwlock.patch,
> HADOOP-10278-atomicref.patch, HADOOP-10278-atomicref.patch,
> HADOOP-10278-atomicref.patch, HADOOP-10278-atomicref.patch,
> HADOOP-10278.patch, HADOOP-10278.patch
>
>
> * Refactor CallQueue into an interface, base, and default implementation that
> matches today's behavior
> * Make the call queue impl configurable, keyed on port so that we minimize
> coupling
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)