[ 
https://issues.apache.org/jira/browse/HADOOP-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904363#comment-13904363
 ] 

Daryn Sharp commented on HADOOP-10278:
--------------------------------------

Up front, my requirement is I don't want a feature that will be rarely used to 
add additional lock contention during normal operation.  I'm -1 to more locking.

bq.  LinkedBlockingQueue separates its lock into only two locks, put side and 
get side, so that congesion in each side causes context switches. Moreover the 
separation of the lock is ineffective when there are almost always empty in the 
queue, and collision between put and get causes context switches.

I'm not sure I understand.  Context switches are inevitable if the queue is 
empty.  Under normal conditions the put/take lock reduces contention.

bq. An issue arises when production > consumption: 
I thought about that case with the original analysis but I didn't want the post 
to be too big.  If the NN is under heavy load, the callq is full, and the user 
uses a new queue is of a much smaller size - I'd argue that's a case of user 
error.  The draining thread will compete with new calls but should eventually 
complete.

bq. The atomic reference is not needed for the queue and volatile is enough if 
you only set and get. Volatile variables may have much chance to be optimized 
by VM because volatile is within the language specification. 

AtomicRef is just a wrapper around a volatile (by nature can't be optimized) 
with CAS methods.  I like the fact AtomicRef provides an obvious hint that the 
value is going to be changed, plus we should be using CAS to swap the queue ref 
to prevent races.
---

So how about something like this rough idea:
# CallQueueManager (or whatever it's called) maintains a putRef and takeRef, 
initially set to the same queue
# for a swap, use CAS to update the putRef with the new callq so readers begin 
filling it - note it's not yet being serviced by handlers
# now wait for handlers to drain all the calls in takeRef's old callq (1) - 
periodically check isEmpty()?
# finally update takeRef to point to the new callq

(1)There's a race where readers may drop in 1 more call after the callq 
swapped.  Using isEmpty() is imperfect due to the race.  I dislike time-based 
solutions, but if handlers are polling for a few seconds, that should be plenty 
of time to get straggler calls before the handlers switch to the new/current 
queue.

> Refactor to make CallQueue pluggable
> ------------------------------------
>
>                 Key: HADOOP-10278
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10278
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ipc
>            Reporter: Chris Li
>         Attachments: HADOOP-10278-atomicref-adapter.patch, 
> HADOOP-10278-atomicref-adapter.patch, HADOOP-10278-atomicref-rwlock.patch, 
> HADOOP-10278-atomicref.patch, HADOOP-10278-atomicref.patch, 
> HADOOP-10278-atomicref.patch, HADOOP-10278-atomicref.patch, 
> HADOOP-10278.patch, HADOOP-10278.patch
>
>
> * Refactor CallQueue into an interface, base, and default implementation that 
> matches today's behavior
> * Make the call queue impl configurable, keyed on port so that we minimize 
> coupling



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to