[ 
https://issues.apache.org/jira/browse/HADOOP-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902175#comment-13902175
 ] 

Chris Li commented on HADOOP-10278:
-----------------------------------

I'm going to post an updated patch with tests and [~daryn]'s suggested changes 
to offer shortly, but I wanted to comment on a potential issue with queue 
swapping.

bq. The logic that attempts to fallback to the old queue probably isn't 
required. The thread swapping should just block until it adds all calls to the 
new queue. Losing or dropping calls under any condition is not desirable. A 
client may be left waiting indefinitely for the lost call's response.

Currently queue swapping can fail if the new queue raises an exception or 
returns false in offer(). When this happens, the old queue is reverted as the 
active queue, and calls are drained back from new queue to old queue. This 
behavior was included to protect against the user accidentally misconfiguring 
the queue, such as going from a queue size of 10k to 10. 

An issue arises when production > consumption: 
1. because we are using poll(timeout) and requests come in faster than we can 
poll them out, poll(timeout) will always return a non-null result.
2. handlers will switch to the new queue. They will either:
3a. not be able to keep up, as before, so the new queue will fill up to capacity
3b. be able to keep up, so the swapping will continue indefinitely, holding the 
Server's intrinsic lock forever

In the case of 3a, 
1. if we revert the swap, we will probably lose calls
2. if we block and just keep going, we will be swapping forever

> Refactor to make CallQueue pluggable
> ------------------------------------
>
>                 Key: HADOOP-10278
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10278
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: ipc
>            Reporter: Chris Li
>         Attachments: HADOOP-10278-atomicref-adapter.patch, 
> HADOOP-10278-atomicref-rwlock.patch, HADOOP-10278-atomicref.patch, 
> HADOOP-10278-atomicref.patch, HADOOP-10278-atomicref.patch, 
> HADOOP-10278-atomicref.patch, HADOOP-10278.patch, HADOOP-10278.patch
>
>
> * Refactor CallQueue into an interface, base, and default implementation that 
> matches today's behavior
> * Make the call queue impl configurable, keyed on port so that we minimize 
> coupling



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to