[
https://issues.apache.org/jira/browse/HADOOP-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902175#comment-13902175
]
Chris Li commented on HADOOP-10278:
-----------------------------------
I'm going to post an updated patch with tests and [~daryn]'s suggested changes
to offer shortly, but I wanted to comment on a potential issue with queue
swapping.
bq. The logic that attempts to fallback to the old queue probably isn't
required. The thread swapping should just block until it adds all calls to the
new queue. Losing or dropping calls under any condition is not desirable. A
client may be left waiting indefinitely for the lost call's response.
Currently queue swapping can fail if the new queue raises an exception or
returns false in offer(). When this happens, the old queue is reverted as the
active queue, and calls are drained back from new queue to old queue. This
behavior was included to protect against the user accidentally misconfiguring
the queue, such as going from a queue size of 10k to 10.
An issue arises when production > consumption:
1. because we are using poll(timeout) and requests come in faster than we can
poll them out, poll(timeout) will always return a non-null result.
2. handlers will switch to the new queue. They will either:
3a. not be able to keep up, as before, so the new queue will fill up to capacity
3b. be able to keep up, so the swapping will continue indefinitely, holding the
Server's intrinsic lock forever
In the case of 3a,
1. if we revert the swap, we will probably lose calls
2. if we block and just keep going, we will be swapping forever
> Refactor to make CallQueue pluggable
> ------------------------------------
>
> Key: HADOOP-10278
> URL: https://issues.apache.org/jira/browse/HADOOP-10278
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ipc
> Reporter: Chris Li
> Attachments: HADOOP-10278-atomicref-adapter.patch,
> HADOOP-10278-atomicref-rwlock.patch, HADOOP-10278-atomicref.patch,
> HADOOP-10278-atomicref.patch, HADOOP-10278-atomicref.patch,
> HADOOP-10278-atomicref.patch, HADOOP-10278.patch, HADOOP-10278.patch
>
>
> * Refactor CallQueue into an interface, base, and default implementation that
> matches today's behavior
> * Make the call queue impl configurable, keyed on port so that we minimize
> coupling
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)