[
https://issues.apache.org/jira/browse/HBASE-15146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348059#comment-15348059
]
Hiroshi Ikeda commented on HBASE-15146:
---------------------------------------
If all clients are well-behaved and patiently wait their response, blocking
reader threads doesn't become a vicious circle. The latency becomes bad
depending on how congestion is there, and that is expected as a part of
gradually reducing performance. It would be true that the latency becomes
unacceptable one in heavy load, but who knows whether blocked tasks can't be
answered in a reasonable time? The preceding tasks might be quite light. These
days there is a plan to adopt AdaptiveLifoCoDelCallQueue, and the task might be
immediately executed after release.
When I think about a full-queue error response, I don't image what a client
should do when it receives the error response, even if the latency is much low.
Such clients probably send again and again their request until they draw a
winning ticket, and the whole latency to get a fruitful result will be
unpredictably longer. The reader threads continue to rob worker threads of CPU
time, with excessive overhead of context switches, and the worker threads would
hardly execute tasks and resolve the full-queue condition. Ironically that
becomes harder when the response is quicker.
BTW, in the first place, I'm now doubting the queue should be bounded. If all
clients are well-behaved and wait their response, the number of requests are
naturally bounded under the number of connections, and the native resources of
sockets will be run out in front of running out heap.
AdaptiveLifoCoDelCallQueue also seems under the assumption that all requests
can be queued.
There are some cheat clients that don't wait their response and send multiple
requests to simultaneously execute. I think we can count the number of
simultaneous requests for each connection, and roughly cap it probably
depending on the number of the connections (excluding idle connections who have
no request?) and the number of the queued requests with some threshold, etc.
> Don't block on Reader threads queueing to a scheduler queue
> -----------------------------------------------------------
>
> Key: HBASE-15146
> URL: https://issues.apache.org/jira/browse/HBASE-15146
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.2.0
> Reporter: Elliott Clark
> Assignee: Elliott Clark
> Priority: Blocker
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-15146-v7.patch, HBASE-15146-v8.patch,
> HBASE-15146-v8.patch, HBASE-15146.0.patch, HBASE-15146.1.patch,
> HBASE-15146.2.patch, HBASE-15146.3.patch, HBASE-15146.4.patch,
> HBASE-15146.5.patch, HBASE-15146.6.patch
>
>
> Blocking on the epoll thread is awful. The new rpc scheduler can have lots of
> different queues. Those queues have different capacity limits. Currently the
> dispatch method can block trying to add the the blocking queue in any of the
> schedulers.
> This causes readers to block, tcp acks are delayed, and everything slows down.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)