[ 
https://issues.apache.org/jira/browse/HBASE-15146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348059#comment-15348059
 ] 

Hiroshi Ikeda commented on HBASE-15146:
---------------------------------------

If all clients are well-behaved and patiently wait their response, blocking 
reader threads doesn't become a vicious circle. The latency becomes bad 
depending on how congestion is there, and that is expected as a part of 
gradually reducing performance. It would be true that the latency becomes 
unacceptable one in heavy load, but who knows whether blocked tasks can't be 
answered in a reasonable time? The preceding tasks might be quite light. These 
days there is a plan to adopt AdaptiveLifoCoDelCallQueue, and the task might be 
immediately executed after release.

When I think about a full-queue error response, I don't image what a client 
should do when it receives the error response, even if the latency is much low. 
Such clients probably send again and again their request until they draw a 
winning ticket, and the whole latency to get a fruitful result will be 
unpredictably longer. The reader threads continue to rob worker threads of CPU 
time, with excessive overhead of context switches, and the worker threads would 
hardly execute tasks and resolve the full-queue condition. Ironically that 
becomes harder when the response is quicker.

BTW, in the first place, I'm now doubting the queue should be bounded. If all 
clients are well-behaved and wait their response, the number of requests are 
naturally bounded under the number of connections, and the native resources of 
sockets will be run out in front of running out heap. 
AdaptiveLifoCoDelCallQueue also seems under the assumption that all requests 
can be queued.

There are some cheat clients that don't wait their response and send multiple 
requests to simultaneously execute. I think we can count the number of 
simultaneous requests for each connection, and roughly cap it probably 
depending on the number of the connections (excluding idle connections who have 
no request?) and the number of the queued requests with some threshold, etc.

> Don't block on Reader threads queueing to a scheduler queue
> -----------------------------------------------------------
>
>                 Key: HBASE-15146
>                 URL: https://issues.apache.org/jira/browse/HBASE-15146
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>            Priority: Blocker
>             Fix For: 2.0.0, 1.2.0, 1.3.0
>
>         Attachments: HBASE-15146-v7.patch, HBASE-15146-v8.patch, 
> HBASE-15146-v8.patch, HBASE-15146.0.patch, HBASE-15146.1.patch, 
> HBASE-15146.2.patch, HBASE-15146.3.patch, HBASE-15146.4.patch, 
> HBASE-15146.5.patch, HBASE-15146.6.patch
>
>
> Blocking on the epoll thread is awful. The new rpc scheduler can have lots of 
> different queues. Those queues have different capacity limits. Currently the 
> dispatch method can block trying to add the the blocking queue in any of the 
> schedulers.
> This causes readers to block, tcp acks are delayed, and everything slows down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to