[ 
https://issues.apache.org/jira/browse/HBASE-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479622#comment-15479622
 ] 

Phil Yang commented on HBASE-16583:
-----------------------------------

As I said, what we need is not the "traditional" SEDA. Maybe the title should 
be changed :) 

I think our real goal is to reduce the time of being blocked (even no blocked) 
in our worker threads, and then in all threads. When there is no any thread 
will be blocked, we have TPC. Now we have to set a large number of handlers to 
counteract the blocking handlers, which results in more thread scheduling. 
HBASE-16492 will do a simple work that set timeout on blocking points according 
to rpc timeout from clients so the handlers will not being blocked too long and 
wasting time. It is easy and not a long-term work so I think we can do this 
before 1.4 releases.

We have several blocking points in read/write path. Some blocking operations 
can be easily change to asynchronous but the others are still synchronous. Now 
we have AsyncWAL so we have an asynchronous way to write to HDFS. But our HDFS 
reader and API to NN are still synchronous. We can use EventLoop to do all 
asynchronous logic, but we have to use a thread pool to do synchronous work, 
although the API exposed to read/write path can be non-blocking, like FSHLog. 
The thread pools are why I titled SEDA. So before we go to TPC completely, we 
may use a mixed architecture that some places are like SEDA and some places are 
like TPC.

No matter how to change the current blocking operations(use thread pool or make 
them async). We should split each blocking point into before/after two stages. 
We will have several stages and when we end up a stage because there is a 
blocking point, we can schedule this work and end up this stage task which lets 
other work can be executed in current thread. And sometimes maybe we can avoid 
switch the task in current thread if we can get the result immediately. For 
example, when we should acquire a lock in our read path, we can tryLock first 
and if we get the lock we can continue the next logic, otherwise the logic 
should wait and this thread can do other things until the task has acquired the 
lock or timeout.

And if we make our worker threads (handlers) async and there is no blocking 
worker threads(may be still have other thread pools), we can schedule all 
requests in a same region into same thread. If we can do this our MemStore can 
be non thread-safe. And the logic of mvcc/rowlock/idlock may be much easier. Of 
course, this is not easy because different region has different qps. C* can do 
this easily after TPC because they have no any guarantee across rows and no 
mvcc so they can shard by partition key directly if they want.

Fully asynchronous Region is not a easy work, but at some important points we 
may can do something first. Maybe executing some synchronous work in another 
thread(pool) is easier?

> Staged Event-Driven Architecture
> --------------------------------
>
>                 Key: HBASE-16583
>                 URL: https://issues.apache.org/jira/browse/HBASE-16583
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: Phil Yang
>
> Staged Event-Driven Architecture (SEDA) splits request-handling logic into 
> several stages, each stage is executed in a thread pool and they are 
> connected by queues.
> Currently, in region server we use a thread pool to handle requests from 
> client. The number of handlers is configurable, reading and writing use 
> different pools. The current architecture has two limitations:
> Performance:
> Different part of the handling path has different bottleneck. For example, 
> accessing MemStore and cache mainly consumes CPU but accessing HDFS mainly 
> consumes network/disk IO. If we use SEDA and split them into two different 
> stages, we can use different numbers for two pools according to the 
> CPU/disk/network performance case by case.
> Availability:
> HBASE-16388 described a scene that if the client use a thread pool and use 
> blocking methods to access region servers, only one slow server may exhaust 
> most of threads of the client. For HBase, we are the client and HDFS 
> datanodes are the servers. A slow datanode may exhaust most of handlers. The 
> best way to resolve this issue is make HDFS requests non-blocking, which is 
> exactly what SEDA does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to