[ 
https://issues.apache.org/jira/browse/HBASE-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475617#comment-15475617
 ] 

Duo Zhang commented on HBASE-16583:
-----------------------------------

In fact, the current architecture of RS is already SEDA. We have several 
threads reading or writing socket, and a thread pool for read request(read 
handler), a thead pool for write request(write handler), and a thread for 
writing WAL(or multiple threads if you use MultiWAL).

I think the thread pool introduced in the WAL layer is reasonble as we will 
doing network IO, this is consider to be time consuming. But the rpc handler 
thread pool is just a simple pattern which is copied from other rpc framework. 
For a general rpc framework, the thread pool is needed as we do not know what 
the users will do in a rpc call so the safe way is to give them a seperated 
thread. But for us, we do know what will happen in the rpc handler, the code is 
written by us. For example, with AsyncFSWAL, ideally, we could execute a write 
request directly in the RpcServer's thread. When we reach the WAL layer, we 
schedule a sync request and just return. The AsyncFSWAL will trigger a callback 
when the sync is done and finish the remaining work and write the response 
back(or let the RpcServer's thread to do the work). And if we make the 
RpcServer also run in the netty EventLoopGroup, we get a TPC architecture, 
right?

But sadly, it is only an ideal... We have increment, which will acquire a write 
row lock and hold it for a really long time, and also, we will wait mvcc 
completion before return. So the first intention here, is to find the place 
where the thread could be blocked for a long time. We could split our workflow 
to several stages using these points. And in fact, thread swtich is not always 
needed when we cross these points. For example, if you can acquire the row lock 
directly, just run it directly. At a high level, this means we could share 
thread pool between different stages. And if we could share a thread pool 
across all stages finally, this is TPC. Of course, there are still lots of work 
to be done, such as making the request from one connection always run in the 
same thread(good for locking and cpu caching), using lock-free data structure 
as much as possible...(And how to implement priority?).

Thanks.


> Staged Event-Driven Architecture
> --------------------------------
>
>                 Key: HBASE-16583
>                 URL: https://issues.apache.org/jira/browse/HBASE-16583
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: Phil Yang
>
> Staged Event-Driven Architecture (SEDA) splits request-handling logic into 
> several stages, each stage is executed in a thread pool and they are 
> connected by queues.
> Currently, in region server we use a thread pool to handle requests from 
> client. The number of handlers is configurable, reading and writing use 
> different pools. The current architecture has two limitations:
> Performance:
> Different part of the handling path has different bottleneck. For example, 
> accessing MemStore and cache mainly consumes CPU but accessing HDFS mainly 
> consumes network/disk IO. If we use SEDA and split them into two different 
> stages, we can use different numbers for two pools according to the 
> CPU/disk/network performance case by case.
> Availability:
> HBASE-16388 described a scene that if the client use a thread pool and use 
> blocking methods to access region servers, only one slow server may exhaust 
> most of threads of the client. For HBase, we are the client and HDFS 
> datanodes are the servers. A slow datanode may exhaust most of handlers. The 
> best way to resolve this issue is make HDFS requests non-blocking, which is 
> exactly what SEDA does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to