[ https://issues.apache.org/jira/browse/HBASE-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475617#comment-15475617 ]
Duo Zhang commented on HBASE-16583: ----------------------------------- In fact, the current architecture of RS is already SEDA. We have several threads reading or writing socket, and a thread pool for read request(read handler), a thead pool for write request(write handler), and a thread for writing WAL(or multiple threads if you use MultiWAL). I think the thread pool introduced in the WAL layer is reasonble as we will doing network IO, this is consider to be time consuming. But the rpc handler thread pool is just a simple pattern which is copied from other rpc framework. For a general rpc framework, the thread pool is needed as we do not know what the users will do in a rpc call so the safe way is to give them a seperated thread. But for us, we do know what will happen in the rpc handler, the code is written by us. For example, with AsyncFSWAL, ideally, we could execute a write request directly in the RpcServer's thread. When we reach the WAL layer, we schedule a sync request and just return. The AsyncFSWAL will trigger a callback when the sync is done and finish the remaining work and write the response back(or let the RpcServer's thread to do the work). And if we make the RpcServer also run in the netty EventLoopGroup, we get a TPC architecture, right? But sadly, it is only an ideal... We have increment, which will acquire a write row lock and hold it for a really long time, and also, we will wait mvcc completion before return. So the first intention here, is to find the place where the thread could be blocked for a long time. We could split our workflow to several stages using these points. And in fact, thread swtich is not always needed when we cross these points. For example, if you can acquire the row lock directly, just run it directly. At a high level, this means we could share thread pool between different stages. And if we could share a thread pool across all stages finally, this is TPC. Of course, there are still lots of work to be done, such as making the request from one connection always run in the same thread(good for locking and cpu caching), using lock-free data structure as much as possible...(And how to implement priority?). Thanks. > Staged Event-Driven Architecture > -------------------------------- > > Key: HBASE-16583 > URL: https://issues.apache.org/jira/browse/HBASE-16583 > Project: HBase > Issue Type: Umbrella > Reporter: Phil Yang > > Staged Event-Driven Architecture (SEDA) splits request-handling logic into > several stages, each stage is executed in a thread pool and they are > connected by queues. > Currently, in region server we use a thread pool to handle requests from > client. The number of handlers is configurable, reading and writing use > different pools. The current architecture has two limitations: > Performance: > Different part of the handling path has different bottleneck. For example, > accessing MemStore and cache mainly consumes CPU but accessing HDFS mainly > consumes network/disk IO. If we use SEDA and split them into two different > stages, we can use different numbers for two pools according to the > CPU/disk/network performance case by case. > Availability: > HBASE-16388 described a scene that if the client use a thread pool and use > blocking methods to access region servers, only one slow server may exhaust > most of threads of the client. For HBase, we are the client and HDFS > datanodes are the servers. A slow datanode may exhaust most of handlers. The > best way to resolve this issue is make HDFS requests non-blocking, which is > exactly what SEDA does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)