[
https://issues.apache.org/jira/browse/HBASE-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475617#comment-15475617
]
Duo Zhang commented on HBASE-16583:
-----------------------------------
In fact, the current architecture of RS is already SEDA. We have several
threads reading or writing socket, and a thread pool for read request(read
handler), a thead pool for write request(write handler), and a thread for
writing WAL(or multiple threads if you use MultiWAL).
I think the thread pool introduced in the WAL layer is reasonble as we will
doing network IO, this is consider to be time consuming. But the rpc handler
thread pool is just a simple pattern which is copied from other rpc framework.
For a general rpc framework, the thread pool is needed as we do not know what
the users will do in a rpc call so the safe way is to give them a seperated
thread. But for us, we do know what will happen in the rpc handler, the code is
written by us. For example, with AsyncFSWAL, ideally, we could execute a write
request directly in the RpcServer's thread. When we reach the WAL layer, we
schedule a sync request and just return. The AsyncFSWAL will trigger a callback
when the sync is done and finish the remaining work and write the response
back(or let the RpcServer's thread to do the work). And if we make the
RpcServer also run in the netty EventLoopGroup, we get a TPC architecture,
right?
But sadly, it is only an ideal... We have increment, which will acquire a write
row lock and hold it for a really long time, and also, we will wait mvcc
completion before return. So the first intention here, is to find the place
where the thread could be blocked for a long time. We could split our workflow
to several stages using these points. And in fact, thread swtich is not always
needed when we cross these points. For example, if you can acquire the row lock
directly, just run it directly. At a high level, this means we could share
thread pool between different stages. And if we could share a thread pool
across all stages finally, this is TPC. Of course, there are still lots of work
to be done, such as making the request from one connection always run in the
same thread(good for locking and cpu caching), using lock-free data structure
as much as possible...(And how to implement priority?).
Thanks.
> Staged Event-Driven Architecture
> --------------------------------
>
> Key: HBASE-16583
> URL: https://issues.apache.org/jira/browse/HBASE-16583
> Project: HBase
> Issue Type: Umbrella
> Reporter: Phil Yang
>
> Staged Event-Driven Architecture (SEDA) splits request-handling logic into
> several stages, each stage is executed in a thread pool and they are
> connected by queues.
> Currently, in region server we use a thread pool to handle requests from
> client. The number of handlers is configurable, reading and writing use
> different pools. The current architecture has two limitations:
> Performance:
> Different part of the handling path has different bottleneck. For example,
> accessing MemStore and cache mainly consumes CPU but accessing HDFS mainly
> consumes network/disk IO. If we use SEDA and split them into two different
> stages, we can use different numbers for two pools according to the
> CPU/disk/network performance case by case.
> Availability:
> HBASE-16388 described a scene that if the client use a thread pool and use
> blocking methods to access region servers, only one slow server may exhaust
> most of threads of the client. For HBase, we are the client and HDFS
> datanodes are the servers. A slow datanode may exhaust most of handlers. The
> best way to resolve this issue is make HDFS requests non-blocking, which is
> exactly what SEDA does.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)