[
https://issues.apache.org/jira/browse/HBASE-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479622#comment-15479622
]
Phil Yang commented on HBASE-16583:
-----------------------------------
As I said, what we need is not the "traditional" SEDA. Maybe the title should
be changed :)
I think our real goal is to reduce the time of being blocked (even no blocked)
in our worker threads, and then in all threads. When there is no any thread
will be blocked, we have TPC. Now we have to set a large number of handlers to
counteract the blocking handlers, which results in more thread scheduling.
HBASE-16492 will do a simple work that set timeout on blocking points according
to rpc timeout from clients so the handlers will not being blocked too long and
wasting time. It is easy and not a long-term work so I think we can do this
before 1.4 releases.
We have several blocking points in read/write path. Some blocking operations
can be easily change to asynchronous but the others are still synchronous. Now
we have AsyncWAL so we have an asynchronous way to write to HDFS. But our HDFS
reader and API to NN are still synchronous. We can use EventLoop to do all
asynchronous logic, but we have to use a thread pool to do synchronous work,
although the API exposed to read/write path can be non-blocking, like FSHLog.
The thread pools are why I titled SEDA. So before we go to TPC completely, we
may use a mixed architecture that some places are like SEDA and some places are
like TPC.
No matter how to change the current blocking operations(use thread pool or make
them async). We should split each blocking point into before/after two stages.
We will have several stages and when we end up a stage because there is a
blocking point, we can schedule this work and end up this stage task which lets
other work can be executed in current thread. And sometimes maybe we can avoid
switch the task in current thread if we can get the result immediately. For
example, when we should acquire a lock in our read path, we can tryLock first
and if we get the lock we can continue the next logic, otherwise the logic
should wait and this thread can do other things until the task has acquired the
lock or timeout.
And if we make our worker threads (handlers) async and there is no blocking
worker threads(may be still have other thread pools), we can schedule all
requests in a same region into same thread. If we can do this our MemStore can
be non thread-safe. And the logic of mvcc/rowlock/idlock may be much easier. Of
course, this is not easy because different region has different qps. C* can do
this easily after TPC because they have no any guarantee across rows and no
mvcc so they can shard by partition key directly if they want.
Fully asynchronous Region is not a easy work, but at some important points we
may can do something first. Maybe executing some synchronous work in another
thread(pool) is easier?
> Staged Event-Driven Architecture
> --------------------------------
>
> Key: HBASE-16583
> URL: https://issues.apache.org/jira/browse/HBASE-16583
> Project: HBase
> Issue Type: Umbrella
> Reporter: Phil Yang
>
> Staged Event-Driven Architecture (SEDA) splits request-handling logic into
> several stages, each stage is executed in a thread pool and they are
> connected by queues.
> Currently, in region server we use a thread pool to handle requests from
> client. The number of handlers is configurable, reading and writing use
> different pools. The current architecture has two limitations:
> Performance:
> Different part of the handling path has different bottleneck. For example,
> accessing MemStore and cache mainly consumes CPU but accessing HDFS mainly
> consumes network/disk IO. If we use SEDA and split them into two different
> stages, we can use different numbers for two pools according to the
> CPU/disk/network performance case by case.
> Availability:
> HBASE-16388 described a scene that if the client use a thread pool and use
> blocking methods to access region servers, only one slow server may exhaust
> most of threads of the client. For HBase, we are the client and HDFS
> datanodes are the servers. A slow datanode may exhaust most of handlers. The
> best way to resolve this issue is make HDFS requests non-blocking, which is
> exactly what SEDA does.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)