[
https://issues.apache.org/jira/browse/HBASE-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475866#comment-15475866
]
Phil Yang commented on HBASE-16583:
-----------------------------------
{quote}
One of the most common and classic critique of the SEDA architecture, by the
original proponent of the idea as well as others, is the overhead of connecting
stages through event queues lowers the ceiling for performance.
{quote}
SDEA is proposed "many" years ago, whose original idea is not perfectly matched
with current hardware/kernel. What we need is referencing this idea to help us.
I agree that queues decrease the performance. We don't need many stages. And
maybe we indeed can remove some queues like mentioned above that if RPC reader
threads directly executing the call, the performance can be improved. I think
we don't need to split into two stages if before and after this point the
bottleneck is almost same, for example, both of them mainly consume CPU. More
specifically, I think the only place we need to discuss if we should use a
queue is IO, especially network IO.
One of the features that HBase is different from other databases is that we use
a distributed file system, HDFS, rather than save data locally. In other
databases which saves data locally, in the database-engine logic(LSM-tree or
BTree) they don't need any RPC call. All the resources it need is
CPU/memory/disks. And the local resources(especially disks) will not be
accessed by other nodes. So in that architecture, if one of
resources(CPU/memory/disks) reach the bottleneck, the whole node reach the
bottleneck, one "stage" from start to end is enough.
However, HBase has RPC calls in read/write path. Even if we make the locality
to 1.0 and use short circuit read, we still call RPC. And now we call them in a
blocking way(even use AsyncWAL we will blocked on WAL.sync in handler). We
don't know if the RPC server works well. And if a DN is reaching its
bottleneck, it doesn't mean our RS is also reaching our bottleneck. So if we
can use two thread pools that one do the local logic and the other do the RPC
logic, I am not sure if the performance must can be improved, but I think the
availability which means the performance(both throughput and latency) when one
of DNs is slow than others can be protected.
> Staged Event-Driven Architecture
> --------------------------------
>
> Key: HBASE-16583
> URL: https://issues.apache.org/jira/browse/HBASE-16583
> Project: HBase
> Issue Type: Umbrella
> Reporter: Phil Yang
>
> Staged Event-Driven Architecture (SEDA) splits request-handling logic into
> several stages, each stage is executed in a thread pool and they are
> connected by queues.
> Currently, in region server we use a thread pool to handle requests from
> client. The number of handlers is configurable, reading and writing use
> different pools. The current architecture has two limitations:
> Performance:
> Different part of the handling path has different bottleneck. For example,
> accessing MemStore and cache mainly consumes CPU but accessing HDFS mainly
> consumes network/disk IO. If we use SEDA and split them into two different
> stages, we can use different numbers for two pools according to the
> CPU/disk/network performance case by case.
> Availability:
> HBASE-16388 described a scene that if the client use a thread pool and use
> blocking methods to access region servers, only one slow server may exhaust
> most of threads of the client. For HBase, we are the client and HDFS
> datanodes are the servers. A slow datanode may exhaust most of handlers. The
> best way to resolve this issue is make HDFS requests non-blocking, which is
> exactly what SEDA does.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)