[ 
https://issues.apache.org/jira/browse/HBASE-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475134#comment-15475134
 ] 

Andrew Purtell commented on HBASE-16583:
----------------------------------------

One of the most common and classic critique of the SEDA architecture, by the 
original proponent of the idea as well as others, is the overhead of connecting 
stages through event queues lowers the ceiling for performance. Consider 
http://matt-welsh.blogspot.com/2010/07/retrospective-on-seda.html .

{quote}
If I were to design SEDA today, I would decouple stages (i.e., code modules) 
from queues and thread pools (i.e., concurrency boundaries). Stages are still 
useful as a structuring primitive, but it is probably best to group multiple 
stages within a single "thread pool domain" where latency is critical. Most 
stages should be connected via direct function call. I would only put a 
separate thread pool and queue in front of a group of stages that have long 
latency or nondeterministic runtime, such as performing disk I/O. 
{quote}

and note he no longer considers the benchmarks that validated the original SEDA 
paper as actually supporting application of SEDA for real-world applications. 

I'm not saying don't go in this direction, but let's not assume the payoff will 
be automatic. 

> Staged Event-Driven Architecture
> --------------------------------
>
>                 Key: HBASE-16583
>                 URL: https://issues.apache.org/jira/browse/HBASE-16583
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: Phil Yang
>
> Staged Event-Driven Architecture (SEDA) splits request-handling logic into 
> several stages, each stage is executed in a thread pool and they are 
> connected by queues.
> Currently, in region server we use a thread pool to handle requests from 
> client. The number of handlers is configurable, reading and writing use 
> different pools. The current architecture has two limitations:
> Performance:
> Different part of the handling path has different bottleneck. For example, 
> accessing MemStore and cache mainly consumes CPU but accessing HDFS mainly 
> consumes network/disk IO. If we use SEDA and split them into two different 
> stages, we can use different numbers for two pools according to the 
> CPU/disk/network performance case by case.
> Availability:
> HBASE-16388 described a scene that if the client use a thread pool and use 
> blocking methods to access region servers, only one slow server may exhaust 
> most of threads of the client. For HBase, we are the client and HDFS 
> datanodes are the servers. A slow datanode may exhaust most of handlers. The 
> best way to resolve this issue is make HDFS requests non-blocking, which is 
> exactly what SEDA does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to