[ 
https://issues.apache.org/jira/browse/ACCUMULO-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978550#comment-14978550
 ] 

Adam Fuchs commented on ACCUMULO-4039:
--------------------------------------

I think that if you do this right you create a system in which a thread is 
never waiting for another thread to complete its work, whether local or remote. 
Anytime a thread grabs some work to do it should be able to make progress, and 
whenever it can't make progress it puts the session back on a queue of some 
sort. We've done something similar with our readahead threads, but we still 
have client threads waiting on server threads, server threads waiting on client 
threads, server threads waiting on shared resources (like the write-ahead log), 
and server threads acting as client threads in the case of writes. 
QoS/scheduling is absolutely essential for performance, but I think we should 
be able to solve the deadlock problem without it.

One thing I'd like to explore with this is, given this approach, how many 
threads are needed to saturate the network in the case of N clients and N 
servers, with NxN total connections and a given message size. Anybody know of 
such a study? This feels a bit like the [C10K 
problem|https://en.wikipedia.org/wiki/C10k_problem], but with lots of servers.

> try out a proactor design pattern for tserver services
> ------------------------------------------------------
>
>                 Key: ACCUMULO-4039
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4039
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Adam Fuchs
>            Priority: Minor
>
> For large instances (i.e. lots of clients for a given tserver) we create 
> oodles of threads on the tserver. This makes for difficulty in predicting 
> performance, memory usage, etc. Moreover, we have operations that recurse, 
> like a server querying itself, that we currently solve by having separate 
> thread pools for regular table operations and metadata table operations, and 
> we "disallow" things like an iterator writing to another table. One 
> alternative option would be to switch to a Proactor pattern: 
> https://en.wikipedia.org/wiki/Proactor_pattern
> The core of this would be to switch to using a selection set rather than a 
> thread per active connection, and then wrap everything in sessions that make 
> progress in something like a state model, with states that account for 
> asynchronous communications and remote work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to