[ 
https://issues.apache.org/jira/browse/CASSANDRA-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314705#comment-15314705
 ] 

Tyler Hobbs commented on CASSANDRA-10993:
-----------------------------------------

I've pushed a 
[branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-10993-WIP] with our 
current progress on the PoC.  Unfortunately, neither of us have had as much 
time to work on this as we'd like, so some parts are still quite rough.

I'll point out a few classes in particular that demonstrate the approach:
* 
[EventLoop|https://github.com/thobbs/cassandra/blob/CASSANDRA-10993-WIP/src/java/org/apache/cassandra/poc/EventLoop.java]
 - our internal event loop for scheduling and running tasks.  Right now there 
is one EventLoop per netty worker process.  We're still playing with the netty 
integration, but at the moment we simply run {{EventLoop.cycle()}} once every 
time the netty event loop runs.
* 
[WriteTask|https://github.com/thobbs/cassandra/blob/CASSANDRA-10993-WIP/src/java/org/apache/cassandra/poc/WriteTask.java]
 - this contains almost all of the logic for the standard write path.
* 
[PaxosWriteTask|https://github.com/thobbs/cassandra/blob/CASSANDRA-10993-WIP/src/java/org/apache/cassandra/poc/PaxosWriteTask.java]
 - similar to WriteTask, but for paxos operations.
* 
[CommitLog|https://github.com/thobbs/cassandra/blob/6729cb63a8d172744832ddec49e56a71ea15e3dd/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L294-L359]
 - the highlighted methods are an example of the kinds of changes we need to 
make in select places to add async operations.  This also demonstrates how we 
handle cases where we _may_ need to defer to the event loop to avoid blocking 
I/O, but don't always need to.  In particular, we can almost always allocate 
space on the current commitlog segment, so it's nice to be able to avoid 
deferring to the event loop and dealing with emitted events in the common case.

This branch does compile and run (although not everything is supported yet).  
It currently uses "dummy" Tasks to handle cases where we haven't created 
specific async code.  Some local stress write benchmarks show throughput to be 
between 90 and 105% of trunk throughput, so this approach is roughly on par 
with trunk without any real optimization work having been done.

> Make read and write requests paths fully non-blocking, eliminate related 
> stages
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10993
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10993
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Coordination, Local Write-Read Paths
>            Reporter: Aleksey Yeschenko
>            Assignee: Tyler Hobbs
>             Fix For: 3.x
>
>
> Building on work done by [~tjake] (CASSANDRA-10528), [~slebresne] 
> (CASSANDRA-5239), and others, convert read and write request paths to be 
> fully non-blocking, to enable the eventual transition from SEDA to TPC 
> (CASSANDRA-10989)
> Eliminate {{MUTATION}}, {{COUNTER_MUTATION}}, {{VIEW_MUTATION}}, {{READ}}, 
> and {{READ_REPAIR}} stages, move read and write execution directly to Netty 
> context.
> For lack of decent async I/O options on Linux, we’ll still have to retain an 
> extra thread pool for serving read requests for data not residing in our page 
> cache (CASSANDRA-5863), however.
> Implementation-wise, we only have two options available to us: explicit FSMs 
> and chained futures. Fibers would be the third, and easiest option, but 
> aren’t feasible in Java without resorting to direct bytecode manipulation 
> (ourselves or using [quasar|https://github.com/puniverse/quasar]).
> I have seen 4 implementations bases on chained futures/promises now - three 
> in Java and one in C++ - and I’m not convinced that it’s the optimal (or 
> sane) choice for representing our complex logic - think 2i quorum read 
> requests with timeouts at all levels, read repair (blocking and 
> non-blocking), and speculative retries in the mix, {{SERIAL}} reads and 
> writes.
> I’m currently leaning towards an implementation based on explicit FSMs, and 
> intend to provide a prototype - soonish - for comparison with 
> {{CompletableFuture}}-like variants.
> Either way the transition is a relatively boring straightforward refactoring.
> There are, however, some extension points on both write and read paths that 
> we do not control:
> - authorisation implementations will have to be non-blocking. We have control 
> over built-in ones, but for any custom implementation we will have to execute 
> them in a separate thread pool
> - 2i hooks on the write path will need to be non-blocking
> - any trigger implementations will not be allowed to block
> - UDFs and UDAs
> We are further limited by API compatibility restrictions in the 3.x line, 
> forbidding us to alter, or add any non-{{default}} interface methods to those 
> extension points, so these pose a problem.
> Depending on logistics, expecting to get this done in time for 3.4 or 3.6 
> feature release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to