Francesco Nigro created ARTEMIS-2945:
----------------------------------------

             Summary: LibAIO JNI code could be rewritten in Java
                 Key: ARTEMIS-2945
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2945
             Project: ActiveMQ Artemis
          Issue Type: Improvement
          Components: Broker
    Affects Versions: 2.15.0
            Reporter: Francesco Nigro
            Assignee: Francesco Nigro


LibAIO JNI code could be rewritten in Java, while keeping the LibaioContext and 
LibaioFile APIs the same. 

 

There are few benefits from this:
 # simplification of C code logic to ease maintain it
 # quicker development process (implement-try-test-debug  cycle) for non-C 
programmers, including simpler integration with Java test suites
 # easier monitoring/telemetry integration

 

As demonstrations/proofs of such benefits I would introduce several changes 
into the Java version:
 # using the libaio async fdatasync feature to allow the LibaioContext duty 
cycle loop to free CPU resources in order to handle compaction reads without 
being slowed down by an in-progress fdatasync
 # use a lock-free high performance data structure to reuse iocbs instead of a 
locked (using a mutex) one
 # expose in-flight callbacks to allow future PRs to introduce Java-only 
latency telemetry per-request or just error check/kill of "slow" in-flight 
requests

 

The possible drawbacks are:
 # slower performance due to GC barriers cost (see the notes on the PR code)
 # slower performance in case of JVM without Unsafe (that means very few)

The latter issue could be addressed by using the new/proper VarHandle features 
when the Artemis min supported version will move from Java 8 or using the same 
approach on other projects relying on it eg Netty.

A note about how to correctly benchmark this due to how I've implemented async 
fdatasync: in order to save both LibaioContext and TimedBuffer to perform 
fdatasync batching of writes, I've preferred to simplify the LibaioContext: it 
means that the buffer timeout to be used on broker.xml should be obtained by 
./artemis perf-journal --sync-writes ie batching writes at the speed of the 
measured fdatasync RTT latency.

This last behaviour could be changed used a more Apple-to-Apple approach 
although I still think that the beauty of using is Java is exactly to bring new 
features/logics in with a shorter development cycle :) 

We're not in hurry to get this done so that perf-wise this feature could be 
implemented improving performance over the original version too, if possible.

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to