Francesco Nigro created ARTEMIS-2945:
----------------------------------------
Summary: LibAIO JNI code could be rewritten in Java
Key: ARTEMIS-2945
URL: https://issues.apache.org/jira/browse/ARTEMIS-2945
Project: ActiveMQ Artemis
Issue Type: Improvement
Components: Broker
Affects Versions: 2.15.0
Reporter: Francesco Nigro
Assignee: Francesco Nigro
LibAIO JNI code could be rewritten in Java, while keeping the LibaioContext and
LibaioFile APIs the same.
There are few benefits from this:
# simplification of C code logic to ease maintain it
# quicker development process (implement-try-test-debug cycle) for non-C
programmers, including simpler integration with Java test suites
# easier monitoring/telemetry integration
As demonstrations/proofs of such benefits I would introduce several changes
into the Java version:
# using the libaio async fdatasync feature to allow the LibaioContext duty
cycle loop to free CPU resources in order to handle compaction reads without
being slowed down by an in-progress fdatasync
# use a lock-free high performance data structure to reuse iocbs instead of a
locked (using a mutex) one
# expose in-flight callbacks to allow future PRs to introduce Java-only
latency telemetry per-request or just error check/kill of "slow" in-flight
requests
The possible drawbacks are:
# slower performance due to GC barriers cost (see the notes on the PR code)
# slower performance in case of JVM without Unsafe (that means very few)
The latter issue could be addressed by using the new/proper VarHandle features
when the Artemis min supported version will move from Java 8 or using the same
approach on other projects relying on it eg Netty.
A note about how to correctly benchmark this due to how I've implemented async
fdatasync: in order to save both LibaioContext and TimedBuffer to perform
fdatasync batching of writes, I've preferred to simplify the LibaioContext: it
means that the buffer timeout to be used on broker.xml should be obtained by
./artemis perf-journal --sync-writes ie batching writes at the speed of the
measured fdatasync RTT latency.
This last behaviour could be changed used a more Apple-to-Apple approach
although I still think that the beauty of using is Java is exactly to bring new
features/logics in with a shorter development cycle :)
We're not in hurry to get this done so that perf-wise this feature could be
implemented improving performance over the original version too, if possible.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)