[
https://issues.apache.org/jira/browse/IGNITE-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847377#comment-17847377
]
Vladislav Pyatkov commented on IGNITE-15568:
--------------------------------------------
{code}
New
Raft metrics:
raft.logmanager.disruptor.Batch: [
0_10:2487232,
10_20:5693,
20_30:2096,
30_40:574,
40_50:295,
50_inf:4]
Benchmark (clusterSize) (fsync) (partitionCount)
Mode Cnt Score Error Units
MyInsertBenchmarkWithMetrics.kvInsert 1 false 2
avgt 200 6723,882 ± 639,991 us/op
MyInsertBenchmarkWithMetrics.kvInsert 1 true 2
avgt 200 7722,169 ± 504,716 us/op
Old
raft.logmanager.disruptor.Batch: [
0_10:2788769,
10_20:8218,
20_30:4532,
30_40:1579,
40_50:782,
50_inf:61]
raft.nodeimpl.disruptor.Batch: [
0_10:3274036,
10_20:2066,
20_30:446,
30_40:128,
40_50:35,
50_inf:8]
raft.readonlyservice.disruptor.Batch: [
0_10:2,
10_20:0,
20_30:0,
30_40:0,
40_50:0,
50_inf:0]
raft.fsmcaller.disruptor.Batch: [
0_10:9135,
10_20:6197,
20_30:4795,
30_40:6800,
40_50:80328,
50_inf:73]
Benchmark (clusterSize) (fsync) (partitionCount)
Mode Cnt Score Error Units
MyInsertBenchmarkWithMetrics.kvInsert 1 false 2
avgt 200 7611,808 ± 695,469 us/op
MyInsertBenchmarkWithMetrics.kvInsert 1 true 2
avgt 200 7681,789 ± 433,490 us/op
{code}
> Striped Disruptor doesn't work with JRaft event handlers properly
> -----------------------------------------------------------------
>
> Key: IGNITE-15568
> URL: https://issues.apache.org/jira/browse/IGNITE-15568
> Project: Ignite
> Issue Type: Bug
> Reporter: Alexey Scherbakov
> Assignee: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3, performance
> Fix For: 3.0.0-beta2
>
> Attachments: InsertBenchmark.java, MyInsertBenchmarkWithMetrics.java
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> The following scenario is broken:
> # Two raft groups are started and mapped to the same stripe.
> # Two LogEntryAndClosure events are added in quick succession so they form
> distruptor batch: first for group 1, second for group 2.
> First event is delivered to group 1 with endOfBatch=false, so it's cached in
> org.apache.ignite.raft.jraft.core.NodeImpl.LogEntryAndClosureHandler#tasks
> and is not processed.
> Second event is delivered to group 2 with endOfBatch=true and processed, but
> first event will remain in queue unprocessed forever, because
> LogEntryAndClosureHandler are different instances per raft group.
> The possible WA for this is to set
> org.apache.ignite.raft.jraft.option.RaftOptions#applyBatch=1
> Reproducible by
> org.apache.ignite.internal.table.TxDistributedTest_1_1_1#testCrossTable +
> applyBatch=32 in ignite-15085 branch
> *Implementation notes*
> My proposal goes bound Disruptor. The striped disruptor implementation has an
> interceptor that proposes an event to a specific interceptor. Only the last
> event in the batch has a completion batch flag. For the other RAFT groups,
> which has been notified in the striped disruptor, required to create an event
> to fix a batch into the specific group. The new event will be created in the
> common striped disruptor interceptor, and it will send to a specific
> interceptor with flag about batch completion.
> The rule of handling the new event is differenced for various interceptor:
> {code:java|title=title=ApplyTaskHandler (FSMCallerImpl#runApplyTask)}
> if (maxCommittedIndex >= 0) {
> doCommitted(maxCommittedIndex);
> return -1;
> }
> {code}
> {code:java|title=LogEntryAndClosureHandler(LogEntryAndClosureHandler#onEvent)}
> if (this.tasks.size() > 0) {
> executeApplyingTasks(this.tasks);
> this.tasks.clear();
> }
> {code}
> {code:java|title=ReadIndexEventHandler(ReadIndexEventHandler#onEvent)}
> if (this.events.size() > 0) {
> executeReadIndexEvents(this.events);
> this.events.clear();
> }
> {code}
> {code:java|title=StableClosureEventHandler(StableClosureEventHandler#onEvent)}
> if (this.ab.size > 0) {
> this.lastId = this.ab.flush();
> setDiskId(this.lastId);
> }
> {code}
> Also in bound of this issue, required to rerun benchmarks. Those are expected
> to dhow increasing in case with high parallelism in one partition.
> There is [an example of the
> benchmark|https://github.com/gridgain/apache-ignite-3/tree/4b9de922caa4aef97a5e8e159d5db76a3fc7a3ad/modules/runner/src/test/java/org/apache/ignite/internal/benchmark].
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)