Thanks to Xinyu. I just read this design and code, which can improve the 
performance of writing. I made a few suggestions that we need to consider. The 
order of data is a problem. In multleader mode, the operations to ensure data 
order need to be explained in the user guide. 1. The data of the same sequence 
of the client is written to a node. 2. The client remains unchanged, the server 
optimizes session segmentation, and the same sequence is transferred to a node 
to control writing.

best
-------
HW-Chao Wang

---Original---
From: "Xinyu Tan"<[email protected]&gt;
Date: Wed, May 25, 2022 23:14 PM
To: "dev"<[email protected]&gt;;
Subject: Consensus layer in new cluster architecture


Hi everyone,

In our new cluster architecture, the consensus layer is a general
interface, which is mainly responsible for providing abstractions of read,
write and shard management for the upper side, and applying the logs to the
state machine in sequence for the below side. Similar to the idea in the
OSDI 2020 best paper Delos[1],&nbsp; our very simple idea is to enable the
consensus layer to support multiple consensus algorithms, and to shield the
details of the consensus for the upper and lower layers through interface.

Currently our consensus layer supports two common implementations
StandAloneConsensus and RatisConsensus. The former is a very simple
single-copy abstraction, and the latter uses Apache Ratis to achieve
Replicate state machine among multiple nodes. However, Ratis guarantees
strong consistency over multiple replicas, which undoubtedly affects
performance, and we don't necessarily need such consistency in some
time-series user scenarios. Therefore, we have recently implemented a
consensus algorithm for multi-master replication that provides weak
consistency but higher performance. Welcome to review this pr [2].

BTW, Although this multi-master consensus algorithm has good theoretical
performance, it provides very weak consistency. We may enhance the
consistency level it can provide by adding certain constraints later, just
like the CAD algorithm in Fast 2020 Best Paper [3], We welcome friends in
the community who are interested in this area to contribute your ideas.

In addition, since the consensus layer is a general interface, we also
welcome friends in the community who are interested in implementing
consensus algorithms to implement more consensus algorithms for the
consensus layer, such as EPaxos, etc. This is undoubtedly a very meaningful
independent work.

[1] https://www.usenix.org/conference/osdi20/presentation/balakrishnan
[2] https://github.com/apache/iotdb/pull/5939
[3] https://www.usenix.org/system/files/fast20-ganesan.pdf

Best
-----------------
Xinyu Tan

Reply via email to