[
https://issues.apache.org/jira/browse/IOTDB-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091091#comment-17091091
]
Houliang Qi commented on IOTDB-606:
-----------------------------------
The operations that can cause the contents of the partition table to change in
the system are as follows:
1. Add a node;
2. Remove a node;
The main reasons when a node needs to pull metasnapshot are as follows;
1. New nodes are added:
2. Restart after downtime, the meta information of this node is far away from
the leader.
3. The new network partition node rejoins the cluster, the meta information of
this node has been far away from the leader.
For 1, no request will come before the new partition table is applied. So just
apply the partition table directly.
For 2 and 3, if the request is being routed to this node, because the partition
table information is old, the metadata obtained by the metamember or datamember
is also wrong. In this case, the operation will definitely fail, so let the
upper layer retry. This node can directly replace the partition table. Before
the replacement of the partition table is completed, all operations are
blocked(emptying flow).
The above is to consider the case of adding only one node or deleting one node
at a time. Let us consider the case of adding or deleting multiple nodes. Since
all operations are performed sequentially at the leader node, the leader has
the newest partition table, Raft guarantees that the partition table given by
the leader to the follower must be accurate. So in this case, for follower, it
is the same as the addition and deletion of a node.
Please leave your opinion, thanks.
> [Distributed] Replace raw logs in MetaSnapshot
> ----------------------------------------------
>
> Key: IOTDB-606
> URL: https://issues.apache.org/jira/browse/IOTDB-606
> Project: Apache IoTDB
> Issue Type: Improvement
> Reporter: Tian Jiang
> Priority: Major
> Labels: cluster, metadata, snapshot
>
> The current MetaSnapshot is using the simplest way, storing the raw committed
> logs. It would be more efficient to replace the logs with compact structures
> like the partition table and other objects that will be affected by meta logs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)