luoluoyuyu commented on code in PR #17609:
URL: https://github.com/apache/iotdb/pull/17609#discussion_r3309337931


##########
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/consensus/statemachine/ConfigRegionStateMachine.java:
##########
@@ -122,16 +122,32 @@ public TSStatus write(IConsensusRequest request) {
 
   /** Transmit {@link ConfigPhysicalPlan} to {@link ConfigPlanExecutor} */
   protected TSStatus write(ConfigPhysicalPlan plan) {
+    SimpleConsensusPersistResult persistResult = null;
+    if 
(ConsensusFactory.SIMPLE_CONSENSUS.equals(CONF.getConfigNodeConsensusProtocolClass()))
 {
+      persistResult = persistPlanForSimpleConsensus(plan);

Review Comment:
   **WAL 顺序 persist → execute** 后,若 L127 `persistPlanForSimpleConsensus` 成功但 
L136 `executeNonQueryPlan` 失败,plan 已落盘且**重启必重放**。
   
   这要求**所有** `ConfigPhysicalPlan` 实现幂等。请提供:
   1. 清单或审计结论(至少覆盖 RegisterDataNode、分区、pipe 相关 plan)
   2. 单测/IT:模拟 execute 失败后重启,验证不 double-register、不抛不可恢复错误
   
   否则 crash recovery 边缘场景可能损坏元数据。—— 与 @CRZbulabula 评论一致,请逐条回复。



##########
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/consensus/statemachine/ConfigRegionStateMachine.java:
##########
@@ -122,16 +122,32 @@ public TSStatus write(IConsensusRequest request) {
 
   /** Transmit {@link ConfigPhysicalPlan} to {@link ConfigPlanExecutor} */
   protected TSStatus write(ConfigPhysicalPlan plan) {
+    SimpleConsensusPersistResult persistResult = null;
+    if 
(ConsensusFactory.SIMPLE_CONSENSUS.equals(CONF.getConfigNodeConsensusProtocolClass()))
 {
+      persistResult = persistPlanForSimpleConsensus(plan);
+      final TSStatus persistStatus = persistResult.status;
+      if (persistStatus.getCode() != 
TSStatusCode.SUCCESS_STATUS.getStatusCode()) {
+        return persistStatus;
+      }
+    }
+
     TSStatus result;
     try {
       result = executor.executeNonQueryPlan(plan);

Review Comment:
   rollback 路径 `rollbackFailedPlanForSimpleConsensus` 在 execute 失败时触发,👍。
   
   请确认 rollback 与 WAL 重放顺序在所有失败类型下一致,且 rollback 本身失败时 L145 的日志/告警足够让运维介入(避免 
silent 元数据与 WAL 不一致)。



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to