Re: [PR] Region migration improvement [iotdb]
OneSizeFitsQuorum merged PR #12165: URL: https://github.com/apache/iotdb/pull/12165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1538711725 ## iotdb-core/datanode/src/main/java/org/apache/iotdb/db/service/RegionMigrateService.java: ## @@ -192,7 +192,7 @@ private boolean addToTaskResultMap(long taskId) { if (taskResultMap.containsKey(taskId)) { return false; } -taskResultMap.put(taskId, unfinishedResult); +taskResultMap.putIfAbsent(taskId, unfinishedResult); Review Comment: you can always learn something from Chief Tan ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1538703346 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/impl/region/RegionMigrateProcedure.java: ## @@ -216,7 +214,10 @@ public void deserialize(ByteBuffer byteBuffer) { coordinatorForAddPeer = ThriftCommonsSerDeUtils.deserializeTDataNodeLocation(byteBuffer); coordinatorForRemovePeer = ThriftCommonsSerDeUtils.deserializeTDataNodeLocation(byteBuffer); } catch (ThriftSerDeException e) { - LOGGER.error("Error in deserialize {}", this.getClass(), e); + LOGGER.error( + "Error in deserialize {}, this procedure may come from old version and already cannot be used.", + this.getClass(), Review Comment: good point -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
OneSizeFitsQuorum commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1538662721 ## iotdb-core/datanode/src/main/java/org/apache/iotdb/db/service/RegionMigrateService.java: ## @@ -192,7 +192,7 @@ private boolean addToTaskResultMap(long taskId) { if (taskResultMap.containsKey(taskId)) { return false; } -taskResultMap.put(taskId, unfinishedResult); +taskResultMap.putIfAbsent(taskId, unfinishedResult); Review Comment: putifAbsent -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
OneSizeFitsQuorum commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1538659472 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/impl/region/RegionMigrateProcedure.java: ## @@ -216,7 +214,10 @@ public void deserialize(ByteBuffer byteBuffer) { coordinatorForAddPeer = ThriftCommonsSerDeUtils.deserializeTDataNodeLocation(byteBuffer); coordinatorForRemovePeer = ThriftCommonsSerDeUtils.deserializeTDataNodeLocation(byteBuffer); } catch (ThriftSerDeException e) { - LOGGER.error("Error in deserialize {}", this.getClass(), e); + LOGGER.error( + "Error in deserialize {}, this procedure may come from old version and already cannot be used.", + this.getClass(), Review Comment: print id -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
CRZbulabula commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1537513344 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/service/StatisticsService.java: ## @@ -279,17 +279,21 @@ private void recordRegionPriorityMap( regionPriorityEntry : priorityMap.entrySet()) { if (!Objects.equals( regionPriorityEntry.getValue().getRight(), regionPriorityEntry.getValue().getLeft())) { -LOGGER.info( -"[RegionPriority]\t {}: {}->{}", -regionPriorityEntry.getKey(), -regionPriorityEntry.getValue().getLeft() == null -? "null" -: regionPriorityEntry.getValue().getLeft().getDataNodeLocations().stream() -.map(TDataNodeLocation::getDataNodeId) -.collect(Collectors.toList()), - regionPriorityEntry.getValue().getRight().getDataNodeLocations().stream() -.map(TDataNodeLocation::getDataNodeId) -.collect(Collectors.toList())); +try { + LOGGER.info( + "[RegionPriority]\t {}: {}->{}", + regionPriorityEntry.getKey(), + regionPriorityEntry.getValue().getLeft() == null + ? "null" + : regionPriorityEntry.getValue().getLeft().getDataNodeLocations().stream() + .map(TDataNodeLocation::getDataNodeId) + .collect(Collectors.toList()), + regionPriorityEntry.getValue().getRight().getDataNodeLocations().stream() + .map(TDataNodeLocation::getDataNodeId) + .collect(Collectors.toList())); +} catch (Exception e) { Review Comment: My mistake!!! I should fix this bug these days >_< -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1535039756 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/impl/region/RegionMigrateProcedure.java: ## @@ -234,8 +213,10 @@ public void deserialize(ByteBuffer byteBuffer) { originalDataNode = ThriftCommonsSerDeUtils.deserializeTDataNodeLocation(byteBuffer); destDataNode = ThriftCommonsSerDeUtils.deserializeTDataNodeLocation(byteBuffer); consensusGroupId = ThriftCommonsSerDeUtils.deserializeTConsensusGroupId(byteBuffer); + coordinatorForAddPeer = ThriftCommonsSerDeUtils.deserializeTDataNodeLocation(byteBuffer); Review Comment: The compatibility for UpdateRegionLocationPlan is remained. For RegionMigrateProcedure, seems not necessary to keep compatibility. User should not upgrade IoTDB if any RegionMigrateProcedure is not finished. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1535038134 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/env/ConfigNodeProcedureEnv.java: ## @@ -207,15 +207,18 @@ public boolean verifySucceed(TSStatus... status) { .allMatch(tsStatus -> tsStatus.getCode() == TSStatusCode.SUCCESS_STATUS.getStatusCode()); } - public boolean doubleCheckReplica(TDataNodeLocation removedDatanode) { -return getNodeManager() -.filterDataNodeThroughStatus(NodeStatus.Running, NodeStatus.ReadOnly) -.size() -- Boolean.compare( -getLoadManager().getNodeStatus(removedDatanode.getDataNodeId()) -!= NodeStatus.Unknown, -false) ->= NodeInfo.getMinimumDataNode(); + public boolean checkEnoughDataNodeAfterRemoving(TDataNodeLocation removedDatanode) { +final int runningOrReadOnlyDataNodeNum = +getNodeManager() +.filterDataNodeThroughStatus(NodeStatus.Running, NodeStatus.ReadOnly) +.size(); +int dataNodeNumAfterRemoving; +if (getLoadManager().getNodeStatus(removedDatanode.getDataNodeId()) != NodeStatus.Unknown) { Review Comment: considered -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1535036692 ## iotdb-protocol/thrift-commons/src/main/thrift/common.thrift: ## @@ -90,12 +90,28 @@ struct TDataNodeConfiguration { 2: required TNodeResource resource } +// TODO: deprecated enum TRegionMigrateFailedType { Review Comment: For some reason it seems still useful, maybe remove it in next version ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1535026292 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/RootProcedureStack.java: ## @@ -74,6 +74,7 @@ protected synchronized void unsetRollback() { state = State.FAILED; } + // TODO: check whether this method return sub-procedures from every level Review Comment: no problem -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1535005566 ## iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/iot/IoTConsensusServerImpl.java: ## @@ -635,22 +623,68 @@ public void recoverConfiguration() { // interrupted // unexpectedly, we need substitute configuration with tmpConfiguration file if (Files.exists(tmpConfigurationPath)) { -if (Files.exists(configurationPath)) { - Files.delete(configurationPath); -} +Files.deleteIfExists(configurationPath); Files.move(tmpConfigurationPath, configurationPath); } - buffer = ByteBuffer.wrap(Files.readAllBytes(configurationPath)); - int size = buffer.getInt(); - for (int i = 0; i < size; i++) { -configuration.add(Peer.deserialize(buffer)); + if (Files.exists(configurationPath)) { +// recover from old configuration file +buffer = ByteBuffer.wrap(Files.readAllBytes(configurationPath)); +int size = buffer.getInt(); +for (int i = 0; i < size; i++) { + configuration.add(Peer.deserialize(buffer)); +} +Files.delete(configurationPath); +persistConfiguration(); Review Comment: good point -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1535005469 ## iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/iot/IoTConsensusServerImpl.java: ## @@ -635,22 +623,68 @@ public void recoverConfiguration() { // interrupted // unexpectedly, we need substitute configuration with tmpConfiguration file if (Files.exists(tmpConfigurationPath)) { -if (Files.exists(configurationPath)) { - Files.delete(configurationPath); -} +Files.deleteIfExists(configurationPath); Files.move(tmpConfigurationPath, configurationPath); } - buffer = ByteBuffer.wrap(Files.readAllBytes(configurationPath)); - int size = buffer.getInt(); - for (int i = 0; i < size; i++) { -configuration.add(Peer.deserialize(buffer)); + if (Files.exists(configurationPath)) { Review Comment: extract a recoverFromOldConfigurationFile() function here ## iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/iot/IoTConsensusServerImpl.java: ## @@ -635,22 +623,68 @@ public void recoverConfiguration() { // interrupted // unexpectedly, we need substitute configuration with tmpConfiguration file if (Files.exists(tmpConfigurationPath)) { -if (Files.exists(configurationPath)) { - Files.delete(configurationPath); -} +Files.deleteIfExists(configurationPath); Files.move(tmpConfigurationPath, configurationPath); } - buffer = ByteBuffer.wrap(Files.readAllBytes(configurationPath)); - int size = buffer.getInt(); - for (int i = 0; i < size; i++) { -configuration.add(Peer.deserialize(buffer)); + if (Files.exists(configurationPath)) { Review Comment: extract a recoverFromOldConfigurationFile() function here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1534991736 ## iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/utils/DataNodeKillPoints.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iotdb.commons.utils; + +public enum DataNodeKillPoints { Review Comment: @CRZbulabula Because it should be used in datanode package, for test purpose. This functionality is not completed yet (doge > Why not move this class to integration-test if it will only be used in that package? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1534991736 ## iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/utils/DataNodeKillPoints.java: ## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iotdb.commons.utils; + +public enum DataNodeKillPoints { Review Comment: Because it should be used in datanode package, for test purpose. This functionality is not completed yet (doge -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1534990599 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/ProcedureManager.java: ## @@ -535,125 +537,136 @@ public boolean removeDataNode(RemoveDataNodePlan removeDataNodePlan) { return true; } - public TSStatus migrateRegion(TMigrateRegionReq migrateRegionReq) { -TConsensusGroupId regionGroupId; + // region region migration + + private TConsensusGroupId idToTConsensusGroupId(final int regionId) throws ProcedureException { Review Comment: good point~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1534982777 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/impl/region/AddRegionPeerProcedure.java: ## @@ -0,0 +1,203 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iotdb.confignode.procedure.impl.region; + +import org.apache.iotdb.common.rpc.thrift.TConsensusGroupId; +import org.apache.iotdb.common.rpc.thrift.TDataNodeLocation; +import org.apache.iotdb.common.rpc.thrift.TRegionMigrateResult; +import org.apache.iotdb.common.rpc.thrift.TSStatus; +import org.apache.iotdb.commons.exception.runtime.ThriftSerDeException; +import org.apache.iotdb.commons.utils.ThriftCommonsSerDeUtils; +import org.apache.iotdb.confignode.procedure.env.ConfigNodeProcedureEnv; +import org.apache.iotdb.confignode.procedure.env.RegionMaintainHandler; +import org.apache.iotdb.confignode.procedure.exception.ProcedureException; +import org.apache.iotdb.confignode.procedure.exception.ProcedureSuspendedException; +import org.apache.iotdb.confignode.procedure.exception.ProcedureYieldException; +import org.apache.iotdb.confignode.procedure.impl.StateMachineProcedure; +import org.apache.iotdb.confignode.procedure.state.AddRegionPeerState; +import org.apache.iotdb.confignode.procedure.store.ProcedureType; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.DataOutputStream; +import java.io.IOException; +import java.nio.ByteBuffer; + +import static org.apache.iotdb.confignode.procedure.state.AddRegionPeerState.UPDATE_REGION_LOCATION_CACHE; +import static org.apache.iotdb.rpc.TSStatusCode.SUCCESS_STATUS; + +public class AddRegionPeerProcedure +extends StateMachineProcedure { + private static final Logger LOGGER = LoggerFactory.getLogger(AddRegionPeerProcedure.class); + private TConsensusGroupId consensusGroupId; + + private TDataNodeLocation coordinator; + + private TDataNodeLocation destDataNode; + + public AddRegionPeerProcedure() { +super(); + } + + public AddRegionPeerProcedure( + TConsensusGroupId consensusGroupId, + TDataNodeLocation coordinator, + TDataNodeLocation destDataNode) { +super(); +this.consensusGroupId = consensusGroupId; +this.coordinator = coordinator; +this.destDataNode = destDataNode; + } + + @Override + protected Flow executeFromState(ConfigNodeProcedureEnv env, AddRegionPeerState state) + throws ProcedureSuspendedException, ProcedureYieldException, InterruptedException { +if (consensusGroupId == null) { + return Flow.NO_MORE_STATE; +} +RegionMaintainHandler handler = env.getRegionMaintainHandler(); +try { + outerSwitch: + switch (state) { +case CREATE_NEW_REGION_PEER: + handler.createNewRegionPeer(consensusGroupId, destDataNode); + setNextState(AddRegionPeerState.DO_ADD_REGION_PEER); + break; +case DO_ADD_REGION_PEER: + TSStatus tsStatus = + handler.addRegionPeer(this.getProcId(), destDataNode, consensusGroupId, coordinator); + TRegionMigrateResult result; + if (tsStatus.getCode() == SUCCESS_STATUS.getStatusCode()) { +result = handler.waitTaskFinish(this.getProcId(), coordinator); + } else { +throw new ProcedureException("ADD_REGION_PEER executed failed in DataNode"); + } + switch (result.getTaskStatus()) { +case TASK_NOT_EXIST: + // coordinator crashed and lost its task table +case FAIL: + // maybe some DataNode crash + LOGGER.warn( + "result is {}, will use resetPeerList to clean in the future", + result.getTaskStatus()); + // List correctDataNodeLocations = + // + // env.getConfigManager().getPartitionManager().getAllReplicaSets().stream() + // .filter( + // tRegionReplicaSet -> + // + // tRegionReplicaSet.getRegionId().equals(consensusGroupId)) + //
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1534980790 ## integration-test/src/main/java/org/apache/iotdb/it/env/cluster/env/AbstractEnv.java: ## @@ -1007,4 +1009,22 @@ public String getToolsPath() { public String getLibPath() { return TEMPLATE_NODE_LIB_PATH; } + + @Override + public Optional dataNodeIdToWrapper(int nodeId) { +try (SyncConfigNodeIServiceClient leaderClient = +(SyncConfigNodeIServiceClient) getLeaderConfigNodeConnection()) { + TShowDataNodesResp resp = leaderClient.showDataNodes(); Review Comment: Because DataNodeWrapper not contains node id -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1533699684 ## iotdb-core/datanode/src/main/java/org/apache/iotdb/db/service/RegionMigrateService.java: ## @@ -139,6 +168,34 @@ public synchronized boolean submitDeleteOldRegionPeerTask(TMaintainPeerReq req) return submitSucceed; } + public synchronized TSStatus resetPeerList(TResetPeerListReq req) { +List correctPeers = +req.getCorrectLocations().stream() +.map(location -> Peer.valueOf(req.getRegionId(), location)) +.collect(Collectors.toList()); +ConsensusGroupId regionId = + ConsensusGroupId.Factory.createFromTConsensusGroupId(req.getRegionId()); +try { + if (regionId instanceof DataRegionId) { +DataRegionConsensusImpl.getInstance().resetPeerList(regionId, correctPeers); + } else { +SchemaRegionConsensusImpl.getInstance().resetPeerList(regionId, correctPeers); + } +} catch (ConsensusException e) { + LOGGER.error("reset peer list fail", e); + return new TSStatus(TSStatusCode.EXECUTE_STATEMENT_ERROR.getStatusCode()); +} +return new TSStatus(TSStatusCode.SUCCESS_STATUS.getStatusCode()); + } + + private boolean addToTaskResultMap(long taskId) { Review Comment: use putIfAbsent -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1533693248 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/env/RegionMaintainHandler.java: ## @@ -69,8 +80,16 @@ public class DataNodeRemoveHandler { /** region migrate lock */ private final LockQueue regionMigrateLock = new LockQueue(); - public DataNodeRemoveHandler(ConfigManager configManager) { + private static final long DATANODE_MAX_DISCONNECTION_MS = 1; Review Comment: while (configManager.getLoadManager().getNodeStatus(dataNodeLocation.getDataNodeId()) != NodeStatus.Unknown) ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1533430325 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/service/StatisticsService.java: ## @@ -279,17 +279,21 @@ private void recordRegionPriorityMap( regionPriorityEntry : priorityMap.entrySet()) { if (!Objects.equals( regionPriorityEntry.getValue().getRight(), regionPriorityEntry.getValue().getLeft())) { -LOGGER.info( -"[RegionPriority]\t {}: {}->{}", -regionPriorityEntry.getKey(), -regionPriorityEntry.getValue().getLeft() == null -? "null" -: regionPriorityEntry.getValue().getLeft().getDataNodeLocations().stream() -.map(TDataNodeLocation::getDataNodeId) -.collect(Collectors.toList()), - regionPriorityEntry.getValue().getRight().getDataNodeLocations().stream() -.map(TDataNodeLocation::getDataNodeId) -.collect(Collectors.toList())); +try { + LOGGER.info( + "[RegionPriority]\t {}: {}->{}", + regionPriorityEntry.getKey(), + regionPriorityEntry.getValue().getLeft() == null + ? "null" + : regionPriorityEntry.getValue().getLeft().getDataNodeLocations().stream() + .map(TDataNodeLocation::getDataNodeId) + .collect(Collectors.toList()), + regionPriorityEntry.getValue().getRight().getDataNodeLocations().stream() + .map(TDataNodeLocation::getDataNodeId) + .collect(Collectors.toList())); +} catch (Exception e) { Review Comment: > Why surrounding with try-catch? because NPE @CRZbulabula -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1533429359 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/ProcedureManager.java: ## @@ -535,125 +537,136 @@ public boolean removeDataNode(RemoveDataNodePlan removeDataNodePlan) { return true; } - public TSStatus migrateRegion(TMigrateRegionReq migrateRegionReq) { -TConsensusGroupId regionGroupId; + // region region migration + + private TConsensusGroupId idToTConsensusGroupId(final int regionId) throws ProcedureException { if (configManager .getPartitionManager() -.isRegionGroupExists( -new TConsensusGroupId( -TConsensusGroupType.SchemaRegion, migrateRegionReq.getRegionId( { - regionGroupId = - new TConsensusGroupId(TConsensusGroupType.SchemaRegion, migrateRegionReq.getRegionId()); -} else if (configManager +.isRegionGroupExists(new TConsensusGroupId(TConsensusGroupType.SchemaRegion, regionId))) { + return new TConsensusGroupId(TConsensusGroupType.SchemaRegion, regionId); +} +if (configManager .getPartitionManager() -.isRegionGroupExists( -new TConsensusGroupId( -TConsensusGroupType.DataRegion, migrateRegionReq.getRegionId( { - regionGroupId = - new TConsensusGroupId(TConsensusGroupType.DataRegion, migrateRegionReq.getRegionId()); -} else { - LOGGER.warn( - "Submit RegionMigrateProcedure failed, because RegionGroup: {} doesn't exist", - migrateRegionReq.getRegionId()); - TSStatus status = new TSStatus(TSStatusCode.MIGRATE_REGION_ERROR.getStatusCode()); - status.setMessage( - String.format( - "Submit RegionMigrateProcedure failed, because RegionGroup: %s doesn't exist", - migrateRegionReq.getRegionId())); - return status; +.isRegionGroupExists(new TConsensusGroupId(TConsensusGroupType.DataRegion, regionId))) { + return new TConsensusGroupId(TConsensusGroupType.DataRegion, regionId); } +String msg = +String.format( +"Submit RegionMigrateProcedure failed, because RegionGroup: %s doesn't exist", +regionId); +LOGGER.warn(msg); +throw new ProcedureException(msg); + } -TDataNodeLocation originalDataNode = -configManager -.getNodeManager() -.getRegisteredDataNode(migrateRegionReq.getFromId()) -.getLocation(); -TDataNodeLocation destDataNode = -configManager -.getNodeManager() -.getRegisteredDataNode(migrateRegionReq.getToId()) -.getLocation(); - + private TSStatus checkRegionMigrate( + TMigrateRegionReq migrateRegionReq, + TConsensusGroupId regionGroupId, + TDataNodeLocation originalDataNode, + TDataNodeLocation destDataNode, + TDataNodeLocation coordinatorForAddPeer) { +String failMessage = null; if (originalDataNode == null) { - LOGGER.warn( - "Submit RegionMigrateProcedure failed, because no original DataNode {}", - migrateRegionReq.getFromId()); - TSStatus status = new TSStatus(TSStatusCode.MIGRATE_REGION_ERROR.getStatusCode()); - status.setMessage( - "Submit RegionMigrateProcedure failed, because no original DataNode " - + migrateRegionReq.getFromId()); - return status; + failMessage = + String.format( + "Submit RegionMigrateProcedure failed, because no original DataNode %d", + migrateRegionReq.getFromId()); } else if (destDataNode == null) { - LOGGER.warn( - "Submit RegionMigrateProcedure failed, because no target DataNode {}", - migrateRegionReq.getToId()); - TSStatus status = new TSStatus(TSStatusCode.MIGRATE_REGION_ERROR.getStatusCode()); - status.setMessage( - "Submit RegionMigrateProcedure failed, because no target DataNode " - + migrateRegionReq.getToId()); - return status; + failMessage = + String.format( + "Submit RegionMigrateProcedure failed, because no target DataNode %s", + migrateRegionReq.getToId()); +} else if (coordinatorForAddPeer == null) { + failMessage = + String.format( + "%s, There are no other DataNodes could be selected to perform the add peer process, " + + "please check RegionGroup: %s by show regions sql command", + REGION_MIGRATE_PROCESS, regionGroupId); } else if (configManager.getPartitionManager() .getAllReplicaSets(originalDataNode.getDataNodeId()).stream() .noneMatch(replicaSet -> replicaSet.getRegionId().equals(regionGroupId))) { - LOGGER.warn( - "Submit RegionMigrateProcedure failed, because the original DataNode {} doesn't contain Region {}", -
Re: [PR] Region migration improvement [iotdb]
CRZbulabula commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1527743866 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/impl/region/AddRegionPeerProcedure.java: ## @@ -0,0 +1,203 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iotdb.confignode.procedure.impl.region; + +import org.apache.iotdb.common.rpc.thrift.TConsensusGroupId; +import org.apache.iotdb.common.rpc.thrift.TDataNodeLocation; +import org.apache.iotdb.common.rpc.thrift.TRegionMigrateResult; +import org.apache.iotdb.common.rpc.thrift.TSStatus; +import org.apache.iotdb.commons.exception.runtime.ThriftSerDeException; +import org.apache.iotdb.commons.utils.ThriftCommonsSerDeUtils; +import org.apache.iotdb.confignode.procedure.env.ConfigNodeProcedureEnv; +import org.apache.iotdb.confignode.procedure.env.RegionMaintainHandler; +import org.apache.iotdb.confignode.procedure.exception.ProcedureException; +import org.apache.iotdb.confignode.procedure.exception.ProcedureSuspendedException; +import org.apache.iotdb.confignode.procedure.exception.ProcedureYieldException; +import org.apache.iotdb.confignode.procedure.impl.StateMachineProcedure; +import org.apache.iotdb.confignode.procedure.state.AddRegionPeerState; +import org.apache.iotdb.confignode.procedure.store.ProcedureType; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.DataOutputStream; +import java.io.IOException; +import java.nio.ByteBuffer; + +import static org.apache.iotdb.confignode.procedure.state.AddRegionPeerState.UPDATE_REGION_LOCATION_CACHE; +import static org.apache.iotdb.rpc.TSStatusCode.SUCCESS_STATUS; + +public class AddRegionPeerProcedure +extends StateMachineProcedure { + private static final Logger LOGGER = LoggerFactory.getLogger(AddRegionPeerProcedure.class); + private TConsensusGroupId consensusGroupId; + + private TDataNodeLocation coordinator; + + private TDataNodeLocation destDataNode; + + public AddRegionPeerProcedure() { +super(); + } + + public AddRegionPeerProcedure( + TConsensusGroupId consensusGroupId, + TDataNodeLocation coordinator, + TDataNodeLocation destDataNode) { +super(); +this.consensusGroupId = consensusGroupId; +this.coordinator = coordinator; +this.destDataNode = destDataNode; + } + + @Override + protected Flow executeFromState(ConfigNodeProcedureEnv env, AddRegionPeerState state) + throws ProcedureSuspendedException, ProcedureYieldException, InterruptedException { +if (consensusGroupId == null) { + return Flow.NO_MORE_STATE; +} +RegionMaintainHandler handler = env.getRegionMaintainHandler(); +try { + outerSwitch: + switch (state) { +case CREATE_NEW_REGION_PEER: + handler.createNewRegionPeer(consensusGroupId, destDataNode); + setNextState(AddRegionPeerState.DO_ADD_REGION_PEER); + break; +case DO_ADD_REGION_PEER: + TSStatus tsStatus = + handler.addRegionPeer(this.getProcId(), destDataNode, consensusGroupId, coordinator); + TRegionMigrateResult result; + if (tsStatus.getCode() == SUCCESS_STATUS.getStatusCode()) { +result = handler.waitTaskFinish(this.getProcId(), coordinator); + } else { +throw new ProcedureException("ADD_REGION_PEER executed failed in DataNode"); + } + switch (result.getTaskStatus()) { +case TASK_NOT_EXIST: + // coordinator crashed and lost its task table +case FAIL: + // maybe some DataNode crash + LOGGER.warn( + "result is {}, will use resetPeerList to clean in the future", + result.getTaskStatus()); + // List correctDataNodeLocations = + // + // env.getConfigManager().getPartitionManager().getAllReplicaSets().stream() + // .filter( + // tRegionReplicaSet -> + // + // tRegionReplicaSet.getRegionId().equals(consensusGroupId)) + //
Re: [PR] Region migration improvement [iotdb]
OneSizeFitsQuorum commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1525978921 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/consensus/request/ConfigPhysicalPlanType.java: ## @@ -53,15 +53,17 @@ public enum ConfigPhysicalPlanType { CreateRegionGroups((short) 300), DeleteRegionGroups((short) 301), GetRegionInfoList((short) 302), + @Deprecated UpdateRegionLocation((short) 303), Review Comment: consider compatibility ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/ProcedureManager.java: ## @@ -535,125 +537,136 @@ public boolean removeDataNode(RemoveDataNodePlan removeDataNodePlan) { return true; } - public TSStatus migrateRegion(TMigrateRegionReq migrateRegionReq) { -TConsensusGroupId regionGroupId; + // region region migration + + private TConsensusGroupId idToTConsensusGroupId(final int regionId) throws ProcedureException { Review Comment: regionIdToTConsensusGroupId ## integration-test/src/test/java/org/apache/iotdb/confignode/it/IoTDBRegionMigrateReliabilityIT.java: ## @@ -0,0 +1,428 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iotdb.confignode.it; + +import org.apache.iotdb.common.rpc.thrift.TConsensusGroupType; +import org.apache.iotdb.commons.client.sync.SyncConfigNodeIServiceClient; +import org.apache.iotdb.commons.concurrent.IoTDBThreadPoolFactory; +import org.apache.iotdb.commons.conf.IoTDBConstant; +import org.apache.iotdb.commons.utils.DataNodeKillPoints; +import org.apache.iotdb.confignode.procedure.state.AddRegionPeerState; +import org.apache.iotdb.confignode.procedure.state.RegionTransitionState; +import org.apache.iotdb.confignode.procedure.state.RemoveRegionPeerState; +import org.apache.iotdb.consensus.ConsensusFactory; +import org.apache.iotdb.db.queryengine.common.header.ColumnHeaderConstant; +import org.apache.iotdb.it.env.EnvFactory; +import org.apache.iotdb.it.env.cluster.node.AbstractNodeWrapper; +import org.apache.iotdb.it.env.cluster.node.ConfigNodeWrapper; +import org.apache.iotdb.itbase.exception.InconsistentDataException; + +import org.awaitility.Awaitility; +import org.awaitility.core.ConditionTimeoutException; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Ignore; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.File; +import java.io.IOException; +import java.io.InputStreamReader; +import java.sql.Connection; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Statement; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; +import java.util.Objects; +import java.util.Optional; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.ConcurrentHashMap.KeySetView; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicReference; +import java.util.stream.Collectors; + +public class IoTDBRegionMigrateReliabilityIT { + private static final Logger LOGGER = + LoggerFactory.getLogger(IoTDBRegionMigrateReliabilityIT.class); + private static final String INSERTION = + "INSERT INTO root.sg.d1(timestamp,speed,temperature) values(100, 10.1, 20.7)"; + private static final String SHOW_REGIONS = "show regions"; + private static final String SHOW_DATANODES = "show datanodes"; + private static final String REGION_MIGRATE_COMMAND_FORMAT = "migrate region %d from %d to %d"; + + @Before + public void setUp() throws Exception { +EnvFactory.getEnv() +.getConfig() +.getCommonConfig() +.setDataRegionConsensusProtocolClass(ConsensusFactory.IOT_CONSENSUS) + .setSchemaRegionConsensusProtocolClass(ConsensusFactory.RATIS_CONSENSUS) +.setConfigNodeConsensusProtocolClass(ConsensusFactory.RATIS_CONSENSUS); + } + + @After + public void tearDown() throws InterruptedException { +EnvFactory.getEnv().cleanClusterEnvironment(); + } + + // region Normal
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1522954522 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/store/ProcedureFactory.java: ## Review Comment: need serDe test for new procedure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1522825858 ## iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/consensus/request/write/partition/UpdateRegionLocationPlan.java: ## Review Comment: Already deleted this file -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Region migration improvement [iotdb]
liyuheng5 commented on code in PR #12165: URL: https://github.com/apache/iotdb/pull/12165#discussion_r1522789551 ## integration-test/src/main/java/org/apache/iotdb/it/env/cluster/node/AbstractNodeWrapper.java: ## Review Comment: This optimization will merge in another pr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Region migration improvement [iotdb]
liyuheng5 opened a new pull request, #12165: URL: https://github.com/apache/iotdb/pull/12165 https://apache-iotdb.feishu.cn/docx/MC8gdgTgOoG9eYx9MIpcrtB9n4e -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@iotdb.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org