[
https://issues.apache.org/jira/browse/IOTDB-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
刘珍 reopened IOTDB-4851:
-----------------------
1124_cd839a4 , 1副本3C5D
缩容ip76
ip62 报错
> 1rep 3C 5D , remove 1 datanode , region migration failed(Unknown)
> -----------------------------------------------------------------
>
> Key: IOTDB-4851
> URL: https://issues.apache.org/jira/browse/IOTDB-4851
> Project: Apache IoTDB
> Issue Type: Bug
> Components: mpp-cluster
> Affects Versions: 0.14.0-SNAPSHOT
> Reporter: 刘珍
> Assignee: Gaofei Cao
> Priority: Major
> Labels: pull-request-available
> Attachments: image-2022-11-04-14-40-54-764.png
>
>
> 版本:m_1103_f857667
> {color:#DE350B}1副本{color},3C5D
> schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus
> data_region_consensus_protocol_class=org.apache.iotdb.consensus.multileader.{color:#DE350B}MultiLeaderConsensus{color}
> 缩容1节点(ip76),从集群中移除此节点成功,但是:
> {color:#DE350B}问题1 : 缩容节点的data没有迁移成功{color}
> !image-2022-11-04-14-40-54-764.png!
> {color:#DE350B}问题2:缩容节点的datanode进程不退出,datanode日志刷:{color}
> 2022-11-04 13:52:56,659 [7@group-000200000006-SegmentedRaftLogWorker] INFO
> o.a.r.s.r.s.SegmentedRaftLogWorker:345 -
> 7@group-000200000006-SegmentedRaftLogWorker was interrupted, exiting. There
> are 0 tasks remaining in the queue.
> 2022-11-04 13:52:56,659 [7@group-000200000001-SegmentedRaftLogWorker] INFO
> o.a.r.s.r.s.SegmentedRaftLogWorker:345 -
> 7@group-000200000001-SegmentedRaftLogWorker was interrupted, exiting. There
> are 0 tasks remaining in the queue.
> 2022-11-04 13:52:56,684 [7-impl-thread2] INFO
> o.a.r.s.r.s.SegmentedRaftLogWorker:255 -
> 7@group-000200000006-SegmentedRaftLogWorker close()
> 2022-11-04 13:52:56,701 [7-impl-thread3] INFO
> o.a.r.s.r.s.SegmentedRaftLogWorker:255 -
> 7@group-000200000001-SegmentedRaftLogWorker close()
> 2022-11-04 13:52:56,703 [JvmPauseMonitor0] INFO o.a.r.u.JvmPauseMonitor:111
> - JvmPauseMonitor-7: Stopped
> {color:#FF8B00}2022-11-04 13:52:56,705
> [pool-24-IoTDB-DataNodeInternalRPC-Processor-146] INFO
> o.a.i.c.s.ThriftService:158 - IoTDB: closing Multi Leader consensus
> Service...
> 2022-11-04 13:52:56,707 [pool-24-IoTDB-DataNodeInternalRPC-Processor-146]
> INFO o.a.i.c.s.ThriftService:165 - IoTDB: close Multi Leader consensus
> Service successfully
> 2022-11-04 13:52:56,707 [pool-24-IoTDB-DataNodeInternalRPC-Processor-146]
> INFO o.a.i.c.s.RegisterManager:67 - deregister all service. {color}
> 2022-11-04 13:53:23,771 [DataNodeInternalRPC-Service] ERROR
> o.a.t.s.TThreadPoolServer:144 - Shutdown is not done after 60SECONDS
> 2022-11-04 13:53:23,778 [MPPDataExchangeRPC-Service] ERROR
> o.a.t.s.TThreadPoolServer:144 - Shutdown is not done after 60SECONDS
> {color:#DE350B}2022-11-04 13:53:54,354 [Thread-0] WARN
> o.a.i.d.e.s.DataRegion:1848 - root.test.g_2-21 has spent 60s to wait for
> closing all TsFiles.
> 2022-11-04 13:54:54,354 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 120s to wait for closing all TsFiles.
> 2022-11-04 13:55:54,354 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 180s to wait for closing all TsFiles. {color}
> 2022-11-04 13:56:54,355 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 240s to wait for closing all TsFiles.
> 2022-11-04 13:57:54,355 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 300s to wait for closing all TsFiles.
> 2022-11-04 13:58:54,355 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 360s to wait for closing all TsFiles.
> 2022-11-04 13:59:54,356 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 420s to wait for closing all TsFiles.
> 2022-11-04 14:00:54,356 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 480s to wait for closing all TsFiles.
> 2022-11-04 14:01:54,357 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 540s to wait for closing all TsFiles.
> 2022-11-04 14:02:54,357 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 600s to wait for closing all TsFiles.
> 2022-11-04 14:03:54,357 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 660s to wait for closing all TsFiles.
> 2022-11-04 14:04:54,358 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 720s to wait for closing all TsFiles.
> 2022-11-04 14:05:54,358 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 780s to wait for closing all TsFiles.
> 2022-11-04 14:06:54,358 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 840s to wait for closing all TsFiles.
> 2022-11-04 14:07:54,358 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 900s to wait for closing all TsFiles.
> 2022-11-04 14:08:54,359 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 960s to wait for closing all TsFiles.
> 2022-11-04 14:09:54,359 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 1020s to wait for closing all TsFiles.
> 2022-11-04 14:10:54,359 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 1080s to wait for closing all TsFiles.
> 2022-11-04 14:11:54,360 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 1140s to wait for closing all TsFiles.
> 2022-11-04 14:12:54,360 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 1200s to wait for closing all TsFiles.
> 2022-11-04 14:13:54,360 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 1260s to wait for closing all TsFiles.
> 2022-11-04 14:14:54,360 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 1320s to wait for closing all TsFiles.
> 2022-11-04 14:15:54,361 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 1380s to wait for closing all TsFiles.
> 2022-11-04 14:16:54,361 [Thread-0] WARN o.a.i.d.e.s.DataRegion:1848 -
> root.test.g_2-21 has spent 1440s to wait for closing all TsFiles.
> ……
> 测试流程
> 1. 启动集群 192.168.10.72~75
> ConfigNode 72,73,74
> MAX_HEAP_SIZE="8G"
> Common
> max_connection_for_internal_service=300
> query_timeout_threshold=3600000
> schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus
> data_region_consensus_protocol_class=org.apache.iotdb.consensus.multileader.MultiLeaderConsensus
> schema_replication_factor=1
> data_replication_factor=1
> DataNode
> MAX_HEAP_SIZE="256G"
> MAX_DIRECT_MEMORY_SIZE="32G"
> 2. bm写入数据
> 配置见附件
> 3. 写入完成,无其他客户端操作,缩容ip76
> 详细日志见机器
> /data/mpp_test/m_1103_f857667
--
This message was sent by Atlassian Jira
(v8.20.10#820010)