[
https://issues.apache.org/jira/browse/IOTDB-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643311#comment-17643311
]
Gaofei Cao commented on IOTDB-5073:
-----------------------------------
A possible reason: port 50010 is already in use.
> [rel/1.0] Internal error processing createSchemaRegion : NPE
> ------------------------------------------------------------
>
> Key: IOTDB-5073
> URL: https://issues.apache.org/jira/browse/IOTDB-5073
> Project: Apache IoTDB
> Issue Type: Bug
> Components: mpp-cluster
> Affects Versions: 1.0.0
> Reporter: 刘珍
> Assignee: Song Ziyang
> Priority: Major
> Attachments: config.properties
>
>
> rel/1.0 1128 61139d9
>
> 测试描述:
> 启动3C3D1BM,间隔10分钟,执行1次扩容3DataNode,增加1BM;最终3C21D7BM。
> BM配置(所有BM往1db里写)
> 读写混合:OPERATION_PROPORTION=70:1:1:1:1:0:1:1:1:1:1
> DEVICE_NUMBER=2000
> SENSOR_NUMBER=1000
> CLIENT_NUMBER=50
> DEVICE_NAME_PREFIX=d[${idx}]_
> 在测试到20分钟后,ip4 datanode ERROR:
> 2022-11-28 18:01:29,969 [pool-25-IoTDB-DataNodeInternalRPC-Processor-2] ERROR
> o.a.t.ProcessFunction:47 - Internal error processing createSchemaRegion
> java.lang.NullPointerException: null
> at
> org.apache.iotdb.db.service.thrift.impl.DataNodeRegionManager.createSchemaRegion(DataNodeRegionManager.java:127)
> at
> org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.createSchemaRegion(DataNodeInternalRPCServiceImpl.java:383)
> at
> org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$createSchemaRegion.getResult(IDataNodeRPCService.java:3807)
> at
> org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$createSchemaRegion.getResult(IDataNodeRPCService.java:3787)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> 测试环境-私有云1期
> confignode 协议:ratis
> schemaregion协议:ratis
> dataregion协议:IoT
> schema和data 副本为3
> 1. 21DataNode (172.16.2.2 ~ 22 8C32GB )
> MAX_HEAP_SIZE="20G"
> MAX_DIRECT_MEMORY_SIZE="6G"
> dn_data_dirs=data/datanode/data,/data1/iotdb/datanode/data
> 2. 3ConfigNode (172.16.2.23 ~ 25 8C32GB )
> MAX_HEAP_SIZE="20G"
> MAX_DIRECT_MEMORY_SIZE="6G"
> 3. 7BM (172.16.2.26 ~ 32 8C32GB )
> 4. 启动3C3D1BM
> 主要测试流程
> // 先启动3个ConfigNode
> exec 3<confignode.txt
> while read hostname <&3
> do
> echo $hostname
> start_confignode ${hostname}
> sleep 2
> done
> no_dn=0
> // 1个循环启动3个DataNode ,1BM
> exec 2<datanode.txt
> exec 3<bm_node.txt
> for i in {1..7}
> do
> target_num_datanode=$((i*3))
> echo "target num datanode : ${target_num_datanode}"
> // start datanode
> while read d_host <&2
> do
> echo $d_host
> start_datanode ${d_host}
> let no_dn++
> if [[ ${no_dn} = "${target_num_datanode}" ]];then
> break
> fi
> done
> // start benchmark
> while read bm_host <&3
> do
> echo $bm_host
> start_bm ${bm_host} "1128_v1_test_${i}"
> break
> done
> {color:#DE350B}* sleep 10m*{color}
> ${cluster_dir}/${cur_cluster}/sbin/start-cli.sh -h ${get_res} -e "show
> storage group" > $i_v1_1128_mpp_info.out
> ${cluster_dir}/${cur_cluster}/sbin/start-cli.sh -h ${get_res} -e "show
> regions" >> $i_v1_1128_mpp_info.out
> ${cluster_dir}/${cur_cluster}/sbin/start-cli.sh -h ${get_res} -e "count
> devices" >> $i_v1_1128_mpp_info.out
> ${cluster_dir}/${cur_cluster}/sbin/start-cli.sh -h ${get_res} -e "count
> timeseries" >> $i_v1_1128_mpp_info.out
> done
--
This message was sent by Atlassian Jira
(v8.20.10#820010)