刘珍 created IOTDB-4020:
-------------------------
Summary: [ MultiLeaderConsensus ]Down the followers and then start
it again, the leader reports errors (some data writing fails)
Key: IOTDB-4020
URL: https://issues.apache.org/jira/browse/IOTDB-4020
Project: Apache IoTDB
Issue Type: Bug
Components: mpp-cluster
Affects Versions: 0.14.0-SNAPSHOT
Reporter: 刘珍
Assignee: Jinrui Zhang
Attachments: image-2022-08-02-16-13-34-186.png,
image-2022-08-02-16-14-14-823.png
master_0801_55b5b17
问题描述
MultiLeaderConsensus,3副本3C9D,1个bm客户端并发写入,顺序停止2个follower节点,再顺序启动,leader报错(部分数据写入失败),bm执行完,flush,分别查询leader
,down掉的2个follower节点的数据,follower数据少于leader。
2022-08-02 14:51:08,233 [20220802_065108_12919_3.1.0-145] ERROR
o.a.i.c.c.s.SyncThriftClientWithErrorHandler:86 - Broken pipe error happened in
calling method recv_sendPlanNode, we need to clear all previous cached
connection, err: {}
org.apache.thrift.TException: Error in calling method receiveBase
at
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:94)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.receiveBase(<generated>)
at
org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Client.recv_sendPlanNode(IDataNodeRPCService.java:307)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.CGLIB$recv_sendPlanNode$11(<generated>)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be$$FastClassByCGLIB$$f64335b4.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
at
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.recv_sendPlanNode(<generated>)
at
org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Client.sendPlanNode(IDataNodeRPCService.java:294)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.CGLIB$sendPlanNode$71(<generated>)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be$$FastClassByCGLIB$$f64335b4.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
at
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.sendPlanNode(<generated>)
at
org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchRemote(FragmentInstanceDispatcherImpl.java:173)
at
org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchOneInstance(FragmentInstanceDispatcherImpl.java:141)
at
org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchWriteSync(FragmentInstanceDispatcherImpl.java:121)
at
org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatch(FragmentInstanceDispatcherImpl.java:92)
at
org.apache.iotdb.db.mpp.plan.scheduler.ClusterScheduler.start(ClusterScheduler.java:102)
at
org.apache.iotdb.db.mpp.plan.execution.QueryExecution.schedule(QueryExecution.java:258)
at
org.apache.iotdb.db.mpp.plan.execution.QueryExecution.start(QueryExecution.java:185)
at
org.apache.iotdb.db.mpp.plan.Coordinator.execute(Coordinator.java:146)
at
org.apache.iotdb.db.mpp.plan.Coordinator.execute(Coordinator.java:160)
at
org.apache.iotdb.db.service.thrift.impl.ClientRPCServiceImpl.insertTablet(ClientRPCServiceImpl.java:977)
at
org.apache.iotdb.service.rpc.thrift.IClientRPCService$Processor$insertTablet.getResult(IClientRPCService.java:3328)
at
org.apache.iotdb.service.rpc.thrift.IClientRPCService$Processor$insertTablet.getResult(IClientRPCService.java:3308)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException: Socket is closed by
peer.
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:181)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
at
org.apache.iotdb.rpc.TElasticFramedTransport.readFrame(TElasticFramedTransport.java:112)
at
org.apache.iotdb.rpc.TElasticFramedTransport.read(TElasticFramedTransport.java:107)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:463)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:361)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:244)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.CGLIB$receiveBase$90(<generated>)
at
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be$$FastClassByCGLIB$$f64335b4.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
at
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55)
... 31 common frames omitted
1. 复现流程
私有云172.20.70.2/3/4/5/13/14/16/18/19
benchmark 在ip15(连ip4)
ip4是leader
!image-2022-08-02-16-13-34-186.png!
!image-2022-08-02-16-14-14-823.png!
2. 启动benchmark
3. 停止ip16 ,停止ip14
4. 启动ip16,启动ip14
5.leader报错
6.bm执行完,查看数据
leader有写入失败
--
This message was sent by Atlassian Jira
(v8.20.10#820010)