刘珍 created IOTDB-4020:
-------------------------

             Summary: [ MultiLeaderConsensus ]Down the followers and then start 
it again, the leader  reports  errors (some data writing fails)
                 Key: IOTDB-4020
                 URL: https://issues.apache.org/jira/browse/IOTDB-4020
             Project: Apache IoTDB
          Issue Type: Bug
          Components: mpp-cluster
    Affects Versions: 0.14.0-SNAPSHOT
            Reporter: 刘珍
            Assignee: Jinrui Zhang
         Attachments: image-2022-08-02-16-13-34-186.png, 
image-2022-08-02-16-14-14-823.png

master_0801_55b5b17
问题描述
 
MultiLeaderConsensus,3副本3C9D,1个bm客户端并发写入,顺序停止2个follower节点,再顺序启动,leader报错(部分数据写入失败),bm执行完,flush,分别查询leader
 ,down掉的2个follower节点的数据,follower数据少于leader。
2022-08-02 14:51:08,233 [20220802_065108_12919_3.1.0-145] ERROR 
o.a.i.c.c.s.SyncThriftClientWithErrorHandler:86 - Broken pipe error happened in 
calling method recv_sendPlanNode, we need to clear all previous cached 
connection, err: {}
org.apache.thrift.TException: Error in calling method receiveBase
        at 
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:94)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.receiveBase(<generated>)
        at 
org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Client.recv_sendPlanNode(IDataNodeRPCService.java:307)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.CGLIB$recv_sendPlanNode$11(<generated>)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be$$FastClassByCGLIB$$f64335b4.invoke(<generated>)
        at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
        at 
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.recv_sendPlanNode(<generated>)
        at 
org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Client.sendPlanNode(IDataNodeRPCService.java:294)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.CGLIB$sendPlanNode$71(<generated>)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be$$FastClassByCGLIB$$f64335b4.invoke(<generated>)
        at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
        at 
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.sendPlanNode(<generated>)
        at 
org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchRemote(FragmentInstanceDispatcherImpl.java:173)
        at 
org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchOneInstance(FragmentInstanceDispatcherImpl.java:141)
        at 
org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatchWriteSync(FragmentInstanceDispatcherImpl.java:121)
        at 
org.apache.iotdb.db.mpp.plan.scheduler.FragmentInstanceDispatcherImpl.dispatch(FragmentInstanceDispatcherImpl.java:92)
        at 
org.apache.iotdb.db.mpp.plan.scheduler.ClusterScheduler.start(ClusterScheduler.java:102)
        at 
org.apache.iotdb.db.mpp.plan.execution.QueryExecution.schedule(QueryExecution.java:258)
        at 
org.apache.iotdb.db.mpp.plan.execution.QueryExecution.start(QueryExecution.java:185)
        at 
org.apache.iotdb.db.mpp.plan.Coordinator.execute(Coordinator.java:146)
        at 
org.apache.iotdb.db.mpp.plan.Coordinator.execute(Coordinator.java:160)
        at 
org.apache.iotdb.db.service.thrift.impl.ClientRPCServiceImpl.insertTablet(ClientRPCServiceImpl.java:977)
        at 
org.apache.iotdb.service.rpc.thrift.IClientRPCService$Processor$insertTablet.getResult(IClientRPCService.java:3328)
        at 
org.apache.iotdb.service.rpc.thrift.IClientRPCService$Processor$insertTablet.getResult(IClientRPCService.java:3308)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException: Socket is closed by 
peer.
        at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:181)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
        at 
org.apache.iotdb.rpc.TElasticFramedTransport.readFrame(TElasticFramedTransport.java:112)
        at 
org.apache.iotdb.rpc.TElasticFramedTransport.read(TElasticFramedTransport.java:107)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
        at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:463)
        at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:361)
        at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:244)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be.CGLIB$receiveBase$90(<generated>)
        at 
org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$d29f64be$$FastClassByCGLIB$$f64335b4.invoke(<generated>)
        at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
        at 
org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.intercept(SyncThriftClientWithErrorHandler.java:55)
        ... 31 common frames omitted

1. 复现流程
私有云172.20.70.2/3/4/5/13/14/16/18/19
benchmark 在ip15(连ip4)
ip4是leader
 !image-2022-08-02-16-13-34-186.png! 
 !image-2022-08-02-16-14-14-823.png! 
2. 启动benchmark

3. 停止ip16 ,停止ip14

4. 启动ip16,启动ip14

5.leader报错

6.bm执行完,查看数据
leader有写入失败



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to