adoroszlai opened a new pull request, #4335:
URL: https://github.com/apache/ozone/pull/4335

   ## What changes were proposed in this pull request?
   
   Avoid calling response observer's `onCompleted()` after `onError()`.
   
   Steps to reproduce:
   
   1. Change `BlockManagerImpl.persistPutBlock` to throw some NPE
   2. Start cluster with 5 nodes
   3. Create EC key: `ozone freon ockg -n1 -t1 --replication-type EC 
--replication rs-3-2-1024k -s100`
   
   Result in datanode log:
   
   ```
   2023-03-01 17:54:22,957 [ChunkReader-1] ERROR server.GrpcXceiverService: Got 
exception when processing ContainerCommandRequestProto cmdType: PutBlock
   traceID: ""
   containerID: 1
   datanodeUuid: "1c284e9e-d128-4c95-bded-7a1d75eecd30"
   putBlock {
     blockData {
       blockID {
         containerID: 1
         localID: 111677748019200001
         blockCommitSequenceId: 0
         replicaIndex: 2
       }
       metadata {
         key: "TYPE"
         value: "KEY"
       }
       metadata {
         key: "blockGroupLen"
         value: "100"
       }
     }
     eof: true
   }
   
   java.lang.NullPointerException: testing
        at 
org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.persistPutBlock(BlockManagerImpl.java:153)
        at 
org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(BlockManagerImpl.java:103)
        at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(KeyValueHandler.java:539)
        at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:243)
        at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:219)
        at 
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
        at 
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:171)
        at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
        at 
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:170)
        at 
org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:57)
        at 
org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:50)
        at 
org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262)
        at 
org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
        at 
org.apache.hadoop.hdds.tracing.GrpcServerInterceptor$1.onMessage(GrpcServerInterceptor.java:49)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:332)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:315)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:834)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
   Mar 01, 2023 5:54:22 PM 
org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor run
   SEVERE: Exception while executing runnable 
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed@31454226
   java.lang.IllegalStateException: call already closed
        at 
org.apache.ratis.thirdparty.com.google.common.base.Preconditions.checkState(Preconditions.java:502)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl.closeInternal(ServerCallImpl.java:218)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl.close(ServerCallImpl.java:211)
        at 
org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onCompleted(ServerCalls.java:395)
        at 
org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onCompleted(GrpcXceiverService.java:76)
        at 
org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onHalfClose(ServerCalls.java:273)
        at 
org.apache.ratis.thirdparty.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
        at 
org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
        at 
org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:354)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
   ```
   
   https://issues.apache.org/jira/browse/HDDS-8019
   
   ## How was this patch tested?
   
   Same repro steps.  Result in datanode: same as previously, without the 
`IllegalStateException`.
   
   ```
   2023-03-01 20:08:50,269 [ChunkReader-1] ERROR server.GrpcXceiverService: Got 
exception when processing ContainerCommandRequestProto cmdType: PutBlock
   traceID: ""
   containerID: 1
   datanodeUuid: "2a1cdc3b-a7ae-419d-bf22-0ecb7b011516"
   putBlock {
     blockData {
       blockID {
         containerID: 1
         localID: 111677748019200001
         blockCommitSequenceId: 0
         replicaIndex: 3
       }
       metadata {
         key: "TYPE"
         value: "KEY"
       }
       metadata {
         key: "blockGroupLen"
         value: "100"
       }
     }
     eof: true
   }
   
   java.lang.NullPointerException: testing
        at 
org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.persistPutBlock(BlockManagerImpl.java:153)
        at 
org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(BlockManagerImpl.java:103)
        at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(KeyValueHandler.java:539)
        at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:243)
        at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:219)
        at 
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
        at 
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:171)
        at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
        at 
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:170)
        at 
org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:57)
        at 
org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:50)
        at 
org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262)
        at 
org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
        at 
org.apache.hadoop.hdds.tracing.GrpcServerInterceptor$1.onMessage(GrpcServerInterceptor.java:49)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:332)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:315)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:834)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        at 
org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
   ```
   
   https://github.com/adoroszlai/hadoop-ozone/actions/runs/4307537351


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to