Kohei Sugihara created HDDS-7596:
------------------------------------

             Summary: OM crashes with NullPointerException while handling 
S3MultipartUploadCompleteResponceWithFSO
                 Key: HDDS-7596
                 URL: https://issues.apache.org/jira/browse/HDDS-7596
             Project: Apache Ozone
          Issue Type: Bug
          Components: OM
    Affects Versions: 1.3.0
            Reporter: Kohei Sugihara
            Assignee: Kohei Sugihara


OM crashes while processing MultipartComplete request against a FSO-enabled 
bucket with NullPointerException due to a corner case. 
S3MultipartUploadCompleteResponseWithFSO handler refers a bucketInfo object to 
get an objectId to a bucket but a special case when usedBytesDiff = 0 that 
assigns the bucketInfo to null. Even if OM-HA enabled, this crash propagates to 
all the rest of OMs and then entire service will be unavailable.

Reboot OM process will not help this problem. Because MultipartComplete request 
has already flushed to raft log before the crash and then OM process reboot 
will restore from flushed raft logs as redo and process same MultipartComplete 
request, so it results crash loop.

{{2022-12-02 20:48:24,809 [OMDoubleBufferFlushThread] ERROR 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with 
exit status 2: OMDoubleBuffer flush thread OMDoubleBufferFlushThread 
encountered Throwable error}}
{{java.lang.NullPointerException: Cannot invoke 
"org.apache.hadoop.ozone.om.helpers.OmBucketInfo.getObjectID()" because the 
return value of 
"org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCompleteResponseWithFSO.getOmBucketInfo()"
 is null}}
{{        at 
org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCompleteResponseWithFSO.addToKeyTable(S3MultipartUploadCompleteResponseWithFSO.java:90)}}
{{        at 
org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCompleteResponse.addToDBBatch(S3MultipartUploadCompleteResponse.java:101)}}
{{        at 
org.apache.hadoop.ozone.om.response.OMClientResponse.checkAndUpdateDB(OMClientResponse.java:73)}}
{{        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$2(OzoneManagerDoubleBuffer.java:283)}}
{{        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.addToBatchWithTrace(OzoneManagerDoubleBuffer.java:228)}}
{{        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$1(OzoneManagerDoubleBuffer.java:281)}}
{{        at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)}}
{{        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:277)}}
{{        at java.base/java.lang.Thread.run(Thread.java:833)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to