[
https://issues.apache.org/jira/browse/HDDS-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Attila Doroszlai resolved HDDS-7596.
------------------------------------
Fix Version/s: 1.4.0
Resolution: Fixed
> OM crashes with NullPointerException while handling
> S3MultipartUploadCompleteResponceWithFSO
> --------------------------------------------------------------------------------------------
>
> Key: HDDS-7596
> URL: https://issues.apache.org/jira/browse/HDDS-7596
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM
> Affects Versions: 1.3.0
> Reporter: Kohei Sugihara
> Assignee: Kohei Sugihara
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.4.0
>
>
> OM crashes while processing MultipartComplete request against a FSO-enabled
> bucket with NullPointerException due to a corner case.
> S3MultipartUploadCompleteResponseWithFSO handler refers a bucketInfo object
> to get an objectId to a bucket but a special case when usedBytesDiff = 0 that
> assigns the bucketInfo to null. Even if OM-HA enabled, this crash propagates
> to all the rest of OMs and then entire service will be unavailable.
> Reboot OM process will not help this problem. Because MultipartComplete
> request has already flushed to raft log before the crash and then OM process
> reboot will restore from flushed raft logs as redo and process same
> MultipartComplete request, so it results crash loop.
> {{2022-12-02 20:48:24,809 [OMDoubleBufferFlushThread] ERROR
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with
> exit status 2: OMDoubleBuffer flush thread OMDoubleBufferFlushThread
> encountered Throwable error}}
> {{java.lang.NullPointerException: Cannot invoke
> "org.apache.hadoop.ozone.om.helpers.OmBucketInfo.getObjectID()" because the
> return value of
> "org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCompleteResponseWithFSO.getOmBucketInfo()"
> is null}}
> {{ at
> org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCompleteResponseWithFSO.addToKeyTable(S3MultipartUploadCompleteResponseWithFSO.java:90)}}
> {{ at
> org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCompleteResponse.addToDBBatch(S3MultipartUploadCompleteResponse.java:101)}}
> {{ at
> org.apache.hadoop.ozone.om.response.OMClientResponse.checkAndUpdateDB(OMClientResponse.java:73)}}
> {{ at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$2(OzoneManagerDoubleBuffer.java:283)}}
> {{ at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.addToBatchWithTrace(OzoneManagerDoubleBuffer.java:228)}}
> {{ at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$1(OzoneManagerDoubleBuffer.java:281)}}
> {{ at
> java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)}}
> {{ at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:277)}}
> {{ at java.base/java.lang.Thread.run(Thread.java:833)}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]