[
https://issues.apache.org/jira/browse/HDDS-10609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832005#comment-17832005
]
Wei-Chiu Chuang commented on HDDS-10609:
----------------------------------------
It was supposed to be fixed by HBASE-27732. But we're seeing it again now.
> [HBase Ozone] HMaster aborts with "RENAME_OPEN_FILE
> org.apache.hadoop.ozone.om.exceptions.OMException"
> ------------------------------------------------------------------------------------------------------
>
> Key: HDDS-10609
> URL: https://issues.apache.org/jira/browse/HDDS-10609
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM
> Reporter: Pratyush Bhatt
> Priority: Major
>
> _HMaster_ continuously throws "RENAME_OPEN_FILE
> org.apache.hadoop.ozone.om.exceptions.OMException" for mostly a single
> file({color:#0747a6}_key=hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.xyz.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.xyz%2C22001%2C1711528430307.1711534016466_{color})
> for approx 18 minutes and then ABORTS.
> {code:java}
> 2024-03-27 10:51:43,369 ERROR
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem: rename key failed:
> Open file cannot be renamed since it is hsync'ed: volumeName=volhbase1,
> bucketName=buckethbase1,
> key=hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.xyz,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.xyz%2C22001%2C1711528430307.1711534016466.
>
> source:volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.xyz,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.xyz%2C22001%2C1711528430307.1711534016466,
>
> destin:volhbase1/buckethbase1/hbase/MasterData/oldWALs/ccycloud-8.ozn-hb932chf3oz.xyz%2C22001%2C1711528430307.1711534016466
> 2024-03-27 10:51:43,369 ERROR
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Failed log archiving
> for the log
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.xyz,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.xyz%2C22001%2C1711528430307.1711534016466,
> RENAME_OPEN_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Open file
> cannot be renamed since it is hsync'ed: volumeName=volhbase1,
> bucketName=buckethbase1,
> key=hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:756)
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.renameKey(OzoneManagerProtocolClientSideTranslatorPB.java:932)
> at
> org.apache.hadoop.ozone.client.rpc.RpcClient.renameKey(RpcClient.java:1612)
> at
> org.apache.hadoop.ozone.client.OzoneBucket.renameKey(OzoneBucket.java:645)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.rename(BasicRootedOzoneClientAdapterImpl.java:497)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.renameFSO(BasicRootedOzoneFileSystem.java:487)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.rename(BasicRootedOzoneFileSystem.java:400)
> at
> org.apache.hadoop.hbase.util.CommonFSUtils.renameAndSetModifyTime(CommonFSUtils.java:711)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.archiveLogFile(AbstractFSWAL.java:785)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.archive(AbstractFSWAL.java:750)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.lambda$cleanOldLogs$1(AbstractFSWAL.java:738)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2024-03-27 10:51:43,370 INFO
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Archiving
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> to
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/oldWALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> ~
> search hit BOTTOM, continuing at TOP
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2024-03-27 10:51:43,125 ERROR
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Failed log archiving
> for the log
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466,
> RENAME_OPEN_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Open file
> cannot be renamed since it is hsync'ed: volumeName=volhbase1,
> bucketName=buckethbase1,
> key=hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:756)
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.renameKey(OzoneManagerProtocolClientSideTranslatorPB.java:932)
> at
> org.apache.hadoop.ozone.client.rpc.RpcClient.renameKey(RpcClient.java:1612)
> at
> org.apache.hadoop.ozone.client.OzoneBucket.renameKey(OzoneBucket.java:645)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.rename(BasicRootedOzoneClientAdapterImpl.java:497)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.renameFSO(BasicRootedOzoneFileSystem.java:487)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.rename(BasicRootedOzoneFileSystem.java:400)
> at
> org.apache.hadoop.hbase.util.CommonFSUtils.renameAndSetModifyTime(CommonFSUtils.java:711)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.archiveLogFile(AbstractFSWAL.java:785)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.archive(AbstractFSWAL.java:750)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.lambda$cleanOldLogs$1(AbstractFSWAL.java:738)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2024-03-27 10:51:43,125 INFO
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Archiving
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> to
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/oldWALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> 2024-03-27 10:51:43,133 ERROR
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem: rename key failed:
> Open file cannot be renamed since it is hsync'ed: volumeName=volhbase1,
> bucketName=buckethbase1,
> key=hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466.
>
> source:volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466,
>
> destin:volhbase1/buckethbase1/hbase/MasterData/oldWALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> 2024-03-27 10:51:43,133 ERROR
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Failed log archiving
> for the log
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466,
> RENAME_OPEN_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Open file
> cannot be renamed since it is hsync'ed: volumeName=volhbase1,
> bucketName=buckethbase1,
> key=hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:756)
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.renameKey(OzoneManagerProtocolClientSideTranslatorPB.java:932)
> at
> org.apache.hadoop.ozone.client.rpc.RpcClient.renameKey(RpcClient.java:1612)
> at
> org.apache.hadoop.ozone.client.OzoneBucket.renameKey(OzoneBucket.java:645)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.rename(BasicRootedOzoneClientAdapterImpl.java:497)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.renameFSO(BasicRootedOzoneFileSystem.java:487)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.rename(BasicRootedOzoneFileSystem.java:400)
> at
> org.apache.hadoop.hbase.util.CommonFSUtils.renameAndSetModifyTime(CommonFSUtils.java:711)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.archiveLogFile(AbstractFSWAL.java:785)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.archive(AbstractFSWAL.java:750)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.lambda$cleanOldLogs$1(AbstractFSWAL.java:738)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> /ABORTING
> at
> org.apache.hadoop.hbase.wal.AbstractWALRoller.run(AbstractWALRoller.java:240)
> 2024-03-27 10:51:43,196 ERROR
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem: rename key failed:
> Open file cannot be renamed since it is hsync'ed: volumeName=volhbase1,
> bucketName=buckethbase1,
> key=hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466.
>
> source:volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466,
>
> destin:volhbase1/buckethbase1/hbase/MasterData/oldWALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466
> 2024-03-27 10:51:43,196 ERROR
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Failed log archiving
> for the log
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.xyz,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.root.comops.site%2C22001%2C1711528430307.1711534016466,
> RENAME_OPEN_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Open file
> cannot be renamed since it is hsync'ed: volumeName=volhbase1,
> bucketName=buckethbase1,
> key=hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.root.comops.site,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.xyz%2C22001%2C1711528430307.1711534016466
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:756)
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.renameKey(OzoneManagerProtocolClientSideTranslatorPB.java:932)
> at
> org.apache.hadoop.ozone.client.rpc.RpcClient.renameKey(RpcClient.java:1612)
> at
> org.apache.hadoop.ozone.client.OzoneBucket.renameKey(OzoneBucket.java:645)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.rename(BasicRootedOzoneClientAdapterImpl.java:497)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.renameFSO(BasicRootedOzoneFileSystem.java:487)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.rename(BasicRootedOzoneFileSystem.java:400)
> at
> org.apache.hadoop.hbase.util.CommonFSUtils.renameAndSetModifyTime(CommonFSUtils.java:711)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.archiveLogFile(AbstractFSWAL.java:785)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.archive(AbstractFSWAL.java:750)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.lambda$cleanOldLogs$1(AbstractFSWAL.java:738)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2024-03-27 10:51:43,197 INFO
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Archiving
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/WALs/ccycloud-8.ozn-hb932chf3oz.xyz,22001,1711528430307/ccycloud-8.ozn-hb932chf3oz.xyz%2C22001%2C1711528430307.1711534016466
> to
> ofs://ozone1711476173/volhbase1/buckethbase1/hbase/MasterData/oldWALs/ccycloud-8.ozn-hb932chf3oz.xyz%2C22001%2C1711528430307.1711534016466
> {code}
> Then aborts
> {code:java}
> 016466
> 2024-03-27 10:51:43,199 ERROR org.apache.hadoop.hbase.master.HMaster: Master
> server abort: loaded coprocessors are:
> [org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor]
> 2024-03-27 10:51:43,199 ERROR org.apache.hadoop.hbase.master.HMaster: *****
> ABORTING master ccycloud-8.ozn-hb932chf3oz.xyz,22001,1711528430307: Log
> rolling failed *****
> java.lang.RuntimeException
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.writeWALMetadata(AsyncProtobufLogWriter.java:217)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.writeMagicAndWALHeader(AsyncProtobufLogWriter.java:223)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufLogWriter.init(AbstractProtobufLogWriter.java:164)
> at
> org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createAsyncWriter(AsyncFSWALProvider.java:116)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:726)
> at
> org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:129)
> at
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:886)
> at
> org.apache.hadoop.hbase.wal.AbstractWALRoller$RollController.rollWal(AbstractWALRoller.java:304)
> at
> org.apache.hadoop.hbase.wal.AbstractWALRoller.run(AbstractWALRoller.java:211)
> 2024-03-27 10:51:43,200 INFO org.apache.ranger.plugin.util.PolicyRefresher:
> PolicyRefresher(serviceName=cm_hbase).run(): interrupted! Exiting thread
> java.lang.InterruptedException
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at
> org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:208)
> 2024-03-27 10:51:43,200 INFO
> org.apache.ranger.audit.provider.AuditProviderFactory: ==>
> JVMShutdownHook.run() {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]