dombizita commented on PR #8613:
URL: https://github.com/apache/ozone/pull/8613#issuecomment-2974737589
@adoroszlai could you please help us? Locally this test is passing, but on
the CI it is failing, the datanode is not able to come up after a stop. I found
this in the DN logs:
```
2025-06-12 12:29:49,098 [ForkJoinPool.commonPool-worker-1] ERROR
ozoneimpl.OzoneContainer: Load db store for HddsVolume /data/hdds/hdds failed
java.io.IOException: Can't init db instance under path
/data/hdds/hdds/CID-30e32454-4a0d-413e-9242-916f8be902f5/DS-336374ce-4a59-4b63-b83a-5b18029927b0/container.db
for volume DS-336374ce-4a59-4b63-b83a-5b18029927b0
at
org.apache.hadoop.ozone.container.common.volume.HddsVolume.loadDbStore(HddsVolume.java:446)
at
org.apache.hadoop.ozone.container.common.utils.HddsVolumeUtil.loadVolume(HddsVolumeUtil.java:110)
at
org.apache.hadoop.ozone.container.common.utils.HddsVolumeUtil.lambda$loadAllHddsVolumeDbStore$0(HddsVolumeUtil.java:96)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1796)
at
java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
at
java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
at
java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
at
java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
at
java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
Caused by: java.io.IOException: Failed to create RDBStore from
/data/hdds/hdds/CID-30e32454-4a0d-413e-9242-916f8be902f5/DS-336374ce-4a59-4b63-b83a-5b18029927b0/container.db
at org.apache.hadoop.hdds.utils.db.RDBStore.<init>(RDBStore.java:178)
at
org.apache.hadoop.hdds.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:226)
at
org.apache.hadoop.ozone.container.metadata.AbstractDatanodeStore.initDBStore(AbstractDatanodeStore.java:96)
at
org.apache.hadoop.ozone.container.metadata.AbstractRDBStore.start(AbstractRDBStore.java:75)
at
org.apache.hadoop.ozone.container.metadata.AbstractRDBStore.<init>(AbstractRDBStore.java:56)
at
org.apache.hadoop.ozone.container.metadata.AbstractDatanodeStore.<init>(AbstractDatanodeStore.java:72)
at
org.apache.hadoop.ozone.container.metadata.DatanodeStoreWithIncrementalChunkList.<init>(DatanodeStoreWithIncrementalChunkList.java:53)
at
org.apache.hadoop.ozone.container.metadata.DatanodeStoreSchemaThreeImpl.<init>(DatanodeStoreSchemaThreeImpl.java:73)
at
org.apache.hadoop.ozone.container.keyvalue.helpers.BlockUtils.getUncachedDatanodeStore(BlockUtils.java:83)
at
org.apache.hadoop.ozone.container.common.utils.HddsVolumeUtil.initPerDiskDBStore(HddsVolumeUtil.java:73)
at
org.apache.hadoop.ozone.container.common.volume.HddsVolume.loadDbStore(HddsVolume.java:442)
... 9 more
Caused by: org.apache.hadoop.hdds.utils.db.RocksDatabaseException: IOError:
class org.apache.hadoop.hdds.utils.db.RocksDatabase: Failed to open
/data/hdds/hdds/CID-30e32454-4a0d-413e-9242-916f8be902f5/DS-336374ce-4a59-4b63-b83a-5b18029927b0/container.db
at
org.apache.hadoop.hdds.utils.db.RocksDatabase.toRocksDatabaseException(RocksDatabase.java:111)
at
org.apache.hadoop.hdds.utils.db.RocksDatabase.open(RocksDatabase.java:180)
at org.apache.hadoop.hdds.utils.db.RDBStore.<init>(RDBStore.java:110)
... 19 more
Caused by: org.rocksdb.RocksDBException: While open a file for appending:
/data/hdds/hdds/CID-30e32454-4a0d-413e-9242-916f8be902f5/DS-336374ce-4a59-4b63-b83a-5b18029927b0/container.db/LOG:
Permission denied
at org.rocksdb.RocksDB.open(Native Method)
at org.rocksdb.RocksDB.open(RocksDB.java:307)
at
org.apache.hadoop.hdds.utils.db.managed.ManagedRocksDB.open(ManagedRocksDB.java:83)
at
org.apache.hadoop.hdds.utils.db.RocksDatabase.open(RocksDatabase.java:174)
... 20 more
```
After this I found this in the acceptance logs:
```
Using Docker Compose v2
Executing test ozonesecure-ha/test-debug-tools.sh
chown: changing ownership of
'/home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-2.1.0-SNAPSHOT/compose/ozonesecure-ha/data/om2':
Operation not permitted
chown: changing ownership of
'/home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-2.1.0-SNAPSHOT/compose/ozonesecure-ha/data/dn3':
Operation not permitted
chown: changing ownership of
'/home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-2.1.0-SNAPSHOT/compose/ozonesecure-ha/data/scm2':
Operation not permitted
chown: changing ownership of
'/home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-2.1.0-SNAPSHOT/compose/ozonesecure-ha/data/om3':
Operation not permitted
chown: changing ownership of
'/home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-2.1.0-SNAPSHOT/compose/ozonesecure-ha/data/kms':
Operation not permitted
chown: changing ownership of
'/home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-2.1.0-SNAPSHOT/compose/ozonesecure-ha/data/dn1':
Operation not permitted
...
```
This leads us to a problem with this part of the debug tools testing
https://github.com/apache/ozone/blob/5a3e4e79c375cebe0ba598b1133b5718c9fd4fda/hadoop-ozone/dist/src/main/compose/ozonesecure-ha/test-debug-tools.sh#L37-L47
This way is used at other acceptance tests as well, do you see any issue
with the debug tools testing? Thanks in advance!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]