[
https://issues.apache.org/jira/browse/HDDS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17706169#comment-17706169
]
Siyao Meng commented on HDDS-8282:
----------------------------------
This looks to be caused by HDDS-7559. I suspect at the time of crash the
[handle in the
listener|https://github.com/apache/ozone/blob/428fe1de25c5598856c4e0414d758607e07b8f7e/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L447-L451]
is invalid.
However, the CF handle is [checked and
set|https://github.com/apache/ozone/blob/2a826133d681e3ed8a789c77b6c6f8c622e4c743/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBStore.java#L164-L168]
right after OM DB is initialized. I don't see any apparent cause of the
invalid handle. Possibly an OM reload or similar would invalidate the handle?
> [Snapshot] Intermittent DB crash in RocksDBCheckpointDiffer
> -----------------------------------------------------------
>
> Key: HDDS-8282
> URL: https://issues.apache.org/jira/browse/HDDS-8282
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: Snapshot
> Affects Versions: 1.4.0
> Reporter: Attila Doroszlai
> Assignee: Hemant Kumar
> Priority: Critical
> Labels: ozone-snapshot
>
> {code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2023/03/25/21082/it-om/output.log}
> [ERROR] Crashed tests:
> [ERROR] org.apache.hadoop.ozone.om.TestOzoneManagerHAWithData
> {code}
> {code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2023/03/25/21082/it-om/hs_err_pid122309.log}
> Current thread (0x00007f99644d1800): JavaThread "Thread-1929"
> [_thread_in_native, id=125898, stack(0x00007f995389a000,0x00007f995489a000)]
> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr:
> 0x00007f998c3766c0
> ...
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j org.rocksdb.RocksDB.iteratorCF(JJ)J+0
> j
> org.rocksdb.RocksDB.newIterator(Lorg/rocksdb/ColumnFamilyHandle;)Lorg/rocksdb/RocksIterator;+14
> j
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.isSnapshotInfoTableEmpty(Lorg/rocksdb/RocksDB;)Z+24
> j
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.access$000(Lorg/apache/ozone/rocksdiff/RocksDBCheckpointDiffer;Lorg/rocksdb/RocksDB;)Z+2
> j
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer$2.onCompactionCompleted(Lorg/rocksdb/RocksDB;Lorg/rocksdb/CompactionJobInfo;)V+5
> j
> org.rocksdb.AbstractEventListener.onCompactionCompletedProxy(JLorg/rocksdb/CompactionJobInfo;)V+19
> {code}
> CC [~smeng], [~hemantk]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]