[ 
https://issues.apache.org/jira/browse/HDDS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17706169#comment-17706169
 ] 

Siyao Meng commented on HDDS-8282:
----------------------------------

This looks to be caused by HDDS-7559. I suspect at the time of crash the 
[handle in the 
listener|https://github.com/apache/ozone/blob/428fe1de25c5598856c4e0414d758607e07b8f7e/hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java#L447-L451]
 is invalid.

However, the CF handle is [checked and 
set|https://github.com/apache/ozone/blob/2a826133d681e3ed8a789c77b6c6f8c622e4c743/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBStore.java#L164-L168]
 right after OM DB is initialized. I don't see any apparent cause of the 
invalid handle. Possibly an OM reload or similar would invalidate the handle?

> [Snapshot] Intermittent DB crash in RocksDBCheckpointDiffer
> -----------------------------------------------------------
>
>                 Key: HDDS-8282
>                 URL: https://issues.apache.org/jira/browse/HDDS-8282
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: Snapshot
>    Affects Versions: 1.4.0
>            Reporter: Attila Doroszlai
>            Assignee: Hemant Kumar
>            Priority: Critical
>              Labels: ozone-snapshot
>
> {code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2023/03/25/21082/it-om/output.log}
> [ERROR] Crashed tests:
> [ERROR] org.apache.hadoop.ozone.om.TestOzoneManagerHAWithData
> {code}
> {code:title=https://github.com/adoroszlai/ozone-build-results/blob/master/2023/03/25/21082/it-om/hs_err_pid122309.log}
> Current thread (0x00007f99644d1800):  JavaThread "Thread-1929" 
> [_thread_in_native, id=125898, stack(0x00007f995389a000,0x00007f995489a000)]
> siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 
> 0x00007f998c3766c0
> ...
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  org.rocksdb.RocksDB.iteratorCF(JJ)J+0
> j  
> org.rocksdb.RocksDB.newIterator(Lorg/rocksdb/ColumnFamilyHandle;)Lorg/rocksdb/RocksIterator;+14
> j  
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.isSnapshotInfoTableEmpty(Lorg/rocksdb/RocksDB;)Z+24
> j  
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer.access$000(Lorg/apache/ozone/rocksdiff/RocksDBCheckpointDiffer;Lorg/rocksdb/RocksDB;)Z+2
> j  
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer$2.onCompactionCompleted(Lorg/rocksdb/RocksDB;Lorg/rocksdb/CompactionJobInfo;)V+5
> j  
> org.rocksdb.AbstractEventListener.onCompactionCompletedProxy(JLorg/rocksdb/CompactionJobInfo;)V+19
> {code}
> CC [~smeng], [~hemantk]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to