[ 
https://issues.apache.org/jira/browse/HDDS-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-7284:
---------------------------------
    Labels: pull-request-available  (was: )

> JVM crash for rocksdb for read/write after close
> ------------------------------------------------
>
>                 Key: HDDS-7284
>                 URL: https://issues.apache.org/jira/browse/HDDS-7284
>             Project: Apache Ozone
>          Issue Type: Bug
>         Environment: Ozone integration test (randomly observed)
> Stack: [0x000000030cea5000,0x000000030cfa5000], sp=0x000000030cfa3620, free 
> space=1017k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C [librocksdbjni2897401891378344213.jnilib+0x20cf9] 
> _Z18rocksdb_put_helperP7JNIEnv_PN7rocksdb2DBERKNS1_12WriteOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayiiSA_ii+0x109
> j org.rocksdb.RocksDB.put(JJ[BII[BIIJ)V+0
> j 
> org.rocksdb.RocksDB.put(Lorg/rocksdb/ColumnFamilyHandle;Lorg/rocksdb/WriteOptions;[B[B)V+23
> j 
> org.apache.hadoop.hdds.utils.db.RocksDatabase.put(Lorg/apache/hadoop/hdds/utils/db/RocksDatabase$ColumnFamily;[B[B)V+25
> j org.apache.hadoop.hdds.utils.db.RDBTable.put([B[B)V+14
> This is reproduced in isolated manner:
> 1. one thread keeps on calling read / write
> 2. Main thread closes the DB store
>            Reporter: Sumit Agrawal
>            Assignee: Sumit Agrawal
>            Priority: Minor
>              Labels: pull-request-available
>
> During integration test of Ozon, its randomly observed that JVM crashes with 
> rocks db stack.
> Its observed if some of thread in Recon which is processing FCR/ICR report, 
> jvm crashed with rocks db stack.
> Solution Proposed:
> 1. every DB access in RocksDatabase,
>    - isClosed() check, if closed, then throw IOException
>    - counter increment on entry and decrement on exit of method
> 2. While RocksDB close,
>   - set isClosed to true
>   - keep check for counter if it reaches to "0", with retry every milli second
>   - Another strategy of force close after 5 second.
> This will provide performance as no lock.
> Alternate solution (will have performance issue due to frequent lock/unlock):
> 1. every DB access, 
>   - take a read lock and 
>   - check for isClosed(), if closed, throw IOException
> 2. while RocksDB close,
>     - take a write lock
>     - set isClosed and close the DB
> This solution can have performance bottleneck as frequent Read lock / unlock
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to