Sumit Agrawal created HDDS-7284:
-----------------------------------

             Summary: JVM crash for rocksdb for read/write after close
                 Key: HDDS-7284
                 URL: https://issues.apache.org/jira/browse/HDDS-7284
             Project: Apache Ozone
          Issue Type: Bug
         Environment: Ozone integration test (randomly observed)

Stack: [0x000000030cea5000,0x000000030cfa5000], sp=0x000000030cfa3620, free 
space=1017k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [librocksdbjni2897401891378344213.jnilib+0x20cf9] 
_Z18rocksdb_put_helperP7JNIEnv_PN7rocksdb2DBERKNS1_12WriteOptionsEPNS1_18ColumnFamilyHandleEP11_jbyteArrayiiSA_ii+0x109
j org.rocksdb.RocksDB.put(JJ[BII[BIIJ)V+0
j 
org.rocksdb.RocksDB.put(Lorg/rocksdb/ColumnFamilyHandle;Lorg/rocksdb/WriteOptions;[B[B)V+23
j 
org.apache.hadoop.hdds.utils.db.RocksDatabase.put(Lorg/apache/hadoop/hdds/utils/db/RocksDatabase$ColumnFamily;[B[B)V+25
j org.apache.hadoop.hdds.utils.db.RDBTable.put([B[B)V+14

This is reproduced in isolated manner:
1. one thread keeps on calling read / write
2. Main thread closes the DB store

            Reporter: Sumit Agrawal


During integration test of Ozon, its randomly observed that JVM crashes with 
rocks db stack.
Its observed if some of thread in Recon which is processing FCR/ICR report, jvm 
crashed with rocks db stack.

Solution Proposed:
1. every DB access in RocksDatabase,
   - isClosed() check, if closed, then throw IOException
   - counter increment on entry and decrement on exit of method
2. While RocksDB close,
  - set isClosed to true
  - keep check for counter if it reaches to "0", with retry every milli second
  - Another strategy of force close after 5 second.

This will provide performance as no lock.

Alternate solution (will have performance issue due to frequent lock/unlock):
1. every DB access, 
  - take a read lock and 
  - check for isClosed(), if closed, throw IOException
2. while RocksDB close,
    - take a write lock
    - set isClosed and close the DB
This solution can have performance bottleneck as frequent Read lock / unlock



  




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to