anishshri-db opened a new pull request, #40981:
URL: https://github.com/apache/spark/pull/40981

   ### What changes were proposed in this pull request?
   Add RocksDB state store memory management enhancements
   
   This change does the following:
   
   - remove use of writeBatchWithIndex
   - move towards using Native RocksDB operations
   - remove use of RocksDB WAL
   - add support for bounding memory usage for all RocksDB state store 
instances on executor using the write buffer manager
   
   ### Why are the changes needed?
   Today when RocksDB is used as a State Store provider, memory usage when 
writing using writeBatch is not capped. Also, a related issue is that the state 
store coordinator can create multiple RocksDB instances on a single node 
without enforcing a global limit on native memory usage. Due to these issues we 
could run into OOM issues and task failures. 
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Added unit tests and fixed existing ones
   
   RocksDBStateStoreSuite
   ```
   [info] Run completed in 40 seconds, 916 milliseconds.
   [info] Total number of tests run: 33
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 33, failed 0, canceled 0, ignored 0, pending 0
   [info] All tests passed.
   ```
   
   StateStoreSuite
   ```
   
   [info] Run completed in 2 minutes, 33 seconds.
   [info] Total number of tests run: 85
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 85, failed 0, canceled 0, ignored 0, pending 0
   [info] All tests passed.
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to