[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38880: [SPARK-38277][SS] Clear write batch after RocksDB state store's commit

GitBox Tue, 06 Dec 2022 00:08:22 -0800


HeartSaVioR commented on code in PR #38880:
URL: https://github.com/apache/spark/pull/38880#discussion_r1040611003



##########
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/RocksDBSuite.scala:
##########
@@ -116,7 +116,9 @@ class RocksDBSuite extends SparkFunSuite {
     withDB(remoteDir, conf = conf) { db =>
       // Generate versions without cleaning up
       for (version <- 1 to 50) {
-        db.put(version.toString, version.toString)  // update "1" -> "1", "2" 
-> "2", ...

Review Comment:
   I'm not an expert of RocksDB, but I could explain this according to the high 
level architecture.
   
   
https://github.com/facebook/rocksdb/wiki/RocksDB-Overview#3-high-level-architecture
   
   During RocksDB.commit(), we write synchronously and flush synchronously, 
meaning that we will produce a new SST file for new writes. (As SST file is 
immutable, you can't append any new write against existing SST files.)
   
   That said, for each loop, we write a bunch of overwritten keys, and produce 
a new SST file containing overwritten keys. L0 allows SST files to have 
overlapping key ranges, but it is no longer allowed in upper level, which is 
ensured via "compaction". We trigger manual compaction based on the config, 
`compactOnCommit = true`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38880: [SPARK-38277][SS] Clear write batch after RocksDB state store's commit

Reply via email to