chaoqin-li1123 commented on code in PR #41099: URL: https://github.com/apache/spark/pull/41099#discussion_r1193351850
########## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala: ########## @@ -286,44 +329,51 @@ class RocksDB( */ def commit(): Long = { val newVersion = loadedVersion + 1 - val checkpointDir = createTempDir("checkpoint") - var rocksDBBackgroundThreadPaused = false try { - // Make sure the directory does not exist. Native RocksDB fails if the directory to - // checkpoint exists. - Utils.deleteRecursively(checkpointDir) logInfo(s"Flushing updates for $newVersion") - val flushTimeMs = timeTakenMs { db.flush(flushOptions) } - - val compactTimeMs = if (conf.compactOnCommit) { - logInfo("Compacting") - timeTakenMs { db.compactRange() } - } else 0 - - logInfo("Pausing background work") - val pauseTimeMs = timeTakenMs { - db.pauseBackgroundWork() // To avoid files being changed while committing - rocksDBBackgroundThreadPaused = true - } - logInfo(s"Creating checkpoint for $newVersion in $checkpointDir") - val checkpointTimeMs = timeTakenMs { - val cp = Checkpoint.create(db) - cp.createCheckpoint(checkpointDir.toString) + var flushTimeMs = 0L + var checkpointTimeMs = 0L + if (shouldCreateSnapshot()) { + // Need to flush the change to disk before creating a checkpoint + // because rocksdb wal is disabled. + flushTimeMs = timeTakenMs { db.flush(flushOptions) } + checkpointTimeMs = timeTakenMs { + val checkpointDir = createTempDir("checkpoint") + // Make sure the directory does not exist. Native RocksDB fails if the directory to + // checkpoint exists. + Utils.deleteRecursively(checkpointDir) + val cp = Checkpoint.create(db) + cp.createCheckpoint(checkpointDir.toString) + synchronized { + latestCheckpoint.foreach(_.close()) Review Comment: I thought about it from the beginning. But in the 3 places` latestCheckpoint` is modified, they are in slightly different pattern and can not fit inside a helper function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org