HeartSaVioR commented on code in PR #47778:
URL: https://github.com/apache/spark/pull/47778#discussion_r1721437164
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala:
##########
@@ -550,6 +641,12 @@ class RocksDB(
} finally {
changelogWriter = None
}
+ // If we have changed the columnFamilyId mapping, we have set a new
Review Comment:
Shall we cancel the changelog writer when we upload snapshot instead of
writing both? It's still not harmful if we succeed to commit changelog but fail
to upload snapshot assuming that the batch will be marked as failed, but it's
still redundant to commit changelog.
Also, the changelog and the snapshot is not exactly the same (not just a
pure replacement), so if there are both changelog and snapshot for the same
version, it is giving more confusion, and in worst case, you lost snapshot file
by any reason and the query will take changelog and lose the information of
column family update, instead of failing query (proper expectation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]