anishshri-db commented on code in PR #43961:
URL: https://github.com/apache/spark/pull/43961#discussion_r1439278954
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala:
##########
@@ -50,13 +51,15 @@ import org.apache.spark.util.{NextIterator, Utils}
* @param localRootDir Root directory in local disk that is used to working
and checkpointing dirs
* @param hadoopConf Hadoop configuration for talking to the remote file
system
* @param loggingId Id that will be prepended in logs for isolating
concurrent RocksDBs
+ * @param useColumnFamilies Used to determine whether a single or multiple
column families are used
*/
class RocksDB(
dfsRootDir: String,
val conf: RocksDBConf,
localRootDir: File = Utils.createTempDir(),
hadoopConf: Configuration = new Configuration,
- loggingId: String = "") extends Logging {
+ loggingId: String = "",
+ useColumnFamilies: Boolean = false) extends Logging {
Review Comment:
I thought about this actually - but the reason I added this flag is 2 fold:
- one is to isolate users of this flag - basically in the current impl, this
flag is set to true only for the `transformWithState` operator. We are not
touching any other operators - so we would limit the impact surface
- second is to identify which changelog writer format to use
If we distinguish old vs new based on just the `default` column family name
- either we won't be able use the `default` col family with the new operator or
we won't be able to identify which writers/formats to use
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]