Re: [PR] [SPARK-48849][SS]Create OperatorStateMetadataV2 for the TransformWithStateExec operator [spark]

via GitHub Thu, 25 Jul 2024 05:40:59 -0700


HeartSaVioR commented on code in PR #47445:
URL: https://github.com/apache/spark/pull/47445#discussion_r1691375121



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala:
##########
@@ -208,13 +208,16 @@ class IncrementalExecution(
         }
         val schemaValidationResult = statefulOp.
           validateAndMaybeEvolveStateSchema(hadoopConf, currentBatchId, 
stateSchemaVersion)
+        val stateSchemaPaths = schemaValidationResult.map(_.schemaPath)
         // write out the state schema paths to the metadata file
         statefulOp match {
-          case stateStoreWriter: StateStoreWriter =>
-            val metadata = stateStoreWriter.operatorStateMetadata()
-            // TODO: [SPARK-48849] Populate metadata with stateSchemaPaths if 
metadata version is v2
-            val metadataWriter = new OperatorStateMetadataWriter(new Path(
-              checkpointLocation, 
stateStoreWriter.getStateInfo.operatorId.toString), hadoopConf)
+          case ssw: StateStoreWriter =>
+            val metadata = ssw.operatorStateMetadata(stateSchemaPaths)

Review Comment:
   I'm a bit concerned about the number of files we are going to write in query 
lifecycle, but we can defer the discussion and decision to the time we handle 
purge.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-48849][SS]Create OperatorStateMetadataV2 for the TransformWithStateExec operator [spark]

Reply via email to