HeartSaVioR commented on code in PR #47445:
URL: https://github.com/apache/spark/pull/47445#discussion_r1691375121
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala:
##########
@@ -208,13 +208,16 @@ class IncrementalExecution(
}
val schemaValidationResult = statefulOp.
validateAndMaybeEvolveStateSchema(hadoopConf, currentBatchId,
stateSchemaVersion)
+ val stateSchemaPaths = schemaValidationResult.map(_.schemaPath)
// write out the state schema paths to the metadata file
statefulOp match {
- case stateStoreWriter: StateStoreWriter =>
- val metadata = stateStoreWriter.operatorStateMetadata()
- // TODO: [SPARK-48849] Populate metadata with stateSchemaPaths if
metadata version is v2
- val metadataWriter = new OperatorStateMetadataWriter(new Path(
- checkpointLocation,
stateStoreWriter.getStateInfo.operatorId.toString), hadoopConf)
+ case ssw: StateStoreWriter =>
+ val metadata = ssw.operatorStateMetadata(stateSchemaPaths)
Review Comment:
I'm a bit concerned about the number of files we are going to write in query
lifecycle, but we can defer the discussion and decision to the time we handle
purge.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]