HeartSaVioR commented on a change in pull request #24173:
URL: https://github.com/apache/spark/pull/24173#discussion_r534776507
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCoordinator.scala
##########
@@ -150,6 +172,25 @@ private class StateStoreCoordinator(override val rpcEnv:
RpcEnv)
storeIdsToRemove.mkString(", "))
context.reply(true)
+ case ValidateSchema(providerId, keySchema, valueSchema, checkEnabled) =>
+ // normalize partition ID to validate only once for one state operator
+ val newProviderId =
StateStoreProviderId.withNoPartitionInformation(providerId)
+
+ val result = schemaValidated.getOrElseUpdate(newProviderId, {
+ val checker = new StateSchemaCompatibilityChecker(newProviderId,
hadoopConf)
+
+ // regardless of configuration, we check compatibility to at least
write schema file
+ // if necessary
+ val ret = Try(checker.check(keySchema,
valueSchema)).toEither.fold(Some(_), _ => None)
Review comment:
https://github.com/apache/spark/pull/24173/files#r534772017
Let's just check it without sending RPC. We'll need some trick - hope that's
acceptable.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]