HeartSaVioR commented on a change in pull request #31700:
URL: https://github.com/apache/spark/pull/31700#discussion_r587970348
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
##########
@@ -166,6 +166,52 @@ object OverwritePartitionsDynamic {
}
}
+case class AppendMicroBatch(
Review comment:
Streaming write semantic is not same as batch one. The semantic is bound
to the stateful operation; there should be only `append`, `update` (not same as
overwrite), and `truncate and append (complete)`. For update we haven't
constructed the proper way to define it.
The major concern is that the group keys in stateful operation must be used
as keys in update mode. That is currently not possible (there are some sketched
ideas on this though), but Spark has been dealing with update with the huge
risk that we're doing the same as append, and the risk is delegated to the sink
(or user). The sink or user has to deal with reflecting the appended output as
"upsert". That's why I renamed `SupportsStreamingUpdate` as
`SupportsStreamingUpdateAsAppend` to clarify the behavior.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]