HeartSaVioR commented on a change in pull request #29461:
URL: https://github.com/apache/spark/pull/29461#discussion_r483430511
##########
File path: docs/structured-streaming-programming-guide.md
##########
@@ -861,6 +861,10 @@ isStreaming(df)
</div>
</div>
+You may want to check the logical plan of the query, as Spark converts the
operation into another operation, which includes adding streaming aggregation.
(e.g. count, distinct, union, etc.)
Review comment:
Probably we can reword here as well to simplify, like
> You may want to check the query plan of the query, as Spark could inject
stateful operations during interpret of the query. Once stateful operations are
injected in the query plan, you may need to check your query with
considerations in stateful operations. (e.g. output mode, watermark, state
store size maintenance, etc.)
If the reworded sentences sound better then I can update.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]