HeartSaVioR opened a new pull request #27333: [WIP][DO-NOT-MERGE][SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState URL: https://github.com/apache/spark/pull/27333 ### What changes were proposed in this pull request? This patch adds new UTs to prevent SPARK-29438 for streaming aggregation as well as flatMapGroupsWithState, as we agree about the review comment quote here: https://github.com/apache/spark/pull/26162#issuecomment-576929692 > LGTM for this PR. But on a additional note, this is a very subtle and easy-to-make bug with TaskContext.getPartitionId. I wonder if this bug is present in any other stateful operation. I wonder if this bug is present in any other stateful operation. Can you please verify how partitionId is used in the other stateful operations? For now they're not broken, but even better if we have UTs to prevent the case for the future. ### Why are the changes needed? New UTs will prevent streaming aggregation and flatMapGroupsWithState to be broken in future where it is placed on the right side of UNION and the number of partition is changing on the left side of UNION. Please refer SPARK-29438 for more details. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added UTs.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
