sunchao commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-1023547628
@HeartSaVioR no worries, I should have pinged you too :) > In Structured Streaming, state is partitioned with grouping keys based on Spark's internal hash function, and the number of partition is static. That said, if Spark does not respect the distribution of state against stateful operator, it leads to correctness problem. Could you give me a concrete example of this? Currently the rule only skips shuffle in join if both sides report the same distribution. Also, with the first follow-up by @cloud-fan I think we've already restored the previous behavior. I'm no Spark streaming expert so still trying to know more about the problem here. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org