sunchao commented on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-1023547628


   @HeartSaVioR no worries, I should have pinged you too :)
   
   > In Structured Streaming, state is partitioned with grouping keys based on 
Spark's internal hash function, and the number of partition is static. That 
said, if Spark does not respect the distribution of state against stateful 
operator, it leads to correctness problem.
   
   Could you give me a concrete example of this? Currently the rule only skips 
shuffle in join if both sides report the same distribution. Also, with the 
first follow-up by @cloud-fan I think we've already restored the previous 
behavior.
   
   I'm no Spark streaming expert so still trying to know more about the problem 
here. :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to