srowen commented on pull request #32415:
URL: https://github.com/apache/spark/pull/32415#issuecomment-832333786


   Yeah it does seem like the variation here is due to distributing the 
computation. It might even be 'reasonable' to expect given the tiny data set. 
But isn't very good for confidence in the implementation via tests.
   
   I do agree that this variation is not due to the changes here. For that 
reason I'd suggest upping the iterations in the relevant tests to 100s of 
iterations, as that seems necessary for proper for convergence. And then just 
assert whatever result it comes up with. We can take a look at why it's so 
sensitive later; at worst it is already an issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to