luhenry commented on pull request #32415: URL: https://github.com/apache/spark/pull/32415#issuecomment-832072136
From looking at https://github.com/luhenry/spark/runs/2500079223 and https://github.com/luhenry/spark/runs/2500065249, it looks like an intermittent failure. I haven't had a chance to successfully reproduce numbers on scikit-learn/sklearn. What in your experience could lead to this variability in the results? Is there some source of randomness in these algorithms? (I'm far from a data scientist and have only played with a handful of ML algorithms, but nothing serious. I'm trying to learn though!) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org