srowen commented on pull request #32415: URL: https://github.com/apache/spark/pull/32415#issuecomment-831545290
Right yeah like if there is a comparable implementation in R or sklearn, and it gives a certain answer, that's decent evidence that it's more correct. Could be due to different choice about defaults or whatever as well. I wouldn't expect a change from just changing the math, but at logs of 65 vs 93? sure I could believe it's small but important differences. We may even choose to decide to ignore it, but that's non-trivial. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org