GitHub user MaxGekk opened a pull request: https://github.com/apache/spark/pull/21228
[SPARK-24171] Adding a note for non-deterministic functions ## What changes were proposed in this pull request? I propose to add a clear statement for functions like `collect_list()` about non-deterministic behavior of such functions. The behavior must be taken into account by user while creating and running queries. You can merge this pull request into a Git repository by running: $ git pull https://github.com/MaxGekk/spark-1 deterministic-comments Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21228.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21228 ---- commit c1e5ade5cd0401519bb7b798741a66e88fd8504a Author: Maxim Gekk <maxim.gekk@...> Date: 2018-05-03T11:24:34Z Adding a note for non-deterministic functions commit 077da4ecf12fe6b3375ed52e5cf4743ca942a3c6 Author: Maxim Gekk <maxim.gekk@...> Date: 2018-05-03T11:47:13Z Updating comments for PySpark commit 1761ed9cc189e3a4593090b5eaa488312f8e76b2 Author: Maxim Gekk <maxim.gekk@...> Date: 2018-05-03T11:52:43Z Updating comments for R ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org