[ 
https://issues.apache.org/jira/browse/SPARK-13380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153687#comment-15153687
 ] 

Apache Spark commented on SPARK-13380:
--------------------------------------

User 'gatorsmile' has created a pull request for this issue:
https://github.com/apache/spark/pull/11232

> Document Rand(seed) and Randn(seed) Return Indeterministic Results When Data 
> Partitions are not fixed
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-13380
>                 URL: https://issues.apache.org/jira/browse/SPARK-13380
>             Project: Spark
>          Issue Type: Documentation
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Xiao Li
>            Priority: Minor
>
> rand and randn functions with a seed argument are commonly used. Based on the 
> common sense, the results of rand and randn should be deterministic if the 
> seed parameter value is provided. For example, in MS SQL Server, it also has 
> a function rand. Regarding the parameter seed, the description is like: Seed 
> is an integer expression (tinyint, smallint, or int) that gives the seed 
> value. If seed is not specified, the SQL Server Database Engine assigns a 
> seed value at random. For a specified seed value, the result returned is 
> always the same.
> Update: the current implementation is unable to generate deterministic 
> results when the partitions are not fixed. This PR documents this issue in 
> the function descriptions.
> @jkbradley hit an issue and provided an example in the following JIRA: 
> https://issues.apache.org/jira/browse/SPARK-13333



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to