[ 
https://issues.apache.org/jira/browse/SPARK-13380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin resolved SPARK-13380.
---------------------------------
       Resolution: Fixed
         Assignee: Xiao Li
    Fix Version/s: 2.0.0

> Document Rand(seed) and Randn(seed) Return Indeterministic Results When Data 
> Partitions are not fixed
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-13380
>                 URL: https://issues.apache.org/jira/browse/SPARK-13380
>             Project: Spark
>          Issue Type: Documentation
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Xiao Li
>            Assignee: Xiao Li
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> rand and randn functions with a seed argument are commonly used. Based on the 
> common sense, the results of rand and randn should be deterministic if the 
> seed parameter value is provided. For example, in MS SQL Server, it also has 
> a function rand. Regarding the parameter seed, the description is like: Seed 
> is an integer expression (tinyint, smallint, or int) that gives the seed 
> value. If seed is not specified, the SQL Server Database Engine assigns a 
> seed value at random. For a specified seed value, the result returned is 
> always the same.
> Update: the current implementation is unable to generate deterministic 
> results when the partitions are not fixed. This PR documents this issue in 
> the function descriptions.
> @jkbradley hit an issue and provided an example in the following JIRA: 
> https://issues.apache.org/jira/browse/SPARK-13333



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to