[ 
https://issues.apache.org/jira/browse/SPARK-11534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandeep Pal updated SPARK-11534:
--------------------------------
    Description: 
According to the definition of uniformVectorRDD, it is supposed to generate the 
uniformly distributed data with mean ~0.0 and standard deviation ~1.
[Definition: Generates an RDD comprised of i.i.d. samples from the uniform 
distribution U(0.0, 1.0).]
But it is generating the data with mean ~0.5 and sd of ~0.28.

PS: uniformRDD is working correctly as per description with uniform 
distribution U(0.0, 1.0) and uniformVectorRDD should be generating data with 
same distribution.


  was:
According to the definition of uniformVectorRDD, it is supposed to generate the 
uniformly distributed data with mean ~0.0 and standard deviation ~1.
[Definition: Generates an RDD comprised of i.i.d. samples from the uniform 
distribution U(0.0, 1.0).]
But it is generating the data with mean ~0.5 and sd of ~0.28.

PS: uniformRDD is working correctly as per description with uniform 
distribution U(0.0, 1.0) .



> MLLib uniformVectorRDD wrong data generation
> --------------------------------------------
>
>                 Key: SPARK-11534
>                 URL: https://issues.apache.org/jira/browse/SPARK-11534
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 1.5.1
>         Environment: Spark 1.6.0-SNAPSHOT standalone on centos 6.6
> Single node VM machine, 8 cores 16 GB memory
>            Reporter: Sandeep Pal
>              Labels: mllib
>
> According to the definition of uniformVectorRDD, it is supposed to generate 
> the uniformly distributed data with mean ~0.0 and standard deviation ~1.
> [Definition: Generates an RDD comprised of i.i.d. samples from the uniform 
> distribution U(0.0, 1.0).]
> But it is generating the data with mean ~0.5 and sd of ~0.28.
> PS: uniformRDD is working correctly as per description with uniform 
> distribution U(0.0, 1.0) and uniformVectorRDD should be generating data with 
> same distribution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to