[
https://issues.apache.org/jira/browse/SPARK-11534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sandeep Pal updated SPARK-11534:
--------------------------------
Summary: MLLib uniformVectorRDD wrong data generation (was: MLLib
uniformVectorRDD wrong data distribution)
> MLLib uniformVectorRDD wrong data generation
> --------------------------------------------
>
> Key: SPARK-11534
> URL: https://issues.apache.org/jira/browse/SPARK-11534
> Project: Spark
> Issue Type: Bug
> Components: MLlib
> Affects Versions: 1.5.1
> Environment: Spark 1.6.0-SNAPSHOT standalone on centos 6.6
> Single node VM machine, 8 cores 16 GB memory
> Reporter: Sandeep Pal
> Labels: mllib
>
> According to the definition of uniformVectorRDD, it is supposed to generate
> the uniformly distributed data with mean ~0.0 and standard deviation ~1.
> [Definition: Generates an RDD comprised of i.i.d. samples from the uniform
> distribution U(0.0, 1.0).]
> But it is generating the data with mean ~0.5 and sd of ~0.28.
> PS: uniformRDD is working correctly as per description with uniform
> distribution U(0.0, 1.0) .
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]