[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565315#comment-15565315 ] Peng Meng commented on SPARK-17870: --- https://github.com/apache/spark/pull/1484#issuecomment-51024568 Hi

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565251#comment-15565251 ] Peng Meng commented on SPARK-17870: --- The scikit learn code is here:

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565238#comment-15565238 ] Sean Owen commented on SPARK-17870: --- I don't quite understand this example, can you point me to the

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565225#comment-15565225 ] Peng Meng commented on SPARK-17870: --- yes, the selectKBest and selectPercentile in scikit learn only use

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565180#comment-15565180 ] Sean Owen commented on SPARK-17870: --- I don't think the raw statistic can be directly compared here

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565041#comment-15565041 ] Peng Meng commented on SPARK-17870: --- hi [~srowen], thanks very much for you quickly reply. yes,the

[jira] [Commented] (SPARK-17870) ML/MLLIB: Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564959#comment-15564959 ] Sean Owen commented on SPARK-17870: --- Oof, I'm pretty certain you're correct. You can rank on the