Thanks for the reference! Many tests are not designed for big data:
http://magazine.amstat.org/blog/2010/09/01/statrevolution/ . So we
need to understand which tests are proper. Feel free to create a JIRA
and let's move our discussion there. -Xiangrui
On Fri, Aug 22, 2014 at 8:44 PM, guxiaobo1982 guxiaobo1...@qq.com wrote:
Hi Xiangrui,
You can refer to An Introduction to Statistical Learning with Applications
in R, there are many stander hypothesis test to do regarding to linear
regression and logistic regression, they should be implement in the fist
order, then we will list some other testes, which are also important when
using logistic regression to build score cards.
Xiaobo Gu
-- Original --
From: Xiangrui Meng;men...@gmail.com;
Send time: Wednesday, Aug 20, 2014 2:18 PM
To: guxiaobo1...@qq.com;
Cc: user@spark.apache.orguser@spark.apache.org;
Subject: Re: What about implementing various hypothesis test for
LogisticRegression in MLlib
We implemented chi-squared tests in v1.1:
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala#L166
and we will add more after v1.1. Feedback on which tests should come
first would be greatly appreciated. -Xiangrui
On Tue, Aug 19, 2014 at 9:50 PM, guxiaobo1982 guxiaobo1...@qq.com wrote:
Hi,
From the documentation I think only the model fitting part is implement,
what about the various hypothesis test and performance indexes used to
evaluate the model fit?
Regards,
Xiaobo Gu
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org