Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-30 Thread DB Tsai
Alternatively, I will give a talk about LOR and LIR with elastic-net implementation and interpretation of those models in spark summit. https://spark-summit.org/2015/events/large-scale-lasso-and-elastic-net-regularized-generalized-linear-models/ You may attend or watch online. Sincerely, DB Ts

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-30 Thread ayan guha
I hope they will come up with1.4 before spark summit in mid June On 31 May 2015 10:07, "Joseph Bradley" wrote: > Spark 1.4 should be available next month, but I'm not sure about the exact > date. > Your interpretation of high lambda is reasonable. "High" lambda is really > data-dependent. > "lam

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-30 Thread Joseph Bradley
Spark 1.4 should be available next month, but I'm not sure about the exact date. Your interpretation of high lambda is reasonable. "High" lambda is really data-dependent. "lambda" is the same as the "regParam" in Spark, available in all recent Spark versions. On Fri, May 29, 2015 at 5:35 AM, méla

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-29 Thread mélanie gallois
When will Spark 1.4 be available exactly? To answer to "Model selection can be achieved through high lambda resulting lots of zero in the coefficients" : Do you mean that putting a high lambda as a parameter of the logistic regression keeps only a few significant variables and "deletes" the others

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-22 Thread Joseph Bradley
If you want to select specific variable combinations by hand, then you will need to modify the dataset before passing it to the ML algorithm. The DataFrame API should make that easy to do. If you want to have an ML algorithm select variables automatically, then I would recommend using L1 regulari

Re: MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-22 Thread DB Tsai
In Spark 1.4, Logistic Regression with elasticNet is implemented in ML pipeline framework. Model selection can be achieved through high lambda resulting lots of zero in the coefficients. Sincerely, DB Tsai --- Blog: https://www.dbtsai.com On F

MLlib: how to get the best model with only the most significant explanatory variables in LogisticRegressionWithLBFGS or LogisticRegressionWithSGD ?

2015-05-22 Thread SparknewUser
I am new in MLlib and in Spark.(I use Scala) I'm trying to understand how LogisticRegressionWithLBFGS and LogisticRegressionWithSGD work. I usually use R to do logistic regressions but now I do it on Spark to be able to analyze Big Data. The model only returns weights and intercept. My problem is