[jira] [Issue Comment Deleted] (SPARK-18713) using SparkR build step wise regression model (glm)
[ https://issues.apache.org/jira/browse/SPARK-18713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasann modi updated SPARK-18713: - Comment: was deleted (was: Can u add step wise regression function into upcoming Spark version.) > using SparkR build step wise regression model (glm) > --- > > Key: SPARK-18713 > URL: https://issues.apache.org/jira/browse/SPARK-18713 > Project: Spark > Issue Type: Bug >Reporter: Prasann modi > > In R to build Step wise regression model > step(glm(formula,data,family),direction = "forward")) > function is there. How to build stepwise regression model using SparkR.. > I am using SPARK 2.0.0 and R 3.3.1.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-18713) using SparkR build step wise regression model (glm)
[ https://issues.apache.org/jira/browse/SPARK-18713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasann modi reopened SPARK-18713: -- Can you add step wise regression function into upcoming Spark version. > using SparkR build step wise regression model (glm) > --- > > Key: SPARK-18713 > URL: https://issues.apache.org/jira/browse/SPARK-18713 > Project: Spark > Issue Type: Bug >Reporter: Prasann modi > > In R to build Step wise regression model > step(glm(formula,data,family),direction = "forward")) > function is there. How to build stepwise regression model using SparkR.. > I am using SPARK 2.0.0 and R 3.3.1.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18713) using SparkR build step wise regression model (glm)
[ https://issues.apache.org/jira/browse/SPARK-18713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727956#comment-15727956 ] Prasann modi commented on SPARK-18713: -- Can u add step wise regression function into upcoming Spark version. > using SparkR build step wise regression model (glm) > --- > > Key: SPARK-18713 > URL: https://issues.apache.org/jira/browse/SPARK-18713 > Project: Spark > Issue Type: Bug >Reporter: Prasann modi > > In R to build Step wise regression model > step(glm(formula,data,family),direction = "forward")) > function is there. How to build stepwise regression model using SparkR.. > I am using SPARK 2.0.0 and R 3.3.1.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-18713) using SparkR build step wise regression model (glm)
Prasann modi created SPARK-18713: Summary: using SparkR build step wise regression model (glm) Key: SPARK-18713 URL: https://issues.apache.org/jira/browse/SPARK-18713 Project: Spark Issue Type: Bug Reporter: Prasann modi In R to build Step wise regression model step(glm(formula,data,family),direction = "forward")) function is there. How to build stepwise regression model using SparkR.. I am using SPARK 2.0.0 and R 3.3.1.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17588) java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. when running glm using gaussian link function.
[ https://issues.apache.org/jira/browse/SPARK-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587813#comment-15587813 ] Prasann modi commented on SPARK-17588: -- Hi Sean, Can u help me to resolve this issue.. please check,I have commented bellow.. Thanks & Regards, Prasann Modi > java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. when > running glm using gaussian link function. > - > > Key: SPARK-17588 > URL: https://issues.apache.org/jira/browse/SPARK-17588 > Project: Spark > Issue Type: Improvement > Components: ML, SparkR >Affects Versions: 2.0.0 >Reporter: sai pavan kumar chitti >Assignee: Sean Owen >Priority: Minor > > hi, > i am getting java.lang.AssertionError error when running glm, using gaussian > link function, on a dataset with 109 columns and 81318461 rows > Below is the call trace. Can someone please tell me what the issues is > related to and how to go about resolving it. Is it because native > acceleration is not working as i am also seeing following warning messages. > WARN netlib.BLAS: Failed to load implementation from: > com.github.fommil.netlib.NativeRefBLAS > WARN netlib.LAPACK: Failed to load implementation from: > com.github.fommil.netlib.NativeSystemLAPACK > WARN netlib.LAPACK: Failed to load implementation from: > com.github.fommil.netlib.NativeRefLAPACK > 16/09/17 13:08:13 ERROR r.RBackendHandler: fit on > org.apache.spark.ml.r.GeneralizedLinearRegressionWrapper failed > Error in invokeJava(isStatic = TRUE, className, methodName, ...) : > java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:40) > at > org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:140) > at > org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:265) > at > org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:139) > at org.apache.spark.ml.Predictor.fit(Predictor.scala:90) > at org.apache.spark.ml.Predictor.fit(Predictor.scala:71) > at > org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149) > at > org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:145) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > scala.collection.IterableViewLike$Transformed$class.foreach(IterableViewLike.sc > thanks, > pavan. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-17588) java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. when running glm using gaussian link function.
[ https://issues.apache.org/jira/browse/SPARK-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587790#comment-15587790 ] Prasann modi edited comment on SPARK-17588 at 10/19/16 6:03 AM: I'm getting same issue.I'm using sparkr in Rstudio(Os - windows) trying to build glm model(binomial) but getting error and while executing that code it is taking so much time. Dataset contain 30 columns and 20 records. Please suggest me to improve this code and to resolve this error ... R Code: # Set Spark Home Sys.setenv(SPARK_HOME="C:/spark/spark-2.0.0-bin-hadoop2.7") # set library path .libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"), .libPaths())) Sys.setenv(JAVA_HOME="C:/Program Files/Java/jdk1.7.0_71") # loading SparkR library library(SparkR) library(rJava) sc <- sparkR.session(enableHiveSupport = FALSE,master = "local[*]",appName = "SparkR-Modi",sparkConfig = list(spark.sql.warehouse.dir="file:///c:/tmp/spark-warehouse")) sqlContext <- sparkRSQL.init(sc) spdf <- read.df(sqlContext, "C:/Users/prasann/Desktop/V/bigdata11.csv", source = "com.databricks.spark.csv", header = "true") showDF(spdf) # glm model md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf) Error : > md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf) Error in invokeJava(isStatic = TRUE, className, methodName, ...) : java.lang.AssertionError: assertion failed: lapack.dppsv returned 226. at scala.Predef$.assert(Predef.scala:170) at org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:40) at org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:140) at org.apache.spark.ml.regression.GeneralizedLinearRegression$FamilyAndLink.initialize(GeneralizedLinearRegression.scala:340) at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:275) at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:139) at org.apache.spark.ml.Predictor.fit(Predictor.scala:90) at org.apache.spark.ml.Predictor.fit(Predictor.scala:71) at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149) at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:145) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.c was (Author: prasann): I'm getting same issue.I'm using sparkr in Rstudio(Os - windows) trying to build glm model(binomial) but getting error and while executing that code it is taking so much time.Please suggest me what to do... R Code: # Set Spark Home Sys.setenv(SPARK_HOME="C:/spark/spark-2.0.0-bin-hadoop2.7") # set library path .libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"), .libPaths())) Sys.setenv(JAVA_HOME="C:/Program Files/Java/jdk1.7.0_71") # loading SparkR library library(SparkR) library(rJava) sc <- sparkR.session(enableHiveSupport = FALSE,master = "local[*]",appName = "SparkR-Modi",sparkConfig = list(spark.sql.warehouse.dir="file:///c:/tmp/spark-warehouse")) sqlContext <- sparkRSQL.init(sc) spdf <- read.df(sqlContext, "C:/Users/prasann/Desktop/V/bigdata11.csv", source = "com.databricks.spark.csv", header = "true") showDF(spdf) # glm model md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf) Error : > md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf) Error in invokeJava(isStatic = TRUE, className, methodName, ...) : java.lang.AssertionError: assertion failed: lapack.dppsv returned 226. at scala.Predef$.assert(Predef.scala:170) at org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:40) at org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:140) at org.apache.spark.ml.regression.GeneralizedLinearRegression$FamilyAndLink.initialize(GeneralizedLinearRegression.scala:340) at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:275) at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:139) at org.apache.spark.ml.Predictor.fit(Predictor.scala:90) at org.apache.spark.ml.Predictor.fit(Predictor.scala:71) at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149) at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:145) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.c > java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. when > running glm using gaussian link function. > - > > Key: SPARK-17588 >
[jira] [Commented] (SPARK-17588) java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. when running glm using gaussian link function.
[ https://issues.apache.org/jira/browse/SPARK-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587790#comment-15587790 ] Prasann modi commented on SPARK-17588: -- I'm getting same issue.I'm using sparkr in Rstudio(Os - windows) trying to build glm model(binomial) but getting error and while executing that code it is taking so much time.Please suggest me what to do... R Code: # Set Spark Home Sys.setenv(SPARK_HOME="C:/spark/spark-2.0.0-bin-hadoop2.7") # set library path .libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"), .libPaths())) Sys.setenv(JAVA_HOME="C:/Program Files/Java/jdk1.7.0_71") # loading SparkR library library(SparkR) library(rJava) sc <- sparkR.session(enableHiveSupport = FALSE,master = "local[*]",appName = "SparkR-Modi",sparkConfig = list(spark.sql.warehouse.dir="file:///c:/tmp/spark-warehouse")) sqlContext <- sparkRSQL.init(sc) spdf <- read.df(sqlContext, "C:/Users/prasann/Desktop/V/bigdata11.csv", source = "com.databricks.spark.csv", header = "true") showDF(spdf) # glm model md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf) Error : > md <- glm(NP_OfferCurrentResponse ~., family = "binomial", data = spdf) Error in invokeJava(isStatic = TRUE, className, methodName, ...) : java.lang.AssertionError: assertion failed: lapack.dppsv returned 226. at scala.Predef$.assert(Predef.scala:170) at org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:40) at org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:140) at org.apache.spark.ml.regression.GeneralizedLinearRegression$FamilyAndLink.initialize(GeneralizedLinearRegression.scala:340) at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:275) at org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:139) at org.apache.spark.ml.Predictor.fit(Predictor.scala:90) at org.apache.spark.ml.Predictor.fit(Predictor.scala:71) at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149) at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:145) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.c > java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. when > running glm using gaussian link function. > - > > Key: SPARK-17588 > URL: https://issues.apache.org/jira/browse/SPARK-17588 > Project: Spark > Issue Type: Improvement > Components: ML, SparkR >Affects Versions: 2.0.0 >Reporter: sai pavan kumar chitti >Assignee: Sean Owen >Priority: Minor > > hi, > i am getting java.lang.AssertionError error when running glm, using gaussian > link function, on a dataset with 109 columns and 81318461 rows > Below is the call trace. Can someone please tell me what the issues is > related to and how to go about resolving it. Is it because native > acceleration is not working as i am also seeing following warning messages. > WARN netlib.BLAS: Failed to load implementation from: > com.github.fommil.netlib.NativeRefBLAS > WARN netlib.LAPACK: Failed to load implementation from: > com.github.fommil.netlib.NativeSystemLAPACK > WARN netlib.LAPACK: Failed to load implementation from: > com.github.fommil.netlib.NativeRefLAPACK > 16/09/17 13:08:13 ERROR r.RBackendHandler: fit on > org.apache.spark.ml.r.GeneralizedLinearRegressionWrapper failed > Error in invokeJava(isStatic = TRUE, className, methodName, ...) : > java.lang.AssertionError: assertion failed: lapack.dppsv returned 105. > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:40) > at > org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:140) > at > org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:265) > at > org.apache.spark.ml.regression.GeneralizedLinearRegression.train(GeneralizedLinearRegression.scala:139) > at org.apache.spark.ml.Predictor.fit(Predictor.scala:90) > at org.apache.spark.ml.Predictor.fit(Predictor.scala:71) > at > org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149) > at > org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:145) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at > scala.collection.IterableViewLike$Transformed$class.foreach(IterableViewLike.sc > thanks, > pavan. --