Hi ALL,

I’ve tried the GLM (General Linear Model) of Spark 2.0.0-preview. And I’ve
countered some unexpected problems.
•       First problem:
I test the “poisson” family type GLM with a very small dataset using SparkR
2.0.0 This dataset can run “poisson” family type GLM in general R
successfully. But SparkR showed the error below. And I have no idea where
this came from.

16/06/13 14:10:58 WARN WeightedLeastSquares: regParam is zero, which might
cause numerical instability and overfitting.
16/06/13 14:10:58 ERROR Executor: Exception in task 0.0 in stage 28.0 (TID
28)
java.lang.IllegalArgumentException: requirement failed: The response
variable of Poisson family should be positive, but got 0.0
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n27145/P.png> 

•       Second problem:
When I run the same dataset which I ran successfully on Spark 1.6.0, Spark
2.0.0 generated the error below.

ERROR RBackendHandler: fit on
org.apache.spark.ml.r.GeneralizedLinearRegressionWrapper failed
Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
  org.apache.spark.SparkException: Currently, GeneralizedLinearRegression
only supports number of features <= 4096. Found 7664 in the input dataset.
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n27145/P2.png> 

This is the R code:
“model <- glm(flow~Origin + Destination, data = distance_flow,family =
gaussian(link = "identity"))”
Dose this because Spark 2.0.0 not support as large dataset as the previous
version?






--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-2-0-0-GLM-problem-tp27145.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to