[
https://issues.apache.org/jira/browse/SPARK-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021720#comment-15021720
]
Yanbo Liang commented on SPARK-11918:
-------------------------------------
I use the same
dataset(https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt)
to train LinearRegressionModel with R:::glm, it did not throw exception but
the result is not confidence. The coefficients of the model contains too many
NA and NaN which is not reasonable. Please see the attached file to find the
R:::glm output.
> WLS can not resolve some kinds of equation
> ------------------------------------------
>
> Key: SPARK-11918
> URL: https://issues.apache.org/jira/browse/SPARK-11918
> Project: Spark
> Issue Type: Bug
> Components: ML
> Reporter: Yanbo Liang
> Attachments: R_GLM_output
>
>
> Weighted Least Squares (WLS) is one of the optimization method for solve
> Linear Regression (when #feature < 4096). But if the dataset is very ill
> condition (such as 0-1 based label used for classification and the equation
> is underdetermined), the WLS failed. The failure is caused by the underneath
> Cholesky Decomposition.
> This issue is easy to reproduce, you can train a LinearRegressionModel by
> "normal" solver with the example
> dataset(https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt).
> The following is the exception:
> {code}
> assertion failed: lapack.dpotrs returned 1.
> java.lang.AssertionError: assertion failed: lapack.dpotrs returned 1.
> at scala.Predef$.assert(Predef.scala:179)
> at
> org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:42)
> at
> org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:117)
> at
> org.apache.spark.ml.regression.LinearRegression.train(LinearRegression.scala:180)
> at
> org.apache.spark.ml.regression.LinearRegression.train(LinearRegression.scala:67)
> at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]