[ 
https://issues.apache.org/jira/browse/SPARK-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-11918:
--------------------------------
    Description: 
Weighted Least Squares (WLS) is one of the optimization method for solve Linear 
Regression (when #feature < 4096). But if the dataset is very ill condition 
(such as 0-1 based label used for classification and the equation is 
underdetermined), the WLS failed (But the "l-bfgs" can train and get the 
model). The failure is caused by the underneath lapack library return error 
value when Cholesky decomposition.
This issue is easy to reproduce, you can train a LinearRegressionModel by 
"normal" solver with the example 
dataset(https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt).
 The following is the exception:
{code}
assertion failed: lapack.dpotrs returned 1.
java.lang.AssertionError: assertion failed: lapack.dpotrs returned 1.
        at scala.Predef$.assert(Predef.scala:179)
        at 
org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:42)
        at 
org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:117)
        at 
org.apache.spark.ml.regression.LinearRegression.train(LinearRegression.scala:180)
        at 
org.apache.spark.ml.regression.LinearRegression.train(LinearRegression.scala:67)
        at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
{code}

  was:
Weighted Least Squares (WLS) is one of the optimization method for solve Linear 
Regression (when #feature < 4096). But if the dataset is very ill condition 
(such as 0-1 based label used for classification and the equation is 
underdetermined), the WLS failed. The failure is caused by the underneath 
lapack library return error value when Cholesky decomposition.
This issue is easy to reproduce, you can train a LinearRegressionModel by 
"normal" solver with the example 
dataset(https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt).
 The following is the exception:
{code}
assertion failed: lapack.dpotrs returned 1.
java.lang.AssertionError: assertion failed: lapack.dpotrs returned 1.
        at scala.Predef$.assert(Predef.scala:179)
        at 
org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:42)
        at 
org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:117)
        at 
org.apache.spark.ml.regression.LinearRegression.train(LinearRegression.scala:180)
        at 
org.apache.spark.ml.regression.LinearRegression.train(LinearRegression.scala:67)
        at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
{code}


> WLS can not resolve some kinds of equation
> ------------------------------------------
>
>                 Key: SPARK-11918
>                 URL: https://issues.apache.org/jira/browse/SPARK-11918
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>            Reporter: Yanbo Liang
>         Attachments: R_GLM_output
>
>
> Weighted Least Squares (WLS) is one of the optimization method for solve 
> Linear Regression (when #feature < 4096). But if the dataset is very ill 
> condition (such as 0-1 based label used for classification and the equation 
> is underdetermined), the WLS failed (But the "l-bfgs" can train and get the 
> model). The failure is caused by the underneath lapack library return error 
> value when Cholesky decomposition.
> This issue is easy to reproduce, you can train a LinearRegressionModel by 
> "normal" solver with the example 
> dataset(https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt).
>  The following is the exception:
> {code}
> assertion failed: lapack.dpotrs returned 1.
> java.lang.AssertionError: assertion failed: lapack.dpotrs returned 1.
>       at scala.Predef$.assert(Predef.scala:179)
>       at 
> org.apache.spark.mllib.linalg.CholeskyDecomposition$.solve(CholeskyDecomposition.scala:42)
>       at 
> org.apache.spark.ml.optim.WeightedLeastSquares.fit(WeightedLeastSquares.scala:117)
>       at 
> org.apache.spark.ml.regression.LinearRegression.train(LinearRegression.scala:180)
>       at 
> org.apache.spark.ml.regression.LinearRegression.train(LinearRegression.scala:67)
>       at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to