[ 
https://issues.apache.org/jira/browse/SPARK-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187791#comment-15187791
 ] 

Imran Younus commented on SPARK-13777:
--------------------------------------

My understanding is that the Cholesky decomposition method requires 
decomposition of A^T.A matrix instead of A. One can calculate A^T.A in single 
pass through the data and move it the driver, where the decomposition can be 
done. As far as I know, this cannot be done with QR decomposition method of 
solving normal eqaution.

> Weighted Leaset Squares fails when there are features with identical values.
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-13777
>                 URL: https://issues.apache.org/jira/browse/SPARK-13777
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>            Reporter: Imran Younus
>            Priority: Minor
>
> "normal" solver in LinearRegression uses Cholesky decomposition to calculate 
> the coefficients. If the data has features with identical values (zero 
> variance), then (A^T A) matrix is not positive definite any more and the 
> Cholesky decomposition fails.
> For the same case, "l-bfgs" solver sets the coefficients of these constant 
> features to zero and produces valid coefficients for the rest of the 
> features. This behaviour is consistent with glmnet in R. "normal" solver 
> should also do the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to