Imran Younus created SPARK-13777:
------------------------------------

             Summary: Weighted Leaset Squares fails when there are features 
with identical values.
                 Key: SPARK-13777
                 URL: https://issues.apache.org/jira/browse/SPARK-13777
             Project: Spark
          Issue Type: Bug
          Components: ML
            Reporter: Imran Younus
            Priority: Minor


"normal" solver in LinearRegression uses Cholesky decomposition to calculate 
the coefficients. If the data has features with identical values (zero 
variance), then (A^T A) matrix is not positive definite any more and the 
Cholesky decomposition fails.

For the same case, "l-bfgs" solver sets the coefficients of these constant 
features to zero and produces valid coefficients for the rest of the features. 
This behaviour is consistent with glmnet in R. "normal" solver should also do 
the same.







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to