Seth Hendrickson created SPARK-21405:
----------------------------------------

             Summary: Add LBFGS solver for GeneralizedLinearRegression
                 Key: SPARK-21405
                 URL: https://issues.apache.org/jira/browse/SPARK-21405
             Project: Spark
          Issue Type: Improvement
          Components: ML
    Affects Versions: 2.3.0
            Reporter: Seth Hendrickson


GeneralizedLinearRegression in Spark ML currently only allows 4096 features 
because it uses IRLS, and hence WLS, as an optimizer which relies on collecting 
the covariance matrix to the driver. GLMs can also be fit by simple gradient 
based methods like LBFGS.

The new API from 
[SPARK-19762|https://issues.apache.org/jira/browse/SPARK-19762] makes this easy 
to add. I've already prototyped it, and it works pretty well. This change would 
allow an arbitrary number of features (up to what can fit on a single node) as 
in Linear/Logistic regression.

For reference, other GLM packages also support this - e.g. statsmodels, H2O.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to