[
https://issues.apache.org/jira/browse/SPARK-12808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shubhanshu Mishra closed SPARK-12808.
-------------------------------------
Resolution: Duplicate
> Formula based GLM in PySpark
> ----------------------------
>
> Key: SPARK-12808
> URL: https://issues.apache.org/jira/browse/SPARK-12808
> Project: Spark
> Issue Type: New Feature
> Components: MLlib, PySpark
> Reporter: Shubhanshu Mishra
> Priority: Minor
> Labels: GLM, mllib, pyspark
>
> I believe PySpark's mllib module should support a GLM feature with also
> includes defining models using a formula. This is done in a python package
> called statsmodels
> http://statsmodels.sourceforge.net/devel/example_formulas.html
> The formula feature can be implemented using the python module patsy.
> Currently, RSpark supports a GLM module with formula feature.
> I can give a shot implementing the feature.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]