[
https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182643#comment-16182643
]
Weichen Xu commented on SPARK-3181:
-----------------------------------
I also vote to combine them as one estimator, here are my two cents:
1, Regression with Huber loss is one kind of linear regression. It makes sense
to switch between different loss functions.
2, To combine them as one estimator should be more visible to users. Users
should be easy to try linear regression with different loss function.
3, It will reduce lots of code duplication.
thanks!
> Add Robust Regression Algorithm with Huber Estimator
> ----------------------------------------------------
>
> Key: SPARK-3181
> URL: https://issues.apache.org/jira/browse/SPARK-3181
> Project: Spark
> Issue Type: New Feature
> Components: ML
> Affects Versions: 2.2.0
> Reporter: Fan Jiang
> Assignee: Yanbo Liang
> Labels: features
> Original Estimate: 0h
> Remaining Estimate: 0h
>
> Linear least square estimates assume the error has normal distribution and
> can behave badly when the errors are heavy-tailed. In practical we get
> various types of data. We need to include Robust Regression to employ a
> fitting criterion that is not as vulnerable as least square.
> In 1973, Huber introduced M-estimation for regression which stands for
> "maximum likelihood type". The method is resistant to outliers in the
> response variable and has been widely used.
> The new feature for MLlib will contain 3 new files
> /main/scala/org/apache/spark/mllib/regression/RobustRegression.scala
> /test/scala/org/apache/spark/mllib/regression/RobustRegressionSuite.scala
> /main/scala/org/apache/spark/examples/mllib/HuberRobustRegression.scala
> and one new class HuberRobustGradient in
> /main/scala/org/apache/spark/mllib/optimization/Gradient.scala
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]