Github user yanboliang commented on the pull request:
https://github.com/apache/spark/pull/10639#issuecomment-173219998
@mengxr Thanks for your comments! IRLS is not bound with GLM in essence, so
it's make sense to decouple them. Based on your prompt, I propose the following
API for ```IterativelyReweightedLeastSquares```:
```Scala
private[ml] class IterativelyReweightedLeastSquares(
val initialModel: WeightedLeastSquaresModel,
val reweightedFunction: (RDD[Instance], WeightedLeastSquaresModel) =>
RDD[(Double, Double)],
val fitIntercept: Boolean,
val regParam: Double,
val standardizeFeatures: Boolean,
val standardizeLabel: Boolean,
val maxIter: Int,
val tol: Double) extends Logging with Serializable {
......
}
```
where ```initialModel``` is the initial guess, ```reweightedFunction``` is
used to update ```y/z```(adjusted response variable) and ```weights```. And the
terminate condition is delta model not great than ```tolerance```.
This framework can fit GLMs, Lp regression and Lasso. I will update this PR
following this idea if it's OK. And if I have some misunderstand, please
correct me. Thanks!
BTW, Do you know which package is the most authoritative one for Lp
regression? I found ```pracma:::l1linreg``` and ```L1pack:::l1fit```, but they
use ```qr.solve()``` to solve WLS equation and produce different result
compared with ML ```WeightedLeastSquares``` when ```weights``` is not equal to
1.0.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]