Hi Joseph,
I've just tried that out. MLLib indeed returns different models. I see
no problem here then. How can Filipp's issue be possible?
Best,
Valeriy.
On 04/27/2018 10:00 PM, Valeriy Avanesov wrote:
Hi all,
maybe I'm missing something, but from what was discussed here I've
gathered
Hi Valeriy,
Let me make sure we are on the same page.
"the current mllib implementation returns exactly the same model whether
standardization is turned on or off. " This should be corrected as "the
current mllib implementation returns exactly the same model whether
standardization is turned on o
Hi all,
maybe I'm missing something, but from what was discussed here I've
gathered that the current mllib implementation returns exactly the same
model whether standardization is turned on or off.
I suggest to consider an R script (please, see below) which trains two
penalized logistic regr
As I’m one of the original authors, let me chime in for some comments.
Without the standardization, the LBFGS will be unstable. For example, if a
feature is being x 10, then the corresponding coefficient should be / 10 to
make the same prediction. But without standardization, the LBFGS will con
Right. If regularization item isn't zero, then enable/disable
standardization will get different result.
But, if comparing results between R-glmnet and mllib, if we set the same
parameters for regularization/standardization/... , then we should get the
same result. If not, then maybe there's a bug.
Hi all.
Filipp, do you use l1/l2/elstic-net penalization? I believe in this case
standardization matters.
Best,
Valeriy.
On 04/17/2018 11:40 AM, Weichen Xu wrote:
Not a bug.
When disabling standadization, mllib LR will still do standadization
for features, but it will scale the coefficie
Not a bug.
When disabling standadization, mllib LR will still do standadization for
features, but it will scale the coefficients back at the end (after
training finished). So it will get the same result with no standadization
training. The purpose of it is to improve the rate of convergence. So th
Hi Filipp,
MLlib’s LR implementation did the same way as R’s glmnet for standardization.
Actually you don’t need to care about the implementation detail, as the
coefficients are always returned on the original scale, so it should be return
the same result as other popular ML libraries.
Could yo
Hi all,
While migrating from custom LR implementation to MLLib's LR implementation
my colleagues noticed that prediction quality dropped (accoring to
different business metrics).
It's turned out that this issue caused by features standardization perfomed
by MLLib's LR: disregard to 'standardizatio