Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/3325#issuecomment-63398036
This is actually an alternate form of the same gradient, but, it works when
y is -1/+1, not when y is 0/1. It comes from the loss function `log(1 + exp(-y
* x.dot(w))`. Here's a decent writeup: http://work.caltech.edu/library/093.pdf
You can work out it's the same thing, plugging in the two possible values
of y in each case. Note there's an extra sign flip since this is a loss you
minimize, wherein in the lecture notes you refer to, it's maximizing a
log-likelihood.
The "real" implementation in Spark uses a gradient more like what you
propose, and I find it a bit simpler myself.
If this changes, you'd have to make sure the input is not assumed to be
-1/+1 anywhere since it wouldn't work then. I'm not sure how much that's
depended-on in this example.
At the least, you could add clarifying comments about the input label
requirement!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]