GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/16699
[SPARK-18710] Add offset in GLM
## What changes were proposed in this pull request?
Add support for offset in GLM. This is useful for at least two reasons:
1. Account for exposure: e.g., when modeling the number of accidents, we
may need to use miles driven as an offset to access factors on frequency.
2. Test incremental effects of new variables: we can use predictions from
the existing model as offset and run a much smaller model on only new
variables. This avoids re-estimating the large model with all variables (old +
new) and can be very important for efficient large-scaled analysis.
## How was this patch tested?
New test.
@yanboliang @srowen @felixcheung @sethah
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/actuaryzhang/spark offset
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16699.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16699
----
commit 3bf2718c1a1e68273508e63499bb5d1cc8230155
Author: actuaryzhang <[email protected]>
Date: 2017-01-24T23:46:16Z
add trait offset
commit 0e240eb313aa91cb645fb3ab8d70e51b6c65b3c7
Author: actuaryzhang <[email protected]>
Date: 2017-01-24T23:48:03Z
add offset setter
commit 9c41453a19c0f9c31403fafaf1995c642c37c70d
Author: actuaryzhang <[email protected]>
Date: 2017-01-25T05:15:50Z
implement offset in GLM
commit 7823f8af8b0926790816c9e79e9425e503e494ad
Author: actuaryzhang <[email protected]>
Date: 2017-01-25T06:55:56Z
add test for glm with offset
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]