Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16630
Could somebody help review this PR? I think this will make gathering the
estimation results in Scala much easier. This will also be helpful in
constructing the tests. For example, the GLM
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16699
@sethah @imatiach-msft
Please review the new commit. Main changes:
- Fix issue in null deviance calculation in the presence of offset. Except
for special cases (Gaussian with
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16699
@sethah Yes, that is lots of work. However, the only critical change (since
the last commit) is on the calculation of the null deviance. The other changes
are mainly because of updating
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16699#discussion_r100581520
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -1218,16 +1266,35 @@ class
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16699
@sethah Thanks much for your review.
Regarding prediction, both R and my implementation here allow prediction
with offsets. If the users want to get the predicted rates (instead of
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16699
@sethah The predict method can work with new data in R. See below. Shall we
focus on the current implementation, instead of discussing the details of the R
behavior? :)
Let me know if
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16699#discussion_r100974891
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala
---
@@ -798,77 +798,160 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16699#discussion_r100974912
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala
---
@@ -798,77 +798,160 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16699#discussion_r100975164
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -168,6 +179,7 @@ private[regression] trait
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16699#discussion_r100975556
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -406,6 +435,14 @@ object
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16699#discussion_r100975590
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -944,15 +981,27 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16699#discussion_r100976709
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -1139,54 +1189,52 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16699#discussion_r100976816
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Instance.scala
---
@@ -27,3 +27,25 @@ import org.apache.spark.ml.linalg.Vector
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16699
@sethah Thanks much for your review. I've made a new commit that addressed
all your comments. Please see my inline comments. Let me know if there is any
other suggestions. Thanks.
-
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101158362
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -915,6 +917,22 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101158825
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -1152,4 +1170,32 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101159105
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -915,6 +917,22 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101159146
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -1152,4 +1170,32 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101159255
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala
---
@@ -1104,6 +1103,83 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101160069
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -915,6 +917,22 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101160217
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -1152,4 +1170,32 @@ class
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16630
@felixcheung @imatiach-msft Thanks much for the review. Made most changes
suggested. Please see my inline replies.
---
If your project is set up for it, you can reply to this email and have
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
@felixcheung Thanks for the discussions. Will work on this in two weeks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101822971
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala
---
@@ -1104,6 +1103,83 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r101822942
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala
---
@@ -1104,6 +1103,83 @@ class
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16630
@imatiach-msft @felixcheung
I cleaned up the tests as suggested, and also updated the R GLM wrapper to
use the result from this PR. Please let me know if there is any other
suggestions
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16630
@imatiach-msft I'm not sure the R^2s are used much in the GLM context. The
deviance, loglikelihood and AIC/BICs are most often used for ANOVA and model
comparison. The GLM
[book](
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16630
@felixcheung Could you take another look at this PR? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16699
@sethah Is there anything else you would recommend for this PR? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17005
[SPARK-14659][ML] RFormula supports setting base level both by frequency
and alphabetically
## What changes were proposed in this pull request?
Current RFormula drops the least frequent
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17005
@srowen @jkbradley @felixcheung @sethah @yanboliang
One question is: is it better to move the `HasStringOrderType` trait to the
shared params? This is only used by StringIndexer and
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17005
Cannot figure out what's exactly causing the test to fail. Error message is
not informative. Any help please?
---
If your project is set up for it, you can reply to this email and have
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17005
@HyukjinKwon Thanks. I'll try retesting this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17017
[SPARK-19682][SparkR] Issue warning (or error) when subset method "[["
takes vector index
## What changes were proposed in this pull request?
The `[[` method is supposed to tak
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17017
@felixcheung
Simple example to illustrate this
```
df <- suppressWarnings(createDataFrame(iris))
df[[1:2]]
```
Instead of issuing warning and taking the first elem
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/17017#discussion_r102648326
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1776,6 +1780,10 @@ setMethod("[[", signature(x = "SparkDataFrame", i =
"numericOrc
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r103003515
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -34,6 +35,7 @@ import org.apache.spark.rdd.RDD
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r103003591
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/r/GeneralizedLinearRegressionWrapper.scala
---
@@ -99,37 +95,23 @@ private[r] object
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16630#discussion_r103004564
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -1152,4 +1173,33 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16344#discussion_r93565335
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -64,6 +64,27 @@ private[regression] trait
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16344#discussion_r93565567
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -303,14 +341,15 @@ object
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@srowen @yanboliang Thanks much for the feedback. I now have a better
understanding of the code and the issue. I have made new commits reflecting
your suggestions. The major changes are
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@srowen Thanks for the comments. Makes lots of sense to move the switch to
subclass. I did not know one could override a `val`.
In the new commit, I have moved the `defaultLink` and
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16344#discussion_r93672741
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -303,20 +337,24 @@ object
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@yanboliang Thanks much for the detailed comments. I have addressed all of
them in the new commits. Please take another look.
@srowen
---
If your project is set up for it, you can
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@srowen @yanboliang
Any additional issues regarding this PR?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@srowen Made a new commit according to your suggestion. Everything looking
good now?
@yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@yanboliang Did you get a chance to take another look at this? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@yanboliang Thanks for the detailed review. I have made all changes you
suggested except for the part on the new power link function. Yes, the
canonical link in the Tweedie in general is `1.0
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16344#discussion_r94849501
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -158,6 +183,16 @@ class
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16344#discussion_r94849540
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -365,7 +401,6 @@ object
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16344#discussion_r94849556
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -397,32 +432,121 @@ object
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@yanboliang Thanks for the feedback. However, I'm not sure why we need to
be consistent with R on this one. The usage of 'tweedie' glm almost always uses
`link.power = 0, 1
Github user actuaryzhang closed the pull request at:
https://github.com/apache/spark/pull/16344
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@srowen @yanboliang
I'm closing this PR since it does not seem to be very clean to integrate
into the current GLM setup. I appreciate all the comments and discussions.
---
If
GitHub user actuaryzhang reopened a pull request:
https://github.com/apache/spark/pull/16344
[SPARK-18929][ML] Add Tweedie distribution in GLM
## What changes were proposed in this pull request?
I propose to add the full Tweedie family into the
GeneralizedLinearRegression model
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
Sorry about closing this prematurely. I'm giving it another shot and I
think I have an elegant solution to include `linkPower`. The new commit adds
the following:
1. It implement
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@yanboliang Thanks. Look forward to your feedback.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16344#discussion_r96061873
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -613,25 +758,67 @@ object
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16344#discussion_r96061883
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -242,9 +316,9 @@ class
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
@yanboliang Thanks for the review and comments. I have made a new commit
that addressed all your comments. The main change is the new companion object
`FamilyAndLink` and factory methods to
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16344
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17084
Thanks for the PR. I think this is helpful. Will take a look next week.
Quite swamped recently.
---
If your project is set up for it, you can reply to this email and have your
reply appear
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17553
[SPARK-20026][Doc] Add Tweedie example for SparkR in programming guide
## What changes were proposed in this pull request?
Add Tweedie example for SparkR in programming guide.
The doc
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17553
@felixcheung
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17103
[Minor][Doc] Update GLM doc to include tweedie distribution
Update GLM documentation to include the Tweedie distribution. #16344
@jkbradley @yanboliang
You can merge this pull
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17105
[SPARK-19773][SparkR] SparkDataFrame should not allow duplicate names
## What changes were proposed in this pull request?
SparkDataFrame in SparkR seems to accept duplicate names at
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17105
@felixcheung
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17105
@felixcheung Ahh, it seems that we have some conflicting design issues.
1. From the test in collect() and crossJoin, it seems to allow dup names in
SparkDataFrame by design
Github user actuaryzhang closed the pull request at:
https://github.com/apache/spark/pull/17105
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17105
@felixcheung Thanks for the clarification. I will close this then.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17115
[Doc][Minor] Update R doc
Update R doc:
1. columns, names and colnames returns a vector of strings, not **list** as
in current doc.
2. `colnames<-` does allow the subset assignm
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17115
@felixcheung I see lots of the SparkDataFrame methods use the following in
examples:
```
path <- "path/to/file.json"
df <- read.json(path)
```
I'
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17115
@HyukjinKwon Thanks. Updated title.
@felixcheung Updated doc and added tests. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17115
@srowen @felixcheung
Thanks for the clarification. I will open another PR to add real data
examples for the SparkDataFrame methods.
I have seen lots of R package document the
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17159
[SPARK-19818][SparkR] union should check for name consistency of input data
frames
## What changes were proposed in this pull request?
Added checks for name consistency of input data
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17159
The current implementation accepts data frames with different schemas. See
issues below:
```
df <- createDataFrame(data.frame(name = c("Michael", "Andy", "
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17161
[SPARK-19819][SparkR] Use concrete data in SparkR DataFrame examples
## What changes were proposed in this pull request?
Many examples in SparkDataFrame methods uses:
```
path
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17161
I think most examples in R packages are (supposed to be) runnable. Coming
from a user perspective, I find it useful if I can run the examples directly
and see what the function does in action
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17159
@felixcheung OK, did not know it was by design. It does seem that the
`union` behavior is similar to R's SQL (in `sqldf`), but as you pointed out,
the overload method `rbind` is diff
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17159
Makes sense. Made changes to rbind and added tests. Please take a look.
Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/17159#discussion_r104335939
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2685,7 +2686,8 @@ setMethod("unionAll",
#' Union two or more SparkDataFrames
#
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
@felixcheung Sorry for taking so long for this update.
I think your first suggestion makes most sense, i.e., we do not expose the
internal `tweedie`.
When `statmod` is loaded
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
@felixcheung Yes, the SparkR `tweedie` is not exported. See below.
```
model1 <- spark.glm(training, Sepal_Width ~ Sepal_Length + Species,
+ fam
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17146
Will take a look tonight.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/17146
This looks good to me. Thanks
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/17146#discussion_r104853968
--- Diff: python/pyspark/ml/tests.py ---
@@ -1223,6 +1223,26 @@ def test_apply_binary_term_freqs(self
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
@felixcheung Could you take a look at this new fix when you get a chance?
Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
@felixcheung If we go with # 3, do we still want to compatibility with
statmod::tweedie? It's confusing to have two different ways of specifying the
same model.
---
If your project i
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
@felixcheung OK, new implementation of # 3. Now works in two ways:
1. `family = "tweedie"` + `variancePower` + `linkPower`
2. When `statmod` is available, `tweedie()`
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
One other change I could make is to change `variancePower` and `linkPower`
to `var.power` and `link.power` to be consistent with `statmod`. But l would
like to get your feedback on this new
Github user actuaryzhang commented on a diff in the pull request:
https://github.com/apache/spark/pull/16729#discussion_r105576051
--- Diff: R/pkg/R/mllib_regression.R ---
@@ -100,6 +120,12 @@ setMethod("spark.glm", signature(data =
"SparkDataFrame"
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
@felixcheung Thanks for the feedback. Made a new commit that
1. change `variancePower` and `linkPower` to `var.power` and `link.power`.
2. use `link = NULL` for tweedie family
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
Sorry that I forgot to address that comment. Fixed now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/16729
@felixcheung Could you merge this please? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/15683
@srowen @thunterdb
I just updated the unit test for poisson GLM (only for the log link). The
simulated data are now forced to take values of zero. Existing data generation
is not
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/15683
@sethah Thanks for the review and comments. I now created a separate unit
test. It also passed the style test.
I accidentally merged master into a branch... and don't know h
Github user actuaryzhang commented on the issue:
https://github.com/apache/spark/pull/15683
@sethah Thanks for your review and suggestion. I have made a new commit
reflecting your comments.
@srowen Thanks for all the suggestions. When do you think this change could
be
1 - 100 of 513 matches
Mail list logo