[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-13 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16149
  
@srowen @sethah 
Thanks for all the helpful discussions! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-13 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16149
  
Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-13 Thread sethah
Github user sethah commented on the issue:

https://github.com/apache/spark/pull/16149
  
LGTM 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16149
  
**[Test build #3495 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3495/consoleFull)**
 for PR 16149 at commit 
[`6e6c48b`](https://github.com/apache/spark/commit/6e6c48b79065666e1e896eec76e1ffa8cb751b6e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-13 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16149
  
**[Test build #3495 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3495/consoleFull)**
 for PR 16149 at commit 
[`6e6c48b`](https://github.com/apache/spark/commit/6e6c48b79065666e1e896eec76e1ffa8cb751b6e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-12 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16149
  
@srowen @sethah 
One more commit that adds a test case with `weight = 4.7` which will round 
up to 5 to test the case @sethah described. All tests passed.  I'm pretty sure 
R's rounding is the same as what I'm doing here. Please merge if there is no 
other issue. Thanks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-10 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16149
  
@sethah Would you please review this? Thanks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-08 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16149
  
@sethah @srowen 
I updated the documentation. I think we have everything needed for this 
fix. Please merge and close this PR if there is no other issue. Thanks much for 
all the comments. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-07 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16149
  
@sethah @srowen I have added a comment to the weigthCol doc for the 
Binomial case. 
I also updated to test the case `weight < 0.5`, i.e., `round(weight) = 0`. 
All tests passed.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-07 Thread sethah
Github user sethah commented on the issue:

https://github.com/apache/spark/pull/16149
  
@srowen We can add a note to the doc for `setWeightCol`. We could also use 
`logInfo` about weights needing to be integer values for `Binomial` family, but 
that may not be very effective.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-07 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16149
  
@srowen @sethah 
I have cleaned up the change as suggested. Please review and let me know if 
there is any question.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-07 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16149
  
What you have is fine (though I might name it `ylogy` or something). I 
don't see other places in the code that compute x ln x or something similar, so 
it's OK to make this a private function. You might just make a `private def` of 
a local helper method rather than instantiate a lambda, but the difference is 
trivial.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-07 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16149
  
@srowen @sethah 
Thanks for the comments. Yes, the major use case is to be able to handle 
multiple trials (integer weight, real-valued response). Indeed, a better way to 
do this is through `offset`, which I have proposed to do in this JIRA 
[SPARK-18710](https://issues.apache.org/jira/browse/SPARK-18710). Please let me 
know if this is worth pursuing. 

I have submitted another two commits.

1. One commit makes minimal modification to the exiting Binomial GLM test 
so that one response record is now non-integer. This test still failed because 
the deviance residual calculation seems to work only for `y in (0, 1)`
2. The second commit fixes the issue in calculating the deviance. But I 
think the code can be improved, especially regarding the function `y_logy`. 
What's the best way to create a utility function like this? Please advise. 

With the two commits, all tests now passed, including the ones on AIC and 
the deviance.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16149
  
**[Test build #3470 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3470/consoleFull)**
 for PR 16149 at commit 
[`7fdab86`](https://github.com/apache/spark/commit/7fdab860f740de558fa1281255b5e7dc35480d7d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16149
  
**[Test build #3470 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3470/consoleFull)**
 for PR 16149 at commit 
[`7fdab86`](https://github.com/apache/spark/commit/7fdab860f740de558fa1281255b5e7dc35480d7d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-05 Thread actuaryzhang
Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16149
  
Jenkins, add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16149: [SPARK-18715][ML]Fix AIC calculations in Binomial GLM

2016-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16149
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org