Github user mattf commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53416722
lgtm
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53481077
LGTM. Merged into `master` and `branch-1.1` (since it only adds new
methods and doesn't modify any existing code).
---
If your project is set up for it, you can
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/2091
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53305144
Why do we even need `evenBuckets`? Can't we just check whether the buckets
are evenly-spaced and automatically perform the optimization if they are? This
only
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53343906
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19159/consoleFull)
for PR 2091 at commit
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53343907
@JoshRosen I had removed evenBuckets, also added more tests, and some test
cases for `str` type.
---
If your project is set up for it, you can reply to this email and
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53350206
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19159/consoleFull)
for PR 2091 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53354293
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19165/consoleFull)
for PR 2091 at commit
Github user holdenk commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53355210
@JoshRosen sure doing a linear scan works, the evenBuckets was because the
caller knows if its providing even buckets.
---
If your project is set up for it, you can
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53359625
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19165/consoleFull)
for PR 2091 at commit
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53375345
When you guys merge this, please close
https://github.com/apache/spark/pull/122 as well. You should just edit the pull
request description to also say closes #122, and
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53378149
done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53180978
@mateiz @JoshRosen I would like to change `evenBuckets` to `even`, the
later one is meaningful enough and much shorter.
One concern is that we will have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53144107
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19094/consoleFull)
for PR 2091 at commit
Github user davies commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53145718
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53145884
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19098/consoleFull)
for PR 2091 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53147036
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19098/consoleFull)
for PR 2091 at commit
Github user mattf commented on a diff in the pull request:
https://github.com/apache/spark/pull/2091#discussion_r16630390
--- Diff: python/pyspark/rdd.py ---
@@ -856,6 +856,104 @@ def redFunc(left_counter, right_counter):
return self.mapPartitions(lambda i:
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/2091#discussion_r16633581
--- Diff: python/pyspark/rdd.py ---
@@ -856,6 +856,104 @@ def redFunc(left_counter, right_counter):
return self.mapPartitions(lambda i:
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53172796
These are _excellent_ unit tests. I ran a coverage report with
[coverage.py](http://nedbatchelder.com/code/coverage/) and it reports
essentially 100% coverage for the
Github user mattf commented on a diff in the pull request:
https://github.com/apache/spark/pull/2091#discussion_r16627566
--- Diff: python/pyspark/rdd.py ---
@@ -856,6 +856,104 @@ def redFunc(left_counter, right_counter):
return self.mapPartitions(lambda i:
Github user mattf commented on a diff in the pull request:
https://github.com/apache/spark/pull/2091#discussion_r16627673
--- Diff: python/pyspark/rdd.py ---
@@ -856,6 +856,104 @@ def redFunc(left_counter, right_counter):
return self.mapPartitions(lambda i:
Github user mattf commented on a diff in the pull request:
https://github.com/apache/spark/pull/2091#discussion_r16627753
--- Diff: python/pyspark/rdd.py ---
@@ -856,6 +856,104 @@ def redFunc(left_counter, right_counter):
return self.mapPartitions(lambda i:
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/2091#discussion_r16628887
--- Diff: python/pyspark/rdd.py ---
@@ -856,6 +856,104 @@ def redFunc(left_counter, right_counter):
return self.mapPartitions(lambda i:
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53143433
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19094/consoleFull)
for PR 2091 at commit
GitHub user davies opened a pull request:
https://github.com/apache/spark/pull/2091
[SPARK-2871] [PySpark] add histgram() API
Compute a histogram using the provided buckets. The buckets
are all open to the right except for the last which is closed.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53022070
[QA tests have
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19071/consoleFull)
for PR 2091 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2091#issuecomment-53024894
[QA tests have
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19071/consoleFull)
for PR 2091 at commit
28 matches
Mail list logo