Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/944
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabl
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45053625
Thanks. I've merged this into master branch.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project d
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45051509
Merged build finished. All automated tests passed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45051510
All automated tests passed.
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15431/
---
If your project
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45049954
@rxin rebased just now. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45049924
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45049915
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not ha
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45047703
Yeah this is one case where we need to manually add excludes. The reason is
MIMA doesn't have proper support for package privacy, and our work-around can't
handle the cas
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45046091
BTW due to two recent PR merges, I think this no longer merges cleanly. Do
you mind updating the PR to rebase it to master? Thanks.
---
If your project is set up for it, yo
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-45046041
Yea we should just add the rules. I had to work around the same problem
when I removed SerializableHyperLogLog.
---
If your project is set up for it, you can reply to this
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44922414
(This is probably an oversight in the way we set up MIMA).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. I
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44922397
Ah, got it. @pwendell what do you think? I think we'd just add them to the
excludes.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44922339
MiMa complains as follows, since I deleted ZippedRDD.scala. However, both
classes are marked ```private[spark]```, I suppose it is OK to exclude?
```
[error]
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44922033
Wait, why did you need to add this to MimaExcludes? You're not changing the
API.
---
If your project is set up for it, you can reply to this email and have your
reply appe
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44918370
Are we going to include this patch for 1.0.x? If so, I'll need to update
MimaExcludes for it.
---
If your project is set up for it, you can reply to this email and have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44918295
All automated tests passed.
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15371/
---
If your project
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44918294
Merged build finished. All automated tests passed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44916587
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not ha
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44916590
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44915339
Build finished.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this f
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44915341
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15369/
---
If your project is set up for it, you can r
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44915289
@mateiz updated
@marmbrus thanks for your note. I'll rebase if it still fails.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44915286
Build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44915291
Build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this fe
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44914041
Weird, might be a temporary issue in Jenkins.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user mateiz commented on a diff in the pull request:
https://github.com/apache/spark/pull/944#discussion_r13317153
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -655,7 +655,19 @@ abstract class RDD[T: ClassTag](
* partitions* and the *same number
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44913258
I think you might need to rebase your change as it probably is not merging
cleanly.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44910313
@pwendell I got the following test error, not sure what I can do. Can you
help?
```
Could not find Apache license headers in the following files:
!? /roo
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44909972
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15362/
---
If your project is set up for it, you can r
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44909971
Build finished.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this f
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44909915
Build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this fe
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44909907
Build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44909251
Merged build finished.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44909252
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15360/
---
If your project is set up for it, you can r
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44909095
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44909083
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not ha
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44907718
Merged build finished.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44907721
Refer to this link for build results:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15357/
---
If your project is set up for it, you can r
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44905482
@mateiz This is to address the remaining issue of #760 (after #776). Since
now zip() calls zipPartitions(), I removed ZippedRDD.scala. Pls take a look.
---
If your proj
GitHub user kanzhang opened a pull request:
https://github.com/apache/spark/pull/944
[SPARK-1817] RDD.zip() should verify partition sizes for each partition
RDD.zip() will throw an exception if it finds partition sizes are not the
same.
You can merge this pull request into a Git re
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44905227
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not ha
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/944#issuecomment-44905235
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user kanzhang closed the pull request at:
https://github.com/apache/spark/pull/760
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is ena
GitHub user kanzhang reopened a pull request:
https://github.com/apache/spark/pull/760
[SPARK-1817] RDD.zip() should verify partition sizes for each partition
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kanzhang/spark SPARK-1
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-44483523
Ok.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user kanzhang closed the pull request at:
https://github.com/apache/spark/pull/760
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is ena
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-44483155
Alright, you should just close this PR. I think we need to keep the JIRA
issue open, but just make the fix be a better exception.
---
If your project is set up for it, you
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-44482528
@witgo for your workaround idea, maybe you want to send a separate PR so
that other people can take a look (I'm no expert)?
---
If your project is set up for it, you can
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-44479189
@mateiz Given #776, I think we can close this.
Btw, do you still want to keep SPARK-1817 open and find a fix for it?
---
If your project is set up for it, you ca
GitHub user kanzhang reopened a pull request:
https://github.com/apache/spark/pull/760
[SPARK-1817] RDD.zip() should verify partition sizes for each partition
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kanzhang/spark SPARK-1
Github user kanzhang closed the pull request at:
https://github.com/apache/spark/pull/760
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is ena
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-44475838
@kanzhang this pull request was primarily about ranges, but I don't think
it addresses all of SPARK-1817. The real solution there would be to throw an
error and tell the us
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43680692
IMHO, slicing a sequence shouldn't change its element values
(floating-point representations), same for ```take``` and ```drop```.
---
If your project is set up for it,
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43579042
I mean,`NumericRange[Double]` different methods get different results.
So we just guarantee `slice` method return consistent results.
---
If your project is set up for
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43528643
@witgo I'm not sure what you are trying to say in above comment, but if you
meant "only string formatting is different, the underlying floating-point
representations are
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43461175
@kanzhang
```
scala> val d=(1D to 2D).by(0.2)
d: scala.collection.immutable.NumericRange[Double] = NumericRange(1.0, 1.2,
1.4, 1.5999, 1.799
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43460411
@witgo your approach is similar to how Range is partitioned (i.e., using
the ```step``` value to recalculate some of the elements in the sequence). One
issue with using t
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43433310
A simple solution
```scala
object ParallelCollectionRDD {
/**
* Slice a collection into numSlices sub-collections. One extra thing we
do here is to treat
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43423066
@witgo unfortunately I don't see an easy alternative. If you see one, pls
let me know.
---
If your project is set up for it, you can reply to this email and have your
re
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43395929
@witgo yes, the first one is wrong and that is due to the Scala bug I
referenced above (https://issues.scala-lang.org/browse/SI-8518).
---
If your project is set up for
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43396698
@kanzhang
All told, we should fix the following code
`slices += r.take(sliceSize).asInstanceOf[Seq[T]]`.
---
If your project is set up for it, you can reply to thi
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43394756
@kanzhang
```
scala> sc.parallelize((1D to 2D).by(0.2),4).collect
res0: Array[Double] = Array(1.0, 1.2, 1.6, 1.8)
```
```
scala> sc.parallelize((
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43373217
@witgo I'm not sure it is the correct semantics (@mateiz ?), but based on
how we partition sequences, that is expected. We zip by partition (see below).
Btw, as
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43378994
@witgo for sequences of the same size, we can get the same result as Scala
zip, as long as we partition them in the same way. For sequences of different
sizes (your examp
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43309863
In your code:
`sc.parallelize(1L to 2L,4).zip(sc.parallelize(11 to 12,4)).collect`
=>
`Array[(Long, Int)] = Array((1,11), (2,12))` Â
This is the right.
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43264224
Looked a little further, it seems once a seq is sliced into partitions,
those partitions are no longer sequences. To find partition length, one has to
iterate through it?
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43142284
@mateiz I factored out NumericRange changes to #776. Now back to this PR,
what semantics should we enforce? Specifically, should we enforce same size
requirement on the l
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43169279
Probably not easy to verify until you iterate through the data.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user kanzhang commented on the pull request:
https://github.com/apache/spark/pull/760#issuecomment-43168454
Looked at Scaladoc just now, zip() assumes same number of elements in each
partition, so we just verify that?
---
If your project is set up for it, you can reply to this
69 matches
Mail list logo