[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-07 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/10554#discussion_r49052171 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -448,7 +448,7 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-169278782 **[Test build #48847 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48847/consoleFull)** for PR 10554 at commit

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/10554#discussion_r48938992 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala --- @@ -288,17 +288,18 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)]) *

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-169328695 **[Test build #48847 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48847/consoleFull)** for PR 10554 at commit

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-169332682 **[Test build #2335 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2335/consoleFull)** for PR 10554 at commit

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-169328843 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-169328832 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-169365786 **[Test build #2335 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2335/consoleFull)** for PR 10554 at commit

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-169515574 I'm going to merge this. Thanks Sean. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10554#discussion_r49032707 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -448,7 +448,7 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10554 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10554#discussion_r48931833 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala --- @@ -17,8 +17,8 @@ package org.apache.spark.api.java +import

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10554#discussion_r48931786 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala --- @@ -288,17 +288,18 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)]) *

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10554#discussion_r48932093 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala --- @@ -288,17 +288,18 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)]) *

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10554#discussion_r48931848 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala --- @@ -848,6 +848,6 @@ object JavaPairDStream { def

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168977430 @rxin OK I _almost_ did that. I realized that `JavaRDD.countByValue` already does a `mapValues`. I left `countByKey` to act the same way, doing the mapping. Other

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168979486 **[Test build #48759 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48759/consoleFull)** for PR 10554 at commit

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168996713 **[Test build #48759 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48759/consoleFull)** for PR 10554 at commit

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168996918 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168996921 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168790343 @srowen If this doesn't change any signature, I think it actually makes things slower by adding another layer of iterators. Maybe it'd make more sense to just change the

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168807831 Yeah that's a good point -- let me see if I can understand that better. At some level, it has to be an `Object` and not a `long` in order to be part of a

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168801616 One thing I don't understand is how can these methods return scala types at the bytecode level? Scala Long is just a primitive long. All the places you find are generics,

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168794854 It doesn't compile that way though since the values are Scala Longs and the signature says Java Longs. One way or the other, such a conversion has to happen somewhere.

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/10554#discussion_r48713706 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala --- @@ -848,6 +848,6 @@ object JavaPairDStream {

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168617727 I don't think it does. It may cause source incompatibility since the generic type changes. You can see an example of a fix/change that can happen in the caller in the

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168829287 ... or I suppose the implementation can just cast to `java.util.Map[K, java.lang.Long]` since that's safe. It's less change, and mimics what callers are doing anyway in

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168821601 @rxin so I tried unpacking this code in IntelliJ: ``` val m = new java.util.HashMap[String, java.lang.Long]() val l = 1L m.put("foo", l) ``` ...

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168830850 Yea your latest suggestion sounds great (a very tiny change). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-03 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168561746 Does this actually change anything w.r.t. bytecode singature? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168391002 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168391003 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168393617 **[Test build #2297 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2297/consoleFull)** for PR 10554 at commit

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/10554 [SPARK-12604] [CORE] Java count(AprroxDistinct)ByKey methods return Scala Long not Java Change Java countByKey, countApproxDistinctByKey return types to use Java Long, not Scala; update similar

[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10554#issuecomment-168400458 **[Test build #2297 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2297/consoleFull)** for PR 10554 at commit