[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-07 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10554#discussion_r49052171
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala 
---
@@ -448,7 +448,7 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] 
extends Serializable {
* combine step happens locally on the master, equivalent to running a 
single reduce task.
*/
   def countByValue(): java.util.Map[T, jl.Long] =
-mapAsSerializableJavaMap(rdd.countByValue().map((x => (x._1, new 
jl.Long(x._2)
+mapAsSerializableJavaMap(rdd.countByValue().mapValues(jl.Long.valueOf))
--- End diff --

See my reply at 
https://github.com/apache/spark/pull/10554#discussion_r48938992 -- it was 
because I then saw this was how `countByKey` had been implemented. I picked 
consistency, but it's as reasonable to change `countByKey` I suppose. Do you 
feel strongly about it one way or the other?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-169278782
  
**[Test build #48847 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48847/consoleFull)**
 for PR 10554 at commit 
[`293b5e4`](https://github.com/apache/spark/commit/293b5e454934092dc58090ea64550595328d0dd5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10554#discussion_r48938992
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
---
@@ -288,17 +288,18 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
* immediately to the master as a Map. This will also perform the 
merging locally on each mapper
* before sending results to a reducer, similarly to a "combiner" in 
MapReduce.
*/
-  def reduceByKeyLocally(func: JFunction2[V, V, V]): java.util.Map[K, V] =
+  def reduceByKeyLocally(func: JFunction2[V, V, V]): JMap[K, V] =
 mapAsSerializableJavaMap(rdd.reduceByKeyLocally(func))
 
   /** Count the number of elements for each key, and return the result to 
the master as a Map. */
-  def countByKey(): java.util.Map[K, Long] = 
mapAsSerializableJavaMap(rdd.countByKey())
+  def countByKey(): JMap[K, JLong] =
+mapAsSerializableJavaMap(rdd.countByKey().mapValues(JLong.valueOf))
--- End diff --

This is what I was referring to in the last comment -- I realized that this 
is how `RDD.countByValue` is implemented, so I was picking consistency. I don't 
feel strongly about it.

For the other naming -- yeah I don't know what to be consistent with. For 
the files I'm touching, I'll try to use `java.util.` and `jl.Long` consistently 
even if it means updating some other bits of these files, and leave it at that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-169328695
  
**[Test build #48847 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48847/consoleFull)**
 for PR 10554 at commit 
[`293b5e4`](https://github.com/apache/spark/commit/293b5e454934092dc58090ea64550595328d0dd5).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-169332682
  
**[Test build #2335 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2335/consoleFull)**
 for PR 10554 at commit 
[`293b5e4`](https://github.com/apache/spark/commit/293b5e454934092dc58090ea64550595328d0dd5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-169328843
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48847/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-169328832
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-169365786
  
**[Test build #2335 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2335/consoleFull)**
 for PR 10554 at commit 
[`293b5e4`](https://github.com/apache/spark/commit/293b5e454934092dc58090ea64550595328d0dd5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-169515574
  
I'm going to merge this. Thanks Sean.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10554#discussion_r49032707
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala 
---
@@ -448,7 +448,7 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] 
extends Serializable {
* combine step happens locally on the master, equivalent to running a 
single reduce task.
*/
   def countByValue(): java.util.Map[T, jl.Long] =
-mapAsSerializableJavaMap(rdd.countByValue().map((x => (x._1, new 
jl.Long(x._2)
+mapAsSerializableJavaMap(rdd.countByValue().mapValues(jl.Long.valueOf))
--- End diff --

@srowen how come there is still a mapValues here?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10554


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10554#discussion_r48931833
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
---
@@ -17,8 +17,8 @@
 
 package org.apache.spark.api.java
 
+import java.lang.{Iterable => JIterable, Long => JLong}
--- End diff --

can we be more consistent and maybe use jl.Long, or java.lang.Long?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10554#discussion_r48931786
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
---
@@ -288,17 +288,18 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
* immediately to the master as a Map. This will also perform the 
merging locally on each mapper
* before sending results to a reducer, similarly to a "combiner" in 
MapReduce.
*/
-  def reduceByKeyLocally(func: JFunction2[V, V, V]): java.util.Map[K, V] =
--- End diff --

fwiw, i actually think spelling out java.util.Map is more clear ...



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10554#discussion_r48932093
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
---
@@ -288,17 +288,18 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
* immediately to the master as a Map. This will also perform the 
merging locally on each mapper
* before sending results to a reducer, similarly to a "combiner" in 
MapReduce.
*/
-  def reduceByKeyLocally(func: JFunction2[V, V, V]): java.util.Map[K, V] =
+  def reduceByKeyLocally(func: JFunction2[V, V, V]): JMap[K, V] =
 mapAsSerializableJavaMap(rdd.reduceByKeyLocally(func))
 
   /** Count the number of elements for each key, and return the result to 
the master as a Map. */
-  def countByKey(): java.util.Map[K, Long] = 
mapAsSerializableJavaMap(rdd.countByKey())
+  def countByKey(): JMap[K, JLong] =
+mapAsSerializableJavaMap(rdd.countByKey().mapValues(JLong.valueOf))
--- End diff --

what's the reason we still need the mapValues here?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/10554#discussion_r48931848
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala
 ---
@@ -848,6 +848,6 @@ object JavaPairDStream {
 
   def scalaToJavaLong[K: ClassTag](dstream: JavaPairDStream[K, Long])
   : JavaPairDStream[K, JLong] = {
-DStream.toPairDStreamFunctions(dstream.dstream).mapValues(new JLong(_))
+
DStream.toPairDStreamFunctions(dstream.dstream).mapValues(JLong.valueOf)
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168977430
  
@rxin OK I _almost_ did that. I realized that `JavaRDD.countByValue` 
already does a `mapValues`. I left `countByKey` to act the same way, doing the 
mapping. Other methods that return `JavaPairRDD` went back to doing a cast. 
(Although similar methods in the streaming API actually also do a `mapValues`.) 
I also tried to use `JLong` vs `jl.Long` consistently within a file without 
going overboard; both of these files use about every different form. I think 
the result is more consistent internally, but WDYT? I'm neutral, and would 
further change things while keeping them consistent if anyone had a preference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168979486
  
**[Test build #48759 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48759/consoleFull)**
 for PR 10554 at commit 
[`39fa6e7`](https://github.com/apache/spark/commit/39fa6e7e93bdc8d6eb01ca3af6d921171f4edaf9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168996713
  
**[Test build #48759 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48759/consoleFull)**
 for PR 10554 at commit 
[`39fa6e7`](https://github.com/apache/spark/commit/39fa6e7e93bdc8d6eb01ca3af6d921171f4edaf9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168996918
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168996921
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48759/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168790343
  
@srowen  If this doesn't change any signature, I think it actually makes 
things slower by adding another layer of iterators. Maybe it'd make more sense 
to just change the signature rather than doing actual "conversions".



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168807831
  
Yeah that's a good point -- let me see if I can understand that better. At 
some level, it has to be an `Object` and not a `long` in order to be part of a 
`java.util.Map`, so something is transforming something. It may be that 
implicits convert it to a `java.lang.Long` in the end, maybe via `RichLong`, 
which would be pretty good news -- this is just then fixing the signature and 
making explicit some implicit conversion and maybe even making it more 
efficient. Let me experiment a bit


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168801616
  
One thing I don't understand is how can these methods return scala types at 
the bytecode level? Scala Long is just a primitive long. All the places you 
find are generics, which cannot be primitive types, and as a result I assume 
they are all java boxed types?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168794854
  
It doesn't compile that way though since the values are Scala Longs and the 
signature says Java Longs. One way or the other, such a conversion has to 
happen somewhere. If it works for anyone it's because they're already doing 
this in the calling code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10554#discussion_r48713706
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala
 ---
@@ -848,6 +848,6 @@ object JavaPairDStream {
 
   def scalaToJavaLong[K: ClassTag](dstream: JavaPairDStream[K, Long])
   : JavaPairDStream[K, JLong] = {
-DStream.toPairDStreamFunctions(dstream.dstream).mapValues(new JLong(_))
+
DStream.toPairDStreamFunctions(dstream.dstream).mapValues(JLong.valueOf)
--- End diff --

Changed for consistency; `valueOf` is slightly more efficient than the 
constructor since the JDK actually caches `Long` objects for small values.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168617727
  
I don't think it does. It may cause source incompatibility since the 
generic type changes. You can see an example of a fix/change that can happen in 
the caller in the example that changed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168829287
  
... or I suppose the implementation can just cast to `java.util.Map[K, 
java.lang.Long]` since that's safe. It's less change, and mimics what callers 
are doing anyway in Java. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168821601
  
@rxin so I tried unpacking this code in IntelliJ:
```
val m = new java.util.HashMap[String, java.lang.Long]()
val l = 1L
m.put("foo", l)
```
... and it tells me that the implicit conversion that is applied here is 
`Predef.long2Long`:

```
implicit def long2Long(x: Long)   = java.lang.Long.valueOf(x)
```

It looks like this is the same underlying transformation that happens in 
`SerializableMapWrapper` and `mapAsSerializableJavaMap` already. So it looks to 
me like this is already the implicit transformation being applied anyway.

Right now the returned type of these methods in Java is `Map` 
and the wrapper does the conversion on the fly in `get()`. I believe. I think 
to make it a real `Map` requires making the transformation upfront -- 
well, short of making some other new special purpose wrapper.

WDYT?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168830850
  
Yea your latest suggestion sounds great (a very tiny change).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-03 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168561746
  
Does this actually change anything w.r.t. bytecode singature? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168391002
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168391003
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48583/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168393617
  
**[Test build #2297 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2297/consoleFull)**
 for PR 10554 at commit 
[`1a27421`](https://github.com/apache/spark/commit/1a27421aea247420b4aed597f8cc19767d11b6f0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread srowen
GitHub user srowen opened a pull request:

https://github.com/apache/spark/pull/10554

[SPARK-12604] [CORE] Java count(AprroxDistinct)ByKey methods return Scala 
Long not Java

Change Java countByKey, countApproxDistinctByKey return types to use Java 
Long, not Scala; update similar methods for consistency on 
java.long.Long.valueOf with no API change

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srowen/spark SPARK-12604

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10554.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10554


commit 1a27421aea247420b4aed597f8cc19767d11b6f0
Author: Sean Owen 
Date:   2016-01-02T13:09:40Z

Change Java countByKey, countApproxDistinctByKey return types to use Java 
Long, not Scala; update similar methods for consistency on 
java.long.Long.valueOf with no API change




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12604] [CORE] Java count(AprroxDistinct...

2016-01-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10554#issuecomment-168400458
  
**[Test build #2297 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2297/consoleFull)**
 for PR 10554 at commit 
[`1a27421`](https://github.com/apache/spark/commit/1a27421aea247420b4aed597f8cc19767d11b6f0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org