[GitHub] spark pull request: SPARK-4963 [SQL] HiveTableScan return mutable ...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3827#issuecomment-68337920
  
  [Test build #24888 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24888/consoleFull)
 for   PR 3827 at commit 
[`cea7e2e`](https://github.com/apache/spark/commit/cea7e2ec42c44f81965f8adf462faa887e2dae89).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5003][SQL]cast support date data type

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3839#issuecomment-68338141
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5003][SQL]cast support date data type

2014-12-30 Thread haiyangsea
GitHub user haiyangsea opened a pull request:

https://github.com/apache/spark/pull/3839

[SPARK-5003][SQL]cast support date data type

enable cast to support date data type
such as : 
select * from tableX where dateFiled  cast('2014-12-30' as date)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/haiyangsea/spark datatype

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3839.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3839


commit 8bdd08f4fb08d6442861d855f015544a3d9c96a4
Author: haiyang huhaiy...@huawei.com
Date:   2014-12-29T11:05:58Z

support date datatype

commit 81bd51da01813ade238fee65f4f2accdf3b6eda9
Author: haiyang huhaiy...@huawei.com
Date:   2014-12-30T07:22:27Z

add test case




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5002][SQL] Using ascending by default w...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3838#issuecomment-68338232
  
  [Test build #24886 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24886/consoleFull)
 for   PR 3838 at commit 
[`114b64a`](https://github.com/apache/spark/commit/114b64a9b8dba469c44a455cb6f239ea1e8c0d2a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class GaussianMixtureModel(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5002][SQL] Using ascending by default w...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3838#issuecomment-68338233
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24886/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4963 [SQL] HiveTableScan return mutable ...

2014-12-30 Thread yanbohappy
Github user yanbohappy commented on the pull request:

https://github.com/apache/spark/pull/3827#issuecomment-68339079
  
@liancheng I agree to move the copy call to execution.Sample.execute and 
added new commits.
It will take no effect on HiveTableScan.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4988][SQL] Fix: 'Create table ..as sele...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3821#issuecomment-68339552
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24887/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4988][SQL] Fix: 'Create table ..as sele...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3821#issuecomment-68339547
  
  [Test build #24887 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24887/consoleFull)
 for   PR 3821 at commit 
[`1bab9e4`](https://github.com/apache/spark/commit/1bab9e4b782e62485f01f4f650a54c5ccb86f2a1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4963 [SQL] Add copy to SQL's Sample oper...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3827#issuecomment-68341428
  
  [Test build #24888 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24888/consoleFull)
 for   PR 3827 at commit 
[`cea7e2e`](https://github.com/apache/spark/commit/cea7e2ec42c44f81965f8adf462faa887e2dae89).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4963 [SQL] Add copy to SQL's Sample oper...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3827#issuecomment-68341432
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24888/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...

2014-12-30 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/3246#issuecomment-68342204
  
Anyone would like to review this pr?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...

2014-12-30 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3246#discussion_r22343345
  
--- Diff: 
external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala
 ---
@@ -25,6 +25,91 @@ import 
org.apache.spark.streaming.api.java.{JavaReceiverInputDStream, JavaDStrea
 import org.apache.spark.streaming.dstream.{ReceiverInputDStream, DStream}
 
 object TwitterUtils {
+
+  // For implicit parameter used to avoid to have same type after erasure
+  case class Ignore(value: String ) {
--- End diff --

This looks like a big hack. Just use different method names if you are 
trying to avoid type conflict after erasure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...

2014-12-30 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3246#discussion_r22343369
  
--- Diff: 
external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala
 ---
@@ -60,6 +61,7 @@ private[streaming]
 class TwitterReceiver(
 twitterAuth: Authorization,
 filters: Seq[String],
+locations: Seq[Seq[Double]],
--- End diff --

These are supposed to be a bunch of (lat,lon) pairs, right? why not 
`Seq[(Double,Double)]`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...

2014-12-30 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3246#discussion_r22343378
  
--- Diff: 
external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala
 ---
@@ -88,6 +90,11 @@ class TwitterReceiver(
   val query = new FilterQuery
   if (filters.size  0) {
 query.track(filters.toArray)
+  }
+  if (locations.size  0) {
+query.locations(locations.map(_.toArray).toArray)
+  }
+  if (filters.size  0 || locations.size  0) {
--- End diff --

It seems like this could be rewritten to avoid the redundant checks and 
create of `FilterQuery` when there is no filtering.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...

2014-12-30 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3246#discussion_r22343430
  
--- Diff: 
external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala
 ---
@@ -112,20 +269,91 @@ object TwitterUtils {
 ): JavaReceiverInputDStream[Status] = {
 createStream(jssc.ssc, Some(twitterAuth), filters)
   }
-
+ 
+  /**
+   * Create a input stream that returns tweets received from Twitter.
+   * Storage level of the data will be the default 
StorageLevel.MEMORY_AND_DISK_SER_2.
+   * @param jsscJavaStreamingContext object
+   * @param twitterAuth Twitter4J Authorization
+   * @param filters Set of filter strings to get only those tweets 
that match them
+   * @param locations Set of longitude, latitude pairs to get only those 
tweets
+   *that falling within the requested bounding boxes
+   */
+  def createStream(
+  jssc: JavaStreamingContext,
+  twitterAuth: Authorization,
+  filters: Array[String],
+  locations: Array[Array[Double]]
+): JavaReceiverInputDStream[Status] = {
+createStream(jssc.ssc, Some(twitterAuth), filters, 
locations.map(_.toSeq).toSeq)
+  }
+ 
+  /**
+   * Create a input stream that returns tweets received from Twitter.
+   * Storage level of the data will be the default 
StorageLevel.MEMORY_AND_DISK_SER_2.
+   * @param jsscJavaStreamingContext object
+   * @param twitterAuth Twitter4J Authorization
+   * @param locations Set of longitude, latitude pairs to get only those 
tweets
+   *that falling within the requested bounding boxes
+   */
+  def createStream(
+  jssc: JavaStreamingContext,
+  twitterAuth: Authorization,
+  locations: Array[Array[Double]]
+): JavaReceiverInputDStream[Status] = {
+createStream(jssc.ssc, Some(twitterAuth), Nil, 
locations.map(_.toSeq).toSeq)
+  }
+ 
+  /**
+   * Create a input stream that returns tweets received from Twitter.
+   * @param jssc JavaStreamingContext object
+   * @param twitterAuth  Twitter4J Authorization object
+   * @param filters  Set of filter strings to get only those tweets 
that match them
+   * @param storageLevel Storage level to use for storing the received 
objects
+   */
+  def createStream(
+  jssc: JavaStreamingContext,
+  twitterAuth: Authorization,
+  filters: Array[String],
+  storageLevel: StorageLevel
+): JavaReceiverInputDStream[Status] = {
+createStream(jssc.ssc, Some(twitterAuth), filters, Nil, storageLevel)
+  }
+ 
   /**
* Create a input stream that returns tweets received from Twitter.
* @param jssc JavaStreamingContext object
* @param twitterAuth  Twitter4J Authorization object
* @param filters  Set of filter strings to get only those tweets 
that match them
+   * @param locations Set of longitude, latitude pairs to get only those 
tweets
+   *that falling within the requested bounding boxes
* @param storageLevel Storage level to use for storing the received 
objects
*/
   def createStream(
   jssc: JavaStreamingContext,
   twitterAuth: Authorization,
   filters: Array[String],
+  locations: Array[Array[Double]],
+  storageLevel: StorageLevel
+): JavaReceiverInputDStream[Status] = {
+createStream(jssc.ssc, Some(twitterAuth), filters, 
locations.map(_.toSeq).toSeq, storageLevel)
+  }
+ 
+  /**
+   * Create a input stream that returns tweets received from Twitter.
+   * @param jssc JavaStreamingContext object
+   * @param twitterAuth  Twitter4J Authorization object
+   * @param locations Set of longitude, latitude pairs to get only those 
tweets
+   *that falling within the requested bounding boxes
+   * @param storageLevel Storage level to use for storing the received 
objects
+   */
+  def createStream(
--- End diff --

There are *12* new overloads of `createStream` on top of the existing 4. 
This seems like big overkill. There should be one version in Java/Scala that 
takes all arguments, one each that takes minimal arguments, and any others 
needed to retain binary compatibility. The rest seem superfluous.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: 

[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...

2014-12-30 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/3246#discussion_r22343534
  
--- Diff: 
external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala
 ---
@@ -60,6 +61,7 @@ private[streaming]
 class TwitterReceiver(
 twitterAuth: Authorization,
 filters: Seq[String],
+locations: Seq[Seq[Double]],
--- End diff --

I remember that seems to be for using under both scala and java. So we can 
simply give a lat, lon pair in array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...

2014-12-30 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3246#discussion_r22344218
  
--- Diff: 
external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala
 ---
@@ -60,6 +61,7 @@ private[streaming]
 class TwitterReceiver(
 twitterAuth: Authorization,
 filters: Seq[String],
+locations: Seq[Seq[Double]],
--- End diff --

There is already a separate set of methods for Java, no? that use `Array` 
and not things like `Seq`. This is private to the Scala-based Spark code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2014-12-30 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/3820#issuecomment-68346947
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4382] Add locations parameter to Twitte...

2014-12-30 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/3246#discussion_r22344303
  
--- Diff: 
external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala
 ---
@@ -60,6 +61,7 @@ private[streaming]
 class TwitterReceiver(
 twitterAuth: Authorization,
 filters: Seq[String],
+locations: Seq[Seq[Double]],
--- End diff --

There is. I meant that the main problem is the inconsistency between Scala 
and Java apis. One use `Seq[(Double,Double)]` and other uses 
`Array[Array[Double]]`. If it is ok, I would revise it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3820#issuecomment-68347060
  
  [Test build #24889 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24889/consoleFull)
 for   PR 3820 at commit 
[`dc6eaba`](https://github.com/apache/spark/commit/dc6eaba7db957eb9038532c7c57282c040e870d4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Changes to illustrate the principles of functi...

2014-12-30 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/3835#issuecomment-68347289
  
@yujunliang If you're not working on a change that you want to be 
considered for merging later, then I would not open a pull request at all. Just 
work in your local branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [MLlib]delete the train function

2014-12-30 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/3836#issuecomment-68348124
  
Although this is an experimental class, and so API methods could be 
removed, I think you'd still want a decent reason to remove this method at this 
point, even if it's deprecated. 

Here, there is no method with the same signature in the object though, so 
I'm not sure what the problem is. It's common to have many methods with the 
same name in a class anyway. With reflection you differentiate them by their 
method signature, so this shouldn't be an obstacle. (I think the object even 
appears as a separate class `DecisionTree$` in the JVM?)

Last you'd want to make a JIRA for this too in general, but first I think 
the points above would need to be answered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4660: Use correct default classloader in...

2014-12-30 Thread pkolaczk
GitHub user pkolaczk opened a pull request:

https://github.com/apache/spark/pull/3840

SPARK-4660: Use correct default classloader in JavaSerializer.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pkolaczk/spark SPARK-4660

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3840.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3840


commit 86bc5ebdfb2c5f0d58ffeaf184f94f60923fe676
Author: Piotr Kolaczkowski pkola...@datastax.com
Date:   2014-12-30T11:01:47Z

SPARK-4660: Use correct default classloader in JavaSerializer.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4660: Use correct default classloader in...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3840#issuecomment-68349098
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3820#issuecomment-68351142
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24889/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3820#issuecomment-68351139
  
  [Test build #24889 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24889/consoleFull)
 for   PR 3820 at commit 
[`dc6eaba`](https://github.com/apache/spark/commit/dc6eaba7db957eb9038532c7c57282c040e870d4).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread WangTaoTheTonic
GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/spark/pull/3841

[SPARK-5006][Deploy]spark.port.maxRetries doesn't work

https://issues.apache.org/jira/browse/SPARK-5006

I think the issue is produced in https://github.com/apache/spark/pull/1777. 

Not digging mesos's backend yet. Maybe should add same logic either.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/spark SPARK-5006

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3841.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3841


commit 62ec336fd3c600a5646d3614287cbb1de72e930d
Author: WangTaoTheTonic barneystin...@aliyun.com
Date:   2014-12-30T12:12:39Z

spark.port.maxRetries doesn't work




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3841#issuecomment-68352288
  
  [Test build #24890 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24890/consoleFull)
 for   PR 3841 at commit 
[`62ec336`](https://github.com/apache/spark/commit/62ec336fd3c600a5646d3614287cbb1de72e930d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3841#issuecomment-68352388
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24890/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3841#issuecomment-68352385
  
  [Test build #24890 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24890/consoleFull)
 for   PR 3841 at commit 
[`62ec336`](https://github.com/apache/spark/commit/62ec336fd3c600a5646d3614287cbb1de72e930d).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3841#issuecomment-68353060
  
  [Test build #24891 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24891/consoleFull)
 for   PR 3841 at commit 
[`191face`](https://github.com/apache/spark/commit/191face9291c8d455223858882ef509406a8826d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3794#issuecomment-68353372
  
  [Test build #24892 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24892/consoleFull)
 for   PR 3794 at commit 
[`6e95955`](https://github.com/apache/spark/commit/6e95955c9c67ce509372fe08f9ced962eb251593).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...

2014-12-30 Thread srowen
GitHub user srowen opened a pull request:

https://github.com/apache/spark/pull/3842

SPARK-2757 [BUILD] [STREAMING] Add Mima test for Spark Sink after 1.10 is 
released

Re-enable MiMa for Streaming Flume Sink module, now that 1.1.0 is released, 
per the JIRA TO-DO. That's pretty much all there is to this.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srowen/spark SPARK-2757

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3842.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3842


commit 0e5ba5cefaca04c188aadf5309ca6d5dffe1c63f
Author: Sean Owen so...@cloudera.com
Date:   2014-12-30T12:46:10Z

Re-enable MiMa for Streaming Flume Sink module




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3842#issuecomment-68354029
  
  [Test build #24893 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24893/consoleFull)
 for   PR 3842 at commit 
[`0e5ba5c`](https://github.com/apache/spark/commit/0e5ba5cefaca04c188aadf5309ca6d5dffe1c63f).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...

2014-12-30 Thread MickDavies
GitHub user MickDavies opened a pull request:

https://github.com/apache/spark/pull/3843

[SPARK-4386] Improve performance when writing Parquet files

Convert type of RowWriteSupport.attributes to Array.

Analysis of performance for writing very wide tables shows that time is 
spent predominantly in apply method on  attributes var. Type of attributes 
previously was LinearSeqOptimized and apply is O(N) which made write O(N 
squared).

Measurements on 575 column table showed this change made a 6x improvement 
in write times.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/MickDavies/spark SPARK-4386

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3843.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3843


commit 892519d3bb7166ea184f0c070759b8a3b679e2c4
Author: Michael Davies michael.belldav...@gmail.com
Date:   2014-12-30T13:00:25Z

[SPARK-4386] Improve performance when writing Parquet files

Convert type of RowWriteSupport.attributes to Array.

Analysis of performance for writing very wide tables shows that time is 
spent predominantly in apply method on  attributes var. Type of attributes 
previously was LinearSeqOptimized and apply is O(N) which made write O(N 
squared).

Measurements on 575 column table showed this change showed a 6x improvement 
in write times.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4631] unit test for MQTT

2014-12-30 Thread Bilna
GitHub user Bilna opened a pull request:

https://github.com/apache/spark/pull/3844

[SPARK-4631] unit test for MQTT



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Bilna/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3844.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3844


commit 86164950acfc794c6c9b1db3663716ac4626c55b
Author: bilna bil...@am.amrita.edu
Date:   2014-12-30T13:06:09Z

[SPARK-4631] unit test for MQTT




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4631] unit test for MQTT

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3844#issuecomment-68355440
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3843#issuecomment-68355445
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...

2014-12-30 Thread MickDavies
Github user MickDavies commented on the pull request:

https://github.com/apache/spark/pull/3254#issuecomment-68355607
  
@jimfcarroll sorry I misunderstood your comment. Good that you have 
verified performance gain.

I have added a PR. It is number 3843.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4631] unit test for MQTT

2014-12-30 Thread prabeesh
Github user prabeesh commented on the pull request:

https://github.com/apache/spark/pull/3844#issuecomment-68355645
  
@tdas verify this patch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...

2014-12-30 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request:

https://github.com/apache/spark/pull/3845

[SPARK-5007] [CORE] Try random port when startServiceOnPort to reduce the 
chance of port collision

When multiple Spark programs are submitted at the same node (called 
springboard machine). The ports (default 4040) of these SparkUIs are from 4040 
to 4056. Then the Spark programs submitted later would fail because of SparkUI 
port collision.
The chance of port collision could be reduced by setting spark.ui.port or 
spark.port.maxRetries.
However, I think it's better to try random port when startServiceOnPort to 
reduce the chance of port collision.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/YanTangZhai/spark SPARK-5007

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3845.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3845


commit cdef539abc5d2d42d4661373939bdd52ca8ee8e6
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-08-06T13:07:08Z

Merge pull request #1 from apache/master

update

commit cbcba66ad77b96720e58f9d893e87ae5f13b2a95
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-08-20T13:14:08Z

Merge pull request #3 from apache/master

Update

commit 8a0010691b669495b4c327cf83124cabb7da1405
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-09-12T06:54:58Z

Merge pull request #6 from apache/master

Update

commit 03b62b043ab7fd39300677df61c3d93bb9beb9e3
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-09-16T12:03:22Z

Merge pull request #7 from apache/master

Update

commit 76d40277d51f709247df1d3734093bf2c047737d
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-10-20T12:52:22Z

Merge pull request #8 from apache/master

update

commit d26d98248a1a4d0eb15336726b6f44e05dd7a05a
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-11-04T09:00:31Z

Merge pull request #9 from apache/master

Update

commit e249846d9b7967ae52ec3df0fb09e42ffd911a8a
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-11-11T03:18:24Z

Merge pull request #10 from apache/master

Update

commit 6e643f81555d75ec8ef3eb57bf5ecb6520485588
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-12-01T11:23:56Z

Merge pull request #11 from apache/master

Update

commit 718afebe364bd54ac33be425e24183eb1c76b5d3
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-12-05T11:08:31Z

Merge pull request #12 from apache/master

update

commit e4c2c0a18bdc78cc17823cbc2adf3926944e6bc5
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-12-24T03:15:22Z

Merge pull request #15 from apache/master

update

commit 2fb4f4450230fee09ff8932eb107f09ef72f2402
Author: yantangzhai tyz0...@163.com
Date:   2014-12-30T13:41:59Z

[SPARK-5007] [CORE] Try random port when startServiceOnPort to reduce the 
chance of port collision




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3845#issuecomment-68357320
  
  [Test build #24894 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24894/consoleFull)
 for   PR 3845 at commit 
[`2fb4f44`](https://github.com/apache/spark/commit/2fb4f4450230fee09ff8932eb107f09ef72f2402).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3845#issuecomment-68357384
  
  [Test build #24894 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24894/consoleFull)
 for   PR 3845 at commit 
[`2fb4f44`](https://github.com/apache/spark/commit/2fb4f4450230fee09ff8932eb107f09ef72f2402).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5007] [CORE] Try random port when start...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3845#issuecomment-68357385
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24894/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...

2014-12-30 Thread james64
GitHub user james64 opened a pull request:

https://github.com/apache/spark/pull/3846

[Spark-4995] Replace Vector.toBreeze.activeIterator with foreachActive

New foreachActive method of vector was introduced by SPARK-4431 as more 
efficient alternative to vector.toBreeze.activeIterator. There are some parts 
of codebase where it was not yet replaced.

@dbtsai


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/james64/spark SPARK-4995-foreachActive

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3846.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3846


commit 90a7d982e298863d16455108208fdb0765fe2ec6
Author: Jakub Dubovsky dubov...@avast.com
Date:   2014-12-30T13:00:23Z

activeIterator removed in MLUtils.saveAsLibSVMFile

commit 47a49c2e13a4828ce633b3080e2ff7a92f6a
Author: Jakub Dubovsky dubov...@avast.com
Date:   2014-12-30T13:22:29Z

activeIterator removed in RowMatrix.toBreeze

commit 32fe6c67e46837f6625ad8ed5ed5eee20c3793d2
Author: Jakub Dubovsky dubov...@avast.com
Date:   2014-12-30T13:29:35Z

activeIterator removed - IndexedRowMatrix.toBreeze

commit 3eb7e3711fcae74031a94708233db0d8da348ea4
Author: Jakub Dubovsky dubov...@avast.com
Date:   2014-12-30T13:35:17Z

Scalastyle fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3846#issuecomment-68357963
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3841#issuecomment-68358227
  
  [Test build #24891 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24891/consoleFull)
 for   PR 3841 at commit 
[`191face`](https://github.com/apache/spark/commit/191face9291c8d455223858882ef509406a8826d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3841#issuecomment-68358232
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24891/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3842#issuecomment-68359334
  
  [Test build #24893 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24893/consoleFull)
 for   PR 3842 at commit 
[`0e5ba5c`](https://github.com/apache/spark/commit/0e5ba5cefaca04c188aadf5309ca6d5dffe1c63f).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3842#issuecomment-68359340
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24893/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...

2014-12-30 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/3842#issuecomment-68360108
  
Interesting situation. There is a MiMa failure since SPARK-3154 / 
https://github.com/apache/spark/commit/bcb5cdad614d4fce43725dfec3ce88172d2f8c11 
changed a method after 1.2.0, but, it's `private[sink]`. I should just exclude 
this failure I believe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3794#issuecomment-68361338
  
**[Test build #24892 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24892/consoleFull)**
 for PR 3794 at commit 
[`6e95955`](https://github.com/apache/spark/commit/6e95955c9c67ce509372fe08f9ced962eb251593)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3794#issuecomment-68361347
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24892/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3842#issuecomment-68361334
  
  [Test build #24895 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24895/consoleFull)
 for   PR 3842 at commit 
[`50ff80e`](https://github.com/apache/spark/commit/50ff80e4498c2cb0a30793fb41fa2d20942811d6).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...

2014-12-30 Thread jimfcarroll
Github user jimfcarroll commented on the pull request:

https://github.com/apache/spark/pull/3254#issuecomment-68361866
  
@MickDavies thanks. I needed the change and was beginning the process of 
profiling again. 5.5 million rows, 2000+ columns took over 15 hours to create a 
Parquet file for me so I incorporated your change when I saw your description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3433#issuecomment-68363448
  
  [Test build #24896 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24896/consoleFull)
 for   PR 3433 at commit 
[`6b47555`](https://github.com/apache/spark/commit/6b4755503e64153fb05425b2085076450a7cbe4a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [YARN][SPARK-4929] Bug fix: fix the yarn-clien...

2014-12-30 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/3771#issuecomment-68367226
  
@SaintBacchus so I'm still a bit unclear of the exact scenario. I just want 
to make sure we are handling everything properly so want to make sure I 
understand fully.

So this is when the RM goes down and is being brought back up or fails over 
to a standby.  At that point it restarts the applications to start a new 
attempt. The shutdown hook is run and the code you mention above runs and 
unregisters. I understand client mode can't set it because spark context is not 
in the same process. The thing that is unclear to me is how is cluster mode 
setting the finalStatus to something other then succeeded?  Is sparkContext 
being signalled and then throwing exception so that startUserClass catches it 
and marks it as failed?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3842#issuecomment-68369026
  
  [Test build #24895 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24895/consoleFull)
 for   PR 3842 at commit 
[`50ff80e`](https://github.com/apache/spark/commit/50ff80e4498c2cb0a30793fb41fa2d20942811d6).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-2757 [BUILD] [STREAMING] Add Mima test f...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3842#issuecomment-68369035
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24895/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [YARN][SPARK-4929] Bug fix: fix the yarn-clien...

2014-12-30 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3771#discussion_r22352924
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -153,6 +153,19 @@ private[spark] class ApplicationMaster(args: 
ApplicationMasterArguments,
   }
 
   /**
+   * we should distinct the default final status between client and 
cluster,
+   * because the SUCCEEDED status may cause the HA failed in client mode 
and
+   * UNDEFINED may cause the error reporter in cluster when using sys.exit.
+   */
+  final def getDefaultFinalStatus() = {
--- End diff --

I assume we are hitting the logic on line 108 above in if (!finished) {... 
I think that comment and code is based on the final status defaulting to 
success.  In the very least we should update that comment explaining what is 
going to happen in client vs cluster mode.   Since the DisassociatedEvent exits 
with success for client mode I think making the default as undefined for client 
mode is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3433#issuecomment-68370094
  
  [Test build #24896 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24896/consoleFull)
 for   PR 3433 at commit 
[`6b47555`](https://github.com/apache/spark/commit/6b4755503e64153fb05425b2085076450a7cbe4a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Concat(left: Expression, right: Expression) extends 
BinaryExpression `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4576][SQL] Add concatenation operator

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3433#issuecomment-68370097
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24896/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [YARN][SPARK-4929] Bug fix: fix the yarn-clien...

2014-12-30 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3771#discussion_r22353308
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -153,6 +153,19 @@ private[spark] class ApplicationMaster(args: 
ApplicationMasterArguments,
   }
 
   /**
+   * we should distinct the default final status between client and 
cluster,
--- End diff --

can we clarify this comment a little.  Perhaps something more like below 
(feel free to reword)

Set the default final application status for client mode to UNDEFINED to 
handle if YARN HA restarts the application so that it properly retries. Set the 
final status to SUCCEEDED in cluster mode to handle if the user calls 
System.exit from the application code.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3431#issuecomment-68371410
  
  [Test build #24897 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24897/consoleFull)
 for   PR 3431 at commit 
[`44eb70c`](https://github.com/apache/spark/commit/44eb70cda9049a68d7a3a4a4ca74e5bc41f04991).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3431#issuecomment-68371497
  
  [Test build #24897 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24897/consoleFull)
 for   PR 3431 at commit 
[`44eb70c`](https://github.com/apache/spark/commit/44eb70cda9049a68d7a3a4a4ca74e5bc41f04991).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class DefaultSource extends SchemaRelationProvider `
  * `case class ParquetRelation2(`
  * `trait SchemaRelationProvider `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3431#issuecomment-68371499
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24897/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3431#issuecomment-68372387
  
  [Test build #24898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24898/consoleFull)
 for   PR 3431 at commit 
[`02a662c`](https://github.com/apache/spark/commit/02a662c4cb3605b3abc7033ad14e3b7400c30964).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI] add version on master and wor...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3769#issuecomment-68375782
  
I've made the SPARK_VERSION change in the maintenance branches, so I'm now 
going to merge this into `master` (1.3.0), `branch-1.2` (1.2.1), `branch-1.1` 
(1.1.2), and `branch-1.0` (1.0.3).  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI] add version on master and wor...

2014-12-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3769


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI] add version on master and wor...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3769#issuecomment-68376239
  
Actually, it looks like there are other patches that need to be 
cherry-picked before this can be pulled into `branch-1.1` (1.1.2) and 
`branch-1.0` (1.0.3); I'll tag this in JIRA for followup and handle it myself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]:current spark version in UI i...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3763#issuecomment-68376619
  
Branch-1.1 backport is here: #3768


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3768#issuecomment-68376638
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3768#issuecomment-68376710
  
LGTM, pending Jenkins.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4882] Register PythonBroadcast with Kry...

2014-12-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3831


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3768#issuecomment-68377078
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24899/consoleFull)
 for   PR 3768 at commit 
[`ec2365f`](https://github.com/apache/spark/commit/ec2365fafa159bd4cc1d3a62a125ac76d4e0dd16).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4882] Register PythonBroadcast with Kry...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3831#issuecomment-68377184
  
I've merged this into `master` (1.3.0) and `branch-1.2` (1.2.1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3768#issuecomment-68377345
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24899/consoleFull)
 for   PR 3768 at commit 
[`ec2365f`](https://github.com/apache/spark/commit/ec2365fafa159bd4cc1d3a62a125ac76d4e0dd16).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3768#issuecomment-68377346
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24899/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3841#discussion_r22356478
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -176,6 +176,10 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
   logInfo(sRunning Spark version $SPARK_VERSION)
   
   private[spark] val conf = config.clone()
+  val portRetriesConf = conf.getOption(spark.port.maxRetries)
--- End diff --

You could use `conf.getOption(...).foreach { portRetriesConf = [...] }` 
but I'm not sure that it's a huge win. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3841#discussion_r22356523
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1691,15 +1691,12 @@ private[spark] object Utils extends Logging {
   /**
* Default maximum number of retries when binding to a port before 
giving up.
*/
-  val portMaxRetries: Int = {
+  lazy val portMaxRetries: Int = {
--- End diff --

Why is this lazy?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3841#discussion_r22356578
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1719,6 +1716,7 @@ private[spark] object Utils extends Logging {
   serviceName: String = ,
   maxRetries: Int = portMaxRetries): (T, Int) = {
 val serviceString = if (serviceName.isEmpty)  else s '$serviceName'
+logInfo(sStarting service$serviceString on port $startPort with 
maximum $maxRetries retries. )
--- End diff --

Typo: need extra space in `service$serviceString`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3841#discussion_r22356620
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnableUtil.scala ---
@@ -76,7 +76,9 @@ trait ExecutorRunnableUtil extends Logging {
 // uses Akka to connect to the scheduler, the akka settings are needed 
as well as the
 // authentication settings.
 sparkConf.getAll.
-  filter { case (k, v) = k.startsWith(spark.auth) || 
k.startsWith(spark.akka) }.
+  filter { case (k, v) =
+  k.startsWith(spark.auth) || k.startsWith(spark.akka) || 
k.equals(spark.port.maxRetries)
--- End diff --

This line is underindented relative to `filter`; I'd move the `filter { 
case (k, v) = ` to the previous line, and the matching brace to the next line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3841#discussion_r22356634
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -176,6 +176,10 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
   logInfo(sRunning Spark version $SPARK_VERSION)
   
   private[spark] val conf = config.clone()
+  val portRetriesConf = conf.getOption(spark.port.maxRetries)
+  if (portRetriesConf.isDefined) {
+System.setProperty(spark.port.maxRetries, portRetriesConf.get)
--- End diff --

Won't changing from SparkConf to system properties break the ability to set 
this configuration via SparkConf?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3841#issuecomment-68378792
  
I'm a bit confused about this chance, since it seems like changing the code 
to read that value from system properties instead of SparkConf breaks our 
ability to configure it via SparkConf.

Can you add a failing unit test which demonstrates the problem / bug that 
this patch addresses?

If this issue has to do with initialization ordering, I'd like to see if we 
can come up with a cleaner approach which doesn't involve things like 
unexplained `lazy` keywords (since I'm concerned that such approaches will 
inevitably break when the code is modified).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3431#issuecomment-68378891
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24898/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4574][SQL] Adding support for defining ...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3431#issuecomment-68378887
  
  [Test build #24898 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24898/consoleFull)
 for   PR 3431 at commit 
[`02a662c`](https://github.com/apache/spark/commit/02a662c4cb3605b3abc7033ad14e3b7400c30964).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class DefaultSource extends SchemaRelationProvider `
  * `case class ParquetRelation2(`
  * `trait SchemaRelationProvider `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3768#issuecomment-68380658
  
That latest failure is my fault (bad merge that I've reverted).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3768#issuecomment-68380668
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4920][UI]: back port the PR-3763 to bra...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3768#issuecomment-68380897
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24900/consoleFull)
 for   PR 3768 at commit 
[`ec2365f`](https://github.com/apache/spark/commit/ec2365fafa159bd4cc1d3a62a125ac76d4e0dd16).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3794#discussion_r22357607
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -202,9 +202,6 @@ abstract class RDD[T: ClassTag](
*/
   final def partitions: Array[Partition] = {
 checkpointRDD.map(_.partitions).getOrElse {
-  if (partitions_ == null) {
--- End diff --

Won't this now throw a NPE if we call `partitions` from a worker, since now 
this will return `null` after the RDD is serialized and deserialized?  I guess 
maybe we never do that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...

2014-12-30 Thread OopsOutOfMemory
GitHub user OopsOutOfMemory opened a pull request:

https://github.com/apache/spark/pull/3847

[SPARK-5011][SQL] Add support for WITH SERDEPROPERTIES, TBLPROPERTIES in 
CREATE TEMPORARY TABLE

issues is here:
https://issues.apache.org/jira/browse/SPARK-5011

Currently,  since I find a bug which block this PR: 
issues: https://issues.apache.org/jira/browse/SPARK-5009 ,

Temporarily, I use  replace `SERDEPROPERTIES` with `SERDEPROP`, replace 
`TBLPROPERTIES` with `TBLPROP`.

After fix that bug above,  I will rename them back.

And the final version will be like this, see below:
```
val hbaseDDL = s
  |CREATE TEMPORARY TABLE hbase_people(row_key string, name string, age 
int, job string)
  |USING com.shengli.spark.hbase
  |OPTIONS (
  |  someOptions 'abcdefg'
  |) 
  |WITH SERDEPROPERTIES (
  |  'hbase.columns.mapping'=':key, profile:name, profile:age, 
career:job'
  |)
  |TBLPROPERTIES (
  | 'hbase.table.name' = 'people'
  |)
.stripMargin```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/OopsOutOfMemory/spark params

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3847.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3847


commit 0657df4a95eb0d5db8bcbfe87eedfe1477ffa1a4
Author: OopsOutOfMemory victorshen...@126.com
Date:   2014-12-30T17:50:52Z

add support for SERDEPROPERTIES TBLPROPERTIES




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3794#discussion_r22357840
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/BinaryFileRDD.scala ---
@@ -46,6 +47,7 @@ private[spark] class BinaryFileRDD[T](
 for (i - 0 until rawSplits.size) {
   result(i) = new NewHadoopPartition(id, i, 
rawSplits(i).asInstanceOf[InputSplit with Writable])
 }
+logDebug(Get these partitions took %f s.format((System.nanoTime - 
start) / 1e9))
--- End diff --

Since this `getPartitions` method is guaranteed to only be called once, I 
think we can just move this logging to its call site in `RDD.scala` (e.g. add a 
block near where we assign to `partitions_`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...

2014-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3847#issuecomment-68381796
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4501][Core] - Create build/mvn to autom...

2014-12-30 Thread brennonyork
Github user brennonyork commented on the pull request:

https://github.com/apache/spark/pull/3707#issuecomment-68381889
  
@witgo just for clarity, does this mean you aren't seeing this issue 
anymore? Want to ensure you aren't having any more troubles! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4961] [CORE] Put HadoopRDD.getPartition...

2014-12-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3794#issuecomment-68382391
  
@markhamstra 

 How would this interact with the idea of @erikerlandson to defer 
partition computation?
#3079

Maybe I'm overlooking something, but #3079 seems kind of orthogonal.  It 
seems like that issue is concerned with making the `sortByKey` transformation 
lazy so that it does not eagerly trigger a Spark job to compute the range 
partition boundaries, whereas this pull request is related to eager vs. lazy 
evaluation of what's effectively a Hadoop filesystem metadata call.

Maybe eager vs. lazy is the wrong way to think about this PR's issue, 
though, since I guess we're more concerned with _where_ the call is performed 
(blocking DAGScheduler's event loop vs. a driver user-code thread) than when 
it's performed.  I suppose that maybe you could contrive an example where this 
patch changes the behavior of a user job, since maybe someone defines some 
transformations up-front, runs jobs to generate output, then reads it back in 
another RDD, in which case the data to be read might not exist at the time that 
the RDD is defined but will exist when the first action on it is invoked.  So, 
maybe we should consider moving the first `partitions` call closer to the 
DAGScheduler's job submission methods, but not inside of the actor (e.g. don't 
change any code in `RDD`, but just add a call that traverses the lineage chain 
and calls `partitions` on each RDD, making sure that this call occurs before 
the job submitter sends a message into the DAGScheduler actor).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...

2014-12-30 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/3846#issuecomment-68382800
  
add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...

2014-12-30 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/3846#issuecomment-68382810
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4998][MLlib]delete the train function

2014-12-30 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/3836#issuecomment-68383731
  
@srowen The Scala compiler doesn't generate the static `train` method under 
`DecisionTree` if there is a `train` method under `class DecisionTree`, 
regardless of the method signature. That's why we deprecated this method. From 
javap:

~~~
public class org.apache.spark.mllib.tree.DecisionTree implements 
scala.Serializable,org.apache.spark.Logging {
  public static scala.Optionorg.apache.spark.mllib.tree.impl.NodeIdCache 
findBestSplits$default$10();
  public static org.apache.spark.mllib.tree.impl.TimeTracker 
findBestSplits$default$9();
  public static org.apache.spark.mllib.tree.model.DecisionTreeModel 
trainRegressor(org.apache.spark.api.java.JavaRDDorg.apache.spark.mllib.regression.LabeledPoint,
 java.util.Mapjava.lang.Integer, java.lang.Integer, java.lang.String, int, 
int);
  public static org.apache.spark.mllib.tree.model.DecisionTreeModel 
trainRegressor(org.apache.spark.rdd.RDDorg.apache.spark.mllib.regression.LabeledPoint,
 scala.collection.immutable.Mapjava.lang.Object, java.lang.Object, 
java.lang.String, int, int);
  public static org.apache.spark.mllib.tree.model.DecisionTreeModel 
trainClassifier(org.apache.spark.api.java.JavaRDDorg.apache.spark.mllib.regression.LabeledPoint,
 int, java.util.Mapjava.lang.Integer, java.lang.Integer, java.lang.String, 
int, int);
  public static org.apache.spark.mllib.tree.model.DecisionTreeModel 
trainClassifier(org.apache.spark.rdd.RDDorg.apache.spark.mllib.regression.LabeledPoint,
 int, scala.collection.immutable.Mapjava.lang.Object, java.lang.Object, 
java.lang.String, int, int);
  public org.slf4j.Logger org$apache$spark$Logging$$log_();
  public void org$apache$spark$Logging$$log__$eq(org.slf4j.Logger);
  public java.lang.String logName();
  public org.slf4j.Logger log();
  public void logInfo(scala.Function0java.lang.String);
  public void logDebug(scala.Function0java.lang.String);
  public void logTrace(scala.Function0java.lang.String);
  public void logWarning(scala.Function0java.lang.String);
  public void logError(scala.Function0java.lang.String);
  public void logInfo(scala.Function0java.lang.String, 
java.lang.Throwable);
  public void logDebug(scala.Function0java.lang.String, 
java.lang.Throwable);
  public void logTrace(scala.Function0java.lang.String, 
java.lang.Throwable);
  public void logWarning(scala.Function0java.lang.String, 
java.lang.Throwable);
  public void logError(scala.Function0java.lang.String, 
java.lang.Throwable);
  public boolean isTraceEnabled();
  public org.apache.spark.mllib.tree.model.DecisionTreeModel 
run(org.apache.spark.rdd.RDDorg.apache.spark.mllib.regression.LabeledPoint);
  public org.apache.spark.mllib.tree.model.DecisionTreeModel 
train(org.apache.spark.rdd.RDDorg.apache.spark.mllib.regression.LabeledPoint);
  public 
org.apache.spark.mllib.tree.DecisionTree(org.apache.spark.mllib.tree.configuration.Strategy);
}
~~~

One way to call those static method is quite ugly:

~~~
DecisionTree$.MODULE$.train(...)
~~~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4995] Replace Vector.toBreeze.activeIte...

2014-12-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3846#issuecomment-68383326
  
  [Test build #24901 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24901/consoleFull)
 for   PR 3846 at commit 
[`3eb7e37`](https://github.com/apache/spark/commit/3eb7e3711fcae74031a94708233db0d8da348ea4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5011][SQL] Add support for WITH SERDEPR...

2014-12-30 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/3847#issuecomment-68384166
  
What is the rational behind this change?  You already have options for 
passing key/value pairs to the library.  Also, there is nothing called a 
`SerDe` in the external datasources API.  Why not just pass them all as options.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >