date:20170420

[GitHub] spark issue #17596: [SPARK-12837][CORE] Do not send the accumulator name to ...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17596
  
**[Test build #76017 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76017/testReport)**
 for PR 17596 at commit 
[`e8633f9`](https://github.com/apache/spark/commit/e8633f9ef48638ea241a137f9a1608256f435152).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17681: [SPARK-20383][SQL] Supporting Create [temporary] Functio...

2017-04-20 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17681
  
The implementation of function existence checking is a little bit messy. We 
might need to clean up it before starting the impl of this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17711
  
**[Test build #76016 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76016/testReport)**
 for PR 17711 at commit 
[`72d1ae1`](https://github.com/apache/spark/commit/72d1ae191c64d5f8063750ff64daef6c91dfacf4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17707: [SPARK-20412] Throw ParseException from visitNonO...

2017-04-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17707#discussion_r112611803
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -214,8 +214,11 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] 
with Logging {
* Create a partition specification map without optional values.
*/
   protected def visitNonOptionalPartitionSpec(
-  ctx: PartitionSpecContext): Map[String, String] = withOrigin(ctx) {
-visitPartitionSpec(ctx).mapValues(_.orNull).map(identity)
+ctx: PartitionSpecContext): Map[String, String] = withOrigin(ctx) {
+visitPartitionSpec(ctx).map {
+  case (key, None) => throw new ParseException(s"Found empty key 
'$key'.", ctx)
--- End diff --

How about `Found an empty partition key '$key'`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17707: [SPARK-20412] Throw ParseException from visitNonO...

2017-04-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17707#discussion_r112611620
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -214,8 +214,11 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] 
with Logging {
* Create a partition specification map without optional values.
*/
   protected def visitNonOptionalPartitionSpec(
-  ctx: PartitionSpecContext): Map[String, String] = withOrigin(ctx) {
-visitPartitionSpec(ctx).mapValues(_.orNull).map(identity)
+ctx: PartitionSpecContext): Map[String, String] = withOrigin(ctx) {
--- End diff --

Nit: Indentation issues. Need to add extra two spaces


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17711#discussion_r112611521
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
 ---
@@ -290,4 +290,14 @@ class SparkSqlParserSuite extends PlanTest {
   basePlan,
   numPartitions = newConf.numShufflePartitions)))
   }
+
+  test("pipeline concatenation") {
+val concat = Concat(
+  UnresolvedAttribute("a") ::
+  Concat(UnresolvedAttribute("b") :: UnresolvedAttribute("c") :: Nil) 
::
+  Nil)
--- End diff --

@viira How about the latest fix?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17700: [SPARK-20391][Core] Rename memory related fields in Exec...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17700
  
**[Test build #76015 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76015/testReport)**
 for PR 17700 at commit 
[`e1d618b`](https://github.com/apache/spark/commit/e1d618b3396b7b89b731d4f1b3d19781416fe85b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17467: [SPARK-20140][DStream] Remove hardcoded kinesis retry wa...

2017-04-20 Thread yssharma

Github user yssharma commented on the issue:

https://github.com/apache/spark/pull/17467
  
@budde @brkyvz - Implemented the review changes. Please review.

- Using SparkConf for all the user parameters
- removed kinesisWait to be val instead of var
- fixed documentation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-20 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17712
  
cc: @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17712
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17712
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76013/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17712
  
**[Test build #76013 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76013/testReport)**
 for PR 17712 at commit 
[`90c516f`](https://github.com/apache/spark/commit/90c516fb3a7163a7291eb72b058a82efee0e0c1e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16699: [SPARK-18710][ML] Add offset in GLM

2017-04-20 Thread actuaryzhang

Github user actuaryzhang commented on the issue:

https://github.com/apache/spark/pull/16699
  
@yanboliang @sethah 
Any suggestion on moving this PR forward? Appreciate your comments and 
reviews. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17711
  
**[Test build #76014 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76014/testReport)**
 for PR 17711 at commit 
[`0701f87`](https://github.com/apache/spark/commit/0701f876a6d05513b8ca25d521f0370a4621a15a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17191
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76011/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17191
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17191
  
**[Test build #76011 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76011/testReport)**
 for PR 17191 at commit 
[`d3ab817`](https://github.com/apache/spark/commit/d3ab817c59c1421b2fc2b93024874b1b4e81a004).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17191
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76012/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17191
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17191
  
**[Test build #76012 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76012/testReport)**
 for PR 17191 at commit 
[`e2d9310`](https://github.com/apache/spark/commit/e2d9310dc0d9051c5553f0a28a4fb79a3c9b92f5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-20 Thread rdblue

Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r112606757
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
@@ -353,11 +353,19 @@ class SQLListener(conf: SparkConf) extends 
SparkListener with Logging {
 }
 
 val driverUpdates = executionUIData.driverAccumUpdates.toSeq
-val totalUpdates = (accumulatorUpdates ++ driverUpdates).filter {
-  case (id, _) => executionUIData.accumulatorMetrics.contains(id)
+// filter out updates that aren't Longs
--- End diff --

Because all of the SQL UI metrics are longs. We can safely discard anything 
else without missing SQL metrics. It's as close as we can get to the correct 
set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-20 Thread rdblue

Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r112606701
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/util/DataFrameCallbackSuite.scala 
---
@@ -183,21 +183,22 @@ class DataFrameCallbackSuite extends QueryTest with 
SharedSQLContext {
 }
 
 withTable("tab") {
-  sql("CREATE TABLE tab(i long) using parquet")
+  sql("CREATE TABLE tab(i long) using parquet") // adds commands(1) 
via onSuccess
   spark.range(10).write.insertInto("tab")
-  assert(commands.length == 2)
-  assert(commands(1)._1 == "insertInto")
-  assert(commands(1)._2.isInstanceOf[InsertIntoTable])
-  assert(commands(1)._2.asInstanceOf[InsertIntoTable].table
+  assert(commands.length == 3)
+  assert(commands(2)._1 == "insertInto")
+  assert(commands(2)._2.isInstanceOf[InsertIntoTable])
+  assert(commands(2)._2.asInstanceOf[InsertIntoTable].table
 .asInstanceOf[UnresolvedRelation].tableIdentifier.table == "tab")
 }
+// exiting withTable adds commands(3) via onSuccess (drops tab)
--- End diff --

I don't think it is a good idea to come up with a way to exclude commands, 
since this is captured by a listener. Simply noting that onSuccess will issue a 
drop command is sufficient.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-20 Thread rdblue

Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r112606599
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
@@ -353,11 +353,19 @@ class SQLListener(conf: SparkConf) extends 
SparkListener with Logging {
 }
 
 val driverUpdates = executionUIData.driverAccumUpdates.toSeq
-val totalUpdates = (accumulatorUpdates ++ driverUpdates).filter {
-  case (id, _) => executionUIData.accumulatorMetrics.contains(id)
+// filter out updates that aren't Longs
+val totalUpdates = (accumulatorUpdates ++ driverUpdates).filter { 
update =>
+  update._2.isInstanceOf[Long]
 }
+// Hack alert!
+// metrics may be missing from the executionUIData because metrics 
are taken from the
+// SparkPlan passed to withNewExecutionId, but some nodes in that 
plan link in LogicalPlan
+// nodes that don't expose the actual SparkPlan or metrics because 
the SparkPlan is built
--- End diff --

I think it's better to have the metrics than to discard them. Why discard 
metrics if we know that they can be safely summarized and used?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17688: [MINOR][DOCS][PYTHON] Adding missing boolean type for re...

2017-04-20 Thread vundela

Github user vundela commented on the issue:

https://github.com/apache/spark/pull/17688
  
Hi @felixcheung Thanks for the review. I have added a small testcase. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17191: [SPARK-14471][SQL] Aliases in SELECT could be use...

2017-04-20 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17191#discussion_r112605454
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -136,6 +136,7 @@ class Analyzer(
   ResolveGroupingAnalytics ::
   ResolvePivot ::
   ResolveOrdinalInOrderByAndGroupBy ::
+  ResolveAggAliasInGroupBy ::
--- End diff --

One idea to put this rule outside `resolution` batch is to uncheck grouping 
expression resolution in 
[Aggregate.resloved](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala#L558).
 But, I feel this is a bit unsafe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17711
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17711
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76010/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17711
  
**[Test build #76010 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76010/testReport)**
 for PR 17711 at commit 
[`9cfaef6`](https://github.com/apache/spark/commit/9cfaef639292390f52a75b4434912dd8f18118ed).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112603493
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -50,22 +49,22 @@ import org.apache.spark.util.{JsonProtocol, Utils}
 private[spark] class EventLoggingListener(
 appId: String,
 appAttemptId : Option[String],
-logBaseDir: URI,
+logBaseDir: String,
 sparkConf: SparkConf,
 hadoopConf: Configuration)
   extends SparkListener with Logging {
 
   import EventLoggingListener._
 
-  def this(appId: String, appAttemptId : Option[String], logBaseDir: URI, 
sparkConf: SparkConf) =
+  def this(appId: String, appAttemptId : Option[String], logBaseDir: 
String, sparkConf: SparkConf) =
 this(appId, appAttemptId, logBaseDir, sparkConf,
   SparkHadoopUtil.get.newConfiguration(sparkConf))
 
   private val shouldCompress = 
sparkConf.getBoolean("spark.eventLog.compress", false)
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
-  private val fileSystem = Utils.getHadoopFileSystem(logBaseDir, 
hadoopConf)
--- End diff --

Yes, from your case it is workable, but I'm sure if it could handle all the 
cases in UT.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17191: [SPARK-14471][SQL] Aliases in SELECT could be use...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17191#discussion_r112603375
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -136,6 +136,7 @@ class Analyzer(
   ResolveGroupingAnalytics ::
   ResolvePivot ::
   ResolveOrdinalInOrderByAndGroupBy ::
+  ResolveAggAliasInGroupBy ::
--- End diff --

is it safer to put it in an individual batch after the `resolution` batch? 
Ideally we should only run this rule if we make sure there is no other way to 
resolve the grouping expressions exception this rule. cc @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17711#discussion_r112602983
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
 ---
@@ -290,4 +290,14 @@ class SparkSqlParserSuite extends PlanTest {
   basePlan,
   numPartitions = newConf.numShufflePartitions)))
   }
+
+  test("pipeline concatenation") {
+val concat = Concat(
+  UnresolvedAttribute("a") ::
+  Concat(UnresolvedAttribute("b") :: UnresolvedAttribute("c") :: Nil) 
::
+  Nil)
--- End diff --

aha, I'll re-think a bit more, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17711#discussion_r112602929
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/string-functions.sql 
---
@@ -1,3 +1,6 @@
+-- A pipe operation for string concatenation
+select 'a' || 'b' || 'c';
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17711#discussion_r112602578
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
 ---
@@ -290,4 +290,14 @@ class SparkSqlParserSuite extends PlanTest {
   basePlan,
   numPartitions = newConf.numShufflePartitions)))
   }
+
+  test("pipeline concatenation") {
+val concat = Concat(
+  UnresolvedAttribute("a") ::
+  Concat(UnresolvedAttribute("b") :: UnresolvedAttribute("c") :: Nil) 
::
+  Nil)
--- End diff --

oh. I see. But I think we may simplify nested `Concat` in `visitConcat`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17681: [SPARK-20383][SQL] Supporting Create [temporary] Functio...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17681
  
The proposal LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17711#discussion_r112602421
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
 ---
@@ -290,4 +290,14 @@ class SparkSqlParserSuite extends PlanTest {
   basePlan,
   numPartitions = newConf.numShufflePartitions)))
   }
+
+  test("pipeline concatenation") {
+val concat = Concat(
+  UnresolvedAttribute("a") ::
+  Concat(UnresolvedAttribute("b") :: UnresolvedAttribute("c") :: Nil) 
::
+  Nil)
--- End diff --

Because you do `Concat(exprs.map(expression))`, isn't it 
`Concat(UnresolvedAttribute("a") :: UnresolvedAttribute("b") :: 
UnresolvedAttribute("c") :: Nil)`?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r112601926
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
@@ -353,11 +353,19 @@ class SQLListener(conf: SparkConf) extends 
SparkListener with Logging {
 }
 
 val driverUpdates = executionUIData.driverAccumUpdates.toSeq
-val totalUpdates = (accumulatorUpdates ++ driverUpdates).filter {
-  case (id, _) => executionUIData.accumulatorMetrics.contains(id)
+// filter out updates that aren't Longs
+val totalUpdates = (accumulatorUpdates ++ driverUpdates).filter { 
update =>
+  update._2.isInstanceOf[Long]
 }
+// Hack alert!
+// metrics may be missing from the executionUIData because metrics 
are taken from the
+// SparkPlan passed to withNewExecutionId, but some nodes in that 
plan link in LogicalPlan
+// nodes that don't expose the actual SparkPlan or metrics because 
the SparkPlan is built
--- End diff --

for this case, can't we just ignore these metrics?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17712
  
**[Test build #76013 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76013/testReport)**
 for PR 17712 at commit 
[`90c516f`](https://github.com/apache/spark/commit/90c516fb3a7163a7291eb72b058a82efee0e0c1e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17702: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2017-04-20 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/17702
  
@marmbrus Can you take a look of this? Thanks :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17712: [SPARK-20416][SQL] Print UDF names in EXPLAIN

2017-04-20 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/17712

[SPARK-20416][SQL] Print UDF names in EXPLAIN

## What changes were proposed in this pull request?
This pr added `withName` in `UserDefinedFunction` for printing UDF names in 
EXPLAIN

## How was this patch tested?
Added tests in `UDFSuite`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-20416

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17712.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17712


commit 90c516fb3a7163a7291eb72b058a82efee0e0c1e
Author: Takeshi Yamamuro 
Date:   2017-04-21T02:57:34Z

Print UDF names in Dataset APIs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread zuotingbing

Github user zuotingbing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112601515
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -50,22 +49,22 @@ import org.apache.spark.util.{JsonProtocol, Utils}
 private[spark] class EventLoggingListener(
 appId: String,
 appAttemptId : Option[String],
-logBaseDir: URI,
+logBaseDir: String,
 sparkConf: SparkConf,
 hadoopConf: Configuration)
   extends SparkListener with Logging {
 
   import EventLoggingListener._
 
-  def this(appId: String, appAttemptId : Option[String], logBaseDir: URI, 
sparkConf: SparkConf) =
+  def this(appId: String, appAttemptId : Option[String], logBaseDir: 
String, sparkConf: SparkConf) =
 this(appId, appAttemptId, logBaseDir, sparkConf,
   SparkHadoopUtil.get.newConfiguration(sparkConf))
 
   private val shouldCompress = 
sparkConf.getBoolean("spark.eventLog.compress", false)
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
-  private val fileSystem = Utils.getHadoopFileSystem(logBaseDir, 
hadoopConf)
--- End diff --

ok, i will try to fix `resolveURI` to handle space case,Thanks.
What is your opinion if i use `val uri = new Path(path).toUri` instead `val 
uri = new URI(path)` in `resolveURI`? we do not need to use encode, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r112601511
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/util/DataFrameCallbackSuite.scala 
---
@@ -183,21 +183,22 @@ class DataFrameCallbackSuite extends QueryTest with 
SharedSQLContext {
 }
 
 withTable("tab") {
-  sql("CREATE TABLE tab(i long) using parquet")
+  sql("CREATE TABLE tab(i long) using parquet") // adds commands(1) 
via onSuccess
   spark.range(10).write.insertInto("tab")
-  assert(commands.length == 2)
-  assert(commands(1)._1 == "insertInto")
-  assert(commands(1)._2.isInstanceOf[InsertIntoTable])
-  assert(commands(1)._2.asInstanceOf[InsertIntoTable].table
+  assert(commands.length == 3)
+  assert(commands(2)._1 == "insertInto")
+  assert(commands(2)._2.isInstanceOf[InsertIntoTable])
+  assert(commands(2)._2.asInstanceOf[InsertIntoTable].table
 .asInstanceOf[UnresolvedRelation].tableIdentifier.table == "tab")
 }
+// exiting withTable adds commands(3) via onSuccess (drops tab)
--- End diff --

how about we update `onSuccess` to only collect commands we are interested?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r112601099
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
@@ -353,11 +353,19 @@ class SQLListener(conf: SparkConf) extends 
SparkListener with Logging {
 }
 
 val driverUpdates = executionUIData.driverAccumUpdates.toSeq
-val totalUpdates = (accumulatorUpdates ++ driverUpdates).filter {
-  case (id, _) => executionUIData.accumulatorMetrics.contains(id)
+// filter out updates that aren't Longs
--- End diff --

why?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r112601049
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
 ---
@@ -281,42 +281,54 @@ class StreamExecution(
 // Unblock `awaitInitialization`
 initializationLatch.countDown()
 
-triggerExecutor.execute(() => {
--- End diff --

cc @zsxwing for streaming changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17711#discussion_r112599634
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/string-functions.sql 
---
@@ -1,3 +1,6 @@
+-- A pipe operation for string concatenation
+select 'a' || 'b' || 'c';
--- End diff --

Please move this to the end of the file. It can minimize the code changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17670: [SPARK-20281][SQL] Print the identical Range para...

2017-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17670


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17670: [SPARK-20281][SQL] Print the identical Range parameters ...

2017-04-20 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17670
  
Thanks! Merging to master/2.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15009: [SPARK-17443][SPARK-11035] Stop Spark Application...

2017-04-20 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15009#discussion_r112597680
  
--- Diff: 
launcher/src/main/java/org/apache/spark/launcher/SparkAppHandle.java ---
@@ -95,7 +95,8 @@ public boolean isFinal() {
   void kill();
 
   /**
-   * Disconnects the handle from the application, without stopping it. 
After this method is called,
+   * Disconnects the handle from the application. If using {@link 
SparkLauncher#autoShutdown()}
--- End diff --

It looks the documentation generation for Javadoc 8 is being failed due to 
this line - 

```
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/launcher/src/main/java/org/apache/spark/launcher/SparkAppHandle.java:98:
 error: reference not found
[error]* Disconnects the handle from the application. If using {@link 
SparkLauncher#autoShutdown()}
[error]   ^
```

Probably, wrap it with `` `... ` `` as I did before - 
https://github.com/apache/spark/pull/16013 or find a way to make the link 
properly.

The other errors seem spurious. Please refer my observation - 
https://github.com/apache/spark/pull/17389#issuecomment-288438704


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17191
  
**[Test build #76012 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76012/testReport)**
 for PR 17191 at commit 
[`e2d9310`](https://github.com/apache/spark/commit/e2d9310dc0d9051c5553f0a28a4fb79a3c9b92f5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread zuotingbing

Github user zuotingbing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112597030
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -405,9 +405,7 @@ class SparkContext(config: SparkConf) extends Logging {
 
 _eventLogDir =
   if (isEventLogEnabled) {
-val unresolvedDir = conf.get("spark.eventLog.dir", 
EventLoggingListener.DEFAULT_LOG_DIR)
-  .stripSuffix("/")
-Some(Utils.resolveURI(unresolvedDir))
--- End diff --

i suggest to use `new Path(path).toURI()` instead `new URI(path)` since 
`new URI(path)` not support space in path. 
It is not necessary to use encode if we use `new Path(path).toURI()`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112596999
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -50,22 +49,22 @@ import org.apache.spark.util.{JsonProtocol, Utils}
 private[spark] class EventLoggingListener(
 appId: String,
 appAttemptId : Option[String],
-logBaseDir: URI,
+logBaseDir: String,
 sparkConf: SparkConf,
 hadoopConf: Configuration)
   extends SparkListener with Logging {
 
   import EventLoggingListener._
 
-  def this(appId: String, appAttemptId : Option[String], logBaseDir: URI, 
sparkConf: SparkConf) =
+  def this(appId: String, appAttemptId : Option[String], logBaseDir: 
String, sparkConf: SparkConf) =
 this(appId, appAttemptId, logBaseDir, sparkConf,
   SparkHadoopUtil.get.newConfiguration(sparkConf))
 
   private val shouldCompress = 
sparkConf.getBoolean("spark.eventLog.compress", false)
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
-  private val fileSystem = Utils.getHadoopFileSystem(logBaseDir, 
hadoopConf)
--- End diff --

I don't agree with you. String or URI representation should be equal, it is 
not that changing to String representation then the issue is workaround-ed.

I think in your case we need to fix `resolveURI` to handle space case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17191
  
**[Test build #76011 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76011/testReport)**
 for PR 17191 at commit 
[`d3ab817`](https://github.com/apache/spark/commit/d3ab817c59c1421b2fc2b93024874b1b4e81a004).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

2017-04-20 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/13440#discussion_r112596837
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/ChiSquared.scala ---
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.tree.impurity
+
+import org.apache.spark.annotation.{DeveloperApi, Experimental, Since}
+
+/**
+ * :: Experimental ::
+ * Class for calculating [[https://en.wikipedia.org/wiki/Chi-squared_test 
chi-squared]]
--- End diff --

The documentation generation is being failed due to this line - 

```
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/mllib/target/java/org/apache/spark/mllib/tree/impurity/ChiSquared.java:4:
 error: unexpected text
[error]  * Class for calculating {@link 
https://en.wikipedia.org/wiki/Chi-squared_test chi-squared}
[error]
```

Probably, remove the link or wrap it `href` as I did before - 
https://github.com/apache/spark/pull/16013

The other errors seem spurious. Please refer my observation - 
https://github.com/apache/spark/pull/17389#issuecomment-288438704


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17641: [SPARK-20329][SQL] Make timezone aware expression...

2017-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17641


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread zuotingbing

Github user zuotingbing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112596553
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -50,22 +49,22 @@ import org.apache.spark.util.{JsonProtocol, Utils}
 private[spark] class EventLoggingListener(
 appId: String,
 appAttemptId : Option[String],
-logBaseDir: URI,
+logBaseDir: String,
 sparkConf: SparkConf,
 hadoopConf: Configuration)
   extends SparkListener with Logging {
 
   import EventLoggingListener._
 
-  def this(appId: String, appAttemptId : Option[String], logBaseDir: URI, 
sparkConf: SparkConf) =
+  def this(appId: String, appAttemptId : Option[String], logBaseDir: 
String, sparkConf: SparkConf) =
 this(appId, appAttemptId, logBaseDir, sparkConf,
   SparkHadoopUtil.get.newConfiguration(sparkConf))
 
   private val shouldCompress = 
sparkConf.getBoolean("spark.eventLog.compress", false)
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
-  private val fileSystem = Utils.getHadoopFileSystem(logBaseDir, 
hadoopConf)
--- End diff --

So i think we should not use `new URI(path)` since it not support space in 
path.
i suggest to use `new Path(path).toURI()` instead `new URI(path)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17641: [SPARK-20329][SQL] Make timezone aware expression withou...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17641
  
LGTM, merging to master/2.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112595717
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -405,9 +405,7 @@ class SparkContext(config: SparkConf) extends Logging {
 
 _eventLogDir =
   if (isEventLogEnabled) {
-val unresolvedDir = conf.get("spark.eventLog.dir", 
EventLoggingListener.DEFAULT_LOG_DIR)
-  .stripSuffix("/")
-Some(Utils.resolveURI(unresolvedDir))
--- End diff --

I think in your case we need to percentile encode the `unresolvedDir` 
before calling `resolveURI`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread zuotingbing

Github user zuotingbing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112595663
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -50,22 +49,22 @@ import org.apache.spark.util.{JsonProtocol, Utils}
 private[spark] class EventLoggingListener(
 appId: String,
 appAttemptId : Option[String],
-logBaseDir: URI,
+logBaseDir: String,
 sparkConf: SparkConf,
 hadoopConf: Configuration)
   extends SparkListener with Logging {
 
   import EventLoggingListener._
 
-  def this(appId: String, appAttemptId : Option[String], logBaseDir: URI, 
sparkConf: SparkConf) =
+  def this(appId: String, appAttemptId : Option[String], logBaseDir: 
String, sparkConf: SparkConf) =
 this(appId, appAttemptId, logBaseDir, sparkConf,
   SparkHadoopUtil.get.newConfiguration(sparkConf))
 
   private val shouldCompress = 
sparkConf.getBoolean("spark.eventLog.compress", false)
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
-  private val fileSystem = Utils.getHadoopFileSystem(logBaseDir, 
hadoopConf)
--- End diff --

Yes i have tested. In resolveURI function if path contains space,  `new 
URI(path)` will throw exception and then will be use as a local FS.
Thanks shao.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112595579
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -50,22 +49,22 @@ import org.apache.spark.util.{JsonProtocol, Utils}
 private[spark] class EventLoggingListener(
 appId: String,
 appAttemptId : Option[String],
-logBaseDir: URI,
+logBaseDir: String,
 sparkConf: SparkConf,
 hadoopConf: Configuration)
   extends SparkListener with Logging {
 
   import EventLoggingListener._
 
-  def this(appId: String, appAttemptId : Option[String], logBaseDir: URI, 
sparkConf: SparkConf) =
+  def this(appId: String, appAttemptId : Option[String], logBaseDir: 
String, sparkConf: SparkConf) =
 this(appId, appAttemptId, logBaseDir, sparkConf,
   SparkHadoopUtil.get.newConfiguration(sparkConf))
 
   private val shouldCompress = 
sparkConf.getBoolean("spark.eventLog.compress", false)
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
-  private val fileSystem = Utils.getHadoopFileSystem(logBaseDir, 
hadoopConf)
--- End diff --

That because `resolveURI` get `URISyntaxException` when resolving 
`hdfs://nn:9000/a b/c` and `resolveURI` will change to local File instead. 
Please see the implementation of `resolveURI`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17707: [SPARK-20412] Throw ParseException from visitNonOptional...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17707
  
looks like we need to fix a test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17703: [SPARK-20367] Properly unescape column names of p...

2017-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17703


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17191: [SPARK-14471][SQL] Aliases in SELECT could be use...

2017-04-20 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17191#discussion_r112595036
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1005,6 +1005,31 @@ class Analyzer(
   }
 
   /**
+   * Replace unresolved expressions in grouping keys with resolved ones in 
SELECT clauses.
+   */
+  object ResolveAggAliasInGroupBy extends Rule[LogicalPlan] {
+
+override def apply(plan: LogicalPlan): LogicalPlan = 
plan.resolveOperators {
+  case agg @ Aggregate(groups, aggs, child)
+  if conf.groupByAliases && child.resolved && 
aggs.forall(_.resolved) &&
+groups.exists(!_.resolved) =>
+agg.copy(groupingExpressions = groups.map {
+  case u: UnresolvedAttribute =>
+val resolvedAgg = aggs.find(ne => resolver(ne.name, u.name))
+// Check if no aggregate function exists in GROUP BY
+resolvedAgg.foreach {
+  case Alias(e, _) if 
ResolveAggregateFunctions.containsAggregate(e) =>
+throw new AnalysisException(
+  s"Aggregate function `$e` is not allowed in GROUP BY")
--- End diff --

okay, I'll update soon


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17703: [SPARK-20367] Properly unescape column names of partitio...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17703
  
thanks, merging to master/2.2!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17191: [SPARK-14471][SQL] Aliases in SELECT could be use...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17191#discussion_r112594814
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1005,6 +1005,31 @@ class Analyzer(
   }
 
   /**
+   * Replace unresolved expressions in grouping keys with resolved ones in 
SELECT clauses.
+   */
+  object ResolveAggAliasInGroupBy extends Rule[LogicalPlan] {
+
+override def apply(plan: LogicalPlan): LogicalPlan = 
plan.resolveOperators {
+  case agg @ Aggregate(groups, aggs, child)
+  if conf.groupByAliases && child.resolved && 
aggs.forall(_.resolved) &&
+groups.exists(!_.resolved) =>
+agg.copy(groupingExpressions = groups.map {
+  case u: UnresolvedAttribute =>
+val resolvedAgg = aggs.find(ne => resolver(ne.name, u.name))
+// Check if no aggregate function exists in GROUP BY
+resolvedAgg.foreach {
+  case Alias(e, _) if 
ResolveAggregateFunctions.containsAggregate(e) =>
+throw new AnalysisException(
+  s"Aggregate function `$e` is not allowed in GROUP BY")
--- End diff --

please update your code, We have this check in `CheckAnalysis` now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17711
  
**[Test build #76010 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76010/testReport)**
 for PR 17711 at commit 
[`9cfaef6`](https://github.com/apache/spark/commit/9cfaef639292390f52a75b4434912dd8f18118ed).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112594234
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -50,22 +49,22 @@ import org.apache.spark.util.{JsonProtocol, Utils}
 private[spark] class EventLoggingListener(
 appId: String,
 appAttemptId : Option[String],
-logBaseDir: URI,
+logBaseDir: String,
 sparkConf: SparkConf,
 hadoopConf: Configuration)
   extends SparkListener with Logging {
 
   import EventLoggingListener._
 
-  def this(appId: String, appAttemptId : Option[String], logBaseDir: URI, 
sparkConf: SparkConf) =
+  def this(appId: String, appAttemptId : Option[String], logBaseDir: 
String, sparkConf: SparkConf) =
 this(appId, appAttemptId, logBaseDir, sparkConf,
   SparkHadoopUtil.get.newConfiguration(sparkConf))
 
   private val shouldCompress = 
sparkConf.getBoolean("spark.eventLog.compress", false)
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
-  private val fileSystem = Utils.getHadoopFileSystem(logBaseDir, 
hadoopConf)
--- End diff --

Are you sure? Let me investigate a bit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread zuotingbing

Github user zuotingbing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112594056
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -50,22 +49,22 @@ import org.apache.spark.util.{JsonProtocol, Utils}
 private[spark] class EventLoggingListener(
 appId: String,
 appAttemptId : Option[String],
-logBaseDir: URI,
+logBaseDir: String,
 sparkConf: SparkConf,
 hadoopConf: Configuration)
   extends SparkListener with Logging {
 
   import EventLoggingListener._
 
-  def this(appId: String, appAttemptId : Option[String], logBaseDir: URI, 
sparkConf: SparkConf) =
+  def this(appId: String, appAttemptId : Option[String], logBaseDir: 
String, sparkConf: SparkConf) =
 this(appId, appAttemptId, logBaseDir, sparkConf,
   SparkHadoopUtil.get.newConfiguration(sparkConf))
 
   private val shouldCompress = 
sparkConf.getBoolean("spark.eventLog.compress", false)
   private val shouldOverwrite = 
sparkConf.getBoolean("spark.eventLog.overwrite", false)
   private val testing = sparkConf.getBoolean("spark.eventLog.testing", 
false)
   private val outputBufferSize = 
sparkConf.getInt("spark.eventLog.buffer.kb", 100) * 1024
-  private val fileSystem = Utils.getHadoopFileSystem(logBaseDir, 
hadoopConf)
--- End diff --

what about the URI of  "hdfs://nn:9000/a b/c" ? Even there is right scheme 
of FS but it will use local FS instead


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-20 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/17582
  
Just update the description, please review again @vanzin , thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17638: [SPARK-20338][CORE]Spaces in spark.eventLog.dir a...

2017-04-20 Thread zuotingbing

Github user zuotingbing commented on a diff in the pull request:

https://github.com/apache/spark/pull/17638#discussion_r112593597
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -405,9 +405,7 @@ class SparkContext(config: SparkConf) extends Logging {
 
 _eventLogDir =
   if (isEventLogEnabled) {
-val unresolvedDir = conf.get("spark.eventLog.dir", 
EventLoggingListener.DEFAULT_LOG_DIR)
-  .stripSuffix("/")
-Some(Utils.resolveURI(unresolvedDir))
--- End diff --

If the dir contains space and also contains %20 (e.g "hdfs://nn:9000/a 
b%20c"), i seems to me that the encode does not work well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-04-20 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17191
  
ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17666: [SPARK-20311][SQL] Support aliases for table value funct...

2017-04-20 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17666
  
ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17670: [SPARK-20281][SQL] Print the identical Range parameters ...

2017-04-20 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17670
  
ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17711
  
okay, I'll add soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17711#discussion_r112591328
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1483,4 +1483,12 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   query: LogicalPlan): LogicalPlan = {
 RepartitionByExpression(expressions, query, conf.numShufflePartitions)
   }
+
+  /**
+   * Create a [[Concat]] expression for pipeline concatenation.
+   */
+  override def visitConcat(ctx: ConcatContext): Expression = {
+val exprs = ctx.primaryExpression().asScala
+Concat(expression(exprs.head) +: exprs.drop(1).map(expression))
--- End diff --

oh, I missed.. you're right. I'll fix


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17700: [SPARK-20391][Core] Rename memory related fields in Exec...

2017-04-20 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/17700
  
Thanks @squito for clarification, sorry I misunderstood it.

Regarding this new `memoryMetrics`, will all the memory related metrics be 
shown here, like what you mentioned in the JIRA?

Also this PR #17625 exposes the Netty memory usage, should this memory 
usage also be shown here? I'm thinking of a extensible and reasonable place to 
track all the memory usages, so that our follow-up work will not break this API 
again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17711
  
**[Test build #76009 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76009/testReport)**
 for PR 17711 at commit 
[`bd36e58`](https://github.com/apache/spark/commit/bd36e58d9572de7418cea4b46360dca4c35052f7).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17711
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76009/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17711
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17711
  
**[Test build #76009 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76009/testReport)**
 for PR 17711 at commit 
[`bd36e58`](https://github.com/apache/spark/commit/bd36e58d9572de7418cea4b46360dca4c35052f7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17711: [SPARK-19951][SQL] Add string concatenate operator || to...

2017-04-20 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/17711
  
can you add a test case in sql query file tests? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17711#discussion_r112590613
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1483,4 +1483,12 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   query: LogicalPlan): LogicalPlan = {
 RepartitionByExpression(expressions, query, conf.numShufflePartitions)
   }
+
+  /**
+   * Create a [[Concat]] expression for pipeline concatenation.
+   */
+  override def visitConcat(ctx: ConcatContext): Expression = {
+val exprs = ctx.primaryExpression().asScala
+Concat(expression(exprs.head) +: exprs.drop(1).map(expression))
--- End diff --

isn't this just `expression(exprs)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17711: [SPARK-19951][SQL] Add string concatenate operato...

2017-04-20 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/17711

[SPARK-19951][SQL] Add string concatenate operator || to Spark SQL

## What changes were proposed in this pull request?
This pr added code to support `||` for string concatenation. This string 
operation is supported in PostgreSQL and MySQL.

## How was this patch tested?
Added tests in `SparkSqlParserSuite`


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-19951

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17711.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17711


commit bd36e58d9572de7418cea4b46360dca4c35052f7
Author: Takeshi Yamamuro 
Date:   2017-04-17T11:05:38Z

Add string concatenate operator || to Spark SQL




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17709: Small rewording about history server use case

2017-04-20 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17709
  
Yea, good to match the titles to other PRs and according to the guide 
lines. It does not block this PR though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17710: [SPARK-20420][SQL] Add events to the external catalog

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17710
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76007/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17710: [SPARK-20420][SQL] Add events to the external catalog

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17710
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17710: [SPARK-20420][SQL] Add events to the external catalog

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17710
  
**[Test build #76007 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76007/testReport)**
 for PR 17710 at commit 
[`8086e1a`](https://github.com/apache/spark/commit/8086e1a5564c8f73c175be342729fb354436b7a7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait ExternalCatalogEventListener `
  * `trait DatabaseEvent extends ExternalCatalogEvent `
  * `case class CreateDatabasePreEvent(database: String) extends 
DatabaseEvent`
  * `case class CreateDatabaseEvent(database: String) extends DatabaseEvent`
  * `case class DropDatabasePreEvent(database: String) extends 
DatabaseEvent`
  * `case class DropDatabaseEvent(database: String) extends DatabaseEvent`
  * `trait TableEvent extends DatabaseEvent `
  * `case class CreateTablePreEvent(database: String, name: String) extends 
TableEvent`
  * `case class CreateTableEvent(database: String, name: String) extends 
TableEvent`
  * `case class DropTablePreEvent(database: String, name: String) extends 
TableEvent`
  * `case class DropTableEvent(database: String, name: String) extends 
TableEvent`
  * `case class RenameTablePreEvent(`
  * `case class RenameTableEvent(`
  * `trait FunctionEvent extends DatabaseEvent `
  * `case class CreateFunctionPreEvent(database: String, name: String) 
extends FunctionEvent`
  * `case class CreateFunctionEvent(database: String, name: String) extends 
FunctionEvent`
  * `case class DropFunctionPreEvent(database: String, name: String) 
extends FunctionEvent`
  * `case class DropFunctionEvent(database: String, name: String) extends 
FunctionEvent`
  * `case class RenameFunctionPreEvent(`
  * `case class RenameFunctionEvent(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17665: [SPARK-16742] Mesos Kerberos Support

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17665
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17665: [SPARK-16742] Mesos Kerberos Support

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17665
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76008/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17665: [SPARK-16742] Mesos Kerberos Support

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17665
  
**[Test build #76008 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76008/testReport)**
 for PR 17665 at commit 
[`4c387eb`](https://github.com/apache/spark/commit/4c387ebcb584732d0d67e83c0b9d5f4cfd1db247).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17693: [SPARK-20314][SQL] Inconsistent error handling in...

2017-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17693#discussion_r112587862
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -393,7 +393,7 @@ case class JsonTuple(children: Seq[Expression])
 }
 
 try {
-  Utils.tryWithResource(jsonFactory.createParser(json.getBytes)) {
+  Utils.tryWithResource(jsonFactory.createParser(json.toString)) {
--- End diff --

this change has performance penalty, we should just catch more exception 
below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17445: [SPARK-20115] [CORE] Fix DAGScheduler to recompute all t...

2017-04-20 Thread umehrot2

Github user umehrot2 commented on the issue:

https://github.com/apache/spark/pull/17445
  
@kayousterhout @mridulm @rxin @lins05 @markhamstra @tgravescs @squito Can 
you take a look at this ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17680: [SPARK-20364][SQL] Support Parquet predicate pushdown on...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17680
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76006/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17680: [SPARK-20364][SQL] Support Parquet predicate pushdown on...

2017-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17680
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17680: [SPARK-20364][SQL] Support Parquet predicate pushdown on...

2017-04-20 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17680
  
**[Test build #76006 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76006/testReport)**
 for PR 17680 at commit 
[`fdc3943`](https://github.com/apache/spark/commit/fdc3943dc7ced2b74eaa0b85240611e426997c6c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17693: [SPARK-20314][SQL] Inconsistent error handling in...

2017-04-20 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17693#discussion_r112584915
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -149,7 +149,7 @@ case class GetJsonObject(json: Expression, path: 
Expression)
 
 if (parsed.isDefined) {
   try {
-Utils.tryWithResource(jsonFactory.createParser(jsonStr.getBytes)) 
{ parser =>
+Utils.tryWithResource(jsonFactory.createParser(jsonStr.toString)) 
{ parser =>
--- End diff --

I thought `JsonParseException` extends `IOException`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17693: [SPARK-20314][SQL] Inconsistent error handling in JSON p...

2017-04-20 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17693
  
I like the idea but I am not sure of `DROPMALFORMED` mode though. If we use 
an expression with the mode enabled, whole record (not only the column but all 
columns) will be dropped in some json expressions, probably not a generator 
expressions (did I understand correctly?).

I think we don't explicitly support parse modes in both 
`from_json`/`to_json` - 
https://github.com/apache/spark/blob/465818389aab1217c9de5c685cfaee3ffaec91bb/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L551

It sets `FAILFAST` but resembles `PERMISSIVE` mode up to my knowledge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17693: [SPARK-20314][SQL] Inconsistent error handling in...

2017-04-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17693#discussion_r112584338
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -149,7 +149,7 @@ case class GetJsonObject(json: Expression, path: 
Expression)
 
 if (parsed.isDefined) {
   try {
-Utils.tryWithResource(jsonFactory.createParser(jsonStr.getBytes)) 
{ parser =>
+Utils.tryWithResource(jsonFactory.createParser(jsonStr.toString)) 
{ parser =>
--- End diff --

Keep the `jsonStr.getBytes` unchanged. 

```Java
/**
 * Method for constructing parser for parsing
 * the contents of given byte array.
 * 
 * @since 2.1
 */
public JsonParser createParser(byte[] data) throws IOException, 
JsonParseException {
IOContext ctxt = _createContext(data, true);
if (_inputDecorator != null) {
InputStream in = _inputDecorator.decorate(ctxt, data, 0, 
data.length);
if (in != null) {
return _createParser(in, ctxt);
}
}
return _createParser(data, 0, data.length, ctxt);
}
```

I think we should capture both `IOException` and `JsonParseException`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-04-20 Thread kevinyu98

Github user kevinyu98 commented on the issue:

https://github.com/apache/spark/pull/12646
  
@hvanhovell: Would you have some cycles to review this PR? I would 
appreciated some feedback on this.. Thanks. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17693: [SPARK-20314][SQL] Inconsistent error handling in JSON p...

2017-04-20 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17693
  
@liancheng Good suggestion! Just like what we did for `from_json/to_json`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17495: [SPARK-20172][Core] Add file permission check whe...

2017-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17495


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17495: [SPARK-20172][Core] Add file permission check when listi...

2017-04-20 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/17495
  
LGTM, merging to master / 2.2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 449 matches

Mail list logo