date:20170706

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18559
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79312/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18560: Revise rand comparison in BatchEvalPythonExecSuit...

2017-07-06 Thread gengliangwang

GitHub user gengliangwang opened a pull request:

https://github.com/apache/spark/pull/18560

Revise rand comparison in BatchEvalPythonExecSuite

## What changes were proposed in this pull request?

Revise rand comparison in BatchEvalPythonExecSuite

In BatchEvalPythonExecSuite, there are two cases using the case "rand() > 3"
Rand() generates a random value in [0, 1), it is wired to be compared with 
3, use 0.3 instead

## How was this patch tested?

unit test

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gengliangwang/spark 
revise_BatchEvalPythonExecSuite

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18560.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18560


commit 67841437eb2cffec5686fafd07cb1233a1e5072a
Author: Wang Gengliang 
Date:   2017-07-07T05:50:24Z

revise BatchEvalPythonExecSuite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17758: [SPARK-20460][SQL] Make it more consistent to han...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17758#discussion_r126074450
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/create.sql ---
@@ -0,0 +1,23 @@
+-- Catch case-sensitive name duplication
+SET spark.sql.caseSensitive=true;
+
+CREATE TABLE t(c0 STRING, c1 INT, c1 DOUBLE, c0 INT) USING parquet;
--- End diff --

We should keep them in one place. For now I think we still need to put them 
in `DDLSuite` because we need to run it with and without hive support. Can we 
pick some typical test cases here and move them to `DDLSuite`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18559
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18559
  
**[Test build #79312 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79312/testReport)**
 for PR 18559 at commit 
[`7279262`](https://github.com/apache/spark/commit/72792627d76e0e3452f84af1322a35e3f0d82580).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17758: [SPARK-20460][SQL] Make it more consistent to han...

2017-07-06 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17758#discussion_r126074238
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/create.sql ---
@@ -0,0 +1,23 @@
+-- Catch case-sensitive name duplication
+SET spark.sql.caseSensitive=true;
+
+CREATE TABLE t(c0 STRING, c1 INT, c1 DOUBLE, c0 INT) USING parquet;
--- End diff --

In `DDLSuite`, we already have simple tests for duplicate columns. we 
better moving these tests there?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18559
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17758: [SPARK-20460][SQL] Make it more consistent to han...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17758#discussion_r126073404
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/create.sql ---
@@ -0,0 +1,23 @@
+-- Catch case-sensitive name duplication
+SET spark.sql.caseSensitive=true;
+
+CREATE TABLE t(c0 STRING, c1 INT, c1 DOUBLE, c0 INT) USING parquet;
--- End diff --

We didn't have test cases for create table before?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18559
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79311/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18559
  
**[Test build #79311 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79311/testReport)**
 for PR 18559 at commit 
[`4d99c11`](https://github.com/apache/spark/commit/4d99c11802efa2d6ee5c36de5941226bf12e1a55).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-06 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18444
  
Thanks for asking @ueshin. Sounds OK to me too. I currently have some 
pending review comments for minor nits. Let me finish mine within today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18559#discussion_r126072754
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2638,4 +2638,17 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-21335: support un-aliased subquery") {
+withTempView("v") {
+  Seq(1 -> "a").toDF("i", "j").createOrReplaceTempView("v")
+  checkAnswer(sql("SELECT i from (SELECT i FROM v)"), Row(1))
+
+  val e = intercept[AnalysisException](sql("SELECT v.i from (SELECT i 
FROM v)"))
+  assert(e.message ==
+"cannot resolve '`v.i`' given input columns: 
[_auto_generated_subquery_name.i]")
--- End diff --

yea that seems wrong ...



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/18559#discussion_r126072760
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2638,4 +2638,17 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-21335: support un-aliased subquery") {
+withTempView("v") {
+  Seq(1 -> "a").toDF("i", "j").createOrReplaceTempView("v")
+  checkAnswer(sql("SELECT i from (SELECT i FROM v)"), Row(1))
+
+  val e = intercept[AnalysisException](sql("SELECT v.i from (SELECT i 
FROM v)"))
+  assert(e.message ==
+"cannot resolve '`v.i`' given input columns: 
[_auto_generated_subquery_name.i]")
--- End diff --

It's supported since 2.0.X, so definitely there are existing user queires 
and apps. I'm agreeing with this PR and want to understand the scope of 
changes. It looks good to me.
```scala
scala> sc.version
res0: String = 2.0.2

scala> Seq(1 -> "a").toDF("i", "j").createOrReplaceTempView("v")

scala> sql("SELECT v.i from (SELECT i FROM v)").show
+---+
|  i|
+---+
|  1|
+---+
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18559#discussion_r126072118
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2638,4 +2638,17 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-21335: support un-aliased subquery") {
+withTempView("v") {
+  Seq(1 -> "a").toDF("i", "j").createOrReplaceTempView("v")
+  checkAnswer(sql("SELECT i from (SELECT i FROM v)"), Row(1))
+
+  val e = intercept[AnalysisException](sql("SELECT v.i from (SELECT i 
FROM v)"))
+  assert(e.message ==
+"cannot resolve '`v.i`' given input columns: 
[_auto_generated_subquery_name.i]")
--- End diff --

we may have, but this is definitely wrong IMO. BTW at least we don't have 
this usage in our tests, so I think it's probably fine. also cc @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector shoul...

2017-07-06 Thread dongjoon-hyun

Github user dongjoon-hyun closed the pull request at:

https://github.com/apache/spark/pull/18557


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18557
  
Yep. It's totally internal officially.

What I meant with `performance issue` is 3rd party can still use it and 
there might be a performance gap between `float` and `double`.

I'll close this PR. Thank you again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18559#discussion_r126071406
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2638,4 +2638,17 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-21335: support un-aliased subquery") {
+withTempView("v") {
+  Seq(1 -> "a").toDF("i", "j").createOrReplaceTempView("v")
+  checkAnswer(sql("SELECT i from (SELECT i FROM v)"), Row(1))
+
+  val e = intercept[AnalysisException](sql("SELECT v.i from (SELECT i 
FROM v)"))
+  assert(e.message ==
+"cannot resolve '`v.i`' given input columns: 
[_auto_generated_subquery_name.i]")
--- End diff --

Do we have such usage in existing queries?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-06 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18444
  
LGTM, pending Jenkins.
@HyukjinKwon, @holdenk, Do you have any other concerns?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #609: SPARK-1691: Support quoted arguments inside of spark-submi...

2017-07-06 Thread koertkuipers

Github user koertkuipers commented on the issue:

https://github.com/apache/spark/pull/609
  
@ganeshm25 it seems to work in newer spark versions. i havent tried in 
spark 1.4.2. however its still very tricky to get it right and i would prefer a 
simpler solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18462: [SPARK-21333][Docs] Removed invalid joinTypes fro...

2017-07-06 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18462#discussion_r126071195
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -1007,6 +1007,10 @@ class Dataset[T] private[sql](
 JoinType(joinType),
 Some(condition.expr))).analyzed.asInstanceOf[Join]
 
+if (joined.joinType == LeftSemi || joined.joinType == LeftAnti) {
+  throw new AnalysisException("Invalid join type in joinWith: " + 
joined.joinType)
--- End diff --

Nit: `joined.joinType.sql`? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/18559#discussion_r126071223
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2638,4 +2638,17 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  test("SPARK-21335: support un-aliased subquery") {
+withTempView("v") {
+  Seq(1 -> "a").toDF("i", "j").createOrReplaceTempView("v")
+  checkAnswer(sql("SELECT i from (SELECT i FROM v)"), Row(1))
+
+  val e = intercept[AnalysisException](sql("SELECT v.i from (SELECT i 
FROM v)"))
+  assert(e.message ==
+"cannot resolve '`v.i`' given input columns: 
[_auto_generated_subquery_name.i]")
--- End diff --

Then, the scope of breaking change is reduced into this kind of queries?

```scala
scala> sc.version
res0: String = 2.1.1

scala> Seq(1 -> "a").toDF("i", "j").createOrReplaceTempView("v")

scala> sql("SELECT v.i from (SELECT i FROM v)").show
+---+
|  i|
+---+
|  1|
+---+
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics w...

2017-07-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18558


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #609: SPARK-1691: Support quoted arguments inside of spark-submi...

2017-07-06 Thread ganeshm25

Github user ganeshm25 commented on the issue:

https://github.com/apache/spark/pull/609
  
@koertkuipers i am trying to do achieve running the multiple 
driver-java-options with Spark 1.4.2 inside a bash script? is there a solution 
you found for this ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18444
  
**[Test build #79314 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79314/testReport)**
 for PR 18444 at commit 
[`f2774c6`](https://github.com/apache/spark/commit/f2774c639fdf653ec7d48127b529124dbbb9b60b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics with dat...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18558
  
thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/18388
  
We didn't change `spark.shuffle.io.numConnectionsPerPeer`. Our biggest 
cluster has 6000 `NodeManager`s. There are 50 executors running on a same host 
at the same time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18425: [SPARK-21217][SQL] Support ColumnVector.Array.to<...

2017-07-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18425


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18425: [SPARK-21217][SQL] Support ColumnVector.Array.toAr...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18425
  
thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/18388
  
@cloud-fan 
To be honest, it's a little bit tricky to reject "open blocks" by closing 
the connection. The following reconnection will surely have extra cost. In 
current change we are relying on retry mechanism of `RetryingBlockFetcher`. 
`spark.shuffle.io.maxRetries` and `spark.shuffle.io.retryWait` should also be 
tuned, with this change maybe their meanings become different, users should 
know this. This is the sacrifice for compatibility.
It comes to me that can we add back `OpenBlocksFailed` and add a 
flag(default false)? If user wants to turned on, we can tell them they should 
upgrade the client.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-06 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18444
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor shou...

2017-07-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18553


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18557
  
`ColumnVector` is total internal in Spark 2.2, so there won't be 3rd party 
Spark library issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics with dat...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18558
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics with dat...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18558
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79309/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor should hand...

2017-07-06 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18553
  
Thanks for reviewing! merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics with dat...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18558
  
**[Test build #79309 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79309/testReport)**
 for PR 18558 at commit 
[`dedafd9`](https://github.com/apache/spark/commit/dedafd95835ddd65118825d74c4592f35b73b3d8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/18388
  
> there are 200K+ connections and 3.5M blocks(FileSegmentManagedBuffer) 
being fetched.

Did you use a large `spark.shuffle.io.numConnectionsPerPeer`? If not, the 
number of connections seems too large since each ShuffleClient should have only 
one connection to one shuffle service. How large is your cluster and how many 
applications are running at the same time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18557
  
BTW, thank you for swift reviews and feedbacks on my PR. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18388: [SPARK-21175] Reject OpenBlocks when memory shortage on ...

2017-07-06 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/18388
  
Analyzing the heap dump, there are 200K+ connections and 3.5M 
blocks(`FileSegmentManagedBuffer`) being fetched. Yes, flow control is a good 
idea. But I still think it make much sense to control the concurrency. Reject 
some "open blocks" requests, thus we can have sufficient bandwidth for the 
existing connections and we can finish the reduce task as soon as possible. 
Simple flow control(slow down connections when pressure) can help avoid OOM, 
but it seems more reduce tasks will run longer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18557
  
I know that 'there is no usage of this API internally in Spark 2.2', but 
it's only for 2.2.0.
My reason was any 3rd party Spark library cannot use `ColumnVector` for 
`float` type in Spark 2.2.1+.

Anyway, @cloud-fan changes the bug type. If that means backporting is not 
allowed for this patch, I have no objection for the community decision.

So, @kiszk and @cloud-fan . Given that, may I close this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18307: [SPARK-21100][SQL] Add summary method as alternative to ...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18307
  
**[Test build #79313 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79313/testReport)**
 for PR 18307 at commit 
[`3b548cc`](https://github.com/apache/spark/commit/3b548cc3d5ad8928785fe644db9ea788dfb8fad2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18307: [SPARK-21100][SQL] Add summary method as alternative to ...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18307
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18557
  
I've changed the ticket type from `bug` to `improvement`, adding a new API 
is not fixing a bug.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17633: [SPARK-20331][SQL] Enhanced Hive partition prunin...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17633#discussion_r126068180
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
@@ -589,18 +590,43 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
 col.getType.startsWith(serdeConstants.CHAR_TYPE_NAME))
   .map(col => col.getName).toSet
 
-filters.collect {
-  case op @ BinaryComparison(a: Attribute, Literal(v, _: 
IntegralType)) =>
-s"${a.name} ${op.symbol} $v"
-  case op @ BinaryComparison(Literal(v, _: IntegralType), a: 
Attribute) =>
-s"$v ${op.symbol} ${a.name}"
-  case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType))
+object ExtractableLiteral {
+  def unapply(expr: Expression): Option[String] = expr match {
+case Literal(value, _: IntegralType) => Some(value.toString)
+case Literal(value, _: StringType) => 
Some(quoteStringLiteral(value.toString))
+case _ => None
+  }
+}
+
+object ExtractableLiterals {
+  def unapply(exprs: Seq[Expression]): Option[Seq[String]] = {
+
exprs.map(ExtractableLiteral.unapply).foldLeft(Option(Seq.empty[String])) {
+  case (Some(accum), Some(value)) => Some(accum :+ value)
+  case _ => None
+}
+  }
+}
+
+lazy val convert: PartialFunction[Expression, String] = {
+  case In(a: Attribute, ExtractableLiterals(values)) if 
!varcharKeys.contains(a.name) =>
+val or =
+  values
+.map(value => s"${a.name} = $value")
+.reduce(_ + " or " + _)
+"(" + or + ")"
+  case op @ BinaryComparison(a: Attribute, ExtractableLiteral(value))
   if !varcharKeys.contains(a.name) =>
-s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}"""
-  case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute)
+s"${a.name} ${op.symbol} $value"
+  case op @ BinaryComparison(ExtractableLiteral(value), a: Attribute)
   if !varcharKeys.contains(a.name) =>
-s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}"""
-}.mkString(" and ")
+s"$value ${op.symbol} ${a.name}"
+  case op @ And(expr1, expr2) =>
+s"(${convert(expr1)} and ${convert(expr2)})"
+  case op @ Or(expr1, expr2) =>
+s"(${convert(expr1)} or ${convert(expr2)})"
+}
+
+filters.flatMap(f => Try(convert(f)).toOption).mkString(" and ")
--- End diff --

I do think we should follow `InMemoryTableScanExec.buildFilters`. For 
example, if the left side of `And` is not supported but the right side is, and 
we can still push down the right side. But here, we simply catch the exception 
and push nothing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18557
  
We have not seen any failure in test suites. And, [there is no usage of 
this API](https://github.com/apache/spark/pull/17836#discussion_r114488839) in 
Spark 2.2.

Does this missing cause any failure of test or application program? If so, 
it is good to put a sample program in this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17633: [SPARK-20331][SQL] Enhanced Hive partition prunin...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17633#discussion_r126067892
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
@@ -589,18 +590,43 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
 col.getType.startsWith(serdeConstants.CHAR_TYPE_NAME))
   .map(col => col.getName).toSet
 
-filters.collect {
-  case op @ BinaryComparison(a: Attribute, Literal(v, _: 
IntegralType)) =>
-s"${a.name} ${op.symbol} $v"
-  case op @ BinaryComparison(Literal(v, _: IntegralType), a: 
Attribute) =>
-s"$v ${op.symbol} ${a.name}"
-  case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType))
+object ExtractableLiteral {
+  def unapply(expr: Expression): Option[String] = expr match {
+case Literal(value, _: IntegralType) => Some(value.toString)
+case Literal(value, _: StringType) => 
Some(quoteStringLiteral(value.toString))
+case _ => None
+  }
+}
+
+object ExtractableLiterals {
+  def unapply(exprs: Seq[Expression]): Option[Seq[String]] = {
+
exprs.map(ExtractableLiteral.unapply).foldLeft(Option(Seq.empty[String])) {
+  case (Some(accum), Some(value)) => Some(accum :+ value)
+  case _ => None
+}
+  }
+}
+
+lazy val convert: PartialFunction[Expression, String] = {
+  case In(a: Attribute, ExtractableLiterals(values)) if 
!varcharKeys.contains(a.name) =>
--- End diff --

cc @gatorsmile , any concerns to not do it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16697: [SPARK-19358][CORE] LiveListenerBus shall log the event ...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16697
  
LGTM, pending tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18556
  
Thank you @cloud-fan!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17633: [SPARK-20331][SQL] Enhanced Hive partition prunin...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17633#discussion_r126067471
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
@@ -589,18 +590,40 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
 col.getType.startsWith(serdeConstants.CHAR_TYPE_NAME))
   .map(col => col.getName).toSet
 
-filters.collect {
-  case op @ BinaryComparison(a: Attribute, Literal(v, _: 
IntegralType)) =>
-s"${a.name} ${op.symbol} $v"
-  case op @ BinaryComparison(Literal(v, _: IntegralType), a: 
Attribute) =>
-s"$v ${op.symbol} ${a.name}"
-  case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType))
-  if !varcharKeys.contains(a.name) =>
-s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}"""
-  case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute)
-  if !varcharKeys.contains(a.name) =>
-s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}"""
-}.mkString(" and ")
+def isExtractable(expr: Expression): Boolean =
+  expr match {
+case Literal(_, _: IntegralType) | Literal(_, _: StringType) => 
true
+case _ => false
+  }
+
+def extractValue(expr: Expression): String =
+  expr match {
+case Literal(v, _: IntegralType) => v.toString
+case Literal(v, _: StringType) => quoteStringLiteral(v.toString)
+  }
+
+lazy val convert: PartialFunction[Expression, String] =
+  {
+case In(a: Attribute, exprs)
+if !varcharKeys.contains(a.name) && 
exprs.forall(isExtractable) =>
+  val or =
+exprs
+  .map(expr => s"${a.name} = ${extractValue(expr)}")
+  .reduce(_ + " or " + _)
+  "(" + or + ")"
+case op @ BinaryComparison(a: Attribute, expr2)
--- End diff --

how about `ExtractLiteralToString`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18559
  
**[Test build #79312 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79312/testReport)**
 for PR 18559 at commit 
[`7279262`](https://github.com/apache/spark/commit/72792627d76e0e3452f84af1322a35e3f0d82580).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat...

2017-07-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18556


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18288: [SPARK-21066][ML] LibSVM load just one input file

2017-07-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18288


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18554: [SPARK-21306][ML] OneVsRest should cache weightCol if ne...

2017-07-06 Thread facaiy

Github user facaiy commented on the issue:

https://github.com/apache/spark/pull/18554
  
I'm not familiar with R, and use grep to search "OneVsRest" and get 
nothing. Hence it seems that nothing is needed to do with R part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18523: [SPARK-21285][ML] VectorAssembler reports the column nam...

2017-07-06 Thread facaiy

Github user facaiy commented on the issue:

https://github.com/apache/spark/pull/18523
  
@SparkQA test again, please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18557
  
Hi, @kiszk .
I think this is a bug fix of `ColumnVector` as described in 
[SPARK-20566](https://issues.apache.org/jira/browse/SPARK-20566).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18556
  
LGTM, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18556#discussion_r126066952
  
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala ---
@@ -102,6 +104,25 @@ object MLUtils extends Logging {
   .map(parseLibSVMRecord)
   }
 
+  private[spark] def parseLibSVMFile(
+  sparkSession: SparkSession, paths: Seq[String]): RDD[(Double, 
Array[Int], Array[Double])] = {
+val lines = sparkSession.baseRelationToDataFrame(
+  DataSource.apply(
+sparkSession,
+paths = paths,
+className = classOf[TextFileFormat].getName
+  ).resolveRelation(checkFilesExist = false))
+  .select("value")
--- End diff --

is this needed? I think text format is known to have only one column.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18559
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18559#discussion_r126066595
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/string-functions.sql.out ---
@@ -30,20 +30,20 @@ abc
 
 -- !query 3
 EXPLAIN EXTENDED SELECT (col1 || col2 || col3 || col4) col
-FROM (SELECT id col1, id col2, id col3, id col4 FROM range(10)) t
+FROM (SELECT id col1, id col2, id col3, id col4 FROM range(10))
 -- !query 3 schema
 struct
 -- !query 3 output
 == Parsed Logical Plan ==
 'Project [concat(concat(concat('col1, 'col2), 'col3), 'col4) AS col#x]
-+- 'SubqueryAlias t
++- 'SubqueryAlias _auto_generated_subquery_name
--- End diff --

I think it's ok, as the name is quite clear about it's auto-generated. And 
I think it's hard to hide it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics with dat...

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18558
  
LGTM pending jenkins, also cc @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18559
  
**[Test build #79311 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79311/testReport)**
 for PR 18559 at commit 
[`4d99c11`](https://github.com/apache/spark/commit/4d99c11802efa2d6ee5c36de5941226bf12e1a55).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18559#discussion_r126066489
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/string-functions.sql.out ---
@@ -30,20 +30,20 @@ abc
 
 -- !query 3
 EXPLAIN EXTENDED SELECT (col1 || col2 || col3 || col4) col
-FROM (SELECT id col1, id col2, id col3, id col4 FROM range(10)) t
+FROM (SELECT id col1, id col2, id col3, id col4 FROM range(10))
 -- !query 3 schema
 struct
 -- !query 3 output
 == Parsed Logical Plan ==
 'Project [concat(concat(concat('col1, 'col2), 'col3), 'col4) AS col#x]
-+- 'SubqueryAlias t
++- 'SubqueryAlias _auto_generated_subquery_name
--- End diff --

Do we want to show the internal subquery name?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18559#discussion_r126066311
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -751,15 +751,17 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] 
with Logging {
* hooks.
*/
   override def visitAliasedQuery(ctx: AliasedQueryContext): LogicalPlan = 
withOrigin(ctx) {
-// The unaliased subqueries in the FROM clause are disallowed. Instead 
of rejecting it in
-// parser rules, we handle it here in order to provide better error 
message.
-if (ctx.strictIdentifier == null) {
-  throw new ParseException("The unaliased subqueries in the FROM 
clause are not supported.",
-ctx)
+val alias = if (ctx.strictIdentifier == null) {
+  // For un-aliased subqueries, ues a default alias name that is not 
likely to conflict with
--- End diff --

nit: typo `ues`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread cloud-fan

GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/18559

[SPARK-21335][SQL] support un-aliased subquery

## What changes were proposed in this pull request?

un-aliased subquery is supported by Spark SQL for a long time. Its semantic 
was not well defined and had confusing behaviors, and it's not a standard SQL 
syntax, so we disallowed it in 
https://issues.apache.org/jira/browse/SPARK-20690 .

However, this is a breaking change, and we do have existing queries using 
un-aliased subquery. We should add the support back and fix its semantic.

This PR fixes the un-aliased subquery by assigning a default alias name.

## How was this patch tested?

new regression test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark sub-query

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18559.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18559


commit 4d99c11802efa2d6ee5c36de5941226bf12e1a55
Author: Wenchen Fan 
Date:   2017-07-07T04:03:34Z

support un-aliased subquery




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18559: [SPARK-21335][SQL] support un-aliased subquery

2017-07-06 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18559
  
cc @rxin @viirya 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18557
  
@dongjoon-hyun Is there any reason to backport this to previous versions? 
This is because we had such [a 
discussion](https://github.com/apache/spark/pull/17836#pullrequestreview-35957231).
Obviously, it makes sense to support the latest one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18557
  
Hi, @cloud-fan .
This is the backport for #17836 .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18557
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18557
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79306/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18557
  
**[Test build #79306 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79306/testReport)**
 for PR 18557 at commit 
[`39839bf`](https://github.com/apache/spark/commit/39839bf5b70aab603e538d424cda00ec7cde1402).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16697: [SPARK-19358][CORE] LiveListenerBus shall log the event ...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16697
  
**[Test build #79310 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79310/testReport)**
 for PR 16697 at commit 
[`554cd39`](https://github.com/apache/spark/commit/554cd391b3ddb5fb3f7c52950610e832ad40047b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18465: [SPARK-21093][R] Terminate R's worker processes in the p...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18465
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79308/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18465: [SPARK-21093][R] Terminate R's worker processes in the p...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18465
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18465: [SPARK-21093][R] Terminate R's worker processes in the p...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18465
  
**[Test build #79308 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79308/testReport)**
 for PR 18465 at commit 
[`c08ccd5`](https://github.com/apache/spark/commit/c08ccd59f438fce1f841aa70f760ffb9dc24cf50).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16697: [SPARK-19358][CORE] LiveListenerBus shall log the event ...

2017-07-06 Thread jiangxb1987

Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/16697
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics with dat...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18558
  
**[Test build #79309 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79309/testReport)**
 for PR 18558 at commit 
[`dedafd9`](https://github.com/apache/spark/commit/dedafd95835ddd65118825d74c4592f35b73b3d8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor should hand...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18553
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor should hand...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18553
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79304/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics with dat...

2017-07-06 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18558
  
cc @cloud-fan This removes the writeTime metrics.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor should hand...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18553
  
**[Test build #79304 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79304/testReport)**
 for PR 18553 at commit 
[`15b7497`](https://github.com/apache/spark/commit/15b7497b76c031488b8ec414f1363f3393f0a3e4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18558: [SPARK-20703][SQL][FOLLOW-UP] Associate metrics w...

2017-07-06 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/18558

[SPARK-20703][SQL][FOLLOW-UP] Associate metrics with data writes onto 
DataFrameWriter operations

## What changes were proposed in this pull request?

Remove time metrics since it seems no way to measure it in non per-row 
tracking.

## How was this patch tested?

Existing tests.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-20703-followup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18558.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18558


commit dedafd95835ddd65118825d74c4592f35b73b3d8
Author: Liang-Chi Hsieh 
Date:   2017-07-07T02:35:48Z

Remove time metrics since it seems no way to measure it in non per-row 
tracking.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18556
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79307/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18556
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18556
  
**[Test build #79307 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79307/testReport)**
 for PR 18556 at commit 
[`b345cb1`](https://github.com/apache/spark/commit/b345cb14758ae6f16d699b2f38b17eefbf316468).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18465: [SPARK-21093][R] Terminate R's worker processes in the p...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18465
  
**[Test build #79308 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79308/testReport)**
 for PR 18465 at commit 
[`c08ccd5`](https://github.com/apache/spark/commit/c08ccd59f438fce1f841aa70f760ffb9dc24cf50).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18465: [SPARK-21093][R] Terminate R's worker processes in the p...

2017-07-06 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18465
  
(simply rebased)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18482: [SPARK-21262] Stop sending 'stream request' when ...

2017-07-06 Thread jinxing64

Github user jinxing64 closed the pull request at:

https://github.com/apache/spark/pull/18482


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18159: [SPARK-20703][SQL] Associate metrics with data wr...

2017-07-06 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18159#discussion_r126056717
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
 ---
@@ -314,21 +339,40 @@ object FileFormatWriter extends Logging {
 
   recordsInFile = 0
   releaseResources()
+  numOutputRows += recordsInFile
   newOutputWriter(fileCounter)
 }
 
 val internalRow = iter.next()
+val startTime = System.nanoTime()
 currentWriter.write(internalRow)
+timeOnCurrentFile += (System.nanoTime() - startTime)
--- End diff --

Yeah, I also considered this option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18482: [SPARK-21262] Stop sending 'stream request' when shuffle...

2017-07-06 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/18482
  
Sure, I will update the document soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18556
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79305/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18556
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18556
  
**[Test build #79305 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79305/testReport)**
 for PR 18556 at commit 
[`de33d6d`](https://github.com/apache/spark/commit/de33d6d8809e87edbe42c2dfab4b914da29c7143).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat...

2017-07-06 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18556#discussion_r126053740
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,14 @@ private[libsvm] class LibSVMFileFormat extends 
TextBasedFileFormat with DataSour
   files: Seq[FileStatus]): Option[StructType] = {
 val libSVMOptions = new LibSVMOptions(options)
 val numFeatures: Int = libSVMOptions.numFeatures.getOrElse {
-  // Infers number of features if the user doesn't specify (a valid) 
one.
-  val dataFiles = files.filterNot(_.getPath.getName startsWith "_")
-  val path = if (dataFiles.length == 1) {
-dataFiles.head.getPath.toUri.toString
-  } else if (dataFiles.isEmpty) {
-throw new IOException("No input path specified for libsvm data")
-  } else {
-throw new IOException("Multiple input paths are not supported for 
libsvm data.")
-  }
-
-  val sc = sparkSession.sparkContext
-  val parsed = MLUtils.parseLibSVMFile(sc, path, sc.defaultParallelism)
+  require(files.nonEmpty, "No input path specified for libsvm data")
--- End diff --

Please refer 
https://github.com/apache/spark/pull/18556#discussion_r126045375.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat in Lib...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18556
  
**[Test build #79307 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79307/testReport)**
 for PR 18556 at commit 
[`b345cb1`](https://github.com/apache/spark/commit/b345cb14758ae6f16d699b2f38b17eefbf316468).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18509: [SPARK-21329][SS] Make EventTimeWatermarkExec exp...

2017-07-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18509


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18509: [SPARK-21329][SS] Make EventTimeWatermarkExec explicitly...

2017-07-06 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/18509
  
Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat...

2017-07-06 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18556#discussion_r126051332
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,17 @@ private[libsvm] class LibSVMFileFormat extends 
TextBasedFileFormat with DataSour
   files: Seq[FileStatus]): Option[StructType] = {
 val libSVMOptions = new LibSVMOptions(options)
 val numFeatures: Int = libSVMOptions.numFeatures.getOrElse {
-  // Infers number of features if the user doesn't specify (a valid) 
one.
-  val dataFiles = files.filterNot(_.getPath.getName startsWith "_")
-  val path = if (dataFiles.length == 1) {
-dataFiles.head.getPath.toUri.toString
-  } else if (dataFiles.isEmpty) {
+  if (files.isEmpty) {
 throw new IOException("No input path specified for libsvm data")
--- End diff --

Actually, that should be right after this function call so probably fine 
:). Yea, but at least using `require` should be shorter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector should suppo...

2017-07-06 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18557
  
**[Test build #79306 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79306/testReport)**
 for PR 18557 at commit 
[`39839bf`](https://github.com/apache/spark/commit/39839bf5b70aab603e538d424cda00ec7cde1402).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18556: [SPARK-21326][SPARK-21066][ML] Use TextFileFormat...

2017-07-06 Thread facaiy

Github user facaiy commented on a diff in the pull request:

https://github.com/apache/spark/pull/18556#discussion_r126050849
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,17 @@ private[libsvm] class LibSVMFileFormat extends 
TextBasedFileFormat with DataSour
   files: Seq[FileStatus]): Option[StructType] = {
 val libSVMOptions = new LibSVMOptions(options)
 val numFeatures: Int = libSVMOptions.numFeatures.getOrElse {
-  // Infers number of features if the user doesn't specify (a valid) 
one.
-  val dataFiles = files.filterNot(_.getPath.getName startsWith "_")
-  val path = if (dataFiles.length == 1) {
-dataFiles.head.getPath.toUri.toString
-  } else if (dataFiles.isEmpty) {
+  if (files.isEmpty) {
 throw new IOException("No input path specified for libsvm data")
--- End diff --

In my opinion, it is safe / necessary to check whether the parameter is 
valid in advance. Perhaps `IllegalArgumentException` is more suitable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18557: [SPARK-20566][SQL][BRANCH-2.2] ColumnVector shoul...

2017-07-06 Thread dongjoon-hyun

GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/18557

[SPARK-20566][SQL][BRANCH-2.2] ColumnVector should support `appendFloats` 
for array

## What changes were proposed in this pull request?

This PR aims to add a missing `appendFloats` API for array into 
**ColumnVector** class. For double type, there is `appendDoubles` for array 
[here](https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java#L818-L824).

## How was this patch tested?

Pass the Jenkins with a newly added test case.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-20566-BRANCH-2.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18557.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18557


commit 39839bf5b70aab603e538d424cda00ec7cde1402
Author: Dongjoon Hyun 
Date:   2017-05-04T13:04:15Z

[SPARK-20566][SQL][BRANCH-2.2] ColumnVector should support `appendFloats` 
for array

This PR aims to add a missing `appendFloats` API for array into 
**ColumnVector** class. For double type, there is `appendDoubles` for array 
[here](https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java#L818-L824).

Pass the Jenkins with a newly added test case.

Author: Dongjoon Hyun 

Closes #17836 from dongjoon-hyun/SPARK-20566.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 >

1 - 100 of 514 matches

Mail list logo