date:20190612

[GitHub] [spark] wangyum commented on a change in pull request #24767: [SPARK-27918][SQL] Port boolean.sql

2019-06-12 Thread GitBox

wangyum commented on a change in pull request #24767: [SPARK-27918][SQL] Port 
boolean.sql
URL: https://github.com/apache/spark/pull/24767#discussion_r293226760
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/pgSQL/boolean.sql.out
 ##
 @@ -0,0 +1,710 @@
+-- Automatically generated by SQLQueryTestSuite
+-- Number of queries: 81
+
+
+-- !query 0
+SELECT 1 AS one
+-- !query 0 schema
+struct
+-- !query 0 output
+1
+
+
+-- !query 1
+SELECT true AS true
+-- !query 1 schema
+struct
+-- !query 1 output
+true
+
+
+-- !query 2
+SELECT false AS false
+-- !query 2 schema
+struct
+-- !query 2 output
+false
+
+
+-- !query 3
+SELECT cast('t' as boolean) AS true
+-- !query 3 schema
+struct
+-- !query 3 output
+true
+
+
+-- !query 4
+SELECT cast('   f   ' as boolean) AS false
+-- !query 4 schema
+struct
+-- !query 4 output
+NULL
+
+
+-- !query 5
+SELECT cast('true' as boolean) AS true
+-- !query 5 schema
+struct
+-- !query 5 output
+true
+
+
+-- !query 6
+SELECT cast('test' as boolean) AS error
+-- !query 6 schema
+struct
+-- !query 6 output
+NULL
+
+
+-- !query 7
+SELECT cast('false' as boolean) AS false
+-- !query 7 schema
+struct
+-- !query 7 output
+false
+
+
+-- !query 8
+SELECT cast('foo' as boolean) AS error
+-- !query 8 schema
+struct
+-- !query 8 output
+NULL
+
+
+-- !query 9
+SELECT cast('y' as boolean) AS true
+-- !query 9 schema
+struct
+-- !query 9 output
+true
+
+
+-- !query 10
+SELECT cast('yes' as boolean) AS true
+-- !query 10 schema
+struct
+-- !query 10 output
+true
+
+
+-- !query 11
+SELECT cast('yeah' as boolean) AS error
+-- !query 11 schema
+struct
+-- !query 11 output
+NULL
+
+
+-- !query 12
+SELECT cast('n' as boolean) AS false
+-- !query 12 schema
+struct
+-- !query 12 output
+false
+
+
+-- !query 13
+SELECT cast('no' as boolean) AS false
+-- !query 13 schema
+struct
+-- !query 13 output
+false
+
+
+-- !query 14
+SELECT cast('nay' as boolean) AS error
+-- !query 14 schema
+struct
+-- !query 14 output
+NULL
+
+
+-- !query 15
+SELECT cast('on' as boolean) AS true
+-- !query 15 schema
+struct
+-- !query 15 output
+NULL
+
+
+-- !query 16
+SELECT cast('off' as boolean) AS false
+-- !query 16 schema
+struct
+-- !query 16 output
+NULL
+
+
+-- !query 17
+SELECT cast('of' as boolean) AS false
+-- !query 17 schema
+struct
+-- !query 17 output
+NULL
 
 Review comment:
   PostgreSQL fixed the doc: 
https://github.com/postgres/postgres/commit/9729c9360886bee7feddc6a1124b0742de4b9f3d


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501571198
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support 
WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501571202
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11701/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support 
WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501571198
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501571202
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11701/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501569729
 
 
   **[Test build #106458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106458/testReport)**
 for PR 24842 at commit 
[`0a00a03`](https://github.com/apache/spark/commit/0a00a036148e898bd88bf3dd6e7b7ca0c67fa270).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] peter-toth commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

peter-toth commented on a change in pull request #24842: [SPARK-28002][SQL] 
Support WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#discussion_r293224600
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
 ##
 @@ -81,18 +81,31 @@ class PlanParserSuite extends AnalysisTest {
 assertEqual("select * from a intersect all select * from b", 
a.intersect(b, isAll = true))
   }
 
+  private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, 
Seq[String]))*): With = {
+val ctes = namedPlans.map {
+  case (name, (cte, columnAliases)) =>
+val subquery = if (columnAliases.isEmpty) {
+  cte
+} else {
+  UnresolvedSubqueryColumnAliases(columnAliases, cte)
+}
+name -> SubqueryAlias(name, subquery)
+}
+With(plan, ctes)
+  }
 
 Review comment:
   Fixed, thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] peter-toth commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

peter-toth commented on a change in pull request #24842: [SPARK-28002][SQL] 
Support WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#discussion_r293224390
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 ##
 @@ -633,4 +634,15 @@ class AnalysisSuite extends AnalysisTest with Matchers {
 val res = ViewAnalyzer.execute(view)
 comparePlans(res, expected)
   }
+
+  test("SPARK-28002: CTE with non-existing column alias") {
 
 Review comment:
   thanks, fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

dongjoon-hyun commented on issue #24068: [SPARK-27105][SQL] Optimize away 
exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#issuecomment-501568337
 
 
   +1 for @cloud-fan 's suggestion. Also, I saw that @gatorsmile also gave the 
same advice before.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] 
Support WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#discussion_r293223242
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
 ##
 @@ -81,18 +81,31 @@ class PlanParserSuite extends AnalysisTest {
 assertEqual("select * from a intersect all select * from b", 
a.intersect(b, isAll = true))
   }
 
+  private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, 
Seq[String]))*): With = {
+val ctes = namedPlans.map {
+  case (name, (cte, columnAliases)) =>
+val subquery = if (columnAliases.isEmpty) {
+  cte
+} else {
+  UnresolvedSubqueryColumnAliases(columnAliases, cte)
+}
+name -> SubqueryAlias(name, subquery)
+}
+With(plan, ctes)
+  }
 
 Review comment:
   Oh, this PR is already updated to the master. Then, please move this 
function to the below of the existing `cte` function. We had better gather the 
similar functions. If possible, please consolidate them into one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support 
WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567326
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106456/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

SparkQA removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH 
clause column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501565642
 
 
   **[Test build #106456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106456/testReport)**
 for PR 24842 at commit 
[`510eee1`](https://github.com/apache/spark/commit/510eee10c9c8b5938cec8bbc867c576ff0080103).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support 
WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567322
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support 
WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567139
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11700/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support 
WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567134
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567326
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106456/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] 
Support WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#discussion_r293222471
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
 ##
 @@ -81,18 +81,31 @@ class PlanParserSuite extends AnalysisTest {
 assertEqual("select * from a intersect all select * from b", 
a.intersect(b, isAll = true))
   }
 
+  private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, 
Seq[String]))*): With = {
+val ctes = namedPlans.map {
+  case (name, (cte, columnAliases)) =>
+val subquery = if (columnAliases.isEmpty) {
+  cte
+} else {
+  UnresolvedSubqueryColumnAliases(columnAliases, cte)
+}
+name -> SubqueryAlias(name, subquery)
+}
+With(plan, ctes)
+  }
 
 Review comment:
   This test suite is updated a few minute ago. Could you rebase once more?
   There will be another `cte` function in this test suite.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567309
 
 
   **[Test build #106456 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106456/testReport)**
 for PR 24842 at commit 
[`510eee1`](https://github.com/apache/spark/commit/510eee10c9c8b5938cec8bbc867c576ff0080103).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] 
Support WITH clause column aliases
URL: https://github.com/apache/spark/pull/24842#discussion_r293222471
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
 ##
 @@ -81,18 +81,31 @@ class PlanParserSuite extends AnalysisTest {
 assertEqual("select * from a intersect all select * from b", 
a.intersect(b, isAll = true))
   }
 
+  private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, 
Seq[String]))*): With = {
+val ctes = namedPlans.map {
+  case (name, (cte, columnAliases)) =>
+val subquery = if (columnAliases.isEmpty) {
+  cte
+} else {
+  UnresolvedSubqueryColumnAliases(columnAliases, cte)
+}
+name -> SubqueryAlias(name, subquery)
+}
+With(plan, ctes)
+  }
 
 Review comment:
   This test suite is updated a few minute ago. Could you rebase against 
`master` branch once more?
   There will be another `cte` function in this test suite.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567322
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567134
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501567139
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11700/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

cloud-fan commented on issue #24068: [SPARK-27105][SQL] Optimize away 
exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#issuecomment-501566535
 
 
   @IvanVergiliev I'd suggest we revert all the benchmark changes, and we write 
a simple microbenmark to test `OrcFilters`, and post the benchmark code and 
result in PR description.
   
   Currently we do not run benchmarks automatically for Spark, so perf 
regressions rely on user reports.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases

2019-06-12 Thread GitBox

SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause 
column aliases
URL: https://github.com/apache/spark/pull/24842#issuecomment-501565642
 
 
   **[Test build #106456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106456/testReport)**
 for PR 24842 at commit 
[`510eee1`](https://github.com/apache/spark/commit/510eee10c9c8b5938cec8bbc867c576ff0080103).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs

2019-06-12 Thread GitBox

SparkQA commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use 
per-query unique IDs instead of globally unique IDs
URL: https://github.com/apache/spark/pull/24735#issuecomment-501565653
 
 
   **[Test build #106457 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106457/testReport)**
 for PR 24735 at commit 
[`053b3ba`](https://github.com/apache/spark/commit/053b3ba1b7a84d6a4b355a865f4741935208d978).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save 
default constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501565304
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106452/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save 
default constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501565300
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24735: [SPARK-27871][SQL] 
LambdaVariable should use per-query unique IDs instead of globally unique IDs
URL: https://github.com/apache/spark/pull/24735#issuecomment-501565084
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11699/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default 
constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501565304
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106452/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

SparkQA removed a comment on issue #24792: [SPARK-27953][SQL] Save default 
constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501530960
 
 
   **[Test build #106452 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106452/testReport)**
 for PR 24792 at commit 
[`46c12d8`](https://github.com/apache/spark/commit/46c12d8896ef1022ca3e3ee6c2b21a376ae7f378).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24735: [SPARK-27871][SQL] 
LambdaVariable should use per-query unique IDs instead of globally unique IDs
URL: https://github.com/apache/spark/pull/24735#issuecomment-501565079
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default 
constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501565300
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24735: [SPARK-27871][SQL] LambdaVariable 
should use per-query unique IDs instead of globally unique IDs
URL: https://github.com/apache/spark/pull/24735#issuecomment-501565084
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11699/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24735: [SPARK-27871][SQL] LambdaVariable 
should use per-query unique IDs instead of globally unique IDs
URL: https://github.com/apache/spark/pull/24735#issuecomment-501565079
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

SparkQA commented on issue #24792: [SPARK-27953][SQL] Save default constraint 
with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501564896
 
 
   **[Test build #106452 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106452/testReport)**
 for PR 24792 at commit 
[`46c12d8`](https://github.com/apache/spark/commit/46c12d8896ef1022ca3e3ee6c2b21a376ae7f378).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs

2019-06-12 Thread GitBox

cloud-fan commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should 
use per-query unique IDs instead of globally unique IDs
URL: https://github.com/apache/spark/pull/24735#issuecomment-501564869
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501564017
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106455/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

SparkQA removed a comment on issue #24832: [SPARK-27845][SQL][WIP] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501563694
 
 
   **[Test build #106455 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106455/testReport)**
 for PR 24832 at commit 
[`2c17ced`](https://github.com/apache/spark/commit/2c17ced4e8effb401e2d0d1c1de14d4939e1c34e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501564013
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: 
InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501564013
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: 
InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501564017
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106455/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

SparkQA commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: 
InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501564006
 
 
   **[Test build #106455 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106455/testReport)**
 for PR 24832 at commit 
[`2c17ced`](https://github.com/apache/spark/commit/2c17ced4e8effb401e2d0d1c1de14d4939e1c34e).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `case class InsertTableStatement(`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

SparkQA commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: 
InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501563694
 
 
   **[Test build #106455 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106455/testReport)**
 for PR 24832 at commit 
[`2c17ced`](https://github.com/apache/spark/commit/2c17ced4e8effb401e2d0d1c1de14d4939e1c34e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501563207
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11698/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501563202
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: 
InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501563202
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: 
InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501563207
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11698/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] jzhuge commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable

2019-06-12 Thread GitBox

jzhuge commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: 
InsertTable
URL: https://github.com/apache/spark/pull/24832#issuecomment-501562120
 
 
   Rebase and squash


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #24699: [SPARK-27666][CORE] Do not release lock while TaskContext already completed

2019-06-12 Thread GitBox

cloud-fan commented on a change in pull request #24699: [SPARK-27666][CORE] Do 
not release lock while TaskContext already completed
URL: https://github.com/apache/spark/pull/24699#discussion_r293217786
 
 

 ##
 File path: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala
 ##
 @@ -1176,6 +1176,24 @@ class RDDSuite extends SparkFunSuite with 
SharedSparkContext {
 }.collect()
   }
 
+  test("SPARK-27666: Do not release lock while TaskContext already completed") 
{
+val rdd = sc.parallelize(Range(0, 10), 1).cache()
+// validate cache
+rdd.collect()
+rdd.mapPartitions { iter =>
+  val t = new Thread(() => {
+while (iter.hasNext) {
+  iter.next()
+  Thread.sleep(100)
+}
+  })
+  t.setDaemon(false)
+  t.start()
+  Iterator(0)
+}.collect()
+Thread.sleep(10 * 150)
 
 Review comment:
   we shouldn't use sleep in tests, as the test will become flaky sooner or 
later. If `CountDownLatch` doesn't work, can we use Spark `Accumulator` as 
signals?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r293212240
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala
 ##
 @@ -135,6 +139,34 @@ object FilterPushdownBenchmark extends BenchmarkBase with 
SQLHelper {
 benchmark.run()
   }
 
+  def filterPushDownBenchmarkWithColumn(
 
 Review comment:
   @IvanVergiliev . The following doesn't mean putting that into here. 
   > I think we should definitely have some automated benchmark for this. 
Otherwise there's nothing in the codebase exercising the behaviour being 
changed, and so nothing to prevent future regressions.
   
   Since this contribution is big, it's worth to have its own benchmark 
focusing on filter conversion. Also, the benchmark should have both ORCv1 and 
ORCv2 benchmark result.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] jzhuge commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

jzhuge commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501561040
 
 
   Thanks @cloud-fan @dongjoon-hyun @gatorsmile @rdblue for the excellent 
reviews! Thanks @rdblue for the great help!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r293216015
 
 

 ##
 File path: sql/core/benchmarks/FilterPushdownBenchmark-results.txt
 ##
 @@ -2,669 +2,695 @@
 Pushdown for many distinct value case
 

 
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 string row (value IS NULL): Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative
-
-Parquet Vectorized  11405 / 11485  1.4 
725.1   1.0X
-Parquet Vectorized (Pushdown)  675 /  690 23.3 
 42.9  16.9X
-Native ORC Vectorized 7127 / 7170  2.2 
453.1   1.6X
-Native ORC Vectorized (Pushdown)   519 /  541 30.3 
 33.0  22.0X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative
-
-Parquet Vectorized  11457 / 11473  1.4 
728.4   1.0X
-Parquet Vectorized (Pushdown)  656 /  686 24.0 
 41.7  17.5X
-Native ORC Vectorized 7328 / 7342  2.1 
465.9   1.6X
-Native ORC Vectorized (Pushdown)   539 /  565 29.2 
 34.2  21.3X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row (value = '7864320'): Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative
-
-Parquet Vectorized  11878 / 11888  1.3 
755.2   1.0X
-Parquet Vectorized (Pushdown)  630 /  654 25.0 
 40.1  18.9X
-Native ORC Vectorized 7342 / 7362  2.1 
466.8   1.6X
-Native ORC Vectorized (Pushdown)   519 /  537 30.3 
 33.0  22.9X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row (value <=> '7864320'): Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
-
-Parquet Vectorized  11423 / 11440  1.4 
726.2   1.0X
-Parquet Vectorized (Pushdown)  625 /  643 25.2 
 39.7  18.3X
-Native ORC Vectorized 7315 / 7335  2.2 
465.1   1.6X
-Native ORC Vectorized (Pushdown)   507 /  520 31.0 
 32.2  22.5X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative
-
-Parquet Vectorized  11440 / 11478  1.4 
727.3   1.0X
-Parquet Vectorized (Pushdown)  634 /  652 24.8 
 40.3  18.0X
-Native ORC Vectorized 7311 / 7324  2.2 
464.8   1.6X
-Native ORC Vectorized (Pushdown)   517 /  548 30.4 
 32.8  22.1X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select all string rows (value IS NOT NULL): Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
-
-Parquet Vectorized  20750 / 20872  0.8
1319.3   1.0X
-Parquet Vectorized (Pushdown)   21002 / 21032  0.7
1335.3   1.0X
-Native ORC Vectorized   16714 / 16742  0.9
1062.6   1.2X
-Native ORC Vectorized (Pushdown)16926 / 16965  0.9
1076.1   1.2X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 int row (value IS NULL):Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r293215498
 
 

 ##
 File path: sql/core/benchmarks/FilterPushdownBenchmark-results.txt
 ##
 @@ -2,669 +2,695 @@
 Pushdown for many distinct value case
 

 
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 string row (value IS NULL): Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative
-
-Parquet Vectorized  11405 / 11485  1.4 
725.1   1.0X
-Parquet Vectorized (Pushdown)  675 /  690 23.3 
 42.9  16.9X
-Native ORC Vectorized 7127 / 7170  2.2 
453.1   1.6X
-Native ORC Vectorized (Pushdown)   519 /  541 30.3 
 33.0  22.0X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative
-
-Parquet Vectorized  11457 / 11473  1.4 
728.4   1.0X
-Parquet Vectorized (Pushdown)  656 /  686 24.0 
 41.7  17.5X
-Native ORC Vectorized 7328 / 7342  2.1 
465.9   1.6X
-Native ORC Vectorized (Pushdown)   539 /  565 29.2 
 34.2  21.3X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row (value = '7864320'): Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative
-
-Parquet Vectorized  11878 / 11888  1.3 
755.2   1.0X
-Parquet Vectorized (Pushdown)  630 /  654 25.0 
 40.1  18.9X
-Native ORC Vectorized 7342 / 7362  2.1 
466.8   1.6X
-Native ORC Vectorized (Pushdown)   519 /  537 30.3 
 33.0  22.9X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row (value <=> '7864320'): Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
-
-Parquet Vectorized  11423 / 11440  1.4 
726.2   1.0X
-Parquet Vectorized (Pushdown)  625 /  643 25.2 
 39.7  18.3X
-Native ORC Vectorized 7315 / 7335  2.2 
465.1   1.6X
-Native ORC Vectorized (Pushdown)   507 /  520 31.0 
 32.2  22.5X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative
-
-Parquet Vectorized  11440 / 11478  1.4 
727.3   1.0X
-Parquet Vectorized (Pushdown)  634 /  652 24.8 
 40.3  18.0X
-Native ORC Vectorized 7311 / 7324  2.2 
464.8   1.6X
-Native ORC Vectorized (Pushdown)   517 /  548 30.4 
 32.8  22.1X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select all string rows (value IS NOT NULL): Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
-
-Parquet Vectorized  20750 / 20872  0.8
1319.3   1.0X
-Parquet Vectorized (Pushdown)   21002 / 21032  0.7
1335.3   1.0X
-Native ORC Vectorized   16714 / 16742  0.9
1062.6   1.2X
-Native ORC Vectorized (Pushdown)16926 / 16965  0.9
1076.1   1.2X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 int row (value IS NULL):Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r293215498
 
 

 ##
 File path: sql/core/benchmarks/FilterPushdownBenchmark-results.txt
 ##
 @@ -2,669 +2,695 @@
 Pushdown for many distinct value case
 

 
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 string row (value IS NULL): Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative
-
-Parquet Vectorized  11405 / 11485  1.4 
725.1   1.0X
-Parquet Vectorized (Pushdown)  675 /  690 23.3 
 42.9  16.9X
-Native ORC Vectorized 7127 / 7170  2.2 
453.1   1.6X
-Native ORC Vectorized (Pushdown)   519 /  541 30.3 
 33.0  22.0X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative
-
-Parquet Vectorized  11457 / 11473  1.4 
728.4   1.0X
-Parquet Vectorized (Pushdown)  656 /  686 24.0 
 41.7  17.5X
-Native ORC Vectorized 7328 / 7342  2.1 
465.9   1.6X
-Native ORC Vectorized (Pushdown)   539 /  565 29.2 
 34.2  21.3X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row (value = '7864320'): Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative
-
-Parquet Vectorized  11878 / 11888  1.3 
755.2   1.0X
-Parquet Vectorized (Pushdown)  630 /  654 25.0 
 40.1  18.9X
-Native ORC Vectorized 7342 / 7362  2.1 
466.8   1.6X
-Native ORC Vectorized (Pushdown)   519 /  537 30.3 
 33.0  22.9X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row (value <=> '7864320'): Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
-
-Parquet Vectorized  11423 / 11440  1.4 
726.2   1.0X
-Parquet Vectorized (Pushdown)  625 /  643 25.2 
 39.7  18.3X
-Native ORC Vectorized 7315 / 7335  2.2 
465.1   1.6X
-Native ORC Vectorized (Pushdown)   507 /  520 31.0 
 32.2  22.5X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative
-
-Parquet Vectorized  11440 / 11478  1.4 
727.3   1.0X
-Parquet Vectorized (Pushdown)  634 /  652 24.8 
 40.3  18.0X
-Native ORC Vectorized 7311 / 7324  2.2 
464.8   1.6X
-Native ORC Vectorized (Pushdown)   517 /  548 30.4 
 32.8  22.1X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select all string rows (value IS NOT NULL): Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
-
-Parquet Vectorized  20750 / 20872  0.8
1319.3   1.0X
-Parquet Vectorized (Pushdown)   21002 / 21032  0.7
1335.3   1.0X
-Native ORC Vectorized   16714 / 16742  0.9
1062.6   1.2X
-Native ORC Vectorized (Pushdown)16926 / 16965  0.9
1076.1   1.2X
-
-OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
-Select 0 int row (value IS NULL):Best/Avg Time(ms)Rate(M/s)   Per 
Row(ns)   Relative

[GitHub] [spark] AmplabJenkins removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add 
a runtime buffer size configuration for Pandas UDFs
URL: https://github.com/apache/spark/pull/24826#issuecomment-501559017
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a 
runtime buffer size configuration for Pandas UDFs
URL: https://github.com/apache/spark/pull/24826#issuecomment-501559024
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106453/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add 
a runtime buffer size configuration for Pandas UDFs
URL: https://github.com/apache/spark/pull/24826#issuecomment-501559024
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106453/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a 
runtime buffer size configuration for Pandas UDFs
URL: https://github.com/apache/spark/pull/24826#issuecomment-501559017
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan closed pull request #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

cloud-fan closed pull request #24741: [SPARK-27322][SQL] DataSourceV2 table 
relation
URL: https://github.com/apache/spark/pull/24741
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs

2019-06-12 Thread GitBox

SparkQA removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a 
runtime buffer size configuration for Pandas UDFs
URL: https://github.com/apache/spark/pull/24826#issuecomment-501536110
 
 
   **[Test build #106453 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106453/testReport)**
 for PR 24826 at commit 
[`614013e`](https://github.com/apache/spark/commit/614013e0b0e87ef71a082a7ac269244157025aad).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs

2019-06-12 Thread GitBox

SparkQA commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime 
buffer size configuration for Pandas UDFs
URL: https://github.com/apache/spark/pull/24826#issuecomment-501558556
 
 
   **[Test build #106453 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106453/testReport)**
 for PR 24826 at commit 
[`614013e`](https://github.com/apache/spark/commit/614013e0b0e87ef71a082a7ac269244157025aad).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save 
default constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501557552
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default 
constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501557557
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106451/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default 
constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501557552
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save 
default constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501557557
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106451/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

SparkQA removed a comment on issue #24792: [SPARK-27953][SQL] Save default 
constraint with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501525743
 
 
   **[Test build #106451 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106451/testReport)**
 for PR 24792 at commit 
[`9931eb6`](https://github.com/apache/spark/commit/9931eb63c0715ba190717a593ce51b949d5355b2).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table

2019-06-12 Thread GitBox

SparkQA commented on issue #24792: [SPARK-27953][SQL] Save default constraint 
with Column into table properties when create Hive table
URL: https://github.com/apache/spark/pull/24792#issuecomment-501557174
 
 
   **[Test build #106451 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106451/testReport)**
 for PR 24792 at commit 
[`9931eb6`](https://github.com/apache/spark/commit/9931eb63c0715ba190717a593ce51b949d5355b2).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

cloud-fan commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table 
relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501556439
 
 
   I have only comment about adding more code comments, which can be addressed 
later. I'm merging it to unblock the DS v2 project, thanks for your hard work 
@jzhuge @rdblue !


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan edited a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

cloud-fan edited a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 
table relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501556439
 
 
   I have only one comment about adding more code comments, which can be 
addressed later. I'm merging it to unblock the DS v2 project, thanks for your 
hard work @jzhuge @rdblue !


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r293212240
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala
 ##
 @@ -135,6 +139,34 @@ object FilterPushdownBenchmark extends BenchmarkBase with 
SQLHelper {
 benchmark.run()
   }
 
+  def filterPushDownBenchmarkWithColumn(
 
 Review comment:
   @IvanVergiliev . The following doesn't mean put that into here. 
   > I think we should definitely have some automated benchmark for this. 
Otherwise there's nothing in the codebase exercising the behaviour being 
changed, and so nothing to prevent future regressions.
   
   Since this contribution is big, it's worth to have its own benchmark 
focusing on filter conversion. Also, the benchmark should have both ORCv1 and 
ORCv2 benchmark result.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r293212240
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala
 ##
 @@ -135,6 +139,34 @@ object FilterPushdownBenchmark extends BenchmarkBase with 
SQLHelper {
 benchmark.run()
   }
 
+  def filterPushDownBenchmarkWithColumn(
 
 Review comment:
   @IvanVergiliev . The following doesn't mean put that into here. Since this 
contribution is big, it's worth to have its own benchmark focusing on filter 
conversion.
   > I think we should definitely have some automated benchmark for this. 
Otherwise there's nothing in the codebase exercising the behaviour being 
changed, and so nothing to prevent future regressions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

cloud-fan commented on a change in pull request #24741: [SPARK-27322][SQL] 
DataSourceV2 table relation
URL: https://github.com/apache/spark/pull/24741#discussion_r293210583
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -731,20 +753,16 @@ class Analyzer(
 //and the default database is only used to look up a view);
 // 3. Use the currentDb of the SessionCatalog.
 private def lookupTableFromCatalog(
+tableIdentifier: TableIdentifier,
 u: UnresolvedRelation,
 defaultDatabase: Option[String] = None): LogicalPlan = {
-  val tableIdentWithDb = u.tableIdentifier.copy(
-database = u.tableIdentifier.database.orElse(defaultDatabase))
+  val tableIdentWithDb = tableIdentifier.copy(
+database = tableIdentifier.database.orElse(defaultDatabase))
   try {
 catalog.lookupRelation(tableIdentWithDb)
   } catch {
-case e: NoSuchTableException =>
-  u.failAnalysis(s"Table or view not found: 
${tableIdentWithDb.unquotedString}", e)
-// If the database is defined and that database is not found, throw an 
AnalysisException.
-// Note that if the database is not defined, it is possible we are 
looking up a temp view.
-case e: NoSuchDatabaseException =>
-  u.failAnalysis(s"Table or view not found: 
${tableIdentWithDb.unquotedString}, the " +
-s"database ${e.db} doesn't exist.", e)
+case _: NoSuchTableException | _: NoSuchDatabaseException =>
+  u
 
 Review comment:
   We should add some comments to explain why we need to delay the exception 
here. To me it's because we still have a chance to resolve the table relation 
with v2 rules.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

cloud-fan commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r293209902
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala
 ##
 @@ -362,6 +394,13 @@ object FilterPushdownBenchmark extends BenchmarkBase with 
SQLHelper {
 }
 
 runBenchmark(s"Pushdown benchmark with many filters") {
+  // This benchmark and the next one are similar in that they both test 
predicate pushdown
+  // where the filter itself is very large. There have been cases where 
the filter conversion
+  // would take minutes to hours for large filters due to it being 
implemented with exponential
+  // complexity in the height of the filter tree.
+  // The difference between these two benchmarks is that this one 
benchmarks pushdown with a
+  // large string filter (`a AND b AND c ...`), whereas the next one 
benchmarks pushdown with
+  // a large Column-based filter (`col(a) || (col(b) || (col(c)...))`).
 
 Review comment:
   I still can't get it. Both the string filter and column-based filter will 
become an `Expression` in the `Filter` operator. The differences I see are
   1. the new benchmark builds a larger filter
   2. the new benchmark use `Or` instead of `And`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] 
Remove an unused variable in SparkSubmt.scala
URL: https://github.com/apache/spark/pull/24857#discussion_r293207010
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
 ##
 @@ -131,13 +129,6 @@ private[ui] class StagePage(parent: StagesTab, store: 
AppStatusStore) extends We
   return UIUtils.headerSparkPage(request, stageHeader, content, parent)
 }
 
-val storedTasks = store.taskCount(stageData.stageId, stageData.attemptId)
-val numCompleted = stageData.numCompleteTasks
-val totalTasksNumStr = if (totalTasks == storedTasks) {
-  s"$totalTasks"
-} else {
-  s"$totalTasks, showing $storedTasks"
-}
 
 Review comment:
   @imback82 . Before removing lines, please read the commit history. For 
example, this is live code. Please see the following PR.
   - https://github.com/apache/spark/pull/22525
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24741: [SPARK-27322][SQL] 
DataSourceV2 table relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501549531
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106449/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24741: [SPARK-27322][SQL] 
DataSourceV2 table relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501549526
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table 
relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501549526
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table 
relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501549531
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106449/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

SparkQA removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 
table relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501517370
 
 
   **[Test build #106449 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106449/testReport)**
 for PR 24741 at commit 
[`b8cdf6c`](https://github.com/apache/spark/commit/b8cdf6c22172585b3b3a9452d5e4d2d591ece88e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation

2019-06-12 Thread GitBox

SparkQA commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table 
relation
URL: https://github.com/apache/spark/pull/24741#issuecomment-501549203
 
 
   **[Test build #106449 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106449/testReport)**
 for PR 24741 at commit 
[`b8cdf6c`](https://github.com/apache/spark/commit/b8cdf6c22172585b3b3a9452d5e4d2d591ece88e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] 
Remove an unused variable in SparkSubmt.scala
URL: https://github.com/apache/spark/pull/24857#discussion_r293205670
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/memory/ExecutionMemoryPool.scala
 ##
 @@ -151,7 +151,7 @@ private[memory] class ExecutionMemoryPool(
*/
   def releaseMemory(numBytes: Long, taskAttemptId: Long): Unit = 
lock.synchronized {
 val curMem = memoryForTask.getOrElse(taskAttemptId, 0L)
-var memoryToFree = if (curMem < numBytes) {
+val memoryToFree = if (curMem < numBytes) {
 
 Review comment:
   Let's not put the different things in the same PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] 
Remove an unused variable in SparkSubmt.scala
URL: https://github.com/apache/spark/pull/24857#discussion_r293205706
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala
 ##
 @@ -364,7 +364,7 @@ private class DefaultPartitionCoalescer(val balanceSlack: 
Double = 0.10)
   val partNoLocIter = partitionLocs.partsWithoutLocs.iterator
   groupArr.filter(pg => pg.numPartitions == 0).foreach { pg =>
 while (partNoLocIter.hasNext && pg.numPartitions == 0) {
-  var nxt_part = partNoLocIter.next()
+  val nxt_part = partNoLocIter.next()
 
 Review comment:
   ditto.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala

2019-06-12 Thread GitBox

dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] 
Remove an unused variable in SparkSubmt.scala
URL: https://github.com/apache/spark/pull/24857#discussion_r293205471
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolMessage.scala
 ##
 @@ -46,9 +46,6 @@ private[rest] abstract class SubmitRestProtocolMessage {
   val action: String = messageType
   var message: String = null
 
-  // For JSON deserialization
-  private def setAction(a: String): Unit = { }
-
 
 Review comment:
   This was added from the 
[beginning](https://github.com/apache/spark/commit/6ec0cdc14390d4dc45acf31040f21e1efc476fc0#diff-fb39e366f633463136727a6b6d5b832fR52)
 and the comment seems to mean this is used. Shall we keep the existing one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] IvanVergiliev commented on issue #24783: [SPARK-27105][SQL][test-hadoop3.2] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

IvanVergiliev commented on issue #24783: [SPARK-27105][SQL][test-hadoop3.2] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24783#issuecomment-501548338
 
 
   @cloud-fan cool, this sounds good to me too! I can also bring my PR back to 
a state similar to before I merged 
https://github.com/IvanVergiliev/spark/pull/2/files - with `filter` and `build` 
in separate functions - and then @gengliangwang can followup with the change to 
reuse `build` for determining whether leaf nodes are convertible?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

2019-06-12 Thread GitBox

IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r293204913
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala
 ##
 @@ -362,6 +394,13 @@ object FilterPushdownBenchmark extends BenchmarkBase with 
SQLHelper {
 }
 
 runBenchmark(s"Pushdown benchmark with many filters") {
+  // This benchmark and the next one are similar in that they both test 
predicate pushdown
+  // where the filter itself is very large. There have been cases where 
the filter conversion
+  // would take minutes to hours for large filters due to it being 
implemented with exponential
+  // complexity in the height of the filter tree.
+  // The difference between these two benchmarks is that this one 
benchmarks pushdown with a
+  // large string filter (`a AND b AND c ...`), whereas the next one 
benchmarks pushdown with
+  // a large Column-based filter (`col(a) || (col(b) || (col(c)...))`).
 
 Review comment:
   @cloud-fan the two go through different code paths. The string-based one was 
added in https://github.com/apache/spark/pull/22313 , but it doesn't expose the 
slowness when passing a `Column` filter directly. That is, the string-based one 
was fast before this PR. The one this PR fixes is specifically when passing in 
a `Column` directly to something like `df.filter(Column)`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24841: [SPARK-27369][CORE] Setup 
resources when Standalone Worker starts up
URL: https://github.com/apache/spark/pull/24841#issuecomment-501547045
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106450/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24841: [SPARK-27369][CORE] Setup 
resources when Standalone Worker starts up
URL: https://github.com/apache/spark/pull/24841#issuecomment-501547038
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24841: [SPARK-27369][CORE] Setup resources 
when Standalone Worker starts up
URL: https://github.com/apache/spark/pull/24841#issuecomment-501547045
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106450/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24841: [SPARK-27369][CORE] Setup resources 
when Standalone Worker starts up
URL: https://github.com/apache/spark/pull/24841#issuecomment-501547038
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up

2019-06-12 Thread GitBox

SparkQA removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources 
when Standalone Worker starts up
URL: https://github.com/apache/spark/pull/24841#issuecomment-501524364
 
 
   **[Test build #106450 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106450/testReport)**
 for PR 24841 at commit 
[`14203f5`](https://github.com/apache/spark/commit/14203f53604ce0b63a964e8c11288c3f9014792d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up

2019-06-12 Thread GitBox

SparkQA commented on issue #24841: [SPARK-27369][CORE] Setup resources when 
Standalone Worker starts up
URL: https://github.com/apache/spark/pull/24841#issuecomment-501546738
 
 
   **[Test build #106450 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106450/testReport)**
 for PR 24841 at commit 
[`14203f5`](https://github.com/apache/spark/commit/14203f53604ce0b63a964e8c11288c3f9014792d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] jiangxb1987 commented on issue #24699: [SPARK-27666][CORE] Do not release lock while TaskContext already completed

2019-06-12 Thread GitBox

jiangxb1987 commented on issue #24699: [SPARK-27666][CORE] Do not release lock 
while TaskContext already completed
URL: https://github.com/apache/spark/pull/24699#issuecomment-501545350
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun edited a comment on issue #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala

2019-06-12 Thread GitBox

dongjoon-hyun edited a comment on issue #24857: [MINOR][CORE] Remove an unused 
variable in SparkSubmt.scala
URL: https://github.com/apache/spark/pull/24857#issuecomment-501543348
 
 
   Ur, thank you for the update, but let's remove `unused imports` stuff. You 
can get reviews later in another PR. It's good to have but sometime it's on the 
edge due to the intrusiveness. Also, it's beyond the scope of PR title.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala

2019-06-12 Thread GitBox

dongjoon-hyun commented on issue #24857: [MINOR][CORE] Remove an unused 
variable in SparkSubmt.scala
URL: https://github.com/apache/spark/pull/24857#issuecomment-501543348
 
 
   Ur, thank you for the update, but let's remove `unused imports` stuff. You 
can get reviews later in another PR. It's good to have but sometime it's on the 
edge due to the intrusiveness.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 edited a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up

2019-06-12 Thread GitBox

Ngone51 edited a comment on issue #24841: [SPARK-27369][CORE] Setup resources 
when Standalone Worker starts up
URL: https://github.com/apache/spark/pull/24841#issuecomment-501538461
 
 
   @viirya IIUC, executor set up the resources from what the worker assigned to 
it. For example, worker could "split" its own resources to some separate 
resource files according to Masters' requirements for executors. Then, executor 
could set up from corresponding resource file when it starts up.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL

2019-06-12 Thread GitBox

SparkQA commented on issue #24706: [SPARK-23128][SQL] A new approach to do 
adaptive execution in Spark SQL
URL: https://github.com/apache/spark/pull/24706#issuecomment-501538447
 
 
   **[Test build #106454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106454/testReport)**
 for PR 24706 at commit 
[`5688cb4`](https://github.com/apache/spark/commit/5688cb47b5171fcb590819c101dacfb73ffde356).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up

2019-06-12 Thread GitBox

Ngone51 commented on issue #24841: [SPARK-27369][CORE] Setup resources when 
Standalone Worker starts up
URL: https://github.com/apache/spark/pull/24841#issuecomment-501538461
 
 
   @viirya IIUC, executor set up the resources from what the worker assigned to 
it. For example, worker could "split" its own resources to some separate 
resource files according to Masters' requirements for executors. Then, 
executors could set up from those resource files when it starts up.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24706: [SPARK-23128][SQL] A new approach to 
do adaptive execution in Spark SQL
URL: https://github.com/apache/spark/pull/24706#issuecomment-501538146
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL

2019-06-12 Thread GitBox

AmplabJenkins commented on issue #24706: [SPARK-23128][SQL] A new approach to 
do adaptive execution in Spark SQL
URL: https://github.com/apache/spark/pull/24706#issuecomment-501538151
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11697/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL

2019-06-12 Thread GitBox

AmplabJenkins removed a comment on issue #24706: [SPARK-23128][SQL] A new 
approach to do adaptive execution in Spark SQL
URL: https://github.com/apache/spark/pull/24706#issuecomment-501538151
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11697/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 >

1 - 100 of 803 matches

Mail list logo