[GitHub] [spark] wangyum commented on a change in pull request #24767: [SPARK-27918][SQL] Port boolean.sql
wangyum commented on a change in pull request #24767: [SPARK-27918][SQL] Port boolean.sql URL: https://github.com/apache/spark/pull/24767#discussion_r293226760 ## File path: sql/core/src/test/resources/sql-tests/results/pgSQL/boolean.sql.out ## @@ -0,0 +1,710 @@ +-- Automatically generated by SQLQueryTestSuite +-- Number of queries: 81 + + +-- !query 0 +SELECT 1 AS one +-- !query 0 schema +struct +-- !query 0 output +1 + + +-- !query 1 +SELECT true AS true +-- !query 1 schema +struct +-- !query 1 output +true + + +-- !query 2 +SELECT false AS false +-- !query 2 schema +struct +-- !query 2 output +false + + +-- !query 3 +SELECT cast('t' as boolean) AS true +-- !query 3 schema +struct +-- !query 3 output +true + + +-- !query 4 +SELECT cast(' f ' as boolean) AS false +-- !query 4 schema +struct +-- !query 4 output +NULL + + +-- !query 5 +SELECT cast('true' as boolean) AS true +-- !query 5 schema +struct +-- !query 5 output +true + + +-- !query 6 +SELECT cast('test' as boolean) AS error +-- !query 6 schema +struct +-- !query 6 output +NULL + + +-- !query 7 +SELECT cast('false' as boolean) AS false +-- !query 7 schema +struct +-- !query 7 output +false + + +-- !query 8 +SELECT cast('foo' as boolean) AS error +-- !query 8 schema +struct +-- !query 8 output +NULL + + +-- !query 9 +SELECT cast('y' as boolean) AS true +-- !query 9 schema +struct +-- !query 9 output +true + + +-- !query 10 +SELECT cast('yes' as boolean) AS true +-- !query 10 schema +struct +-- !query 10 output +true + + +-- !query 11 +SELECT cast('yeah' as boolean) AS error +-- !query 11 schema +struct +-- !query 11 output +NULL + + +-- !query 12 +SELECT cast('n' as boolean) AS false +-- !query 12 schema +struct +-- !query 12 output +false + + +-- !query 13 +SELECT cast('no' as boolean) AS false +-- !query 13 schema +struct +-- !query 13 output +false + + +-- !query 14 +SELECT cast('nay' as boolean) AS error +-- !query 14 schema +struct +-- !query 14 output +NULL + + +-- !query 15 +SELECT cast('on' as boolean) AS true +-- !query 15 schema +struct +-- !query 15 output +NULL + + +-- !query 16 +SELECT cast('off' as boolean) AS false +-- !query 16 schema +struct +-- !query 16 output +NULL + + +-- !query 17 +SELECT cast('of' as boolean) AS false +-- !query 17 schema +struct +-- !query 17 output +NULL Review comment: PostgreSQL fixed the doc: https://github.com/postgres/postgres/commit/9729c9360886bee7feddc6a1124b0742de4b9f3d This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501571198 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501571202 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11701/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501571198 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501571202 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11701/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501569729 **[Test build #106458 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106458/testReport)** for PR 24842 at commit [`0a00a03`](https://github.com/apache/spark/commit/0a00a036148e898bd88bf3dd6e7b7ca0c67fa270). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] peter-toth commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases
peter-toth commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#discussion_r293224600 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala ## @@ -81,18 +81,31 @@ class PlanParserSuite extends AnalysisTest { assertEqual("select * from a intersect all select * from b", a.intersect(b, isAll = true)) } + private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, Seq[String]))*): With = { +val ctes = namedPlans.map { + case (name, (cte, columnAliases)) => +val subquery = if (columnAliases.isEmpty) { + cte +} else { + UnresolvedSubqueryColumnAliases(columnAliases, cte) +} +name -> SubqueryAlias(name, subquery) +} +With(plan, ctes) + } Review comment: Fixed, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] peter-toth commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases
peter-toth commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#discussion_r293224390 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala ## @@ -633,4 +634,15 @@ class AnalysisSuite extends AnalysisTest with Matchers { val res = ViewAnalyzer.execute(view) comparePlans(res, expected) } + + test("SPARK-28002: CTE with non-existing column alias") { Review comment: thanks, fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
dongjoon-hyun commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-501568337 +1 for @cloud-fan 's suggestion. Also, I saw that @gatorsmile also gave the same advice before. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases
dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#discussion_r293223242 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala ## @@ -81,18 +81,31 @@ class PlanParserSuite extends AnalysisTest { assertEqual("select * from a intersect all select * from b", a.intersect(b, isAll = true)) } + private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, Seq[String]))*): With = { +val ctes = namedPlans.map { + case (name, (cte, columnAliases)) => +val subquery = if (columnAliases.isEmpty) { + cte +} else { + UnresolvedSubqueryColumnAliases(columnAliases, cte) +} +name -> SubqueryAlias(name, subquery) +} +With(plan, ctes) + } Review comment: Oh, this PR is already updated to the master. Then, please move this function to the below of the existing `cte` function. We had better gather the similar functions. If possible, please consolidate them into one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567326 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106456/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
SparkQA removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501565642 **[Test build #106456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106456/testReport)** for PR 24842 at commit [`510eee1`](https://github.com/apache/spark/commit/510eee10c9c8b5938cec8bbc867c576ff0080103). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567322 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567139 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11700/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins removed a comment on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567134 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567326 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106456/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases
dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#discussion_r293222471 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala ## @@ -81,18 +81,31 @@ class PlanParserSuite extends AnalysisTest { assertEqual("select * from a intersect all select * from b", a.intersect(b, isAll = true)) } + private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, Seq[String]))*): With = { +val ctes = namedPlans.map { + case (name, (cte, columnAliases)) => +val subquery = if (columnAliases.isEmpty) { + cte +} else { + UnresolvedSubqueryColumnAliases(columnAliases, cte) +} +name -> SubqueryAlias(name, subquery) +} +With(plan, ctes) + } Review comment: This test suite is updated a few minute ago. Could you rebase once more? There will be another `cte` function in this test suite. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567309 **[Test build #106456 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106456/testReport)** for PR 24842 at commit [`510eee1`](https://github.com/apache/spark/commit/510eee10c9c8b5938cec8bbc867c576ff0080103). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases
dongjoon-hyun commented on a change in pull request #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#discussion_r293222471 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala ## @@ -81,18 +81,31 @@ class PlanParserSuite extends AnalysisTest { assertEqual("select * from a intersect all select * from b", a.intersect(b, isAll = true)) } + private def cte(plan: LogicalPlan, namedPlans: (String, (LogicalPlan, Seq[String]))*): With = { +val ctes = namedPlans.map { + case (name, (cte, columnAliases)) => +val subquery = if (columnAliases.isEmpty) { + cte +} else { + UnresolvedSubqueryColumnAliases(columnAliases, cte) +} +name -> SubqueryAlias(name, subquery) +} +With(plan, ctes) + } Review comment: This test suite is updated a few minute ago. Could you rebase against `master` branch once more? There will be another `cte` function in this test suite. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567322 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567134 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
AmplabJenkins commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501567139 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11700/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
cloud-fan commented on issue #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#issuecomment-501566535 @IvanVergiliev I'd suggest we revert all the benchmark changes, and we write a simple microbenmark to test `OrcFilters`, and post the benchmark code and result in PR description. Currently we do not run benchmarks automatically for Spark, so perf regressions rely on user reports. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases
SparkQA commented on issue #24842: [SPARK-28002][SQL] Support WITH clause column aliases URL: https://github.com/apache/spark/pull/24842#issuecomment-501565642 **[Test build #106456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106456/testReport)** for PR 24842 at commit [`510eee1`](https://github.com/apache/spark/commit/510eee10c9c8b5938cec8bbc867c576ff0080103). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs
SparkQA commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs URL: https://github.com/apache/spark/pull/24735#issuecomment-501565653 **[Test build #106457 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106457/testReport)** for PR 24735 at commit [`053b3ba`](https://github.com/apache/spark/commit/053b3ba1b7a84d6a4b355a865f4741935208d978). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501565304 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106452/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501565300 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs
AmplabJenkins removed a comment on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs URL: https://github.com/apache/spark/pull/24735#issuecomment-501565084 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11699/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501565304 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106452/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
SparkQA removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501530960 **[Test build #106452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106452/testReport)** for PR 24792 at commit [`46c12d8`](https://github.com/apache/spark/commit/46c12d8896ef1022ca3e3ee6c2b21a376ae7f378). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs
AmplabJenkins removed a comment on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs URL: https://github.com/apache/spark/pull/24735#issuecomment-501565079 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501565300 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs
AmplabJenkins commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs URL: https://github.com/apache/spark/pull/24735#issuecomment-501565084 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11699/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs
AmplabJenkins commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs URL: https://github.com/apache/spark/pull/24735#issuecomment-501565079 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
SparkQA commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501564896 **[Test build #106452 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106452/testReport)** for PR 24792 at commit [`46c12d8`](https://github.com/apache/spark/commit/46c12d8896ef1022ca3e3ee6c2b21a376ae7f378). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs
cloud-fan commented on issue #24735: [SPARK-27871][SQL] LambdaVariable should use per-query unique IDs instead of globally unique IDs URL: https://github.com/apache/spark/pull/24735#issuecomment-501564869 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501564017 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106455/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
SparkQA removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501563694 **[Test build #106455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106455/testReport)** for PR 24832 at commit [`2c17ced`](https://github.com/apache/spark/commit/2c17ced4e8effb401e2d0d1c1de14d4939e1c34e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501564013 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501564013 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501564017 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106455/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
SparkQA commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501564006 **[Test build #106455 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106455/testReport)** for PR 24832 at commit [`2c17ced`](https://github.com/apache/spark/commit/2c17ced4e8effb401e2d0d1c1de14d4939e1c34e). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class InsertTableStatement(` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
SparkQA commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501563694 **[Test build #106455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106455/testReport)** for PR 24832 at commit [`2c17ced`](https://github.com/apache/spark/commit/2c17ced4e8effb401e2d0d1c1de14d4939e1c34e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501563207 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11698/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
AmplabJenkins removed a comment on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501563202 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501563202 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
AmplabJenkins commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501563207 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11698/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jzhuge commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable
jzhuge commented on issue #24832: [SPARK-27845][SQL][WIP] DataSourceV2: InsertTable URL: https://github.com/apache/spark/pull/24832#issuecomment-501562120 Rebase and squash This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24699: [SPARK-27666][CORE] Do not release lock while TaskContext already completed
cloud-fan commented on a change in pull request #24699: [SPARK-27666][CORE] Do not release lock while TaskContext already completed URL: https://github.com/apache/spark/pull/24699#discussion_r293217786 ## File path: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala ## @@ -1176,6 +1176,24 @@ class RDDSuite extends SparkFunSuite with SharedSparkContext { }.collect() } + test("SPARK-27666: Do not release lock while TaskContext already completed") { +val rdd = sc.parallelize(Range(0, 10), 1).cache() +// validate cache +rdd.collect() +rdd.mapPartitions { iter => + val t = new Thread(() => { +while (iter.hasNext) { + iter.next() + Thread.sleep(100) +} + }) + t.setDaemon(false) + t.start() + Iterator(0) +}.collect() +Thread.sleep(10 * 150) Review comment: we shouldn't use sleep in tests, as the test will become flaky sooner or later. If `CountDownLatch` doesn't work, can we use Spark `Accumulator` as signals? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#discussion_r293212240 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala ## @@ -135,6 +139,34 @@ object FilterPushdownBenchmark extends BenchmarkBase with SQLHelper { benchmark.run() } + def filterPushDownBenchmarkWithColumn( Review comment: @IvanVergiliev . The following doesn't mean putting that into here. > I think we should definitely have some automated benchmark for this. Otherwise there's nothing in the codebase exercising the behaviour being changed, and so nothing to prevent future regressions. Since this contribution is big, it's worth to have its own benchmark focusing on filter conversion. Also, the benchmark should have both ORCv1 and ORCv2 benchmark result. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jzhuge commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
jzhuge commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501561040 Thanks @cloud-fan @dongjoon-hyun @gatorsmile @rdblue for the excellent reviews! Thanks @rdblue for the great help! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#discussion_r293216015 ## File path: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ## @@ -2,669 +2,695 @@ Pushdown for many distinct value case -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 string row (value IS NULL): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11405 / 11485 1.4 725.1 1.0X -Parquet Vectorized (Pushdown) 675 / 690 23.3 42.9 16.9X -Native ORC Vectorized 7127 / 7170 2.2 453.1 1.6X -Native ORC Vectorized (Pushdown) 519 / 541 30.3 33.0 22.0X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11457 / 11473 1.4 728.4 1.0X -Parquet Vectorized (Pushdown) 656 / 686 24.0 41.7 17.5X -Native ORC Vectorized 7328 / 7342 2.1 465.9 1.6X -Native ORC Vectorized (Pushdown) 539 / 565 29.2 34.2 21.3X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row (value = '7864320'): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11878 / 11888 1.3 755.2 1.0X -Parquet Vectorized (Pushdown) 630 / 654 25.0 40.1 18.9X -Native ORC Vectorized 7342 / 7362 2.1 466.8 1.6X -Native ORC Vectorized (Pushdown) 519 / 537 30.3 33.0 22.9X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row (value <=> '7864320'): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11423 / 11440 1.4 726.2 1.0X -Parquet Vectorized (Pushdown) 625 / 643 25.2 39.7 18.3X -Native ORC Vectorized 7315 / 7335 2.2 465.1 1.6X -Native ORC Vectorized (Pushdown) 507 / 520 31.0 32.2 22.5X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11440 / 11478 1.4 727.3 1.0X -Parquet Vectorized (Pushdown) 634 / 652 24.8 40.3 18.0X -Native ORC Vectorized 7311 / 7324 2.2 464.8 1.6X -Native ORC Vectorized (Pushdown) 517 / 548 30.4 32.8 22.1X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select all string rows (value IS NOT NULL): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 20750 / 20872 0.8 1319.3 1.0X -Parquet Vectorized (Pushdown) 21002 / 21032 0.7 1335.3 1.0X -Native ORC Vectorized 16714 / 16742 0.9 1062.6 1.2X -Native ORC Vectorized (Pushdown)16926 / 16965 0.9 1076.1 1.2X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 int row (value IS NULL):Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#discussion_r293215498 ## File path: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ## @@ -2,669 +2,695 @@ Pushdown for many distinct value case -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 string row (value IS NULL): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11405 / 11485 1.4 725.1 1.0X -Parquet Vectorized (Pushdown) 675 / 690 23.3 42.9 16.9X -Native ORC Vectorized 7127 / 7170 2.2 453.1 1.6X -Native ORC Vectorized (Pushdown) 519 / 541 30.3 33.0 22.0X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11457 / 11473 1.4 728.4 1.0X -Parquet Vectorized (Pushdown) 656 / 686 24.0 41.7 17.5X -Native ORC Vectorized 7328 / 7342 2.1 465.9 1.6X -Native ORC Vectorized (Pushdown) 539 / 565 29.2 34.2 21.3X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row (value = '7864320'): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11878 / 11888 1.3 755.2 1.0X -Parquet Vectorized (Pushdown) 630 / 654 25.0 40.1 18.9X -Native ORC Vectorized 7342 / 7362 2.1 466.8 1.6X -Native ORC Vectorized (Pushdown) 519 / 537 30.3 33.0 22.9X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row (value <=> '7864320'): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11423 / 11440 1.4 726.2 1.0X -Parquet Vectorized (Pushdown) 625 / 643 25.2 39.7 18.3X -Native ORC Vectorized 7315 / 7335 2.2 465.1 1.6X -Native ORC Vectorized (Pushdown) 507 / 520 31.0 32.2 22.5X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11440 / 11478 1.4 727.3 1.0X -Parquet Vectorized (Pushdown) 634 / 652 24.8 40.3 18.0X -Native ORC Vectorized 7311 / 7324 2.2 464.8 1.6X -Native ORC Vectorized (Pushdown) 517 / 548 30.4 32.8 22.1X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select all string rows (value IS NOT NULL): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 20750 / 20872 0.8 1319.3 1.0X -Parquet Vectorized (Pushdown) 21002 / 21032 0.7 1335.3 1.0X -Native ORC Vectorized 16714 / 16742 0.9 1062.6 1.2X -Native ORC Vectorized (Pushdown)16926 / 16965 0.9 1076.1 1.2X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 int row (value IS NULL):Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#discussion_r293215498 ## File path: sql/core/benchmarks/FilterPushdownBenchmark-results.txt ## @@ -2,669 +2,695 @@ Pushdown for many distinct value case -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 string row (value IS NULL): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11405 / 11485 1.4 725.1 1.0X -Parquet Vectorized (Pushdown) 675 / 690 23.3 42.9 16.9X -Native ORC Vectorized 7127 / 7170 2.2 453.1 1.6X -Native ORC Vectorized (Pushdown) 519 / 541 30.3 33.0 22.0X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 string row ('7864320' < value < '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11457 / 11473 1.4 728.4 1.0X -Parquet Vectorized (Pushdown) 656 / 686 24.0 41.7 17.5X -Native ORC Vectorized 7328 / 7342 2.1 465.9 1.6X -Native ORC Vectorized (Pushdown) 539 / 565 29.2 34.2 21.3X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row (value = '7864320'): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11878 / 11888 1.3 755.2 1.0X -Parquet Vectorized (Pushdown) 630 / 654 25.0 40.1 18.9X -Native ORC Vectorized 7342 / 7362 2.1 466.8 1.6X -Native ORC Vectorized (Pushdown) 519 / 537 30.3 33.0 22.9X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row (value <=> '7864320'): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11423 / 11440 1.4 726.2 1.0X -Parquet Vectorized (Pushdown) 625 / 643 25.2 39.7 18.3X -Native ORC Vectorized 7315 / 7335 2.2 465.1 1.6X -Native ORC Vectorized (Pushdown) 507 / 520 31.0 32.2 22.5X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 1 string row ('7864320' <= value <= '7864320'): Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 11440 / 11478 1.4 727.3 1.0X -Parquet Vectorized (Pushdown) 634 / 652 24.8 40.3 18.0X -Native ORC Vectorized 7311 / 7324 2.2 464.8 1.6X -Native ORC Vectorized (Pushdown) 517 / 548 30.4 32.8 22.1X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select all string rows (value IS NOT NULL): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative - -Parquet Vectorized 20750 / 20872 0.8 1319.3 1.0X -Parquet Vectorized (Pushdown) 21002 / 21032 0.7 1335.3 1.0X -Native ORC Vectorized 16714 / 16742 0.9 1062.6 1.2X -Native ORC Vectorized (Pushdown)16926 / 16965 0.9 1076.1 1.2X - -OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64 -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz -Select 0 int row (value IS NULL):Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative
[GitHub] [spark] AmplabJenkins removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs
AmplabJenkins removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs URL: https://github.com/apache/spark/pull/24826#issuecomment-501559017 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs
AmplabJenkins commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs URL: https://github.com/apache/spark/pull/24826#issuecomment-501559024 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106453/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs
AmplabJenkins removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs URL: https://github.com/apache/spark/pull/24826#issuecomment-501559024 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106453/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs
AmplabJenkins commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs URL: https://github.com/apache/spark/pull/24826#issuecomment-501559017 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #24741: [SPARK-27322][SQL] DataSourceV2 table relation
cloud-fan closed pull request #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs
SparkQA removed a comment on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs URL: https://github.com/apache/spark/pull/24826#issuecomment-501536110 **[Test build #106453 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106453/testReport)** for PR 24826 at commit [`614013e`](https://github.com/apache/spark/commit/614013e0b0e87ef71a082a7ac269244157025aad). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs
SparkQA commented on issue #24826: [SPARK-27870][SQL][PYTHON] Add a runtime buffer size configuration for Pandas UDFs URL: https://github.com/apache/spark/pull/24826#issuecomment-501558556 **[Test build #106453 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106453/testReport)** for PR 24826 at commit [`614013e`](https://github.com/apache/spark/commit/614013e0b0e87ef71a082a7ac269244157025aad). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501557552 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501557557 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106451/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
AmplabJenkins commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501557552 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
AmplabJenkins removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501557557 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106451/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
SparkQA removed a comment on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501525743 **[Test build #106451 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106451/testReport)** for PR 24792 at commit [`9931eb6`](https://github.com/apache/spark/commit/9931eb63c0715ba190717a593ce51b949d5355b2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table
SparkQA commented on issue #24792: [SPARK-27953][SQL] Save default constraint with Column into table properties when create Hive table URL: https://github.com/apache/spark/pull/24792#issuecomment-501557174 **[Test build #106451 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106451/testReport)** for PR 24792 at commit [`9931eb6`](https://github.com/apache/spark/commit/9931eb63c0715ba190717a593ce51b949d5355b2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
cloud-fan commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501556439 I have only comment about adding more code comments, which can be addressed later. I'm merging it to unblock the DS v2 project, thanks for your hard work @jzhuge @rdblue ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan edited a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
cloud-fan edited a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501556439 I have only one comment about adding more code comments, which can be addressed later. I'm merging it to unblock the DS v2 project, thanks for your hard work @jzhuge @rdblue ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#discussion_r293212240 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala ## @@ -135,6 +139,34 @@ object FilterPushdownBenchmark extends BenchmarkBase with SQLHelper { benchmark.run() } + def filterPushDownBenchmarkWithColumn( Review comment: @IvanVergiliev . The following doesn't mean put that into here. > I think we should definitely have some automated benchmark for this. Otherwise there's nothing in the codebase exercising the behaviour being changed, and so nothing to prevent future regressions. Since this contribution is big, it's worth to have its own benchmark focusing on filter conversion. Also, the benchmark should have both ORCv1 and ORCv2 benchmark result. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
dongjoon-hyun commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#discussion_r293212240 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala ## @@ -135,6 +139,34 @@ object FilterPushdownBenchmark extends BenchmarkBase with SQLHelper { benchmark.run() } + def filterPushDownBenchmarkWithColumn( Review comment: @IvanVergiliev . The following doesn't mean put that into here. Since this contribution is big, it's worth to have its own benchmark focusing on filter conversion. > I think we should definitely have some automated benchmark for this. Otherwise there's nothing in the codebase exercising the behaviour being changed, and so nothing to prevent future regressions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24741: [SPARK-27322][SQL] DataSourceV2 table relation
cloud-fan commented on a change in pull request #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#discussion_r293210583 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -731,20 +753,16 @@ class Analyzer( //and the default database is only used to look up a view); // 3. Use the currentDb of the SessionCatalog. private def lookupTableFromCatalog( +tableIdentifier: TableIdentifier, u: UnresolvedRelation, defaultDatabase: Option[String] = None): LogicalPlan = { - val tableIdentWithDb = u.tableIdentifier.copy( -database = u.tableIdentifier.database.orElse(defaultDatabase)) + val tableIdentWithDb = tableIdentifier.copy( +database = tableIdentifier.database.orElse(defaultDatabase)) try { catalog.lookupRelation(tableIdentWithDb) } catch { -case e: NoSuchTableException => - u.failAnalysis(s"Table or view not found: ${tableIdentWithDb.unquotedString}", e) -// If the database is defined and that database is not found, throw an AnalysisException. -// Note that if the database is not defined, it is possible we are looking up a temp view. -case e: NoSuchDatabaseException => - u.failAnalysis(s"Table or view not found: ${tableIdentWithDb.unquotedString}, the " + -s"database ${e.db} doesn't exist.", e) +case _: NoSuchTableException | _: NoSuchDatabaseException => + u Review comment: We should add some comments to explain why we need to delay the exception here. To me it's because we still have a chance to resolve the table relation with v2 rules. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
cloud-fan commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#discussion_r293209902 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala ## @@ -362,6 +394,13 @@ object FilterPushdownBenchmark extends BenchmarkBase with SQLHelper { } runBenchmark(s"Pushdown benchmark with many filters") { + // This benchmark and the next one are similar in that they both test predicate pushdown + // where the filter itself is very large. There have been cases where the filter conversion + // would take minutes to hours for large filters due to it being implemented with exponential + // complexity in the height of the filter tree. + // The difference between these two benchmarks is that this one benchmarks pushdown with a + // large string filter (`a AND b AND c ...`), whereas the next one benchmarks pushdown with + // a large Column-based filter (`col(a) || (col(b) || (col(c)...))`). Review comment: I still can't get it. Both the string filter and column-based filter will become an `Expression` in the `Filter` operator. The differences I see are 1. the new benchmark builds a larger filter 2. the new benchmark use `Or` instead of `And`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala
dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala URL: https://github.com/apache/spark/pull/24857#discussion_r293207010 ## File path: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala ## @@ -131,13 +129,6 @@ private[ui] class StagePage(parent: StagesTab, store: AppStatusStore) extends We return UIUtils.headerSparkPage(request, stageHeader, content, parent) } -val storedTasks = store.taskCount(stageData.stageId, stageData.attemptId) -val numCompleted = stageData.numCompleteTasks -val totalTasksNumStr = if (totalTasks == storedTasks) { - s"$totalTasks" -} else { - s"$totalTasks, showing $storedTasks" -} Review comment: @imback82 . Before removing lines, please read the commit history. For example, this is live code. Please see the following PR. - https://github.com/apache/spark/pull/22525 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
AmplabJenkins removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501549531 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106449/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
AmplabJenkins removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501549526 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
AmplabJenkins commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501549526 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
AmplabJenkins commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501549531 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106449/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
SparkQA removed a comment on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501517370 **[Test build #106449 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106449/testReport)** for PR 24741 at commit [`b8cdf6c`](https://github.com/apache/spark/commit/b8cdf6c22172585b3b3a9452d5e4d2d591ece88e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation
SparkQA commented on issue #24741: [SPARK-27322][SQL] DataSourceV2 table relation URL: https://github.com/apache/spark/pull/24741#issuecomment-501549203 **[Test build #106449 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106449/testReport)** for PR 24741 at commit [`b8cdf6c`](https://github.com/apache/spark/commit/b8cdf6c22172585b3b3a9452d5e4d2d591ece88e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala
dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala URL: https://github.com/apache/spark/pull/24857#discussion_r293205670 ## File path: core/src/main/scala/org/apache/spark/memory/ExecutionMemoryPool.scala ## @@ -151,7 +151,7 @@ private[memory] class ExecutionMemoryPool( */ def releaseMemory(numBytes: Long, taskAttemptId: Long): Unit = lock.synchronized { val curMem = memoryForTask.getOrElse(taskAttemptId, 0L) -var memoryToFree = if (curMem < numBytes) { +val memoryToFree = if (curMem < numBytes) { Review comment: Let's not put the different things in the same PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala
dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala URL: https://github.com/apache/spark/pull/24857#discussion_r293205706 ## File path: core/src/main/scala/org/apache/spark/rdd/CoalescedRDD.scala ## @@ -364,7 +364,7 @@ private class DefaultPartitionCoalescer(val balanceSlack: Double = 0.10) val partNoLocIter = partitionLocs.partsWithoutLocs.iterator groupArr.filter(pg => pg.numPartitions == 0).foreach { pg => while (partNoLocIter.hasNext && pg.numPartitions == 0) { - var nxt_part = partNoLocIter.next() + val nxt_part = partNoLocIter.next() Review comment: ditto. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala
dongjoon-hyun commented on a change in pull request #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala URL: https://github.com/apache/spark/pull/24857#discussion_r293205471 ## File path: core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolMessage.scala ## @@ -46,9 +46,6 @@ private[rest] abstract class SubmitRestProtocolMessage { val action: String = messageType var message: String = null - // For JSON deserialization - private def setAction(a: String): Unit = { } - Review comment: This was added from the [beginning](https://github.com/apache/spark/commit/6ec0cdc14390d4dc45acf31040f21e1efc476fc0#diff-fb39e366f633463136727a6b6d5b832fR52) and the comment seems to mean this is used. Shall we keep the existing one? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] IvanVergiliev commented on issue #24783: [SPARK-27105][SQL][test-hadoop3.2] Optimize away exponential complexity in ORC predicate conversion
IvanVergiliev commented on issue #24783: [SPARK-27105][SQL][test-hadoop3.2] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24783#issuecomment-501548338 @cloud-fan cool, this sounds good to me too! I can also bring my PR back to a state similar to before I merged https://github.com/IvanVergiliev/spark/pull/2/files - with `filter` and `build` in separate functions - and then @gengliangwang can followup with the change to reuse `build` for determining whether leaf nodes are convertible? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion
IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24068#discussion_r293204913 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala ## @@ -362,6 +394,13 @@ object FilterPushdownBenchmark extends BenchmarkBase with SQLHelper { } runBenchmark(s"Pushdown benchmark with many filters") { + // This benchmark and the next one are similar in that they both test predicate pushdown + // where the filter itself is very large. There have been cases where the filter conversion + // would take minutes to hours for large filters due to it being implemented with exponential + // complexity in the height of the filter tree. + // The difference between these two benchmarks is that this one benchmarks pushdown with a + // large string filter (`a AND b AND c ...`), whereas the next one benchmarks pushdown with + // a large Column-based filter (`col(a) || (col(b) || (col(c)...))`). Review comment: @cloud-fan the two go through different code paths. The string-based one was added in https://github.com/apache/spark/pull/22313 , but it doesn't expose the slowness when passing a `Column` filter directly. That is, the string-based one was fast before this PR. The one this PR fixes is specifically when passing in a `Column` directly to something like `df.filter(Column)`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
AmplabJenkins removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up URL: https://github.com/apache/spark/pull/24841#issuecomment-501547045 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106450/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
AmplabJenkins removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up URL: https://github.com/apache/spark/pull/24841#issuecomment-501547038 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
AmplabJenkins commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up URL: https://github.com/apache/spark/pull/24841#issuecomment-501547045 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/106450/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
AmplabJenkins commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up URL: https://github.com/apache/spark/pull/24841#issuecomment-501547038 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
SparkQA removed a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up URL: https://github.com/apache/spark/pull/24841#issuecomment-501524364 **[Test build #106450 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106450/testReport)** for PR 24841 at commit [`14203f5`](https://github.com/apache/spark/commit/14203f53604ce0b63a964e8c11288c3f9014792d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
SparkQA commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up URL: https://github.com/apache/spark/pull/24841#issuecomment-501546738 **[Test build #106450 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106450/testReport)** for PR 24841 at commit [`14203f5`](https://github.com/apache/spark/commit/14203f53604ce0b63a964e8c11288c3f9014792d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jiangxb1987 commented on issue #24699: [SPARK-27666][CORE] Do not release lock while TaskContext already completed
jiangxb1987 commented on issue #24699: [SPARK-27666][CORE] Do not release lock while TaskContext already completed URL: https://github.com/apache/spark/pull/24699#issuecomment-501545350 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala
dongjoon-hyun edited a comment on issue #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala URL: https://github.com/apache/spark/pull/24857#issuecomment-501543348 Ur, thank you for the update, but let's remove `unused imports` stuff. You can get reviews later in another PR. It's good to have but sometime it's on the edge due to the intrusiveness. Also, it's beyond the scope of PR title. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala
dongjoon-hyun commented on issue #24857: [MINOR][CORE] Remove an unused variable in SparkSubmt.scala URL: https://github.com/apache/spark/pull/24857#issuecomment-501543348 Ur, thank you for the update, but let's remove `unused imports` stuff. You can get reviews later in another PR. It's good to have but sometime it's on the edge due to the intrusiveness. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 edited a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
Ngone51 edited a comment on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up URL: https://github.com/apache/spark/pull/24841#issuecomment-501538461 @viirya IIUC, executor set up the resources from what the worker assigned to it. For example, worker could "split" its own resources to some separate resource files according to Masters' requirements for executors. Then, executor could set up from corresponding resource file when it starts up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL
SparkQA commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL URL: https://github.com/apache/spark/pull/24706#issuecomment-501538447 **[Test build #106454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/106454/testReport)** for PR 24706 at commit [`5688cb4`](https://github.com/apache/spark/commit/5688cb47b5171fcb590819c101dacfb73ffde356). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
Ngone51 commented on issue #24841: [SPARK-27369][CORE] Setup resources when Standalone Worker starts up URL: https://github.com/apache/spark/pull/24841#issuecomment-501538461 @viirya IIUC, executor set up the resources from what the worker assigned to it. For example, worker could "split" its own resources to some separate resource files according to Masters' requirements for executors. Then, executors could set up from those resource files when it starts up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL
AmplabJenkins commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL URL: https://github.com/apache/spark/pull/24706#issuecomment-501538146 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL
AmplabJenkins commented on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL URL: https://github.com/apache/spark/pull/24706#issuecomment-501538151 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11697/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL
AmplabJenkins removed a comment on issue #24706: [SPARK-23128][SQL] A new approach to do adaptive execution in Spark SQL URL: https://github.com/apache/spark/pull/24706#issuecomment-501538151 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/11697/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org