[GitHub] [spark] AmplabJenkins commented on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-658859828 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29090: [SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29090: URL: https://github.com/apache/spark/pull/29090#issuecomment-658859821 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29085: URL: https://github.com/apache/spark/pull/29085#issuecomment-658859834 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29080: [WIP][SPARK-32271][ML] Add option for k-fold cross-validation to CrossValidator

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658859802 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-658859622 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658859789 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29120: [SPARK-32291][SQL] COALESCE should not reduce the child parallelism if it contains a Join

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29120: URL: https://github.com/apache/spark/pull/29120#issuecomment-658859544 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29104: URL: https://github.com/apache/spark/pull/29104#issuecomment-658859641 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-658860156 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29008: [SPARK-31579][SQL] replaced floorDiv to Div

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29008: URL: https://github.com/apache/spark/pull/29008#issuecomment-658860092 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29074: URL: https://github.com/apache/spark/pull/29074#issuecomment-658860051 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-658860126 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-658859990 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-658860191 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29080: [WIP][SPARK-32271][ML] Add option for k-fold cross-validation to CrossValidator

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29080: URL: https://github.com/apache/spark/pull/29080#issuecomment-658859802 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-658860285 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #28917: [SPARK-31847][CORE][TESTS] DAGSchedulerSuite: Rewrite the test framework to support apply specified spark configurations.

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28917: URL: https://github.com/apache/spark/pull/28917#issuecomment-658860271 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-658860285 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29008: [SPARK-31579][SQL] replaced floorDiv to Div

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29008: URL: https://github.com/apache/spark/pull/29008#issuecomment-658860092 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-658860156 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-658859828 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-658860191 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29085: URL: https://github.com/apache/spark/pull/29085#issuecomment-658859834 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658859789 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29090: [SPARK-32293] Fix inconsistency between Spark memory configs and JVM option

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29090: URL: https://github.com/apache/spark/pull/29090#issuecomment-658859821 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-658860126 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-658859990 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28917: [SPARK-31847][CORE][TESTS] DAGSchedulerSuite: Rewrite the test framework to support apply specified spark configurations.

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28917: URL: https://github.com/apache/spark/pull/28917#issuecomment-658860271 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29074: URL: https://github.com/apache/spark/pull/29074#issuecomment-658860051 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-658860452 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-658860443 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-658860583 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-658860583 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-658860452 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-658860443 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29074: URL: https://github.com/apache/spark/pull/29074#issuecomment-658861923 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-15 Thread GitBox
SparkQA commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-658862178 **[Test build #125895 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125895/testReport)** for PR 29002 at commit [`d768385`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-07-15 Thread GitBox
SparkQA commented on pull request #29074: URL: https://github.com/apache/spark/pull/29074#issuecomment-658861887 **[Test build #125890 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125890/testReport)** for PR 29074 at commit [`ab237bc`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29074: URL: https://github.com/apache/spark/pull/29074#issuecomment-658857778 **[Test build #125890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125890/testReport)** for PR 29074 at commit [`ab237bc`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29074: URL: https://github.com/apache/spark/pull/29074#issuecomment-658861923 Build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29074: URL: https://github.com/apache/spark/pull/29074#issuecomment-658861935 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125

[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455176092 ## File path: docs/sql-ref-syntax-qry-select.md ## @@ -74,6 +74,12 @@ SELECT [ hints , ... ] [ ALL | DISTINCT ] { named_expression [ , ... ] } A

[GitHub] [spark] tgravescs commented on a change in pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due t

2020-07-15 Thread GitBox
tgravescs commented on a change in pull request #28287: URL: https://github.com/apache/spark/pull/28287#discussion_r455153176 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -789,6 +811,28 @@ private[spark] class ExecutorAllocationManage

[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455178397 ## File path: docs/sql-ref-syntax-qry.md ## @@ -45,4 +45,7 @@ ability to generate logical and physical plan for a given query using * [TABLESAMPLE

[GitHub] [spark] holdenk commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-15 Thread GitBox
holdenk commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r455181019 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ## @@ -1285,6 +1314,9 @@ private[spark] class BlockManager( require(b

[GitHub] [spark] tgravescs commented on pull request #28972: [SPARK-30794][CORE] Stage Level scheduling: Add ability to set off heap memory

2020-07-15 Thread GitBox
tgravescs commented on pull request #28972: URL: https://github.com/apache/spark/pull/28972#issuecomment-658871451 test this please This is an automated message from the Apache Git Service. To respond to the message, please l

[GitHub] [spark] holdenk commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-15 Thread GitBox
holdenk commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r455184084 ## File path: core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala ## @@ -0,0 +1,230 @@ +/* + * Licensed to the Ap

[GitHub] [spark] tgravescs commented on pull request #28874: [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklisting feature.

2020-07-15 Thread GitBox
tgravescs commented on pull request #28874: URL: https://github.com/apache/spark/pull/28874#issuecomment-658871868 weird, let me try to commit again, it was reporting couldn't do it, but I also saw this PR was stuck and wouldn't say if it was mergeable

[GitHub] [spark] holdenk commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-15 Thread GitBox
holdenk commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r455184586 ## File path: core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala ## @@ -0,0 +1,230 @@ +/* + * Licensed to the Ap

[GitHub] [spark] tgravescs commented on pull request #28874: [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklisting feature.

2020-07-15 Thread GitBox
tgravescs commented on pull request #28874: URL: https://github.com/apache/spark/pull/28874#issuecomment-658872615 thanks @xkrogen merged to master This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-658857420 **[Test build #125882 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125882/testReport)** for PR 29117 at commit [`843c9f0`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
SparkQA commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-658874043 **[Test build #125882 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125882/testReport)** for PR 29117 at commit [`843c9f0`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #28972: [SPARK-30794][CORE] Stage Level scheduling: Add ability to set off heap memory

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28972: URL: https://github.com/apache/spark/pull/28972#issuecomment-658875176 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-658874216 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-658874216 Build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] SparkQA commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-15 Thread GitBox
SparkQA commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-658874516 **[Test build #125896 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125896/testReport)** for PR 28977 at commit [`9600708`](https://github.com

[GitHub] [spark] holdenk commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-15 Thread GitBox
holdenk commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r455188725 ## File path: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala ## @@ -1866,13 +1903,57 @@ class BlockManagerSuite extends SparkFunSuit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-658874226 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28972: [SPARK-30794][CORE] Stage Level scheduling: Add ability to set off heap memory

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28972: URL: https://github.com/apache/spark/pull/28972#issuecomment-658875176 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
SparkQA commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658876843 **[Test build #125886 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125886/testReport)** for PR 29089 at commit [`ba6a1bb`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658876982 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125886/

[GitHub] [spark] xkrogen commented on pull request #28874: [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklisting feature.

2020-07-15 Thread GitBox
xkrogen commented on pull request #28874: URL: https://github.com/apache/spark/pull/28874#issuecomment-658877306 Thanks a lot @tgravescs ! This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [spark] SparkQA commented on pull request #28917: [SPARK-31847][CORE][TESTS] DAGSchedulerSuite: Rewrite the test framework to support apply specified spark configurations.

2020-07-15 Thread GitBox
SparkQA commented on pull request #28917: URL: https://github.com/apache/spark/pull/28917#issuecomment-658877230 **[Test build #125897 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125897/testReport)** for PR 28917 at commit [`ec0d8d0`](https://github.com

[GitHub] [spark] SparkQA removed a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658857595 **[Test build #125886 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125886/testReport)** for PR 29089 at commit [`ba6a1bb`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658876966 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658876966 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond t

[GitHub] [spark] cloud-fan commented on a change in pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
cloud-fan commented on a change in pull request #29064: URL: https://github.com/apache/spark/pull/29064#discussion_r455191578 ## File path: docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md ## @@ -0,0 +1,67 @@ +--- +layout: global +title: SET TIME ZONE +displayTitle: SET TIME Z

[GitHub] [spark] cloud-fan commented on a change in pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
cloud-fan commented on a change in pull request #29064: URL: https://github.com/apache/spark/pull/29064#discussion_r455192200 ## File path: docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md ## @@ -0,0 +1,67 @@ +--- +layout: global +title: SET TIME ZONE +displayTitle: SET TIME Z

[GitHub] [spark] cloud-fan commented on a change in pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-15 Thread GitBox
cloud-fan commented on a change in pull request #29045: URL: https://github.com/apache/spark/pull/29045#discussion_r455190828 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala ## @@ -112,25 +112,26 @@ case c

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658876982 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125

[GitHub] [spark] cloud-fan commented on a change in pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
cloud-fan commented on a change in pull request #29064: URL: https://github.com/apache/spark/pull/29064#discussion_r455192732 ## File path: docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md ## @@ -0,0 +1,67 @@ +--- +layout: global +title: SET TIME ZONE +displayTitle: SET TIME Z

[GitHub] [spark] c21 commented on a change in pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-15 Thread GitBox
c21 commented on a change in pull request #29079: URL: https://github.com/apache/spark/pull/29079#discussion_r455193119 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2659,7 +2660,19 @@ object SQLConf { buildConf("spark.sql.b

[GitHub] [spark] cloud-fan commented on a change in pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
cloud-fan commented on a change in pull request #29064: URL: https://github.com/apache/spark/pull/29064#discussion_r455194517 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ## @@ -90,6 +92,41 @@ class SparkSqlAstBuilder(conf: SQLConf)

[GitHub] [spark] MaxGekk commented on a change in pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
MaxGekk commented on a change in pull request #27366: URL: https://github.com/apache/spark/pull/27366#discussion_r455195732 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Softwa

[GitHub] [spark] asfgit closed pull request #28874: [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklisting feature.

2020-07-15 Thread GitBox
asfgit closed pull request #28874: URL: https://github.com/apache/spark/pull/28874 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-658883356 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-658883356 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #27735: [SPARK-30985][k8s] Support propagating SPARK_CONF_DIR files to driver and executor pods.

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #27735: URL: https://github.com/apache/spark/pull/27735#issuecomment-658884309 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/30507/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27735: [SPARK-30985][k8s] Support propagating SPARK_CONF_DIR files to driver and executor pods.

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #27735: URL: https://github.com/apache/spark/pull/27735#issuecomment-657480229 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
cloud-fan commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658884805 In general, I think we can remove sort if it doesn't affect the final output ordering. The case caught by @dongjoon-hyun is a good example: the final output ordering changes a

[GitHub] [spark] MaxGekk commented on a change in pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
MaxGekk commented on a change in pull request #27366: URL: https://github.com/apache/spark/pull/27366#discussion_r455199600 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/StructFiltersSuite.scala ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Soft

[GitHub] [spark] MaxGekk commented on a change in pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
MaxGekk commented on a change in pull request #27366: URL: https://github.com/apache/spark/pull/27366#discussion_r455199981 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/StructFiltersSuite.scala ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Soft

[GitHub] [spark] AmplabJenkins commented on pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29111: URL: https://github.com/apache/spark/pull/29111#issuecomment-658887689 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29111: [SPARK-29292][SQL][ML] Update rest of default modules (Hive, ML, etc) for Scala 2.13 compilation

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29111: URL: https://github.com/apache/spark/pull/29111#issuecomment-658887689 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] venkata91 commented on a change in pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due t

2020-07-15 Thread GitBox
venkata91 commented on a change in pull request #28287: URL: https://github.com/apache/spark/pull/28287#discussion_r455204743 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -696,7 +717,8 @@ private[spark] class ExecutorAllocationManager

[GitHub] [spark] venkata91 commented on a change in pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due t

2020-07-15 Thread GitBox
venkata91 commented on a change in pull request #28287: URL: https://github.com/apache/spark/pull/28287#discussion_r455204743 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -696,7 +717,8 @@ private[spark] class ExecutorAllocationManager

[GitHub] [spark] venkata91 commented on a change in pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due t

2020-07-15 Thread GitBox
venkata91 commented on a change in pull request #28287: URL: https://github.com/apache/spark/pull/28287#discussion_r455205081 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -829,12 +873,28 @@ private[spark] class ExecutorAllocationManag

[GitHub] [spark] MaxGekk commented on a change in pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
MaxGekk commented on a change in pull request #27366: URL: https://github.com/apache/spark/pull/27366#discussion_r455205295 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmark.scala ## @@ -508,6 +548,9 @@ object JsonBenchmark ext

[GitHub] [spark] huaxingao closed pull request #28960: [SPARK-32140][ML][PySpark] Add training summary to FMClassificationModel

2020-07-15 Thread GitBox
huaxingao closed pull request #28960: URL: https://github.com/apache/spark/pull/28960 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] venkata91 commented on a change in pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due t

2020-07-15 Thread GitBox
venkata91 commented on a change in pull request #28287: URL: https://github.com/apache/spark/pull/28287#discussion_r455205236 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -289,13 +290,27 @@ private[spark] class ExecutorAllocationManag

[GitHub] [spark] huaxingao commented on pull request #28960: [SPARK-32140][ML][PySpark] Add training summary to FMClassificationModel

2020-07-15 Thread GitBox
huaxingao commented on pull request #28960: URL: https://github.com/apache/spark/pull/28960#issuecomment-658892660 Merged to master. Thanks @srowen @zhengruifeng for reviewing! This is an automated message from the Apache Git

[GitHub] [spark] cloud-fan opened a new pull request #29125: [SPARK-32018][SQL] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
cloud-fan opened a new pull request #29125: URL: https://github.com/apache/spark/pull/29125 partially backport https://github.com/apache/spark/pull/29026 This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] MaxGekk commented on a change in pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
MaxGekk commented on a change in pull request #27366: URL: https://github.com/apache/spark/pull/27366#discussion_r455208221 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Softwa

[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
cloud-fan commented on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-658894290 cc @dongjoon-hyun @viirya This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [spark] venkata91 commented on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spark's bl

2020-07-15 Thread GitBox
venkata91 commented on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-658896316 > it would also be nice to add a test to make sure the abort timer still works as expected. Please check TaskSchedulerImplSuite @tgravescs Do you mean a test where alloc

[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-658896338 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] MaxGekk commented on a change in pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
MaxGekk commented on a change in pull request #27366: URL: https://github.com/apache/spark/pull/27366#discussion_r455213475 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Softwa

[GitHub] [spark] SaurabhChawla100 commented on a change in pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-15 Thread GitBox
SaurabhChawla100 commented on a change in pull request #29045: URL: https://github.com/apache/spark/pull/29045#discussion_r455213803 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcPartitionReaderFactory.scala ## @@ -112,25 +112,26 @@

[GitHub] [spark] rdblue commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
rdblue commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-658899476 > To generate small final Parquet/ORC files, we do the above tricks, don't we? We don't rely on this. Our recommendation to users is to add a global sort to distribute the

[GitHub] [spark] Fokko commented on pull request #29121: [SPARK-32319][PYSPARK] Remove unused imports

2020-07-15 Thread GitBox
Fokko commented on pull request #29121: URL: https://github.com/apache/spark/pull/29121#issuecomment-658900326 I've added the error code to Flake8. This was already in the pipeline, but only looking for a few allowed-listed error codes. Sometime we have to suppress the warning becaus

<    1   2   3   4   5   6   7   8   9   >