[GitHub] [spark] maropu commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
maropu commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455451125 ## File path: docs/sql-ref-syntax-qry-select-lateral-view.md ## @@ -0,0 +1,122 @@ +--- +layout: global +title: LATERAL VIEW Clause +displayTitle: LATERAL

[GitHub] [spark] maropu commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
maropu commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455451217 ## File path: docs/sql-ref-syntax-qry-select-lateral-view.md ## @@ -0,0 +1,122 @@ +--- +layout: global +title: LATERAL VIEW Clause +displayTitle: LATERAL

[GitHub] [spark] maropu commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
maropu commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455451125 ## File path: docs/sql-ref-syntax-qry-select-lateral-view.md ## @@ -0,0 +1,122 @@ +--- +layout: global +title: LATERAL VIEW Clause +displayTitle: LATERAL

[GitHub] [spark] maropu commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
maropu commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455451294 ## File path: docs/sql-ref-syntax-qry-select-lateral-view.md ## @@ -0,0 +1,122 @@ +--- +layout: global +title: LATERAL VIEW Clause +displayTitle: LATERAL

[GitHub] [spark] leanken commented on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-15 Thread GitBox
leanken commented on pull request #29104: URL: https://github.com/apache/spark/pull/29104#issuecomment-659099057 @maropu Any further comments? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] huaxingao commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
huaxingao commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455452971 ## File path: docs/sql-ref-syntax-qry-select-pivot.md ## @@ -0,0 +1,98 @@ +--- +layout: global +title: PIVOT Clause +displayTitle: PIVOT Clause

[GitHub] [spark] huaxingao commented on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-15 Thread GitBox
huaxingao commented on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-659103918 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] maropu commented on pull request #29126: [SPARK-32324][SQL]Fix error messages during using PIVOT and lateral view

2020-07-15 Thread GitBox
maropu commented on pull request #29126: URL: https://github.com/apache/spark/pull/29126#issuecomment-659103736 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HyukjinKwon commented on a change in pull request #28874: [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklistin

2020-07-15 Thread GitBox
HyukjinKwon commented on a change in pull request #28874: URL: https://github.com/apache/spark/pull/28874#discussion_r455457528 ## File path: python/pyspark/cloudpickle.py ## @@ -87,8 +87,8 @@ PY2 = True PY2_WRAPPER_DESCRIPTOR_TYPE = type(object.__init__)

[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659103651 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659103651 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29126: [SPARK-32324][SQL]Fix error messages during using PIVOT and lateral view

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29126: URL: https://github.com/apache/spark/pull/29126#issuecomment-658917336 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] yaooqinn commented on a change in pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
yaooqinn commented on a change in pull request #29064: URL: https://github.com/apache/spark/pull/29064#discussion_r455471147 ## File path: docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md ## @@ -0,0 +1,67 @@ +--- +layout: global +title: SET TIME ZONE +displayTitle: SET TIME

[GitHub] [spark] yaooqinn commented on a change in pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
yaooqinn commented on a change in pull request #29064: URL: https://github.com/apache/spark/pull/29064#discussion_r455470765 ## File path: docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md ## @@ -0,0 +1,67 @@ +--- +layout: global +title: SET TIME ZONE +displayTitle: SET TIME

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-659119577 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] HyukjinKwon commented on a change in pull request #28874: [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklistin

2020-07-15 Thread GitBox
HyukjinKwon commented on a change in pull request #28874: URL: https://github.com/apache/spark/pull/28874#discussion_r455457528 ## File path: python/pyspark/cloudpickle.py ## @@ -87,8 +87,8 @@ PY2 = True PY2_WRAPPER_DESCRIPTOR_TYPE = type(object.__init__)

[GitHub] [spark] maropu edited a comment on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-15 Thread GitBox
maropu edited a comment on pull request #29104: URL: https://github.com/apache/spark/pull/29104#issuecomment-659122885 hm, it might be okay to support the limited optimization as a first step if it has a huge impact on the performance of common caes. But, I think the method (& parameter)

[GitHub] [spark] maropu commented on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-15 Thread GitBox
maropu commented on pull request #29104: URL: https://github.com/apache/spark/pull/29104#issuecomment-659122885 hm, it might be okay to support the limited optimization as a first step if it has a huge impact on the performance of common caes. But, I think the method names should be more

[GitHub] [spark] AmplabJenkins commented on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spark'

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659131774 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29123: [SPARK-32283][CORE] Kryo should support multiple user registrators

2020-07-15 Thread GitBox
SparkQA commented on pull request #29123: URL: https://github.com/apache/spark/pull/29123#issuecomment-659131918 **[Test build #125928 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125928/testReport)** for PR 29123 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spar

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659086627 **[Test build #125919 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125919/testReport)** for PR 28287 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659141023 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659141023 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HyukjinKwon commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-15 Thread GitBox
HyukjinKwon commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659141249 @maropu, per the documentation [Spark Project Improvement Proposals (SPIP)](http://spark.apache.org/improvement-proposals.html), if you feel like it needs an SPIP, it does.

[GitHub] [spark] xuanyuanking commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
xuanyuanking commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659141316 `Well, I guess I already explained why compactLogs is the culprit of the memory issue, right? (#28904 (comment))` Yep that's right. I'm also looking at the code in

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659148369 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659148369 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spar

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659083280 **[Test build #125918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125918/testReport)** for PR 28287 at commit

[GitHub] [spark] SparkQA commented on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spark's blac

2020-07-15 Thread GitBox
SparkQA commented on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659148089 **[Test build #125918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125918/testReport)** for PR 28287 at commit

[GitHub] [spark] agrawaldevesh commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
agrawaldevesh commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659155613 jenkins retest this please This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins commented on pull request #29128: [SPARK-32329][TESTS] Rename HADOOP2_MODULE_PROFILES to HADOOP_MODULE_PROFILES

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29128: URL: https://github.com/apache/spark/pull/29128#issuecomment-659155710 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659160481 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659160481 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] venkata91 edited a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spa

2020-07-15 Thread GitBox
venkata91 edited a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-658977388 > you can make a common function that has most of the code that gets called from 2 separate tests. one test passes with dynamic allocation on, the other with it off.

[GitHub] [spark] LantaoJin commented on pull request #29123: [SPARK-32283][CORE] Kryo should support multiple user registrators

2020-07-15 Thread GitBox
LantaoJin commented on pull request #29123: URL: https://github.com/apache/spark/pull/29123#issuecomment-659073795 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA removed a comment on pull request #27735: [SPARK-30985][k8s] Support propagating SPARK_CONF_DIR files to driver and executor pods.

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #27735: URL: https://github.com/apache/spark/pull/27735#issuecomment-658989107 **[Test build #125899 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125899/testReport)** for PR 27735 at commit

[GitHub] [spark] SparkQA commented on pull request #27735: [SPARK-30985][k8s] Support propagating SPARK_CONF_DIR files to driver and executor pods.

2020-07-15 Thread GitBox
SparkQA commented on pull request #27735: URL: https://github.com/apache/spark/pull/27735#issuecomment-659073387 **[Test build #125899 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125899/testReport)** for PR 27735 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-659007979 **[Test build #125905 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125905/testReport)** for PR 28708 at commit

[GitHub] [spark] SparkQA commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-15 Thread GitBox
SparkQA commented on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-659082726 **[Test build #125905 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125905/testReport)** for PR 28708 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28972: [SPARK-30794][CORE] Stage Level scheduling: Add ability to set off heap memory

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28972: URL: https://github.com/apache/spark/pull/28972#issuecomment-659081881 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] sarutak commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-15 Thread GitBox
sarutak commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-659083037 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] maropu commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
maropu commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455443183 ## File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md ## @@ -27,15 +27,23 @@ The `CREATE TABLE` statement defines a new table using Hive

[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
SparkQA commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659087425 **[Test build #125898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125898/testReport)** for PR 28904 at commit

[GitHub] [spark] maropu commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
maropu commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455449422 ## File path: docs/sql-ref-syntax-qry-select-case.md ## @@ -0,0 +1,114 @@ +--- +layout: global +title: CASE Clause +displayTitle: CASE Clause +license: |

[GitHub] [spark] AmplabJenkins commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659093379 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] maropu commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
maropu commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455449524 ## File path: docs/sql-ref-syntax-qry-select-case.md ## @@ -0,0 +1,114 @@ +--- +layout: global +title: CASE Clause +displayTitle: CASE Clause +license: |

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-659093058 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-659072030 **[Test build #125913 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125913/testReport)** for PR 29115 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659020484 **[Test build #125909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125909/testReport)** for PR 29032 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29115: [SPARK-32315][ML] Provide an explanation error message when calling require

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29115: URL: https://github.com/apache/spark/pull/29115#issuecomment-659093058 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
SparkQA commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659093134 **[Test build #125922 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125922/testReport)** for PR 29015 at commit

[GitHub] [spark] SparkQA commented on pull request #29127: [SPARK-32327][SQL] Introduce UnresolvedTableOrPermanentView for commands that support a table and permanent view, but not a temporary view

2020-07-15 Thread GitBox
SparkQA commented on pull request #29127: URL: https://github.com/apache/spark/pull/29127#issuecomment-659093494 **[Test build #125923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125923/testReport)** for PR 29127 at commit

[GitHub] [spark] leanken commented on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-15 Thread GitBox
leanken commented on pull request #29104: URL: https://github.com/apache/spark/pull/29104#issuecomment-659125973 For example. -- Case 4 -- (one column null, other column matches a row in the subquery result -> row not returned) SELECT * FROM m WHERE b = 1.0 -- Matches

[GitHub] [spark] AmplabJenkins commented on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-659125664 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-659125664 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659135184 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659135184 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HeartSaVioR commented on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
HeartSaVioR commented on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659144203 I guess we have several possible approaches here: 1. place the suite to the Hive-version specific directory (with new config on pom.xml to add the test source based

[GitHub] [spark] HeartSaVioR edited a comment on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
HeartSaVioR edited a comment on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659144203 I guess we have several possible approaches here: 1. place the suite to the Hive-version specific directory (with new config on pom.xml to add the test source

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28676: URL: https://github.com/apache/spark/pull/28676#issuecomment-659154211 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] zhengruifeng commented on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-15 Thread GitBox
zhengruifeng commented on pull request #29095: URL: https://github.com/apache/spark/pull/29095#issuecomment-659165588 friendly ping @huaxingao @srowen @viirya Different another attempt to save RAM, this should be a clear optimization. I found that those methods can not be marked

[GitHub] [spark] viirya commented on a change in pull request #29107: [SPARK-32308][SQL] Move by-name resolution logic of unionByName from API code to analysis phase

2020-07-15 Thread GitBox
viirya commented on a change in pull request #29107: URL: https://github.com/apache/spark/pull/29107#discussion_r455521016 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ## @@ -1099,6 +1101,64 @@ object TypeCoercion {

[GitHub] [spark] zhengruifeng commented on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-15 Thread GitBox
zhengruifeng commented on pull request #29095: URL: https://github.com/apache/spark/pull/29095#issuecomment-659171262 @viirya This PR is only on cpu. This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] SparkQA commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
SparkQA commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659171509 **[Test build #125925 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125925/testReport)** for PR 29032 at commit

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype

2020-07-15 Thread GitBox
dongjoon-hyun edited a comment on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-659086163 I guess we can forbid that too consistently as a continuation of this approach. BTW, until now, it's beyond of the scope because this PR was designed to prevent

[GitHub] [spark] dongjoon-hyun commented on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype

2020-07-15 Thread GitBox
dongjoon-hyun commented on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-659086163 I guess we can forbid that too consistently as a continuation of this approach. BTW, until now, it's beyond of the scope because this PR was designed to prevent Hive

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-659091535 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659091597 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-659091535 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659091597 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] maropu commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs

2020-07-15 Thread GitBox
maropu commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455450809 ## File path: docs/sql-ref-syntax-qry-select-groupby.md ## @@ -260,6 +274,30 @@ SELECT city, car_model, sum(quantity) AS sum FROM dealer | San Jose|

[GitHub] [spark] rxin commented on pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-15 Thread GitBox
rxin commented on pull request #28708: URL: https://github.com/apache/spark/pull/28708#issuecomment-659095250 @rdblue for such a large change, it's pretty reasonable to get people like @tgravescs to take a look, isn't it? Maybe I missed it, but when that comment was brought up, there

[GitHub] [spark] AmplabJenkins commented on pull request #29120: [SPARK-32291][SQL] COALESCE should not reduce the child parallelism if it contains a Join

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29120: URL: https://github.com/apache/spark/pull/29120#issuecomment-659105687 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29123: [SPARK-32283][CORE] Kryo should support multiple user registrators

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29123: URL: https://github.com/apache/spark/pull/29123#issuecomment-659105655 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-659105702 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29126: [SPARK-32324][SQL]Fix error messages during using PIVOT and lateral view

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29126: URL: https://github.com/apache/spark/pull/29126#issuecomment-659105692 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659139670 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] HyukjinKwon commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-15 Thread GitBox
HyukjinKwon commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659139846 I just saw the comment. Thanks for summarizing @revans2. This is an automated message from the Apache Git

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659139663 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
SparkQA commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-659139822 **[Test build #125931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125931/testReport)** for PR 29089 at commit

[GitHub] [spark] frankyin-factual commented on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
frankyin-factual commented on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659139939 I will also take a look. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] HyukjinKwon commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
HyukjinKwon commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659146783 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] williamhyun opened a new pull request #29128: [SPARK-XXX][TESTS] Rename HADOOP2_MODULE_PROFILES to HADOOP_MODULE_PROFILES

2020-07-15 Thread GitBox
williamhyun opened a new pull request #29128: URL: https://github.com/apache/spark/pull/29128 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] frankyin-factual commented on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
frankyin-factual commented on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659154829 I am working on a combination of 1) and 2). Will push shortly. This is an automated message from the

[GitHub] [spark] SparkQA removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659118412 **[Test build #125926 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125926/testReport)** for PR 29032 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659168327 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29126: [SPARK-32324][SQL]Fix error messages during using PIVOT and lateral view

2020-07-15 Thread GitBox
SparkQA commented on pull request #29126: URL: https://github.com/apache/spark/pull/29126#issuecomment-659168426 **[Test build #125936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125936/testReport)** for PR 29126 at commit

[GitHub] [spark] SparkQA commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
SparkQA commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659167771 **[Test build #125926 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125926/testReport)** for PR 29032 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659065467 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #27735: [SPARK-30985][k8s] Support propagating SPARK_CONF_DIR files to driver and executor pods.

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #27735: URL: https://github.com/apache/spark/pull/27735#issuecomment-659073988 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] LantaoJin commented on pull request #29120: [SPARK-32291][SQL] COALESCE should not reduce the child parallelism if it contains a Join

2020-07-15 Thread GitBox
LantaoJin commented on pull request #29120: URL: https://github.com/apache/spark/pull/29120#issuecomment-659074037 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659027005 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27735: [SPARK-30985][k8s] Support propagating SPARK_CONF_DIR files to driver and executor pods.

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #27735: URL: https://github.com/apache/spark/pull/27735#issuecomment-659073988 Build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] SparkQA commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-15 Thread GitBox
SparkQA commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659074097 **[Test build #125914 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125914/testReport)** for PR 29101 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28972: [SPARK-30794][CORE] Stage Level scheduling: Add ability to set off heap memory

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28972: URL: https://github.com/apache/spark/pull/28972#issuecomment-659081876 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #28972: [SPARK-30794][CORE] Stage Level scheduling: Add ability to set off heap memory

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28972: URL: https://github.com/apache/spark/pull/28972#issuecomment-659081876 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29074: [SPARK-32282][SQL] Improve EnsureRquirement.reorderJoinKeys to handle more scenarios such as PartitioningCollection

2020-07-15 Thread GitBox
SparkQA commented on pull request #29074: URL: https://github.com/apache/spark/pull/29074#issuecomment-659081984 **[Test build #125917 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125917/testReport)** for PR 29074 at commit

[GitHub] [spark] SparkQA commented on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spark's blac

2020-07-15 Thread GitBox
SparkQA commented on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659086627 **[Test build #125919 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125919/testReport)** for PR 28287 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29008: [SPARK-31579][SQL] replaced floorDiv to Div

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29008: URL: https://github.com/apache/spark/pull/29008#issuecomment-659086518 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29008: [SPARK-31579][SQL] replaced floorDiv to Div

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29008: URL: https://github.com/apache/spark/pull/29008#issuecomment-659086518 This is an automated message from the Apache Git Service. To respond to the message, please log on to

<    3   4   5   6   7   8   9   >