[GitHub] [spark] AmplabJenkins commented on pull request #32161: [SPARK-35025][SQL][PYTHON][DOCS] Move Parquet data source options from Python and Scala into a single page.

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32161: URL: https://github.com/apache/spark/pull/32161#issuecomment-845675575 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43310/ --

[GitHub] [spark] SparkQA commented on pull request #32161: [SPARK-35025][SQL][PYTHON][DOCS] Move Parquet data source options from Python and Scala into a single page.

2021-05-20 Thread GitBox
SparkQA commented on pull request #32161: URL: https://github.com/apache/spark/pull/32161#issuecomment-845672347 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43310/ --

[GitHub] [spark] SparkQA commented on pull request #32609: [SPARK-29223][SQL][SS] New option to specify timestamp on all subscribing topic-partitions in Kafka source

2021-05-20 Thread GitBox
SparkQA commented on pull request #32609: URL: https://github.com/apache/spark/pull/32609#issuecomment-845671780 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43309/ -- This is an automated message from the Apache

[GitHub] [spark] cloud-fan closed pull request #32615: [SPARK-35479][SQL] Format PartitionFilters IN strings in scan nodes

2021-05-20 Thread GitBox
cloud-fan closed pull request #32615: URL: https://github.com/apache/spark/pull/32615 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] cloud-fan commented on pull request #32615: [SPARK-35479][SQL] Format PartitionFilters IN strings in scan nodes

2021-05-20 Thread GitBox
cloud-fan commented on pull request #32615: URL: https://github.com/apache/spark/pull/32615#issuecomment-845671031 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] Ngone51 commented on pull request #32590: [SPARK-35445][SQL] Reduce the execution time of DeduplicateRelations

2021-05-20 Thread GitBox
Ngone51 commented on pull request #32590: URL: https://github.com/apache/spark/pull/32590#issuecomment-845665122 thanks all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] gengliangwang closed pull request #32590: [SPARK-35445][SQL] Reduce the execution time of DeduplicateRelations

2021-05-20 Thread GitBox
gengliangwang closed pull request #32590: URL: https://github.com/apache/spark/pull/32590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] gengliangwang commented on pull request #32590: [SPARK-35445][SQL] Reduce the execution time of DeduplicateRelations

2021-05-20 Thread GitBox
gengliangwang commented on pull request #32590: URL: https://github.com/apache/spark/pull/32590#issuecomment-845662291 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon closed pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
HyukjinKwon closed pull request #32600: URL: https://github.com/apache/spark/pull/32600 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] HyukjinKwon commented on pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
HyukjinKwon commented on pull request #32600: URL: https://github.com/apache/spark/pull/32600#issuecomment-845661750 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AmplabJenkins commented on pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32600: URL: https://github.com/apache/spark/pull/32600#issuecomment-845658495 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138781/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32600: URL: https://github.com/apache/spark/pull/32600#issuecomment-845658495 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138781/

[GitHub] [spark] SparkQA removed a comment on pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
SparkQA removed a comment on pull request #32600: URL: https://github.com/apache/spark/pull/32600#issuecomment-845597607 **[Test build #138781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138781/testReport)** for PR 32600 at commit

[GitHub] [spark] SparkQA commented on pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
SparkQA commented on pull request #32600: URL: https://github.com/apache/spark/pull/32600#issuecomment-845657587 **[Test build #138781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138781/testReport)** for PR 32600 at commit

[GitHub] [spark] MaxGekk commented on pull request #32574: [SPARK-35427][SQL][TESTS] Check the `EXCEPTION` rebase mode for Avro/Parquet

2021-05-20 Thread GitBox
MaxGekk commented on pull request #32574: URL: https://github.com/apache/spark/pull/32574#issuecomment-845655136 @gengliangwang @HyukjinKwon Could you take a look at the PR, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] allisonwang-db commented on a change in pull request #32606: [SPARK-35287][SQL] Allow RemoveRedundantProjects to preserve ProjectExec which generates UnsafeRow for DataSourceV2ScanRel

2021-05-20 Thread GitBox
allisonwang-db commented on a change in pull request #32606: URL: https://github.com/apache/spark/pull/32606#discussion_r636635430 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/RemoveRedundantProjectsSuite.scala ## @@ -215,6 +217,27 @@ abstract class

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31830: [SPARK-34735][SQL][UI] Add modified configs for SQL execution in UI

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #31830: URL: https://github.com/apache/spark/pull/31830#issuecomment-845652568 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43304/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845652569 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43308/

[GitHub] [spark] AmplabJenkins commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845652569 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43308/ --

[GitHub] [spark] AmplabJenkins commented on pull request #31830: [SPARK-34735][SQL][UI] Add modified configs for SQL execution in UI

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #31830: URL: https://github.com/apache/spark/pull/31830#issuecomment-845652568 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43304/ --

[GitHub] [spark] SparkQA commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
SparkQA commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845652182 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43308/ --

[GitHub] [spark] MaxGekk commented on pull request #32574: [SPARK-35427][SQL][TESTS] Check the `EXCEPTION` rebase mode for Avro/Parquet

2021-05-20 Thread GitBox
MaxGekk commented on pull request #32574: URL: https://github.com/apache/spark/pull/32574#issuecomment-845649011 @cloud-fan Any objections to the changes? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] SparkQA commented on pull request #31830: [SPARK-34735][SQL][UI] Add modified configs for SQL execution in UI

2021-05-20 Thread GitBox
SparkQA commented on pull request #31830: URL: https://github.com/apache/spark/pull/31830#issuecomment-845645932 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43304/ -- This is an automated message from the

[GitHub] [spark] itholic commented on a change in pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
itholic commented on a change in pull request #32546: URL: https://github.com/apache/spark/pull/32546#discussion_r636625094 ## File path: docs/sql-data-sources-orc.md ## @@ -172,3 +172,29 @@ When reading from Hive metastore ORC tables and inserting to Hive metastore ORC

[GitHub] [spark] chrismbryant commented on pull request #27278: [SPARK-30569][SQL][PYSPARK][SPARKR] Add percentile_approx DSL functions.

2021-05-20 Thread GitBox
chrismbryant commented on pull request #27278: URL: https://github.com/apache/spark/pull/27278#issuecomment-845642088 @HyukjinKwon Thanks, here's that ticket: https://issues.apache.org/jira/browse/SPARK-35480 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] HyukjinKwon commented on pull request #32577: [SPARK-35422][SQL] Fix plan-printing issues to pass the TPCDS plan stability tests in Scala v2.13

2021-05-20 Thread GitBox
HyukjinKwon commented on pull request #32577: URL: https://github.com/apache/spark/pull/32577#issuecomment-845639671 Nice, LGTM2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] yaooqinn commented on a change in pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
yaooqinn commented on a change in pull request #32600: URL: https://github.com/apache/spark/pull/32600#discussion_r636621258 ## File path: core/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala ## @@ -104,7 +104,7 @@ private[spark] class

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32600: URL: https://github.com/apache/spark/pull/32600#issuecomment-845638217 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43305/

[GitHub] [spark] AmplabJenkins commented on pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32600: URL: https://github.com/apache/spark/pull/32600#issuecomment-845638217 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43305/ --

[GitHub] [spark] SparkQA commented on pull request #32600: [SPARK-35456][CORE] Print the invalid value in config validation error message

2021-05-20 Thread GitBox
SparkQA commented on pull request #32600: URL: https://github.com/apache/spark/pull/32600#issuecomment-845638203 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43305/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32586: [SPARK-35439][SQL] Children subexpr should come first than parent subexpr

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-845634701 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138770/

[GitHub] [spark] SparkQA removed a comment on pull request #32586: [SPARK-35439][SQL] Children subexpr should come first than parent subexpr

2021-05-20 Thread GitBox
SparkQA removed a comment on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-845535109 **[Test build #138770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138770/testReport)** for PR 32586 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845635353 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138788/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32609: [SPARK-29223][SQL][SS] New option to specify timestamp on all subscribing topic-partitions in Kafka source

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32609: URL: https://github.com/apache/spark/pull/32609#issuecomment-845634865 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138786/

[GitHub] [spark] SparkQA removed a comment on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
SparkQA removed a comment on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845635027 **[Test build #138788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138788/testReport)** for PR 32611 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32204: URL: https://github.com/apache/spark/pull/32204#issuecomment-845634702 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43302/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-845634703 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43300/

[GitHub] [spark] SparkQA removed a comment on pull request #32609: [SPARK-29223][SQL][SS] New option to specify timestamp on all subscribing topic-partitions in Kafka source

2021-05-20 Thread GitBox
SparkQA removed a comment on pull request #32609: URL: https://github.com/apache/spark/pull/32609#issuecomment-845619416 **[Test build #138786 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138786/testReport)** for PR 32609 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845635353 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138788/ -- This

[GitHub] [spark] SparkQA commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
SparkQA commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845635337 **[Test build #138788 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138788/testReport)** for PR 32611 at commit

[GitHub] [spark] SparkQA commented on pull request #32587: [SPARK-35440][SQL] Add function type to `ExpressionInfo` for UDF

2021-05-20 Thread GitBox
SparkQA commented on pull request #32587: URL: https://github.com/apache/spark/pull/32587#issuecomment-845635076 **[Test build #138789 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138789/testReport)** for PR 32587 at commit

[GitHub] [spark] SparkQA commented on pull request #31830: [SPARK-34735][SQL][UI] Add modified configs for SQL execution in UI

2021-05-20 Thread GitBox
SparkQA commented on pull request #31830: URL: https://github.com/apache/spark/pull/31830#issuecomment-845635070 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43304/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
SparkQA commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845635027 **[Test build #138788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138788/testReport)** for PR 32611 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32609: [SPARK-29223][SQL][SS] New option to specify timestamp on all subscribing topic-partitions in Kafka source

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32609: URL: https://github.com/apache/spark/pull/32609#issuecomment-845634865 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138786/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32204: URL: https://github.com/apache/spark/pull/32204#issuecomment-845634702 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43302/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32586: [SPARK-35439][SQL] Children subexpr should come first than parent subexpr

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-845634701 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138770/ -- This

[GitHub] [spark] SparkQA commented on pull request #32609: [SPARK-29223][SQL][SS] New option to specify timestamp on all subscribing topic-partitions in Kafka source

2021-05-20 Thread GitBox
SparkQA commented on pull request #32609: URL: https://github.com/apache/spark/pull/32609#issuecomment-845634723 **[Test build #138786 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138786/testReport)** for PR 32609 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-845634703 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43300/ --

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32610: [SPARK-35460][K8S] invalid `spark.kubernetes.executor.podNamePrefix` causes app to hang

2021-05-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #32610: URL: https://github.com/apache/spark/pull/32610#discussion_r636616275 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -250,11 +250,21 @@ private[spark]

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
dongjoon-hyun edited a comment on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-845632925 Thank you, @itholic and @HyukjinKwon . The refactoring idea looks good to me. I commented only a technical issue about the link usage. I'll leave this to

[GitHub] [spark] dongjoon-hyun commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
dongjoon-hyun commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-845632925 Thank you, @itholic and @HyukjinKwon . The refactoring idea looks good to me. I commented only a technical issue about the link usage. -- This is an automated message

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #32546: URL: https://github.com/apache/spark/pull/32546#discussion_r636615565 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -874,23 +874,10 @@ class DataFrameReader

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #32546: URL: https://github.com/apache/spark/pull/32546#discussion_r636614962 ## File path: docs/sql-data-sources-orc.md ## @@ -172,3 +172,29 @@ When reading from Hive metastore ORC tables and inserting to Hive metastore ORC

[GitHub] [spark] SparkQA commented on pull request #32586: [SPARK-35439][SQL] Children subexpr should come first than parent subexpr

2021-05-20 Thread GitBox
SparkQA commented on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-845632152 **[Test build #138770 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138770/testReport)** for PR 32586 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #32546: URL: https://github.com/apache/spark/pull/32546#discussion_r636615149 ## File path: python/pyspark/sql/readwriter.py ## @@ -793,28 +793,13 @@ def orc(self, path, mergeSchema=None, pathGlobFilter=None,

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #32546: URL: https://github.com/apache/spark/pull/32546#discussion_r636614962 ## File path: docs/sql-data-sources-orc.md ## @@ -172,3 +172,29 @@ When reading from Hive metastore ORC tables and inserting to Hive metastore ORC

[GitHub] [spark] yaooqinn commented on a change in pull request #32610: [SPARK-35460][K8S] invalid `spark.kubernetes.executor.podNamePrefix` causes app to hang

2021-05-20 Thread GitBox
yaooqinn commented on a change in pull request #32610: URL: https://github.com/apache/spark/pull/32610#discussion_r636615055 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -250,11 +250,21 @@ private[spark] object

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #32546: URL: https://github.com/apache/spark/pull/32546#discussion_r636614962 ## File path: docs/sql-data-sources-orc.md ## @@ -172,3 +172,29 @@ When reading from Hive metastore ORC tables and inserting to Hive metastore ORC

[GitHub] [spark] dongjoon-hyun commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
dongjoon-hyun commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-845630052 Thank you for pinging me, @HyukjinKwon . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
SparkQA commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-845629618 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43300/ -- This is an automated message from the

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32589: [SPARK-35444][SQL] Imporve the logic of createTable if table already exist and ignoreIfExists=true

2021-05-20 Thread GitBox
HyukjinKwon commented on a change in pull request #32589: URL: https://github.com/apache/spark/pull/32589#discussion_r636613685 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ## @@ -367,6 +367,7 @@ class SessionCatalog(

[GitHub] [spark] SparkQA commented on pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
SparkQA commented on pull request #32204: URL: https://github.com/apache/spark/pull/32204#issuecomment-845629104 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43302/ -- This is an automated message from the

[GitHub] [spark] xinrong-databricks commented on a change in pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
xinrong-databricks commented on a change in pull request #32611: URL: https://github.com/apache/spark/pull/32611#discussion_r636607352 ## File path: python/pyspark/pandas/data_type_ops/num_ops.py ## @@ -16,31 +16,46 @@ # import numbers -from typing import TYPE_CHECKING,

[GitHub] [spark] SparkQA commented on pull request #32161: [SPARK-35025][SQL][PYTHON][DOCS] Move Parquet data source options from Python and Scala into a single page.

2021-05-20 Thread GitBox
SparkQA commented on pull request #32161: URL: https://github.com/apache/spark/pull/32161#issuecomment-845620740 **[Test build #138787 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138787/testReport)** for PR 32161 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning in some SparkPlan

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32236: URL: https://github.com/apache/spark/pull/32236#issuecomment-845620125 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43301/

[GitHub] [spark] AmplabJenkins commented on pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning in some SparkPlan

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32236: URL: https://github.com/apache/spark/pull/32236#issuecomment-845620125 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43301/ --

[GitHub] [spark] SparkQA commented on pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning in some SparkPlan

2021-05-20 Thread GitBox
SparkQA commented on pull request #32236: URL: https://github.com/apache/spark/pull/32236#issuecomment-845620098 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43301/ --

[GitHub] [spark] xinrong-databricks commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
xinrong-databricks commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845619671 @HyukjinKwon Certainly, examples are added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] SparkQA commented on pull request #32609: [SPARK-29223][SQL][SS] New option to specify timestamp on all subscribing topic-partitions in Kafka source

2021-05-20 Thread GitBox
SparkQA commented on pull request #32609: URL: https://github.com/apache/spark/pull/32609#issuecomment-845619416 **[Test build #138786 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138786/testReport)** for PR 32609 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845618643 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138785/

[GitHub] [spark] SparkQA removed a comment on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
SparkQA removed a comment on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845618215 **[Test build #138785 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138785/testReport)** for PR 32611 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845618643 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138785/ -- This

[GitHub] [spark] SparkQA commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
SparkQA commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845618630 **[Test build #138785 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138785/testReport)** for PR 32611 at commit

[GitHub] [spark] SparkQA commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
SparkQA commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845618215 **[Test build #138785 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138785/testReport)** for PR 32611 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning in some SparkPlan

2021-05-20 Thread GitBox
SparkQA removed a comment on pull request #32236: URL: https://github.com/apache/spark/pull/32236#issuecomment-845596330 **[Test build #138777 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138777/testReport)** for PR 32236 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning in some SparkPlan

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32236: URL: https://github.com/apache/spark/pull/32236#issuecomment-845617484 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138777/

[GitHub] [spark] AmplabJenkins commented on pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning in some SparkPlan

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32236: URL: https://github.com/apache/spark/pull/32236#issuecomment-845617484 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138777/ -- This

[GitHub] [spark] SparkQA commented on pull request #32546: [SPARK-35395][DOCS] Move ORC data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
SparkQA commented on pull request #32546: URL: https://github.com/apache/spark/pull/32546#issuecomment-845617446 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43300/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning in some SparkPlan

2021-05-20 Thread GitBox
SparkQA commented on pull request #32236: URL: https://github.com/apache/spark/pull/32236#issuecomment-845617297 **[Test build #138777 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138777/testReport)** for PR 32236 at commit

[GitHub] [spark] SparkQA commented on pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
SparkQA commented on pull request #32204: URL: https://github.com/apache/spark/pull/32204#issuecomment-845617127 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43302/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32586: [SPARK-35439][SQL] Children subexpr should come first than parent subexpr

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-845615196 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138767/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845615298 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43288/

[GitHub] [spark] HyukjinKwon commented on pull request #32595: [SPARK-35449][SQL] Only extract common expressions from CaseWhen values if elseValue is set

2021-05-20 Thread GitBox
HyukjinKwon commented on pull request #32595: URL: https://github.com/apache/spark/pull/32595#issuecomment-845616346 @Kimahriman would you mind describing what behaviour change (bug fix) happens in "Does this PR introduce any user-facing change?"? The fix itself looks making sense but it

[GitHub] [spark] SparkQA commented on pull request #32204: [SPARK-34494][SQL][DOCS] Move JSON data source options from Python and Scala into a single page

2021-05-20 Thread GitBox
SparkQA commented on pull request #32204: URL: https://github.com/apache/spark/pull/32204#issuecomment-845616015 **[Test build #138784 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138784/testReport)** for PR 32204 at commit

[GitHub] [spark] cloud-fan commented on pull request #32391: [SPARK-35264][SQL] Support AQE side broadcastJoin threshold

2021-05-20 Thread GitBox
cloud-fan commented on pull request #32391: URL: https://github.com/apache/spark/pull/32391#issuecomment-845616002 To add a bit more color: The static size estimation in Spark is usually underestimated, due to things like file compression. We can set the AQE broadcast threshold a bit

[GitHub] [spark] SparkQA commented on pull request #32616: [SPARK-35454][SQL] One LogicalPlan can match multiple dataset ids

2021-05-20 Thread GitBox
SparkQA commented on pull request #32616: URL: https://github.com/apache/spark/pull/32616#issuecomment-845615736 **[Test build #138783 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138783/testReport)** for PR 32616 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32611: [SPARK-35314][PYTHON] Support arithmetic operations against bool IndexOpsMixin

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32611: URL: https://github.com/apache/spark/pull/32611#issuecomment-845615298 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43288/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32586: [SPARK-35439][SQL] Children subexpr should come first than parent subexpr

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-845615196 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138767/ -- This

[GitHub] [spark] weixiuli commented on pull request #32571: [SPARK-35424][SHUFFLE] Remove some useless code in the ExternalBlockHandler.

2021-05-20 Thread GitBox
weixiuli commented on pull request #32571: URL: https://github.com/apache/spark/pull/32571#issuecomment-845613438 Thank you so much ! @HyukjinKwon @srowen @mridulm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] maropu commented on a change in pull request #32616: [SPARK-35454][SQL] One LogicalPlan can match multiple dataset ids

2021-05-20 Thread GitBox
maropu commented on a change in pull request #32616: URL: https://github.com/apache/spark/pull/32616#discussion_r636598360 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -231,9 +231,10 @@ class Dataset[T] private[sql]( case _ =>

[GitHub] [spark] SparkQA removed a comment on pull request #32586: [SPARK-35439][SQL] Children subexpr should come first than parent subexpr

2021-05-20 Thread GitBox
SparkQA removed a comment on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-845504266 **[Test build #138767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138767/testReport)** for PR 32586 at commit

[GitHub] [spark] SparkQA commented on pull request #32586: [SPARK-35439][SQL] Children subexpr should come first than parent subexpr

2021-05-20 Thread GitBox
SparkQA commented on pull request #32586: URL: https://github.com/apache/spark/pull/32586#issuecomment-845610984 **[Test build #138767 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138767/testReport)** for PR 32586 at commit

[GitHub] [spark] itholic commented on pull request #32516: [SPARK-35364][PYTHON] Renaming the existing Koalas related codes

2021-05-20 Thread GitBox
itholic commented on pull request #32516: URL: https://github.com/apache/spark/pull/32516#issuecomment-845603789 Thanks for all the reviews :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] ulysses-you commented on pull request #32391: [SPARK-35264][SQL] Support AQE side broadcastJoin threshold

2021-05-20 Thread GitBox
ulysses-you commented on pull request #32391: URL: https://github.com/apache/spark/pull/32391#issuecomment-845603097 @Gabriel39 I guess you misunderstand the logic of AQE. > AQE should not optimize it to other join type since static stats (e.g sizeInBytes) is always larger or equal

[GitHub] [spark] Ngone51 commented on pull request #32616: [SPARK-35454][SQL] One LogicalPlan can match multiple dataset ids

2021-05-20 Thread GitBox
Ngone51 commented on pull request #32616: URL: https://github.com/apache/spark/pull/32616#issuecomment-845602413 cc @cloud-fan @maropu Please take a look, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] Ngone51 opened a new pull request #32616: [SPARK-35454][SQL] One LogicalPlan can match multiple dataset ids

2021-05-20 Thread GitBox
Ngone51 opened a new pull request #32616: URL: https://github.com/apache/spark/pull/32616 ### What changes were proposed in this pull request? Change the type of `DATASET_ID_TAG` from `Long` to `HashSet[Long]` to allow the logical plan to match multiple datasets.

[GitHub] [spark] SparkQA removed a comment on pull request #32161: [SPARK-35025][SQL][PYTHON][DOCS] Move Parquet data source options from Python and Scala into a single page.

2021-05-20 Thread GitBox
SparkQA removed a comment on pull request #32161: URL: https://github.com/apache/spark/pull/32161#issuecomment-845596403 **[Test build #138779 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138779/testReport)** for PR 32161 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32161: [SPARK-35025][SQL][PYTHON][DOCS] Move Parquet data source options from Python and Scala into a single page.

2021-05-20 Thread GitBox
AmplabJenkins removed a comment on pull request #32161: URL: https://github.com/apache/spark/pull/32161#issuecomment-845597711 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138779/

[GitHub] [spark] AmplabJenkins commented on pull request #32161: [SPARK-35025][SQL][PYTHON][DOCS] Move Parquet data source options from Python and Scala into a single page.

2021-05-20 Thread GitBox
AmplabJenkins commented on pull request #32161: URL: https://github.com/apache/spark/pull/32161#issuecomment-845597711 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138779/ -- This

[GitHub] [spark] SparkQA commented on pull request #32161: [SPARK-35025][SQL][PYTHON][DOCS] Move Parquet data source options from Python and Scala into a single page.

2021-05-20 Thread GitBox
SparkQA commented on pull request #32161: URL: https://github.com/apache/spark/pull/32161#issuecomment-845597698 **[Test build #138779 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138779/testReport)** for PR 32161 at commit

  1   2   3   4   5   6   7   8   >