[GitHub] [spark] dongjoon-hyun edited a comment on pull request #32824: [SPARK-30993][FOLLOWUP][SQL] Refactor LocalDateTimeUDT as YearUDT in UserDefinedTypeSuite

2021-06-09 Thread GitBox
dongjoon-hyun edited a comment on pull request #32824: URL: https://github.com/apache/spark/pull/32824#issuecomment-857401557 Hi, @gengliangwang . SPARK-30993 was released long time ago by 2.4.6 and 3.0.0. If you don't mind, could you use a new JIRA instead of the follow-up next

[GitHub] [spark] SparkQA commented on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
SparkQA commented on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-857404004 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44053/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32582: [SPARK-35436][SS] RocksDBFileManager - save checkpoint to DFS

2021-06-09 Thread GitBox
SparkQA commented on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-857411039 **[Test build #139529 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139529/testReport)** for PR 32582 at commit

[GitHub] [spark] cloud-fan commented on pull request #32834: [SPARK-35690][SS] Stream-stream join keys should be reordered properly

2021-06-09 Thread GitBox
cloud-fan commented on pull request #32834: URL: https://github.com/apache/spark/pull/32834#issuecomment-857411344 looks reasonable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #32826: [SPARK-35670][BUILD] Upgrade ZSTD-JNI to 1.5.0-1

2021-06-09 Thread GitBox
dongjoon-hyun edited a comment on pull request #32826: URL: https://github.com/apache/spark/pull/32826#issuecomment-857411986 Thank you for your efforts. BTW, @dchristle . Please note that your ORC PR is not about ZSTD-JNI. It's native ZSTD library only. I commented on your ORC PR about

[GitHub] [spark] itholic commented on a change in pull request #32835: [SPARK-35591][PYTHON][DOCS] Rename "Koalas" to "pandas API on Spark" in the documents

2021-06-09 Thread GitBox
itholic commented on a change in pull request #32835: URL: https://github.com/apache/spark/pull/32835#discussion_r648002946 ## File path: python/docs/source/getting_started/ps_install.rst ## @@ -47,20 +47,20 @@ To put your self inside this environment run:: conda

[GitHub] [spark] AmplabJenkins commented on pull request #32693: [SPARK-35556][SQL][TESTS] Avoid log NoSuchMethodError when running multiple Hive version related tests

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32693: URL: https://github.com/apache/spark/pull/32693#issuecomment-857417817 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857428563 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139545/ -- This

[GitHub] [spark] HeartSaVioR edited a comment on pull request #32834: [SPARK-35690][SS] Stream-stream join keys should be reordered properly

2021-06-09 Thread GitBox
HeartSaVioR edited a comment on pull request #32834: URL: https://github.com/apache/spark/pull/32834#issuecomment-857426847 Please be aware of "existing state" when dealing with stateful operation. Existing states should have been stored as not-reordered one, hence once we change the key

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857428563 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139545/

[GitHub] [spark] c21 commented on pull request #32834: [SPARK-35690][SS] Stream-stream join keys should be reordered properly

2021-06-09 Thread GitBox
c21 commented on pull request #32834: URL: https://github.com/apache/spark/pull/32834#issuecomment-857436089 @HeartSaVioR - yeah, thanks for pointing it out. It seems that state store compatibility is the deal breaker here. > The different output of reorder breaks compatibility

[GitHub] [spark] HeartSaVioR commented on pull request #32834: [SPARK-35690][SS] Stream-stream join keys should be reordered properly

2021-06-09 Thread GitBox
HeartSaVioR commented on pull request #32834: URL: https://github.com/apache/spark/pull/32834#issuecomment-857438284 >> The different output of reorder breaks compatibility again. > Are you referring to the output for join, or something else? Output for join should not be affected.

[GitHub] [spark] SparkQA commented on pull request #32814: [SPARK-35664][SQL] Support java.time.LocalDateTime as an external type of TimestampWithoutTZ type

2021-06-09 Thread GitBox
SparkQA commented on pull request #32814: URL: https://github.com/apache/spark/pull/32814#issuecomment-857438239 **[Test build #139520 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139520/testReport)** for PR 32814 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #32833: [SPARK-35687][SQL][TEST] PythonUDFSuite move assume into its methods

2021-06-09 Thread GitBox
HyukjinKwon commented on pull request #32833: URL: https://github.com/apache/spark/pull/32833#issuecomment-857438967 Merged to master, branch-3.1 and branch-3.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] viirya commented on a change in pull request #32807: [SPARK-35669][SQL] Fix special char in CSV header with filter pushdown

2021-06-09 Thread GitBox
viirya commented on a change in pull request #32807: URL: https://github.com/apache/spark/pull/32807#discussion_r648019398 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/sources/filters.scala ## @@ -35,8 +35,9 @@ sealed abstract class Filter { /** *

[GitHub] [spark] HyukjinKwon closed pull request #32833: [SPARK-35687][SQL][TEST] PythonUDFSuite move assume into its methods

2021-06-09 Thread GitBox
HyukjinKwon closed pull request #32833: URL: https://github.com/apache/spark/pull/32833 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] SparkQA removed a comment on pull request #32814: [SPARK-35664][SQL] Support java.time.LocalDateTime as an external type of TimestampWithoutTZ type

2021-06-09 Thread GitBox
SparkQA removed a comment on pull request #32814: URL: https://github.com/apache/spark/pull/32814#issuecomment-857322106 **[Test build #139520 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139520/testReport)** for PR 32814 at commit

[GitHub] [spark] SparkQA commented on pull request #32763: [SPARK-35058][SQL] Group exception messages in hive/client

2021-06-09 Thread GitBox
SparkQA commented on pull request #32763: URL: https://github.com/apache/spark/pull/32763#issuecomment-857443803 **[Test build #139522 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139522/testReport)** for PR 32763 at commit

[GitHub] [spark] SparkQA commented on pull request #32800: [SPARK-35661][SQL] Allow deserialized off-heap memory entry

2021-06-09 Thread GitBox
SparkQA commented on pull request #32800: URL: https://github.com/apache/spark/pull/32800#issuecomment-857452986 **[Test build #139534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139534/testReport)** for PR 32800 at commit

[GitHub] [spark] itholic edited a comment on pull request #32835: [SPARK-35591][PYTHON][DOCS] Rename "Koalas" to "pandas API on Spark" in the documents

2021-06-09 Thread GitBox
itholic edited a comment on pull request #32835: URL: https://github.com/apache/spark/pull/32835#issuecomment-857452257 I've some refined the sentence by using the term `pandas-on-Spark` simply. For example if there are other words after `pandas APIs on Spark` e.g. `pandas APIs on

[GitHub] [spark] itholic edited a comment on pull request #32835: [SPARK-35591][PYTHON][DOCS] Rename "Koalas" to "pandas API on Spark" in the documents

2021-06-09 Thread GitBox
itholic edited a comment on pull request #32835: URL: https://github.com/apache/spark/pull/32835#issuecomment-857452257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA commented on pull request #32738: [SPARK-35474] Enable disallow_untyped_defs mypy check for pyspark.pandas.indexing.

2021-06-09 Thread GitBox
SparkQA commented on pull request #32738: URL: https://github.com/apache/spark/pull/32738#issuecomment-857452732 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44063/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #32800: [SPARK-35661][SQL] Allow deserialized off-heap memory entry

2021-06-09 Thread GitBox
SparkQA removed a comment on pull request #32800: URL: https://github.com/apache/spark/pull/32800#issuecomment-857366545 **[Test build #139534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139534/testReport)** for PR 32800 at commit

[GitHub] [spark] beliefer commented on pull request #32763: [SPARK-35058][SQL] Group exception messages in hive/client

2021-06-09 Thread GitBox
beliefer commented on pull request #32763: URL: https://github.com/apache/spark/pull/32763#issuecomment-857452590 ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins commented on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857467717 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139551/ -- This

[GitHub] [spark] sunchao commented on a change in pull request #32764: [SPARK-35390][SQL] Handle type coercion when resolving V2 functions

2021-06-09 Thread GitBox
sunchao commented on a change in pull request #32764: URL: https://github.com/apache/spark/pull/32764#discussion_r648050362 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -2189,7 +2194,7 @@ class Analyzer(override val

[GitHub] [spark] SparkQA commented on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
SparkQA commented on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857467418 **[Test build #139551 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139551/testReport)** for PR 32830 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
SparkQA removed a comment on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857456683 **[Test build #139551 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139551/testReport)** for PR 32830 at commit

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-06-09 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-857472139 **[Test build #139527 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139527/testReport)** for PR 32473 at commit

[GitHub] [spark] SparkQA commented on pull request #32800: [SPARK-35661][SQL] Allow deserialized off-heap memory entry

2021-06-09 Thread GitBox
SparkQA commented on pull request #32800: URL: https://github.com/apache/spark/pull/32800#issuecomment-857478591 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44060/ -- This is an automated message from the

[GitHub] [spark] wangyum commented on a change in pull request #32781: [SPARK-35650][SQL] Enhance `RepartitionByExpression` to make it coalesce partitions efficiently by AQE

2021-06-09 Thread GitBox
wangyum commented on a change in pull request #32781: URL: https://github.com/apache/spark/pull/32781#discussion_r648063132 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ## @@ -712,7 +712,9 @@ abstract class SparkStrategies extends

[GitHub] [spark] gengliangwang commented on a change in pull request #32838: [SPARK-35694][INFRA] Increase the default JVM stack size of SBT/Maven

2021-06-09 Thread GitBox
gengliangwang commented on a change in pull request #32838: URL: https://github.com/apache/spark/pull/32838#discussion_r648062670 ## File path: build/sbt ## @@ -53,7 +53,7 @@ realpath () { declare -r noshare_opts="-Dsbt.global.base=project/.sbtboot

[GitHub] [spark] cloud-fan commented on a change in pull request #32807: [SPARK-35669][SQL] Fix special char in CSV header with filter pushdown

2021-06-09 Thread GitBox
cloud-fan commented on a change in pull request #32807: URL: https://github.com/apache/spark/pull/32807#discussion_r648082097 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -699,20 +699,25 @@ abstract class

[GitHub] [spark] AmplabJenkins commented on pull request #32839: [SPARK-35679][SQL][WIP] instantToMicros overflow

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32839: URL: https://github.com/apache/spark/pull/32839#issuecomment-857496298 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] JkSelf commented on a change in pull request #32781: [SPARK-35650][SQL] Enhance `RepartitionByExpression` to make it coalesce partitions efficiently by AQE

2021-06-09 Thread GitBox
JkSelf commented on a change in pull request #32781: URL: https://github.com/apache/spark/pull/32781#discussion_r648082801 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala ## @@ -1782,4 +1782,36 @@ class

[GitHub] [spark] wangyum commented on a change in pull request #32781: [SPARK-35650][SQL] Enhance `RepartitionByExpression` to make it coalesce partitions efficiently by AQE

2021-06-09 Thread GitBox
wangyum commented on a change in pull request #32781: URL: https://github.com/apache/spark/pull/32781#discussion_r648086922 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala ## @@ -86,11 +86,15 @@ case object

[GitHub] [spark] SparkQA commented on pull request #32834: [SPARK-35690][SS] Stream-stream join keys should be reordered properly

2021-06-09 Thread GitBox
SparkQA commented on pull request #32834: URL: https://github.com/apache/spark/pull/32834#issuecomment-857500074 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44067/ -- This is an automated message from the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-06-09 Thread GitBox
cloud-fan commented on a change in pull request #32401: URL: https://github.com/apache/spark/pull/32401#discussion_r648092216 ## File path: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java ## @@ -164,6 +180,13 @@ public void write(Iterator>

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-857507860 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139533/

[GitHub] [spark] gengliangwang opened a new pull request #32840: [SPARK-35674][SQL][TESTS] Test timestamp without time zone in UDF

2021-06-09 Thread GitBox
gengliangwang opened a new pull request #32840: URL: https://github.com/apache/spark/pull/32840 ### What changes were proposed in this pull request? Write tests for timestamp without time zone in UDF as input parameters and results. ### Why are the changes needed?

[GitHub] [spark] SparkQA commented on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
SparkQA commented on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-857533584 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44071/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #32836: [SPARK-35693][SS][TEST] Add plan check for stream-stream join unit test

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32836: URL: https://github.com/apache/spark/pull/32836#issuecomment-857538157 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44075/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32836: [SPARK-35693][SS][TEST] Add plan check for stream-stream join unit test

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32836: URL: https://github.com/apache/spark/pull/32836#issuecomment-857538157 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44075/

[GitHub] [spark] SparkQA commented on pull request #31102: [SPARK-34054][CORE] BlockManagerDecommissioner code cleanup

2021-06-09 Thread GitBox
SparkQA commented on pull request #31102: URL: https://github.com/apache/spark/pull/31102#issuecomment-857541886 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44072/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32835: [SPARK-35591][PYTHON][DOCS] Rename "Koalas" to "pandas API on Spark" in the documents

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32835: URL: https://github.com/apache/spark/pull/32835#issuecomment-857541540 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44069/

[GitHub] [spark] sarutak commented on a change in pull request #32786: [SPARK-35296][SQL] Allow Dataset.observe to work even if CollectMetricsExec in a task handles multiple partitions.

2021-06-09 Thread GitBox
sarutak commented on a change in pull request #32786: URL: https://github.com/apache/spark/pull/32786#discussion_r648145444 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/AggregatingAccumulator.scala ## @@ -95,13 +95,13 @@ class AggregatingAccumulator

[GitHub] [spark] SparkQA commented on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
SparkQA commented on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-857560220 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44071/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
SparkQA commented on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857577221 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44077/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-09 Thread GitBox
SparkQA removed a comment on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-857456948 **[Test build #139555 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139555/testReport)** for PR 32767 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
SparkQA removed a comment on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-857366765 **[Test build #139535 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139535/testReport)** for PR 32303 at commit

[GitHub] [spark] SparkQA commented on pull request #32786: [SPARK-35296][SQL] Allow Dataset.observe to work even if CollectMetricsExec in a task handles multiple partitions.

2021-06-09 Thread GitBox
SparkQA commented on pull request #32786: URL: https://github.com/apache/spark/pull/32786#issuecomment-857580880 **[Test build #139568 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139568/testReport)** for PR 32786 at commit

[GitHub] [spark] SparkQA commented on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
SparkQA commented on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-857584466 **[Test build #139546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139546/testReport)** for PR 32303 at commit

[GitHub] [spark] gengliangwang commented on a change in pull request #32839: [SPARK-35679][SQL] instantToMicros overflow

2021-06-09 Thread GitBox
gengliangwang commented on a change in pull request #32839: URL: https://github.com/apache/spark/pull/32839#discussion_r648178608 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala ## @@ -820,4 +820,9 @@ class

[GitHub] [spark] MaxGekk commented on pull request #32825: [WIP][SPARK-35680][SQL] Support units by the year-month interval type

2021-06-09 Thread GitBox
MaxGekk commented on pull request #32825: URL: https://github.com/apache/spark/pull/32825#issuecomment-857584544 > where is the SQL parser change to fill the units? This is WIP PR. Changes are coming. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA removed a comment on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
SparkQA removed a comment on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-857419054 **[Test build #139546 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139546/testReport)** for PR 32303 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-857586028 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139546/

[GitHub] [spark] MaxGekk commented on a change in pull request #32839: [SPARK-35679][SQL] instantToMicros overflow

2021-06-09 Thread GitBox
MaxGekk commented on a change in pull request #32839: URL: https://github.com/apache/spark/pull/32839#discussion_r648186390 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala ## @@ -385,9 +385,15 @@ object DateTimeUtils { *

[GitHub] [spark] SparkQA removed a comment on pull request #32738: [SPARK-35474] Enable disallow_untyped_defs mypy check for pyspark.pandas.indexing.

2021-06-09 Thread GitBox
SparkQA removed a comment on pull request #32738: URL: https://github.com/apache/spark/pull/32738#issuecomment-857387251 **[Test build #139539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139539/testReport)** for PR 32738 at commit

[GitHub] [spark] SparkQA commented on pull request #32693: [SPARK-35556][SQL][TESTS] Avoid log NoSuchMethodError when running multiple Hive version related tests

2021-06-09 Thread GitBox
SparkQA commented on pull request #32693: URL: https://github.com/apache/spark/pull/32693#issuecomment-857409440 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44057/ --

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #32826: [SPARK-35670][BUILD] Upgrade ZSTD-JNI to 1.5.0-1

2021-06-09 Thread GitBox
dongjoon-hyun edited a comment on pull request #32826: URL: https://github.com/apache/spark/pull/32826#issuecomment-857411986 Thank you for your efforts. BTW, @dchristle . Please note that your ORC PR is not about ZSTD-JNI. It's native ZSTD library only. I commented on your ORC PR about

[GitHub] [spark] cloud-fan commented on a change in pull request #32781: [SPARK-35650][SQL] Enhance `RepartitionByExpression` to make it coalesce partitions efficiently by AQE

2021-06-09 Thread GitBox
cloud-fan commented on a change in pull request #32781: URL: https://github.com/apache/spark/pull/32781#discussion_r647997967 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ## @@ -712,7 +712,9 @@ abstract class SparkStrategies

[GitHub] [spark] cloud-fan commented on a change in pull request #32781: [SPARK-35650][SQL] Enhance `RepartitionByExpression` to make it coalesce partitions efficiently by AQE

2021-06-09 Thread GitBox
cloud-fan commented on a change in pull request #32781: URL: https://github.com/apache/spark/pull/32781#discussion_r647998499 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ## @@ -712,7 +712,9 @@ abstract class SparkStrategies

[GitHub] [spark] cloud-fan commented on a change in pull request #32781: [SPARK-35650][SQL] Enhance `RepartitionByExpression` to make it coalesce partitions efficiently by AQE

2021-06-09 Thread GitBox
cloud-fan commented on a change in pull request #32781: URL: https://github.com/apache/spark/pull/32781#discussion_r647997967 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ## @@ -712,7 +712,9 @@ abstract class SparkStrategies

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-857417819 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] xuanyuanking commented on pull request #32582: [SPARK-35436][SS] RocksDBFileManager - save checkpoint to DFS

2021-06-09 Thread GitBox
xuanyuanking commented on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-857418906 Great thanks for the help! @HeartSaVioR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32582: [SPARK-35436][SS] RocksDBFileManager - save checkpoint to DFS

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-857417824 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139529/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-857417815 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44056/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32834: [SPARK-35690][SS] Stream-stream join keys should be reordered properly

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32834: URL: https://github.com/apache/spark/pull/32834#issuecomment-857417826 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139542/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32738: [SPARK-35474] Enable disallow_untyped_defs mypy check for pyspark.pandas.indexing.

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32738: URL: https://github.com/apache/spark/pull/32738#issuecomment-857417822 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139539/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-857417830 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44052/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32693: [SPARK-35556][SQL][TESTS] Avoid log NoSuchMethodError when running multiple Hive version related tests

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32693: URL: https://github.com/apache/spark/pull/32693#issuecomment-857417817 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32610: [SPARK-35460][K8S] invalid `spark.kubernetes.executor.podNamePrefix` causes app to hang

2021-06-09 Thread GitBox
dongjoon-hyun commented on a change in pull request #32610: URL: https://github.com/apache/spark/pull/32610#discussion_r648001743 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -250,11 +250,32 @@ private[spark]

[GitHub] [spark] SparkQA commented on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
SparkQA commented on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857418482 **[Test build #139545 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139545/testReport)** for PR 32830 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32610: [SPARK-35460][K8S] verify the content of`spark.kubernetes.executor.podNamePrefix` ahead before post it to k8s api-server

2021-06-09 Thread GitBox
dongjoon-hyun commented on a change in pull request #32610: URL: https://github.com/apache/spark/pull/32610#discussion_r648007320 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -250,11 +250,32 @@ private[spark]

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32610: [SPARK-35460][K8S] verify the content of`spark.kubernetes.executor.podNamePrefix` ahead before post it to k8s api-server

2021-06-09 Thread GitBox
dongjoon-hyun commented on a change in pull request #32610: URL: https://github.com/apache/spark/pull/32610#discussion_r648007320 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -250,11 +250,32 @@ private[spark]

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-06-09 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-857437264 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44058/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32814: [SPARK-35664][SQL] Support java.time.LocalDateTime as an external type of TimestampWithoutTZ type

2021-06-09 Thread GitBox
SparkQA commented on pull request #32814: URL: https://github.com/apache/spark/pull/32814#issuecomment-857442919 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44062/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #32835: [SPARK-35591][PYTHON][DOCS] Rename "Koalas" to "pandas API on Spark" in the documents

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32835: URL: https://github.com/apache/spark/pull/32835#issuecomment-857455567 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139544/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32763: [SPARK-35058][SQL] Group exception messages in hive/client

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32763: URL: https://github.com/apache/spark/pull/32763#issuecomment-85748 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139522/ -- This

[GitHub] [spark] yaooqinn opened a new pull request #32837: [SPARK-35692][K8S] Use AtomicInteger for executor id generating

2021-06-09 Thread GitBox
yaooqinn opened a new pull request #32837: URL: https://github.com/apache/spark/pull/32837 ### What changes were proposed in this pull request? AtomicInteger is enough for executor ids, in this PR, we use it to replace AtomicLong like other cluster managers, e.g. yarn,

[GitHub] [spark] AmplabJenkins commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-85749 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #32814: [SPARK-35664][SQL] Support java.time.LocalDateTime as an external type of TimestampWithoutTZ type

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32814: URL: https://github.com/apache/spark/pull/32814#issuecomment-85745 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32303: URL: https://github.com/apache/spark/pull/32303#issuecomment-85747 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44064/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32833: [SPARK-35687][SQL][TEST] PythonUDFSuite move assume into its methods

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32833: URL: https://github.com/apache/spark/pull/32833#issuecomment-85746 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44061/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32800: [SPARK-35661][SQL] Allow deserialized off-heap memory entry

2021-06-09 Thread GitBox
AmplabJenkins commented on pull request #32800: URL: https://github.com/apache/spark/pull/32800#issuecomment-857455565 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139534/ -- This

[GitHub] [spark] dongjoon-hyun commented on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
dongjoon-hyun commented on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857454909 You are right. Then, shall we move the check from `val driverPod` to `start` method? `ExecutorPodsAllocator` creation and `start` invocation is different. -- This is an

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32800: [SPARK-35661][SQL] Allow deserialized off-heap memory entry

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32800: URL: https://github.com/apache/spark/pull/32800#issuecomment-857455565 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139534/

[GitHub] [spark] viirya commented on pull request #32767: [SPARK-35628][SS] RocksDBFileManager - load checkpoint from DFS

2021-06-09 Thread GitBox
viirya commented on pull request #32767: URL: https://github.com/apache/spark/pull/32767#issuecomment-857463805 Thank you @xuanyuanking. I'll find some time to review this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] gengliangwang opened a new pull request #32838: [SPARK-35694][INFRA] Increase the default JVM stack size of SBT

2021-06-09 Thread GitBox
gengliangwang opened a new pull request #32838: URL: https://github.com/apache/spark/pull/32838 ### What changes were proposed in this pull request? The jenkins SBT keep failing with stack overflow error:

[GitHub] [spark] dgd-contributor opened a new pull request #32839: [SPARK-35679][SQL] instantToMicros overflow

2021-06-09 Thread GitBox
dgd-contributor opened a new pull request #32839: URL: https://github.com/apache/spark/pull/32839 ### Why are the changes needed? With Long.minValue cast to an instant, secs will be floored in function microsToInstant and cause overflow when multiply with Micros_per_second def

[GitHub] [spark] sarutak edited a comment on pull request #32786: [SPARK-35296][SQL] Allow Dataset.observe to work even if CollectMetricsExec in a task handles multiple partitions.

2021-06-09 Thread GitBox
sarutak edited a comment on pull request #32786: URL: https://github.com/apache/spark/pull/32786#issuecomment-857479722 I considered the following comments from @cloud-fan and @hvanhovell . https://github.com/apache/spark/pull/32786#discussion_r647073606

[GitHub] [spark] cloud-fan closed pull request #32763: [SPARK-35058][SQL] Group exception messages in hive/client

2021-06-09 Thread GitBox
cloud-fan closed pull request #32763: URL: https://github.com/apache/spark/pull/32763 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] cloud-fan commented on pull request #32763: [SPARK-35058][SQL] Group exception messages in hive/client

2021-06-09 Thread GitBox
cloud-fan commented on pull request #32763: URL: https://github.com/apache/spark/pull/32763#issuecomment-857493645 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] yaooqinn edited a comment on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
yaooqinn edited a comment on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857473012 > You are right. Then, shall we move the check from `val driverPod` to `start` method? `ExecutorPodsAllocator` creation and `start` invocation is different?

[GitHub] [spark] SparkQA commented on pull request #32781: [SPARK-35650][SQL] Enhance `RepartitionByExpression` to make it coalesce partitions efficiently by AQE

2021-06-09 Thread GitBox
SparkQA commented on pull request #32781: URL: https://github.com/apache/spark/pull/32781#issuecomment-857502656 **[Test build #139563 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139563/testReport)** for PR 32781 at commit

[GitHub] [spark] Yikun commented on pull request #32806: [SPARK-35668][INFRA] Use "concurrency" syntax on Github Actions workflow

2021-06-09 Thread GitBox
Yikun commented on pull request #32806: URL: https://github.com/apache/spark/pull/32806#issuecomment-857502360 ![image](https://user-images.githubusercontent.com/1736354/121317951-5d462c00-c93d-11eb-871b-1d36a56ff487.png) emm, The reason as above. - `pull_request_target`

[GitHub] [spark] Yikun edited a comment on pull request #32806: [SPARK-35668][INFRA] Use "concurrency" syntax on Github Actions workflow

2021-06-09 Thread GitBox
Yikun edited a comment on pull request #32806: URL: https://github.com/apache/spark/pull/32806#issuecomment-857502360 ![image](https://user-images.githubusercontent.com/1736354/121317951-5d462c00-c93d-11eb-871b-1d36a56ff487.png) emm, The reason as above. -

[GitHub] [spark] Yikun edited a comment on pull request #32806: [SPARK-35668][INFRA] Use "concurrency" syntax on Github Actions workflow

2021-06-09 Thread GitBox
Yikun edited a comment on pull request #32806: URL: https://github.com/apache/spark/pull/32806#issuecomment-857502360 ![image](https://user-images.githubusercontent.com/1736354/121317951-5d462c00-c93d-11eb-871b-1d36a56ff487.png) emm, The reason as above. -

[GitHub] [spark] gengliangwang commented on pull request #32839: [SPARK-35679][SQL][WIP] instantToMicros overflow

2021-06-09 Thread GitBox
gengliangwang commented on pull request #32839: URL: https://github.com/apache/spark/pull/32839#issuecomment-857528050 As I mentioned in https://github.com/apache/spark/pull/32814#issuecomment-856857466 > We can fix it but it is too corner. The fix makes performance worse, too.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32830: [SPARK-32975][K8S][FOLLOWUP] Avoid None.get exception

2021-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32830: URL: https://github.com/apache/spark/pull/32830#issuecomment-857537385 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44070/

  1   2   3   4   5   6   7   8   9   10   >