[GitHub] [spark] SparkQA commented on pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
SparkQA commented on pull request #34631: URL: https://github.com/apache/spark/pull/34631#issuecomment-972617174 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49850/ -- This is an automated message from the Apache

[GitHub] [spark] xuanyuanking commented on a change in pull request #34502: [SPARK-37224][SS] Optimize write path on RocksDB state store provider

2021-11-17 Thread GitBox
xuanyuanking commented on a change in pull request #34502: URL: https://github.com/apache/spark/pull/34502#discussion_r751969435 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1956,8 +1956,21 @@ Here are the configs regarding to RocksDB instance of the

[GitHub] [spark] SparkQA removed a comment on pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-17 Thread GitBox
SparkQA removed a comment on pull request #34644: URL: https://github.com/apache/spark/pull/34644#issuecomment-972609462 **[Test build #145378 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145378/testReport)** for PR 34644 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34641: [SPARK-37368][SQL][TESTS] Explicit GC for TPC-DS query runs

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34641: URL: https://github.com/apache/spark/pull/34641#issuecomment-972611320 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145365/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34644: URL: https://github.com/apache/spark/pull/34644#issuecomment-972610843 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145378/

[GitHub] [spark] AmplabJenkins commented on pull request #34641: [SPARK-37368][SQL][TESTS] Explicit GC for TPC-DS query runs

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34641: URL: https://github.com/apache/spark/pull/34641#issuecomment-972611320 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145365/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34644: URL: https://github.com/apache/spark/pull/34644#issuecomment-972610843 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145378/ -- This

[GitHub] [spark] SparkQA commented on pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-17 Thread GitBox
SparkQA commented on pull request #34644: URL: https://github.com/apache/spark/pull/34644#issuecomment-972610827 **[Test build #145378 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145378/testReport)** for PR 34644 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34641: [SPARK-37368][SQL][TESTS] Explicit GC for TPC-DS query runs

2021-11-17 Thread GitBox
SparkQA removed a comment on pull request #34641: URL: https://github.com/apache/spark/pull/34641#issuecomment-972467425 **[Test build #145365 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145365/testReport)** for PR 34641 at commit

[GitHub] [spark] SparkQA commented on pull request #34641: [SPARK-37368][SQL][TESTS] Explicit GC for TPC-DS query runs

2021-11-17 Thread GitBox
SparkQA commented on pull request #34641: URL: https://github.com/apache/spark/pull/34641#issuecomment-972610031 **[Test build #145365 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145365/testReport)** for PR 34641 at commit

[GitHub] [spark] SparkQA commented on pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-17 Thread GitBox
SparkQA commented on pull request #34644: URL: https://github.com/apache/spark/pull/34644#issuecomment-972609462 **[Test build #145378 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145378/testReport)** for PR 34644 at commit

[GitHub] [spark] SparkQA commented on pull request #34643: [SPARK-37370][SQL] Add SQL configs to control newly added join code-gen in 3.3

2021-11-17 Thread GitBox
SparkQA commented on pull request #34643: URL: https://github.com/apache/spark/pull/34643#issuecomment-972609501 **[Test build #145379 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145379/testReport)** for PR 34643 at commit

[GitHub] [spark] SparkQA commented on pull request #34435: [SPARK-37155][PYTHON] Inline type hints for python/pyspark/statcounter.py

2021-11-17 Thread GitBox
SparkQA commented on pull request #34435: URL: https://github.com/apache/spark/pull/34435#issuecomment-972609689 **[Test build #145381 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145381/testReport)** for PR 34435 at commit

[GitHub] [spark] SparkQA commented on pull request #34630: [SPARK-37224][SS][FOLLOWUP] Add benchmark on basic state store operations

2021-11-17 Thread GitBox
SparkQA commented on pull request #34630: URL: https://github.com/apache/spark/pull/34630#issuecomment-972609567 **[Test build #145380 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145380/testReport)** for PR 34630 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34631: URL: https://github.com/apache/spark/pull/34631#issuecomment-972608645 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49844/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972608646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-972608647 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145363/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34638: URL: https://github.com/apache/spark/pull/34638#issuecomment-972608648 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145360/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34639: [SPARK-37308][SQL][TESTS] Partially revert SPARK-37214's test due to flakiness

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34639: URL: https://github.com/apache/spark/pull/34639#issuecomment-972608643 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972608646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins commented on pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34631: URL: https://github.com/apache/spark/pull/34631#issuecomment-972608645 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49844/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34639: [SPARK-37308][SQL][TESTS] Partially revert SPARK-37214's test due to flakiness

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34639: URL: https://github.com/apache/spark/pull/34639#issuecomment-972608643 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins commented on pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34638: URL: https://github.com/apache/spark/pull/34638#issuecomment-972608648 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145360/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-972608647 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145363/ -- This

[GitHub] [spark] SparkQA commented on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-17 Thread GitBox
SparkQA commented on pull request #34326: URL: https://github.com/apache/spark/pull/34326#issuecomment-972607484 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49849/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34627: [SPARK-37270][3.2][SQL] Fix push foldable into CaseWhen branches if elseValue is empty

2021-11-17 Thread GitBox
SparkQA commented on pull request #34627: URL: https://github.com/apache/spark/pull/34627#issuecomment-972607312 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49847/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
SparkQA commented on pull request #34631: URL: https://github.com/apache/spark/pull/34631#issuecomment-972606931 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49844/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
SparkQA commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-972605581 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49848/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-11-17 Thread GitBox
SparkQA removed a comment on pull request #34638: URL: https://github.com/apache/spark/pull/34638#issuecomment-972461642 **[Test build #145360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145360/testReport)** for PR 34638 at commit

[GitHub] [spark] SparkQA commented on pull request #34638: [SPARK-37360][SQL] Support TimestampNTZ in JSON data source

2021-11-17 Thread GitBox
SparkQA commented on pull request #34638: URL: https://github.com/apache/spark/pull/34638#issuecomment-972605437 **[Test build #145360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145360/testReport)** for PR 34638 at commit

[GitHub] [spark] beliefer opened a new pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-17 Thread GitBox
beliefer opened a new pull request #34644: URL: https://github.com/apache/spark/pull/34644 ### What changes were proposed in this pull request? This PR proposes to push down filters with `timestamp_ntz` to ORC. ### Why are the changes needed? It's great to be able to push

[GitHub] [spark] SparkQA removed a comment on pull request #34639: [SPARK-37308][SQL][TESTS] Partially revert SPARK-37214's test due to flakiness

2021-11-17 Thread GitBox
SparkQA removed a comment on pull request #34639: URL: https://github.com/apache/spark/pull/34639#issuecomment-972461638 **[Test build #145359 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145359/testReport)** for PR 34639 at commit

[GitHub] [spark] SparkQA commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
SparkQA commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972603379 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49846/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34639: [SPARK-37308][SQL][TESTS] Partially revert SPARK-37214's test due to flakiness

2021-11-17 Thread GitBox
SparkQA commented on pull request #34639: URL: https://github.com/apache/spark/pull/34639#issuecomment-972603005 **[Test build #145359 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145359/testReport)** for PR 34639 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34639: [SPARK-37308][SQL][TESTS] Partially revert SPARK-37214's test due to flakiness

2021-11-17 Thread GitBox
SparkQA removed a comment on pull request #34639: URL: https://github.com/apache/spark/pull/34639#issuecomment-972464503 **[Test build #145364 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145364/testReport)** for PR 34639 at commit

[GitHub] [spark] SparkQA commented on pull request #34639: [SPARK-37308][SQL][TESTS] Partially revert SPARK-37214's test due to flakiness

2021-11-17 Thread GitBox
SparkQA commented on pull request #34639: URL: https://github.com/apache/spark/pull/34639#issuecomment-972601830 **[Test build #145364 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145364/testReport)** for PR 34639 at commit

[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972600423 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49845/ -- This is an automated message from the

[GitHub] [spark] SparkQA removed a comment on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
SparkQA removed a comment on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972461629 **[Test build #145361 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145361/testReport)** for PR 34611 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
SparkQA removed a comment on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-972461717 **[Test build #145363 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145363/testReport)** for PR 34596 at commit

[GitHub] [spark] SparkQA commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
SparkQA commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-972599289 **[Test build #145363 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145363/testReport)** for PR 34596 at commit

[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972594185 **[Test build #145361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145361/testReport)** for PR 34611 at commit

[GitHub] [spark] c21 commented on pull request #34643: [SPARK-37370][SQL] Add SQL configs to control newly added join code-gen in 3.3

2021-11-17 Thread GitBox
c21 commented on pull request #34643: URL: https://github.com/apache/spark/pull/34643#issuecomment-972592537 cc @cloud-fan could you help take a look when you have time? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] c21 opened a new pull request #34643: [SPARK-37370][SQL] Add SQL configs to control newly added join code-gen in 3.3

2021-11-17 Thread GitBox
c21 opened a new pull request #34643: URL: https://github.com/apache/spark/pull/34643 ### What changes were proposed in this pull request? During Spark 3.3, we added code-gen for FULL OUTER shuffled hash join, FULL OUTER sort merge join, and Existence sort merge join. Given

[GitHub] [spark] HeartSaVioR commented on pull request #34630: [SPARK-37224][SS][FOLLOWUP] Add benchmark on basic state store operations

2021-11-17 Thread GitBox
HeartSaVioR commented on pull request #34630: URL: https://github.com/apache/spark/pull/34630#issuecomment-972589144 Just rebased since we merged #34502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
HyukjinKwon commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751948058 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java ## @@ -19,19 +19,14 @@ import

[GitHub] [spark] viirya commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
viirya commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751948694 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -225,7 +227,10 @@ def toPandas(self) -> "PandasDataFrameLike": else:

[GitHub] [spark] SparkQA commented on pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
SparkQA commented on pull request #34631: URL: https://github.com/apache/spark/pull/34631#issuecomment-972585271 **[Test build #145377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145377/testReport)** for PR 34631 at commit

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
HyukjinKwon commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751948058 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java ## @@ -19,19 +19,14 @@ import

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
HyukjinKwon commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751947485 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -225,7 +227,10 @@ def toPandas(self) -> "PandasDataFrameLike": else:

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
HyukjinKwon commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751947212 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -225,7 +227,10 @@ def toPandas(self) -> "PandasDataFrameLike": else:

[GitHub] [spark] viirya commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
viirya commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751946892 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java ## @@ -19,19 +19,14 @@ import

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
HyukjinKwon commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751945528 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -225,7 +227,10 @@ def toPandas(self) -> "PandasDataFrameLike": else:

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.sendTokenConf' to support renewing delegation tokens in a multi-clust

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-972581194 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49841/

[GitHub] [spark] AmplabJenkins commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.sendTokenConf' to support renewing delegation tokens in a multi-cluster envir

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-972581194 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49841/ --

[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.sendTokenConf' to support renewing delegation tokens in a multi-cluster environment

2021-11-17 Thread GitBox
SparkQA commented on pull request #34635: URL: https://github.com/apache/spark/pull/34635#issuecomment-972581177 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49841/ -- This is an automated message from the

[GitHub] [spark] viirya commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
viirya commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751944243 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -225,7 +227,10 @@ def toPandas(self) -> "PandasDataFrameLike": else:

[GitHub] [spark] HeartSaVioR closed pull request #34502: [SPARK-37224][SS] Optimize write path on RocksDB state store provider

2021-11-17 Thread GitBox
HeartSaVioR closed pull request #34502: URL: https://github.com/apache/spark/pull/34502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34502: [SPARK-37224][SS] Optimize write path on RocksDB state store provider

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34502: URL: https://github.com/apache/spark/pull/34502#issuecomment-972580150 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49840/

[GitHub] [spark] HeartSaVioR commented on pull request #34502: [SPARK-37224][SS] Optimize write path on RocksDB state store provider

2021-11-17 Thread GitBox
HeartSaVioR commented on pull request #34502: URL: https://github.com/apache/spark/pull/34502#issuecomment-972580577 Thanks for the review @zsxwing ! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] SparkQA commented on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-17 Thread GitBox
SparkQA commented on pull request #34326: URL: https://github.com/apache/spark/pull/34326#issuecomment-972580431 **[Test build #145376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145376/testReport)** for PR 34326 at commit

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34435: [SPARK-37155][PYTHON] Inline type hints for python/pyspark/statcounter.py

2021-11-17 Thread GitBox
HyukjinKwon commented on a change in pull request #34435: URL: https://github.com/apache/spark/pull/34435#discussion_r751943438 ## File path: python/pyspark/statcounter.py ## @@ -53,12 +54,12 @@ def merge(self, value): return self # Merge another StatCounter

[GitHub] [spark] SparkQA commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
SparkQA commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-972580212 **[Test build #145375 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145375/testReport)** for PR 34596 at commit

[GitHub] [spark] SparkQA commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
SparkQA commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972580187 **[Test build #145373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145373/testReport)** for PR 34642 at commit

[GitHub] [spark] SparkQA commented on pull request #34627: [SPARK-37270][3.2][SQL] Fix push foldable into CaseWhen branches if elseValue is empty

2021-11-17 Thread GitBox
SparkQA commented on pull request #34627: URL: https://github.com/apache/spark/pull/34627#issuecomment-972580219 **[Test build #145374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145374/testReport)** for PR 34627 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #34502: [SPARK-37224][SS] Optimize write path on RocksDB state store provider

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34502: URL: https://github.com/apache/spark/pull/34502#issuecomment-972580150 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49840/ --

[GitHub] [spark] SparkQA commented on pull request #34502: [SPARK-37224][SS] Optimize write path on RocksDB state store provider

2021-11-17 Thread GitBox
SparkQA commented on pull request #34502: URL: https://github.com/apache/spark/pull/34502#issuecomment-972580130 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49840/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34641: [SPARK-37368][SQL][TESTS] Explicit GC for TPC-DS query runs

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34641: URL: https://github.com/apache/spark/pull/34641#issuecomment-972578841 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49837/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972578842 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49842/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972578843 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49843/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34471: [SPARK-36879][SQL] Support Parquet v2 data page encoding (DELTA_BINARY_PACKED) for the vectorized path

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34471: URL: https://github.com/apache/spark/pull/34471#issuecomment-972578844 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49839/

[GitHub] [spark] AmplabJenkins commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972578842 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49842/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34641: [SPARK-37368][SQL][TESTS] Explicit GC for TPC-DS query runs

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34641: URL: https://github.com/apache/spark/pull/34641#issuecomment-972578841 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49837/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34471: [SPARK-36879][SQL] Support Parquet v2 data page encoding (DELTA_BINARY_PACKED) for the vectorized path

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34471: URL: https://github.com/apache/spark/pull/34471#issuecomment-972578844 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49839/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972578843 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49843/ --

[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972578644 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49845/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972576578 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49842/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34471: [SPARK-36879][SQL] Support Parquet v2 data page encoding (DELTA_BINARY_PACKED) for the vectorized path

2021-11-17 Thread GitBox
SparkQA commented on pull request #34471: URL: https://github.com/apache/spark/pull/34471#issuecomment-972575765 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49839/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
SparkQA commented on pull request #34631: URL: https://github.com/apache/spark/pull/34631#issuecomment-972575653 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49844/ -- This is an automated message from the Apache

[GitHub] [spark] Peng-Lei commented on pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
Peng-Lei commented on pull request #34631: URL: https://github.com/apache/spark/pull/34631#issuecomment-972575254 LGTM for *.scala and *.java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on pull request #34620: [SPARK-37209][YARN][TESTS] Fix `YarnShuffleIntegrationSuite` releated UTs when using `hadoop-3.2` profile without `assembly/target/scala-

2021-11-17 Thread GitBox
LuciferYang commented on pull request #34620: URL: https://github.com/apache/spark/pull/34620#issuecomment-972574513 cc @srowen @HyukjinKwon @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on pull request #34548: [SPARK-37282][TESTS] Add `ExtendedLevelDBTest` and disable LevelDB tests on Apple Silicon

2021-11-17 Thread GitBox
HyukjinKwon commented on pull request #34548: URL: https://github.com/apache/spark/pull/34548#issuecomment-972572642 SGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] LuciferYang commented on pull request #34576: [SPARK-37282][TESTS][FOLLOWUP] Mark `YarnShuffleServiceSuite` as ExtendedLevelDBTest

2021-11-17 Thread GitBox
LuciferYang commented on pull request #34576: URL: https://github.com/apache/spark/pull/34576#issuecomment-972570611 I have Apple Silicon. Let me verify this ~ I'm very happy that I can run the single test using Apple Silicon ~ thanks @dongjoon-hyun -- This is an

[GitHub] [spark] LuciferYang commented on pull request #34548: [SPARK-37282][TESTS] Add `ExtendedLevelDBTest` and disable LevelDB tests on Apple Silicon

2021-11-17 Thread GitBox
LuciferYang commented on pull request #34548: URL: https://github.com/apache/spark/pull/34548#issuecomment-972567773 Should we extract `SystemUtils.IS_OS_MAC_OSX && SystemUtils.OS_ARCH.equals("aarch64")` into an independent `val`, such as `Utils.isAppleSilicon` @dongjoon-hyun

[GitHub] [spark] cloud-fan commented on a change in pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-11-17 Thread GitBox
cloud-fan commented on a change in pull request #32875: URL: https://github.com/apache/spark/pull/32875#discussion_r751931689 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala ## @@ -352,3 +380,142 @@ case class

[GitHub] [spark] itholic commented on pull request #34509: [SPARK-34521][PYTHON][SQL] Fix spark.createDataFrame when using pandas with StringDtype

2021-11-17 Thread GitBox
itholic commented on pull request #34509: URL: https://github.com/apache/spark/pull/34509#issuecomment-972567563 Not a big deal, but could you add a simple example to PR description what issue is resolved with before/after ?? e.g. Before: ```python >>>

[GitHub] [spark] sadikovi commented on pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
sadikovi commented on pull request #34596: URL: https://github.com/apache/spark/pull/34596#issuecomment-972567167 @gengliangwang I addressed your comments. Would appreciate it if you could take a look. Thanks. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] sadikovi commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
sadikovi commented on a change in pull request #34596: URL: https://github.com/apache/spark/pull/34596#discussion_r751930879 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala ## @@ -160,6 +169,15 @@ class CSVInferSchema(val

[GitHub] [spark] sadikovi commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-17 Thread GitBox
sadikovi commented on a change in pull request #34596: URL: https://github.com/apache/spark/pull/34596#discussion_r751930741 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ## @@ -1012,6 +1012,162 @@ abstract class CSVSuite

[GitHub] [spark] SparkQA commented on pull request #34641: [SPARK-37368][SQL][TESTS] Explicit GC for TPC-DS query runs

2021-11-17 Thread GitBox
SparkQA commented on pull request #34641: URL: https://github.com/apache/spark/pull/34641#issuecomment-972566219 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49837/ -- This is an automated message from the

[GitHub] [spark] wangyum commented on pull request #34627: [SPARK-37270][3.2][SQL] Fix push foldable into CaseWhen branches if elseValue is empty

2021-11-17 Thread GitBox
wangyum commented on pull request #34627: URL: https://github.com/apache/spark/pull/34627#issuecomment-972565986 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] SparkQA commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
SparkQA commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972562105 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49843/ --

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32340: [SPARK-35139][SQL] Support ANSI intervals as Arrow Column vectors

2021-11-17 Thread GitBox
HyukjinKwon commented on a change in pull request #32340: URL: https://github.com/apache/spark/pull/32340#discussion_r751894690 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java ## @@ -172,6 +176,10 @@ public

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32340: [SPARK-35139][SQL] Support ANSI intervals as Arrow Column vectors

2021-11-17 Thread GitBox
HyukjinKwon commented on a change in pull request #32340: URL: https://github.com/apache/spark/pull/32340#discussion_r751925908 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java ## @@ -172,6 +176,10 @@ public

[GitHub] [spark] SparkQA removed a comment on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
SparkQA removed a comment on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972554088 **[Test build #145370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145370/testReport)** for PR 34642 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
cloud-fan commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751923607 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java ## @@ -549,21 +544,18 @@ int getInt(int rowId) {

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
AmplabJenkins removed a comment on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972558412 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145370/

[GitHub] [spark] AmplabJenkins commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
AmplabJenkins commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972558412 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145370/ -- This

[GitHub] [spark] SparkQA commented on pull request #34642: [SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan

2021-11-17 Thread GitBox
SparkQA commented on pull request #34642: URL: https://github.com/apache/spark/pull/34642#issuecomment-972558390 **[Test build #145370 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145370/testReport)** for PR 34642 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #34631: [SPARK-37277][PYTHON][SQL] Support DayTimeIntervalType in pandas UDF and Arrow optimization

2021-11-17 Thread GitBox
cloud-fan commented on a change in pull request #34631: URL: https://github.com/apache/spark/pull/34631#discussion_r751923194 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java ## @@ -549,21 +544,18 @@ int getInt(int rowId) {

[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-17 Thread GitBox
SparkQA commented on pull request #34611: URL: https://github.com/apache/spark/pull/34611#issuecomment-972557544 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49842/ -- This is an automated message from the Apache

  1   2   3   4   5   6   7   8   >