[GitHub] [spark] AmplabJenkins commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-950091643 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144551/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-950091643 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144551/

[GitHub] [spark] huaxingao commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-22 Thread GitBox
huaxingao commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r734929416 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala ## @@ -298,17 +299,22 @@ private[sql] case

[GitHub] [spark] SparkQA removed a comment on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
SparkQA removed a comment on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-950063876 **[Test build #144551 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144551/testReport)** for PR 34213 at commit

[GitHub] [spark] SparkQA commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
SparkQA commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-950086802 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49022/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
SparkQA commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-950068793 **[Test build #144551 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144551/testReport)** for PR 34213 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950068687 CRAN check in SparkR build validates the DESCRIPTION file. The check won't validate the values but at least format and etc. Might need to make sure that the tests pass

[GitHub] [spark] SparkQA commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
SparkQA commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-950063876 **[Test build #144551 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144551/testReport)** for PR 34213 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950063381 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144550/

[GitHub] [spark] AmplabJenkins commented on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950063381 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144550/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
SparkQA removed a comment on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950038749 **[Test build #144550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144550/testReport)** for PR 34353 at commit

[GitHub] [spark] dchvn edited a comment on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
dchvn edited a comment on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-950060026 CC @HyukjinKwon , updated some nit. May I resolve other improvements in F-UP PR latter? -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] dchvn commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
dchvn commented on pull request #34213: URL: https://github.com/apache/spark/pull/34213#issuecomment-950060026 CC @HyukjinKwon , updated some nit. May I resolve other improvements in F-UP PR? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] dchvn commented on a change in pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
dchvn commented on a change in pull request #34213: URL: https://github.com/apache/spark/pull/34213#discussion_r734923937 ## File path: python/pyspark/pandas/frame.py ## @@ -8201,6 +8202,185 @@ def update(self, other: "DataFrame", join: str = "left", overwrite: bool = True)

[GitHub] [spark] dchvn commented on a change in pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
dchvn commented on a change in pull request #34213: URL: https://github.com/apache/spark/pull/34213#discussion_r734923848 ## File path: python/pyspark/pandas/tests/test_dataframe.py ## @@ -6025,6 +6025,64 @@ def test_multi_index_dtypes(self): )

[GitHub] [spark] dchvn commented on a change in pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
dchvn commented on a change in pull request #34213: URL: https://github.com/apache/spark/pull/34213#discussion_r734923406 ## File path: python/pyspark/pandas/frame.py ## @@ -8201,6 +8202,185 @@ def update(self, other: "DataFrame", join: str = "left", overwrite: bool = True)

[GitHub] [spark] dchvn commented on a change in pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-10-22 Thread GitBox
dchvn commented on a change in pull request #34213: URL: https://github.com/apache/spark/pull/34213#discussion_r734923248 ## File path: python/pyspark/pandas/frame.py ## @@ -8201,6 +8202,185 @@ def update(self, other: "DataFrame", join: str = "left", overwrite: bool = True)

[GitHub] [spark] SparkQA commented on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
SparkQA commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950056971 **[Test build #144550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144550/testReport)** for PR 34353 at commit

[GitHub] [spark] dchvn commented on a change in pull request #34363: [SPARK-37083][PYTHON] Inline type hints for python/pyspark/accumulators.py

2021-10-22 Thread GitBox
dchvn commented on a change in pull request #34363: URL: https://github.com/apache/spark/pull/34363#discussion_r734921537 ## File path: python/pyspark/accumulators.py ## @@ -176,44 +193,44 @@ class AccumulatorParam(object): [7.0, 8.0, 9.0] """ -def zero(self,

[GitHub] [spark] dchvn commented on pull request #34238: [SPARK-36969][PYTHON] Inline type hints for SparkContext

2021-10-22 Thread GitBox
dchvn commented on pull request #34238: URL: https://github.com/apache/spark/pull/34238#issuecomment-950052545 @ueshin Thanks for your help! I updated this PR. Could you take another look? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] dchvn commented on a change in pull request #34363: [SPARK-37083][PYTHON] Inline type hints for python/pyspark/accumulators.py

2021-10-22 Thread GitBox
dchvn commented on a change in pull request #34363: URL: https://github.com/apache/spark/pull/34363#discussion_r734920972 ## File path: python/pyspark/accumulators.py ## @@ -264,7 +281,12 @@ def authenticate_and_accum_updates(): class

[GitHub] [spark] dchvn commented on a change in pull request #34363: [SPARK-37083][PYTHON] Inline type hints for python/pyspark/accumulators.py

2021-10-22 Thread GitBox
dchvn commented on a change in pull request #34363: URL: https://github.com/apache/spark/pull/34363#discussion_r734920585 ## File path: python/pyspark/accumulators.py ## @@ -20,20 +20,32 @@ import struct import socketserver as SocketServer import threading +from typing

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950051491 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49021/

[GitHub] [spark] AmplabJenkins commented on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950051491 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49021/ --

[GitHub] [spark] SparkQA commented on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
SparkQA commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950048633 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49021/ -- This is an automated message from the

[GitHub] [spark] srowen commented on a change in pull request #34368: [WIP][SPARK-37072][CORE][TEST] Pass all UTs in `repl` with Java 17

2021-10-22 Thread GitBox
srowen commented on a change in pull request #34368: URL: https://github.com/apache/spark/pull/34368#discussion_r734914860 ## File path: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala ## @@ -407,6 +417,24 @@ private[spark] object ClosureCleaner extends Logging

[GitHub] [spark] srowen commented on a change in pull request #34351: [SPARK-37071][CORE] Make OpenHashMap serialize without reference tracking

2021-10-22 Thread GitBox
srowen commented on a change in pull request #34351: URL: https://github.com/apache/spark/pull/34351#discussion_r734914783 ## File path: core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala ## @@ -149,17 +149,12 @@ class OpenHashMap[K : ClassTag,

[GitHub] [spark] srowen commented on a change in pull request #34351: [SPARK-37071][CORE] Make OpenHashMap serialize without reference tracking

2021-10-22 Thread GitBox
srowen commented on a change in pull request #34351: URL: https://github.com/apache/spark/pull/34351#discussion_r734914728 ## File path: core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala ## @@ -149,17 +149,12 @@ class OpenHashMap[K : ClassTag,

[GitHub] [spark] srowen commented on a change in pull request #34351: [SPARK-37071][CORE] Make OpenHashMap serialize without reference tracking

2021-10-22 Thread GitBox
srowen commented on a change in pull request #34351: URL: https://github.com/apache/spark/pull/34351#discussion_r734914681 ## File path: core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala ## @@ -149,17 +149,12 @@ class OpenHashMap[K : ClassTag,

[GitHub] [spark] srowen commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
srowen commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950044528 I also can't see the repo, but w/e. It seems fine to make this change - not sure any tests will test it anyway. This is a fine change for 3.3, but, only Java 11 is supported

[GitHub] [spark] SparkQA commented on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
SparkQA commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950043823 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49021/ -- This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon edited a comment on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
HyukjinKwon edited a comment on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950042051 https://github.com/jupyter/drumsticks: I cannot access  . Once it's enabled, you could rebase in this PR. That should kick the job -- This is an automated message

[GitHub] [spark] HyukjinKwon closed pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
HyukjinKwon closed pull request #34353: URL: https://github.com/apache/spark/pull/34353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] HyukjinKwon commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950042051 https://github.com/jupyter/drumsticks: I cannot access . Once it's enabled, you could rebase in this PR. That should kick the job -- This is an automated message from

[GitHub] [spark] HyukjinKwon commented on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950041897 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34298: [SPARK-34960][SQL] Aggregate push down for ORC

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34298: URL: https://github.com/apache/spark/pull/34298#issuecomment-950040725 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144547/

[GitHub] [spark] AmplabJenkins commented on pull request #34298: [SPARK-34960][SQL] Aggregate push down for ORC

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34298: URL: https://github.com/apache/spark/pull/34298#issuecomment-950040725 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144547/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #34298: [SPARK-34960][SQL] Aggregate push down for ORC

2021-10-22 Thread GitBox
SparkQA removed a comment on pull request #34298: URL: https://github.com/apache/spark/pull/34298#issuecomment-949960083 **[Test build #144547 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144547/testReport)** for PR 34298 at commit

[GitHub] [spark] SparkQA commented on pull request #34298: [SPARK-34960][SQL] Aggregate push down for ORC

2021-10-22 Thread GitBox
SparkQA commented on pull request #34298: URL: https://github.com/apache/spark/pull/34298#issuecomment-950040509 **[Test build #144547 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144547/testReport)** for PR 34298 at commit

[GitHub] [spark] ueshin commented on a change in pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
ueshin commented on a change in pull request #34296: URL: https://github.com/apache/spark/pull/34296#discussion_r734910279 ## File path: dev/lint-python ## @@ -124,10 +125,66 @@ function pycodestyle_test { fi } -function mypy_test { + +function mypy_annotation_test {

[GitHub] [spark] wangyum commented on pull request #34367: [SPARK-37099][SQL] Impl a rank-based filter to optimize top-k computation

2021-10-22 Thread GitBox
wangyum commented on pull request #34367: URL: https://github.com/apache/spark/pull/34367#issuecomment-950039381 cc @opensky142857 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-948391980 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] SparkQA commented on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
SparkQA commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950038749 **[Test build #144550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144550/testReport)** for PR 34353 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-950038360 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144548/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950038362 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144549/

[GitHub] [spark] AmplabJenkins commented on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-950038360 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144548/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950038362 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144549/ -- This

[GitHub] [spark] Bidek56 commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
Bidek56 commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950038236 I did enabled GA but after the failure, not sure to rerun the action, besides using an API. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] Bidek56 commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
Bidek56 commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950037941 > https://github.com/jupyter/drumsticks repo seems private. Okay, if something is affected, then it's fine - I asked because this Java versions in description are not used in

[GitHub] [spark] Bidek56 edited a comment on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
Bidek56 edited a comment on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950037038 > I agree with this change, and LGTM but I would like to understand how this change interacts with something external. This

[GitHub] [spark] HyukjinKwon commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950037653 Mind enabling GitHub Actions in your forked repository? Apache Spark leverages contributor's resources in forked repository in their PRs

[GitHub] [spark] SparkQA removed a comment on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
SparkQA removed a comment on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950004282 **[Test build #144549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144549/testReport)** for PR 34297 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
SparkQA removed a comment on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-95966 **[Test build #144548 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144548/testReport)** for PR 34296 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950037574 https://github.com/jupyter/drumsticks repo seems private. Okay, if something is affected, then it's fine - I asked because this Java versions in description are not used in

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34359: [SPARK-36986][SQL] Improving external schema management flexibility on DataSet and StructType

2021-10-22 Thread GitBox
HyukjinKwon commented on a change in pull request #34359: URL: https://github.com/apache/spark/pull/34359#discussion_r734908797 ## File path: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala ## @@ -511,6 +511,20 @@ class SparkSession private(

[GitHub] [spark] Bidek56 commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
Bidek56 commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950037038 > I agree with this change, and LGTM but I would like to understand how this change interacts with something external. This

[GitHub] [spark] HyukjinKwon commented on pull request #34353: [SPARK-37084][SQL] Set spark.sql.files.openCostInBytes to bytesConf

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #34353: URL: https://github.com/apache/spark/pull/34353#issuecomment-950035199 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #34292: [SPARK-37017][SQL] Reduce the scope of synchronized to prevent potential deadlock

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #34292: URL: https://github.com/apache/spark/pull/34292#issuecomment-950035138 Thanks @baibaichen and @chenzhx for addressing my comment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HyukjinKwon edited a comment on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
HyukjinKwon edited a comment on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950034756 I agree with this change, and LGTM but I would like to understand how this change interacts with something external. -- This is an automated message from the

[GitHub] [spark] HyukjinKwon commented on pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #34371: URL: https://github.com/apache/spark/pull/34371#issuecomment-950034756 I agree with this change, and LGTM but I would like to understand how this change interacts with what. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34371: [SPARK-37091][R] Upgrading SystemRequirements to include Java <= 17

2021-10-22 Thread GitBox
HyukjinKwon commented on a change in pull request #34371: URL: https://github.com/apache/spark/pull/34371#discussion_r734906656 ## File path: R/pkg/DESCRIPTION ## @@ -13,7 +13,7 @@ Authors@R: c(person("Shivaram", "Venkataraman", role = "aut", License: Apache License (== 2.0)

[GitHub] [spark] HyukjinKwon commented on pull request #25490: [SPARK-28756][R][FOLLOW-UP] Specify minimum and maximum Java versions

2021-10-22 Thread GitBox
HyukjinKwon commented on pull request #25490: URL: https://github.com/apache/spark/pull/25490#issuecomment-950034587 @Bidek56 why does the DESCRIPTION matter? it's just a metadata. Also JDK 17 isn't fully supported yet. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
SparkQA commented on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-950034494 **[Test build #144548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144548/testReport)** for PR 34296 at commit

[GitHub] [spark] SparkQA commented on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
SparkQA commented on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950033141 **[Test build #144549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144549/testReport)** for PR 34297 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950029775 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49020/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-950029774 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49019/

[GitHub] [spark] AmplabJenkins commented on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950029775 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49020/ --

[GitHub] [spark] AmplabJenkins commented on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-950029774 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49019/ --

[GitHub] [spark] SparkQA commented on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
SparkQA commented on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-950026540 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49019/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
SparkQA commented on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950026171 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49020/ -- This is an automated message from the

[GitHub] [spark] github-actions[bot] commented on pull request #32666: [SPARK-30696][SQL] fromUTCtime and toUTCtime produced wrong result on Daylight Saving Time changes days

2021-10-22 Thread GitBox
github-actions[bot] commented on pull request #32666: URL: https://github.com/apache/spark/pull/32666#issuecomment-950022442 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] SparkQA commented on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
SparkQA commented on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950016605 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49020/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
SparkQA commented on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-950014277 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49019/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34297: [WIP][SPARK-37022][PYTHON] Use black as a formatter for PySpark

2021-10-22 Thread GitBox
SparkQA commented on pull request #34297: URL: https://github.com/apache/spark/pull/34297#issuecomment-950004282 **[Test build #144549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144549/testReport)** for PR 34297 at commit

[GitHub] [spark] SparkQA commented on pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
SparkQA commented on pull request #34296: URL: https://github.com/apache/spark/pull/34296#issuecomment-95966 **[Test build #144548 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144548/testReport)** for PR 34296 at commit

[GitHub] [spark] zero323 commented on a change in pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
zero323 commented on a change in pull request #34296: URL: https://github.com/apache/spark/pull/34296#discussion_r734874865 ## File path: dev/lint-python ## @@ -124,39 +125,76 @@ function pycodestyle_test { fi } -function mypy_test { -local MYPY_REPORT= -local

[GitHub] [spark] kazuyukitanimura commented on pull request #33930: [SPARK-36665][SQL] Add more Not operator simplifications

2021-10-22 Thread GitBox
kazuyukitanimura commented on pull request #33930: URL: https://github.com/apache/spark/pull/33930#issuecomment-949987313 Hi @sunchao It would be great if you could take one more look when you get a chance. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] zero323 edited a comment on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox
zero323 edited a comment on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-949961121 > > I think we might have to redefine `ColumnOrName` to fully support these > > What's your idea like? Long story short, I've been looking into different

[GitHub] [spark] sunchao commented on a change in pull request #34365: [SPARK-37098][SQL] Alter table properties should invalidate cache

2021-10-22 Thread GitBox
sunchao commented on a change in pull request #34365: URL: https://github.com/apache/spark/pull/34365#discussion_r734851214 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ## @@ -276,6 +276,7 @@ case class AlterTableSetPropertiesCommand(

[GitHub] [spark] c21 commented on a change in pull request #33828: [SPARK-36579][CORE][SQL] Make spark source stagingDir can be customized

2021-10-22 Thread GitBox
c21 commented on a change in pull request #33828: URL: https://github.com/apache/spark/pull/33828#discussion_r734850205 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -3412,6 +3412,16 @@ object SQLConf { .booleanConf

[GitHub] [spark] zero323 commented on a change in pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox
zero323 commented on a change in pull request #34354: URL: https://github.com/apache/spark/pull/34354#discussion_r734849802 ## File path: python/pyspark/sql/functions.py ## @@ -1652,7 +1652,19 @@ def expr(str: str) -> Column: return Column(sc._jvm.functions.expr(str))

[GitHub] [spark] zero323 commented on pull request #34354: [WIP][SPARK-37085][PYTHON][SQL] Add list/tuple overloads to array, struct, create_map, map_concat

2021-10-22 Thread GitBox
zero323 commented on pull request #34354: URL: https://github.com/apache/spark/pull/34354#issuecomment-949961121 > > I think we might have to redefine `ColumnOrName` to fully support these > > What's your idea like? Long story short, I've been looking into different scenarios

[GitHub] [spark] SparkQA commented on pull request #34298: [SPARK-34960][SQL] Aggregate push down for ORC

2021-10-22 Thread GitBox
SparkQA commented on pull request #34298: URL: https://github.com/apache/spark/pull/34298#issuecomment-949960083 **[Test build #144547 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144547/testReport)** for PR 34298 at commit

[GitHub] [spark] ankurdave commented on a change in pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
ankurdave commented on a change in pull request #34369: URL: https://github.com/apache/spark/pull/34369#discussion_r734840380 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -128,15 +139,21 @@ class FileScanRDD(

[GitHub] [spark] sunchao commented on a change in pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
sunchao commented on a change in pull request #34369: URL: https://github.com/apache/spark/pull/34369#discussion_r734842173 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -128,15 +139,21 @@ class FileScanRDD(

[GitHub] [spark] ankurdave commented on a change in pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
ankurdave commented on a change in pull request #34369: URL: https://github.com/apache/spark/pull/34369#discussion_r734840380 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -128,15 +139,21 @@ class FileScanRDD(

[GitHub] [spark] ankurdave commented on a change in pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
ankurdave commented on a change in pull request #34369: URL: https://github.com/apache/spark/pull/34369#discussion_r734840380 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -128,15 +139,21 @@ class FileScanRDD(

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34370: [SPARK-37047][SQL][FOLLOWUP] lpad/rpad should fail if parameter str and pad are different types

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34370: URL: https://github.com/apache/spark/pull/34370#issuecomment-949952367 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144539/

[GitHub] [spark] AmplabJenkins commented on pull request #34370: [SPARK-37047][SQL][FOLLOWUP] lpad/rpad should fail if parameter str and pad are different types

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34370: URL: https://github.com/apache/spark/pull/34370#issuecomment-949952367 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144539/ -- This

[GitHub] [spark] ankurdave commented on a change in pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
ankurdave commented on a change in pull request #34369: URL: https://github.com/apache/spark/pull/34369#discussion_r734822511 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -85,6 +85,17 @@ class FileScanRDD(

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
AmplabJenkins removed a comment on pull request #34369: URL: https://github.com/apache/spark/pull/34369#issuecomment-949951007 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144544/

[GitHub] [spark] sunchao commented on a change in pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
sunchao commented on a change in pull request #34369: URL: https://github.com/apache/spark/pull/34369#discussion_r734831796 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -128,15 +139,21 @@ class FileScanRDD(

[GitHub] [spark] ueshin commented on a change in pull request #34296: [SPARK-36989][TESTS][PYTHON] Add type hints data tests

2021-10-22 Thread GitBox
ueshin commented on a change in pull request #34296: URL: https://github.com/apache/spark/pull/34296#discussion_r734833122 ## File path: dev/lint-python ## @@ -124,39 +125,76 @@ function pycodestyle_test { fi } -function mypy_test { -local MYPY_REPORT= -local

[GitHub] [spark] AmplabJenkins commented on pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
AmplabJenkins commented on pull request #34369: URL: https://github.com/apache/spark/pull/34369#issuecomment-949951007 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144544/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #34370: [SPARK-37047][SQL][FOLLOWUP] lpad/rpad should fail if parameter str and pad are different types

2021-10-22 Thread GitBox
SparkQA removed a comment on pull request #34370: URL: https://github.com/apache/spark/pull/34370#issuecomment-949752286 **[Test build #144539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144539/testReport)** for PR 34370 at commit

[GitHub] [spark] SparkQA commented on pull request #34370: [SPARK-37047][SQL][FOLLOWUP] lpad/rpad should fail if parameter str and pad are different types

2021-10-22 Thread GitBox
SparkQA commented on pull request #34370: URL: https://github.com/apache/spark/pull/34370#issuecomment-949950511 **[Test build #144539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144539/testReport)** for PR 34370 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
SparkQA removed a comment on pull request #34369: URL: https://github.com/apache/spark/pull/34369#issuecomment-949766425 **[Test build #144544 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144544/testReport)** for PR 34369 at commit

[GitHub] [spark] SparkQA commented on pull request #34369: [SPARK-37089][SQL] Do not register ParquetFileFormat completion listener lazily

2021-10-22 Thread GitBox
SparkQA commented on pull request #34369: URL: https://github.com/apache/spark/pull/34369#issuecomment-949949305 **[Test build #144544 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144544/testReport)** for PR 34369 at commit

[GitHub] [spark] c21 commented on a change in pull request #34291: [SPARK-37020][SQL] DS V2 LIMIT push down

2021-10-22 Thread GitBox
c21 commented on a change in pull request #34291: URL: https://github.com/apache/spark/pull/34291#discussion_r734830379 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala ## @@ -298,17 +299,22 @@ private[sql] case class

[GitHub] [spark] c21 commented on a change in pull request #34298: [SPARK-34960][SQL] Aggregate push down for ORC

2021-10-22 Thread GitBox
c21 commented on a change in pull request #34298: URL: https://github.com/apache/spark/pull/34298#discussion_r734787209 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnsStatistics.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the

  1   2   3   4   >