[GitHub] [spark] itholic commented on a diff in pull request #39705: [SPARK-41488][SQL] Assign name to _LEGACY_ERROR_TEMP_1176 (and 1177)

2023-01-24 Thread via GitHub
itholic commented on code in PR #39705: URL: https://github.com/apache/spark/pull/39705#discussion_r1084933920 ## core/src/main/resources/error/error-classes.json: ## @@ -592,6 +592,17 @@ "Detected an incompatible DataSourceRegister. Please remove the incompatible

[GitHub] [spark] dongjoon-hyun commented on pull request #38518: [SPARK-33349][K8S] Reset the executor pods watcher when we receive a version changed from k8s

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #38518: URL: https://github.com/apache/spark/pull/38518#issuecomment-1401531745 May I ask which K8s fabric kubernetes-client version are used, @yeachan153 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun opened a new pull request, #39716: [SPARK-42167][INFRA] Improve GitHub Action job to stop on failures earlier

2023-01-24 Thread via GitHub
dongjoon-hyun opened a new pull request, #39716: URL: https://github.com/apache/spark/pull/39716 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

[GitHub] [spark] dongjoon-hyun commented on pull request #39704: [MINOR][K8S][DOCS] Add all resource managers in `Scheduling Within an Application` section

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39704: URL: https://github.com/apache/spark/pull/39704#issuecomment-1401522248 Thank you, @viirya ! Merged to master/3.3/3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dongjoon-hyun closed pull request #39704: [MINOR][K8S][DOCS] Add all resource managers in `Scheduling Within an Application` section

2023-01-24 Thread via GitHub
dongjoon-hyun closed pull request #39704: [MINOR][K8S][DOCS] Add all resource managers in `Scheduling Within an Application` section URL: https://github.com/apache/spark/pull/39704 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun commented on pull request #39714: [SPARK-42166][K8S] Make `docker-image-tool.sh` usage message up-to-date

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39714: URL: https://github.com/apache/spark/pull/39714#issuecomment-1401562599 Could you review this, @Yikun ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39369: [SPARK-41775][PYTHON][ML] Adding support for PyTorch functions

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39369: URL: https://github.com/apache/spark/pull/39369#discussion_r1084987239 ## python/pyspark/ml/torch/distributor.py: ## @@ -15,16 +15,21 @@ # limitations under the License. # +import cloudpickle # type: ignore Review Comment:

[GitHub] [spark] dongjoon-hyun opened a new pull request, #39714: [SPARK-42166][K8S] Make `docker-image-tool.sh` usage message up-to-date

2023-01-24 Thread via GitHub
dongjoon-hyun opened a new pull request, #39714: URL: https://github.com/apache/spark/pull/39714 ### What changes were proposed in this pull request? This PR aims to make `docker-image-tool.sh` usage message up-to-date. ### Why are the changes needed? - Use `v3.4.0`

[GitHub] [spark] dongjoon-hyun opened a new pull request, #39715: [SPARK-41775][PYTHON][FOLLOWU] Use pyspark.cloudpickle

2023-01-24 Thread via GitHub
dongjoon-hyun opened a new pull request, #39715: URL: https://github.com/apache/spark/pull/39715 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39637: [SPARK-41777][PYSPARK][ML] Integration testing for TorchDistributor

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39637: URL: https://github.com/apache/spark/pull/39637#discussion_r1085005879 ## dev/requirements.txt: ## @@ -59,3 +59,7 @@ googleapis-common-protos==1.56.4 mypy-protobuf==3.3.0 googleapis-common-protos-stubs==2.2.0 grpc-stubs==1.24.11

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39637: [SPARK-41777][PYSPARK][ML] Integration testing for TorchDistributor

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39637: URL: https://github.com/apache/spark/pull/39637#discussion_r1085007063 ## python/pyspark/ml/torch/tests/test_distributor.py: ## @@ -288,6 +288,13 @@ def test_local_training_succeeds(self) -> None: if cuda_env_var:

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39369: [SPARK-41775][PYTHON][ML] Adding support for PyTorch functions

2023-01-24 Thread via GitHub
HyukjinKwon commented on code in PR #39369: URL: https://github.com/apache/spark/pull/39369#discussion_r1085070034 ## python/pyspark/ml/torch/distributor.py: ## @@ -15,16 +15,21 @@ # limitations under the License. # +import cloudpickle # type: ignore Review Comment: +1

[GitHub] [spark] dongjoon-hyun commented on pull request #39714: [SPARK-42166][K8S] Make `docker-image-tool.sh` usage message up-to-date

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39714: URL: https://github.com/apache/spark/pull/39714#issuecomment-1401692271 Merged to master for Apache Spark 3.4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun closed pull request #39714: [SPARK-42166][K8S] Make `docker-image-tool.sh` usage message up-to-date

2023-01-24 Thread via GitHub
dongjoon-hyun closed pull request #39714: [SPARK-42166][K8S] Make `docker-image-tool.sh` usage message up-to-date URL: https://github.com/apache/spark/pull/39714 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39716: [SPARK-42167][INFRA] Improve GitHub Action `lint` job to stop on failures earlier

2023-01-24 Thread via GitHub
HyukjinKwon commented on code in PR #39716: URL: https://github.com/apache/spark/pull/39716#discussion_r1085290756 ## .github/workflows/build_and_test.yml: ## @@ -587,9 +606,9 @@ jobs: # See also https://issues.apache.org/jira/browse/SPARK-35375. # Pin the

[GitHub] [spark] yeachan153 commented on pull request #38518: [SPARK-33349][K8S] Reset the executor pods watcher when we receive a version changed from k8s

2023-01-24 Thread via GitHub
yeachan153 commented on PR #38518: URL: https://github.com/apache/spark/pull/38518#issuecomment-1401962554 > May I ask which K8s fabric kubernetes-client version are used, @yeachan153 ? We are using 5.4.1 @dongjoon-hyun -- This is an automated message from the Apache Git Service.

[GitHub] [spark] HyukjinKwon commented on pull request #39543: [SPARK-42044][SQL] Fix incorrect error message for `MUST_AGGREGATE_CORRELATED_SCALAR_SUBQUERY`

2023-01-24 Thread via GitHub
HyukjinKwon commented on PR #39543: URL: https://github.com/apache/spark/pull/39543#issuecomment-1401997020 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] EnricoMi opened a new pull request, #39717: [SPARK-42168][3.3][SQL][PYTHON] Fix required child distribution of FlatMapCoGroupsInPandas (as in CoGroup)

2023-01-24 Thread via GitHub
EnricoMi opened a new pull request, #39717: URL: https://github.com/apache/spark/pull/39717 ### What changes were proposed in this pull request? Make `FlatMapCoGroupsInPandas` (used by PySpark) report its required child distribution as `HashClusteredDistribution`, rather than

[GitHub] [spark] dongjoon-hyun commented on pull request #39714: [SPARK-42166][K8S] Make `docker-image-tool.sh` usage message up-to-date

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39714: URL: https://github.com/apache/spark/pull/39714#issuecomment-1401687848 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on pull request #39715: [SPARK-41775][PYTHON][FOLLOWUP] Use `pyspark.cloudpickle` instead of `cloudpickle`

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39715: URL: https://github.com/apache/spark/pull/39715#issuecomment-1401688304 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #39716: [SPARK-42167][INFRA] Improve GitHub Action `lint` job to stop on failures earlier

2023-01-24 Thread via GitHub
HyukjinKwon commented on PR #39716: URL: https://github.com/apache/spark/pull/39716#issuecomment-1401676399 cc @Yikun too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] Yikun commented on pull request #39714: [SPARK-42166][K8S] Make `docker-image-tool.sh` usage message up-to-date

2023-01-24 Thread via GitHub
Yikun commented on PR #39714: URL: https://github.com/apache/spark/pull/39714#issuecomment-1401870194 late LGTM. > Use the smallest multi-arch image tag example, 11-jre, instead of 11-jre-focal. FYI, According to

[GitHub] [spark] HyukjinKwon closed pull request #39543: [SPARK-42044][SQL] Fix incorrect error message for `MUST_AGGREGATE_CORRELATED_SCALAR_SUBQUERY`

2023-01-24 Thread via GitHub
HyukjinKwon closed pull request #39543: [SPARK-42044][SQL] Fix incorrect error message for `MUST_AGGREGATE_CORRELATED_SCALAR_SUBQUERY` URL: https://github.com/apache/spark/pull/39543 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on pull request #39716: [SPARK-42167][INFRA] Improve GitHub Action `lint` job to stop on failures earlier

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39716: URL: https://github.com/apache/spark/pull/39716#issuecomment-1401670611 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun closed pull request #39715: [SPARK-41775][PYTHON][FOLLOWUP] Use `pyspark.cloudpickle` instead of `cloudpickle`

2023-01-24 Thread via GitHub
dongjoon-hyun closed pull request #39715: [SPARK-41775][PYTHON][FOLLOWUP] Use `pyspark.cloudpickle` instead of `cloudpickle` URL: https://github.com/apache/spark/pull/39715 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dongjoon-hyun commented on pull request #39715: [SPARK-41775][PYTHON][FOLLOWUP] Use `pyspark.cloudpickle` instead of `cloudpickle`

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39715: URL: https://github.com/apache/spark/pull/39715#issuecomment-1401694385 All Python tests including Python linter passed. Merged to master. ![Screenshot 2023-01-24 at 2 24 43

[GitHub] [spark] HyukjinKwon commented on pull request #39681: [SPARK-18011] Fix SparkR NA date serialization

2023-01-24 Thread via GitHub
HyukjinKwon commented on PR #39681: URL: https://github.com/apache/spark/pull/39681#issuecomment-1401918768 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #39681: [SPARK-18011] Fix SparkR NA date serialization

2023-01-24 Thread via GitHub
HyukjinKwon closed pull request #39681: [SPARK-18011] Fix SparkR NA date serialization URL: https://github.com/apache/spark/pull/39681 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cashmand opened a new pull request, #39718: [SPARK-42163] Fix schema pruning for non-foldable array index or map key

2023-01-24 Thread via GitHub
cashmand opened a new pull request, #39718: URL: https://github.com/apache/spark/pull/39718 ### What changes were proposed in this pull request? In parquet schema pruning, we use SelectedField to try to extract the field that is used in a struct. It looks through

[GitHub] [spark] AmplabJenkins commented on pull request #39711: [SPARK-41931][SQL] Better error message for incomplete complex type definition

2023-01-24 Thread via GitHub
AmplabJenkins commented on PR #39711: URL: https://github.com/apache/spark/pull/39711#issuecomment-1402189136 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #39712: [TODO][Connect] Scala Client Mima Compatibility Tests

2023-01-24 Thread via GitHub
AmplabJenkins commented on PR #39712: URL: https://github.com/apache/spark/pull/39712#issuecomment-1402189018 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on pull request #39721: [SPARK-42171][PYSPARK] Enable `pyspark-errors` module test in GitHub Action

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39721: URL: https://github.com/apache/spark/pull/39721#issuecomment-1402402762 cc @itholic and @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] ocworld commented on pull request #32397: [WIP][SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2023-01-24 Thread via GitHub
ocworld commented on PR #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-1402127571 @jbguerraz @GaruGaru The pr(https://github.com/apache/spark/pull/38828) is merged now -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] jchen5 commented on a diff in pull request #39375: [SPARK-36124][SQL] Support subqueries with correlation through UNION

2023-01-24 Thread via GitHub
jchen5 commented on code in PR #39375: URL: https://github.com/apache/spark/pull/39375#discussion_r1085530300 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -1056,6 +1056,11 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] NarekDW opened a new pull request, #39720: [SPARK-41500] [SQL] Year/Month Interval operations bug fix

2023-01-24 Thread via GitHub
NarekDW opened a new pull request, #39720: URL: https://github.com/apache/spark/pull/39720 ### What changes were proposed in this pull request? `YearMonthIntervalType` case was missed in `org.apache.spark.sql.catalyst.analysis.Analyzer.ResolveBinaryArithmetic` class. Because of

[GitHub] [spark] dongjoon-hyun closed pull request #39716: [SPARK-42167][INFRA] Improve GitHub Action `lint` job to stop on failures earlier

2023-01-24 Thread via GitHub
dongjoon-hyun closed pull request #39716: [SPARK-42167][INFRA] Improve GitHub Action `lint` job to stop on failures earlier URL: https://github.com/apache/spark/pull/39716 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39387: [SPARK-41586][PYTHON] Introduce `pyspark.errors` and error classes for PySpark.

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39387: URL: https://github.com/apache/spark/pull/39387#discussion_r1085727543 ## python/pyspark/errors/tests/test_errors.py: ## @@ -0,0 +1,47 @@ +# -*- encoding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one or

[GitHub] [spark] dongjoon-hyun commented on pull request #39721: [SPARK-42171][PYSPARK] Enable `pyspark-errors` module test in GitHub Action

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39721: URL: https://github.com/apache/spark/pull/39721#issuecomment-1402409916 cc @xinrong-meng too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] NarekDW closed pull request #39097: [SPARK-42169] Implement code generation for to_csv function (StructsToCsv)

2023-01-24 Thread via GitHub
NarekDW closed pull request #39097: [SPARK-42169] Implement code generation for to_csv function (StructsToCsv) URL: https://github.com/apache/spark/pull/39097 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] NarekDW commented on pull request #39097: [SPARK-42169] Implement code generation for to_csv function (StructsToCsv)

2023-01-24 Thread via GitHub
NarekDW commented on PR #39097: URL: https://github.com/apache/spark/pull/39097#issuecomment-1402112421 @cloud-fan finally I've got my JIRA account and created a separate ticket for this. I've closed this PR as I've renamed my branch in accordance with new JIRA ticket. Please check

[GitHub] [spark] srowen commented on pull request #39566: Patched()Fix Protobuf Java vulnerable to Uncontrolled Resource Consumption

2023-01-24 Thread via GitHub
srowen commented on PR #39566: URL: https://github.com/apache/spark/pull/39566#issuecomment-1402320043 Hold up a sec. First please read https://spark.apache.org/contributing.html Where does this actually affect Spark? You have only updated a protobuf depenency in the Kinesis

[GitHub] [spark] dongjoon-hyun commented on pull request #39716: [SPARK-42167][INFRA] Improve GitHub Action `lint` job to stop on failures earlier

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39716: URL: https://github.com/apache/spark/pull/39716#issuecomment-1402356012 Thank you, @HyukjinKwon . The `lint` job passed. Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] RunyaoChen commented on pull request #39711: [SPARK-41931][SQL] Better error message for incomplete complex type definition

2023-01-24 Thread via GitHub
RunyaoChen commented on PR #39711: URL: https://github.com/apache/spark/pull/39711#issuecomment-1402362690 @MaxGekk Thanks! Done enabling GitHub action. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun opened a new pull request, #39721: [SPARK-42171][PYSPARK] Enable `pyspark-errors` module test in GitHub Action

2023-01-24 Thread via GitHub
dongjoon-hyun opened a new pull request, #39721: URL: https://github.com/apache/spark/pull/39721 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] srowen commented on pull request #39720: [SPARK-41500] [SQL] Year/Month Interval operations bug fix

2023-01-24 Thread via GitHub
srowen commented on PR #39720: URL: https://github.com/apache/spark/pull/39720#issuecomment-1402394943 CC @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins commented on pull request #39710: [SPARK-42090][3.2] Introduce sasl retry count in RetryingBlockTransferor

2023-01-24 Thread via GitHub
AmplabJenkins commented on PR #39710: URL: https://github.com/apache/spark/pull/39710#issuecomment-1402189273 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #39709: [SPARK-42090][3.3] Introduce sasl retry count in RetryingBlockTransferor

2023-01-24 Thread via GitHub
AmplabJenkins commented on PR #39709: URL: https://github.com/apache/spark/pull/39709#issuecomment-1402189361 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] srowen commented on pull request #39381: [SPARK-41554] fix changing of Decimal scale when scale decreased by m…

2023-01-24 Thread via GitHub
srowen commented on PR #39381: URL: https://github.com/apache/spark/pull/39381#issuecomment-1402327013 Is there a backport for 3.3. too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] mridulm commented on pull request #39709: [SPARK-42090][3.3] Introduce sasl retry count in RetryingBlockTransferor

2023-01-24 Thread via GitHub
mridulm commented on PR #39709: URL: https://github.com/apache/spark/pull/39709#issuecomment-1402392140 The test failure is unrelated. Merged to 3.3. Thanks for backporting this @akpatnam25 ! -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] mridulm closed pull request #39709: [SPARK-42090][3.3] Introduce sasl retry count in RetryingBlockTransferor

2023-01-24 Thread via GitHub
mridulm closed pull request #39709: [SPARK-42090][3.3] Introduce sasl retry count in RetryingBlockTransferor URL: https://github.com/apache/spark/pull/39709 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] sunchao commented on pull request #39633: [SPARK-42038][SQL] SPJ: Support partially clustered distribution

2023-01-24 Thread via GitHub
sunchao commented on PR #39633: URL: https://github.com/apache/spark/pull/39633#issuecomment-1402391567 Hmm @cloud-fan , it seems a bit tricky to split the logic specific to SPJ out of `EnsureRequirements`, since it depends on `reorderJoinPredicates`, which is currently co-mingled with

[GitHub] [spark] cashmand commented on pull request #39718: [SPARK-42163] Fix schema pruning for non-foldable array index or map key

2023-01-24 Thread via GitHub
cashmand commented on PR #39718: URL: https://github.com/apache/spark/pull/39718#issuecomment-1402093051 @sigmod @viirya do you want to take a look at this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] NarekDW opened a new pull request, #39719: [SPARK-42169] Implement code generation for to_csv function (StructsToCsv)

2023-01-24 Thread via GitHub
NarekDW opened a new pull request, #39719: URL: https://github.com/apache/spark/pull/39719 ### What changes were proposed in this pull request? This PR enhances `StructsToCsv` class with `doGenCode` function instead of extending it from `CodegenFallback` trait (performance improvement).

[GitHub] [spark] mridulm commented on pull request #39710: [SPARK-42090][3.2] Introduce sasl retry count in RetryingBlockTransferor

2023-01-24 Thread via GitHub
mridulm commented on PR #39710: URL: https://github.com/apache/spark/pull/39710#issuecomment-1402390256 Merged to 3.2. Thanks for fixing this @akpatnam25 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] mridulm closed pull request #39710: [SPARK-42090][3.2] Introduce sasl retry count in RetryingBlockTransferor

2023-01-24 Thread via GitHub
mridulm closed pull request #39710: [SPARK-42090][3.2] Introduce sasl retry count in RetryingBlockTransferor URL: https://github.com/apache/spark/pull/39710 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] db-scnakandala opened a new pull request, #39722: [SPARK-42162]

2023-01-24 Thread via GitHub
db-scnakandala opened a new pull request, #39722: URL: https://github.com/apache/spark/pull/39722 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] NarekDW commented on pull request #39723: [SPARK-41302][SQL] Assign name to _LEGACY_ERROR_TEMP_1185

2023-01-24 Thread via GitHub
NarekDW commented on PR #39723: URL: https://github.com/apache/spark/pull/39723#issuecomment-1402458211 @itholic FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] rmcyang opened a new pull request, #39725: [SPARK-33573][FOLLOW-UP] Increment ignoredBlockBytes when shuffle push blocks are late or colliding

2023-01-24 Thread via GitHub
rmcyang opened a new pull request, #39725: URL: https://github.com/apache/spark/pull/39725 ### What changes were proposed in this pull request? Currently, the `ignoredBlockBytes` of server side metrics for push-based shuffle does not take the late-pushed blocks or the

[GitHub] [spark] dongjoon-hyun commented on pull request #39721: [SPARK-42171][PYSPARK][TESTS] Fix `pyspark-errors` module and enable it in GitHub Action

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39721: URL: https://github.com/apache/spark/pull/39721#issuecomment-1402642236 Could you review this, @huaxingao ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] dtenedor commented on pull request #39657: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output

2023-01-24 Thread via GitHub
dtenedor commented on PR #39657: URL: https://github.com/apache/spark/pull/39657#issuecomment-1402647607 I don't know what is going on with the broken GitHub actions CI. I am going to drop this PR, delete my fork, then create another separate fork and create a new PR with the same changes.

[GitHub] [spark] dtenedor closed pull request #39657: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output

2023-01-24 Thread via GitHub
dtenedor closed pull request #39657: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output URL: https://github.com/apache/spark/pull/39657 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] dongjoon-hyun closed pull request #39721: [SPARK-42171][PYSPARK][TESTS] Fix `pyspark-errors` module and enable it in GitHub Action

2023-01-24 Thread via GitHub
dongjoon-hyun closed pull request #39721: [SPARK-42171][PYSPARK][TESTS] Fix `pyspark-errors` module and enable it in GitHub Action URL: https://github.com/apache/spark/pull/39721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085980466 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowCreateTableSuite.scala: ## @@ -161,6 +161,30 @@ trait ShowCreateTableSuiteBase extends

[GitHub] [spark] rithwik-db opened a new pull request, #39724: [SPARK-41775][PYTHON][FOLLOWUP] Fix stdout rerouting

2023-01-24 Thread via GitHub
rithwik-db opened a new pull request, #39724: URL: https://github.com/apache/spark/pull/39724 ### What changes were proposed in this pull request? This is a follow-up of https://github.com/apache/spark/pull/39369 which aims to fix `stderr` rerouting to `stdout` which is

[GitHub] [spark] jbguerraz commented on pull request #32397: [WIP][SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2023-01-24 Thread via GitHub
jbguerraz commented on PR #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-1402508589 Thank you @ocworld :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dtenedor opened a new pull request, #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dtenedor opened a new pull request, #39726: URL: https://github.com/apache/spark/pull/39726 ### What changes were proposed in this pull request? Include column default values in DESCRIBE and SHOW CREATE TABLE output. ### Why are the changes needed? This helps users work

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085978481 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala: ## @@ -645,6 +645,17 @@ case class DescribeTableCommand( } else if

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085985386 ## sql/core/src/test/resources/sql-tests/inputs/show-create-table.sql: ## @@ -45,6 +45,14 @@ SHOW CREATE TABLE tbl; DROP TABLE tbl; +-- default column

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085985386 ## sql/core/src/test/resources/sql-tests/inputs/show-create-table.sql: ## @@ -45,6 +45,14 @@ SHOW CREATE TABLE tbl; DROP TABLE tbl; +-- default column

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085984751 ## sql/core/src/test/resources/sql-tests/inputs/describe.sql: ## @@ -97,3 +97,14 @@ DROP VIEW temp_v; DROP VIEW temp_Data_Source_View; DROP VIEW v; + +--

[GitHub] [spark] dongjoon-hyun commented on pull request #39721: [SPARK-42171][PYSPARK][TESTS] Fix `pyspark-errors` module and enable it in GitHub Action

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39721: URL: https://github.com/apache/spark/pull/39721#issuecomment-1402666052 Thank you so much, @viirya and @huaxingao ! Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085981834 ## sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructField.scala: ## @@ -160,7 +164,7 @@ case class StructField( */ def toDDL: String = {

[GitHub] [spark] EnricoMi commented on pull request #39640: [SPARK-38591][SQL] Add flatMapSortedGroups and cogroupSorted

2023-01-24 Thread via GitHub
EnricoMi commented on PR #39640: URL: https://github.com/apache/spark/pull/39640#issuecomment-1402553814 Found it: `Analyzer.resolveExpressionByPlanOutput`, follow-up PR is imminent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dtenedor commented on pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dtenedor commented on PR #39726: URL: https://github.com/apache/spark/pull/39726#issuecomment-1402674297 @gengliangwang here is my PR to add DESCRIBE and SHOW CREATE TABLE support for column default values again, and the CI appears to be running now.  -- This is an automated message

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085977201 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala: ## @@ -645,6 +645,17 @@ case class DescribeTableCommand( } else if

[GitHub] [spark] NarekDW opened a new pull request, #39723: [SPARK-41302][SQL] Assign name to _LEGACY_ERROR_TEMP_1020

2023-01-24 Thread via GitHub
NarekDW opened a new pull request, #39723: URL: https://github.com/apache/spark/pull/39723 ### What changes were proposed in this pull request? This PR proposes to assign name to _LEGACY_ERROR_TEMP_1185 -> "INVALID_IDENTIFIER_HAS_MORE_THAN_2_NAME_PARTS". ### Why are the

[GitHub] [spark] rmcyang commented on pull request #39725: [SPARK-33573][FOLLOW-UP] Increment ignoredBlockBytes when shuffle push blocks are late or colliding

2023-01-24 Thread via GitHub
rmcyang commented on PR #39725: URL: https://github.com/apache/spark/pull/39725#issuecomment-1402573328 cc @mridulm @otterc Please take a look, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] vinodkc commented on a diff in pull request #39449: [SPARK-40688][SQL] Support data masking built-in function 'mask_first_n'

2023-01-24 Thread via GitHub
vinodkc commented on code in PR #39449: URL: https://github.com/apache/spark/pull/39449#discussion_r1085963701 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/maskExpressions.scala: ## @@ -257,19 +271,272 @@ case class Mask( otherChar =

[GitHub] [spark] otterc commented on a diff in pull request #39725: [SPARK-33573][FOLLOW-UP] Increment ignoredBlockBytes when shuffle push blocks are late or colliding

2023-01-24 Thread via GitHub
otterc commented on code in PR #39725: URL: https://github.com/apache/spark/pull/39725#discussion_r1085930450 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -1356,6 +1356,17 @@ private boolean isTooLate(

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085984751 ## sql/core/src/test/resources/sql-tests/inputs/describe.sql: ## @@ -97,3 +97,14 @@ DROP VIEW temp_v; DROP VIEW temp_Data_Source_View; DROP VIEW v; + +--

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1085984055 ## sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructField.scala: ## @@ -141,6 +141,10 @@ case class StructField( } } + private def

[GitHub] [spark] rmcyang commented on a diff in pull request #39725: [SPARK-33573][FOLLOW-UP] Increment ignoredBlockBytes when shuffle push blocks are late or colliding

2023-01-24 Thread via GitHub
rmcyang commented on code in PR #39725: URL: https://github.com/apache/spark/pull/39725#discussion_r1086006244 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -1393,6 +1405,7 @@ public void onData(String streamId,

[GitHub] [spark] dongjoon-hyun opened a new pull request, #39727: Use scikit-learn instead of sklearn

2023-01-24 Thread via GitHub
dongjoon-hyun opened a new pull request, #39727: URL: https://github.com/apache/spark/pull/39727 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] gengliangwang commented on a diff in pull request #39642: [SPARK-41677][CORE][SQL][SS][UI] Add Protobuf serializer for `StreamingQueryProgressWrapper`

2023-01-24 Thread via GitHub
gengliangwang commented on code in PR #39642: URL: https://github.com/apache/spark/pull/39642#discussion_r1086049823 ## sql/core/src/main/scala/org/apache/spark/status/protobuf/sql/StreamingQueryProgressSerializer.scala: ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache

[GitHub] [spark] gengliangwang commented on a diff in pull request #39642: [SPARK-41677][CORE][SQL][SS][UI] Add Protobuf serializer for `StreamingQueryProgressWrapper`

2023-01-24 Thread via GitHub
gengliangwang commented on code in PR #39642: URL: https://github.com/apache/spark/pull/39642#discussion_r1086049614 ## sql/core/src/main/scala/org/apache/spark/status/protobuf/sql/StreamingQueryProgressSerializer.scala: ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache

[GitHub] [spark] HyukjinKwon commented on pull request #39375: [SPARK-36124][SQL] Support subqueries with correlation through UNION

2023-01-24 Thread via GitHub
HyukjinKwon commented on PR #39375: URL: https://github.com/apache/spark/pull/39375#issuecomment-1402860722 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39712: [SPARK-42172][CONNECT] Scala Client Mima Compatibility Tests

2023-01-24 Thread via GitHub
HyukjinKwon commented on code in PR #39712: URL: https://github.com/apache/spark/pull/39712#discussion_r1086082677 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala: ## @@ -44,7 +45,7 @@ import org.apache.spark.sql.functions.lit * * @since

[GitHub] [spark] zhenlineo commented on a diff in pull request #39712: [SPARK-42172][CONNECT] Scala Client Mima Compatibility Tests

2023-01-24 Thread via GitHub
zhenlineo commented on code in PR #39712: URL: https://github.com/apache/spark/pull/39712#discussion_r1086087241 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala: ## @@ -44,7 +45,7 @@ import org.apache.spark.sql.functions.lit * * @since 3.4.0

[GitHub] [spark] zhenlineo commented on a diff in pull request #39712: [SPARK-42172][CONNECT] Scala Client Mima Compatibility Tests

2023-01-24 Thread via GitHub
zhenlineo commented on code in PR #39712: URL: https://github.com/apache/spark/pull/39712#discussion_r1086093964 ## connector/connect/client/jvm/pom.xml: ## @@ -75,6 +76,13 @@ mockito-core test + Review Comment: The SBT MiMa check has some

[GitHub] [spark] zhenlineo commented on a diff in pull request #39712: [SPARK-42172][CONNECT] Scala Client Mima Compatibility Tests

2023-01-24 Thread via GitHub
zhenlineo commented on code in PR #39712: URL: https://github.com/apache/spark/pull/39712#discussion_r1086095693 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/CompatibilitySuite.scala: ## @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39728: [SPARK-42173][CORE] RpcAddress equality can fail

2023-01-24 Thread via GitHub
dongjoon-hyun commented on code in PR #39728: URL: https://github.com/apache/spark/pull/39728#discussion_r1086103013 ## core/src/main/scala/org/apache/spark/rpc/RpcAddress.scala: ## @@ -23,30 +23,37 @@ import org.apache.spark.util.Utils /** * Address for an RPC environment,

[GitHub] [spark] LuciferYang commented on a diff in pull request #39642: [SPARK-41677][CORE][SQL][SS][UI] Add Protobuf serializer for `StreamingQueryProgressWrapper`

2023-01-24 Thread via GitHub
LuciferYang commented on code in PR #39642: URL: https://github.com/apache/spark/pull/39642#discussion_r1086118621 ## sql/core/src/main/scala/org/apache/spark/status/protobuf/sql/StreamingQueryProgressSerializer.scala: ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] HyukjinKwon closed pull request #39727: [SPARK-42174][PYTHON][INFRA] Use `scikit-learn` instead of `sklearn`

2023-01-24 Thread via GitHub
HyukjinKwon closed pull request #39727: [SPARK-42174][PYTHON][INFRA] Use `scikit-learn` instead of `sklearn` URL: https://github.com/apache/spark/pull/39727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] HyukjinKwon commented on pull request #39727: [SPARK-42174][PYTHON][INFRA] Use `scikit-learn` instead of `sklearn`

2023-01-24 Thread via GitHub
HyukjinKwon commented on PR #39727: URL: https://github.com/apache/spark/pull/39727#issuecomment-1402877171 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39712: [SPARK-42172][CONNECT] Scala Client Mima Compatibility Tests

2023-01-24 Thread via GitHub
HyukjinKwon commented on code in PR #39712: URL: https://github.com/apache/spark/pull/39712#discussion_r1086096820 ## connector/connect/client/jvm/pom.xml: ## @@ -75,6 +76,13 @@ mockito-core test + Review Comment: Gotya. Let's probably add a couple

[GitHub] [spark] dongjoon-hyun commented on pull request #39727: [SPARK-42174][PYTHON][INFRA] Use `scikit-learn` instead of `sklearn`

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39727: URL: https://github.com/apache/spark/pull/39727#issuecomment-1402900556 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on pull request #39728: [SPARK-42173][CORE] RpcAddress equality can fail

2023-01-24 Thread via GitHub
dongjoon-hyun commented on PR #39728: URL: https://github.com/apache/spark/pull/39728#issuecomment-1402912399 cc @mridulm , too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dtenedor commented on a diff in pull request #39726: [SPARK-42123][SQL] Include column default values in DESCRIBE and SHOW CREATE TABLE output #39657

2023-01-24 Thread via GitHub
dtenedor commented on code in PR #39726: URL: https://github.com/apache/spark/pull/39726#discussion_r1086099890 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala: ## @@ -645,6 +645,17 @@ case class DescribeTableCommand( } else if (isExtended)

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39712: [SPARK-42172][CONNECT] Scala Client Mima Compatibility Tests

2023-01-24 Thread via GitHub
HyukjinKwon commented on code in PR #39712: URL: https://github.com/apache/spark/pull/39712#discussion_r1086085628 ## connector/connect/client/jvm/pom.xml: ## @@ -75,6 +76,13 @@ mockito-core test + Review Comment: Can we use SBT to check this instead

[GitHub] [spark] HyukjinKwon commented on pull request #39656: [SPARK-42119][SQL] Add built-in table-valued functions inline and inline_outer

2023-01-24 Thread via GitHub
HyukjinKwon commented on PR #39656: URL: https://github.com/apache/spark/pull/39656#issuecomment-1402878991 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

  1   2   >