[GitHub] [spark] zhixingheyi-tian commented on pull request #36659: [SPARK-39282][SQL] Replace If-Else branch with bitwise operators in roundNumberOfBytesToNearestWord

2022-05-28 Thread GitBox
zhixingheyi-tian commented on PR #36659: URL: https://github.com/apache/spark/pull/36659#issuecomment-1140226861 > The GA job didn't pass, can you check? Hi @cloud-fan @srowen All Three GA jobs passed. Thanks -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] Yikun opened a new pull request, #36712: [SPARK-38819][PS] replace "NaN" with real "None" value in indexes

2022-05-28 Thread GitBox
Yikun opened a new pull request, #36712: URL: https://github.com/apache/spark/pull/36712 ### What changes were proposed in this pull request? Since pandas 1.4 https://github.com/pandas-dev/pandas/commit/aaba0efd630ed607c5aaaef7b5f43d2fe90ca81c > Series.__repr__() and

[GitHub] [spark] zhixingheyi-tian commented on pull request #36659: [SPARK-39282][SQL] Replace If-Else branch with bitwise operators in roundNumberOfBytesToNearestWord

2022-05-28 Thread GitBox
zhixingheyi-tian commented on PR #36659: URL: https://github.com/apache/spark/pull/36659#issuecomment-1140187223 > **continuous-integration/appveyor/pr ** — AppVeyor build Hi @cloud-fan @srowen All Three GA jobs passed. Thanks -- This is an automated message from the Apache Git

[GitHub] [spark] cxzl25 opened a new pull request, #36710: [SPARK-39261][CORE][FOLLOWUP] Improve newline formatting for error messages

2022-05-28 Thread GitBox
cxzl25 opened a new pull request, #36710: URL: https://github.com/apache/spark/pull/36710 ### What changes were proposed in this pull request? Use `java.nio.file.Files.delete` instead of `org.apache.commons.io.FileUtils#delete` ### Why are the changes needed?

[GitHub] [spark] sandeepvinayak commented on pull request #36680: [SPARK-39283][CORE] Fix deadlock between TaskMemoryManager and UnsafeExternalSorter.SpillableIterator

2022-05-28 Thread GitBox
sandeepvinayak commented on PR #36680: URL: https://github.com/apache/spark/pull/36680#issuecomment-1140203155 @JoshRosen Just took another look at the code, the fix I made is for the deadlock b/w `TaskMemoryManager` and `UnsafeExternalSorter.SplittableIterator` which is what we faced and

[GitHub] [spark] wankunde opened a new pull request, #36709: [SPARK-39325][CORE]Improve MapOutputTracker convertMapStatuses perfor…

2022-05-28 Thread GitBox
wankunde opened a new pull request, #36709: URL: https://github.com/apache/spark/pull/36709 …mance ### What changes were proposed in this pull request? Optimize `MapOutputTracker.convertMapStatuses()` method. ### Why are the changes needed?

[GitHub] [spark] Yikun opened a new pull request, #36711: [SPARK-39314][PS] Respect ps.concat sort parameter to follow pandas behavior

2022-05-28 Thread GitBox
Yikun opened a new pull request, #36711: URL: https://github.com/apache/spark/pull/36711 ### What changes were proposed in this pull request? Respect ps.concat sort parameter to follow pandas behavior: - Remove the multi-index special sort process case and add ut. - Still keep

[GitHub] [spark] sunchao commented on a diff in pull request #36697: [SPARK-39313][SQL] `toCatalystOrdering` should fail if V2Expression can not be translated

2022-05-28 Thread GitBox
sunchao commented on code in PR #36697: URL: https://github.com/apache/spark/pull/36697#discussion_r884084376 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanPartitioning.scala: ## @@ -32,15 +32,15 @@ import

[GitHub] [spark] Yikun commented on pull request #36712: [SPARK-39326][PYTHON][PS] replace "NaN" with real "None" value in indexes

2022-05-28 Thread GitBox
Yikun commented on PR #36712: URL: https://github.com/apache/spark/pull/36712#issuecomment-1140234328 We need to cleanup all these doctest after bump pandas to 1.4. will address together in SPARK-39150. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] pan3793 commented on pull request #36697: [SPARK-39313][SQL] `toCatalystOrdering` should fail if V2Expression can not be translated

2022-05-28 Thread GitBox
pan3793 commented on PR #36697: URL: https://github.com/apache/spark/pull/36697#issuecomment-1140265452 CI is green, please take another look @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] srowen commented on pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

2022-05-28 Thread GitBox
srowen commented on PR #36499: URL: https://github.com/apache/spark/pull/36499#issuecomment-1140281363 So if I create a NUMBER in Teradata without a scale, then it uses a system default scale. Do we know what that is? I'm confused if Teradata doesn't record and return the actual scale

[GitHub] [spark] srowen commented on pull request #36659: [SPARK-39282][SQL] Replace If-Else branch with bitwise operators in roundNumberOfBytesToNearestWord

2022-05-28 Thread GitBox
srowen commented on PR #36659: URL: https://github.com/apache/spark/pull/36659#issuecomment-1140286823 I think the appveyor error is unrelated. I'll merge shortly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] srowen commented on a diff in pull request #36666: [SPARK-39289][CORE][SQL][SS] Replace `map(_.toBoolean).getOrElse(false/true)` with `exists/forall(_.toBoolean)`

2022-05-28 Thread GitBox
srowen commented on code in PR #3: URL: https://github.com/apache/spark/pull/3#discussion_r884136877 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala: ## @@ -54,31 +54,31 @@ private[sql] class JSONOptions( val samplingRatio =

[GitHub] [spark] dongjoon-hyun commented on pull request #36707: [SPARK-39324][CORE] Log `ExecutorDecommission` as INFO level in `TaskSchedulerImpl`

2022-05-28 Thread GitBox
dongjoon-hyun commented on PR #36707: URL: https://github.com/apache/spark/pull/36707#issuecomment-1140303016 Thank you so much, @wangyum . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dongjoon-hyun closed pull request #36707: [SPARK-39324][CORE] Log `ExecutorDecommission` as INFO level in `TaskSchedulerImpl`

2022-05-28 Thread GitBox
dongjoon-hyun closed pull request #36707: [SPARK-39324][CORE] Log `ExecutorDecommission` as INFO level in `TaskSchedulerImpl` URL: https://github.com/apache/spark/pull/36707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] ravwojdyla commented on a diff in pull request #36430: [WIP][SPARK-38904] Select by schema

2022-05-28 Thread GitBox
ravwojdyla commented on code in PR #36430: URL: https://github.com/apache/spark/pull/36430#discussion_r884187298 ## sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala: ## @@ -1593,6 +1593,35 @@ class Dataset[T] private[sql]( @scala.annotation.varargs def

[GitHub] [spark] wangyum opened a new pull request, #36713: [SPARK-39265][SQL][FOLLOWUP] Fix test failure when SPARK_ANSI_SQL_MODE is enabled

2022-05-28 Thread GitBox
wangyum opened a new pull request, #36713: URL: https://github.com/apache/spark/pull/36713 ### What changes were proposed in this pull request? Fix test failure when `SPARK_ANSI_SQL_MODE` is enabled: ``` 2022-05-28T21:02:01.9025896Z - INSERT rows, ALTER TABLE ADD COLUMNS with

[GitHub] [spark] wangyum commented on pull request #36713: [SPARK-39265][SQL][FOLLOWUP] Fix test failure when SPARK_ANSI_SQL_MODE is enabled

2022-05-28 Thread GitBox
wangyum commented on PR #36713: URL: https://github.com/apache/spark/pull/36713#issuecomment-1140354174 cc @dtenedor @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins commented on pull request #36701: [SPARK-39179][PYTHON][TESTS] Improve the test coverage for pyspark/shuffle.py

2022-05-28 Thread GitBox
AmplabJenkins commented on PR #36701: URL: https://github.com/apache/spark/pull/36701#issuecomment-1140376433 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #36709: [SPARK-39325][CORE]Improve MapOutputTracker convertMapStatuses performance

2022-05-28 Thread GitBox
AmplabJenkins commented on PR #36709: URL: https://github.com/apache/spark/pull/36709#issuecomment-1140324927 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #36710: [SPARK-39261][CORE][FOLLOWUP] Improve newline formatting for error messages

2022-05-28 Thread GitBox
AmplabJenkins commented on PR #36710: URL: https://github.com/apache/spark/pull/36710#issuecomment-1140324924 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] beliefer opened a new pull request, #36714: [SPARK-39320][SQL] Support aggregate function `MEDIAN`

2022-05-28 Thread GitBox
beliefer opened a new pull request, #36714: URL: https://github.com/apache/spark/pull/36714 ### What changes were proposed in this pull request? Many mainstream database supports aggregate function `MEDIAN`. **Syntax:** Aggregate function `MEDIAN( )` Window function

[GitHub] [spark] srowen closed pull request #36659: [SPARK-39282][SQL] Replace If-Else branch with bitwise operators in roundNumberOfBytesToNearestWord

2022-05-28 Thread GitBox
srowen closed pull request #36659: [SPARK-39282][SQL] Replace If-Else branch with bitwise operators in roundNumberOfBytesToNearestWord URL: https://github.com/apache/spark/pull/36659 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] srowen commented on pull request #36659: [SPARK-39282][SQL] Replace If-Else branch with bitwise operators in roundNumberOfBytesToNearestWord

2022-05-28 Thread GitBox
srowen commented on PR #36659: URL: https://github.com/apache/spark/pull/36659#issuecomment-1140316615 MErged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] sunchao commented on pull request #36697: [SPARK-39313][SQL] `toCatalystOrdering` should fail if V2Expression can not be translated

2022-05-28 Thread GitBox
sunchao commented on PR #36697: URL: https://github.com/apache/spark/pull/36697#issuecomment-1140327360 Hmm thinking more about this, I think maybe we should fail the analysis on the write path, even if a V2 transform exist in the function catalog. Otherwise, the write may fail at a later

[GitHub] [spark] dcoliversun commented on a diff in pull request #36666: [SPARK-39289][CORE][SQL][SS] Replace `map(_.toBoolean).getOrElse(false/true)` with `exists/forall(_.toBoolean)`

2022-05-28 Thread GitBox
dcoliversun commented on code in PR #3: URL: https://github.com/apache/spark/pull/3#discussion_r884194187 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala: ## @@ -54,31 +54,31 @@ private[sql] class JSONOptions( val samplingRatio =

[GitHub] [spark] dcoliversun closed pull request #36666: [SPARK-39289][CORE][SQL][SS] Replace `map(_.toBoolean).getOrElse(false/true)` with `exists/forall(_.toBoolean)`

2022-05-28 Thread GitBox
dcoliversun closed pull request #3: [SPARK-39289][CORE][SQL][SS] Replace `map(_.toBoolean).getOrElse(false/true)` with `exists/forall(_.toBoolean)` URL: https://github.com/apache/spark/pull/3 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] wangyum closed pull request #36710: [SPARK-39261][CORE][FOLLOWUP] Improve newline formatting for error messages

2022-05-28 Thread GitBox
wangyum closed pull request #36710: [SPARK-39261][CORE][FOLLOWUP] Improve newline formatting for error messages URL: https://github.com/apache/spark/pull/36710 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] wangyum commented on pull request #36710: [SPARK-39261][CORE][FOLLOWUP] Improve newline formatting for error messages

2022-05-28 Thread GitBox
wangyum commented on PR #36710: URL: https://github.com/apache/spark/pull/36710#issuecomment-114037 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] github-actions[bot] commented on pull request #35334: [SPARK-38034][SQL] Optimize TransposeWindow rule

2022-05-28 Thread GitBox
github-actions[bot] commented on PR #35334: URL: https://github.com/apache/spark/pull/35334#issuecomment-1140348353 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #35536: [SPARK-38222][SQL] Expose Node Description attribute in SQL Rest API

2022-05-28 Thread GitBox
github-actions[bot] closed pull request #35536: [SPARK-38222][SQL] Expose Node Description attribute in SQL Rest API URL: https://github.com/apache/spark/pull/35536 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] github-actions[bot] commented on pull request #34453: [SPARK-37173][SQL] SparkGetFunctionOperation return builtin function only once

2022-05-28 Thread GitBox
github-actions[bot] commented on PR #34453: URL: https://github.com/apache/spark/pull/34453#issuecomment-1140348363 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #35278: [SPARK-37677][CORE] Decompress the ZIP file and retain the original file permissions

2022-05-28 Thread GitBox
github-actions[bot] commented on PR #35278: URL: https://github.com/apache/spark/pull/35278#issuecomment-1140348356 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] beliefer commented on pull request #36708: [SPARK-37623][SQL] Support ANSI Aggregate Function: regr_intercept

2022-05-28 Thread GitBox
beliefer commented on PR #36708: URL: https://github.com/apache/spark/pull/36708#issuecomment-1140376742 ping @MaxGekk cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific