[GitHub] [spark] grundprinzip commented on a diff in pull request #39016: [SPARK-41477][CONNECT][PYTHON] Correctly infer the datatype of literal integers

2022-12-09 Thread GitBox
grundprinzip commented on code in PR #39016: URL: https://github.com/apache/spark/pull/39016#discussion_r1045017515 ## python/pyspark/sql/connect/column.py: ## @@ -180,7 +180,12 @@ def to_plan(self, session: "SparkConnectClient") -> "proto.Expression": elif

[GitHub] [spark] MaxGekk commented on pull request #38972: [SPARK-41443][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1061

2022-12-09 Thread GitBox
MaxGekk commented on PR #38972: URL: https://github.com/apache/spark/pull/38972#issuecomment-1345173649 @panbingkun Could you fix the test failure, seems it is related to your changes: ``` sbt.ForkMain$ForkError: org.scalatest.exceptions.TestFailedException: "[COLUMN_NOT_FOUND] The

[GitHub] [spark] MaxGekk commented on a diff in pull request #38998: [SPARK-41463][SQL][TESTS] Ensure error class names contain only capital letters, numbers and underscores

2022-12-09 Thread GitBox
MaxGekk commented on code in PR #38998: URL: https://github.com/apache/spark/pull/38998#discussion_r1045013373 ## core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala: ## @@ -147,6 +147,18 @@ class SparkThrowableSuite extends SparkFunSuite {

[GitHub] [spark] MaxGekk closed pull request #38954: [SPARK-41417][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0019` to `INVALID_TYPED_LITERAL`

2022-12-09 Thread GitBox
MaxGekk closed pull request #38954: [SPARK-41417][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0019` to `INVALID_TYPED_LITERAL` URL: https://github.com/apache/spark/pull/38954 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] MaxGekk commented on pull request #38954: [SPARK-41417][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0019` to `INVALID_TYPED_LITERAL`

2022-12-09 Thread GitBox
MaxGekk commented on PR #38954: URL: https://github.com/apache/spark/pull/38954#issuecomment-1345157464 +1, LGTM. Merging to master. Thank you, @LuciferYang. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] AmplabJenkins commented on pull request #39012: [SPARK-41475][CONNECT] Fix lint-scala command error and typo

2022-12-09 Thread GitBox
AmplabJenkins commented on PR #39012: URL: https://github.com/apache/spark/pull/39012#issuecomment-1345155848 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #39013: [SPARK-41472][CONNECT][PYTHON] Implement the rest of string/binary functions

2022-12-09 Thread GitBox
AmplabJenkins commented on PR #39013: URL: https://github.com/apache/spark/pull/39013#issuecomment-1345155835 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on pull request #39005: [SPARK-41467][BUILD] Upgrade httpclient from 4.5.13 to 4.5.14

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39005: URL: https://github.com/apache/spark/pull/39005#issuecomment-1345151502 All tests passed. Merged to master. Thank you, @panbingkun and all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun closed pull request #39005: [SPARK-41467][BUILD] Upgrade httpclient from 4.5.13 to 4.5.14

2022-12-09 Thread GitBox
dongjoon-hyun closed pull request #39005: [SPARK-41467][BUILD] Upgrade httpclient from 4.5.13 to 4.5.14 URL: https://github.com/apache/spark/pull/39005 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] dongjoon-hyun commented on pull request #39015: [SPARK-41476][INFRA] Prevent `README.md` from triggering CIs

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39015: URL: https://github.com/apache/spark/pull/39015#issuecomment-1345151056 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] zhengruifeng opened a new pull request, #39016: [SPARK-41477][CONNECT][PYTHON] Correctly infer the datatype of literal integers

2022-12-09 Thread GitBox
zhengruifeng opened a new pull request, #39016: URL: https://github.com/apache/spark/pull/39016 ### What changes were proposed in this pull request? check the bounds of integer and choose the correct datatypes ### Why are the changes needed? to match pyspark: ```

[GitHub] [spark] dongjoon-hyun commented on pull request #39015: [SPARK-41476][INFRA] Prevent `README.md` from triggering CIs

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39015: URL: https://github.com/apache/spark/pull/39015#issuecomment-1345146063 BTW, if you don't mind, I'd like to land this to all live branches to reduce the community resources. WDTY, @HyukjinKwon and @gengliangwang ? -- This is an automated message from

[GitHub] [spark] dongjoon-hyun closed pull request #38991: [SPARK-41457][PYTHON][TESTS] Refactor type annotations and dependency checks in tests

2022-12-09 Thread GitBox
dongjoon-hyun closed pull request #38991: [SPARK-41457][PYTHON][TESTS] Refactor type annotations and dependency checks in tests URL: https://github.com/apache/spark/pull/38991 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] dongjoon-hyun commented on pull request #38991: [SPARK-41457][PYTHON][TESTS] Refactor type annotations and dependency checks in tests

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #38991: URL: https://github.com/apache/spark/pull/38991#issuecomment-1345145538 All python and linter tests passed. Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] dongjoon-hyun closed pull request #39012: [SPARK-41475][CONNECT] Fix lint-scala command error and typo

2022-12-09 Thread GitBox
dongjoon-hyun closed pull request #39012: [SPARK-41475][CONNECT] Fix lint-scala command error and typo URL: https://github.com/apache/spark/pull/39012 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39015: [SPARK-41476][INFRA] Prevent `README.md` from triggering CIs

2022-12-09 Thread GitBox
dongjoon-hyun commented on code in PR #39015: URL: https://github.com/apache/spark/pull/39015#discussion_r1044989412 ## dev/sparktestsupport/utils.py: ## @@ -34,19 +34,22 @@ def determine_modules_for_files(filenames): Given a list of filenames, return the set of modules

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39015: [SPARK-41476][INFRA] Prevent `README.md` from triggering CIs

2022-12-09 Thread GitBox
dongjoon-hyun commented on code in PR #39015: URL: https://github.com/apache/spark/pull/39015#discussion_r1044988801 ## dev/sparktestsupport/utils.py: ## @@ -84,7 +84,7 @@ def identify_changed_files_from_git_commits(patch_sha, target_branch=None, targe ["git", "diff",

[GitHub] [spark] shuyouZZ commented on a diff in pull request #38983: [SPARK-41447][CORE] Clean up expired event log files that don't exist in listing db

2022-12-09 Thread GitBox
shuyouZZ commented on code in PR #38983: URL: https://github.com/apache/spark/pull/38983#discussion_r1044983924 ## core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala: ## @@ -1705,6 +1705,47 @@ abstract class FsHistoryProviderSuite extends

[GitHub] [spark] viirya commented on pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
viirya commented on PR #38993: URL: https://github.com/apache/spark/pull/38993#issuecomment-1345117121 > BTW, seems you change the PR description template by mistake. Can you restore the template? Can you restore to the standard template? -- This is an automated message from the

[GitHub] [spark] viirya commented on a diff in pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
viirya commented on code in PR #38993: URL: https://github.com/apache/spark/pull/38993#discussion_r1044982387 ## sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/LogDivertAppender.java: ## @@ -276,12 +276,19 @@ private static StringLayout

[GitHub] [spark] idealspark commented on a diff in pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
idealspark commented on code in PR #38993: URL: https://github.com/apache/spark/pull/38993#discussion_r1044970950 ## sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/LogDivertAppender.java: ## @@ -276,12 +276,19 @@ private static StringLayout

[GitHub] [spark] idealspark commented on a diff in pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
idealspark commented on code in PR #38993: URL: https://github.com/apache/spark/pull/38993#discussion_r1044969362 ## sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/LogDivertAppender.java: ## @@ -276,12 +276,19 @@ private static StringLayout

[GitHub] [spark] dengziming commented on a diff in pull request #38984: [SPARK-41349][CONNECT][PYTHON] Implement DataFrame.hint

2022-12-09 Thread GitBox
dengziming commented on code in PR #38984: URL: https://github.com/apache/spark/pull/38984#discussion_r1044965012 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -305,7 +305,11 @@ class

[GitHub] [spark] idealspark commented on a diff in pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
idealspark commented on code in PR #38993: URL: https://github.com/apache/spark/pull/38993#discussion_r1044964246 ## sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/LogDivertAppender.java: ## @@ -276,12 +276,19 @@ private static StringLayout

[GitHub] [spark] panbingkun commented on a diff in pull request #38972: [SPARK-41443][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1061

2022-12-09 Thread GitBox
panbingkun commented on code in PR #38972: URL: https://github.com/apache/spark/pull/38972#discussion_r1044960581 ## core/src/main/resources/error/error-classes.json: ## @@ -109,6 +109,11 @@ "The column already exists. Consider to choose another name or rename the

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38984: [SPARK-41349][CONNECT][PYTHON] Implement DataFrame.hint

2022-12-09 Thread GitBox
HyukjinKwon commented on code in PR #38984: URL: https://github.com/apache/spark/pull/38984#discussion_r1044958569 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -305,7 +305,11 @@ class

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38984: [SPARK-41349][CONNECT][PYTHON] Implement DataFrame.hint

2022-12-09 Thread GitBox
HyukjinKwon commented on code in PR #38984: URL: https://github.com/apache/spark/pull/38984#discussion_r1044958569 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -305,7 +305,11 @@ class

[GitHub] [spark] Ngone51 commented on a diff in pull request #38876: [SPARK-41360][CORE] Avoid BlockManager re-registration if the executor has been lost

2022-12-09 Thread GitBox
Ngone51 commented on code in PR #38876: URL: https://github.com/apache/spark/pull/38876#discussion_r1044955782 ## core/src/main/scala/org/apache/spark/storage/BlockManager.scala: ## @@ -637,9 +637,11 @@ private[spark] class BlockManager( def reregister(): Unit = { //

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38991: [SPARK-41457][PYTHON][TESTS] Refactor type annotations and dependency checks in tests

2022-12-09 Thread GitBox
HyukjinKwon commented on code in PR #38991: URL: https://github.com/apache/spark/pull/38991#discussion_r1044955147 ## dev/lint-python: ## @@ -104,7 +104,7 @@ function mypy_data_test { -c dev/pyproject.toml \ --rootdir python \ --mypy-only-local-stub \ -

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39015: [SPARK-41476][INFRA] Prevent `README.md` from triggering CIs

2022-12-09 Thread GitBox
HyukjinKwon commented on code in PR #39015: URL: https://github.com/apache/spark/pull/39015#discussion_r1044952894 ## dev/sparktestsupport/utils.py: ## @@ -84,7 +84,7 @@ def identify_changed_files_from_git_commits(patch_sha, target_branch=None, targe ["git", "diff",

[GitHub] [spark] dongjoon-hyun commented on pull request #39014: [SPARK-41474][PROTOBUF][BUILD] Exclude `proto` files from `spark-protobuf` jar

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39014: URL: https://github.com/apache/spark/pull/39014#issuecomment-1344990963 All tests passed. Merged to master for Apache Spark 3.4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dongjoon-hyun closed pull request #39014: [SPARK-41474][PROTOBUF][BUILD] Exclude `proto` files from `spark-protobuf` jar

2022-12-09 Thread GitBox
dongjoon-hyun closed pull request #39014: [SPARK-41474][PROTOBUF][BUILD] Exclude `proto` files from `spark-protobuf` jar URL: https://github.com/apache/spark/pull/39014 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] dongjoon-hyun commented on pull request #39015: [SPARK-41476][INFRA] Prevent `README.md` from triggering CIs

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39015: URL: https://github.com/apache/spark/pull/39015#issuecomment-1344988102 Thank you so much, @gengliangwang ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] dongjoon-hyun commented on pull request #39015: [SPARK-41476][INFRA] Prevent `README.md` from triggering CIs

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39015: URL: https://github.com/apache/spark/pull/39015#issuecomment-1344987151 Could you review this too please, @gengliangwang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38922: [SPARK-41396][SQL][PROTOBUF] OneOf field support and recursion checks

2022-12-09 Thread GitBox
SandishKumarHN commented on code in PR #38922: URL: https://github.com/apache/spark/pull/38922#discussion_r1044857157 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufOptions.scala: ## @@ -38,6 +38,12 @@ private[sql] class ProtobufOptions(

[GitHub] [spark] dongjoon-hyun opened a new pull request, #39015: [SPARK-41476][INFRA] Prevent `README.md` from triggering CIs

2022-12-09 Thread GitBox
dongjoon-hyun opened a new pull request, #39015: URL: https://github.com/apache/spark/pull/39015 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] dongjoon-hyun commented on pull request #39014: [SPARK-41474][PROTOBUF][BUILD] Exclude `proto` files from `spark-protobuf` jar

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39014: URL: https://github.com/apache/spark/pull/39014#issuecomment-1344983516 Thank you, @gengliangwang ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on pull request #39014: [SPARK-41474][PROTOBUF][BUILD] Exclude `proto` files from `spark-protobuf` jar

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39014: URL: https://github.com/apache/spark/pull/39014#issuecomment-1344977841 Thank you, @SandishKumarHN . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SandishKumarHN commented on pull request #39014: [SPARK-41474][PROTOBUF][BUILD] Exclude `proto` files from `spark-protobuf` jar

2022-12-09 Thread GitBox
SandishKumarHN commented on PR #39014: URL: https://github.com/apache/spark/pull/39014#issuecomment-1344977544 @dongjoon-hyun LGTM, thanks for the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun commented on pull request #39014: [SPARK-41474][PROTOBUF][BUILD] Exclude `proto` files from `spark-protobuf` jar

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39014: URL: https://github.com/apache/spark/pull/39014#issuecomment-1344964855 Could you review this, @SandishKumarHN and @gengliangwang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] zhengruifeng commented on pull request #38946: [SPARK-41414][CONNECT][PYTHON] Implement date/timestamp functions

2022-12-09 Thread GitBox
zhengruifeng commented on PR #38946: URL: https://github.com/apache/spark/pull/38946#issuecomment-1344961425 merged into master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng closed pull request #38946: [SPARK-41414][CONNECT][PYTHON] Implement date/timestamp functions

2022-12-09 Thread GitBox
zhengruifeng closed pull request #38946: [SPARK-41414][CONNECT][PYTHON] Implement date/timestamp functions URL: https://github.com/apache/spark/pull/38946 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] srowen commented on a diff in pull request #38996: [SPARK-41008][MLLIB] Follow-up isotonic regression features deduplica…

2022-12-09 Thread GitBox
srowen commented on code in PR #38996: URL: https://github.com/apache/spark/pull/38996#discussion_r1044938891 ## docs/mllib-isotonic-regression.md: ## @@ -43,7 +43,17 @@ best fitting the original data points. which uses an approach to [parallelizing isotonic

[GitHub] [spark] dengziming commented on a diff in pull request #39012: [ SPARK-41475][CONNECT]: Fix lint-scala command error and typo

2022-12-09 Thread GitBox
dengziming commented on code in PR #39012: URL: https://github.com/apache/spark/pull/39012#discussion_r1044935825 ## dev/lint-scala: ## @@ -29,14 +29,14 @@ ERRORS=$(./build/mvn \ -Dscalafmt.skip=false \ -Dscalafmt.validateOnly=true \ -Dscalafmt.changedOnly=false

[GitHub] [spark] dengziming commented on a diff in pull request #39012: [MINOR][CONNECT]: Fix command error and typo

2022-12-09 Thread GitBox
dengziming commented on code in PR #39012: URL: https://github.com/apache/spark/pull/39012#discussion_r1044932932 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -322,8 +322,7 @@ class SparkConnectPlanner(session:

[GitHub] [spark] dongjoon-hyun commented on pull request #38994: [SPARK-41329][CONNECT] Resolve circular imports in Spark Connect

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #38994: URL: https://github.com/apache/spark/pull/38994#issuecomment-1344949254 No problem at all. If you checked locally, that's more than enough. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun opened a new pull request, #39014: [SPARK-41474][BUILD] Exclude proto files from spark-protobuf

2022-12-09 Thread GitBox
dongjoon-hyun opened a new pull request, #39014: URL: https://github.com/apache/spark/pull/39014 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] HyukjinKwon commented on pull request #38994: [SPARK-41329][CONNECT] Resolve circular imports in Spark Connect

2022-12-09 Thread GitBox
HyukjinKwon commented on PR #38994: URL: https://github.com/apache/spark/pull/38994#issuecomment-1344946168 oh yeah. the tests passed but the linter failed. I just removed one ignore in linter that's not used, and I checked it locally. But sorry I should have waited for the test

[GitHub] [spark] dongjoon-hyun commented on pull request #38994: [SPARK-41329][CONNECT] Resolve circular imports in Spark Connect

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #38994: URL: https://github.com/apache/spark/pull/38994#issuecomment-1344945736 Ur, does this pass CI? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #38994: [SPARK-41329][CONNECT] Resolve circular imports in Spark Connect

2022-12-09 Thread GitBox
HyukjinKwon closed pull request #38994: [SPARK-41329][CONNECT] Resolve circular imports in Spark Connect URL: https://github.com/apache/spark/pull/38994 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HyukjinKwon commented on pull request #38994: [SPARK-41329][CONNECT] Resolve circular imports in Spark Connect

2022-12-09 Thread GitBox
HyukjinKwon commented on PR #38994: URL: https://github.com/apache/spark/pull/38994#issuecomment-1344945421 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38994: [SPARK-41329][CONNECT] Resolve circular imports in Spark Connect

2022-12-09 Thread GitBox
HyukjinKwon commented on code in PR #38994: URL: https://github.com/apache/spark/pull/38994#discussion_r1044928308 ## python/pyspark/sql/connect/column.py: ## @@ -706,28 +705,30 @@ def substr(self, startPos: Union[int, "Column"], length: Union[int, "Column"]) -

[GitHub] [spark] gengliangwang commented on a diff in pull request #39000: [SPARK-41461][BUILD][CORE] Support user configurable protoc executables when building Spark Core.

2022-12-09 Thread GitBox
gengliangwang commented on code in PR #39000: URL: https://github.com/apache/spark/pull/39000#discussion_r1044927630 ## core/pom.xml: ## @@ -713,6 +693,71 @@ + + default-protoc + + + !skipDefaultProtoc + + +

[GitHub] [spark] github-actions[bot] closed pull request #36921: [SPARK-39481][SQL] Do not push down complex filter condition

2022-12-09 Thread GitBox
github-actions[bot] closed pull request #36921: [SPARK-39481][SQL] Do not push down complex filter condition URL: https://github.com/apache/spark/pull/36921 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] dongjoon-hyun commented on pull request #38985: [SPARK-41451][K8S] Avoid using empty abbrevMarker in StringUtils.abbreviate

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #38985: URL: https://github.com/apache/spark/pull/38985#issuecomment-1344930220 Thank you for closing, @pan3793 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] gengliangwang closed pull request #38988: [SPARK-41456][SQL] Improve the performance of try_cast

2022-12-09 Thread GitBox
gengliangwang closed pull request #38988: [SPARK-41456][SQL] Improve the performance of try_cast URL: https://github.com/apache/spark/pull/38988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] gengliangwang commented on pull request #38988: [SPARK-41456][SQL] Improve the performance of try_cast

2022-12-09 Thread GitBox
gengliangwang commented on PR #38988: URL: https://github.com/apache/spark/pull/38988#issuecomment-1344894973 @cloud-fan @HyukjinKwon @LuciferYang Thanks for the review. Merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] xinrong-meng commented on pull request #39009: [SPARK-41225][CONNECT][PYTHON][FOLLOW-UP] Disable unsupported functions.

2022-12-09 Thread GitBox
xinrong-meng commented on PR #39009: URL: https://github.com/apache/spark/pull/39009#issuecomment-1344864352 Merged to master, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] xinrong-meng closed pull request #39009: [SPARK-41225][CONNECT][PYTHON][FOLLOW-UP] Disable unsupported functions.

2022-12-09 Thread GitBox
xinrong-meng closed pull request #39009: [SPARK-41225][CONNECT][PYTHON][FOLLOW-UP] Disable unsupported functions. URL: https://github.com/apache/spark/pull/39009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] xinrong-meng opened a new pull request, #39013: [SPARK-41472][CONNECT][PYTHON] Implement the rest of string/binary functions

2022-12-09 Thread GitBox
xinrong-meng opened a new pull request, #39013: URL: https://github.com/apache/spark/pull/39013 ### What changes were proposed in this pull request? Implement the rest of string/binary functions. The first commit is https://github.com/apache/spark/pull/38921. ### Why are the

[GitHub] [spark] dongjoon-hyun closed pull request #38982: [SPARK-41376][CORE][3.2] Correct the Netty preferDirectBufs check logic on executor start

2022-12-09 Thread GitBox
dongjoon-hyun closed pull request #38982: [SPARK-41376][CORE][3.2] Correct the Netty preferDirectBufs check logic on executor start URL: https://github.com/apache/spark/pull/38982 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] otterc commented on a diff in pull request #36165: [SPARK-36620][SHUFFLE] Add Push Based Shuffle client side read metrics

2022-12-09 Thread GitBox
otterc commented on code in PR #36165: URL: https://github.com/apache/spark/pull/36165#discussion_r1044862699 ## core/src/test/resources/HistoryServerExpectations/one_stage_json_with_partitionId_expectation.json: ## @@ -26,13 +26,23 @@ "outputBytes" : 0, "outputRecords" :

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38922: [SPARK-41396][SQL][PROTOBUF] OneOf field support and recursion checks

2022-12-09 Thread GitBox
SandishKumarHN commented on code in PR #38922: URL: https://github.com/apache/spark/pull/38922#discussion_r1044857157 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufOptions.scala: ## @@ -38,6 +38,12 @@ private[sql] class ProtobufOptions(

[GitHub] [spark] mridulm commented on a diff in pull request #36165: [SPARK-36620][SHUFFLE] Add Push Based Shuffle client side read metrics

2022-12-09 Thread GitBox
mridulm commented on code in PR #36165: URL: https://github.com/apache/spark/pull/36165#discussion_r1044823829 ## core/src/main/scala/org/apache/spark/status/api/v1/api.scala: ## @@ -302,7 +312,9 @@ class StageData private[spark]( @JsonDeserialize(using =

[GitHub] [spark] anchovYu commented on a diff in pull request #38776: [SPARK-27561][SQL] Support implicit lateral column alias resolution on Project and refactor Analyzer

2022-12-09 Thread GitBox
anchovYu commented on code in PR #38776: URL: https://github.com/apache/spark/pull/38776#discussion_r1044807443 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala: ## @@ -424,8 +424,51 @@ case class OuterReference(e: NamedExpression)

[GitHub] [spark] WolverineJiang commented on pull request #39000: [SPARK-41461][BUILD][CORE] Support user configurable protoc executables when building Spark Core.

2022-12-09 Thread GitBox
WolverineJiang commented on PR #39000: URL: https://github.com/apache/spark/pull/39000#issuecomment-1344724455 The pom of the core module has active profiles, and activeByDefault does not take effect. Therefore, property is used instead. -- This is an automated message from the Apache

[GitHub] [spark] amaliujia commented on pull request #39004: [SPARK-41466][BUILD] Change Scala Style configuration to catch AnyFunSuite instead of FunSuite

2022-12-09 Thread GitBox
amaliujia commented on PR #39004: URL: https://github.com/apache/spark/pull/39004#issuecomment-1344680268 late LGTM thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] anchovYu commented on a diff in pull request #38776: [SPARK-27561][SQL] Support implicit lateral column alias resolution on Project and refactor Analyzer

2022-12-09 Thread GitBox
anchovYu commented on code in PR #38776: URL: https://github.com/apache/spark/pull/38776#discussion_r1044744657 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -638,6 +638,14 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] viirya commented on pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
viirya commented on PR #38993: URL: https://github.com/apache/spark/pull/38993#issuecomment-1344649595 BTW, seems you change the PR description template by mistake. Can you restore the template? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] viirya commented on a diff in pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
viirya commented on code in PR #38993: URL: https://github.com/apache/spark/pull/38993#discussion_r1044740898 ## sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/LogDivertAppender.java: ## @@ -276,12 +276,19 @@ private static StringLayout

[GitHub] [spark] viirya commented on a diff in pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
viirya commented on code in PR #38993: URL: https://github.com/apache/spark/pull/38993#discussion_r1044738513 ## sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/LogDivertAppender.java: ## @@ -276,12 +276,19 @@ private static StringLayout

[GitHub] [spark] viirya commented on a diff in pull request #38993: [SPARK-41459][SQL] fix thrift server operation log output is empty

2022-12-09 Thread GitBox
viirya commented on code in PR #38993: URL: https://github.com/apache/spark/pull/38993#discussion_r1044737028 ## sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/LogDivertAppender.java: ## @@ -276,12 +276,19 @@ private static StringLayout

[GitHub] [spark] dongjoon-hyun commented on pull request #38982: [SPARK-41376][CORE][3.2] Correct the Netty preferDirectBufs check logic on executor start

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #38982: URL: https://github.com/apache/spark/pull/38982#issuecomment-1344618354 Merged to branch-3.2 too. Thank you, @pan3793 and @Yikun . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] vinodkc commented on pull request #38608: [SPARK-41080][SQL] Support Bit manipulation function SETBIT

2022-12-09 Thread GitBox
vinodkc commented on PR #38608: URL: https://github.com/apache/spark/pull/38608#issuecomment-1344618388 @gengliangwang, can you please review it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] amaliujia commented on a diff in pull request #38984: [SPARK-41349][CONNECT][PYTHON] Implement DataFrame.hint

2022-12-09 Thread GitBox
amaliujia commented on code in PR #38984: URL: https://github.com/apache/spark/pull/38984#discussion_r1044705096 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -305,7 +305,11 @@ class SparkConnectPlanner(session:

[GitHub] [spark] dongjoon-hyun commented on pull request #38991: [SPARK-41457][PYTHON][TESTS] Refactor type annotations and dependency checks in tests

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #38991: URL: https://github.com/apache/spark/pull/38991#issuecomment-1344612390 Could you check the linter failure? ``` starting mypy data test... annotations failed data checks: = test session starts

[GitHub] [spark] xinrong-meng commented on a diff in pull request #38946: [SPARK-41414][CONNECT][PYTHON] Implement date/timestamp functions

2022-12-09 Thread GitBox
xinrong-meng commented on code in PR #38946: URL: https://github.com/apache/spark/pull/38946#discussion_r1044694589 ## python/pyspark/sql/tests/connect/test_connect_function.py: ## @@ -645,6 +645,153 @@ def test_string_functions(self): sdf.select(SF.encode("c",

[GitHub] [spark] gengliangwang commented on pull request #38999: [SPARK-41450][BUILD] Fix shading in `core` module

2022-12-09 Thread GitBox
gengliangwang commented on PR #38999: URL: https://github.com/apache/spark/pull/38999#issuecomment-1344591174 @pan3793 Thanks for fixing it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] otterc commented on a diff in pull request #36165: [SPARK-36620][SHUFFLE] Add Push Based Shuffle client side read metrics

2022-12-09 Thread GitBox
otterc commented on code in PR #36165: URL: https://github.com/apache/spark/pull/36165#discussion_r1044679570 ## core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala: ## @@ -726,6 +736,61 @@ final class ShuffleBlockFetcherIterator( } } + //

[GitHub] [spark] steveloughran commented on pull request #38974: [SPARK-41392][BUILD] Make maven build Spark master with Hadoop 3.4.0-SNAPSHOT successful

2022-12-09 Thread GitBox
steveloughran commented on PR #38974: URL: https://github.com/apache/spark/pull/38974#issuecomment-1344581573 yeah, not going to happen for a while; 3.3.5 RC0 coming soon though; just trying to wrap up an abfs prefetch bug -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] LuciferYang commented on pull request #38974: [SPARK-41392][BUILD] Make maven build Spark master with Hadoop 3.4.0-SNAPSHOT successful

2022-12-09 Thread GitBox
LuciferYang commented on PR #38974: URL: https://github.com/apache/spark/pull/38974#issuecomment-1344577731 fine to me, close first ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] LuciferYang closed pull request #38974: [SPARK-41392][BUILD] Make maven build Spark master with Hadoop 3.4.0-SNAPSHOT successful

2022-12-09 Thread GitBox
LuciferYang closed pull request #38974: [SPARK-41392][BUILD] Make maven build Spark master with Hadoop 3.4.0-SNAPSHOT successful URL: https://github.com/apache/spark/pull/38974 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] MaxGekk commented on pull request #38997: [SPARK-41462][SQL] Date and timestamp type can up cast to TimestampNTZ

2022-12-09 Thread GitBox
MaxGekk commented on PR #38997: URL: https://github.com/apache/spark/pull/38997#issuecomment-1344574665 +1, LGTM. Merging to master. Thank you, @gengliangwang and @cloud-fan for review. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39012: [MINOR][CONNECT]: Fix command error and typo

2022-12-09 Thread GitBox
dongjoon-hyun commented on code in PR #39012: URL: https://github.com/apache/spark/pull/39012#discussion_r1044671065 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -322,8 +322,7 @@ class

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #39012: [MINOR][CONNECT]: Fix command error and typo

2022-12-09 Thread GitBox
dongjoon-hyun commented on code in PR #39012: URL: https://github.com/apache/spark/pull/39012#discussion_r1044670574 ## dev/lint-scala: ## @@ -29,14 +29,14 @@ ERRORS=$(./build/mvn \ -Dscalafmt.skip=false \ -Dscalafmt.validateOnly=true \

[GitHub] [spark] dongjoon-hyun commented on pull request #39004: [SPARK-41466][BUILD] Change Scala Style configuration to catch AnyFunSuite instead of FunSuite

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #39004: URL: https://github.com/apache/spark/pull/39004#issuecomment-1344576249 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun closed pull request #39004: [SPARK-41466][BUILD] Change Scala Style configuration to catch AnyFunSuite instead of FunSuite

2022-12-09 Thread GitBox
dongjoon-hyun closed pull request #39004: [SPARK-41466][BUILD] Change Scala Style configuration to catch AnyFunSuite instead of FunSuite URL: https://github.com/apache/spark/pull/39004 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] MaxGekk closed pull request #38997: [SPARK-41462][SQL] Date and timestamp type can up cast to TimestampNTZ

2022-12-09 Thread GitBox
MaxGekk closed pull request #38997: [SPARK-41462][SQL] Date and timestamp type can up cast to TimestampNTZ URL: https://github.com/apache/spark/pull/38997 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] rednaxelafx commented on pull request #38923: [SPARK-41395][SQL] `InterpretedMutableProjection` should use `setDecimal` to set null values for decimals in an unsafe row

2022-12-09 Thread GitBox
rednaxelafx commented on PR #38923: URL: https://github.com/apache/spark/pull/38923#issuecomment-1344558867 Post-hoc review: LGTM, this is a good catch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] warrenzhu25 commented on pull request #39011: [WIP][SPARK-41469][CORE] Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated

2022-12-09 Thread GitBox
warrenzhu25 commented on PR #39011: URL: https://github.com/apache/spark/pull/39011#issuecomment-1344553081 > cc @warrenzhu25 too It's really the change I want. Great work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] wineternity commented on a diff in pull request #38702: [SPARK-41187][CORE] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen

2022-12-09 Thread GitBox
wineternity commented on code in PR #38702: URL: https://github.com/apache/spark/pull/38702#discussion_r1044596644 ## core/src/main/scala/org/apache/spark/status/AppStatusListener.scala: ## @@ -674,22 +674,30 @@ private[spark] class AppStatusListener( delta }.orNull

[GitHub] [spark] Ngone51 commented on a diff in pull request #38702: [SPARK-41187][CORE] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen

2022-12-09 Thread GitBox
Ngone51 commented on code in PR #38702: URL: https://github.com/apache/spark/pull/38702#discussion_r1044581470 ## core/src/main/scala/org/apache/spark/status/AppStatusListener.scala: ## @@ -674,22 +674,30 @@ private[spark] class AppStatusListener( delta }.orNull -

[GitHub] [spark] dengziming opened a new pull request, #39012: [MINOR][CONNECT]: Fix command error and typo

2022-12-09 Thread GitBox
dengziming opened a new pull request, #39012: URL: https://github.com/apache/spark/pull/39012 ### What changes were proposed in this pull request? We separate connect into server and common, but failed to update the `lint-scala` tools. fix a typo: fase -> false format the code.

[GitHub] [spark] dongjoon-hyun commented on pull request #38999: [SPARK-41450][BUILD] Fix shading in `core` module

2022-12-09 Thread GitBox
dongjoon-hyun commented on PR #38999: URL: https://github.com/apache/spark/pull/38999#issuecomment-1344459575 Merged to master. Thank you, @pan3793 and all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dongjoon-hyun closed pull request #38999: [SPARK-41450][BUILD] Fix shading in `core` module

2022-12-09 Thread GitBox
dongjoon-hyun closed pull request #38999: [SPARK-41450][BUILD] Fix shading in `core` module URL: https://github.com/apache/spark/pull/38999 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] Ngone51 commented on pull request #39011: [WIP][SPARK-41469][CORE] Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated

2022-12-09 Thread GitBox
Ngone51 commented on PR #39011: URL: https://github.com/apache/spark/pull/39011#issuecomment-1344456134 Mark as WIP first regarding the compilation error and missing ut. Any feedback is still welcome. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] vinodkc commented on pull request #38146: [SPARK-40687][SQL] Support data masking built-in function 'mask'

2022-12-09 Thread GitBox
vinodkc commented on PR #38146: URL: https://github.com/apache/spark/pull/38146#issuecomment-1344455185 @gengliangwang, Review comments are addressed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] Ngone51 commented on a diff in pull request #39011: [SPARK-41469][CORE] Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated

2022-12-09 Thread GitBox
Ngone51 commented on code in PR #39011: URL: https://github.com/apache/spark/pull/39011#discussion_r1044572392 ## core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala: ## @@ -1046,17 +1046,46 @@ private[spark] class TaskSetManager( /** Called by

[GitHub] [spark] Ngone51 commented on pull request #39011: [SPARK-41469][CORE] Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated

2022-12-09 Thread GitBox
Ngone51 commented on PR #39011: URL: https://github.com/apache/spark/pull/39011#issuecomment-137148 cc @warrenzhu25 too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] Ngone51 opened a new pull request, #39011: [SPARK-41469][CORE] Avoid unnecessary task rerun on decommissioned executor lost if shuffle data migrated

2022-12-09 Thread GitBox
Ngone51 opened a new pull request, #39011: URL: https://github.com/apache/spark/pull/39011 ### What changes were proposed in this pull request? This PR proposes to avoid rerunning the finished shuffle map task in `TaskSetManager.executorLost()` if the executor lost is

  1   2   3   >