[GitHub] [spark] beliefer commented on a diff in pull request #36877: [SPARK-39479][SQL] DS V2 supports push down math functions(non ANSI)

2022-06-15 Thread GitBox
beliefer commented on code in PR #36877: URL: https://github.com/apache/spark/pull/36877#discussion_r898714207 ## sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala: ## @@ -542,6 +544,60 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36887: [SPARK-39490][K8S] Support `ipFamilyPolicy` and `ipFamilies` in Driver Service

2022-06-15 Thread GitBox
dongjoon-hyun opened a new pull request, #36887: URL: https://github.com/apache/spark/pull/36887 ### What changes were proposed in this pull request? This PR aims to support ` ```yaml $ kubectl get svc spark-xxx-driver-svc -oyaml oyaml apiVersion: v1 kind: Service

[GitHub] [spark] MaxGekk commented on a diff in pull request #36857: [SPARK-39470][SQL] Support cast of ANSI intervals to decimals

2022-06-15 Thread GitBox
MaxGekk commented on code in PR #36857: URL: https://github.com/apache/spark/pull/36857#discussion_r898699293 ## sql/core/src/test/resources/sql-tests/inputs/cast.sql: ## @@ -116,3 +116,11 @@ select cast(interval '10' day as bigint); select cast(interval '-1000' month as

[GitHub] [spark] MaxGekk commented on a diff in pull request #36857: [SPARK-39470][SQL] Support cast of ANSI intervals to decimals

2022-06-15 Thread GitBox
MaxGekk commented on code in PR #36857: URL: https://github.com/apache/spark/pull/36857#discussion_r898692249 ## sql/core/src/test/resources/sql-tests/inputs/cast.sql: ## @@ -116,3 +116,11 @@ select cast(interval '10' day as bigint); select cast(interval '-1000' month as

[GitHub] [spark] MaxGekk commented on pull request #36857: [SPARK-39470][SQL] Support cast of ANSI intervals to decimals

2022-06-15 Thread GitBox
MaxGekk commented on PR #36857: URL: https://github.com/apache/spark/pull/36857#issuecomment-1157241074 @cloud-fan @srielau I added more checks, please, take a look at this PR one more time. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] dongjoon-hyun commented on pull request #36879: [SPARK-39482][DOCS] Add build and test documentation on IPv6

2022-06-15 Thread GitBox
dongjoon-hyun commented on PR #36879: URL: https://github.com/apache/spark/pull/36879#issuecomment-1157238953 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
LuciferYang commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898697170 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -50,8 +50,6 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] LuciferYang commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
LuciferYang commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898697170 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -50,8 +50,6 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] dongjoon-hyun commented on pull request #36882: [SPARK-39468][CORE][FOLLOWUP] Use `lazy val` for `host`

2022-06-15 Thread GitBox
dongjoon-hyun commented on PR #36882: URL: https://github.com/apache/spark/pull/36882#issuecomment-1157238708 Oh, Thank you for approval and merging, @HyukjinKwon , too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] HyukjinKwon commented on pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #36871: URL: https://github.com/apache/spark/pull/36871#issuecomment-1157236737 cc @bersprockets too if you find some time to review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #36871: URL: https://github.com/apache/spark/pull/36871#issuecomment-1157236219 Took a cursory look. @MaxGekk do you remember the context here? I remember we didn't merge this change because the legacy fast format parser (Java 8 libraries) did not support the

[GitHub] [spark] Jonathancui123 commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
Jonathancui123 commented on code in PR #36871: URL: https://github.com/apache/spark/pull/36871#discussion_r898694871 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/UnivocityParserSuite.scala: ## @@ -358,4 +358,19 @@ class UnivocityParserSuite extends

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
HyukjinKwon commented on code in PR #36871: URL: https://github.com/apache/spark/pull/36871#discussion_r898694396 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala: ## @@ -206,28 +218,27 @@ class UnivocityParser( // If fails to

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
HyukjinKwon commented on code in PR #36871: URL: https://github.com/apache/spark/pull/36871#discussion_r898693212 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/UnivocityParserSuite.scala: ## @@ -358,4 +358,19 @@ class UnivocityParserSuite extends

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
HyukjinKwon commented on code in PR #36871: URL: https://github.com/apache/spark/pull/36871#discussion_r898692835 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/UnivocityParserSuite.scala: ## @@ -358,4 +358,19 @@ class UnivocityParserSuite extends

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
HyukjinKwon commented on code in PR #36871: URL: https://github.com/apache/spark/pull/36871#discussion_r898692373 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/UnivocityParserSuite.scala: ## @@ -358,4 +358,19 @@ class UnivocityParserSuite extends

[GitHub] [spark] MaxGekk commented on a diff in pull request #36857: [SPARK-39470][SQL] Support cast of ANSI intervals to decimals

2022-06-15 Thread GitBox
MaxGekk commented on code in PR #36857: URL: https://github.com/apache/spark/pull/36857#discussion_r898692249 ## sql/core/src/test/resources/sql-tests/inputs/cast.sql: ## @@ -116,3 +116,11 @@ select cast(interval '10' day as bigint); select cast(interval '-1000' month as

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36871: [SPARK-39469][SQL] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
HyukjinKwon commented on code in PR #36871: URL: https://github.com/apache/spark/pull/36871#discussion_r898692164 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala: ## @@ -169,6 +174,14 @@ class CSVInferSchema(val options: CSVOptions) extends

[GitHub] [spark] MaxGekk commented on a diff in pull request #36857: [SPARK-39470][SQL] Support cast of ANSI intervals to decimals

2022-06-15 Thread GitBox
MaxGekk commented on code in PR #36857: URL: https://github.com/apache/spark/pull/36857#discussion_r898691634 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala: ## @@ -1015,6 +1014,11 @@ case class Cast( } catch { case _:

[GitHub] [spark] huaxingao commented on a diff in pull request #36877: [SPARK-39479][SQL] DS V2 supports push down math functions(non ANSI)

2022-06-15 Thread GitBox
huaxingao commented on code in PR #36877: URL: https://github.com/apache/spark/pull/36877#discussion_r898685809 ## sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala: ## @@ -542,6 +544,60 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with

[GitHub] [spark] huaxingao commented on pull request #36886: [MINOR][SQL] Remove duplicate code for `AggregateExpression.isAggregate` usage

2022-06-15 Thread GitBox
huaxingao commented on PR #36886: URL: https://github.com/apache/spark/pull/36886#issuecomment-1157223864 Merged to master. Thanks @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] huaxingao closed pull request #36886: [MINOR][SQL] Remove duplicate code for `AggregateExpression.isAggregate` usage

2022-06-15 Thread GitBox
huaxingao closed pull request #36886: [MINOR][SQL] Remove duplicate code for `AggregateExpression.isAggregate` usage URL: https://github.com/apache/spark/pull/36886 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] huaxingao commented on pull request #36696: [SPARK-39312][SQL] Use parquet native In predicate for in filter push down

2022-06-15 Thread GitBox
huaxingao commented on PR #36696: URL: https://github.com/apache/spark/pull/36696#issuecomment-1157221620 @wangyum Thank you very much for helping me test this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on pull request #36413: [SPARK-39074][CI] Fail on upload, not download of missing test files

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #36413: URL: https://github.com/apache/spark/pull/36413#issuecomment-1157219031 that's fine. i think it's good that we know there's a problem there  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] srowen closed pull request #36413: [SPARK-39074][CI] Fail on upload, not download of missing test files

2022-06-15 Thread GitBox
srowen closed pull request #36413: [SPARK-39074][CI] Fail on upload, not download of missing test files URL: https://github.com/apache/spark/pull/36413 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] srowen commented on pull request #36413: [SPARK-39074][CI] Fail on upload, not download of missing test files

2022-06-15 Thread GitBox
srowen commented on PR #36413: URL: https://github.com/apache/spark/pull/36413#issuecomment-1157218266 Ah shoot OK thank you. You were right -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] EnricoMi opened a new pull request, #36413: [SPARK-39074][CI] Fail on upload, not download of missing test files

2022-06-15 Thread GitBox
EnricoMi opened a new pull request, #36413: URL: https://github.com/apache/spark/pull/36413 ### What changes were proposed in this pull request? The CI should not fail when there are no test result files to be downloaded. ### Why are the changes needed? The CI workflow "Report

[GitHub] [spark] wangyum commented on pull request #36696: [SPARK-39312][SQL] Use parquet native In predicate for in filter push down

2022-06-15 Thread GitBox
wangyum commented on PR #36696: URL: https://github.com/apache/spark/pull/36696#issuecomment-1157212653 I have tested it more than 2 weeks and no data issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] gengliangwang closed pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
gengliangwang closed pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around URL: https://github.com/apache/spark/pull/36880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] LuciferYang commented on pull request #36876: [SPARK-39464][CORE][TESTS][FOLLOWUP] Use Utils.localHostNameForURI instead of Utils.localCanonicalHostName in tests

2022-06-15 Thread GitBox
LuciferYang commented on PR #36876: URL: https://github.com/apache/spark/pull/36876#issuecomment-1157208938 Yes, using export SPARK_LOCAL_HOSTNAME ="[fe80::e63d:1aff:fe28:...]"` can pass the suites, but using `export SPARK_LOCAL_IP=::1` is still not, so using `export SPARK_LOCAL_IP=::1`

[GitHub] [spark] gengliangwang commented on pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
gengliangwang commented on PR #36880: URL: https://github.com/apache/spark/pull/36880#issuecomment-1157208934 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #36413: [SPARK-39074][CI] Fail on upload, not download of missing test files

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #36413: URL: https://github.com/apache/spark/pull/36413#issuecomment-1157205469 Seems like it fails in some cases such as https://github.com/apache/spark/commit/3b709ebf8aea0f0b21f5ee721900f88e3f5e8b4e

[GitHub] [spark] cloud-fan commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
cloud-fan commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898655223 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -50,8 +50,6 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] wangyum commented on a diff in pull request #36886: [MINOR][SQL] Remove duplicate code for `AggregateExpression.isAggregate` usage

2022-06-15 Thread GitBox
wangyum commented on code in PR #36886: URL: https://github.com/apache/spark/pull/36886#discussion_r898653846 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -301,20 +301,16 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] wangyum opened a new pull request, #36886: [MINOR][SQL] Remove duplicate code for `AggregateExpression.isAggregate` usage

2022-06-15 Thread GitBox
wangyum opened a new pull request, #36886: URL: https://github.com/apache/spark/pull/36886 ### What changes were proposed in this pull request? Remove duplicate code for `AggregateExpression.isAggregate` usage. ### Why are the changes needed? Make the code easier to

[GitHub] [spark] HyukjinKwon closed pull request #34406: [MINOR][DOCS][PS] Fix the documentation for escapechar in read_csv

2022-06-15 Thread GitBox
HyukjinKwon closed pull request #34406: [MINOR][DOCS][PS] Fix the documentation for escapechar in read_csv URL: https://github.com/apache/spark/pull/34406 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on pull request #34406: [MINOR][DOCS][PS] Fix the documentation for escapechar in read_csv

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #34406: URL: https://github.com/apache/spark/pull/34406#issuecomment-1157183575 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] viirya commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
viirya commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898653220 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala: ## @@ -50,8 +50,6 @@ trait CheckAnalysis extends PredicateHelper with

[GitHub] [spark] HyukjinKwon commented on pull request #34406: Minor fix to docs for read_csv

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #34406: URL: https://github.com/apache/spark/pull/34406#issuecomment-1157183037 I pushed a fix to match with pandas' https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #34406: Minor fix to docs for read_csv

2022-06-15 Thread GitBox
HyukjinKwon commented on code in PR #34406: URL: https://github.com/apache/spark/pull/34406#discussion_r898653016 ## python/pyspark/pandas/namespace.py: ## @@ -272,7 +272,7 @@ def read_csv( The character used to denote the start and end of a quoted item. Quoted items

[GitHub] [spark] viirya commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
viirya commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898652235 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -4345,32 +4340,42 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {

[GitHub] [spark] JoshRosen commented on a diff in pull request #36885: [SPARK-39489][CORE] Improve event logging JsonProtocol performance by using Jackson instead of Json4s

2022-06-15 Thread GitBox
JoshRosen commented on code in PR #36885: URL: https://github.com/apache/spark/pull/36885#discussion_r898649842 ## core/src/main/scala/org/apache/spark/util/JsonProtocol.scala: ## @@ -663,66 +854,69 @@ private[spark] object JsonProtocol { case `stageExecutorMetrics` =>

[GitHub] [spark] JoshRosen commented on a diff in pull request #36885: [SPARK-39489][CORE] Improve event logging JsonProtocol performance by using Jackson instead of Json4s

2022-06-15 Thread GitBox
JoshRosen commented on code in PR #36885: URL: https://github.com/apache/spark/pull/36885#discussion_r898648926 ## core/src/main/scala/org/apache/spark/util/JsonProtocol.scala: ## @@ -360,255 +460,342 @@ private[spark] object JsonProtocol { * * The behavior here must

[GitHub] [spark] JoshRosen commented on a diff in pull request #36885: [SPARK-39489][CORE] Improve event logging JsonProtocol performance by using Jackson instead of Json4s

2022-06-15 Thread GitBox
JoshRosen commented on code in PR #36885: URL: https://github.com/apache/spark/pull/36885#discussion_r898647504 ## core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala: ## @@ -916,13 +958,13 @@ private[spark] object JsonProtocolSuite extends Assertions { }

[GitHub] [spark] JoshRosen commented on a diff in pull request #36885: [SPARK-39489][CORE] Improve event logging JsonProtocol performance by using Jackson instead of Json4s

2022-06-15 Thread GitBox
JoshRosen commented on code in PR #36885: URL: https://github.com/apache/spark/pull/36885#discussion_r898646978 ## core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala: ## @@ -610,13 +618,34 @@ class JsonProtocolSuite extends SparkFunSuite { | "Event" :

[GitHub] [spark] JoshRosen commented on a diff in pull request #36885: [SPARK-39489][CORE] Improve event logging JsonProtocol performance by using Jackson instead of Json4s

2022-06-15 Thread GitBox
JoshRosen commented on code in PR #36885: URL: https://github.com/apache/spark/pull/36885#discussion_r898646642 ## core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala: ## @@ -610,13 +618,34 @@ class JsonProtocolSuite extends SparkFunSuite { | "Event" :

[GitHub] [spark] cloud-fan commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
cloud-fan commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898644825 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -4345,32 +4340,42 @@ object ApplyCharTypePadding extends Rule[LogicalPlan]

[GitHub] [spark] JoshRosen commented on a diff in pull request #36885: [SPARK-39489][CORE] Improve event logging JsonProtocol performance by using Jackson instead of Json4s

2022-06-15 Thread GitBox
JoshRosen commented on code in PR #36885: URL: https://github.com/apache/spark/pull/36885#discussion_r898644147 ## core/src/main/scala/org/apache/spark/util/JsonProtocol.scala: ## @@ -360,255 +460,342 @@ private[spark] object JsonProtocol { * * The behavior here must

[GitHub] [spark] LuciferYang commented on pull request #36876: [SPARK-39464][CORE][TESTS][FOLLOWUP] Use Utils.localHostNameForURI instead of Utils.localCanonicalHostName in tests

2022-06-15 Thread GitBox
LuciferYang commented on PR #36876: URL: https://github.com/apache/spark/pull/36876#issuecomment-1157169372 > Firewall Thanks, fell asleep yesterday... I'll try this today -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] viirya commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
viirya commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898640889 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -4345,32 +4340,42 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {

[GitHub] [spark] JoshRosen opened a new pull request, #36885: [SPARK-39489][CORE] Improve event logging JsonProtocol performance by using Jackson instead of Json4s

2022-06-15 Thread GitBox
JoshRosen opened a new pull request, #36885: URL: https://github.com/apache/spark/pull/36885 ### What changes were proposed in this pull request? This PR improves the performance of `org.apache.spark.util.JsonProtocol` by replacing all uses of Json4s with uses of Jackson

[GitHub] [spark] cloud-fan commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
cloud-fan commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898635585 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -4345,32 +4340,42 @@ object ApplyCharTypePadding extends Rule[LogicalPlan]

[GitHub] [spark] dongjoon-hyun commented on pull request #36882: [SPARK-39468][CORE][FOLLOWUP] Use `lazy val` for `host`

2022-06-15 Thread GitBox
dongjoon-hyun commented on PR #36882: URL: https://github.com/apache/spark/pull/36882#issuecomment-1157160262 Thank you @mridulm ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] viirya commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
viirya commented on code in PR #36809: URL: https://github.com/apache/spark/pull/36809#discussion_r898623736 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -4345,32 +4340,42 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {

[GitHub] [spark] mridulm commented on pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-15 Thread GitBox
mridulm commented on PR #35906: URL: https://github.com/apache/spark/pull/35906#issuecomment-1157144340 Please do take a look at the changes @otterc. I will circle back to this later next week -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] mridulm commented on pull request #36882: [SPARK-39468][CORE][FOLLOWUP] Use `lazy val` for `host`

2022-06-15 Thread GitBox
mridulm commented on PR #36882: URL: https://github.com/apache/spark/pull/36882#issuecomment-1157143900 Thanks for fixing this @dongjoon-hyun ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] mridulm commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

2022-06-15 Thread GitBox
mridulm commented on code in PR #36162: URL: https://github.com/apache/spark/pull/36162#discussion_r897072986 ## core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala: ## @@ -2245,6 +2250,220 @@ class TaskSetManagerSuite

[GitHub] [spark] cloud-fan commented on pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox
cloud-fan commented on PR #36809: URL: https://github.com/apache/spark/pull/36809#issuecomment-1157142097 cc @LuciferYang @amaliujia @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan commented on a diff in pull request #36857: [SPARK-39470][SQL] Support cast of ANSI intervals to decimals

2022-06-15 Thread GitBox
cloud-fan commented on code in PR #36857: URL: https://github.com/apache/spark/pull/36857#discussion_r898618909 ## sql/core/src/test/resources/sql-tests/inputs/cast.sql: ## @@ -116,3 +116,11 @@ select cast(interval '10' day as bigint); select cast(interval '-1000' month as

[GitHub] [spark] cloud-fan commented on a diff in pull request #36857: [SPARK-39470][SQL] Support cast of ANSI intervals to decimals

2022-06-15 Thread GitBox
cloud-fan commented on code in PR #36857: URL: https://github.com/apache/spark/pull/36857#discussion_r898617688 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala: ## @@ -1015,6 +1014,11 @@ case class Cast( } catch { case _:

[GitHub] [spark] cloud-fan commented on pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

2022-06-15 Thread GitBox
cloud-fan commented on PR #36873: URL: https://github.com/apache/spark/pull/36873#issuecomment-1157137986 thanks, merging to master/3.3/3.2/3.1! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] cloud-fan closed pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

2022-06-15 Thread GitBox
cloud-fan closed pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float URL: https://github.com/apache/spark/pull/36873 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] ulysses-you commented on a diff in pull request #33522: [SPARK-36290][SQL] Pull out join condition

2022-06-15 Thread GitBox
ulysses-you commented on code in PR #33522: URL: https://github.com/apache/spark/pull/33522#discussion_r898610674 ## sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala: ## @@ -1057,7 +1057,7 @@ class JoinSuite extends QueryTest with SharedSparkSession with

[GitHub] [spark] koodin9 opened a new pull request, #36884: [SPARK-39485][SQL] When hiveMetastoreJars is path, get hive configs from origLoader

2022-06-15 Thread GitBox
koodin9 opened a new pull request, #36884: URL: https://github.com/apache/spark/pull/36884 ### What changes were proposed in this pull request? Adds hive resource URL of origLoader, the existing classloader, to the newly created classloader of IsolatedClientLoader. ### Why are

[GitHub] [spark] HyukjinKwon closed pull request #36883: [SPARK-39061][SQL] Set nullable correctly for `Inline` output attributes

2022-06-15 Thread GitBox
HyukjinKwon closed pull request #36883: [SPARK-39061][SQL] Set nullable correctly for `Inline` output attributes URL: https://github.com/apache/spark/pull/36883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on pull request #36883: [SPARK-39061][SQL] Set nullable correctly for `Inline` output attributes

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #36883: URL: https://github.com/apache/spark/pull/36883#issuecomment-1157104320 Merged to master, branch-3.3 and branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] bersprockets commented on a diff in pull request #36883: [SPARK-39061][SQL] Set nullable correctly for `Inline` output attributes

2022-06-15 Thread GitBox
bersprockets commented on code in PR #36883: URL: https://github.com/apache/spark/pull/36883#discussion_r898586898 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala: ## @@ -444,7 +444,8 @@ case class Inline(child: Expression) extends

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36883: [SPARK-39061][SQL] Set nullable correctly for `Inline` output attributes

2022-06-15 Thread GitBox
HyukjinKwon commented on code in PR #36883: URL: https://github.com/apache/spark/pull/36883#discussion_r898585360 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala: ## @@ -444,7 +444,8 @@ case class Inline(child: Expression) extends

[GitHub] [spark] HyukjinKwon closed pull request #36879: [SPARK-39482][DOCS] Add build and test documentation on IPv6

2022-06-15 Thread GitBox
HyukjinKwon closed pull request #36879: [SPARK-39482][DOCS] Add build and test documentation on IPv6 URL: https://github.com/apache/spark/pull/36879 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon commented on pull request #36879: [SPARK-39482][DOCS] Add build and test documentation on IPv6

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #36879: URL: https://github.com/apache/spark/pull/36879#issuecomment-1157098949 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #36882: [SPARK-39468][CORE][FOLLOWUP] Use `lazy val` for `host`

2022-06-15 Thread GitBox
HyukjinKwon closed pull request #36882: [SPARK-39468][CORE][FOLLOWUP] Use `lazy val` for `host` URL: https://github.com/apache/spark/pull/36882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #36882: [SPARK-39468][CORE][FOLLOWUP] Use `lazy val` for `host`

2022-06-15 Thread GitBox
HyukjinKwon commented on PR #36882: URL: https://github.com/apache/spark/pull/36882#issuecomment-1157098472 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] wangyum commented on a diff in pull request #36874: [SPARK-39475][SQL] Pull out complex join keys for shuffled join

2022-06-15 Thread GitBox
wangyum commented on code in PR #36874: URL: https://github.com/apache/spark/pull/36874#discussion_r898559792 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: ## @@ -210,6 +210,8 @@ abstract class Optimizer(catalogManager: CatalogManager)

[GitHub] [spark] wangyum commented on a diff in pull request #33522: [SPARK-36290][SQL] Pull out join condition

2022-06-15 Thread GitBox
wangyum commented on code in PR #33522: URL: https://github.com/apache/spark/pull/33522#discussion_r898557548 ## sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala: ## @@ -1057,7 +1057,7 @@ class JoinSuite extends QueryTest with SharedSparkSession with

[GitHub] [spark] Jonathancui123 commented on a diff in pull request #36871: [WIP][SPARK-39469] Infer date type for CSV schema inference

2022-06-15 Thread GitBox
Jonathancui123 commented on code in PR #36871: URL: https://github.com/apache/spark/pull/36871#discussion_r898222912 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala: ## @@ -117,8 +124,8 @@ class CSVInferSchema(val options: CSVOptions)

[GitHub] [spark] bersprockets opened a new pull request, #36883: [SPARK-39061][SQL] Set nullable correctly for `Inline` output attributes

2022-06-15 Thread GitBox
bersprockets opened a new pull request, #36883: URL: https://github.com/apache/spark/pull/36883 ### What changes were proposed in this pull request? Change `Inline#elementSchema` to make each struct field nullable when the containing array has a null element. ### Why are the

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36868: [SPARK-39468][CORE] Improve `RpcAddress` to add `[]` to `IPv6` if needed

2022-06-15 Thread GitBox
dongjoon-hyun commented on code in PR #36868: URL: https://github.com/apache/spark/pull/36868#discussion_r898459231 ## core/src/main/scala/org/apache/spark/rpc/RpcAddress.scala: ## @@ -23,7 +23,9 @@ import org.apache.spark.util.Utils /** * Address for an RPC environment,

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36882: [SPARK-39468][CORE][FOLLOWUP] Use 'lazy val' for host

2022-06-15 Thread GitBox
dongjoon-hyun opened a new pull request, #36882: URL: https://github.com/apache/spark/pull/36882 ### What changes were proposed in this pull request? This PR aims to use `lazy val host` instead of `val host`. ### Why are the changes needed? To address the review comments

[GitHub] [spark] dtenedor commented on a diff in pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
dtenedor commented on code in PR #36880: URL: https://github.com/apache/spark/pull/36880#discussion_r898422483 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala: ## @@ -241,4 +242,36 @@ object ResolveDefaultColumns { }

[GitHub] [spark] dtenedor commented on pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
dtenedor commented on PR #36880: URL: https://github.com/apache/spark/pull/36880#issuecomment-1156934535 > @dtenedor Thanks for the work! This simplifies the code implementation! Note that after this refactoring we will need extra work if we decide to support column default value with

[GitHub] [spark] gengliangwang commented on pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
gengliangwang commented on PR #36880: URL: https://github.com/apache/spark/pull/36880#issuecomment-1156912614 @dtenedor Thanks for the work! This simplifies the code implementation! Note that after this refactoring we will need extra work if we decide to support column default value with

[GitHub] [spark] gengliangwang commented on a diff in pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
gengliangwang commented on code in PR #36880: URL: https://github.com/apache/spark/pull/36880#discussion_r898396863 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala: ## @@ -241,4 +242,36 @@ object ResolveDefaultColumns { }

[GitHub] [spark] edgarRd opened a new pull request, #36881: [SPARK-39484][SQL] Fix field name case sensitivity for struct type in V2WriteCommand.outputResolved

2022-06-15 Thread GitBox
edgarRd opened a new pull request, #36881: URL: https://github.com/apache/spark/pull/36881 ### What changes were proposed in this pull request? When a V2 write uses an input with a struct type which contains differences in the casing of field names, the `caseSensitive` config

[GitHub] [spark] dtenedor commented on a diff in pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
dtenedor commented on code in PR #36880: URL: https://github.com/apache/spark/pull/36880#discussion_r898349505 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala: ## @@ -241,4 +243,33 @@ object ResolveDefaultColumns { }

[GitHub] [spark] mridulm commented on a diff in pull request #36868: [SPARK-39468][CORE] Improve `RpcAddress` to add `[]` to `IPv6` if needed

2022-06-15 Thread GitBox
mridulm commented on code in PR #36868: URL: https://github.com/apache/spark/pull/36868#discussion_r898332785 ## core/src/main/scala/org/apache/spark/rpc/RpcAddress.scala: ## @@ -23,7 +23,9 @@ import org.apache.spark.util.Utils /** * Address for an RPC environment, with

[GitHub] [spark] gengliangwang commented on a diff in pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
gengliangwang commented on code in PR #36880: URL: https://github.com/apache/spark/pull/36880#discussion_r898328782 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala: ## @@ -241,4 +243,33 @@ object ResolveDefaultColumns { }

[GitHub] [spark] gengliangwang commented on a diff in pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
gengliangwang commented on code in PR #36880: URL: https://github.com/apache/spark/pull/36880#discussion_r898328157 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala: ## @@ -143,6 +144,7 @@ object ResolveDefaultColumns { }

[GitHub] [spark] otterc commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-15 Thread GitBox
otterc commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r898325741 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -656,6 +771,206 @@ public void registerExecutor(String

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36868: [SPARK-39468][CORE] Improve `RpcAddress` to add `[]` to `IPv6` if needed

2022-06-15 Thread GitBox
dongjoon-hyun commented on code in PR #36868: URL: https://github.com/apache/spark/pull/36868#discussion_r898323400 ## core/src/main/scala/org/apache/spark/rpc/RpcAddress.scala: ## @@ -23,7 +23,9 @@ import org.apache.spark.util.Utils /** * Address for an RPC environment,

[GitHub] [spark] dtenedor commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

2022-06-15 Thread GitBox
dtenedor commented on code in PR #36771: URL: https://github.com/apache/spark/pull/36771#discussion_r898323664 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala: ## @@ -43,6 +43,8 @@ class V2SessionCatalog(catalog: SessionCatalog)

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36868: [SPARK-39468][CORE] Improve `RpcAddress` to add `[]` to `IPv6` if needed

2022-06-15 Thread GitBox
dongjoon-hyun commented on code in PR #36868: URL: https://github.com/apache/spark/pull/36868#discussion_r898323400 ## core/src/main/scala/org/apache/spark/rpc/RpcAddress.scala: ## @@ -23,7 +23,9 @@ import org.apache.spark.util.Utils /** * Address for an RPC environment,

[GitHub] [spark] dtenedor commented on pull request #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the primary Analyzer around

2022-06-15 Thread GitBox
dtenedor commented on PR #36880: URL: https://github.com/apache/spark/pull/36880#issuecomment-1156828533 Hi @gengliangwang, here is a separate PR to refactor DEFAULT column support to avoid passing around the main `Analyzer` as we talked about offline. -- This is an automated message

[GitHub] [spark] dtenedor opened a new pull request, #36880: [SPARK-39383][SQL] Refactor DEFAULT column support to skip passing the Analyzer around

2022-06-15 Thread GitBox
dtenedor opened a new pull request, #36880: URL: https://github.com/apache/spark/pull/36880 ### What changes were proposed in this pull request? Refactor DEFAULT column support to skip passing the main `Analyzer` around. Instead, the `ResolvedDefaultColumnsUtil.scala` file gains the

[GitHub] [spark] dongjoon-hyun commented on pull request #36876: [SPARK-39464][CORE][TESTS][FOLLOWUP] Use Utils.localHostNameForURI instead of Utils.localCanonicalHostName in tests

2022-06-15 Thread GitBox
dongjoon-hyun commented on PR #36876: URL: https://github.com/apache/spark/pull/36876#issuecomment-1156820299 BTW, if you are using Mac, please disable `Firewall` during testing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #36872: [SPARK-36252][CORE]: Add log files rolling policy for driver running in cluster mode with spark standalone cluster

2022-06-15 Thread GitBox
AmplabJenkins commented on PR #36872: URL: https://github.com/apache/spark/pull/36872#issuecomment-1156803887 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #36871: [WIP] infer date type in csv

2022-06-15 Thread GitBox
AmplabJenkins commented on PR #36871: URL: https://github.com/apache/spark/pull/36871#issuecomment-1156803934 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

2022-06-15 Thread GitBox
AmplabJenkins commented on PR #36873: URL: https://github.com/apache/spark/pull/36873#issuecomment-1156803849 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] amaliujia commented on a diff in pull request #36641: [SPARK-39263][SQL] Make GetTable, TableExists and DatabaseExists be compatible with 3 layer namespace

2022-06-15 Thread GitBox
amaliujia commented on code in PR #36641: URL: https://github.com/apache/spark/pull/36641#discussion_r898293801 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -250,8 +251,18 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-15 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r898266509 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -576,6 +661,7 @@ public MergeStatuses

[GitHub] [spark] zhouyejoe commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-15 Thread GitBox
zhouyejoe commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r898238961 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -656,6 +771,206 @@ public void registerExecutor(String

  1   2   >