Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-20 Thread via GitHub


uros-db commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2121183739

   fix should be ready https://github.com/apache/spark/pull/46661
   please review @cloud-fan @HyukjinKwon @dongjoon-hyun
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-20 Thread via GitHub


dongjoon-hyun commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2121013158

   Gentle ping, @uros-db . The CI is still broken.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-20 Thread via GitHub


HyukjinKwon closed pull request #46649: [SPARK-44838][SQL][FOLLOW-UP] Fix the 
test for raise_error by using default type for strings
URL: https://github.com/apache/spark/pull/46649


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-19 Thread via GitHub


uros-db commented on code in PR #46649:
URL: https://github.com/apache/spark/pull/46649#discussion_r1606212289


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -1048,6 +1048,9 @@ object TypeCoercion extends TypeCoercionBase {
   }
 }
 
+  // Allows the cast between different collated strings to match with ANSI 
behavior.

Review Comment:
   on it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-19 Thread via GitHub


HyukjinKwon commented on code in PR #46649:
URL: https://github.com/apache/spark/pull/46649#discussion_r1606157978


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -1048,6 +1048,9 @@ object TypeCoercion extends TypeCoercionBase {
   }
 }
 
+  // Allows the cast between different collated strings to match with ANSI 
behavior.

Review Comment:
   @uros-db please open a separate PR - I can close mine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-19 Thread via GitHub


HyukjinKwon commented on code in PR #46649:
URL: https://github.com/apache/spark/pull/46649#discussion_r1606157733


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -1048,6 +1048,9 @@ object TypeCoercion extends TypeCoercionBase {
   }
 }
 
+  // Allows the cast between different collated strings to match with ANSI 
behavior.

Review Comment:
   `AbstractMapType(StringTypeAnyCollation, StringTypeAnyCollation)` does not 
work with non ANSI type coercion here. We should fix it here in any event.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-19 Thread via GitHub


cloud-fan commented on code in PR #46649:
URL: https://github.com/apache/spark/pull/46649#discussion_r1606154991


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -1048,6 +1048,9 @@ object TypeCoercion extends TypeCoercionBase {
   }
 }
 
+  // Allows the cast between different collated strings to match with ANSI 
behavior.

Review Comment:
   This is a behavior change.
   
   @uros-db I looked at the code base, we have implicit cast rules for 
`AbstractArrayType`, but not `AbstractMapType`, seems like a miss?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-19 Thread via GitHub


HyukjinKwon commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2119477261

   im gonna switch back to just fix ANSI disabeld type coercion rule but I 
think we should revisit those whole thing .. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-19 Thread via GitHub


HyukjinKwon commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2119473364

   Seems like that still doesn't work with ANSI disabled:
   
   ```
   org.apache.spark.sql.AnalysisException: 
[DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE] Cannot resolve 
"raise_error(USER_RAISED_EXCEPTION, map(errorMessage, 1))" due to data type 
mismatch: The second parameter requires the "MAP" type, however 
"map(errorMessage, 1)" has the type "MAP". SQLSTATE: 42K09; line 1 
pos 7;
   'Project [unresolvedalias(raise_error(USER_RAISED_EXCEPTION, 
map(errorMessage, 1), NullType))]
   +- OneRowRelation
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-18 Thread via GitHub


HyukjinKwon commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2118804792

   Sure, that sounds like more localized fix


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-18 Thread via GitHub


uros-db commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2118751764

   @HyukjinKwon I believe you could also use: 
`AbstractMapType(StringTypeAnyCollation, StringTypeAnyCollation)` for 
`inputTypes` in `RaiseError` (misc.scala) instead of `MapType(StringType, 
StringType)`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-18 Thread via GitHub


uros-db commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2118751278

   It was my understanding that this wouldn't be a problem, since this second 
parameter (MapType) is only used internally in Spark to raise errors


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-18 Thread via GitHub


HyukjinKwon commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2118744355

   For a bit of more context, the test fails as below:
   
   ```
   org.apache.spark.sql.AnalysisException: 
[DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE] Cannot resolve 
"raise_error(USER_RAISED_EXCEPTION, map(errorMessage, 'aa' collate 
UTF8_BINARY_LCASE))" due to data type mismatch: The second parameter requires 
the "MAP" type, however "map(errorMessage, 'aa' collate 
UTF8_BINARY_LCASE)" has the type "MAP". SQLSTATE: 42K09; line 1 pos 7;
   'Project [unresolvedalias(raise_error(cast(USER_RAISED_EXCEPTION as string 
collate UTF8_BINARY_LCASE), map(errorMessage, aa), NullType))]
   +- OneRowRelation
   
at 
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.dataTypeMismatch(package.scala:73)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7(CheckAnalysis.scala:315)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$7$adapted(CheckAnalysis.scala:302)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:244)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:243)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:243)
at scala.collection.immutable.Vector.foreach(Vector.scala:2124)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:243)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$6(CheckAnalysis.scala:302)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$6$adapted(CheckAnalysis.scala:302)
at scala.collection.immutable.List.foreach(List.scala:334)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$2(CheckAnalysis.scala:302)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$2$adapted(CheckAnalysis.scala:216)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:244)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis0(CheckAnalysis.scala:216)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis0$(CheckAnalysis.scala:198)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis0(Analyzer.scala:192)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:190)
at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:161)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:192)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:214)
at 
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:393)
at 
org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:212)
at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:92)
at 
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:225)
at 
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:599)
at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:225)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:923)
at 
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:224)
at 
org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:92)
at 
org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:89)
at 
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:73)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$3(Dataset.scala:118)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:923)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:115)
at 
org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:660)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:923)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:681)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at 

Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-18 Thread via GitHub


HyukjinKwon commented on PR #46649:
URL: https://github.com/apache/spark/pull/46649#issuecomment-2118743892

   cc @cloud-fan and @uros-db 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44838][SQL][FOLLOW-UP] Fix the test for raise_error by using default type for strings [spark]

2024-05-18 Thread via GitHub


HyukjinKwon commented on code in PR #46649:
URL: https://github.com/apache/spark/pull/46649#discussion_r1605735539


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala:
##
@@ -969,6 +969,8 @@ object TypeCoercion extends TypeCoercionBase {
 // Note that ret is nullable to avoid typing a lot of Some(...) in this 
local scope.
 // We wrap immediately an Option after this.
 @Nullable val ret: DataType = (inType, expectedType) match {
+  case (_: StringType, _: StringType) => expectedType.defaultConcreteType

Review Comment:
   This seems to be already working in ANSI.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org