harshmotw-db commented on code in PR #48379:
URL: https://github.com/apache/spark/pull/48379#discussion_r1804216301


##########
sql/core/src/test/scala/org/apache/spark/sql/CollationSQLRegexpSuite.scala:
##########
@@ -96,6 +98,31 @@ class CollationSQLRegexpSuite
     }
   }
 
+  test("RegExpReplace throws the right exception when replace fails on a 
particular row") {
+    val tableName = "regexpReplaceException"
+    withTable(tableName) {
+      Seq("NO_CODEGEN", "CODEGEN_ONLY").foreach { codegenMode =>
+        withSQLConf("spark.sql.codegen.factoryMode" -> codegenMode) {
+          sql(s"CREATE TABLE IF NOT EXISTS $tableName(s STRING)")
+          sql(s"INSERT INTO $tableName VALUES('first last')")
+          val query = s"SELECT regexp_replace(s, '(?<first>[a-zA-Z]+) 
(?<last>[a-zA-Z]+)', " +
+            s"'$$3 $$1') FROM $tableName"

Review Comment:
   @beliefer The designer of `RegExpReplace` very well may have thought of this 
case and in my opinion it works exactly as intended. It's just that sometimes 
these indexes are out of bounds and we just throw the library error in this 
case. This PR aims to handle these exceptions better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to