LorenzoMartini commented on code in PR #36457:
URL: https://github.com/apache/spark/pull/36457#discussion_r866721488


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala:
##########
@@ -642,7 +642,7 @@ case class RegExpReplace(subject: Expression, regexp: 
Expression, rep: Expressio
     }
     val source = s.toString()
     val position = i.asInstanceOf[Int] - 1
-    if (position < source.length) {
+    if (position < source.length || (position == 0 && source.equals(""))) {

Review Comment:
   Yes that's a good point. Compiler actually suggests `source.isEmpty()`. The 
default position passed in is 1 so position will be 0 by default so `position 
== 0 || position < source.length` doesn't catch the empty string.
   I will replace with `isEmpty`.
   
   I would also argue that any non-default position passed in with an empty 
string for the regex replace is an error on the user side, so I'm wondering if 
we should special case that?
   As in, if user gives us a regex replace with empty string match (`^$`) and 
explicitly sets a position, we should probably just throw? I don't really have 
a strong opinion here, happy to keep as is. Seems just a bit odd to specify a 
position when you are matching on an empty string that should always have 
length 0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to