LorenzoMartini commented on code in PR #36457:
URL: https://github.com/apache/spark/pull/36457#discussion_r866721488
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala:
##########
@@ -642,7 +642,7 @@ case class RegExpReplace(subject: Expression, regexp:
Expression, rep: Expressio
}
val source = s.toString()
val position = i.asInstanceOf[Int] - 1
- if (position < source.length) {
+ if (position < source.length || (position == 0 && source.equals(""))) {
Review Comment:
Yes that's a good point. Compiler actually suggests `source.isEmpty()`. The
default position passed in is 1 so position will be 0 by default so `position
== 0 || position < source.length` doesn't catch the empty string.
I will replace with `isEmpty`.
I would also argue that any non-default position passed in with an empty
string for the regex replace is an error on the user side, so I'm wondering if
we should special case that?
As in, if user gives us a regex replace with empty string match (`^$`) and
explicitly sets a position, we should probably just throw? I don't really have
a strong opinion here, happy to keep as is. Seems just a bit odd to specify a
position when you are matching on an empty string that should always have
length 0
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]