maropu commented on a change in pull request #28750:
URL: https://github.com/apache/spark/pull/28750#discussion_r437998610
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
##########
@@ -123,7 +123,11 @@ object StringUtils extends Logging {
val stringToAppend = if (available >= sLen) s else s.substring(0,
available)
strings.append(stringToAppend)
}
- length += sLen
+
+ // Cap the length at ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH.
Otherwise, we
+ // will overflow length causing StringIndexOutOfBoundsException in the
substring call
+ // above.
Review comment:
@dilipbiswal Sorry and my comment looked ambiguous. I just wanted to
leave a note about your comment above;
> I think, the original intent of keeping this length outside the if block
(which i missed and @MaxGekk pointed out) is to keep the true length of the
input. So when we are producing a SQL plan, we want to tell users how much was
that the size of the explain string and how much we truncated based on max
length specification. So caller could call append method even after we have
gone past the max.
I couldn't tell why the count-up is placed outside `the if (!atLimit) block`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]