maropu commented on a change in pull request #28750:
URL: https://github.com/apache/spark/pull/28750#discussion_r437998610



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
##########
@@ -123,7 +123,11 @@ object StringUtils extends Logging {
           val stringToAppend = if (available >= sLen) s else s.substring(0, 
available)
           strings.append(stringToAppend)
         }
-        length += sLen
+
+        // Cap the length at  ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH. 
Otherwise, we
+        // will overflow length causing StringIndexOutOfBoundsException in the 
substring call
+        // above.

Review comment:
       @dilipbiswal Sorry and my comment looked ambiguous. I just wanted to 
leave a note about your comment above;
   >  I think, the original intent of keeping this length outside the if block 
(which i missed and @MaxGekk pointed out) is to keep the true length of the 
input. So when we are producing a SQL plan, we want to tell users how much was 
that the size of the explain string and how much we truncated based on max 
length specification. So caller could call append method even after we have 
gone past the max.
   
   I couldn't tell why the count-up is placed outside `the if (!atLimit) block`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to