DaveDeCaprio commented on a change in pull request #23169: [SPARK-26103][SQL] 
Limit the length of debug strings for query plans
URL: https://github.com/apache/spark/pull/23169#discussion_r249585186
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 ##########
 @@ -92,31 +96,64 @@ object StringUtils {
 
   /**
    * Concatenation of sequence of strings to final string with cheap append 
method
-   * and one memory allocation for the final string.
+   * and one memory allocation for the final string.  Can also bound the final 
size of
+   * the string.
    */
-  class StringConcat {
+  class StringConcat(val maxLength: Int = 
ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
     private val strings = new ArrayBuffer[String]
     private var length: Int = 0
 
+    def atLimit: Boolean = length >= maxLength
+
     /**
      * Appends a string and accumulates its length to allocate a string buffer 
for all
-     * appended strings once in the toString method.
+     * appended strings once in the toString method.  Returns true if the 
string still
+     * has room for further appends before it hits its max limit.
      */
-    def append(s: String): Unit = {
-      if (s != null) {
+    def append(s: String): Boolean = {
+      if (!atLimit && s != null) {
         strings.append(s)
         length += s.length
       }
+      return !atLimit
     }
 
     /**
      * The method allocates memory for all appended strings, writes them to 
the memory and
      * returns concatenated string.
      */
     override def toString: String = {
-      val result = new java.lang.StringBuilder(length)
-      strings.foreach(result.append)
+      val finalLength = Math.min(length, maxLength)
+      val result = new java.lang.StringBuilder(finalLength)
+      var ix = 0
+      while(ix < strings.length) {
+        var s = strings(ix)
+        if(ix < strings.length - 1) {
+          result.append(s)
+        }
+        else {
+          val lastLength = Math.min(s.length, maxLength - result.length())
+          result.append(s, 0, lastLength)
+        }
+        ix += 1
+      }
       result.toString
     }
   }
+
+  /** Whether we have warned about plan string truncation yet. */
+  private val planSizeWarningPrinted = new AtomicBoolean(false)
+
+  /** A string concatenator for plan strings.  Uses length from a configured 
value, and
+   *  prints a warning the first time a plan is truncated. */
 
 Review comment:
   I was mirroring the behavior used by 
org.apache.spark.sql.catalyst.util.truncatedString.  This does an equivalent 
kind of truncation if a data frame has too many fields.  It only prints the 
warning out once.  It's a similar warning so I copied the existing behavior.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to