Ahir Reddy created SPARK-48568:
----------------------------------

             Summary: Improve Performance of CodeFormatter with Java 
StringBuilder
                 Key: SPARK-48568
                 URL: https://issues.apache.org/jira/browse/SPARK-48568
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Ahir Reddy


- Update CodeFormatter to use Java's `StringBuilder` directly instead of Scala's
- The Scala `StringBuilder` is a very thin API that wraps Java's.
- All callsites that need to change just trivially copy what Scala was 
delegating to Java. For example `clear()` simply calls `setLength(0)` under the 
hood.
- The reason for this change is that it substantially improves the performance 
of the CodeFormatter.
- From some basic profiling, in a ~100s suite, code formatting took ~2.7 
seconds of CPU time. Post change it takes about 800ms.
- My hypothesis is that Java StringBuilder is much more JIT friendly. Scala's 
StringBuilder has many layers as it implements a significant portion of the 
Scala mutable collection API. It's also likely the case that the JVM has 
special JIT handling for Java StringBuilder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to