[
https://issues.apache.org/jira/browse/SPARK-39361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Rosen resolved SPARK-39361.
--------------------------------
Fix Version/s: 3.3.0
Resolution: Fixed
Issue resolved by pull request 36747
[https://github.com/apache/spark/pull/36747]
> Stop using Log4J2's extended throwable logging pattern in default logging
> configurations
> ----------------------------------------------------------------------------------------
>
> Key: SPARK-39361
> URL: https://issues.apache.org/jira/browse/SPARK-39361
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 3.3.0
> Reporter: Josh Rosen
> Assignee: Josh Rosen
> Priority: Major
> Fix For: 3.3.0
>
>
> This PR addresses a performance problem in Log4J 2 related to exception
> logging: in certain scenarios I observed that Log4J2's exception stacktrace
> logging can be ~10x slower than Log4J 1.
> The problem stems from a new log pattern format in Log4J2 called ["extended
> exception"|https://logging.apache.org/log4j/2.x/manual/layouts.html#PatternExtendedException],
> which enriches the regular stacktrace string with information on the name of
> the JAR files that contained the classes in each stack frame.
> Log4J queries the classloader to determine the source JAR for each class.
> This isn't cheap, but this information is cached and reused in future
> exception logging calls. In certain scenarios involving runtime-generated
> classes, this lookup will fail and the failed lookup result will _not_ be
> cached. As a result, expensive classloading operations will be performed
> every time such an exception is logged. In addition to being very slow, these
> operations take out a lock on the classloader and thus can cause severe lock
> contention if multiple threads are logging errors. This issue is described in
> more detail in a comment on a Log4J2 JIRA and in a linked blogpost:
> https://issues.apache.org/jira/browse/LOG4J2-2391?focusedCommentId=16667140&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16667140
> . Spark frequently uses generated classes and lambdas and thus Spark
> executor logs will almost always trigger this edge-case and suffer from poor
> performance.
> By default, if you do not specify an explicit exception format in your
> logging pattern then Log4J2 will add this "extended exception" pattern (see
> PatternLayout's {{alwaysWriteExceptions}} flag in Log4J's documentation, plus
> [the code implementing that
> flag|https://github.com/apache/logging-log4j2/blob/d6c8ab0863c551cdf0f8a5b1966ab45e3cddf572/log4j-core/src/main/java/org/apache/logging/log4j/core/pattern/PatternParser.java#L206-L209]
> in Log4J2).
> In this PR, I have updated Spark's default Log4J2 configurations so that each
> pattern layout includes an explicit {{%ex}} so that it uses the normal
> (non-extended) exception logging format.
> Although it's true that any program logging exceptions at a high rate should
> probably just fix the source of the exceptions, I think it's still a good
> idea for us to try to fix this out-of-the-box performance difference so that
> users' existing workloads do not regress when upgrading to 3.3.0.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]