[ 
https://issues.apache.org/jira/browse/SPARK-39361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Rosen resolved SPARK-39361.
--------------------------------
    Fix Version/s: 3.3.0
       Resolution: Fixed

Issue resolved by pull request 36747
[https://github.com/apache/spark/pull/36747]

> Stop using Log4J2's extended throwable logging pattern in default logging 
> configurations
> ----------------------------------------------------------------------------------------
>
>                 Key: SPARK-39361
>                 URL: https://issues.apache.org/jira/browse/SPARK-39361
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.3.0
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>            Priority: Major
>             Fix For: 3.3.0
>
>
> This PR addresses a performance problem in Log4J 2 related to exception 
> logging: in certain scenarios I observed that Log4J2's exception stacktrace 
> logging can be ~10x slower than Log4J 1.
> The problem stems from a new log pattern format in Log4J2 called ["extended 
> exception"|https://logging.apache.org/log4j/2.x/manual/layouts.html#PatternExtendedException],
>  which enriches the regular stacktrace string with information on the name of 
> the JAR files that contained the classes in each stack frame.
> Log4J queries the classloader to determine the source JAR for each class. 
> This isn't cheap, but this information is cached and reused in future 
> exception logging calls. In certain scenarios involving runtime-generated 
> classes, this lookup will fail and the failed lookup result will _not_ be 
> cached. As a result, expensive classloading operations will be performed 
> every time such an exception is logged. In addition to being very slow, these 
> operations take out a lock on the classloader and thus can cause severe lock 
> contention if multiple threads are logging errors. This issue is described in 
> more detail in a comment on a Log4J2 JIRA and in a linked blogpost: 
> https://issues.apache.org/jira/browse/LOG4J2-2391?focusedCommentId=16667140&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16667140
>  . Spark frequently uses generated classes and lambdas and thus Spark 
> executor logs will almost always trigger this edge-case and suffer from poor 
> performance.
> By default, if you do not specify an explicit exception format in your 
> logging pattern then Log4J2 will add this "extended exception" pattern (see 
> PatternLayout's {{alwaysWriteExceptions}} flag in Log4J's documentation, plus 
> [the code implementing that 
> flag|https://github.com/apache/logging-log4j2/blob/d6c8ab0863c551cdf0f8a5b1966ab45e3cddf572/log4j-core/src/main/java/org/apache/logging/log4j/core/pattern/PatternParser.java#L206-L209]
>  in Log4J2).
> In this PR, I have updated Spark's default Log4J2 configurations so that each 
> pattern layout includes an explicit {{%ex}} so that it uses the normal 
> (non-extended) exception logging format.
> Although it's true that any program logging exceptions at a high rate should 
> probably just fix the source of the exceptions, I think it's still a good 
> idea for us to try to fix this out-of-the-box performance difference so that 
> users' existing workloads do not regress when upgrading to 3.3.0.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to