[ 
https://issues.apache.org/jira/browse/SPARK-46330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhou Yifan updated SPARK-46330:
-------------------------------
    Summary: Loading of Spark UI blocks for a long time when HybridStore 
enabled  (was: Spark UI's loading blocks for a long time when HybridStore 
enabled)

> Loading of Spark UI blocks for a long time when HybridStore enabled
> -------------------------------------------------------------------
>
>                 Key: SPARK-46330
>                 URL: https://issues.apache.org/jira/browse/SPARK-46330
>             Project: Spark
>          Issue Type: Bug
>          Components: UI
>    Affects Versions: 3.3.1
>            Reporter: Zhou Yifan
>            Priority: Major
>
> In our SparkHistoryServer, we used these two property to speed up Spark UI's 
> loading:
>  
> {code:java}
> spark.history.store.hybridStore.enabled true
> spark.history.store.hybridStore.maxMemoryUsage 16g {code}
> Occasionally, we found that it took minutes to load a small eventlog which 
> usually took seconds.
> In the jstack output of SparkHistoryServer, we found that 4 threads were 
> blocked and waiting to lock org.apache.spark.deploy.history.FsHistoryProvider 
> object monitor, which was
> locked by thread "spark-history-task-0" closing a HybridStore.
> {code:java}
> "qtp791499503-2688947" #2688947 daemon prio=5 os_prio=0 
> tid=0x00007f4044042800 nid=0x8d98 waiting for monitor entry 
> [0x00007f3f64760000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>     at 
> org.apache.spark.deploy.history.FsHistoryProvider.getAppUI(FsHistoryProvider.scala:386)
>     - waiting to lock <0x00000004c64433f0> (a 
> org.apache.spark.deploy.history.FsHistoryProvider)
>     at 
> org.apache.spark.deploy.history.HistoryServer.getAppUI(HistoryServer.scala:194)
>     at 
> org.apache.spark.deploy.history.ApplicationCache.$anonfun$loadApplicationEntry$2(ApplicationCache.scala:182)
>     at 
> org.apache.spark.deploy.history.ApplicationCache$$Lambda$805/90086258.apply(Unknown
>  Source)
>     at 
> org.apache.spark.deploy.history.ApplicationCache.time(ApplicationCache.scala:154)
>     at 
> org.apache.spark.deploy.history.ApplicationCache.org$apache$spark$deploy$history$ApplicationCache$$loadApplicationEntry(ApplicationCache.scala:180)
>     at 
> org.apache.spark.deploy.history.ApplicationCache$$anon$1.load(ApplicationCache.scala:71)
>     at 
> org.apache.spark.deploy.history.ApplicationCache$$anon$1.load(ApplicationCache.scala:58)
>     at 
> org.sparkproject.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
>     at 
> org.sparkproject.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
>     at 
> org.sparkproject.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
>     - locked <0x000000066effc3e8> (a 
> org.sparkproject.guava.cache.LocalCache$StrongAccessEntry)
>     at 
> org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
>     at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000)
>     at org.sparkproject.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
>     at 
> org.sparkproject.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
>     at 
> org.apache.spark.deploy.history.ApplicationCache.get(ApplicationCache.scala:108)
>     at 
> org.apache.spark.deploy.history.ApplicationCache.withSparkUI(ApplicationCache.scala:120)
>     at 
> org.apache.spark.deploy.history.HistoryServer.org$apache$spark$deploy$history$HistoryServer$$loadAppUi(HistoryServer.scala:251)
>     at 
> org.apache.spark.deploy.history.HistoryServer$$anon$1.doGet(HistoryServer.scala:99)
>  "spark-history-task-0" #49 daemon prio=5 os_prio=0 tid=0x00007f431e55b800 
> nid=0x1ac6 in Object.wait() [0x00007f41b2cc9000]   java.lang.Thread.State: 
> WAITING (on object monitor)     at java.lang.Object.wait(Native Method) at 
> java.lang.Thread.join(Thread.java:1252)      - locked <0x000000063ccbc9f0> (a 
> java.lang.Thread)      at java.lang.Thread.join(Thread.java:1326)      at 
> org.apache.spark.deploy.history.HybridStore.close(HybridStore.scala:106)     
> at org.apache.spark.status.AppStatusStore.close(AppStatusStore.scala:553)     
>   at 
> org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$invalidateUI$1(FsHistoryProvider.scala:913)
>        at 
> org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$invalidateUI$1$adapted(FsHistoryProvider.scala:911)
>        at 
> org.apache.spark.deploy.history.FsHistoryProvider$$Lambda$416/229723341.apply(Unknown
>  Source)        at scala.Option.foreach(Option.scala:407)       at 
> org.apache.spark.deploy.history.FsHistoryProvider.invalidateUI(FsHistoryProvider.scala:911)
>   - locked <0x00000004c64433f0> (a 
> org.apache.spark.deploy.history.FsHistoryProvider)     at 
> org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$7(FsHistoryProvider.scala:541)
>        at 
> org.apache.spark.deploy.history.FsHistoryProvider.$anonfun$checkForLogs$7$adapted(FsHistoryProvider.scala:498){code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to