[ 
https://issues.apache.org/jira/browse/SPARK-23470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-23470:
---------------------------------
    Description: 
I was testing 2.3.0 RC3 and found that it's easy to hit "read timeout" when 
accessing All Jobs page. The stack dump says it was running 
"org.apache.spark.ui.jobs.ApiHelper.lastStageNameAndDescription".

{code}
"SparkUI-59" #59 daemon prio=5 os_prio=0 tid=0x00007fc15b0a3000 nid=0x8dc 
runnable [0x00007fc0ce9f8000]
   java.lang.Thread.State: RUNNABLE
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.util.kvstore.KVTypeInfo$MethodAccessor.get(KVTypeInfo.java:154)
        at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryView.compare(InMemoryStore.java:248)
        at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryView.lambda$iterator$2(InMemoryStore.java:214)
        at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryView$$Lambda$36/1834982692.compare(Unknown
 Source)
        at java.util.TimSort.binarySort(TimSort.java:296)
        at java.util.TimSort.sort(TimSort.java:239)
        at java.util.Arrays.sort(Arrays.java:1512)
        at java.util.ArrayList.sort(ArrayList.java:1460)
        at java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:387)
        at java.util.stream.Sink$ChainedReference.end(Sink.java:258)
        at 
java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:210)
        at 
java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
        at 
java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
        at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
        at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryIterator.hasNext(InMemoryStore.java:278)
        at 
org.apache.spark.status.AppStatusStore.lastStageAttempt(AppStatusStore.scala:101)
        at 
org.apache.spark.ui.jobs.ApiHelper$$anonfun$38.apply(StagePage.scala:1014)
        at 
org.apache.spark.ui.jobs.ApiHelper$$anonfun$38.apply(StagePage.scala:1014)
        at 
org.apache.spark.status.AppStatusStore.asOption(AppStatusStore.scala:408)
        at 
org.apache.spark.ui.jobs.ApiHelper$.lastStageNameAndDescription(StagePage.scala:1014)
        at 
org.apache.spark.ui.jobs.JobDataSource.org$apache$spark$ui$jobs$JobDataSource$$jobRow(AllJobsPage.scala:434)
        at 
org.apache.spark.ui.jobs.JobDataSource$$anonfun$24.apply(AllJobsPage.scala:412)
        at 
org.apache.spark.ui.jobs.JobDataSource$$anonfun$24.apply(AllJobsPage.scala:412)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at 
scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
        at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.AbstractTraversable.map(Traversable.scala:104)
        at org.apache.spark.ui.jobs.JobDataSource.<init>(AllJobsPage.scala:412)
        at org.apache.spark.ui.jobs.JobPagedTable.<init>(AllJobsPage.scala:504)
        at org.apache.spark.ui.jobs.AllJobsPage.jobsTable(AllJobsPage.scala:246)
        at org.apache.spark.ui.jobs.AllJobsPage.render(AllJobsPage.scala:295)
        at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:98)
        at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:98)
        at org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
        at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
        at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
{code}

According to the heap dump, there are 954 JobDataWrapper and 54690 
StageDataWrapper. It's obvious that the UI will be slow since we need to sort 
54690 items for 954 jobs.


  was:
I was testing 2.3.0 RC3 and found that it's easy to hit "read timeout" in Spark 
UI. The stack dump says it was running 
"org.apache.spark.ui.jobs.ApiHelper.lastStageNameAndDescription".

{code}
"SparkUI-59" #59 daemon prio=5 os_prio=0 tid=0x00007fc15b0a3000 nid=0x8dc 
runnable [0x00007fc0ce9f8000]
   java.lang.Thread.State: RUNNABLE
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.util.kvstore.KVTypeInfo$MethodAccessor.get(KVTypeInfo.java:154)
        at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryView.compare(InMemoryStore.java:248)
        at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryView.lambda$iterator$2(InMemoryStore.java:214)
        at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryView$$Lambda$36/1834982692.compare(Unknown
 Source)
        at java.util.TimSort.binarySort(TimSort.java:296)
        at java.util.TimSort.sort(TimSort.java:239)
        at java.util.Arrays.sort(Arrays.java:1512)
        at java.util.ArrayList.sort(ArrayList.java:1460)
        at java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:387)
        at java.util.stream.Sink$ChainedReference.end(Sink.java:258)
        at 
java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:210)
        at 
java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
        at 
java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
        at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
        at 
org.apache.spark.util.kvstore.InMemoryStore$InMemoryIterator.hasNext(InMemoryStore.java:278)
        at 
org.apache.spark.status.AppStatusStore.lastStageAttempt(AppStatusStore.scala:101)
        at 
org.apache.spark.ui.jobs.ApiHelper$$anonfun$38.apply(StagePage.scala:1014)
        at 
org.apache.spark.ui.jobs.ApiHelper$$anonfun$38.apply(StagePage.scala:1014)
        at 
org.apache.spark.status.AppStatusStore.asOption(AppStatusStore.scala:408)
        at 
org.apache.spark.ui.jobs.ApiHelper$.lastStageNameAndDescription(StagePage.scala:1014)
        at 
org.apache.spark.ui.jobs.JobDataSource.org$apache$spark$ui$jobs$JobDataSource$$jobRow(AllJobsPage.scala:434)
        at 
org.apache.spark.ui.jobs.JobDataSource$$anonfun$24.apply(AllJobsPage.scala:412)
        at 
org.apache.spark.ui.jobs.JobDataSource$$anonfun$24.apply(AllJobsPage.scala:412)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at 
scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
        at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.AbstractTraversable.map(Traversable.scala:104)
        at org.apache.spark.ui.jobs.JobDataSource.<init>(AllJobsPage.scala:412)
        at org.apache.spark.ui.jobs.JobPagedTable.<init>(AllJobsPage.scala:504)
        at org.apache.spark.ui.jobs.AllJobsPage.jobsTable(AllJobsPage.scala:246)
        at org.apache.spark.ui.jobs.AllJobsPage.render(AllJobsPage.scala:295)
        at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:98)
        at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:98)
        at org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
        at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
        at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
{code}

According to the heap dump, there are 954 JobDataWrapper and 54690 
StageDataWrapper. It's obvious that the UI will be slow since we need to sort 
54690 items for 954 jobs.



> org.apache.spark.ui.jobs.ApiHelper.lastStageNameAndDescription is too slow
> --------------------------------------------------------------------------
>
>                 Key: SPARK-23470
>                 URL: https://issues.apache.org/jira/browse/SPARK-23470
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 2.3.0
>            Reporter: Shixiong Zhu
>            Priority: Blocker
>
> I was testing 2.3.0 RC3 and found that it's easy to hit "read timeout" when 
> accessing All Jobs page. The stack dump says it was running 
> "org.apache.spark.ui.jobs.ApiHelper.lastStageNameAndDescription".
> {code}
> "SparkUI-59" #59 daemon prio=5 os_prio=0 tid=0x00007fc15b0a3000 nid=0x8dc 
> runnable [0x00007fc0ce9f8000]
>    java.lang.Thread.State: RUNNABLE
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.spark.util.kvstore.KVTypeInfo$MethodAccessor.get(KVTypeInfo.java:154)
>       at 
> org.apache.spark.util.kvstore.InMemoryStore$InMemoryView.compare(InMemoryStore.java:248)
>       at 
> org.apache.spark.util.kvstore.InMemoryStore$InMemoryView.lambda$iterator$2(InMemoryStore.java:214)
>       at 
> org.apache.spark.util.kvstore.InMemoryStore$InMemoryView$$Lambda$36/1834982692.compare(Unknown
>  Source)
>       at java.util.TimSort.binarySort(TimSort.java:296)
>       at java.util.TimSort.sort(TimSort.java:239)
>       at java.util.Arrays.sort(Arrays.java:1512)
>       at java.util.ArrayList.sort(ArrayList.java:1460)
>       at java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:387)
>       at java.util.stream.Sink$ChainedReference.end(Sink.java:258)
>       at 
> java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:210)
>       at 
> java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
>       at 
> java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
>       at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
>       at 
> org.apache.spark.util.kvstore.InMemoryStore$InMemoryIterator.hasNext(InMemoryStore.java:278)
>       at 
> org.apache.spark.status.AppStatusStore.lastStageAttempt(AppStatusStore.scala:101)
>       at 
> org.apache.spark.ui.jobs.ApiHelper$$anonfun$38.apply(StagePage.scala:1014)
>       at 
> org.apache.spark.ui.jobs.ApiHelper$$anonfun$38.apply(StagePage.scala:1014)
>       at 
> org.apache.spark.status.AppStatusStore.asOption(AppStatusStore.scala:408)
>       at 
> org.apache.spark.ui.jobs.ApiHelper$.lastStageNameAndDescription(StagePage.scala:1014)
>       at 
> org.apache.spark.ui.jobs.JobDataSource.org$apache$spark$ui$jobs$JobDataSource$$jobRow(AllJobsPage.scala:434)
>       at 
> org.apache.spark.ui.jobs.JobDataSource$$anonfun$24.apply(AllJobsPage.scala:412)
>       at 
> org.apache.spark.ui.jobs.JobDataSource$$anonfun$24.apply(AllJobsPage.scala:412)
>       at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>       at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>       at scala.collection.immutable.List.foreach(List.scala:381)
>       at 
> scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
>       at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45)
>       at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>       at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>       at org.apache.spark.ui.jobs.JobDataSource.<init>(AllJobsPage.scala:412)
>       at org.apache.spark.ui.jobs.JobPagedTable.<init>(AllJobsPage.scala:504)
>       at org.apache.spark.ui.jobs.AllJobsPage.jobsTable(AllJobsPage.scala:246)
>       at org.apache.spark.ui.jobs.AllJobsPage.render(AllJobsPage.scala:295)
>       at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:98)
>       at org.apache.spark.ui.WebUI$$anonfun$3.apply(WebUI.scala:98)
>       at org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:90)
>       at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>       at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>       at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>       at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>       at 
> org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>       at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> {code}
> According to the heap dump, there are 954 JobDataWrapper and 54690 
> StageDataWrapper. It's obvious that the UI will be slow since we need to sort 
> 54690 items for 954 jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to