[ https://issues.apache.org/jira/browse/SPARK-20657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16295746#comment-16295746 ]
Apache Spark commented on SPARK-20657: -------------------------------------- User 'vanzin' has created a pull request for this issue: https://github.com/apache/spark/pull/20013 > Speed up Stage page > ------------------- > > Key: SPARK-20657 > URL: https://issues.apache.org/jira/browse/SPARK-20657 > Project: Spark > Issue Type: Sub-task > Components: Web UI > Affects Versions: 2.3.0 > Reporter: Marcelo Vanzin > > The Stage page in the UI is very slow when a large number of tasks exist > (tens of thousands). The new work being done in SPARK-18085 makes that worse, > since it adds potential disk access to the mix. > A lot of the slowness is because the code loads all the tasks in memory then > sorts a really large list, and does a lot of calculations on all the data; > both can be avoided with the new app state store by having smarter indices > (so data is read from the store sorted in the desired order) and by keeping > statistics about metrics pre-calculated (instead of re-doing that on every > page access). > Then only the tasks on the current page (100 items by default) need to > actually be loaded. This also saves a lot on memory usage, not just CPU time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org