Reynold Xin created SPARK-2016:
----------------------------------
Summary: rdd in-memory storage UI becomes unresponsive when the
number of RDD partitions is large
Key: SPARK-2016
URL: https://issues.apache.org/jira/browse/SPARK-2016
Project: Spark
Issue Type: Sub-task
Reporter: Reynold Xin
Try run
{code}
sc.parallelize(1 to 100, 1000000).cache().count()
{code}
And open the storage UI for this RDD. It takes forever to load the page.
When the number of partitions is very large, I think there are a few
alternatives:
0. Only show the top 1000.
1. Pagination
2. Instead of grouping by RDD blocks, group by executors
--
This message was sent by Atlassian JIRA
(v6.2#6252)