[ 
https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360757#comment-16360757
 ] 

Marcelo Vanzin edited comment on SPARK-23394 at 2/12/18 1:36 PM:
-----------------------------------------------------------------

I talked to Attila offline, and to me it seems like the new UI is more correct. 
There are only 10 cached partitions, each one replicated to 2 executors; the 
table also reflects that (whereas   the old UI shows the same block twice). The 
only potential adjustment here would be to show the executor addresses instead 
of the executor IDs.

In the context of what lead us here (SPARK-20659 / 
https://github.com/apache/spark/pull/20546#discussion_r167070392), I think that 
we should fix the tests that rely on the old code returning the total count 
including replication, so that they work with the new code that returns more 
accurate information.


was (Author: vanzin):
I talked to Attila offline, and to me it seems like the new UI is more correct. 
There are only 10 cached partitions, each one replicated to 2 executors; the 
table also reflects that (instead of the old UI, where the same block showed up 
twice). The only potential adjustment here would be to show the executor 
addresses instead of the executor IDs.

In the context of what lead us here (SPARK-20659 / 
https://github.com/apache/spark/pull/20546#discussion_r167070392), I think that 
we should fix the tests that rely on the old code returning the total count 
including replication, so that they work with the new code that returns more 
accurate information.

> Storage info's Cached Partitions doesn't consider the replications (but 
> sc.getRDDStorageInfo does)
> --------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23394
>                 URL: https://issues.apache.org/jira/browse/SPARK-23394
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: Attila Zsolt Piros
>            Priority: Major
>         Attachments: Spark_2.2.1.png, Spark_2.4.0-SNAPSHOT.png, 
> Storage_Tab.png
>
>
> Start spark as:
> {code:bash}
> $ bin/spark-shell --master local-cluster[2,1,1024]
> {code}
> {code:scala}
> scala> import org.apache.spark.storage.StorageLevel._
> import org.apache.spark.storage.StorageLevel._
> scala> sc.parallelize((1 to 100), 10).persist(MEMORY_AND_DISK_2).count
> res0: Long = 100                                                              
>   
> scala> sc.getRDDStorageInfo(0).numCachedPartitions
> res1: Int = 20
> {code}
> h2. Cached Partitions 
> On the UI at the Storage tab Cached Partitions is 10:
>  !Storage_Tab.png! .
> h2. Full tab
> Moreover the replicated partitions was also listed on the old 2.2.1 like:
>  !Spark_2.2.1.png! 
> But now it is like:
>  !Spark_2.4.0-SNAPSHOT.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to