Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/6672#issuecomment-119048589
On the screenshot
- "Aggregated Stream Block Metrics by Executor" --> "Aggregated Metrics by
executor" (Stream block part is obvious)
- I am not again categorizing the by Executor ID is a good idea. Rather I
would like to see input blocks as the first column, sorted alphabetically. Then
multiple rows per block ids as there are replicas. That would be easy to
visually search through blocks of one input stream (order will show missing
blocks). Also will be easy to see if any blocks has less replicas.
I think the columns should be
Block ID | Replication Level | Location | Storage Level | Size
If replication level is 2, there will be two subrow with location + storage
level + size each. The Storage Level will be either Memory, or Memory
Serialized, or Disk, or External.
Does that make sense.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]