Github user squito commented on the pull request:
https://github.com/apache/spark/pull/7753#issuecomment-142113378
Hi @liyezhang556520 , sorry for the long delay on the review. I think this
is progressing in the right direction, I really think this is going to help a
lot of users understand memory usage out a lot. I left a handful of comments
on the code -- honestly I have not been through it in a ton of detail, but
hopefully that helps you keep going. Also I have some high-level comments:
1. I don't think there is any need to separate out the memory used by the
client and server portions. These are internal details that the end-user
doesn't care about -- in fact you're already correctly simplifying by the time
it gets to the web UI, but you could really combine them immediately in
`TransportMetrics`. That would help simplify the code I think.
2. We shouldn't say "Net" memory used, users might think that means *total*
memory used. I guess we should say "Network"
3. I've never heard "direct-heap" before, it seems strange -- I'd say
"off-heap" or just "direct".
4. Could the stage table include the max memory per executor per stage as
well? That would be great to help users quickly identify the stages which
require the most memory
5. How many additional events are getting logged? With the current
architecture, there is some pressure to not log too much -- both to keep log
sizes small for later processing, and also to make sure the driver doesn't get
too busy just logging (which eventually leads to it dropping events).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]