Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/12990
Sorry I haven't had time to look at this in great detail but have some
concerns.
So the way I read this, it will only ever show 1000 tasks on the task table
(by default at least)? I don't see that there is a way for it to dynamically
load this data or that the user is informed that they hit this limit so I'm
concerned it will be confusing. It looks like it applies to both active
application and once it was re-read by the history server and I'm assuming
history server is using the application setting for this? It seems like it
would make more sense for the history server to have its own setting that would
override the applications if that is what you are trying to protect running out
of memory. I guess the other settings work that way so perhaps that is a
separate jira.
We have a ton of jobs that have stages with over 1000 tasks and if I can't
see the data needed this page is useless. 1000 seems way low to me. I know the
same thing exists for stages but that limit is much less likely to be hit and
when you do you either have all data from that stage or none of it. Here you
have partial stage data which could be confusing if we happen to remove the
"interesting" task.
If I read the screen shot for heap dump properly 65000 tasks were taking
about 250MB. On a running application that doesn't seem to bad. On the
history server where you can have thousands of applications obviously that can
add up, but personally I would prefer us do something smart like not load it
until someone clicks on the button and needs the data.
I'm ok with adding a config for this but would prefer to see it default to
all or very high number and those that want it smaller can decrease.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]