Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/12990
  
    Sorry I haven't had time to look at this in great detail but have some 
concerns.
    
    So the way I read this, it will only ever show 1000 tasks on the task table 
(by default at least)?  I don't see that there is a way for it to dynamically 
load this data or that the user is informed that they hit this limit so I'm 
concerned it will be confusing.   It looks like it applies to both active 
application and once it was re-read by the history server and I'm assuming 
history server is using the application setting for this?  It seems like it 
would make more sense for the history server to have its own setting that would 
override the applications if that is what you are trying to protect running out 
of memory.  I guess the other settings work that way so perhaps that is a 
separate jira.  
    
    We have a ton of jobs that have stages with over 1000 tasks and if I can't 
see the data needed this page is useless. 1000 seems way low to me. I know the 
same thing exists for stages but that limit is much less likely to be hit and 
when you do you either have all data from that stage or none of it.  Here you 
have partial stage data which could be confusing if we happen to remove the 
"interesting" task. 
    
    If I read the screen shot for heap dump properly 65000 tasks were taking 
about 250MB.  On a running application that doesn't seem to bad.  On the 
history server where you can have thousands of applications obviously that can 
add up, but personally I would prefer us do something smart like not load it 
until someone clicks on the button and needs the data.
    
    I'm ok with adding a config for this but would prefer to see it default to 
all or very high number and those that want it smaller can decrease.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to