Re: help understand/debug high memory footprint on jobmanager

2018-06-29 Thread Steven Wu
Thanks everyone for jumping in. BTW, we are using flink-1.4.1. deployment is stand-alone mode. here is the JIRA: https://issues.apache.org/jira/browse/FLINK-9693 On Fri, Jun 29, 2018 at 12:09 PM, Stephan Ewen wrote: > Just saw Stefan's response, it is basically the same. > > We either null out

Re: help understand/debug high memory footprint on jobmanager

2018-06-29 Thread Stephan Ewen
Just saw Stefan's response, it is basically the same. We either null out the field on deploy or archival. On deploy would be even more memory friendly. @Steven - can you open a JIRA ticket for this? On Fri, Jun 29, 2018 at 9:08 PM, Stephan Ewen wrote: > The problem seems to be that the Executi

Re: help understand/debug high memory footprint on jobmanager

2018-06-29 Thread Stephan Ewen
The problem seems to be that the Executions that are kept for history (mainly metrics / web UI) still hold a reference to their TaskStateSnapshot. Upon archival, that field needs to be cleared for GC. This is quite clearly a bug... On Fri, Jun 29, 2018 at 11:29 AM, Stefan Richter < s.rich...@dat

Re: help understand/debug high memory footprint on jobmanager

2018-06-29 Thread Stefan Richter
Hi Steven, from your analysis, I would conclude the following problem. ExecutionVertexes hold executions, which are bootstrapped with the state (in form of the map of state handles) when the job is initialized from a checkpoint/savepoint. It holds a reference on this state, even when the task i