Hi, This is probably a known issue of Hadoop [1]. Unfortunately it was only fixed in 3.3.0.
Piotrek [1] https://issues.apache.org/jira/browse/HADOOP-15658 <https://issues.apache.org/jira/browse/HADOOP-15658> > On 22 Jan 2020, at 13:56, Till Rohrmann <trohrm...@apache.org> wrote: > > Thanks for reporting this issue Mark. I'm pulling Klou into this conversation > who knows more about the StreamingFileSink. @Klou does the StreamingFileSink > relies on DeleteOnExitHooks to clean up files? > > Cheers, > Till > > On Tue, Jan 21, 2020 at 3:38 PM Mark Harris <mark.har...@hivehome.com > <mailto:mark.har...@hivehome.com>> wrote: > Hi, > > We're using flink 1.7.2 on an EMR cluster v emr-5.22.0, which runs hadoop v > "Amazon 2.8.5". We've recently noticed that some TaskManagers fail (causing > all the jobs running on them to fail) with an "java.lang.OutOfMemoryError: GC > overhead limit exceeded”. The taskmanager (and jobs that should be running on > it) remain down until manually restarted. > > I managed to take and analyze a memory dump from one of the afflicted > taskmanagers. > > It showed that 85% of the heap was made up of the > java.io.DeleteOnExitHook.files hashset. The majority of the strings in that > hashset (9041060 out of ~9041100) pointed to files that began > /tmp/hadoop-yarn/s3a/s3ablock > > The problem seems to affect jobs that make use of the StreamingFileSink - all > of the taskmanager crashes have been on the taskmaster running at least one > job using this sink, and a cluster running only a single taskmanager / job > that uses the StreamingFileSink crashed with the GC overhead limit exceeded > error. > > I've had a look for advice on handling this error more broadly without luck. > > Any suggestions or advice gratefully received. > > Best regards, > > Mark Harris > > > > The information contained in or attached to this email is intended only for > the use of the individual or entity to which it is addressed. If you are not > the intended recipient, or a person responsible for delivering it to the > intended recipient, you are not authorised to and must not disclose, copy, > distribute, or retain this message or any part of it. It may contain > information which is confidential and/or covered by legal professional or > other privilege under applicable law. > > The views expressed in this email are not necessarily the views of Centrica > plc or its subsidiaries, and the company, its directors, officers or > employees make no representation or accept any liability for its accuracy or > completeness unless expressly stated to the contrary. > > Additional regulatory disclosures may be found here: > https://www.centrica.com/privacy-cookies-and-legal-disclaimer#email > <https://www.centrica.com/privacy-cookies-and-legal-disclaimer#email> > > PH Jones is a trading name of British Gas Social Housing Limited. British Gas > Social Housing Limited (company no: 01026007), British Gas Trading Limited > (company no: 03078711), British Gas Services Limited (company no: 3141243), > British Gas Insurance Limited (company no: 06608316), British Gas New Heating > Limited (company no: 06723244), British Gas Services (Commercial) Limited > (company no: 07385984) and Centrica Energy (Trading) Limited (company no: > 02877397) are all wholly owned subsidiaries of Centrica plc (company no: > 3033654). Each company is registered in England and Wales with a registered > office at Millstream, Maidenhead Road, Windsor, Berkshire SL4 5GD. > > British Gas Insurance Limited is authorised by the Prudential Regulation > Authority and regulated by the Financial Conduct Authority and the Prudential > Regulation Authority. British Gas Services Limited and Centrica Energy > (Trading) Limited are authorised and regulated by the Financial Conduct > Authority. British Gas Trading Limited is an appointed representative of > British Gas Services Limited which is authorised and regulated by the > Financial Conduct Authority.