JobClient.runJob leaks file descriptors ---------------------------------------
Key: MAPREDUCE-821 URL: https://issues.apache.org/jira/browse/MAPREDUCE-821 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Driver running on Ubuntu Jaunty x86, cluster running a Linux variant. Reporter: Mark Desnoyer Priority: Critical In a Java-based driver that runs multiple MapReduce jobs (e.g. Mahout's K-means implementation), numerous calls to JobClient.runJob will cause many RPC connections to be opened and then never closed. This results in the driver job leaking file descriptors and will eventually crash once the OS limit is reached for Too Many Open Files. This has been verified in Hadoop 18.3 by running the driver and as new MapReduce jobs are run, lsof -p dhows an increasing number of open TCP connections to the cluster. Looking at the current code in the trunk, it looks like this is caused by runJob not calling close() on the JobClient object it creates. Or alternatively, it's cause by the fact that JobClient does not have a destructor that calls close(). I am going to verify this hypothesis and post a patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.