Hi Sudhan, well that was just an example. In reality the situation is slightly more complex, as its not a single thread but some serious amount of third party libs that are not fully under my control. While there is a chance that I can shut these things down in most cases, experience shows that some were not meant to be shut down. Also, as the process is being reused several times between map and combine the overhead of re-initializing frameworks is not insignificant.
Thanks, Henning On 10/28/2011 06:37 AM, Sudharsan Sampath wrote: Hi Henning, I feel it's the non-daemon thread that's causing the issue. A JVM will not exit until all its non-daemon threads have finished. Is there a reason why you want this thread to be non-daemon? If unavoidable, then can you exit this thread when the reducer's job is completed? Thanks Sudhan S On Thu, Oct 27, 2011 at 9:14 PM, Henning Blohm <henning.bl...@zfabrik.de> <henning.bl...@zfabrik.de>wrote: Hi Harsh, here's the simplest example I could come up with: Add protected void setup(Context context) throws IOException ,InterruptedException { // start some non-deamon thread Thread t = new Thread(new Runnable() { public void run() { while (true) { try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } } }); t.setDaemon(false); t.start(); System.err.println("Started thread in reduce setup"); }; to the Reduce inner class in the wordcount sample (source code attached). Assuming its in wordcount.jar and files have been uploaded for counting (no matter what content of course), running hadoop jar wordcount.jar org.myorg.WordCount wordcount/input wordcount/result gives me, reproducibly, a hanging "Child" process. Interestingly, that does not happen when starting a thread like above but in Map.setup. One more note: In our case, some non-trivial infrastructure is started and used in map, combine, and reduce. I believe it could be shutdown and started again between map and reduce when run in the same JVM. That is however expensive and brings no benefit otherwise. If there would be a way to know that now the JVM will really not be used anymore, that would be a good time to really cleanup. Unfortunately shutdown hooks don't work here as they will not be run before non-daemon threads have stopped. Thanks, Henning On 10/27/2011 01:18 PM, Henning Blohm wrote: Hi Harsh, that would be 0.20.3. Will try to prepare a stripped down sample later today or tomorrow. Thanks, Henning On 10/27/2011 12:55 PM, Harsh J wrote: Hey Henning, What version of Hadoop are you running, and can we have a dumbed down sample to reproduce? On Thu, Oct 27, 2011 at 3:28 PM, Henning Blohm<henning.bl...@zfabrik.de> <henning.bl...@zfabrik.de> <henning.bl...@zfabrik.de> <henning.bl...@zfabrik.de> wrote: Hi, found that several people have run into this issue, but I was not able to find a solution yet. We have reduce tasks that leave a hanging "child" process. The implementation uses a lot of third party stuff and leave Timer threads running (as you can readily see in thread dumps). Which is bad style - no doubt. But eventually we don't really care - when the reduce is done, its done and the process should be really just killed rather than hanging around and eventually impacting the cluster. Is there a way to force killing of child processes, e.g. based on job configuration? Thanks, Henning -- *Henning Blohm* *ZFabrik Software KG* T: +49/62278399955 F: +49/62278399956 M: +49/1781891820 Bunsenstrasse 1 69190 Walldorf henning.bl...@zfabrik.de <mailto:henning.bl...@zfabrik.de> <henning.bl...@zfabrik.de> <henning.bl...@zfabrik.de> <henning.bl...@zfabrik.de> Linkedin <http://de.linkedin.com/pub/henning-blohm/0/7b5/628> <http://de.linkedin.com/pub/henning-blohm/0/7b5/628> <http://de.linkedin.com/pub/henning-blohm/0/7b5/628> <http://de.linkedin.com/pub/henning-blohm/0/7b5/628>www.zfabrik.de <http://www.zfabrik.de> <http://www.zfabrik.de> <http://www.zfabrik.de> <http://www.zfabrik.de>www.z2-environment.eu <http://www.z2-environment.eu> <http://www.z2-environment.eu> <http://www.z2-environment.eu> <http://www.z2-environment.eu> -- *Henning Blohm* *ZFabrik Software KG* T: +49/62278399955 F: +49/62278399956 M: +49/1781891820 Bunsenstrasse 1 69190 Walldorf henning.bl...@zfabrik.de Linkedin <http://de.linkedin.com/pub/henning-blohm/0/7b5/628> www.zfabrik.de www.z2-environment.eu