Looks like you have a good point! I think you are right. Let me raise a jira to handle this issue more generally, i.e., fix all places wherever this kind of check needs to be done.
-----Original Message----- From: Calvin Yu [mailto:[EMAIL PROTECTED] Sent: Friday, June 01, 2007 8:50 PM To: hadoop-user@lucene.apache.org Subject: Bad concurrency bug in 0.12.3? I've been experiencing some issues where my mapred tasks have been hanging after a lengthy period of execution. I believe I've found the problem and wanted to get other's thoughts about it. The problem seems to be with the MapTask's (MapTask.java) sort progress thread (line #196) not stopping after the sort is completed, and hence the call to join() (line# 190) never returns. This is because that thread is only catching the InterruptedException, and not checking the thread's interrupted flag as well. According to the Javadocs, an InterruptedException is thrown only if the Thread is in the middle of the sleep(), wait(), join(), etc. calls, and during normal operations only the interrupted flag is set. Can someone confirm this? I'm going to patch my install to see if this is my problem, but I seem to only run into this problem after several hours of processing and would like to get earlier confirmation. I did a search in JIRA and it looks like there are patches (HADOOP-1431) that might inadvertently solve this problem, but didn't see any one ticket that specifically details this scenario. Calvin