Looks like you have a good point! I think you are right.
Let me raise a jira to handle this issue more generally, i.e., fix all
places wherever this kind of check needs to be done.

-----Original Message-----
From: Calvin Yu [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 01, 2007 8:50 PM
To: hadoop-user@lucene.apache.org
Subject: Bad concurrency bug in 0.12.3?

I've been experiencing some issues where my mapred tasks have been hanging
after a lengthy period of execution.  I believe I've found the problem and
wanted to get other's thoughts about it.

The problem seems to be with the MapTask's (MapTask.java) sort progress
thread (line #196) not stopping after the sort is completed, and hence the
call to join() (line# 190) never returns.  This is because that thread is
only catching the InterruptedException, and not checking the thread's
interrupted flag as well.  According to the Javadocs, an
InterruptedException is thrown only if the Thread is in the middle of the
sleep(), wait(), join(), etc. calls, and during normal operations only the
interrupted flag is set.  Can someone confirm this?  I'm going to patch my
install to see if this is my problem, but I seem to only run into this
problem after several hours of processing and would like to get earlier
confirmation.

I did a search in JIRA and it looks like there are patches
(HADOOP-1431) that might inadvertently solve this problem, but didn't see
any one ticket that specifically details this scenario.

Calvin

Reply via email to