Mike Smith wrote:
I am in the same state again, and same reduce jobs keep failing on different
machines. I cannot get the dump using kill -3 pid, it does not make the
thread to quit. Also, I tried to place some log into FetcherOutputFormat,
but because of this bug:
*https://issues.apache.org/jira/browse/HADOOP-406*<https://issues.apache.org/jira/browse/HADOOP-406> The logging is not possible in the childs threads. Do you have any idea why the reducers doesn't catch the QUIT signal from the cache. I am running the
latest version on SVN, otherwise I could log some key,value and url
filtering information at the reduce stage.

SIGQUIT should not make the JVM quit, it should produce a thread dump on stderr. You need to manually pick up the process that corresponds to the child JVM of the task, e.g. with top(1) or ps(1), and then execute 'kill -SIGQUIT <pid>'.

You can use Hadoop's log4j.properties to quickly enable a lot of log info, including stderr - put it in conf on every tasktracker and restart the cluster.

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to