Mike Smith wrote:
I am in the same state again, and same reduce jobs keep failing on
different
machines. I cannot get the dump using kill -3 pid, it does not make the
thread to quit. Also, I tried to place some log into FetcherOutputFormat,
but because of this bug:
*https://issues.apache.org/jira/browse/HADOOP-406*<https://issues.apache.org/jira/browse/HADOOP-406>
The logging is not possible in the childs threads. Do you have any
idea why
the reducers doesn't catch the QUIT signal from the cache. I am
running the
latest version on SVN, otherwise I could log some key,value and url
filtering information at the reduce stage.
SIGQUIT should not make the JVM quit, it should produce a thread dump on
stderr. You need to manually pick up the process that corresponds to the
child JVM of the task, e.g. with top(1) or ps(1), and then execute 'kill
-SIGQUIT <pid>'.
You can use Hadoop's log4j.properties to quickly enable a lot of log
info, including stderr - put it in conf on every tasktracker and restart
the cluster.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com