Hello,

(Inline)

On Tue, Oct 18, 2011 at 12:04 AM, W.P. McNeill <[email protected]> wrote:
<snip>
> 1. *Turn on JMX remote for the tasks*...I added the following options to
> mapred.child.java.opts:
> com.sun.management.jmxremote,
> com.sun.management.jmxremote.port=8004,com.sun.management.jmxremote.authenticate=false,com.sun.management.jmxremote.ssl
> = false.
>
> This does not work because there is contention for the JMX remote port when
> multiple tasks run on the same node. All but the first task fail at JVM
> initialization time, causing the job to fail before I can see the repro.

For profiling/etc. this way, you are probably interested in just one
task. So switch down your slots to 1, and that'd be an easy way out -
1 mapper at a time, reusing the port as it goes.

> 2. *Use jstatd*...I tried running jstatd in the background on my cluster
> nodes. It launches and runs, but when I try to connect using Visual VM,
> nothing happens.

While I find it odd that jstatd doesn't seem to expose the host's jvm
metrics out for you, I don't think jstatd would let you do memory
profiling AFAIK. You need jmx for that, right? You can observe heap
charts with jstatd running though, I think.

> I am going to try adding -XX:-HeapDumpOnOutOfMemoryError, which will at
> least give me post-mortem information. Does anyone know where the heap dump
> file will  be written?

Enable keep.failed.task.files as true for your job, then hunt the
attempt directory down in your mapred.local.dir of the TaskTracker
that ran it. An easier way is to also log the Child's pwd via your
java code so you see which disk its on when you check logs first.
Under the attempt dir, you should be able to locate your heap dump.

> Has anyone debugged a similar setup? What tools did you use?

I think you'll find some (possibly odd looking) ways described on
https://issues.apache.org/jira/browse/MAPREDUCE-2637, similar to your
approach.

-- 
Harsh J

Reply via email to