If you know the vertex/task, you can enable profiling only on those. Please
check "Profiling in tez" section in
https://cwiki.apache.org/confluence/display/TEZ/How+to+Diagnose+Tez+App.
But this is with yourkit, which would dump the snapshot at the end of the
run.  Haven't tried with hprof options.

~Rajesh.B

On Thu, Jan 19, 2017 at 5:39 AM, Piyush Narang <pnar...@twitter.com> wrote:

> hi folks,
>
> I had a couple of Cascading3 on Tez jobs that seemed to be running slower
> on Tez as compared to Hadoop. Wanted to try and get some hprof profiles to
> see what the jobs are spending time on, so thought I'd try and get a hprof
> profile. I tried running the job with:
>
> -Dmapreduce.task.profile=true \
>
> -Dtez.task.launch.cluster-default.cmd-opts="-XX:+UseSerialGC
> -Djava.net.preferIPv4Stack=true -XX:ReservedCodeCacheSize=128M
> -XX:MaxMetaspaceSize=256M -XX:CompressedClassSpaceSize=256M
> -XX:CICompilerCount=2 -XX:HeapDumpPath=<LOG_DIR>/heapdump-@taskid@.hprof
> -XX:ErrorFile=<LOG_DIR>/hs_err_pid-@taskid@.log
> -agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,
> verbose=n,file=<LOG_DIR>/profile-@taskid@.out"
> Now when I kick off the Tez job, it seems to spend very long (upwards of
> an hour) stuck at 0%. The tasks don't seem to proceed beyond this:
>
> 2017-01-18 23:53:26,261 [INFO] [TezChild] |tez.FlowProcessor|: flow node id: 
> E08C07BFB10141D8B6D7211E5AF172E4, all 1 inputs ready in: 00:00:00.002
>
>
> Tried capturing some jstacks and the top of the stack seems to hold some hprof
>
> related frames.
>
>
> Has anyone been able to profile their Tez jobs with hprof? Are there any other
>
> settings I'm missing?
>
>
> Same set of options(replace tez.task.launch.cluster-default.cmd-opts with 
> mapreduce.task.profile.params) seem to work fine in case of Hadoop. I end up 
> getting the
>
> hprof profiles there.
>
>
> Thanks,
>
> --
> - Piyush
>

Reply via email to