hi folks, I had a couple of Cascading3 on Tez jobs that seemed to be running slower on Tez as compared to Hadoop. Wanted to try and get some hprof profiles to see what the jobs are spending time on, so thought I'd try and get a hprof profile. I tried running the job with:
-Dmapreduce.task.profile=true \ -Dtez.task.launch.cluster-default.cmd-opts="-XX:+UseSerialGC -Djava.net.preferIPv4Stack=true -XX:ReservedCodeCacheSize=128M -XX:MaxMetaspaceSize=256M -XX:CompressedClassSpaceSize=256M -XX:CICompilerCount=2 -XX:HeapDumpPath=<LOG_DIR>/heapdump-@taskid@.hprof -XX:ErrorFile=<LOG_DIR>/hs_err_pid-@taskid@.log -agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=<LOG_DIR>/profile-@taskid @.out" Now when I kick off the Tez job, it seems to spend very long (upwards of an hour) stuck at 0%. The tasks don't seem to proceed beyond this: 2017-01-18 23:53:26,261 [INFO] [TezChild] |tez.FlowProcessor|: flow node id: E08C07BFB10141D8B6D7211E5AF172E4, all 1 inputs ready in: 00:00:00.002 Tried capturing some jstacks and the top of the stack seems to hold some hprof related frames. Has anyone been able to profile their Tez jobs with hprof? Are there any other settings I'm missing? Same set of options(replace tez.task.launch.cluster-default.cmd-opts with mapreduce.task.profile.params) seem to work fine in case of Hadoop. I end up getting the hprof profiles there. Thanks, -- - Piyush