Is it possible to upload the AM logs alone?. That would be helpful. It appears to be a problem with "scope_38_INPUT_scope_37". But without the logs and without knowing the DAG, it would be hard to locate the issue.
Otherwise, try "yarn logs -applicationId appId | grep "HISTORY" > history.log". If you have SimpleHistoryLoggingService (which is the default), check if "history.txt" logs are available which can be shared. If not sure about the location, check "yarn logs -applicationId appId | | grep 'Initializing SimpleHistoryLoggingService, logFileLocation='". ~Rajesh.B On Thu, Sep 3, 2015 at 3:30 PM, Sandeep Kumar <[email protected]> wrote: > @Rohini, I used new version of pig i.e. 0.15.0 unfortunately the > performance of my script degraded. > 2015-09-03 15:15:24,698 [main] INFO org.apache.pig.Main - Pig script > completed in 4 minutes, 1 second and 22 milliseconds (241022 ms) > > whereas earlier it was taking hardly 3 minutes and 27 seconds. > > PFA the task counters. Following are the version of softwares being used: > > HadoopVersion: > 2.6.0-cdh5.4.4 > > PigVersion: > 0.15.1-SNAPSHOT > > TezVersion: > 0.7.0 > > > Regards, > Sandeep > > On Thu, Sep 3, 2015 at 2:46 PM, Sandeep Kumar <[email protected]> > wrote: > >> @Rajesh, PFA the required statistics. Its difficult to share application >> log because they are huge in size(i.e. 167MB). In case you want anything >> specific from those logs then please let me know. >> >> @Rohini, >> Thanks for suggesting regarding new version of Pig. I'll give it a try >> for sure. >> >> Regards, >> Sandeep >> >> On Thu, Sep 3, 2015 at 2:31 PM, Rohini Palaniswamy < >> [email protected]> wrote: >> >>> Sandeep, >>> Can you try with Pig 0.15 first? There is ton of fixes that has gone >>> in for Pig on Tez into that release and many of them are performance fixes. >>> >>> Regards, >>> Rohini >>> >>> On Thu, Sep 3, 2015 at 1:05 AM, Rajesh Balamohan <[email protected]> >>> wrote: >>> >>>> Can you post the application logs? It would be helpful if you could >>>> run with "tez.task.generate.counters.per.io=true". This would generate >>>> the per IO statistics which can be useful for debugging. >>>> >>>> >>>> ~Rajesh.B >>>> >>>> On Thu, Sep 3, 2015 at 1:20 PM, Sandeep Kumar <[email protected] >>>> > wrote: >>>> >>>>> Hi All, >>>>> >>>>> I'm using Pig-0.14.0 over Tez-0.7.0 for running some basic pig >>>>> scripts. I'm not able to see any performance gain using Tez. My pig >>>>> scripts >>>>> are taking same amount of time on mapred executionType as well. >>>>> >>>>> Following are the parameters which are in mapred-site.xml and being >>>>> read by Tez and I'm not able to override them even if i mention them in my >>>>> tez-site.xml: >>>>> >>>>> tez.runtime.shuffle.merge.percent=0.66 >>>>> tez.runtime.shuffle.fetch.buffer.percent=0.70 >>>>> tez.runtime.io.sort.mb=256 >>>>> tez.runtime.shuffle.memory.limit.percent=0.25 >>>>> tez.runtime.io.sort.factor=64 >>>>> tez.runtime.shuffle.connect.timeout=180000 >>>>> tez.runtime.internal.sorter.class=org.apache.hadoop.util.QuickSort >>>>> tez.runtime.merge.progress.records=10000 >>>>> tez.runtime.compress=true >>>>> tez.runtime.sort.spill.percent=0.8 >>>>> tez.runtime.shuffle.ssl.enable=false >>>>> tez.runtime.ifile.readahead=true >>>>> tez.runtime.shuffle.parallel.copies=10 >>>>> tez.runtime.ifile.readahead.bytes=4194304 >>>>> tez.runtime.task.input.post-merge.buffer.percent=0.0 >>>>> tez.runtime.shuffle.read.timeout=180000 >>>>> tez.runtime.compress.codec=org.apache.hadoop.io.compress.SnappyCodec >>>>> >>>>> >>>>> >>>>> PFA the list of task counter. I can see a lot of data is being spilled >>>>> but if i try to increase tez.runtime.io.sort.mb through >>>>> mapred-site.xml then my script terminates with OOM exception. >>>>> >>>>> Can you please suggest what parameters i should change to improve the >>>>> performance of pig using Tez? >>>>> >>>>> Regards, >>>>> Sandeep >>>>> >>>> >>>> >>> >> >
