Hi Rajesh Thanks for your response. *This seems to be working for me.* By setting pig.exec.reducers.max to 10 i am able to complete my run in under 4 mins.(Initially it was running in 14-15 mins). I'm new to pig/tez/hadoop world. Do you write any blogs about pig/tez/hadoop etc? Can you suggest any tutorials/links to read about tez? I need to understand concepts like scope, DAG, parallelism etc. I just have a very basic understanding of tez. If i understand all these concepts i'll be able to tune my job.
Thanks On Tue, Jul 7, 2015 at 1:23 PM, Rajesh Balamohan <[email protected] > wrote: > Forgot to add the following. Ideally auto-reduce implementation should > have kicked-in and on need basis, it should have decreased the number of > reducers needed. However, for the vertices of concern (scope-2037 & > scope-2162), auto-reducer has been turned off in configuration by Pig and > for the rest of the vertices it is turned on. > > Pig folks would be able to help in terms of providing details on why > auto-reduce parallelism is turned off in certain vertices. > > 2015-07-06 14:11:35,109 INFO [AsyncDispatcher event handler] > impl.VertexImpl: Setting vertexManager to ShuffleVertexManager for > vertex_1436152736518_0210_1_28* [scope-2037]* > 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler] > vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings > minFrac:0.25 maxFrac:0.75* auto:false* desiredTaskIput:104857600 > minTasks:1 > 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler] > impl.VertexImpl: Creating 999 for vertex: vertex_1436152736518_0210_1_28 > [scope-2037] > .... > > 2015-07-06 14:11:35,245 INFO [AsyncDispatcher event handler] > impl.VertexImpl: Setting vertexManager to ShuffleVertexManager for > vertex_1436152736518_0210_1_39 *[scope-2162]* > 2015-07-06 14:11:35,257 INFO [AsyncDispatcher event handler] > vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings > minFrac:0.25 maxFrac:0.75 *auto:false* desiredTaskIput:104857600 > minTasks:1 > 2015-07-06 14:11:35,257 INFO [AsyncDispatcher event handler] > impl.VertexImpl: Creating 999 for vertex: vertex_1436152736518_0210_1_39 > [scope-2162] > .... > > 2015-07-06 14:11:35,417 INFO [AsyncDispatcher event handler] > impl.VertexImpl: Setting user vertex manager plugin: > org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager on vertex:* > scope-2185* > 2015-07-06 14:11:35,419 INFO [AsyncDispatcher event handler] > vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings > minFrac:0.25 maxFrac:0.75 *auto:true* desiredTaskIput:104857600 minTasks:1 > ... > > > On Tue, Jul 7, 2015 at 12:47 PM, Rajesh Balamohan < > [email protected]> wrote: > >> >> Attaching the DAG and the swimlane for the job. >> >> scope-2052 which had to give the data to other vertices slowed down (~ >> 150-180 seconds) due to multiple spills and NumberFormatExceptions in >> data. You might want to try setting >> "tez.task.scale.memory.additional-reservation.fraction.max='PARTITIONED_UNSORTED_OUTPUT:12,UNSORTED_INPUT:1,UNSORTED_OUTPUT:12,SORTED_OUTPUT:12,SORTED_MERGED_INPUT:1,PROCESSOR:1,OTHER:1' >> " to allocate more memory for unordered outputs. >> Following are the details for this scope. >> - attempt_1436152736518_0210_1_31_000000_0, >> PigLatin:dmwith1tapin.pig-0_scope-0, VertexName: scope-2052, >> VertexParallelism: 1, >> TaskAttemptID:attempt_1436152736518_0210_1_31_000000_0, >> - numInputs=1, numOutputs=4, JVM.maxFree=734527488 >> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor: >> Informing: INPUT, scope-546, org.apache.tez.mapreduce.input.MRInput: >> requested=0, allocated=0 >> - Small allocation of ~7 MB allocation to unordered output lead to >> multiple spills. >> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor: >> Informing: OUTPUTPUT_RECORDS, scope-2117, >> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput: >> requested=268435456, allocated=222303401 >> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor: >> Informing: OUTPUT, scope-2251, >> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput: >> requested=268435456, allocated=222303401 >> - 2015-07-06 14:11:40,048 INFO [TezChild] resources.MemoryDistributor: >> Informing: OUTPUT, scope-2063, >> org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput: >> requested=104857600, allocated=7236438 >> - 2015-07-06 14:11:40,048 INFO [TezChild] resources.MemoryDistributor: >> Informing: OUTPUT, scope-2068, >> org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput: >> requested=104857600, allocated=7236438 >> - Too many number of records had issues in NumberFormatException leading >> to large amount of logs. This dragged the runtime of this task to around >> e.g "mapReduceLayer.PigHadoopLogger: >> java.lang.Class(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to >> interpret value in field being converted to long, caught >> NumberFormatException <empty String> field discarded" >> >> >> - scope-2037 and scope-2162 had set the vertex parallelism to "999" >> affecting subsequent execution. >> - VertexName: scope-2037, VertexParallelism: 999 vertex: >> vertex_1436152736518_0210_1_28 finished in *410* seconds. Tasks >> themselves were small, but due to large number of tasks that had to be >> executed in small containers (pretty much used the same container to >> execute this) it took time. >> - VertexName: scope-2162, VertexParallelism: 999 vertex: >> vertex_1436152736518_0210_1_39 finished in *697* seconds. Similar >> observation as previous vertex. >> *Above 2 vertices have caused the entire job to slow down.* >> >> "999" is set as the reducer parallelism at compile time by Pig. This is >> not for the input. I am not sure how pig sets the parallelism at compile >> time. You can possibly try setting "pig.exec.reducers.max=50" in your case >> and give it a try. Pig folks would be in a better position to explain that. >> >> >> On Tue, Jul 7, 2015 at 11:22 AM, Sachin Sabbarwal < >> [email protected]> wrote: >> >>> >>> logs.gz >>> <https://drive.google.com/file/d/0B-RFcYxUIHzzUVJpRzVDZXB5TUk/view?usp=drive_web> >>> Hi Rajesh >>> >>> PFA the gziped logs. >>> FYI It's a single file, when you'll gunzip it, it'll be around 1.5gb in >>> size. >>> One more thing which you might find useful: >>> >>> In the dmOutputTez file i could see following line, which suggests that >>> TEZ created a total of 7660 tasks. This is surprising as my data is only >>> few mbs(10-15 mb max). How is this number of tasks decided? is there any >>> property to tune it? >>> >>> 2015-07-07 05:37:02,647 [Timer-0] INFO >>> org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG >>> Status: status=RUNNING, progress=TotalTasks: 7660 Succeeded: 0 Running: >>> 0 Failed: 0 Killed: 0, diagnostics= >>> >>> Thanks >>> >>> On Mon, Jul 6, 2015 at 8:34 PM, Rajesh Balamohan < >>> [email protected]> wrote: >>> >>>> yarn logs -applicationId application_1436152736518_0210 >>>> >>>> You can possibly send the output to a log file, gzip it and post it. >>>> >>>> ~Rajesh.B >>>> >>>> On Mon, Jul 6, 2015 at 8:12 PM, Sachin Sabbarwal < >>>> [email protected]> wrote: >>>> >>>>> Hi >>>>> Thanks for reply. My tez-site.xml contains only following: >>>>> >>>>> <configuration> >>>>> <property> >>>>> <name>tez.lib.uris</name> >>>>> <value>${fs.defaultFS}/apps/tez-0.5/tez-0.5.3.tar.gz, >>>>> ${fs.defaultFS}/apps/tez-0.5/*,${fs.defaultFS}/apps/tez-0.5/lib/*</value> >>>>> </property> >>>>> </configuration> >>>>> >>>>> PFA the application logs. Here is the version information: >>>>> 1. Hadoop version: Hadoop 2.5.0-cdh5.3.1 >>>>> 2. Pig: Apache Pig version 0.14.0 (r1640057) >>>>> 3. Tez: 0.5.3 >>>>> >>>>> Lemme know if anything else is needed. >>>>> >>>>> Thanks in advance >>>>> >>>>> On Mon, Jul 6, 2015 at 7:07 PM, Rajesh Balamohan < >>>>> [email protected]> wrote: >>>>> >>>>>> Can you post the application logs, tez-site.xml and also the version >>>>>> details? >>>>>> >>>>>> ~Rajesh.B >>>>>> >>>>>> On Mon, Jul 6, 2015 at 6:38 PM, Sachin Sabbarwal < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> >>>>>>> ---------- Forwarded message ---------- >>>>>>> From: Sachin Sabbarwal <[email protected]> >>>>>>> Date: Mon, Jul 6, 2015 at 5:34 PM >>>>>>> Subject: Same pig script running slower with Tez as compared with >>>>>>> run in Mapred mode >>>>>>> To: [email protected] >>>>>>> >>>>>>> >>>>>>> Hello Guys >>>>>>> Trying Apache Tez. >>>>>>> I've setup to use pig in TEZ mode. >>>>>>> I'm running a pig script against i) no data and ii) with some data. >>>>>>> In case i) when i run with pig using TEZ mode my pig scripts >>>>>>> completes run in ~40secs. Whereas when i run case i) with mapred it >>>>>>> takes >>>>>>> around 7-8 mins. >>>>>>> in case ii) when run with pig using TEZ, same pig script takes >>>>>>> around 14-15 mins but with mapred it takes around 10 mins. >>>>>>> When i'm running same pig script with production data(which is much >>>>>>> more than the data i used here to run case i) and (ii) ) the job takes >>>>>>> hours to complete. >>>>>>> Hence I'm trying tez to run my pig job in a faster mode. I'm not >>>>>>> really sure what i might be missing here. Please help, ask for any >>>>>>> further >>>>>>> info if required. >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> -- >>>>>>> Sachin Sabbarwal >>>>>>> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265 >>>>>>> Facebook: facebook.com/sachinsabbarwal >>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal >>>>>>> Blog: http://sachinsabbarwal.tumblr.com/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Sachin Sabbarwal >>>>>>> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265 >>>>>>> Facebook: facebook.com/sachinsabbarwal >>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal >>>>>>> Blog: http://sachinsabbarwal.tumblr.com/ >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> ~Rajesh.B >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Sachin Sabbarwal >>>>> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265 >>>>> Facebook: facebook.com/sachinsabbarwal >>>>> Quora: http://www.quora.com/Sachin-Sabbarwal >>>>> Blog: http://sachinsabbarwal.tumblr.com/ >>>>> >>>> >>>> >>>> >>>> -- >>>> ~Rajesh.B >>>> >>> >>> >>> >>> -- >>> Sachin Sabbarwal >>> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265 >>> Facebook: facebook.com/sachinsabbarwal >>> Quora: http://www.quora.com/Sachin-Sabbarwal >>> Blog: http://sachinsabbarwal.tumblr.com/ >>> >> >> >> >> -- >> ~Rajesh.B >> > > > > -- > ~Rajesh.B > -- Sachin Sabbarwal Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265 Facebook: facebook.com/sachinsabbarwal Quora: http://www.quora.com/Sachin-Sabbarwal Blog: http://sachinsabbarwal.tumblr.com/
