Re: Same pig script running slower with Tez as compared with run in Mapred mode

Sachin Sabbarwal Tue, 07 Jul 2015 02:45:12 -0700

Thanks Rajesh.  I've sent a mail to pig user mail list also.
Waiting for their reply. I'll reply here for other to know about the
problem.


On Tue, Jul 7, 2015 at 2:12 PM, Rajesh Balamohan <[email protected]
> wrote:

> Hi Sachin,
>
> That was just a temporary workaround to ensure that was the issue.
> Ideally user does not need to set this parameter.  Real issue is why
> auto-reducer is set to false in certain vertices in pig-tez. Will wait for
> Pig folks to chime in.
>
> For doc/tutorial related, you can start off with the following
> - http://tez.apache.org/talks.html
> - couple of youtube videos are available from hadoop summits and meetups.
> -
> http://hortonworks.com/blog/apache-tez-a-new-chapter-in-hadoop-data-processing/
> (this is pretty old)
> - Pig on tez
> http://www.slideshare.net/Hadoop_Summit/pig-on-tez-low-latency-data-processing-with-big-data
>
> ~Rajesh.B
>
> On Tue, Jul 7, 2015 at 1:54 PM, Sachin Sabbarwal <
> [email protected]> wrote:
>
>> Hi Rajesh
>> Thanks for your response. *This seems to be working for me.*
>> By setting pig.exec.reducers.max to 10 i am able to complete my run in
>> under 4 mins.(Initially it was running in 14-15 mins).
>> I'm new to pig/tez/hadoop world. Do you write any blogs about
>> pig/tez/hadoop etc? Can you suggest any tutorials/links to read about tez?
>> I need to understand concepts like scope, DAG, parallelism etc. I just
>> have a very basic understanding of tez. If i understand all these concepts
>> i'll be able to tune my job.
>>
>> Thanks
>>
>>
>> On Tue, Jul 7, 2015 at 1:23 PM, Rajesh Balamohan <
>> [email protected]> wrote:
>>
>>> Forgot to add the following.  Ideally auto-reduce implementation should
>>> have kicked-in and on need basis, it should have decreased the number of
>>> reducers needed.  However, for the vertices of concern (scope-2037 &
>>> scope-2162), auto-reducer has been turned off in configuration by Pig and
>>> for the rest of the vertices it is turned on.
>>>
>>> Pig folks would be able to help in terms of providing details on why
>>> auto-reduce parallelism is turned off in certain vertices.
>>>
>>> 2015-07-06 14:11:35,109 INFO [AsyncDispatcher event handler]
>>> impl.VertexImpl: Setting vertexManager to ShuffleVertexManager for
>>> vertex_1436152736518_0210_1_28* [scope-2037]*
>>> 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
>>> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
>>> minFrac:0.25 maxFrac:0.75* auto:false* desiredTaskIput:104857600
>>> minTasks:1
>>> 2015-07-06 14:11:35,123 INFO [AsyncDispatcher event handler]
>>> impl.VertexImpl: Creating 999 for vertex: vertex_1436152736518_0210_1_28
>>> [scope-2037]
>>> ....
>>>
>>> 2015-07-06 14:11:35,245 INFO [AsyncDispatcher event handler]
>>> impl.VertexImpl: Setting vertexManager to ShuffleVertexManager for
>>> vertex_1436152736518_0210_1_39 *[scope-2162]*
>>> 2015-07-06 14:11:35,257 INFO [AsyncDispatcher event handler]
>>> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
>>> minFrac:0.25 maxFrac:0.75 *auto:false* desiredTaskIput:104857600
>>> minTasks:1
>>> 2015-07-06 14:11:35,257 INFO [AsyncDispatcher event handler]
>>> impl.VertexImpl: Creating 999 for vertex: vertex_1436152736518_0210_1_39
>>> [scope-2162]
>>> ....
>>>
>>> 2015-07-06 14:11:35,417 INFO [AsyncDispatcher event handler]
>>> impl.VertexImpl: Setting user vertex manager plugin:
>>> org.apache.tez.dag.library.vertexmanager.ShuffleVertexManager on vertex:*
>>> scope-2185*
>>> 2015-07-06 14:11:35,419 INFO [AsyncDispatcher event handler]
>>> vertexmanager.ShuffleVertexManager: Shuffle Vertex Manager: settings
>>> minFrac:0.25 maxFrac:0.75 *auto:true* desiredTaskIput:104857600
>>> minTasks:1
>>> ...
>>>
>>>
>>> On Tue, Jul 7, 2015 at 12:47 PM, Rajesh Balamohan <
>>> [email protected]> wrote:
>>>
>>>>
>>>> Attaching the DAG and the swimlane for the job.
>>>>
>>>> scope-2052 which had to give the data to other vertices slowed down (~
>>>> 150-180 seconds) due to multiple spills and NumberFormatExceptions in
>>>> data.  You might want to try setting
>>>> "tez.task.scale.memory.additional-reservation.fraction.max='PARTITIONED_UNSORTED_OUTPUT:12,UNSORTED_INPUT:1,UNSORTED_OUTPUT:12,SORTED_OUTPUT:12,SORTED_MERGED_INPUT:1,PROCESSOR:1,OTHER:1'
>>>> " to allocate more memory for unordered outputs.
>>>> Following are the details for this scope.
>>>> - attempt_1436152736518_0210_1_31_000000_0,
>>>> PigLatin:dmwith1tapin.pig-0_scope-0, VertexName: scope-2052,
>>>> VertexParallelism: 1,
>>>> TaskAttemptID:attempt_1436152736518_0210_1_31_000000_0,
>>>> - numInputs=1, numOutputs=4, JVM.maxFree=734527488
>>>> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor:
>>>> Informing: INPUT, scope-546, org.apache.tez.mapreduce.input.MRInput:
>>>> requested=0, allocated=0
>>>> - Small allocation of ~7 MB allocation to unordered output lead to
>>>> multiple spills.
>>>> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor:
>>>> Informing: OUTPUTPUT_RECORDS, scope-2117,
>>>> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:
>>>> requested=268435456, allocated=222303401
>>>> - 2015-07-06 14:11:40,047 INFO [TezChild] resources.MemoryDistributor:
>>>> Informing: OUTPUT, scope-2251,
>>>> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput:
>>>> requested=268435456, allocated=222303401
>>>> - 2015-07-06 14:11:40,048 INFO [TezChild] resources.MemoryDistributor:
>>>> Informing: OUTPUT, scope-2063,
>>>> org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput:
>>>> requested=104857600, allocated=7236438
>>>> - 2015-07-06 14:11:40,048 INFO [TezChild] resources.MemoryDistributor:
>>>> Informing: OUTPUT, scope-2068,
>>>> org.apache.tez.runtime.library.output.UnorderedPartitionedKVOutput:
>>>> requested=104857600, allocated=7236438
>>>>  - Too many number of records had issues in NumberFormatException
>>>> leading to large amount of logs. This dragged the runtime of this task to
>>>> around
>>>>    e.g "mapReduceLayer.PigHadoopLogger:
>>>> java.lang.Class(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to
>>>> interpret value  in field being converted to long, caught
>>>> NumberFormatException <empty String> field discarded"
>>>>
>>>>
>>>> - scope-2037 and scope-2162 had set the vertex parallelism to "999"
>>>> affecting subsequent execution.
>>>> - VertexName: scope-2037, VertexParallelism: 999 vertex:
>>>> vertex_1436152736518_0210_1_28 finished in *410* seconds.  Tasks
>>>> themselves were small, but due to large number of tasks that had to be
>>>> executed in small containers (pretty much used the same container to
>>>> execute this) it took time.
>>>> - VertexName: scope-2162, VertexParallelism: 999 vertex:
>>>> vertex_1436152736518_0210_1_39 finished in *697* seconds. Similar
>>>> observation as previous vertex.
>>>> *Above 2 vertices have caused the entire job to slow down.*
>>>>
>>>> "999" is set as the reducer parallelism at compile time by Pig. This is
>>>> not for the input. I am not sure how pig sets the parallelism at compile
>>>> time.  You can possibly try setting "pig.exec.reducers.max=50" in your case
>>>> and give it a try. Pig folks would be in a better position to explain that.
>>>>
>>>>
>>>> On Tue, Jul 7, 2015 at 11:22 AM, Sachin Sabbarwal <
>>>> [email protected]> wrote:
>>>>
>>>>> 
>>>>>  logs.gz
>>>>> <https://drive.google.com/file/d/0B-RFcYxUIHzzUVJpRzVDZXB5TUk/view?usp=drive_web>
>>>>> Hi Rajesh
>>>>>
>>>>> PFA the gziped logs.
>>>>> FYI It's a single file, when you'll gunzip it, it'll be around 1.5gb
>>>>> in size.
>>>>> One more thing which you might find useful:
>>>>>
>>>>> In the dmOutputTez file i could see following line, which suggests
>>>>> that TEZ created a total of 7660 tasks. This is surprising as my data is
>>>>> only few mbs(10-15 mb max). How is this number of tasks decided? is there
>>>>> any property to tune it?
>>>>>
>>>>> 2015-07-07 05:37:02,647 [Timer-0] INFO
>>>>>  org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG
>>>>> Status: status=RUNNING, progress=TotalTasks: 7660 Succeeded: 0
>>>>> Running: 0 Failed: 0 Killed: 0, diagnostics=
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Mon, Jul 6, 2015 at 8:34 PM, Rajesh Balamohan <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> yarn logs -applicationId application_1436152736518_0210
>>>>>>
>>>>>> You can possibly send the output to a log file, gzip it and post it.
>>>>>>
>>>>>> ~Rajesh.B
>>>>>>
>>>>>> On Mon, Jul 6, 2015 at 8:12 PM, Sachin Sabbarwal <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>> Thanks for reply. My tez-site.xml contains only following:
>>>>>>>
>>>>>>> <configuration>
>>>>>>> <property>
>>>>>>>   <name>tez.lib.uris</name>
>>>>>>>   <value>${fs.defaultFS}/apps/tez-0.5/tez-0.5.3.tar.gz,
>>>>>>> ${fs.defaultFS}/apps/tez-0.5/*,${fs.defaultFS}/apps/tez-0.5/lib/*</value>
>>>>>>> </property>
>>>>>>> </configuration>
>>>>>>>
>>>>>>> PFA the application logs. Here is the version information:
>>>>>>> 1. Hadoop version: Hadoop 2.5.0-cdh5.3.1
>>>>>>> 2. Pig: Apache Pig version 0.14.0 (r1640057)
>>>>>>> 3. Tez: 0.5.3
>>>>>>>
>>>>>>> Lemme know if anything else is needed.
>>>>>>>
>>>>>>> Thanks in advance
>>>>>>>
>>>>>>> On Mon, Jul 6, 2015 at 7:07 PM, Rajesh Balamohan <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Can you post the application logs, tez-site.xml and also the
>>>>>>>> version details?
>>>>>>>>
>>>>>>>> ~Rajesh.B
>>>>>>>>
>>>>>>>> On Mon, Jul 6, 2015 at 6:38 PM, Sachin Sabbarwal <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------- Forwarded message ----------
>>>>>>>>> From: Sachin Sabbarwal <[email protected]>
>>>>>>>>> Date: Mon, Jul 6, 2015 at 5:34 PM
>>>>>>>>> Subject: Same pig script running slower with Tez as compared with
>>>>>>>>> run in Mapred mode
>>>>>>>>> To: [email protected]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hello Guys
>>>>>>>>> Trying Apache Tez.
>>>>>>>>> I've setup to use pig in TEZ mode.
>>>>>>>>> I'm running a pig script against i) no data and ii) with some data.
>>>>>>>>> In case i) when i run with pig using TEZ mode my pig scripts
>>>>>>>>> completes run in ~40secs. Whereas when i run case i) with mapred it 
>>>>>>>>> takes
>>>>>>>>> around 7-8 mins.
>>>>>>>>> in case ii) when run with pig using TEZ, same pig script takes
>>>>>>>>> around 14-15 mins but with mapred it takes around 10 mins.
>>>>>>>>> When i'm running same pig script with production data(which is
>>>>>>>>> much more than the data i used here to run case i) and (ii) ) the job 
>>>>>>>>> takes
>>>>>>>>> hours to complete.
>>>>>>>>> Hence I'm trying tez to run my pig job in a faster mode. I'm not
>>>>>>>>> really sure what i might be missing here. Please help, ask for any 
>>>>>>>>> further
>>>>>>>>> info if required.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> --
>>>>>>>>> Sachin Sabbarwal
>>>>>>>>> Linkedin:
>>>>>>>>> https://www.linkedin.com/profile?viewProfile=&key=95777265
>>>>>>>>> Facebook: facebook.com/sachinsabbarwal
>>>>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
>>>>>>>>> Blog: http://sachinsabbarwal.tumblr.com/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Sachin Sabbarwal
>>>>>>>>> Linkedin:
>>>>>>>>> https://www.linkedin.com/profile?viewProfile=&key=95777265
>>>>>>>>> Facebook: facebook.com/sachinsabbarwal
>>>>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
>>>>>>>>> Blog: http://sachinsabbarwal.tumblr.com/
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ~Rajesh.B
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Sachin Sabbarwal
>>>>>>> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
>>>>>>> Facebook: facebook.com/sachinsabbarwal
>>>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
>>>>>>> Blog: http://sachinsabbarwal.tumblr.com/
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ~Rajesh.B
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Sachin Sabbarwal
>>>>> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
>>>>> Facebook: facebook.com/sachinsabbarwal
>>>>> Quora: http://www.quora.com/Sachin-Sabbarwal
>>>>> Blog: http://sachinsabbarwal.tumblr.com/
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ~Rajesh.B
>>>>
>>>
>>>
>>>
>>> --
>>> ~Rajesh.B
>>>
>>
>>
>>
>> --
>> Sachin Sabbarwal
>> Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
>> Facebook: facebook.com/sachinsabbarwal
>> Quora: http://www.quora.com/Sachin-Sabbarwal
>> Blog: http://sachinsabbarwal.tumblr.com/
>>
>
>
>
> --
> ~Rajesh.B
>



-- 
Sachin Sabbarwal
Linkedin: https://www.linkedin.com/profile?viewProfile=&key=95777265
Facebook: facebook.com/sachinsabbarwal
Quora: http://www.quora.com/Sachin-Sabbarwal
Blog: http://sachinsabbarwal.tumblr.com/

Re: Same pig script running slower with Tez as compared with run in Mapred mode

Reply via email to