Re: SLS and Tez

Hitesh Shah Fri, 14 Nov 2014 21:04:54 -0800

Thanks for the detailed answers. Most of the differences that I mentioned will 
be applicable to running MR with “yarn-tez” mode ( except for the split 
handling ). I believe the automatic update of number of reducers should also 
work if configured to do so.


Let us know if we can help in any other way. Also, it would be great if you can 
get back to the community with any other insights that you discover as part of 
your analysis. 

thanks
— Hitesh 


On Nov 14, 2014, at 8:22 PM, Fabio <[email protected]> wrote:

> Hi Hitesh,
> 
> - "When you say run a trace of a map-reduce job against SLS+Tez, could you 
> clarify what that means?"
> 
> Well, I have some "job profiles", where each job profile tells statistics 
> about a particular map-reduce application, such as: minimum and maximum Map 
> and Reduce execution time, number of Map tasks, number of Reduce tasks. As a 
> part of a bigger project, some artificial traces have been created starting 
> from these job profiles and then provided as an input to SLS (see 
> http://goo.gl/ueuXzr for a sample trace) to analyze the jobs execution time 
> in condition of heavy load.
> I know these profiles have been made for sure without using Tez (plain MR 
> application), I also know Tez can run "old" MR application (let's say the 
> yarn wordcount example) by setting it as the default framework. Of course if 
> a MR application is rewritten, it can take many advantages from Tez, but even 
> just running it on top of Tez I guess at least the container reuse is taken 
> into account.
> So I wanted to take those same artificial traces and give them as input to 
> SLS first (plain yarn architecture), and possibly to SLS while simulating Tez 
> to see the different behavior.
> Actually I could see the different behavior of single jobs even with a single 
> machine, but the idea here was to launch hundreds of concurrent jobs and see 
> how Tez influence the overall execution.
> 
> - "I have not yet had a chance to look at SLS or the MR AM simulator in 
> detail. What does writing a Tez-specific simulator entail? "
> 
> I honestly have no idea, and considering both the relative complexity of the 
> already implemented class and the lack of documentation for developers, I 
> assume it's not a few minutes job (not for me, for sure).
> I'm not even sure it's a job it is worth doing, since I wasn't able to find 
> any information about the reliability of SLS as a Yarn simulator. Officially, 
> it is just used to test the scheduler behavior, and as far as I remember it 
> doesn't consider network simulation, nor the shuffle phase.
> 
> Best regards
> 
> Fabio
> 
> On 11/15/2014 04:20 AM, Hitesh Shah wrote:
>> Hi Fabio
>> 
>> The behavior that Tez induces on a cluster for a MapReduce-like job may be 
>> vastly different to what MapReduce does today:
>>    - Tez can do splits calculation on the cluster and makes use of 
>> information such as available cluster resources to decide how many tasks to 
>> run
>>    - Tez does container reuse across tasks. There are multiple factors that 
>> can affect reuse. The kind of workload, where tasks run, how long the task 
>> ran, how much common locality do the different tasks have. At this point, 
>> depending on the locality-scheduling related flags, Tez waits for a certain 
>> delay before assigning a task to a rack-local container or an off-rack 
>> container. This is assuming that the RM has not provided a matching 
>> container within the timeframe and even then, a newly allocated container 
>> may not be used as there is a penalty for launching a new JVM.
>>    - Tez also keeps containers around (configurable) so as to use them later 
>> for tasks that have not yet been scheduled or future DAGs to be run within 
>> the same Application Master.
>>    - Lastly, Tez in case of Hive/Pig, has vastly complex and dynamic DAGs. 
>> Unlike MapReduce, the resource needs are not known completely upfront. Also, 
>> resource asks may also change depending on how much data is actually being 
>> processed.
>> 
>> Had a couple of questions to ask:
>> 
>>   - When you say run a trace of a map-reduce job against SLS+Tez, could you 
>> clarify what that means?
>>   - I have not yet had a chance to look at SLS or the MR AM simulator in 
>> detail. What does writing a Tez-specific simulator entail?
>>   Let us know if you are planning to look at implementing some of the 
>> missing simulation pieces for Tez.
>> 
>> thanks
>> — Hitesh
>> 
>> 
>> On Nov 14, 2014, at 5:46 PM, Fabio <[email protected]> wrote:
>> 
>>> Thanks for the reply, actually what I was planning to do is to generate 
>>> artificial traces of map-reduce jobs and run them against SLS and SLS+Tez 
>>> to analyze the differences.
>>> I asked here directly since I am pretty sure that in the Hadoop mailing 
>>> list they were going to tell me to ask you about it since that class is 
>>> app-specific, so I thought someone here may have already written an AM 
>>> simulator class for Tez. I doubt the SLS developers will do it any soon 
>>> (unless Tez will become part of the Hadoop code).
>>> Anyway, no problem. Some tests in a small cluster will do the job ;)
>>> 
>>> Thanks again
>>> 
>>> Fabio
>>> 
>>> On 11/14/2014 07:54 PM, Hitesh Shah wrote:
>>>> Hello Fabio
>>>> 
>>>> We do not have a job trace file generated by Tez and therefore no 
>>>> simulator that can re-run the trace. We do store some historical data for 
>>>> the job but the level of tooling around it is pretty minimal.
>>>> 
>>>> — Hitesh
>>>> 
>>>> On Nov 14, 2014, at 3:29 AM, Fabio <[email protected]> wrote:
>>>> 
>>>>> With SLS (Yarn Scheduler Load Simulator) I can test a MR job trace 
>>>>> against different schedulers, but to do so I see one has to specify 
>>>>> "yarn.sls.am.type.mapreduce" that is "The AMSimulator implementation for 
>>>>> MapReduce-like applications. Users can specify implementations for other 
>>>>> type of applications.". As far as I understand this class is a simulator 
>>>>> of the AM, so I suppose that if I want to execute a job trace as if it is 
>>>>> run on top of Tez, I should implement this class in order to simulate a 
>>>>> Tez AM.
>>>>> Is this correct? A of today is there already some implementation of this?
>>>>> 
>>>>> Thanks in advance
>>>>> 
>>>>> Fabio
>> 
>

Re: SLS and Tez

Reply via email to