Re: Multi-level aggregation with combining the result of maps per node/rack

Tsuyoshi OZAWA Wed, 01 Aug 2012 00:52:00 -0700

Bikas,

Yes, this feature is similar to Dryad approach.
The design of current prototype is a bit hacky to minimize the costs
of implementation.
The modified file is only MapTask.java. Processing flow is as follows:


1. Move the results of output files to temporary directory after doing
mapper function.
2. A leader of the container run combiner each 3 minutes if there are
some files under
the temporary directory.

A leader of the container is elected by file lock. I know that this
design breaks the
fault-tolerance of MapReduce, and isn't acceptable for hadoop. I'll
renew the design
and document it.

> That is useful to reduce the input arity (as you suggest) and also helps with
> reducing the chance of seeing failures.

What does the latter mean? Could you explain with an example?

Thank you for your comment,
Tsuyoshi OZAWA

On Wed, Aug 1, 2012 at 2:32 AM, Bikas Saha <[email protected]> wrote:
> Can you please share a brief note on the design. Just a few sentences on
> the main changes.
>
>
>
> What you are saying sounds similar to multi-level aggregation done in the
> Dryad <http://www.cs.cmu.edu/~./15712/papers/isard07.pdf> runtime. That is
> useful to reduce the input arity (as you suggest) and also helps with
> reducing the chance of seeing failures.
>
>
>
> Bikas
>
>
>
> -----Original Message-----
> From: Tsuyoshi OZAWA [mailto:[email protected]]
> Sent: Monday, July 30, 2012 6:11 PM
> To: [email protected]
> Subject: Multi-level aggregation with combining the result of maps per
> node/rack
>
>
>
> Hi,
>
>
>
> We consider the shuffle cost is a main concern in MapReduce, in particular,
> aggregation processing.
>
> The shuffle costs is also expensive in Hadoop in spite of the existence of
> combiner, because the scope of combining is limited within only one MapTask.
>
>
>
> To solve this problem, I've implemented the prototype that combines the
> result of multiple maps per node[1].
>
> This is the first step to make hadoop faster with multi-level aggregation
> technique like Google Dremel[2].
>
>
>
> I took a benchmark with the prototype.
>
> We used WordCount program with in-mapper combining optimization as the
> benchmark. The benchmark is taken under 40 nodes [3].
>
> The input data set is 300GB, 500GB, 1TB, and 2TB texts which is generated
> by default RandomTextWriter. Reducer is configured as 1 on the assumption
> that some workload forces 1 reducer like Google Dremel. The result is as
> follows:
>
>
>
>                          | 300GB | 500GB |   1TB |   2TB |
>
>             Normal (sec) |  4004 |  5551 | 12177 | 27608 | Combining per
> node (sec) |  3678 |  3844 |  7440 | 15591 |
>
>
>
> Note that a MapTask runs combiner per node every 3 minutes in the current
> prototype, so the aggregation rate is very limited.
>
>
>
> "Normal" is the result of current hadoop, and "Combining per node"
>
> is the result with my optimization.  Regardless of the 3-minutes
> restriction, the prototype is 1.7 times faster than normal hadoop in 2TB
> case.  Another benchmark also shows that the shuffle costs is cut down by
> 50%.
>
>
>
> I want to know from you guys, do you think is it a useful feature?
>
> If yes, I will work for contributing it.
>
> It is also welcome to tell me the benchmark that you want me to do with my
> prototype.
>
>
>
> Regards,
>
> Tsuyoshi
>
>
>
>
>
> [1] The idea is also described in Hadoop wiki:
>
>     http://wiki.apache.org/hadoop/HadoopResearchProjects
>
> [2] Dremel paper is available at:
>
>     http://research.google.com/pubs/pub36632.html
>
> [3] The specification of each nodes is as follows:
>
>     CPU Core(TM)2 Duo CPU E7400 2.80GHz x 2
>
>     Memory 8 GB
>
>     Network 1 GbE



-- 
OZAWA Tsuyoshi

Re: Multi-level aggregation with combining the result of maps per node/rack

Reply via email to