Re: is it possible to ignore http Mapoutput get by feed mapoutput file and index file diirectly to reducer?

Ling Kun Thu, 28 Feb 2013 01:47:08 -0800

After search the hadoop maillist again, I found this link which trying to
optimize hadoop based on Lustre using Hardlink instead of http(
http://search-hadoop.com/m/JkHSa17oHp12 ).



 Any other suggestion ?

Thanks all

yours,
Ling Kun


On Thu, Feb 28, 2013 at 4:57 PM, Ling Kun <lkun.e...@gmail.com> wrote:

> Dear Arun C Murthy, Pavan Kulkarni and all.
>      Hello!
>      I am currently working on optimize Hadoop cluster based on Lustre FS.
> According to the TeraSort Benchmark, it seems the remote mapoutput copy
> takes a great part of the total runtime.
>
>
>    After search , I saw your discussion half a years ago (
> http://search-hadoop.com/m/jj3y46KUwC1 ).
>
>      I am writing to wonder whether  we  can make reducer directly read
> his part of each mapout file based on index file, and merge them together,
> instead of making each map task generate output for each reduce task.
>
>     In this way, it seems that not too much inode is needed.
>
>
> @Pavan Kulkarni: no email wa sent by you after Sep. 2012. Could you please
> kindly share some experience on how to optimize such a kind of  FileSystem
> like lustre?
>
>   Anyone have similar work experience?
>
>
>   Any comment and reply is welcome and appreciate!
>
> yours,
> Ling Kun.
> *
> *
> --
> http://www.lingcc.com
>



-- 
http://www.lingcc.com

Re: is it possible to ignore http Mapoutput get by feed mapoutput file and index file diirectly to reducer?

Reply via email to