After search the hadoop maillist again, I found this link which trying to optimize hadoop based on Lustre using Hardlink instead of http( http://search-hadoop.com/m/JkHSa17oHp12 ).
Any other suggestion ? Thanks all yours, Ling Kun On Thu, Feb 28, 2013 at 4:57 PM, Ling Kun <lkun.e...@gmail.com> wrote: > Dear Arun C Murthy, Pavan Kulkarni and all. > Hello! > I am currently working on optimize Hadoop cluster based on Lustre FS. > According to the TeraSort Benchmark, it seems the remote mapoutput copy > takes a great part of the total runtime. > > > After search , I saw your discussion half a years ago ( > http://search-hadoop.com/m/jj3y46KUwC1 ). > > I am writing to wonder whether we can make reducer directly read > his part of each mapout file based on index file, and merge them together, > instead of making each map task generate output for each reduce task. > > In this way, it seems that not too much inode is needed. > > > @Pavan Kulkarni: no email wa sent by you after Sep. 2012. Could you please > kindly share some experience on how to optimize such a kind of FileSystem > like lustre? > > Anyone have similar work experience? > > > Any comment and reply is welcome and appreciate! > > yours, > Ling Kun. > * > * > -- > http://www.lingcc.com > -- http://www.lingcc.com