Hi,

on behalf of Ralph Castain who you may know from the Open MPI mailing list I 
want to forward this eMail to your attention.

-- Reuti

> I have a question for the Gridengine community, but thought I'd run it 
> through you as I believe you work in that area?
> 
> As you may know, I am now employed by Greenplum/EMC to work on resource 
> management for Hadoop as well as MPI. The main concern frankly is that the 
> current Hadoop RM (yarn) scales poorly in terms of launch and provides no 
> support for MPI wireup, thus causing MPI jobs to exhibit quadratic scaling of 
> startup times.
> 
> The only reason for using yarn is that it has the HDFS interface required to 
> determine file locality, thus allowing users to place processes network-near 
> to the files they will use. I have initiated an effort here at GP to create a 
> C-library for accessing HDFS to obtain that locality info, and expect to have 
> it completed in the next few weeks.
> 
> Armed with that capability, it would be possible to extend more capable RMs 
> such as Gridengine so that users could obtain HDFS-based allocations for 
> their MapReduce applications. This would allow Gridengine to support Hadoop 
> operations, and make Hadoop clusters that used Gridengine as their RM be 
> "multi-use".
> 
> Would this be of interest to the community? I can contribute the C-lib code 
> for their use under a BSD-like license structure, if that would help.
> 
> Regards,
> Ralph
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to