Hi, on behalf of Ralph Castain who you may know from the Open MPI mailing list I want to forward this eMail to your attention.
-- Reuti > I have a question for the Gridengine community, but thought I'd run it > through you as I believe you work in that area? > > As you may know, I am now employed by Greenplum/EMC to work on resource > management for Hadoop as well as MPI. The main concern frankly is that the > current Hadoop RM (yarn) scales poorly in terms of launch and provides no > support for MPI wireup, thus causing MPI jobs to exhibit quadratic scaling of > startup times. > > The only reason for using yarn is that it has the HDFS interface required to > determine file locality, thus allowing users to place processes network-near > to the files they will use. I have initiated an effort here at GP to create a > C-library for accessing HDFS to obtain that locality info, and expect to have > it completed in the next few weeks. > > Armed with that capability, it would be possible to extend more capable RMs > such as Gridengine so that users could obtain HDFS-based allocations for > their MapReduce applications. This would allow Gridengine to support Hadoop > operations, and make Hadoop clusters that used Gridengine as their RM be > "multi-use". > > Would this be of interest to the community? I can contribute the C-lib code > for their use under a BSD-like license structure, if that would help. > > Regards, > Ralph > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
