This is how we run hadoop using Grid Engine (for that matter any scheduler with appropriate alteration)

http://www.ats.ucla.edu/clusters/hoffman2/hadoop/default.htm

Basically, run either a prolog or call a script inside the submission command file itself to parse the output of PE_HOSTFILE to create hadoop *.site.xml, masters and slaves files at run time. This methodology is suitable for any scheduler as it is not dependent on them. If there is interest I can post the prologue script. Thanks.

Prakashan


On 05/28/2012 06:50 AM, Rayson Ho wrote:
Vic,

If you just want to run Hadoop jobs with Grid Engine, then the
integration (mainly a HDFS monitor) written by DanT in SGE 6.2u5 will
work:

https://blogs.oracle.com/templedf/entry/welcome_sun_grid_engine_6

The on-going discussion is related to running jobs that request
dynamic allocation - it is going to be more complicated... and we have
not even defined the interface yet!

Rayson




On Mon, May 28, 2012 at 7:35 AM, Vic<[email protected]>  wrote:

Hi All.

I've just re-read the thread from a few months ago about integrating SGE
with Hadoop. This might suddenly have become very useful to me!

So what sort of state is it in? Is it the sort of thing I can get my hands
dirty with (bearing in mind I'm a SGE neophyte), or will I get my fingers
burnt?

Thanks!

Vic.



_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to