Thanks guys,it is good to head that Hadoop is spreading... :-) Regards, Lukas
On Wed, Feb 18, 2009 at 5:24 PM, Steve Loughran <[email protected]> wrote: > Amin Astaneh wrote: > >> Lukáš- >> >>> Hi Amin, >>> I am not familiar with SGE, do you think you could tell me what did you >>> get >>> from this combination? What is the benefit of running Hadoop on SGE? >>> >>> >> Sun Grid Engine is a distributed resource management platform for >> supercomputing centers. We use it to allocate resources to a supercomputing >> task, such as requesting 32 processors to run a particular simulation. This >> mechanism is analogous to the scheduler on a multi-user OS. What I was able >> to accomplish was to turn Hadoop into an as-needed service. When you submit >> a job request to run Hadoop as the documentation describes, a Hadoop cluster >> of arbitrary size is instantiated depending on how many nodes were requested >> by generating a cluster configuration specific to that job request. This >> allows the Hadoop cluster to be deployed within the context of Gridengine, >> as well as being able to coexist with other running simulations on the >> cluster. >> >> To the researcher or user needing to run a mapreduce code, all they need >> to worry about is telling Hadoop to execute it as well as determining how >> many machines should be dedicated to the task. This benefit makes Hadoop >> very accessible to people since they don't need to worry about configuring a >> cluster, SGE and it's helper scripts do it for them. >> >> As Steve Loughran accurately commented, as of now we can only run one set >> of Hadoop slave processes per machine, due to the network binding issue. >> That problem is mitigated by configuring SGE to spread the slaves one per >> machine automatically to avoid failures. >> > > Only the Namenode and JobTracker need hard-coded/well-known port numbers, > the rest could all be done dynamically. > > One thing SGE does offer over Xen-hosted images is better performance than > virtual machines, for both CPU and storage, as virtualised disk > performance can be awful, and even on the latest x86 parts, there is a > measurable hit from VM overheads. > -- http://blog.lukas-vlcek.com/
