Rayson,
I reviewed my note and don't see any changes needed for GE part of source
to integrate it with hadoop.
In order to build hadoop module and include them in the distribution
binary, one needs to adjust the following two things:
1. Modify $GE_SOURCE/libs/herd/nbproject/project.properties for hadoop
installation
2. $GE_SOURCE/dist/hadoop and JSV.jar and herd.jar are not automatically
pulled into the GE distribution binary. To add them, modify the common
packages in the mk_dist script:
COMMON_PACKAGE="3rd_party catman ckpt dtrace examples/jobs \
examples/drmaa *hadoop* include inst_sge install_qmaster \
install_execd start_gui_installer \
man mpi pvm qmon util lib/drmaa.jar lib/juti.jar \
lib/jgdi.jar *lib/JSV.jar lib/herd.jar*"
HTH,
- Chansup
On Tue, Mar 6, 2012 at 1:09 PM, Rayson Ho <[email protected]> wrote:
> Chansup,
>
> Did you need to change anything in GE2011.11 to integrate it with
> Hadoop?? I am finishing up GE2011.11 patch 1 (ie. GE2011.11 update-0
> patch-1), so if the changes are small and isolated, then I can quickly
> integrate them into the patch 1 release, or else I will just push them
> into patch 2 & GE2011.11 u1.
>
>
> Tood,
>
> The SGE-Hadoop integration uses Grid Engine as the job scheduler for
> Hadoop jobs, and the integration has the Herd JSV & load sensor that
> talk to HDFS to request & report data locality. There was a big API
> change in Hadoop 0.20.x for the Hadoop 1.0 release. I recall someone
> contributed a small patch that fixed things related to Hadoop, and
> that part is in GE 2011.11 already, but I don't recall changing any of
> the Java code in the GE2011.11 release for Hadoop.
>
> However, to be honest, using the SGE-Hadoop integration means that you
> need to give up the Hadoop job scheduler, and thus to get the full
> functionality of a normal Hadoop cluster, Grid Engine needs to
> implement all the features of the scheduler in Hadoop. For example, in
> the Hadoop scheduler supports "Speculative Execution" and Grid Engine
> does not have it.
>
> Rayson
>
>
>
> On Tue, Mar 6, 2012 at 12:53 PM, CB <[email protected]> wrote:
> > Hi Todd,
> >
> > I have implemented a hadoop (0.20.2 version) integration with OGE2011.11
> > release based on Dan T's work as described in the link below. We are
> > experimenting the development cluster for internal projects.
> >
> > Dan T's hadoop module was built with hadoop 0.20.x release. So it will
> > requires some changes in order to work with the latest hadoop 1.x
> release.
> > This is one of my ToDo list. :-)
> >
> > Regards,
> > - Chansup
> >
> >
> > On Tue, Mar 6, 2012 at 12:21 PM, Heywood, Todd <[email protected]> wrote:
> >>
> >> Yes. There also used to be something similar called Hadoop-on-Demand.
> >>
> >> But the idea is to schedule jobs to a persistent HDFS, sending jobs to
> >> where the data is, as opposed to setting up and tearing down HDFS for
> >> every job.
> >>
> >> I probably should have given this as background:
> >>
> >> https://blogs.oracle.com/templedf/entry/beta_testing_the_sun_grid
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: "Hung-Sheng Tsao (LaoTsao) Ph.D" <[email protected]>
> >> Date: Tue, 6 Mar 2012 12:12:06 -0500
> >> To: Todd Heywood <[email protected]>
> >> Cc: "[email protected]" <[email protected]>
> >> Subject: Re: [gridengine users] Hadoop integration
> >>
> >> >did you see this blog?
> >> >https://blogs.oracle.com/ravee/entry/creating_hadoop_pe_under_sge
> >> >
> >> >Sent from my iPad
> >> >
> >> >On Mar 6, 2012, at 11:45, "Heywood, Todd" <[email protected]> wrote:
> >> >
> >> >> Way back when SGE was still at Sun, Dan Templeton wrote a SGE-Hadoop
> >> >>integration for 6.2u5 (Sun's distribution as a value-added feature).
> >> >>
> >> >> I have been told that because of changes have been made to the Hadoop
> >> >>API since Oracle purchased Sun this integration no longer works - at
> >> >>least not in the open source versions following 6.2u5.
> >> >>
> >> >> Does anyone know if this is true? Has anyone worked with this
> recently?
> >> >>I do see a hadoop.tar.gz at the SoGE site
> >>
> >> >> >>http://arc.liv.ac.uk/downloads/SGE/releases/8.0.0d<
> http://arc.liv.ac.uk/d
> >> >>ownloads/SGE/releases/8.0.0d/> but it looks to me like it is probably
> >> >>the 2-3 year old code from Sun (with no documentation since it was a
> >> >>value-added feature for Sun).
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Todd Heywood
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> users mailing list
> >> >> [email protected]
> >> >> https://gridengine.org/mailman/listinfo/users
> >>
> >>
> >> _______________________________________________
> >> users mailing list
> >> [email protected]
> >> https://gridengine.org/mailman/listinfo/users
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
> >
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users