When I checked last, Torque's DRMAA binding still is in pre-alpha release and not all features are implemented, as shown by README.
I suspect the one from casperlabs.sourceforge.net has the only Torque DRMAA binding. However, the way this would be
handled in JobMonarch, it would be trivial to support any DRM with the correspoding DRMAA support.
If we get DRMAA impl for other DRMs completed, then we can trivially make JobMonarch to support that DRM by passing an option via the config file such as BATCH_API=DRMAA-sge or BATCH_API=DRMAA-Torque and such.
Regards,
Babu
On 6/26/06, Bernard Li <[EMAIL PROTECTED]> wrote:
Sounds good. I was just wondering what the progress with TORQUE's DRMAA is - because I didn't see that mentioned in the report.Thanks,BernardSubject: Re: [Oscar-devel] Mid-term Progress Update for SoC 2006 Project -HPCMetrics in OSCARHi Babu,
thanks for the detailed report. Progress looks good and I looking foward to
the first JobMonarch checkins.
To Bernard's question: Babu once explained that Torque's DRMAA support is not
complete. Once it will be, JobMonarch can switch over to DRMAA usage for
Torque, too.
Best regards,
Erich
On Sunday 25 June 2006 19:25, Babu Sundaram wrote:
> Hi All:
>
> Please find below a mid-term update on my SoC work so far. Let me know if
> you guys have any comments/suggestions.
>
> Mid-term Progress Update for SoC 2006 - HPCMetrics in OSCAR
> ================================================
> Summary of work accomplished so far:
>
> 1, New addons for Ganglia with libe, authd and gexec
> 2, Modified Ganglia-OSCAR package with gexec support
> 3, DRMAA-Python OSCAR package
> 4, Modified implementation of JobMonarch to facilitate integration of SGE
> via DRMAA
>
> The latest code and the SRPMs and binary RPMs (for FC4-i386 and FC5-i386)
> are available at OSCAR repository under
> .../oscar-soc/soc-2006/hpcmetrics
>
> Note: The JobMonarch code is not on the SVN yet.
>
> Weekly tasks:
>
> Week 1: May 24th - May 31st
> - Completed Ganglia compilation with gexec
> - Building libe, authd, gexec
> There were some problems getting the correct versions of the above
> that work correctly with latest Ganglia 3.0.x
> - Identified the correct versions of the components above for building
> - Wrote correct spec files for libe(0.3.0), authd(0.2.2) and gexec(0.3.6)
> - Sucessfully built the RPMs and SRPMs on FC4-i386
> - There were some portions with gexec implementation that were using old
> Ganglia 2.x
>
> Week 2: Jun 1st - 7th
> - Implemented patches to gexec-0.3.6 so it built correctly with Ganglia
> 3.0.x
> Modified the paths to header files
> Added the requirement for ganglia-devel and libe >= 0.3.0
> Added the linking to expat library
> - Created the updated spec file for gexec
> - Got SVN access to OSCAR repository; Created hpcmetrics dir for the SoC
> code
> - Completed a test bed setup in UH using FC5 on i386 with OSCAR 5.0 from
> trunk
> - Rebuilt all the RPMs for FC5
>
> Week 3: Jun 8th - 15th
> - Made changes to Ganglia's spec file - to allow gexec support
> --enable-gexec as part of configure phase in ganglia build
> - Tested the modified Ganglia package on OSCAR cluster on Master node
> - Brushed up on my Python knowledge to start work with JobMonarch
> - Read up on DRMAA, obtained some familiarity with DRMAA-Python
> implementation
>
> Week 4: Jun 15th -22nd
> - Built DRMAA Python on FC5-i386 with SGE's C bindings as the DRM
> - Created DRMAA python spec file for building RPMs; Requires DRMAA
> - Modified SGE-OSCAR package spec so it provides DRMAA that is required by
> DRMAA-Python
> - Created RPMs, SRPM for DRMAA-Python-0.2
> - Preliminary tests to monitor SGE jobs via DRMAA API
> - OSCAR Package for DRMAA-Python
> - Renamed authd RPMS to gexec-authd to avoid conflict with RFC 1413 identd
> daemon (Also called authd)
> Otherwise, the identd daemon RPM was installed instead of authd prior to
> gexec
>
> Week 5: Jun 23rd - today
> - Changes to JobMonarch implementation were requested from Ramon
> An 'if' test is added to check whether to use pbs interface or DRMAA's
> - Support was added to express the Interface needed as part of Monarch's
> config file
> - BATCH_API option; When set to DRMAA it will use the Python binding (onto
> SGE's C binding)
> - some unexpected delays this week
> A few servers were compromised by external access at my department in Univ
> of Houston
> DRMAA API issues - Could submit jobs to OSCAR-SGE; But wait() call on SGE
> jobs fails due to ValueError
> Need to clarify with SGE developers
>
> =============================
> *** Some issues currently ***
> - Having some trouble in network booting the client nodes in OSCAR cluster
> So testing of client side install of gexec and DRMAA remains; Hopefully
> should be resolved this coming week
> - Cannot access the testbed within the Computer Science @ UH due to complete
> rebuild of systems; Should be ready sometime this week
> =============================
>
> Plan for the next 3 weeks:
> Week 6: Jun 26th - Jul 3rd
> DRMAA-Monarch integration completion
> Refine the gexec and authd package and test them
> - need to add a few additions to post_install scripts
> for RSA key setup via authd on master node and copy it
> out to client nodes for transparent gexec
>
> Weeks 7 & 8: Jul 4th - 19th
> Obtain sensord from Erich and start working on it for adding timing control
> support
> Identifying mechanisms to collect DRM job breakdown, network and disk
> statistics
> Integrate these tools with sensord (either as separate programs or as addons
> to Monarch)
> - needs to be discussed with Erich
> JobMonarch package for OSCAR
>
> And the last 4 weeks would be spent for extensions to Ganglia interface (3
> weeks) and
> documentation of the work in Summer
>
> Regards,
> Babu
>
> Note: My access to [EMAIL PROTECTED] is temporarily unavailable. Once we get
> our access to the servers restored, I will post the weekly updates on my
> webpage once a week so you all can take a look when you get the time.
> Thanks.Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________ Oscar-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-devel
