Title: [Oscar-devel] Weekly update - SoC JobMonarch & Ganglia
Hi Babu:
 
When you get a chance, perhaps you can post some screenshots, that would be great.
 
Thanks!
 
Bernard


From: [EMAIL PROTECTED] on behalf of Babu Sundaram
Sent: Mon 10/07/2006 14:03
To: [email protected]
Subject: [Oscar-devel] Weekly update - SoC JobMonarch & Ganglia

Hi All:

Here is my update for the past week. I was holding this e-mail till I
checked in the code into subversion repo, which I did a few minutes ago.

So, here is the list of things acomplished:

Mainly, I got the jobmond.py implementation completed working with SGE. As
a result of some issues with DRMAA in SGE, the current code is not 100%
DRMAA. From the SGE-devel mailing lists, it appears that these will be
sorted out by the 6.0u9 release of SGE. I will keep the pure DRMAA code
hanging in there comented out, so we can easily include it to support
DRMAA as and when its available in SGE. More info about the issue and
my discussion on the SGE-devel can be found at:

http://gridengine.sunsource.net/servlets/ReadMsg?list=dev&msgNo=2771
http://gridengine.sunsource.net/issues/show_bug.cgi?id=1485

Here are some of the notes about the current jobmond.py:

- It collects the SGE job info by performing a qstat -ext -xml and puts it
into a file (the location of which is included in jobmond.conf)
- I have implemented a XML parser (based on SAX) for sifting through this
and collecting the requisite information about all the jobs. (DOM parser
might slow things down when SGE returns verbose XML info in case of large
number of jobs and we dont need write access to XML tree; So, I went with
SAX)
- Jobs with no change in their status are reported as such; new jobs are
added; jobs with changed status are updated
- This information is formed into a dictionary with the job IDs acting as
keys and the corresponding value indicates the job status

An example is as below:

The key is '169'

Status=pending
JB_job_number=169
JAT_prio=0.00000
JAT_ntix=0.00000
JB_name=Sleeper
JB_owner=babu
JB_project=Unknown
JB_department=defaultdepartment
state=qw
tickets=0
JB_override_tickets=0
JB_jobshare=0
otickets=0
ftickets=0
stickets=0
JAT_share=0.00000
queue_name=sample.q
slots=1

and these info are collected for every job that is currently under the
control of SGE.

- This information is then multicast using the gmetric tool of Ganglia so
as to be available via gmond daemons.
- The above steps are repeated every BATCH_POLL_INTERVAL seconds as
indicated in jobmond.conf

Note: When the DRMAA issue is sorted out, the job status will then be
obtained by establishing a DRMAA session. It would look as below:

s=DRMAA.Session()
s.init()

for jobid in self.qstatparser.attribs:
    job_status[jobid] = s.getJobProgramStatus(jobid)
    ##...Code here for translating the DRMA status code
    ##...into string such as queued active, on hold, etc.,
    ##...and update the status key in the attribs dictionary


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Reply via email to