Title: Re: [Oscar-devel] job status
Ganglia generates the graphs from the Round-Robin Database (in /var/lib/ganglia/rrds).  Those PHP Notice should just be "warnings" - I know of a lot of that has been fixed in newer versions of Ganglia.
 
I guess perhaps you can see if you get the same problems under OSCAR5.
 
BTW, I would strongly recommend that you re-install CentOS on the headnode before trying the nightly tarball - currently multiple versions don't play nice with each other.  Oh and perhaps use CentOS 4.3 while you're at it ;-)
 
Thanks,
 
Bernard


From: John Meskes [mailto:[EMAIL PROTECTED]
Sent: Wed 07/06/2006 20:04
To: Bernard Li
Cc: [email protected]
Subject: Re: [Oscar-devel] job status

> Perhaps you can post the relevant bits from your mom log.

The job ran for 1 minute. I did a qdel the following day.

node: /var/spool/pbs/mom_logs/20060606
06/06/2006 17:00:29;0001;   pbs_mom;Job;TMomFinalizeJob3;job 49.master started, pid = 20628
06/06/2006 17:00:29;0008;   pbs_mom;Job;49.master;Job Modified at request of [EMAIL PROTECTED]
06/06/2006 17:01:32;0008;   pbs_mom;Job;49.master;Terminated
06/06/2006 17:01:32;0001;   pbs_mom;Job;49.master;server rejected job obit - 15008

master: /var/spool/pbs/server_logs/pbs_server.log
06/06/2006 17:00:27;0008;PBS_Server;Job;49.master;Job Queued at request of [EMAIL PROTECTED], owner = [EMAIL PROTECTED], job name = STDIN, queue = parallel
06/06/2006 17:00:27;0040;PBS_Server;Svr;master;Scheduler sent command new
06/06/2006 17:00:29;0008;PBS_Server;Job;49.master;Job Modified at request of [EMAIL PROTECTED]
06/06/2006 17:00:29;0008;PBS_Server;Job;49.master;Job Run at request of [EMAIL PROTECTED]
06/06/2006 17:00:29;0008;PBS_Server;Job;49.master;Job Modified at request of [EMAIL PROTECTED]
06/06/2006 17:01:27;0040;PBS_Server;Svr;master;Scheduler sent command time
06/06/2006 17:02:27;0040;PBS_Server;Svr;master;Scheduler sent command time
06/06/2006 17:03:27;0040;PBS_Server;Svr;master;Scheduler sent command time
. . .
06/07/2006 15:48:55;0008;PBS_Server;Job;49.master;Job deleted at request of [EMAIL PROTECTED]
06/07/2006 15:48:55;0008;PBS_Server;Job;49.master;Job sent signal SIGTERM on delete
06/07/2006 15:48:55;0008;PBS_Server;Job;49.master;MOM rejected signal during delete


> The gaps in the Ganglia graph looks really strange - any errors in http error logs?

Only PHP "Notice" reports. The graphs are generated, just some of the data is missing. Would you know if the data is coming from a database or from a log file?

/var/log/httpd/error_log
[client 65.x.x.x] PHP Notice:  Undefined index:  G in /var/www/html/ganglia/get_context.php on line 9, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  s in /var/www/html/ganglia/get_context.php on line 13, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  cr in /var/www/html/ganglia/get_context.php on line 14, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  hc in /var/www/html/ganglia/get_context.php on line 15, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  sh in /var/www/html/ganglia/get_context.php on line 16, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  p in /var/www/html/ganglia/get_context.php on line 18, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  t in /var/www/html/ganglia/get_context.php on line 19, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  jr in /var/www/html/ganglia/get_context.php on line 21, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  js in /var/www/html/ganglia/get_context.php on line 23, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  gw in /var/www/html/ganglia/get_context.php on line 25, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  gs in /var/www/html/ganglia/get_context.php on line 27, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  g in /var/www/html/ganglia/graph.php on line 8, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  G in /var/www/html/ganglia/graph.php on line 9, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  me in /var/www/html/ganglia/graph.php on line 10, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined index:  vl in /var/www/html/ganglia/graph.php on line 15, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined variable:  command in /var/www/html/ganglia/graph.php on line 40, referer: http://142.x.x.x/ganglia/
[client 65.x.x.x] PHP Notice:  Undefined variable:  extras in /var/www/html/ganglia/graph.php on line 288, referer: http://142.x.x.x/ganglia/

> You can also try updating to the newer version of Ganglia and see if that helps:

I'll try that in the morning before I destroy the cluster (4.2.1 -> 5.)

_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Reply via email to