Indeed the script would not have been able to echo to my home directory, so I changed the destination of the echo to /tmp/LD. I then changed the name of script slightly and went into qmon and fixed the new spelling for both global and the one host where I am looking. After a few minutes there is no sign of my echoes in /tmp or the script name in a ps -elf on the one host.
On Fri, Apr 20, 2012 at 9:42 AM, Reuti <[email protected]> wrote: > Am 20.04.2012 um 16:28 schrieb Earl Lazarus: > > > Yes the load sensor is under my home directory which is visible on all > machines. Would it be a true statement that my load sensor should be > running as soon as I specify it in the host configuration? I need not > submit jobs that reference the load that it is measuring. > > If you change the definition in a local host configuration it's necessary > to change the global configuration to distribute it to the node (just > remove a blank somewhere). Then after 2 cycles of the load_report_time the > process should be visible on a node: > > $ ps -e f > ... > 5081 ? Sl 125:47 /usr/sge/bin/lx24-amd64/sge_execd > 5147 ? S 2:39 \_ /bin/sh /usr/sge/cluster/tmpspace.sh > > -- Reuti > > > > On Thu, Apr 19, 2012 at 9:30 PM, Ron Chen <[email protected]> > wrote: > > I don't have access to a Unix machine now, so I assume the script works. > > > > However, it is always the execution daemons that run the load sensors, so > > make sure the load sensor is available on all the machines. > > > > -Ron > > > > > > > > ________________________________ > > From: Earl Lazarus <[email protected]> > > To: Rayson Ho <[email protected]> > > Cc: [email protected] > > Sent: Thursday, April 19, 2012 9:49 PM > > Subject: Re: [gridengine users] Load sensors > > > > > > Here is the load sensor...it basically checks to see if a server is > running on the host, returning 1 if yes > > and 0 if no. It currently contains diagnostic prints to my home > directory. It runs fine from the command prompt. > > > > When is a user provided load monitor actually run? Every time the > scheduler runs? > > > > #!/bin/bash > > #PURPOSE SGE load monitor > > # > > # > > good(){ > > echo "begin" > > echo "$hst:earl_ecs_jun:1" > > echo "end" > > } > > bad(){ > > echo "begin" > > echo "$hst:earl_ecs_jun:0" > > echo "end" > > } > > echo START `date` >>/home/elazarus/LD > > hst=$(uname -n) > > pf="PID_FILE" > > while [ 1 ] ; do > > read input > > result=$? > > echo READ `date` >>/home/elazarus/LD > > if [ $result != 0 ] ; then > > exit 1 > > fi > > if [ "$input" = "quit" ] ; then > > echo END `date` >>/home/elazarus/LD > > exit 0 > > fi > > # --ASSERT VALID QUERY > > tmpname=/tmp/jaeger/0p1/EDB/ECS_JUN_SS3_SL4h > > if [ -d $tmpname ] ; then > > cd $tmpname > > # --EXAMINE THE PID_FILE > > if [ -e $pf ] ; then > > # --FOUND PID_FILE > > pid=$(cat $pf) > > l=$(ps h -p $pid |wc -l) > > if [ $l -eq 0 ] ; then > > # --CANNOT FIND THE SPECIFIED PROCESS > > bad > > else > > # --IT'S RUNNING!! > > good > > fi > > else > > # --NO PID_FILE > > bad > > fi > > else > > # --NO SERVER DIRECTORY > > bad > > fi > > done > > > > > > > > > > On Thu, Apr 19, 2012 at 7:18 PM, Rayson Ho <[email protected]> > wrote: > > > > Can you post your load sensor, or at least the main structure of your > > >load sensor script?? > > > > > >If you run the script interactively, what do you get?? > > > > > >Rayson > > > > > > > > > > > > > > >On Thu, Apr 19, 2012 at 8:14 PM, Earl Lazarus <[email protected]> > wrote: > > >> I followed all of those directions...it just doesn't run. > Permissions are > > >> 777. > > >> I put an "echo START `date` >>/home/<myid>/LD" > > >> > > >> The file is always empty. > > >> > > >> > > >> On Thu, Apr 19, 2012 at 12:37 PM, Rayson Ho <[email protected] > > > > >> wrote: > > >>> > > >>> There is not a lot of actual "REQUIREMENTS" for a load sensor. As > long > > >>> as it prints the proper values to standard output, then it is good > > >>> enough in most cases. > > >>> > > >>> You can get more detail from Oracle's doc: > > >>> > > >>> > > >>> > http://docs.oracle.com/cd/E24901_01/doc.62/e21978/configuration.htm#sthref182 > > >>> > > >>> Rayson > > >>> > > >>> > > >>> > > >>> On Thu, Apr 19, 2012 at 1:31 PM, Earl Lazarus < > [email protected]> > > >>> wrote: > > >>> > Based upon earlier postings, it looks like a load sensor will > solve my > > >>> > problem. Others have > > >>> > pointed to the following link (which contains an example of a load > > >>> > sensor > > >>> > script). > > >>> > > > >>> > http://gridscheduler.sourceforge.net/howto/loadsensor.html > > >>> > > > >>> > The example script at this site contains a "read" statement and > seems to > > >>> > communicate with SGE via "echo". Is there someplace where I can > > >>> > find the actual REQUIREMENTS for a load sensor script instead of > > >>> > having to reverse engineer the requirements from an example? > > >>> > > > >>> > _______________________________________________ > > >>> > users mailing list > > >>> > [email protected] > > >>> > https://gridengine.org/mailman/listinfo/users > > >>> > > > >> > > >> > > >> > > >> _______________________________________________ > > >> users mailing list > > >> [email protected] > > >> https://gridengine.org/mailman/listinfo/users > > >> > > > > > > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users > > > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users > >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
