Is your home on NFS & NFS is mounted as root squash?? The load sensor runs as root and thus won't be able to write to your home dir.
Also, log onto the execution node, and see if the load sensor is running by running "ps -elf|grep <name of load sensor>". Finally, if all else failed, we can increase the logging level to see what's going on... Rayson On Fri, Apr 20, 2012 at 10:28 AM, Earl Lazarus <[email protected]> wrote: > Yes the load sensor is under my home directory which is visible on all > machines. Would it be a true statement that my load sensor should be > running as soon as I specify it in the host configuration? I need not > submit jobs that reference the load that it is measuring. > > > On Thu, Apr 19, 2012 at 9:30 PM, Ron Chen <[email protected]> wrote: >> >> I don't have access to a Unix machine now, so I assume the script works. >> >> However, it is always the execution daemons that run the load sensors, so >> make sure the load sensor is available on all the machines. >> >> -Ron >> >> >> >> ________________________________ >> From: Earl Lazarus <[email protected]> >> To: Rayson Ho <[email protected]> >> Cc: [email protected] >> Sent: Thursday, April 19, 2012 9:49 PM >> Subject: Re: [gridengine users] Load sensors >> >> >> Here is the load sensor...it basically checks to see if a server is >> running on the host, returning 1 if yes >> and 0 if no. It currently contains diagnostic prints to my home >> directory. It runs fine from the command prompt. >> >> When is a user provided load monitor actually run? Every time the >> scheduler runs? >> >> #!/bin/bash >> #PURPOSE SGE load monitor >> # >> # >> good(){ >> echo "begin" >> echo "$hst:earl_ecs_jun:1" >> echo "end" >> } >> bad(){ >> echo "begin" >> echo "$hst:earl_ecs_jun:0" >> echo "end" >> } >> echo START `date` >>/home/elazarus/LD >> hst=$(uname -n) >> pf="PID_FILE" >> while [ 1 ] ; do >> read input >> result=$? >> echo READ `date` >>/home/elazarus/LD >> if [ $result != 0 ] ; then >> exit 1 >> fi >> if [ "$input" = "quit" ] ; then >> echo END `date` >>/home/elazarus/LD >> exit 0 >> fi >> # --ASSERT VALID QUERY >> tmpname=/tmp/jaeger/0p1/EDB/ECS_JUN_SS3_SL4h >> if [ -d $tmpname ] ; then >> cd $tmpname >> # --EXAMINE THE PID_FILE >> if [ -e $pf ] ; then >> # --FOUND PID_FILE >> pid=$(cat $pf) >> l=$(ps h -p $pid |wc -l) >> if [ $l -eq 0 ] ; then >> # --CANNOT FIND THE SPECIFIED PROCESS >> bad >> else >> # --IT'S RUNNING!! >> good >> fi >> else >> # --NO PID_FILE >> bad >> fi >> else >> # --NO SERVER DIRECTORY >> bad >> fi >> done >> >> >> >> >> On Thu, Apr 19, 2012 at 7:18 PM, Rayson Ho <[email protected]> >> wrote: >> >> Can you post your load sensor, or at least the main structure of your >> >load sensor script?? >> > >> >If you run the script interactively, what do you get?? >> > >> >Rayson >> > >> > >> > >> > >> >On Thu, Apr 19, 2012 at 8:14 PM, Earl Lazarus <[email protected]> >> > wrote: >> >> I followed all of those directions...it just doesn't run. Permissions >> >> are >> >> 777. >> >> I put an "echo START `date` >>/home/<myid>/LD" >> >> >> >> The file is always empty. >> >> >> >> >> >> On Thu, Apr 19, 2012 at 12:37 PM, Rayson Ho <[email protected]> >> >> wrote: >> >>> >> >>> There is not a lot of actual "REQUIREMENTS" for a load sensor. As long >> >>> as it prints the proper values to standard output, then it is good >> >>> enough in most cases. >> >>> >> >>> You can get more detail from Oracle's doc: >> >>> >> >>> >> >>> >> >>> http://docs.oracle.com/cd/E24901_01/doc.62/e21978/configuration.htm#sthref182 >> >>> >> >>> Rayson >> >>> >> >>> >> >>> >> >>> On Thu, Apr 19, 2012 at 1:31 PM, Earl Lazarus <[email protected]> >> >>> wrote: >> >>> > Based upon earlier postings, it looks like a load sensor will solve >> >>> > my >> >>> > problem. Others have >> >>> > pointed to the following link (which contains an example of a load >> >>> > sensor >> >>> > script). >> >>> > >> >>> > http://gridscheduler.sourceforge.net/howto/loadsensor.html >> >>> > >> >>> > The example script at this site contains a "read" statement and >> >>> > seems to >> >>> > communicate with SGE via "echo". Is there someplace where I can >> >>> > find the actual REQUIREMENTS for a load sensor script instead of >> >>> > having to reverse engineer the requirements from an example? >> >>> > >> >>> > _______________________________________________ >> >>> > users mailing list >> >>> > [email protected] >> >>> > https://gridengine.org/mailman/listinfo/users >> >>> > >> >> >> >> >> >> >> >> _______________________________________________ >> >> users mailing list >> >> [email protected] >> >> https://gridengine.org/mailman/listinfo/users >> >> >> > >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://gridengine.org/mailman/listinfo/users > > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
