Indeed the script would not have been able to echo to my home directory, so
I changed the destination of the echo to /tmp/LD.  I then changed the name
of script slightly and went into qmon and fixed the new spelling for both
global and the one host where I am looking.  After a few minutes there is
no sign of my echoes in /tmp or the script name in a ps -elf on the one
host.

On Fri, Apr 20, 2012 at 9:42 AM, Reuti <[email protected]> wrote:

> Am 20.04.2012 um 16:28 schrieb Earl Lazarus:
>
> > Yes the load sensor is under my home directory which is visible on all
> machines.  Would it be a true statement that my load sensor should be
> running as soon as I specify it in the host configuration?  I need not
> submit jobs that reference the load that it is measuring.
>
> If you change the definition in a local host configuration it's necessary
> to change the global configuration to distribute it to the node (just
> remove a blank somewhere). Then after 2 cycles of the load_report_time the
> process should be visible on a node:
>
> $ ps -e f
> ...
>  5081 ?        Sl   125:47 /usr/sge/bin/lx24-amd64/sge_execd
>  5147 ?        S      2:39  \_ /bin/sh /usr/sge/cluster/tmpspace.sh
>
> -- Reuti
>
>
> > On Thu, Apr 19, 2012 at 9:30 PM, Ron Chen <[email protected]>
> wrote:
> > I don't have access to a Unix machine now, so I assume the script works.
> >
> > However, it is always the execution daemons that run the load sensors, so
> > make sure the load sensor is available on all the machines.
> >
> >  -Ron
> >
> >
> >
> > ________________________________
> > From: Earl Lazarus <[email protected]>
> > To: Rayson Ho <[email protected]>
> > Cc: [email protected]
> > Sent: Thursday, April 19, 2012 9:49 PM
> > Subject: Re: [gridengine users] Load sensors
> >
> >
> > Here is  the load sensor...it basically checks to see if a server is
> running on the host, returning 1 if yes
> > and 0 if no.  It currently contains diagnostic prints to my home
> directory.   It runs fine from the command prompt.
> >
> > When is a user provided load monitor actually run?  Every time the
> scheduler runs?
> >
> > #!/bin/bash
> > #PURPOSE  SGE load monitor
> > #
> > #
> > good(){
> >    echo "begin"
> >    echo "$hst:earl_ecs_jun:1"
> >    echo "end"
> > }
> > bad(){
> >    echo "begin"
> >    echo "$hst:earl_ecs_jun:0"
> >    echo "end"
> > }
> >    echo START `date`  >>/home/elazarus/LD
> >    hst=$(uname -n)
> >    pf="PID_FILE"
> >    while [ 1 ] ; do
> >       read input
> >       result=$?
> >       echo READ `date`  >>/home/elazarus/LD
> >       if [ $result != 0 ] ; then
> >          exit 1
> >       fi
> >       if [ "$input" = "quit" ] ; then
> >          echo END `date`  >>/home/elazarus/LD
> >          exit 0
> >       fi
> > #     --ASSERT VALID QUERY
> >       tmpname=/tmp/jaeger/0p1/EDB/ECS_JUN_SS3_SL4h
> >       if [ -d $tmpname ] ; then
> >          cd $tmpname
> > #        --EXAMINE THE PID_FILE
> >          if [ -e $pf ] ; then
> > #           --FOUND PID_FILE
> >             pid=$(cat $pf)
> >             l=$(ps h -p $pid |wc -l)
> >             if [ $l -eq 0 ] ; then
> > #              --CANNOT FIND THE SPECIFIED PROCESS
> >                bad
> >             else
> > #              --IT'S RUNNING!!
> >                good
> >             fi
> >          else
> > #           --NO PID_FILE
> >             bad
> >          fi
> >       else
> > #        --NO SERVER DIRECTORY
> >          bad
> >       fi
> >    done
> >
> >
> >
> >
> > On Thu, Apr 19, 2012 at 7:18 PM, Rayson Ho <[email protected]>
> wrote:
> >
> > Can you post your load sensor, or at least the main structure of your
> > >load sensor script??
> > >
> > >If you run the script interactively, what do you get??
> > >
> > >Rayson
> > >
> > >
> > >
> > >
> > >On Thu, Apr 19, 2012 at 8:14 PM, Earl Lazarus <[email protected]>
> wrote:
> > >> I followed all of those directions...it just doesn't run.
>  Permissions are
> > >> 777.
> > >>  I put an "echo START `date` >>/home/<myid>/LD"
> > >>
> > >> The file is always empty.
> > >>
> > >>
> > >> On Thu, Apr 19, 2012 at 12:37 PM, Rayson Ho <[email protected]
> >
> > >> wrote:
> > >>>
> > >>> There is not a lot of actual "REQUIREMENTS" for a load sensor. As
> long
> > >>> as it prints the proper values to standard output, then it is good
> > >>> enough in most cases.
> > >>>
> > >>> You can get more detail from Oracle's doc:
> > >>>
> > >>>
> > >>>
> http://docs.oracle.com/cd/E24901_01/doc.62/e21978/configuration.htm#sthref182
> > >>>
> > >>> Rayson
> > >>>
> > >>>
> > >>>
> > >>> On Thu, Apr 19, 2012 at 1:31 PM, Earl Lazarus <
> [email protected]>
> > >>> wrote:
> > >>> > Based upon earlier postings, it looks like a load sensor will
> solve my
> > >>> > problem.  Others have
> > >>> > pointed to the following link (which contains an example of a load
> > >>> > sensor
> > >>> > script).
> > >>> >
> > >>> > http://gridscheduler.sourceforge.net/howto/loadsensor.html
> > >>> >
> > >>> > The example script at this site contains a "read" statement and
> seems to
> > >>> > communicate with SGE via "echo".  Is there someplace where I can
> > >>> > find the actual REQUIREMENTS for a load sensor script instead of
> > >>> > having to reverse engineer the requirements from an example?
> > >>> >
> > >>> > _______________________________________________
> > >>> > users mailing list
> > >>> > [email protected]
> > >>> > https://gridengine.org/mailman/listinfo/users
> > >>> >
> > >>
> > >>
> > >>
> > >> _______________________________________________
> > >> users mailing list
> > >> [email protected]
> > >> https://gridengine.org/mailman/listinfo/users
> > >>
> > >
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to