I don't have access to a Unix machine now, so I assume the script works.

However, it is always the execution daemons that run the load sensors, so
make sure the load sensor is available on all the machines.

 -Ron



________________________________
From: Earl Lazarus <[email protected]>
To: Rayson Ho <[email protected]> 
Cc: [email protected] 
Sent: Thursday, April 19, 2012 9:49 PM
Subject: Re: [gridengine users] Load sensors


Here is  the load sensor...it basically checks to see if a server is running on 
the host, returning 1 if yes
and 0 if no.  It currently contains diagnostic prints to my home directory.   
It runs fine from the command prompt.  

When is a user provided load monitor actually run?  Every time the scheduler 
runs?

#!/bin/bash
#PURPOSE  SGE load monitor
#
#
good(){
   echo "begin"
   echo "$hst:earl_ecs_jun:1"
   echo "end"
}
bad(){
   echo "begin"
   echo "$hst:earl_ecs_jun:0"
   echo "end"
}
   echo START `date`  >>/home/elazarus/LD
   hst=$(uname -n)
   pf="PID_FILE"
   while [ 1 ] ; do
      read input
      result=$?
      echo READ `date`  >>/home/elazarus/LD
      if [ $result != 0 ] ; then
         exit 1
      fi
      if [ "$input" = "quit" ] ; then
         echo END `date`  >>/home/elazarus/LD
         exit 0
      fi
#     --ASSERT VALID QUERY
      tmpname=/tmp/jaeger/0p1/EDB/ECS_JUN_SS3_SL4h
      if [ -d $tmpname ] ; then
         cd $tmpname
#        --EXAMINE THE PID_FILE
         if [ -e $pf ] ; then
#           --FOUND PID_FILE
            pid=$(cat $pf)
            l=$(ps h -p $pid |wc -l)
            if [ $l -eq 0 ] ; then
#              --CANNOT FIND THE SPECIFIED PROCESS
               bad
            else
#              --IT'S RUNNING!!
               good
            fi
         else
#           --NO PID_FILE
            bad
         fi
      else
#        --NO SERVER DIRECTORY
         bad
      fi
   done




On Thu, Apr 19, 2012 at 7:18 PM, Rayson Ho <[email protected]> wrote:

Can you post your load sensor, or at least the main structure of your
>load sensor script??
>
>If you run the script interactively, what do you get??
>
>Rayson
>
>
>
>
>On Thu, Apr 19, 2012 at 8:14 PM, Earl Lazarus <[email protected]> wrote:
>> I followed all of those directions...it just doesn't run.  Permissions are
>> 777.
>>  I put an "echo START `date` >>/home/<myid>/LD"
>>
>> The file is always empty.
>>
>>
>> On Thu, Apr 19, 2012 at 12:37 PM, Rayson Ho <[email protected]>
>> wrote:
>>>
>>> There is not a lot of actual "REQUIREMENTS" for a load sensor. As long
>>> as it prints the proper values to standard output, then it is good
>>> enough in most cases.
>>>
>>> You can get more detail from Oracle's doc:
>>>
>>>
>>> http://docs.oracle.com/cd/E24901_01/doc.62/e21978/configuration.htm#sthref182
>>>
>>> Rayson
>>>
>>>
>>>
>>> On Thu, Apr 19, 2012 at 1:31 PM, Earl Lazarus <[email protected]>
>>> wrote:
>>> > Based upon earlier postings, it looks like a load sensor will solve my
>>> > problem.  Others have
>>> > pointed to the following link (which contains an example of a load
>>> > sensor
>>> > script).
>>> >
>>> > http://gridscheduler.sourceforge.net/howto/loadsensor.html
>>> >
>>> > The example script at this site contains a "read" statement and seems to
>>> > communicate with SGE via "echo".  Is there someplace where I can
>>> > find the actual REQUIREMENTS for a load sensor script instead of
>>> > having to reverse engineer the requirements from an example?
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > [email protected]
>>> > https://gridengine.org/mailman/listinfo/users
>>> >
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> [email protected]
>> https://gridengine.org/mailman/listinfo/users
>>
>

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to