On Wed, Feb 1, 2012 at 7:25 AM, Reuti <[email protected]> wrote:
> Am 01.02.2012 um 15:22 schrieb Michael Coffman:
>
>> I don't really understand what this means?  You mean that nfsd is
>> designed to run under the group id of the user for a short time?  Yes,
>> we run nfs servers on all grid exec hosts.
>
> Yes, at least for the userspace daemons at that time. I'm not aware of the 
> situation for kernel threads. But it looks like it's still (or again?) the 
> case.

Thanks for the help!  Oh and by the way this is on rhel5 update 4.
Running the update 7 kernel:

kernel-2.6.18-194.11.3.el5
nfs-utils-1.0.9-42.el5

>
> -- Reuti
>
>
>>>
>>> The additional group id is for sure 22024 and/or 27903 is a second from 
>>> another process?
>>
>> Yes the 27903 and 90 are non-sge group id's
>>
>> Sounds like the right thing to do is just to have the script exit once
>> all non-nfsd processes with the groupid have
>> exited.
>>
>>>
>>> -- Reuti
>>>
>>>
>>>> This may be more appropriate for an NFS mailing list at this point,
>>>> but an clues as to how and why this groupid
>>>> gets added to nfsd?
>>>>
>>>> Thanks.
>>>>
>>>> On Fri, Jan 13, 2012 at 2:57 PM, Reuti <[email protected]> wrote:
>>>>> Am 13.01.2012 um 19:40 schrieb Michael Coffman:
>>>>>
>>>>>>>> <snip>
>>>>>>>> It currently determines the pid of the shepherd process then watches 
>>>>>>>> all
>>>>>>>> the children processes.
>>>>>>>
>>>>>>> I think it's easier to use the additional group ID, which is attached 
>>>>>>> to all kids by SGE, whether they jump out of the process tree or not. 
>>>>>>> This one is recorded in $SGE_JOB_SPOOL_DIR in the file "addgrpid".
>>>>>>>
>>>>>>
>>>>>> Had not thought of this.  Sounds like a good idea.  At first glance I
>>>>>> am not seeing how to list the jobs via
>>>>>> ps that are identified by the gid in the addgrpid file.   I tried ps
>>>>>> -G`cat addgrpid`  -o vsz,rss,arg but it
>>>>>> returns nothing.   I'll have to dig into this a bit more.
>>>>>
>>>>> Yes, it's most likely only in the /proc:
>>>>>
>>>>> $ qrsh
>>>>> Running inside SGE
>>>>> Job 3696
>>>>> $ id
>>>>> uid=1000(reuti) gid=100(users) 
>>>>> groups=10(wheel),16(dialout),33(video),100(users),20007
>>>>> $ grep -l -r "^Groups.* 20007" /proc/*/status 2>/dev/null | sed -n 
>>>>> "s|/proc/\([0-9]*\)/status|\1|p"
>>>>> 13306
>>>>> 13628
>>>>> 13629
>>>>>
>>>>>
>>>>>>>> Initially it will be watching memory usage and if a job begins using 
>>>>>>>> more
>>>>>>>> physical memory than requested, the user will be notified.  That's 
>>>>>>>> where
>>>>>>>> my question comes from.
>>>>>>>
>>>>>>> What about setting a soft limit for h_vmem and prepare the job script 
>>>>>>> to handle ithe signal to send an email. How will they request memory - 
>>>>>>> by virtual_free?
>>>>>>
>>>>>> Memory is requested via a consumable complex that we define as the
>>>>>> amount of physical memory.  The way most of the jobs are run
>>>>>> currently, we could not do this.  Job scripts typically call a
>>>>>> commercial vendors binary so there is
>>>>>> nothing listening for the signals.
>>>>>
>>>>> Ok. Depending on the application and whether it resets the traps you can 
>>>>> try to use a subshell as the signal is send to the complete process group 
>>>>> to ignore it for the application:
>>>>>
>>>>> #!/bin/bash
>>>>> trap 'echo USR1' usr1
>>>>> (trap '' usr1; exec your_binary) &
>>>>> PID=$!
>>>>> wait $PID
>>>>> RET=$?
>>>>> while [ $RET -eq 138 ]; do wait $PID; RET=$?; done
>>>>>
>>>>>
>>>>> '' = two single quotation marks
>>>>> After the first signal `wait` must be called again.
>>>>>
>>>>>
>>>>>>>> Is there any way in the prolog to get access to the hard_request 
>>>>>>>> options
>>>>>>>> besides using qstat?
>>>>>>>>
>>>>>>>> What I'm currently doing:
>>>>>>>>
>>>>>>>>  cmd = "bash -c '. #{@sge_root}/default/common/settings.sh && qstat
>>>>>>>> -xml -j #{@number}'"
>>>>>>>>
>>>>>>>> I have thought of possibly setting an environment variable via a jsv 
>>>>>>>> script
>>>>>>>> that can be queried by the prolog script.  Is this a good idea?  How 
>>>>>>>> much impact
>>>>>>>> on submission time does jsv_send_env() add?
>>>>>>>
>>>>>>> You can use either a JSV or a `qsub` wrapper for it.
>>>>>>>
>>>>>>>
>>>>>>>> Any one else doing anything like this have any suggestions?
>>>>>>>>
>>>>>>>>
>>>>>>>> The end goal is to have a utility that users can also interact with to
>>>>>>>> monitor their jobs.  By either setting environment variables or grid
>>>>>>>> complexes
>>>>>>>
>>>>>>> Complexes are only handled internally by SGE. There is no user command 
>>>>>>> to change them for a non-admin.
>>>>>>
>>>>>> My thoughts on the complex were that there would be a complex flag
>>>>>> that would indicate that the user
>>>>>> wanted to monitor memory, or cpu, etc...  Not that it would be
>>>>>> changeable by the user, just an indicator
>>>>>> for the JSV script
>>>>>
>>>>> Ok.
>>>>>
>>>>> -- Reuti
>>>>>
>>>>>
>>>>>>>> to affect the behavior of what is being watched and how they
>>>>>>>> are notified.
>>>>>>>
>>>>>>> AFAIK you can't change the content of an already inherited variable, as 
>>>>>>> the process got a copy of the value. Also /proc/12345/environ is only 
>>>>>>> readable. And your "observation daemon" will run on all nodes - one for 
>>>>>>> each job from the prolog if I get you right?
>>>>>>
>>>>>> Correct.
>>>>>>
>>>>>>>
>>>>>>> But a nice solution could be the usage of the job context. This can be 
>>>>>>> set by the user on the command line, and your job can access this by 
>>>>>>> issuing a similar command like you did already. If the exechosts are 
>>>>>>> submit hosts, the job can also change this by using `qalter` like the 
>>>>>>> user has to use on the command line. We use the job context only for 
>>>>>>> documentation purpose, to record the issued command and append it to 
>>>>>>> the email which is send after the job.
>>>>>>>
>>>>>>> http://gridengine.org/pipermail/users/2011-September/001629.html
>>>>>>>
>>>>>>> $ qstat -j 12345
>>>>>>> ...
>>>>>>> context:                    COMMAND=subturbo -v 631 -g -m 3500 -p 8 -t 
>>>>>>> infinity -s 
>>>>>>> aoforce,OUTPUT=/home/foobar/carbene/gecl4_2carb228/trans_tzvp_3.out
>>>>>>>
>>>>>>> It's only one long line, and I split it later on to inidividual 
>>>>>>> entries. In your case you have to watch out for commas, as they are 
>>>>>>> used already to separate entries.
>>>>>>
>>>>>> The context sounds very interesting.  Not something we have really
>>>>>> played around with.
>>>>>>
>>>>>> Again.  Thanks for the input.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> -- Reuti
>>>>>>>
>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> --
>>>>>>>> -MichaelC
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> [email protected]
>>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -MichaelC
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -MichaelC
>>>>
>>>
>>
>>
>>
>> --
>> -MichaelC
>>
>



-- 
-MichaelC

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to