On Wed, 13 May 2015 07:44:08 +0000
"[email protected]" <[email protected]> wrote:

> Hi Reuti,
> 
> The value in /opt/sge/default/spool/active_jobs/8143543.1/addgrpid is not 
> there in /proc/
> 
> But the the child processes of the job are available in /proc/.
> 
> Can you please suggest a solution.
> 
There are two possible solutions specified in the section of the man page Reuti 
referred you to.  Either
compile and use a patched sshd or configure your existing sshd to use the 
pam_sge-qrsh-setup PAM module.  

> Regards,
> Sudha
> 
> -----Original Message-----
> From: Reuti [mailto:[email protected]]
> Sent: Tuesday, May 12, 2015 8:53 PM
> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
> Cc: [email protected]; [email protected]
> Subject: Re: [gridengine users] grid jobs not visible with qstat output
> 
> 
> > Am 12.05.2015 um 17:03 schrieb <[email protected]> 
> > <[email protected]>:
> >
> > Hi Reuti,
> >
> > In the link suggested by you
> > (https://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html ) it
> > is mentioned as below
> >
> > "To  have a tight integration of SSH into SGE, the started sshd needs an 
> > additional group ID to be attached."
> >
> > Checked the configuration from our side and the addgrpid is generated
> >
> > /opt/sge/default/spool/active_jobs/8143543.1 : ls addgrpid
> 
> Yes, but not attached to all processes. Processes running in a tight 
> integration needs them attached like something in /proc:
> 
> reuti@node:/proc/24989> cat status
> ...
> Groups: 20082 24000 25000
> 
> And the 20082 is the additional one.
> 
> -- Reuti
> 
> 
> >
> > Regards,
> > Sudha
> >
> > -----Original Message-----
> > From: Reuti [mailto:[email protected]]
> > Sent: Monday, May 11, 2015 2:08 AM
> > To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
> > Cc: [email protected]; [email protected]
> > Subject: Re: [gridengine users] grid jobs not visible with qstat
> > output
> >
> >
> > Am 10.05.2015 um 19:30 schrieb <[email protected]> 
> > <[email protected]>:
> >
> >> Hi Reuti,
> >>
> >> The startup mechanism is as below
> >>
> >> qlogin_daemon                /usr/sbin/sshd -i
> >> qlogin_command               /gridapl1/HWEE_ge6/new/qssh
> >
> > Then it's most likely that the `ssh` is not tightly integrated into SGE. 
> > Please have a look at:
> >
> > https://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html
> >
> > section "SSH TIGHT INTEGRATION".
> >
> > -- Reuti
> >
> >
> >> Regards,
> >> Sudha
> >>
> >> -----Original Message-----
> >> From: Reuti [mailto:[email protected]]
> >> Sent: Friday, May 08, 2015 10:50 PM
> >> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
> >> Cc: [email protected]; [email protected]
> >> Subject: Re: [gridengine users] grid jobs not visible with qstat
> >> output
> >>
> >>
> >>> Am 08.05.2015 um 16:57 schrieb [email protected]:
> >>>
> >>> Hi Zhang,
> >>>
> >>> Please find the o/p
> >>>
> >>> 32682 61457200 27020 karppa 32682
> >>> /applic36/grid/HWEE_ge6/utilbin/lx24-amd64/qrsh_starter
> >>> /gridapl1/HWEE_ge6/default/spo
> >>> 32734 61457200 27020 karppa 32734  \_ /bin/ksh ./run_it_file.vcs
> >>> 33043 61457200 27020 karppa 32734      \_ /bin/ksh ./vcs.start.dh.no_gui
> >>> 33059 61457200 27020 karppa 32734          \_ 
> >>> ./vcs/tb_bin/hdl_top_rtldhsim/simv -licqueue -cm line+cond+fsm+branch+tgl+
> >>> 38048 61457200 27020 karppa 32734              \_ [target.bin] <defunct>
> >>> 5049 61457200 27020 karppa 5049
> >>> /applic36/grid/HWEE_ge6/utilbin/lx24-amd64/qrsh_starter
> >>> /gridapl1/HWEE_ge6/default/spoo
> >>> 5101 61457200 27020 karppa 5101  \_ /bin/ksh ./run_it_file.vcs
> >>> 5408 61457200 27020 karppa 5101      \_ /bin/ksh ./vcs.start.dh.no_gui
> >>> 5424 61457200 27020 karppa 5101          \_ 
> >>> ./vcs/tb_bin/hdl_top_rtldhsim/simv -licqueue -cm 
> >>> line+cond+fsm+branch+tgl+a
> >>> 9089 61457200 27020 karppa 5101              \_ [target.bin] <defunct>
> >>
> >> The problem seems to be, that the `qrsh`starter` is no longer bound to the 
> >> "sge_shephered". This was after the job? How does it look like while SGE 
> >> still knows about the job. What is the startup mechanism:
> >>
> >> $ qconf -sconf
> >> ...
> >> qlogin_command               builtin
> >> qlogin_daemon                builtin
> >> rlogin_command               builtin
> >> rlogin_daemon                builtin
> >> rsh_command                  builtin
> >> rsh_daemon                   builtin
> >>
> >> -- Reuti
> >>
> >>
> >>> Regards,
> >>> Sudha
> >>>
> >>> -----Original Message-----
> >>> From: Feng Zhang [mailto:[email protected]]
> >>> Sent: Friday, May 08, 2015 7:35 PM
> >>> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
> >>> Subject: Re: [gridengine users] grid jobs not visible with qstat
> >>> output
> >>>
> >>> Sudha,
> >>>
> >>> Can you run "ps -e f -o pid,ppid,command", which can show more details?
> >>>
> >>> On Fri, May 8, 2015 at 4:09 AM,  <[email protected]> wrote:
> >>>> Hi Reuti,
> >>>>
> >>>> The processes are not bound to sge_shepherd anymore.
> >>>>
> >>>> Below are the qrsh_starter processes running still
> >>>>
> >>>> 5049 ?        00:00:00 qrsh_starter
> >>>> 5101 ?        00:00:00 run_it_file.vcs
> >>>> 5408 ?        00:00:00 vcs.start.dh.no
> >>>> 5424 ?        8-20:57:02 simv
> >>>> 9089 ?        00:00:00 target.bin <defunct>
> >>>> 16868 ?        00:00:00 sshd
> >>>> 16913 pts/9    00:00:00 bash
> >>>> 17371 pts/9    00:00:00 ps
> >>>> 32682 ?        00:00:00 qrsh_starter
> >>>> 32734 ?        00:00:00 run_it_file.vcs
> >>>> 33043 ?        00:00:00 vcs.start.dh.no
> >>>> 33059 ?        8-21:19:03 simv
> >>>> 38048 ?        00:00:00 target.bin <defunct>
> >>>>
> >>>> Regards,
> >>>> Sudha
> >>>>
> >>>> -----Original Message-----
> >>>> From: Reuti [mailto:[email protected]]
> >>>> Sent: Thursday, May 07, 2015 9:52 PM
> >>>> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
> >>>> Cc: [email protected]; [email protected]
> >>>> Subject: Re: [gridengine users] grid jobs not visible with qstat
> >>>> output
> >>>>
> >>>> Are the processes still bound to the sge_shephered or did they jump out 
> >>>> of the process tree? By what method were they started by qrsh_starter: 
> >>>> "builtin" or by defining `ssh`?
> >>>>
> >>>> -- Reuti
> >>>>
> >>>>
> >>>>> Am 07.05.2015 um 18:00 schrieb <[email protected]> 
> >>>>> <[email protected]>:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> No the slots are not being used anymore
> >>>>>
> >>>>> That according to qstat I seem not to have any jobs at host. However, 
> >>>>> there are my processes running in that specific host (launched by 
> >>>>> qrsh_starter) that are altogether consuming 200% of CPU and licenses. 
> >>>>> The problem here is that the processes have been running there over a 
> >>>>> week and I haven't been aware of those. I've thought that the processes 
> >>>>> were killed when the job was killed with qdel.
> >>>>>
> >>>>> What could be the reason for this.
> >>>>>
> >>>>> Regards,
> >>>>> Sudha
> >>>>>
> >>>>> From: Srirangam Addepalli [mailto:[email protected]]
> >>>>> Sent: Wednesday, May 06, 2015 7:52 PM
> >>>>> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
> >>>>> Subject: Re: [gridengine users] grid jobs not visible with qstat
> >>>>> output
> >>>>>
> >>>>> That would be strange.  Do the slots on the host show as being used.
> >>>>>
> >>>>> qhost -j -h hostname should list the jobs that Grid Engine is aware of. 
> >>>>> Unless qrsh some how spwanned a process that is not bound by sge_execd. 
> >>>>> On the client/ execution host  what info do you have in active_jobs and 
> >>>>> jobs directories.  It is more likely that the qrsh session is 
> >>>>> terminated but left resident processes.
> >>>>>
> >>>>> Rangam
> >>>>>
> >>>>> On Wed, May 6, 2015 at 9:05 AM, <[email protected]> wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I noticed that I've had two grid jobs running over a week on a machine 
> >>>>> of which I haven't been aware of. Both of the jobs have been launched 
> >>>>> with qrsh but they are not visible with qstat thus for a reason or 
> >>>>> another they are no longer included in grid book-keeping. This issue 
> >>>>> will cause that grid resources are wasted for ghost jobs as for example 
> >>>>> both of my jobs seem to consume 100% CPU on the host.
> >>>>>
> >>>>> Can anyone please explain on this.
> >>>>>
> >>>>> Regards,
> >>>>> Sudha
> >>>>>
> >>>>> The information contained in this electronic message and any
> >>>>> attachments to this message are intended for the exclusive use of
> >>>>> the addressee(s) and may contain proprietary, confidential or
> >>>>> privileged information. If you are not the intended recipient, you
> >>>>> should not disseminate, distribute or copy this e-mail. Please
> >>>>> notify the sender immediately and destroy all copies of this
> >>>>> message and any attachments. WARNING: Computer viruses can be
> >>>>> transmitted via email. The recipient should check this email and
> >>>>> any attachments for the presence of viruses. The company accepts
> >>>>> no liability for any damage caused by any virus transmitted by
> >>>>> this email. www.wipro.com
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> [email protected]
> >>>>> https://gridengine.org/mailman/listinfo/users
> >>>>>
> >>>>>
> >>>>> The information contained in this electronic message and any
> >>>>> attachments to this message are intended for the exclusive use of
> >>>>> the addressee(s) and may contain proprietary, confidential or
> >>>>> privileged information. If you are not the intended recipient, you
> >>>>> should not disseminate, distribute or copy this e-mail. Please
> >>>>> notify the sender immediately and destroy all copies of this
> >>>>> message and any attachments. WARNING: Computer viruses can be
> >>>>> transmitted via email. The recipient should check this email and
> >>>>> any attachments for the presence of viruses. The company accepts
> >>>>> no liability for any damage caused by any virus transmitted by
> >>>>> this email. www.wipro.com
> >>>>
> >>>> The information contained in this electronic message and any
> >>>> attachments to this message are intended for the exclusive use of
> >>>> the addressee(s) and may contain proprietary, confidential or
> >>>> privileged information. If you are not the intended recipient, you
> >>>> should not disseminate, distribute or copy this e-mail. Please
> >>>> notify the sender immediately and destroy all copies of this
> >>>> message and any attachments. WARNING: Computer viruses can be
> >>>> transmitted via email. The recipient should check this email and
> >>>> any attachments for the presence of viruses. The company accepts no
> >>>> liability for any damage caused by any virus transmitted by this
> >>>> email. www.wipro.com
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> [email protected]
> >>>> https://gridengine.org/mailman/listinfo/users
> >>>
> >>>
> >>>
> >>> --
> >>> Best,
> >>>
> >>> Feng
> >>> The information contained in this electronic message and any
> >>> attachments to this message are intended for the exclusive use of
> >>> the addressee(s) and may contain proprietary, confidential or
> >>> privileged information. If you are not the intended recipient, you
> >>> should not disseminate, distribute or copy this e-mail. Please
> >>> notify the sender immediately and destroy all copies of this message
> >>> and any attachments. WARNING: Computer viruses can be transmitted
> >>> via email. The recipient should check this email and any attachments
> >>> for the presence of viruses. The company accepts no liability for
> >>> any damage caused by any virus transmitted by this email.
> >>> www.wipro.com
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> [email protected]
> >>> https://gridengine.org/mailman/listinfo/users
> >>>
> >>
> >> The information contained in this electronic message and any
> >> attachments to this message are intended for the exclusive use of the
> >> addressee(s) and may contain proprietary, confidential or privileged
> >> information. If you are not the intended recipient, you should not
> >> disseminate, distribute or copy this e-mail. Please notify the sender
> >> immediately and destroy all copies of this message and any
> >> attachments. WARNING: Computer viruses can be transmitted via email.
> >> The recipient should check this email and any attachments for the
> >> presence of viruses. The company accepts no liability for any damage
> >> caused by any virus transmitted by this email. www.wipro.com
> >>
> >
> > The information contained in this electronic message and any
> > attachments to this message are intended for the exclusive use of the
> > addressee(s) and may contain proprietary, confidential or privileged
> > information. If you are not the intended recipient, you should not
> > disseminate, distribute or copy this e-mail. Please notify the sender
> > immediately and destroy all copies of this message and any
> > attachments. WARNING: Computer viruses can be transmitted via email.
> > The recipient should check this email and any attachments for the
> > presence of viruses. The company accepts no liability for any damage
> > caused by any virus transmitted by this email. www.wipro.com
> >
> 
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments. WARNING: Computer viruses can be transmitted via 
> email. The recipient should check this email and any attachments for the 
> presence of viruses. The company accepts no liability for any damage caused 
> by any virus transmitted by this email. www.wipro.com
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


-- 
William Hay <[email protected]>

Attachment: pgpyW78IDdzSE.pgp
Description: PGP signature

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to