> Am 08.05.2015 um 16:57 schrieb [email protected]:
> 
> Hi Zhang,
> 
> Please find the o/p
> 
> 32682 61457200 27020 karppa 32682 
> /applic36/grid/HWEE_ge6/utilbin/lx24-amd64/qrsh_starter 
> /gridapl1/HWEE_ge6/default/spo
> 32734 61457200 27020 karppa 32734  \_ /bin/ksh ./run_it_file.vcs
> 33043 61457200 27020 karppa 32734      \_ /bin/ksh ./vcs.start.dh.no_gui
> 33059 61457200 27020 karppa 32734          \_ 
> ./vcs/tb_bin/hdl_top_rtldhsim/simv -licqueue -cm line+cond+fsm+branch+tgl+
> 38048 61457200 27020 karppa 32734              \_ [target.bin] <defunct>
> 5049 61457200 27020 karppa 5049 
> /applic36/grid/HWEE_ge6/utilbin/lx24-amd64/qrsh_starter 
> /gridapl1/HWEE_ge6/default/spoo
> 5101 61457200 27020 karppa 5101  \_ /bin/ksh ./run_it_file.vcs
> 5408 61457200 27020 karppa 5101      \_ /bin/ksh ./vcs.start.dh.no_gui
> 5424 61457200 27020 karppa 5101          \_ 
> ./vcs/tb_bin/hdl_top_rtldhsim/simv -licqueue -cm line+cond+fsm+branch+tgl+a
> 9089 61457200 27020 karppa 5101              \_ [target.bin] <defunct>

The problem seems to be, that the `qrsh`starter` is no longer bound to the 
"sge_shephered". This was after the job? How does it look like while SGE still 
knows about the job. What is the startup mechanism:

$ qconf -sconf
...
qlogin_command               builtin
qlogin_daemon                builtin
rlogin_command               builtin
rlogin_daemon                builtin
rsh_command                  builtin
rsh_daemon                   builtin

-- Reuti


> Regards,
> Sudha
> 
> -----Original Message-----
> From: Feng Zhang [mailto:[email protected]]
> Sent: Friday, May 08, 2015 7:35 PM
> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
> Subject: Re: [gridengine users] grid jobs not visible with qstat output
> 
> Sudha,
> 
> Can you run "ps -e f -o pid,ppid,command", which can show more details?
> 
> On Fri, May 8, 2015 at 4:09 AM,  <[email protected]> wrote:
>> Hi Reuti,
>> 
>> The processes are not bound to sge_shepherd anymore.
>> 
>> Below are the qrsh_starter processes running still
>> 
>> 5049 ?        00:00:00 qrsh_starter
>> 5101 ?        00:00:00 run_it_file.vcs
>> 5408 ?        00:00:00 vcs.start.dh.no
>> 5424 ?        8-20:57:02 simv
>> 9089 ?        00:00:00 target.bin <defunct>
>> 16868 ?        00:00:00 sshd
>> 16913 pts/9    00:00:00 bash
>> 17371 pts/9    00:00:00 ps
>> 32682 ?        00:00:00 qrsh_starter
>> 32734 ?        00:00:00 run_it_file.vcs
>> 33043 ?        00:00:00 vcs.start.dh.no
>> 33059 ?        8-21:19:03 simv
>> 38048 ?        00:00:00 target.bin <defunct>
>> 
>> Regards,
>> Sudha
>> 
>> -----Original Message-----
>> From: Reuti [mailto:[email protected]]
>> Sent: Thursday, May 07, 2015 9:52 PM
>> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
>> Cc: [email protected]; [email protected]
>> Subject: Re: [gridengine users] grid jobs not visible with qstat output
>> 
>> Are the processes still bound to the sge_shephered or did they jump out of 
>> the process tree? By what method were they started by qrsh_starter: 
>> "builtin" or by defining `ssh`?
>> 
>> -- Reuti
>> 
>> 
>>> Am 07.05.2015 um 18:00 schrieb <[email protected]> 
>>> <[email protected]>:
>>> 
>>> Hi,
>>> 
>>> No the slots are not being used anymore
>>> 
>>> That according to qstat I seem not to have any jobs at host. However, there 
>>> are my processes running in that specific host (launched by qrsh_starter) 
>>> that are altogether consuming 200% of CPU and licenses. The problem here is 
>>> that the processes have been running there over a week and I haven’t been 
>>> aware of those. I’ve thought that the processes were killed when the job 
>>> was killed with qdel.
>>> 
>>> What could be the reason for this.
>>> 
>>> Regards,
>>> Sudha
>>> 
>>> From: Srirangam Addepalli [mailto:[email protected]]
>>> Sent: Wednesday, May 06, 2015 7:52 PM
>>> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom)
>>> Subject: Re: [gridengine users] grid jobs not visible with qstat output
>>> 
>>> That would be strange.  Do the slots on the host show as being used.
>>> 
>>> qhost -j -h hostname should list the jobs that Grid Engine is aware of. 
>>> Unless qrsh some how spwanned a process that is not bound by sge_execd. On 
>>> the client/ execution host  what info do you have in active_jobs and jobs 
>>> directories.  It is more likely that the qrsh session is terminated but 
>>> left resident processes.
>>> 
>>> Rangam
>>> 
>>> On Wed, May 6, 2015 at 9:05 AM, <[email protected]> wrote:
>>> Hi,
>>> 
>>> I noticed that I've had two grid jobs running over a week on a machine of 
>>> which I haven't been aware of. Both of the jobs have been launched with 
>>> qrsh but they are not visible with qstat thus for a reason or another they 
>>> are no longer included in grid book-keeping. This issue will cause that 
>>> grid resources are wasted for ghost jobs as for example both of my jobs 
>>> seem to consume 100% CPU on the host.
>>> 
>>> Can anyone please explain on this.
>>> 
>>> Regards,
>>> Sudha
>>> 
>>> The information contained in this electronic message and any attachments to 
>>> this message are intended for the exclusive use of the addressee(s) and may 
>>> contain proprietary, confidential or privileged information. If you are not 
>>> the intended recipient, you should not disseminate, distribute or copy this 
>>> e-mail. Please notify the sender immediately and destroy all copies of this 
>>> message and any attachments. WARNING: Computer viruses can be transmitted 
>>> via email. The recipient should check this email and any attachments for 
>>> the presence of viruses. The company accepts no liability for any damage 
>>> caused by any virus transmitted by this email. www.wipro.com
>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>>> 
>>> 
>>> The information contained in this electronic message and any attachments to 
>>> this message are intended for the exclusive use of the addressee(s) and may 
>>> contain proprietary, confidential or privileged information. If you are not 
>>> the intended recipient, you should not disseminate, distribute or copy this 
>>> e-mail. Please notify the sender immediately and destroy all copies of this 
>>> message and any attachments. WARNING: Computer viruses can be transmitted 
>>> via email. The recipient should check this email and any attachments for 
>>> the presence of viruses. The company accepts no liability for any damage 
>>> caused by any virus transmitted by this email. www.wipro.com
>> 
>> The information contained in this electronic message and any attachments to 
>> this message are intended for the exclusive use of the addressee(s) and may 
>> contain proprietary, confidential or privileged information. If you are not 
>> the intended recipient, you should not disseminate, distribute or copy this 
>> e-mail. Please notify the sender immediately and destroy all copies of this 
>> message and any attachments. WARNING: Computer viruses can be transmitted 
>> via email. The recipient should check this email and any attachments for the 
>> presence of viruses. The company accepts no liability for any damage caused 
>> by any virus transmitted by this email. www.wipro.com
>> 
>> _______________________________________________
>> users mailing list
>> [email protected]
>> https://gridengine.org/mailman/listinfo/users
> 
> 
> 
> --
> Best,
> 
> Feng
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments. WARNING: Computer viruses can be transmitted via 
> email. The recipient should check this email and any attachments for the 
> presence of viruses. The company accepts no liability for any damage caused 
> by any virus transmitted by this email. www.wipro.com
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to