Can you please explain in more detail how to launch with the debugger enabled?

Thanks

On 08/31/2012 12:25 PM, Rayson Ho wrote:
Two things you can try:

1) Run the qmaster under a debugger by setting $SGE_ND, and send us a
backtrace of the crash.

2) Try the qmaster binary in a newer release (you don't need to
upgrade other parts of your cluster, and don't need to drain the
jobs), and if it really is the job report issue, then the newer
qmaster should be able to handle the job reports without crashing:

http://dl.dropbox.com/u/47200624/respin/ge2011.11.tar.gz

Of course, you can compile from source if you want:
http://gridscheduler.sourceforge.net/

Rayson



On Fri, Aug 31, 2012 at 3:20 PM, Bob Tupper <[email protected]> wrote:
Thanks for your help.
I do have PE defined.  But it crashes with just a simple job that just
sleeps.
Crashes every time.
-Bob



On 08/31/2012 11:59 AM, Rayson Ho wrote:
Do you have parallel (or PE) jobs in your cluster?? A bug in SGE 6.2u5
can cause the qmaster to seg fault when it receives the job reports
from parallel jobs.

Rayson



On Fri, Aug 31, 2012 at 2:52 PM, Bob Tupper <[email protected]> wrote:
Greetings,

Hope someone can help me out.
I have a 6.2u5 install on centos 5.x

Last night the power company shut us down.
This morning I can not get sge_master daemon to say running.
If i disable all the queues or shutdown all the executable host daemons
so
jobs can not run,  it will stay up.

As soon as i enable and a job attempts to run,  the sge_master daemon
crashes.   Sometimes the job sends an email error, often not, but it
always
segfaults.

I restored from backup and have the same problem.

I have a shadow master and it crashes on both the main and backup
masters.

Im at a loss.   Any help would be most appreciated.

Thanks
-Bob

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to