Am 28.04.2012 um 06:12 schrieb M Stauffer -V-: >> Neverthless, if the admin increases the wallclock time, one >> could start the tool `screen` and inside the `screen` session >> issue a `qlogin`. To this `screen` session you can detach and >> reattach from whereever and whenever you like (supposing it's >> issued an always switched on headnode of the cluster). >> >> http://linux.die.net/man/1/screen >> >> Most important: "CTRL-A d" to detach, and later start it with >> `screen -xRR` to get the session back. > > I've tried screen a bit before, thanks. Someone else had idea which might > work even if the admin doesn't increase wallclock time. To qlogin, *then* > start screen and start the debugging process, then detatch and logout. Then > qlogin into the *same node* and reattach. I'm going to experiment with that, > see if it works.
Well, this would violate the granted scheduling, and AFAICS the screen session will be terminated in a proper way due to the attached additonal group ID. NB: the ownership of the generated /dev/pts/x is wrong and needs to be fixed to have access to it as a user (in case you want to test it on your own). -- Reuti > Cheers, > Michael > >> >>> Rayson >>> >>> >>> On Thu, Apr 26, 2012 at 7:20 PM, "Hung-Sheng Tsao (LaoTsao >> 老曹) Ph. D." >>> <[email protected]> wrote: >>>> if this is SGE related >>>> >>>> >>>> >>>> Hi, >>>> >>>> I've been assigned a debugging task on a Rocks 5.4.3 >> cluster (I helped >>>> build built the software, but on OSX). I know only the very >> basics of >>>> using the cluster, like making sure I qlogin to run >> something instead of >>>> doing it on the head node, and that I can submit tasks >> using qsub, but >>>> that's about it. >>>> >>>> The app I'm debugging is segfaulting but only on the >> cluster, not on my >>>> Mac, and it will take 20+ hours to segfault judging from the current >>>> rate. I'm currently running it in gdb via a qlogin session. >>>> >>>> My question is wheter there's a way to start an interactive session, >>>> then suspend the qlogin session without ending the interactive job >>>> itself, and then and reattach to the job later on to debug once the >>>> segfault has happened. The main reason for doing this is that the >>>> cluster admin has set a 24-hour limit on qlogin sessions. >> Also, it'd be >>>> easier to not have to maintain my terminal connection, but >> that's not a >>>> big issue. >>>> >>>> Or are there other ways to accomplish this? Many thanks for >> any help? >>>> >>>> Cheers, >>>> Michael >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://gridengine.org/mailman/listinfo/users >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> https://gridengine.org/mailman/listinfo/users >>> >> > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
