Ok, this is the backtrace of the running processes. There are two processes 
running:

0 S becsekba  54451  54421  0  80   0 - 76108 futex_ 12:39 pts/92   00:00:00 
/opt/slurm/16.05.8/bin/srun -n 8 whale-dbg -i IMP/RunImpact2D.i
1 S becsekba  54477  54451  0  80   0 - 24908 pipe_w 12:39 pts/92   00:00:00 
/opt/slurm/16.05.8/bin/srun -n 8 whale-dbg -i IMP/RunImpact2D.i

attaching gdb to the first give me this stack frame:
(gdb) bt
#0  0x00002b0f815c003f in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x0000000000580ee4 in slurm_step_launch_wait_finish (ctx=0x99dd40) at 
step_launch.c:622
#2  0x00002b0f85db2490 in launch_p_step_wait (job=0x99e3e0, got_alloc=false) at 
launch_slurm.c:692
#3  0x0000000000587a82 in launch_g_step_wait (job=0x99e3e0, got_alloc=false) at 
launch.c:523
#4  0x000000000042d27a in srun (ac=6, av=0x7ffd0d0f2c58) at srun.c:288
#5  0x000000000042dc21 in main (argc=6, argv=0x7ffd0d0f2c58) at 
srun.wrapper.c:17

attaching gdb to the second gives me this stack frame:
(gdb) bt
#0  0x00002b0f815c2a60 in __read_nocancel () from /lib64/libpthread.so.0
#1  0x00000000005918f7 in _shepard_spawn (job=0x99e3e0, got_alloc=false) at 
srun_job.c:1383
#2  0x000000000058fe15 in create_srun_job (p_job=0x7ecd00 <job>, 
got_alloc=0x7ffd0d0f2a6f, slurm_started=false, handle_signals=true) at 
srun_job.c:652
#3  0x000000000042cd6c in srun (ac=6, av=0x7ffd0d0f2c58) at srun.c:194
#4  0x000000000042dc21 in main (argc=6, argv=0x7ffd0d0f2c58) at 
srun.wrapper.c:17

–Barna

> On 12 Jan 2017, at 17:51, Roy Stogner <royst...@ices.utexas.edu> wrote:
> 
> 
> On Thu, 12 Jan 2017, Barna Becsek wrote:
> 
>> What I meant was the program will not exit gather_neighboring_elements. I 
>> think the processes are still running.
> 
> Right.  But you can e.g. attach gdb to a running process to get a
> stack trace.  If there's an infinite loop then we can at least find
> out *where* it's looping.
> ---
> Roy


------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Libmesh-users mailing list
Libmesh-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to