Hello Jeff,

"Jeff Squyres (jsquyres)" <jsquy...@cisco.com> writes:

> With THREAD_FUNNELED, it means that there can only be one thread in
> MPI at a time -- and it needs to be the same thread as the one that
> called MPI_INIT_THREAD.
>
> Is that the case in your app?


the master rank (i.e. 0) never creates threads, while other ranks go through 
the following
to communicate with it, so I check that it is indeed the master thread
communicating only: 

,----
|        tid = 0                                                          
| #ifdef _OPENMP                                                          
|        tid = omp_get_thread_num()                                       
| #endif                                                                  
|                                                                         
|        do                                                               
|           if (tid == 0) then                                            
|              call mpi_send(my_rank, 1, mpi_integer, master, ask_job, &  
|                   mpi_comm_world, mpierror)                             
|              call mpi_probe(master,mpi_any_tag,mpi_comm_world,stat,mpierror)
|                                                                             
|              if (stat(mpi_tag) == stop_signal) then                         
|                 call mpi_recv(b_,1,mpi_integer,master,stop_signal, &        
|                      mpi_comm_world,stat,mpierror)                          
|              else                                                           
|                 call mpi_recv(iyax,1,mpi_integer,master,give_job, &         
|                      mpi_comm_world,stat,mpierror)                          
|              end if                                                         
|           end if                                                            
|                                                                             
|           !$omp barrier
| 
|           [... actual work...]
`----


> Also, what is your app doing at src/pcorona_main.f90:627?

It is the mpi_probe call above.


In case it can clarify things, my app follows a master-worker paradigm,
where rank 0 hands over jobs, and all mpi ranks > 0 just do the following:

,----
| !$OMP PARALLEL DEFAULT(NONE)
| do
|   !  (the code above) 
|   if (tid == 0) then receive job number | stop signal
|  
|   !$OMP DO schedule(dynamic)
|   loop_izax: do izax=sol_nz_min,sol_nz_max
| 
|      [big computing loop body]
| 
|   end do loop_izax              
|   !$OMP END DO                  
| 
|   if (tid == 0) then                                             
|       call mpi_send(iyax,1,mpi_integer,master,results_tag, &     
|            mpi_comm_world,mpierror)                              
|       call mpi_send(stokes_buf_y,nz*8,mpi_double_precision, &    
|            master,results_tag,mpi_comm_world,mpierror)           
|   end if                                                         
|                                                                  
|   !omp barrier                                                   
|                                                                  
| end do                                                           
| !$OMP END PARALLEL  
`----



Following Gilles' suggestion, I also tried changing MPI_THREAD_FUNELLED
to MPI_THREAD_MULTIPLE just in case, but I get the same segmentation
fault in the same line (mind you, the segmentation fault doesn't happen
all the time). But again, no issues if running with --bind-to socket
(and no apparent issues at all in the other computer even with --bind-to
none).

Many thanks for any suggestions,
-- 
Ángel de Vicente

Tel.: +34 922 605 747
Web.: http://research.iac.es/proyecto/polmag/
---------------------------------------------------------------------------------------------
AVISO LEGAL: Este mensaje puede contener información confidencial y/o 
privilegiada. Si usted no es el destinatario final del mismo o lo ha recibido 
por error, por favor notifíquelo al remitente inmediatamente. Cualquier uso no 
autorizadas del contenido de este mensaje está estrictamente prohibida. Más 
información en: https://www.iac.es/es/responsabilidad-legal
DISCLAIMER: This message may contain confidential and / or privileged 
information. If you are not the final recipient or have received it in error, 
please notify the sender immediately. Any unauthorized use of the content of 
this message is strictly prohibited. More information:  
https://www.iac.es/en/disclaimer

Reply via email to