I'd have to check to be sure, but I'm pretty sure that it's because we've 
activated lots of locks that aren't there in single-threaded mode.

I say this because I *assume* you mean that when you use MPI_INIT_THREAD, you 
mean that you're actually calling it with MPI_THREAD_MULTIPLE.  Calling 
MPI_INIT should be exactly equivalent to calling MPI_INIT_THREAD with 
MPI_THREAD_SINGLE.


On Nov 1, 2010, at 12:46 PM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com> 
wrote:

> Hi
>  
> I have the following small program where the rank-0 process  does sleep and 
> then all the processes perform barrier().
> #include "mpi.h"
> #include <stdio.h>
> int main(int argc, char *argv[])
> {
>     int rank, nprocs;
>  
>     MPI_Init(&argc,&argv);
>     MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
>     MPI_Comm_rank(MPI_COMM_WORLD,&rank);
>     if ( rank == 0) sleep(60);
>     MPI_Barrier(MPI_COMM_WORLD);
>     printf("Hello, world.  I am %d of %d\n", rank, nprocs);fflush(stdout);
>     MPI_Finalize();
>     return 0;
> }
>  
> When I run this program on two nodes consuming 16 cores, I see that the non 
> rank-0 processes which are in wait mode for rank-0 process to complete 
> barrier() are consuming only user time. I was expecting this behavior and 
> there are no questions about it.
>  
> However if I initialize MPI threads by replacing MPI_Init() with 
> MPI_Init_thread(), I see quite a different behavior of this program. While 
> rank-0 process is sleeping, all non rank-0 processes seem to be spending time 
> in kernel mode (thus increasing system time) instead of waiting in user mode.
>  
> Following is the sar output on the node where rank-0 process is running
> Node1> sar 2 10
> Linux 2.6.18-128.1.10.el5-perfctr (Node1)  10/29/2010
>  
> 02:33:51 PM       CPU     %user     %nice   %system   %iowait    %steal     
> %idle
> 02:33:53 PM       all      6.69      0.00     80.88      0.00      0.00     
> 12.44
> 02:33:55 PM       all      6.56      0.00     81.00      0.00      0.00     
> 12.44
> 02:33:57 PM       all      6.62      0.00     80.89      0.00      0.00     
> 12.49
> 02:33:59 PM       all      6.68      0.00     80.89      0.00      0.00     
> 12.43
> 02:34:01 PM       all      6.69      0.00     81.00      0.00      0.00     
> 12.31
> 02:34:03 PM       all      6.75      0.00     80.76      0.00      0.00     
> 12.49
> 02:34:05 PM       all      6.75      0.00     80.82      0.00      0.00     
> 12.43
> 02:34:07 PM       all      6.75      0.00     81.19      0.00      0.00     
> 12.06
> 02:34:09 PM       all      6.93      0.00     80.64      0.00      0.00     
> 12.43
> 02:34:11 PM       all      6.75      0.00     80.81      0.00      0.00     
> 12.44
> Average:          all      6.72      0.00     80.89      0.00      0.00     
> 12.40
>  
> And following is the sar output on the second node:
> Node2> sar 2 10
> Linux 2.6.18-128.1.10.el5-perfctr (Node2)  10/29/2010
>  
> 02:33:48 PM       CPU     %user     %nice   %system   %iowait    %steal     
> %idle
> 02:33:50 PM       all      6.37      0.00     93.63      0.00      0.00      
> 0.00
> 02:33:52 PM       all      6.19      0.00     93.81      0.00      0.00      
> 0.00
> 02:33:54 PM       all      6.31      0.00     93.69      0.00      0.00      
> 0.00
> 02:33:56 PM       all      6.50      0.00     93.50      0.00      0.00      
> 0.00
> 02:33:58 PM       all      6.81      0.00     93.19      0.00      0.00      
> 0.00
> 02:34:00 PM       all      6.56      0.00     93.44      0.00      0.00      
> 0.00
> 02:34:02 PM       all      6.50      0.00     93.50      0.00      0.00      
> 0.00
> 02:34:04 PM       all      6.50      0.00     93.50      0.00      0.00      
> 0.00
> 02:34:06 PM       all      6.56      0.00     93.44      0.00      0.00      
> 0.00
> 02:34:08 PM       all      6.68      0.00     93.32      0.00      0.00      
> 0.00
> Average:          all      6.50      0.00     93.50      0.00      0.00      
> 0.00
>  
> Can someone please explain the difference in behavior of barrier() call when 
> I use MPI_Init() vs MPI_Init_thread()?
>  
> Thanks
> Ananda
>  
> Ananda B Mudar, PMP
> Senior Technical Architect
> Wipro Technologies
> Ph: 972 765 8093
> ananda.mu...@wipro.com
>  
> Please do not print this email unless it is absolutely necessary.
> 
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments.
> 
> WARNING: Computer viruses can be transmitted via email. The recipient should 
> check this email and any attachments for the presence of viruses. The company 
> accepts no liability for any damage caused by any virus transmitted by this 
> email.
> 
> www.wipro.com
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to