On Thursday 24 February 2011, Fabien wrote: > I try to profile a parallel code on a large cluster > (8cores per node, Intel_MPI & Open_MPI available) > with mpirun ... valgrind --tool=callgrind --base=$PWD/callgrind.out ./exe > > ++ Everything seems fine when I profile from 1 to 8 cores > (1 node) and CPU times is ok (10min with 8cores) > - But, I get some empty callgrind.out.* files when I profile from 16 cores > to XXX (and computing on different nodes) ; > - And CPU times dramatically increase to ~1day since 16cores
The difference between a 8 and 16 core run is that MPI comes into play. No idea what could happen there. The profile results do not give any hints? How is the slowdown with the none tool? Josef ------------------------------------------------------------------------------ Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users