Yujie, Funny you should ask, Derek and myself (Cody) are working on a talk for the upcoming SIAM conference illustrating the various tradeoffs of using a hybrid threaded-MPI scheme in Libmesh. I've been running various combinations of threads and MPI processes for the last two weeks to solve a variety of problems on our large parallel system. The LibMesh developers have done a good job of wrapping the TBB APIs so that you can embed those functions in your code and they still work whether you have TBB or not. Check out the Threads namespace for these functions. There is a fair amount of work as you'll need to create functors out of your hotspots and call them with constructs like parallel_for but the work may have big payoffs for your codes. I wish I had a better example for you than just explaining at a high level what to do but our threaded loops are buried in our own custom framework and won't stand well as examples on their own.
Hope this helps, Cody On Feb 11, 2010, at 4:00 PM, Yujie wrote: > Dear Derek, > > Thank you for your comments. > I am wandering whether multithread can signicantly reduce the cost time of > Mesh Communication. If it is, can current libmesh support the multithread in > MPI. Any example for it? thanks a lot. > > Regards, > Yujie > > On Thu, Feb 11, 2010 at 4:38 PM, Derek Gaston <[email protected]> wrote: > >> This plot looks reasonable to me. For a normal linear problem... I >> wouldn't expect to be able to scale a 350,000 dof problem (that only takes >> 30 seconds in serial) over about 10 cpus... just too much communication >> involved (which is what you're seeing). >> >> I'm not saying that we couldn't do something better in MeshCommunication >> (or that there might be a bug)... I'm just trying to say that you shouldn't >> expect this problem to scale all that well. Try a bigger (or harder) >> problem and I bet that plot changes. >> >> Derek >> >> On Feb 11, 2010, at 3:23 PM, Yujie wrote: >> >> Dear John, >> >> Please check the attached figure. It is better to show the problem. >> I will test it according to your advice. I am just wondering whether it is >> reasonable. >> Thanks a lot. >> >> Regards, >> Yujie >> >> On Thu, Feb 11, 2010 at 4:19 PM, John Peterson < >> [email protected]> wrote: >> >>> Yujie, >>> >>> I don't understand the numbers you posted, not least of which because >>> the columns don't line up in my email... >>> >>> MeshCommunication is a class that does a lot of different things, is >>> that what you are referring to in the second column. >>> >>> How about attaching some actual perflog logs? How about adding some >>> additional logging to parallel_sort if that part is getting slower, >>> and seeing which part of it takes the longest? >>> >>> -- >>> John >>> >> >> <forlibmesh.jpg> >> >> >> > ------------------------------------------------------------------------------ > SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, > Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW > http://p.sf.net/sfu/solaris-dev2dev > _______________________________________________ > Libmesh-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/libmesh-users ------------------------------------------------------------------------------ SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev _______________________________________________ Libmesh-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/libmesh-users
