On Nov 11, 2010, at 8:24 PM, Mark F. Adams wrote: > This is a great technical discussion the very vexing question of future > programming models. > > In addition to these issues there are facts-on-the-ground. My limited view > of this elephant, if you will, is that OpenMP seems to be getting a certain > critical mass, for better or worse. We may not see as homogenous world in > the future that we had say 10-15 years ago when MPI + C/FORTRAN was dominant > and supported well (with some exceptions) everywhere, but I think we could > see some coalescence around MPI + OpenMP + ? + C/F. > > Anecdotally, I can just say that I've had several discussion about potential > development with PETSc where I've had to say "well, we can just add threads > to PETSc ourselves, I know where the loops are". As a big PETSc fan this is > a bit awkward, it would be nice to say something like "PETSc has some support > for threads ...".
I need a good student or equivalent to work with to do this; as Mark notes it is not that hard but needs someone focused on it. Anybody have someone available? Barry > > Simple OpenMP will not survive for long because of the data locality issues, > but perhaps there is a path vis-a-vis Aron's comments on "MPI-like" > constructs. > > Anyway, I have only anecdotal experience and no quantitative data on this but > OpenMP may get too big to ignore, as C++ has: not by being particularly good > but by being used a lot. > > Mark > > On Nov 11, 2010, at 8:34 PM, Barry Smith wrote: > >> >> On Nov 11, 2010, at 7:22 PM, Jed Brown wrote: >> >>> On Fri, Nov 12, 2010 at 02:18, Barry Smith <bsmith at mcs.anl.gov> wrote: >>> How do you get adaptive load balancing (across the cores inside a process) >>> if you have OpenMP compiler decide the partitioning/parallelism? This was >>> Bill's point in why not to use OpenMP. For example if you give each core >>> the same amount of work up front they will end not ending at the same time >>> so you have wasted cycles. >>> >>> Hmm, I think this issue is largely subordinate to the memory locality (for >>> the sort of work we usually care about), but the OpenMP could be more >>> dynamic about distributing work. I.e. this could be an OpenMP >>> implementation or tuning issue, but I don't see it as a fundamental >>> disadvantage of that programming model. I could be wrong. >> >> You are probably right, your previous explanation was better. Here is >> something related that Bill and I discussed, static load balance has lower >> overhead while dynamic has more overhead. Static load balancing however will >> end up with some in-balance. Thus one could do an upfront static load >> balancing of most of the data then when the first cores run out of their >> static work they do the rest of the work with the dynamic balancing. >> >> Barry >> >>> >>> Jed >> >> >
