[petsc-dev] PETSc programming model for multi-core systems

Barry Smith Thu, 11 Nov 2010 21:08:38 -0600

On Nov 11, 2010, at 8:24 PM, Mark F. Adams wrote:

> This is a great technical discussion the very vexing question of future 
> programming models.
> 
> In addition to these issues there are facts-on-the-ground.  My limited view 
> of this elephant, if you will, is that OpenMP seems to be getting a certain 
> critical mass, for better or worse.  We may not see as homogenous world in 
> the future that we had say 10-15 years ago when MPI + C/FORTRAN was dominant 
> and supported well (with some exceptions) everywhere, but I think we could 
> see some coalescence around MPI + OpenMP + ? + C/F.
> 
> Anecdotally, I can just say that I've had several discussion about potential 
> development with PETSc where I've had to say "well, we can just add threads 
> to PETSc ourselves, I know where the loops are".  As a big PETSc fan this is 
> a bit awkward, it would be nice to say something like "PETSc has some support 
> for threads ...".


   I need  a good student or equivalent to work with to do this; as Mark notes 
it is not that hard but needs someone focused on it. Anybody have someone 
available?

   Barry

> 
> Simple OpenMP will not survive for long because of the data locality issues, 
> but perhaps there is a path vis-a-vis Aron's comments on "MPI-like" 
> constructs.
> 
> Anyway, I have only anecdotal experience and no quantitative data on this but 
> OpenMP may get too big to ignore, as C++ has: not by being particularly good 
> but by being used a lot.
> 
> Mark
> 
> On Nov 11, 2010, at 8:34 PM, Barry Smith wrote:
> 
>> 
>> On Nov 11, 2010, at 7:22 PM, Jed Brown wrote:
>> 
>>> On Fri, Nov 12, 2010 at 02:18, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> How do you get adaptive load balancing (across the cores inside a process) 
>>> if you have OpenMP compiler decide the partitioning/parallelism? This was 
>>> Bill's point in why not to use OpenMP. For example if you give each core 
>>> the same amount of work up front they will end not ending at the same time 
>>> so you have wasted cycles.
>>> 
>>> Hmm, I think this issue is largely subordinate to the memory locality (for 
>>> the sort of work we usually care about), but the OpenMP could be more 
>>> dynamic about distributing work.  I.e. this could be an OpenMP 
>>> implementation or tuning issue, but I don't see it as a fundamental 
>>> disadvantage of that programming model.  I could be wrong.
>> 
>>  You are probably right, your previous explanation was better.  Here is 
>> something related that Bill and I discussed, static load balance has lower 
>> overhead while dynamic has more overhead. Static load balancing however will 
>> end up with some in-balance. Thus one could do an upfront static load 
>> balancing of most of the data then when the first cores run out of their 
>> static work they do the rest of the work with the dynamic balancing.
>> 
>>  Barry
>> 
>>> 
>>> Jed
>> 
>> 
>

[petsc-dev] PETSc programming model for multi-core systems

Reply via email to