Jeff Hammond writes:
> On Tue, Jul 10, 2018 at 11:27 AM, Richard Tran Mills
> wrote:
>
>> On Mon, Jul 9, 2018 at 10:04 AM, Jed Brown wrote:
>>
>>> Jeff Hammond writes:
>>>
>>> > This is the textbook Wrong Way to write OpenMP and the reason that the
>>> > thread-scalability of DOE applications
Mark Adams writes:
>>
>> I don't know if Chris has ever lived there. And he's great, but GFDL is
>> an application, not a library.
>>
>
> GFDL is a lab next door to PPPL.
My recollection (perhaps flawed) is that Chris was remote.
Jeff Hammond writes:
> On Tue, Jul 10, 2018 at 9:33 AM, Jed Brown wrote:
>
>> Mark Adams writes:
>>
>> > On Mon, Jul 9, 2018 at 7:19 PM Jeff Hammond
>> wrote:
>> >
>> >>
>> >>
>> >> On Mon, Jul 9, 2018 at 7:38 AM, Mark Adams wrote:
>> >>
>> >>> I agree with Matt's comment and let me add
On Tue, Jul 10, 2018 at 11:27 AM, Richard Tran Mills
wrote:
> On Mon, Jul 9, 2018 at 10:04 AM, Jed Brown wrote:
>
>> Jeff Hammond writes:
>>
>> > This is the textbook Wrong Way to write OpenMP and the reason that the
>> > thread-scalability of DOE applications using MPI+OpenMP sucks. It
>>
On Tue, Jul 10, 2018 at 9:33 AM, Jed Brown wrote:
> Mark Adams writes:
>
> > On Mon, Jul 9, 2018 at 7:19 PM Jeff Hammond
> wrote:
> >
> >>
> >>
> >> On Mon, Jul 9, 2018 at 7:38 AM, Mark Adams wrote:
> >>
> >>> I agree with Matt's comment and let me add (somewhat redundantly)
> >>>
> >>>
>
On Mon, Jul 9, 2018 at 10:04 AM, Jed Brown wrote:
> Jeff Hammond writes:
>
> > This is the textbook Wrong Way to write OpenMP and the reason that the
> > thread-scalability of DOE applications using MPI+OpenMP sucks. It leads
> to
> > codes that do fork-join far too often and suffer from death
>
> I don't know if Chris has ever lived there. And he's great, but GFDL is
> an application, not a library.
>
GFDL is a lab next door to PPPL.
Mark Adams writes:
> On Mon, Jul 9, 2018 at 7:19 PM Jeff Hammond wrote:
>
>>
>>
>> On Mon, Jul 9, 2018 at 7:38 AM, Mark Adams wrote:
>>
>>> I agree with Matt's comment and let me add (somewhat redundantly)
>>>
>>>
This isn't how you'd write MPI, is it? No, you'd figure out how to
Jeff Hammond writes:
> If PETSc was an application, it could do whatever it wanted, but it's not.
> If PETSc is a library that intends to meet the needs of HPC applications,
> it needs to support the programming models the applications are using. Or
> I suppose you will continue to disparage
On 10 July 2018 at 13:06, Mark Adams wrote:
>
> If PETSc was an application, it could do whatever it wanted, but it's
>> not. If PETSc is a library that intends to meet the needs of HPC
>> applications, it needs to support the programming models the applications
>> are using.
>>
>
> To repeat,
On Mon, Jul 9, 2018 at 7:19 PM Jeff Hammond wrote:
>
>
> On Mon, Jul 9, 2018 at 7:38 AM, Mark Adams wrote:
>
>> I agree with Matt's comment and let me add (somewhat redundantly)
>>
>>
>>> This isn't how you'd write MPI, is it? No, you'd figure out how to
>>> decompose your data properly to
> If PETSc was an application, it could do whatever it wanted, but it's
> not. If PETSc is a library that intends to meet the needs of HPC
> applications, it needs to support the programming models the applications
> are using.
>
To repeat, PETSc supports threads, currently with MKL kernels and
On Mon, Jul 9, 2018 at 6:53 PM Jeff Hammond wrote:
> On Mon, Jul 9, 2018 at 6:38 AM, Matthew Knepley wrote:
>
>> On Mon, Jul 9, 2018 at 9:34 AM Jeff Hammond
>> wrote:
>>
>>> On Fri, Jul 6, 2018 at 4:28 PM, Smith, Barry F.
>>> wrote:
>>>
Richard,
The problem is that
On Mon, Jul 9, 2018 at 11:36 AM, Smith, Barry F. wrote:
>
>
> > On Jul 9, 2018, at 8:33 AM, Jeff Hammond wrote:
> >
> >
> >
> > On Fri, Jul 6, 2018 at 4:28 PM, Smith, Barry F.
> wrote:
> >
> > Richard,
> >
> > The problem is that OpenMP is too large and has too many different
>
On Mon, Jul 9, 2018 at 7:38 AM, Mark Adams wrote:
> I agree with Matt's comment and let me add (somewhat redundantly)
>
>
>> This isn't how you'd write MPI, is it? No, you'd figure out how to
>> decompose your data properly to exploit locality and then implement an
>> algorithm that minimizes
On Mon, Jul 9, 2018 at 6:38 AM, Matthew Knepley wrote:
> On Mon, Jul 9, 2018 at 9:34 AM Jeff Hammond
> wrote:
>
>> On Fri, Jul 6, 2018 at 4:28 PM, Smith, Barry F.
>> wrote:
>>
>>>
>>> Richard,
>>>
>>> The problem is that OpenMP is too large and has too many different
>>> programming
> On Jul 9, 2018, at 8:33 AM, Jeff Hammond wrote:
>
>
>
> On Fri, Jul 6, 2018 at 4:28 PM, Smith, Barry F. wrote:
>
> Richard,
>
> The problem is that OpenMP is too large and has too many different
> programming models imbedded in it (and it will get worse) to "support OpenMP"
>
> On Jul 9, 2018, at 12:04 PM, Jed Brown wrote:
>
> Jeff Hammond writes:
>
>> This is the textbook Wrong Way to write OpenMP and the reason that the
>> thread-scalability of DOE applications using MPI+OpenMP sucks. It leads to
>> codes that do fork-join far too often and suffer from death
> On Jul 9, 2018, at 8:33 AM, Jeff Hammond wrote:
>
>
>
> On Fri, Jul 6, 2018 at 4:28 PM, Smith, Barry F. wrote:
>
> Richard,
>
> The problem is that OpenMP is too large and has too many different
> programming models imbedded in it (and it will get worse) to "support OpenMP"
>
Jeff Hammond writes:
> This is the textbook Wrong Way to write OpenMP and the reason that the
> thread-scalability of DOE applications using MPI+OpenMP sucks. It leads to
> codes that do fork-join far too often and suffer from death by Amdahl,
> unless you do a second pass where you fuse all
I agree with Matt's comment and let me add (somewhat redundantly)
> This isn't how you'd write MPI, is it? No, you'd figure out how to
> decompose your data properly to exploit locality and then implement an
> algorithm that minimizes communication and synchronization. Do that with
> OpenMP.
>
On Mon, Jul 9, 2018 at 9:34 AM Jeff Hammond wrote:
> On Fri, Jul 6, 2018 at 4:28 PM, Smith, Barry F.
> wrote:
>
>>
>> Richard,
>>
>> The problem is that OpenMP is too large and has too many different
>> programming models imbedded in it (and it will get worse) to "support
>> OpenMP" from
On Fri, Jul 6, 2018 at 4:28 PM, Smith, Barry F. wrote:
>
> Richard,
>
> The problem is that OpenMP is too large and has too many different
> programming models imbedded in it (and it will get worse) to "support
> OpenMP" from PETSc.
>
This is also true of MPI. You can write CSP, BSP,
The number of people who are using PETSc with HPF for well thought out
reasons is the same size as the number of people using OpenMP with PETSc for
well thought out reasons.
That said, as you saw in a previous email, I have no problem with pull
requests that provide some "OpenMP usage"
"Smith, Barry F." writes:
> You could use your same argument to argue PETSc should do "something" to
> help people who have (rightly or wrongly) chosen to code their application in
> High Performance Fortran or any other similar inane parallel programming
> model.
If a large fraction
Hi all,
(...)Since it looks like MPI endpoints are going to be a long time (or
possibly forever) in coming, I think we need (a) stopgap plan(s) to
support this crappy MPI + OpenMP model in the meantime. One possible
approach is to do what Mark is trying with to do with MKL: Use a third
Richard,
The problem is that OpenMP is too large and has too many different
programming models imbedded in it (and it will get worse) to "support OpenMP"
from PETSc.
One way to use #pragma based optimization tools (which is one way to treat
OpenMP) is to run the application code
On Fri, Jul 6, 2018 at 6:20 PM Matthew Knepley wrote:
> On Fri, Jul 6, 2018 at 3:07 PM Richard Tran Mills wrote:
>
>> True, Barry. But, unfortunately, I think Jed's argument has something to
>> it because the hybrid MPI + OpenMP model has become so popular. I know of a
>> few codes where
On Fri, Jul 6, 2018 at 3:07 PM Richard Tran Mills wrote:
> True, Barry. But, unfortunately, I think Jed's argument has something to
> it because the hybrid MPI + OpenMP model has become so popular. I know of a
> few codes where adopting this model makes some sense, though I believe
> that, more
True, Barry. But, unfortunately, I think Jed's argument has something to it
because the hybrid MPI + OpenMP model has become so popular. I know of a
few codes where adopting this model makes some sense, though I believe
that, more often, the model has been adopted simply because it is the
On Thu, Jul 5, 2018 at 2:04 PM Mark Adams wrote:
>
>
> On Thu, Jul 5, 2018 at 12:41 PM Tobin Isaac wrote:
>
>> On Thu, Jul 05, 2018 at 09:28:16AM -0400, Mark Adams wrote:
>> > >
>> > >
>> > > Please share the results of your experiments that prove OpenMP does
>> not
>> > > improve performance
On Thu, Jul 5, 2018 at 12:41 PM Tobin Isaac wrote:
> On Thu, Jul 05, 2018 at 09:28:16AM -0400, Mark Adams wrote:
> > >
> > >
> > > Please share the results of your experiments that prove OpenMP does not
> > > improve performance for Mark’s users.
> > >
> >
> > This obviously does not "prove"
> On Jul 5, 2018, at 8:28 AM, Mark Adams wrote:
>
>
> Please share the results of your experiments that prove OpenMP does not
> improve performance for Mark’s users.
>
> This obviously does not "prove" anything but my users use OpenMP primarily
> because they do not distribute their mesh
On Thu, Jul 05, 2018 at 09:28:16AM -0400, Mark Adams wrote:
> >
> >
> > Please share the results of your experiments that prove OpenMP does not
> > improve performance for Mark’s users.
> >
>
> This obviously does not "prove" anything but my users use OpenMP primarily
> because they do not
>
>
> Please share the results of your experiments that prove OpenMP does not
> improve performance for Mark’s users.
>
This obviously does not "prove" anything but my users use OpenMP primarily
because they do not distribute their mesh metadata. They can not replicated
the mesh on every core, on
Jed,
You could use your same argument to argue PETSc should do "something" to
help people who have (rightly or wrongly) chosen to code their application in
High Performance Fortran or any other similar inane parallel programming model.
Barry
> On Jul 4, 2018, at 11:51 PM, Jed
Matthew Knepley writes:
> On Wed, Jul 4, 2018 at 4:51 PM Jeff Hammond wrote:
>
>> On Wed, Jul 4, 2018 at 6:31 AM Matthew Knepley wrote:
>>
>>> On Tue, Jul 3, 2018 at 10:32 PM Jeff Hammond
>>> wrote:
>>>
On Tue, Jul 3, 2018 at 4:35 PM Mark Adams wrote:
> On Tue, Jul 3,
On Wed, Jul 4, 2018 at 4:51 PM Jeff Hammond wrote:
> On Wed, Jul 4, 2018 at 6:31 AM Matthew Knepley wrote:
>
>> On Tue, Jul 3, 2018 at 10:32 PM Jeff Hammond
>> wrote:
>>
>>>
>>>
>>> On Tue, Jul 3, 2018 at 4:35 PM Mark Adams wrote:
>>>
On Tue, Jul 3, 2018 at 1:00 PM Richard Tran Mills
On Wed, Jul 4, 2018 at 6:31 AM Matthew Knepley wrote:
> On Tue, Jul 3, 2018 at 10:32 PM Jeff Hammond
> wrote:
>
>>
>>
>> On Tue, Jul 3, 2018 at 4:35 PM Mark Adams wrote:
>>
>>> On Tue, Jul 3, 2018 at 1:00 PM Richard Tran Mills
>>> wrote:
>>>
Hi Mark,
I'm glad to see you trying
On Tue, Jul 3, 2018 at 10:32 PM Jeff Hammond wrote:
>
>
> On Tue, Jul 3, 2018 at 4:35 PM Mark Adams wrote:
>
>> On Tue, Jul 3, 2018 at 1:00 PM Richard Tran Mills
>> wrote:
>>
>>> Hi Mark,
>>>
>>> I'm glad to see you trying out the AIJMKL stuff. I think you are the
>>> first person trying to
On Tue, Jul 3, 2018 at 4:35 PM Mark Adams wrote:
> On Tue, Jul 3, 2018 at 1:00 PM Richard Tran Mills wrote:
>
>> Hi Mark,
>>
>> I'm glad to see you trying out the AIJMKL stuff. I think you are the
>> first person trying to actually use it, so we are probably going to expose
>> some bugs and
On Tue, Jul 3, 2018 at 1:00 PM Richard Tran Mills wrote:
> Hi Mark,
>
> I'm glad to see you trying out the AIJMKL stuff. I think you are the first
> person trying to actually use it, so we are probably going to expose some
> bugs and also some performance issues. My somewhat limited testing has
Hi Mark,
I'm glad to see you trying out the AIJMKL stuff. I think you are the first
person trying to actually use it, so we are probably going to expose some
bugs and also some performance issues. My somewhat limited testing has
shown that the MKL sparse routines often perform worse than our own
GAMG drills into AIJ data structures and will need to be fixed up to work
with MKL matrices, I guess, but it is failing now from a logic error.
This example works with one processor but fails with 2 (appended). The code
looks like this:
ierr =
44 matches
Mail list logo