On Fri, May 7, 2010 at 2:24 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > On May 7, 2010, at 1:04 PM, Matthew Knepley wrote: > > On Thu, May 6, 2010 at 10:16 PM, Barry Smith <bsmith at mcs.anl.gov> wrote: > >> >> I'd like to add a MPI_Comm as the first argument to PetscError() and >> friends. >> >> In this way, if the same error is known over all the communicator ranks >> it can print just one nice error message and stack instead of spewing out >> many of the same messages all over the place. >> >> Does anyone object to this? >> > > I am just worried that it will introduce deadlocks. If an error occurs on > only one process > and not another (like a NaN), but we use the entire communicator, we can > get deadlock > on the error message which will be very confusing. > > > The idea is that by default we would pass in PETSC_COMM_SELF. Only when > we KNOW 100% that ALL ranks in a process WILL FOR ABSOLUTE sure generate the > same error would be pass the entire comm to SETERRQ() For example, if the > user has set an invalid PC type etc. So a process generating a NAN would use > only a PETSC_COMM_SELF in the SETERRQ(). > > You are right that there is a chance when totally bizarre shit happens > that rank 1 of a comm calls SETERRQ() but rank 0 does not; then no > appropriate error message will be printed. I don't see a way to totally > avoid this chance. So we can > > 1) ignore this chance and make the change and see what happens > 2) leave things the same as they are now. > > Even if it turns out we cannot have only rank 0 of the SETERRQ() print > the message because bizarre shit happens too often, I think conceptually it > is the right thing to do to pass in the MPI_Comm over which the error > happens. So I'd like to make the chance and we can always take out the > control over printing by rank 0 if it is a problem (i.e. the default error > handles could ignore the comm). > > I'm going to try this and see if I can work out the kinks before > pushing. > > Is there a nice MPI way of checking > whether everyone is present, and if not then just use the current method? > > > No, absolutely not since that would require communication with everyone > who may not be there. > I was thinking of something like a barrier+probe+cancel (if necessary). Matt > Barry > > > Matt > > >> It does mean for each SETERRQXXX() we call we need to select the correct >> comm that is passed in. I will do all that, worst case just use >> MPI_COMM_SELF for some and get the same effect as today. >> >> Barry >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100507/3b41aba7/attachment.html>
