I see your point about setting MPI_ERR_PENDING on the internal status
versus the status returned by MPI_Waitall. As I mentioned, the reason
I choose to do that is to support the ompi_errhandler_request_invoke()
function. I could not think of an better way to fix this, so I'm open
to ideas.

MPI_Waitall returns MPI_ERR_PENDING for 'not completed' requests (or
sets the exposed status.MPI_ERROR field appropriately, depending on
the completion function used). The user must still call a completion
function to complete the request at which point the true error value
is reported (MPI_SUCCESS, or something else). So I think we are on the
same page here.

Setting the request->req_status.MPI_ERROR to MPI_ERR_PENDING in
MPI_Waitall does not prevent the OMPI system from overwriting that
value with the correct error (e.g., MPI_ERR_TRUNCATE). When the OMPI
system wants to set an error on the request it sets the value (without
checking the previous value in req_status.MPI_ERROR) before calling
the completion operation. We reset the value from MPI_ERR_PENDING to
MPI_SUCCESS at the start of the test/wait operations so that the rest
of the checks for (MPI_SUCCESS != req->req_status.MPI_ERROR) complete
appropriately in those functions.

So I am still failing to see why the patch is incorrect. Is it
'incorrect' or just 'not implemented the way you think it should be'?
If the latter, then I'd love to see a patch. If the former, then I
would really like to understand why.

Do we both agree that before this patch Open MPI did not support
MPI_ERR_PENDING?

-- Josh


On Wed, Mar 21, 2012 at 2:45 PM, George Bosilca <bosi...@eecs.utk.edu> wrote:
> My point was that MPI_ERR_PENDING should never be set on a specific request. 
> MPI_ERR_PENDING should only be returned in the array of statuses attached to 
> MPI_Waitall. Thus, there is no need to remove it from any request.
>
> In addition, there is another reason why this is unnecessary (and I was too 
> lazy to describe it in my previous email). The MPI_Waitall is only allowed to 
> return MPI_ERR_PENDING for "not completed" MPI request. Such requests will 
> therefore be completed later by the MPI library, and at that moment the error 
> in the status will be set to the correct value, or MPI_SUCCESS.
>
>  george.
>
> On Mar 21, 2012, at 14:32 , Josh Hursey wrote:
>
>> In the patch for errhandler_invoke.c, you can see that we need to
>> check for MPI_ERR_PENDING to make sure that we do not free the request
>> when we are trying to decide if we should invoke the error handler. So
>> setting the internal req->req_status.MPI_ERROR to MPI_ERR_PENDING made
>> it possible to check for this state in this code. Maybe you have a
>> better way around this, but that is why I chose to implement it this
>> way.
>>
>> I agree that MPI_ERR_PENDING is only to be set in MPI_Waitall. And, in
>> this patch, MPI_ERR_PENDING is only set in that operation. The other
>> operations must look for the MPI_ERR_PENDING and reset the value -
>> since we are using the internal status object to carry it around per
>> above.
>>
>> You said that the implementation in ompi_request_default_wait_all was
>> incorrect. Would you care to elaborate as to what specifically is
>> incorrect?
>>
>> -- Josh
>>
>> On Wed, Mar 21, 2012 at 2:17 PM, George Bosilca <bosi...@eecs.utk.edu> wrote:
>>> Josh,
>>>
>>> I don't agree that these changes are required. In the current standard 
>>> (2.2), MPI_ERR_PENDING is only allowed to be returned by MPI_WAITALL, in 
>>> some very specific conditions. Here is the snippet from the MPI standard 
>>> clarifying this behavior.
>>>
>>>> When one or more of the communications completed by a call to MPI_WAITALL 
>>>> fail, it is desireable to return specific information on each 
>>>> communication. The function MPI_WAITALL will return in such case the error 
>>>> code MPI_ERR_IN_STATUS and will set the error field of each status to a 
>>>> specific error code. This code will be MPI_SUCCESS, if the specific 
>>>> communication completed; it will be another specific error code, if it 
>>>> failed; or it can be MPI_ERR_PENDING if it has neither failed nor 
>>>> completed.
>>>
>>> As you can notice, the MPI_ERR_PENDING is only for the error in the status 
>>> array and not for the error field in the status attached to the MPI 
>>> request. Moreover, we don't use this inside Open MPI at all.
>>>
>>> The usage we're doing in the default_wait_all of MPI_ERR_PENDING is 
>>> incorrect as well. I will fix it, once we clarify the issue with your patch.
>>>
>>> Thanks,
>>>  george.
>>>
>>> On Mar 21, 2012, at 13:46 , jjhur...@osl.iu.edu wrote:
>>>
>>>> Author: jjhursey
>>>> Date: 2012-03-21 13:46:15 EDT (Wed, 21 Mar 2012)
>>>> New Revision: 26172
>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/26172
>>>>
>>>> Log:
>>>> Add support for MPI_ERR_PENDING - Per MPI 2.2 p 60
>>>>
>>>> Tested with:
>>>>  ompi-tests/mpich_tester/mpich_pt2pt/truncmult.c
>>>>
>>>>
>>>> Text files modified:
>>>>   trunk/ompi/errhandler/errhandler_invoke.c |     9 ++
>>>>   trunk/ompi/request/req_test.c             |    32 ++++++++++
>>>>   trunk/ompi/request/req_wait.c             |   120 
>>>> +++++++++++++++++++++++++++++++++++++--
>>>>   trunk/ompi/request/request.c              |     2
>>>>   trunk/ompi/request/request.h              |     5 +
>>>>   5 files changed, 158 insertions(+), 10 deletions(-)
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey

Reply via email to