Rainer Keller wrote:

Hi Terry,
On Wednesday 22 August 2007 16:22, Terry D. Dontje wrote:
I thought I would run this by the group before trying to unravel the
code and figure out how to fix the problem.  It looks to me from some
experiementation that when a process matches an unexpected message that
the PERUSE framework incorrectly fires a
PERUSE_COMM_MSG_MATCH_POSTED_REQ in addition to a
PERUSE_COMM_REQ_MATCH_UNEX event.  I believe this is wrong that the
former event should not be fired in this case.
You are right, the former event PERUSE_COMM_MSG_MATCH_POSTED_Q should not be posted, as this was an unexpected message.

If the above assumption is true I think the problem arises because
PERUSE_COMM_MSG_MATCH_POSTED_REQ event is fired in function
mca_pml_ob1_recv_request_progress which is called by
mca_pml_ob1_recv_request_match_specific when a match of an unexpected
message has occurred.  I am wondering if the
PERUSE_COMM_MSG_MATCH_POSTED_REQ event should be moved to a more posted
queue centric routine something like mca_pml_ob1_recv_frag_match?
I believe, this is correct -- at least this works for a large message late sender and late receiver test program mpi_peruse.c.
Should be fixed with the committed patch v15947.
Actually, there are two other items, one is a missing PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q...

This works for large posted messges but when the posted message is small
you don't see the unexpected messages at all now.

--td

Additionally, we have a problem that we fire PERUSE_COMM_REQ_ACTIVATE event for MPI_*Probe-function calls. The solution is to move the pml_base_sendreq.h / pml_base_recv_req.h
into
 pml_ob1_irecv.c, and resp. pml_ob1_isend.c
Please see the v15945.

With best regards,
Rainer

Reply via email to