This didn't happens until recently, but it was pure luck. Before there was a pending queue in the SM BTL, and eventually the message got sent at one point, without involving the PML. Anyway, as I said before the problem could happens with any other BTL, if we post the right number of non-blocking sends.
Here is the solution I propose. If you think there is any problem with it, please let me know asap.
Move the progress function from the BML layer back into the PML. Then the PML will have a way to check on it's pending requests, and progress them accordingly. This solution offer the same number of function calls as what we have today, and should only minimally affect the performances (one more if in the progress function).
george. On Jun 25, 2008, at 4:06 AM, Lenny Verkhovsky wrote:
Hi, I downloaded new version from trunk and got the fallowing 1. opal_output for no reason ( probaly something was forgotten ) 2. it got stacked./home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 2 -hostfile hostfile_w4_8 ./osu_bw[witch4:20920] Using eager rdma: 1 [witch4:20921] Using eager rdma: 1 # OSU MPI Bandwidth Test (Version 2.1) # Size Bandwidth (MB/s) ( got stacked ) Lenny. _______________________________________________ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
smime.p7s
Description: S/MIME cryptographic signature