I have a fix for ticket 1944 working, but the broader problem is unpleasant. E.g., let's say we have zillions of uncountered Bcasts or something. Say, the root is repeatedly emitting sends, but never polling its in-coming FIFO. Return fragments will be accumulating, the FIFO will be congested, pending-send queues on peer processes will be growing, etc. The code now handles this (by growing the pending-send queues and eventually draining them, pre-1.3.2 we would also have handled this by growing the FIFO and using up the shared memory), but the whole thing is disturbing. E.g., queues might drain only when the root reaches MPI_Finalize. (Okay, unclear to me what sort of real application would have communications only from one process going out.)

So, is this (one-way communications, e.g., repeated Bcasts) pathological and not worth worrying about. Or, are other solutions worth considering? E.g., I'd like to have a sending process run mca_btl_sm_component_progress occasionally, even if it is successfully completing its sends.

Reply via email to