Galen asked for a writeup of where the UD BTL support is at and what (important) issues remain, so here it is.

Right now, to ensure MPI guaranteed delivery semantics the DR PML must be used with UD -- the UD BTL does not implement its own reliability. The best solution would be to implement a lightweight reliability protocol within the UD BTL, and would be most effective with a progress thread.

Progress threads are a whole other issue.. with a quick implementation, I was hitting all sorts of segfaults in the PML. The UD BTL seems unique in that it is common for messages to be received and passed up to the PML out of order. I can revisit this and file some bug reports if desired sooner than later.

I know of one outstanding bug -- any of the tests in the intel suite using buffered sends fail with incorrect data. I've shown this problem to George, Galen, and Brian and have yet to come up with a fix -- it appears to be an issue with messages arriving at the PML out of order, at which point the PML has no datatype information so cannot reassemble the messages correctly. This would need to be fixed for 1.3.

When the UD BTL goes into the trunk, it will always de-select itself unless specifically requested with the MCA btl parameter (i.e. -mca btl ud,self). This prevents the UD BTL from being used by default along with the existing RC (openib) BTL and possibly lowering performance.

Some minor issues.. when it hits the trunk, it will be called 'ofud', short for OpenFabrics Unreliable Datagrams. Currently RDMA CM is not used, though it will not be hard to switch over (doing it at the same time as the openib BTL seems appropriate to me).

Andrew

Reply via email to