Hi Roland, On Wednesday 16 January 2008 22:54, Roland Dreier wrote: > However I'm a little puzzled about how this can lead to memory > corruption in practice: the only thing that flushing FMRs should do is > make memory keys that should no longer be in use anyway become > invalid. So the only effect of this fix should be to expose a bug in > your ULP by having some RDMA operation complete with a protection > error -- and you're not relying on that behavior in normal operation, > are you? What am I missing?
The corruption happened when the process that allocated the MRs went away in the middle of the operation. We would free the MR and invalidate - and expect the in flight RDMA to error out. RDS does not know who is doing RDMA to or from a MR at any given time. There is a second potential issue however. When RDS performs an RDMA, the initiator will queue two work requests - one for the actual RDMA, immediately followed by a normal SEND with a RDS packet. When the consumer sees that RDS packet, it will release the MR to which the RDMA was directed. Is that a safe thing to do? I found the spec a little unclear on the ordering rules. It *seems* that RDMA writes are always fencing against subsequent operations, and RDMA reads will fence if we ask for it. But I'm not perfectly sure whether the ordering applies to the sending system only, or if IB also guarantees that the RDMA will have completed when it puts the incoming message on the completion queue at the consumer. If there is no such guarantee, then we have a second potential issue in RDS wrt RDMA and memory corruption. Thanks, Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play [EMAIL PROTECTED] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax _______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg