> What's the state of fault-tolerance in OFI?  Would it be prudent for
> someone to write OFI code that aspired to survive process failures?  Are
> any implementations known to support this robustly right now?

This would be provider specific.  I'm not aware of anything that's coded to 
handle failures.

Having an example of this over libfabric would be great, though I'm not sure 
who's going to volunteer to write this.

It's not clear to me how fault tolerance relates to a networking API.  For 
example, what specific lower-level features does an app need to make this 
happen?  Are their restrictions that providers need to report to apps regarding 
their level of support?  Is this something that even belongs to this level of 
API?
_______________________________________________
ofiwg mailing list
[email protected]
http://lists.openfabrics.org/mailman/listinfo/ofiwg

Reply via email to