>Sean, please let me know your preference (as it was somehow unclear from >the thread) if you want the delivery of this event to be dependent on >the ulp asking for it or no.
I spent most of the morning looking at this, and until I know what the trade-offs really are in the implementation, I can't say that I have a strong preference for how to deal with any of this. My main concerns are: * All callbacks from the rdma_cm are serialized * We minimize the overhead of reporting events * We don't lose events * If the user returns a non-zero value from a callback, the rdma_cm_id is destroyed, an no further callbacks are invoked. and in concept I prefer to: * Always report the event and let ULPs ignore it * Let someone come up with a fantastically simple way of reporting new events The existing rdma_cm callbacks are naturally serialized with each other. (Callback for connect after resolve route after resolve address...) This allows using the stack for event structures, but the cost is complex synchronization with device removal. Supporting additional events while meeting the concerns listed above will be equally challenging. So if we can simplify device removal handling, then supporting similar types of events should be easier as well. If we can guarantee that this works, one option is to acquire a mutex before invoking a callback on an rdma_cm_id. I hesitate to hold any locks while in a callback, since it restricts what the user can do, but if the mutex is only used to synchronize calling the user back, it may work, since the rdma_cm never invokes a callback from a downcall. This should simplify the device removal handling, eliminating wait_remove and dev_remove from the rdma_cm_id. Alternatively, the ib_cm serializes callbacks using different logic (see cm_process_work() and use of work_count/work_list). I've been looking at what it would take to use the ib_cm event logic in the rdma_cm. The trick is to minimize the event reporting overhead without losing any events, (and minimizing the overhead may require registering for events...) What I've been exploring is adding an event_list to the rdma_cm_id. Whenever the user performs an asynchronous operation, event structure(s) is allocated and placed on the event_list. When an asynchronous operation completes, the event structure is removed from this list, placed on a work_list, and a call like cma_process_work() is invoked. Note that some operations (e.g. connect) result in multiple callbacks to the rdma_cm (connect and disconnect). And the more I consider this option, the more appealing just holding a mutex around the callbacks becomes. - Sean _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
