[EMAIL PROTECTED] wrote: > At 12:57 PM 7/11/2007, Roland Dreier wrote: >> However on another level your question gets to the reason why we >> haven't implemented support for multiple completion event vectors. >> Namely, it's not clear how consumers, kernel or userspace, can make a >> good choice of which vector to assign a given CQ to. > > Got it, thanks. But aren't the vectors shared across all > consumers on an HCA? As such, it seems problematic to expect > consumers to make optimal choices, since they have no way of > knowing what other consumers are doing. > > In any case, all NFS/RDMA does is to check the completion > status, queue the event and schedule a tasklet, so there is > little or no parallelism to be gained in the upcall. I'd > prefer to not have to wait for other ULPs on the same vector, of > course. >
What a single Consumer could do is to clump as many of their CQs as possible into a single "bag" where serialization of notifications for these CQs would have little detrimental impact on the application. As you point out, for most applications this is all of their CQs. This would presume that when the Consumer supplied too many that the lower layers would simply say "tough" and combine some of them (achieving less than optimal results, but better than having the OS assign notification queues on a totally arbitrary basis). To use the actual number implies that it would be meaningful for *each* application to divide its CQs over that set, without any mechanism to balance applications themselves. That would seem to imply that a typical Consumer would have a large number of CQs, when I've never understood the need for more than one per core per application. At the minimum, if the actual number were published by the device, would the kernel consumers actually be able to distribute their CQs over the set? Tom, I definitely agree that userland consumers have absolutely no way to do that reasonably, but do you think it is plausible for the kernel to do so far kernel-resident consumers? If not, what would be needed to bridge that gap? Or is the need for parallelism so small amongst kernel completion handlers that the kernel does not need this feature? _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
