Bernard> The assumption you have here is that one CPU is capable
Bernard> of handling the completions without impacting
Bernard> bandwidth. We have seen the opposite in that we end up
Bernard> with one CPU pegged at high throughput. The benefit you
Bernard> are working on is latency will be faster if we handle
Bernard> both send and receive processing off the same
Bernard> thread/interrupt, but you have to balance that with
Bernard> bandwidth limitations. You think 4X has a bandwdith
Bernard> problem using IPoIB, wait till 12X comes out.
I still don't understand why splitting the CQ allows you to use more
than one CPU to handle completions. Both CQ events get handled on the
same CPU -- you just have more overhead in getting to the CQ event
handlers if there are two of them.
Also, why is 12X any worse? With current hardware at least the 4X
link is not the bottleneck anyway.
Bernard> What per CPU utilization do you see on mthca on a
Bernard> multiple CPU machine running peak bandwidth?
I've never really measured it. It's especially tough to account for
interrupt handler time.
- R.
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general