On Wed, Feb 15, 2006 at 01:34:00PM -0800, Roland Dreier wrote:
>     Michael> AFAIK, which of the two options gives better performance
>     Michael> might depend on the application and the specific system.
>     Michael> For now, Eli made the simpler option the default.
> 
> Have you seen cases where using the HCR is faster?  It seems that in
> both cases we are doing posted writes to PCI memory, except that the
> HCR case has to do at least one (slow) read to check the go bit.  The
> doorbell case does use more write barriers since all the writes have
> to be ordered, but I have a hard time believing that the write
> barriers are anywhere near as expensive as the read of the go bit.

me too.

AFAIK, the write barriers only guarantee the write has left the CPU,
is in flight, and subject to PCI ordering rules. The MMIO read is
going to cost 1000-3000 CPU cycles depending on chipset, CPU speed,
and which register it's reading from the device.

However, that doesn't mean all metrics are better just because
the CPU is more efficient. Forcing things down the PCI bus
will sometimes improve latency sensitive benchmarks.

grant
_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to