On Thu, May 2, 2013 at 4:56 AM, Michael S. Tsirkin <[email protected]> wrote: > > On Thu, May 02, 2013 at 02:11:15AM +0300, Or Gerlitz wrote: > > So we've noted that when configuring the kernel && booting with intel > > iommu set to on on a physical node (non VM, and without enabling SRIOV > > by the HW device driver) raw performance of the iSER (iSCSI RDMA) SAN > > initiator is reduced notably, e.g in the testbed we looked today we > > had ~260K 1KB random IOPS and 5.5GBs BW for 128KB IOs with iommu > > turned off for single LUN, and ~150K IOPS and 4GBs BW with iommu > > turned on. No change on the target node between runs. > > That's why we have iommu=pt. > See definition of iommu_pass_through in arch/x86/kernel/pci-dma.c.
Hi Michael (hope you feel better), We did some runs with the pt approach you suggested and still didn't get the promised gain -- in parallel we came across this 2012 commit f800326dc "ixgbe: Replace standard receive path with a page based receive" where they say "[...] we are able to see a considerable performance gain when an IOMMU is enabled because we are no longer unmapping every buffer on receive [...] instead we can simply call sync_single_range [...]" looking on the commit you can see that they allocate a page/skb dma_map it initially and later of the life cycle of that buffer use dma_sync_for_device/cpu and avoid dma_map/unmap on the fast path. Well few questions which I'd love to hear people's opinion -- 1st this approach seems cool for network device RX path, but what about the TX path, any idea how to avoid dma_map for it? or why on the TX path calling dma_map/unmap for every buffer doesn't involve a notable perf hit? 2nd I don't see how to apply the method on block device since these devices don't allocate buffers, but rather get a scatter-gather list of pages from upper layers, issue dma_map_sg on them and submit the IO, later when done call dma_unmap_sg Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
