David Dillow wrote: > We're talking about different things -- > max_segments(sg_tablesize)/max_sectors are > the limits as we're adding pages to the bio via bio_add_page(). > blk_rq_map_sg() uses > max_segment_size as a bound on the largest S/G entry into which it can > coalesce multiple entries from the BIO. > It is considered in the line > if (sg->length + nbytes > queue_max_segment_size(q)) > goto new_segment;
Dave, thanks for the detailed explanation, I understand it much better now, just to make sure, blk_rq_map_sg() is called through the flow of dma_map_sg, correct? If this is the case, we're talking on decision making done by the block layer during dma-mapping which later affect the "IB IOMMU mapping" at the IB driver (e.g srp, iser, etc). >> In iser we want to support up to 512KB IOs, so sg_tablesize is 128 (=512>>12) >> which on systems with 4K page size accounts to 512K totally (we also set >> max_sectors >> to 1024, so on systems with 16K or whatever page size, we'll not get > 512K >> IOs). > Yes, but without this change, you will get your 512 KB request in 8 S/G > entries minimum, when it could be in one if contiguous. For our systems > where we're trying to get 1 MB or larger IOs over SRP, we get 16 S/G > entries when we could get one, potentially forcing us into using FMRs > and doing additional work when we could just map the single entry directly. Since the block layer did its best to coalesce multiple entries from the BIO to SG(s), you would need to FMR whenever dma_map_sg returns value > 1 As you mention later on, I wonder what would be the benefit from not using FMRs as we're talking on large IOs (> 64K, by the assumption that the block it coalesces today BIOs that allow that, i.e their pages are contiguous), for which I would expect latency, bandwidth and IOPS not to be effected by no-FMRing them. So we're remained with the CPU usage saving, do you have (say) "vmstat 1" snapshots before/after this patch with the ~same IO tool/load that can help quantify this saving? Also when working with direct I/O from user space and/or under file-system, did you really see many BIOs that can be merged? I was under the impression, that (specifically after some time the system is active) for the most case, I get totally scattered SGs, whose pages can't be coalesced at all. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
