On Saturday, August 21, 2010, David Dillow wrote: > On Sat, 2010-08-21 at 13:14 +0200, Bart Van Assche wrote: > > On Fri, Aug 20, 2010 at 9:49 AM, Bernd Schubert > > > > <[email protected]> wrote: > > > In ib_srp.c sg_tablesize is defined as 255. With that value we see lots > > > of IO requests of size 1020. As I already wrote on linux-scsi, that is > > > really sub- optimal for DDN storage, as lots of IO requests of size > > > 1020 come up. > > > > > > Now the question is if we can safely increase it. Is there somewhere a > > > definition what is the real hardware supported size? And shouldn't we > > > increase sg_tablesize, but also set the .dma_boundary value? > > > > (resending as plain text) > > > > The request size of 1020 indicates that there are less than 60 data > > buffer descriptors in the SRP_CMD request. So you are probably hitting > > another limit than srp_sg_tablesize. > > 4 KB * 255 descriptors = 1020 KB
We at least verified it indirectly. Lustre-1.8.4 will include a patch to incrase SG_ALL from 255 to 256 (not ideal at least for older kernels, as it will require at least a order 1 allocation, instead of the previous order 0). But including that patch into our release and then testing IO sizes with QLogic FC definitely made 1020K IO requests to vanish. > > IIRC, we verified that we were seeing 255 entries in the S/G list with a > few printk()s, but it has been a few years. I probably should do that as well, just some time limitations. > > I'm not sure how you came up with 60 descriptors -- could you elaborate > please? > > > Did this occur with buffered (asynchronous) or unbuffered (direct) I/O > > ? And in the first case, which I/O scheduler did you use ? > > I'm sure Bernd will speak for his situation, but we've seen it with both > buffered and unbuffered, with the deadline and noop schedulers (mostly > on vendor 2.6.18 kernels). CFQ never gave us larger than 512 KB > requests. Our main use is Lustre, which does unbuffered IO from the > kernel. I'm in the DDN Lustre group, so I mainly speak for Lustre as well. I think Lustres filterio is directio-like. It is not the classical kernel direct-IO interface and provides a few buffers for writes, AFAIK. But it is still almost direct-IO and its filterio also immediately sends a disk commit request. We use the deadline scheduler by default. Differences to noop are small for streaming writes, but for example for mke2fs it is 5 times faster with deadline compared to noop. Cheers, Bernd -- Bernd Schubert DataDirect Networks -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
