Re: srp sg_tablesize

David Dillow Sat, 21 Aug 2010 13:50:56 -0700

On Sat, 2010-08-21 at 20:20 +0200, Bernd Schubert wrote:
> On Saturday, August 21, 2010, Bart Van Assche wrote:


> > What might help - depending on how the target is implemented - is
> > using an I/O depth larger than one. ib_srp sends all SRP_CMDs with the
> 
> It depends if we enable write-back cache or not. The older S2A architechture 
> does not mirror the cache at all and therefore write-back cache is supposed 
> to 
> be disabled. The recent SFA architechture mirrors the write-back cache and so 
> it supposed to be enabled. With enabled witeback-cache an 'improved' command 
> processing is done (I don't know details myself). However, cache mirroring is 
> an expensive operation if the system can do 10GB/s and there IOs only will go 
> into the cache, if the size is not a multiple of 1024K. 1MiB IOs are directly 
> send to the disks. And now that leaves us with srp, where we see to many 
> 1020K 
> request, which will need to be processed by the write-back cache....

You have a few options here -- if it was a 1024 KB request broken into a
1020 KB and a 4 KB request, you can hold onto the 1020 KB request for a
fraction of a second to see if the next request completes it. The 4 KB
request will almost always be the next request for the LUN. If the next
request doesn't fill out the stripe, then do the effort for write
mirroring.

That can be extended as well -- perhaps start the mirror of the 1020 KB
request, but you can decide not to mirror the 4 KB request since it
completes the full stripe write, and then you can just wait to complete
the 4 KB write once the full stripe write completes to disk, as you
would if a 1 MB, stripe-aligned request comes in.

And similarly for if the next request fills a stripe but spills into the
next stripe -- mirror only the portion that's needed and switch to
waiting for the disk write if you can make a full stripe on the next
request.

> > task attribute SIMPLE, so a target is allowed to process these
> > requests concurrently. For the ib_srpt target I see the following
> > results over a single QDR link and a NULLIO target (fio

> How exactly do you do that? Is that something I would try with our storage as 
> well? I guess only with a special firmware version, which I also do not have 
> access to.

Bart is referring to keeping multiple requests in flight. On your
client, use a non-zero -qd to xdd for example, or the --threads
--numjobs=X for fio he showed.

If you're not doing direct IO, then you have less direct control over
how the page cache will do its writeback, but I would expect it to
either try to have a decent queue depth or the block/mm developers may
be interested in patches to get there if it makes sense for a particular
device.

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: srp sg_tablesize

Reply via email to