Hi Mike,

Sorry for the delay but thanks for your quick response.  We've had a good
level of success with tuning around these parameters and have seen some
pretty good stability improvements with adjusting timeouts in a heavily
congested environment.

Can you recommend any further reading on the iscsi implementation in Linux
btw?

On 30 July 2012 19:19, Mike Christie <micha...@cs.wisc.edu> wrote:

> On 07/30/2012 07:29 AM, lwade wrote:
> > Hi all,
> >
> > Second try at posting, it appears I am incapable of using Google Groups
> ;)
>
> You are just not subscribed with that email address so I have to go in
> and approve your message.
>
> >
> > I'm busy tuning a virtualisation environment and now I'm looking at the
> > iscsi initiators.  We have about 16 virtual hosts, each of these will be
> > connecting to exported LUN's from tgtd on a separate host.  These LUNs
> will
> > then be passed as block devices to the virtual guests.  Thus on each of
> > these hosts we will have sessions to the iSCSI targets, with potentially
> up
> > to 24 per host.  We won't be using existing sessions, so we'll be
> spawning
> > a session per connection to each new target on the hosts.  Since we'll
> also
> > be running "boot from iSCSI" on a number of the virtual machines, I've
> > tweaked the following values:
> >
> > node.conn[0].timeo.noop_out_interval
> > node.conn[0].timeo.noop_out_timeout
> >
> > Setting both to 0 and also cranking up
> node.session.timeo.replacement_timeout, per the open-iscsi docs/README.
> >
> > Next up, I'm looking at queue_depth and cmds_max.  Since we're using 1
> session per LUN, I guess these variables play less of an important role?
> >
>
> Yeah, since you are doing a lun per session you would want to just set
> them to the same value.
>
> I think tgtd has a per session limit of 32 or 128 cmds, but you might
> want to double check for your version.
>
>
> > However, is it still worth me upping both values, say queue_depth to 256
> and cmds_max to 512, over the defaults?  The nature of the I/O in the
> platform is quite bursty with peaks of heavy writes onto these devices from
> VM's to the hosts connected LUNs. I'm wondering whether tweaking these
> values will buy me leeway during the peaks, particularly if we have hosts
> filling their page/buffercache and potentially struggling to clear it
> during these bursty periods.
> >
>
> It depends on the backing storage and network. If you set them too high
> then the initiator layer is going to overqueue, IO is not going to get
> executed quick enough and IO could timeout and that will start the scsi
> eh. That will result in a large slowdown and possibly IO errors if it
> happens too often.
>
> If you set the queueing limits high then you might also want to set the
> scsi command timeout higher to deal with temporary slowdowns in the
> network or bursts in IO activity (if you are setting the noop values to
> non zero you would also want to increase them too).
>
> If you set the queue depth values to greater than 128 then you would
> also want to set the block layer rq limits higher (there are some
> settings in /sys/block/sdXYZ/queue like nr_requests).
>
> In general 32 for queue_depth and 128 for cmds_max (when doing multiple
> luns per session) is more on the low side. We selected those values to
> be safer. Increasing to 128 or 256 would help, but after that I am not
> sure if it helps. If you were doing multiple luns per session then
> increasing cmds_max would help (1024 is normally a ok value).
>

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to