On 07/30/2012 07:29 AM, lwade wrote:
> Hi all,
> 
> Second try at posting, it appears I am incapable of using Google Groups ;)

You are just not subscribed with that email address so I have to go in
and approve your message.

> 
> I'm busy tuning a virtualisation environment and now I'm looking at the 
> iscsi initiators.  We have about 16 virtual hosts, each of these will be 
> connecting to exported LUN's from tgtd on a separate host.  These LUNs will 
> then be passed as block devices to the virtual guests.  Thus on each of 
> these hosts we will have sessions to the iSCSI targets, with potentially up 
> to 24 per host.  We won't be using existing sessions, so we'll be spawning 
> a session per connection to each new target on the hosts.  Since we'll also 
> be running "boot from iSCSI" on a number of the virtual machines, I've 
> tweaked the following values:
> 
> node.conn[0].timeo.noop_out_interval
> node.conn[0].timeo.noop_out_timeout 
> 
> Setting both to 0 and also cranking up 
> node.session.timeo.replacement_timeout, per the open-iscsi docs/README.
> 
> Next up, I'm looking at queue_depth and cmds_max.  Since we're using 1 
> session per LUN, I guess these variables play less of an important role? 
> 

Yeah, since you are doing a lun per session you would want to just set
them to the same value.

I think tgtd has a per session limit of 32 or 128 cmds, but you might
want to double check for your version.


> However, is it still worth me upping both values, say queue_depth to 256 and 
> cmds_max to 512, over the defaults?  The nature of the I/O in the platform is 
> quite bursty with peaks of heavy writes onto these devices from VM's to the 
> hosts connected LUNs. I'm wondering whether tweaking these values will buy me 
> leeway during the peaks, particularly if we have hosts filling their 
> page/buffercache and potentially struggling to clear it during these bursty 
> periods.
> 

It depends on the backing storage and network. If you set them too high
then the initiator layer is going to overqueue, IO is not going to get
executed quick enough and IO could timeout and that will start the scsi
eh. That will result in a large slowdown and possibly IO errors if it
happens too often.

If you set the queueing limits high then you might also want to set the
scsi command timeout higher to deal with temporary slowdowns in the
network or bursts in IO activity (if you are setting the noop values to
non zero you would also want to increase them too).

If you set the queue depth values to greater than 128 then you would
also want to set the block layer rq limits higher (there are some
settings in /sys/block/sdXYZ/queue like nr_requests).

In general 32 for queue_depth and 128 for cmds_max (when doing multiple
luns per session) is more on the low side. We selected those values to
be safer. Increasing to 128 or 256 would help, but after that I am not
sure if it helps. If you were doing multiple luns per session then
increasing cmds_max would help (1024 is normally a ok value).

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to