Hi Mike, Sorry for the delay but thanks for your quick response. We've had a good level of success with tuning around these parameters and have seen some pretty good stability improvements with adjusting timeouts in a heavily congested environment.
Can you recommend any further reading on the iscsi implementation in Linux btw? On 30 July 2012 19:19, Mike Christie <micha...@cs.wisc.edu> wrote: > On 07/30/2012 07:29 AM, lwade wrote: > > Hi all, > > > > Second try at posting, it appears I am incapable of using Google Groups > ;) > > You are just not subscribed with that email address so I have to go in > and approve your message. > > > > > I'm busy tuning a virtualisation environment and now I'm looking at the > > iscsi initiators. We have about 16 virtual hosts, each of these will be > > connecting to exported LUN's from tgtd on a separate host. These LUNs > will > > then be passed as block devices to the virtual guests. Thus on each of > > these hosts we will have sessions to the iSCSI targets, with potentially > up > > to 24 per host. We won't be using existing sessions, so we'll be > spawning > > a session per connection to each new target on the hosts. Since we'll > also > > be running "boot from iSCSI" on a number of the virtual machines, I've > > tweaked the following values: > > > > node.conn[0].timeo.noop_out_interval > > node.conn[0].timeo.noop_out_timeout > > > > Setting both to 0 and also cranking up > node.session.timeo.replacement_timeout, per the open-iscsi docs/README. > > > > Next up, I'm looking at queue_depth and cmds_max. Since we're using 1 > session per LUN, I guess these variables play less of an important role? > > > > Yeah, since you are doing a lun per session you would want to just set > them to the same value. > > I think tgtd has a per session limit of 32 or 128 cmds, but you might > want to double check for your version. > > > > However, is it still worth me upping both values, say queue_depth to 256 > and cmds_max to 512, over the defaults? The nature of the I/O in the > platform is quite bursty with peaks of heavy writes onto these devices from > VM's to the hosts connected LUNs. I'm wondering whether tweaking these > values will buy me leeway during the peaks, particularly if we have hosts > filling their page/buffercache and potentially struggling to clear it > during these bursty periods. > > > > It depends on the backing storage and network. If you set them too high > then the initiator layer is going to overqueue, IO is not going to get > executed quick enough and IO could timeout and that will start the scsi > eh. That will result in a large slowdown and possibly IO errors if it > happens too often. > > If you set the queueing limits high then you might also want to set the > scsi command timeout higher to deal with temporary slowdowns in the > network or bursts in IO activity (if you are setting the noop values to > non zero you would also want to increase them too). > > If you set the queue depth values to greater than 128 then you would > also want to set the block layer rq limits higher (there are some > settings in /sys/block/sdXYZ/queue like nr_requests). > > In general 32 for queue_depth and 128 for cmds_max (when doing multiple > luns per session) is more on the low side. We selected those values to > be safer. Increasing to 128 or 256 would help, but after that I am not > sure if it helps. If you were doing multiple luns per session then > increasing cmds_max would help (1024 is normally a ok value). > -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.