Re: Timeout values for iSCSI
On Thursday, October 4, 2018 at 10:20:56 AM UTC-7, Sai Chaitanya Mitta wrote: > > > > On Wednesday, 14 April 2010 01:33:13 UTC+5:30, Mike Christie wrote: >> >> On 04/13/2010 03:23 AM, Christian Iversen wrote: >> > Hi iSCSI guys >> > >> > I've set up iSCSI storage on our servers, using IETD and OpenISCSI. >> > >> > It works and performs great, but I am a little unsure of how to adjust >> > the timeout values properly. >> > >> > On our storage servers, we use heartbeat to achieve HA failover, which >> > works nicely. However, the client machines only try for a fixed amount >> > of time before giving up, so if the failover for some reason does not >> > happen relatively quickly, everything grinds to a halt in a really bad >> way. >> > >> > I would like to set up open-iscsi to keep trying, preferably at low >> > intervals, and not give up contacting the server. >> > >> > There are quite a few different timeouts, and I have been unable to find >> > any sort of reference documentation for this. Maybe someone here can >> help? >> > >> >> Did you read the README? I tried to document the timeouts that are asked >> about most frequently on the list. >> >> >> > What I'd like is the following: >> > >> > - Never give up trying (or at least try for a month :) >> >> The iscsi initiator almost always tries to reconnect to the target. If >> it gets a successful login then that fails it will try to relogin until >> the the user runs some iscsiadm command to logout. >> >> If you mean you want it to hold onto IO and not fail it, then you want >> the replacement_timeout/recovery_timeout. There should be info in the >> README and iscsid.conf about this. If it is not clear let me know. >> > > >> If in the iscsid.conf you see this for >> node.session.timeo.replacement_timeout then this is what I think you are >> asking for (that is if you are saying you do not want IO failed) and you >> want to set the value to 0. >> # - If the value is 0, IO will be failed immediately. >> # - If the value is less than 0, IO will remain queued until the session >> # is logged back in, or until the user runs the logout command. >> >> > - Try every 1 second >> > - Timeout should work for all stages of the session, >> > be it logged in or not. >> > Even though I changed node.session.timeo.replacement_timeout to 300, It is > not updating the value in file > /sys/class/iscsi_session/session/recov_tmo because of multipathd > service is taking upper hand of iscsi_timeout setting and making it to 5 > seconds. Is there anyway to change the value without stopping multipathd > service?? > >> >> There is always a trade-off when using timeouts with multipathng. You want the failure to be detected fairly quickly with multi-pathing so that you can fail over to the other path. If you have a long failure timeout, then the timeouts for multi-pathing become *really* long, because they use the individual path timeouts times the number of paths. So if you have a 300-second failure timeout and two paths you could easily have a 600-second timeout for the MP device. This is likely why the MP software sets the timeout to 5 seconds. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+unsubscr...@googlegroups.com. To post to this group, send email to open-iscsi@googlegroups.com. Visit this group at https://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.
Re: Timeout values for iSCSI
On Wednesday, 14 April 2010 01:33:13 UTC+5:30, Mike Christie wrote: > > On 04/13/2010 03:23 AM, Christian Iversen wrote: > > Hi iSCSI guys > > > > I've set up iSCSI storage on our servers, using IETD and OpenISCSI. > > > > It works and performs great, but I am a little unsure of how to adjust > > the timeout values properly. > > > > On our storage servers, we use heartbeat to achieve HA failover, which > > works nicely. However, the client machines only try for a fixed amount > > of time before giving up, so if the failover for some reason does not > > happen relatively quickly, everything grinds to a halt in a really bad > way. > > > > I would like to set up open-iscsi to keep trying, preferably at low > > intervals, and not give up contacting the server. > > > > There are quite a few different timeouts, and I have been unable to find > > any sort of reference documentation for this. Maybe someone here can > help? > > > > Did you read the README? I tried to document the timeouts that are asked > about most frequently on the list. > > > > What I'd like is the following: > > > > - Never give up trying (or at least try for a month :) > > The iscsi initiator almost always tries to reconnect to the target. If > it gets a successful login then that fails it will try to relogin until > the the user runs some iscsiadm command to logout. > > If you mean you want it to hold onto IO and not fail it, then you want > the replacement_timeout/recovery_timeout. There should be info in the > README and iscsid.conf about this. If it is not clear let me know. > > If in the iscsid.conf you see this for > node.session.timeo.replacement_timeout then this is what I think you are > asking for (that is if you are saying you do not want IO failed) and you > want to set the value to 0. > # - If the value is 0, IO will be failed immediately. > # - If the value is less than 0, IO will remain queued until the session > # is logged back in, or until the user runs the logout command. > > > - Try every 1 second > > - Timeout should work for all stages of the session, > > be it logged in or not. > Even though I changed node.session.timeo.replacement_timeout to 300, It is not updating the value in file /sys/class/iscsi_session/session/recov_tmo because of multipathd service is taking upper hand of iscsi_timeout setting and making it to 5 seconds. Is there anyway to change the value without stopping multipathd service?? > > > > Can anybody help? > > > > Please CC me as I'm not on the list. > > > > -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+unsubscr...@googlegroups.com. To post to this group, send email to open-iscsi@googlegroups.com. Visit this group at https://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.
Re: Timeout values for iSCSI
On 04/15/2010 04:33 AM, Christian Iversen wrote: What about these timeouts? node.session.err_timeo.abort_timeout = x node.session.err_timeo.lu_reset_timeout = y node.session.err_timeo.host_reset_timeout = z I would just use the defaults. What are reasonable values for x, y and z, and when are they used? If there is a low-level error, I'd like iscsi to detect this quickly and reconnect right away. (this will happen when there's a failover). Will the following settings work for this purpose: node.conn[0].timeo.noop_out_interval = 2 node.conn[0].timeo.noop_out_timeout = 2 node.session.timeo.replacement_timeout = 86400 Yes. I'll use this then. Per my understanding: This will ping the server every 2. seconds, and wait 2 seconds for a reply. If a connection problem is discovered, the client will try for 24 hours (86400 seconds) to reestablish a connection before giving up and returning IO errors to higher layers. Is this correct? From your description it seems like replacement_timeout Yes. = 0 would cause immediate IO errors in case of connection problems? Or did I misunderstand? Yeah, on newer versions 0 causes the IO to be failed immediately. I wrote that wrong before. Was it different on old versions? Yes, it was different in older versions. If your iscsid.conf does not have that info about -1 and 0, then you have older tools and those tools did not let you set the value to 0. If you did the iscsi tools would spit out an error in /var/log/messages saying it was an invalid value and that it was going to use the default 120 instead. And if you tried to use -1 then it could overflow and you end up with all kinds of weirdness. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: Timeout values for iSCSI
On 2010-04-14 19:42, Mike Christie wrote: On 04/14/2010 07:02 AM, Christian Iversen wrote: What I'd like is the following: - Never give up trying (or at least try for a month :) The iscsi initiator almost always tries to reconnect to the target. If it gets a successful login then that fails it will try to relogin until the the user runs some iscsiadm command to logout. If you mean you want it to hold onto IO and not fail it, then you want the replacement_timeout/recovery_timeout. There should be info in the README and iscsid.conf about this. If it is not clear let me know. There's info about replacement_timeout, but no recovery_timeout. Maybe only the former is a valid name? replacement_timeout is the name of the setting in iscsid.conf, but for some dumb reason I named it recovery_timeout in the kernel. Ah, ok. I'll go with replacement_timeout then :) If in the iscsid.conf you see this for node.session.timeo.replacement_timeout then this is what I think you are asking for (that is if you are saying you do not want IO failed) and you want to set the value to 0. # - If the value is 0, IO will be failed immediately. # - If the value is less than 0, IO will remain queued until the session # is logged back in, or until the user runs the logout command. I'm a little unsure about the semantics for "failed io". What I want is the iscsi client to see all IO as working, or hanging indefinitely if the server cannot be contacted. Then set the replacement_timeout to -1. Ok. What about these timeouts? node.session.err_timeo.abort_timeout = x node.session.err_timeo.lu_reset_timeout = y node.session.err_timeo.host_reset_timeout = z What are reasonable values for x, y and z, and when are they used? If there is a low-level error, I'd like iscsi to detect this quickly and reconnect right away. (this will happen when there's a failover). Will the following settings work for this purpose: node.conn[0].timeo.noop_out_interval = 2 node.conn[0].timeo.noop_out_timeout = 2 node.session.timeo.replacement_timeout = 86400 Yes. I'll use this then. Per my understanding: This will ping the server every 2. seconds, and wait 2 seconds for a reply. If a connection problem is discovered, the client will try for 24 hours (86400 seconds) to reestablish a connection before giving up and returning IO errors to higher layers. Is this correct? From your description it seems like replacement_timeout Yes. = 0 would cause immediate IO errors in case of connection problems? Or did I misunderstand? Yeah, on newer versions 0 causes the IO to be failed immediately. I wrote that wrong before. Was it different on old versions? In any case, I'll use a value of 86400 for that timeout :) -- Med venlig hilsen / Best regards Christian Iversen Sikkerhed.org ApS Fuglebakkevej 88 E-mail: supp...@sikkerhed.org 1. sal Web: www.sikkerhed.org DK-2000 Frederiksberg Direkte: c...@sikkerhed.org -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: Timeout values for iSCSI
On 04/14/2010 07:02 AM, Christian Iversen wrote: On 2010-04-13 22:03, Mike Christie wrote: On 04/13/2010 03:23 AM, Christian Iversen wrote: Hi iSCSI guys I've set up iSCSI storage on our servers, using IETD and OpenISCSI. It works and performs great, but I am a little unsure of how to adjust the timeout values properly. On our storage servers, we use heartbeat to achieve HA failover, which works nicely. However, the client machines only try for a fixed amount of time before giving up, so if the failover for some reason does not happen relatively quickly, everything grinds to a halt in a really bad way. I would like to set up open-iscsi to keep trying, preferably at low intervals, and not give up contacting the server. There are quite a few different timeouts, and I have been unable to find any sort of reference documentation for this. Maybe someone here can help? Did you read the README? I tried to document the timeouts that are asked about most frequently on the list. Thank you! I've been looking for that kind of document for a while. Things are somewhat clearer now :) What I'd like is the following: - Never give up trying (or at least try for a month :) The iscsi initiator almost always tries to reconnect to the target. If it gets a successful login then that fails it will try to relogin until the the user runs some iscsiadm command to logout. If you mean you want it to hold onto IO and not fail it, then you want the replacement_timeout/recovery_timeout. There should be info in the README and iscsid.conf about this. If it is not clear let me know. There's info about replacement_timeout, but no recovery_timeout. Maybe only the former is a valid name? replacement_timeout is the name of the setting in iscsid.conf, but for some dumb reason I named it recovery_timeout in the kernel. If in the iscsid.conf you see this for node.session.timeo.replacement_timeout then this is what I think you are asking for (that is if you are saying you do not want IO failed) and you want to set the value to 0. # - If the value is 0, IO will be failed immediately. # - If the value is less than 0, IO will remain queued until the session # is logged back in, or until the user runs the logout command. I'm a little unsure about the semantics for "failed io". What I want is the iscsi client to see all IO as working, or hanging indefinitely if the server cannot be contacted. Then set the replacement_timeout to -1. If there is a low-level error, I'd like iscsi to detect this quickly and reconnect right away. (this will happen when there's a failover). Will the following settings work for this purpose: node.conn[0].timeo.noop_out_interval = 2 node.conn[0].timeo.noop_out_timeout = 2 node.session.timeo.replacement_timeout = 86400 Yes. Per my understanding: This will ping the server every 2. seconds, and wait 2 seconds for a reply. If a connection problem is discovered, the client will try for 24 hours (86400 seconds) to reestablish a connection before giving up and returning IO errors to higher layers. Is this correct? From your description it seems like replacement_timeout Yes. = 0 would cause immediate IO errors in case of connection problems? Or did I misunderstand? Yeah, on newer versions 0 causes the IO to be failed immediately. I wrote that wrong before. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: Timeout values for iSCSI
On 2010-04-13 22:03, Mike Christie wrote: On 04/13/2010 03:23 AM, Christian Iversen wrote: Hi iSCSI guys I've set up iSCSI storage on our servers, using IETD and OpenISCSI. It works and performs great, but I am a little unsure of how to adjust the timeout values properly. On our storage servers, we use heartbeat to achieve HA failover, which works nicely. However, the client machines only try for a fixed amount of time before giving up, so if the failover for some reason does not happen relatively quickly, everything grinds to a halt in a really bad way. I would like to set up open-iscsi to keep trying, preferably at low intervals, and not give up contacting the server. There are quite a few different timeouts, and I have been unable to find any sort of reference documentation for this. Maybe someone here can help? Did you read the README? I tried to document the timeouts that are asked about most frequently on the list. Thank you! I've been looking for that kind of document for a while. Things are somewhat clearer now :) What I'd like is the following: - Never give up trying (or at least try for a month :) The iscsi initiator almost always tries to reconnect to the target. If it gets a successful login then that fails it will try to relogin until the the user runs some iscsiadm command to logout. If you mean you want it to hold onto IO and not fail it, then you want the replacement_timeout/recovery_timeout. There should be info in the README and iscsid.conf about this. If it is not clear let me know. There's info about replacement_timeout, but no recovery_timeout. Maybe only the former is a valid name? If in the iscsid.conf you see this for node.session.timeo.replacement_timeout then this is what I think you are asking for (that is if you are saying you do not want IO failed) and you want to set the value to 0. # - If the value is 0, IO will be failed immediately. # - If the value is less than 0, IO will remain queued until the session # is logged back in, or until the user runs the logout command. I'm a little unsure about the semantics for "failed io". What I want is the iscsi client to see all IO as working, or hanging indefinitely if the server cannot be contacted. If there is a low-level error, I'd like iscsi to detect this quickly and reconnect right away. (this will happen when there's a failover). Will the following settings work for this purpose: node.conn[0].timeo.noop_out_interval = 2 node.conn[0].timeo.noop_out_timeout = 2 node.session.timeo.replacement_timeout = 86400 Per my understanding: This will ping the server every 2. seconds, and wait 2 seconds for a reply. If a connection problem is discovered, the client will try for 24 hours (86400 seconds) to reestablish a connection before giving up and returning IO errors to higher layers. Is this correct? From your description it seems like replacement_timeout = 0 would cause immediate IO errors in case of connection problems? Or did I misunderstand? -- Med venlig hilsen / Best regards Christian Iversen Sikkerhed.org ApS Fuglebakkevej 88 E-mail: supp...@sikkerhed.org 1. sal Web: www.sikkerhed.org DK-2000 Frederiksberg Direkte: c...@sikkerhed.org -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: Timeout values for iSCSI
On 04/13/2010 03:23 AM, Christian Iversen wrote: Hi iSCSI guys I've set up iSCSI storage on our servers, using IETD and OpenISCSI. It works and performs great, but I am a little unsure of how to adjust the timeout values properly. On our storage servers, we use heartbeat to achieve HA failover, which works nicely. However, the client machines only try for a fixed amount of time before giving up, so if the failover for some reason does not happen relatively quickly, everything grinds to a halt in a really bad way. I would like to set up open-iscsi to keep trying, preferably at low intervals, and not give up contacting the server. There are quite a few different timeouts, and I have been unable to find any sort of reference documentation for this. Maybe someone here can help? Did you read the README? I tried to document the timeouts that are asked about most frequently on the list. What I'd like is the following: - Never give up trying (or at least try for a month :) The iscsi initiator almost always tries to reconnect to the target. If it gets a successful login then that fails it will try to relogin until the the user runs some iscsiadm command to logout. If you mean you want it to hold onto IO and not fail it, then you want the replacement_timeout/recovery_timeout. There should be info in the README and iscsid.conf about this. If it is not clear let me know. If in the iscsid.conf you see this for node.session.timeo.replacement_timeout then this is what I think you are asking for (that is if you are saying you do not want IO failed) and you want to set the value to 0. # - If the value is 0, IO will be failed immediately. # - If the value is less than 0, IO will remain queued until the session # is logged back in, or until the user runs the logout command. - Try every 1 second - Timeout should work for all stages of the session, be it logged in or not. Can anybody help? Please CC me as I'm not on the list. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.