Re: How to investigate these error codes
What version of the c client are you using - multi-threaded or single threaded? If multi-threaded then the library (incl pthreads) will take care of handling the periodic heartbeats for you. If single threaded then you might be starving the event processing - which includes the heartbeat loop. See the THREADED sections of cli.c for an example. Patrick On Mon, Aug 15, 2016 at 12:01 AM, Krizansky, Jan <jkrizan...@netsuite.com> wrote: > Yes, we're using the C client but we don't seem to have any network issues > or load issues (in fact the setup is still in development mode so there is > little to none traffic going through it). > We have also set fairly high session timeout of 1,800,000 and a tickTime > of 900,000. Yet we're getting SESSIONEXPIRED error even 2-3 times a minute. > Are there any investigation steps you could recommend to pinpoint the > problem? > > Thank you, > Jan > > -Original Message- > From: Flavio Junqueira [mailto:f...@apache.org] > Sent: Friday, August 12, 2016 6:05 PM > To: user@zookeeper.apache.org > Subject: Re: How to investigate these error codes > > Hi Jan, > > Connection loss means that the client has disconnected from the server it > was connected to and it will try to connect to another server to avoid > session expiration. > > Session expired means that your session has expired. :-) > > Session expiration is important because if you have ephemerals associated > to that session, they will be gone, so it might trigger some recovery path > in your application. > > You're using the C client? If so, then it is not going to be garbage > collection on the client side causing your clients to disconnect, which is > a pretty common cause for applications using the Java client. You may want > to investigate if you're having some network issues or if perhaps your > servers are overwhelmed with something. If you're sharing the disk devices > and other applications are inducing a good number of IOs, then you may end > up affecting the performance of the server. > > -Flavio > > > > On 12 Aug 2016, at 15:39, Krizansky, Jan <jkrizan...@netsuite.com> > wrote: > > > > > > > I'm trying to reach out to you as we couldn't find any satisfying info > online. > > We've recently started seeing some errors in our cluster. The prevailing > one is ZSESSIONEXPIRED but there sometimes is also a ZCONNECTIONLOSS error. > > We couldn't find any documentation about possible causes of these > issues. Any recommendation where we should investigate and what might be > causing these? > > > > The ZCONNECTIONLOSS error is fairly rare. But ZSESSIONEXPIRED is very > common happening on almost every other hit. > > > > Thank you, > > > > Jan Krizansky > > > > > > NOTICE: This email and any attachments may contain confidential and > proprietary information of NetSuite Inc. and is for the sole use of the > intended recipient for the stated purpose. Any improper use or distribution > is prohibited. If you are not the intended recipient, please notify the > sender; do not review, copy or distribute; and promptly delete or destroy > all transmitted information. Please note that all communications and > information transmitted through this email system may be monitored by > NetSuite or its agents and that all incoming email is automatically scanned > by a third party spam and filtering service > > > > > > > NOTICE: This email and any attachments may contain confidential and > proprietary information of NetSuite Inc. and is for the sole use of the > intended recipient for the stated purpose. Any improper use or distribution > is prohibited. If you are not the intended recipient, please notify the > sender; do not review, copy or distribute; and promptly delete or destroy > all transmitted information. Please note that all communications and > information transmitted through this email system may be monitored by > NetSuite or its agents and that all incoming email is automatically scanned > by a third party spam and filtering service > > >
RE: How to investigate these error codes
mailto:f...@apache.org] Sent: Friday, August 12, 2016 6:05 PM To: user@zookeeper.apache.org Subject: Re: How to investigate these error codes Hi Jan, Connection loss means that the client has disconnected from the server it was connected to and it will try to connect to another server to avoid session expiration. Session expired means that your session has expired. :-) Session expiration is important because if you have ephemerals associated to that session, they will be gone, so it might trigger some recovery path in your application. You're using the C client? If so, then it is not going to be garbage collection on the client side causing your clients to disconnect, which is a pretty common cause for applications using the Java client. You may want to investigate if you're having some network issues or if perhaps your servers are overwhelmed with something. If you're sharing the disk devices and other applications are inducing a good number of IOs, then you may end up affecting the performance of the server. -Flavio > On 12 Aug 2016, at 15:39, Krizansky, Jan <jkrizan...@netsuite.com> wrote: > > > I'm trying to reach out to you as we couldn't find any satisfying info online. > We've recently started seeing some errors in our cluster. The prevailing one > is ZSESSIONEXPIRED but there sometimes is also a ZCONNECTIONLOSS error. > We couldn't find any documentation about possible causes of these issues. Any > recommendation where we should investigate and what might be causing these? > > The ZCONNECTIONLOSS error is fairly rare. But ZSESSIONEXPIRED is very common > happening on almost every other hit. > > Thank you, > > Jan Krizansky > > > NOTICE: This email and any attachments may contain confidential and > proprietary information of NetSuite Inc. and is for the sole use of the > intended recipient for the stated purpose. Any improper use or distribution > is prohibited. If you are not the intended recipient, please notify the > sender; do not review, copy or distribute; and promptly delete or destroy all > transmitted information. Please note that all communications and information > transmitted through this email system may be monitored by NetSuite or its > agents and that all incoming email is automatically scanned by a third party > spam and filtering service > > NOTICE: This email and any attachments may contain confidential and proprietary information of NetSuite Inc. and is for the sole use of the intended recipient for the stated purpose. Any improper use or distribution is prohibited. If you are not the intended recipient, please notify the sender; do not review, copy or distribute; and promptly delete or destroy all transmitted information. Please note that all communications and information transmitted through this email system may be monitored by NetSuite or its agents and that all incoming email is automatically scanned by a third party spam and filtering service
Re: How to investigate these error codes
On top of what Flavio pointed out: The liveness of a session is maintained by regular heartbeats between client and server, and heartbeats could fail due to a couple of reasons: - Network: increased latency, or network error. - Server overloaded such as IO contention / swapping; server GC took too long; server has too many clients connected; server is running in a multi-tenant environment. - Client overloaded. - Configuration issue: the pre-configured tickTime / minSessionTimeout / maxSessionTimeout is too low for the specific environment; shared dataDir and dataLogDir (which could cause IO contention in some cases.). I think it's hard to tell exactly what's going on in the cluster based on the information posted here given this many reason could cause the issue. It seems that the cluster was running fine previously, so identifying what's changed that correlates to the above points might be a good start. On Fri, Aug 12, 2016 at 9:05 AM, Flavio Junqueirawrote: > Hi Jan, > > Connection loss means that the client has disconnected from the server it > was connected to and it will try to connect to another server to avoid > session expiration. > > Session expired means that your session has expired. :-) > > Session expiration is important because if you have ephemerals associated > to that session, they will be gone, so it might trigger some recovery path > in your application. > > You're using the C client? If so, then it is not going to be garbage > collection on the client side causing your clients to disconnect, which is > a pretty common cause for applications using the Java client. You may want > to investigate if you're having some network issues or if perhaps your > servers are overwhelmed with something. If you're sharing the disk devices > and other applications are inducing a good number of IOs, then you may end > up affecting the performance of the server. > > -Flavio > > > > On 12 Aug 2016, at 15:39, Krizansky, Jan > wrote: > > > > > > > I'm trying to reach out to you as we couldn't find any satisfying info > online. > > We've recently started seeing some errors in our cluster. The prevailing > one is ZSESSIONEXPIRED but there sometimes is also a ZCONNECTIONLOSS error. > > We couldn't find any documentation about possible causes of these > issues. Any recommendation where we should investigate and what might be > causing these? > > > > The ZCONNECTIONLOSS error is fairly rare. But ZSESSIONEXPIRED is very > common happening on almost every other hit. > > > > Thank you, > > > > Jan Krizansky > > > > > > NOTICE: This email and any attachments may contain confidential and > proprietary information of NetSuite Inc. and is for the sole use of the > intended recipient for the stated purpose. Any improper use or distribution > is prohibited. If you are not the intended recipient, please notify the > sender; do not review, copy or distribute; and promptly delete or destroy > all transmitted information. Please note that all communications and > information transmitted through this email system may be monitored by > NetSuite or its agents and that all incoming email is automatically scanned > by a third party spam and filtering service > > > > > > -- Cheers Michael.
Re: How to investigate these error codes
Hi Jan, Connection loss means that the client has disconnected from the server it was connected to and it will try to connect to another server to avoid session expiration. Session expired means that your session has expired. :-) Session expiration is important because if you have ephemerals associated to that session, they will be gone, so it might trigger some recovery path in your application. You're using the C client? If so, then it is not going to be garbage collection on the client side causing your clients to disconnect, which is a pretty common cause for applications using the Java client. You may want to investigate if you're having some network issues or if perhaps your servers are overwhelmed with something. If you're sharing the disk devices and other applications are inducing a good number of IOs, then you may end up affecting the performance of the server. -Flavio > On 12 Aug 2016, at 15:39, Krizansky, Janwrote: > > > I'm trying to reach out to you as we couldn't find any satisfying info online. > We've recently started seeing some errors in our cluster. The prevailing one > is ZSESSIONEXPIRED but there sometimes is also a ZCONNECTIONLOSS error. > We couldn't find any documentation about possible causes of these issues. Any > recommendation where we should investigate and what might be causing these? > > The ZCONNECTIONLOSS error is fairly rare. But ZSESSIONEXPIRED is very common > happening on almost every other hit. > > Thank you, > > Jan Krizansky > > > NOTICE: This email and any attachments may contain confidential and > proprietary information of NetSuite Inc. and is for the sole use of the > intended recipient for the stated purpose. Any improper use or distribution > is prohibited. If you are not the intended recipient, please notify the > sender; do not review, copy or distribute; and promptly delete or destroy all > transmitted information. Please note that all communications and information > transmitted through this email system may be monitored by NetSuite or its > agents and that all incoming email is automatically scanned by a third party > spam and filtering service > >