Re: How to investigate these error codes

2016-08-16 Thread Patrick Hunt
What version of the c client are you using - multi-threaded or single
threaded? If multi-threaded then the library (incl pthreads) will take care
of handling the periodic heartbeats for you. If single threaded then you
might be starving the event processing - which includes the heartbeat loop.
See the THREADED sections of cli.c for an example.

Patrick

On Mon, Aug 15, 2016 at 12:01 AM, Krizansky, Jan <jkrizan...@netsuite.com>
wrote:

>  Yes, we're using the C client but we don't seem to have any network issues
> or load issues (in fact the setup is still in development mode so there is
> little to none traffic going through it).
> We have also set fairly high session timeout of 1,800,000 and a tickTime
> of 900,000. Yet we're getting SESSIONEXPIRED error even 2-3 times a minute.
> Are there any investigation steps you could recommend to pinpoint the
> problem?
>
> Thank you,
> Jan
>
> -Original Message-
> From: Flavio Junqueira [mailto:f...@apache.org]
> Sent: Friday, August 12, 2016 6:05 PM
> To: user@zookeeper.apache.org
> Subject: Re: How to investigate these error codes
>
> Hi Jan,
>
> Connection loss means that the client has disconnected from the server it
> was connected to and it will try to connect to another server to avoid
> session expiration.
>
> Session expired means that your session has expired. :-)
>
> Session expiration is important because if you have ephemerals associated
> to that session, they will be gone, so it might trigger some recovery path
> in your application.
>
> You're using the C client? If so, then it is not going to be garbage
> collection on the client side causing your clients to disconnect, which is
> a pretty common cause for applications using the Java client. You may want
> to investigate if you're having some network issues or if perhaps your
> servers are overwhelmed with something. If you're sharing the disk devices
> and other applications are inducing a good number of IOs, then you may end
> up affecting the performance of the server.
>
> -Flavio
>
>
> > On 12 Aug 2016, at 15:39, Krizansky, Jan <jkrizan...@netsuite.com>
> wrote:
> >
> >  >
> > I'm trying to reach out to you as we couldn't find any satisfying info
> online.
> > We've recently started seeing some errors in our cluster. The prevailing
> one is ZSESSIONEXPIRED but there sometimes is also a ZCONNECTIONLOSS error.
> > We couldn't find any documentation about possible causes of these
> issues. Any recommendation where we should investigate and what might be
> causing these?
> >
> > The ZCONNECTIONLOSS error is fairly rare. But ZSESSIONEXPIRED is very
> common happening on almost every other hit.
> >
> > Thank you,
> >
> > Jan Krizansky
> >
> >
> > NOTICE: This email and any attachments may contain confidential and
> proprietary information of NetSuite Inc. and is for the sole use of the
> intended recipient for the stated purpose. Any improper use or distribution
> is prohibited. If you are not the intended recipient, please notify the
> sender; do not review, copy or distribute; and promptly delete or destroy
> all transmitted information. Please note that all communications and
> information transmitted through this email system may be monitored by
> NetSuite or its agents and that all incoming email is automatically scanned
> by a third party spam and filtering service
> >
> > 
>
>
> NOTICE: This email and any attachments may contain confidential and
> proprietary information of NetSuite Inc. and is for the sole use of the
> intended recipient for the stated purpose. Any improper use or distribution
> is prohibited. If you are not the intended recipient, please notify the
> sender; do not review, copy or distribute; and promptly delete or destroy
> all transmitted information. Please note that all communications and
> information transmitted through this email system may be monitored by
> NetSuite or its agents and that all incoming email is automatically scanned
> by a third party spam and filtering service
>
> 
>


RE: How to investigate these error codes

2016-08-15 Thread Krizansky, Jan
mailto:f...@apache.org] 
Sent: Friday, August 12, 2016 6:05 PM
To: user@zookeeper.apache.org
Subject: Re: How to investigate these error codes

Hi Jan,

Connection loss means that the client has disconnected from the server it was 
connected to and it will try to connect to another server to avoid session 
expiration.

Session expired means that your session has expired. :-)

Session expiration is important because if you have ephemerals associated to 
that session, they will be gone, so it might trigger some recovery path in your 
application.

You're using the C client? If so, then it is not going to be garbage collection 
on the client side causing your clients to disconnect, which is a pretty common 
cause for applications using the Java client. You may want to investigate if 
you're having some network issues or if perhaps your servers are overwhelmed 
with something. If you're sharing the disk devices and other applications are 
inducing a good number of IOs, then you may end up affecting the performance of 
the server.

-Flavio 


> On 12 Aug 2016, at 15:39, Krizansky, Jan <jkrizan...@netsuite.com> wrote:
> 
>  
> I'm trying to reach out to you as we couldn't find any satisfying info online.
> We've recently started seeing some errors in our cluster. The prevailing one 
> is ZSESSIONEXPIRED but there sometimes is also a ZCONNECTIONLOSS error.
> We couldn't find any documentation about possible causes of these issues. Any 
> recommendation where we should investigate and what might be causing these?
> 
> The ZCONNECTIONLOSS error is fairly rare. But ZSESSIONEXPIRED is very common 
> happening on almost every other hit.
> 
> Thank you,
> 
> Jan Krizansky
> 
> 
> NOTICE: This email and any attachments may contain confidential and 
> proprietary information of NetSuite Inc. and is for the sole use of the 
> intended recipient for the stated purpose. Any improper use or distribution 
> is prohibited. If you are not the intended recipient, please notify the 
> sender; do not review, copy or distribute; and promptly delete or destroy all 
> transmitted information. Please note that all communications and information 
> transmitted through this email system may be monitored by NetSuite or its 
> agents and that all incoming email is automatically scanned by a third party 
> spam and filtering service
> 
> 


NOTICE: This email and any attachments may contain confidential and proprietary 
information of NetSuite Inc. and is for the sole use of the intended recipient 
for the stated purpose. Any improper use or distribution is prohibited. If you 
are not the intended recipient, please notify the sender; do not review, copy 
or distribute; and promptly delete or destroy all transmitted information. 
Please note that all communications and information transmitted through this 
email system may be monitored by NetSuite or its agents and that all incoming 
email is automatically scanned by a third party spam and filtering service




Re: How to investigate these error codes

2016-08-12 Thread Michael Han
On top of what Flavio pointed out:

The liveness of a session is maintained by regular heartbeats between
client and server, and heartbeats could fail due to a couple of reasons:

- Network: increased latency, or network error.
- Server overloaded such as IO contention / swapping; server GC took too
long; server has too many clients connected; server is running in a
multi-tenant environment.
- Client overloaded.
- Configuration issue: the pre-configured tickTime / minSessionTimeout /
maxSessionTimeout is too low for the specific environment; shared dataDir
and dataLogDir (which could cause IO contention in some cases.).

I think it's hard to tell exactly what's going on in the cluster based on
the information posted here given this many reason could cause the issue.
It seems that the cluster was running fine previously, so identifying
what's changed that correlates to the above points might be a good start.

On Fri, Aug 12, 2016 at 9:05 AM, Flavio Junqueira  wrote:

> Hi Jan,
>
> Connection loss means that the client has disconnected from the server it
> was connected to and it will try to connect to another server to avoid
> session expiration.
>
> Session expired means that your session has expired. :-)
>
> Session expiration is important because if you have ephemerals associated
> to that session, they will be gone, so it might trigger some recovery path
> in your application.
>
> You're using the C client? If so, then it is not going to be garbage
> collection on the client side causing your clients to disconnect, which is
> a pretty common cause for applications using the Java client. You may want
> to investigate if you're having some network issues or if perhaps your
> servers are overwhelmed with something. If you're sharing the disk devices
> and other applications are inducing a good number of IOs, then you may end
> up affecting the performance of the server.
>
> -Flavio
>
>
> > On 12 Aug 2016, at 15:39, Krizansky, Jan 
> wrote:
> >
> >  >
> > I'm trying to reach out to you as we couldn't find any satisfying info
> online.
> > We've recently started seeing some errors in our cluster. The prevailing
> one is ZSESSIONEXPIRED but there sometimes is also a ZCONNECTIONLOSS error.
> > We couldn't find any documentation about possible causes of these
> issues. Any recommendation where we should investigate and what might be
> causing these?
> >
> > The ZCONNECTIONLOSS error is fairly rare. But ZSESSIONEXPIRED is very
> common happening on almost every other hit.
> >
> > Thank you,
> >
> > Jan Krizansky
> >
> >
> > NOTICE: This email and any attachments may contain confidential and
> proprietary information of NetSuite Inc. and is for the sole use of the
> intended recipient for the stated purpose. Any improper use or distribution
> is prohibited. If you are not the intended recipient, please notify the
> sender; do not review, copy or distribute; and promptly delete or destroy
> all transmitted information. Please note that all communications and
> information transmitted through this email system may be monitored by
> NetSuite or its agents and that all incoming email is automatically scanned
> by a third party spam and filtering service
> >
> > 
>
>


-- 
Cheers
Michael.


Re: How to investigate these error codes

2016-08-12 Thread Flavio Junqueira
Hi Jan,

Connection loss means that the client has disconnected from the server it was 
connected to and it will try to connect to another server to avoid session 
expiration.

Session expired means that your session has expired. :-)

Session expiration is important because if you have ephemerals associated to 
that session, they will be gone, so it might trigger some recovery path in your 
application.

You're using the C client? If so, then it is not going to be garbage collection 
on the client side causing your clients to disconnect, which is a pretty common 
cause for applications using the Java client. You may want to investigate if 
you're having some network issues or if perhaps your servers are overwhelmed 
with something. If you're sharing the disk devices and other applications are 
inducing a good number of IOs, then you may end up affecting the performance of 
the server.

-Flavio 


> On 12 Aug 2016, at 15:39, Krizansky, Jan  wrote:
> 
>  
> I'm trying to reach out to you as we couldn't find any satisfying info online.
> We've recently started seeing some errors in our cluster. The prevailing one 
> is ZSESSIONEXPIRED but there sometimes is also a ZCONNECTIONLOSS error.
> We couldn't find any documentation about possible causes of these issues. Any 
> recommendation where we should investigate and what might be causing these?
> 
> The ZCONNECTIONLOSS error is fairly rare. But ZSESSIONEXPIRED is very common 
> happening on almost every other hit.
> 
> Thank you,
> 
> Jan Krizansky
> 
> 
> NOTICE: This email and any attachments may contain confidential and 
> proprietary information of NetSuite Inc. and is for the sole use of the 
> intended recipient for the stated purpose. Any improper use or distribution 
> is prohibited. If you are not the intended recipient, please notify the 
> sender; do not review, copy or distribute; and promptly delete or destroy all 
> transmitted information. Please note that all communications and information 
> transmitted through this email system may be monitored by NetSuite or its 
> agents and that all incoming email is automatically scanned by a third party 
> spam and filtering service
> 
> 



How to investigate these error codes

2016-08-12 Thread Krizansky, Jan