Hi,

having the following setup:
HttpClient 3.0.1 --> HWLB --> Apache Synapse 1.2 --> AS hosting web services

we encounter a couple of java.net.SocketException: Connection reset while 
performing a graceful restart of Apache Synapse 1.2

The HttpClient per default retries some of those and others not. We have not 
changed from the default implementation of the retry handler. So according to 
my current understanding only non-completed requests will be retried (which is 
generally a very good idea as most of the involved backend operations are not 
idempotent and cannot be retried safely.

So before trying to modify the default behaviour in providing a custom 
implementation of a retry handler, we first would like to understand the exact 
cause of the connection reset in this particular case.

Generally the reason should be an "unexpected" server-side connection close.

What happens during a graceful restart of Synapse is from a highlevel 
perspective the following:

1. Graceful shutdown request is received
2. Request pause of any listener, sender or task threads
3. Acknowledgement of pause execution
4. Periodically check current processing of already accepted request (before 
completion of listener pause) and wait until all threads are idle and there are 
no active connections
5. end of graceful period
6. restart of instance

We encounter the connection resets in the phase between 3 and 5. First we tried 
to understand whether only persistent connections are involved or not.
So we disabled keepalive on the httpclient part. This did not change the 
situation. We send disabled keepalive also on the esb side between esb and 
service. We still received those exceptions on the server side.

We are not quite sure how the hwlb affects the problem. It generally notices 
the listener pause and directs traffic to the other Apache Synapse nodes. In 
this environment we are not able to capture traffic via tcpmon, but we could 
activate wire logs on both http client as well as synapse end.

Can anyone please point us into the right direction in analyzing and 
understanding the cause of the connection reset?
Once this part is done we need to decide how to properly handle this. We would 
like to loose no request and do not mistakenly retry a request (which might 
have been already processed).

Any help on this is greatly appreciated.

Thanks a lot,
   Eric

Reply via email to