[
https://issues.apache.org/jira/browse/AMQ-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gregg Saffell updated AMQ-3522:
-------------------------------
Fix Version/s: (was: 5.10.2)
> Bridge connection lost, client side tries to reconnect for a while then stops
> trying (HTTPS through a proxy server)
> -------------------------------------------------------------------------------------------------------------------
>
> Key: AMQ-3522
> URL: https://issues.apache.org/jira/browse/AMQ-3522
> Project: ActiveMQ
> Issue Type: Bug
> Affects Versions: 5.4.2
> Environment: We have a hub-and-spoke bridge architecture where the
> hub is the client that connects (over HTTPS) to seventy some servers. A proxy
> server resides at each server node thus the connection is https over a proxy.
> The client side of the bridge is running on Linux (CentOS), the server side
> of bridge is running on Windows 2003. The proxy server is Apache HTTP server.
> All of this runs over a VPN.
> The specific amq build we are using is from FUSE, namely, 5.4.2-fuse-03-09.
> On the client side, we have amq logging set as follows:
> log4j.rootLogger=INFO, console, logfile
> log4j.logger.org.apache.activemq.spring=WARN
> log4j.logger.org.apache.activemq.web.handler=WARN
> log4j.logger.org.springframework=WARN
> log4j.logger.org.apache.xbean=WARN
> log4j.logger.org.apache.camel=INFO
> log4j.logger.org.apache.activemq.network.DiscoveryNetworkConnector=DEBUG
> Reporter: Gregg Saffell
> Attachments: client-activemq.log, server-access.log,
> server-wrapper.log
>
>
> Each server node has a queue that is dedicated to messages going from the
> server to the client. That queue is therefore expected always to have one
> consumer (the one that does the bridge forwarding). At least two times per
> week, we detect that one of the servers shows zero consumers on that queue
> (in which case messages are not being forwarded across the bridge) and this
> persists until we stop and restart the client amq instance. Examination of
> the logs shows that the client side detected the connection had been lost and
> it retried to establish it for a time but repeatedly failed and then seemed
> to give up. The wrapper.log file on the server side shows "IDLE
> SCEP@...[d=false,io=1,w=true,rb=false,wb=false]" for the period where the
> client is not able to re-establish the connection.
> The client-side network connector definition is as follows:
> <networkConnector name="ECENTRE_TSOLAR_EU"
> uri="static:(https://localhost:61617?
> readCheckTime=20000&initialDelayTime=4000&keepAliveResponseRequired=true&proxyHost=ecentre.tsolar.eu&proxyPort=80)?useExponentialBackOff=false&initialReconnectDelay=5000&maxReconnectDelay=5000"
> duplex="true">
> <dynamicallyIncludedDestinations>
> <queue physicalName="TO_EC.ECENTRE_TSOLAR_EU.>"/>
> <queue physicalName="TO_SCA.>"/>
> </dynamicallyIncludedDestinations>
> <staticallyIncludedDestinations>
> <queue
> physicalName="TO_EC.ECENTRE_TSOLAR_EU.ASYNC_MSGS"/>
> <queue
> physicalName="TO_EC.ECENTRE_TSOLAR_EU.SYNC_RESP"/>
> <queue physicalName="TO_SCA.ASYNC_MSGS"/>
> <queue physicalName="TO_SCA.ADMIN.SYNC_REQ"/>
> <queue physicalName="TO_SCA.SSO.SYNC_REQ"/>
> </staticallyIncludedDestinations>
> </networkConnector>
> The server-side transport connector definition is as follows:
> <transportConnectors>
> <transportConnector name="https"
> uri="https://127.0.0.1:61617?transport.readCheckTime=20000&transport.initialDelayTime=20000&transport.keepAliveResponseRequired=true&needClientAuth=true"/>
> </transportConnectors>
> The proxy server resides on the same machine and is configured with the
> following virtual host:
> <VirtualHost _default_:80>
> ErrorLog logs/default80.ecentre_error.log
> ProxyRequests on
> AllowCONNECT 61617
> ProxyVia on
> LogLevel warn
> RewriteEngine on
> RewriteCond %{REQUEST_METHOD} (GET|POST|HEAD|PUT|DELETE)
> RewriteCond %{REQUEST_URI} !^/eCentreToolServices*
> RewriteCond %{REQUEST_URI} !^/eCentreServices*
> RewriteRule ^(.*)$ https://%{SERVER_NAME}$1 [L,R,NC]
>
> #Take out the TRACE and TRACK
> RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK)
> RewriteRule .* -[F]
> </VirtualHost>
> I'm about to attach amq logs from both the client and server along with the
> apache access log. The specific remote server that had the problem is
> ecentre.tsolar.eu. Allow me to direct your attention to some specific
> entries:
> -------------------
> client-activemq.log
> -------------------
> 2011-09-30 07:46:33,357 | WARN -- connection to tsolar shutdown due to a
> remote error: java.io.IOException: Could not post command: KeepAliveInfo {}
> due to: java.net.SocketException: Connection reset
> (several attempts to re-establish the connection to tsolar)
> 2011-09-30 07:48:29,566 | INFO -- the last attempt is made to re-establish
> the connection to tsolar
> 2011-09-30 14:06:34,816 | INFO -- we restart amq
> 2011-09-30 14:07:10,646 | INFO -- connection established to tsolar
> -------------------
> server-wrapper.log
> -------------------
> (Note: the server node's clock is around 5 hours 50 minutes ahead of the
> client's)
> The tsolar server
> INFO | jvm 1 | 2011/09/30 13:38:06 | WARN | Stopping an existing active
> duplex connection [Transport Connection to: blockingQueue_8847135] for
> network connector
> (ECENTRE_TSOLAR_EU@ID:amq-us01.ilstechnology.net-44428-1317330422396-0:1).
> (connection is stopped and re-established several times)
> INFO | jvm 1 | 2011/09/30 13:39:57 | INFO - Network connection shutdown
> due to inactivity timeout
> (note the series of "IDLE SCEP@... [d=false,io=1,w=true,rb=false,wb=false]"
> entries here, not sure if they are relevant, they don't appear in
> activemq.log, only in wrapper.log)
> INFO | jvm 1 | 2011/09/30 19:58:38 | INFO - connection re-established
> ------------------------
> server apache access log
> ------------------------
> 216.244.116.13 - - [30/Sep/2011:13:38:05 +0200] -- begins a series of rapid
> "CONNECT localhost:61617 HTTP/1.1" 200 entries
> 216.244.116.13 - - [30/Sep/2011:13:40:04 +0200] -- last in series of rapid
> "CONNECT localhost:61617 HTTP/1.1" 200 entries
> 216.244.116.13 - - [30/Sep/2011:19:58:34 +0200] -- the next "CONNECT
> localhost:61617 HTTP/1.1" 200 entry
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)