Richard Sheath created QPID-4233:
------------------------------------
Summary: Windows C++ client does not reconnect when port is block
then re-opened (e.g by firewall)
Key: QPID-4233
URL: https://issues.apache.org/jira/browse/QPID-4233
Project: Qpid
Issue Type: Bug
Components: C++ Client
Affects Versions: 0.16, 0.14, Future
Environment: Windows 7 64bit
Reporter: Richard Sheath
Looking at the qpid client code it doesn't seem to handle reconnection if the
tcp port is blocked and then unblocked.
We are using the Windows client with SSL and SASL having applied this patch to
the 0.14 codebase: https://issues.apache.org/jira/browse/QPID-3914
I then noticed this JIRA which sounded like the same issue that I was
experiencing https://issues.apache.org/jira/browse/QPID-3759 and applied the
patch but I am still getting the same issue.
This is a sample of how we are creating the connection:
m_pConnection = new
qpid::messaging::Connection::Connection("amqp:ssl:<IP1>:<port1>,<IP2>:<port2>",
""); //(IP1:port1 and IP2:port2 are the same as we currently only have one
server to connect to.
m_pConnection->setOption("transport","ssl");
m_pConnection->setOption("sasl_mechanisms", "EXTERNAL");
m_pConnection->setOption("ssl-cert-filename", m_strSslCertFileName.c_str());
m_pConnection->setOption("ssl-cert-filenamepass",m_strSslCertFileNamePassword.c_str());
m_pConnection->setOption("host-cert-filename",m_strHostCertFileName.c_str());
m_pConnection->setOption("heartbeat",30); //30 seconds, defaults to 0 which is
no heartbeats
m_pConnection->setOption("reconnect",true); //defaults to false
m_pConnection->setOption("reconnect-interval",30); //30 seconds, default is 60
seconds
m_pConnection->open();
We then create 3 Sessions from the connection e.g.:
m_SessionResponse = m_pConnection->createSession("Response");
Using one of these sessions we create both a receiver and a sender.
And we create a receiver for each of the other 2 sessions.
I am expecting these receivers and sender to remain active for the lifetime of
the program. We call receiver.fetch(Duration::SECOND * 10); in a loop on its
own thread for each receiver.
We start the application and it connects and runs ok. Then we block the port
using windows firewall to simulate a network issue. At this point the
.fetch(Duration::SECOND * 10); never returns from the call. And if you call the
qpid::messaging::Sender::send function this returns with no exceptions thrown.
I am not sure what exactly should happen in this scenario these are my thoughts
please advise/correct:
1) At worst the fetch should throw an exception so the calling application
knows there is a problem.
2) Possibly the send should also throw an exception, again so the calling
application knows there is a problem.
3) If "reconnect" is enabled then we should try to reconnect (to the same
IP:port).
4) If multiple IPs are specified we should failover to the next IP on reconnect.
I can see in the qpid log that the heartbeats are timing out with this message
"Traffic timeout". This could possibly be used to trigger the reconnect.
Also I noticed when debugging that void TCPConnector::eof(AsynchIO&) is called
much before the heartbeat timeout and maybe this could be used instead.
I am new to this codebase so any help would be appreciated. It could even be
that I am just using the qpid library in the wrong way or have missing a
required config setting or setOption.
Thanks
Richard
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]