> On Jan. 30, 2013, 12:28 p.m., Gordon Sim wrote: > > Is abort() the right call to make? What about close()? It looks to me like > > the 'purpose' of abort is to trigger a simulated eof call on the connection > > processing thread from e.g. the heartbeat timer thread. Since in this case > > we are already on the connection processing thread, why would close() not > > do the job (this is not the same as issuing a clean connection.close > > sequence I don't believe). > > > > One thing to remember with any change to the IO code is that different > > 'transports' (ssl, rdma) and platforms (windows) may involve different > > codepaths. > > Alan Conway wrote: > Clients do not fail-over if the connection is closed politely, which is > the objective here. > > Gordon Sim wrote: > Understood and to be clear I'm not suggesting what I would call a > 'polite' close (i.e. connection-close; connection-close-ok handshake). > However calling qpid::amqp_0_10::Connection::close() should result in the IO > layer detecting that the upper layer wants to close and doing so (see > qpid::sys::AsynchIOHandler::idle() for example). > > The behaviour of these calls is not well defined certainly. My only > concern is we seem to be modifying what abort() is supposed to do and at > least from a skim of the code it seems close() might do what is required > (certainly it should result in aio->queueWriteClose() being called in > AsynchIOHandler). > > Andrew Stitcher wrote: > It is currently true on trunk that that ssl and tcp share the code paths > at this level but rdma does not have abort() implemented at all. > > I tend to agree that overloading abort() is not entirely the correct > semantic although does seem pretty close - abort() says "abort this > connection; I think it has failed but is still connected for some reason". > The correct semantic here could be "close this connection with error". > > I wonder if it is possible to hook into the authentication in a similar > way to ACL to reject the connection there with a failure code.
I'd argue that this fixes a bug in abort() - you shouldn't be able to write to an aborted connection. It all works nicely now, unless there's a compelling reason to take a different approach I'd like to leave it as-is. Semantically it seems fine, abort() means terminate the connection abruptly without the close handshake, which is exactly what the HA code wants to do. - Alan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9137/#review15837 ----------------------------------------------------------- On Jan. 29, 2013, 9:49 p.m., Alan Conway wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/9137/ > ----------------------------------------------------------- > > (Updated Jan. 29, 2013, 9:49 p.m.) > > > Review request for qpid and Andrew-Duplicate-Accct-Inactiv Konwinski. > > > Description > ------- > > HA Fix race condition in rejecting connections. > > Sporadic failure of test_failover_python was caused by a race in rejecting > connections. There was a very small window where work could be done by a > connection after it was rejected. > > > Diffs > ----- > > /trunk/qpid/cpp/src/qpid/sys/AsynchIOHandler.cpp 1439431 > > Diff: https://reviews.apache.org/r/9137/diff/ > > > Testing > ------- > > make check, manual heartbeat test > > > Thanks, > > Alan Conway > >
