> On Jan. 30, 2013, 12:28 p.m., Gordon Sim wrote:
> > Is abort() the right call to make? What about close()? It looks to me like 
> > the 'purpose' of abort is to trigger a simulated eof call on the connection 
> > processing thread from e.g. the heartbeat timer thread. Since in this case 
> > we are already on the connection processing thread, why would close() not 
> > do the job (this is not the same as issuing a clean connection.close 
> > sequence I don't believe).
> > 
> > One thing to remember with any change to the IO code is that different 
> > 'transports' (ssl, rdma) and platforms (windows) may involve different 
> > codepaths.
> 
> Alan Conway wrote:
>     Clients do not fail-over if the connection is closed politely, which is 
> the objective here.
> 
> Gordon Sim wrote:
>     Understood and to be clear I'm not suggesting what I would call a 
> 'polite' close (i.e. connection-close; connection-close-ok handshake). 
> However calling qpid::amqp_0_10::Connection::close() should result in the IO 
> layer detecting that the upper layer wants to close and doing so (see 
> qpid::sys::AsynchIOHandler::idle() for example).
>     
>     The behaviour of these calls is not well defined certainly. My only 
> concern is we seem to be modifying what abort() is supposed to do and at 
> least from a skim of the code it seems close() might do what is required 
> (certainly it should result in aio->queueWriteClose() being called in 
> AsynchIOHandler).
> 
> Andrew Stitcher wrote:
>     It is currently true on trunk that that ssl and tcp share the code paths 
> at this level but rdma does not have abort() implemented at all.
>     
>     I tend to agree that overloading abort() is not entirely the correct 
> semantic although does seem pretty close - abort() says "abort this 
> connection; I think it has failed but is still connected for some reason". 
> The correct semantic here could be "close this connection with error".
>     
>     I wonder if it is possible to hook into the authentication in a similar 
> way to ACL to reject the connection there with a failure code.
> 
> Alan Conway wrote:
>     I'd argue that this fixes a bug in abort() - you shouldn't be able to 
> write to an aborted connection. It all works nicely now, unless there's a 
> compelling reason to take a different approach I'd like to leave it as-is. 
> Semantically it seems fine, abort() means terminate the connection abruptly 
> without the close handshake, which is exactly what the HA code wants to do.

I would argue that at present a key part of the role of abort() in practice is 
to be called on another thread, hence the requesting of a callback to eof on 
the IO processing thread which is likely the source of the original problem 
(i.e. abort returning doesn't mean the eof has actually been handled). Is this 
change safe if called from a thread other than the connections IO thread? It 
seems to me that the right fix is to provide an 'inline' way of doing the same 
thing as abort(). (I'm suggesting calling qpid::amqp_0_10::Connection::close() 
_may_ be that).


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9137/#review15837
-----------------------------------------------------------


On Jan. 29, 2013, 9:49 p.m., Alan Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9137/
> -----------------------------------------------------------
> 
> (Updated Jan. 29, 2013, 9:49 p.m.)
> 
> 
> Review request for qpid and Andrew-Duplicate-Accct-Inactiv Konwinski.
> 
> 
> Description
> -------
> 
> HA Fix race condition in rejecting connections.
> 
> Sporadic failure of test_failover_python was caused by a race in rejecting
> connections. There was a very small window where work could be done by a
> connection after it was rejected.
> 
> 
> Diffs
> -----
> 
>   /trunk/qpid/cpp/src/qpid/sys/AsynchIOHandler.cpp 1439431 
> 
> Diff: https://reviews.apache.org/r/9137/diff/
> 
> 
> Testing
> -------
> 
> make check, manual heartbeat test
> 
> 
> Thanks,
> 
> Alan Conway
> 
>

Reply via email to