Re: [PATCH] Bug 3643: Connection Auth redesign
Okay then, applied as trunk 12750. For the record: We have managed to test everything short of the timing change about when unpin() is performed. The effects of leaving the server connection pinned for a few milliseconds longer now than before is unknown - but probably not too bad. Amos
Re: [PATCH] Bug 3643: Connection Auth redesign
On 3/04/2013 3:55 a.m., Alex Rousskov wrote: On 03/28/2013 12:41 AM, Amos Jeffries wrote: New patch attached with the above small changes to conn-auth included. Thanks for addressing all of my concerns, Alex. Is that a +1? Amos
Re: [PATCH] Bug 3643: Connection Auth redesign
On 04/02/2013 11:55 PM, Amos Jeffries wrote: On 3/04/2013 3:55 a.m., Alex Rousskov wrote: On 03/28/2013 12:41 AM, Amos Jeffries wrote: New patch attached with the above small changes to conn-auth included. Thanks for addressing all of my concerns, Is that a +1? Well, I cannot +1 this code because I do not understand enough of it, but the problems I could see seem to be resolved. So it is more like a removal of -1, and even that sounds too extreme :-). Cheers, Alex.
Re: [PATCH] Bug 3643: Connection Auth redesign
On 03/28/2013 12:41 AM, Amos Jeffries wrote: New patch attached with the above small changes to conn-auth included. Thanks for addressing all of my concerns, Alex.
Re: [PATCH] Bug 3643: Connection Auth redesign
On 28/03/2013 7:50 a.m., Alex Rousskov wrote: On 03/27/2013 05:38 AM, Amos Jeffries wrote: On 27/03/2013 10:16 a.m., Alex Rousskov wrote: On 03/25/2013 09:59 PM, Amos Jeffries wrote: snip +// clobbered with alternative credentials +if (aur != credentialsState) { +debugs(33, 2, HERE ERROR: Closing clientConnection due to change of connection-auth from by); +credentialsState-releaseAuthServer(); +credentialsState = NULL; +unpinConnection(); +// this is a fatal type of problem. Close the connection immediately +clientConnection-close(); // XXX: need to send TCP RST packet if we can. +stopReceiving(connection-auth change attempted); +return; +} IIRC, we should either stopReceiving() or close the connection. Doing both is conceptually wrong because stopReceiving() implies that we should keep sending if needed (which connection closure prohibits) and because connection closure already implies that we will not receive anything. Doing both may also lead to confusing double-closures. For this case we need to close the connection ASAP and data loss of everything buffered for this connection is the desired outcome. Let's call comm_reset_close() then (directly or indirectly, see below). The stopReceiving() is to both speed up the signalling that the client connection has closed, and produce a nice error message about why. Are you aware of any better way to exit ConnStateData with an error? There is no API to do that and we should not abuse stop*() API. Our choices are: 1) Just use comm_reset_close(). Fastest closure possible but no nice error message. 2) Add an optional reason to comm_reset_close(). Fastest closure possible, with an error message. Perhaps make this a Connection method? 3) Add ConnStateData::close(message, reset) or similar method. Fastest closure possible, with a context-aware error message. Please note that we should not use ConnStateData::mustStop() because setAuth() may not be under job call protection. We have to close the connection to kill our job, unfortunately. Yes. I'm minded to do (2) and (3) as a followup patch. For now just using comm_reset_close(). NP: Comm::Connection::close() can be called any number of times and protects against repeated comm_close() cycles if that was what you mean? Yes, I know, but that protection does not make repeated closures less confusing. I am also having second thoughts about whether the unpin and stopReceiving side effects matter when called from swanSong(). That comment and code was written for the mk1 patch which used deleteThis in the abort case (and crashed sometimes as a result). What do you think about just calling setAuth(NULL) there? Well, it is difficult to suggest the best action because the need for releaseAuthServer() API itself is a sign of a problem -- the release should be done automatically (by default) when UserRequest is destroyed. However, I am sure it is difficult to fix that. The UserRequest details are not solely locked by this reference. The original bug which prompted this cleanup was about these helper locks staying around after the connection closed simply due to the credentials staying around in use by other state. Which results in the helpers all being falsely in use/reserved under heavy traffic. None of the other code releases the If a proper solution is too difficult, how about adding a private ConnStateData::endAuth() method that will do this: +auth_-releaseAuthServer(); +auth_ = NULL; +unpinConnection(); and then call if from setAuth() and from swanSong(), as needed? This way any changes to that cleanup code will be in one place. Nevermind. I've got it. The bad side effect is that in the placement of clearing auth before closing the connections all the connection closures triggered get the setAuth removal message reportes. If we ensure setAuth(NULL) is used after the connections are closed that disappears. I've adjusted swanSong to cleanup connection auth a lot better now. BTW, 1) Why are we unpinning the connection in setAuth()? Seems like that is not directly related to authentication. Can that call wait until we close the connection and the regular cleanup takes place? In majority of cases it was caused by the authentication, but so long as the unpin is actioned on cleanup I think you are right. All the re-pin and server connection re-use logics in the presence of pinning are gone now yes? I left this in the patch from last year because I was't clear on when unpin occured. 2) How come ConnStateData::swanSong() does not call unpinConnection()? I dont know the answer to that one. I agree it should. Are we sometimes leaking pinning.peer and pinning.host because of that? Quite right about the leak, I have applied the patch changing the swansong cleanup to simply call unpinConnection(). I suspect it was undetected
Re: [PATCH] Bug 3643: Connection Auth redesign
On 27/03/2013 10:16 a.m., Alex Rousskov wrote: On 03/25/2013 09:59 PM, Amos Jeffries wrote: Bug 3643: NTLM helpers stuck in reserved state by Safari NTLM failures are not always cleaning up connection-auth credentials properly. In particular they are not releasing the NTLM helpers when the connection is closed between challenge and handshake completion. Resulting in permanently reserved helpers locking up all access through the proxy. This change redesigns the connection authentication state management to move the auth link/unlink operations into the connection state manager objects instead of being managed by NTLM auth components. As a result we are able to manage credentials from any auth scheme consistently and terminate the connection properly on several error conditions which the auth components are not easily aware of. Fix sponsored by Netbox Blue Pty (http://netboxblue.com/) The bulk of the patch is symbol changes since we are moving the credentials from a public member variable to private one with accessors. The core of the patch logics is contained in ConnStateData::setConnectionAuth which performs error checking and triggers termination of the connection in various ways determined by the type of error that was encountered. In summary: - once credentials are set they are baked into the connection state. - all following requests require a credentials token matching the state one. - any request lacking credentials (or setting NULL) will terminate the connection gracefully. It should end with an auth re-challenge, but that specific detail does depend on ACLs. - any request with a new or altered auth token (injection attacks, broken relay pinning) will terminate the connection immediately (expect lost bytes). - any pinned server connection, and any pinned stateful auth helper is unpined/released on credential errors. Specific cases and handling are covered in greater detail in the patch comments. It also adds a stub file for client_side.h functionality. Amos -if (conn-auth_user_request != NULL) { -*auth_user_request = conn-auth_user_request; +if (conn-getConnectionAuth() != NULL) { +*auth_user_request = conn-getConnectionAuth(); } else { /* failed connection based authentication */ debugs(29, 4, HERE Auth user request *auth_user_request conn-auth user request - conn-auth_user_request conn type - conn-auth_user_request-user()-auth_type authentication failed.); + conn-getConnectionAuth() conn type + conn-getConnectionAuth()-user()-auth_type authentication failed.); This is an old bug not introduced by this patch, but please note that the last debugs() line above is always dereferencing a NULL pointer (and the next-to-last line is always logging a nil pointer). Fixed. +if (credentialsState == NULL) { +debugs(33, 2, HERE Adding connection-auth to clientConnection from by); +credentialsState = aur; +return; +} The if-statement after the above implies that aur can be NULL. If aur is NULL, the above code will not change anything. Should we check for that before we claim that we are adding something or at least debugs() aur? Here and elsewhere, please remove HERE from the new debugs() messages. Done. +private: +/// state of some credentials that can be used to perform authentication on this connection +Auth::UserRequest::Pointer credentialsState; The words state, data, and info usually do not add to the description or understanding of a member. How about renaming credentialsState to just credentials? UserRequest is not just the credentials. It is the credentials along with all the additional state information. ie how, when and if they validated, any protocol options line realm which are relevant, which helper was used if stateful, the HelperReply line details etc. whatever the potocol scheme is written to preserve. I think this is possibly one case where saying state is actually warranted. Changed anyways with the method re-naming. Please move the private section down, after the public/protected sections. Done. Also, can the description be simplified to just say connection-based credentials for this connection or something similar? Done. Finally, why do getConnectionAuth() and setConnectionAuth() use the term connectionAuth while the class member storing the same information is called credentialsState? Can all three names be made consistent/similar? I would drop connection from the accessor names because the ConnStateData class itself determines their scope as connection. For example: public: const Auth::UserRequest::Pointer credentials() {...} void credentials(...); private: Auth::UserRequest::Pointer credentials_; Used getAuth() /
Re: [PATCH] Bug 3643: Connection Auth redesign
On 03/27/2013 05:38 AM, Amos Jeffries wrote: On 27/03/2013 10:16 a.m., Alex Rousskov wrote: On 03/25/2013 09:59 PM, Amos Jeffries wrote: +private: +/// state of some credentials that can be used to perform authentication on this connection +Auth::UserRequest::Pointer credentialsState; The words state, data, and info usually do not add to the description or understanding of a member. How about renaming credentialsState to just credentials? UserRequest is not just the credentials. It is the credentials along with all the additional state information. ie how, when and if they validated, any protocol options line realm which are relevant, which helper was used if stateful, the HelperReply line details etc. whatever the potocol scheme is written to preserve. Yes, of course. It is possible that credentials itself is wrong. As is UserRequest, apparently :-). I think this is possibly one case where saying state is actually warranted. Not in my opinion. The word state does not add any meaning in this context. Pure credentials without all that extra stuff would still be state (just a smaller one). This is, of course, not as bad as something like naming a class implementing a protocol client ServerStateData :-). Used getAuth() / setAuth() / auth_ since its more than just credentials. Looks good to me, thanks! +// clobbered with alternative credentials +if (aur != credentialsState) { +debugs(33, 2, HERE ERROR: Closing clientConnection due to change of connection-auth from by); +credentialsState-releaseAuthServer(); +credentialsState = NULL; +unpinConnection(); +// this is a fatal type of problem. Close the connection immediately +clientConnection-close(); // XXX: need to send TCP RST packet if we can. +stopReceiving(connection-auth change attempted); +return; +} IIRC, we should either stopReceiving() or close the connection. Doing both is conceptually wrong because stopReceiving() implies that we should keep sending if needed (which connection closure prohibits) and because connection closure already implies that we will not receive anything. Doing both may also lead to confusing double-closures. For this case we need to close the connection ASAP and data loss of everything buffered for this connection is the desired outcome. Let's call comm_reset_close() then (directly or indirectly, see below). The stopReceiving() is to both speed up the signalling that the client connection has closed, and produce a nice error message about why. Are you aware of any better way to exit ConnStateData with an error? There is no API to do that and we should not abuse stop*() API. Our choices are: 1) Just use comm_reset_close(). Fastest closure possible but no nice error message. 2) Add an optional reason to comm_reset_close(). Fastest closure possible, with an error message. Perhaps make this a Connection method? 3) Add ConnStateData::close(message, reset) or similar method. Fastest closure possible, with a context-aware error message. Please note that we should not use ConnStateData::mustStop() because setAuth() may not be under job call protection. We have to close the connection to kill our job, unfortunately. NP: Comm::Connection::close() can be called any number of times and protects against repeated comm_close() cycles if that was what you mean? Yes, I know, but that protection does not make repeated closures less confusing. I am also having second thoughts about whether the unpin and stopReceiving side effects matter when called from swanSong(). That comment and code was written for the mk1 patch which used deleteThis in the abort case (and crashed sometimes as a result). What do you think about just calling setAuth(NULL) there? Well, it is difficult to suggest the best action because the need for releaseAuthServer() API itself is a sign of a problem -- the release should be done automatically (by default) when UserRequest is destroyed. However, I am sure it is difficult to fix that. If a proper solution is too difficult, how about adding a private ConnStateData::endAuth() method that will do this: +auth_-releaseAuthServer(); +auth_ = NULL; +unpinConnection(); and then call if from setAuth() and from swanSong(), as needed? This way any changes to that cleanup code will be in one place. BTW, 1) Why are we unpinning the connection in setAuth()? Seems like that is not directly related to authentication. Can that call wait until we close the connection and the regular cleanup takes place? 2) How come ConnStateData::swanSong() does not call unpinConnection()? Are we sometimes leaking pinning.peer and pinning.host because of that? Cheers, Alex.
Re: [PATCH] Bug 3643: Connection Auth redesign
On 03/25/2013 09:59 PM, Amos Jeffries wrote: Bug 3643: NTLM helpers stuck in reserved state by Safari NTLM failures are not always cleaning up connection-auth credentials properly. In particular they are not releasing the NTLM helpers when the connection is closed between challenge and handshake completion. Resulting in permanently reserved helpers locking up all access through the proxy. This change redesigns the connection authentication state management to move the auth link/unlink operations into the connection state manager objects instead of being managed by NTLM auth components. As a result we are able to manage credentials from any auth scheme consistently and terminate the connection properly on several error conditions which the auth components are not easily aware of. Fix sponsored by Netbox Blue Pty (http://netboxblue.com/) The bulk of the patch is symbol changes since we are moving the credentials from a public member variable to private one with accessors. The core of the patch logics is contained in ConnStateData::setConnectionAuth which performs error checking and triggers termination of the connection in various ways determined by the type of error that was encountered. In summary: - once credentials are set they are baked into the connection state. - all following requests require a credentials token matching the state one. - any request lacking credentials (or setting NULL) will terminate the connection gracefully. It should end with an auth re-challenge, but that specific detail does depend on ACLs. - any request with a new or altered auth token (injection attacks, broken relay pinning) will terminate the connection immediately (expect lost bytes). - any pinned server connection, and any pinned stateful auth helper is unpined/released on credential errors. Specific cases and handling are covered in greater detail in the patch comments. It also adds a stub file for client_side.h functionality. Amos -if (conn-auth_user_request != NULL) { -*auth_user_request = conn-auth_user_request; +if (conn-getConnectionAuth() != NULL) { +*auth_user_request = conn-getConnectionAuth(); } else { /* failed connection based authentication */ debugs(29, 4, HERE Auth user request *auth_user_request conn-auth user request - conn-auth_user_request conn type - conn-auth_user_request-user()-auth_type authentication failed.); + conn-getConnectionAuth() conn type + conn-getConnectionAuth()-user()-auth_type authentication failed.); This is an old bug not introduced by this patch, but please note that the last debugs() line above is always dereferencing a NULL pointer (and the next-to-last line is always logging a nil pointer). +if (credentialsState == NULL) { +debugs(33, 2, HERE Adding connection-auth to clientConnection from by); +credentialsState = aur; +return; +} The if-statement after the above implies that aur can be NULL. If aur is NULL, the above code will not change anything. Should we check for that before we claim that we are adding something or at least debugs() aur? Here and elsewhere, please remove HERE from the new debugs() messages. +private: +/// state of some credentials that can be used to perform authentication on this connection +Auth::UserRequest::Pointer credentialsState; The words state, data, and info usually do not add to the description or understanding of a member. How about renaming credentialsState to just credentials? Please move the private section down, after the public/protected sections. Also, can the description be simplified to just say connection-based credentials for this connection or something similar? Finally, why do getConnectionAuth() and setConnectionAuth() use the term connectionAuth while the class member storing the same information is called credentialsState? Can all three names be made consistent/similar? I would drop connection from the accessor names because the ConnStateData class itself determines their scope as connection. For example: public: const Auth::UserRequest::Pointer credentials() {...} void credentials(...); private: Auth::UserRequest::Pointer credentials_; +Auth::UserRequest::Pointer getConnectionAuth() const { return credentialsState; }; Extra semicolon at the end. The above method can return const Auth::UserRequest::Pointer if you want to avoid refcounting overheads during simple checks. + * NP: once set the credentials are fixed until closure because for any change to be needed + * since something invalid has happened: Please rephrase the because ... part. I am not sure what it means. +if (aur == NULL) { +