Re: Strange SSL_shutdown() error return (SSL_ERROR_SYSCALL but errno == 0)

Darryl Miles Thu, 08 Apr 2010 01:42:06 -0700

Antoine Pitrou wrote:

These issues are tracked together at http://bugs.python.org/issue8108 ,
because they both appeared when someone tried OpenSSL 0.9.8m.

I have read through the discussion first I'd like to confirm thescenario for the errno==0 situation through particular sequence of events.

I have an SSL protocol test-case creator that can manipulate both endsOpenSSL API usage in a co-ordinated fashion, it should be straightforward to cause an abrupt socket closure around/during SSL_shutdown()usage.

Ok, thanks for the clarification. We were a bit baffled by errno==0
(EPIPE, ECONNABORTED, EBADF... would have been much more helpful).


I agree with this, it should return a more useful value.

So, in any case, I can interpret an SSL_ERROR_SYSCALL return from
SSL_shutdown() as "the socket was closed more or less abruptly"
response? There are no other possible reasons for this error return?

This is the intention of the error indication. The presumption by me atthis time is to believe it, as no proof has been submitted otherwise.Further investigation may alter this statement.

But tobe a good well meaning TLS/SSL citizen both ends should continue theirnon-blocking event loops for a reasonable amount of time (in the orderof 5 to TCP timeout seconds) even after the last SSL_write() has beenmade.
He, well. The interesting thing here is that we are testing a blocking
FTP TLS client with a non-blocking (event loop-based) server. The
blocking client can't really sleep() for 5 seconds when closing the FTP
session. At least I think users wouldn't like it :-)

Also, the client doesn't try to shutdown the SSL layer when closing its
connection. According to the client's author, this is contrary to the
RFC. In his own words:

This is in sympathy with my claim. To reiterate, it is upto anindividual protocol/application to decide if it requires a securecryptographic shutdown or not. It is also upto the individualprotocol/application to decide the course of action to take when itdoesn't happen.

So if the protocol spec for "FTP TLS" makes a claim one way or theother, that is a matter for that specification. Since the FTP protocolhas a clear "QUIT" command to mark the moment when the client has nofurther use of the control connection, then there is actually no need toperform a full SSL_shutdown() to make the system safe from attack. Thisdoesn't mean you shouldn't attempt to do SSL_shutdown().

ftplib.FTP_TLS class already calls unwrap() but only when

        closing a "secured" *data* connection.
        This is never done for the *control* connection as the examples
        shown in RFC-4217 do that only when dealing with the CCC command
        which is intended to switch the control connection back to clear
        text.
        Since ftplib.py does not implement the CCC command I would avoid
        to override its close() method.

You need to be clear in your own mind what statements from the "FTP TLS"specification are:

 * mandating and
 * what it is suggesting / recommending and
 * also matters it doesn't indicate any opinion on

The fact that something ISN'T shown in an example should not be taken asany kind of statement, it is just that; that specific example didn'texpress that particular matter. Interpret only the rules that arewritten as rules, anything else is open to interpretation.

You also need to go an read the original RFC first-hand and come to yourown interpretation. Then compare your interpretation to that of theftplib author's.

(if you have an opinion on this specific point -- no implicit SSL
shutdown when closing the FTP session --, I'd like to hear it. Although
it isn't really part of the issue at hand).

You'd need to educate me in the specific of "FTP TLS" protocol. I amvery experienced with all the details of the classic "FTP" protocol.




Does "FTP TLS" :
 * does it make use of 2 sockets like FTP ?

* are both sockets encrypted with TLS (at all times before anytransaction starts) ?

 * is the ftp-data socket opened/closed once for each file like FTP ?

* is the payload data inside the ftp-data socket just the exact numberof bytes in the single file being transfered ?

So in interests of trying to convey better understanding of the TLSshutdown issue please read the following claims and attempt tounderstand the goals behind each claim rather than the specific detail(in respect of FTP TLS, since I do not fully understand every detail ofFTP TLS at this time).




Things to consider:

* Any unencrypted channel falls outside the scope of TLS (and thus anypoints made right below).* If the encrypted command channel has a "QUIT" command and thespecification (or defacto default implementation) requires that thechannel after receving such a command write's back a single response andthen stops processing any further commands. It can be said that youalready have an in-band shutdown process and SSL_shutdown() provides noadditional benefit to your application.* "FTP TLS" is transactional in the sense that an individual fileoperation is a single unit-of-work (1 transaction). Therefore iftampering with the TLS stream is detected at most your rollback wouldthen attempt to rollback the transaction you are currently on. Nopreviously completed transactions would be affected.* Does the specification talk about what to do in the case of aprotocol error? I use the parallel of "transactions" to describe thispredicament. It mainly affects stuff being written (transactions withpersistent side-effects). i.e. The rollback strategy is: If the newfile didn't exist before, delete it, if the new file is being appendedtoo then truncate it back to the old length, etc... Single operationcommands like "Make A Directory" are begun and committed before itsresponse is returned. A command response is not part of thetransaction, just an advice about transaction status.* If the ftp-data stream works just like Classic (non-TLS) FTPprotocol, then one connection per-file with the entire data contents ofthe connection being exactly the data in the file (there are no in-bandstart and end markers). Then in this situation you MUST make use andcheck the SSL_shutdown() returns 1 at both ends before you consider thefile data contents to be valid and commit the transaction. In thissituation there is no in-band end-of-file market, its implied from theend of the network socket stream. This is just the situationSSL_shutdown() provides cryptographic guarantees over.


Now to talk in respect of SSL_shutdown() more specifically:

* Since SSL_shutdown() is part of the SSL protocol and sinceimplementing it doesn't contradict any other part of the FTP_TLSprotocol, and where it isn't a required part of the FTP_TLS protocol,then a BEST EFFORT attempt should be made to use/implement it.* A BEST EFFORT attempt does not mean you are required to enforce anykind of extra delay purely for the purpose of implementing a completeSSL_shutdown() sequence. BEST EFFORT might mean you call SSL_shutdown()which will attempt to write out to the socket the end-of-stream notifypacket at least once. If it fails; it fails, you tried!* A client SHOULD attempt to receive the "QUIT command response" (orwait for server instigated socket disconnection) before indicating tothe user that it has finished being a client.* A server MUST ensure it sends the "QUIT command response" with thesame amount of effort as it would any other kind of response. That iswhile the socket remains open it will be persistent with flushing thedata out the socket.* A server SHOULD (after making its last successful SSL_write() tosend the "QUIT command response") immediately call SSL_shutdown(). Note- which MAY return 1 immediately.* Both client and server if using non-standard OpenSSL BIO layersshould ensure that during a QUIT command/response those layers areactively flushed downwards into the kernel, BEFORE the socket descriptorgoes under consideration to be close() at kernel level. Notes - Thisreinforces the point that you must ensure a data flush down the stackfrom application -> OpenSSL -> BIO -> Kernel BEFORE you close thesocket. Only once all data has been written to the socket do youconsider when to close().* Both client and server after they call SSL_shutdown() and it returnsthe specific value of 0 (or 1) then that side MAY call shutdown(fd,SHUT_WR) on the socket. Notes - You are not guaranteed SSL_shutdown()will always return 0 on the first call, even if you observe that to bethe case.* A server SHOULD implement the SSL_shutdown() wait loop even after ithas written its last byte to the socket. Consider this to be a STRONGBEST EFFORT (i.e. actually code it ! ha ha). Notes - A server more sothan a client should implement a wait loop. A server is designed so tobe hanging around for work to do, a server is usually capable ofhandling multiple client simultaneously, a server is usually anon-interactive application. This is the logic on "more so".



[Errors and Omissions Exempt.]

Now in respect of implementing a "FTP Client Access Library", then youshould consider your "ftpcli.quit()" method to have 4 return states toprovide back to the caller:

 * ERROR before "QUIT" committed (terminal state 1)
 * "QUIT" committed, ERROR before response (terminal state 2)
 * "QUIT" sent, "QUIT" response received. (terminal state 3)

* "QUIT" sent, "QUIT" response received, ERROR before TLS shutdowncomplete. (terminal state 4)* "QUIT" sent, "QUIT" response received, TLS shutdown completed.(terminal state 5)

The term committed means you got the data flushed into the kernel. Sotherefore the data was committed into the kernel layer and part thepoint of no return.

If Python has an "exception" system, then I would suggest you consideronly the first case to raise an exception. The other three areindicated in soft-error returns. The logic in this is that you shouldraise exceptions for instruction that you failed to be execute on behalfof the caller.

Most users might just choose to IGNORE the return status of"ftpcli.quit()" because they are also acting in a best-efforts kind ofway by sending a quit command in the first place. Since the course ofaction the client will take after the ftpcli.quit() method return is thesame, regardless of its error state.

You might also like to provide an argument to the quit() command toindicate a maximum waiting time. This would be applied to the "waitingfor QUIT response" aspect, as well as the "waiting for TLS shutdowncomplete" aspect. You might like to consider a value where zeromilliseconds of wait can be indicated for impatient client users. Youmight also like to consider a value to mean an INFINITE wait. Thiswould also mean you need additional states to indicate:

 * "QUIT" committed, no-error, waiting for response (interim state 1.5)

* "QUIT" committed, "QUIT" response received, no-error, waiting forTLS shutdown complete (interim state 3.5)In order to implement an assured max-wait time then you might need tochange a socket that was blocked into non-blocking mode, and then put itback to blocking before returning from the ftpcli method.

You might also like to make your "ftpcli.quit()" method restartable.That is make it valid for a client to call it multiple times, the ftpclilibrary will track the state and not resend the QUIT command, or notexpect to see a quit response, etc... You might also like to convert aprevious error state (state 2, state 4) into an exception raising eventif the ftpcli user calls quit() method again, after already having beentold an error occurred via soft-error return value on a previous invocation.

You might want to enforce that the ftpcli.quit() will never wait for"TLS shutdown" on the first invocation. This means a ftpcli users whowants to do that must call it again (potentially with a new timeoutvalue) in the hope the return status changes from state 3.5 to state 5(in my list) in that time. As an after thought to this, if the socketis already non-blocking the first invocation of ftpcli.quit() might liketo attempt a one-shot non-blocking test of SSL_shutdown() to see if itwould/can complete (right after it received the "QUIT response message")but before returning for the first time. What I'm trying to emphasis isgive the ftpcli API user the control over tho two waits.

Putting all these things together allows the ftpcli API users to decidewhat they want, allows fast users to get what they want, allows fullycompliant users to get what they want, allows the shutdown blockingtimeouts to be finely controlled.

I am somewhat practical about matters, your FTP Client Access Libraryshould seek to provide:* the ability for someone to use the FTP protocol "by the book" and doeverything possible.* the ability for users to gain performance by cutting corner on stuffthat is unimportant to them (anything after sending the QUIT commandmaybe unimportant).

A healthy balance of the two makes for a good API that everyone canlike. With an API you always have to consider how an API is used, theethos of the language/paradigm and always try to make the API easier touse by providing the complex/difficult stuff.

Here is my attempt at using my own API and being a fully compliantcitizen, allowing up ~30 seconds to stuff to happen:


#define 5000_MILLISECONDS 5000

rc = ftpcli.quit(5000_MILLISECONDS); // Send command, maybe we getresponse too

for(int i = 0; i < 5; i++) {
   if(ftpcli.quit_is_terminal_status(rc) == TRUE)
      break;    // No more progress can be made

rc = ftpcli.quit(5000_MILLISECONDS); // Wait for response and SSLshutdown

}
// Examine rc status now. we tried to push the progress as much as possible


Here is my attempt at using my own API and cutting corners:

rc = ftpcli.quit(); // No argument implies a system default or historiccompatibly timeout value is used, it will commence the SSL_shutdown()but will-never/may-never be able confirm completion of that.

You stick both examples in the documentation, this help reassure simpleusers that their use is valid too.


Everybody wins at the expense of the ftpcli maintainer(s).

The one thing I think is worth abstracting (and part of my patches) is
when SSL_shutdown returns ERROR_WANT_{READ,WRITE} *and* the socket is in
*blocking* mode. In that case, shipping the select-and-retry loop as
part of the ssl abstraction, instead of having each user replicate the
boring logic, looks reasonable to me. What do you think?

This should not be an issue since if the socket is in blocking mode theywill never return EAGAIN (in the case of reads) andEAGAIN/partial-writes (in the case of writes).

So -1/WANT_READ and -1/WANT_WRITE soft-error returns are facets ofnon-blocking socket usage.

Certainly classic BSD socket interpretation of blocking and non-blockingmode makes my above comments true.



Darryl
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

Re: Strange SSL_shutdown() error return (SSL_ERROR_SYSCALL but errno == 0)

Reply via email to