Comments sent. I forgot to add this mailing list as a cc: target, though. On Thu, Jun 19, 2008 at 1:17 PM, Jacob Burkhart <[EMAIL PROTECTED]> wrote:
> patch posted: http://codereview.appspot.com/2341 > > On Mon, Jun 16, 2008 at 11:59 PM, Brad Fitzpatrick <[EMAIL PROTECTED]> wrote: > >> || instead of 'or' (local style), >> spaces after # in comments, >> wrap long lines, >> keep the => lined up in the constants at top, >> use logging system instead of literal "warn"? >> >> If you could also upload this patch once updated to >> codereview.appspot.com, that'd be great. (there's a command-line tool >> you can download from there which does it all... no need to use the web-app) >> >> Brad >> >> >> On Mon, Jun 16, 2008 at 8:19 AM, Jacob Burkhart <[EMAIL PROTECTED]> >> wrote: >> >>> Hi, >>> Can I get this patch in to djabberd? >>> >>> Incrementing a counter on empty SSL reads is not the correct way to >>> determine ssl_eof >>> (http://code.sixapart.com/trac/djabberd/changeset/758) >>> >>> Instead we should check the error code to see if it is one of the SSL >>> errors for which we are supposed to retry( >>> http://www.openssl.org/docs/ssl/SSL_get_error.html) >>> >>> thanks, >>> Jacob >>> >>> >>> >>> On Thu, Feb 28, 2008 at 12:40 PM, Jacob Burkhart <[EMAIL PROTECTED]> >>> wrote: >>> >>>> looks like the number was original 3, and then bumped up to 10 because >>>> of observed disconnects "in practice" >>>> http://code.sixapart.com/trac/djabberd/changeset/758 >>>> >>>> >>>> On Thu, Feb 28, 2008 at 11:33 AM, Jacob Burkhart <[EMAIL PROTECTED]> >>>> wrote: >>>> >>>>> Hi, >>>>> So we've been experiencing some bizarre behavior with clients randomly >>>>> disconnecting when trying to send messages over SSL. The disconnects vary >>>>> based on size of message and speed of network connections. >>>>> >>>>> The problems seems to stem from: >>>>> >>>>> http://code.sixapart.com/trac/djabberd/browser/trunk/DJabberd/lib/DJabberd/Connection.pm#L494 >>>>> >>>>> In http://code.sixapart.com/trac/djabberd/changeset/756 bradfitz added >>>>> code to handle non-graceful SSL disconnects. However, the number of times >>>>> to retry read before giving up (10) proved to be 1 to 2 too few times in >>>>> our >>>>> particular situation. >>>>> >>>>> But why did he pick 10? >>>>> >>>>> In our tests, the number of times that Net::SSLeay::read can return >>>>> zero bytes in a row varies based on the speed of a client's connection and >>>>> the size of bytes attempting to be sent. >>>>> >>>>> During stream initiation for instance, we experience about 2 reads of >>>>> zero in a row from clients connected on reasonably fast connections, and 0 >>>>> reads of zero in a row from a client running on the same machine as the >>>>> server. >>>>> >>>>> When sending messages larger than 8k, the number of zero reads is >>>>> somewhere under 10 for clients connection over local ethernet, but is in >>>>> the >>>>> 11 to 12 range for clients connecting over WiFi. >>>>> >>>>> So clearly, counting the number of zero bytes reads that happen is a >>>>> row is not the most reliable way to determine if the client has improperly >>>>> disconnected >>>>> >>>>> It turns out that if we look at the Djabberd code for SSL write, we can >>>>> see that there is a different answer for handling write failures. >>>>> >>>>> >>>>> http://code.sixapart.com/trac/djabberd/browser/trunk/DJabberd/lib/DJabberd/Stanza/StartTLS.pm#L90 >>>>> >>>>> Here, get_error is called to determine if the problem might have been >>>>> caused by SSL_ERROR_WANT_READ or perhaps SSL_ERROR_WANT_WRITE. If it is, >>>>> then this is not a fatal condition, and we don't want to close the >>>>> connection. instead, we simply return zero and expect that the writer >>>>> will >>>>> come around and try again at some point. >>>>> >>>>> So why not do the same thing for read failures? >>>>> >>>>> Please consider my patch (attached) as a possible solution for this >>>>> problem. >>>>> >>>>> thanks, >>>>> Jacob >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >