Essentially what's happening is the core driver keeps calling nsopenssl's read function even when there's nothing ready to be read yet. The infinite loop isn't really infinite because the connection is still alive, but the client hasn't sent anything for us to read yet. In the case of Mozilla, it appears that when the cert warning pops up, the client doesn't send anything to the server until the user takes action, (that's why it happens in the SSL handshake and not later) which could be a while, but the server core still tries to repeatedly read bytes anyway. This makes sense from a performance aspect: if you're serving a small number of bytes (single HTML pages) to thousands of sites, connections will be short and you'll want to free up the conn to serve other clients as quickly as possible. This model breaks with SSL and maybe in non-SSL conns if the client is really, really slow to respond, but only for servers with with lots of idle time.
Let's assume you have 100 connections per second coming in, and one of the clients is really slow. That single client won't take 100% of the server CPU because the SSL read call returns immediately so the core can service other requests. The other requests will take their slice of time and eventually another read call will be issued to nsopenssl. Remember that a box under this much load won't be very useful for other processing chores anyway, so if it has idle cycles, why not try to perform a read on that slow socket and see if there's something there yet; nsd is going to be hogging the CPU in any case.
If you have 1 or 2 conns per second, and other processing to do on your system, it's a different story. One slow client on such a system can potentially take up all the idle CPU cycles on your system, making anything else you're doing move very slowly.
Take this with a grain of salt; this is pure speculation at this point. I'll be analyzing the core server to see if I can fix this cleanly in there; if not, it's probably time to look at supporting multiple communication modules in the core, not in the old 3.x way of doing it, but more along the lines of allowing an nssock or nsopenssl module request a particular comm model for it's driver(s) so that nssock can use the aggressive read, while nsopenssl could request a more sedate model. This may be a mutually exclusive thing however: if nssock uses aggressive reads and nsopenssl uses a more sedate comm model, nssock could still chew up CPU with a slow client.
Between now and then, I'll get a beta 14 up with a timeout on reads, so that if an nsopenssl conn has failed every read over the past 3 seconds, it'll close the conn. I'll see if I can have that up this weekend.
/s.
On Feb 26, 2004, at 5:12 PM, Jamie Rasmussen wrote:
We see nsopenssl eating the CPU too - the cause seems to be an infinite loop in NsOpenSSLConnHandshake. You can see it if you use Netscape/Mozilla and your domain name doesn't match the certificate. It also seems to happen when the domain name matches but you close the browser window during the handshake. (I haven't had the chance to confirm what error that triggers.)
Scott has been looking into this, and thought that a solution could require changes to the core because of the aggressive read-ahead in AOLserver 4.x. As a temporary measure, he is also considering adding a timer to the conn that would exit the loop if no data has been read over the specified period.
-- AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
-- AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.