On Tue, May 10, 2011 at 5:06 AM, Ger Hobbelt <[email protected]> wrote:
> On Mon, May 9, 2011 at 9:15 PM, Michael Gale <[email protected]>wrote: > >> I checked OpenSSL 1.0.0d release and the same comment is there "does not >> happen". If it does not happen why is it being checked? > > > Might be more correctly stated as '/* _should_ not happen */' - see the > code: it's a basic sanity check to ensure the BIO_read() doesn't overrun the > rx buffer, which would cause arbitrary memory corruption if the BIO_read() > were allowed to do that. The error is keeping you from worse (and even > harder to debug) things happening to you. > > It is unclear (assuming the stack trace you provided only lists the python > side of things) who invoked ssl3_read_n() and where the error originates: it > may either be an input argument error to ssl3_read_n(): ..., int n, int max, > ... where n>max from the start of the function, or due to a 'max' fixup > inside ssl3_read_n() combined with yet unknown context conditions. > > Incidentally, did you grep whether ssl3_read_n() is invoked directly from > the Python glue (just to be sure only the 'usual suspects' inside openSSL > itself may invoke ssl3_read_n(); it is marked as an internal-use-only > function (prototype sits in ssl_locl.h) > > I haven't seen this error before, but what might help you in improving > diagnosis, assuming you can build your own openSSL (+ python glue code ?), > is adding a bit of info to the error report plus activating the openssl > assertions in the build. Then you can add these lines (typed off the top of > my head, so reckon with the compiler yakking due to typos by me) to see > where the issue 'starts' within the function: > > > --snip-- Whole function below from s3_pkt.c > >> 132 int i,off,newb; >> 133 >> > > OPENSSL_assert(n <= max); /* turn on assertions in build to have this > one fire on incorrect input */ > * > alternative code, which might be more useful and doesn't require assertions > enabled to fire anyway when this 'should not happen' occasion happens > anyway: > > if (n > max) /* should not happen */* > * { > char errbuf[80]; > * > * SSLerr(SSL_F_SSL3_READ_N,ERR_R_INTERNAL_ERROR); > * > * sprintf(errbuf, ' @ start of ssl3_read_n: n = %d, max = > %d", n, max);* > * ERR_add_error_data(1, errbuf);* > * return -1;* > * }* > > The ERR_add_error_data() call accepts a series of strings which are > appended as extra data to the error report; very handy to transport debug > info out of openSSL. Note that it must be invoked AFTER the SSLerr() pushed > the error on the error stack. > > >> 134 if (!extend) >> [...] >> > 165 { >> >> 166 /* avoid buffer overflow */ >> 167 int max_max = s->s3->rbuf.len - s->packet_length; >> 168 if (max > max_max) >> 169 max = max_max; >> 170 } >> * 171 if (n > max) /* does not happen */* >> * 172 {* >> > > char errbuf[120]; /* and here, it might be useful to have a look at the > rbuf, as by now we 'know' this will only fire when max was reduced above */ > > >> * 173 SSLerr(SSL_F_SSL3_READ_N,ERR_R_INTERNAL_ERROR);* >> > > * ** sprintf(errbuf, ' @ line %d in ssl3_read_n: n = %d, max > = %d, *rbuf.len = %d, packet_length* = %d", n, max, (int)*s->s3->rbuf.len, > (int)s->packet_length*);* > * ERR_add_error_data(1, errbuf);* > > >> * 174 return -1;* >> * 175 }* >> > > > -- > Met vriendelijke groeten / Best regards, > > Ger Hobbelt > > -------------------------------------------------- > web: http://www.hobbelt.com/ > http://www.hebbut.net/ > mail: [email protected] > mobile: +31-6-11 120 978 > -------------------------------------------------- > > Hey, First, thanks for the reply and feedback. The section of the python code that invokes the function call is below, I highlighted line 138 which is where the exception is occurring according to the stack trace. --snip-- *Python-2.7/lib/python2.7/ssl.py* 132 def read(self, len=1024): 133 134 """Read up to LEN bytes and return them. 135 Return zero-length string on EOF.""" 136 137 try: *138 return self._sslobj.read(len)* 139 except SSLError, x: 140 if x.args[0] == SSL_ERROR_EOF and self.suppress_ragged_eofs: 141 return '' 142 else: 143 raise --snip-- I traced this to *Python-2.7/Modules/_ssl.c*, then to * openssl-0.9.8b/ssl/ssl_lib.c* where it calls SSL_read. So it does not look like it is calling the internal method. We build our own version of Python from the python source because we make it into an RPM with a custom installation directory. I can also build a version of openSSL but getting a client to create this condition has proved difficult so far. We ran this new service on QA for several months with out an issue, last weekend we placed it into production and this issue occurred after about 5hrs of running. At first we thought how odd and restarted the service and the issue occurred again with in minutes. My belief is that it is caused by a client stuck in a broken state. However with so many connections going on at this time I am not in a position to trouble shoot it :( ... it just forces us to roll back the deployment. I was really hoping that someone knows how a connection could get into this state and maybe had a test utility I could run against our service create the issue. So far my attempts are re-creating it on QA have failed :( Any ideas on how maybe I could create the issue? I have added a bunch of source code below to try and layout the flow from Python to OpenSSL. Michael --snip-- *openssl-0.9.8b/ssl/ssl_lib.c* 858 int SSL_read(SSL *s,void *buf,int num) 859 { 860 if (s->handshake_func == 0) 861 { 862 SSLerr(SSL_F_SSL_READ, SSL_R_UNINITIALIZED); 863 return -1; 864 } 865 866 if (s->shutdown & SSL_RECEIVED_SHUTDOWN) 867 { 868 s->rwstate=SSL_NOTHING; 869 return(0); 870 } 871 return(s->method->ssl_read(s,buf,num)); 872 } --snip-- --snip-- *Python-2.7/Modules/_ssl.c* 1262 static PyObject *PySSL_SSLread(PySSLObject *self, PyObject *args) 1263 { 1264 PyObject *buf; 1265 int count = 0; 1266 int len = 1024; 1267 int sockstate; 1268 int err; 1269 int nonblocking; 1270 1271 if (!PyArg_ParseTuple(args, "|i:read", &len)) 1272 return NULL; 1273 1274 if (!(buf = PyString_FromStringAndSize((char *) 0, len))) 1275 return NULL; 1276 1277 /* just in case the blocking state of the socket has been changed */ 1278 nonblocking = (self->Socket->sock_timeout >= 0.0); 1279 BIO_set_nbio(SSL_get_rbio(self->ssl), nonblocking); 1280 BIO_set_nbio(SSL_get_wbio(self->ssl), nonblocking); 1281 1282 /* first check if there are bytes ready to be read */ 1283 PySSL_BEGIN_ALLOW_THREADS 1284 count = SSL_pending(self->ssl); 1285 PySSL_END_ALLOW_THREADS 1286 1287 if (!count) { 1288 sockstate = check_socket_and_wait_for_timeout(self->Socket, 0); 1289 if (sockstate == SOCKET_HAS_TIMED_OUT) { 1290 PyErr_SetString(PySSLErrorObject, 1291 "The read operation timed out"); 1292 Py_DECREF(buf); 1293 return NULL; 1294 } else if (sockstate == SOCKET_TOO_LARGE_FOR_SELECT) { 1295 PyErr_SetString(PySSLErrorObject, 1296 "Underlying socket too large for select()."); 1297 Py_DECREF(buf); 1298 return NULL; 1299 } else if (sockstate == SOCKET_HAS_BEEN_CLOSED) { 1300 if (SSL_get_shutdown(self->ssl) != 1301 SSL_RECEIVED_SHUTDOWN) 1302 { 1303 Py_DECREF(buf); 1304 PyErr_SetString(PySSLErrorObject, 1305 "Socket closed without SSL shutdown handshake"); 1306 return NULL; 1307 } else { 1308 /* should contain a zero-length string */ 1309 _PyString_Resize(&buf, 0); 1310 return buf; 1311 } 1312 } 1313 } 1314 do { 1315 PySSL_BEGIN_ALLOW_THREADS *1316 count = SSL_read(self->ssl, PyString_AsString(buf), len);* 1317 err = SSL_get_error(self->ssl, count); 1318 PySSL_END_ALLOW_THREADS 1319 if(PyErr_CheckSignals()) { 1320 Py_DECREF(buf); 1321 return NULL; 1322 } 1323 if (err == SSL_ERROR_WANT_READ) { 1324 sockstate = check_socket_and_wait_for_timeout(self->Socket, 0); 1325 } else if (err == SSL_ERROR_WANT_WRITE) { 1326 sockstate = check_socket_and_wait_for_timeout(self->Socket, 1); 1327 } else if ((err == SSL_ERROR_ZERO_RETURN) && 1328 (SSL_get_shutdown(self->ssl) == 1329 SSL_RECEIVED_SHUTDOWN)) 1330 { 1331 _PyString_Resize(&buf, 0); 1332 return buf; 1333 } else { 1334 sockstate = SOCKET_OPERATION_OK; 1335 } 1336 if (sockstate == SOCKET_HAS_TIMED_OUT) { 1337 PyErr_SetString(PySSLErrorObject, 1338 "The read operation timed out"); 1339 Py_DECREF(buf); 1340 return NULL; 1341 } else if (sockstate == SOCKET_IS_NONBLOCKING) { 1342 break; 1343 } 1344 } while (err == SSL_ERROR_WANT_READ || err == SSL_ERROR_WANT_WRITE); *1345 if (count <= 0) {* *1346 Py_DECREF(buf);* *1347 return PySSL_SetError(self, count, __FILE__, __LINE__);* *1348 }* 1349 if (count != len) 1350 _PyString_Resize(&buf, count); 1351 return buf; 1352 } --snip-- -- Ecclesiastes 1:18 18 For with much wisdom comes much sorrow; the more knowledge, the more grief.
