Ok, the culprit is definitely ignoring exceptions raised in sendall. In my 
humble opinion this is serious enough to be on the 2.0 blocker list.

How to reproduce: you have to have a wsgi worker, that produces output in 
parts (that is, returns a list or yields part as a generator). e.g: use 
web2py's "static" file server (which uses wsgi and does not use the 
FileSystermWorker). 

   1. Make sure that there's a large payload produced, and that it is made 
   of a lot of small parts. e.g. put a 10MB file in 
   web2py/applications/welcome/static/file10mb.data (web2py will use 64K parts 
   by default)
   2. Consume file slowly, e.g. wget --limit=100k 
   http://localhost:8000/welcome/static/file10mb.data ; this would take 100 
   seconds to download the whole file even on localhost.
   3. Let file download for 10 seconds, then pause wget (e.g. suspend it by 
   using Ctrl-Z on linux/osx)
   4. Wait 20 seconds
   5. Let it continue (e.g. type 'fg' if you suspended it with ctrl-z)
   6. Notice that when it reaches the end, wget will complain about missing 
   bytes, reconnect and download the rest of the file (and will be happy with 
   it). However, the file will be corrupt: A block (or many blocks) will be 
   missing from the middle, and the last few blocks will be repeated (by the 
   2nd wget connection; if you disallow wget from resuming, the file will just 
   be shorter).

A better idea where the problem is can be seen from the following ugly 
patch (applied against web2py's "one file" rocket.py)

@@ -1929,6 +1929,9 @@ class WSGIWorker(Worker):
                 self.conn.sendall(b('%x\r\n%s\r\n' % (len(data), data)))
             else:
                 self.conn.sendall(data)
+        except socket.timeout:
+            self.closeConnection = True
+            print 'Exception lost'
         except socket.error:
             # But some clients will close the connection before that
             # resulting in a socket error.
 

Running the same experiment with the patched rocket.py will show that files 
get corrupted if 'exception lost' is printed to the web2py's terminal.

Discussion: The only way to use sendall() reliably is to immediately 
terminate the connection upon any error (including timeout), as there is no 
way to know how many bytes were sent. (That there is no way to know how 
many bytes were sent is clearly stated in the documentation; the 
implication that it is impossible to reliably recover from this is not). 
However, there are sendall() calls all over rocket.py, and some will result 
in additional sendalls() following a failed sendall(). The worst offender 
seems to be WSGIWorker.write(), but I'm not sure the other sendalls are 
safe either.

Temporary workaround: increase SOCKET_TIMEOUT significantly (default is 1 
second; bump to e.g. 10), and not swallow socket.timeout in 
WSGIWorker.write().

Increasing the chunk size is NOT a helpful, because it only changes the 
number of bytes before the first loss (at a given bandwidth), but from that 
point, the problem is the same.
cross reference: 
https://github.com/explorigin/Rocket/issues/1#issuecomment-3734231

Reply via email to