Borja, thanks a lot for the report and for the patch

** Changed in: openobject-server
   Importance: Undecided => Wishlist

** Changed in: openobject-server
       Status: New => Confirmed

** Changed in: openobject-server
     Assignee: (unassigned) => OpenERP's Framework R&D (openerp-dev-framework)

-- 
tiny_socket does not handle EINTR and may fail if SIGCHLD signal received
https://bugs.launchpad.net/bugs/501617
You received this bug notification because you are a member of C2C
OERPScenario, which is subscribed to the OpenERP Project Group.

Status in OpenObject Server: Confirmed

Bug description:
The basic network functions used by OpenERP for receiving and sending data 
(tiny_socket.py) don't handle EINTR ("interrupted system call") errors, and 
that may cause weird race conditions.

EINTR errors happen when the process receives signals while doing some low 
level I/O (like receiving or sending data over a socket): the I/O operation is 
interrupted by the kernel (is just not performed*) so the process can take care 
of the signal, and should be retried again afterwards (the I/O didn't really 
fail, it was just interrupted to wake up the process).

   (*) EINTR for socket fuctions means "The recv() function was interrupted by 
a signal that was caught, before any data was available." / "A signal 
interrupted send() before any data was transmitted." 
(http://www.opengroup.org/onlinepubs/000095399/functions/recv.html) so the 
calls can be safely retried (http://www.wlug.org.nz/EINTR)

Python does not handle EINTR errors by itself (there had been discussions about 
this: http://bugs.python.org/issue1628205) so is the Python programmer that 
uses I/O who must take care (and retry the operation).

***

This bug was first detected on the Koo client, that uses a copy of the 
tiny_socket.py file for NetRPC communication, but may affect all the code that 
depends on tiny_socket.py (like the server itself, the GTK client and the Web 
client). The bug shown up on computers running Linux Mint 7 (kernel 2.6.28-16 
32bits) and Linux Ubuntu 9.10 64 bit (2.6.31-14) - 
(https://bugs.launchpad.net/openobject-client-kde/+bug/484651).


On (tiny_socket.py) mysocket.myreceive, some data may have been received (in 
calls to recv) when the EINTR error happens; as the EINTR is not handled, 
mysocket.myreceive will just raise up the Exception so the current operation 
will fail. That means that OpenERP is susceptible to weird race conditions (it 
will fail only when the SIGCHLD, or other non-ignored signal, arrives while 
performing I/O) or denial of service attacks (sending lots of signals to 
OpenERP).

For example, on the OpenERP server, some addons use spawnlp or other similar 
functions to create sub-processes. Some of them, like the jasper_reports, need 
to run those sub-process without waiting for the spawned process to end 
(os.P_NOWAIT). In that context, OpenERP will receive SIGCHLD signals when the 
spawned sub-process end. If OpenERP receives one of those signals while it is 
performing a socket I/O operation (mainly using socket.recv or socket.send 
functions in tiny_socket.py), the call may fail with an EINTR error (4) and 
data may be lost.

***

A possible fix is to patch tiny_socket.py so it handles EINTR errors, retrying 
the recv/send operations. This would make sure that no signal breaks 
mysocket.mysend or mysocket.myreceive.

As an optional workaround, if no fix is applied to tiny_socket.py, SIGCHLD 
signals could be ignored ("signal.signal(signal.SIGCHLD, signal.SIG_IGN)"), and 
no EINTR error will be raised then when a sub-process end. This would avoid the 
spawn* with os.P_NOWAIT problem.



_______________________________________________
Mailing list: https://launchpad.net/~c2c-oerpscenario
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~c2c-oerpscenario
More help   : https://help.launchpad.net/ListHelp

Reply via email to