Borja, thanks a lot for the report and for the patch
** Changed in: openobject-server
Importance: Undecided => Wishlist
** Changed in: openobject-server
Status: New => Confirmed
** Changed in: openobject-server
Assignee: (unassigned) => OpenERP's Framework R&D (openerp-dev-framework)
--
tiny_socket does not handle EINTR and may fail if SIGCHLD signal received
https://bugs.launchpad.net/bugs/501617
You received this bug notification because you are a member of C2C
OERPScenario, which is subscribed to the OpenERP Project Group.
Status in OpenObject Server: Confirmed
Bug description:
The basic network functions used by OpenERP for receiving and sending data
(tiny_socket.py) don't handle EINTR ("interrupted system call") errors, and
that may cause weird race conditions.
EINTR errors happen when the process receives signals while doing some low
level I/O (like receiving or sending data over a socket): the I/O operation is
interrupted by the kernel (is just not performed*) so the process can take care
of the signal, and should be retried again afterwards (the I/O didn't really
fail, it was just interrupted to wake up the process).
(*) EINTR for socket fuctions means "The recv() function was interrupted by
a signal that was caught, before any data was available." / "A signal
interrupted send() before any data was transmitted."
(http://www.opengroup.org/onlinepubs/000095399/functions/recv.html) so the
calls can be safely retried (http://www.wlug.org.nz/EINTR)
Python does not handle EINTR errors by itself (there had been discussions about
this: http://bugs.python.org/issue1628205) so is the Python programmer that
uses I/O who must take care (and retry the operation).
***
This bug was first detected on the Koo client, that uses a copy of the
tiny_socket.py file for NetRPC communication, but may affect all the code that
depends on tiny_socket.py (like the server itself, the GTK client and the Web
client). The bug shown up on computers running Linux Mint 7 (kernel 2.6.28-16
32bits) and Linux Ubuntu 9.10 64 bit (2.6.31-14) -
(https://bugs.launchpad.net/openobject-client-kde/+bug/484651).
On (tiny_socket.py) mysocket.myreceive, some data may have been received (in
calls to recv) when the EINTR error happens; as the EINTR is not handled,
mysocket.myreceive will just raise up the Exception so the current operation
will fail. That means that OpenERP is susceptible to weird race conditions (it
will fail only when the SIGCHLD, or other non-ignored signal, arrives while
performing I/O) or denial of service attacks (sending lots of signals to
OpenERP).
For example, on the OpenERP server, some addons use spawnlp or other similar
functions to create sub-processes. Some of them, like the jasper_reports, need
to run those sub-process without waiting for the spawned process to end
(os.P_NOWAIT). In that context, OpenERP will receive SIGCHLD signals when the
spawned sub-process end. If OpenERP receives one of those signals while it is
performing a socket I/O operation (mainly using socket.recv or socket.send
functions in tiny_socket.py), the call may fail with an EINTR error (4) and
data may be lost.
***
A possible fix is to patch tiny_socket.py so it handles EINTR errors, retrying
the recv/send operations. This would make sure that no signal breaks
mysocket.mysend or mysocket.myreceive.
As an optional workaround, if no fix is applied to tiny_socket.py, SIGCHLD
signals could be ignored ("signal.signal(signal.SIGCHLD, signal.SIG_IGN)"), and
no EINTR error will be raised then when a sub-process end. This would avoid the
spawn* with os.P_NOWAIT problem.
_______________________________________________
Mailing list: https://launchpad.net/~c2c-oerpscenario
Post to : [email protected]
Unsubscribe : https://launchpad.net/~c2c-oerpscenario
More help : https://help.launchpad.net/ListHelp