Hello, It seems this subject has had quite a bit of history. Tim Peters demonstrated the problem in 2003 in this message: http://mail.python.org/pipermail/python-dev/2003-June/036537.html
In short, Python file objects release the GIL before calling any C stdlib function on their embedded FILE pointer. Unfortunately, if another thread calls fclose on the FILE pointer concurrently, the contents pointed to can become garbage and the interpreter process crashes. Just by using the same file object in two threads running pure Python code, you can crash the interpreter. (another, easier-to-solve problem is that the FILE pointer stored in the file object could become NULL at the point it is used by another thread. If that was the only problem you could just store the FILE pointer in a local variable before releasing the GIL et voilĂ ) There was some discussion at the time about the possible resolution. I've tried to fix the problem, and I've come to what I think is a satisfying solution, which I can sum up as the following bullet points: * Each file object gets a dedicated counter, which is incremented before the bject releases the GIL and decremented after the GIL is taken again; thus this counter keeps track of how many running "unlocked" sections of code are using that particular file object. (please note the counter doesn't need its own lock, since it is only modified in GIL-protected sections) * In the close() method, if the aforementioned counter is greater than 0, we refuse to call fclose and instead raise an IOError. This may seem like a worrying semantic change, but I don't think it is, for the following reasons: 1) if we closed the FILE pointer anyway, the interpreter would likely crash because another thread would be using garbage data (that's what we are trying to fix after all!) 2) if close() raises an IOError, it can be called again later, or at worse fclose will be called when the file object is garbage collected 3) close() can already raise an IOError if fclose fails for whatever reason (although for sure it's probably very rare) 4) it doesn't seem wrong to notify the programmer that his code is very unsafe The patch is attached at http://bugs.python.org/issue815646 . It addresses (or at least I hope it does) all potential problems with pure Python code, threads, and the file object. It doesn't try to fix C extensions using the PyFile_AsFile API and doing their own dirty things with the FILE pointer. It could be a second step if the approach is accepted, but as noted in the 2003 discussions it would probably involve a new API. Whether we want to introduce such an API in Python 2.x while Python 3.0 has a different IO model anyway is left open to discussion :) Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com