[issue3139] bytearrays are not thread safe
Boya Sun boya@case.edu added the comment: I am still a little bit confused. Can you explain a little more in detail? What is the difference between the suspicious code and the ones that are fixed? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis mar...@v.loewis.de added the comment: A problem can only occur if you preserve a pointer to the buffer, and the object gets a chance to change its buffer underneath. This can happen when there are user-defined callback, and when other threads can get control. In the cases being fixed, other threads *can* get control, as the GIL is released. In the cases you discuss, this cannot happen, since the GIL is not released, and no codepath can lead to buffer reallocation. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Boya Sun boya@case.edu added the comment: Although the bug is fixed, the following three code segments seems suspicious in _codecsmodule.c in the latest revision 74624, and they are similar to the bug described here: (1) escape_decode(PyObject *self, PyObject *args) { ... const char *data; ... if (!PyArg_ParseTuple(args, s#|z:escape_decode, data, size, errors)) } (2) readbuffer_encode(PyObject *self, PyObject *args) { const char *data; ... if (!PyArg_ParseTuple(args, s#|z:readbuffer_encode, data, size, errors)) ... } (3) charbuffer_encode(PyObject *self, PyObject *args) { const char *data; ... if (!PyArg_ParseTuple(args, t#|z:charbuffer_encode, data, size, errors)) ... } Firstly, char *data; have been replaced by Py_buffer pbuf; in many procedures in this file in the bug fix, but these code did not; Secondly, they uses s# or t# which should probably changed to s*; I could be wrong about it. Does anyone have any opinions on the above code? Are they really buggy or am I misunderstanding anything? -- nosy: +boya ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: In theory, yes. But it happens that the GIL is not released before the buffer is used. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Travis Oliphant [EMAIL PROTECTED] added the comment: I'm sorry that I was unavailable for comment during July and August as it looks like a lot of decisions were made that have changed the semantics a bit. I'm still trying to figure out why the decisions were made that were made. I get the impression that most of the problems are related to objects incorrectly managing their exported buffers, but there may be some semantic issues related to t# that were not conceived of during the many discussions surrounding the design of PEP 3118. I'm not convinced that Py_buffer should have grown a link to an object. I think this is a shortcut solution due to misuse of the protocol that may have unfortunate consequences. I'm not sure where PyBuffer_Release came from. I can't find it in the PEP and don't remember what it's purpose is. Did I add it or did somebody elese? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Hi Travis, Glad you're back! I'm not convinced that Py_buffer should have grown a link to an object. I think this is a shortcut solution due to misuse of the protocol that may have unfortunate consequences. What consequences are you thinking about? Specifically, why shouldn't Py_buffer have a link to the object? It's the best way we've found to be able to release the buffer without having to keep a link to the originator ourselves. The concern is to simplify the API for most of its users. Especially, the new format codes (s* et al.) can just fill the Py_buffer rather than return several things at once. (please note that link can be NULL if you don't want to have the associated resource management) I'm not sure where PyBuffer_Release came from. I can't find it in the PEP and don't remember what it's purpose is. It's a replacement for PyObject_ReleaseBuffer(). Since a Py_buffer now has a link to its originator, there's no need to pass it separately to the releasing function. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Le samedi 16 août 2008 à 05:50 +, Ismail Donmez a écrit : This seems to break test_unicode on MacOSX 10.5.4, test_unicode python(80320,0xa0659fa0) malloc: *** mmap(size=2147483648) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug python(80320,0xa0659fa0) malloc: *** mmap(size=2147483648) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Can you run Lib/test/test_unicode.py directly to know which test precisely crashes? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Ismail Donmez [EMAIL PROTECTED] added the comment: Can you run Lib/test/test_unicode.py directly to know which test precisely crashes? The latest py3k branch is fine now and the test passes. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Ismail Donmez [EMAIL PROTECTED] added the comment: This seems to break test_unicode on MacOSX 10.5.4, test_unicode python(80320,0xa0659fa0) malloc: *** mmap(size=2147483648) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug python(80320,0xa0659fa0) malloc: *** mmap(size=2147483648) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: I'm a bit confused. In the PyBuffer_Release implementation you committed, there is no DECREF at all. Oops, I meant to make the reference own by Py_buffer, but actually forgot to implement that. Fixed in r65677, r65678. Now, when I try to merge that into the 3k branch, test_unicode terribly leaks memory :-( It's really frustrating. Regards, Martin ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Le jeudi 14 août 2008 à 16:13 +, Martin v. Löwis a écrit : Now, when I try to merge that into the 3k branch, test_unicode terribly leaks memory :-( It's really frustrating. If you don't have the time to work on it anymore, you can post the current patch here and I'll take a try. Regards Antoine. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: The patch is really trivial, and attached. Added file: http://bugs.python.org/file4/refcount.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Le jeudi 14 août 2008 à 18:52 +, Martin v. Löwis a écrit : The patch is really trivial, and attached. Added file: http://bugs.python.org/file4/refcount.diff By the way, even without that patch, there is a memory leak: Python 3.0b2+ (py3k, Aug 14 2008, 20:49:19) [GCC 4.3.1 20080626 (prerelease)] on linux2 Type help, copyright, credits or license for more information. import sys, codecs b = bytearray() sys.getrefcount(b) 2 codecs.ascii_decode(memoryview(b)) ('', 0) sys.getrefcount(b) 3 ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: By the way, even without that patch, there is a memory leak: With the patch, this memory leak goes away. Regards, Martin ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Le jeudi 14 août 2008 à 19:06 +, Martin v. Löwis a écrit : Martin v. Löwis [EMAIL PROTECTED] added the comment: By the way, even without that patch, there is a memory leak: With the patch, this memory leak goes away. But now: 30 m = memoryview(b) sys.getrefcount(b) 32 del m sys.getrefcount(b) 31 ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Sorry, the roundup e-mail interface ate some lines of code: b = b'' sys.getrefcount(b) 30 m = memoryview(b) sys.getrefcount(b) 32 del m sys.getrefcount(b) 31 It doesn't happen with bytearrays, so I suspect it's that you only DECREF when releasebuffer method exists: b = bytearray() sys.getrefcount(b) 2 m = memoryview(b) sys.getrefcount(b) 4 del m sys.getrefcount(b) 2 ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: It doesn't happen with bytearrays, so I suspect it's that you only DECREF when releasebuffer method exists: Thanks, that was indeed the problem; this is now fixed in r65683 and r65685. My initial observation that test_unicode leaks a lot of memory was incorrect - it's rather that test_raiseMemError consumes all my memory (probably without leaking). test_unicode still leaks 6 references each time; one reference is leaked whenever a SyntaxError is raised. I'm not sure though whether this was caused by this patch, so I'll close this issue as fixed. Any further improvements should be done through separate patches (without my involvement, most likely). ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Martin v. Löwis [EMAIL PROTECTED]: -- resolution: - fixed status: open - closed ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Le mardi 12 août 2008 à 16:15 +, Martin v. Löwis a écrit : I also started working on porting it to 3.0, but couldn't complete that port yet - the memoryview object doesn't play nicely. I've seen your recent merge and I don't know if you have finished with it. I think we should drop the base member in PyMemoryViewObject, it has become redundant and confusing. There are some places in memoryobject.c where base seems mistakingly used instead of view.obj, e.g. PyMemoryView_FromMemory INCREFs view.obj, but memory_dealloc DECREFs base. Also, I don't understand why memory_getbuf INCREFs view.obj, there is no corresponding DECREF in memory_releasebuf and view.obj should already have been INCREFed anyway. (if people want to get easily at the base object, we could provide be a macro e.g. PyMemory_GET_BASE()) By the way, perhaps PyBuffer_Release should set view-obj and view-buf to NULL (and view-len to -1?), it would be a simple way to signal that the buffer can't be used anymore. What do you think? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: I've seen your recent merge and I don't know if you have finished with it. Indeed, I'm done, r65654. Unless there are actual bugs in these patches (in the sense that they don't fix the reported problem, or introduce new bugs), I'd like to close this issue. I think we should drop the base member in PyMemoryViewObject, it has become redundant and confusing. I tried, and failed; feel free to submit a patch (as a new issue). The tricky part is that base is sometimes used as a tuple. Also, I don't understand why memory_getbuf INCREFs view.obj, there is no corresponding DECREF in memory_releasebuf and view.obj should already have been INCREFed anyway. Ok, that's an open issue. Is the caller of FromMemory supposed to do Buffer_Release afterwards, or is ownership of the buffer transferred in the FromMemory call? (the issue didn't exist when the embedded reference was borrowed) To put it another way: view.obj has *not* been INCREFed. *view holds a reference, but that reference belongs to the caller, not to the memory object. Every time you initialize a PyObject* (such as view.obj), you need to INCREF, unless it's a borrowed reference. In any case, the corresponding DECREF *does* exist: in memory_dealloc, PyBuffer_Release is invoked, which releases the reference. By the way, perhaps PyBuffer_Release should set view-obj and view-buf to NULL (and view-len to -1?), it would be a simple way to signal that the buffer can't be used anymore. That can be done; it's again a separate patch. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: In any case, the corresponding DECREF *does* exist: in memory_dealloc, PyBuffer_Release is invoked, which releases the reference. I'm a bit confused. In the PyBuffer_Release implementation you committed, there is no DECREF at all. By the way, perhaps PyBuffer_Release should set view-obj and view-buf to NULL (and view-len to -1?), it would be a simple way to signal that the buffer can't be used anymore. That can be done; it's again a separate patch. Ok. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: I have now committed the patch to 2.6 as r65654, adding changes for the bz2module. I also decided to make the Py_buffer structure own its reference, as I was running out of arguments why not to. In the process, I removed PyObject_ReleaseBuffer, as it is redundant and would have an unclear sematics (what if the object passed directly and the object passed indirectly were different?). ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: I also started working on porting it to 3.0, but couldn't complete that port yet - the memoryview object doesn't play nicely. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: The last beta is arriving soon. We need to go ahead on this, or retarget it for 2.7/3.1. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: The last beta is arriving soon. We need to go ahead on this, or retarget it for 2.7/3.1. I'll look into it tomorrow. Regards, Martin ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment: Two comments: * I like the new *-getarg parameters, but it would be better to test for '#' first since this is still by far the most used getarg parameter. * Antoine, I think your codecs.c patch has a glitch: +decoded = PyUnicode_DecodeMBCSStateful(pbuf.buf, pbuf.len, errors, + final ? NULL : consumed); + PyBuffer_Release(pbuf); +if (decoded == NULL) return NULL; -return codec_tuple(decoded, final ? size : consumed); +return codec_tuple(decoded, consumed); } You dropped the final ? size : for no apparent reason. -- nosy: +lemburg ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: You've missed the preceding line that says: +consumed = pbuf.len; If final is NULL, consumed isn't updated by the call to PyUnicode_DecodeMBCSStateful and keeps its original value of pbuf.len. (in all honesty, I didn't test under Windows so this particular function wasn't enabled when I compiled and ran the test suite) ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment: On 2008-08-02 16:49, Antoine Pitrou wrote: Antoine Pitrou [EMAIL PROTECTED] added the comment: You've missed the preceding line that says: +consumed = pbuf.len; If final is NULL, consumed isn't updated by the call to PyUnicode_DecodeMBCSStateful and keeps its original value of pbuf.len. (in all honesty, I didn't test under Windows so this particular function wasn't enabled when I compiled and ran the test suite) Thanks for clarifying this. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Here is a new patch wrapping up Martin's patch with the following additions: - getargs.c fixes - codecs module fixes - multiprocessing module fix - fileobject readinto fix - in bytearray deallocator, print out a SystemError if there are existing exports The whole test suite now passes, which is a good start :-) Added file: http://bugs.python.org/file11040/s_star2.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Antoine Pitrou [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file11032/codecs.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Antoine Pitrou [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file11033/getargs.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: Amaury, I think the issue of thread-safety of the io library should be decoupled from this issue. I don't think it is release-critical that the io library is thread-safe (and even if it is release-critical, the problem should be tracked in a different issue, as it requires a completely independent patch). The original issue (bytearrays are not thread-safe) will not be completely resolved (for a certain definition of thread-safe): it will always be possible that one thread observes a locked byte object - this is by design of the buffer interface, and it is a good design. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Selon Martin v. Löwis [EMAIL PROTECTED]: Amaury, I think the issue of thread-safety of the io library should be decoupled from this issue. I don't think it is release-critical that the io library is thread-safe (and even if it is release-critical, the problem should be tracked in a different issue, as it requires a completely independent patch). Sorry Martin, I forgot to mention that I did open a separate issue in #3476. Regards Antoine. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: I have now updated the patch to fix the socket bug, and the rejects bytearray for t#. As for making Py_buffer own a reference to the object: what should be the semantics for PyObject_ReleaseBuffer? I see the following options: - Drop PyObject_ReleaseBuffer - make it DECREF the embedded object, whether or not it is the same as the object being passed in - leave it as-is, and require the caller to DECREF. Added file: http://bugs.python.org/file11026/s_star.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Martin v. Löwis [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file10969/s_star.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Selon Martin v. Löwis [EMAIL PROTECTED]: As for making Py_buffer own a reference to the object: what should be the semantics for PyObject_ReleaseBuffer? I see the following options: - Drop PyObject_ReleaseBuffer - make it DECREF the embedded object, whether or not it is the same as the object being passed in - leave it as-is, and require the caller to DECREF. I don't know, is there supposed to be a semantic difference between PyObject_ReleaseBuffer and PyBuffer_Release? If not, I'd say drop it. Also, I think it's fine if you commit your fix without adding an incref/decref. In the absence of the designer of the buffer API it is difficult to know what subtleties should be taken into account when trying to change that API... ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Le vendredi 01 août 2008 à 17:53 +, Amaury Forgeot d'Arc a écrit : There is a small issue with the patch: in the w# format handler, bf_getwritebuffer(arg, 0, res) is wrong. The third argument should be res (there is a compilation warning on windows), And a few lines after, in the if (*format == '#') block, there should be a line like *p = res; otherwise the user code never sees the buffer... Nice catch! Making those changes actually fixes a segfault I had in testReadinto in test_file.py. By the way, please note bytearray.decode is broken: Traceback (most recent call last): File stdin, line 1, in module TypeError: ascii_decode() argument 1 must be string or pinned buffer, not bytearray bytearray(b).decode(utf8) Traceback (most recent call last): File stdin, line 1, in module File /home/antoine/cpython/bufferedwriter/Lib/encodings/utf_8.py, line 16, in decode return codecs.utf_8_decode(input, errors, True) TypeError: utf_8_decode() argument 1 must be string or pinned buffer, not bytearray bytearray(b).decode(latin1) Traceback (most recent call last): File stdin, line 1, in module TypeError: latin_1_decode() argument 1 must be string or pinned buffer, not bytearray ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Giampaolo Rodola' [EMAIL PROTECTED]: -- nosy: -giampaolo.rodola ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: I don't know, is there supposed to be a semantic difference between PyObject_ReleaseBuffer and PyBuffer_Release? If not, I'd say drop it. There are existing callers of it which would need to be changed, perhaps outside the core also; plus it's in PEP 3118. Technically, they do the same. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Martin, sorry for noticing this now but on python-dev you had proposed the following: « I propose that new codes s*, t*, w* are added, and that s#,t#,w# refuses objects which implement a releasebuffer procedure (alternatively, s# etc might be removed altogether right away) » I don't see any changes to s# and t# in your patch. Do you plan to do it later or do you think this part should be dropped? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: I don't see any changes to s# and t# in your patch. Do you plan to do it later or do you think this part should be dropped? It's implemented. When bf_releasebuffer is not NULL, convertbuffer will complain string or read-only buffer. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: Ok for s#. But t# uses bf_getcharbuffer(), which does not seem to lock anything. I wonder if this can lead to the same kind of problems, for example when calling file_write with a binary file. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: But t# uses bf_getcharbuffer(), which does not seem to lock anything. Indeed. I think we can safely drop support for passing buffer objects into t# which have getbuffer/releasebuffer, so the check for releasebuffer being NULL should be added to t#. I think the bytearray object should refuse to implement getcharbuffer, anyway. I wonder if this can lead to the same kind of problems, for example when calling file_write with a binary file. It should be a text file to cause problems, right? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: The problem is that the fix for #3295 was committed in the py3k branch (in r64751) rather thank on the trunk! Once PyExc_BufferError is defined properly the crash disappears and exceptions are printed instead. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Benjamin Peterson [EMAIL PROTECTED] added the comment: Sorry, that was my oversight! I've backported the fix. -- nosy: +benjamin.peterson ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: It's indeed better. Now with when running my previous script I can see the exception ;-) Exception in thread Thread-2: Traceback (most recent call last): File C:\dev\python\trunk1\lib\threading.py, line 523, in __bootstrap_inner self.run() File C:\dev\python\trunk1\lib\threading.py, line 478, in run self.__target(*self.__args, **self.__kwargs) File stdin, line 3, in f File C:\dev\python\trunk1\lib\io.py, line 1473, in write self.buffer.write(b) File C:\dev\python\trunk1\lib\io.py, line 1041, in write self._write_buf.extend(b) BufferError: Existing exports of data: object cannot be re-sized Again, I think this is unfortunate for a simple script that prints from several threads. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Le mercredi 30 juillet 2008 à 23:03 +, Amaury Forgeot d'Arc a écrit : Again, I think this is unfortunate for a simple script that prints from several threads. Yes, but it's an issue with the BufferedWriter implementation, since it changes the length of its underlying bytearray object. If it was rewritten to use a fixed-size bytearray, the problem would probably disappear. (in the middle term BufferedReader and BufferedWriter should perhaps be both rewritten in C) ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: If it was rewritten to use a fixed-size bytearray does such an object exist today? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Le jeudi 31 juillet 2008 à 00:00 +, Amaury Forgeot d'Arc a écrit : Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: If it was rewritten to use a fixed-size bytearray does such an object exist today? Manually, yes :) Just preallocate your bytearray, e.g.: b = bytearray(b * 4096) and then be careful to only do operations (e.g. slice assignments) which keep the size intact. In a BufferedWriter implementation, you would have to keep track of the currently used chunk in the bytearray (offset and size). Anyway, I'd question the efficiency of the bytearray approach; when removing the quadratic behaviour in BufferedReader I discovered that using a bytearray was slower than keeping a list of bytes instances and joining them when needed (not to mention that the latter is inherently thread-safe :-)). The reason is that the underlying raw stream expects and returns bytes, and the public buffered API also does, so using a bytearray internally means lots of additional memory copies. (a related problem is that readinto() does not let you specify an offset inside the given buffer object, it always starts writing at the beginning of the buffer. Perhaps memoryview() will support creating subbuffers, I don't know...) ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: With the patch the following script crashes the 2.6 interpreter on Windows: import sys, io, threading stdout2 = io.open(sys.stdout.fileno(), mode=w) def f(i): for i in range(10): stdout2.write(unicode((x, i)) + '\n') for x in range(10): t = threading.Thread(target=f, args=(x,)) t.start() (with py3k, replace stdout2.write with a simple print) ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: (1) are you sure it is safe not to INCREF the obj pointer in the Py_buffer? Yes, that's the semantics of the current buffer interface, and I cannot see a flaw in it. When you call getbuffer, you certainly hold a reference, and it is your obligation to hold onto this reference somehow. So it is definitely safe (if properly documented). It would seem more logical for PyBuffer_FillInfo to INCREF the obj, and for PyBuffer_Release to DECREF it and set it to NULL. Perhaps. I cannot quite see what undesirable consequences that might have - feel free to develop and test an alternative patch that takes that approach. (2) is it necessary to call directly bf_getbuffer the like or is there a higher-level API to do it? There is PyObject_GetBuffer and PyObject_ReleaseBuffer, but it is not higher-level. I need to check the function pointers, anyway (to determine whether the object supports getbuffer and requires releasebuffer or not), so calling them directly is the right level of abstraction (IMO). (3) didn't you forget to convert PyArg_ParseTuple(args, s#iO:sendto, [...]) in sock_sendto? True. (4) is it really necessary to do a special case with PyString_Check() rather than rely on the string type's getbuffer method? That's what the code always did (for s#). It would be possible to eliminate that case, yes. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: I haven't yet studied the patch in detail but I have a few questions: (1) are you sure it is safe not to INCREF the obj pointer in the Py_buffer? in the cases handled by your patch we still hold a reference to the original args tuple and dict before calling PyBuffer_Release(), but some functions may want to store the Py_buffer object somewhere to re-use it later. It would seem more logical for PyBuffer_FillInfo to INCREF the obj, and for PyBuffer_Release to DECREF it and set it to NULL. (2) is it necessary to call directly bf_getbuffer the like or is there a higher-level API to do it? (3) didn't you forget to convert PyArg_ParseTuple(args, s#iO:sendto, [...]) in sock_sendto? (4) is it really necessary to do a special case with PyString_Check() rather than rely on the string type's getbuffer method? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Travis, it would be really nice to have your input on this. -- nosy: +teoliphant ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: Here is a patch adding the s* format, and changing files, sockets, and fileio to use it. For bz2, the immediate effect is that you get a type error (as an object providing bf_releasebuffer cannot be converted through s#/w# anymore); it would be possible to fix bz2 also to use s*/w* instead. I'd like reviewers to focus on flaws in the approach and bugs in the implementation; existing code should be converted to the new API afterwards (or not converted at all for 2.6/3.0, but only as patches get contributed). If this is accepted in principle, I'll forward-port it to 3.0. Added file: http://bugs.python.org/file10969/s_star.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Barry A. Warsaw [EMAIL PROTECTED]: -- priority: deferred blocker - release blocker ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Barry A. Warsaw [EMAIL PROTECTED] added the comment: Not blocking beta 2 because there's not enough time for the fix, but this will definitely block beta 3. -- nosy: +barry priority: release blocker - deferred blocker ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: Fixing this in the methods themselves has the disadvantage that the error reporting is not that good anymore. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Ismail Donmez [EMAIL PROTECTED]: -- nosy: +cartman ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Martin v. Löwis [EMAIL PROTECTED]: -- priority: - release blocker ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: If I try to follow the chain the consequences: - all PyArg_ParseTuple(s#) calls that release the GIL afterwards should be re-written to use another API (which one I don't know exactly, but hopefully the appropriate functions are already provided by the buffer API); this applies to third-party extension modules as well - consequently, forward compatibility is broken in an important way (but it would probably be ok for py3k) - perhaps the bytearray type should not have been backported to 2.6, or perhaps it should carry a big warning in the documentation ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: By the way, here's a more reliable way to make it crash (on my Linux machine): import bz2, threading bz2c = bz2.BZ2Compressor() # Span at least a whole arena (256KB long) junk_len = 512 * 1024 junk = ba * junk_len buf = bytearray() for x in range(50): buf[:] = junk t = threading.Thread(target=bz2c.compress, args=[buf]) t.start() buf[:] = b t.join() ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: Now I've just discovered there is the same problem with the array.array() type (see following code). import bz2, threading, array bz2c = bz2.BZ2Compressor() # Span at least a whole arena (256KB long) junk_len = 512 * 1024 junk = array.array(c, ba) * junk_len empty = array.array(c) buf = empty[:] for x in range(50): buf[:] = junk t = threading.Thread(target=bz2c.compress, args=[buf]) t.start() buf[:] = empty t.join() ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Martin v. Löwis [EMAIL PROTECTED] added the comment: I believe the 2.6 s# processing works correctly; the error is in the bytearray object. This either shouldn't support the buffer interface, or it shouldn't reallocate the buffer after having returned a pointer to it. For 3k, convertbuffer shouldn't call bf_releasebuffer; instead, the caller of ParseTuple somehow needs to release the buffer. As a consequence, s# should refuse buffer objects who have a bf_releasebuffer operation, and another code might get added to fill out a Py_buffer - or s# can get changed to always produce a Py_buffer (which would affect the code significantly). -- nosy: +loewis ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Antoine Pitrou [EMAIL PROTECTED] added the comment: For reference, here is a proof-of-concept patch which shows how to fix the bytearray crasher above (it raises a BufferError instead). Added file: http://bugs.python.org/file10822/bzcrash.py ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Antoine Pitrou [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file10822/bzcrash.py ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Changes by Antoine Pitrou [EMAIL PROTECTED]: Added file: http://bugs.python.org/file10823/bzcrash.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3139] bytearrays are not thread safe
Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment: Probably, but this affects all PyArg_ParseTuple(s#) calls that release the GIL afterwards. How many of them are there? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue3139 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com