Re: socket send O(N**2) complexity
exarkun at twistedmatrix.com writes: To the OP, you can get view-like behavior with the buffer builtin. And, on Python 3 (or even the 2.7 in development), you can use the memoryview builtin for similar effect. Regards Antoine. -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
On Mon, 21 Sep 2009 16:33:08 -0400, Jack Diederich wrote: AIUI, as a python string is imutable, a slice of a string is a new string which points (C char *) to the start of the slice data and with a length that is the length of the slice, about 8 bytes on 32 bit machine. Not in CPython. While some special strings are re-used (empty string, single letters) if you take a slice of an existing string a new buffer is allocated and the slice memcpy'd into it. Er, why? I can understand doing this for mutable sequences, but it doesn't seem to make much sense for strings. -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
On Tue, Sep 22, 2009 at 7:25 PM, Nobody nob...@nowhere.com wrote: On Mon, 21 Sep 2009 16:33:08 -0400, Jack Diederich wrote: AIUI, as a python string is imutable, a slice of a string is a new string which points (C char *) to the start of the slice data and with a length that is the length of the slice, about 8 bytes on 32 bit machine. Not in CPython. While some special strings are re-used (empty string, single letters) if you take a slice of an existing string a new buffer is allocated and the slice memcpy'd into it. Er, why? I can understand doing this for mutable sequences, but it doesn't seem to make much sense for strings. Because it adds large amounts of complexity. If you have a string that points to 8 bytes in a 1Gig string and the giant string goes out of scope, what do you do? You can keep it alive, you can copy out 8 bytes; you can do one of those things now or you can wait until later when some kind of garbage collection happens. Once you have that system why not pile ropes on top of it? and once you have ropes why not just make strings mutable ... If you want much longer mailing list threads about the topic search the python-dev archives. -Jack -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
On Wed, 23 Sep 2009 00:25:52 +0100, Nobody wrote: On Mon, 21 Sep 2009 16:33:08 -0400, Jack Diederich wrote: AIUI, as a python string is imutable, a slice of a string is a new string which points (C char *) to the start of the slice data and with a length that is the length of the slice, about 8 bytes on 32 bit machine. Not in CPython. While some special strings are re-used (empty string, single letters) if you take a slice of an existing string a new buffer is allocated and the slice memcpy'd into it. Er, why? I can understand doing this for mutable sequences, but it doesn't seem to make much sense for strings. Consider: huge_string = abcdef*1000*1000*1000 tiny_string = huge_string[42:45] del huge_string Under the current behaviour, huge_string will be garbage collected. With the proposed string-view, it won't be. It would be surprising and disturbing if taking a tiny slice of a huge string prohibited the huge string from being garbage collected. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
socket send O(N**2) complexity
The mysocket.mysend method given at http://docs.python.org/howto/sockets.html has an (unwitting?) O(N**2) complexity for long msg due to the string slicing. I've been looking for a way to optimize this, but aside from a pure python 'string slice view' that looks at the original string I can't think of anything. Perhaps start and end keywords could be added to send? I can't think of a reason for the end keyword, but it would be there for symmetry. -- Zachary Burns (407)590-4814 Aim - Zac256FL Production Engineer (Digital Overlord) Zindagi Games -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
On Sep 21, 2:03 pm, Zac Burns zac...@gmail.com wrote: The mysocket.mysend method given athttp://docs.python.org/howto/sockets.htmlhas an (unwitting?) O(N**2) complexity for long msg due to the string slicing. I've been looking for a way to optimize this, but aside from a pure python 'string slice view' that looks at the original string I can't think of anything. Perhaps start and end keywords could be added to send? I can't think of a reason for the end keyword, but it would be there for symmetry. for ch in msg: add to current buffer or start another buffer for buf in buffers: send(...buf) -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
Zac Burns wrote in news:mailman.211.1253559803.2807.python-l...@python.org in comp.lang.python: The mysocket.mysend method given at http://docs.python.org/howto/sockets.html has an (unwitting?) O(N**2) complexity for long msg due to the string slicing. I've been looking for a way to optimize this, but aside from a pure python 'string slice view' that looks at the original string I can't think of anything. Perhaps start and end keywords could be added to send? I can't think of a reason for the end keyword, but it would be there for symmetry. I ran this script on various versions of python I have access to: #encoding: utf-8 raw_input( start ) s = 'x' * 100 r = [None] * 1000 raw_input( allocated 1 meg + ) for i in xrange(1000): r[i] = s[:] raw_input( end ) Niether of the CPython versions (2.5 and 3.0 (with modified code)) exibited any memory increase between allocated 1 meg + and end pypy-c (1.0.0) showed a 30k jump, and IronPython 2.0 showed a few megs jump. AIUI, as a python string is imutable, a slice of a string is a new string which points (C char *) to the start of the slice data and with a length that is the length of the slice, about 8 bytes on 32 bit machine. So even though a slice assignment new_s = s[:] appears to a python programmer to make a copy of s, its only the a few bytes of metadata (the pointer and the length) that is really copied, the strings character data stays where it is. So the code you cite is in fact O(N) as the copy is constant size. Rob. -- http://www.victim-prime.dsl.pipex.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
On 08:00 pm, r...@freenet.co.uk wrote: Zac Burns wrote in news:mailman.211.1253559803.2807.python- l...@python.org in comp.lang.python: The mysocket.mysend method given at http://docs.python.org/howto/sockets.html has an (unwitting?) O(N**2) complexity for long msg due to the string slicing. I've been looking for a way to optimize this, but aside from a pure python 'string slice view' that looks at the original string I can't think of anything. Perhaps start and end keywords could be added to send? I can't think of a reason for the end keyword, but it would be there for symmetry. I ran this script on various versions of python I have access to: #encoding: utf-8 raw_input( start ) s = 'x' * 100 r = [None] * 1000 raw_input( allocated 1 meg + ) for i in xrange(1000): r[i] = s[:] raw_input( end ) Niether of the CPython versions (2.5 and 3.0 (with modified code)) exibited any memory increase between allocated 1 meg + and end You bumped into a special case that CPython optimizes. s[:] is s. If you repeat your test with s[1:], you'll see memory climb as one might normally expect. pypy-c (1.0.0) showed a 30k jump, and IronPython 2.0 showed a few megs jump. AIUI, as a python string is imutable, a slice of a string is a new string which points (C char *) to the start of the slice data and with a length that is the length of the slice, about 8 bytes on 32 bit machine. So even though a slice assignment new_s = s[:] appears to a python programmer to make a copy of s, its only the a few bytes of metadata (the pointer and the length) that is really copied, the strings character data stays where it is. So the code you cite is in fact O(N) as the copy is constant size. This all (basically) valid for the special case of s[:]. For any other string slicing, though, the behavior is indeed O(N**2). To the OP, you can get view-like behavior with the buffer builtin. Here's an example of its usage from Twisted, where it is employed for exactly the reason raised here: http://twistedmatrix.com/trac/browser/trunk/twisted/internet/abstract.py#L93 Jean-Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
On Mon, Sep 21, 2009 at 4:00 PM, Rob Williscroft r...@freenet.co.uk wrote: AIUI, as a python string is imutable, a slice of a string is a new string which points (C char *) to the start of the slice data and with a length that is the length of the slice, about 8 bytes on 32 bit machine. Not in CPython. While some special strings are re-used (empty string, single letters) if you take a slice of an existing string a new buffer is allocated and the slice memcpy'd into it. So even though a slice assignment new_s = s[:] appears to a python programmer to make a copy of s, its only the a few bytes of metadata (the pointer and the length) that is really copied, the strings character data stays where it is. There is a special case for copy in CPython that avoids copying the string bytes, BUT in that case it just INCREFs the existing object and returns it. There are no new allocations at all. Be very careful recommending optimizations based on how you think the underlying implementation works without double checking with the code first. -Jack -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
wrote in news:mailman.216.1253565002.2807.python-l...@python.org in comp.lang.python: Niether of the CPython versions (2.5 and 3.0 (with modified code)) exibited any memory increase between allocated 1 meg + and end You bumped into a special case that CPython optimizes. s[:] is s. If you repeat your test with s[1:], you'll see memory climb as one might normally expect. Thanks for the correction (to Jack also) Rob. -- http://www.victim-prime.dsl.pipex.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: socket send O(N**2) complexity
On Mon, Sep 21, 2009 at 2:10 PM, Rob Williscroft r...@freenet.co.uk wrote: wrote in news:mailman.216.1253565002.2807.python-l...@python.org in comp.lang.python: Niether of the CPython versions (2.5 and 3.0 (with modified code)) exibited any memory increase between allocated 1 meg + and end You bumped into a special case that CPython optimizes. s[:] is s. If you repeat your test with s[1:], you'll see memory climb as one might normally expect. Thanks for the correction (to Jack also) Rob. -- http://www.victim-prime.dsl.pipex.com/ -- http://mail.python.org/mailman/listinfo/python-list Thanks! The buffer object was exactly the 'slice view' that I was looking for to fix my version of the algorithm. -- Zachary Burns (407)590-4814 Aim - Zac256FL Production Engineer (Digital Overlord) Zindagi Games -- http://mail.python.org/mailman/listinfo/python-list