Changes by Glyph Lefkowitz gl...@twistedmatrix.com:
--
nosy: +glyph
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
___
Python-bugs-list
Serhiy Storchaka added the comment:
Yet one issue. You forgot to add join.h to BYTESTR_DEPS in Makefile.pre.in.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
Roundup Robot added the comment:
New changeset 388e43bb519d by Antoine Pitrou in branch 'default':
Followup to issue #15958: add join.h to Makefile dependencies for byte strings
http://hg.python.org/cpython/rev/388e43bb519d
--
___
Python tracker
Antoine Pitrou added the comment:
Here is a new patch checking that the sequence size didn't change. I also
refactored the join() implementation into a shared function in stringlib.
--
Added file: http://bugs.python.org/file27594/bytes_join_buffers3.patch
Serhiy Storchaka added the comment:
I added new comments. :-(
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
___
Python-bugs-list
Antoine Pitrou added the comment:
I added new comments. :-(
Thanks. I think I will commit after adding the missing #undef :-)
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
Roundup Robot added the comment:
New changeset 16285c1b4dda by Antoine Pitrou in branch 'default':
Issue #15958: bytes.join and bytearray.join now accept arbitrary buffer objects.
http://hg.python.org/cpython/rev/16285c1b4dda
--
nosy: +python-dev
___
Antoine Pitrou added the comment:
Done now.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
___
Python-bugs-list mailing list
Changes by Antoine Pitrou pit...@free.fr:
--
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
Serhiy Storchaka added the comment:
Well done.
However check at top of Objects/stringlib/join.h does not protect from using
the file with asciilib or ucs1lib.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
Antoine Pitrou added the comment:
However check at top of Objects/stringlib/join.h does not protect from using
the file with asciilib or ucs1lib.
I'm not sure that's a problem. Someone would have to go out of their way
to use join.h with only UCS1 unicode strings. Also tests would probably
Antoine Pitrou added the comment:
Here is an updated patch using PySequence_Fast_GET_SIZE to avoid problems when
the sequence is resized during iteration.
--
Added file: http://bugs.python.org/file27585/bytes_join_buffers2.patch
___
Python tracker
Antoine Pitrou added the comment:
Here is a patch with tests.
--
Added file: http://bugs.python.org/file27554/bytes_join_buffers.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
Serhiy Storchaka added the comment:
Patch LGTM, however...
$ ./python -m timeit -s a=[b'a']*10 b','.join(a)
Vanilla: 3.69 msec per loop
Patched: 11.6 msec per loop
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
Antoine Pitrou added the comment:
Patch LGTM, however...
$ ./python -m timeit -s a=[b'a']*10 b','.join(a)
Vanilla: 3.69 msec per loop
Patched: 11.6 msec per loop
True. It is a bit of a pathological case, though.
--
___
Python tracker
Serhiy Storchaka added the comment:
Here is a patch with restored performance. Is not it too complicated?
--
Added file: http://bugs.python.org/file27557/bytes_join_buffers_2.patch
___
Python tracker rep...@bugs.python.org
Antoine Pitrou added the comment:
The problem with your approach is that the sequence could be mutated while
another thread is running (_getbuffer() may release the GIL). Then the
pre-computed size gets wrong.
--
___
Python tracker
Serhiy Storchaka added the comment:
The problem with your approach is that the sequence could be mutated while
another thread is running (_getbuffer() may release the GIL). Then the
pre-computed size gets wrong.
Well, then I withdraw my patch.
But what if the sequence will be mutated and
Antoine Pitrou added the comment:
But what if the sequence will be mutated and PySequence_Size(seq) will become
less seqlen? Then using PySequence_Fast_GET_ITEM() will be incorrect.
Perhaps we should detect that case and raise, then.
--
___
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file27557/bytes_join_buffers_2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
Ezio Melotti added the comment:
Attached patch adds support for memoryviews to bytes.join:
b''.join([memoryview(b'foo'), b'bar'])
b'foobar'
The implementation currently has some duplication, because it does a first pass
to calculate the total size to allocate, and another pass to create the
Serhiy Storchaka added the comment:
I think memoryview here is example only, and Antoine had in mind arbitrary
buffer objects.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
Ezio Melotti added the comment:
Indeed. Attached new patch.
Tests still need to be improved; bytearrays are still not changed.
--
Added file: http://bugs.python.org/file27258/issue15958-2.diff
___
Python tracker rep...@bugs.python.org
Stefan Krah added the comment:
We would need to release the buffers and also check for format 'B'.
With issue15958-2.diff this is possible:
import array
a = array.array('d', [1.2345])
b''.join([b'ABC', a])
b'ABC\x8d\x97n\x12\x83\xc0\xf3?'
It is unfortunate that a PyBUF_SIMPLE request does
Stefan Krah added the comment:
Also, perhaps we can keep a fast path for bytes and bytearray, but I
didn't time the difference.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
Antoine Pitrou added the comment:
Well, given the following works:
import array
a = array.array('d', [1.2345])
b'' + a
b'\x8d\x97n\x12\x83\xc0\xf3?'
It should also work for bytes.join().
I guess that means I'm against the strict-typedness of memoryviews. As the name
suggests, it provides
Ezio Melotti added the comment:
Attached new refleakless patch.
--
Added file: http://bugs.python.org/file27259/issue15958-3.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
Antoine Pitrou added the comment:
Attached new refleakless patch.
Your approach is dangerous, because the buffers may change size between
two calls to PyObject_GetBuffer(). I think you should keep the
Py_buffers alive in an array, and only release them at the end (it may
also be slightly
Serhiy Storchaka added the comment:
I think you should keep the
Py_buffers alive in an array, and only release them at the end (it may
also be slightly faster to do so).
However allocation of this array may considerably slow down the function. We
may need the special-case for bytes and
New submission from Antoine Pitrou:
This should ideally succeed:
b''.join([memoryview(b'foo'), b'bar'])
Traceback (most recent call last):
File stdin, line 1, in module
TypeError: sequence item 0: expected bytes, memoryview found
(ditto for bytearray.join)
--
components:
Changes by Serhiy Storchaka storch...@gmail.com:
--
nosy: +storchaka
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
___
Python-bugs-list
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15958
___
32 matches
Mail list logo