[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-05-13 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Note that this is not work with the punycode encoding (and may be some 
third-party encodings).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-05-13 Thread Nikolaus Rath

Nikolaus Rath added the comment:

On 05/13/2014 12:41 PM, Serhiy Storchaka wrote:
 Note that this is not work with the punycode encoding (and may be some 
 third-party encodings).

I do not understand. Could you elaborate?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-04-29 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 2a1d63f09560 by Antoine Pitrou in branch 'default':
Issue #21057: TextIOWrapper now allows the underlying binary stream's read() or 
read1() method to return an arbitrary bytes-like object (such as a memoryview).
http://hg.python.org/cpython/rev/2a1d63f09560

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-04-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Thank you, I've committed tha patch to 3.5.

--
resolution:  - fixed
stage: patch review - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-04-28 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 My usecase is that I have a binary stream class that internally uses 
 memoryviews.

Ok, I think it is a reasonable use case. I'm gonna look at your patch and give 
it a review.

--
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-04-28 Thread Nikolaus Rath

Changes by Nikolaus Rath nikol...@rath.org:


Added file: http://bugs.python.org/file35083/issue_21057_r3.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-04-12 Thread Martin v . Löwis

Martin v. Löwis added the comment:

Nikolaus, can you please restate from scratch what the issue is? If the issue 
is still the one you posed in msg214776, I think the issue should be closed as 
invalid - it's *not* the case that there is no good reason for TextIOWrapper 
not accepting any bytes-like object. Or, to drop the double negation: as Serhiy 
states, read1() should return bytes, and it's perfectly fine to rely on that. 
Your MyByteStream ought to fail, so it is correct behaviour that it does fail.

--
nosy: +loewis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-04-12 Thread Nikolaus Rath

Nikolaus Rath added the comment:

My usecase is that I have a binary stream class that internally uses 
memoryviews. I would like to read text data from this stream and thus 
encapsulate it in a TextIOWrapper. Currently, TextIOWrapper (correctly) expects 
read() to return bytes and fails if it receives any other bytes-like object. I 
can change my custom class to internally convert to bytes, but this means that 
the data is needlessly copied around and affects every other consumer of the 
class as well.

Changing TextIOWrapper to work with any bytes-like object is (as far as I can 
see) rather simple. It does not introduce any new branches in the code, and it 
does not change the behavior for bytes objects at all. It does, however, 
eliminate unnecessary memcopies for classes that do not internally work with 
bytes objects. Therefore, I was hoping this patch could be considered for 
inclusion.

The MyByteStream example that I gave in the first message is useless. I merely 
included it as the smallest possible code fragment that currently does not 
work, but would work after the patch in an attempt to illustrate what I meant - 
but apparently it had the opposite effect.

Thanks for considering!

--
type: behavior - enhancement
versions:  -Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-31 Thread Nikolaus Rath

Nikolaus Rath added the comment:

Thanks for the feedback! I have attached an updated patch.

I did not include any testcase because the patch did not create any new code 
paths, so I was assuming it'd be covered by the existing test case. But of 
course I was wrong. In the revised patch, I added a testcase based on your 
example of a more complex memoryview.

(Note, however, that even with the previous implementation using nbytes = 
PySequence_Size(input_chunk) this test does not fail, because nbytes is used 
only to estimate the size of the text string).

--
Added file: http://bugs.python.org/file34690/issue21057.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-27 Thread Nikolaus Rath

Nikolaus Rath added the comment:

I'm attaching a patch that enables TextIOWrapper to work with bytes-like 
objects from the underlying file descriptor.

The code changes are pretty small, without introducing any significant 
additional complexity.

For streams providing bytes objects, this patch does not change anything. For 
streams that are able to provide bytearrays or memoryviews, this patch removes 
the overhead that would be incurred by explicitly converting to bytes.

--
keywords: +patch
Added file: http://bugs.python.org/file34645/issue21057.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-27 Thread Nikolaus Rath

Changes by Nikolaus Rath nikol...@rath.org:


Removed file: http://bugs.python.org/file34645/issue21057.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-27 Thread Nikolaus Rath

Changes by Nikolaus Rath nikol...@rath.org:


Added file: http://bugs.python.org/file34646/issue21057.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 If someone is willing to do the work (and I am), is there a reason *not* to
 allow TextIOWrapper to accept bytes-like objects?

Yes, there are. The code which works only with bytes is much simpler. Not only 
TextIOWrapper, but many other classes in the stdlib (GzipFile, BZ2File, 
LZMAFile, ZipFile) expect that the read method of underlied file object returns 
bytes, so your MyByteStream is just broken for such cases.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-26 Thread STINNER Victor

STINNER Victor added the comment:

2014-03-26 0:35 GMT+01:00 Antoine Pitrou rep...@bugs.python.org:
 - they are immutable: you can keep an internal reference to a bytes object 
 and be sure it won't change under your feet

Is it possible to request this feature using PyObject_GetBuffer()? I
don't see such flag.

If it's not possible, a new flag is maybe needed?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-26 Thread STINNER Victor

STINNER Victor added the comment:

 class MyByteStream(BytesIO):
def read1(self, len_):
return memoryview(super().read(len_))
 bs = MyByteStream(b'some data in ascii\n')

I guess that you are trying to implement a zero-copy I/O. The problem is that 
BytesIO does copy data. Example:

 data=b'abcdef'
 x=io.BytesIO(data)
 x.read() is  data
False

Before trying to avoid copies in the buffered layer, something should be done 
for the raw layer (BytesIO in this case).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-26 Thread Nikolaus Rath

Nikolaus Rath added the comment:

On 03/26/2014 03:43 AM, STINNER Victor wrote:
 class MyByteStream(BytesIO):
def read1(self, len_):
return memoryview(super().read(len_))
 bs = MyByteStream(b'some data in ascii\n')
 
 I guess that you are trying to implement a zero-copy I/O. The problem is that 
 BytesIO does copy data. 

Right on the first count, but wrong on the second. The class I'm
concerned with wants to do zero-copy I/O, but is not related to BytesIO.
I only picked that to produce a minimal example.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-25 Thread STINNER Victor

STINNER Victor added the comment:

I'm interested to review a patch, but I'm not sure that read1() can return a 
type different than bytes. You can use the Py_buffer API for your patch.

--
nosy: +haypo, pitrou
versions: +Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

read1() should return bytes. MyByteStream doesn't implement the 
io.BufferedIOBase interface.

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-25 Thread Nikolaus Rath

Nikolaus Rath added the comment:

On 03/25/2014 01:39 PM, Serhiy Storchaka wrote:
 read1() should return bytes. MyByteStream doesn't implement the 
 io.BufferedIOBase interface.

Indeed, this is what this issue is about :-).

The question is: is there a good reason to require io.BufferedIOBase
implementors to return bytes rather than any bytes-like object from read1?

I'd also argue that the current documentation of io.BufferedIOBase
actually does not clearly require read1 to return bytes. The exact
formulation is Read and return up to *size* bytes (note that bytes
is not interpreted text). This can just as easily read as return binary
data of up to *size* bytes using one of the customary Python types.

So if it turns out that there is a good reason for this requirement, the
documentation should at least make this requirement more explicit.

Best,
-Nikolaus

-- 
Encrypted emails preferred.
PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

 »Time flies like an arrow, fruit flies like a Banana.«

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-25 Thread Antoine Pitrou

Antoine Pitrou added the comment:

bytes objects have two things going for them:
- they have the full bytes API (all the startswith(), etc. methods) - not only 
buffer access
- they are immutable: you can keep an internal reference to a bytes object and 
be sure it won't change under your feet

This is especially handy when writing C code.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-25 Thread Nikolaus Rath

Nikolaus Rath added the comment:

Yes, bytes objects have some advantages. But if a bytes object is desired, it 
can always be created from bytes-like object. If a BufferedIOBase instance is 
required to only provide bytes objects, this conversion is forced even when it 
may not be necessary.

If someone is willing to do the work (and I am), is there a reason *not* to 
allow TextIOWrapper to accept bytes-like objects?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-24 Thread Nikolaus Rath

New submission from Nikolaus Rath:

In Python 3.4, TextIOWrapper can not read from streams that return bytearrays 
or memoryviews:

from io import TextIOWrapper, BytesIO
class MyByteStream(BytesIO):
def read1(self, len_):
return memoryview(super().read(len_))
bs = MyByteStream(b'some data in ascii\n')
ss = TextIOWrapper(bs, encoding='ascii')
print(ss.read(200))

results in:

TypeError: underlying read1() should have returned a bytes object, not 
'memoryview'

I don't think there's any good reason for TextIOWrapper not accepting any 
bytes-like object (especially now that b''.join finally accepts bytes-like 
objects as well).

--
components: Library (Lib)
messages: 214776
nosy: nikratio
priority: normal
severity: normal
status: open
title: TextIOWrapper does not support reading bytearrays or memoryviews
type: behavior
versions: Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21057] TextIOWrapper does not support reading bytearrays or memoryviews

2014-03-24 Thread Nikolaus Rath

Nikolaus Rath added the comment:

If someone is willing to review, I'd be happy to write a patch for this.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue21057
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com