[issue29178] Adding bytes.frombuffer(byteslike) constructor

2021-03-20 Thread Gregory P. Smith


Change by Gregory P. Smith :


--
stage:  -> patch review
versions:  -Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2021-03-20 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I am more warm to this feature now (+0). Not because it would allow to write 
some code shorter, but because it can be in line with other type-strict 
alternate constructors, like int.fromnumber() and int.parse(). But questions 
about parameters and subclasses still stay.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2021-03-20 Thread Eryk Sun


Change by Eryk Sun :


--
versions: +Python 3.10, Python 3.8, Python 3.9 -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-04-12 Thread INADA Naoki

INADA Naoki added the comment:

FYI, Tornado 4.5b switched buffer implementation from deque-of-bytes
to bytearray like asyncio.  I sent PR for fixing "unreleased memoryview".
https://github.com/tornadoweb/tornado/pull/2008

This is common pitfall when implementing buffer.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-07 Thread Nick Coghlan

Nick Coghlan added the comment:

The complexity you're hitting here is the main reason I'm a fan of creating a 
dedicated library for dealing with these problems, rather than trying to handle 
them directly on the builtins.

Given a bufferlib module, for example, you could have a higher level API like:

def snapshot(source, *, result_type=bytes):
...

That handled any source object that supported the buffer protocol, and any 
target type that accepted a memoryview instance as input to the constructor.

# Default bytes snapshot
data = snapshot(original)

# Mutable snapshot without copying and promptly releasing the view
data = snapshot(original, result_type=bytearray)

The start/stop/step or length+offset question could be handled at that level by 
allowing both, but also offering lower level APIs with less argument processing 
overhead:

def snapshot_slice(source, start, stop, step=1, *, result_type=bytes):
...

def snapshot_at(source, *, offset=0, count=None, result_type=bytes):
...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-07 Thread INADA Naoki

INADA Naoki added the comment:

You're right!  How difficult working with memoryview!

> The memoryview constructor could take start, stop, and step keyword-only 
> arguments to avoid having to immediately slice a new view.

Maybe, memoryview.to_bytes() is better place to add such options.

memoryview(x) can accept multi dimensional arrays, and itemsize can be >=1.
It's too complex when working with byte sequence.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-07 Thread Eryk Sun

Eryk Sun added the comment:

Isn't the proposed workaround also relying on CPython reference counting to 
immediately deallocate the sliced view? It fails if I keep a reference to the 
sliced view:

byteslike = bytearray(b'abc')

with memoryview(byteslike) as m1:
m2 = m1[1:]
bs = bytes(m2)
 
>>> byteslike += b'd'
Traceback (most recent call last):
  File "", line 1, in 
BufferError: Existing exports of data: object cannot be re-sized

It seems to me that it should be written as follows:

with memoryview(byteslike) as m1:
with m1[1:] as m2:
bs = bytes(m2)

>>> byteslike += b'd'
>>> byteslike
bytearray(b'abcd')

The memoryview constructor could take start, stop, and step keyword-only 
arguments to avoid having to immediately slice a new view.

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-07 Thread INADA Naoki

INADA Naoki added the comment:

> I'm -1 if the intention is about easiness and efficiency. 

Do you +1 when adding it to stdlib (say "bufferlib")?

> Creating a memoryview is not cheap enough in such a case.

Actually speaking, it's 5 calls + 2 temporary memoriview.

1. creating memoryview.
2. call __enter__.
3. slicing memoryview
4. create bytes from it
5. call __exit__

It can be bottleneck very easily, when writing protocol parser like HTTP parser.

> About easiness to use, when a user considering such low level details, it's 
> reasonable to know memoryview and it needs to be released.

closing memoryview is not strict requirements in some cases.
When memoryview is created from immutable, relying to GC is safe.
That's why closing memoryview is easily forgotten.

> But if this API is added to simplify bytes(), I think it makes sense but it's 
> not only about adding a frombuffer().

Yes. See msg284813.
There are PEP 467 already to make bytes() simple.

But if PEP 467 is accepted, bytes() constructor is simple enough.
Adding `offset` and `length` keyword-only argument to bytes() make sense.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-07 Thread Xiang Zhang

Xiang Zhang added the comment:

I'm -1 if the intention is about easiness and efficiency. 

I think a new API is usually added due to functional defect not performance 
defect. We get a way here though the performance seems not ideal, according to 
INADA's mail. I think we should first check if memoryview gets an optimization 
chance to fit more in such a case. Creating a memoryview is not cheap enough in 
such a case.

About easiness to use, when a user considering such low level details, it's 
reasonable to know memoryview and it needs to be released.

But if this API is added to simplify bytes(), I think it makes sense but it's 
not only about adding a frombuffer().

--
nosy: +xiang.zhang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-06 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

> Which virtually no one follows :(

Sad. But adding bytes.frombuffer() wouldn't make it magically used. If you are 
aware of the problem, you can use the above two-liner as well as 
bytes.frombuffer(). You even can use it in the current code with older Python 
releases, unlike to bytes.frombuffer() which would need 3.7.

> Any protocol parsing code has a lot of slicing.

How much code you expect to update with bytes.frombuffer()? And why not use the 
above two-liner instead?

> > There is also a problem with returned type for subclasses (this is always
> > a problem for alternate constructors).
> Good point. How do we usually solve this in CPython?

It is deemed that returning an instance of a subclass is more preferable. 
Otherwise you could use separate function rather of a class method. But it is 
not easy. You need either pass a memoryview or bytes instance to class 
constructor, or (only for mutable arrays) create an empty instance and 
concatenate a buffer to it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-06 Thread Yury Selivanov

Yury Selivanov added the comment:

> This is just a two-liner:
>
>with memoryview(bytelike) as m:
>bs = bytes(m[start:end])

Which virtually no one follows :(


> Adding new method to builtin type has high bar. I doubts that there are 
> enough use cases in which bytes.frombuffer() has an advantage.

Any protocol parsing code has a lot of slicing.

> The signature of bytes.frombuffer() looks questionable. Why length and offset 
> instead of start and stop indices as in slices? Why length is first and 
> offset is last? This contradicts the interface of Python 2 buffer(), 
> socket.sendfile(), os.sendfile(), etc.

I propose to make both arguments keyword-only.

> There is also a problem with returned type for subclasses (this is always a 
> problem for alternate constructors).

Good point. How do we usually solve this in CPython?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-06 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Count me as -1 too.

This is just a two-liner:

with memoryview(bytelike) as m:
bs = bytes(m[start:end])

In most cases, when all content is used, the bytes constructor works fine.

bs = bytes(bytelike)

This works not just with bytes, but with bytearray and most other bytes-like 
arrays. With frombuffer() you need to add a new class method to all these 
classes.

Adding new method to builtin type has high bar. I doubts that there are enough 
use cases in which bytes.frombuffer() has an advantage.

The signature of bytes.frombuffer() looks questionable. Why length and offset 
instead of start and stop indices as in slices? Why length is first and offset 
is last? This contradicts the interface of Python 2 buffer(), 
socket.sendfile(), os.sendfile(), etc.

There is also a problem with returned type for subclasses (this is always a 
problem for alternate constructors). Should B.frombuffer() where B is a bytes 
subclass return an instance of bytes or B? If it always returns a bytes object, 
we need to use a constructor for subclasses, if it returns an instance of a 
subclass, how can it be implemented efficiently?

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-06 Thread Yury Selivanov

Yury Selivanov added the comment:

I've added a couple of review comments.  Also, it looks like you can count 
Antoine Pitrou as +1 too.

Two questions:

1. length or count?  Need to look through builtins/stdlib and see what is more 
common in CPython.

2. Maybe we should make length/count and offset keyword-only arguments?

--
nosy: +yselivanov

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29178] Adding bytes.frombuffer(byteslike) constructor

2017-01-06 Thread INADA Naoki

New submission from INADA Naoki:

# Summary

## 1. Making bytes from slice of bytearray easy and efficient.

bs = bytes(memoryview(bytelike)[start:end]) works fine on CPython,
but it will cause issue on PyPy.

Since memoryview is not closed explicitly, exception like
"BufferError: Existing exports of data: object cannot be re-sized"
will be raised after it.
Where exception raised can be far from where unclosed memoryview is leaked.


## 2. bytes(x) constructor is too overloaded.

It has undocumented corner cases. See PEP 467 and #29159


# ML threads

https://mail.python.org/pipermail/python-dev/2016-October/146668.html
https://mail.python.org/pipermail/python-dev/2017-January/147109.html

+1 from: Nathaniel Smith, Alexander Belopolsky, Yury Selivanov
-1 from: Nick Coghlan

Nick proposed put it on separated module, instead of adding it as builtin 
method.


# First draft patch

bytes-frombuffer.patch is first draft patch.  It implements frombuffer to only 
bytes,
with signature proposed first. Only C-contiguous buffer is supported for now.

  frombuffer(byteslike, length=-1, offset=0) method of builtins.type instance
Create a bytes object from bytes-like object.

Examples:
bytes.frombuffer(b'abcd') -> b'abcd'
bytes.frombuffer(b'abcd', 2) -> b'ab'
bytes.frombuffer(b'abcd', 8) -> b'abcd'
bytes.frombuffer(b'abcd', offset=2) -> b'cd'
bytes.frombuffer(b'abcd', 1, 2) -> b'c'

--
components: Interpreter Core
files: bytes-frombuffer.patch
keywords: patch
messages: 284813
nosy: Yury.Selivanov, belopolsky, haypo, inada.naoki, ncoghlan
priority: normal
severity: normal
status: open
title: Adding bytes.frombuffer(byteslike) constructor
type: enhancement
versions: Python 3.7
Added file: http://bugs.python.org/file46173/bytes-frombuffer.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com