On 26Dec2018 12:18, Andrew Svetlov <andrew.svet...@gmail.com> wrote:
On Wed, Dec 26, 2018 at 11:26 AM Steven D'Aprano <st...@pearwood.info>
wrote:

On Wed, Dec 26, 2018 at 09:48:15AM +0200, Andrew Svetlov wrote:
> The perfect demonstration of io objects complexity.
> `stream.read(N)` can return None by spec if the file is non-blocking
> and have no ready data.
>
> Confusing but still possible and documented behavior.

https://docs.python.org/3/library/io.html#io.RawIOBase.read

Regardless, my point doesn't change. That has nothing to do with the
behaviour of unpack. If you pass a non-blocking file-like object which
returns None, you get exactly the same exception as if you wrote

    unpack(fmt, f.read(size))

and the call to f.read returned None. Why is it unpack's responsibility
to educate the caller that f.read can return None?
[...]
> You need to repeat reads until collecting the value of enough size.

That's not what the OP has asked for, it isn't what the OP's code does,
and its not what I've suggested.

Do pickle and json block and repeat the read until they have a complete
object? I'm pretty sure they don't [...]
json is correct: if `read()` is called without argument it reads the whole
content until EOF.
But with size argument the is different for interactive and non-interactive
streams.

Oh, it is better than that. At the low level, even blocking streams can return short reads - particularly serial streams like ttys and TCP connections.

RawIOBase and BufferedIOBase also have slightly different behavior for
`.read()`.

Restriction fp to BufferedIOBase looks viable though, but it is not a
file-like object.

Also I'm thinking about type annotations in typeshed.
Now the type is Union[array[int], bytes, bytearray, memoryview]
Should it be Union[io.BinaryIO, array[int], bytes, bytearray, memoryview] ?

And this is why I, personally, think augumenting struct.unpack and json.read and a myriad of other arbitrary methods to accept both file-like things and bytes is an open ended can of worms.

And it is why I wrote myself my CornuCopyBuffer class (see my other post in this thread).

Its entire purpose is to wrap an iterable of bytes-like objects and do all that work via convenient methods. And which has factory methods to make these from files or other common things. Given a CornuCopyBuffer `bfr`:

   S = struct('spec-here...')
   sbuf = bfr.take(S.size)
   result = S.unpack(sbuf)

Under the covers `bfr` take care of short "reads" (iteraion values) etc in the underlying iterable. The return from .take is typically a memoryview from `bfr`'s internal buffer - it is _always_ exactly `size` bytes long if you don't pass short_ok=True, or it raises an exception. And so on.

The point here is: make a class to get what you actually need, and _don't_ stuff variable and hard to agree on extra semantics inside multiple basic utility classes like struct.

For myself, the CornuCopyBuffer is now my universal interface to byte streams (binary files, TCP connections, whatever) which need binary parsing, and it has the methods and internal logic to provide that, including presenting a simple read only file-like interface with read and seek-forward, should I need to pass it to a file-expecting object.

Do it _once_, and don't megacomplicatise all the existing utility classes.

Cheers,
Cameron Simpson <c...@cskk.id.au>
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to