The proposal can generate cryptic messages like `a bytes-like object is required, not 'NoneType'`
To produce more informative exception text all mentioned cases should be handled: > - read partial structs from non-blocking files without failing > - deal with file system errors without failing > - support reading from text files when bytes are required without failing > - if an exception occurs, the state of the file shouldn't change I can add a couple of cases but the list is long enough for demonstration purposes. When a user calls unpack(fmt, f.read(calcsize(fmt)) the user is responsible for handling all edge cases (or ignore them most likely). If it is a part of a library -- robustness is the library responsibility. On Mon, Dec 24, 2018 at 11:23 PM Steven D'Aprano <st...@pearwood.info> wrote: > On Mon, Dec 24, 2018 at 03:36:07PM +0000, Paul Moore wrote: > > > > There should be no difference whether the text comes from a literal, a > > > variable, or is read from a file. > > > > One difference is that with a file, it's (as far as I can see) > > impossible to determine whether or not you're going to get bytes or > > text without reading some data (and so potentially affecting the state > > of the file object). > > Here are two ways: look at the type of the file object, or look at the > mode of the file object: > > py> f = open('/tmp/spam.binary', 'wb') > py> g = open('/tmp/spam.text', 'w') > py> type(f), type(g) > (<class '_io.BufferedWriter'>, <class '_io.TextIOWrapper'>) > > py> f.mode, g.mode > ('wb', 'w') > > > > This might be considered irrelevant > > Indeed :-) > > > > (personally, > > I don't see a problem with a function definition that says "parameter > > fd must be an object that has a read(length) method that returns > > bytes" - that's basically what duck typing is all about) but it *is* a > > distinguishing feature of files over in-memory data. > > But it's not a distinguishing feature between the proposal, and writing: > > unpack(fmt, f.read(size)) > > which will also read from the file and affect the file state before > failing. So its a difference that makes no difference. > > > > There is also the fact that read() is only defined to return *at most* > > the requested number of bytes. Non-blocking reads and objects like > > pipes that can return additional data over time add extra complexity. > > How do they add extra complexity? > > According to the proposal, unpack() attempts the read. If it returns the > correct number of bytes, the unpacking succeeds. If it doesn't, you get > an exception, precisely the same way you would get an exception if you > manually did the read and passed it to unpack(). > > Its the caller's responsibility to provide a valid file object. If your > struct needs 10 bytes, and you provide a file that returns 6 bytes, you > get an exception. There's no promise made that unpack() should repeat > the read over and over again, hoping that its a pipe and more data > becomes available. It either works with a single read, or it fails. > > Just like similar APIs as those provided by pickle, json etc which > provide load() and loads() functions. > > In hindsight, the precedent set by pickle, json, etc suggests that we > ought to have an unpack() function that reads from files and an > unpacks() function that takes a string, but that ship has sailed. > > > > Again, not insoluble, and potentially simple enough to handle with > > "read N bytes, if you got something other than bytes or fewer than N > > of them, raise an error", but still enough that the special cases > > start to accumulate. > > I can understand the argument that the benefit of this is trivial over > > unpack(fmt, f.read(calcsize(fmt)) > > Unlike reading from a pickle or json record, its pretty easy to know how > much to read, so there is an argument that this convenience method > doesn't gain us much convenience. > > But I'm just not seeing where all the extra complexity and special case > handing is supposed to be, except by having unpack make promises that > the OP didn't request: > > - read partial structs from non-blocking files without failing > - deal with file system errors without failing > - support reading from text files when bytes are required without failing > - if an exception occurs, the state of the file shouldn't change > > Those promises *would* add enormous amounts of complexity, but I don't > think we need to make those promises. I don't think the OP wants them, > I don't want them, and I don't think they are reasonable promises to > make. > > > > The suggestion is a nice convenience method, and probably a useful > > addition for the majority of cases where it would do exactly what was > > needed, but still not completely trivial to actually implement and > > document (if I were doing it, I'd go with the naive approach, and just > > raise a ValueError when read(N) returns anything other than N bytes, > > for what it's worth). > > Indeed. Except that we should raise precisely the same exception type > that struct.unpack() currently raises in the same circumstances: > > py> struct.unpack("ddd", b"a") > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > struct.error: unpack requires a bytes object of length 24 > > rather than ValueError. > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -- Thanks, Andrew Svetlov
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/