On Mon, Dec 24, 2018 at 03:36:07PM +0000, Paul Moore wrote: > > There should be no difference whether the text comes from a literal, a > > variable, or is read from a file. > > One difference is that with a file, it's (as far as I can see) > impossible to determine whether or not you're going to get bytes or > text without reading some data (and so potentially affecting the state > of the file object).
Here are two ways: look at the type of the file object, or look at the mode of the file object: py> f = open('/tmp/spam.binary', 'wb') py> g = open('/tmp/spam.text', 'w') py> type(f), type(g) (<class '_io.BufferedWriter'>, <class '_io.TextIOWrapper'>) py> f.mode, g.mode ('wb', 'w') > This might be considered irrelevant Indeed :-) > (personally, > I don't see a problem with a function definition that says "parameter > fd must be an object that has a read(length) method that returns > bytes" - that's basically what duck typing is all about) but it *is* a > distinguishing feature of files over in-memory data. But it's not a distinguishing feature between the proposal, and writing: unpack(fmt, f.read(size)) which will also read from the file and affect the file state before failing. So its a difference that makes no difference. > There is also the fact that read() is only defined to return *at most* > the requested number of bytes. Non-blocking reads and objects like > pipes that can return additional data over time add extra complexity. How do they add extra complexity? According to the proposal, unpack() attempts the read. If it returns the correct number of bytes, the unpacking succeeds. If it doesn't, you get an exception, precisely the same way you would get an exception if you manually did the read and passed it to unpack(). Its the caller's responsibility to provide a valid file object. If your struct needs 10 bytes, and you provide a file that returns 6 bytes, you get an exception. There's no promise made that unpack() should repeat the read over and over again, hoping that its a pipe and more data becomes available. It either works with a single read, or it fails. Just like similar APIs as those provided by pickle, json etc which provide load() and loads() functions. In hindsight, the precedent set by pickle, json, etc suggests that we ought to have an unpack() function that reads from files and an unpacks() function that takes a string, but that ship has sailed. > Again, not insoluble, and potentially simple enough to handle with > "read N bytes, if you got something other than bytes or fewer than N > of them, raise an error", but still enough that the special cases > start to accumulate. I can understand the argument that the benefit of this is trivial over unpack(fmt, f.read(calcsize(fmt)) Unlike reading from a pickle or json record, its pretty easy to know how much to read, so there is an argument that this convenience method doesn't gain us much convenience. But I'm just not seeing where all the extra complexity and special case handing is supposed to be, except by having unpack make promises that the OP didn't request: - read partial structs from non-blocking files without failing - deal with file system errors without failing - support reading from text files when bytes are required without failing - if an exception occurs, the state of the file shouldn't change Those promises *would* add enormous amounts of complexity, but I don't think we need to make those promises. I don't think the OP wants them, I don't want them, and I don't think they are reasonable promises to make. > The suggestion is a nice convenience method, and probably a useful > addition for the majority of cases where it would do exactly what was > needed, but still not completely trivial to actually implement and > document (if I were doing it, I'd go with the naive approach, and just > raise a ValueError when read(N) returns anything other than N bytes, > for what it's worth). Indeed. Except that we should raise precisely the same exception type that struct.unpack() currently raises in the same circumstances: py> struct.unpack("ddd", b"a") Traceback (most recent call last): File "<stdin>", line 1, in <module> struct.error: unpack requires a bytes object of length 24 rather than ValueError. -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/