On 24Dec2018 10:19, James Edwards <jh...@jheiv.com> wrote:
Here's a snippet of semi-production code we use:
def read_and_unpack(handle, fmt):
size = struct.calcsize(fmt)
data = handle.read(size)
if len(data) < size: return None
return struct.unpack(fmt, data)
which was originally something like:
def read_and_unpack(handle, fmt, offset=None):
if offset is not None:
handle.seek(*offset)
size = struct.calcsize(fmt)
data = handle.read(size)
if len(data) < size: return None
return struct.unpack(fmt, data)
until we pulled file seeking up out of the function.
Having struct.unpack and struct.unpack_from support files would seem
straightforward and be a nice quality of life change, imo.
These days I go the other way. I make it easy to get bytes from what I'm
working with and _expect_ to parse from a stream of bytes.
I have a pair of modules cs.buffer (for getting bytes from things) and
cs.binary (for parsing structures from binary data). (See PyPI.)
cs.buffer primarily offers a CornuCopyBuffer which manages access to any
iterable of bytes objects. It has a suite of factories to make these
from binary files, bytes, bytes[], a mmap, etc. Once you've got one of
these you have access to a suite of convenient methods. Particularly for
grabbing structs, these's a .take() method which obtains a precise
number of bytes. (Think that looks like a file read? Yes, and it offers
a basic file-like suite of methods too.)
Anyway, cs.binary is based of a PacketField base class oriented around
pulling a binary structure from a CornuCopyBuffer. Obviously, structs
are very common, and cs.binary has a factory:
def structtuple(class_name, struct_format, subvalue_names):
which gets you a PacketField subclass whose parse methods read a struct
and return it to you in a nice namedtuple.
Also, PacketFields self transcribe: you can construct one from its
values and have it write out the binary form.
Once you've got these the tendency is just to make a PacketField
instances from that function for the structs you need and then to just
grab things from a CornuCopyBuffer providing the data. And you no longer
have to waste effort on different code for bytes or files.
Example from cs.iso14496:
PDInfo = structtuple('PDInfo', '>LL', 'rate initial_delay')
Then you can just use PDInfo.from_buffer() or PDInfo.from_bytes() to
parse out your structures from then on.
I used to have tedious duplicated code for bytes and files in various
placed; I'm ripping it out and replacing with this as I encounter it.
Far more reliable, not to mention smaller and easier.
Cheers,
Cameron Simpson <c...@cskk.id.au>
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/