On Sat, Aug 31, 2019 at 8:44 PM Steven D'Aprano <st...@pearwood.info> wrote:
> > So b"abc" should not be allowed?
>
> In what way are byte-STRINGS not strings? Unicode-strings and
> byte-strings share a significant fraction of their APIs, and are so
> similar that back in Python 2.2 the devs thought it was a good idea to
> try automagically coercing from one to the other.
>
> I was careful to write *string* rather than *str*. Sorry if that wasn't
> clear enough.
>

We call it a string, but a bytes object has as much in common with
bytearray and with a list of integers as it does with a text string.
Is the contents of a MIDI file a "string"? I would say no, it's not -
but it can *contain* strings, eg for metadata and lyrics. The MIDI
file representation of an integer might be stored in a byte-string,
but the common API between text strings and byte strings is going to
be mostly irrelevant here. You can't upper-case the
variable-length-integer b"\xe7\x61" any more than you can upper-case
the integer 13281. Those common methods are mostly built on the
assumption that the string contains ASCII text.

There are a few string-like functions that truly can be used with
completely binary data, and which actually do make a lot more sense on
a byte string than on, say, a list of integers. Notably, finding a
particular byte sequence can be done without knowing what the bytes
actually mean (and similarly bytes.split(), which does the same sort
of search), and you can strip off trailing b"\0" without needing to
give much meaning to the content. But I cannot recollect *ever* using
these methods on any bytes object that wasn't storing some form of
encoded text.

Bytes and text have a long relationship, and as such, there are
special similarities. That doesn't mean that bytes ARE text, any more
than a compiled regex is text just because it's traditional to
describe a regex in a textual form. Path objects also blur the "is
this text?" line, since you can divide a Path by a string to
concatenate them, and there are ways of smuggling arbitrary bytes
through them.

I don't think it's necessary to be too adamant about "must be some
sort of thing-we-call-string" here. Let practicality rule, since
purity has already waved a white flag at us.

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7P43IPPY6WPTQ24QDLGFJC2IEBZTEXCL/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to