On Sat, Aug 31, 2019 at 8:44 PM Steven D'Aprano <st...@pearwood.info> wrote: > > So b"abc" should not be allowed? > > In what way are byte-STRINGS not strings? Unicode-strings and > byte-strings share a significant fraction of their APIs, and are so > similar that back in Python 2.2 the devs thought it was a good idea to > try automagically coercing from one to the other. > > I was careful to write *string* rather than *str*. Sorry if that wasn't > clear enough. >
We call it a string, but a bytes object has as much in common with bytearray and with a list of integers as it does with a text string. Is the contents of a MIDI file a "string"? I would say no, it's not - but it can *contain* strings, eg for metadata and lyrics. The MIDI file representation of an integer might be stored in a byte-string, but the common API between text strings and byte strings is going to be mostly irrelevant here. You can't upper-case the variable-length-integer b"\xe7\x61" any more than you can upper-case the integer 13281. Those common methods are mostly built on the assumption that the string contains ASCII text. There are a few string-like functions that truly can be used with completely binary data, and which actually do make a lot more sense on a byte string than on, say, a list of integers. Notably, finding a particular byte sequence can be done without knowing what the bytes actually mean (and similarly bytes.split(), which does the same sort of search), and you can strip off trailing b"\0" without needing to give much meaning to the content. But I cannot recollect *ever* using these methods on any bytes object that wasn't storing some form of encoded text. Bytes and text have a long relationship, and as such, there are special similarities. That doesn't mean that bytes ARE text, any more than a compiled regex is text just because it's traditional to describe a regex in a textual form. Path objects also blur the "is this text?" line, since you can divide a Path by a string to concatenate them, and there are ways of smuggling arbitrary bytes through them. I don't think it's necessary to be too adamant about "must be some sort of thing-we-call-string" here. Let practicality rule, since purity has already waved a white flag at us. ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7P43IPPY6WPTQ24QDLGFJC2IEBZTEXCL/ Code of Conduct: http://python.org/psf/codeofconduct/