On Thu, May 19, 2011 at 1:43 AM, Nick Coghlan <ncogh...@gmail.com> wrote: > OK, summarising the thread so far from my point of view. > > 1. There are some aspects of the behavior of bytes() objects that > tempt people to think of them as string-like objects (primarily the > b'' literals and their use in repr(), along with the fact that they > fill roles that were filled by str in it's "arbitrary binary data" > incarnation in Python 2.x). The mental model this creates in the > reader is incorrect, as bytes() are far closer to array.array('c') in > their underlying behaviour (and deliberately so - cf. PEP 358, 3112, > 3137).
I think most of this "wrong mental model" is actually due to people not having completely internalized the Python 3 way. > One proposal for addressing this is to add a x'deadbeef' literal and > using that in repr() rather than the bytestring. Another would be to > escape all characters, even printable ASCII, in the bytes() > representation. Both of these are undesirable, as they miss the > original purpose of this behaviour: making it easier to work with the > many ASCII based wire protocols that are in widespread use. Indeed, -1 on both. > To be honest, I don't think there is a lot we can do here except to > further emphasise in the documentation and elsewhere that *bytes is > not a string type* (regardless of any API similarities retained to > ease transition from the 2.x series). For example, if we have any > lingering references to "byte strings" they should be replaced with > "byte sequences" or "bytes objects" (depending on context, as the > former phrasing also encompasses bytearray objects). +1 > 2. As a concrete usability issue, it is awkward to programmatically > check the value of a specific byte when working with an ASCII based > protocol: > > data[i] == b'a' # Intuitive, but always False due to type mismatch > data[i:i+1] == b'a' # Works, but clumsy > data[i] == b'a'[0] # Ditto (but at least susceptible to compiler > const-expression optimisation) > data[i] == ord('a') # Clumsy and slow > data[i] == 97 # Hard to read > > Proposals to address this include: > - introduce a "character" literal to allow c'a' as an alternative to ord('a') -1; the result is not a *character* but an integer. I'm personally favoring using b'a'[0] and possibly hiding this in a constant definition. > Potentially workable, but leaves the intuitive answer above > silently producing an unexpected answer I'm not convinced that that problem is any worse than other comparison-related problems. E.g. b'a' == 'a' also always returns False (most likely it'll be disguised by at least one operand being a variable of course.) > - allow 1-element byte sequences to compare equal to the corresponding > integer values. > - would require reworking of bytes.__hash__ to use the hash of the > contained element when the data length is exactly 1 > - transitivity of equality would recommend also supporting > equivalences such as b'a' == 97.0 > - backwards compatibility concerns arise due to introduction of > new key collisions in dictionaries and sets and other value based > containers > - yet more string-like behaviour in a type that is *not* a string > (further reinforcing the mistaken impression from point 1) > - One thing that *isn't* a concern from my point of view is the > fact that we have ample precedent in decimal.Decimal for supporting > implicit coercion in comparison operations while disallowing them in > arithmetic operations (Decimal("1") == 1.0 is allowed, but > Decimal("1") + 1.0 will raise TypeError). > > For point 2, I'm personally +0 on the idea of having 1-element bytes > and bytearray objects delegate hashing and comparison operations to > the corresponding integer object. We have the power to make the > obvious code correct code, so let's do that. However, the implications > of the additional key collisions in value based containers may need to > be explored further. My gut feeling about this is that this will probably introduce some confusing or unintended side effect elsewhere, and I am -1 on this change. -- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com