On Tuesday 27 February 2007 00:39, Greg Ewing wrote: > I can't help feeling the people arguing for b"..." as the > repr format haven't really accepted the fact that text and > binary data will be distinct things in py3k, and are thinking > of bytes as being a replacement for the old string type. But > that's not true -- most of the time, *unicode* will be the > replacement for str when it is used to represent characters, > and bytes will mostly be used only for non-text. [etc.]
... but Guido prefers to use b"..." as the repr format, on the grounds that byte-sequences quite often are lightly encoded text, and that when that's true it can be *much* better to report them as such. Here's an ugly, impure, but possibly practical answer: give each bytes object a single-bit flag meaning something like "mostly textual"; make the bytes([1,2,3,4]) constructor set it to false, the b"abcde" constructor set it to true, and arbitrary operations on bytes objects do ... well, something plausible :-). (Textuality/non-textuality is generally preserved; combining texual and non-textual yields non-textual.) Then repr() can look at that flag and decide what to do on the basis of it. This would mean that x==y ==> repr(x)==repr(y) would fail; it can already fail when x,y are of different types (3==3.0; 1==True) and perhaps in some weird situations where they are of the same type (signed IEEE zeros). It would make the behaviour of repr() less predictable, and that's probably bad; it would mean (unlike the examples I gave above) that you can have x==y, with x and y of different types, but have repr(x) and repr(y) not look at all similar. Obviously the flag wouldn't affect comparisons or hashing. I can't say I like this much -- it's exactly the sort of behaviour I've found painful in Perl, with too much magic happening behind the scenes for perhaps-insufficient reason -- but it still might be the best available compromise. (The other obvious compromise approach would be to sniff the contents of the bytes object and see whether it "looks" like a lightly-encoded string. That's a bit too much magic for fuzzy reasons too.) -- Gareth McCaughan _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com