On 14 January 2014 19:54, Stephen J. Turnbull <step...@xemacs.org> wrote: > Guido van Rossum writes: > > And that is precisely my point. When you're using a format string, > > all of the format string (not just the part between { and }) had > > better use ASCII or an ASCII superset. And this (rightly) > > constrains the output to an ASCII superset as well. > > Except that if you interpolate something like Shift JIS, much of the > ASCII really isn't ASCII. That's a general issue, of course, if you > do something that requires iterated format strings, but it's far more > likely to appear to work most of the time with those encodings. > > Of course you can say "if it hurts, don't do that", but ....
Right, that's the danger I was worried about, but the problem is that there's at least *some* minimum level of ASCII compatibility that needs to be assumed in order to define an interpolation format at all (this is the point I originally missed). For printf-style formatting, it's % along with the various formatting characters and other syntax (like digits, parentheses, variable names and "."), with the format method it's braces, brackets, colons, variable names, etc. The mini-language parser has to assume in encoding in order to interpret the format string, and that's *all* done assuming an ASCII compatible format string (which must make life interesting if you try to use an ASCII incompatible coding cookie for your source code - I'm actually not sure what the full implications of that *are* for bytes literals in Python 3). The one remaining way I could potentially see a formatb method working is along the lines of what Glenn (I think) suggested: just like struct definitions, the formatb specifier would have to consist *solely* of substitution fields. However, that's getting awfully close to being just an alternate spelling for the struct module or bytes.join at that point, which hardly makes for a compelling case to add two new methods to a builtin type. Given that one of the concepts with the Python 3 transition was to take certain problematic constructs (like ASCII compatible interpolation directly to binary without a separate encoding step) away and decide whether or not we were happy to live without them, I think this one has proven to have sufficient staying power to finally bring it back in Python 3.5 (especially given the gain in lowering the barrier to porting Python 2 code that makes heavy use of interpolation to ASCII compatible binary formats). It's certainly a decision that has its downsides, with the potential impact on users of ASCII incompatible encodings (mostly in Asia) being the main one, but I think the increased convenience in working with ASCII compatible binary protocols and file formats is worth the cost. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com