Executive summary: My experience is that having bytes APIs in the os module is very useful. But perhaps higher-level functions like os.scandir can do without (I present no arguments either way on that, just acknowledge it).
Andrew Barnert writes: > Anyway, Windows CDs can't cause this problem. My bad. I meant archival Mac CDs (or perhaps they were taken from a network filesystem) which is where I see MacRoman, and Windows (ie, FAT-formatted) USB drives, which is where I see Shift JIS. The point here is not what is technically possible or even standard, it's that though what I see in practice may not *require* bytes APIs, it's *very convenient* to have them (especially interactively). > The same thing is true with NTFS external drives, VFAT USB drives, > etc. Generally, it's usually not Windows media on *nix systems that > break Python 2 unicode; it's native *nix filesystems where users > mix locales. IMHO, Python 2 unicode is not breakable, let alone broken. ;-) Mailman 2 has managed to almost get to a state where you can't get it to raise a Unicode exception (except where deliberately used as EAFP), let alone one that is not handled (before the catch-all "except Exception" that keeps the daemon running). And that's in an application whose original encoding support assumed standard conformance by design in a realm where spammers and junior high school hackers regularly violate the most ancient of RFCs (the restriction to ASCII in headers goes back to a 6xx RFC at the latest!) Python 2 Unicode turns out to have been an excellent compromise between the needs of backward compatibility with uniformly encoded bytestreams for Europe, and the forward-looking needs of a globalizing Internet. (But you knew that! :-) As I wrote earlier, the world is broken, or at least Japan. The world "got bettah", thus Python 3. And most of the time Python 3 is wonderful in Japan (specifically, it's trivial to get recalcitrant students to use best I18N practice). My point is that *where I live* the experience is very different. There are *no* Japanese who use *nix (other than Mac OS X) for paperwork in my neighborhood. Shift JIS filenames *are* from Windows media recently written, though probably not by Microsoft-provided software. Bytes APIs are a very useful tool in dealing with these issues, at least in the hands of someone who has become expert in dealing with them. I suspect the same is true of China, except that like their business partner Apple they are in a position to legislate uniformity, and do. (Unfortunately that's GB18030, not Unicode.) So maybe they're better off than a place that coined the phrase "politics that can't decide". I admit I've not yet used os.scandir, let alone its bytes API. Perhaps we can, and perhaps we should, restrict the bytes API in the os module to a few basic functions, and require that the environment be sane for cases where we want to use higher-level or optimized functions. > > You contradict yourself! ;-) > > I'm perfectly happen to have been wrong earlier. And if catching > myself before someone else did makes me a flip-flopper, well, I'm > not running for president. :P I consider that the most important qualification for President, especially if your name is Trump or Sanders. That's one of the things I respect most about Python: with a few (negligible) exceptions, minds change to fit the facts. And, BTW, EAFP applies here, too. Make mistakes on the mailing lists before you commit them to code. Please!<wink/> _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com