On 21 Aug 2014 09:06, "Chris Barker" <chris.bar...@noaa.gov> wrote:
> > As I understand it, the whole problem with some posix systems is that there is NO filesystem encoding -- i.e. you can't know for sure what encoding a filename is in. So you need to be able to pass the bytes through as they are. > > (At least as I read Armin Ronacher's blog) Armin lets his astonishment at the idea we'd expect Linux vendors to fix their broken OS get the better of him at times - he thinks the responsibility lies entirely with us to work around its quirks and limitations :) The "surrogateescape" codec is our main answer to the unreliability of the POSIX encoding model - fsdecode will squirrel away arbitrary bytes in the private use area, and then fsencode will restore them again later. That works for the simple round tripping case, but we currently lack good default tools for "cleaning" strings that may contain surrogates (or even scanning a string to see if surrogates are present). One idea I had along those lines is a surrogatereplace error handler ( http://bugs.python.org/issue22016) that emitted an ASCII question mark for each smuggled byte, rather than propagating the encoding problem. Cheers, Nick. > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > chris.bar...@noaa.gov > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com