-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, Feb 15, 2017 at 06:59:14PM +0200, Eli Zaretskii wrote: > > Date: Wed, 15 Feb 2017 10:18:32 +0100 > > From: <[email protected]>
[...] > > Most notably, the whole path might cross several mount points, thus > > the whole path can well have fragments coming from several file systems. > > A possible solution would be to decode each mount point's part as it > is being resolved. ...which can only be based on guesswork: there's no reliable info on the encoding used for that file system (if it's consistent at all). What can we do? Try different encodings until one "works"? That amounts to trying UTF-8 and then some Latin-x (for any x), which would fit, for any x. > > I think the only sane way to see a Linux file system path is the way > > Linux sees it: as a byte string. > > This would lose a lot in 99% of use cases. You are, in effect, > suggesting a "reverse optimization", whereby the majority of use cases > is punished in favor of a small minority, based on theoretical > intractability. I feel queasy doing some voodoo whithout the application having a word on it. In the Emacs context it's a bit easier, because in the "normal" case things are pretty quickly deferred to the user (usually). > > Sure, some helper infrastructure to try to make characters of that > > mess will be welcome, but that should be absolutely robust wrt. > > unexpected input e.g. bad UTF-8) and leave control to the application. > > Most applications won't like this burden, because most application > programmers don't know enough about the issue to solve them correctly, > especially for users of other OSes and locales. > > > > But if OpenBSD requires all _filenames_ to be in valid UTF-8, that > > > is a bad decision in my view. > > > > NT has done that too. > > Windows can do that because it also transparently translates file > names to the locale's encoding when files are accessed with ANSI APIs. > Without such translation, this kind of decision is unwise, IMO. I guess (I don't *know*) Windows stores information about the encoding at file system level (and keeps that consistent). Linux hasn't that, it just keeps out of it. It hasn't even a place to state the encoding used. Thanks®ards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlikuCgACgkQBcgs9XrR2kauCACfTpfRpHhL2iUJXET5zqokA6US +pkAnjIc7Q+hBPj9Vi9Pk46AsmI3yA5m =RXAn -----END PGP SIGNATURE-----
