On Wed, Mar 30, 2011 at 11:01, Alistair Bayley <[email protected]> wrote:
> On 30 March 2011 20:53, Max Bolingbroke <[email protected]>wrote: > >> On 30 March 2011 07:52, Michael Snoyman <[email protected]> wrote: >> > I could >> > manually do something like (utf8Decode . S8.pack), but that presumes >> > that the character encoding on the system in question is UTF8. So two >> > questions: >> >> Funnily enough I have been thinking about this quite hard recently, >> and the situation is kind of a mess and short of implementing PEP383 >> (http://www.python.org/dev/peps/pep-0383/) in GHC I can't see how to >> make it easier on the programmer. As Jason points out the best you can >> really do is probably: >> >> 1. Treat Strings that represent filenames as raw byte sequences, even >> though they claim to be strings >> >> 2. When presenting such Strings to the user, re-decode them by using >> the current locale encoding (which will typically be UTF-8). You >> probably want to have some means of avoiding decoding errors here too >> -- ignoring or replacing undecodable bytes -- but presently this is >> not so straightforward. If you happen to be on a system with GNU Iconv >> you can use it's "C//TRANSLIT//IGNORE" encoding to achieve this, >> however. >> > > > http://www.haskell.org/pipermail/libraries/2009-August/012493.html > > I took from this discussion that FilePath really should be a pair of the > actual filename ByteString, and the printable String (decoded from the > ByteString, with encoding specified by the user's locale). The conversion > from ByteString to String (and vice versa) is not guaranteed to be lossless, > so you need to remember both. > > I'm not sure that I agree with that. Why does it have to be loss-less? The problem, more likely, is the fact that FilePath is just a simple string. Maybe we should go the way of Java where cross-platform file access is based upon a File (or the new Path) type? That way the internal representation could use whatever necessary to ensure a unique reference to a file or directory while at the same time providing a way to get a human-readable representation. Going from strings to file/path types would need the correct encodings to work. Cheers, -Tako PS: Just lurking here most of the time because I'm still a total Haskell noob, you can ignore me without risk.
_______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
