#3307: System.IO and System.Directory functions not Unicode-aware under Unix
----------------------------------+-----------------------------------------
Reporter: YitzGale | Owner:
Type: bug | Status: new
Priority: normal | Milestone: 7.2.1
Component: libraries/base | Version: 6.11
Keywords: directory unicode | Testcase:
Blockedby: | Difficulty: Unknown
Os: Unknown/Multiple | Blocking:
Architecture: Unknown/Multiple | Failure: None/Unknown
----------------------------------+-----------------------------------------
Comment(by batterseapower):
Replying to [comment:11 tsuyoshi]:
> This has almost nothing to do with character encoding. It happens
because a question mark happens to be a special character in shell
filename expansion (wildcard). Apparently in your case Bash substitutes
each question mark to one byte, not one character.
That is interesting, thanks - I've never seen ? used as a shell wildcard.
It's certainly a more reassuring explanation than what I wrongly thought
was going on!
However, I don't think this changes the argument as to how we should
decode the command line. As I see it, the only reasonable thing to do is
assume that:
1. All of argv has the same encoding
2. The only reasonable encoding to pick is the locale encoding, as that
should match the terminal's encoding and hence the encoding in which typed
user input will arrive
It is unfortunate that Bash will tab-complete filenames without regard for
the current encoding, thus creating a command line that may have mixed-
encoding data with no way to tell which bit is which.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/3307#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs