On Sat, Dec 27, 2008 at 8:53 AM, Eric Kow <[email protected]> wrote: > Hi Judah, > > On Sat, Dec 27, 2008 at 08:37:01 -0800, Judah Jacobson wrote: >> The following patch fixes Unicode input on POSIX systems with the >> Haskeline backend. Haskeline returns decoded Chars, but Darcs expects >> the input to be encoded (which is the behavior of the non-Haskeline >> backend). This fix re-encodes the input received from Haskeline into >> UTF-8. > > Thanks for the explanation, but I think I'm going to need a little more > help understanding this patch. What do you mean by "Darcs expects the > input to be encoded"? Is it because we use getLine which just assumes > that the input encoding is ISO-8859-1 (or actually, as I understand it, > the first 256 code points in the Unicode table, which happens to be the > same)?
Yes, that's right. For example: Prelude> getLine [user input:]α "\206\177" Prelude> :m +System.Console.Haskeline Prelude System.Console.Haskeline> runInputT defaultSettings $ getInputLine "" [user input:]α Just "\945" Haskeline decodes the two bytes into one Char, but Darcs expects to receive the encoded string since it uses standard functions like writeFile, putStr, etc. which ignore all but the last 8 bits of each Char input. > Also, what if the user is not using UTF-8 (ugh!) in their terminal? Haskeline currently can't handle encodings besides ASCII or UTF-8 on POSIX systems. (I'm hoping to fix that soon.) How serious of a concern is this? -Judah _______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
