Re: [Haskell-cafe] Ready for testing: Unicode support for Handle I/O

2009-02-03 Thread John Goerzen
Simon Marlow wrote: I've been working on adding proper Unicode support to Handle I/O in GHC, and I finally have something that's ready for testing. I've put a patchset here: Yay! Comments below. Comments/discussion please! Do you expect Hugs will be able to pick up all of this? The

Re: [Haskell-cafe] Ready for testing: Unicode support for Handle I/O

2009-02-03 Thread John Goerzen
On Tue, Feb 03, 2009 at 10:56:13PM +, Duncan Coutts wrote: Thanks to suggestions from Duncan Coutts, it's possible to call hSetEncoding even on buffered read Handles, and the right thing happens. So we can read from text streams that include multiple encodings, such as an HTTP

Re: [Haskell-cafe] Ready for testing: Unicode support for Handle I/O

2009-02-03 Thread Duncan Coutts
On Tue, 2009-02-03 at 11:03 -0600, John Goerzen wrote: Will there also be something to handle the UTF-16 BOM marker? I'm not sure what the best API for that is, since it may or may not be present, but it should be considered -- and could perhaps help autodetect encoding. I think someone else

Re: [Haskell-cafe] Ready for testing: Unicode support for Handle I/O

2009-02-03 Thread Duncan Coutts
On Tue, 2009-02-03 at 17:39 -0600, John Goerzen wrote: On Tue, Feb 03, 2009 at 10:56:13PM +, Duncan Coutts wrote: Thanks to suggestions from Duncan Coutts, it's possible to call hSetEncoding even on buffered read Handles, and the right thing happens. So we can read from text

Re: [Haskell-cafe] Ready for testing: Unicode support for Handle I/O

2009-02-03 Thread John Goerzen
Duncan Coutts wrote: Sorry, I think we've been talking at cross purposes. I think so. There always has to be *some* conversion from a 32-bit Char to the system's selection, right? Yes. In text mode there is always some conversion going on. Internally there is a byte buffer and a char