Re: [GHC] #5512: UTF-16//ROUNDTRIP encoding behaves weirdly
#5512: UTF-16//ROUNDTRIP encoding behaves weirdly -+-- Reporter: batterseapower|Owner: Type: bug | Status: new Priority: normal|Milestone: 7.4.1 Component: libraries/base| Version: 7.2.1 Keywords:| Testcase: Blockedby:| Difficulty: Os: Unknown/Multiple | Blocking: Architecture: Unknown/Multiple | Failure: Incorrect result at runtime -+-- Comment(by batterseapower): You are seeing exactly the expected output. My recent change to have mkTextEncoding try our Haskell TextEncodings before it falls back to iconv may have made this better. In practice, we don't really care whether UTF-16//ROUNDTRIP works because we only use //ROUNDTRIP for the fileSystemEncoding (a modified localeEncoding), UTF-16 is not an ASCII superset, and IIRC the Posix standard requires the locale encoding to be an ASCII superset. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5512#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5512: UTF-16//ROUNDTRIP encoding behaves weirdly
#5512: UTF-16//ROUNDTRIP encoding behaves weirdly --+- Reporter: batterseapower | Owner: Type: bug | Status: closed Priority: normal | Milestone: 7.4.1 Component: libraries/base |Version: 7.2.1 Resolution: fixed| Keywords: Testcase: | Blockedby: Difficulty: | Os: Unknown/Multiple Blocking: | Architecture: Unknown/Multiple Failure: Incorrect result at runtime | --+- Changes (by batterseapower): * status: new = closed * resolution: = fixed Comment: I've just realised that the first paragraph above is rubbish: I didn't change mkTextEncoding, only localeEncoding. So perhaps it was just a foolish missing hClose that was causing this behaviour. Sorry for the noise. The second paragraph still stands. Closing the ticket. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5512#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5512: UTF-16//ROUNDTRIP encoding behaves weirdly
#5512: UTF-16//ROUNDTRIP encoding behaves weirdly -+-- Reporter: batterseapower|Owner: Type: bug | Status: new Priority: normal|Milestone: 7.4.1 Component: libraries/base| Version: 7.2.1 Keywords:| Testcase: Blockedby:| Difficulty: Os: Unknown/Multiple | Blocking: Architecture: Unknown/Multiple | Failure: Incorrect result at runtime -+-- Changes (by igloo): * milestone: = 7.4.1 Comment: What's the expected output? I got a 0 byte output file, but if I add hClose h then I get {{{ $ ls -l out.temp; hexdump -C out.temp -rw-r--r-- 1 ian ian 11 Nov 10 01:01 out.temp fe ff 00 48 00 69 e8 00 48 00 69 |...H.i..H.i| 000b }}} (HEAD, amd64/Linux) -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5512#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
[GHC] #5512: UTF-16//ROUNDTRIP encoding behaves weirdly
#5512: UTF-16//ROUNDTRIP encoding behaves weirdly -+-- Reporter: batterseapower| Owner: Type: bug | Status: new Priority: normal| Component: libraries/base Version: 7.2.1 |Keywords: Testcase:| Blockedby: Os: Unknown/Multiple |Blocking: Architecture: Unknown/Multiple | Failure: Incorrect result at runtime -+-- Try this program: {{{ module Main where import System.IO main = do roundtrip_enc - mkTextEncoding UTF16//ROUNDTRIP h - openFile out.temp WriteMode hSetEncoding h roundtrip_enc hPutStr h Hi\xEFE8Hi }}} It fails with: {{{ hSetEncoding: invalid argument (Invalid argument) }}} If you change UTF16 to UTF-16 (so we use the builtin encoding rather than iconv) it works, but the output file only contains the first Hi. I think what is going on here is that iconv does not generate EILSEQ for identity transformations such as that between a UTF-16 text file and our UTF-16 CharBuffers. Since we never get that exception, we can't fix up the lone surrogates we use to encode roundtrip characters. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5512 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #5512: UTF-16//ROUNDTRIP encoding behaves weirdly
#5512: UTF-16//ROUNDTRIP encoding behaves weirdly -+-- Reporter: batterseapower| Owner: Type: bug | Status: new Priority: normal| Component: libraries/base Version: 7.2.1 |Keywords: Testcase:| Blockedby: Os: Unknown/Multiple |Blocking: Architecture: Unknown/Multiple | Failure: Incorrect result at runtime -+-- Description changed by batterseapower: Old description: Try this program: {{{ module Main where import System.IO main = do roundtrip_enc - mkTextEncoding UTF16//ROUNDTRIP h - openFile out.temp WriteMode hSetEncoding h roundtrip_enc hPutStr h Hi\xEFE8Hi }}} It fails with: {{{ hSetEncoding: invalid argument (Invalid argument) }}} If you change UTF16 to UTF-16 (so we use the builtin encoding rather than iconv) it works, but the output file only contains the first Hi. I think what is going on here is that iconv does not generate EILSEQ for identity transformations such as that between a UTF-16 text file and our UTF-16 CharBuffers. Since we never get that exception, we can't fix up the lone surrogates we use to encode roundtrip characters. New description: Try this program: {{{ module Main where import System.IO main = do roundtrip_enc - mkTextEncoding UTF16//ROUNDTRIP h - openFile out.temp WriteMode hSetEncoding h roundtrip_enc hPutStr h Hi\xEFE8Hi }}} It fails with: {{{ hSetEncoding: invalid argument (Invalid argument) }}} If you change UTF16 to UTF-16 (so we use the builtin encoding rather than iconv) it works, but the output file only contains the first Hi. I think part of what is going on here is that iconv does not generate EILSEQ for identity transformations such as that between a UTF-16 text file and our UTF-16 CharBuffers. Since we never get that exception, we can't fix up the lone surrogates we use to encode roundtrip characters. -- -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/5512#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs