Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
Got this back from the bug tracker 6.12.1 will have Unicode support in the IO library which mostly fixes this problem. The rest is fixed by #3398. Iain ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
On Aug 20, 2009, at 05:07 , Ketil Malde wrote: % ghci -e 'map Data.Char.ord "饁"' :1:21: lexical error in string/character literal at character '\129' but again: % ghci -e 'map Data.Char.ord "£"' [194,163] So GHCi used interactively translates input from the terminal's UTF-8, but outputs truncates output to eight bits. Executing a string with -e, it appears to read byte for byte (which I think was the original behavior at some point). Makes sense; absent utf8-string, System.Environment.getArgs only groks bytes. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu electrical and computer engineering, carnegie mellon universityKF8NH PGP.sig Description: This is a digitally signed message part ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
2009/8/20 Ketil Malde > > Stuart Cook writes: > > > GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help > > Loading package base ... linking ... done. > > Prelude> map Data.Char.ord "饁" > > [39233]<== 0x9941 > > Prelude> putStrLn "饁" > > A <== 0x41 > > > It seems that GHCi is clever enough to decode UTF-8 input, which only > > serves to confuse System.IO even more. > > I get: > >GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help >Loading package base ... linking ... done. >Prelude> map Data.Char.ord "饁" >[39233] > > and > >Prelude> map Data.Char.ord "£" >[163] > > but also: > >% ghci -e 'map Data.Char.ord "饁"' >:1:21: >lexical error in string/character literal at character '\129' > > but again: > >% ghci -e 'map Data.Char.ord "£"' >[194,163] > > So GHCi used interactively translates input from the terminal's UTF-8, > but outputs truncates output to eight bits. Executing a string with > -e, it appears to read byte for byte (which I think was the original > behavior at some point). > > -k > -- I get the same behaviour here. If you want to try Latin 1 (ISO-8859-1) then you can use a utility called Luit (maybe only Linux?) luit -encoding ISO-8859-1 ghci £ becomes £, but gives the same byte output as above. Iain ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
Stuart Cook writes: > GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help > Loading package base ... linking ... done. > Prelude> map Data.Char.ord "饁" > [39233]<== 0x9941 > Prelude> putStrLn "饁" > A <== 0x41 > It seems that GHCi is clever enough to decode UTF-8 input, which only > serves to confuse System.IO even more. I get: GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help Loading package base ... linking ... done. Prelude> map Data.Char.ord "饁" [39233] and Prelude> map Data.Char.ord "£" [163] but also: % ghci -e 'map Data.Char.ord "饁"' :1:21: lexical error in string/character literal at character '\129' but again: % ghci -e 'map Data.Char.ord "£"' [194,163] So GHCi used interactively translates input from the terminal's UTF-8, but outputs truncates output to eight bits. Executing a string with -e, it appears to read byte for byte (which I think was the original behavior at some point). -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
On Thu, Aug 20, 2009 at 5:12 PM, Colin Paul Adams wrote: > Yes, but surely this will work both ways. The same bytes on input > should come back on output, shouldn't they? I would have thought so, but apparently this isn't actually what happens. GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help Loading package base ... linking ... done. Prelude> map Data.Char.ord "饁" [39233]<== 0x9941 Prelude> putStrLn "饁" A <== 0x41 It seems that GHCi is clever enough to decode UTF-8 input, which only serves to confuse System.IO even more. Stuart ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] gbp sign showing as unknown character by GHC
Hello Colin, Thursday, August 20, 2009, 11:12:53 AM, you wrote: > Yes, but surely this will work both ways. The same bytes on input > should come back on output, shouldn't they? only ascii subset that have fixed encoding. the rest may migrate in some way -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
> "Stuart" == Stuart Cook writes: Stuart> On Thu, Aug 20, 2009 at 4:28 PM, Colin Paul Stuart> Adams wrote: >> But how do you get Latin-1 bytes from a Unicode string? This >> would need a transcoding process. Stuart> The first 256 code-points of Unicode coincide with Stuart> Latin-1. Therefore, if you truncate Unicode characters Stuart> down to 8 bits you'll effectively end up with Latin-1 text Stuart> (except that any code points above U+00FF will give Stuart> strange results). Stuart> If your terminal then interprets these bytes as UTF-8 (or Stuart> anything else, really), the result will be gibberish or Stuart> worse. Yes, but surely this will work both ways. The same bytes on input should come back on output, shouldn't they? -- Colin Adams Preston Lancashire ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
On Thu, Aug 20, 2009 at 4:28 PM, Colin Paul Adams wrote: > But how do you get Latin-1 bytes from a Unicode string? This would > need a transcoding process. The first 256 code-points of Unicode coincide with Latin-1. Therefore, if you truncate Unicode characters down to 8 bits you'll effectively end up with Latin-1 text (except that any code points above U+00FF will give strange results). If your terminal then interprets these bytes as UTF-8 (or anything else, really), the result will be gibberish or worse. Stuart ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
> "Bulat" == Bulat Ziganshin writes: Bulat> Hello Colin, Bulat> Thursday, August 20, 2009, 10:13:28 AM, you wrote: > I don't understand where latin-1 comes into this. String is supposed >> to be a list of Unicode characters. Bulat> but ghc 6.10 i/o used String as list of bytes But how do you get Latin-1 bytes from a Unicode string? This would need a transcoding process. -- Colin Adams Preston Lancashire ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] gbp sign showing as unknown character by GHC
Hello Colin, Thursday, August 20, 2009, 10:13:28 AM, you wrote: > I don't understand where latin-1 comes into this. String is supposed > to be a list of Unicode characters. but ghc 6.10 i/o used String as list of bytes -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
> "Judah" == Judah Jacobson writes: Judah> On Wed, Aug 19, 2009 at 10:31 AM, Iain Barnett wrote: >> Quick question: I've tested this in a couple of different >> terminals (roxterm and xterm), so I'm fairly sure it's GHC >> that's the problem. Have I missed a setting? GHCi, version >> 6.10.4 Prelude> putStrLn "£" >> � Hugs98 200609-3 Hugs> putStrLn "£" >> £ >> ghc-6.10.4 and earlier don't automatically encode/decode Unicode Judah> characters. So on terminals which don't use the latin-1 Judah> encoding, you need to do the conversion explicitly with a Judah> separate package such as utf8-string, iconv or text-icu. I don't understand where latin-1 comes into this. String is supposed to be a list of Unicode characters. -- Colin Adams Preston Lancashire ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
On Wed, Aug 19, 2009 at 10:31 AM, Iain Barnett wrote: > Quick question: I've tested this in a couple of different terminals (roxterm > and xterm), so I'm fairly sure it's GHC that's the problem. Have I missed a > setting? > GHCi, version 6.10.4 > Prelude> putStrLn "£" > � > Hugs98 200609-3 > Hugs> putStrLn "£" > £ > ghc-6.10.4 and earlier don't automatically encode/decode Unicode characters. So on terminals which don't use the latin-1 encoding, you need to do the conversion explicitly with a separate package such as utf8-string, iconv or text-icu. For example, on OS X: $ echo $LANG en_US.UTF-8 $ ghci Prelude> putStrLn "£" ? Prelude> System.IO.UTF8.putStrLn "£" £ The conversion is done automatically by hugs, which is why the outputs differ. This feature will also be supported in ghc-6.12. -Judah ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
Both of these happen on Windows 7 64-bit for me too. On Wed, Aug 19, 2009 at 7:08 PM, Iain Barnett wrote: > > > 2009/8/19 David Leimbach > >> Interesting... GHCI bug? Didn't the readline dependency go away not too >> long ago? Could it be related? >> > > I just tried this > > Prelude> putStrLn "\£" > ghc: panic! (the 'impossible' happened) > (GHC version 6.10.4 for i386-unknown-linux): > charType: '\163' > > Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug > > > So perhaps I should put in a bug report, as that shouldn't happen (it > doesn't with some other characters I tried), unless anyone has a different > idea? I'm running Arch Linux with xmonad and using roxterm, so perhaps it's > something to do with my setup? > > Iain > > > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > > -- Sebastian Sylvan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re[2]: [Haskell-cafe] gbp sign showing as unknown character by GHC
2009/8/19 Bulat Ziganshin > > probably, terminals reports some unusual symbols. but any panic should > be reported to GHC Trac - anyway > > I've added a new ticket here, in case you feel you want to add to it (or not :) http://hackage.haskell.org/trac/ghc/ticket/3443 Iain ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] gbp sign showing as unknown character by GHC
Hello Alexander, Wednesday, August 19, 2009, 10:16:26 PM, you wrote: > Could it be a terminfo problem of some sort? It seems suspicious that > there is a difference between terminals. probably, terminals reports some unusual symbols. but any panic should be reported to GHC Trac - anyway -- Best regards, Bulatmailto:bulat.zigans...@gmail.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
On Wed, Aug 19, 2009 at 11:08 AM, Iain Barnett wrote: > > > 2009/8/19 David Leimbach >> >> Interesting... GHCI bug? Didn't the readline dependency go away not too >> long ago? Could it be related? > > I just tried this > Prelude> putStrLn "\£" > ghc: panic! (the 'impossible' happened) > (GHC version 6.10.4 for i386-unknown-linux): > charType: '\163' > Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug > > So perhaps I should put in a bug report, as that shouldn't happen (it > doesn't with some other characters I tried), unless anyone has a different > idea? I'm running Arch Linux with xmonad and using roxterm, so perhaps it's > something to do with my setup? > Iain > > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > > I can reproduce the panic on both urxvt and uxterm. On urxvt, GHCi works correctly with putStrLn "£". On uxterm, it just prints a blank space. I can type the GBP sign into both terminals. Could it be a terminfo problem of some sort? It seems suspicious that there is a difference between terminals. Alex ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] gbp sign showing as unknown character by GHC
2009/8/19 David Leimbach > Interesting... GHCI bug? Didn't the readline dependency go away not too > long ago? Could it be related? > I just tried this Prelude> putStrLn "\£" ghc: panic! (the 'impossible' happened) (GHC version 6.10.4 for i386-unknown-linux): charType: '\163' Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug So perhaps I should put in a bug report, as that shouldn't happen (it doesn't with some other characters I tried), unless anyone has a different idea? I'm running Arch Linux with xmonad and using roxterm, so perhaps it's something to do with my setup? Iain ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] gbp sign showing as unknown character by GHC
Quick question: I've tested this in a couple of different terminals (roxterm and xterm), so I'm fairly sure it's GHC that's the problem. Have I missed a setting? GHCi, version 6.10.4 Prelude> putStrLn "£" � Hugs98 200609-3 Hugs> putStrLn "£" £ I get the same character output from a password generator I've writtern, after compilation with GHC [iainb]$ ./makepass2 50 2 >> testfile.txt [iainb]$ cat testfile.txt H(xW!:maNyxZ;h,IW=Uu4G$ztc>k...@q[g6?y:�TbG&5Nd")+"5+ Iain ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe