Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-20 Thread Iain Barnett
Got this back from the bug tracker
 6.12.1 will have Unicode support in the IO library which mostly fixes this
 problem.  The rest is fixed by #3398.


Iain
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-20 Thread Brandon S. Allbery KF8NH

On Aug 20, 2009, at 05:07 , Ketil Malde wrote:

   % ghci -e 'map Data.Char.ord "饁"'
   :1:21:
   lexical error in string/character literal at character '\129'

but again:

   % ghci -e 'map Data.Char.ord "£"'
   [194,163]

So GHCi used interactively translates input from the terminal's UTF-8,
but outputs truncates output to eight bits.  Executing a string with
-e, it appears to read byte for byte (which I think was the original
behavior at some point).



Makes sense; absent utf8-string, System.Environment.getArgs only groks  
bytes.


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com
system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu
electrical and computer engineering, carnegie mellon universityKF8NH




PGP.sig
Description: This is a digitally signed message part
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-20 Thread Iain Barnett
2009/8/20 Ketil Malde 

>
> Stuart Cook  writes:
>
> > GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
> > Loading package base ... linking ... done.
> > Prelude> map Data.Char.ord "饁"
> > [39233]<== 0x9941
> > Prelude> putStrLn "饁"
> > A <== 0x41
>
> > It seems that GHCi is clever enough to decode UTF-8 input, which only
> > serves to confuse System.IO even more.
>
> I get:
>
>GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
>Loading package base ... linking ... done.
>Prelude> map Data.Char.ord "饁"
>[39233]
>
> and
>
>Prelude> map Data.Char.ord "£"
>[163]
>
> but also:
>
>% ghci -e 'map Data.Char.ord "饁"'
>:1:21:
>lexical error in string/character literal at character '\129'
>
> but again:
>
>% ghci -e 'map Data.Char.ord "£"'
>[194,163]
>
> So GHCi used interactively translates input from the terminal's UTF-8,
> but outputs truncates output to eight bits.  Executing a string with
> -e, it appears to read byte for byte (which I think was the original
> behavior at some point).
>
> -k
> --


I get the same behaviour here.

If you want to try Latin 1 (ISO-8859-1) then you can use a utility called
Luit (maybe only Linux?)

luit -encoding ISO-8859-1 ghci

£ becomes £, but gives the same byte output as above.

Iain
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-20 Thread Ketil Malde

Stuart Cook  writes:

> GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
> Loading package base ... linking ... done.
> Prelude> map Data.Char.ord "饁"
> [39233]<== 0x9941
> Prelude> putStrLn "饁"
> A <== 0x41

> It seems that GHCi is clever enough to decode UTF-8 input, which only
> serves to confuse System.IO even more.

I get:

GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
Loading package base ... linking ... done.
Prelude> map Data.Char.ord "饁"
[39233]

and

Prelude> map Data.Char.ord "£"
[163]

but also:

% ghci -e 'map Data.Char.ord "饁"'
:1:21:
lexical error in string/character literal at character '\129'

but again:

% ghci -e 'map Data.Char.ord "£"' 
[194,163]

So GHCi used interactively translates input from the terminal's UTF-8,
but outputs truncates output to eight bits.  Executing a string with
-e, it appears to read byte for byte (which I think was the original
behavior at some point). 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-20 Thread Stuart Cook
On Thu, Aug 20, 2009 at 5:12 PM, Colin Paul
Adams wrote:
> Yes, but surely this will work both ways. The same bytes on input
> should come back on output, shouldn't they?

I would have thought so, but apparently this isn't actually what happens.

GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
Loading package base ... linking ... done.
Prelude> map Data.Char.ord "饁"
[39233]<== 0x9941
Prelude> putStrLn "饁"
A <== 0x41

It seems that GHCi is clever enough to decode UTF-8 input, which only
serves to confuse System.IO even more.


Stuart
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-20 Thread Bulat Ziganshin
Hello Colin,

Thursday, August 20, 2009, 11:12:53 AM, you wrote:

> Yes, but surely this will work both ways. The same bytes on input
> should come back on output, shouldn't they?

only ascii subset that have fixed encoding. the rest may migrate in
some way



-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-20 Thread Colin Paul Adams
> "Stuart" == Stuart Cook  writes:

Stuart> On Thu, Aug 20, 2009 at 4:28 PM, Colin Paul
Stuart> Adams wrote:
>> But how do you get Latin-1 bytes from a Unicode string? This
>> would need a transcoding process.

Stuart> The first 256 code-points of Unicode coincide with
Stuart> Latin-1. Therefore, if you truncate Unicode characters
Stuart> down to 8 bits you'll effectively end up with Latin-1 text
Stuart> (except that any code points above U+00FF will give
Stuart> strange results).

Stuart> If your terminal then interprets these bytes as UTF-8 (or
Stuart> anything else, really), the result will be gibberish or
Stuart> worse.

Yes, but surely this will work both ways. The same bytes on input
should come back on output, shouldn't they?

-- 
Colin Adams
Preston Lancashire
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Stuart Cook
On Thu, Aug 20, 2009 at 4:28 PM, Colin Paul
Adams wrote:
> But how do you get Latin-1 bytes from a Unicode string? This would
> need a transcoding process.

The first 256 code-points of Unicode coincide with Latin-1. Therefore,
if you truncate Unicode characters down to 8 bits you'll effectively
end up with Latin-1 text (except that any code points above U+00FF
will give strange results).

If your terminal then interprets these bytes as UTF-8 (or anything
else, really), the result will be gibberish or worse.


Stuart
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Colin Paul Adams
> "Bulat" == Bulat Ziganshin  writes:

Bulat> Hello Colin,
Bulat> Thursday, August 20, 2009, 10:13:28 AM, you wrote:

> I don't understand where latin-1 comes into this. String is supposed
>> to be a list of Unicode characters.

Bulat> but ghc 6.10 i/o used String as list of bytes

But how do you get Latin-1 bytes from a Unicode string? This would
need a transcoding process.
-- 
Colin Adams
Preston Lancashire
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Bulat Ziganshin
Hello Colin,

Thursday, August 20, 2009, 10:13:28 AM, you wrote:

> I don't understand where latin-1 comes into this. String is supposed
> to be a list of Unicode characters.

but ghc 6.10 i/o used String as list of bytes


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Colin Paul Adams
> "Judah" == Judah Jacobson  writes:

Judah> On Wed, Aug 19, 2009 at 10:31 AM, Iain Barnett 
wrote:
>> Quick question: I've tested this in a couple of different
>> terminals (roxterm and xterm), so I'm fairly sure it's GHC
>> that's the problem. Have I missed a setting?  GHCi, version
>> 6.10.4
Prelude> putStrLn "£"
>> � Hugs98 200609-3
Hugs> putStrLn "£"
>> £
>> 

ghc-6.10.4 and earlier don't automatically encode/decode Unicode
Judah> characters.  So on terminals which don't use the latin-1
Judah> encoding, you need to do the conversion explicitly with a
Judah> separate package such as utf8-string, iconv or text-icu.

I don't understand where latin-1 comes into this. String is supposed
to be a list of Unicode characters.
-- 
Colin Adams
Preston Lancashire
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Judah Jacobson
On Wed, Aug 19, 2009 at 10:31 AM, Iain Barnett wrote:
> Quick question: I've tested this in a couple of different terminals (roxterm
> and xterm), so I'm fairly sure it's GHC that's the problem. Have I missed a
> setting?
> GHCi, version 6.10.4
> Prelude> putStrLn "£"
> �
> Hugs98 200609-3
> Hugs> putStrLn "£"
> £
>

ghc-6.10.4 and earlier don't automatically encode/decode Unicode
characters.  So on terminals which don't use the latin-1 encoding, you
need to do the conversion explicitly with a separate package such as
utf8-string, iconv or text-icu.  For example, on OS X:

$ echo $LANG
en_US.UTF-8
$ ghci
Prelude> putStrLn "£"
?
Prelude> System.IO.UTF8.putStrLn "£"
£

The conversion is done automatically by hugs, which is why the outputs
differ.  This feature will also be supported in ghc-6.12.

-Judah
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Sebastian Sylvan
Both of these happen on Windows 7 64-bit for me too.

On Wed, Aug 19, 2009 at 7:08 PM, Iain Barnett  wrote:

>
>
> 2009/8/19 David Leimbach 
>
>> Interesting... GHCI bug?  Didn't the readline dependency go away not too
>> long ago?  Could it be related?
>>
>
> I just tried this
>
> Prelude> putStrLn "\£"
> ghc: panic! (the 'impossible' happened)
>   (GHC version 6.10.4 for i386-unknown-linux):
> charType: '\163'
>
> Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
>
>
> So perhaps I should put in a bug report, as that shouldn't happen (it
> doesn't with some other characters I tried), unless anyone has a different
> idea? I'm running Arch Linux with xmonad and using roxterm, so perhaps it's
> something to do with my setup?
>
> Iain
>
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>


-- 
Sebastian Sylvan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[2]: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Iain Barnett
2009/8/19 Bulat Ziganshin 

>
> probably, terminals reports some unusual symbols. but any panic should
> be reported to GHC Trac - anyway
>
>
I've added a new ticket here, in case you feel you want to add to it (or not
:)
http://hackage.haskell.org/trac/ghc/ticket/3443


Iain
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Bulat Ziganshin
Hello Alexander,

Wednesday, August 19, 2009, 10:16:26 PM, you wrote:

> Could it be a terminfo problem of some sort? It seems suspicious that
> there is a difference between terminals.

probably, terminals reports some unusual symbols. but any panic should
be reported to GHC Trac - anyway


-- 
Best regards,
 Bulatmailto:bulat.zigans...@gmail.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Alexander Dunlap
On Wed, Aug 19, 2009 at 11:08 AM, Iain Barnett wrote:
>
>
> 2009/8/19 David Leimbach 
>>
>> Interesting... GHCI bug?  Didn't the readline dependency go away not too
>> long ago?  Could it be related?
>
> I just tried this
> Prelude> putStrLn "\£"
> ghc: panic! (the 'impossible' happened)
>   (GHC version 6.10.4 for i386-unknown-linux):
> charType: '\163'
> Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
>
> So perhaps I should put in a bug report, as that shouldn't happen (it
> doesn't with some other characters I tried), unless anyone has a different
> idea? I'm running Arch Linux with xmonad and using roxterm, so perhaps it's
> something to do with my setup?
> Iain
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>

I can reproduce the panic on both urxvt and uxterm.

On urxvt, GHCi works correctly with putStrLn "£". On uxterm, it just
prints a blank space.

I can type the GBP sign into both terminals.

Could it be a terminfo problem of some sort? It seems suspicious that
there is a difference between terminals.

Alex
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Iain Barnett
2009/8/19 David Leimbach 

> Interesting... GHCI bug?  Didn't the readline dependency go away not too
> long ago?  Could it be related?
>

I just tried this

Prelude> putStrLn "\£"
ghc: panic! (the 'impossible' happened)
  (GHC version 6.10.4 for i386-unknown-linux):
charType: '\163'

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug


So perhaps I should put in a bug report, as that shouldn't happen (it
doesn't with some other characters I tried), unless anyone has a different
idea? I'm running Arch Linux with xmonad and using roxterm, so perhaps it's
something to do with my setup?

Iain
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] gbp sign showing as unknown character by GHC

2009-08-19 Thread Iain Barnett
Quick question: I've tested this in a couple of different terminals (roxterm
and xterm), so I'm fairly sure it's GHC that's the problem. Have I missed a
setting?
GHCi, version 6.10.4
Prelude> putStrLn "£"
�

Hugs98 200609-3
Hugs> putStrLn "£"
£


I get the same character output from a password generator I've writtern,
after compilation with GHC

[iainb]$ ./makepass2 50 2 >> testfile.txt
[iainb]$ cat testfile.txt
H(xW!:maNyxZ;h,IW=Uu4G$ztc>k...@q[g6?y:�TbG&5Nd")+"5+


Iain
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe