Re: unicode/internalization issues

2006-03-28 Thread John Meacham
On Sun, Mar 26, 2006 at 03:22:38PM +0400, Bulat Ziganshin wrote:
> 3. Unicode support in I/O routines, i.e. ability to read/write UTF-8
> encoded files and files what use other Unicode byte encodings: not
> implemented in any compiler, afaik, but there are 3rd-party libs:
> Streams library, New I/O library, and even CharIO module from jhc
> sources

programs compiled by jhc will use the proper locale as set by the system
so support any encoding your c libraries support.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


Re: unicode/internalization issues

2006-03-26 Thread Ross Paterson
On Sun, Mar 26, 2006 at 03:22:38PM +0400, Bulat Ziganshin wrote:
> i've planned some time ago to open unicode/internalization wiki page,
> what reflects current state of the art in this area. here is the
> information i have, please add/correct me if i don't know something or
> wrong.

You might want to look at:
http://haskell.galois.com/cgi-bin/haskell-prime/trac.cgi/wiki/CharAsUnicode
http://haskell.galois.com/cgi-bin/haskell-prime/trac.cgi/wiki/UnicodeInHaskellSource

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime


unicode/internalization issues

2006-03-26 Thread Bulat Ziganshin
Hello haskell-prime,

i've planned some time ago to open unicode/internalization wiki page,
what reflects current state of the art in this area. here is the
information i have, please add/correct me if i don't know something or
wrong.

1. Char supports full Unicode range (about million of chars) instead
of just 8-bit ASCII: implemented in GHC 6.0 and Hugs 2005 (i'm not
sure about exact versions)

2. Character classification/convertion routines in Data.Char: all
Unicode chars managed properly starting from GHC 6.4 and Hugs 2005.
Author of this update, Dmitry Golubovsky, also provides it as
additional lib for ghc 6.2.2, and i think it is possible to extend his
work to work with any compiler supporting "wide" Chars.

3. Unicode support in I/O routines, i.e. ability to read/write UTF-8
encoded files and files what use other Unicode byte encodings: not
implemented in any compiler, afaik, but there are 3rd-party libs:
Streams library, New I/O library, and even CharIO module from jhc
sources

4. Support for UTF-8 encoded source files: implemented in ghc 6.5 and
jhc. afaik, ghc's support is more advanced because it uses
abovementioned routines to classify Chars, so you can use any national
characters in identifiers according to their case, and all other
symbols in operators. because ghc 6.5 supports ONLY utf-8 encoded
source files, these creates some problems when compiling files created
for previous versions of ghc (or for other compilers) and using ASCII
encoding with national (>chr 127) chars in comments and especially
string literals. GHC team asked their users for best solution of this
problem

if i don't mentioned here any issues regarding
unicode/internalization, please add this


-- 
Best regards,
 Bulat  mailto:[EMAIL PROTECTED]

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://haskell.org/mailman/listinfo/haskell-prime