Re[2]: [Haskell-cafe] I/O and utf8

2006-01-12 Thread Bulat Ziganshin
Hello Einar,

Wednesday, January 11, 2006, 6:14:44 PM, you wrote:

EK> Do you plan on supporting things like HTTP where the character set
EK> is only known in the middle of the parsing?

yes, it is supported, see Examples/Encoding.hs in the
http://freearc.narod.ru/Binary.tar.gz :

 h <- openWithEncoding latin1 =<< openBinaryFile "test" ReadMode
 print =<< vGetLine h
 vSetEncoding h utf8
 print =<< vGetLine h
 vSetEncoding h latin1
 print =<< vGetLine h
 vClose h

it's not optimized currently. if you will need more speed - yell me


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] I/O and utf8

2006-01-11 Thread Einar Karttunen
On 10.01 10:25, Bulat Ziganshin wrote:
> i have the question about this issue - i also want to provide
> autodetection mechanism, which relies on first bytes of text files to
> set proper encoding. what is the standard rules to encode utf8/utf16
> encoding used for text in file in these first bytes?

The BOM is used to mark the encoding
(http://en.wikipedia.org/wiki/Byte_Order_Mark), but most
UTF-8 streams lack it. I have not seen it used in UTF-8 files either.

Do you plan on supporting things like HTTP where the character set
is only known in the middle of the parsing?

- Einar Karttunen
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Re[2]: [Haskell-cafe] I/O and utf8

2006-01-10 Thread Bayley, Alistair
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Bulat Ziganshin
> 
> i have the question about this issue - i also want to provide
> autodetection mechanism, which relies on first bytes of text files to
> set proper encoding. what is the standard rules to encode utf8/utf16
> encoding used for text in file in these first bytes?


Are you asking about the byte-order-mark in UTF encodings?
  http://www.unicode.org/faq/utf_bom.html#BOM

Note that UTF8 files typically lack the BOM, as UTF8 is meant to be
backwards-compatible with US7ASCII, I think. Windows Notepad is one of
the few programs that will insert it if a text file is saved as UTF8.

Alistair.
*
Confidentiality Note: The information contained in this message,
and any attachments, may contain confidential and/or privileged
material. It is intended solely for the person(s) or entity to
which it is addressed. Any review, retransmission, dissemination,
or taking of any action in reliance upon this information by
persons or entities other than the intended recipient(s) is
prohibited. If you received this in error, please contact the
sender and delete the material from any computer.
*
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] I/O and utf8

2006-01-10 Thread Bulat Ziganshin
Hello John,

Tuesday, January 10, 2006, 2:08:44 AM, you wrote:

>> i want to read a file encoded in utf8 and at a later time output portions of 
>> it
>> on the console. Is there an easy way to do this in haskell? using the 
>> standard
>> i/o functions i can read the file but the output gives me \1071 ... instead 
>> of
>> the unicode characters. 

JM> Jhc does all of its IO in utf8. CharIO is a drop in replacement for the
JM> standard prelude routines which converts everything to and from UTF8

JM> http://repetae.net/john/repos/jhc/CharIO.hs
JM> http://repetae.net/john/repos/jhc/UTF8.hs

btw, i plan to add this functionality to my Binary/Streams library,
basing on your code, John. so it will work something like:

unicode_stdout <- openWithEncoding unicode stdout
vPutStrLn unicode_stdout "it's a test"

i have the question about this issue - i also want to provide
autodetection mechanism, which relies on first bytes of text files to
set proper encoding. what is the standard rules to encode utf8/utf16
encoding used for text in file in these first bytes?



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] I/O and utf8

2006-01-09 Thread John Meacham
On Sun, Jan 08, 2006 at 11:26:05AM +, Andreas Kägi wrote:
> hello
> i want to read a file encoded in utf8 and at a later time output portions of 
> it
> on the console. Is there an easy way to do this in haskell? using the standard
> i/o functions i can read the file but the output gives me \1071 ... instead of
> the unicode characters. 

Jhc does all of its IO in utf8. CharIO is a drop in replacement for the
standard prelude routines which converts everything to and from UTF8

http://repetae.net/john/repos/jhc/CharIO.hs
http://repetae.net/john/repos/jhc/UTF8.hs

John

-- 
John Meacham - ⑆repetae.net⑆john⑈ 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] I/O and utf8

2006-01-08 Thread Andreas Kägi
hello
i want to read a file encoded in utf8 and at a later time output portions of it
on the console. Is there an easy way to do this in haskell? using the standard
i/o functions i can read the file but the output gives me \1071 ... instead of
the unicode characters. 



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe