I have implemented UTF8-encode/decode. Unlike the code someone has already
posted it handles all UTF8 sequences, including those longer than 3 bytes.
It also catches all illegal UTF8 sequences (such as characters encoded
with a longer sequence than necessary). Here is the code.
On Tue, Apr 27, 2004 at 10:55:57AM +0200, George Russell wrote:
I have implemented UTF8-encode/decode. Unlike the code someone has already
posted it handles all UTF8 sequences, including those longer than 3 bytes.
It also catches all illegal UTF8 sequences (such as characters encoded
with a
David Brown wrote (snipped):
What license is your code covered under? As it stands now, it is an
informative example, but cannot be used by anybody.
As author, I am quite happy for it to be used and modified by other people
for non-commercial purposes. As far as I know my employers wouldn't
any
I am writing some utilities to deal with UTF-8 encoded text files (not
source). Currently, I'm just reading in the UTF-8 directly, and things
work reasonably well, since my parse tokens are ASCII, they are easy to
parse.
However, the character type seems perfectly happy with larger values for
On Mon, 2004-04-26 at 18:49, David Brown wrote:
Is anyone aware of any Haskell libraries for doing UTF-8 decoding and
encoding? If not, I'll write something simple.
The gtk2hs library uses the following functions internally.
Credit to Axel Simon I believe unless he swiped them from somewhere
Duncan Coutts wrote:
On Mon, 2004-04-26 at 18:49, David Brown wrote: [...]
toUTF :: String - String
Hmmm, String - [Word8] would be nicer...
fromUTF :: String - String
... and here: [Word8] - String or [Word8] - Maybe String.
Furthermore, UTF-8 is not restricted to a maximum of 3 bytes per
On Mon, Apr 26, 2004 at 08:33:38PM +0200, Sven Panne wrote:
Duncan Coutts wrote:
On Mon, 2004-04-26 at 18:49, David Brown wrote: [...]
toUTF :: String - String
Hmmm, String - [Word8] would be nicer...
fromUTF :: String - String
... and here: [Word8] - String or [Word8] - Maybe String.