On Oct 2, 2007, at 3:01 PM, Twan van Laarhoven wrote:
Lots of people wrote:
> I want a UTF-8 bikeshed!
> No, I want a UTF-16 bikeshed!
What the heck does it matter what encoding the library uses
internally? I expect the interface to be something like (from my own
CompactString library):
> fromByteString :: Encoding -> ByteString -> UnicodeString
> toByteString :: Encoding -> UnicodeString -> ByteString
I agree, from an API perspective the internal encoding doesn't matter.
The only matter is efficiency for a particular encoding.
This matters a lot.
I would suggest that we get a working library first. Either UTF-8 or
UTF-16 will do, as long as it works.
Even better would be to implement both (and perhaps more encodings),
and then benchmark them to get a sensible default. Then the choice
can be made available to the user as well, in case someone has
specifix needs. But again: get it working first!
The problem is that the internal encoding can have a big effect on the
implementation of the library. It's better not to have to do it over
again if the first choice is not optimal.
I'm just trying to share the experience of the Unicode Consortium, the
ICU library contributors, and Apple, with the Haskell community. They,
and I personally, have many years of experience implementing support
for Unicode.
Anyway, I think we're starting to repeat ourselves...
Deborah
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe