At 4:10 AM -0700 8/4/04, Joshua Gatcomb wrote:
All:
After speaking with Dan in #parrot last night, I
either had originally misunderstood his position or he
has changed it (paraphrased):

We will ship Parrot with unicode support, but:.

A.  The unicode support does not necessarily need to
be limited to a single library or ICU specifically.
B.  Just because CVS will have unicode support, does
not mean the user will be forced to use it.
C.  Configure should detect a system unicode library
and do the right thing in choosing which one it uses.

Yup, you've got it. I *thought* that having ICU in would be more a win than a loss. Given the hell this has been putting people through I'm seriously changing my mind.


We have a single requirement -- Parrot, as shipped, *must* have a working Unicode solution. It won't have to be configured when parrot's built, but it must at least be configurable. Right now, that solution's ICU. Longer term, well... longer term I dunno.

So, here's the plan.

1) We beat up Configure to probe for and use the system ICU, if available. (Switches are needed now, it should be automagic)

2) I spec out the encoding and charset APIs for the loadable encoding and charset modules. (This is step one of teasing ICU out of the core)

3) We make Parrot's string system use the loadable encoding and charset system

4) We get non-unicode encodings and charsets in

5) We make ICU a loadable module tied into the proper encodings and charset

Step 1 can be done by anyone willing to poke at the configure perl code. Step 2 needs me, and I'll get that done when I'm waiting for the train today. (No, don't ask) Step 3 is the biggie here, as it touches a lot of string.c. 4's relatively easy (7-bit ASCII and binary'll be first :) and 5 may or may not be straightforward, depending on how the design goes.

I'd like to get work on steps 3 and 4 going quickly -- the sooner the better -- once the API design's done.

And yes, the API will support doing this in bytecode, though there'll be the obligatory performance penalty, so if someone later comes along and wants to reimplement the Unicode support in a parrot language, well... that'd be keen and we could toss ICU from the distribution entirely. (Though still use it if there's a system version installed)
--
Dan


--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to