Re: RFC 312 (v1) Unicode Combinatorix

Bryan C . Warnock Mon, 25 Sep 2000 19:10:39 -0700

Someone's been busy...

On Mon, 25 Sep 2000, Simon Cozens wrote:
> Data which comes in through a line discipline B<must> be in UTF8, unless
> C<no unicode> is in force.

{snip}

> C<no unicode> just throws everything. None of the above happens.

What does "just throws everything" mean?

In RFC 294, data is internally stored in UTF8, no questions asked.
Line disciplines also assume the transformation to UTF8, but
there is a brief mention of a non-migration case.

My first question is, I guess, which is it?  Either all data will be
UTF8, or it won't.  What would be the exception cases?  The impact of
those exception cases. Only data which is UTF8 is capable of sorts and
compares and regexes?

My second question, what *does* happen in C<no unicode> mode?
Up until recently, about half my non-one-liners dealt more with binary
data than textual data.  You mention that there will be this pragma so
that I may run against my data in its birthday suit, but you give zero
details as to how this fits in the UTF8 scheme.  Minds that need
inquiring want to know!

  -- 
Bryan C. Warnock
([EMAIL PROTECTED])

Re: RFC 312 (v1) Unicode Combinatorix

Reply via email to