> On Jun 1, 2016, at 10:43 AM, Kamil Cholewiński <[email protected]> wrote:
> 
>> On Wed, 01 Jun 2016, Ben Woolley <[email protected]> wrote:
>> That is the reason why I am erring on the side of 5% this time.
> 
> The 95% use case here is handling UTF8-encoded Unicode text. Secure by
> default should be the norm, not a magic flag, not buried in a readme.
> 

Yes, that is what I am suggesting for libutf. I believe we have the same 
concern. 

> If you need to encode an arbitrarily large integer into a stream of
> bytes, then use a library specifically designed for encoding arbitrarily
> large integers into streams of bytes.
> 

Yes, that is what I am suggesting for "libctf", and that it not be called UTF. 
Then the encoding expert making the next encoding update will hopefully be the 
only one messing with it. I could have used a "libctf" before, when updating an 
app beyond what was available in the libraries I was stuck with. 

> Yes, we're making up problems.

Or are we ultimately agreeing? :)

The reason why I am looking at this on the several-year time span is this: how 
often do people review encoding implementations? Probably once every 5 years. 
With changes every 7 years to the standard, there is a need for random Joe to 
be able to glance at a libutf and see the quirks in a wrapper, and not have to 
touch a slightly convoluted transformation function just to see if a range is 
handled properly.

I am basing these thoughts on things that I have actually done. For example, at 
one company, I worked on a statistical dictionary compression that worked its 
symbols in between "CTF" ranges. That was essentially a libctfcomp library that 
could be consumed by an unaltered libutf. That way, the change can be made in a 
secure way more easily. I have worked with UTF-8 at this level in 3 different 
companies already. Maybe there is a real need for a libctf. 

Reply via email to