Martin J. Dürst wrote:
> Assuming (conservatively) that it will take about a century to fill up
> all 17 (well, actually 15, because two are private) planes, this would
> give us another century.
Current estimates seem to indicate that 800 years is closer to the mark.
--
Doug Ewell | Thornton,
Martin J. Dürst > Sorry to be late with this, but if 20.1 bits turn out to not
be enough, what about 21 bits?
Martin J. Dürst > That would still limit UTF-8 to four bytes, but would almost
double the code space. Assuming (conservatively) that it will take about a
century to fill up all 17
On Mon, 5 Jun 2017 13:08:06 +0900
"Martin J. Dürst via Unicode" wrote:
> On 2017/06/02 04:54, Doug Ewell via Unicode wrote:
> > Richard Wordingham wrote:
> >
> >> even supporting 6-byte patterns just in case 20.1 bits eventually
> >> turn out not to be enough,
>
>
On Sun, Jun 4, 2017 at 9:13 PM Martin J. Dürst via Unicode <
unicode@unicode.org> wrote:
> Sorry to be late with this, but if 20.1 bits turn out to not be enough,
> what about 21 bits?
>
> That would still limit UTF-8 to four bytes, but would almost double the
> code space. Assuming
On 2017/06/02 04:54, Doug Ewell via Unicode wrote:
Richard Wordingham wrote:
even supporting 6-byte patterns just in case 20.1 bits eventually turn
out not to be enough,
Sorry to be late with this, but if 20.1 bits turn out to not be enough,
what about 21 bits?
That would still limit
On 6/1/2017 8:32 PM, Richard Wordingham via Unicode wrote:
TUS Section 3 is like the Augean Stables. It is a complete mess as a
standards document,
That is a matter of editorial taste, I suppose.
imputing mental states to computing processes.
That, however, is false. The rhetorical turn
On Thu, 1 Jun 2017 19:19:51 -0700
Ken Whistler via Unicode wrote:
> > and therefore should start a
> > sequence of 6 characters.
>
> That is completely false, and has nothing to do with the current
> definition of UTF-8.
>
> The current, normative definition of UTF-8,
On 6/1/2017 6:21 PM, Richard Wordingham via Unicode wrote:
By definition D39b, either sequence of bytes, if encountered by an
conformant UTF-8 conversion process, would be interpreted as a
sequence of 6 maximal subparts of an ill-formed subsequence.
("D39b" is a typo for "D93b".)
Sorry about
On Thu, 1 Jun 2017 17:10:54 -0700
Ken Whistler via Unicode wrote:
> Well, working from the *current* specification:
>
> FC 80 80 80 80 80
> and
> FF FF FF FF FF FF
>
> are equal trash, uninterpretable as *anything* in UTF-8.
>
> By definition D39b, either sequence of
On Thu, 1 Jun 2017 17:10:54 -0700
Ken Whistler via Unicode wrote:
> On 6/1/2017 2:39 PM, Richard Wordingham via Unicode wrote:
> > You were implicitly invited to argue that there was no need to
> > handle 5 and 6 byte invalid sequences.
> >
>
> Well, working from the
On 6/1/2017 2:39 PM, Richard Wordingham via Unicode wrote:
You were implicitly invited to argue that there was no need to handle
5 and 6 byte invalid sequences.
Well, working from the *current* specification:
FC 80 80 80 80 80
and
FF FF FF FF FF FF
are equal trash, uninterpretable as
This is still very unlikely to occur. Lot of discussions about emojis but
they still don't count a lot in the total.
The major updates were epected for CJK sinograms, but even the rate of
updates has slowed down and we will eventually will have another
sinographic plane, but it will not come soon
On Thu, 01 Jun 2017 12:54:45 -0700
Doug Ewell via Unicode wrote:
> Richard Wordingham wrote:
>
> > even supporting 6-byte patterns just in case 20.1 bits eventually
> > turn out not to be enough,
>
> Oh, gosh, here we go with this.
You were implicitly invited to argue
13 matches
Mail list logo