RE: Running out of code points, redux (was: Re: Feedback on the proposal...)

2017-06-05 Thread Doug Ewell via Unicode
Martin J. Dürst wrote:

> Assuming (conservatively) that it will take about a century to fill up
> all 17 (well, actually 15, because two are private) planes, this would
> give us another century.

Current estimates seem to indicate that 800 years is closer to the mark.
 
--
Doug Ewell | Thornton, CO, US | ewellic.org




Re: CLDR 'B'

2017-06-05 Thread Peter Edberg via Unicode

> On Jun 5, 2017, at 1:20 AM, Neil Shadrach via Unicode  
> wrote:
> 
> 
> http://cldr.unicode.org/translation/date-time-patterns 
> 
> 
> How are 'B' values added for languages that do not have them?
> I cannot see an option for this in the survey tool which just refers to the 
> existing list.

If you want to override the inherited pattern for one of the 5 existing 'B' 
skeletons (Bh, Bhm, Bhms, EBhm, EBhms) you should be able to do that with no 
problem, please let us know if that does not work for you.

If in a particular locale you want to add another skeleton to the existing 5, 
please file a ticket:
http://unicode.org/cldr/trac/newticket 
(we shoud be able to get to that within a few days)

- Peter E




Re: Running out of code points, redux (was: Re: Feedback on the proposal...)

2017-06-05 Thread William_J_G Overington via Unicode
Martin J. Dürst > Sorry to be late with this, but if 20.1 bits turn out to not 
be enough, what about 21 bits?

Martin J. Dürst > That would still limit UTF-8 to four bytes, but would almost 
double the code space. Assuming (conservatively) that it will take about a 
century to fill up all 17 (well, actually 15, because two are private) planes, 
this would give us another century.

Martin J. Dürst > Just one more crazy idea :-(.

An interesting possibility for application of some of the code points of those 
extra planes is to encode one code point for each Esperanto word that is in the 
PanLex database.

https://www.panlex.org/

That could provide a platform for assisting communication through the language 
barrier.

William Overington

Monday 5 June 2017




Re: Running out of code points, redux (was: Re: Feedback on the proposal...)

2017-06-05 Thread Richard Wordingham via Unicode
On Mon, 5 Jun 2017 13:08:06 +0900
"Martin J. Dürst via Unicode"  wrote:

> On 2017/06/02 04:54, Doug Ewell via Unicode wrote:
> > Richard Wordingham wrote:
> >   
> >> even supporting 6-byte patterns just in case 20.1 bits eventually
> >> turn out not to be enough,  
> 
> Sorry to be late with this, but if 20.1 bits turn out to not be
> enough, what about 21 bits?
> 
> That would still limit UTF-8 to four bytes, but would almost double
> the code space. Assuming (conservatively) that it will take about a
> century to fill up all 17 (well, actually 15, because two are
> private) planes, this would give us another century.

It all depends on how the lead byte is parsed.  With a block-if
construct ignorant of the original design or a look-up table, it may be
simplest to treat F5 onwards as out and out errors and not expect any
trailing bytes.  Code handling attempts at 6-byte code points
was the most complex case.  Of course, one **might** want to handle a
list of mostly small positive integers, at which point the old UTF-8
design might be useful.

Richard.



CLDR 'B'

2017-06-05 Thread Neil Shadrach via Unicode
http://cldr.unicode.org/translation/date-time-patterns

How are 'B' values added for languages that do not have them?
I cannot see an option for this in the survey tool which just refers to the
existing list.