All,

There are plenty of unassigned code points within blocks that are in use; these often come at the end of a block but there are plenty of holes as well.

I have a cluster of interrelated questions:
1. What sorts of reasons are there (or have there been) for leaving holes? Code page conversion and changes to casing by simple arithmetic? What else? 1.1 The rationale for particular holes is not documented in the code charts I looked at; is there documentation? (Yes, in some instances the answer can be guessed.) 1.2 How is the number of holes determined? It seems like multiples of 16 are used for block sizes merely for practical reasons. 2. I notice that ranges are often used to describe where scripts are found. Do holes have properties? Are the other block-related policies that gives holes a certain semantics? 2.1 If not, how likely is it that Unicode assigns script-external characters to holes? 2.2 If yes, how does the number of assigned code points differ, if holes that are assumed to be filled only by certain types of characters are counted? 2.2.1 Would this make much of a difference wrt the question (this comes up from time to time it seems) of how much of Unicode will eventually fill up?
3. Have there been "mistakes" wrt to hole assignment?

Stephan


Reply via email to