On 24-Oct-2015 02:45, Anon wrote:
On Friday, 23 October 2015 at 21:22:38 UTC, Vladimir Panteleev wrote:
That doesn't sound right. In fact, this puts into question why
dchar.max is at the value it is now. It might be the current maximum
at the current version of Unicode, but this seems like a completely
pointless restriction that breaks forward-compatibility with future
Unicode versions, meaning that D programs compiled today may be unable
to work with Unicode text in the future because of a pointless
artificial limitation.
Unless UTF-16 is deprecated and completely removed from all systems
everywhere, there is no way for Unicode Consortium to increase the limit
beyond U+10FFFF. That limit is not arbitrary, but based on the technical
limitations of what UTF-16 can actually represent. UTF-8 and UTF-32 both
have room for expansion, but have been defined to match UTF-16's
limitations.
Exactly. Unicode officially limited UTf-8 to 10FFFF in Unicode 6.0 or
so. Previously it was expected to (maybe) expand beyond but it was
decided to stay with 10FFFF pretty much indefinitely because of UTF-16.
Also; only ~114k of codepoints have assigned meaning, we are looking at
900K+ unassigned values reserved today.
--
Dmitry Olshansky