On Wednesday, June 4, 2014 9:22:54 AM UTC+5:30, Chris Angelico wrote:
> On Wed, Jun 4, 2014 at 1:37 PM, Rustom Mody wrote:
> > And so a pure BMP-supporting implementation may be a reasonable
> > compromise. [As long as no surrogate-pairs are there]
> Not if you're working on the internet. There are several critical
> groups of characters that aren't in the BMP, such as:
Of course. But what has the internet to do with micropython?
This is their stated goal:
| Micro Python is a lean and fast implementation of the Python
| programming language (python.org) that is optimised to run on a
> 1) Most or all Chinese and Japanese characters
Dont know how you count 'most'
| One possible rationale is the desire to limit the size of the full
| Unicode character set, where CJK characters as represented by discrete
| ideograms may approach or exceed 100,000 (while those required for
| ordinary literacy in any language are probably under 3,000). Version 1
| of Unicode was designed to fit into 16 bits and only 20,940 characters
| (32%) out of the possible 65,536 were reserved for these CJK Unified
| Ideographs. Later Unicode has been extended to 21 bits allowing many
| more CJK characters (75,960 are assigned, with room for more).
| From http://en.wikipedia.org/wiki/Han_unification