Originally utf8 allowed at most 6 bytes that would be enough any codepoint in ucs4. In latter unicode standard, it was decided not all of 32 bits of ucs4 were needed and the new range corresponds 4 bytes of utf8. On Feb 28, 2014 4:42 AM, "Raul Miller" <rauldmil...@gmail.com> wrote:
> Unicode is messy, but it's not that messy. > > The utf-8 encoding places a limit on how many characters can be > encoded, and if I understand properly, that limit is slightly over a > million, and less than a quarter of those theoretical characters > currently have been assigned. Of course... unicode is a standard and > the nice thing about standards is that we have so many to pick from... > but utf-32 just convenient. > > Still, even 250k characters is a lot to deal with. Just representing > which sets each of them belongs in is a measurable load. Representing > font information and all of the numerous special rules is going to > occupy a certain amount of space. There's space/time tradeoffs but if > we require that the language make those tradeoffs that's going to be > good for some cases and bad for other cases. > > Basically, everyone has to pay for the storage (and other overhead) of > every feature built into the language, every time the language gets > used. That works for some contexts, but I think it plays against J's > strengths. > > Also... a key issue here is that, if we cannot model a feature like > this outside the language, we are not ready to implement it within the > language. > > Thanks, > > -- > Raul > > > > On Thu, Feb 27, 2014 at 2:00 PM, Björn Helgason <gos...@gmail.com> wrote: > > Unicode was supposed to be the solution to the problems with the APL > chars > > as well all the code pages with national characters. > > > > As should be obvious the solution is far from anywhere close. > > > > UTF-8 UTF-16 UTF-32 UTF-64 UTF-??? > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm