2013/9/16 Stephan Stiller <[email protected]> > That's exactly what happens when people confuse "code point" with "scalar value" ;-) Hmm, whom might we blame? :-)
Actually you never count scalar values. You are confusing tham with code units. Twitter was orignally counting UTF-16 code units, but now counts code points. Scalar values are unrelated, they are properites assigned to code points so that all code points have a scalar value but the reverse is true only with the valid range 0 to 0x1FFFFF. Scalar values are only used if you need to perform arithmetic to compute code points from others. This genreally does not work well within the UCS except in a few very small ranges (like decimal digits). The scalar value is also needed to convert from one standard UTF to another.

