On Mar 26, 10 18:52, yigal chripun wrote:
KennyTM~ Wrote:
On Mar 26, 10 05:46, yigal chripun wrote:
while it's true that '?' has one unicode value for it, it's not true for all
sorts of diacritics and combine code-points. So your approach is to pass the
responsibility for that to the end user which in 99.9999% will not handle this
correctlly.
Non-issue. Since when can a character literal store> 1 code-point?
character != code-point
D chars are really as you say code-points and not always complete characters.
here's a use case for you:
you want to write a fully unicode aware search engine.
If you just try to match the given sequnce of code-points in the search term,
you will miss valid matches since, for instance you do not take into account
permutations of the order of combining marks.
you can't just assume that the code-point value identifies the character.
Stop being off-topic. '?' is of type char, not string. A char always
holds an octet of UTF-8 encoded sequence. The numerical content is
unique and well-defined*. Therefore adding 4 to '?' also has a meaning.
* If you're paranoid you may request the spec to ensure the character is
in NFC form.