--- In [email protected], "swzoh" <sean...@...> wrote: > > --- In [email protected], "brucexs" <bswitzer@> wrote: > > > > > > IMHO, Each Korean character might have two dbcs lead bytes. > > > http://tech.groups.yahoo.com/group/power-pro/message/34624 > > > please see also testresultyw9.png > > > > The referenced MS article says only two bytes max per character, but that > > does seem too small to me if there are thousands of characters that must be > > represented. > > > No, they can't have two lead bytes, they always consist of one lead-byte and > one trail-byte. The problem is that the same code can be either a lead-byte > or a trail-byte, and IsDBCSLeadByte API doesn't "validate the presence or > validity of a trail byte". > OK, ignore d2 and use d3 which I just uploaded and which only checks for a single lead byte.
I thought that Korean and other similar languages could have thousands of characters. How are these represented if there are only around 10 lead bytes and around 200 usable following bytes. Is 2000 enough?
