UTF-8 is for almost all languajes (uses 8 bits for representing a letter I think), but "complicated" languajes as Japanese and Thailand uses 16 bits, so that's because of UTF-16 overall.
Miquel Angel On 4/20/05, Brandon Goodin <[EMAIL PROTECTED]> wrote: > I've done quite a bit with i18n working between UTF-8 and UTF-16. Even > after all that... I'm still mystified. :D Encoding is a world unto > itself. All i want is something that works :) Maybe one of these days > i'll understand more... for now it's all about trial and error. > > On 4/20/05, Brice Ruth <[EMAIL PROTECTED]> wrote: > > I don't see anywhere in there that UTF-8 cannot encode everything that > > UTF-16 and UTF-32 can ... just that the storage requirements differ ?! > > > > Brice > > > > On 4/20/05, Brandon Goodin <[EMAIL PROTECTED]> wrote: > > > http://icu.sourceforge.net/docs/papers/forms_of_unicode/ > > > > > > On 4/20/05, Brice Ruth <[EMAIL PROTECTED]> wrote: > > > > I had heard that chinese does a lot with UTF-16, but I hadn't heard > > > > about arabic ... and I don't exactly understand why UTF-8 doesn't > > > > support that ... is it simply because their character sets keep > > > > expanding and UTF-8 is static? > > > > > > > > On 4/20/05, Brandon Goodin <[EMAIL PROTECTED]> wrote: > > > > > Latin characters are fine. Howeve, UTF-8 is not sufficient for several > > > > > languages like Arabic and Chinese. For their FULL range of character > > > > > representaions these languages require UTF-16 and in the case of > > > > > Chinese it is pushing for UTF-32. > > > > > > > > > > Brandon > > > > > > > > > > On 4/20/05, Brice Ruth <[EMAIL PROTECTED]> wrote: > > > > > > OK ... that's more reasonable. Obviously, you need to use an editor > > > > > > (such as Eclipse) that is capable of editing UTF-8 files, otherwise, > > > > > > you'll get junk and that won't be fun. > > > > > > > > > > > > Whew ... glad UTF-8 isn't compromised :) > > > > > > > > > > > > On 4/20/05, Brandon Goodin <[EMAIL PROTECTED]> wrote: > > > > > > > I found this quote when doing a search in google: > > > > > > > > > > > > > > --- quote --- > > > > > > > > > > > > > > Your actual problem is very typical. By default (without encoding > > > > > > > specified in the XML declaration), XML is encoded in UTF-8. If > > > > > > > you use > > > > > > > an editor which is not encoding-aware and typically assuming an > > > > > > > ISO-8859-1 encoding, and you insert characters such as accented > > > > > > > letters, curly quotes, etc., you will get this error. As a > > > > > > > workaround, > > > > > > > you can put an XML declaration with the ISO-8859-1 encoding at > > > > > > > the top > > > > > > > of your XML file: > > > > > > > > > > > > > > <?xml version="1.0" encoding="ISO-8859-1"?> > > > > > > > > > > > > > > You can also use an editor which knows how to handle UTF-8. > > > > > > > > > > > > > > In your case it is also possible that somebody inserted incorrect > > > > > > > characters by accident, and you can just remove those and then > > > > > > > decide > > > > > > > which encoding you want to use. UTF-8 gives you the whole range of > > > > > > > Unicode, while ISO-8859-1 gives you a limited set of characters > > > > > > > that > > > > > > > work for the Western languages. > > > > > > > > > > > > > > --- quote --- > > > > > > > > > > > > > > maybe that will help, > > > > > > > Brandon > > > > > > > > > > > > > > On 4/20/05, Brice Ruth <[EMAIL PROTECTED]> wrote: > > > > > > > > What special characters aren't supported by UTF-8?! I have > > > > > > > > never heard > > > > > > > > of such a thing. My understanding is that UTF-8 represents the > > > > > > > > full > > > > > > > > Unicode character set as a multi-byte value. And since Unicode > > > > > > > > is > > > > > > > > supposed to encompass all known characters for all known > > > > > > > > languages > > > > > > > > (with space for new Chinese characters created daily) - what's > > > > > > > > not > > > > > > > > covered?! > > > > > > > > > > > > > > > > There most certainly shouldn't be anything that iso-8859-1 or > > > > > > > > latin1 > > > > > > > > (Windows-1252) covers that is not in Unicode. > > > > > > > > > > > > > > > > Brice > > > > > > > > > > > > > > > > On 4/20/05, Daniel H. F. e Silva <[EMAIL PROTECTED]> wrote: > > > > > > > > > You could check also your xml encoding. If you work with > > > > > > > > > special charaters not in utf-8, you will > > > > > > > > > get in trouble. > > > > > > > > > I had this as my native language is portuguese and we have > > > > > > > > > some special characters not supported > > > > > > > > > by utf-8. > > > > > > > > > So, if this is your case, try iso-8859-1 or one that fits > > > > > > > > > better to your needs. > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > Daniel Silva. > > > > > > > > > > > > > > > > > > > > > > > > > > > --- Larry Meadors <[EMAIL PROTECTED]> wrote: > > > > > > > > > > Make sure that there is no white space and no odd chars at > > > > > > > > > > the top of your > > > > > > > > > > config file. > > > > > > > > > > > > > > > > > > > > Larry > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 4/18/05, KK <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > > > > > > > > > > I get the following error when I try to build > > > > > > > > > > > sqlCOnfigmap..does it > > > > > > > > > > > look familiar to someone? > > > > > > > > > > > > > > > > > > > > > > com.ibatis.sqlmap.client.SqlMapException: There was an > > > > > > > > > > > error while > > > > > > > > > > > building the SqlMap instance. > > > > > > > > > > > --- The error occurred in the SQL Map Configuration file. > > > > > > > > > > > --- Cause: com.ibatis.sqlmap.client.SqlMapException: XML > > > > > > > > > > > Parser Error. > > > > > > > > > > > Cause: java.io.UTFDataFormatException: Invalid byte 3 of > > > > > > > > > > > 3-byte UTF-8 > > > > > > > > > > > sequence. > > > > > > > > > > > Caused by: java.io.UTFDataFormatException: Invalid byte 3 > > > > > > > > > > > of 3-byte > > > > > > > > > > > UTF-8 sequence. > > > > > > > > > > > Caused by: com.ibatis.sqlmap.client.SqlMapException: XML > > > > > > > > > > > Parser Error. > > > > > > > > > > > Cause: java.io.UTFDataFormatException: Invalid byte 3 of > > > > > > > > > > > 3-byte UTF-8 > > > > > > > > > > > sequence. > > > > > > > > > > > Caused by: java.io.UTFDataFormatException: Invalid byte 3 > > > > > > > > > > > of 3-byte > > > > > > > > > > > UTF-8 sequence. > > > > > > > > > > > at > > > > > > > > > > > com.ibatis.sqlmap.engine.builder.xml.XmlSqlMapClientBuilder.buildSqlMap > > > > > > > > > > > (XmlSqlMapClientBuilder.java:203) > > > > > > > > > > > at com.ibatis.sqlmap.client. > > > > > > > > > > > SqlMapClientBuilder.buildSqlMapClient(SqlMapClientBuilder.java:49) > > > > > > > > > > > > > > > > > > > > > > Your help is greatly appreciated. > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > KK > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > __________________________________________________ > > > > > > > > > Do You Yahoo!? > > > > > > > > > Tired of spam? Yahoo! Mail has the best spam protection > > > > > > > > > around > > > > > > > > > http://mail.yahoo.com > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Brice Ruth > > > > > > > > Software Engineer, Madison WI > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Brice Ruth > > > > > > Software Engineer, Madison WI > > > > > > > > > > > > > > > > > > > -- > > > > Brice Ruth > > > > Software Engineer, Madison WI > > > > > > > > > > > -- > > Brice Ruth > > Software Engineer, Madison WI > > >