Hi Phil, Phil Vandry wrote: > On Wed, Mar 25, 2009 at 09:03:35AM +0100, Jacek Konieczny wrote: > >> that is not a sane way to do things. The problems will start as soon as >> someone will try to process this data as 'latin1' (according to the >> declaration on the database), when it is not latin1. >> > > Agreed. And the database would be perfectly within its rights to reject > or corrupt any byte in the range 0x80 to 0x9f if the encoding is latin1 > (those bytes are not used in latin1), so you cannot even count on binary > transparency. (I doubt MySQL actually does this, though.) > > >> But, back to my original question, as can understand that 'latin1' is ok >> for some or even most people. My my question was: is there any specific, >> technical reason, that 'utf8' is forbidden? I don't think OpenSIPs does >> > > I don't know why you are getting a problem with UTF-8 but there is one > issue with MySQL and UTF-8 that's worth mentioning (it's not related to > OpenSIPS). The MySQL docs do draw attention to this point. > > If you have a CHAR(n) column (not a VARCHAR column) and your table is > using a fixed-length record (usually, myisam with no VARCHAR columns), > the CHAR column must reserve 3*n bytes with UTF-8 but requires only n > bytes with latin1 or ASCII. > > (Actually it should be 4*n, not 3*n, but MySQL's support for UTF-8 is > crippled and only supports characters up to U+00FFFF, and that means it > never needs more than 3 bytes to encode one character.) > So, more or less it is about the table size - what is not clear for me (from what you say) is why for a char(n) you need n bytes when using latin1 charset? it means it supports only 256 chars? because according to mysql docs, the latin1 supports a lot of non-standard chars (extended codes).
Regards, Bogdan _______________________________________________ Users mailing list [email protected] http://lists.opensips.org/cgi-bin/mailman/listinfo/users
