On 22/11/16 13:56, Delmar Wichnieski wrote: > But VARCHAR fields work correctly. Problem only in CHAR.
VARCHAR is trimmed to the number of bytes used ... not the number of 'characters'! CHAR is only designed for single byte characters IN PHP so providing multi byte characters to a CHAR(1) field does not know how many actual characters are displayed. I'm not saying what is currently happening is right, but it is 'safe' since PHP then needs help to move the string to a variable that it can check if the UTF8 data is a single character or multiple characters. > And it should not be gambol, because > "Each UTF is reversible, thus every UTF supports lossless round tripping: > mapping from any Unicode coded character sequence S to a sequence of bytes > and back will produce S again." > Source > http://unicode.org/faq/utf_bom.html Provided that there is no processing of the data then that is correct, but operations like 'upper' and 'lower' can result in a change in number of characters, and the addition of accent characters can also result in differences. It is this area that basically stopped the development of a UTF8 native PHP6. Normalization in http://www.unicode.org/reports/tr15/ is a minefield even for the Firebird collation process ... Just how long is the normalized string? > 2016-11-22 11:21 GMT-02:00 Lester Caine <les...@lsces.co.uk>: > >> > On 22/11/16 12:58, Delmar Wichnieski wrote: >>> > > Since there was no answer here on the list, I was feeling alone and >> > afraid >>> > > and wondering why no one else has this problem. >> > >> > Delmar I must apologise as I HAD posted a reply, but it did not actually >> > go through ... list in bounce emails mode which I missed ... >> > >> > The simple answer is that strings in PHP are not UTF8 so the 'bug' you >> > are listing is actually that we need to make sure that the single byte >> > buffer for a string is long enough. To ensure UTF8 strings to be handled >> > properly since PHP6 is not going to happen, we have to transfer the >> > simple php strings to mbstring objects. UTF8 is a gambol in PHP if it is >> > going to be transferred properly as a simple string variable and will >> > give string length as bytes rather than characters ... -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php