Re: [GTALUG] a solved problem unsolved itself: WordPress, MySQL, UTF-8

Jamon Camisso via talk Wed, 01 Dec 2021 18:53:42 -0800

On 01/12/2021 08:05, Stewart C. Russell via talk wrote:

On 2021-11-29 16:25, Jamon Camisso via talk wrote:
Another thing to try is using mysqli_set_charset("UTF8"); somewhere inyour site's code. Substitute in different character sets until youfind the correct one ...
Thanks, Jamon, but there isn't a valid encoding for what my databaseseems to be holding. It was UTF-8, and now it's seemingly UTF-8 decodedto CP1252 bytes re-encoded to UTF-8 characters again.
If WordPress were using Python (it's not), if my db held the 4character, 6 byte UTF-8 string, the equivalent Python code to end up inthe mess I'm in is:
>>> bytes(bytes("côté",encoding='utf-8').decode(encoding='cp1252'),encoding='utf-8')
     b'c\xc3\x83\xc2\xb4t\xc3\x83\xc2\xa9'

or 6 characters / 10 bytes of gibberish ('cÃ´tÃ©').

Since that encoding is reversible, can you attempt it on some of thecorrupted posts/pages? e.g.

>>> bytes(bytes('cÃ´tÃ©', encoding='utf-8').decode(),encoding='cp1252').decode()

'côté'

Since this happened in the last month or so, it's not really a legacyencoding issue. Perfectly good UTF-8 got destroyed with no input/changesfrom me.
I'd been fairly careful with backups for the first decade of runningthis blog, but the process got wearing after a while, especially sinceevery update went flawlessly so the manual backup process was a waste oftime. Wordpress offers automatic updates without forcing a backupcheckpoint, which I think is wrong.

Is it a managed Wordpress? That's terribly bad sounding if it is. WorseI suppose if Wordpress itself just did it.

Do any of the casting suggestions on that link that I sent fix it? Orare you going to have to dump each row and run them through thatdouble-decoding process?


Jamon
---
Post to this mailing list [email protected]
Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk

Re: [GTALUG] a solved problem unsolved itself: WordPress, MySQL, UTF-8

Reply via email to