+cc port maintainer and move the thread to ports@; misc@ is not a good place for this
On 2021-08-25, Ted Wynnychenko <ted....@comcast.net> wrote: > Hello > > Ok, to start with, I am not sure about any of this, but, here goes: > > I don't know why this happened just now, since I last updated the system > about 3 weeks ago, but today, I was unable to access data on my home server > via a php web application (horde). > > This was working fine this morning, but I then restarted the server, and I > started getting this error: > > utf8 is not supported by MySQL (big5, dec8, cp850, hp8, koi8r, latin1, > latin2, swe7, ascii, ujis, sjis, hebrew, tis620, euckr, koi8u, gb2312, > greek, cp1250, gbk, latin5, armscii8, utf8mb3, ucs2, cp866, keybcs2, macce, > macroman, cp852, latin7, utf8mb4, cp1251, utf16, utf16le, cp1256, cp1257, > utf32, binary, geostd8, cp932, eucjpms) > > The system is running MariaDB, and when I look at the available character > sets, I see: > > Welcome to the MariaDB monitor. Commands end with ; or \g. > Your MariaDB connection id is 69 > Server version: 10.6.4-MariaDB-log OpenBSD port: mariadb-server-10.6.4p1v1 > > Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. > > Type 'help;' or '\h' for help. Type '\c' to clear the current input > statement. > > MariaDB [(none)]> show character set; > +----------+-----------------------------+---------------------+--------+ >| Charset | Description | Default collation | Maxlen | > +----------+-----------------------------+---------------------+--------+ >| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 | >| dec8 | DEC West European | dec8_swedish_ci | 1 | >| cp850 | DOS West European | cp850_general_ci | 1 | >| hp8 | HP West European | hp8_english_ci | 1 | >| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 | >| latin1 | cp1252 West European | latin1_swedish_ci | 1 | >| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 | >| swe7 | 7bit Swedish | swe7_swedish_ci | 1 | >| ascii | US ASCII | ascii_general_ci | 1 | >| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 | >| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 | >| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 | >| tis620 | TIS620 Thai | tis620_thai_ci | 1 | >| euckr | EUC-KR Korean | euckr_korean_ci | 2 | >| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 | >| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 | >| greek | ISO 8859-7 Greek | greek_general_ci | 1 | >| cp1250 | Windows Central European | cp1250_general_ci | 1 | >| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 | >| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 | >| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 | >| utf8mb3 | UTF-8 Unicode | utf8mb3_general_ci | 3 | >| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 | >| cp866 | DOS Russian | cp866_general_ci | 1 | >| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 | >| macce | Mac Central European | macce_general_ci | 1 | >| macroman | Mac West European | macroman_general_ci | 1 | >| cp852 | DOS Central European | cp852_general_ci | 1 | >| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 | >| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 | >| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 | >| utf16 | UTF-16 Unicode | utf16_general_ci | 4 | >| utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 | >| cp1256 | Windows Arabic | cp1256_general_ci | 1 | >| cp1257 | Windows Baltic | cp1257_general_ci | 1 | >| utf32 | UTF-32 Unicode | utf32_general_ci | 4 | >| binary | Binary pseudo charset | binary | 1 | >| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 | >| cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 | >| eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 | > +----------+-----------------------------+---------------------+--------+ > 40 rows in set (0.000 sec) > > > Well, there is no "utf8" listed, so that explains the error (I think). > > But, I don't understand why it is missing. > According to the MariaDB site: > "Until MariaDB 10.5, this was a UTF-8 encoding using one to three bytes per > character. Basic Latin letters, numbers and punctuation use one byte. > European and Middle East letters mostly fit into 2 bytes. Korean, Chinese, > and Japanese ideographs use 3-bytes. No supplementary characters are stored. > From MariaDB 10.6, utf8 is an alias for utf8mb3, but this can changed to > ut8mb4 by changing the default value of the old_mode system variable." > > I tried running mysqld/mariadb with "--old-mode UTF8_IS_UTF8MB3" but the > error is the same (not surprising). > > According the MariaDB website, there should be a "utf8" character set > available as an alias. > But, it is not listed as on option on the mariadb version running on > openbsd. > > I would have no idea how to update php scripts to change the character set > they are specifying when the try to interact with mysqld, so I am wondering > if there is a way to add back an "utf8" character set (as an alias for > utf8mb3) into MariaDB? > > I hope this makes some sense. > Any advice would be great. > > Thanks > Ted > > So this relates to https://jira.mariadb.org/browse/MDEV-8334 which landed in 10.6.1, "The utf8 character set (and related collations) is now by default an alias for utf8mb3 rather than the other way around. It can be set to imply utf8mb4 by changing the value of the old_mode system variable (MDEV-8334)" The list from "show character sets" does not show aliases. From a 10.5 machine: : Server version: 10.5.11-MariaDB OpenBSD port: mariadb-server-10.5.11v1 : ... : MariaDB [(none)]> show character set where Charset like'utf%'; : +---------+------------------+--------------------+--------+ : | Charset | Description | Default collation | Maxlen | : +---------+------------------+--------------------+--------+ : | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | : | utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 | : | utf16 | UTF-16 Unicode | utf16_general_ci | 4 | : | utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 | : | utf32 | UTF-32 Unicode | utf32_general_ci | 4 | : +---------+------------------+--------------------+--------+ : 5 rows in set (0.003 sec) And from 10.6 : Server version: 10.6.4-MariaDB-log OpenBSD port: mariadb-server-10.6.4p2v1 : ... : MariaDB [(none)]> show character set where Charset like'utf%'; : +---------+------------------+--------------------+--------+ : | Charset | Description | Default collation | Maxlen | : +---------+------------------+--------------------+--------+ : | utf8mb3 | UTF-8 Unicode | utf8mb3_general_ci | 3 | : | utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 | : | utf16 | UTF-16 Unicode | utf16_general_ci | 4 | : | utf16le | UTF-16LE Unicode | utf16le_general_ci | 4 | : | utf32 | UTF-32 Unicode | utf32_general_ci | 4 | : +---------+------------------+--------------------+--------+ : 5 rows in set (0.000 sec) Trying to use an invalid charset is rightly rejected: MariaDB [(none)]> set names zzz; ERROR 1115 (42000): Unknown character set: 'zzz' and trying to use utf8 does work on both: MariaDB [(none)]> set names utf8; Query OK, 0 rows affected (0.000 sec) What happens if you try that? If you get an error then it seems like there's some problem with your installation. (I don't know whether it's necessary for this particular change, but maybe you forgot to run mariadb-upgrade after a mariadb version update? If so, that might be one possible reason). If you _don't_ get an error then perhaps horde is doing some unnecessary check on "show character set" before using a particular charset, but if so then it would seem strange that nobody else has reported this (your email is the only search hit I can find for the error).