Hi Adam, Yeah... Here's the situation with MySQL/MariaDB and "utf8".
When MySQL introduced utf8 charset, they went with a sort of "compressed" version of UTF-8 that excluded bits for some character ranges (I am super simplifying this). Emojis and some other character ranges didn't exist at the time, and now cannot be represented by their "utf8". utf8mb4 is the "real" UTF-8 charset type. However, it's not a drop-in replacement. It affects key lengths, amongst other things, and is incompatible with, well, many things. There *is* a way to get true UTF-8 support. It requires utf8mb4, and a handful of global settings applied to the server to enable large keys and a different InnoDB file format. It then requires a special command to be set at the beginning of each MySQL/MariaDB session to opt into some better support. Basically, it's invasive and not something that we can currently tell people to enable, or it'll cause new problems. It also requires full table rebuilds. The instructions also depend on the version of MySQL/MariaDB. We plan to bake in some level of support for it in Review Board in the future, but Django doesn't natively support it, and it'll require a bunch of special logic to rebuild data. I can't currently provide the settings you may need, because many of them are dependent on the version of MySQL/MariaDB you're using, and I haven't verified them lately (just working off internal notes). It boils down to: 1) Using utf8mb4 charsets for all databases, tables, and connections/sessions 2) Using utf8mb4_bin collation for all the above 3) Enabling innodb_large_prefix and innodb_per_table (might depend on the versions of MySQL/MariaDB) 4) Enabling innodb_file_format=barracuda (not needed on modern versions) This is not an exhaustive step-by-step. PostgreSQL will do UTF-8 by default, fwiw. Hoping to revisit this support in MySQL/MariaDB after RB4 wraps up. Should be easier now that MySQL/MariaDB have made progress in this area, and I need to update my knowledge of what that progress looks like. Christian On Fri, May 15, 2020 at 5:31 AM Adam Weremczuk <[email protected]> wrote: > I don't think utf8mb4 was a good idea and I believe it's now leading to: > > sudo rb-site install /var/www/mysite > (...) > * Installing the site... > (...) > Creating table scmtools_repository > > [!] There was an error synchronizing the database. Make sure the > database is created and has the appropriate permissions, and then > continue. > [!] Details: (1071, 'Specified key was too long; max key length is 767 > bytes') > > Press Enter to continue > > > > On Thursday, 14 May 2020 16:01:35 UTC+1, Adam Weremczuk wrote: >> >> Hi all, >> >> Following installation guide for MySQL I've added to /etc/mysql/my.cnf >> >> [client] >> default-character-set=utf8 >> >> [mysqld] >> character-set-server=utf8 >> >> MariaDB fails to start: >> >> May 14 14:01:41 gittest systemd[1]: Starting MariaDB 10.1.44 database >> server... >> May 14 14:01:41 gittest mysqld[10318]: 2020-05-14 14:01:41 >> 139687784537472 [Note] /usr/sbin/mysqld (mysqld 10.1.44-MariaDB-0+deb9u1) >> starting as process 10318 ... >> May 14 14:01:41 gittest mysqld[10318]: 2020-05-14 14:01:41 >> 139687784537472 [ERROR] COLLATION 'utf8mb4_general_ci' is not valid for >> CHARACTER SET 'utf8' >> May 14 14:01:41 gittest mysqld[10318]: 2020-05-14 14:01:41 >> 139687784537472 [ERROR] Aborting >> May 14 14:01:41 gittest systemd[1]: mariadb.service: Main process exited, >> code=exited, status=1/FAILURE >> May 14 14:01:41 gittest systemd[1]: Failed to start MariaDB 10.1.44 >> database server. >> >> When I comment out these 2 addition it starts fine and I can retrieve the >> following: >> >> MariaDB [(none)]> SHOW COLLATION LIKE 'utf8%'; >> >> +------------------------------+---------+-----+---------+----------+---------+ >> | Collation | Charset | Id | Default | Compiled | >> Sortlen | >> >> +------------------------------+---------+-----+---------+----------+---------+ >> | utf8_general_ci | utf8 | 33 | Yes | Yes | >> 1 | >> | utf8_bin | utf8 | 83 | | Yes | >> 1 | >> | utf8_unicode_ci | utf8 | 192 | | Yes | >> 8 | >> | utf8_icelandic_ci | utf8 | 193 | | Yes | >> 8 | >> | utf8_latvian_ci | utf8 | 194 | | Yes | >> 8 | >> | utf8_romanian_ci | utf8 | 195 | | Yes | >> 8 | >> | utf8_slovenian_ci | utf8 | 196 | | Yes | >> 8 | >> | utf8_polish_ci | utf8 | 197 | | Yes | >> 8 | >> | utf8_estonian_ci | utf8 | 198 | | Yes | >> 8 | >> | utf8_spanish_ci | utf8 | 199 | | Yes | >> 8 | >> | utf8_swedish_ci | utf8 | 200 | | Yes | >> 8 | >> | utf8_turkish_ci | utf8 | 201 | | Yes | >> 8 | >> | utf8_czech_ci | utf8 | 202 | | Yes | >> 8 | >> | utf8_danish_ci | utf8 | 203 | | Yes | >> 8 | >> | utf8_lithuanian_ci | utf8 | 204 | | Yes | >> 8 | >> | utf8_slovak_ci | utf8 | 205 | | Yes | >> 8 | >> | utf8_spanish2_ci | utf8 | 206 | | Yes | >> 8 | >> | utf8_roman_ci | utf8 | 207 | | Yes | >> 8 | >> | utf8_persian_ci | utf8 | 208 | | Yes | >> 8 | >> | utf8_esperanto_ci | utf8 | 209 | | Yes | >> 8 | >> | utf8_hungarian_ci | utf8 | 210 | | Yes | >> 8 | >> | utf8_sinhala_ci | utf8 | 211 | | Yes | >> 8 | >> | utf8_german2_ci | utf8 | 212 | | Yes | >> 8 | >> | utf8_croatian_mysql561_ci | utf8 | 213 | | Yes | >> 8 | >> | utf8_unicode_520_ci | utf8 | 214 | | Yes | >> 8 | >> | utf8_vietnamese_ci | utf8 | 215 | | Yes | >> 8 | >> | utf8_general_mysql500_ci | utf8 | 223 | | Yes | >> 1 | >> | utf8_croatian_ci | utf8 | 576 | | Yes | >> 8 | >> | utf8_myanmar_ci | utf8 | 577 | | Yes | >> 8 | >> | utf8_thai_520_w2 | utf8 | 578 | | Yes | >> 4 | >> | utf8mb4_general_ci | utf8mb4 | 45 | Yes | Yes | >> 1 | >> | utf8mb4_bin | utf8mb4 | 46 | | Yes | >> 1 | >> | utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | >> 8 | >> | utf8mb4_icelandic_ci | utf8mb4 | 225 | | Yes | >> 8 | >> | utf8mb4_latvian_ci | utf8mb4 | 226 | | Yes | >> 8 | >> | utf8mb4_romanian_ci | utf8mb4 | 227 | | Yes | >> 8 | >> | utf8mb4_slovenian_ci | utf8mb4 | 228 | | Yes | >> 8 | >> | utf8mb4_polish_ci | utf8mb4 | 229 | | Yes | >> 8 | >> | utf8mb4_estonian_ci | utf8mb4 | 230 | | Yes | >> 8 | >> | utf8mb4_spanish_ci | utf8mb4 | 231 | | Yes | >> 8 | >> | utf8mb4_swedish_ci | utf8mb4 | 232 | | Yes | >> 8 | >> | utf8mb4_turkish_ci | utf8mb4 | 233 | | Yes | >> 8 | >> | utf8mb4_czech_ci | utf8mb4 | 234 | | Yes | >> 8 | >> | utf8mb4_danish_ci | utf8mb4 | 235 | | Yes | >> 8 | >> | utf8mb4_lithuanian_ci | utf8mb4 | 236 | | Yes | >> 8 | >> | utf8mb4_slovak_ci | utf8mb4 | 237 | | Yes | >> 8 | >> | utf8mb4_spanish2_ci | utf8mb4 | 238 | | Yes | >> 8 | >> | utf8mb4_roman_ci | utf8mb4 | 239 | | Yes | >> 8 | >> | utf8mb4_persian_ci | utf8mb4 | 240 | | Yes | >> 8 | >> | utf8mb4_esperanto_ci | utf8mb4 | 241 | | Yes | >> 8 | >> | utf8mb4_hungarian_ci | utf8mb4 | 242 | | Yes | >> 8 | >> | utf8mb4_sinhala_ci | utf8mb4 | 243 | | Yes | >> 8 | >> | utf8mb4_german2_ci | utf8mb4 | 244 | | Yes | >> 8 | >> | utf8mb4_croatian_mysql561_ci | utf8mb4 | 245 | | Yes | >> 8 | >> | utf8mb4_unicode_520_ci | utf8mb4 | 246 | | Yes | >> 8 | >> | utf8mb4_vietnamese_ci | utf8mb4 | 247 | | Yes | >> 8 | >> | utf8mb4_croatian_ci | utf8mb4 | 608 | | Yes | >> 8 | >> | utf8mb4_myanmar_ci | utf8mb4 | 609 | | Yes | >> 8 | >> | utf8mb4_thai_520_w2 | utf8mb4 | 610 | | Yes | >> 4 | >> >> +------------------------------+---------+-----+---------+----------+---------+ >> 59 rows in set (0.00 sec) >> >> I've replaced utf8 with utf8mb4 in my.cf and MariaDB is now starting >> fine. >> >> Have I done the right thing? >> >> Shall the installation documentation be updated? >> >> Thanks, >> Adam >> >> -- > Supercharge your Review Board with Power Pack: > https://www.reviewboard.org/powerpack/ > Want us to host Review Board for you? Check out RBCommons: > https://rbcommons.com/ > Happy user? Let us know! https://www.reviewboard.org/users/ > --- > You received this message because you are subscribed to the Google Groups > "Review Board Community" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/reviewboard/a02fb57b-6547-4d43-a028-4e8706a42860%40googlegroups.com > <https://groups.google.com/d/msgid/reviewboard/a02fb57b-6547-4d43-a028-4e8706a42860%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- Christian Hammond President/CEO of Beanbag <https://www.beanbaginc.com/> Makers of Review Board <https://www.reviewboard.org/> -- Supercharge your Review Board with Power Pack: https://www.reviewboard.org/powerpack/ Want us to host Review Board for you? Check out RBCommons: https://rbcommons.com/ Happy user? Let us know! https://www.reviewboard.org/users/ --- You received this message because you are subscribed to the Google Groups "Review Board Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/reviewboard/CAE7VndmhBFOxeePH3NGSV9dg2B1XQ8D-guiyRJxABps0%3D%2BK--Q%40mail.gmail.com.
