Uh, ok, something obviously went wrong there. Checking. On Sat, Jul 21, 2018 at 8:30 AM, Rasmus Lerdorf <ras...@lerdorf.com> wrote:
> For future reference, here is what I did to fix the encoding problem: > > MariaDB [phpbugsdb]> select sdesc from bugdb where id=76553; > +----------------------------------------------------------- > ------------------------------------------------------------ > ---------------------------------------------+ > | sdesc > > | > +----------------------------------------------------------- > ------------------------------------------------------------ > ---------------------------------------------+ > | Ð˜Ð¼Ñ Ð¿ÐµÑ€ÐµÐ¼ÐµÐ½Ð½Ð¾Ð¹ может Ñ Ð¾Ð´ÐµÑ€Ð¶Ð°Ñ‚ÑŒ ÑƒÐ¿Ñ€Ð°Ð²Ð»Ñ > ющие > | > +----------------------------------------------------------- > ------------------------------------------------------------ > ---------------------------------------------+ > 1 row in set (0.00 sec) > > MariaDB [phpbugsdb]> alter table bugdb drop index email; > Query OK, 76298 rows affected (0.85 sec) > Records: 76298 Duplicates: 0 Warnings: 0 > > MariaDB [phpbugsdb]> alter table bugdb modify sdesc varbinary(80) NOT NULL > DEFAULT '', modify ldesc binary NOT NULL, modify email varbinary(40) NOT > NULL DEFAULT ''; > Query OK, 76298 rows affected, 65535 warnings (0.65 sec) > Records: 76298 Duplicates: 0 Warnings: 76091 > > MariaDB [phpbugsdb]> alter table bugdb modify sdesc varchar(80) CHARACTER > SET utf8mb4 NOT NULL DEFAULT '', modify ldesc text CHARACTER SET utf8mb4 > NOT NULL, modify email varchar(40) CHARACTER SET utf8mb4 NOT NULL DEFAULT > ''; > Query OK, 76298 rows affected, 127 warnings (0.57 sec) > Records: 76298 Duplicates: 0 Warnings: 127 > > MariaDB [phpbugsdb]> alter table bugdb add FULLTEXT INDEX `email` > (`email`,`sdesc`,`ldesc`); > Query OK, 76298 rows affected (1.56 sec) > Records: 76298 Duplicates: 0 Warnings: 0 > > MariaDB [phpbugsdb]> select sdesc from bugdb where id=76553; > +----------------------------------------------------------- > -----------------------+ > | sdesc > | > +----------------------------------------------------------- > -----------------------+ > | Имя переменной может содержать управляющие > | > +----------------------------------------------------------- > -----------------------+ > 1 row in set (0.00 sec) > > The trick was to convert the columns to binary first. When I went straight > from latin1 to utf8 I got the utf8 equivalent of the latin1 characters. By > telling it that the data was actually binary first, it converted from > binary to utf8 which appears to have worked. There were some warnings, > which I assume are invalid utf8 byte sequences somewhere. > > -Rasmus >