Hello,

Using a MySQL database keeping Unicode strings under control turned out to 
be a bit of a challenge. I could have sworn that character encoding and 
collation <https://dev.mysql.com/doc/refman/5.6/en/charset-database.html> 
are set up correctly, i.e.

+--------------------------+----------------------+
| @@character_set_database | @@collation_database |
+--------------------------+----------------------+
| utf8                     | utf8_general_ci      |
+--------------------------+----------------------+

Alas, that turned out not to be the case. So now I’m looking at a few 
databases where random tables and columns are back to latin1 character set. 
Trying to get to a solution that takes a db, checks & switches its 
charset/encoding then iterates over all tables, checks & switches their 
charset/encoding then iterates over the columns and (for appropriate types) 
checks & switches their charset/encoding.

I can do that by issuing SQL statements directly:

SELECT @@character_set_database, @@collation_database;

then if need be

ALTER DATABASE db_name CHARACTER SET "utf8" COLLATE "utf8_general_ci";

Similarly for tables (link 
<https://dev.mysql.com/doc/refman/5.6/en/charset-table.html>) and table 
columns (link <https://dev.mysql.com/doc/refman/5.6/en/charset-column.html>
). 

However, I wonder if there is a more elegant way with SQLAlchemy than 
issuing SQL directly?

I have also noticed that some string types use Unicode 
<https://docs.sqlalchemy.org/en/13/core/type_basics.html#sqlalchemy.types.Unicode>
 (which 
maps to varchar <https://dev.mysql.com/doc/refman/5.6/en/char.html>) and 
others use UnicodeText 
<https://docs.sqlalchemy.org/en/13/core/type_basics.html#sqlalchemy.types.UnicodeText>
 (which 
maps to mediumtext <https://dev.mysql.com/doc/refman/5.6/en/blob.html>). I 
vaguely remember that there were size constraints, I think the maximum of 
65k across all columns of a table whereas mediumtext doesn’t have that 65k 
size constraint? Disregarding that, converting from UnicodeText to Unicode 
should not impact the value, right? While cleaning up the character set 
encoding and collation, I thought that it may be good to get the string 
types right.

Much thanks!
Jens

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/7f8346fa-91bc-45ac-b4f1-0c6086fb0352%40googlegroups.com.

Reply via email to