The first thing is to see what MySQL thinks its server and client character sets are:
$ paster shell development.ini >>> import pylons.database >>> engine = pylons.database.create_engine() >>> engine.execute("SHOW VARIABLES").fetchall() Look for the variables that start with "character_set_" and "collation_". On my workstation they show: (u'character_set_client', u'utf8'), (u'character_set_connection', u'utf8'), (u'character_set_database', u'utf8'), (u'character_set_filesystem', u'binary'), (u'character_set_results', u'utf8'), (u'character_set_server', u'utf8'), (u'character_set_system', u'utf8'), (u'character_sets_dir', u'/usr/share/mysql/charsets/'), (u'collation_connection', u'utf8_general_ci'), (u'collation_database', u'utf8_general_ci'), (u'collation_server', u'utf8_general_ci'), Then you have a choice of "?use_unicode=1" (a MySQLdb option in the DSN) or: create_engine(dsn, convert_unicode=True, encoding="utf8") SQLAlchemy wants to standardize on 'convert_unicode' because it's the same for all engines, but currently there are some problems with that: 1) You have to know what character set the MySQL connection is. 2) If you're using pylons.database.session_context, you can't add arguments to the create_engine call without hacking pylons.database or reimplementing the functions yourself. Hopefully this will be improved in a future version of Pylons -- as soon as we figure out what to do. 3) convert_unicode works at the Table level, so raw engine.execute() queries will still be 'str' strings. * * * * I have MySQLdb 1.2.1 because Kubuntu is still on the older version and I can't get the newer one to compile: $ easy_install MySQL-python File "/tmp/easy_install-p8zEac/MySQL-python-1.2.2/setup_posix.py", line 24, in mysql_config EnvironmentError: mysql_config not found Thank you, now what is "mysql_config"? * * * * With autoloaded tables you'll get unicode values if you use either use_unicode or convert_unicode, or str values if you don't. With declared columns you'll also have to use the Unicode column type. I had to use a my.cnf file to get the client and server charsets to match. Otherwise I was getting: (u'character_set_client', u'latin1'), (u'character_set_connection', u'latin1'), (u'character_set_database', u'utf8'), (u'character_set_filesystem', u'binary'), (u'character_set_results', u'latin1'), (u'character_set_server', u'utf8'), (u'character_set_system', u'utf8'), (u'character_sets_dir', u'/usr/share/mysql/charsets/'), (u'collation_connection', u'latin1_swedish_ci'), (u'collation_database', u'utf8_general_ci'), (u'collation_server', u'utf8_general_ci'), The my.cnf is a symlink to /etc/mysql/my.cnf and contains among other things: [client] default-character-set = utf8 [mysqld] character-set-server = utf8 default-character-set = utf8 The DSN looks like this: sqlalchemy.dbapi = mysql://...?read_default_file=%(here)s/my.cnf&use_unicode=1 Unfortunately MySQLdb does not raise an exception if the file is missing, so it's worth creating a unit test that checks whether all the character sets are correct. I haven't tried that yet. * * * * If you have existing records in the database which are in a different charset than what the column purports to be, and you want to keep the actual values as is, change the column type to BINARY in MySQL, then change it to the desired charset: http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.html If you want to change the data to the charset the column purports to be, write a Python program that connects WITHOUT Unicode support, read the values, decode them from the source charset, encode them into the destination charset, and write them back to the database. With "mysql://...?use_unicode=1" and reflected tables, I get Unicode values. Without use_unicode, I get string values. If I hack pylons.database to pass 'convert_unicode=True' to create_engine() in a way that it'll be propagated to the session_context, I get Unicode values again. Curiously, in all three cases the column instances are MSString with convert_unicode=False: Column(u'name',MSString(length=255,convert_unicode=False),nullable=False) * * * * The following didn't work: "mysql://...?use_unicode=1&charset=utf8" sqlalchemy.exceptions.DBAPIError: (Connection failed) (OperationalError) (2019, 'Can\'t initialize character set utf8" (path: /usr/share/mysql/charsets/)') This may be because I'm using the older MySQLdb, or maybe it's a bug Ubuntu's packaging of MySQL. -- Mike Orr <[EMAIL PROTECTED]> --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pylons-discuss" group. To post to this group, send email to pylons-discuss@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en -~----------~----~----~----~------~----~------~--~---