On 11/3/06, Shannon -jj Behrens <[EMAIL PROTECTED]> wrote: > I'm using convert_unicode=True. Everything is fine as long as I'm the > one reading and writing the data. However, if I look at what's > actually being stored in the database, it's like the data has been > encoded twiced. If I switch to use_unicode=True, which I believe is > MySQL specific, things work just fine and what's being stored in the > database looks correct. > > I started looking through the SQLAlchemy code, and I came across this: > > def convert_bind_param(self, value, dialect): > if not dialect.convert_unicode or value is None or not > isinstance(value, unicode): > return value > else: > return value.encode(dialect.encoding) > def convert_result_value(self, value, dialect): > if not dialect.convert_unicode or value is None or > isinstance(value, unicode): > return value > else: > return value.decode(dialect.encoding) > > The logic looks backwards. It says, "If it's not a unicode object, > return it. Otherwise, encode it." Later, "If it is a unicode object, > return it. Otherwise decode it." > > Am I correct that this is backwards? If so, this is going to be > *painful* to update all the databases out there!
Ok, MySQLdb doesn't have a mailing list, so I can't ask there. Here are some things I've learned: Changing from convert_unicode=True to use_unicode=True doesn't do what you'd expect. SQLAlchemy is passing keyword arguments all over the place, and use_unicode actually gets ignored. <minor rant>I personally think that you should be strict *somewhere* when you're passing around keyword arguments. I've been bitten in this way too many times. Unknown keyword arguments should result in exceptions.</minor rant> Anyway, I'm still a bit worried about that code above like I said. However, here's what's even scarier. If I use the following code: import MySQLdb for use_unicode in (True, False): connection = MySQLdb.connect(host="localhost", user="user", passwd='dataase', db="users", use_unicode=use_unicode) cursor = connection.cursor() cursor.execute("select firstName from users where username='test'") row = cursor.fetchone() print "use_unicode:%s %r" % (use_unicode, row) I get use_unicode:True (u'test \xc3\xa7',) use_unicode:False ('test \xc3\xa7',) Notice the result is the same, but one has a unicode object and the other doesn't. Notice that it's \xc3\xa7 each time? It shouldn't be. Consider: >>> s = 'test \xc3\xa7' >>> s.decode('utf-8') u'test \xe7' *It's creating a unicode object without actually doing any decoding!* This is somewhere low level. Like I said, this is lower level than SQLAlchemy, but I don't have anywhere else to turn. SQLAlchemy: 0.2.8 MySQLdb: 1.36.2.4 mysql client and server: 5.0.22 Ubuntu: 6.0.6 Help! -jj -- http://jjinux.blogspot.com/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---