Hello Michael, I know there's a database engine parameter "encoding". It tells sqlalchemy in which encoding Unicode objects should be saved to database.
I suggest adding another encoding, let's say "client_encoding" which will be used when convert_unicode is True and user assigns string object to object attribute. Currently even if convert_unicode is set to True string go to database as-is, bypassing convertion to unicode. This option will allow to assign string's in national/platform specific encodings, like cp1251 straigt to object attributes and they will be properly converted to database encoding (engine.encoding). See, encoding on client machine may be different from encoding in database. You can see changes that I suggest from attached diff. Suggested changes will can make life of users of multilingual/multienconding enviromnents a little easier while not affexcting all other users of SQLAlchemy. MB> On Apr 17, 2006, at 5:47 AM, Vasily Sulatskov wrote: >> In my opinion that's a bug and that behaviour should be changed to >> something >> like that: >> 1. If object is unicode then convert it to engine specified >> encoding (like >> utf8) as it happens now >> 2. If it's a string then convert it to unicode using some another >> specifed >> encoding (it should be added to engine parameters). This encoding >> specifies >> client-side encoding. It's often handy to have different encodings >> in database >> and on client machines (at least for people with "alternate >> languages" :-) MB> there already is an encoding parameter for the engine. MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options MB> does that solve your problem ? -- Best regards, Vasily mailto:[EMAIL PROTECTED]
encodings.diff
Description: Binary data
# -*- coding: cp1251 -*- import sqlalchemy db = sqlalchemy.create_engine('sqlite://', echo=False, echo_uow=False, convert_unicode=True, client_encoding='cp1251') # a table to store companies companies = sqlalchemy.Table('companies', db, sqlalchemy.Column('company_id', sqlalchemy.Integer, primary_key=True), sqlalchemy.Column('name', sqlalchemy.Unicode(50))) class Company(object): pass sqlalchemy.assign_mapper(Company, companies) companies.create() # Company(name=u'Some text in cp1251 encoding') # This lines works perfectly, unicode object is automatically encoded to # utf8 before going to database Company(name=u'Êàêîé-òî òåêñò â êîäèðîâêå cp1251') # This line still works fine: # It goes to database as is, i.e. as a string and when decoded # it is a valid utf8 that can be converted to unicode without # problems Company(name='Some text in ascii') # And this line causes problems: # It goes to database as is, i.e. as a string and when Company(name='Êàêîé-òî òåêñò â êîäèðîâêå cp1251') sqlalchemy.objectstore.commit() sqlalchemy.objectstore.clear() c = Company.get(1) print type(c.name) c = Company.get(2) # Now we get something funny. We specified name as a string during # object creation and get it out of database as Unicode. print type(c.name) # And this line will crash interpeter because sqlalchemy tries to convert it # name to Unicode as it was an utf8 and it is not. It is still in cp1251 # encoding c2 = Company.get(3) print c2.name.encode('cp866') print 'Done'