Hello

I have a question about convert-unicode engine option.

The documentation says:

convert_unicode=False : if set to True, all String/character based types will 
convert Unicode values to raw byte values going into the database, and all 
raw byte values to Python Unicode coming out in result sets. This is an 
engine-wide method to provide unicode across the board. For unicode 
conversion on a column-by-column level, use the Unicode column type 
instead.convert_unicode=False : if set to True, all String/character based 
types will convert Unicode values to raw byte values going into the database, 
and all raw byte values to Python Unicode coming out in result sets. This is 
an engine-wide method to provide unicode across the board. For unicode 
conversion on a column-by-column level, use the Unicode column type instead.

Wut when convert_unicode is set to true it converts Unicode objects to strings 
and leaves String objects unchanged and it can lead to problems:

here is a simple example:
# -*- coding: cp1251 -*-

import sqlalchemy

db = sqlalchemy.create_engine('sqlite://', echo=True, echo_uow=False,
    convert_unicode=True)

# a table to store companies
companies = sqlalchemy.Table('companies', db, 
    sqlalchemy.Column('company_id', sqlalchemy.Integer, primary_key=True),
    sqlalchemy.Column('name', sqlalchemy.String(50)))
    
class Company(object):
    pass

sqlalchemy.assign_mapper(Company, companies)

companies.create()

# Company(name=u'Some text in cp1251 encoding')
# This lines works perfectly, unicode object is automatically encoded to
# utf8 before going to database
Company(name=u'Какой-то текст в кодировке cp1251')

# This line still works fine:
# It goes to database as is, i.e. as a string and when decoded
# it is a valid utf8 that can be converted to unicode without
# problems
Company(name='Some text in ascii')

# And this line causes problems:
# It goes to database as is, i.e. as a string and when
Company(name='Какой-то текст в кодировке cp1251')

sqlalchemy.objectstore.commit()

sqlalchemy.objectstore.clear()


c = Company.get(1)
print type(c.name)


c = Company.get(2)
# Now we get something funny. We specified name as a string during
# object creation and get it out of database as Unicode.
print type(c.name)

# And this line will crash interpeter because sqlalchemy tries to convert it
# name to Unicode as it was an utf8 and it is not. It is still in cp1251 
encoding
c2 = Company.get(3)


So is it intended behaviour for sqlalchemy or is that a bug?

In my opinion that's a bug and that behaviour should be changed to something 
like that:
1. If object is unicode then convert it to engine specified encoding (like 
utf8) as it happens now
2. If it's a string then convert it to unicode using some another specifed 
encoding (it should be added to engine parameters). This encoding specifies
client-side encoding. It's often handy to have different encodings in database 
and on client machines (at least for people with "alternate languages" :-)

If that's indeed problems with sqlalchemy and not my expectations of what 
sqlalchemy should be theh I perhaps can make those changes to sqlalchemy


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Sqlalchemy-users mailing list
Sqlalchemy-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users

Reply via email to