Hello Michael, Here's a perhaps better version of that patch. I added default value for client_encoding='ascii'. For people with ascii encoding it will make no difference but for people with 8-bit encodings it will produce error if they will try to put regular string to sqlalchemy when convert_unicode=True.
> OK, i think im getting this now...i think between the myghty list and here > one can begin to see my lack of unicode awareness... > > so basically, since your python file has the -*- coding attribute, you > dont really have to put the u'' around strings that contain multibyte > characters, since the multibyte encoding is implicit throughout the file. > so the client_encoding pretty much is designed to match up with a python > script that has a -*- declaration, is that accurate ? > > ill add the patch to my list. code says it all for me .... Not exactly. See this example: # -*- coding: cp1251 -*- s1 = 'текст в кодировке cp1251' s2 = unicode('текст в кодировке cp1251', 'cp1251') s3 = u'текст в кодировке cp1251' print 's1:', type(s1) # prints: s1: <type 'str'> print 's2:', type(s2) # prints: s2: <type 'unicode'> print 's3:', type(s3) # prints: s3: <type 'unicode'> s1 is a regular python string, i.e. a sequence of bytes, it cannot be converted to unicode or to another encoding, without knowing it encoding s2 is an unicode object, it's converted to unicode from regular string in constructor because we specifed correct encoding. s3 is an unicode object. Creation of s2 is just a shorthand for s2. If you are interested, there is a good article about unicode, it's modestly named " The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" :-) http://www.joelonsoftware.com/articles/Unicode.html > Vasily Sulatskov wrote: > > Hello Michael, > > > > I know there's a database engine parameter "encoding". It tells > > sqlalchemy in which encoding Unicode objects should be saved to > > database. > > > > I suggest adding another encoding, let's say "client_encoding" which > > will be used when convert_unicode is True and user assigns string > > object to object attribute. Currently even if convert_unicode is set > > to True string go to database as-is, bypassing convertion to unicode. > > > > This option will allow to assign string's in national/platform > > specific encodings, like cp1251 straigt to object attributes and they > > will be properly converted to database encoding (engine.encoding). > > > > > > See, encoding on client machine may be different from encoding in > > database. You can see changes that I suggest from attached diff. > > > > Suggested changes will can make life of users of > > multilingual/multienconding enviromnents a little easier while not > > affexcting all other users of SQLAlchemy. > > > > MB> On Apr 17, 2006, at 5:47 AM, Vasily Sulatskov wrote: > >>> In my opinion that's a bug and that behaviour should be changed to > >>> something > >>> like that: > >>> 1. If object is unicode then convert it to engine specified > >>> encoding (like > >>> utf8) as it happens now > >>> 2. If it's a string then convert it to unicode using some another > >>> specifed > >>> encoding (it should be added to engine parameters). This encoding > >>> specifies > >>> client-side encoding. It's often handy to have different encodings > >>> in database > >>> and on client machines (at least for people with "alternate > >>> languages" :-) > > > > MB> there already is an encoding parameter for the engine. > > > > MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options > > > > MB> does that solve your problem ? > > > > -- > > Best regards, > > Vasily mailto:[EMAIL PROTECTED] > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live > webcast and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > Sqlalchemy-users mailing list > Sqlalchemy-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users
Index: lib/sqlalchemy/types.py =================================================================== --- lib/sqlalchemy/types.py (revision 1280) +++ lib/sqlalchemy/types.py (working copy) @@ -97,7 +97,10 @@ return {'length':self.length} def convert_bind_param(self, value, engine): if not engine.convert_unicode or value is None or not isinstance(value, unicode): - return value + if isinstance(value, str) and engine.client_encoding: + return unicode(value, engine.client_encoding).encode(engine.encoding) + else: + return value else: return value.encode(engine.encoding) def convert_result_value(self, value, engine): @@ -116,7 +119,10 @@ if value is not None and isinstance(value, unicode): return value.encode(engine.encoding) else: - return value + if isinstance(value, str) and engine.client_encoding: + return unicode(value, engine.client_encoding).encode(engine.encoding) + else: + return value def convert_result_value(self, value, engine): if value is not None and not isinstance(value, unicode): return value.decode(engine.encoding) Index: lib/sqlalchemy/engine.py =================================================================== --- lib/sqlalchemy/engine.py (revision 1280) +++ lib/sqlalchemy/engine.py (working copy) @@ -229,7 +229,7 @@ SQLEngines are constructed via the create_engine() function inside this package. """ - def __init__(self, pool=None, echo=False, logger=None, default_ordering=False, echo_pool=False, echo_uow=False, convert_unicode=False, encoding='utf-8', **params): + def __init__(self, pool=None, echo=False, logger=None, default_ordering=False, echo_pool=False, echo_uow=False, convert_unicode=False, encoding='utf-8', client_encoding='ascii', **params): """constructs a new SQLEngine. SQLEngines should be constructed via the create_engine() function which will construct the appropriate subclass of SQLEngine.""" # get a handle on the connection pool via the connect arguments @@ -250,6 +250,7 @@ self.echo_uow = echo_uow self.convert_unicode = convert_unicode self.encoding = encoding + self.client_encoding = client_encoding self.context = util.ThreadLocal() self._ischema = None self._figure_paramstyle()