Re: Convertion of Unicode to ASCII NIGHTMARE
Roger Binns wrote: No. APSW converts it *to* Unicode. SQLite only accepts Unicode so a Unicode string has to be supplied. If you supply a non-Unicode string then conversion has to happen. APSW asks Python to supply the string in Unicode. If Python can't do that (eg it doesn't know the encoding) then you get an error. If what you say is true, I have to ask why I get a converstion error which states it cant convert to ASCII, not it cant convert to UNICODE? Ok if SQLite uses unicode internally why do you need to ignore everything greater than 127, I never said that. I said that a special case is made so that if the string you supply only contains ASCII characters (ie =127) then the ASCII string is converted to Unicode. (In fact it is valid UTF-8 hence the shortcut). the ascii table (256 bit one) fits into unicode just fine as far as I recall? No, ASCII characters have defined Unicode codepoints. The ASCII character number just happens to be the same as the Unicode codepoints. But there are only 127 ASCII characters. Or did I miss the boat here ? For bytes greater than 127, what character set is used? There are hundreds of character sets that define those characters. You have to tell the computer which one to use. See the Unicode article referenced above. Yes I know there are a million extended ASCII charaters sets, which happen to the bane of all existence. Most computers deal in bytes nativly and the 7 bit coding still causes problems to this day. But since the error I get is a converstion error to ASCII, not from ASCII, I am willing to accept loss of information. You cant code unicode into ascii without loss of information or two charcater codes. In my mind, somewhere inside the cursor.execute function, it converts to ascii. I say this because of the error msg recieved. So I am missing how a function which supposedly converts evereythin to unicode lands up doing an ascii converstion ? -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
There's an Oracle environment variable that appears to make a difference: NLS_CHARSET, perhaps - it's been a while since I've had to deal with Oracle, and I'm not looking for another adventure into Oracle's hideous documentation to find out. That is an EVIL setting which should not be used. The NLS_CHARSET environment variable causes so many headaches its not worth playing with it at all. -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
ChaosKCW wrote: Roger Binns wrote: No. APSW converts it *to* Unicode. SQLite only accepts Unicode so a Unicode string has to be supplied. If you supply a non-Unicode string then conversion has to happen. APSW asks Python to supply the string in Unicode. If Python can't do that (eg it doesn't know the encoding) then you get an error. If what you say is true, I have to ask why I get a converstion error which states it cant convert to ASCII, not it cant convert to UNICODE? You do get error about convertion to unicode. Quote from you message: SQLiteCur.execute(sql, row) UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 12: ordinal not in range(128) Notice the name of the error: UnicodeDecode or in other words ToUnicode. So I am missing how a function which supposedly converts evereythin to unicode lands up doing an ascii converstion ? When python tries to concatenate a byte string and a unicode string, it assumes that the byte string is encoded ascii and tries to convert from encoded ascii to unicode. It calls ascii decoder to do the decoding. If decoding fails you see message from ascii decoder about the error. Serge -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
ChaosKCW wrote: There's an Oracle environment variable that appears to make a difference: NLS_CHARSET, perhaps - it's been a while since I've had to deal with Oracle, and I'm not looking for another adventure into Oracle's hideous documentation to find out. That is an EVIL setting which should not be used. The NLS_CHARSET environment variable causes so many headaches its not worth playing with it at all. Well, at this very point in time I don't remember the preferred way of getting Oracle, the client libraries and the database adapter to agree on the character encoding used for communicating data between applications and the database system. Nevertheless, what you need to do is to make sure that you know which encoding is used so that if you either get plain strings (ie. not Unicode objects) out of the database, or if you need to write plain strings to the database, you can provide the encoding to the unicode built-in function or to the decode/encode methods; this is much better than just stripping out characters that can't be represented by ASCII. Anyway, despite my objections to digging through Oracle documentation, I found the following useful documents: the Globalization Support index [1], an FAQ about NLS_LANG [2], and a white paper about Unicode support in Oracle [3]. It may well be the case that NLS_LANG might help you do what you want, but since the database systems I have installed (PostgreSQL, sqlite3) seem to do Unicode without such horsing around, I'm not really able to offer much more advice on this subject. Paul [1] http://www.oracle.com/technology/tech/globalization/index.html [2] http://www.oracle.com/technology/tech/globalization/htdocs/nls_lang faq.htm [3] http://www.oracle.com/technology/tech/globalization/pdf/TWP_AppDev_Unicode_10gR2.pdf -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
When python tries to concatenate a byte string and a unicode string, it assumes that the byte string is encoded ascii and tries to convert from encoded ascii to unicode. It calls ascii decoder to do the decoding. If decoding fails you see message from ascii decoder about the error. Serge Ok I get it now. Sorry for the slowness. I have to say as a lover of python for its simplicity and clarity, the charatcer set thing has been harder than I would have liked to figure out. Thanks for all the help. -- http://mail.python.org/mailman/listinfo/python-list
Help on exceptions (was: Convertion of Unicode to ASCII NIGHTMARE)
ChaosKCW wrote: Ok I get it now. Sorry for the slowness. I have to say as a lover of python for its simplicity and clarity, the charatcer set thing has been harder than I would have liked to figure out. I think there is a room for improvement here. In my opinion the message is too confusing for newbies. It would be easier for them if there is a mini tutorial available about what's going on with links to other more broad tutorials (like unicode tutorial). Instead of generic error -- UnicodeDecodeError: 'ascii' codec can't decode byte 0xa5 in position 0: ordinal not in range(128) -- It can be like this. Notice special url, it is pointing to a non-existing :) tutorial about why concatenating byte strings with unicode strings can produce UnicodeDecodeError -- UnicodeDecodeError: 'ascii' codec can't decode byte 0xa5 in position 0: ordinal not in range(128) For additional information about this exception see: http://docs.python.org/2.4/exceptions/concat+UnicodeDecodeError+str -- Here is the sample code how it can be done: --- extended_help = { (concat, UnicodeDecodeError, str, unicode): http://docs.python.org/2.4/exceptions/concat+UnicodeDecodeError+str;, (concat, UnicodeDecodeError, unicode, str): http://docs.python.org/2.4/exceptions/concat+UnicodeDecodeError+str; } def get_more_help(error, key): if not extended_help.has_key(key): return error.reason += \nFor additional information about this exception see:\n error.reason += extended_help[key] def concat(s1,s2): try: return s1 + s2 except Exception, e: key = concat, e.__class__, type(s1), type(s2) get_more_help(e, key) raise concat(chr(0xA5),unichr(0x5432)) -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Roger Binns wrote: Serge Orlov [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I have an impression that handling/production of byte order marks is pretty clear: they are produced/consumed only by two codecs: utf-16 and utf-8-sig. What is not clear? Are you talking about the C APIs in Python/SQLite (that is what I have been discussing) or the language level? Both. Documentation for PyUnicode_DecodeUTF16 and PyUnicode_EncodeUTF16 is pretty clear when BOM is produced/removed. The only problem is that you have to find out host endianess yourself. In python it's sys.byteorder, in C you use hack like unsigned long one = 1; endianess = (*(char *) one) == 0) ? 1 : -1; And then pass endianess to PyUnicode_(De/En)codeUTF16. So I still don't see what is unclear about BOM production/handling. At the C level, SQLite doesn't accept boms. It would be surprising if it did. Quote from http://www.unicode.org/faq/utf_bom.html: Where the data is typed, such as a field in a database, a BOM is unnecessary -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Fredrik Lundh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Roger Binns wrote: SQLite only accepts Unicode so a Unicode string has to be supplied. fact or FUD? let's see: Note I said SQLite. For APIs that take/give strings, you can either supply/get a UTF-8 encoded sequence of bytes, or two bytes per character host byte order sequence. Any wrapper of SQLite that doesn't do Unicode in/out is seriously breaking things. I ended up using the UTF-8 versions of the API as Python can't quite make its mind up how to represent Unicode strings at the C api level. You can have two bytes per char or four, and the handling/production of byte order markers isn't that clear either. import pysqlite2.dbapi2 as DB pysqlite had several unicode problems in the past. It has since been cleaned up as you saw. Roger -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Roger Binns wrote: fact or FUD? let's see: Note I said SQLite. For APIs that take/give strings, you can either supply/get a UTF-8 encoded sequence of bytes, or two bytes per character host byte order sequence. Any wrapper of SQLite that doesn't do Unicode in/out is seriously breaking things. I ended up using the UTF-8 versions of the API as Python can't quite make its mind up how to represent Unicode strings at the C api level. You can have two bytes per char or four, and the handling/production of byte order markers isn't that clear either. sounds like your understanding of Unicode and Python's Unicode system is a bit unclear. /F -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Fredrik Lundh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] sounds like your understanding of Unicode and Python's Unicode system is a bit unclear. Err, no. Relaying unicode data between two disparate C APIs requires being careful and thorough. That means paying attention to when conversions happen, byte ordering (compile time) and boms (run time) and when the API documentation isn't thorough, verifying the behaviour yourself. That requires a very clear understanding of Unicode in order to do the requisite test cases, as well as reading what the code does. Roger -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Roger Binns wrote: Fredrik Lundh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Roger Binns wrote: SQLite only accepts Unicode so a Unicode string has to be supplied. fact or FUD? let's see: Note I said SQLite. For APIs that take/give strings, you can either supply/get a UTF-8 encoded sequence of bytes, or two bytes per character host byte order sequence. Any wrapper of SQLite that doesn't do Unicode in/out is seriously breaking things. I ended up using the UTF-8 versions of the API as Python can't quite make its mind up how to represent Unicode strings at the C api level. You can have two bytes per char or four, and the handling/production of byte order markers isn't that clear either. I have an impression that handling/production of byte order marks is pretty clear: they are produced/consumed only by two codecs: utf-16 and utf-8-sig. What is not clear? Serge -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Serge Orlov [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I have an impression that handling/production of byte order marks is pretty clear: they are produced/consumed only by two codecs: utf-16 and utf-8-sig. What is not clear? Are you talking about the C APIs in Python/SQLite (that is what I have been discussing) or the language level? At the C level, SQLite doesn't accept boms. You have to provide UTF-8 or host byte order two bytes per char UTF-16. Roger -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Paul Boddie [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] It looks like you may have Unicode objects that you're presenting to sqlite. In any case, with earlier versions of pysqlite that I've used, you need to connect with a special unicode_results parameter, He is using apsw. apsw correctly handles unicode. In fact it won't accept a str with bytes 127 as they will be an unknown encoding and SQLite only uses Unicode internally. It does have a blob type using buffer for situations where binary data needs to be stored. pysqlite's mishandling of Unicode is one of the things that drove me to writing apsw in the first place. Roger -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Hi Thanks for all the posts. I am still digesting it all but here are my initial comments. Don't. You can't. Those characters don't exist in the ASCII character set. SQLite 3.0 deals with UTF-8 encoded SQL statements, though. http://www.sqlite.org/version3.html As mentioned by the next poster, there is, its supposed to be encode with the 'ignore' option. Thus you lose data, but thats just dandy with me. As for SQLite supporting unicode, it probably does, but something on the python side (probabyl in apsw) converts it to ascii at some point before its handed to SQLite. The .encode() method returns a new value; it does not change an object inplace. sql = sql.encode('utf-8') Ah yes, big bistake on my part :-/ He is using apsw. apsw correctly handles unicode. In fact it won't accept a str with bytes 127 as they will be an unknown encoding and SQLite only uses Unicode internally. It does have a blob type using buffer for situations where binary data needs to be stored. pysqlite's mishandling of Unicode is one of the things that drove me to writing apsw in the first place. Ok if SQLite uses unicode internally why do you need to ignore everything greater than 127, the ascii table (256 bit one) fits into unicode just fine as far as I recall? Or did I miss the boat here ? Thanks, -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Thus you lose data, but thats just dandy with me.: Please reconsider this attitude, before you perpetrate a nonsense or even a disaster. Wrt your last para: 1. Roger didn't say ignore -- he said won't accept (a major difference). 2. The ASCII code comprises 128 characters, *NOT* 256. 3. What Roger means is: given a Python 8-bit string and no other information, you don't have a clue what the encoding is. Most codes of interest these days have the ASCII code (or a mild perversion thereof) in the first 128 positions, but it's anyones guess what the writer of the string had in mind with the next 128. -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Roger Binns wrote: Paul Boddie [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] It looks like you may have Unicode objects that you're presenting to sqlite. In any case, with earlier versions of pysqlite that I've used, you need to connect with a special unicode_results parameter, He is using apsw. apsw correctly handles unicode. In fact it won't accept a str with bytes 127 as they will be an unknown encoding and SQLite only uses Unicode internally. It does have a blob type using buffer for situations where binary data needs to be stored. pysqlite's mishandling of Unicode is one of the things that drove me to writing apsw in the first place. Ah, I misread the OP's traceback. Okay, the OP is getting regular strings, which are probably encoded in ISO-8859-1 if I had to guess, from the Oracle DB. He is trying to pass them in to SQLiteCur.execute() which tries to make a unicode string from the input: In [1]: unicode('\xdc') --- exceptions.UnicodeDecodeErrorTraceback (most recent call last) /Users/kern/ipython console UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 0: ordinal not in range(128) *Now*, my advice to the OP is to figure out the encoding of the strings that are being returned from Oracle. As I said, ISO-8859-1 is probably a good guess. Then, he would *decode* the string to a unicode string using the encoding. E.g.: row = row.decode('iso-8859-1') Then everything should be peachy. I hope. -- Robert Kern [EMAIL PROTECTED] I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Robert Kern wrote: Roger Binns wrote: Paul Boddie [EMAIL PROTECTED] wrote It looks like you may have Unicode objects that you're presenting to sqlite. In any case, with earlier versions of pysqlite that I've used, you need to connect with a special unicode_results parameter, Note that I've since mentioned client_encoding which seems to matter for pysqlite 1.x. He is using apsw. apsw correctly handles unicode. In fact it won't accept a str with bytes 127 as they will be an unknown encoding and SQLite only uses Unicode internally. It does have a blob type using buffer for situations where binary data needs to be stored. pysqlite's mishandling of Unicode is one of the things that drove me to writing apsw in the first place. For pysqlite 2.x, it appears that Unicode objects can be handed straight to the API methods, and I'd be interested to hear about your problems with pysqlite, Unicode and what actually made you write apsw instead. Ah, I misread the OP's traceback. Okay, the OP is getting regular strings, which are probably encoded in ISO-8859-1 if I had to guess, from the Oracle DB. He is trying to pass them in to SQLiteCur.execute() which tries to make a unicode string from the input: [...] There's an Oracle environment variable that appears to make a difference: NLS_CHARSET, perhaps - it's been a while since I've had to deal with Oracle, and I'm not looking for another adventure into Oracle's hideous documentation to find out. *Now*, my advice to the OP is to figure out the encoding of the strings that are being returned from Oracle. As I said, ISO-8859-1 is probably a good guess. Then, he would *decode* the string to a unicode string using the encoding. E.g.: row = row.decode('iso-8859-1') Then everything should be peachy. I hope. Yes, just find out what Oracle wants first, then set it all up, noting that without looking into the Oracle wrapper being used, I can't suggest an easier way. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
ChaosKCW [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] me. As for SQLite supporting unicode, it probably does, No, SQLite *ONLY* supports Unicode. It will *only* accept strings in Unicode and only produces strings in Unicode. All the functionality built into SQLite such as comparison operators operate only on Unicode strings. but something on the python side (probabyl in apsw) converts it to ascii at some point before its handed to SQLite. No. APSW converts it *to* Unicode. SQLite only accepts Unicode so a Unicode string has to be supplied. If you supply a non-Unicode string then conversion has to happen. APSW asks Python to supply the string in Unicode. If Python can't do that (eg it doesn't know the encoding) then you get an error. I strongly recommend reading this: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets http://www.joelonsoftware.com/articles/Unicode.html Ok if SQLite uses unicode internally why do you need to ignore everything greater than 127, I never said that. I said that a special case is made so that if the string you supply only contains ASCII characters (ie =127) then the ASCII string is converted to Unicode. (In fact it is valid UTF-8 hence the shortcut). the ascii table (256 bit one) fits into unicode just fine as far as I recall? No, ASCII characters have defined Unicode codepoints. The ASCII character number just happens to be the same as the Unicode codepoints. But there are only 127 ASCII characters. Or did I miss the boat here ? For bytes greater than 127, what character set is used? There are hundreds of character sets that define those characters. You have to tell the computer which one to use. See the Unicode article referenced above. Roger -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Roger Binns wrote: SQLite only accepts Unicode so a Unicode string has to be supplied. fact or FUD? let's see: import pysqlite2.dbapi2 as DB db = DB.connect(test.db) cur = db.cursor() cur.execute(create table if not exists test (col text)) cur.execute(insert into test values (?), [this is an ascii string]) cur.execute(insert into test values (?), [uthis is a unicode string]) cur.execute(insert into test values (?), [uthïs ïs ö unicöde strïng]) cur.execute(select * from test) for row in cur.fetchall(): print row prints (u'this is an ascii string',) (u'this is a unicode string',) (u'th\xefs \xefs \xf6 unic\xf6de str\xefng',) which is correct behaviour under Python's Unicode model. /F -- http://mail.python.org/mailman/listinfo/python-list
Convertion of Unicode to ASCII NIGHTMARE
Hi I am reading from an oracle database using cx_Oracle. I am writing to a SQLite database using apsw. The oracle database is returning utf-8 characters for euopean item names, ie special charcaters from an ASCII perspective. I get the following error: SQLiteCur.execute(sql, row) UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 12: ordinal not in range(128) I have googled for serval days now and still cant get it to encode to ascii. I encode the SQL as follows: sql = insert into %s values %s % (SQLiteTable, paramstr) sql.encode('ascii', 'ignore') I then code each of the row values returned from Oracle like this: row = map(encodestr, row) SQLiteCur.execute(sql, row) where encodestr is as follows: def encodestr(item): if isinstance(item, types.StringTypes): return unicodedata.normalize('NFKD', unicode(item, 'utf-8', 'ignore')).encode('ASCII', 'ignore') else: return item I have tried a thousand of similiar functions to the above, permitations of the above from various google searches. But I still get the above exception on the line: SQLiteCur.execute(sql, row) and the exception is reslated to the data in one field. Int the end I resorted to using oracles convert function in the SQL statement but would like to understand why this is happening and why its so hard to convert the string in python. I have read many complaints about this from other people some of whom have written custom stripping routines. I havent tried a custom routine yet, cause I think it should be possilble in python. Thanks, -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
ChaosKCW wrote: Hi I am reading from an oracle database using cx_Oracle. I am writing to a SQLite database using apsw. The oracle database is returning utf-8 characters for euopean item names, ie special charcaters from an ASCII perspective. And does cx_Oracle return those as Unicode objects or as plain strings containing UTF-8 byte sequences? It's very important to distinguish between these two cases, and I don't have any experience with cx_Oracle to be able to give advice here. I get the following error: SQLiteCur.execute(sql, row) UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 12: ordinal not in range(128) It looks like you may have Unicode objects that you're presenting to sqlite. In any case, with earlier versions of pysqlite that I've used, you need to connect with a special unicode_results parameter, although later versions should work with Unicode objects without special configuration. See here for a thread (in which I seem to have participated, coincidentally): http://mail.python.org/pipermail/python-list/2002-June/107526.html I have googled for serval days now and still cant get it to encode to ascii. This is a tough thing to find out - whilst previous searches did uncover some discussions about it, I just tried and failed to find the enlightening documents - and I certainly didn't see many references to it on the official pysqlite site. Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
ChaosKCW wrote: Hi I am reading from an oracle database using cx_Oracle. I am writing to a SQLite database using apsw. The oracle database is returning utf-8 characters for euopean item names, ie special charcaters from an ASCII perspective. I'm not sure that you are using those terms correctly. From your description below, it seems that your data is being returned from the Oracle database as unicode strings rather than regular strings containing UTF-8 encoded data. These European characters are not special characters from an ASCII perspective; they simply aren't characters in the ASCII character set at all. I get the following error: SQLiteCur.execute(sql, row) UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 12: ordinal not in range(128) I have googled for serval days now and still cant get it to encode to ascii. Don't. You can't. Those characters don't exist in the ASCII character set. SQLite 3.0 deals with UTF-8 encoded SQL statements, though. http://www.sqlite.org/version3.html I encode the SQL as follows: sql = insert into %s values %s % (SQLiteTable, paramstr) sql.encode('ascii', 'ignore') The .encode() method returns a new value; it does not change an object inplace. sql = sql.encode('utf-8') -- Robert Kern [EMAIL PROTECTED] I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Don't. You can't. Those characters don't exist in the ASCII character set. SQLite 3.0 deals with UTF-8 encoded SQL statements, though. That is not entirely correct - one can, if losing information is ok. The OPs code that normalized UTF-8 to NFKD, an umlaut like ä is transformed to a two-character-sequence basically saying a with two dots on top. With 'ignore' specified as parameter to the encoder, this should be result in the letter a. Regards, Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: Convertion of Unicode to ASCII NIGHTMARE
Oh, and it occurs to me, as I seem to have mentioned a document about PgSQL rather than pysqlite (although they both have the same principal developer), that you might need to investigate the client_encoding parameter when setting up your connection. The following message gives some information (but not much): http://groups.google.com/group/comp.lang.python/msg/f27fa9866c9b7b5f Sadly, I can't find the information about getting result values as Unicode objects, but I believe it involves some kind of SQL comment that you send to the database system which actually tells pysqlite to change its behaviour. Paul -- http://mail.python.org/mailman/listinfo/python-list