On Fri, 13 Mar 2009 14:24:52 +0100, Peter Otten <__pete...@web.de>
wrote:
>It seems the database gives you the strings as unicode. When a unicode
>string is printed python tries to encode it using sys.stdout.encoding
>before writing it to stdout. As you run your script on the windows commmand
>line that encoding seems to be cp437. Unfortunately your database contains
>characters the cannot be expressed in that encoding.

Vielen Dank for the help :) I hadn't thought about the code page used
to display data in the DOS box in XP.

It turns out that the HTML page from which I  was trying to extract
data using regexes was encoded in 8859-1 instead of UTF8, the SQLite
wrapper expects Unicode only, and it had a problem with some
characters.

For those interested, here's how I solved it, although there's likely
a smarter way to do it:

============
data = re_data.search(response)
if data:
        name = data.group(1).strip()
        address = data.group(2).strip()

        #content="text/html; charset=iso-8859-1">
        name  = name.decode('iso8859-1')
        address = address.decode('iso8859-1')
        
        sql = 'BEGIN;'
        sql = sql + 'UPDATE companies SET name=?,address=? WHERE id=?;'
        sql = sql + "COMMIT"

        try:
                cursor.execute(sql, (name,address,id) )
        except:
                print "Failed UPDATING"
                raise
else:
        print "Pattern not found"
============

Thanks again.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to