Gandalf wrote:
[...]
I solved the problem by entering data manually but now the problem is
that i had to delete this function:
con.text_factory = str
and now I can't see the data that I entered thought my sqlite manager
Then apparently there is still non-UTF-8 data in the database. Perhaps
SQLite Manager allows you to insert invalid data as well?
Try this:
con.text_factory = lambda s: unicode(s, "utf-8", "replace")
This will decode to UTF-8 as good as possible, and for non-decodable
characters you will get a REPLACEMENT CHARACTER instead.
To demonstrate:
>>> result = con.execute("select ? || ? || ?", ('text ok ', chr(230), '
ok, too')).fetchone()[0]
The chr(230) is just random garbage that isn't valid UTF-8:
>>> print result
text ok �k, too
As you can see, there is now a strange character, but the second 'ok,
too' got messed up :-/ But at least you can then find out which rows in
the database have messed up data. You can then iterate over all rows
with something like:
Then, assuming the text to check for validity is in result, you can do
something like:
>>> import unicodedata
>>> unicodedata.lookup("REPLACEMENT CHARACTER") in result
True
Does this help?
-- Gerhard
PS: This thread reinforces my believe that I have to make it harder for
users of pysqlite to make themselves shoot in the foot with non-UTF-8 data.
--
http://mail.python.org/mailman/listinfo/python-list