On 11/17/2015 03:56 PM, Lars Skjærlund wrote:
Hi,
I’m afraid I’ve hit a bug:
I want to migrate our Kallithea database from SQLite to MySQL. In
order to do that, I dumped the SQLite database to an SQL script,
modified the SQL commands to MySQL dialect, and ran the script against
the MySQL database.
It worked like a charm – except that Kallithea kept crashing with
Unicode errors.
But everything _/was/_ Unicode: The dump from SQLite was Unicode, my
edits where fully Unicode compatible, and the database as well as the
tables where created in MySQL as UTF8 compatible. After fighting this
for a long time, I tried letting Kallithea populate a new MySQL
database – and discovered that Kallithea doesn’t store data in UTF8
format. It appears that the data is encoded for UTF8 twice, so my
record looks like
+-----------+--------------+
| firstname | lastname |
+-----------+--------------+
| Lars | Skjærlund |
+-----------+--------------+
If update my name to be true UTF8, Kallithea crashes. I haven’t tried
other databases, but the encoding in SQLite is correct.
I solved my problem by running the SQL scriptfile through iconv before
submitting it to MySQL, claiming the input was Latin1 and asking for
UTF8 as output: In that way I got the same double-encoding that
Kallithea appears to require…
Generally Kallithea works fine with unicode. It can however be tricky
when it is interfacing with VCS or database. It is my impression that
mysql also just works, but I use postgresql and haven't tried mysql myself.
If the database really is in utf8, I guess some other layer in the stack
(sqlalchemy or the database driver) messes it up.
These two issue reports might give hints of what to check
https://bitbucket.org/conservancy/kallithea/issues/9/doc-unicode-utf-8-issues-in-the-changelog
https://bitbucket.org/conservancy/kallithea/issues/147/unicodeencodeerror-ascii-codec-cant-encode
When debugging the problem, it might be simpler to let Kallithea create
a new database and focus on whether it can store and read unicode. Next,
you can make sure your converted database use the same encoding.
/Mads
_______________________________________________
kallithea-general mailing list
[email protected]
http://lists.sfconservancy.org/mailman/listinfo/kallithea-general