Your message dated Mon, 16 Nov 2015 20:03:06 +1100
with message-id <[email protected]>
and subject line Re: Bug#541198: python-mysqldb: utf8_bin collation will not 
convert to Unicode strings
has caused the Debian Bug report #541198,
regarding python-mysqldb: utf8_bin collation will not convert to Unicode strings
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
541198: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=541198
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: python-mysqldb
Version: 1.2.2-8
Severity: normal

A string type column with a utf8_bin collation will not be converted to a
Python Unicode string, but instead will be returned as a utf8 (byte) string.

The MySQL documentation though clearly states: "A nonbinary string has a
character set and is converted to another character set in many cases, even
when the string has a _bin collation"[1].

I understand that a string with utf8_bin collation is still a string and
thus should not be dealt with differently. The utf8_bin collation is
essential when working with Unicode without wanting the Unicode collation
algorithm to kick in.

How to reproduce:

CREATE TABLE t1 (
    a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin,
);

INSERT INTO t1 VALUES ('ΓΌ');

In Python:
>>> import MySQLdb
>>> db = MySQLdb.connect(db='pymysqltest', charset='utf8', use_unicode=True)
>>> cur = db.cursor()
>>> cur.execute("SELECT a FROM t1;")
1L
>>> cur.fetchall()
(('\xc3\xbc',),)

Chosing utf8_general_ci instead of utf8_bin will properly yield Unicode
objects:

>>> cur.execute("SELECT a COLLATE utf8_general_ci FROM t1;")
1L
>>> cur.fetchall()
((u'\xfc',),)

[1] http://dev.mysql.com/doc/refman/5.1/en/charset-binary-collations.html

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)

Kernel: Linux 2.6.30-1-686 (SMP w/1 CPU core)
Locale: LANG=de_DE@euro, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages python-mysqldb depends on:
ii  libc6                         2.9-23     GNU C Library: Shared libraries
ii  libmysqlclient16              5.1.37-1   MySQL database client library
ii  python                        2.5.4-2    An interactive high-level object-o
ii  python-support                1.0.3      automated rebuilding support for P

python-mysqldb recommends no packages.

Versions of packages python-mysqldb suggests:
ii  mysql-server                  5.1.37-1   MySQL database server (metapackage
ii  mysql-server-5.1 [mysql-serve 5.1.37-1   MySQL database server binaries
ii  python-egenix-mxdatetime      3.1.2-1    date and time handling routines fo
pn  python-mysqldb-dbg            <none>     (no description available)

-- no debconf information



--- End Message ---
--- Begin Message ---
On Wed, Nov 04, 2015 at 08:41:29PM +1100, Brian May wrote:
> Is this bug still present in the python-mysqldb package in unstable?
> Version 1.3.6-1 based on the mysqlclient fork.

No response; am going to assume this is fixed and closed the ticket.

If the problem can be reproduced, this bug report can get reopened.
-- 
Brian May <[email protected]>

--- End Message ---
_______________________________________________
Python-modules-team mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/python-modules-team

Reply via email to