On 2018-03-25 06:30:54 +1100, Chris Angelico wrote: > On Sun, Mar 25, 2018 at 3:35 AM, Peter J. Holzer <hjp-pyt...@hjp.at> wrote: > > On 2018-03-24 11:21:09 +1100, Chris Angelico wrote: > >> If the database has been configured to use UTF-8 (as mentioned, that's > >> "utf8mb4" in MySQL), you won't get that byte sequence back. You'll get > >> back valid UTF-8. > > > > Actually (with python3 and mysql.connector), you'll get back str values, > > not byte values encoded in utf-8 or latin-1. You don't have to decode > > them because the driver already did it. > > > > So as a Python programmer, you don't care what character set the > > database uses internally, as this is almost completely hidden from you > > (The one aspect that isn't hidden is of course the set of characters > > that you can store in a character field: Obviously, you can't store > > Chinese characters in a latin1 field). > > Good. I mentioned earlier that that's how it is with PostgreSQL and > psycopg2, but wasn't sure about the MySQL interface modules. Glad to > know that it is.
I'm surprised that PEP 249 doesn't specify this. It seems worth standardizing (OTOH it's the "one obvious way to do it", so maybe it doesn't need to be specified explicitely). The interfaces for the 4 databases I use (psycopg2 for PostgreSQL, cx_Oracle, mysql.connector and sqlite3) all behave the same for varchar fields, which makes me reasonably confident that this is also the case for other databases. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | h...@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
signature.asc
Description: PGP signature
-- https://mail.python.org/mailman/listinfo/python-list