On 13Sep2019 09:31, Matt Billenstein <m...@vazor.com> wrote:
On Fri, Sep 13, 2019 at 08:37:26AM +1000, Cameron Simpson wrote:
On 10Sep2019 10:42, Daniel Holth <dho...@gmail.com> wrote:
[...]
> I stopped using Python 3 after learning about str(bytes) by finding it
> in
> my corrupted database. [...]
Could you outline how this happened to you?
Not the OP, but I've actually seen something like this happen in postgres, but
it's postgres doing the adaptation of bytea into a text column, not python str
afaict:
conn = psycopg2.connect(...)
with conn.cursor() as cursor:
... cursor.execute('update note set notes=%s where id=%s returning notes',
('hi there', 'NwMVUksheafn'))
... cursor.fetchall()
... cursor.execute('update note set notes=%s where id=%s returning notes',
(b'hi there', 'NwMVUksheafn'))
... cursor.fetchall()
...
[{'notes': 'hi there'}]
[{'notes': '\\x6869207468657265'}]
We were storing the response of an api request from requests and had grabbed
response.content (bytes) instead of response.text (str). I was still able to
decode the original data from this bytes representation, so not ideal, but no
data lost.
I did wish this sorta thing had raised an error instead of doing what
it did.
Aye. Somewhere there's some Python taking the b'' and accepting it for
the notes= parameter, presumably in the postgres dbapi code. That isn't
a Python language bug to my eye. It could be some careless 2->3 adaption
I guess. I suspect it isn't postgres itself (or its C library) mangling
things, it would be accepting a C string or character buffer.
Still, I can see how this can quietly leak mojibake into your database.
Thanks,
Cameron Simpson <c...@cskk.id.au>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/VCPZ6EHTXQLULVVOKJWUOLBSRB6EG2XO/