[issue10976] json.loads() raises TypeError on bytes object
Martin Panter added the comment: Issue 17909 (auto-detecting JSON encoding) looks like it has a patch which would probably satisfy this issue -- nosy: +vadmium ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Hanxue Lee added the comment: This seems to be an issue (bug?) for Python 3.3 When calling json.loads() with a byte array, this is the error json.loads(response.data, 'latin-1') TypeError: can't use a string pattern on a bytes-like object When I decode the byte array to string json.loads(response.data.decode(), 'latin-1') I get this error TypeError: bytes or integer address expected instead of str instance -- nosy: +Hanxue.Lee ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Changes by Chris Rebert pyb...@rebertia.com: -- nosy: +cvrebert ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Martin v. Löwis added the comment: Bike-shedding: instead of jsonb, make it json.bytes. Else, it may get confused with other protocols, such as JSONP or BSON. -- nosy: +loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Nick Coghlan added the comment: json.bytes would also work for me. It wouldn't need to replicate the full main module API, just combine the text transform with UTF-8 encoding and decoding (as well as autodetected UTF-16 and UTF-32 decoding) for the main 4 functions (dump[s], load[s]). If people want UTF-16 and UTF-32 *en*coding (which seem to be rarely used in combination with JSON), then they can invoke the text transform version directly, and then do a separate encoding step. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +ncoghlan ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Nick Coghlan added the comment: Issue 19837 is the complementary problem on the serialisation side - users migrating from Python 2 are accustomed to being able to use the json module directly as a wire protocol module, but the strict Python 3 interpretation as a text transform means that isn't possible - you have to apply the text encoding step separately. What appears to have happened is that the way JSON is used in practice has diverged from JSON as a formal spec. Formal spec (this is what the Py3k JSON module implements, and Py2 implements with ensure_ascii=False): JSON is a Unicode text transform, which may optionally be serialised as UTF-8, UTF-16 or UTF-32. Practice (what the Py2 JSON module implements with ensure_ascii=True, and what is covered in RFC 4627): JSON is a UTF-8 encoded wire protocol So now we're left with the options: - try to tweak the existing json APIs to handle both the str-str and str-bytes use cases (ugly) - add new APIs within the existing json module - add a new jsonb module, which dumps to UTF-8 encoded bytes, and reads from UTF-8, UTF-16 or UTF-32 encoded bytes in accordance with RFC 4627 (but being more tolerant in terms of what is allowed at the top level) I'm currently leaning towards the jsonb module option, and deprecating the encoding argument in the pure text version. It's not pretty, but I think it's better than the alternatives. -- versions: +Python 3.5 -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Changes by Nick Guenther n...@kousu.ca: -- nosy: +kousu ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Changes by Josh Lee jlee...@gmail.com: -- nosy: +jleedev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Antoine Pitrou pit...@free.fr added the comment: According to current implementation this is acceptable. Then perhaps auto-detection can be restricted to strict mode? Non-strict mode would always use utf-8. Or we can just skip auto-detection altogether (I don't think many people produce utf-16 or utf-32 JSON; that would be a waste of bandwidth for no obvious benefit). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Serhiy Storchaka storch...@gmail.com added the comment: Related to this question is a question about errors. How to inform the user, if an error occurred in the decoding with detected encoding? Leave UnicodeDecodeError or convert it to ValueError? If there is a syntax error in JSON -- exception will refer to the position in the decoded string, we should to translate it to the position in the original binary string? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Changes by Éric Araujo mer...@netwok.org: -- title: json.loads() throws TypeError on bytes object - json.loads() raises TypeError on bytes object ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Serhiy Storchaka storch...@gmail.com added the comment: I mean a string that starts with '\u'. b'\x00...'. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Antoine Pitrou pit...@free.fr added the comment: Le jeudi 26 avril 2012 à 15:48 +, Serhiy Storchaka a écrit : I mean a string that starts with '\u'. b'\x00...'. According to the RFC, that should be escaped: All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+ through U+001F). And indeed: json.loads('\u') Traceback (most recent call last): File stdin, line 1, in module File /home/antoine/opt/lib/python3.2/json/__init__.py, line 307, in loads return _default_decoder.decode(s) File /home/antoine/opt/lib/python3.2/json/decoder.py, line 351, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File /home/antoine/opt/lib/python3.2/json/decoder.py, line 367, in raw_decode obj, end = self.scan_once(s, idx) ValueError: Invalid control character at: line 1 column 1 (char 1) json.loads('\\u') '\x00' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10976] json.loads() raises TypeError on bytes object
Serhiy Storchaka storch...@gmail.com added the comment: According to current implementation this is acceptable. json.loads('\u', strict=False) '\x00' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10976 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com