Nick Coghlan added the comment:
The patch needs to be rebased on top of the issue 19307 patch, but I like this
approach.
I say go ahead and commit it whenever you're ready :)
--
___
Python tracker rep...@bugs.python.org
Serhiy Storchaka added the comment:
LGTM.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
___
___
Python-bugs-list mailing list
Roundup Robot added the comment:
New changeset ac016cba7e64 by Ezio Melotti in branch 'default':
#18958: Improve error message for json.load(s) while passing a string that
starts with a UTF-8 BOM.
http://hg.python.org/cpython/rev/ac016cba7e64
--
nosy: +python-dev
Ezio Melotti added the comment:
Fixed, thanks for the feedback!
--
assignee: docs@python - ezio.melotti
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
Ezio Melotti added the comment:
Here is an updated patch with tests.
--
stage: needs patch - patch review
Added file: http://bugs.python.org/file32235/issue18958-2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
Nick Coghlan added the comment:
Updated patch looks good to me.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
___
___
Python-bugs-list
Changes by Ezio Melotti ezio.melo...@gmail.com:
Added file: http://bugs.python.org/file32237/issue18958-2-py3k.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
___
Nick Coghlan added the comment:
As does the Py3k version :)
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
___
___
Python-bugs-list
Nick Coghlan added the comment:
Discussing this with Ezio on IRC, we decided that it probably makes more sense
to do this check outside the scanner as preliminary validation of the input
passed in via the public API. That will minimise the overhead and also avoids
any potential side effects
Ezio Melotti added the comment:
I opened a new issue about improving the error message: #19307.
After further discussion on IRC, we think that both #19307 and this issue
should only be applied on 3.4 (the attached patch produces an even more
misleading error that would require backporting
Ezio Melotti added the comment:
I'm not sure this should be documented in json.load/loads, and I'm not sure
people will look there once they get this exception.
The error is raised because the wrong codec is used (either by open() before
passing the file object to json.load or by json.loads),
Ezio Melotti added the comment:
Here is a proof of concept that raises this error:
import json; json.load(open('input.json'))
Traceback (most recent call last):
File stdin, line 1, in module
File /home/wolf/dev/py/2.7/Lib/json/__init__.py, line 290, in load
**kw)
File
Changes by Ezio Melotti ezio.melo...@gmail.com:
Added file: http://bugs.python.org/file32213/issue18958.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
___
Ezio Melotti added the comment:
Forgot to add that the patch is for 2.7, and it also needs to be implemented in
the unicode scanner.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
Nick Coghlan added the comment:
I like the new error message as a low-risk immediate improvement that nudges
people in the direction of utf8-sig. It also leaves the door open to silently
ignoring the BoM in the future without immediately committing to that approach.
--
Anoop Thomas Mathew added the comment:
Patch for BOM signature documentation in json.loads
--
keywords: +patch
nosy: +Anoop.Thomas.Mathew
Added file:
http://bugs.python.org/file31764/json_BOM_signature_documentation.patch
___
Python tracker
Changes by Ezio Melotti ezio.melo...@gmail.com:
--
keywords: +easy
nosy: +ezio.melotti
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
___
___
New submission from Adrián Chaves Fernández:
Calling json.load() with a file object or json.loads() with a string containing
the attached JSON code raises an exception with the message 'No JSON object
could be decoded'.
I’ve pasted the JSON code into http://jsonlint.com/ and it reports it as
Vajrasky Kok added the comment:
a = open('/tmp/input.json')
b = a.read()
b[0]
'\ufeff'
import json
json.loads(b[1:])
loads just fine
json.loads(b)
chokes.
Whether python json module should handle '\ufeff' gracefully or not, I am not
sure. Let me investigate it.
--
nosy:
Vajrasky Kok added the comment:
The U+FEFF character is related with Byte order mark.
Reference:
http://en.wikipedia.org/wiki/Byte_Order_Mark
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
Serhiy Storchaka added the comment:
Use the utf-8-sig encoding.
See also issue17909.
--
nosy: +serhiy.storchaka
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18958
___
Adrián Chaves Fernández added the comment:
I’ll veave how to address this up to you. Thanks a lot for finding out that the
cause was the BOM, I’ve just removed it from the file and now everything works
as expected.
--
___
Python tracker
Nick Coghlan added the comment:
Switching to a docs bug - this won't be fixed in 2.7, but it should probably be
documented as a limitation.
--
assignee: - docs@python
components: +Documentation -Extension Modules
nosy: +docs@python, ncoghlan
stage: - needs patch
23 matches
Mail list logo