json.load and json.dump already default to UTF8 and already have parameters
for json loading and dumping.

json.loads and json.dumps exist only because there was no way to
distinguish between a string containing JSON and a file path string.
(They probably should've been .loadstr and .dumpstr, but it's too late for
that now)

TBH, I think it would be great to just have .load and .dump read the file
with standard params when a path-like ( hasattr(obj, '__path__') ) is
passed, but the suggested disadvantages of this are:

- https://docs.python.org/3/library/functions.html#open

  > The default encoding is platform dependent (whatever
locale.getpreferredencoding() returns), but any text encoding supported by
Python can be used. See the codecs module for the list of supported
encodings.

- .load and .dump don't default to UTF8?
  AFAIU, they do default to UTF-8. Do they instead currently default to
locale.getpreferredencoding() instead of the JSON spec(s) *
  encoding= was removed from .loads and was never accepted by json.load or
json.dump
- .load and .dump would also need to accept an encoding= parameter for
non-spec data that don't want to continue handling the file themselves
  - pickle.load has an encoding= parameter
  - marshal.load does not have (and probably doesn't need?) an encoding=
parameter
- What if you need to specify parameters for the file context manager?
  Accepting a path-like object should not break any existing code: you
could always still open and close a file-like yourself.
  open('file', 'rb') as _file:
      json.load(_file)

- Should we be using open(pth, 'rb') and open(pth, 'wb')? (Binary mode)

JSON Specs:
- https://tools.ietf.org/html/rfc7159#section-8.1  :

  > JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32.  The default
   encoding is UTF-8, and JSON texts that are encoded in UTF-8 are
   interoperable in the sense that they will be read successfully by the
   maximum number of implementations; there are many implementations
   that cannot successfully read texts in other encodings (such as
   UTF-16 and UTF-32).

   Implementations MUST NOT add a byte order mark to the beginning of a
   JSON text.  In the interests of interoperability, implementations
   that parse JSON texts MAY ignore the presence of a byte order mark
   rather than treating it as an error.

- https://www.json.org/ >
http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
(PDF!)

  > JSON syntax describes a sequence of Unicode code points. JSON also
depends on Unicode in the hex
numbers used in the \u escapement notation

So, could we just have .load and .dump accept a path-like and an encoding=
parameter (because they need to be able to specify UTF-8 / UTF-16 / UTF-32
anyway)?

On Tue, Sep 15, 2020 at 3:22 AM Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> Joao S. O. Bueno writes:
>
>  > If .load and .dump are super-charged, people coding with these
>  > methods in mind have _one_ less_ thing to worry about: if the
>  > method accepts a path or an open file becomes irrelevant.
>
> But then you either lose the primary benefit of this three line
> function (defaulting to the UTF-8 encoding to conform to the JSON
> standard), or you have a situation where what encoding you get can
> depend on whether you use the name of a file or that file already
> opened.
>
> I consider that worse because it's precisely the kind of thing that
> people *don't* worry about and *do* have some difficulty debugging.
> _______________________________________________
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/KO3ZZNTDMFZD26QGPTSNEXP2ALRDWOMF/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DHRTCSYINOZFVBYOQZ4CKFS5ZHDUSIZ3/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to