New submission from Nick Coghlan:

In the Python 3 transition, we had to make a choice regarding whether we 
treated the JSON module as a text transform (with load[s] reading Unicode code 
points and dump[s] producing them), or as a text encoding (with load[s] reading 
binary sequences and dump[s] producing them).

To minimise the changes to the module API, the decision was made to treat it as 
a text transform, with the text encoding handled externally.

This API design decision doesn't appear to have worked out that well in the web 
development context, since JSON is typically encountered as a UTF-8 encoded 
wire protocol, not as already decoded text.

It also makes the module inconsistent with most of the other modules that offer 
"dumps" APIs, as those *are* specifically about wire protocols (Python 3.4):

>>> import json, marshal, pickle, plistlib, xmlrpc.client
>>> json.dumps('hello')
'"hello"'
>>> marshal.dumps('hello')
b'\xda\x05hello'
>>> pickle.dumps('hello')
b'\x80\x03X\x05\x00\x00\x00helloq\x00.'
>>> plistlib.dumps('hello')
b'<?xml version="1.0" encoding="UTF-8"?>\n<!DOCTYPE plist PUBLIC "-//Apple//DTD 
PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd";>\n<plist 
version="1.0">\n<string>hello</string>\n</plist>\n'

The only module with a dumps function that (like the json module) returns a 
string, is the XML-RPC client module:

>>> xmlrpc.client.dumps(('hello',))
'<params>\n<param>\n<value><string>hello</string></value>\n</param>\n</params>\n'

And that's nonsensical, since that XML-RPC API *accepts an encoding argument*, 
which it now silently ignores:

>>> xmlrpc.client.dumps(('hello',), encoding='utf-8')
'<params>\n<param>\n<value><string>hello</string></value>\n</param>\n</params>\n'
>>> xmlrpc.client.dumps(('hello',), encoding='utf-16')
'<params>\n<param>\n<value><string>hello</string></value>\n</param>\n</params>\n'

I now believe that an "encoding" parameter should have been added to the 
json.dump API in the Py3k transition (defaulting to UTF-8), allowing all of the 
dump/load APIs in the standard library to be consistently about converting to 
and from a binary wire protocol.

Unfortunately, I don't have a solution to offer at this point (since backwards 
compatibility concerns rule out the simple solution of just changing the return 
type). I just wanted to get it on record as a problem (and internal 
inconsistency within the standard library for dump/load protocols) with the 
current API.

----------
components: Library (Lib)
messages: 204764
nosy: chrism, ncoghlan
priority: normal
severity: normal
status: open
title: Wire protocol encoding for the JSON module
versions: Python 3.5

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19837>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to