[issue36738] Add 'array_hook' for json module

2019-04-27 Thread matt farrugia


Change by matt farrugia :


--
keywords: +patch
pull_requests: +12906
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue36738>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36738] Add 'array_hook' for json module

2019-04-27 Thread matt farrugia


New submission from matt farrugia :

The json module allows a user to provide an `object_hook` function, which, if 
provided, is called to transform the dict that is created as a result of 
parsing a JSON Object.

It'd be nice if there was something analogous for JSON Arrays: an `array_hook` 
function to transform the list that is created as a result of parsing a JSON 
Array.

At the moment transforming JSON Arrays requires one of the following approaches 
(as far as I can see):

(1) Providing an object_hook function that will recursively transform any lists 
in the values of an Object/dict, including any nested lists, AND recursively 
transforming the final result in the event that the top level JSON object being 
parsed is an array (this array is never inside a JSON Object that goes through 
the `object_hook` transformation).
(2) Transforming the entire parsed result after parsing is finished by 
recursively transforming any lists in the final result, including recursively 
traversing nested lists AND nested dicts.

Providing an array_hook would cut out the need for either approach, as the 
recursive case from the recursive functions I mentioned could be used as the 
`array_hook` function directly (without the recursion).


## An example of usage:

Let's say we want JSON Arrays represented using tuples rather than lists, e.g. 
so that they are hashable straight out-of-the-(json)-box. Before this 
enhancement, this change requires one of the two methods I mentioned above. It 
is not so difficult to implement these recursive functions, but seems 
inelegant. After the change, `tuple` could be used as the `array_hook` directly:

```
>>> json.loads('{"foo": [[1, 2], "spam", [], ["eggs"]]}', array_hook=tuple)
{'foo': ((1, 2), 'spam', (), ('eggs',))}
```

It seems (in my opinion) this is more elegant than converting via an 
`object_hook` or traversing the whole structure after parsing.

## The patch:

I am submitting a patch that adds an `array_hook` kwarg to the `json` module's 
functions `load` and `loads`, and to the `json.decoder` module's `JSONDecoder`, 
`JSONArray` and `JSONObject` classes. I also hooked these together in the 
`json.scanner` module's `py_make_scanner` function.


It seems that `json.scanner` will prefer the `c_make_scanner` function defined 
in `Modules/_json.c` when it is available. I am not confident enough in my C 
skills or C x Python knowledge to dive into this module and make the analogous 
changes. But I assume they will be simple for someone who can read C x Python 
code, and that the changes will be analogous to those required to 
`Lib/json/scanner.py`. I need help to accomplish this part of the patch.


## Testing:

In the mean time, I added a test to `test_json.test_decode`. It's CURRENTLY 
FAILING because the implementation of the patch is incomplete (I believe this 
is only due to the missing part of the patch---the required changes to 
`Modules/_json.c` I identified above).

When I manually reset `json.scanner.make_scanner` to 
`json.scanner.py_make_scanner` and play around with the new `array_hook` 
functionality, it seems to work.

--
components: Extension Modules, Library (Lib)
messages: 340957
nosy: matomatical
priority: normal
severity: normal
status: open
title: Add 'array_hook' for json module
type: enhancement
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue36738>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com