New submission from matt farrugia <m...@far.in.net>:
The json module allows a user to provide an `object_hook` function, which, if provided, is called to transform the dict that is created as a result of parsing a JSON Object. It'd be nice if there was something analogous for JSON Arrays: an `array_hook` function to transform the list that is created as a result of parsing a JSON Array. At the moment transforming JSON Arrays requires one of the following approaches (as far as I can see): (1) Providing an object_hook function that will recursively transform any lists in the values of an Object/dict, including any nested lists, AND recursively transforming the final result in the event that the top level JSON object being parsed is an array (this array is never inside a JSON Object that goes through the `object_hook` transformation). (2) Transforming the entire parsed result after parsing is finished by recursively transforming any lists in the final result, including recursively traversing nested lists AND nested dicts. Providing an array_hook would cut out the need for either approach, as the recursive case from the recursive functions I mentioned could be used as the `array_hook` function directly (without the recursion). ## An example of usage: Let's say we want JSON Arrays represented using tuples rather than lists, e.g. so that they are hashable straight out-of-the-(json)-box. Before this enhancement, this change requires one of the two methods I mentioned above. It is not so difficult to implement these recursive functions, but seems inelegant. After the change, `tuple` could be used as the `array_hook` directly: ``` >>> json.loads('{"foo": [[1, 2], "spam", [], ["eggs"]]}', array_hook=tuple) {'foo': ((1, 2), 'spam', (), ('eggs',))} ``` It seems (in my opinion) this is more elegant than converting via an `object_hook` or traversing the whole structure after parsing. ## The patch: I am submitting a patch that adds an `array_hook` kwarg to the `json` module's functions `load` and `loads`, and to the `json.decoder` module's `JSONDecoder`, `JSONArray` and `JSONObject` classes. I also hooked these together in the `json.scanner` module's `py_make_scanner` function. It seems that `json.scanner` will prefer the `c_make_scanner` function defined in `Modules/_json.c` when it is available. I am not confident enough in my C skills or C x Python knowledge to dive into this module and make the analogous changes. But I assume they will be simple for someone who can read C x Python code, and that the changes will be analogous to those required to `Lib/json/scanner.py`. I need help to accomplish this part of the patch. ## Testing: In the mean time, I added a test to `test_json.test_decode`. It's CURRENTLY FAILING because the implementation of the patch is incomplete (I believe this is only due to the missing part of the patch---the required changes to `Modules/_json.c` I identified above). When I manually reset `json.scanner.make_scanner` to `json.scanner.py_make_scanner` and play around with the new `array_hook` functionality, it seems to work. ---------- components: Extension Modules, Library (Lib) messages: 340957 nosy: matomatical priority: normal severity: normal status: open title: Add 'array_hook' for json module type: enhancement versions: Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36738> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com