Dear numpy developers,
I would like to share a proposal on making ndarray JSON serializable by
default, as detailed in this github issue:
https://github.com/numpy/numpy/issues/20461
briefly, my group and collaborators are working on a new NIH (National
Institute of Health) funded initiative - NeuroJSON
(http://neurojson.org) - to further disseminate a lightweight data
annotation specification (JData
<https://github.com/NeuroJSON/jdata/blob/master/JData_specification.md>)
among the broad neuroimaging/scientific community. Python and numpy have
been widely used <http://neuro.debian.net/_files/nipy-handout.pdf> in
neuroimaging data analysis pipelines (nipy, nibabel, mne-python,
PySurfer ... ), because N-D array is THE most important data structure
used in scientific data. However, numpy currently does not support JSON
serialization by default. This is one of the frequently requested
features on github (#16432, #12481).
We have developed a lightweight python modules (jdata
<https://pypi.org/project/jdata/>, bjdata
<https://pypi.org/project/bjdata/>) to help export/import ndarray
objects to/from JSON (and a binary JSON format - BJData
<https://github.com/NeuroJSON/bjdata/blob/master/Binary_JData_Specification.md>/UBJSON
<http://ubjson.org/> - to gain efficiency). The approach is to convert
ndarray objects to a dictionary with subfields using standardized JData
annotation tags. The JData spec can serialize complex data structures
such as N-D arrays (solid, sparse, complex). trees, graphs, tables etc.
It also permits data compression. These annotations have been
implemented in my MATLAB toolbox - JSONLab
<https://github.com/fangq/jsonlab> - since 2011 to help import/export
MATLAB data types, and have been broadly used among MATLAB/GNU Octave users.
Examples of these portable JSON annotation tags representing N-D arrays
can be found at
http://openjdata.org/wiki/index.cgi?JData/Examples/Basic#2_D_arrays_in_the_annotated_format
http://openjdata.org/wiki/index.cgi?JData/Examples/Advanced
and the detailed formats on N-D array annotations can be found in the spec:
https://github.com/NeuroJSON/jdata/blob/master/JData_specification.md#annotated-storage-of-n-d-arrays
our current python module to encode/decode ndarray to JSON serializable
forms are implemented in these compact functions (handling lossless
type/data conversion and data compression)
https://github.com/NeuroJSON/pyjdata/blob/63301d41c7b97fc678fa0ab0829f76c762a16354/jdata/jdata.py#L72-L97
https://github.com/NeuroJSON/pyjdata/blob/63301d41c7b97fc678fa0ab0829f76c762a16354/jdata/jdata.py#L126-L160
We strongly believe that enabling JSON serialization by default will
benefit the numpy user community, making it a lot easier to share
complex data between platforms (MATLAB/Python/C/FORTRAN/JavaScript...)
via a standardized/NIH-backed data annotation scheme.
We are happy to hear your thoughts, suggestions on how to contribute,
and also glad to set up dedicated discussions.
Cheers
Qianqian
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com