On Thu, Nov 25, 2021 at 10:21 PM Qianqian Fang <q.f...@neu.edu> wrote:

> On 11/25/21 17:05, Stephan Hoyer wrote:
>
> Hi Qianqian,
>
> What is your concrete proposal for NumPy here?
>
> Are you suggesting new methods or functions like to_json/from_json in
> NumPy itself?
>
>
> that would work - either define a subclass of JSONEncoder to serialize
> ndarray and allow users to pass it to cls in json.dump, or, as you
> mentioned, define to_json/from_json like pandas DataFrame
> <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_json.html>
> would save people from writing customized codes/formats.
>
> I am also wondering if there is a more automated way to tell
> json.dump/dumps to use a default serializer for ndarray without using
> cls=...? I saw a SO post mentioned about a method called "__serialize__" in
> a class, but can't find it in the official doc. I am wondering if anyone is
> aware of the method defining a default json serializer in an object?
>
There isn't one. You have to explicitly provide the JSONEncoder. Which is
why there is nothing that we can really do in numpy to avoid the TypeError
that you mention below. The stdlib json module just doesn't give us the
hooks to be able to do that. We can provide top-level functions like
to_json()/from_json() to encode/decode a top-level ndarray to a JSON text,
but that doesn't help with ndarrays in dicts or other objects. We could
also provide a JSONEncoder/JSONDecoder pair, too, but as I mention in one
of the Github issues you link to, there are a number of different
expectations that people could have for what the JSON representation of an
array is. Some will want to use the JData standard. Others might just want
the arrays to be represented as lists of lists of plain-old JSON numbers in
order to talk with software in other languages that have no particular
standard for array data.

> As far as I can tell, reading/writing in your custom JSON format already
> works with your jdata library.
>
> ideally, I was hoping the small jdata encoder/decoder functions can be
> integrated into numpy; it can help avoid the "TypeError: Object of type
> ndarray is not JSON serializable" in json.dump/dumps without needing
> additional modules; more importantly, it simplifies users experience in
> exchanging complex arrays (complex valued, sparse, special shapes) with
> other programming environments.
>
It seems to me that the jdata package is the right place for implementing
the JData standard. I'm happy for our documentation to point to it in all
the places that we talk about serialization of arrays. If the json module
did have some way for us to specify a default representation for our
objects, then that would be a different matter. But for the present
circumstances, I'm not seeing a substantial benefit to moving this code
inside of numpy. Outside of numpy, you can evolve the JData standard at its
own pace.

-- 
Robert Kern
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to