[
https://issues.apache.org/jira/browse/ARROW-14115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419709#comment-17419709
]
Joris Van den Bossche commented on ARROW-14115:
-----------------------------------------------
[~apitrou] I was looking into this what removing it would involve. But so we
still have one place internally where we use the (de)serialization machinery
ourselves, i.e. in plasma.
In the Python plasma code, we use it for the
{{PlasmaClient.put}}/{{PlasmaClient.get}} to put/get generic Python objects in
the plasma store
(https://github.com/apache/arrow/blob/83e169852c21bed92924a53df8902d3a17935ceb/python/pyarrow/_plasma.pyx#L543-L547).
This is something we could replace with storing the pickled object in the
store? (so using pickling instead of our own custom (de)serialization)
In addition, we also use part of the {{arrow::py}} serialize functionality in
the plasma C++ code: {{WriteNdarrayHeader}}
(https://github.com/apache/arrow/blob/83e169852c21bed92924a53df8902d3a17935ceb/python/pyarrow/tensorflow/plasma_op.cc#L141)
and {{NdarrayFromBuffer}}
(https://github.com/apache/arrow/blob/83e169852c21bed92924a53df8902d3a17935ceb/python/pyarrow/tensorflow/plasma_op.cc#L286).
It is relatively easy to only keep those two functions in (de)serialize.cc and
remove the rest we no longer need. But are you aware of other internal
utilities that might cover that as well?
> [Python] Remove deprecated pyarrow.serialization functionality
> --------------------------------------------------------------
>
> Key: ARROW-14115
> URL: https://issues.apache.org/jira/browse/ARROW-14115
> Project: Apache Arrow
> Issue Type: Sub-task
> Components: Python
> Reporter: Joris Van den Bossche
> Assignee: Joris Van den Bossche
> Priority: Major
> Fix For: 6.0.0
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)