[
https://issues.apache.org/jira/browse/ARROW-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16246309#comment-16246309
]
Wes McKinney commented on ARROW-1784:
-------------------------------------
It's hard to prevent a memory doubling on receipt if you go column-wise (e.g.
{{pd.DataFrame(data)}} where data is a dict of columns will double memory). So
I think as long as we avoid memory doubling we are good
> [Python] Read and write pandas.DataFrame in pyarrow.serialize by decomposing
> the BlockManager rather than coercing to Arrow format
> ----------------------------------------------------------------------------------------------------------------------------------
>
> Key: ARROW-1784
> URL: https://issues.apache.org/jira/browse/ARROW-1784
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Reporter: Wes McKinney
> Fix For: 0.8.0
>
>
> See discussion in https://github.com/dask/distributed/pull/931
> This will permit zero-copy reads for DataFrames not containing Python
> objects. In the event of an {{ObjectBlock}} these arrays will not be worse
> than pickle to reconstruct on the receiving side
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)