[ 
https://issues.apache.org/jira/browse/ARROW-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16246309#comment-16246309
 ] 

Wes McKinney commented on ARROW-1784:
-------------------------------------

It's hard to prevent a memory doubling on receipt if you go column-wise (e.g. 
{{pd.DataFrame(data)}} where data is a dict of columns will double memory). So 
I think as long as we avoid memory doubling we are good

> [Python] Read and write pandas.DataFrame in pyarrow.serialize by decomposing 
> the BlockManager rather than coercing to Arrow format
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-1784
>                 URL: https://issues.apache.org/jira/browse/ARROW-1784
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Wes McKinney
>             Fix For: 0.8.0
>
>
> See discussion in https://github.com/dask/distributed/pull/931
> This will permit zero-copy reads for DataFrames not containing Python 
> objects. In the event of an {{ObjectBlock}} these arrays will not be worse 
> than pickle to reconstruct on the receiving side



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to