[
https://issues.apache.org/jira/browse/ARROW-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16277943#comment-16277943
]
ASF GitHub Bot commented on ARROW-1784:
---------------------------------------
wesm commented on issue #1390: ARROW-1784: [Python] Enable zero-copy
serialization, deserialization of pandas.DataFrame via components
URL: https://github.com/apache/arrow/pull/1390#issuecomment-349181695
> I think I read that you all had set up nightly builds on the twosigma
channel?
yes, as soon as this is merged, it should show up in the next nightly
https://anaconda.org/twosigma/pyarrow/files. Though we are having a small
problem with the version numbers in the nightlies
(https://issues.apache.org/jira/browse/ARROW-1881) that needs to get fixed in
the next day or two (cc @xhochy)
> This is to be expected, right?
Yes, it's a nice confirmation that pandas definitely is not making any
unexpected memory copies (it can be quite zealous about copying stuff)
> That's surprisingly nice. Do you have a sense for what is going on here?
100ms in copying memory?
Yes, I think this is strictly from copying the internal numeric ndarrays.
The memory use vs. pickle will also be less by whatever the total pickled
footprint of those numeric arrays that are being copied
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> [Python] Read and write pandas.DataFrame in pyarrow.serialize by decomposing
> the BlockManager rather than coercing to Arrow format
> ----------------------------------------------------------------------------------------------------------------------------------
>
> Key: ARROW-1784
> URL: https://issues.apache.org/jira/browse/ARROW-1784
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Reporter: Wes McKinney
> Assignee: Wes McKinney
> Labels: pull-request-available
> Fix For: 0.8.0
>
>
> See discussion in https://github.com/dask/distributed/pull/931
> This will permit zero-copy reads for DataFrames not containing Python
> objects. In the event of an {{ObjectBlock}} these arrays will not be worse
> than pickle to reconstruct on the receiving side
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)