[
https://issues.apache.org/jira/browse/ARROW-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wes McKinney updated ARROW-2249:
--------------------------------
Fix Version/s: (was: 0.11.0)
0.12.0
> [Java/Python] in-process vector sharing from Java to Python
> -----------------------------------------------------------
>
> Key: ARROW-2249
> URL: https://issues.apache.org/jira/browse/ARROW-2249
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Java, Python
> Reporter: Uwe L. Korn
> Assignee: Uwe L. Korn
> Priority: Major
> Labels: beginner
> Fix For: 0.12.0
>
>
> Currently we seem to use in all applications of Arrow the IPC capabilities to
> move data between a Java process and a Python process. While this is
> 0-serialization, it is not zero-copy. By taking the address and offset, we
> can already create Python buffers from Java buffers:
> https://github.com/apache/arrow/pull/1693. This is still a very low-level
> interface and we should provide the user with:
> * A guide on how to load Apache Arrow java libraries in Python (either
> through a fat-jar that was shipped with Arrow or how he should integrate it
> into its Java packaging)
> * {{pyarrow.Array.from_jvm}}, {{pyarrow.RecordBatch.from_jvm}}, … functions
> that take the respective Java objects and emit Python objects. These Python
> objects should also ensure that the underlying memory regions are kept alive
> as long as the Python objects exist.
> This issue can also be used as a tracker for the various sub-tasks that will
> need to be done to complete this rather large milestone.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)