[ https://issues.apache.org/jira/browse/ARROW-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney updated ARROW-2249: -------------------------------- Fix Version/s: (was: 0.10.0) 0.11.0 > [Java/Python] in-process vector sharing from Java to Python > ----------------------------------------------------------- > > Key: ARROW-2249 > URL: https://issues.apache.org/jira/browse/ARROW-2249 > Project: Apache Arrow > Issue Type: New Feature > Components: Java - Vectors, Python > Reporter: Uwe L. Korn > Assignee: Uwe L. Korn > Priority: Major > Labels: beginner > Fix For: 0.11.0 > > > Currently we seem to use in all applications of Arrow the IPC capabilities to > move data between a Java process and a Python process. While this is > 0-serialization, it is not zero-copy. By taking the address and offset, we > can already create Python buffers from Java buffers: > https://github.com/apache/arrow/pull/1693. This is still a very low-level > interface and we should provide the user with: > * A guide on how to load Apache Arrow java libraries in Python (either > through a fat-jar that was shipped with Arrow or how he should integrate it > into its Java packaging) > * {{pyarrow.Array.from_jvm}}, {{pyarrow.RecordBatch.from_jvm}}, … functions > that take the respective Java objects and emit Python objects. These Python > objects should also ensure that the underlying memory regions are kept alive > as long as the Python objects exist. > This issue can also be used as a tracker for the various sub-tasks that will > need to be done to complete this rather large milestone. -- This message was sent by Atlassian JIRA (v7.6.3#76005)