[
https://issues.apache.org/jira/browse/ARROW-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303899#comment-17303899
]
Liya Fan commented on ARROW-9513:
---------------------------------
[~benmosher] Thanks for your feedback. Your comments about {{allocator}} makes
sense to me. Do you want to provide a PR to improve the documentation?
> [Java] Improve documentation in regards to basic-usage / memory-management
> --------------------------------------------------------------------------
>
> Key: ARROW-9513
> URL: https://issues.apache.org/jira/browse/ARROW-9513
> Project: Apache Arrow
> Issue Type: Wish
> Components: Documentation, Java
> Affects Versions: 0.17.1
> Reporter: sascha schnug
> Priority: Minor
>
> I'm experimenting with Arrow using Java, C+ +and Python+ IPC format
> (Bytestream, File) and Parquet: I am struggling alot on the Java-side, even
> after looking for external resources and some code-reading within the
> dev-repository .
>
> Observing the state of the documentation, there seems to be a strong favour
> in regards to C++ and Python, which is not surprising. The Java part however,
> is hard to work with (at least for me; it might be possible that i'm the
> problem though). Sadly the Java interface is also the one, which is the most
> diverging from what people would usually do in Java.
>
> Acknowledging the user-guide like documentation from
> [repo/java|https://github.com/apache/arrow/tree/master/java#getting-started]
> (-i don't think this is referenced in the docs and it might only be
> referenced by the java-part of the repository > looks "hidden" as the
> Java-link in the docs points to Javadoc-based content- -> known issue:
> ARROW-9364 ) and it's warnings about VectorSchemaRoot being special and
> temporary and also reading [this external
> article|https://www.infoq.com/articles/apache-arrow-java]
> which also talks about manual memory-management i'm still struggling with a
> very simple use-case:
>
> - create and fill VectorSchemaRoot
> - write VectorSchemaRoot in IPC format to disk
> - read VectorSchemaRoot from IPC format from disk
> - INTO some out of scope object not owned by the reader!
>
> I won't put example code here, but refer to my StackOverflow question
> showing the problem of mine:
> [StackOverflow|https://stackoverflow.com/q/62938237/2320035]
>
> Something about memory-ownership is not working as expected for me.
>
> No matter what tests (dev-repo) or article (e.g. the second link above) i
> read, their examples did not help me here as those all are *processing* the
> data read in *within the reader-scope* (mostly simple elementwise check),
> while i want to read into some *global* object which outlives the
> reader-object (see my code on SO or the second link: printing out read data
> works as long as the reader is open).
>
> The article above also says:
>
> {code:java}
> A vector is managed by one allocator. We say that the allocator owns the
> buffer backing the vector. Vector ownership can be transferred from one
> allocator to another.
> {code}
>
> But how exactly would i populate an empty VectorSchemaRoot (of my class)
> with whatever i read in, surviving closing the reader? I experiment with
> VectorLoad and VectorUnload, including usage of the only call i found which
> has "ownership" in his docstring (batch.cloneWithTransfer), but no success.
> And even if working, the Java-based RecordBatch
> [link|https://arrow.apache.org/docs/java/] which would be the one using for
> this looks completely different then what Pythons does look like
> [link|https://arrow.apache.org/docs/python/generated/pyarrow.RecordBatch.htm]).
>
>
> Should i be able to see my problem given the documentation? Is there
> anything else to read? (I know that there must be in this regards within some
> Flight / Gandiva project-code, but i did not find it yet).
>
> Or would it be completely wrong to keep VectorSchemaRoot as core-object to
> handle all my data?
>
> Feel free to close this issue if you think, that documentation is *not*
> incomplete.
>
> Thanks,
> Sascha
--
This message was sent by Atlassian Jira
(v8.3.4#803005)