[
https://issues.apache.org/jira/browse/ARROW-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17662223#comment-17662223
]
Rok Mihevc commented on ARROW-5200:
-----------------------------------
This issue has been migrated to [issue
#21675|https://github.com/apache/arrow/issues/21675] on GitHub. Please see the
[migration documentation|https://github.com/apache/arrow/issues/14542] for
further details.
> [Java] Provide light-weight arrow APIs
> --------------------------------------
>
> Key: ARROW-5200
> URL: https://issues.apache.org/jira/browse/ARROW-5200
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Java
> Reporter: Liya Fan
> Assignee: Liya Fan
> Priority: Major
> Labels: pull-request-available
> Attachments: image-2019-04-23-15-19-34-187.png, safe_nocheck.jpg,
> unsafe.jpg
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> We are trying to incorporate Apache Arrow to Apache Flink runtime. We find
> Arrow an amazing library, which greatly simplifies the support of columnar
> data format.
> However, for many scenarios, we find the performance unacceptable. Our
> investigation shows the reason is that, there are too many redundant checks
> and computations in Arrow API.
> For example, the following figures shows that in a single call to
> Float8Vector.get(int) method (this is one of the most frequently used APIs in
> Flink computation), there are 20+ method invocations.
> !image-2019-04-23-15-19-34-187.png!
>
> There are many other APIs with similar problems. We believe that these checks
> will make sure of the integrity of the program. However, it also impacts
> performance severely. For our evaluation, the performance may degrade by two
> or three orders of magnitude slower, compared to access data on heap memory.
> We think at least for some scenarios, we can give the responsibility of
> integrity check to application owners. If they can be sure all the checks
> have been passed, we can provide some light-weight APIs and the inherent high
> performance, to them.
> In the light-weight APIs, we only provide minimum checks, or avoid checks at
> all. The application owner can still develop and debug their code using the
> original heavy-weight APIs. Once all bugs have been fixed, they can switch to
> light-weight APIs in their products and enjoy the consequent high performance.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)