[
https://issues.apache.org/jira/browse/ARROW-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838260#comment-16838260
]
Jacques Nadeau edited comment on ARROW-5224 at 5/13/19 5:16 AM:
----------------------------------------------------------------
What is the major downside of wrapping in a batch? It seems like we should
probably just do that and not introduce new APIs & protocols.
was (Author: jnadeau):
What is the major downside of wrapping in a batch? It seems like we should
probably just do that and not introduce new APIs.
> [Java] Add APIs for supporting directly serialize/deserialize ValueVector
> -------------------------------------------------------------------------
>
> Key: ARROW-5224
> URL: https://issues.apache.org/jira/browse/ARROW-5224
> Project: Apache Arrow
> Issue Type: Improvement
> Reporter: Ji Liu
> Assignee: Ji Liu
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> There is no API to directly serialize/deserialize ValueVector. The only way
> to implement this is to put a single FieldVector in VectorSchemaRoot and
> convert it to ArrowRecordBatch, and the deserialize process is as well.
> Provide a utility class to implement this may be better, I know all
> serializations should follow IPC format so that data can be shared between
> different Arrow implementations. But for users who only use Java API and want
> to do some further optimization, this seem to be no problem and we could
> provide them a more option.
> This may take some benefits for Java user who only use ValueVector rather
> than IPC series classes such as ArrowReordBatch:
> * We could do some shuffle optimization such as compression and some
> encoding algorithm for numerical type which could greatly improve performance.
> * Do serialize/deserialize with the actual buffer size within vector since
> the buffer size is power of 2 which is actually bigger than it really need.
> * Reduce data conversion(VectorSchemaRoot, ArrowRecordBatch etc) to make it
> user-friendly.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)