[ https://issues.apache.org/jira/browse/ARROW-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Micah Kornfield updated ARROW-6593: ----------------------------------- Description: It has been posited that the Decoder object (and on-heap work in general) is potentially slow for decoding. The scope of this Jira is to add a new method that instead of consuming from Decoder, consumes directly from a ByteBuffer. In order to this we there needs to be utility classes for zig-zag decoding (one might existing in avro) from a ByteBuffer. This is esentially rewriting logic in the decoder to work directly against a bytebuffer and then measure if there is a meaningful performance impact. was: Users should be able to pass in a set of fields they wish to decode from Avro and the converter should avoid creating Vectors in the returned ArrowSchemaRoot. This would ideally support nested columns so if there was: Struct A { int B; int C; } The use could choose to only read A.B or A.C or both. > [Java] Experiment with performance difference of avoiding the use of Avro > Decoder > --------------------------------------------------------------------------------- > > Key: ARROW-6593 > URL: https://issues.apache.org/jira/browse/ARROW-6593 > Project: Apache Arrow > Issue Type: Sub-task > Components: Java > Reporter: Micah Kornfield > Priority: Major > Labels: avro > > It has been posited that the Decoder object (and on-heap work in general) is > potentially slow for decoding. > > The scope of this Jira is to add a new method that instead of consuming from > Decoder, consumes directly from a ByteBuffer. In order to this we there > needs to be utility classes for zig-zag decoding (one might existing in avro) > from a ByteBuffer. > > This is esentially rewriting logic in the decoder to work directly against a > bytebuffer and then measure if there is a meaningful performance impact. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)