[
https://issues.apache.org/jira/browse/AVRO-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874777#action_12874777
]
Kevin Oliver commented on AVRO-557:
-----------------------------------
We do a decent amount of 1 time usage of BinaryDecoders and
GenericDatumReaders. When we upgraded to Avro 1.3 we saw significant regression
in performance on decoding. A profiler showed the issue pretty quickly.
Basically, it boiled down to 2 issue:
1) Having GenericDatumReaders always create the ResolvingDecoder is too
expensive for one time usage.
2) BinaryDecoders now created a bunch of arrays and got more complicated, again
significantly slowing down one time usage.
I'm attaching a patch that has a somewhat hacky workaround. I've resurrected
the BinaryDecoder code from v1.2 (more or less). I've also created a
GenericDatumReaderWithOptionalResolver class that basically forks
GenericDatumReader to allow for reading directly from the supplied decoder.
Running the newly added 'Perf -GoneTimeUse' you can see the stark difference:
GenericReaderOneTimeUsage12Test: 2175 ms, 1.9147720770945649 million
entries/sec. 0.008961491780473783 million bytes/sec
GenericReaderOneTimeUsage13Test: 13152 ms, 0.3167766318368232 million
entries/sec. 0.0014825739399539307 million bytes/sec
I don't believe we should commit the patch as is. But I'd like some feedback on
how to go from here to get this performance back.
> Speed up one-time data decoding
> -------------------------------
>
> Key: AVRO-557
> URL: https://issues.apache.org/jira/browse/AVRO-557
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.3.2
> Reporter: Kevin Oliver
> Assignee: Kevin Oliver
> Fix For: 1.4.0
>
>
> There are big gains to be had in performance when using a BinaryDecoder and a
> GenericDatumReader just one time. This is due to the relatively expensive
> parsing and initialization that came with 1.3. Patch with example code and a
> Perf harness to follow.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.