[
https://issues.apache.org/jira/browse/AVRO-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920620#action_12920620
]
Stu Hood commented on AVRO-679:
-------------------------------
> That's what I meant by a schema transformation.
As far as I know, there is no way to transform a schema that will allow you to
dodge Avro's varint encoding and do group varint encoding instead: that was
where I was suggesting you would get the encoding/decoding speed benefits by
using multiple arrays.
> Improved encodings for arrays
> -----------------------------
>
> Key: AVRO-679
> URL: https://issues.apache.org/jira/browse/AVRO-679
> Project: Avro
> Issue Type: New Feature
> Components: spec
> Reporter: Stu Hood
> Priority: Minor
>
> There are better ways to encode arrays of varints [1] which are faster to
> decode, and more space efficient than encoding varints independently.
> Extending the idea to other types of variable length data like 'bytes' and
> 'string', you could encode the entries for an array block as an array of
> lengths, followed by contiguous byte/utf8 data.
> [1] group varint encoding: slides 57-63 of
> http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/WSDM09-keynote.pdf
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.