[ 
https://issues.apache.org/jira/browse/AVRO-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991257#comment-12991257
 ] 

Stu Hood commented on AVRO-679:
-------------------------------

I'm having trouble getting an implementation of group varint encoding to be 
significantly (or any) faster than Avro's varint encoding, so this ticket is 
probably invalid.

On a side note though, I'm curious what I might be doing wrong in BinaryEncoder 
setup for it to be so expensive: see https://github.com/stuhood/gvi#readme and 
the benchmark code 
https://github.com/stuhood/gvi/blob/master/src/test/java/net/hoodidge/gvi/GroupVarIntBenchmark.java#L53

> Improved encodings for arrays
> -----------------------------
>
>                 Key: AVRO-679
>                 URL: https://issues.apache.org/jira/browse/AVRO-679
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Stu Hood
>            Priority: Minor
>
> There are better ways to encode arrays of varints [1] which are faster to 
> decode, and more space efficient than encoding varints independently.
> Extending the idea to other types of variable length data like 'bytes' and 
> 'string', you could encode the entries for an array block as an array of 
> lengths, followed by contiguous byte/utf8 data.
> [1] group varint encoding: slides 57-63 of 
> http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/WSDM09-keynote.pdf

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to