[protobuf] Re: Variable-Width Integer Encoding

'Josh Haberman' via Protocol Buffers Fri, 05 Jan 2018 15:55:51 -0800

I believe the encoding you mentioned here is the same as PrefixVarint. See 
some more info here when WebAssembly was evaluating which varint scheme to 
use:


https://github.com/WebAssembly/design/issues/601

Our existing varint encoding wins on simplicity, and WebAssembly chose it 
even though they were aware of PrefixVarint. That said, there may be cases 
where PrefixVarint would have given a noticeable speed boost.

On Tuesday, January 2, 2018 at 12:54:59 PM UTC-8, 
matthew.martin...@gmail.com wrote:
>
> What is the rationale behind the current variable-width integer encoding?
>
> As I understand it, an integer is terminated by a byte that's 
> most-significant bit is equal to zero.  Thus, bytes must be read one at a 
> time, and this condition must be checked after reading each one to 
> determine whether to read another.  Why was this encoding chosen over a 
> variable-width encoding that would require at most two reads -- that is, an 
> encoding that specifies the number of subsequent bytes to read in the first 
> byte?
>
> No, I don't mean for the first byte's value to be the length of the rest 
> of the integer.  Rather, the number of leading ones in the first byte could 
> be the number of following bytes.  This would still allow 7 bits of a value 
> to be stored per byte, with the added bonus of a full 64-bit value being 
> encoded in 9 bytes instead of 10.
>
> Examples:
>
> 0 leading ones followed by a terminating zero and then 7 bits:
>
> 0b0.......
>
> 1 leading one followed by a terminating zero, then 6 bits, and then 1 byte:
>
> 0b10...... ........
>
> 7 leading ones followed by a terminating zero and then 7 bytes:
>
> 0b11111110 ........ ........ ........ ........ ........ ........ ........
>
> 8 leading ones followed by 8 bytes:
>
> 0b11111111 ........ ........ ........ ........ ........ ........ ........ 
> ........
>
> So, such an encoding is clearly possible.  Why does Protocol Buffers use 
> something different?  Is this to provide some level of protection against 
> dropped bytes?  Has all of the data already been read into a buffer by the 
> time that it is to be decoded, and so reducing the number of reads does not 
> provide much of a speed boost?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

[protobuf] Re: Variable-Width Integer Encoding

Reply via email to