I unrolled the loop and reordered the bounds checking on the varint
"_VarUInt64ByteSizeNoTag"  function which gets called quite a bit within
wire_format.py and got some interesting speedups.

What surprised me is that my first attempt to speed this function up was by
using the actual mathematical formula for how large a varint will be
encoded, using log base 2 and knowing that it gets spread out across 7-bit
pieces.
something like this:
  if uint64 > UINT64_MAX:
    raise ValueError
  if uint64 < 1:
    uint64 = 1
  uint64 += 1
  size = math.ceil(math.log(uint64,  2) / 7)
  return size

This code has a consistent running time for every value sent in to it.  On
my simple hardware for encoded values under 2^28, this code is SLOWER than
the looping version (loop only goes around 4 times).  But for values larger
than 2^28, this logarithm method was up to 2x faster!  But that's the
opposite performance we want, since varints should be used less frequently
than fixed(32|64) for values greater than 2^28.

So instead, I unrolled the loop and moved the range bounds checking to a
more sensible place.  I then compared the runtimes using the timeit module
and found it's from 2x-3x faster than the old method.  I didn't really
expect that kind of speedup from just unrolling the loop, so I would
appreciate some feedback.  Given how often this code is called

I've only tested this with python 2.5.2 on linux (fedora10), with -O and
without -O on the cmdline.  I also ran the "python/setup.py test" tests and
got an "OK" for the 145 tests.  I haven't tested yet on a 64bit box, or
other versions of python, so perhaps others could try it out.  I've attached
the test code (named size_calc.py which is standalone code), and also the
patch, (named varintsizenotag.patch, taken from svn diff at trunk/../) which
has the unrolled loop (ATTACHED.)

Here are the measurements I got (output follows, also attached as
speedtest.txt):
---------------------------------------------------------------------------------------------------------------------
Testing speed using 4000000 iterations
Value 18446744073709551615 uses (old:10 == new:10) bytes to encode
(UINT64_MAX)
  Old method takes: 35.3816759586     New method takes: 10.5258858204
seconds
> New method is 3.4x the old method speed.  New method is FASTER.

Value 18446744073709551614 uses (old:10 == new:10) bytes to encode
(UINT64_MAX-1)
  Old method takes: 34.9271941185     New method takes: 10.4822030067
seconds
> New method is 3.3x the old method speed.  New method is FASTER.

Value 0 uses (old:1 == new:1) bytes to encode (bit-shift is 0, offset is -1)
  Old method takes: 4.06476712227     New method takes: 2.08561015129
seconds
> New method is 1.9x the old method speed.  New method is FASTER.

Value 1 uses (old:1 == new:1) bytes to encode (bit-shift is 0, offset is 0)
  Old method takes: 4.05545282364     New method takes: 2.09449100494
seconds
> New method is 1.9x the old method speed.  New method is FASTER.

Value 2 uses (old:1 == new:1) bytes to encode (bit-shift is 0, offset is 1)
  Old method takes: 4.31162905693     New method takes: 2.09951090813
seconds
> New method is 2.1x the old method speed.  New method is FASTER.

Value 127 uses (old:1 == new:1) bytes to encode (bit-shift is 7, offset is
-1)
  Old method takes: 4.07322001457     New method takes: 1.99916601181
seconds
> New method is 2.0x the old method speed.  New method is FASTER.

Value 128 uses (old:2 == new:2) bytes to encode (bit-shift is 7, offset is
0)
  Old method takes: 5.91175985336     New method takes: 2.56322216988
seconds
> New method is 2.3x the old method speed.  New method is FASTER.

Value 129 uses (old:2 == new:2) bytes to encode (bit-shift is 7, offset is
1)
  Old method takes: 5.71182394028     New method takes: 2.66633605957
seconds
> New method is 2.1x the old method speed.  New method is FASTER.

Value 16383 uses (old:2 == new:2) bytes to encode (bit-shift is 14, offset
is -1)
  Old method takes: 5.86016702652     New method takes: 2.5270512104 seconds
> New method is 2.3x the old method speed.  New method is FASTER.

Value 16384 uses (old:3 == new:3) bytes to encode (bit-shift is 14, offset
is 0)
  Old method takes: 7.12100481987     New method takes: 2.98018193245
seconds
> New method is 2.4x the old method speed.  New method is FASTER.

Value 16385 uses (old:3 == new:3) bytes to encode (bit-shift is 14, offset
is 1)
  Old method takes: 7.09731388092     New method takes: 2.95949792862
seconds
> New method is 2.4x the old method speed.  New method is FASTER.

Value 2097151 uses (old:3 == new:3) bytes to encode (bit-shift is 21, offset
is -1)
  Old method takes: 7.49546098709     New method takes: 2.94564318657
seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 2097152 uses (old:4 == new:4) bytes to encode (bit-shift is 21, offset
is 0)
  Old method takes: 9.02288794518     New method takes: 3.25170016289
seconds
> New method is 2.8x the old method speed.  New method is FASTER.

Value 2097153 uses (old:4 == new:4) bytes to encode (bit-shift is 21, offset
is 1)
  Old method takes: 8.94438004494     New method takes: 3.33856105804
seconds
> New method is 2.7x the old method speed.  New method is FASTER.

Value 268435455 uses (old:4 == new:4) bytes to encode (bit-shift is 28,
offset is -1)
  Old method takes: 9.18178391457     New method takes: 3.2392680645 seconds
> New method is 2.8x the old method speed.  New method is FASTER.

Value 268435456 uses (old:5 == new:5) bytes to encode (bit-shift is 28,
offset is 0)
  Old method takes: 11.2249140739     New method takes: 4.54431605339
seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 268435457 uses (old:5 == new:5) bytes to encode (bit-shift is 28,
offset is 1)
  Old method takes: 10.726238966     New method takes: 4.54060006142 seconds
> New method is 2.4x the old method speed.  New method is FASTER.

Value 34359738367 uses (old:5 == new:5) bytes to encode (bit-shift is 35,
offset is -1)
  Old method takes: 18.0354709625     New method takes: 7.20030999184
seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 34359738368 uses (old:6 == new:6) bytes to encode (bit-shift is 35,
offset is 0)
  Old method takes: 21.0656189919     New method takes: 8.30025315285
seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 34359738369 uses (old:6 == new:6) bytes to encode (bit-shift is 35,
offset is 1)
  Old method takes: 21.5061860085     New method takes: 8.09376716614
seconds
> New method is 2.7x the old method speed.  New method is FASTER.

Value 4398046511103 uses (old:6 == new:6) bytes to encode (bit-shift is 42,
offset is -1)
  Old method takes: 21.0156860352     New method takes: 8.32196497917
seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 4398046511104 uses (old:7 == new:7) bytes to encode (bit-shift is 42,
offset is 0)
  Old method takes: 24.8848249912     New method takes: 8.75971508026
seconds
> New method is 2.8x the old method speed.  New method is FASTER.

Value 4398046511105 uses (old:7 == new:7) bytes to encode (bit-shift is 42,
offset is 1)
  Old method takes: 25.0450389385     New method takes: 8.79610991478
seconds
> New method is 2.8x the old method speed.  New method is FASTER.

Value 562949953421311 uses (old:7 == new:7) bytes to encode (bit-shift is
49, offset is -1)
  Old method takes: 24.9431898594     New method takes: 8.53739500046
seconds
> New method is 2.9x the old method speed.  New method is FASTER.

Value 562949953421312 uses (old:8 == new:8) bytes to encode (bit-shift is
49, offset is 0)
  Old method takes: 27.9076519012     New method takes: 9.38700890541
seconds
> New method is 3.0x the old method speed.  New method is FASTER.

Value 562949953421313 uses (old:8 == new:8) bytes to encode (bit-shift is
49, offset is 1)
  Old method takes: 28.1347260475     New method takes: 9.65255093575
seconds
> New method is 2.9x the old method speed.  New method is FASTER.

Value 72057594037927935 uses (old:8 == new:8) bytes to encode (bit-shift is
56, offset is -1)
  Old method takes: 28.4428110123     New method takes: 9.37137103081
seconds
> New method is 3.0x the old method speed.  New method is FASTER.

Value 72057594037927936 uses (old:9 == new:9) bytes to encode (bit-shift is
56, offset is 0)
  Old method takes: 31.4207801819     New method takes: 10.2984797955
seconds
> New method is 3.1x the old method speed.  New method is FASTER.

Value 72057594037927937 uses (old:9 == new:9) bytes to encode (bit-shift is
56, offset is 1)
  Old method takes: 31.3949120045     New method takes: 9.89139890671
seconds
> New method is 3.2x the old method speed.  New method is FASTER.

Value 9223372036854775807 uses (old:9 == new:9) bytes to encode (bit-shift
is 63, offset is -1)
  Old method takes: 31.50395298     New method takes: 9.49733114243 seconds
> New method is 3.3x the old method speed.  New method is FASTER.

Value 9223372036854775808 uses (old:10 == new:10) bytes to encode (bit-shift
is 63, offset is 0)
  Old method takes: 35.6204199791     New method takes: 10.5821249485
seconds
> New method is 3.4x the old method speed.  New method is FASTER.

Value 9223372036854775809 uses (old:10 == new:10) bytes to encode (bit-shift
is 63, offset is 1)
  Old method takes: 34.8968119621     New method takes: 10.5793101788
seconds
> New method is 3.3x the old method speed.  New method is FASTER.

Test done!
---------------------------------------------------------------------------------------------------------------------

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Testing speed using 4000000 iterations
Value 18446744073709551615 uses (old:10 == new:10) bytes to encode (UINT64_MAX)
  Old method takes: 35.3816759586     New method takes: 10.5258858204 seconds
> New method is 3.4x the old method speed.  New method is FASTER.

Value 18446744073709551614 uses (old:10 == new:10) bytes to encode 
(UINT64_MAX-1)
  Old method takes: 34.9271941185     New method takes: 10.4822030067 seconds
> New method is 3.3x the old method speed.  New method is FASTER.

Value 0 uses (old:1 == new:1) bytes to encode (bit-shift is 0, offset is -1)
  Old method takes: 4.06476712227     New method takes: 2.08561015129 seconds
> New method is 1.9x the old method speed.  New method is FASTER.

Value 1 uses (old:1 == new:1) bytes to encode (bit-shift is 0, offset is 0)
  Old method takes: 4.05545282364     New method takes: 2.09449100494 seconds
> New method is 1.9x the old method speed.  New method is FASTER.

Value 2 uses (old:1 == new:1) bytes to encode (bit-shift is 0, offset is 1)
  Old method takes: 4.31162905693     New method takes: 2.09951090813 seconds
> New method is 2.1x the old method speed.  New method is FASTER.

Value 127 uses (old:1 == new:1) bytes to encode (bit-shift is 7, offset is -1)
  Old method takes: 4.07322001457     New method takes: 1.99916601181 seconds
> New method is 2.0x the old method speed.  New method is FASTER.

Value 128 uses (old:2 == new:2) bytes to encode (bit-shift is 7, offset is 0)
  Old method takes: 5.91175985336     New method takes: 2.56322216988 seconds
> New method is 2.3x the old method speed.  New method is FASTER.

Value 129 uses (old:2 == new:2) bytes to encode (bit-shift is 7, offset is 1)
  Old method takes: 5.71182394028     New method takes: 2.66633605957 seconds
> New method is 2.1x the old method speed.  New method is FASTER.

Value 16383 uses (old:2 == new:2) bytes to encode (bit-shift is 14, offset is 
-1)
  Old method takes: 5.86016702652     New method takes: 2.5270512104 seconds
> New method is 2.3x the old method speed.  New method is FASTER.

Value 16384 uses (old:3 == new:3) bytes to encode (bit-shift is 14, offset is 0)
  Old method takes: 7.12100481987     New method takes: 2.98018193245 seconds
> New method is 2.4x the old method speed.  New method is FASTER.

Value 16385 uses (old:3 == new:3) bytes to encode (bit-shift is 14, offset is 1)
  Old method takes: 7.09731388092     New method takes: 2.95949792862 seconds
> New method is 2.4x the old method speed.  New method is FASTER.

Value 2097151 uses (old:3 == new:3) bytes to encode (bit-shift is 21, offset is 
-1)
  Old method takes: 7.49546098709     New method takes: 2.94564318657 seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 2097152 uses (old:4 == new:4) bytes to encode (bit-shift is 21, offset is 
0)
  Old method takes: 9.02288794518     New method takes: 3.25170016289 seconds
> New method is 2.8x the old method speed.  New method is FASTER.

Value 2097153 uses (old:4 == new:4) bytes to encode (bit-shift is 21, offset is 
1)
  Old method takes: 8.94438004494     New method takes: 3.33856105804 seconds
> New method is 2.7x the old method speed.  New method is FASTER.

Value 268435455 uses (old:4 == new:4) bytes to encode (bit-shift is 28, offset 
is -1)
  Old method takes: 9.18178391457     New method takes: 3.2392680645 seconds
> New method is 2.8x the old method speed.  New method is FASTER.

Value 268435456 uses (old:5 == new:5) bytes to encode (bit-shift is 28, offset 
is 0)
  Old method takes: 11.2249140739     New method takes: 4.54431605339 seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 268435457 uses (old:5 == new:5) bytes to encode (bit-shift is 28, offset 
is 1)
  Old method takes: 10.726238966     New method takes: 4.54060006142 seconds
> New method is 2.4x the old method speed.  New method is FASTER.

Value 34359738367 uses (old:5 == new:5) bytes to encode (bit-shift is 35, 
offset is -1)
  Old method takes: 18.0354709625     New method takes: 7.20030999184 seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 34359738368 uses (old:6 == new:6) bytes to encode (bit-shift is 35, 
offset is 0)
  Old method takes: 21.0656189919     New method takes: 8.30025315285 seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 34359738369 uses (old:6 == new:6) bytes to encode (bit-shift is 35, 
offset is 1)
  Old method takes: 21.5061860085     New method takes: 8.09376716614 seconds
> New method is 2.7x the old method speed.  New method is FASTER.

Value 4398046511103 uses (old:6 == new:6) bytes to encode (bit-shift is 42, 
offset is -1)
  Old method takes: 21.0156860352     New method takes: 8.32196497917 seconds
> New method is 2.5x the old method speed.  New method is FASTER.

Value 4398046511104 uses (old:7 == new:7) bytes to encode (bit-shift is 42, 
offset is 0)
  Old method takes: 24.8848249912     New method takes: 8.75971508026 seconds
> New method is 2.8x the old method speed.  New method is FASTER.

Value 4398046511105 uses (old:7 == new:7) bytes to encode (bit-shift is 42, 
offset is 1)
  Old method takes: 25.0450389385     New method takes: 8.79610991478 seconds
> New method is 2.8x the old method speed.  New method is FASTER.

Value 562949953421311 uses (old:7 == new:7) bytes to encode (bit-shift is 49, 
offset is -1)
  Old method takes: 24.9431898594     New method takes: 8.53739500046 seconds
> New method is 2.9x the old method speed.  New method is FASTER.

Value 562949953421312 uses (old:8 == new:8) bytes to encode (bit-shift is 49, 
offset is 0)
  Old method takes: 27.9076519012     New method takes: 9.38700890541 seconds
> New method is 3.0x the old method speed.  New method is FASTER.

Value 562949953421313 uses (old:8 == new:8) bytes to encode (bit-shift is 49, 
offset is 1)
  Old method takes: 28.1347260475     New method takes: 9.65255093575 seconds
> New method is 2.9x the old method speed.  New method is FASTER.

Value 72057594037927935 uses (old:8 == new:8) bytes to encode (bit-shift is 56, 
offset is -1)
  Old method takes: 28.4428110123     New method takes: 9.37137103081 seconds
> New method is 3.0x the old method speed.  New method is FASTER.

Value 72057594037927936 uses (old:9 == new:9) bytes to encode (bit-shift is 56, 
offset is 0)
  Old method takes: 31.4207801819     New method takes: 10.2984797955 seconds
> New method is 3.1x the old method speed.  New method is FASTER.

Value 72057594037927937 uses (old:9 == new:9) bytes to encode (bit-shift is 56, 
offset is 1)
  Old method takes: 31.3949120045     New method takes: 9.89139890671 seconds
> New method is 3.2x the old method speed.  New method is FASTER.

Value 9223372036854775807 uses (old:9 == new:9) bytes to encode (bit-shift is 
63, offset is -1)
  Old method takes: 31.50395298     New method takes: 9.49733114243 seconds
> New method is 3.3x the old method speed.  New method is FASTER.

Value 9223372036854775808 uses (old:10 == new:10) bytes to encode (bit-shift is 
63, offset is 0)
  Old method takes: 35.6204199791     New method takes: 10.5821249485 seconds
> New method is 3.4x the old method speed.  New method is FASTER.

Value 9223372036854775809 uses (old:10 == new:10) bytes to encode (bit-shift is 
63, offset is 1)
  Old method takes: 34.8968119621     New method takes: 10.5793101788 seconds
> New method is 3.3x the old method speed.  New method is FASTER.

Test done!

Attachment: size_calc.py
Description: Binary data

Attachment: varintsizenotag.patch
Description: Binary data

Reply via email to