[Luarocks-developers] bit.lshift and performance - luabitop v.s. lua-5.2.0-work4

David Manura Tue, 12 Oct 2010 00:28:09 -0700

As LuaBitOp v.s. Lua-5.2.0-work4 bit libraries has been a point of
contention, I thought I'd post a little anecdotal feedback tonight.


LuaBitOp and Lua-5.2.0-work4 bitwise lshift behave differently here:

  bit.lshift(0xffffffff, 32) --> -1 (LuaJIT2) or 0 (lua-5.2.0-work4)

Indeed, the LuaBitOp docs say that only the lower 5 bits in the shift
count are used (i.e. 0-31), and that also follows from C.

I came across that when converting some pure Lua code to these native
bit libraries.  The code involves ModuleCompressDeflateLua [1] where
it wraps a bytestream in a bitstream [2].  A simplified version is
given below.  Basically, you push individual bytes onto a queue (four
byte buffer) and pop 0-32 bits at a time.

-----
local buffer = 0
local buffer_nbits = 0

local function push_byte(byte)
  buffer  = buffer + bit.lshift(byte, buffer_nbits)
  buffer_nbits = buffer_nbits + 8
  return byte, buffer, buffer_nbits
end

local function pop_bits(nbits)
  local bits = bit.band(buffer, bit.lshift(1, nbits)-1)
  buffer = bit.rshift(buffer, nbits)
  ----[[pure Lua-5.1 alternative:
  local m = 2^nbits  -- or pow2[nbits]
  local bits = buffer % m
  buffer = (buffer - bits) / m
  --]]
  buffer_nbits = buffer_nbits - nbits
  return bits, buffer, buffer_nbits
end

print(push_byte(0xff))
print(push_byte(0xff))
print(push_byte(0xff))
print(push_byte(0xff))
print(pop_bits(32))

$ lua52-work4 t.lua
255     255     8
255     65535   16
255     16777215        24
255     4294967295      32
4294967295      0       0

$ luajit2-20101011 t.lua
255     255     8
255     65535   16
255     16777215        24
255     -1      32
0       -1      0
-----

The two bit libraries (and the pure Lua alternative) give the
identical final result for nbits < 32, but for nbits == 32 the
LuaBitOp breaks as is and I suppose would need to be handled as a
special case.  That's not to say that is wrong.  Having differing
bitwise semantics between C and Lua could complicate conversion of
C/native code to Lua (and vice versa--e.g. LuaToCee).

I also took a brief look at performance.  The largest bottleneck
(~50%) in the pure Lua form is in the CRC calculation performing the
XOR operation, which is implemented as a variant of Roberto's XOR [3].
 That bottleneck goes away with the native XOR operation.
Interestingly, it also largely goes away in LuaJIT2 even when using
[3].  However, most of the other bit operations in
ModuleCompressDeflateLua are shifts and modulus, which, although
cleaner with their native operations, tend to be slightly slower when
implemented as function calls:

  local x = 0
  local rshift = bit.rshift
  for i=1,1000000000 do
    x = (x - x % 256) /  256
    --x = rshift(x, 8)
  end

*except* under LuaJIT2, where the rshift version is faster by over an
order of magnitude.

[1] http://lua-users.org/wiki/ModuleCompressDeflateLua
[2] 
http://github.com/davidm/lua-compress-deflatelua/blob/master/module/lmod/compress/deflatelua.lua#L178
[3] http://lua-users.org/lists/lua-l/2002-09/msg00134.html

_______________________________________________
Luarocks-developers mailing list
[email protected]
http://lists.luaforge.net/cgi-bin/mailman/listinfo/luarocks-developers

[Luarocks-developers] bit.lshift and performance - luabitop v.s. lua-5.2.0-work4

Reply via email to