As LuaBitOp v.s. Lua-5.2.0-work4 bit libraries has been a point of
contention, I thought I'd post a little anecdotal feedback tonight.
LuaBitOp and Lua-5.2.0-work4 bitwise lshift behave differently here:
bit.lshift(0xffffffff, 32) --> -1 (LuaJIT2) or 0 (lua-5.2.0-work4)
Indeed, the LuaBitOp docs say that only the lower 5 bits in the shift
count are used (i.e. 0-31), and that also follows from C.
I came across that when converting some pure Lua code to these native
bit libraries. The code involves ModuleCompressDeflateLua [1] where
it wraps a bytestream in a bitstream [2]. A simplified version is
given below. Basically, you push individual bytes onto a queue (four
byte buffer) and pop 0-32 bits at a time.
-----
local buffer = 0
local buffer_nbits = 0
local function push_byte(byte)
buffer = buffer + bit.lshift(byte, buffer_nbits)
buffer_nbits = buffer_nbits + 8
return byte, buffer, buffer_nbits
end
local function pop_bits(nbits)
local bits = bit.band(buffer, bit.lshift(1, nbits)-1)
buffer = bit.rshift(buffer, nbits)
----[[pure Lua-5.1 alternative:
local m = 2^nbits -- or pow2[nbits]
local bits = buffer % m
buffer = (buffer - bits) / m
--]]
buffer_nbits = buffer_nbits - nbits
return bits, buffer, buffer_nbits
end
print(push_byte(0xff))
print(push_byte(0xff))
print(push_byte(0xff))
print(push_byte(0xff))
print(pop_bits(32))
$ lua52-work4 t.lua
255 255 8
255 65535 16
255 16777215 24
255 4294967295 32
4294967295 0 0
$ luajit2-20101011 t.lua
255 255 8
255 65535 16
255 16777215 24
255 -1 32
0 -1 0
-----
The two bit libraries (and the pure Lua alternative) give the
identical final result for nbits < 32, but for nbits == 32 the
LuaBitOp breaks as is and I suppose would need to be handled as a
special case. That's not to say that is wrong. Having differing
bitwise semantics between C and Lua could complicate conversion of
C/native code to Lua (and vice versa--e.g. LuaToCee).
I also took a brief look at performance. The largest bottleneck
(~50%) in the pure Lua form is in the CRC calculation performing the
XOR operation, which is implemented as a variant of Roberto's XOR [3].
That bottleneck goes away with the native XOR operation.
Interestingly, it also largely goes away in LuaJIT2 even when using
[3]. However, most of the other bit operations in
ModuleCompressDeflateLua are shifts and modulus, which, although
cleaner with their native operations, tend to be slightly slower when
implemented as function calls:
local x = 0
local rshift = bit.rshift
for i=1,1000000000 do
x = (x - x % 256) / 256
--x = rshift(x, 8)
end
*except* under LuaJIT2, where the rshift version is faster by over an
order of magnitude.
[1] http://lua-users.org/wiki/ModuleCompressDeflateLua
[2]
http://github.com/davidm/lua-compress-deflatelua/blob/master/module/lmod/compress/deflatelua.lua#L178
[3] http://lua-users.org/lists/lua-l/2002-09/msg00134.html
_______________________________________________
Luarocks-developers mailing list
[email protected]
http://lists.luaforge.net/cgi-bin/mailman/listinfo/luarocks-developers