I tried to simplify the inner loop to:
~~~
    for word::D in data
        remainder = (remainder >>> word_size)
    end
~~~
It still took more than half the time of the original (with table indexing 
and two XORs) in my Win7-x64 machine. It shows that not much speed-ups with 
the real operations can be expected with this loop structure. Int32 tables 
instead of Int64 does not look promising, either. Are you sure the zlib 
crc32 function was fed with the same data (100 million bytes fetched from 
memory, not  generated on demand)? If yes, the Julia loops can use some 
improvements.

Reply via email to