@nexprs misunderstanding fixed (thanks tim!).  fast julia code at 
https://github.com/andrewcooke/CRC.jl/blob/master/src/CRC.jl#L349

On Sunday, 20 April 2014 14:35:28 UTC-3, andrew cooke wrote:
>
>
> Just for the record - multiple tables and unrolling in Julia now beats C 
> (very slightly).
>
> Tim's @nexprs macro generally helps with the unrolling (although I seem to 
> have hit a bug misunderstanding in one particular case, so am having to 
> copy + paste in one place).
>
> Thanks,
> Andrew
>
> On Thursday, 10 April 2014 19:52:03 UTC-3, andrew cooke wrote:
>>
>>
>> huh.  i had forgotten about this.
>>
>> i'll try four tables.  it shouldn't be that hard to add (although there's 
>> going to be extra book-keeping - it's not an obvious gain to me).
>>
>> cheers,
>> andrew
>>
>> On Thursday, 10 April 2014 19:08:21 UTC-3, Chris Foster wrote:
>>>
>>> On Fri, Apr 11, 2014 at 6:44 AM, Laszlo Hars <[email protected]> 
>>> wrote: 
>>> > note that the running time does not change with a partial loop unroll, 
>>> like 
>>> > this: 
>>> > ~~~ 
>>> > function signed_loop{D<:Unsigned, A<:Unsigned}(::Type{D}, r::A, data, 
>>> > table::Vector{A}) 
>>> >     local j = 0 
>>> >      for i = 1 : div(length(data),20) 
>>> >         r = (r >>> 8) $ table[1 + (data[j+=1]$convert(D,r))] 
>>> [...] 
>>> >         r = (r >>> 8) $ table[1 + (data[j+=1]$convert(D,r))] 
>>> >     end 
>>> >     return r 
>>> > end 
>>> > ~~~ 
>>>
>>> In that case, it's probably because zlib is processing the bytes four 
>>> at a time, using four different CRC tables.  This is quite distinct 
>>> from the loop unrolling, and can have a larger effect because it 
>>> removes some of the data dependency between iterations.  It looks 
>>> something like this (very untested!  I didn't have time to figure out 
>>> how to make the four different CRC tables.) 
>>>
>>> data4 = reinterpret(Uint32, data)  # note, need special cases for 
>>> trailing bytes 
>>> for i = 1:div(length(data4)) 
>>>     word::Uint32 = data4[i] 
>>>     r = r $ word 
>>>     r = table3[1 + (r & 0xff)] $ table2[1 + ((r >> 8) $ 0xff)] $ 
>>> table1[1 + ((r >> 16) $ 0xff)] $ table0[1 + (r >> 24)] 
>>> end 
>>>
>>

Reply via email to