Relooking at your timings and mine, it appears that you allocate 10x my count
of register-size count of items and require 10x the FillChar which you need to
initialize your filter array.
My timing is about 80 ms and yours looks like 900 ms for 10x more register
sized data, which look like the reasonable ratio since we may have difference
in the way we get the timings (my timing routines beeing maybe a bit
optimistic).
Here I think the speed limit is the time it takes to effectively transfer the
data to RAM and that whether you have a FillChar or FillQWord that ends up
beeing STOSB or STOSQ, that is fully cached inside the CPU thus the limiting
factor is the time it takes to move the initialized CPU cached data to the RAM.
_______________________________________________
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal