Well... a blur is not at pixel level either.... That can be handled in
blocks too.... I might see this fairly clear as I come from an ASM
background...... the thing you need to realize is that performing an action
on a 32bit block is not much slower that it is performing the same action of
a pixel.... actually the pixel is most likely slower... the cpu still read
32bits although it seems like it only reads one.... but with the single
pixel it needs to perform an AND when it reads it and 2 x ( AND + OR ) when
it writes it ..... with a 32 bit block it is not..... on top of that, since
a blur is essentially a form of convolute filter the pixel version need to
read it's some radius of neighboring pixels.... radius being the amount of
blur to apply. So that's another now we have another stack of single pixel
to process.... so now what started out to be a pixel process is actually
getting/putting another 20 pixels in a 3.0 blur (radius) scenario.....
323
32123
21X12
32123
323
Now multiply that by your 0.75megapixels..... IT ADDS UP
--Allan