On Monday 07 of May 2012 09:53:01 steve wrote: > sorry for being vague. > > treating pixels as 64bit on amd64 as that is the natural size for the > machine, vs using 32bits per pixel - 10 bits of r, g, and b or y, u, and v > plus 2 spare leads to a significant speedup; where significant is a number > lost in the mists of time. > > i believe this speedup is due to the reduction in the rate of cache line > refills, as forsyth described.
on RISC, there's usually significant penalty for accessing data units smaller than machine word (`unaligned access'), but it ain't so on the benevolent x86 CISC. both handling pixel graphics and transferring to graphic card are special cases. speedup may be due to better prefetch during sequential memory access, but larger data size should not help much here. more data causes FSB and PCIe contention, and cache trashing. oops? when i asked about int and long size on amd64, i was more concerned with ability to cast between pointer and integer and handling offset of large files. -- dexen deVries [[[↓][→]]] Weightless and alone you speed through the eerie nothingness of space you circle 'round the Moon and journey back to face the punishing torment of re-entry -- LUNA-C, ``Supaset8 (full release)'', #24m52s
