Hi, all This patch results a performance increase of 4% for SPECfp2000 and 13% for NAS benchmark suite on Itanium-2 system, respectively. More performance increase is hopeful by further tuning the parameters and improving the prefetch algorithm at tree level.
Details of NAS benchmarks are listed below. GCC options: -O3 -fprefetch-loop-arrays Target: Itanium-2 1.6GHz; L2 Cache 256K, L3 Cache 6M Execution times in seconds -this patch +this patch bt.W 14.43 14.17 cg.A 13.76 6.86 ep.W 7.83 7.79 ft.A 18.73 20.15 is.B 11.85 10.94 lu.W 20.55 20.27 mg.A 15.09 11.86 sp.W 37.11 35.49 geomean 15.84 13.94 speedup 13.68% 2006-06-02 Canqun Yang <[EMAIL PROTECTED]> * config/ia64/ia64.h (SIMULTANEOUS_PREFETCHES): Define to 18. (PREFETCH_BLOCK): Define to 128. (PREFETCH_LATENCY): Define to 400. Index: ia64.h =================================================================== --- ia64.h (revision 114307) +++ ia64.h (working copy) @@ -1985,13 +1985,18 @@ ??? This number is bogus and needs to be replaced before the value is actually used in optimizations. */ -#define SIMULTANEOUS_PREFETCHES 6 +#define SIMULTANEOUS_PREFETCHES 18 /* If this architecture supports prefetch, define this to be the size of the cache line that is prefetched. */ -#define PREFETCH_BLOCK 32 +#define PREFETCH_BLOCK 128 +/* A number that should roughly corresponding to the nunmber of instructions + executed before the prefetch is completed. */ + +#define PREFETCH_LATENCY 400 + #define HANDLE_SYSV_PRAGMA 1 /* A C expression for the maximum number of instructions to execute via Canqun Yang __________________________________________________ 赶快注册雅虎超大容量免费邮箱? http://cn.mail.yahoo.com