Hi All, What follows is an updated version of copy_4K_page that has been tuned for the Cell processor. With this new routine it was found that the system time measured when compiling a 2.6.26 pseries_defconfig was reduced by ~10s:
mainline (2.6.27-rc1-00632-g2e1e921): real 17m8.727s user 59m48.693s sys 3m56.089s real 17m9.350s user 59m44.822s sys 3m56.666s new routine: real 17m7.311s user 59m51.339s sys 3m47.043s real 17m7.863s user 59m49.028s sys 3m46.608s This same routine was also found to improve performance on 970 CPUs too (but by a much smaller amount): mainline (2.6.27-rc1-00632-g2e1e921): real 16m8.545s user 14m38.134s sys 1m55.156s real 16m7.089s user 14m37.974s sys 1m55.010s new routine: real 16m11.641s user 14m37.251s sys 1m52.618s real 16m6.139s user 14m38.282s sys 1m53.184s I also did testing on Power{3..6} and I found that Power3, Power5 and Power6 did better with this new routine when the dcbt and dcbz weren't used (in which case they achieved performance comparable to the existing kernel copy_4K_page routine). Power4 on other hand performed slightly better with the dcbt and dcbz included (still comparable to the current kernel copy_4K_page). So in order to get the best performance across the board I created a new CPU feature that will govern whether the dcbt and dcbz are used (and un-creatively named it CPU_FTR_CP_USE_DCBTZ). I added it to the CPU features of Cell, Power4 and 970. Unfortunately I don't have access to a PA6T but judging by the marketing material I could find, it looks like it has a strong enough hardware prefetcher that it probably wouldn't benefit from the dcbt and dcbz... Okay, that's probably enough prattling along - you can all go and look at the code now. All comments appreciated [I decided to post the whole copy routine rather than a diff between it and the current one because I found the diff quite unreadable. I'll post a real patchset after I've addressed any comments.] Many thanks! _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev