In an initial test where I mark a rom chip as WB cacheable, on the P4 I appear to be successful. It occurs to me that my earlier failure to do this with the Athlon may simply have been because I didn't put the second cpu to sleep, and I was getting cross-cpu cache contention.
To get this really useful I need a relocateable linuxBIOS, so I can use global variables both when running from the cache and when I have memory setup. The most portable way to do this appears to be setting up page tables so I can use virtual addresses in linuxBIOS. Eric