On 29/04/08 17:41 +0200, NoiseEHC wrote: > On this page > http://wiki.laptop.org/go/Geode_LX > I have named some instructions as "Synchronized ops" (in the MMX > section). Are those real or did I mismeasured something?
That section is very difficult to understand. I'm not sure which operations you have invented this name for. > If those are > real then would somebody from AMD just go through the databook and fix > the instruction clock cycle numbers? Because in that case it is sure > that they do not match reality and clearly I have better things to do > than measuring clock cycles. Clearly you must have some basis for assuming that the numbers are wrong, so you must have done some measurement. I consulted the secret documentation that you claim I am withholding from you, and the timings there are the same as in the datasheet. I believe that you are correct in that these are the clock counts for the instruction to go through the FPU and don't include the stall time for the pipeline to clear up. I am not a silicon designer, so I'm not the final word on if they are correct or not, but at least that should prove that there isn't a massive marketing conspiracy to hide the details of the processor from our customers. If they are lying to you, they are lying to me, and they're not lying to me. > Also the legend is clearly wrong in several > cases so probably that would need checking too (like on page 668 note 4 > talks about 3DNOW ops in the table about FP ops). That is an mistake - I have let the technical writer know about it. > absolutely no info about L2 cache miss penalties or mispredicted jumps > or about the pipeline stages of the FP unit. I don't have any information about L2 cache miss penalties, but they are easy to calculate. Please see: http://homepages.cwi.nl/~manegold/Calibrator/ I will talk to somebody about documenting the FP unit pipeline. It does handle 1 instruction per clock from the integer unit. In practice we know that two floating point instructions back to back will stall the IU. I can also tell you that it is optimized for single precision, so double precision is handled by microcode and needs to go through the path again. > See, all I would like to have is enough data that when I look at > assembly code I could approximately calculate how many clock cycles will > be consumed. Nothing more and nothing less. You have nearly all the information you need, and you can collect the additional information the same way we do, with careful analysis and measurement. In fact, Bernie and Vladimir Makarov have done a lot of work already in this area, resulting in the Geode specific code for gcc 4.2.0 and glibc. Perhaps you can work with them to figure out the finer details of the FPU scheduling. I'm sure they would appreciate it. Jordan _______________________________________________ Devel mailing list [email protected] http://lists.laptop.org/listinfo/devel
