On Sat, May 21, 2016 at 01:33:50PM -0400, D. Hugh Redelmeier wrote: > <https://software.intel.com/en-us/articles/google-vp9-optimization> > > Intel describing how they improved the performance of the VP9 decoder for > Silvermont, a recent Atom core. > > The meat is several not-really-obvious changes to the code to overcome > limitations of the instruction decoder. The optimizations seem particular > to Silvermont but the article says: > Testing against the future Intel Atom platforms, codenamed Goldmont and > Tremont, the VP9 optimizations delivered additional gains. > > These optimizations did nothing for Core processors as far as I can tell. > I don't know if it affects any AMD processors. > > A RISC processor would not have a complex instruction decoder so this kind > of hacking would not apply. I will admit that there are "hazards" in RISC > processors that are worth paying attention to when selecting and ordering > instructions but these tend to be clearer. > > Another thing in the paper: > > The overall results were outstanding. The team improved user-level > performance by up to 16 percent (6.2 frames per second) in 64-bit > mode and by about 12 percent (1.65 frames per second) in 32-bit > mode. This testing included evaluation of 32-bit and 64-bit GCC > and Intel® compilers, and concluded that the Intel compilers > delivered the best optimizations by far for Intel® Atom™ > processors. When you multiply this improvement by millions of > viewers and thousands of videos, it is significant. The WebM team > at Google also recognized this performance gain as extremely > significant. Frank Gilligan, a Google engineering manager, > responded to the team’s success: “Awesome. It looks good. I can’t > wait to try everything out.” Testing against the future Intel Atom > platforms, codenamed Goldmont and Tremont, the VP9 optimizations > delivered additional gains. > > Consider 64-bit. If 16% improvement is 6.2 f/s, then the remaining 84% > would be 32.55 f/s. Not great, but OK. > > For 32-bit, 12% is 1.65 f/s; the remaining 88% would be 12 f/s. Totally > useless, I think. > > Quite interesting how different these two are.
64 bit has twice the registers, which for a lot of code is a huge difference. That is the biggest improvement AMD made to x86. Scrapping x87 is probably number 2. -- Len Sorensen --- Talk Mailing List [email protected] https://gtalug.org/mailman/listinfo/talk
