<https://software.intel.com/en-us/articles/google-vp9-optimization>

Intel describing how they improved the performance of the VP9 decoder for 
Silvermont, a recent Atom core.

The meat is several not-really-obvious changes to the code to overcome 
limitations of the instruction decoder.  The optimizations seem particular 
to Silvermont but the article says:
        Testing against the future Intel Atom platforms, codenamed Goldmont and 
        Tremont, the VP9 optimizations delivered additional gains.

These optimizations did nothing for Core processors as far as I can tell.  
I don't know if it affects any AMD processors.

A RISC processor would not have a complex instruction decoder so this kind 
of hacking would not apply.  I will admit that there are "hazards" in RISC 
processors that are worth paying attention to when selecting and ordering 
instructions but these tend to be clearer.

Another thing in the paper:

        The overall results were outstanding. The team improved user-level 
        performance by up to 16 percent (6.2 frames per second) in 64-bit 
        mode and by about 12 percent (1.65 frames per second) in 32-bit 
        mode. This testing included evaluation of 32-bit and 64-bit GCC 
        and Intel® compilers, and concluded that the Intel compilers 
        delivered the best optimizations by far for Intel® Atom™ 
        processors. When you multiply this improvement by millions of 
        viewers and thousands of videos, it is significant. The WebM team 
        at Google also recognized this performance gain as extremely 
        significant. Frank Gilligan, a Google engineering manager, 
        responded to the team’s success: “Awesome. It looks good. I can’t 
        wait to try everything out.” Testing against the future Intel Atom 
        platforms, codenamed Goldmont and Tremont, the VP9 optimizations 
        delivered additional gains.

Consider 64-bit.  If 16% improvement is 6.2 f/s, then the remaining 84% 
would be 32.55 f/s.  Not great, but OK.

For 32-bit, 12% is 1.65 f/s; the remaining 88% would be 12 f/s.  Totally 
useless, I think.

Quite interesting how different these two are.
---
Talk Mailing List
[email protected]
https://gtalug.org/mailman/listinfo/talk

Reply via email to