Added optimizations for several bilinear fast paths:
- src__8_
- src__8_0565
- src_0565_8_x888
- src_0565_8_0565
- add__8_
- src__
- src__0565
- src_0565_
- src_0565_0565
- over__
- add__
Benchmark results (using lowlevel-blt-bench) on
From: Nemanja Lukic nemanja.lu...@rt-rk.com
Performance numbers before/after on MIPS-74kc @ 1GHz:
lowlevel-blt-bench -b
Referent (before):
src__8_ = L1: 6.37 L2: 6.08 M: 5.46 ( 32.57%) HT:
4.64 VT: 4.61 R: 4.52 RT: 2.85 ( 23Kops/s)
src__8_0565 =
From: Nemanja Lukic nemanja.lu...@rt-rk.com
Performance numbers before/after on MIPS-74kc @ 1GHz:
lowlevel-blt-bench -b
Referent (before):
src__ = L1: 8.18 L2: 7.79 M: 6.32 ( 33.51%) HT:
5.78 VT: 5.70 R: 5.61 RT: 3.79 ( 29Kops/s)
src__0565 =