Okay. Will check IACA report and try pxor for m0 and buffer 1023.
On Mon, Jun 22, 2015 at 8:24 PM, chen chenm...@163.com wrote:
right
some comment:
'psignb X, [pb_128]' equal to 'psubb X, 0, X', in AVX2, second type
faster, in SSE4, choice depends on IACA report
in PMINSW, you buffer ZERO
right
some comment:
'psignb X, [pb_128]' equal to 'psubb X, 0, X', in AVX2, second type faster, in
SSE4, choice depends on IACA report
in PMINSW, you buffer ZERO into M0, and use pw_1023 directly, could you try
buffer pw_1023 and use PXOR to get ZERO?
At 2015-06-22
SAO_EO_08.97x974.03 8740.81
SAO_EO_110.18x 492.67 5017.42
SAO_EO_1_2Rows 11.21x 900.82 10095.86
SAO_EO_2[0] 6.27x207.22 1298.92
SAO_EO_2[1] 8.92x555.20 4949.69
SAO_EO_3[0] 4.97x236.72 1177.29