On 7/22/20 2:16 AM, frank.ch...@sifive.com wrote: > # Vector ordered and unordered reduction sum > -vfredsum_vs 0000-1 . ..... ..... 001 ..... 1010111 @r_vm > +vfredsum_vs 000001 . ..... ..... 001 ..... 1010111 @r_vm > +vfredosum_vs 000011 . ..... ..... 001 ..... 1010111 @r_vm
"The vfredosum instruction is a valid implementation of the vfredsum instruction." Which is exactly what we're doing here. Why should we treat them differently? There is no parallelism that we can exploit in tcg, unlike in hardware. r~