On Tue, 2022-07-12 at 10:19 +0530, Richard Henderson wrote: > On 7/12/22 07:27, Ilya Leoshkevich wrote: > > +/* > > + * vfmin/vfmax code generation. > > + */ > > +extern const char vfminmax_template[]; > > +extern const int vfminmax_template_size; > > +extern const int vfminmax_offset; > > +asm(".globl vfminmax_template\n" > > + "vfminmax_template:\n" > > + "vl %v25,0(%r3)\n" > > + "vl %v26,0(%r4)\n" > > + "0: vfmax %v24,%v25,%v26,2,0,0\n" > > + "vst %v24,0(%r2)\n" > > + "br %r14\n" > > + "1: .align 4\n" > > + ".globl vfminmax_template_size\n" > > + "vfminmax_template_size: .long 1b - vfminmax_template\n" > > + ".globl vfminmax_offset\n" > > + "vfminmax_offset: .long 0b - vfminmax_template\n"); > ... > > + > > +#define VFMIN 0xEE > > +#define VFMAX 0xEF > > + > > +static void vfminmax(unsigned char *buf, unsigned int op, > > + unsigned int m4, unsigned int m5, unsigned > > int m6, > > + void *v1, const void *v2, const void *v3) > > +{ > > + memcpy(buf, vfminmax_template, vfminmax_template_size); > > + buf[vfminmax_offset + 3] = (m6 << 4) | m5; > > + buf[vfminmax_offset + 4] &= 0x0F; > > + buf[vfminmax_offset + 4] |= (m4 << 4); > > + buf[vfminmax_offset + 5] = op; > > + ((void (*)(void *, const void *, const void *))buf)(v1, v2, > > v3); > > +} > > This works, of course. It could be simpler using EXECUTE, to store > just the one > instruction and not worry about an executable mapped page, but I > guess it doesn't matter. > > Reviewed-by: Richard Henderson <richard.hender...@linaro.org> > > > r~
Thanks! I thought about this too, but EX/EXRL operate only on the second byte, and I need to modify bytes 3-5 here.