On 09/02/2015 07:31 PM, Richard Henderson wrote: > Signed-off-by: Richard Henderson <r...@twiddle.net> > --- > target-tilegx/Makefile.objs | 2 +- > target-tilegx/helper.h | 4 +++ > target-tilegx/simd_helper.c | 63 > +++++++++++++++++++++++++++++++++++++++++++++ > target-tilegx/translate.c | 17 +++++++++++- > 4 files changed, 84 insertions(+), 2 deletions(-) > create mode 100644 target-tilegx/simd_helper.c
Naive question: > + > +uint64_t helper_v1shl(uint64_t a, uint64_t b) > +{ > + uint64_t r = 0; > + int i; > + > + b &= 7; > + for (i = 0; i < 64; i += 8) { > + uint64_t m = 0xffULL << i; > + r |= ((a & m) << b) & m; > + } Is it any more efficient to use multiplies instead of looping, as in: uint64_t m; b &= 7; m = 0x0101010101010101ULL * ((1 << (8 - b)) - 1); return (a & m) << b; Or if multiplies are bad, what about straight-line expansion of the mask, as in: uint64_t m; b &= 7; m = (1 << (8 - b)) - 1; m |= m << 32; m |= m << 16; m |= m << 8; return (a & m) << b; -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature