On 09/02/2015 07:31 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <r...@twiddle.net>
> ---
>  target-tilegx/Makefile.objs |  2 +-
>  target-tilegx/helper.h      |  4 +++
>  target-tilegx/simd_helper.c | 63 
> +++++++++++++++++++++++++++++++++++++++++++++
>  target-tilegx/translate.c   | 17 +++++++++++-
>  4 files changed, 84 insertions(+), 2 deletions(-)
>  create mode 100644 target-tilegx/simd_helper.c

Naive question:

> +
> +uint64_t helper_v1shl(uint64_t a, uint64_t b)
> +{
> +    uint64_t r = 0;
> +    int i;
> +
> +    b &= 7;
> +    for (i = 0; i < 64; i += 8) {
> +        uint64_t m = 0xffULL << i;
> +        r |= ((a & m) << b) & m;
> +    }

Is it any more efficient to use multiplies instead of looping, as in:

uint64_t m;

b &= 7;
m = 0x0101010101010101ULL * ((1 << (8 - b)) - 1);
return (a & m) << b;

Or if multiplies are bad, what about straight-line expansion of the
mask, as in:

uint64_t m;

b &= 7;
m = (1 << (8 - b)) - 1;
m |= m << 32;
m |= m << 16;
m |= m << 8;
return (a & m) << b;

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to