On Tue, Oct 09, 2018 at 05:30:25PM +0200, Alberto Garcia wrote:
> >> >      for (i = 0; i < lim; i++) {
> >> > -        xts_tweak_encdec(datactx, decfunc, src, dst, (uint8_t *)&T);
> >> > +        xts_uint128 S, D;
> >> > +
> >> > +        memcpy(&S, src, XTS_BLOCK_SIZE);
> >> > +        xts_tweak_encdec(datactx, decfunc, &S, &D, &T);
> >> > +        memcpy(dst, &D, XTS_BLOCK_SIZE);
> >> 
> >> Why do you need S and D?
> >
> > I think src & dst pointers can't be guaranteed to be aligned
> > sufficiently for int64 operations, if we just cast from uint8t*.
> 
> I see. I did a quick test without the memcpy() calls and it doesn't seem
> to have a visible effect on performance, but if it turns out that it
> does then maybe this is worth investigating further. I suspect all
> buffers received by this code are allocated with qemu_try_blockalign()
> anyway, so it should be safe.

The extra memcpy() calls certainly had a perf impact when I added
them, so if we can determine that we can safely do without, that
would be desirable.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Reply via email to