Re: [Intel-gfx] [PATCH i-g-t v5 3/6] lib: Add function to hash a framebuffer

2019-01-16 Thread Chris Wilson
Quoting Liviu Dudau (2019-01-16 11:20:09)
> On Tue, Jan 15, 2019 at 06:47:47PM +, Chris Wilson wrote:
> > Quoting Liviu Dudau (2019-01-15 17:47:44)
> > > +int igt_fb_get_crc(struct igt_fb *fb, igt_crc_t *crc)
> > > +{
> > > +#define FNV1a_OFFSET_BIAS 2166136261
> > > +#define FNV1a_PRIME 16777619
> > > +   uint32_t hash;
> > > +   void *map;
> > > +   char *ptr, *line = NULL;
> > > +   int x, y, cpp = igt_drm_format_to_bpp(fb->drm_format) / 8;
> > > +   uint32_t stride = calc_plane_stride(fb, 0);
> > > +
> > > +   if (fb->is_dumb)
> > > +   map = kmstest_dumb_map_buffer(fb->fd, fb->gem_handle, 
> > > fb->size,
> > > + PROT_READ);
> > > +   else
> > > +   map = gem_mmap__gtt(fb->fd, fb->gem_handle, fb->size,
> > > +   PROT_READ);
> > > +   ptr = map;
> > > +
> > > +   /*
> > > +* Framebuffers are often uncached, which can make byte-wise 
> > > accesses
> > > +* very slow. We copy each line of the FB into a local buffer to 
> > > speed
> > > +* up the hashing.
> > > +*/
> > > +   line = malloc(stride);
> > > +   if (!line) {
> > > +   munmap(map, fb->size);
> > > +   return -ENOMEM;
> > > +   }
> > > +
> > > +   hash = FNV1a_OFFSET_BIAS;
> > > +
> > > +   for (y = 0; y < fb->height; y++, ptr += stride) {
> > > +
> > > +   memcpy(line, ptr, stride);
> > 
> > igt_memcpy_from_wc() for the reasons cited above.
> > -Chris
> 
> Hi Chris,
> 
> Thanks for pointing out the function, I was not aware of it!
> 
> Now, looking at the implementation, and ignoring the fact that it is in a
> file called igt_x86.c, it looks to me that it will end up calling memcpy
> anyway for Arm drivers. Not being a GCC expert, I am wondering if the
> ifunc() wrapper around resolve_memcpy_from_wc() will not actually
> prevent the compiler from choosing an optimised version of memcpy for Arm.
> 
> I can refresh the patch if you think it is safe to use igt_memcpy_from_wc() 
> for
> non-x86 architectures.

You are memcpy'ing from uncached, there is no general purpose optimised
memcpy! :) The compiler builtin (e.g. rep movb for x86) may be the worst
thing you could do ;) For Arm, it will fallback to the general memcpy
routine which will do it own ifunc, just will not be inlined.

It should be safe, and it should be no worse than a call to memcpy with
variable arguments.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t v5 3/6] lib: Add function to hash a framebuffer

2019-01-16 Thread Liviu Dudau
On Tue, Jan 15, 2019 at 06:47:47PM +, Chris Wilson wrote:
> Quoting Liviu Dudau (2019-01-15 17:47:44)
> > +int igt_fb_get_crc(struct igt_fb *fb, igt_crc_t *crc)
> > +{
> > +#define FNV1a_OFFSET_BIAS 2166136261
> > +#define FNV1a_PRIME 16777619
> > +   uint32_t hash;
> > +   void *map;
> > +   char *ptr, *line = NULL;
> > +   int x, y, cpp = igt_drm_format_to_bpp(fb->drm_format) / 8;
> > +   uint32_t stride = calc_plane_stride(fb, 0);
> > +
> > +   if (fb->is_dumb)
> > +   map = kmstest_dumb_map_buffer(fb->fd, fb->gem_handle, 
> > fb->size,
> > + PROT_READ);
> > +   else
> > +   map = gem_mmap__gtt(fb->fd, fb->gem_handle, fb->size,
> > +   PROT_READ);
> > +   ptr = map;
> > +
> > +   /*
> > +* Framebuffers are often uncached, which can make byte-wise 
> > accesses
> > +* very slow. We copy each line of the FB into a local buffer to 
> > speed
> > +* up the hashing.
> > +*/
> > +   line = malloc(stride);
> > +   if (!line) {
> > +   munmap(map, fb->size);
> > +   return -ENOMEM;
> > +   }
> > +
> > +   hash = FNV1a_OFFSET_BIAS;
> > +
> > +   for (y = 0; y < fb->height; y++, ptr += stride) {
> > +
> > +   memcpy(line, ptr, stride);
> 
> igt_memcpy_from_wc() for the reasons cited above.
> -Chris

Hi Chris,

Thanks for pointing out the function, I was not aware of it!

Now, looking at the implementation, and ignoring the fact that it is in a
file called igt_x86.c, it looks to me that it will end up calling memcpy
anyway for Arm drivers. Not being a GCC expert, I am wondering if the
ifunc() wrapper around resolve_memcpy_from_wc() will not actually
prevent the compiler from choosing an optimised version of memcpy for Arm.

I can refresh the patch if you think it is safe to use igt_memcpy_from_wc() for
non-x86 architectures.

Best regards,
Liviu

-- 

| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---
¯\_(ツ)_/¯
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t v5 3/6] lib: Add function to hash a framebuffer

2019-01-15 Thread Chris Wilson
Quoting Liviu Dudau (2019-01-15 17:47:44)
> +int igt_fb_get_crc(struct igt_fb *fb, igt_crc_t *crc)
> +{
> +#define FNV1a_OFFSET_BIAS 2166136261
> +#define FNV1a_PRIME 16777619
> +   uint32_t hash;
> +   void *map;
> +   char *ptr, *line = NULL;
> +   int x, y, cpp = igt_drm_format_to_bpp(fb->drm_format) / 8;
> +   uint32_t stride = calc_plane_stride(fb, 0);
> +
> +   if (fb->is_dumb)
> +   map = kmstest_dumb_map_buffer(fb->fd, fb->gem_handle, 
> fb->size,
> + PROT_READ);
> +   else
> +   map = gem_mmap__gtt(fb->fd, fb->gem_handle, fb->size,
> +   PROT_READ);
> +   ptr = map;
> +
> +   /*
> +* Framebuffers are often uncached, which can make byte-wise accesses
> +* very slow. We copy each line of the FB into a local buffer to speed
> +* up the hashing.
> +*/
> +   line = malloc(stride);
> +   if (!line) {
> +   munmap(map, fb->size);
> +   return -ENOMEM;
> +   }
> +
> +   hash = FNV1a_OFFSET_BIAS;
> +
> +   for (y = 0; y < fb->height; y++, ptr += stride) {
> +
> +   memcpy(line, ptr, stride);

igt_memcpy_from_wc() for the reasons cited above.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx