On Thu, May 18, 2017 at 10:46:32AM +0100, Chris Wilson wrote:
> When userspace is doing most of the work, avoiding relocs (using
> NO_RELOC) and opting out of implicit synchronisation (using ASYNC), we
> still spend a lot of time processing the arrays in execbuf, even though
> we now should have nothing to do most of the time. One issue that
> becomes readily apparent in profiling anv is that iterating over the
> large execobj is unfriendly to the loop prefetchers of the CPU and it
> much prefers iterating over a pair of arrays rather than one big array.
Joonas pointed out that given the alignment of vma in our slab, I have a
few bits to spare in the eb->vma pointers, now only using a few bits
for state tracking of vma through execbuf and he does like bit packing
into pointers very much...
Chris Wilson, Intel Open Source Technology Centre
Intel-gfx mailing list