Re: [Mesa-dev] [PATCH 02/13] i965: Allow passing target_bo=NULL to brw_emit_reloc()

2017-07-20 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-19 21:08:23) > On Wednesday, July 19, 2017 3:09:10 AM PDT Chris Wilson wrote: > > Sometimes we want to emit a relocation to a NULL surface when the > > constructing the batch. If we push the NULL handling into the common > > brw_emit_reloc()

[Mesa-dev] [PATCH 02/13] i965: Allow passing target_bo=NULL to brw_emit_reloc()

2017-07-19 Thread Chris Wilson
Sometimes we want to emit a relocation to a NULL surface when the constructing the batch. If we push the NULL handling into the common brw_emit_reloc() we can make the batch construction itself more readable. On the other hand, we often test for the existence of the bo separately and so would

[Mesa-dev] [PATCH 08/13] i965: Convert reloc.target_handle into an index for I915_EXEC_HANDLE_LUT

2017-07-19 Thread Chris Wilson
: Only enable HANDLE_LUT if we can use BATCH_FIRST and thereby avoid a post-processing loop to fixup the relocations. v3: Move kernel probing from context creation to screen init. Use batch->use_exec_lut as it more descriptive of what's going on (Daniel) Signed-off-by: Chris Wilson <ch...

[Mesa-dev] [PATCH 03/13] i965: Refactor __gen_combine_address()

2017-07-19 Thread Chris Wilson
Since brw_emit_reloc() now does the test for target==NULL itself, we can remove the test from __gen_combine_address() and call brw_emit_reloc() directly. --- src/mesa/drivers/dri/i965/genX_state_upload.c | 21 + 1 file changed, 5 insertions(+), 16 deletions(-) diff --git

[Mesa-dev] [PATCH 04/13] i965: Always use the pre-computed offset for the relocation entry

2017-07-19 Thread Chris Wilson
We must be careful to only compute the address once based on the per-context information (rather than accessing the unlocked global bo->offset64) so that the value in the batch does match the reloc.presumed_offset we declare to the kernel. Otherwise, highly unlikely, but we may see GPU hangs in

[Mesa-dev] [PATCH 07/13] i965: Move add_exec_bo()

2017-07-19 Thread Chris Wilson
To avoid a forward declaration in the next patch, move the definition of add_exec_bo() earlier. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Ekstrand <jason.ekstr...@int

[Mesa-dev] [PATCH 12/13] i965: Per-context bo tracking for relocations

2017-07-19 Thread Chris Wilson
The kernel creates a unique binding for each instance of a GEM handle in the per-process GTT. Keeping a single bo->offset64 used by multiple contexts will therefore cause a lot of migration and relocation stalls when the bo are reused between contexts. Not a common problem, but when it does occur

[Mesa-dev] [PATCH 11/13] i965: Reuse intel_batchbuffer_reset_to_saved() for intel_batchbuffer_free()

2017-07-19 Thread Chris Wilson
Rather than have a seperate implementation that discards all of the execobjects for the rare event of destroying the context, recast it as an operation to reset to the saved state of no batch. --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 13 +++-- 1 file changed, 7 insertions(+), 6

[Mesa-dev] [PATCH 13/13] i965: Reduce passing 2x32b of reloc_domains to 2 bits

2017-07-19 Thread Chris Wilson
The kernel only cares about whether the object is to be written to or not, only reduces (reloc.read_domains, reloc.write_domain) down to just !!reloc.write_domain. When we use NO_RELOC, the kernel doesn't even read those relocs and instead userspace has to pass that information in the

[Mesa-dev] [PATCH 10/13] i965: Push no_hw down to the execbuf call

2017-07-19 Thread Chris Wilson
For the common path where we want to execute the batch, if we push the no_hw detection down to the execbuf we can eliminate one loop over all the execobjects. For the less common path where we don't want to execute the batch, no_hw was leaving out_fence uninitialised. Cc: Kenneth Graunke

[Mesa-dev] [PATCH 01/13] i965: Assert that 64b immediate writes are correctly aligned

2017-07-19 Thread Chris Wilson
The HW can only write a 64b immediate into a 64b aligned address, so add an assert. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c

[Mesa-dev] [PATCH 09/13] i965: Always create the batch with the batch object in the first execobject slot

2017-07-19 Thread Chris Wilson
Even if we are using older kernels that do not accept the batch in the first slot, we can simplify our code by creating the batch with itself in the first slot and moving it to the end on execbuf submission. --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 70 --- 1

[Mesa-dev] [PATCH 05/13] i965: Track last location of bo used for the batch

2017-07-19 Thread Chris Wilson
ces() v3: Reset bo->index on creation (Daniel) Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Ekstrand <jason.ekstr...@intel.com> Cc: Daniel Vetter <daniel.vet...@ffwll.c

[Mesa-dev] [PATCH 06/13] i965: Use I915_EXEC_NO_RELOC

2017-07-19 Thread Chris Wilson
-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Ekstrand <jason.ekstr...@intel.com> --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 54 +-- 1 file chang

Re: [Mesa-dev] [PATCH 2/2] anv: ensure device name contains terminating character

2017-07-16 Thread Chris Wilson
Quoting Lionel Landwerlin (2017-07-16 15:31:38) > CID: 1415113 > Reported-by: Grazvydas Ignotas > Signed-off-by: Lionel Landwerlin > --- > src/intel/vulkan/anv_device.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git

Re: [Mesa-dev] [PATCH v3 04/14] i965/bufmgr: Add a BO_ALLOC_ZEROED flag

2017-07-15 Thread Chris Wilson
Quoting Chad Versace (2017-07-14 23:36:43) > On Wed 12 Jul 2017, Jason Ekstrand wrote: > > Cc: Kenneth Graunke > > > > --- > > src/mesa/drivers/dri/i965/brw_bufmgr.c | 28 ++-- > > src/mesa/drivers/dri/i965/brw_bufmgr.h | 1 + > > 2 files changed,

Re: [Mesa-dev] [PATCH 2/2] util: Make CLAMP turn NaN into MIN.

2017-07-14 Thread Chris Wilson
Quoting Roland Scheidegger (2017-07-14 13:22:27) > Reviewed-by: Roland Scheidegger > > Interesting side-effect there with the results being different if max > > min. But hopefully not an issue anywhere else... Is it worth a gccism to check? #ifdef __GNUC__ #define CLAMP(x,

[Mesa-dev] [PATCH 2/2] i965: Always use the pre-computed offset for the relocation entry

2017-07-14 Thread Chris Wilson
We must be careful to only compute the address once based on the per-context information (rather than accessing the unlocked global bo->offset64) so that the value in the batch does match the reloc.presumed_offset we declare to the kernel. Otherwise, highly unlikely, but we may see GPU hangs in

[Mesa-dev] [PATCH 1/2] i965: Allow passing target_bo=NULL to brw_emit_reloc()

2017-07-14 Thread Chris Wilson
Sometimes we want to emit a relocation to a NULL surface when the constructing the batch. If we push the NULL handling into the common brw_emit_reloc() we can streamline the batch construction. Cc: Kenneth Graunke Cc: Matt Turner Cc: Jason Ekstrand

Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS (v2)

2017-07-14 Thread Chris Wilson
Quoting Zhongmin Wu (2017-07-14 07:55:45) > Before we queued the buffer with a invalid fence (-1), it will > make some benchmarks failed to test such as flatland. > > Now we get the out fence during the flushing buffer and then pass > it to SurfaceFlinger in eglSwapbuffer function. > > v2: a)

Re: [Mesa-dev] [PATCH] i965 : Performance Improvement

2017-07-14 Thread Chris Wilson
Quoting aravindan.muthuku...@intel.com (2017-07-14 05:09:09) > From: Aravindan M > > This patch improves CPI Rate(Cycles per Instruction) > and CPU time utilization for i965. The functions > check_state and brw_pipeline_state_finished was found > poor CPU

Re: [Mesa-dev] [PATCH 3/4] i965: Use async maps for BufferSubData to regions with no valid data.

2017-07-13 Thread Chris Wilson
Quoting Chris Wilson (2017-06-13 12:57:05) > Quoting Kenneth Graunke (2017-06-13 01:33:31) > > When writing a region of a buffer via glBufferSubData(), we can write > > the data asynchronously if the destination doesn't contain any data. > > Even if it's busy, the data was

Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-13 Thread Chris Wilson
Quoting Wu, Zhongmin (2017-07-13 09:31:15) > As for the using of last fence when the batch buffer is empty for > create_fence_fd, I suggest it can be another story and we will try to > optimize it in the future... Note that is a backend problem. If you call a driver interface to create a fence

Re: [Mesa-dev] [PATCH 4/4] i965: Drop non-LLC lunacy in the program cache code.

2017-07-12 Thread Chris Wilson
Quoting Chris Wilson (2017-07-12 10:40:43) > Quoting Kenneth Graunke (2017-07-12 08:22:25) > > The non-LLC story was a horror show. We uploaded data via pwrite > > (drm_intel_bo_subdata), which would stall if the cache BO was in > > use (being read) by the GPU. Obviousl

Re: [Mesa-dev] [PATCH 4/4] i965: Drop non-LLC lunacy in the program cache code.

2017-07-12 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-12 08:22:25) > The non-LLC story was a horror show. We uploaded data via pwrite > (drm_intel_bo_subdata), which would stall if the cache BO was in > use (being read) by the GPU. Obviously, we wanted to avoid that. > So, we tried to detect whether the buffer was

Re: [Mesa-dev] [PATCH 3/4] i965: Use write-combine mappings where available

2017-07-12 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-12 08:22:24) > From: Matt Turner > > Write-combine mappings give much better performance on writes than > uncached access through the GTT. > > Improves performance of GFXBench 4's gl_driver2 benchmark at 1024x768 > on Apollolake by 3.6086%

Re: [Mesa-dev] [PATCH 1/4] i965: Drop bogus pthread_mutex_unlock in map_gtt error path.

2017-07-12 Thread Chris Wilson
. My apologies for not noticing! I appear to have fixed it in a rebase and so it disappeared from my tree. Reviewed-by: Chris Wilson <ch...@chris-wilson.co.uk> Doing a quick grep on the remaining bufmgr->lock shows that we are now only locking around the gl

[Mesa-dev] [PATCH] i965: Fix up a failed CPU/WC mmaping with a GTT mapping

2017-07-11 Thread Chris Wilson
Not all objects will be mappable for direct access by the CPU (either using WC/CPU or WC paths), for example, a dmabuf wrapping an object on a foreign device or an object wrapping access to stolen memory. Since either the physical pages are not known or even do not exist, we need to use the

[Mesa-dev] [PATCH] i965: Use VALGRIND_MAKE_MEM_x in place of MALLOCLIKE/FREELIKE

2017-07-11 Thread Chris Wilson
Valgrind doesn't actually implement VALGRIND_FREELIKE_BLOCK as the exact inverse of VALGRIND_MALLOCLIKE_BLOCK. It makes the block inaccessible, but still leaves it defined in its allocation tracker i.e. it will report the mmap as lost despite the call to FREELIKE! Instead of treating the mmap as

Re: [Mesa-dev] [PATCH] anv: Stop setting domains to RENDER on EXEC_OBJECT_WRITE

2017-07-07 Thread Chris Wilson
out. That was until I saw what you were planning to do for anv. Hmm, that puts the oldest kernel that might support anv as commit 51bc140431e233284660b1d22c47dec9ecdb521e [v4.3] Author: Chris Wilson <ch...@chris-wilson.co.uk> Date: Mon Aug 31 15:10:39 2015 +0100 drm/i915: Always mark th

Re: [Mesa-dev] [Intel-gfx] [PATCH 1/1] drm/i915: Version the MOCS settings

2017-07-07 Thread Chris Wilson
Quoting Ben Widawsky (2017-07-07 19:42:25) > On 17-07-07 11:34:48, Chris Wilson wrote: > >Quoting Ben Widawsky (2017-07-07 00:27:01) > >> drivers/gpu/drm/i915/i915_drv.c | 3 +++ > >> drivers/gpu/drm/i915/i915_drv.h | 2 ++ > >> drivers/gpu/drm/i915/i915_pc

[Mesa-dev] [PATCH v2] i965: Resolve framebuffers before signaling the fence

2017-07-07 Thread Chris Wilson
ore as is currently the case. v2: fixup assert to use GL_SYNC_GPU_COMMANDS_COMPLETE (Chad) Reported-by: Sergi Granell <xerpi.g...@gmail.com> Fixes: c636284ee8ee ("i965/sync: Implement DRI2_Fence extension") Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Sergi Granell <x

[Mesa-dev] [PATCH] i965: Use brw_bo_wait() for brw_bo_wait_rendering()

2017-07-07 Thread Chris Wilson
. Historically libdrm used set-domain as we did not have an explicit wait-ioctl (and the patches to teach it to use wait if available were lost in the mists). Since mesa already depends upon a kernel support the wait-ioctl, we do not need to supply a fallback. Signed-off-by: Chris Wilson <ch...@ch

Re: [Mesa-dev] [PATCH 7/7] i965: Fix asynchronous mappings on !LLC platforms.

2017-07-07 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-07 07:08:16) > On Thursday, July 6, 2017 10:51:49 PM PDT Kenneth Graunke wrote: > > On Wednesday, July 5, 2017 2:24:55 PM PDT Chris Wilson wrote: > > > Quoting Kenneth Graunke (2017-07-05 21:56:54) > > > > --- > > > > s

Re: [Mesa-dev] [PATCH 2/4] i965: Use I915_EXEC_NO_RELOC

2017-07-07 Thread Chris Wilson
Quoting Daniel Vetter (2017-07-07 11:04:00) > On Mon, Jun 19, 2017 at 11:06:48AM +0100, Chris Wilson wrote: > > - if (target != batch->bo) > > - add_exec_bo(batch, target); > > + if (target != batch->bo) { > > + unsigned int index = add_exec_bo(

Re: [Mesa-dev] [PATCH 4/4] i965: Convert reloc.target_handle into an index for I915_EXEC_HANDLE_LUT

2017-07-07 Thread Chris Wilson
Quoting Daniel Vetter (2017-07-07 11:31:46) > On Mon, Jun 19, 2017 at 11:06:50AM +0100, Chris Wilson wrote: > > Passing the index of the target buffer via the reloc.target_handle is > > marginally more efficient for the kernel (it can avoid some allocations, > > and can use a

Re: [Mesa-dev] [Intel-gfx] [PATCH 1/1] drm/i915: Version the MOCS settings

2017-07-07 Thread Chris Wilson
Quoting Ben Widawsky (2017-07-07 00:27:01) > drivers/gpu/drm/i915/i915_drv.c | 3 +++ > drivers/gpu/drm/i915/i915_drv.h | 2 ++ > drivers/gpu/drm/i915/i915_pci.c | 13 + > include/uapi/drm/i915_drm.h | 8 > 4 files changed, 22 insertions(+), 4 deletions(-) > > diff

[Mesa-dev] [PATCH v3] i965: Track last location of bo used for the batch

2017-07-07 Thread Chris Wilson
ces() v3: Reset bo->index on creation (Daniel) Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Ekstrand <jason.ekstr...@intel.com> Cc: Daniel Vetter <daniel.vet...@ffwll.c

Re: [Mesa-dev] [PATCH 1/4] i965: Track last location of bo used for the batch

2017-07-07 Thread Chris Wilson
Quoting Daniel Vetter (2017-07-07 10:55:49) > On Mon, Jun 19, 2017 at 11:06:47AM +0100, Chris Wilson wrote: > > Borrow a trick from anv, and use the last known index for the bo to skip > > a search of the batch->exec_bo when adding a new relocation. In defence > > a

Re: [Mesa-dev] [EGL android: accquire fence implementation] i965: Queue the buffer with a sync fence for Android OS

2017-07-07 Thread Chris Wilson
Quoting Zhongmin Wu (2017-07-07 09:07:06) > Before we queued the buffer with a invalid fence (-1), it will > make some benchmarks failed to test such as flatland. Create a fence, pass fence-fd to android? Instead of forcing a lot of busy work and using up another precious resource for everyone

Re: [Mesa-dev] [PATCH 7/7] i965: Fix asynchronous mappings on !LLC platforms.

2017-07-07 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-07 06:51:49) > On Wednesday, July 5, 2017 2:24:55 PM PDT Chris Wilson wrote: > > Quoting Kenneth Graunke (2017-07-05 21:56:54) > > > --- > > > src/mesa/drivers/dri/i965/brw_bufmgr.c | 15 +-- > > > 1 file ch

Re: [Mesa-dev] [PATCH 6/7] i965: Don't use PREAD for glGetBufferSubData().

2017-07-07 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-07 06:19:07) > On Thursday, July 6, 2017 4:21:28 AM PDT Chris Wilson wrote: > > Quoting Kenneth Graunke (2017-07-05 21:56:53) > > > diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c > > > b/src/mesa/drivers/dri/i965/intel_b

Re: [Mesa-dev] [PATCH 6/7] i965: Don't use PREAD for glGetBufferSubData().

2017-07-06 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-05 21:56:53) > diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c > b/src/mesa/drivers/dri/i965/intel_buffer_objects.c > index a9ac29a6a81..2b0f7b9a698 100644 > --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c > +++

Re: [Mesa-dev] [PATCH 5/7] i965: Assert that we don't use CPU write maps to non-coherent buffers.

2017-07-06 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-05 21:56:52) > Using CPU maps of non-coherent buffers can get us in a lot of trouble, > and WC maps are a reasonable alternative anyway. Guard against shooting > ourselves in the foot by adding an assert, and comment. Reviewed-by: Chris Wilson <

Re: [Mesa-dev] [PATCH 7/7] i965: Fix asynchronous mappings on !LLC platforms.

2017-07-05 Thread Chris Wilson
Quoting Kenneth Graunke (2017-07-05 21:56:54) > --- > src/mesa/drivers/dri/i965/brw_bufmgr.c | 15 +-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c > b/src/mesa/drivers/dri/i965/brw_bufmgr.c > index

Re: [Mesa-dev] [PATCH 3/3] anv: Use DRM sync objects for external semaphores when available

2017-07-05 Thread Chris Wilson
Quoting Jason Ekstrand (2017-07-05 18:21:08) > static void > anv_cmd_buffer_process_relocs(struct anv_cmd_buffer *cmd_buffer, >struct anv_reloc_list *list) > @@ -1450,6 +1484,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device, > impl->fd = -1; >

[Mesa-dev] [PATCH] i965: Remove clearing of bo->map_gtt after failure

2017-07-01 Thread Chris Wilson
cessfully mmaped the GTT. Fixes: 314647c4c206 ("i965: Drop global bufmgr lock from brw_bo_map_* functions.") Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/driver

Re: [Mesa-dev] [PATCH] i965: Resolve framebuffers before signaling the fence

2017-06-20 Thread Chris Wilson
Quoting Chad Versace (2017-06-20 18:08:14) > On Mon 19 Jun 2017, Chris Wilson wrote: > > Quoting Chad Versace (2017-06-19 19:42:16) > > > On Mon 12 Jun 2017, Chris Wilson wrote: > > > > brw_emit_mi_flush(brw); > > > > > > > &g

[Mesa-dev] [PATCH] i965: Discard bo->map_count

2017-06-20 Thread Chris Wilson
map. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Ekstrand <jason.ekstr...@intel.com> --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 94 +++---

[Mesa-dev] [PATCH 1/2] i965: Disable access to CPU mmap for async access on non-LLC machines

2017-06-20 Thread Chris Wilson
that buffer to be clflushed and any further CPU access to be discarded.) To prevent this, simply disallow any CPU async mmap access. The cases where async CPU access to a non-LLC buffer should continue to be allowed via their preferred snooping path. Signed-off-by: Chris Wilson <ch...@chris-wilson.co

[Mesa-dev] [PATCH 2/2] i965: Track initial CPU domain for mappings

2017-06-20 Thread Chris Wilson
) are not permitted Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/brw_bufmgr.c| 19 +-- src/mesa/drivers/dri/i965/brw_bufmgr.h| 10 ++

Re: [Mesa-dev] [PATCH v2 02/10] i965: Track initial CPU domain for mappings

2017-06-20 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-20 00:33:35) > On Monday, June 19, 2017 3:55:01 AM PDT Chris Wilson wrote: > > If we need to force a cache domain transition (e.g. a buffer was in the > > CPU domain and we want to access it via WC) then we need to trigger a > > clflush. T

Re: [Mesa-dev] [PATCH 2/4] i965: Use I915_EXEC_NO_RELOC

2017-06-19 Thread Chris Wilson
Quoting Jason Ekstrand (2017-06-19 22:00:45) > On Mon, Jun 19, 2017 at 12:53 PM, Chris Wilson <ch...@chris-wilson.co.uk> > wrote: > > Quoting Kenneth Graunke (2017-06-19 20:28:31) > > On Monday, June 19, 2017 3:06:48 AM PDT Chris Wilson wrote: > >

Re: [Mesa-dev] [PATCH] i965: Resolve framebuffers before signaling the fence

2017-06-19 Thread Chris Wilson
Quoting Chad Versace (2017-06-19 19:42:16) > On Mon 12 Jun 2017, Chris Wilson wrote: > > brw_emit_mi_flush(brw); > > > > switch (fence->type) { > > @@ -335,6 +363,8 @@ brw_gl_fence_sync(struct gl_context *ctx, struct > > gl_sync_object *_s

Re: [Mesa-dev] [PATCH 2/4] i965: Use I915_EXEC_NO_RELOC

2017-06-19 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-19 20:28:31) > On Monday, June 19, 2017 3:06:48 AM PDT Chris Wilson wrote: > > - if (target != batch->bo) > > - add_exec_bo(batch, target); > > + if (target != batch->bo) { > > + unsigned int index = add_exec_bo(

Re: [Mesa-dev] [PATCH 1/2] RFC i965: Bypass a couple of libraries for syscall on x84_64

2017-06-19 Thread Chris Wilson
Quoting Eric Engestrom (2017-06-19 15:30:46) > On Monday, 2017-06-19 13:02:11 +0100, Chris Wilson wrote: > > Quoting Emil Velikov (2017-06-19 12:43:42) > > > Hi Chris, > > > > > > On 19 June 2017 at 12:32, Chris Wilson <ch...@chris-wilson.co.uk&g

Re: [Mesa-dev] [PATCH 1/2] RFC i965: Bypass a couple of libraries for syscall on x84_64

2017-06-19 Thread Chris Wilson
Quoting Emil Velikov (2017-06-19 12:43:42) > Hi Chris, > > On 19 June 2017 at 12:32, Chris Wilson <ch...@chris-wilson.co.uk> wrote: > > On linux/x86_64, calling into the kernel is just a single instruction > > with the parameters passed via registers. We can therefr

[Mesa-dev] [PATCH 1/2] RFC i965: Bypass a couple of libraries for syscall on x84_64

2017-06-19 Thread Chris Wilson
a slight impedance mismatch with the kernel interface in that it converts the -errno return into -1 + errno, which we immediately convert back into -errno for ourselves! Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <mat

[Mesa-dev] [PATCH 2/2] RFC anv: Use direct call into kernel for ioctl()

2017-06-19 Thread Chris Wilson
Bypass libc's PLT indirection and its impedance mismatch by emitting the single syscall instruction ourselves for linux/x86_64. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Eks

[Mesa-dev] [PATCH v2 09/10] i965: Pack simple pipelined query objects into the same buffer

2017-06-19 Thread Chris Wilson
Reuse the same query object buffer for multiple queries within the same batch. A task for the future is propagating the GL_NO_MEMORY errors. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com

[Mesa-dev] [PATCH v2 04/10] i965: Replace hard-coded indices with const named variables in gen6_queryobj

2017-06-19 Thread Chris Wilson
To simplify replacement later, replace repeated use of explicit 0/1 with local variables of the same value. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/ge

[Mesa-dev] [PATCH v2 08/10] i965: Use 'available' fence for polling query results

2017-06-19 Thread Chris Wilson
, the busy-ioctl is lightweight!). Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/brw_context.h | 4 +-- src/mesa/drivers/dri/i965/ge

[Mesa-dev] [PATCH v2 05/10] i965: Replace open-coded gen6 queryobj offsets with simple helpers

2017-06-19 Thread Chris Wilson
Lots of places open-coded the assumed layout of the predicate/results within the query object, replace those with simple helpers. v2: Fix function decl style. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <mat

[Mesa-dev] [PATCH v2 03/10] i965: Check last known busy status on bo before asking the kernel

2017-06-19 Thread Chris Wilson
using the last known flag, the query is split into two. v2: Check against external bo before trusting our own tracking. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri

[Mesa-dev] [PATCH v2 02/10] i965: Track initial CPU domain for mappings

2017-06-19 Thread Chris Wilson
If we need to force a cache domain transition (e.g. a buffer was in the CPU domain and we want to access it via WC) then we need to trigger a clflush. This overrides the use of MAP_ASYNC as we call into the kernel to change domains on the whole object. Signed-off-by: Chris Wilson <ch...@ch

[Mesa-dev] [PATCH v2 07/10] i965: Use snoop bo for accessing query results on !llc

2017-06-19 Thread Chris Wilson
(where the results are used directly by the GPU and not CPU). Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/brw_bufmgr.c| 23 +++ src/mes

[Mesa-dev] [PATCH v2 06/10] i965: Map the query results for the life of the bo

2017-06-19 Thread Chris Wilson
If we map the bo upon creation, we can avoid the latency of mmapping it when querying, and later use the asynchronous, persistent map of the predicate to do a quick query. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt T

[Mesa-dev] [PATCH v2 10/10] i965: Pass consistent args along gen6_queryobj.c

2017-06-19 Thread Chris Wilson
Be consistent in passing along brw_context rather than switching between that and gl_context. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> --- src/mesa/drivers/dri/i965/gen6_queryobj.c | 32 ++- 1 file changed, 14 insertions(+), 18 deletions(-) diff

[Mesa-dev] [PATCH v2 01/10] i965: Track when a bo is shared with an external client

2017-06-19 Thread Chris Wilson
will have more examples of non-reusable buffers in the near future. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 4 src/mesa/drivers/dri/i965/brw_bufmgr.h | 5 + 2 files changed,

[Mesa-dev] [PATCH 4/4] i965: Convert reloc.target_handle into an index for I915_EXEC_HANDLE_LUT

2017-06-19 Thread Chris Wilson
: Only enable HANDLE_LUT if we can use BATCH_FIRST and thereby avoid a post-processing loop to fixup the relocations. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Ekstrand <jas

[Mesa-dev] [PATCH 3/4] i965: Move add_exec_bo()

2017-06-19 Thread Chris Wilson
To avoid a forward declaration in the next patch, move the definition of add_exec_bo() earlier. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Ekstrand <jason.ekstr...@int

[Mesa-dev] [PATCH 1/4] i965: Track last location of bo used for the batch

2017-06-19 Thread Chris Wilson
Borrow a trick from anv, and use the last known index for the bo to skip a search of the batch->exec_bo when adding a new relocation. In defence against the bo being used in multiple batches simultaneously, we check that this slot exists and points back to us. Signed-off-by: Chris Wilson

[Mesa-dev] [PATCH 2/4] i965: Use I915_EXEC_NO_RELOC

2017-06-19 Thread Chris Wilson
-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> Cc: Jason Ekstrand <jason.ekstr...@intel.com> --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 53 +-- 1 file chang

Re: [Mesa-dev] [PATCH 2/9] i965: Check last known busy status on bo before asking the kernel

2017-06-15 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-15 19:45:19) > On Thursday, June 15, 2017 1:41:39 AM PDT Chris Wilson wrote: > > Quoting Kenneth Graunke (2017-06-14 22:49:01) > > > On Friday, June 9, 2017 6:01:33 AM PDT Chris Wilson wrote: > > > > If we know the bo is idle (that is

Re: [Mesa-dev] [PATCH 0/7] i965: Stop hanging on Haswell

2017-06-15 Thread Chris Wilson
Quoting Jason Ekstrand (2017-06-15 16:58:13) > On Thu, Jun 15, 2017 at 4:15 AM, Chris Wilson <ch...@chris-wilson.co.uk> > wrote: > > Quoting Kenneth Graunke (2017-06-14 21:44:45) > > If Chris is right, and what we're really seeing is that MI_SET_CONTEXT >

Re: [Mesa-dev] [PATCH 4/7] i965: Add an end-of-pipe sync helper

2017-06-15 Thread Chris Wilson
Quoting Jason Ekstrand (2017-06-15 16:59:19) > On Thu, Jun 15, 2017 at 4:11 AM, Chris Wilson <ch...@chris-wilson.co.uk> > wrote: > The kernel does have a LRI after a flush before signaling the batch is > complete. I don't see a need to add another... > >

Re: [Mesa-dev] [PATCH 0/7] i965: Stop hanging on Haswell

2017-06-15 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-14 21:44:45) > On Tuesday, June 13, 2017 2:53:20 PM PDT Jason Ekstrand wrote: > > As I've been working on converting more things in the GL driver over to > > blorp, I've been highly annoyed by all of the hangs on Haswell. About one > > in 3-5 Jenkins runs would

Re: [Mesa-dev] [PATCH 4/7] i965: Add an end-of-pipe sync helper

2017-06-15 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-14 21:41:56) > On Tuesday, June 13, 2017 2:53:24 PM PDT Jason Ekstrand wrote: > > From: Topi Pohjolainen > > > > v2 (Jason Ekstrand): > > - Take a flags parameter to control the flushes > > - Refactoring > > > > Signed-off-by: Topi

Re: [Mesa-dev] [PATCH 9/9] i965: Pack simple pipelined query objects into the same buffer

2017-06-15 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-15 00:19:35) > On Friday, June 9, 2017 6:01:40 AM PDT Chris Wilson wrote: > > Reuse the same query object buffer for multiple queries within the same > > batch. > > > > A task for the future is propagating the GL_NO_MEMORY errors. &g

Re: [Mesa-dev] [PATCH 7/9] i965: Use snoop bo for accessing query results on !llc

2017-06-15 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-15 00:13:14) > On Friday, June 9, 2017 6:01:38 AM PDT Chris Wilson wrote: > > Ony non-llc architectures where we are primarily reading back the > > results of the GPU queries, then we can improve performance by using a > > cacheable m

Re: [Mesa-dev] [PATCH 6/9] i965: Map the query results for the life of the bo

2017-06-15 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-14 23:50:12) > On Friday, June 9, 2017 6:01:37 AM PDT Chris Wilson wrote: > > If we map the bo upon creation, we can avoid the latency of mmapping it > > when querying, and later use the asynchronous, persistent map of the > > predicat

Re: [Mesa-dev] [PATCH 5/9] i965: Replace open-coded gen6 queryobj offsets with simple helpers

2017-06-15 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-14 23:10:38) > On Friday, June 9, 2017 6:01:36 AM PDT Chris Wilson wrote: > > diff --git a/src/mesa/drivers/dri/i965/hsw_queryobj.c > > b/src/mesa/drivers/dri/i965/hsw_queryobj.c > > index b81ab3b6f8..cb1a2df52d 100644 > > ---

Re: [Mesa-dev] [PATCH 2/9] i965: Check last known busy status on bo before asking the kernel

2017-06-15 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-14 22:49:01) > On Friday, June 9, 2017 6:01:33 AM PDT Chris Wilson wrote: > > If we know the bo is idle (that is we have no submitted a command buffer > > referencing this bo since the last query) we can skip asking the kernel. > > Note th

Re: [Mesa-dev] [PATCH 0/7] i965: Stop hanging on Haswell

2017-06-14 Thread Chris Wilson
Quoting Jason Ekstrand (2017-06-13 22:53:20) > As I've been working on converting more things in the GL driver over to > blorp, I've been highly annoyed by all of the hangs on Haswell. About one > in 3-5 Jenkins runs would hang somewhere. After looking at about a > half-dozen error states, I

Re: [Mesa-dev] [PATCH 4/4] i965: Orphan storage in MapBufferRange if invalidating all valid data.

2017-06-13 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-13 01:33:32) > We can promote INVALIDATE_RANGE_BIT to INVALIDATE_BUFFER_BIT if the > range contains the only valid data in the buffer. This allows us to > orphan the storage, instead of doing stall avoidance blits. > --- >

Re: [Mesa-dev] [PATCH 3/4] i965: Use async maps for BufferSubData to regions with no valid data.

2017-06-13 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-13 01:33:31) > When writing a region of a buffer via glBufferSubData(), we can write > the data asynchronously if the destination doesn't contain any data. > Even if it's busy, the data was undefined, so the new data is fine too. > > Decreases the number of stall

Re: [Mesa-dev] [PATCH 2/4] i965: Track a range of the buffer which contains valid data.

2017-06-13 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-13 01:33:30) Every alloc_buffer_object() is followed by marking the valid range. I could not find a missed path, so Reviewed-by: Chris Wilson <ch...@chris-wilson.co.uk> At some point, mesa with have to get an rbtree and then it will be interesting

Re: [Mesa-dev] [PATCH 1/4] i965: Add a "write" parameter to intel_bufferobj_buffer.

2017-06-13 Thread Chris Wilson
Quoting Kenneth Graunke (2017-06-13 01:33:29) > This doesn't do anything yet, but soon we'll want to know whether an > access to a buffer section may write that data, or simply reads it. This series doesn't got further than boolean, but would it be worth feeding through map flags? The immediate

[Mesa-dev] [PATCH] i965: Resolve framebuffers before signaling the fence

2017-06-12 Thread Chris Wilson
as is currently the case. Reported-by: Sergi Granell <xerpi.g...@gmail.com> Fixes: c636284ee8ee ("i965/sync: Implement DRI2_Fence extension") Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Sergi Granell <xerpi.g...@gmail.com> Cc: Rob Clark <robdcl...@gmail.

Re: [Mesa-dev] [PATCH 3/9] i965: Only wait for the fence bo to be signaled

2017-06-12 Thread Chris Wilson
Quoting Chris Wilson (2017-06-09 14:01:34) > The fence bo may be reused as an input fence to another batch, which > will cause us to treat it as busy until that subsequent batch is idle. > We only need to check if the fence has been signaled, which we can do by > checking the

[Mesa-dev] [PATCH 8/9] i965: Use 'available' fence for polling query results

2017-06-09 Thread Chris Wilson
, the busy-ioctl is lightweight!). Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/brw_context.h | 4 +-- src/mesa/drivers/dri/i965/ge

[Mesa-dev] [PATCH 9/9] i965: Pack simple pipelined query objects into the same buffer

2017-06-09 Thread Chris Wilson
Reuse the same query object buffer for multiple queries within the same batch. A task for the future is propagating the GL_NO_MEMORY errors. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com

[Mesa-dev] [PATCH 1/9] i965: Mark freshly allocate bo as idle

2017-06-09 Thread Chris Wilson
When created, buffers are idle, so mark them as such to save an early ioctl or mistaken assuming the fresh buffer is busy. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mes

[Mesa-dev] [PATCH 7/9] i965: Use snoop bo for accessing query results on !llc

2017-06-09 Thread Chris Wilson
(where the results are used directly by the GPU and not CPU). Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/brw_bufmgr.c| 21 + src/mes

[Mesa-dev] [PATCH 3/9] i965: Only wait for the fence bo to be signaled

2017-06-09 Thread Chris Wilson
is signaled. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/brw_sync.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/

[Mesa-dev] [PATCH 4/9] i965: Replace hard-coded indices with const named variables in gen6_queryobj

2017-06-09 Thread Chris Wilson
To simplify replacement later, replace repeated use of explicit 0/1 with local variables of the same value. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/ge

[Mesa-dev] [PATCH 6/9] i965: Map the query results for the life of the bo

2017-06-09 Thread Chris Wilson
If we map the bo upon creation, we can avoid the latency of mmapping it when querying, and later use the asynchronous, persistent map of the predicate to do a quick query. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt T

[Mesa-dev] [PATCH 5/9] i965: Replace open-coded gen6 queryobj offsets with simple helpers

2017-06-09 Thread Chris Wilson
Lots of places open-coded the assumed layout of the predicate/results within the query object, replace those with simple helpers. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mes

[Mesa-dev] [PATCH 2/9] i965: Check last known busy status on bo before asking the kernel

2017-06-09 Thread Chris Wilson
using the last known flag, the query is split into two. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Matt Turner <matts...@gmail.com> --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 17 + src/mesa/drivers/dri

[Mesa-dev] [PATCH v2 3/4] i965: Move add_exec_bo()

2017-05-26 Thread Chris Wilson
To avoid a forward declaration in the next patch, move the definition of add_exec_bo() earlier. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Cc: Kenneth Graunke <kenn...@whitecape.org> Cc: Jason Ekstrand <jason.ekstr...@intel.com> --- src/mesa/drivers/dri/i965/intel_b

<    1   2   3   4   5   6   7   8   9   10   >