Quoting Kenneth Graunke (2017-07-19 21:08:23)
> On Wednesday, July 19, 2017 3:09:10 AM PDT Chris Wilson wrote:
> > Sometimes we want to emit a relocation to a NULL surface when the
> > constructing the batch. If we push the NULL handling into the common
> > brw_emit_reloc()
Sometimes we want to emit a relocation to a NULL surface when the
constructing the batch. If we push the NULL handling into the common
brw_emit_reloc() we can make the batch construction itself more
readable.
On the other hand, we often test for the existence of the bo separately
and so would
: Only enable HANDLE_LUT if we can use BATCH_FIRST and thereby avoid
a post-processing loop to fixup the relocations.
v3: Move kernel probing from context creation to screen init.
Use batch->use_exec_lut as it more descriptive of what's going on (Daniel)
Signed-off-by: Chris Wilson <ch...
Since brw_emit_reloc() now does the test for target==NULL itself, we can
remove the test from __gen_combine_address() and call brw_emit_reloc()
directly.
---
src/mesa/drivers/dri/i965/genX_state_upload.c | 21 +
1 file changed, 5 insertions(+), 16 deletions(-)
diff --git
We must be careful to only compute the address once based on the
per-context information (rather than accessing the unlocked global
bo->offset64) so that the value in the batch does match the
reloc.presumed_offset we declare to the kernel. Otherwise, highly
unlikely, but we may see GPU hangs in
To avoid a forward declaration in the next patch, move the definition of
add_exec_bo() earlier.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Ekstrand <jason.ekstr...@int
The kernel creates a unique binding for each instance of a GEM handle in
the per-process GTT. Keeping a single bo->offset64 used by multiple
contexts will therefore cause a lot of migration and relocation stalls
when the bo are reused between contexts. Not a common problem, but when
it does occur
Rather than have a seperate implementation that discards all of the
execobjects for the rare event of destroying the context, recast it as
an operation to reset to the saved state of no batch.
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 13 +++--
1 file changed, 7 insertions(+), 6
The kernel only cares about whether the object is to be written to or
not, only reduces (reloc.read_domains, reloc.write_domain) down to just
!!reloc.write_domain. When we use NO_RELOC, the kernel doesn't even read
those relocs and instead userspace has to pass that information in the
For the common path where we want to execute the batch, if we push the
no_hw detection down to the execbuf we can eliminate one loop over all
the execobjects. For the less common path where we don't want to execute
the batch, no_hw was leaving out_fence uninitialised.
Cc: Kenneth Graunke
The HW can only write a 64b immediate into a 64b aligned address, so
add an assert.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
Even if we are using older kernels that do not accept the batch in the
first slot, we can simplify our code by creating the batch with itself
in the first slot and moving it to the end on execbuf submission.
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 70 ---
1
ces()
v3: Reset bo->index on creation (Daniel)
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Ekstrand <jason.ekstr...@intel.com>
Cc: Daniel Vetter <daniel.vet...@ffwll.c
-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Ekstrand <jason.ekstr...@intel.com>
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 54 +--
1 file chang
Quoting Lionel Landwerlin (2017-07-16 15:31:38)
> CID: 1415113
> Reported-by: Grazvydas Ignotas
> Signed-off-by: Lionel Landwerlin
> ---
> src/intel/vulkan/anv_device.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git
Quoting Chad Versace (2017-07-14 23:36:43)
> On Wed 12 Jul 2017, Jason Ekstrand wrote:
> > Cc: Kenneth Graunke
> >
> > ---
> > src/mesa/drivers/dri/i965/brw_bufmgr.c | 28 ++--
> > src/mesa/drivers/dri/i965/brw_bufmgr.h | 1 +
> > 2 files changed,
Quoting Roland Scheidegger (2017-07-14 13:22:27)
> Reviewed-by: Roland Scheidegger
>
> Interesting side-effect there with the results being different if max >
> min. But hopefully not an issue anywhere else...
Is it worth a gccism to check?
#ifdef __GNUC__
#define CLAMP(x,
We must be careful to only compute the address once based on the
per-context information (rather than accessing the unlocked global
bo->offset64) so that the value in the batch does match the
reloc.presumed_offset we declare to the kernel. Otherwise, highly
unlikely, but we may see GPU hangs in
Sometimes we want to emit a relocation to a NULL surface when the
constructing the batch. If we push the NULL handling into the common
brw_emit_reloc() we can streamline the batch construction.
Cc: Kenneth Graunke
Cc: Matt Turner
Cc: Jason Ekstrand
Quoting Zhongmin Wu (2017-07-14 07:55:45)
> Before we queued the buffer with a invalid fence (-1), it will
> make some benchmarks failed to test such as flatland.
>
> Now we get the out fence during the flushing buffer and then pass
> it to SurfaceFlinger in eglSwapbuffer function.
>
> v2: a)
Quoting aravindan.muthuku...@intel.com (2017-07-14 05:09:09)
> From: Aravindan M
>
> This patch improves CPI Rate(Cycles per Instruction)
> and CPU time utilization for i965. The functions
> check_state and brw_pipeline_state_finished was found
> poor CPU
Quoting Chris Wilson (2017-06-13 12:57:05)
> Quoting Kenneth Graunke (2017-06-13 01:33:31)
> > When writing a region of a buffer via glBufferSubData(), we can write
> > the data asynchronously if the destination doesn't contain any data.
> > Even if it's busy, the data was
Quoting Wu, Zhongmin (2017-07-13 09:31:15)
> As for the using of last fence when the batch buffer is empty for
> create_fence_fd, I suggest it can be another story and we will try to
> optimize it in the future...
Note that is a backend problem. If you call a driver interface to create
a fence
Quoting Chris Wilson (2017-07-12 10:40:43)
> Quoting Kenneth Graunke (2017-07-12 08:22:25)
> > The non-LLC story was a horror show. We uploaded data via pwrite
> > (drm_intel_bo_subdata), which would stall if the cache BO was in
> > use (being read) by the GPU. Obviousl
Quoting Kenneth Graunke (2017-07-12 08:22:25)
> The non-LLC story was a horror show. We uploaded data via pwrite
> (drm_intel_bo_subdata), which would stall if the cache BO was in
> use (being read) by the GPU. Obviously, we wanted to avoid that.
> So, we tried to detect whether the buffer was
Quoting Kenneth Graunke (2017-07-12 08:22:24)
> From: Matt Turner
>
> Write-combine mappings give much better performance on writes than
> uncached access through the GTT.
>
> Improves performance of GFXBench 4's gl_driver2 benchmark at 1024x768
> on Apollolake by 3.6086%
.
My apologies for not noticing! I appear to have fixed it in a rebase and
so it disappeared from my tree.
Reviewed-by: Chris Wilson <ch...@chris-wilson.co.uk>
Doing a quick grep on the remaining bufmgr->lock shows that we are now
only locking around the gl
Not all objects will be mappable for direct access by the CPU (either
using WC/CPU or WC paths), for example, a dmabuf wrapping an object on a
foreign device or an object wrapping access to stolen memory. Since
either the physical pages are not known or even do not exist, we need to
use the
Valgrind doesn't actually implement VALGRIND_FREELIKE_BLOCK as the
exact inverse of VALGRIND_MALLOCLIKE_BLOCK. It makes the block
inaccessible, but still leaves it defined in its allocation tracker i.e.
it will report the mmap as lost despite the call to FREELIKE!
Instead of treating the mmap as
out. That was until I saw what you were planning to do for anv. Hmm,
that puts the oldest kernel that might support anv as
commit 51bc140431e233284660b1d22c47dec9ecdb521e [v4.3]
Author: Chris Wilson <ch...@chris-wilson.co.uk>
Date: Mon Aug 31 15:10:39 2015 +0100
drm/i915: Always mark th
Quoting Ben Widawsky (2017-07-07 19:42:25)
> On 17-07-07 11:34:48, Chris Wilson wrote:
> >Quoting Ben Widawsky (2017-07-07 00:27:01)
> >> drivers/gpu/drm/i915/i915_drv.c | 3 +++
> >> drivers/gpu/drm/i915/i915_drv.h | 2 ++
> >> drivers/gpu/drm/i915/i915_pc
ore as is currently the case.
v2: fixup assert to use GL_SYNC_GPU_COMMANDS_COMPLETE (Chad)
Reported-by: Sergi Granell <xerpi.g...@gmail.com>
Fixes: c636284ee8ee ("i965/sync: Implement DRI2_Fence extension")
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Sergi Granell <x
. Historically libdrm used set-domain as we did not
have an explicit wait-ioctl (and the patches to teach it to use wait if
available were lost in the mists). Since mesa already depends upon a
kernel support the wait-ioctl, we do not need to supply a fallback.
Signed-off-by: Chris Wilson <ch...@ch
Quoting Kenneth Graunke (2017-07-07 07:08:16)
> On Thursday, July 6, 2017 10:51:49 PM PDT Kenneth Graunke wrote:
> > On Wednesday, July 5, 2017 2:24:55 PM PDT Chris Wilson wrote:
> > > Quoting Kenneth Graunke (2017-07-05 21:56:54)
> > > > ---
> > > > s
Quoting Daniel Vetter (2017-07-07 11:04:00)
> On Mon, Jun 19, 2017 at 11:06:48AM +0100, Chris Wilson wrote:
> > - if (target != batch->bo)
> > - add_exec_bo(batch, target);
> > + if (target != batch->bo) {
> > + unsigned int index = add_exec_bo(
Quoting Daniel Vetter (2017-07-07 11:31:46)
> On Mon, Jun 19, 2017 at 11:06:50AM +0100, Chris Wilson wrote:
> > Passing the index of the target buffer via the reloc.target_handle is
> > marginally more efficient for the kernel (it can avoid some allocations,
> > and can use a
Quoting Ben Widawsky (2017-07-07 00:27:01)
> drivers/gpu/drm/i915/i915_drv.c | 3 +++
> drivers/gpu/drm/i915/i915_drv.h | 2 ++
> drivers/gpu/drm/i915/i915_pci.c | 13 +
> include/uapi/drm/i915_drm.h | 8
> 4 files changed, 22 insertions(+), 4 deletions(-)
>
> diff
ces()
v3: Reset bo->index on creation (Daniel)
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Ekstrand <jason.ekstr...@intel.com>
Cc: Daniel Vetter <daniel.vet...@ffwll.c
Quoting Daniel Vetter (2017-07-07 10:55:49)
> On Mon, Jun 19, 2017 at 11:06:47AM +0100, Chris Wilson wrote:
> > Borrow a trick from anv, and use the last known index for the bo to skip
> > a search of the batch->exec_bo when adding a new relocation. In defence
> > a
Quoting Zhongmin Wu (2017-07-07 09:07:06)
> Before we queued the buffer with a invalid fence (-1), it will
> make some benchmarks failed to test such as flatland.
Create a fence, pass fence-fd to android? Instead of forcing a lot of
busy work and using up another precious resource for everyone
Quoting Kenneth Graunke (2017-07-07 06:51:49)
> On Wednesday, July 5, 2017 2:24:55 PM PDT Chris Wilson wrote:
> > Quoting Kenneth Graunke (2017-07-05 21:56:54)
> > > ---
> > > src/mesa/drivers/dri/i965/brw_bufmgr.c | 15 +--
> > > 1 file ch
Quoting Kenneth Graunke (2017-07-07 06:19:07)
> On Thursday, July 6, 2017 4:21:28 AM PDT Chris Wilson wrote:
> > Quoting Kenneth Graunke (2017-07-05 21:56:53)
> > > diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
> > > b/src/mesa/drivers/dri/i965/intel_b
Quoting Kenneth Graunke (2017-07-05 21:56:53)
> diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
> b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
> index a9ac29a6a81..2b0f7b9a698 100644
> --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
> +++
Quoting Kenneth Graunke (2017-07-05 21:56:52)
> Using CPU maps of non-coherent buffers can get us in a lot of trouble,
> and WC maps are a reasonable alternative anyway. Guard against shooting
> ourselves in the foot by adding an assert, and comment.
Reviewed-by: Chris Wilson <
Quoting Kenneth Graunke (2017-07-05 21:56:54)
> ---
> src/mesa/drivers/dri/i965/brw_bufmgr.c | 15 +--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index
Quoting Jason Ekstrand (2017-07-05 18:21:08)
> static void
> anv_cmd_buffer_process_relocs(struct anv_cmd_buffer *cmd_buffer,
>struct anv_reloc_list *list)
> @@ -1450,6 +1484,14 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> impl->fd = -1;
>
cessfully mmaped the GTT.
Fixes: 314647c4c206 ("i965: Drop global bufmgr lock from brw_bo_map_*
functions.")
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/driver
Quoting Chad Versace (2017-06-20 18:08:14)
> On Mon 19 Jun 2017, Chris Wilson wrote:
> > Quoting Chad Versace (2017-06-19 19:42:16)
> > > On Mon 12 Jun 2017, Chris Wilson wrote:
> > > > brw_emit_mi_flush(brw);
> > > >
> > > &g
map.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Ekstrand <jason.ekstr...@intel.com>
---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 94 +++---
that buffer to be clflushed and any further CPU access to be
discarded.) To prevent this, simply disallow any CPU async mmap access.
The cases where async CPU access to a non-LLC buffer should continue to
be allowed via their preferred snooping path.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co
) are not permitted
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/brw_bufmgr.c| 19 +--
src/mesa/drivers/dri/i965/brw_bufmgr.h| 10 ++
Quoting Kenneth Graunke (2017-06-20 00:33:35)
> On Monday, June 19, 2017 3:55:01 AM PDT Chris Wilson wrote:
> > If we need to force a cache domain transition (e.g. a buffer was in the
> > CPU domain and we want to access it via WC) then we need to trigger a
> > clflush. T
Quoting Jason Ekstrand (2017-06-19 22:00:45)
> On Mon, Jun 19, 2017 at 12:53 PM, Chris Wilson <ch...@chris-wilson.co.uk>
> wrote:
>
> Quoting Kenneth Graunke (2017-06-19 20:28:31)
> > On Monday, June 19, 2017 3:06:48 AM PDT Chris Wilson wrote:
> >
Quoting Chad Versace (2017-06-19 19:42:16)
> On Mon 12 Jun 2017, Chris Wilson wrote:
> > brw_emit_mi_flush(brw);
> >
> > switch (fence->type) {
> > @@ -335,6 +363,8 @@ brw_gl_fence_sync(struct gl_context *ctx, struct
> > gl_sync_object *_s
Quoting Kenneth Graunke (2017-06-19 20:28:31)
> On Monday, June 19, 2017 3:06:48 AM PDT Chris Wilson wrote:
> > - if (target != batch->bo)
> > - add_exec_bo(batch, target);
> > + if (target != batch->bo) {
> > + unsigned int index = add_exec_bo(
Quoting Eric Engestrom (2017-06-19 15:30:46)
> On Monday, 2017-06-19 13:02:11 +0100, Chris Wilson wrote:
> > Quoting Emil Velikov (2017-06-19 12:43:42)
> > > Hi Chris,
> > >
> > > On 19 June 2017 at 12:32, Chris Wilson <ch...@chris-wilson.co.uk&g
Quoting Emil Velikov (2017-06-19 12:43:42)
> Hi Chris,
>
> On 19 June 2017 at 12:32, Chris Wilson <ch...@chris-wilson.co.uk> wrote:
> > On linux/x86_64, calling into the kernel is just a single instruction
> > with the parameters passed via registers. We can therefr
a slight impedance mismatch with the kernel interface in
that it converts the -errno return into -1 + errno, which we immediately
convert back into -errno for ourselves!
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <mat
Bypass libc's PLT indirection and its impedance mismatch by emitting the
single syscall instruction ourselves for linux/x86_64.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Eks
Reuse the same query object buffer for multiple queries within the same
batch.
A task for the future is propagating the GL_NO_MEMORY errors.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com
To simplify replacement later, replace repeated use of explicit 0/1 with
local variables of the same value.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/ge
, the busy-ioctl is lightweight!).
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/brw_context.h | 4 +--
src/mesa/drivers/dri/i965/ge
Lots of places open-coded the assumed layout of the predicate/results
within the query object, replace those with simple helpers.
v2: Fix function decl style.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <mat
using the last known flag, the query is split into two.
v2: Check against external bo before trusting our own tracking.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri
If we need to force a cache domain transition (e.g. a buffer was in the
CPU domain and we want to access it via WC) then we need to trigger a
clflush. This overrides the use of MAP_ASYNC as we call into the kernel
to change domains on the whole object.
Signed-off-by: Chris Wilson <ch...@ch
(where the results are used directly by the GPU and not
CPU).
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/brw_bufmgr.c| 23 +++
src/mes
If we map the bo upon creation, we can avoid the latency of mmapping it
when querying, and later use the asynchronous, persistent map of the
predicate to do a quick query.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt T
Be consistent in passing along brw_context rather than switching between
that and gl_context.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
---
src/mesa/drivers/dri/i965/gen6_queryobj.c | 32 ++-
1 file changed, 14 insertions(+), 18 deletions(-)
diff
will have more examples of non-reusable buffers in the
near future.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 4
src/mesa/drivers/dri/i965/brw_bufmgr.h | 5 +
2 files changed,
: Only enable HANDLE_LUT if we can use BATCH_FIRST and thereby avoid
a post-processing loop to fixup the relocations.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Ekstrand <jas
To avoid a forward declaration in the next patch, move the definition of
add_exec_bo() earlier.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Ekstrand <jason.ekstr...@int
Borrow a trick from anv, and use the last known index for the bo to skip
a search of the batch->exec_bo when adding a new relocation. In defence
against the bo being used in multiple batches simultaneously, we check
that this slot exists and points back to us.
Signed-off-by: Chris Wilson
-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
Cc: Jason Ekstrand <jason.ekstr...@intel.com>
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 53 +--
1 file chang
Quoting Kenneth Graunke (2017-06-15 19:45:19)
> On Thursday, June 15, 2017 1:41:39 AM PDT Chris Wilson wrote:
> > Quoting Kenneth Graunke (2017-06-14 22:49:01)
> > > On Friday, June 9, 2017 6:01:33 AM PDT Chris Wilson wrote:
> > > > If we know the bo is idle (that is
Quoting Jason Ekstrand (2017-06-15 16:58:13)
> On Thu, Jun 15, 2017 at 4:15 AM, Chris Wilson <ch...@chris-wilson.co.uk>
> wrote:
>
> Quoting Kenneth Graunke (2017-06-14 21:44:45)
> > If Chris is right, and what we're really seeing is that MI_SET_CONTEXT
>
Quoting Jason Ekstrand (2017-06-15 16:59:19)
> On Thu, Jun 15, 2017 at 4:11 AM, Chris Wilson <ch...@chris-wilson.co.uk>
> wrote:
> The kernel does have a LRI after a flush before signaling the batch is
> complete. I don't see a need to add another...
>
>
Quoting Kenneth Graunke (2017-06-14 21:44:45)
> On Tuesday, June 13, 2017 2:53:20 PM PDT Jason Ekstrand wrote:
> > As I've been working on converting more things in the GL driver over to
> > blorp, I've been highly annoyed by all of the hangs on Haswell. About one
> > in 3-5 Jenkins runs would
Quoting Kenneth Graunke (2017-06-14 21:41:56)
> On Tuesday, June 13, 2017 2:53:24 PM PDT Jason Ekstrand wrote:
> > From: Topi Pohjolainen
> >
> > v2 (Jason Ekstrand):
> > - Take a flags parameter to control the flushes
> > - Refactoring
> >
> > Signed-off-by: Topi
Quoting Kenneth Graunke (2017-06-15 00:19:35)
> On Friday, June 9, 2017 6:01:40 AM PDT Chris Wilson wrote:
> > Reuse the same query object buffer for multiple queries within the same
> > batch.
> >
> > A task for the future is propagating the GL_NO_MEMORY errors.
&g
Quoting Kenneth Graunke (2017-06-15 00:13:14)
> On Friday, June 9, 2017 6:01:38 AM PDT Chris Wilson wrote:
> > Ony non-llc architectures where we are primarily reading back the
> > results of the GPU queries, then we can improve performance by using a
> > cacheable m
Quoting Kenneth Graunke (2017-06-14 23:50:12)
> On Friday, June 9, 2017 6:01:37 AM PDT Chris Wilson wrote:
> > If we map the bo upon creation, we can avoid the latency of mmapping it
> > when querying, and later use the asynchronous, persistent map of the
> > predicat
Quoting Kenneth Graunke (2017-06-14 23:10:38)
> On Friday, June 9, 2017 6:01:36 AM PDT Chris Wilson wrote:
> > diff --git a/src/mesa/drivers/dri/i965/hsw_queryobj.c
> > b/src/mesa/drivers/dri/i965/hsw_queryobj.c
> > index b81ab3b6f8..cb1a2df52d 100644
> > ---
Quoting Kenneth Graunke (2017-06-14 22:49:01)
> On Friday, June 9, 2017 6:01:33 AM PDT Chris Wilson wrote:
> > If we know the bo is idle (that is we have no submitted a command buffer
> > referencing this bo since the last query) we can skip asking the kernel.
> > Note th
Quoting Jason Ekstrand (2017-06-13 22:53:20)
> As I've been working on converting more things in the GL driver over to
> blorp, I've been highly annoyed by all of the hangs on Haswell. About one
> in 3-5 Jenkins runs would hang somewhere. After looking at about a
> half-dozen error states, I
Quoting Kenneth Graunke (2017-06-13 01:33:32)
> We can promote INVALIDATE_RANGE_BIT to INVALIDATE_BUFFER_BIT if the
> range contains the only valid data in the buffer. This allows us to
> orphan the storage, instead of doing stall avoidance blits.
> ---
>
Quoting Kenneth Graunke (2017-06-13 01:33:31)
> When writing a region of a buffer via glBufferSubData(), we can write
> the data asynchronously if the destination doesn't contain any data.
> Even if it's busy, the data was undefined, so the new data is fine too.
>
> Decreases the number of stall
Quoting Kenneth Graunke (2017-06-13 01:33:30)
Every alloc_buffer_object() is followed by marking the valid range. I
could not find a missed path, so
Reviewed-by: Chris Wilson <ch...@chris-wilson.co.uk>
At some point, mesa with have to get an rbtree and then it will be
interesting
Quoting Kenneth Graunke (2017-06-13 01:33:29)
> This doesn't do anything yet, but soon we'll want to know whether an
> access to a buffer section may write that data, or simply reads it.
This series doesn't got further than boolean, but would it be worth
feeding through map flags? The immediate
as is currently the case.
Reported-by: Sergi Granell <xerpi.g...@gmail.com>
Fixes: c636284ee8ee ("i965/sync: Implement DRI2_Fence extension")
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Sergi Granell <xerpi.g...@gmail.com>
Cc: Rob Clark <robdcl...@gmail.
Quoting Chris Wilson (2017-06-09 14:01:34)
> The fence bo may be reused as an input fence to another batch, which
> will cause us to treat it as busy until that subsequent batch is idle.
> We only need to check if the fence has been signaled, which we can do by
> checking the
, the busy-ioctl is lightweight!).
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/brw_context.h | 4 +--
src/mesa/drivers/dri/i965/ge
Reuse the same query object buffer for multiple queries within the same
batch.
A task for the future is propagating the GL_NO_MEMORY errors.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com
When created, buffers are idle, so mark them as such to save an early
ioctl or mistaken assuming the fresh buffer is busy.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mes
(where the results are used directly by the GPU and not
CPU).
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/brw_bufmgr.c| 21 +
src/mes
is signaled.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/brw_sync.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/
To simplify replacement later, replace repeated use of explicit 0/1 with
local variables of the same value.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/ge
If we map the bo upon creation, we can avoid the latency of mmapping it
when querying, and later use the asynchronous, persistent map of the
predicate to do a quick query.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt T
Lots of places open-coded the assumed layout of the predicate/results
within the query object, replace those with simple helpers.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mes
using the last known flag, the query is split into two.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Matt Turner <matts...@gmail.com>
---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 17 +
src/mesa/drivers/dri
To avoid a forward declaration in the next patch, move the definition of
add_exec_bo() earlier.
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: Jason Ekstrand <jason.ekstr...@intel.com>
---
src/mesa/drivers/dri/i965/intel_b
401 - 500 of 1052 matches
Mail list logo