Quoting Eric Engestrom (2019-10-31 14:06:40)
> On Thursday, 2019-10-31 07:35:04 +0000, Chris Wilson wrote:
> > The system can be disabling HW acceleration unbeknowst to the user,
> > leading to a long debug session trying to work out which component is
> > fai
The system can be disabling HW acceleration unbeknowst to the user,
leading to a long debug session trying to work out which component is
failing. A quick mention that it is the environment override would be
very useful.
---
src/egl/main/egldriver.c | 2 ++
1 file changed, 2 insertions(+)
diff
Quoting Daniel Stone (2019-08-30 14:13:08)
> Hi,
>
> On Thu, 29 Aug 2019 at 21:35, Chris Wilson wrote:
> > Quoting Kristian Høgsberg (2019-08-29 21:20:12)
> > > On Thu, Aug 29, 2019 at 12:44 PM Chris Wilson
> > > wrote:
> > > > Quoting Kenneth Gra
Quoting Kristian Høgsberg (2019-08-29 21:20:12)
> On Thu, Aug 29, 2019 at 12:44 PM Chris Wilson
> wrote:
> >
> > Quoting Kenneth Graunke (2019-08-29 19:52:51)
> > > Some cons:
> > >
> > > - Moving bug reports between the kernel and Mesa would b
Quoting Kenneth Graunke (2019-08-29 19:52:51)
> Some cons:
>
> - Moving bug reports between the kernel and Mesa would be harder.
> We would have to open a bug in the other system. (Then again,
> moving bugs between Mesa and X or Wayland would be easier...)
All that I ask is that we move the
Not all hardware is made equal and some does not have the full
complement of 48b of address space. Ask what the actual size of virtual
address space allocated for contexts, and bail if that is not enough to
satisfy our static partitioning needs.
Cc: Kenneth Graunke
---
Quoting Jordan Justen (2019-03-31 10:57:09)
> On 2019-03-25 03:58:59, Chris Wilson wrote:
> > iris currently uses two distinct GEM contexts to have distinct logical
> > HW contexts for the compute and render pipelines. However, using two
> > distinct GEM contexts implies t
Quoting Jordan Justen (2019-03-31 10:53:06)
> Where are these changes from (repo/commit)? It could be good to
> reference in the commit message.
They don't exist in drm-next yet, so they don't have a reference.
-Chris
___
mesa-dev mailing list
Quoting Kenneth Graunke (2019-03-26 17:01:57)
> On Tuesday, March 26, 2019 12:16:20 AM PDT Chris Wilson wrote:
> > Quoting Kenneth Graunke (2019-03-26 05:52:10)
> > > On Monday, March 25, 2019 3:58:59 AM PDT Chris Wilson wrote:
> > > > iris currently uses two distinct
Quoting Kenneth Graunke (2019-03-26 05:52:10)
> On Monday, March 25, 2019 3:58:59 AM PDT Chris Wilson wrote:
> > iris currently uses two distinct GEM contexts to have distinct logical
> > HW contexts for the compute and render pipelines. However, using two
> > distinct
We want to opt out of the automatic GPU recovery and replay performed by
the kernel of a guilty context after a GPU reset as our incremental
batch construction very often implies that subsequent batches are a GPU
reset are incomplete and will trigger fresh GPU hangs. As we are aware
of how we need
For use in GPU recovery and pipeline construction.
---
include/drm-uapi/i915_drm.h | 389 +---
1 file changed, 317 insertions(+), 72 deletions(-)
diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index d2792ab3640..59baacd265d 100644
---
iris currently uses two distinct GEM contexts to have distinct logical
HW contexts for the compute and render pipelines. However, using two
distinct GEM contexts implies that they are distinct timelines, yet as
they are a single GL context that implies they belong to a single
timeline from the
Quoting Rafael Antognolli (2019-03-05 19:33:03)
> On Tue, Mar 05, 2019 at 09:40:20AM +0000, Chris Wilson wrote:
> > Not all commands support being preempted as they execute, and for those
> > make sure we at least check for being preempted before we start so as to
> > try and
Not all commands support being preempted as they execute, and for those
make sure we at least check for being preempted before we start so as to
try and minimise the latency of whomever is more important than
ourselves.
Cc: Jari Tahvanainen ,
Cc: Rafael Antognolli
Cc: Kenneth Graunke
---
Always
Not all commands support being preempted as they execute, and for those
make sure we at least check for being preempted before we start so as to
try and minimise the latency of whomever is more important than
ourselves.
Cc: Jari Tahvanainen ,
Cc: Rafael Antognolli
Cc: Kenneth Graunke
---
Quoting Emil Velikov (2019-02-28 11:44:28)
> On Tue, 26 Feb 2019 at 21:52, Chris Wilson wrote:
> >
> > A few of the GEM drivers provide matching ioctls to allow control of
> > their bo caches. Hook these up to APPLE_object_purgeable to allow
> > clients to discard
Quoting Eric Anholt (2019-02-27 02:19:32)
> Overall, I'm hesitatant to land support for actually doing anything with
> APPLE_object_purgeable when there are no functional tests of it. I
> don't mean to actually have tests that force purging, but at least
> making sure that we don't accidentally
A few of the GEM drivers provide matching ioctls to allow control of
their bo caches. Hook these up to APPLE_object_purgeable to allow
clients to discard video memory under pressure where they are able to
fallback to restoring content themselves, e.g. from their own (presumably
compressed, on
A few of the GEM drivers provide matching ioctls to allow control of
their bo caches. Hook these up to APPLE_object_purgeable to allow
clients to discard video memory under pressure where they are able to
fallback to restoring content themselves, e.g. from their own (presumably
compressed, on
Quoting Lionel Landwerlin (2019-02-21 12:57:09)
> I did not find the PRM bit that says it must be 64b aligned, but I can
> see that's what i915 checks.
>
> Chris: If you have a pointer to it, I could add the quote.
In amongst the register specs,
PLANE_STRIDE:
For Linear memory, this field
Quoting Lionel Landwerlin (2019-02-18 15:06:15)
> On 15/02/2019 14:43, Samuel Iglesias Gonsálvez wrote:
> > There are formats which bpp are not aligned to a power-of-two and
> > that can cause problems in the checks we do.
> >
> > The cacheline size was a requirement for using the BLT engine,
To make wedging even more likely, we use a new "no recovery" context
parameter that tells the kernel to not even attempt to replay any
batches in flight against the default context image, as experience shows
the HW is not always robust enough to cope with the conflicting state.
This allows us to
XXX Not in drm-next XXX
Pull i915_drm.h to include
commit ba4fda620a5f7db521aa9e0262cf49854c1b1d9c (HEAD -> drm-intel-next-queued,
drm-intel/drm-intel-next-queued)
Author: Chris Wilson
Date: Mon Feb 18 10:58:21 2019 +
drm/i915: Optionally disable automatic recovery after a GPU re
If we hang the GPU and end up banning our context, we will no longer be
able to submit and abort with an error (exit(1) no less). As we submit
minimal incremental batches that rely on the logical context state of
previous batches, we can not rely on the kernel's recovery mechanism
which tries to
Introduce a new debug option to wilfully cause the GPU to hang and for
the kernel to accuse of being neglectful.
---
src/intel/Makefile.sources| 2 +
src/intel/common/gen_debug.c | 1 +
src/intel/common/gen_debug.h | 1 +
If we hang the GPU and end up banning our context, we will no longer be
able to submit and abort with an error (exit(1) no less). As we submit
minimal incremental batches that rely on the logical context state of
previous batches, we can not rely on the kernel's recovery mechanism
which tries to
If we hang the GPU and end up banning our context, we will no longer be
able to submit and abort with an error (exit(1) no less). As we submit
minimal incremental batches that rely on the logical context state of
previous batches, we can not rely on the kernel's recovery mechanism
which tries to
Object handles are local to the device fd, so double check we are not
mixing together objects from multiple screens on execbuf submission.
Cc: Kenneth Graunke
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 ++
1 file changed, 2 insertions(+)
diff --git
Quoting Chris Wilson (2019-02-14 12:05:00)
> If we hang the GPU and end up banning our context, we will no longer be
> able to submit and abort with an error (exit(1) no less). As we submit
> minimal incremental batches that rely on the logical context state of
> previous batches, we
If we hang the GPU and end up banning our context, we will no longer be
able to submit and abort with an error (exit(1) no less). As we submit
minimal incremental batches that rely on the logical context state of
previous batches, we can not rely on the kernel's recovery mechanism
which tries to
The kernel tries to repair a hanging context by restoring the default
state (or else we have discovered that the context may be unusably
corrupt by the reset). However, this is unsuitable for mesa as it
(rightfully) assumes that the context image contains the state it has
earlier set and so only
Object handles are local to the device fd, so double check we are not
mixing together objects from multiple screens on execbuf submission.
Cc: Kenneth Graunke
---
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 2 ++
1 file changed, 2 insertions(+)
diff --git
Quoting Kenneth Graunke (2019-01-08 20:17:01)
> On Tuesday, January 8, 2019 3:11:37 AM PST Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2019-01-08 11:03:26)
> > > Hi Andrii,
> > >
> > > Although I think what these patches do makes sense, I think it
Quoting andrey simiklit (2019-01-08 16:00:45)
> On Tue, Jan 8, 2019 at 1:11 PM Chris Wilson wrote:
>
> Quoting Lionel Landwerlin (2019-01-08 11:03:26)
> > Hi Andrii,
> >
> > Although I think what these patches do makes sense, I think it's missing
Quoting Lionel Landwerlin (2019-01-08 11:03:26)
> Hi Andrii,
>
> Although I think what these patches do makes sense, I think it's missing
> the bigger picture.
> There is a lot more state that gets lost if we have to revert all of the
> emitted commands.
> A quick look at
Quoting Chris Wilson (2018-10-24 09:40:08)
> If we hang the GPU and end up banning our context, we will no longer be
> able to submit and abort with an error (exit(1) no less). As we submit
> minimal incremental batches that rely on the logical context state of
> previous batches, we
Quoting Mathias Fröhlich (2018-11-23 17:14:45)
> Hi Chris,
>
> On Friday, 23 November 2018 16:12:38 CET Chris Wilson wrote:
> >
> > Something to note here is that valgrind reports
> > (piglit/bin/drawoverhead):
> >
> > ==492== Use of uninitialised valu
Quoting mathias.froehl...@gmx.net (2018-11-23 08:07:29)
> From: Mathias Fröhlich
>
> Factor out vertex array setup routines from the array state atom.
> The factored functions will be used in feedback rendering in the
> next change.
>
> Signed-off-by: Mathias Fröhlich
> ---
>
Quoting Rafael Antognolli (2018-10-29 17:19:53)
> +void
> +brw_enable_obj_preemption(struct brw_context *brw, bool enable)
> +{
> + const struct gen_device_info *devinfo = >screen->devinfo;
> + assert(devinfo->gen >= 9);
> +
> + if (enable == brw->object_preemption)
> + return;
> +
> +
If we hang the GPU and end up banning our context, we will no longer be
able to submit and abort with an error (exit(1) no less). As we submit
minimal incremental batches that rely on the logical context state of
previous batches, we can not rely on the kernel's recovery mechanism
which tries to
Quoting Kenneth Graunke (2018-10-19 18:51:36)
> Usually when making a new file, people copy some random other file
> to get the copyright header comments. Unfortunately, some of them
> are commented in a decades-old style, are word wrapped poorly, or
> worse, have a few subtle variations in the
Quoting Ian Romanick (2018-10-10 00:24:00)
> On 10/09/2018 06:24 AM, Chris Wilson wrote:
> > The userspace driver does not exist in isolation and occasionally
> > depends on kernel uapi, and so it is useful in bug reports to include
> > that information. (radeonsi, r600 and
The userspace driver does not exist in isolation and occasionally
depends on kernel uapi, and so it is useful in bug reports to include
that information. (radeonsi, r600 and radv already include utsname)
References: https://bugs.freedesktop.org/show_bug.cgi?id=108282
---
Quoting Kenneth Graunke (2018-10-06 02:57:29)
> On Tuesday, October 2, 2018 11:06:23 AM PDT Chris Wilson wrote:
> > Reuse the same query object buffer for multiple queries within the same
> > batch.
> >
> > A task for the future is propagating the GL_NO_MEMORY
If we map the bo upon creation, we can avoid the latency of mmapping it
when querying, and later use the asynchronous, persistent map of the
predicate to do a quick query.
v2: Inline the wait on results; it disappears shortly in the next few
patches.
Signed-off-by: Chris Wilson
Cc: Kenneth
using the last known flag, the query is split into two.
v2: Check against external bo before trusting our own tracking.
Signed-off-by: Chris Wilson
Cc: Kenneth Graunke
Cc: Matt Turner
---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 40 --
src/mesa/drivers/dri/i965
, the busy-ioctl is lightweight!).
Signed-off-by: Chris Wilson
Cc: Kenneth Graunke
Cc: Matt Turner
---
src/mesa/drivers/dri/i965/brw_context.h | 4 +-
src/mesa/drivers/dri/i965/gen6_queryobj.c | 54 ++-
2 files changed, 25 insertions(+), 33 deletions(-)
diff --git a/src/mesa
Skip the next check for brw_batch_references() by recording when we
flush the query.
---
src/mesa/drivers/dri/i965/gen6_queryobj.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c
b/src/mesa/drivers/dri/i965/gen6_queryobj.c
(where the results are used directly by the GPU and not
CPU).
Signed-off-by: Chris Wilson
Cc: Kenneth Graunke
Cc: Matt Turner
---
src/mesa/drivers/dri/i965/brw_bufmgr.c| 24 +++
src/mesa/drivers/dri/i965/brw_bufmgr.h| 2 ++
src/mesa/drivers/dri/i965/gen6_queryobj.c
Lots of places open-coded the assumed layout of the predicate/results
within the query object, replace those with simple helpers.
v2: Fix function decl style.
Signed-off-by: Chris Wilson
Cc: Kenneth Graunke
Cc: Matt Turner
---
.../drivers/dri/i965/brw_conditional_render.c | 10
Be consistent in passing along brw_context rather than switching between
that and gl_context.
Signed-off-by: Chris Wilson
---
src/mesa/drivers/dri/i965/gen6_queryobj.c | 30 +++
1 file changed, 14 insertions(+), 16 deletions(-)
diff --git a/src/mesa/drivers/dri/i965
To simplify replacement later, replace repeated use of explicit 0/1 with
local variables of the same value.
Signed-off-by: Chris Wilson
Cc: Kenneth Graunke
Cc: Matt Turner
---
src/mesa/drivers/dri/i965/gen6_queryobj.c | 30 ---
1 file changed, 16 insertions(+), 14
Reuse the same query object buffer for multiple queries within the same
batch.
A task for the future is propagating the GL_NO_MEMORY errors.
Signed-off-by: Chris Wilson
Cc: Kenneth Graunke
Cc: Matt Turner
---
src/mesa/drivers/dri/i965/brw_context.c | 3 ++
src/mesa/drivers/dri/i965
Quoting andrey simiklit (2018-08-21 13:00:57)
> Hi all,
>
> The bug for this issue was created:
> https://bugs.freedesktop.org/show_bug.cgi?id=107626
What about something like
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c
b/src/mesa/drivers/dri/i965/brw_draw.c
index
As a prelude to handling large address spaces, first allow ourselves the
luxury of handling the full 4G.
Reported-by: Andrey Simiklit
Cc: Kenneth Graunke
---
src/mesa/drivers/dri/i965/brw_context.h | 2 +-
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 9 +
Quoting Kenneth Graunke (2018-09-04 16:13:59)
> On Tuesday, September 4, 2018 2:57:29 AM PDT Lionel Landwerlin wrote:
> > Both brw_bo_map_cpu() & brw_bo_map_wc() assert if mapping the
> > underlying BO fails. Failing back to brw_bo_map_gtt() doesn't seem to
> > make any sense for that reason.
> >
->tiling_mode != I915_TILING_NONE);
> + assert((flags & MAP_COHERENT) == 0);
Lucky you only have to support platforms with working WC :)
The explanation matches with my understanding,
Reviewed-by: Chris Wilson
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Quoting Lionel Landwerlin (2018-09-04 10:57:29)
> Both brw_bo_map_cpu() & brw_bo_map_wc() assert if mapping the
> underlying BO fails. Failing back to brw_bo_map_gtt() doesn't seem to
> make any sense for that reason.
>
> We also only call brw_bo_map_gtt() for tiled buffers which as far as
> we
Quoting Lionel Landwerlin (2018-09-04 01:07:12)
> Talking with Ken about this, it seems we might not actually coherent use
> GTT maps because those are just for buffer (which are allocated linear).
> GTT maps are only used with tiled buffers.
>
> So we most likely don't even need this patch.
>
Quoting Lionel Landwerlin (2018-08-31 12:32:23)
> On 31/08/2018 12:22, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2018-08-31 12:16:19)
> >> We would need a fairly recent kernel (drm-tip?) to test this in CI.
> > Unpatched mesa, assumes all is fine.
> >
Quoting Lionel Landwerlin (2018-08-31 12:16:19)
> We would need a fairly recent kernel (drm-tip?) to test this in CI.
Unpatched mesa, assumes all is fine.
Post-patch mesa, assumes all is broken.
So we can quickly see if anything actually fails if a persistent GGTT
mmap is rejected. Which is the
On more recent HW, the indirect writes via the GGTT are internally
buffered presenting a nuisance to trying to use them for persistent
client maps. (We cannot guarantee that any write by the client will
then be visible to either the GPU or third parties in a timely manner,
leading to corruption.)
Quoting Anuj Phogat (2018-08-28 18:53:59)
> h/w specification requires this bit to be always set.
>
> Suggested-by: Kenneth Graunke
> Signed-off-by: Anuj Phogat
> ---
> src/mesa/drivers/dri/i965/brw_defines.h | 4
> src/mesa/drivers/dri/i965/brw_state_upload.c | 7 +++
> 2 files
Since v3.16 (though universal access was only enabled by default in v4.6),
the kernel has offered the ability to wrap any system memory (i.e. RAM
and not I/O mapped memory) into an object that can be used by the GPU. The
caveat is that this object is marked as cache coherent (so that the client
Recent kernels do exclude snoop access for i965g/i965gm as it does not
work as advertised. However to avoid depending on a recent kernel for
old hardware, mark the presence of the bug in gen_device_info.
See kernel commit df0700e53047662c167836bd6fdeea55d5d8dcfa
Author: Chris Wilson
Date: Wed
All GEN GPU can bind to any piece of memory (thanks UMA), and so through
a special ioctl we can map a chunk of page-aligned client memory into
the GPU address space. However, not all GEN are equal. Some have
cache-coherency between the CPU and the GPU, whilst the others are
incoherent and rely on
Technically only for Sandybridge and later core designs, but finally we
can claim support for allowing clients to create glBufferObjects from
their own memory.
---
docs/relnotes/18.3.0.html | 1 +
1 file changed, 1 insertion(+)
diff --git a/docs/relnotes/18.3.0.html b/docs/relnotes/18.3.0.html
Quoting Michal Srb (2018-08-15 09:22:19)
> Hi,
>
> This is my first attempt to review patch for Mesa, so please take it with a
> grain of salt.
>
> On úterý 14. srpna 2018 20:21:40 CEST Chris Wilson wrote:
> > @@ -504,6 +506,24 @@ bo_alloc_internal(struct brw_bufmgr *b
allocations, while in the mean time making sure that
we do not waste any extra pages on them.
Signed-off-by: Chris Wilson
Cc: Sergii Romantsov
Cc: Lionel Landwerlin
Cc: Kenneth Graunke
---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 24
1 file changed, 24 insertions
Quoting Lionel Landwerlin (2018-07-30 17:08:47)
> On 30/07/18 16:45, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2018-07-30 16:28:37)
> >> Gen10+ has an additional bit in MI_BATCH_BUFFER_END to signal the end
> >> of the context image.
> > Hmm, do you think
Quoting Lionel Landwerlin (2018-07-30 16:28:37)
> Gen10+ has an additional bit in MI_BATCH_BUFFER_END to signal the end
> of the context image.
Hmm, do you think we should perhaps include the BBE in the protocontext
we create in the kernel?
-Chris
___
Quoting Sergii Romantsov (2018-07-25 10:37:29)
> Hello, Chris.
> Your variant also works.
> But i wonder about comment:
> /* If we don't have caching at this size, don't actually round the
> * allocation up.
> */
> if (bucket == NULL) {
>
> Has it any sense now? If 'no' - will
Quoting Sergii Romantsov (2018-07-25 08:42:55)
> Hello,
> here is a backtrace:
...
Please try:
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 09d45e30ecc..8274c2e0b2f 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++
Quoting Lionel Landwerlin (2018-07-24 13:45:18)
> On 24/07/18 13:42, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2018-07-24 13:34:57)
> >> That looks correct to me (and we do the same in Anv).
> >> Also a bit baffled that we haven't run into issues earlier :(
>
Quoting Lionel Landwerlin (2018-07-24 13:34:57)
> That looks correct to me (and we do the same in Anv).
> Also a bit baffled that we haven't run into issues earlier :(
All the allocations should be in multiples of page size, alignment less
than a page size should be a no-op. Tracking down who
Quoting Nanley Chery (2018-07-23 18:17:15)
> Satisfy the BLT engine's row pitch limitation on the destination
> miptree. The destination miptree is untiled, so its row_pitch will be
> slightly less than or equal to the source miptree's row_pitch. Use the
> source miptree's row_pitch in
Quoting aravindan.muthuku...@intel.com (2018-07-20 09:32:57)
> From: "Muthukumar, Aravindan"
>
> The Patch here is to give control to user/ application to really
> decide what's the max GPU load it would put. If that can be
> known in advance, rpcs can be programmed accordingly.
> This
info, we see that non-linear tilings have
> widths greater than or equal to 128B.
Yup, we only have non-linear at this point and pitch has to a multiple
of tiles.
> Cc:
Reviewed-by: Chris Wilson
-Chris
___
mesa-dev mailing list
mesa-dev@lists.fre
Quoting Nanley Chery (2018-07-12 18:28:16)
> Retile miptrees to a linear tiling less often. Retiling can cause issues
> with imported BOs.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106738
> Suggested-by: Chris Wilson
> Cc:
Reviewed-by: Ch
Quoting Nanley Chery (2018-07-12 18:28:14)
> We'd like to reuse this helper.
>
> Cc:
Reviewed-by: Chris Wilson
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Quoting Lionel Landwerlin (2018-06-21 17:29:04)
> From: Jason Ekstrand
>
> This is a simple, invasive, liberally licensed red-black tree
> implementation. It's an invasive data structure similar to the
> Linux kernel linked-list where the intention is that you embed a
s/linked-list/rbtree/
Quoting Chris Wilson (2018-06-18 11:10:23)
> We should we have all the kinks worked out and full-ppgtt now works
> reliably on gen7 (Ivybridge, Valleyview/Baytrail and Haswell). If we can
> let userspace have full control over their own ppgtt, it makes softpinning
> far more effect
We believe we have all the kinks worked out, even for the early
Valleyview devices, for whom we currently disable all ppgtt.
References: 62942ed7279d ("drm/i915/vlv: disable PPGTT on early revs v3")
Signed-off-by: Chris Wilson
Cc: Ville Syrjälä
Cc: Joonas Lahtinen
Reviewed-by: Joona
(due to better mm segregation). On the other hand, switching
over to a different GTT for every client does incur noticeable overhead.
Signed-off-by: Chris Wilson
Cc: Joonas Lahtinen
Cc: Mika Kuoppala
Cc: Matthew Auld
Reviewed-by: Joonas Lahtinen
Cc: Jason Ekstrand
Cc: Kenneth Graunke
Quoting Nanley Chery (2018-06-14 19:46:09)
> On Thu, Jun 14, 2018 at 10:01:18AM -0700, Nanley Chery wrote:
> > On Thu, Jun 14, 2018 at 04:18:30PM +0300, Martin Peres wrote:
> > > This fixes screenshots using 8k+ wide display setups in modesetting.
> > >
> >
Quoting Lionel Landwerlin (2018-06-10 13:15:10)
> Now that we're softpinning the address of our BOs in anv & i965, the
> addresses selected start at the top of the addressing space. This is a
> problem for the current implementation of aubinator which uses only a
> 40bit mmapped address space.
>
Technically only for Sandybridge and later core designs, but finally we
can claim support for allowing clients to create glBufferObjects from
their own memory.
---
docs/relnotes/18.2.0.html | 1 +
1 file changed, 1 insertion(+)
diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html
All GEN GPU can bind to any piece of memory (thanks UMA), and so through
a special ioctl we can map a chunk of page-aligned client memory into
the GPU address space. However, not all GEN are equal. Some have
cache-coherency between the CPU and the GPU, whilst the others are
incoherent and rely on
The primary benefit for this is that we get format conversion for
"free", along with detiling and cache flushing (most relevant for !llc).
Using the GPU does impose a bandwidth cost that is presumably better
used for rendering, hence we limit the use to readback into client
memory (not pbo) where
Since v3.16 (though universal access was only enabled by default in v4.6),
the kernel has offered the ability to wrap any system memory (i.e. RAM
and not I/O mapped memory) into an object that can be used by the GPU. The
caveat is that this object is marked as cache coherent (so that the client
Recent kernels do exclude snoop access for i965g/i965gm as it does not
work as advertised. However to avoid depending on a recent kernel for
old hardware, mark the presence of the bug in gen_device_info.
See kernel commit df0700e53047662c167836bd6fdeea55d5d8dcfa
Author: Chris Wilson
Date: Wed
the normal case.
>
> Fixes: 29ba502a4e28471f67e4e904ae503157087efd20 (i965: Use
> I915_EXEC_BATCH_FIRST when available.)
Reviewed-by: Chris Wilson
One thing that may have helped is if we do the post-execbuf processing
in submit_batch; execbuffer() then just becomes stuffing the pointers
into struct drm_i915_g
Quoting mathias.froehl...@gmx.net (2018-05-17 07:38:27)
> From: Mathias Fröhlich
>
> The merge_inputs function handles that part that changes when the
> inputs change. The clear_buffers function triggers when we may need
> a new upload. Thus the merge_inputs can be limited to be once
> per
Quoting mathias.froehl...@gmx.net (2018-05-17 07:38:26)
> From: Mathias Fröhlich
>
> Avoid looping over all VARYING_SLOT_MAX urb_setup array
> entries from genX_upload_sbe. Prepare an array indirection
> to the active entries of urb_setup already in the compile
> step. On upload only walk the
Quoting Nanley Chery (2018-05-30 21:44:35)
> We previously retiled miptrees to work around limitations of the BLT
> engine. BLORP fallbacks can overcome these, so we no longer have need
> for retiling.
>
> Removing retiling fixes a number of problems. If the row pitch was too
> wide for the BLT
Commit 92f01fc5f914 ("i965: Emit VF cache invalidates for 48-bit
addressing bugs with softpin.") tried to only emit the VF invalidate if
the high bits changed, but it accidentally always set need_invalidate to
true; causing it to emit unconditionally emit the pipe control before
every primitive.
The limit to the working set of a single batch is the amount we can fit
into the GTT. The GTT is a per-context address space, so query the
context rather than use an estimate based on the legacy global aperture.
Futhermore, we can fine tune our soft-limit based on the knowledge of
whether we are
Just a small series to put the new cache-line read back to good use for
ye olde Xorg on bxt (and older/newer with very similar effect).
From
4 trep @ 0.7007 msec ( 1430.0/sec): ShmPutImage 500x500 square
4000 trep @ 9.0367 msec ( 111.0/sec): ShmGetImage 500x500 square
to
Allow the tiled_memcpy backend to determine if it is able to copy
between the source and destination pixel buffer. This allows us to
eliminate some duplication in the callers, and permits us to be more
flexible in checking for compatible formats.
(Hmm, is sRGB handling right?)
---
1 - 100 of 1052 matches
Mail list logo