Re: [Intel-gfx] [PATCH 8/9] drm/i915: Implement .get_format_info() hook for CCS
On Wed 04 Jan 2017, ville.syrj...@linux.intel.com wrote: > From: Ville Syrjälä> > SKL+ display engine can scan out certain kinds of compressed surfaces > produced by the render engine. This involved telling the display engine > the location of the color control surfae (CCS) which describes which > parts of the main surface are compressed and which are not. The location > of CCS is provided by userspace as just another plane with its own offset. > > By providing our own format information for the CCS formats, we should > be able to make framebuffer_check() do the right thing for the CCS > surface as well. > > Note that we'll return the same format info for both Y and Yf tiled > format as that's what happens with the non-CCS Y vs. Yf as well. If > desired, we could potentially return a unique pointer for each > pixel_format+tiling+ccs combination, in which case we immediately be > able to tell if any of that stuff changed by just comparing the > pointers. But that does sound a bit wasteful space wise. > > v2: Drop the 'dev' argument from the hook > v3: Include the description of the CCS surface layout > > Cc: Vandana Kannan > Cc: Daniel Vetter > Cc: Ben Widawsky > Cc: Jason Ekstrand > Reviewed-by: Ben Widawsky > Signed-off-by: Ville Syrjälä > --- > drivers/gpu/drm/i915/intel_display.c | 36 ++ > include/uapi/drm/drm_fourcc.h| 49 > > 2 files changed, 85 insertions(+) > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h > index 9e1bb7fabcde..4581e3d41e5c 100644 > --- a/include/uapi/drm/drm_fourcc.h > +++ b/include/uapi/drm/drm_fourcc.h > @@ -230,6 +230,55 @@ extern "C" { > #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3) > > /* > + * Intel color control surface (CCS) for render compression > + * > + * The framebuffer format must be one of the 8:8:8:8 RGB formats, > + * the main surface will be plane index 0 and will be Y/Yf-tiled, > + * the CCS will be plane index 1. > + * The paragraph below is... > + * Each byte of CCS contains 4 pairs of bits, with each pair of bits > + * matching an area of 8x4 pixels of the main surface. Which would seem > + * to match 2 cachelines containing 4x4 pixels each. The pairs bits within > + * the byte form a 2x2 grid, which thus matches a 16x8 pixel area of the > + * main surface. This is the 2x2 pattern the bits form (0=lsb, 7=msb): > + * --- > + * | 01 | 23 | > + * -- > + * | 45 | 67 | > + * --- ...almost correct. The hw docs state that each bit-pair of the CCS maps cacheline-pair, horizontally adjacent in the Y tile, of the primary surface. As a consequence, the remainder of the above paragraph is correct for 32-bit formats, but not others. This is not a nitpick, because Vulkan and EGL users may want to share dma_bufs with a fat format like R32G32B32A32_UNORM, and want to have CCS enabled when possible. As long as the users use the dma_buf only the 3D engine, and don't submit it to KMS, it's all safe. For those users, we should document the bit-pair/cacheline-pair relationship correctly. And then preface the following detailed explanation and nice ascii diagrams by saying "this applies only to the 32-bit formats". Here's the relevant PRM quote: The Color Control Surface (CCS) contains the compression status of the cache-line pairs. The compression state of the cache-line pair is specified by 2 bits in the CCS. Each CCS cache-line represents an area on the main surface of 16x16 sets of 128 byte Y-tiled cache-line-pairs. CCS is always Y tiled. See this Mesa comment for more details: https://cgit.freedesktop.org/mesa/mesa/tree/src/intel/isl/isl.c?h=17.0#n211 > + * Actually only the lower bit of the pair seems to have any effect. > + * No idea why. 0 in the lower bit would seem to mean not compressed, > + * and 1 is compressed. The interpreation of the main surface data > + * when the block is marked compressed is unknown as of now. If I recall correctly (it's been several months since I inspected the bits), the high bit is actually significant. Bit pattern 11 means that the data in primary surface's cacheline-pair is invalid; the 3D engine interprets the color of all pixels in that cacheline-pair to be the clear color stored in RENDER_SURFACE_STATE. Before the display engine can consume the surface, userspace is required to do a partial resolve, flushing the clear color into the primary surface. So it makes sense that the kernel would never observe that bit pattern in an incoming ccs surface, as long as userspace is doing its job correctly. And it makes sense that the display engine would ignore the high bit, because there is no way to provide the clear color to the display engine (at least according my reading of the docs).
[Intel-gfx] [PATCH] intel: Update intel-symbol-check
Fixes make check. The symbols below were missing: drm_intel_bufmgr_gem_can_disable_implicit_sync drm_intel_gem_bo_disable_implicit_sync drm_intel_gem_bo_enable_implicit_sync drm_intel_gem_bo_fence_exec Signed-off-by: Chad Versace <chadvers...@chromium.org> --- intel/intel-symbol-check | 4 1 file changed, 4 insertions(+) diff --git a/intel/intel-symbol-check b/intel/intel-symbol-check index 038c9829..2aa2d819 100755 --- a/intel/intel-symbol-check +++ b/intel/intel-symbol-check @@ -50,6 +50,7 @@ drm_intel_bufmgr_fake_init drm_intel_bufmgr_fake_set_exec_callback drm_intel_bufmgr_fake_set_fence_callback drm_intel_bufmgr_fake_set_last_dispatch +drm_intel_bufmgr_gem_can_disable_implicit_sync drm_intel_bufmgr_gem_enable_fenced_relocs drm_intel_bufmgr_gem_enable_reuse drm_intel_bufmgr_gem_get_devid @@ -69,6 +70,9 @@ drm_intel_decode_set_output_file drm_intel_gem_bo_aub_dump_bmp drm_intel_gem_bo_clear_relocs drm_intel_gem_bo_context_exec +drm_intel_gem_bo_disable_implicit_sync +drm_intel_gem_bo_enable_implicit_sync +drm_intel_gem_bo_fence_exec drm_intel_gem_bo_get_reloc_count drm_intel_gem_bo_map__cpu drm_intel_gem_bo_map__gtt -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [dinq] Request to merge in padovan's fence fd fix
Chris just pushed execbuffer fence fds to dinq. Could someone please get the below fix from airlied/drm-fixes into dinq? (What's the usual method here? Merge drm-fixes -> dinq, cherry-pick drm-fixes -> dinq, rebase?) Thanks commit 7e9081c5aac73b8a0bc22e0b3e7a12c3e9cf5256 Author: Gustavo Padovan <gustavo.pado...@collabora.com> Date: Fri Jan 13 12:22:09 2017 -0200 drm/fence: fix memory overwrite when setting out_fence fd Currently if the userspace declares a int variable to store the out_fence fd and pass it to OUT_FENCE_PTR the kernel will overwrite the 32 bits above the int variable on 64 bits systems. Fix this by making the internal storage of out_fence in the kernel a s32 pointer. Reported-by: Chad Versace <chadvers...@chromium.org> Signed-off-by: Gustavo Padovan <gustavo.pado...@collabora.com> Fixes: beaf5af48034 ("drm/fence: add out-fences support") Cc: Daniel Vetter <dan...@ffwll.ch> Cc: Rafael Antognolli <rafael.antogno...@intel.com> Cc: Laurent Pinchart <laurent.pinch...@ideasonboard.com> Acked-by: Laurent Pinchart <laurent.pinch...@ideasonboard.com> Reviewed-and-Tested-by: Chad Versace <chadvers...@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/1484317329-9293-1-git-send-email-gust...@padovan.org ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 13/14] drm/i915: Enable userspace to opt-out of implicit fencing
On Thu 26 Jan 2017, Chris Wilson wrote: > On Wed, Jan 25, 2017 at 12:38:32PM -0800, Chad Versace wrote: > > On Mon 14 Nov 2016, Chris Wilson wrote: > > > Userspace is faced with a dilemma. The kernel requires implicit fencing > > > to manage resource usage (we always must wait for the GPU to finish > > > before releasing its PTE) and for third parties. However, userspace may > > > wish to avoid this serialisation if it is either using explicit fencing > > > between parties and wants more fine-grained access to buffers (e.g. it > > > may partition the buffer between uses and track fences on ranges rather > > > than the implicit fences tracking the whole object). It follows that > > > userspace needs a mechanism to avoid the kernel's serialisation on its > > > implicit fences before execbuf execution. > > > > > > The next question is whether this is an object, execbuf or context flag. > > > Hybrid users (such as using explicit EGL_ANDROID_native_sync fencing on > > > shared winsys buffers, but implicit fencing on internal surfaces) > > > require a per-object level flag. Given that this flag need to be only > > > set once for the lifetime of the object, this reduces the convenience of > > > having an execbuf or context level flag (and avoids having multiple > > > pieces of uABI controlling the same feature). > > > > > > Incorrect use of this flag will result in rendering corruption and GPU > > > hangs - but will not result in use-after-free or similar resource > > > tracking issues. > > > > > > Serious caveat: write ordering is not strictly correct after setting > > > this flag on a render target on multiple engines. This affects all > > > subsequent GEM operations (execbuf, set-domain, pread) and shared > > > dma-buf operations. A fix is possible - but costly (both in terms of > > > further ABI changes and runtime overhead). > > > > > > Testcase: igt/gem_exec_async > > > Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> > > > Reviewed-by: Joonas Lahtinen <joonas.lahti...@linux.intel.com> > > > --- > > > drivers/gpu/drm/i915/i915_drv.c| 1 + > > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +++ > > > include/uapi/drm/i915_drm.h| 29 > > > - > > > 3 files changed, 32 insertions(+), 1 deletion(-) > > > > I'm neutral about this patch. I believe patch 14/14 is useful with or > > without this patch, and I want to see patch 14 land regardless of what > > happens with this one. > > I don't like the patch, it opens up a big wart in the GEM api (incorrect > write tracking on GEM/dma-buf across multiple timelines). Otoh, being > able to discard the implicit fence tracking seems to be an important > feature request - if we go forward witout it, we will then be lacking a > feature that is common across other drivers and in particular seems to > be commonplace in the Android ecosystem. I agree. The explicit fence fds provide more benefit (that is, less blocking and, in general, more *explicitness*) when implicit fencing is disabled. Userspace should have some API to disable the implicit fencing, and this patch seems like an ok approach. I certainly can think of nothing better. > Daniel, what's your feeling? One problem will be that the > synchronisation issue may be hard to track down in future (proving that > the cause of a stall is an avoidable implicit fence). > > > I'm not opposed to this patch. It's just that I don't yet understand > > exactly if Mesa's EGL/GL code could effectively use this feature for > > Android winsys buffers. The amount of information loss between the > > EGL/GL apis and the eventual execbuffer submission may prevent Mesa from > > annotating the Android winsys buffers with this. I'm unsure. I'm still > > thinking about it. > > > > But, if Chris, or anyone, already has plans to use this somehow, perhaps > > in the DDX, then don't let my hesitation block the patch. > > Actually, the example I have would be for mesa. It can use this on its > own scratch buffers to just discard writes and prevent ordering on > a single scratch shared between contexts, and for its fence tracking using > a single page for multiple rings. Those use cases sound good to me. This patch is Acked-by: Chad Versace <chadvers...@chromium.org> ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Mesa-dev] [PATCH] i965: Share the workaround bo between all contexts
On Thu 26 Jan 2017, Chad Versace wrote: > On Thu 26 Jan 2017, Chris Wilson wrote: > > Since the workaround bo is used strictly as a write-only buffer, we need > > only allocate one per screen and use the same one from all contexts. > > > > (The caveat here is during extension initialisation, where we write into > > and read back register values from the buffer, but that is performed only > > once for the first context - and baring synchronisation issues should not > > be a problem. Safer would be to move that also to the screen.) > > > > v2: Give the workaround bo its own init function and don't piggy back > > intel_bufmgr_init() since it is not that related. > > > > v3: Drop the reference count of the workaround bo for the context since > > the context itself is owned by the screen (and so we can rely on the bo > > existing for the lifetime of the context). > > I like this idea, but I have questions and comments about the details. > More questions than comments, really. > > Today, with only Mesa changes, could we effectively do the same as > drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo); > by hacking Mesa to set no read/write domain when emitting relocs for the > workaround_bo? (I admit I don't fully understand the kernel's domain > tracking). If that does work, then it just would require a small hack to > brw_emit_pipe_control_write(). > > > Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> > > Cc: Kenneth Graunke <kenn...@whitecape.org> > > Cc: Martin Peres <martin.pe...@linux.intel.com> > > Cc: Chad Versace <chadvers...@chromium.org> > > Cc: Daniel Vetter <daniel.vet...@ffwll.ch> > > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > > b/src/mesa/drivers/dri/i965/intel_screen.c > > + /* We want to use this bo from any and all contexts, without undue > > +* writing ordering between them. To prevent the kernel enforcing > > +* the order due to writes from different contexts, we disable > > +* the use of (the kernel's) implicit sync on this bo. > > +*/ > > + drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo); > > +#ifndef HAVE_DRM_INTEL_GEM_BO_DISABLE_IMPLICIT_SYNC > > +#define drm_intel_gem_bo_disable_implicit_sync(BO) do { } while (0) > > +#endif Until Mesa can actually disable the implicit sync, I think this patch should be postponed. If it landed now, it may cause additional unneccessary stalls between contexts. Chrome OS uses many contexts in the same process, so if problems exist, they'll exhibit on CrOS. Perhaps the extra stalls will be imperceptible, but I don't want to take the risk. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Mesa-dev] [PATCH] i965: Share the workaround bo between all contexts
On Thu 26 Jan 2017, Chris Wilson wrote: > On Thu, Jan 26, 2017 at 09:39:51AM -0800, Chad Versace wrote: > > On Thu 26 Jan 2017, Chris Wilson wrote: > > > Since the workaround bo is used strictly as a write-only buffer, we need > > > only allocate one per screen and use the same one from all contexts. > > > > > > (The caveat here is during extension initialisation, where we write into > > > and read back register values from the buffer, but that is performed only > > > once for the first context - and baring synchronisation issues should not > > > be a problem. Safer would be to move that also to the screen.) > > > > > > v2: Give the workaround bo its own init function and don't piggy back > > > intel_bufmgr_init() since it is not that related. > > > > > > v3: Drop the reference count of the workaround bo for the context since > > > the context itself is owned by the screen (and so we can rely on the bo > > > existing for the lifetime of the context). > > > > I like this idea, but I have questions and comments about the details. > > More questions than comments, really. > > > > Today, with only Mesa changes, could we effectively do the same as > > drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo); > > by hacking Mesa to set no read/write domain when emitting relocs for the > > workaround_bo? (I admit I don't fully understand the kernel's domain > > tracking). If that does work, then it just would require a small hack to > > brw_emit_pipe_control_write(). > > Yes, for anything that is totally scratch just not setting the write > hazard is the same. For something like the seqno page where we have > multiple engines that we do want to be preserved, not settting the write > hazzard had the consequence that page could be lost under memory pressure > or across resume. (As usual there are some details that this part of the > ABI had to be relaxed because userspace didn't have this flag.) > But that doesn't sell many bananas. Good. That's how I thought it worked. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Mesa-dev] [PATCH] i965: Share the workaround bo between all contexts
On Thu 26 Jan 2017, Chris Wilson wrote: > Since the workaround bo is used strictly as a write-only buffer, we need > only allocate one per screen and use the same one from all contexts. > > (The caveat here is during extension initialisation, where we write into > and read back register values from the buffer, but that is performed only > once for the first context - and baring synchronisation issues should not > be a problem. Safer would be to move that also to the screen.) > > v2: Give the workaround bo its own init function and don't piggy back > intel_bufmgr_init() since it is not that related. > > v3: Drop the reference count of the workaround bo for the context since > the context itself is owned by the screen (and so we can rely on the bo > existing for the lifetime of the context). I like this idea, but I have questions and comments about the details. More questions than comments, really. Today, with only Mesa changes, could we effectively do the same as drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo); by hacking Mesa to set no read/write domain when emitting relocs for the workaround_bo? (I admit I don't fully understand the kernel's domain tracking). If that does work, then it just would require a small hack to brw_emit_pipe_control_write(). > Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> > Cc: Kenneth Graunke <kenn...@whitecape.org> > Cc: Martin Peres <martin.pe...@linux.intel.com> > Cc: Chad Versace <chadvers...@chromium.org> > Cc: Daniel Vetter <daniel.vet...@ffwll.ch> > --- > src/mesa/drivers/dri/i965/Makefile.am| 2 +- > src/mesa/drivers/dri/i965/brw_pipe_control.c | 12 +- > src/mesa/drivers/dri/i965/intel_screen.c | 24 > src/mesa/drivers/dri/i965/intel_screen.h | 1 + > src/mesa/drivers/dri/i965/libdrm_compat.h| 33 > > 5 files changed, 64 insertions(+), 8 deletions(-) > create mode 100644 src/mesa/drivers/dri/i965/libdrm_compat.h > > diff --git a/src/mesa/drivers/dri/i965/Makefile.am > b/src/mesa/drivers/dri/i965/Makefile.am > index 6602a17995..b208563f7d 100644 > --- a/src/mesa/drivers/dri/i965/Makefile.am > +++ b/src/mesa/drivers/dri/i965/Makefile.am > @@ -77,7 +77,7 @@ noinst_LTLIBRARIES = \ > libi965_compiler.la \ > $(I965_PERGEN_LIBS) > > -libi965_dri_la_SOURCES = $(i965_FILES) > +libi965_dri_la_SOURCES = $(i965_FILES) libdrm_compat.h > libi965_dri_la_LIBADD = \ > $(top_builddir)/src/intel/common/libintel_common.la \ > $(top_builddir)/src/intel/isl/libisl.la \ > diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c > b/src/mesa/drivers/dri/i965/brw_pipe_control.c > index b8f740640f..22c946f744 100644 > --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c > +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c > @@ -371,20 +371,18 @@ brw_init_pipe_control(struct brw_context *brw, > /* We can't just use brw_state_batch to get a chunk of space for > * the gen6 workaround because it involves actually writing to > * the buffer, and the kernel doesn't let us write to the batch. > +* > +* As the screen has a long lifetime than the contexts derived from > +* it, we do not need to add our own reference count and can simply > +* rely on the bo always existing for the duration of the context. > */ > - brw->workaround_bo = drm_intel_bo_alloc(brw->bufmgr, > - "pipe_control workaround", > - 4096, 4096); > - if (brw->workaround_bo == NULL) > - return -ENOMEM; > + brw->workaround_bo = brw->screen->workaround_bo; > > brw->pipe_controls_since_last_cs_stall = 0; > - > return 0; > } > > void > brw_fini_pipe_control(struct brw_context *brw) > { > - drm_intel_bo_unreference(brw->workaround_bo); > } > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > b/src/mesa/drivers/dri/i965/intel_screen.c > index 5f88c1..6e788c41cc 100644 > --- a/src/mesa/drivers/dri/i965/intel_screen.c > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > @@ -107,6 +107,7 @@ DRI_CONF_END > #include "brw_context.h" > > #include "i915_drm.h" > +#include "libdrm_compat.h" > > /** > * For debugging purposes, this returns a time in seconds. > @@ -1030,6 +1031,7 @@ intelDestroyScreen(__DRIscreen * sPriv) > { > struct intel_screen *screen = sPriv->driverPrivate; > > + drm_intel_bo_unreference(screen->workaround_bo); > dri_bufmgr_destroy(screen->bufmgr); > driDestroyOptionInfo(>optionCache); > > @@ -1210,6 +1212,25 @@ intel_i
Re: [Intel-gfx] [PATCH v3 13/14] drm/i915: Enable userspace to opt-out of implicit fencing
On Mon 14 Nov 2016, Chris Wilson wrote: > Userspace is faced with a dilemma. The kernel requires implicit fencing > to manage resource usage (we always must wait for the GPU to finish > before releasing its PTE) and for third parties. However, userspace may > wish to avoid this serialisation if it is either using explicit fencing > between parties and wants more fine-grained access to buffers (e.g. it > may partition the buffer between uses and track fences on ranges rather > than the implicit fences tracking the whole object). It follows that > userspace needs a mechanism to avoid the kernel's serialisation on its > implicit fences before execbuf execution. > > The next question is whether this is an object, execbuf or context flag. > Hybrid users (such as using explicit EGL_ANDROID_native_sync fencing on > shared winsys buffers, but implicit fencing on internal surfaces) > require a per-object level flag. Given that this flag need to be only > set once for the lifetime of the object, this reduces the convenience of > having an execbuf or context level flag (and avoids having multiple > pieces of uABI controlling the same feature). > > Incorrect use of this flag will result in rendering corruption and GPU > hangs - but will not result in use-after-free or similar resource > tracking issues. > > Serious caveat: write ordering is not strictly correct after setting > this flag on a render target on multiple engines. This affects all > subsequent GEM operations (execbuf, set-domain, pread) and shared > dma-buf operations. A fix is possible - but costly (both in terms of > further ABI changes and runtime overhead). > > Testcase: igt/gem_exec_async > Signed-off-by: Chris Wilson> Reviewed-by: Joonas Lahtinen > --- > drivers/gpu/drm/i915/i915_drv.c| 1 + > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +++ > include/uapi/drm/i915_drm.h| 29 - > 3 files changed, 32 insertions(+), 1 deletion(-) I'm neutral about this patch. I believe patch 14/14 is useful with or without this patch, and I want to see patch 14 land regardless of what happens with this one. I'm not opposed to this patch. It's just that I don't yet understand exactly if Mesa's EGL/GL code could effectively use this feature for Android winsys buffers. The amount of information loss between the EGL/GL apis and the eventual execbuffer submission may prevent Mesa from annotating the Android winsys buffers with this. I'm unsure. I'm still thinking about it. But, if Chris, or anyone, already has plans to use this somehow, perhaps in the DDX, then don't let my hesitation block the patch. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 14/14] drm/i915: Support explicit fencing for execbuf
If I understand correctly, this patch preserves the kernel's current implicit fencing, even when an input fence fd is given to execbuffer. I'm convinced that's the right approach. If userspace does want to disable the implicit fencing during an execbuffer, then it should disable that explicit fencing through an *explicit* knob. I believe the kernel should not interpret the presence of an in fence fd in execbuffer as that knob. If it did, then using this feature from GL/EGL userspace would be unwieldy. In other words, I like this. Patch 14 gets my Acked-by: Chad Versace <chadvers...@chromium.org> On Mon 14 Nov 2016, Chris Wilson wrote: > Now that the user can opt-out of implicit fencing, we need to give them > back control over the fencing. We employ sync_file to wrap our > drm_i915_gem_request and provide an fd that userspace can merge with > other sync_file fds and pass back to the kernel to wait upon before > future execution. > > Testcase: igt/gem_exec_fence > Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> > Reviewed-by: Joonas Lahtinen <joonas.lahti...@linux.intel.com> > --- > drivers/gpu/drm/i915/Kconfig | 1 + > drivers/gpu/drm/i915/i915_drv.c| 3 +- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 54 > +++--- > include/uapi/drm/i915_drm.h| 35 ++- > 4 files changed, 86 insertions(+), 7 deletions(-) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/3] intel: Support passing of explicit fencing from execbuf
From: Chris Wilson <ch...@chris-wilson.co.uk> Allow the caller to pass in an fd to an array of fences to control serialisation of the execbuf in the kernel and on the GPU, and in return allow creation of a fence fd for signaling the completion (and flushing) of the batch. When the returned fence is signaled, all writes to the buffers inside the batch will be complete and coherent from the cpu, or other consumers. The return fence is a sync_file object and can be passed to other users (such as atomic modesetting, or other drivers). Fixes from rantogno: - Fix the in/out fence flags on execbuf. Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> Squashed-fixes-from: Rafael Antognolli <rafael.antogno...@intel.com> Signed-off-by: Chad Versace <chadvers...@chromium.org> --- intel/intel_bufmgr.h | 6 ++ intel/intel_bufmgr_gem.c | 32 +++- 2 files changed, 33 insertions(+), 5 deletions(-) diff --git a/intel/intel_bufmgr.h b/intel/intel_bufmgr.h index 6f128679..49cabf67 100644 --- a/intel/intel_bufmgr.h +++ b/intel/intel_bufmgr.h @@ -219,6 +219,12 @@ int drm_intel_gem_context_get_id(drm_intel_context *ctx, void drm_intel_gem_context_destroy(drm_intel_context *ctx); int drm_intel_gem_bo_context_exec(drm_intel_bo *bo, drm_intel_context *ctx, int used, unsigned int flags); +int drm_intel_gem_bo_fence_exec(drm_intel_bo *bo, + drm_intel_context *ctx, + int used, + int in_fence, + int *out_fence, + unsigned int flags); int drm_intel_bo_gem_export_to_prime(drm_intel_bo *bo, int *prime_fd); drm_intel_bo *drm_intel_bo_gem_create_from_prime(drm_intel_bufmgr *bufmgr, diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c index c5fc1b36..da993002 100644 --- a/intel/intel_bufmgr_gem.c +++ b/intel/intel_bufmgr_gem.c @@ -2376,6 +2376,7 @@ drm_intel_gem_bo_exec(drm_intel_bo *bo, int used, static int do_exec2(drm_intel_bo *bo, int used, drm_intel_context *ctx, drm_clip_rect_t *cliprects, int num_cliprects, int DR4, +int in_fence, int *out_fence, unsigned int flags) { drm_intel_bufmgr_gem *bufmgr_gem = (drm_intel_bufmgr_gem *)bo->bufmgr; @@ -2429,13 +2430,20 @@ do_exec2(drm_intel_bo *bo, int used, drm_intel_context *ctx, i915_execbuffer2_set_context_id(execbuf, 0); else i915_execbuffer2_set_context_id(execbuf, ctx->ctx_id); - execbuf.rsvd2 = 0; + if (in_fence != -1) { + execbuf.rsvd2 = in_fence; + execbuf.flags |= I915_EXEC_FENCE_IN; + } + if (out_fence != NULL) { + *out_fence = -1; + execbuf.flags |= I915_EXEC_FENCE_OUT; + } if (bufmgr_gem->no_exec) goto skip_execution; ret = drmIoctl(bufmgr_gem->fd, - DRM_IOCTL_I915_GEM_EXECBUFFER2, + DRM_IOCTL_I915_GEM_EXECBUFFER2_WR, ); if (ret != 0) { ret = -errno; @@ -2451,6 +2459,9 @@ do_exec2(drm_intel_bo *bo, int used, drm_intel_context *ctx, } drm_intel_update_buffer_offsets2(bufmgr_gem); + if (out_fence != NULL) + *out_fence = execbuf.rsvd2 >> 32; + skip_execution: if (bufmgr_gem->bufmgr.debug) drm_intel_gem_dump_validation_list(bufmgr_gem); @@ -2476,7 +2487,7 @@ drm_intel_gem_bo_exec2(drm_intel_bo *bo, int used, int DR4) { return do_exec2(bo, used, NULL, cliprects, num_cliprects, DR4, - I915_EXEC_RENDER); + -1, NULL, I915_EXEC_RENDER); } static int @@ -2485,14 +2496,25 @@ drm_intel_gem_bo_mrb_exec2(drm_intel_bo *bo, int used, unsigned int flags) { return do_exec2(bo, used, NULL, cliprects, num_cliprects, DR4, - flags); + -1, NULL, flags); } int drm_intel_gem_bo_context_exec(drm_intel_bo *bo, drm_intel_context *ctx, int used, unsigned int flags) { - return do_exec2(bo, used, ctx, NULL, 0, 0, flags); + return do_exec2(bo, used, ctx, NULL, 0, 0, -1, NULL, flags); +} + +int +drm_intel_gem_bo_fence_exec(drm_intel_bo *bo, + drm_intel_context *ctx, + int used, + int in_fence, + int *out_fence, + unsigned int flags) +{ + return do_exec2(bo, used, ctx, NULL, 0, 0, in_fence, out_fence, flags); } static int -- 2.11.0.21.ga274e0a ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/3] intel: Allow the client to control implicit synchronisation
From: Chris WilsonThe kernel allows implicit synchronisation to be disabled on individual buffers. Use at your own risk. Signed-off-by: Chris Wilson --- intel/intel_bufmgr.h | 2 ++ intel/intel_bufmgr_gem.c | 32 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/intel/intel_bufmgr.h b/intel/intel_bufmgr.h index 85e4ff78..6f128679 100644 --- a/intel/intel_bufmgr.h +++ b/intel/intel_bufmgr.h @@ -184,6 +184,8 @@ int drm_intel_gem_bo_map_unsynchronized(drm_intel_bo *bo); int drm_intel_gem_bo_map_gtt(drm_intel_bo *bo); int drm_intel_gem_bo_unmap_gtt(drm_intel_bo *bo); +void drm_intel_gem_bo_disable_implicit_sync(drm_intel_bo *bo); + void *drm_intel_gem_bo_map__cpu(drm_intel_bo *bo); void *drm_intel_gem_bo_map__gtt(drm_intel_bo *bo); void *drm_intel_gem_bo_map__wc(drm_intel_bo *bo); diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c index 75949b9f..c5fc1b36 100644 --- a/intel/intel_bufmgr_gem.c +++ b/intel/intel_bufmgr_gem.c @@ -195,8 +195,11 @@ struct _drm_intel_bo_gem { uint32_t swizzle_mode; unsigned long stride; + unsigned long kflags; + time_t free_time; + /** Array passed to the DRM containing relocation information. */ struct drm_i915_gem_relocation_entry *relocs; /** @@ -575,12 +578,11 @@ drm_intel_add_validate_buffer2(drm_intel_bo *bo, int need_fence) bufmgr_gem->exec2_objects[index].relocation_count = bo_gem->reloc_count; bufmgr_gem->exec2_objects[index].relocs_ptr = (uintptr_t)bo_gem->relocs; bufmgr_gem->exec2_objects[index].alignment = bo->align; - bufmgr_gem->exec2_objects[index].offset = bo_gem->is_softpin ? - bo->offset64 : 0; - bufmgr_gem->exec_bos[index] = bo; - bufmgr_gem->exec2_objects[index].flags = flags; + bufmgr_gem->exec2_objects[index].offset = bo->offset64; + bufmgr_gem->exec2_objects[index].flags = flags | bo_gem->kflags; bufmgr_gem->exec2_objects[index].rsvd1 = 0; bufmgr_gem->exec2_objects[index].rsvd2 = 0; + bufmgr_gem->exec_bos[index] = bo; bufmgr_gem->exec_count++; } @@ -1368,6 +1370,7 @@ drm_intel_gem_bo_unreference_final(drm_intel_bo *bo, time_t time) for (i = 0; i < bo_gem->softpin_target_count; i++) drm_intel_gem_bo_unreference_locked_timed(bo_gem->softpin_target[i], time); + bo_gem->kflags = 0; bo_gem->reloc_count = 0; bo_gem->used_as_reloc_target = false; bo_gem->softpin_target_count = 0; @@ -2765,6 +2768,27 @@ drm_intel_bufmgr_gem_enable_reuse(drm_intel_bufmgr *bufmgr) } /** + * Disables implicit synchronisation before executing the bo + * + * This will cause rendering corruption unless you correctly manage explicit + * fences for all rendering involving this buffer - including use by others. + * Disabling the implicit serialisation is only required if that serialisation + * is too coarse (for example, you have split the buffer into many + * non-overlapping regions and are sharing the whole buffer between concurrent + * independent command streams). + * + * Note the kernel must advertise support via I915_PARAM_HAS_EXEC_ASYNC or + * subsequent execbufs involving the bo will generate EINVAL. + */ +void +drm_intel_gem_bo_disable_implicit_sync(drm_intel_bo *bo) +{ + drm_intel_bo_gem *bo_gem = (drm_intel_bo_gem *) bo; + + bo_gem->kflags |= EXEC_OBJECT_ASYNC; +} + +/** * Enable use of fenced reloc type. * * New code should enable this to avoid unnecessary fence register -- 2.11.0.21.ga274e0a ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/3] WAIT: headers: Update drm_i915.h
Generated with `make headers_install` from tag 'chadv/test/i915-exec-fence-v03' of . TODO: Wait until header updates are in upstream kernel. References: http://git.kiwitree.net/cgit/~chadv/linux/tag/?h=chadv/test/i915-exec-fence-v03 Signed-off-by: Chad Versace <chadvers...@chromium.org> --- include/drm/i915_drm.h | 277 +++-- 1 file changed, 267 insertions(+), 10 deletions(-) diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index eb611a7a..48e023fa 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -29,6 +29,10 @@ #include "drm.h" +#if defined(__cplusplus) +extern "C" { +#endif + /* Please note that modifications to all structs defined here are * subject to backwards-compatibility constraints. */ @@ -58,6 +62,30 @@ #define I915_ERROR_UEVENT "ERROR" #define I915_RESET_UEVENT "RESET" +/* + * MOCS indexes used for GPU surfaces, defining the cacheability of the + * surface data and the coherency for this data wrt. CPU vs. GPU accesses. + */ +enum i915_mocs_table_index { + /* +* Not cached anywhere, coherency between CPU and GPU accesses is +* guaranteed. +*/ + I915_MOCS_UNCACHED, + /* +* Cacheability and coherency controlled by the kernel automatically +* based on the DRM_I915_GEM_SET_CACHING IOCTL setting and the current +* usage of the surface (used for display scanout or not). +*/ + I915_MOCS_PTE, + /* +* Cached in all GPU caches available on the platform. +* Coherency between CPU and GPU accesses to the surface is not +* guaranteed without extra synchronization. +*/ + I915_MOCS_CACHED, +}; + /* Each region is a minimum of 16k, and there are at most 255 of them. */ #define I915_NR_TEX_REGIONS 255/* table size 2k - maximum due to use @@ -218,6 +246,7 @@ typedef struct _drm_i915_sarea { #define DRM_I915_OVERLAY_PUT_IMAGE 0x27 #define DRM_I915_OVERLAY_ATTRS 0x28 #define DRM_I915_GEM_EXECBUFFER2 0x29 +#define DRM_I915_GEM_EXECBUFFER2_WRDRM_I915_GEM_EXECBUFFER2 #define DRM_I915_GET_SPRITE_COLORKEY 0x2a #define DRM_I915_SET_SPRITE_COLORKEY 0x2b #define DRM_I915_GEM_WAIT 0x2c @@ -230,6 +259,7 @@ typedef struct _drm_i915_sarea { #define DRM_I915_GEM_USERPTR 0x33 #define DRM_I915_GEM_CONTEXT_GETPARAM 0x34 #define DRM_I915_GEM_CONTEXT_SETPARAM 0x35 +#define DRM_I915_PERF_OPEN 0x36 #define DRM_IOCTL_I915_INITDRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t) #define DRM_IOCTL_I915_FLUSH DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH) @@ -251,6 +281,7 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_GEM_INITDRM_IOW(DRM_COMMAND_BASE + DRM_I915_GEM_INIT, struct drm_i915_gem_init) #define DRM_IOCTL_I915_GEM_EXECBUFFER DRM_IOW(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER, struct drm_i915_gem_execbuffer) #define DRM_IOCTL_I915_GEM_EXECBUFFER2 DRM_IOW(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER2, struct drm_i915_gem_execbuffer2) +#define DRM_IOCTL_I915_GEM_EXECBUFFER2_WR DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER2_WR, struct drm_i915_gem_execbuffer2) #define DRM_IOCTL_I915_GEM_PIN DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_PIN, struct drm_i915_gem_pin) #define DRM_IOCTL_I915_GEM_UNPIN DRM_IOW(DRM_COMMAND_BASE + DRM_I915_GEM_UNPIN, struct drm_i915_gem_unpin) #define DRM_IOCTL_I915_GEM_BUSYDRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_BUSY, struct drm_i915_gem_busy) @@ -283,6 +314,7 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_GEM_USERPTR DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_USERPTR, struct drm_i915_gem_userptr) #define DRM_IOCTL_I915_GEM_CONTEXT_GETPARAMDRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_GETPARAM, struct drm_i915_gem_context_param) #define DRM_IOCTL_I915_GEM_CONTEXT_SETPARAMDRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_SETPARAM, struct drm_i915_gem_context_param) +#define DRM_IOCTL_I915_PERF_OPEN DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param) /* Allow drivers to submit batchbuffers directly to hardware, relying * on the security mechanisms provided by hardware. @@ -357,8 +389,28 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_GPU_RESET35 #define I915_PARAM_HAS_RESOURCE_STREAMER 36 #define I915_PARAM_HAS_EXEC_SOFTPIN 37 -#define I915_PARAM_HAS_POOLED_EU 38 -#define I915_PARAM_MIN_EU_IN_POOL39 +#define I915_PARAM_HAS_POOLED_EU38 +#define I915_PARAM_MIN_EU_IN_POOL 39 +#define I915_PARAM_MMAP_GTT_VERSION 40 + +/* Query whether DRM_I915_GEM_EXECBUFFER2 supports user defined execution + * priorities and the driver will attempt to execute batches in priority order. + * The initial prio
[Intel-gfx] [PATCH 0/3] intel: Add fence fd support to execbuf
Chris wrote these patches, and the kernel patches too. I wrote the Mesa patches that use the new feature. I'm submitting these patches myself to get things moving. This series depends on fence fd support in I915_GEM_EXECBUFFER2, which isn't upstream in the kernel yet. I tested this with kmscube on Skylake, and everything looked good to me. I pushed tags for this series as well as all the code I tested with: mesa: http://git.kiwitree.net/cgit/~chadv/mesa/tag/?h=chadv/review/i965-exec-fence-v03 libdrm: http://git.kiwitree.net/cgit/~chadv/libdrm/tag/?h=chadv/review/intel-exec-fence-v01 linux: http://git.kiwitree.net/cgit/~chadv/linux/tag/?h=chadv/test/i915-exec-fence-v03 kmscube: http://git.kiwitree.net/cgit/~chadv/kmscube/tag/?h=chadv/test/fences-v02 I also sent Mesa patches to mesa-dev. Someone else should submit the kernel patches, as I tested them but don't grok them. Chad Versace (1): WAIT: headers: Update drm_i915.h Chris Wilson (2): intel: Allow the client to control implicit synchronisation intel: Support passing of explicit fencing from execbuf include/drm/i915_drm.h | 277 +-- intel/intel_bufmgr.h | 8 ++ intel/intel_bufmgr_gem.c | 64 +-- 3 files changed, 330 insertions(+), 19 deletions(-) -- 2.11.0.21.ga274e0a ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC] i915: Add fence fds to execbuffer2 uapi
On Tue 28 Jun 2016, John Harrison wrote: > The latest set of patches (including changes from feedback about rsvd > fields) never actually got posted to the mailing list due to the above issue > with de-staging. I have just updated my FDO git account with them instead. > The kernel patch is: > https://cgit.freedesktop.org/~johnharr/scheduler/commit/?h=all=b7cd5e85edce4af9d7d4c34bb640cd49e31236a8 > > The userland LibDRM patch is: > https://cgit.freedesktop.org/~johnharr/scheduler/commit/?h=LibDRM=f11b2d577904b1a096d5b36384a9cc83ba51cbb8 Thanks for the sharing the code. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC] i915: Add fence fds to execbuffer2 uapi
On Mon 27 Jun 2016, Chris Wilson wrote: > On Mon, Jun 27, 2016 at 01:18:59PM -0700, Chad Versace wrote: > > On Mon 27 Jun 2016, Chad Versace wrote: > > > Let the hight 32 bits of drm_i915_gem_execbuffer2::rsvd1 contain an > > > input and/or output fence fd, whose presence is controlled by flags. > > > Also add I915_PARAM_HAS_FENCE_FD. > > > > > > Signed-off-by: Chad Versace <chad.vers...@intel.com> > > > --- > > > include/uapi/drm/i915_drm.h | 24 ++-- > > > 1 file changed, 22 insertions(+), 2 deletions(-) > > > > Oops. git-send-email stripped the notes to the patch. Here's the notes: > > > > This RFC proposes a uapi that integrates execbuf with Android sync fds. Of > > course, this is *only* an RFC because other devs are working on the i915 > > internals, and this patch depends on that work. > > Why not just use the earlier patches for the uAPI as well? I examined all the patches that John Harrison sent to the list, and they contained no uapi. Is there another patchset, from someone else (possibly you), that proposes a uapi? > > Why am I sending an RFC this early? I will soon begin prototyping Intel's > > Mesa implementation of EGL_ANDROID_native_fence_sync, and that prototype > > will > > be easier to write if I have a rough expectation of i915's eventual fence fd > > uapi. > > > > Please provide feedback: Does this roughly look like the uapi that the i915 > > devs expect? > > Not quite. You have to use separate in/out dwords (i.e. rsvd2) in order > to ensure that we don't overwite the in-fence when dealing with error > paths (i.e. so that userspace can feed in the same execbuf parameters > following EINTR, and you don't have confusion between in/out parameters). Right. I forgot about resubmission on EINTR. > You have to also mark the ioctl as writing the new structures which is an > ABI break and so requires a new identifier (otherwise you break userspace > passing in the args from read-only memory). Thanks for explaining the obvious ABI break to this kernel noob. > Playind devil's advocate, an alternative to every driver implementing > their own fence extension for execbuf/cmdsubmit would be to add support > for explicit sync_fences to be added via dmabuf. (Instead of setting the > fence on the execbuf, you would set the fence on the batch buffer obj, > or surface of interest - though for CreateSync, it would have to be the > batch. Extracting the fence is then supplied by querying the batch buffer > dmabuf. It's not as explicit, but I suspect such uABI will be added to > dmabuf and will be required to be supported in the driver to handle > implicit fencing between PRIME anyway.) If I were arguing with the devil, I would claim that uapi that attached fence fds to dma_bufs seems more elegant, but API that attaches fence fds to batch bos prevents an optimized use case in Vulkan's submission model. In Vulkan, the user submits work by compiling a VkCommandBuffer (which is closely related to Intel's batch bo) and then submitting it to a VkQueue (which is related to a GPU ring). For repetive rendering tasks, the Vulkan API encourages the user to re-use the compiled VkCommandBuffer by re-submitting it to the VkQueue. When the user resubmits a VkCommandBuffer, the Vulkan spec doesn't *require* the driver to resubmit the same exact batch buffer; but that's the spec's *intent*. And Mesa does that today; when the user compiles a VkCommandBuffer, we compile the batch buffer immediately, and resubmit that exact batch buffer each time the user submits the VkCommandBuffer. Vulkan doesn't use fence fds today, but it will someday. If the kernel doesn't add fence fds to the execbuffer ioctl, but instead requires that the fences be associated with a batch buffer, then that would prevents the natural batch buffer re-use in Vulkan. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC] i915: Add fence fds to execbuffer2 uapi
On Mon 27 Jun 2016, Chad Versace wrote: > Let the hight 32 bits of drm_i915_gem_execbuffer2::rsvd1 contain an > input and/or output fence fd, whose presence is controlled by flags. > Also add I915_PARAM_HAS_FENCE_FD. > > Signed-off-by: Chad Versace <chad.vers...@intel.com> > --- > include/uapi/drm/i915_drm.h | 24 ++-- > 1 file changed, 22 insertions(+), 2 deletions(-) Oops. git-send-email stripped the notes to the patch. Here's the notes: This RFC proposes a uapi that integrates execbuf with Android sync fds. Of course, this is *only* an RFC because other devs are working on the i915 internals, and this patch depends on that work. Why am I sending an RFC this early? I will soon begin prototyping Intel's Mesa implementation of EGL_ANDROID_native_fence_sync, and that prototype will be easier to write if I have a rough expectation of i915's eventual fence fd uapi. Please provide feedback: Does this roughly look like the uapi that the i915 devs expect? (Pardon any stupidity in the patch. It's my first non-trivial kernel patch). ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC] i915: Add fence fds to execbuffer2 uapi
Let the hight 32 bits of drm_i915_gem_execbuffer2::rsvd1 contain an input and/or output fence fd, whose presence is controlled by flags. Also add I915_PARAM_HAS_FENCE_FD. Signed-off-by: Chad Versace <chad.vers...@intel.com> --- include/uapi/drm/i915_drm.h | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index c17d63d..6f26b79 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -361,6 +361,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_GPU_RESET35 #define I915_PARAM_HAS_RESOURCE_STREAMER 36 #define I915_PARAM_HAS_EXEC_SOFTPIN 37 +#define I915_PARAM_HAS_FENCE_FD 38 typedef struct drm_i915_getparam { __s32 param; @@ -742,7 +743,17 @@ struct drm_i915_gem_execbuffer2 { #define I915_EXEC_CONSTANTS_ABSOLUTE (1<<6) #define I915_EXEC_CONSTANTS_REL_SURFACE (2<<6) /* gen4/5 only */ __u64 flags; - __u64 rsvd1; /* now used for context info */ + + /* The low word (bits 0:31) contains the context id. +* +* The high word (bits 32:63) contains an optional fence fd. If flag +* I915_EXEC_FENCE_FD_IN is set, then the high word is an input fence +* fd. The batch will not begin execution before the input fence +* signals. If flag I915_EXEC_FENCE_FD_OUT is set, then an output +* fence fd is returned in the high word. The output fence will signal +* after the batch completes execution. It is legal to set both flags. +*/ + __u64 rsvd1; __u64 rsvd2; }; @@ -788,7 +799,10 @@ struct drm_i915_gem_execbuffer2 { */ #define I915_EXEC_RESOURCE_STREAMER (1<<15) -#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_RESOURCE_STREAMER<<1) +#define I915_EXEC_FENCE_FD_IN (1<<16) +#define I915_EXEC_FENCE_FD_OUT (1<<17) + +#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_FENCE_FD_OUT<<1) #define I915_EXEC_CONTEXT_ID_MASK (0x) #define i915_execbuffer2_set_context_id(eb2, context) \ @@ -796,6 +810,12 @@ struct drm_i915_gem_execbuffer2 { #define i915_execbuffer2_get_context_id(eb2) \ ((eb2).rsvd1 & I915_EXEC_CONTEXT_ID_MASK) +#define I915_EXEC_FENCE_FD_MASK(0x) +#define i915_execbuffer2_set_fence_fd(eb2, fence_fd) \ + ((eb2).rsvd2 = (fence_fd) & I915_EXEC_FENCE_MASK) +#define i915_execbuffer2_get_fence_fd(eb2, fence_fd) \ + ((eb2).rsvd2 & I915_EXEC_FENCE_FD_MASK) + struct drm_i915_gem_pin { /** Handle of the buffer to be pinned. */ __u32 handle; -- 2.9.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Mesa-dev] [PATCH v2 mesa] vk/intel: use negative VK_NO_PROTOTYPES scheme
On Mon 09 May 2016, Eric Engestrom wrote: > On Sun, May 01, 2016 at 09:39:58PM -0700, Jason Ekstrand wrote: > > On May 1, 2016 6:04 PM, "Kenneth Graunke" <kenn...@whitecape.org> wrote: > > > On Sunday, May 1, 2016 9:51:00 AM PDT Emil Velikov wrote: > > > > Adding the anv authors. > > > > > > > > Jason, Chad, is there a canonical place where changes to > > > > vulkan_intel.h should land first ? Eric has a nice fix which we want > > > > in mesa. > > > > > > > > Thanks > > > > Emil > > > > > > I'm pretty sure Mesa is the only repository where this lives. > > > > Yup > > Ping? > > I'm not sure why this didn't get at least an ACK from Intel, but I don't > see any reason not to apply this fix. > I don't have commit access; Emil, can you do it? Bump. Jason, does this patch look good to you? Reviewed-by: Chad Versace <chad.vers...@intel.com> ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC/Draft] Testing requirements for upstream drm/i915 patches
On 10/30/2013 12:05 PM, Daniel Vetter wrote: On Wed, Oct 30, 2013 at 7:11 PM, Ian Romanick i...@freedesktop.org wrote: test coverage of the existing code _before_ starting a feature instead of when the patches are ready for merging should help a lot, before everyone is invested into patches already and mounting rebase pain looms large. Yeah, that would be a great way to bring up new people. The problem is a bit that on the kernel side we have a few disadvantages compared to mesa: We don't have a nice spec and we also don't have a fairly decent reference implementation (the nvidia blob). So ime writing kernel tests is much more open-ended than reading a gl extension spec and just nocking off all the new enums and api interface points. Writing *meaningful* GL tests is much more open-ended than reading a gl spec and knocking off each item. To really test some GL features, you must be exceedingly clever and even have an understanding of the underlying hardware implementation to test the significant corner cases. In that sense, it's not too different from writing a kernel test case. My comment is intended to be positive rather than a negative correction. The Mesa team frequently succeeds at creating good test coverage of new GL features despite the difficulty. That fact hopefully confirms it will be possible for the kernel team too. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH intel-gpu-tools] configure: Don't bail if libdrm_nouveau isn't available.
On 10/11/2013 02:29 PM, Matt Turner wrote: On Fri, Oct 11, 2013 at 2:24 PM, Daniel Vetter dan...@ffwll.ch wrote: On Fri, Oct 11, 2013 at 12:51:34PM -0700, Matt Turner wrote: On Fri, Oct 11, 2013 at 12:40 AM, Daniel Vetter dan...@ffwll.ch wrote: On Fri, Oct 11, 2013 at 5:56 AM, Matt Turner matts...@gmail.com wrote: We were seriously *requiring* libdrm_nouveau unless explicitly disabled? I've had a bit of hilarious fail with optional testcases that automatically get disabled when depencies aren't around. Hence why they're all required by default with the optional switch to disable them. You've had this problem in general, or with nouveau? In general. So I think what I'm saying is that the nouveau support in intel-gpu-tools isn't the project's primary purpose, so we should allow it to build without nouveau support if you don't have nouveau installed. If you don't have nouveau installed, you're not making a mistake by not building intel-gpu-tools without support. And if you have nouveau installed, you still get nouveau support. I completely agree with Matt. If the system doesn't already have nouveau installed, then it there is very low likelihood that it makes any sense to install nouveau onto that system. It very likely does not have NVidia hardware at all. So why require it? Let's just autodetect it and behave sanely. Requiring libdrm_nouveau is a nuisance for developers working on platforms where libdrm doesn't get built with NVidia support. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH intel-gpu-tools] configure: Don't bail if libdrm_nouveau isn't available.
On 10/11/2013 03:51 PM, Chad Versace wrote: On 10/11/2013 02:29 PM, Matt Turner wrote: On Fri, Oct 11, 2013 at 2:24 PM, Daniel Vetter dan...@ffwll.ch wrote: On Fri, Oct 11, 2013 at 12:51:34PM -0700, Matt Turner wrote: On Fri, Oct 11, 2013 at 12:40 AM, Daniel Vetter dan...@ffwll.ch wrote: On Fri, Oct 11, 2013 at 5:56 AM, Matt Turner matts...@gmail.com wrote: We were seriously *requiring* libdrm_nouveau unless explicitly disabled? I've had a bit of hilarious fail with optional testcases that automatically get disabled when depencies aren't around. Hence why they're all required by default with the optional switch to disable them. You've had this problem in general, or with nouveau? In general. So I think what I'm saying is that the nouveau support in intel-gpu-tools isn't the project's primary purpose, so we should allow it to build without nouveau support if you don't have nouveau installed. If you don't have nouveau installed, you're not making a mistake by not building intel-gpu-tools without support. And if you have nouveau installed, you still get nouveau support. I completely agree with Matt. If the system doesn't already have nouveau installed, then it there is very low likelihood that it makes any sense to install nouveau onto that system. It very likely does not have NVidia hardware at all. So why require it? Let's just autodetect it and behave sanely. Requiring libdrm_nouveau is a nuisance for developers working on platforms where libdrm doesn't get built with NVidia support. And, this patch gets Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/hsw: Change default LLC age to 3
On 07/31/2013 04:07 PM, Ben Widawsky wrote: The default LLC age was changed: commit 0d8ff15e9a15f2b393e53337a107b7a1e5919b6d Author: Ben Widawsky benjamin.widaw...@intel.com Date: Thu Jul 4 11:02:03 2013 -0700 drm/i915/hsw: Set correct Haswell PTE encodings. This caused a regression in performance on certain benchmarks. While I think a discussion still needs to happen about how the kernel should default for both eLLC, and LLC - just revert this behavior for now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67062 Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_gem_gtt.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index e7b4204..4b1e6e3 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -52,8 +52,10 @@ */ #define HSW_CACHEABILITY_CONTROL(bits)bits) 0x7) 1) | \ (((bits) 0x8) (11 - 3))) +#define HSW_WB_LLC_AGE3HSW_CACHEABILITY_CONTROL(0x2) #define HSW_WB_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0x3) #define HSW_WB_ELLC_LLC_AGE0 HSW_CACHEABILITY_CONTROL(0xb) +#define HSW_LLCHSW_WB_LLC_AGE3 static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr, enum i915_cache_level level) @@ -105,7 +107,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr, pte |= HSW_PTE_ADDR_ENCODE(addr); if (level != I915_CACHE_NONE) - pte |= HSW_WB_LLC_AGE0; + pte |= HSW_LLC; return pte; } I think the patch is clearer without the wrapper define HSW_LLC. But, anyways, with or without the wrapper, Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915: split PCI IDs out into i915_drm.h v3
On 07/24/2013 05:04 PM, Jesse Barnes wrote: For use by userspace (at some point in the future) and other kernel code. v2: move PCI IDs to uabi (Chris) move PCI IDs to drm/ (Dave) v3: fixup Quanta detection - needs to come first (Daniel) Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_drv.c | 164 +++--- include/drm/i915_drm.h |2 + include/drm/i915_pciids.h | 208 +++ 3 files changed, 244 insertions(+), 130 deletions(-) create mode 100644 include/drm/i915_pciids.h +#define INTEL_VGA_DEVICE(id, info) { \ + .class = PCI_BASE_CLASS_DISPLAY 16,\ + .class_mask = 0xff, \ + .vendor = 0x8086, \ + .device = id, \ + .subvendor = PCI_ANY_ID,\ + .subdevice = PCI_ANY_ID,\ + .driver_data = (unsigned long) info } I retract my objections from yesterday. I expected the header to define a static table (like static const struct xxx i915_pci_ids[] = ...), which I didn't like due its inflexibility. But, this macro I do like. It's flexible enough. Acked-by: Chad Versace chad.vers...@linux.intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] [v3] drm/i915: Make i915 events part of uapi
On 07/19/2013 09:16 AM, Ben Widawsky wrote: Make the uevent strings part of the user API for people who wish to write their own listeners. v2: Make a space in the string concatenation. (Chad) Use the UEVENT suffix intead of EVENT (Chad) Make kernel-doc parseable Docbook comments (Daniel) v3: Undid reset change introduced in last submission (Daniel) Fixed up comments to address removal changes. Thanks to Daniel Vetter for a majority of the parity error comments. CC: Chad Versace chad.vers...@linux.intel.com Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_irq.c | 8 include/uapi/drm/i915_drm.h | 24 2 files changed, 28 insertions(+), 4 deletions(-) For what it's worth, Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Make i915 events part of uapi
On 07/16/2013 10:07 PM, Daniel Vetter wrote: On Tue, Jul 16, 2013 at 04:41:13PM -0700, Ben Widawsky wrote: Make the uevent strings part of the user API for people who wish to write their own listeners. CC: Chad Versace chad.vers...@linux.intel.com Signed-off-by: Ben Widawsky b...@bwidawsk.net One thing I've toyed around with a bit is that we should add kerneldoc to our uapi headers and create a DocBook out of it (maybe as a subsection in the drm userspace api chapter). I guess the DocBook integration needs an overall approach, but we should start to add comments to each piece of userspace api to clearly spec them. See below for what I have in mind ... Yes, docs please. I don't have the kernel-fu of a kernel-dev, so without docs I don't know what these events mean and what to expect from them. - parity_event[0] = L3_PARITY_ERROR=1; + parity_event[0] = I915_L3_PARITY_EVENT=1; Small nitpick. I usually find string concatenation more readable like this, with a space: parity_event[0] = I915_L3_PARITY_EVENT =1; But, that's just my preference, so feel free to ignore me. +#define I915_L3_PARITY_EVENT L3_PARITY_ERROR +#define I915_ERROR_EVENT ERROR +#define I915_RESET_EVENT RESET Maybe this is a dumb question... since these are uevents, do you think the names would be improved by given them a UEVENT suffix? Like I915_ERROR_UEVENT? Or is that dumb, because these tokens are intended to serve more purposes than uevents? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] [v2] drm/i915: Expose LLC size to user space
On 07/11/2013 11:52 AM, Ben Widawsky wrote: The algorithm/information was originally written by Chad, though I changed the control flow, and I think his original code had a couple of bugs, though I didn't look very hard before rewriting. That could have also been different interpretations of the spec. I've tested this on two platforms, and it seems to perform how I want. With this patch is a small tool for igt to query the size. This can be used as a reference for DRI clients wishing to query the information. v2: Update name of the SDM location (Bryan) Dissent: Use a new param instead of reusing HAS_LLC param (Chris, Chad) Fix unicode multiply symbol (Ben) CC: Chad Versace chad.vers...@linux.intel.com CC: Bryan Bell bryan.j.b...@intel.com Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_dma.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem.c | 55 + include/uapi/drm/i915_drm.h | 1 + 4 files changed, 61 insertions(+) As long as this is tested and returns the same as L3 in /proc/cpuinfo, this patch is Reviewed-by: Chad Versace chad.vers...@linux.intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] intel_get_llc_size: Small tool to query LLC size
On 07/11/2013 11:53 AM, Ben Widawsky wrote: v2: Use the new param CC: Chad Versace chad.vers...@linux.intel.com CC: Bryan Bell bryan.j.b...@intel.com Signed-off-by: Ben Widawsky b...@bwidawsk.net --- tools/Makefile.am | 1 + tools/intel_get_llc_size.c | 58 ++ 2 files changed, 59 insertions(+) create mode 100644 tools/intel_get_llc_size.c diff --git a/tools/Makefile.am b/tools/Makefile.am index 2519169..a064b65 100644 --- a/tools/Makefile.am +++ b/tools/Makefile.am @@ -9,6 +9,7 @@ bin_PROGRAMS = \ intel_bios_dumper \ intel_bios_reader \ intel_error_decode \ + intel_get_llc_size \ intel_gpu_top \ intel_gpu_time \ intel_gtt \ diff --git a/tools/intel_get_llc_size.c b/tools/intel_get_llc_size.c new file mode 100644 index 000..498e252 --- /dev/null +++ b/tools/intel_get_llc_size.c @@ -0,0 +1,58 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + * + */ + +#include sys/ioctl.h +#include drmtest.h +#include i915_drm.h + +#define LOCAL__I915_PARAM_LLC_SIZE 27 +static int get_llc_size(int fd) +{ + struct drm_i915_getparam gp; + int size; + + gp.param = LOCAL__I915_PARAM_LLC_SIZE; + gp.value = size; + + if (ioctl(fd, DRM_IOCTL_I915_GETPARAM, gp, sizeof(gp))) + return 0; + + return size; +} + +int main(int argc, char **argv) +{ + int size, fd = drm_open_any(); + + size = get_llc_size(fd); + + if (size == 0) + fprintf(stdout, Doesn't have LLC\n); + else if (size == 1) + fprintf(stdout, Kernel is too old to determine LLC size\n); + else + fprintf(stdout, LLC size = %dK\n, size10); This if-tree looks wrong. If the kernel doesn't know about I915_PARAM_LLC_SIZE, the ioctl returns -1, and get_llc_size() returns 0, and the if-tree prints the wrong error. How about making get_llc_size() return -1 on error? Or am I missing something here? + + return 0; +} ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Expose LLC size to user space
On 07/09/2013 07:58 PM, Ben Widawsky wrote: The algorithm/information was originally written by Chad, though I changed the control flow, and I think his original code had a couple of bugs, though I didn't look very hard before rewriting. That could have also been different interpretations of the spec. The excellent comments remain entirely copied from Chad's code. I've tested this on two platforms, and it seems to perform how I want. CC: Chad Versace chad.vers...@linux.intel.com CC: Bryan Bell bryan.j.b...@intel.com Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_dma.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem.c | 53 + 3 files changed, 56 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 0e22142..377949e 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -974,7 +974,7 @@ static int i915_getparam(struct drm_device *dev, void *data, value = 1; break; case I915_PARAM_HAS_LLC: - value = HAS_LLC(dev); + value = dev_priv-llc_size; break; case I915_PARAM_HAS_ALIASING_PPGTT: value = dev_priv-mm.aliasing_ppgtt ? 1 : 0; I would like to see a dedicated param for the llc size. 'has' is always boolean valued, right? diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c8d6104..43a549d 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1187,6 +1187,8 @@ typedef struct drm_i915_private { /* Old dri1 support infrastructure, beware the dragons ya fools entering * here! */ struct i915_dri1_state dri1; + + size_t llc_size; } drm_i915_private_t; /* Iterate over initialised rings */ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index af61be8..a070686 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4282,6 +4282,57 @@ i915_gem_lastclose(struct drm_device *dev) DRM_ERROR(failed to idle hardware: %d\n, ret); } +/** + * Return the size, in bytes, of the CPU L3 cache size. If the CPU has no L3 + * cache, or if an error occurs in obtaining the cache size, then return 0. + * From Intel Processor Identification and the CPUID Instruction 5.15 + * Deterministic Cache Parmaeters (Function 04h): + *When EAX is initialized to a value of 4, the CPUID instruction returns + *deterministic cache information in the EAX, EBX, ECX and EDX registers. + *This function requires ECX be initialized with an index which indicates + *which cache to return information about. The OS is expected to call this + *function (CPUID.4) with ECX = 0, 1, 2, until EAX[4:0] == 0, indicating no + *more caches. The order in which the caches are returned is not specified + *and may change at Intel's discretion. + * + * Equation 5-4. Calculating the Cache Size in bytes: + * = (Ways +1) � (Partitions +1) � (Line Size +1) � (Sets +1) + * = (EBX[31:22] +1) � (EBX[21:12] +1) � (EBX[11:0] +1 � (ECX + 1) + */ +static size_t get_llc_size(struct drm_device *dev) +{ + u8 cnt = 0; + unsigned int eax, ebx, ecx, edx; + + if (!HAS_LLC(dev)) + return 0; + + do { + uint32_t cache_level; + uint32_t associativity, line_partitions, line_size, sets; + + eax = 4; + ecx = cnt; + __cpuid(eax, ebx, ecx, edx); + + cache_level = (eax 5) 0x7; + if (cache_level != 3) + continue; + + associativity = ((ebx 22) 0x3ff) + 1; + line_partitions = ((ebx 12) 0x3ff) + 1; + line_size = (ebx 0xfff) + 1; + sets = ecx + 1; + + return associativity * line_partitions * line_size * sets; + } while (eax 0x1f ++cnt); + + /* Let user space know we have LLC, but we can't figure it out */ + DRM_DEBUG_DRIVER(Couldn't find LLC size. Bug?\n); + return 1; +} This function looks good to me, since I wrote most of it ;) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: debugfs entries for [e]LLC
On 07/04/2013 11:46 AM, Ben Widawsky wrote: On Thu, Jul 04, 2013 at 08:43:58PM +0200, Daniel Vetter wrote: On Thu, Jul 4, 2013 at 8:40 PM, Ben Widawsky b...@bwidawsk.net wrote: On Thu, Jul 04, 2013 at 08:14:41PM +0200, Daniel Vetter wrote: On Thu, Jul 04, 2013 at 11:02:07AM -0700, Ben Widawsky wrote: To make users life a little easier figuring out what they have on their system. Ideally, I'd really like to report LLC size, but it turned out to be a bit of a pain. Maybe I'll revisit it in the future. Signed-off-by: Ben Widawsky b...@bwidawsk.net I think a getparam for eLLC would be neat, so that usespace can use it to tune working set sizes. -Daniel And I assume drop debugfs? Yeah, I guess the DRM_INFO message in dmesg should be good enough then. For userspace's convenience we could even look into exposing the LLC size with a getparam. -Daniel I would like to do this since we have easy access to cpuid. I know Chad really wants it. If you'll accept the patch, I'll write it. I really want to know the cache sizes. Actually, I didn't expect the kernel to do this for me. So, I've prototyped a patch for Mesa to probe the cache sizes with CPUID. If the kernel does that for Mesa, then I can likely drop my Mesa patch. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] intel: Add pci id for Haswell Harris Beach Mobile GT2+
Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- intel/intel_chipset.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/intel/intel_chipset.h b/intel/intel_chipset.h index b73fa0f..da2fbee 100644 --- a/intel/intel_chipset.h +++ b/intel/intel_chipset.h @@ -82,6 +82,7 @@ #define PCI_CHIP_HASWELL_CRW_S_GT1 0x0D1A /* Server */ #define PCI_CHIP_HASWELL_CRW_S_GT2 0x0D2A #define PCI_CHIP_HASWELL_CRW_S_GT2_PLUS 0x0D3A +#define PCI_CHIP_HASWELL_HSB_M_GT2_PLUS 0x0A26 /* Mobile */ #define IS_830(dev) (dev == 0x3577) #define IS_845(dev) (dev == 0x2562) @@ -198,7 +199,8 @@ devid == PCI_CHIP_HASWELL_ULT_S_GT2_PLUS || \ devid == PCI_CHIP_HASWELL_CRW_GT2_PLUS || \ devid == PCI_CHIP_HASWELL_CRW_M_GT2_PLUS || \ -devid == PCI_CHIP_HASWELL_CRW_S_GT2_PLUS) +devid == PCI_CHIP_HASWELL_CRW_S_GT2_PLUS || \ +devid == PCI_CHIP_HASWELL_HSB_M_GT2_PLUS) #define IS_HASWELL(devid) (IS_HSW_GT1(devid) || \ IS_HSW_GT2(devid)) -- 1.8.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Working from home (Mon 3 Dec)
I'm not coming to the office today. I have a cold. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] intel: Fix bufmgr_gem-gen for gen 4
If the pci_device's actual gen was 4, then we stupidly set bufmgr_gem-gen = 6. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- intel/intel_bufmgr_gem.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c index 2b4fab1..26e3a5c 100644 --- a/intel/intel_bufmgr_gem.c +++ b/intel/intel_bufmgr_gem.c @@ -2321,8 +2321,14 @@ drm_intel_bufmgr_gem_init(int fd, int batch_size) bufmgr_gem-gen = 3; else if (IS_GEN4(bufmgr_gem-pci_device)) bufmgr_gem-gen = 4; - else + else if (IS_GEN5(bufmgr_gem-pci_device)) + bufmgr_gem-gen = 5; + else if (IS_GEN6(bufmgr_gem-pci_device)) bufmgr_gem-gen = 6; + else if (IS_GEN7(bufmgr_gem-pci_device)) + bufmgr_gem-gen = 7; + else + assert(0); if (IS_GEN3(bufmgr_gem-pci_device) bufmgr_gem-gtt_size 256*1024*1024) { -- 1.7.7.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] Android port of intel-gpu-tools
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/10/2012 08:50 AM, Kavuri, Sateesh wrote: -Original Message- From: Adam Jackson [mailto:a...@redhat.com] Sent: Tuesday, January 10, 2012 8:34 PM To: Kavuri, Sateesh Cc: intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH] Android port of intel-gpu-tools On 1/9/12 11:45 PM, Sateesh Kavuri wrote: Added support for Android. Changes include fixes for compilation issues related to Android using an older version of GCC compiler (ver 4.3.3) while the latest version of intel-gpu-tools confirms to GCC ver 4.5.2 (C99 standard functions), using functions like getline(). Fixed such functions, header dependencies for android and added an Android.mk file. I can understand avoiding C99 functions that android doesn't have, but this kind of thing: +#ifdef ANDROID + int i; + for (i = 1; i len; i++) { +#else for (int i = 1; i len; i++) { +#endif is silly. Does gcc -std=c99 on android seriously not cope with this? Yes, -std=c99 would help to get rid of such silly checks (would fix it). Continued this, since there has to be a ANDROID definition for checks like fcntl.h header path - ajax -- Sateesh I want to see the C99 workarounds removed. In Mesa we enabled -std=c99 in the Android build with this hack: # Use C99. ifeq ($(LOCAL_CC),) ifeq ($(LOCAL_IS_HOST_MODULE),true) LOCAL_CC := $(HOST_CC) -std=c99 else LOCAL_CC := $(TARGET_CC) -std=c99 endif endif - Chad Versace chad.vers...@linux.intel.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJPFGVPAAoJEAIvNt057x8i+6MP/iX2T5Q9z4zeQB4LlgpNKUVO 5Y6Sl/ZC/AyfWFg1u188pPZvSi6jI86wvZg6MPULzP7NVBx/xG5bsexO0zqDRJ7l PLridgrxOzUbkreN5jzOI0vV+hVcHfJfuOz9YJ5WTq6LHskPpj/fd3sIbZJRJstg sGPEFU9Nfp2bVteE35ci4ASYPzdisO/O6sWB5RiDkUJRa+xvBm7NrzBeA4TbxD8f 8lRhKUAcmv1AFtFVJHOUoyL+UkjJXRiI19cSnAB6mr3r6Nf+NCvrN+Kp298ze3xM /VzLv0GdkmcmtPGzursd8VEEgWwWxWTC39QcURsQjf6+eHs57I77T6XC7YILfGOu bwMRqCSBygVo6MnCcNlzjCPNHATITgDvRbthSiN8tX2wcrndquNKD1+qBlpoFfVg d/m8TQD8UAravGkDWwmTxnFFeqXM9pktrTpk55gO+ZNQtkK5SbWtJ9F5O6Yn3trb 8opFiFi6rJdORzSV/4Ma8ySeTaWc7JnANzYaB4PXIJOTTTgUukX9ARWJqYIyRnj9 sBZdG1wrRt89Tlk7KGBQ7f3nXIpMwe9aLYmnajbd6xHMTnzXrEuBZ65r320uUXHo h+mba6lzJuow3TwcyudvjK5yKPmgQ7ZOV7DmT8wW2pwhVUcIjd4xxgEQsV4xKjLk SciBCELXKhdkHan8ucml =fHSl -END PGP SIGNATURE- ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] Android port of intel-gpu-tools
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/10/2012 04:47 AM, Daniel Vetter wrote: On Tue, Jan 10, 2012 at 10:15:01AM +0530, Sateesh Kavuri wrote: Added support for Android. Changes include fixes for compilation issues related to Android using an older version of GCC compiler (ver 4.3.3) while the latest version of intel-gpu-tools confirms to GCC ver 4.5.2 (C99 standard functions), using functions like getline(). Fixed such functions, header dependencies for android and added an Android.mk file. signed-off-by: Sateesh Kavuri sateesh.kav...@intel.com A few comments - It looks like you need a completely separate makefile for android. Is there no way to let the automake tools generate that somehow? Because this simply won't scale. - There's too much ANDRIOD #ifdef'ery in the code. Either switch to a construct that works on all platforms or extract things into a little helper functions (like the get_total_ram helper that has recently been ported to Solaris). - You don't seem to touch the testsuite, and I think you want it on Andriod, too. Added xorg-devel to cc, maybe someone else has already tried this with a different package, my buildsystem fu is not up to this. Yours, Daniel Daniel, the Android.mk's are the curse of every project that is ported to Android. Android has it's own build system, and those makefiles can't be generated with autotools. This was a contentious issue when Chia-I Wu and I ported Mesa to Android and led to a discussion [1] on mesa-dev. Below is quoted my key email from that discussion (the Dan I'm speaking to is a Debian maitainer). [1] http://article.gmane.org/gmane.comp.video.mesa3d.devel/28881/match=add+toplevel+android+mk To address Dan's questions, the Android build cannot be fitted into autoconf or configs/android. I explored this option, and discovered that it was impossible. The Android build system is... well... interesting. Allow me to explain. The entirety of the Android project --- libc, webkit, the window manager, *everything* --- exists in a single source tree [1]. And that source tree is built with a single, non-recursive invocation of make. Every time I say that, I find it hard to believe myself, so I'll say it again: The entirety of the Android OS, all core libraries and apps, are built with a single, non-recursive invocation of make. (The kernel is the special exception to this all-encompassing build). The final build artifact is a bootable iso image. [1] http://android.git.kernel.org/ Given this unified design of the Android source tree and build process, it requires system components, such as Mesa, to be integrated into its build system. If Mesa is going to support Android, the Android.mk's are necessary. To address another concern of Dan's, this will not add another way to configure the Mesa build. Android handles the system's build configuration in a single location which is outside of Mesa. I'm aware that the extra set of makefiles is unwelcome, but their presence will be innocuous. The only people that will need to touch them are those maintaining the Android build. - Chad Versace chad.vers...@linux.intel.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJPFGubAAoJEAIvNt057x8iTckQAKf3DPB3zAnVwEPcmad3BacJ +3SfLISJUJPxoNpROYXiM/50bti/xir6x68Hez3jryzgFpSUTo2jpdXT2CalXSeN ITdScopl4p0gNu4zxXiWzIv5445RLswkf1IArvL25LABRnQ1xSLdp6qTUZcfZXTa bpzPM27p7k2FNvWN9z0vwNI8ebtMAwapvsb2lLyFeIfiB2ldQJIsclP1iOMylRlS wSSEdqmWm7OTzpnocAaptCFDeolNplN7zvBfzMs9/AOCs1miKo+nUlf0GUf1CN0V xuLrIf4UZn8sNLlsw5Mnc3uBIT5q8NR/rz60IswDKgm/W4pJbjRViaaecNjpbuvL tzUE99CUtSmK/MY6TiSqlgTzgYRFk1qGu9bGxhbWyJE76gAjOX7A1OL+DsPj+wgR DfAEKfcfFsUCDCkm3604uhl6hJauu3lDEVxL/6jAm/jMLzvwaxBPoMdS6QFtQ3QR tTlZ4Y13jDnp+SOar3eLjgDE+pcdsFIHaFZ0ZxYEjgCHGVhP4ye3BaqMtMopwths HcscB47/xz4y4CmEgBvDmcuPaxw6rgNd9d8543zvWlheb/dGH9yNJcoLBq0dz3tW P/m0A+Hnv2gfLaopbPZIAPUKdR/iqA4cgoSKQq7OI46He4CHApatPapunL4IKCpU eSulLcm0Dqpc6fz2zL1B =Xlzm -END PGP SIGNATURE- ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] Android port of intel-gpu-tools
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/16/2012 10:36 AM, Daniel Vetter wrote: On Mon, Jan 16, 2012 at 10:25:32AM -0800, Chad Versace wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/10/2012 04:47 AM, Daniel Vetter wrote: On Tue, Jan 10, 2012 at 10:15:01AM +0530, Sateesh Kavuri wrote: Added support for Android. Changes include fixes for compilation issues related to Android using an older version of GCC compiler (ver 4.3.3) while the latest version of intel-gpu-tools confirms to GCC ver 4.5.2 (C99 standard functions), using functions like getline(). Fixed such functions, header dependencies for android and added an Android.mk file. signed-off-by: Sateesh Kavuri sateesh.kav...@intel.com A few comments - It looks like you need a completely separate makefile for android. Is there no way to let the automake tools generate that somehow? Because this simply won't scale. - There's too much ANDRIOD #ifdef'ery in the code. Either switch to a construct that works on all platforms or extract things into a little helper functions (like the get_total_ram helper that has recently been ported to Solaris). - You don't seem to touch the testsuite, and I think you want it on Andriod, too. Added xorg-devel to cc, maybe someone else has already tried this with a different package, my buildsystem fu is not up to this. Yours, Daniel Daniel, the Android.mk's are the curse of every project that is ported to Android. Android has it's own build system, and those makefiles can't be generated with autotools. This was a contentious issue when Chia-I Wu and I ported Mesa to Android and led to a discussion [1] on mesa-dev. Below is quoted my key email from that discussion (the Dan I'm speaking to is a Debian maitainer). [1] http://article.gmane.org/gmane.comp.video.mesa3d.devel/28881/match=add+toplevel+android+mk Meh. I've just read about androgenizer: http://cgit.collabora.com/git/user/derek/androgenizer.git/ Would that be a useful to at least generate the Android.mk in a sensible fashion? I don't have clue about this ... Never heard of androgenizer, but looks interesting. When the Mesa build gets its overdue conversion to autotools, I'll have to give it a try. Otherwise we'll just stick Android.mk into the root dir and I'll forget about this (and probably break it every time I change something). Hah! That's expected if the Android maintainers aren't vigilant. We've mostly fixed that problem in Mesa (when people change the real makefile, the Android build breaks) by forcing the two build systems to share common makefiles when sensible. But I don't suggest that intel-gpu-tools take that approach unless build-breakage problems become painful. - Chad Versace chad.vers...@linux.intel.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJPFHIjAAoJEAIvNt057x8ilVAQALd5Z1TSD/SfNBGp1NRTinnz HFN17Zp56apz2bDQgC1hftkJJXNZW83IPm9lseBvpfeKLVLfxkIsp6uklPbVputK YJdI4G7TxDWpjDxHdh/tA45q+5H6hrtsPOBcnQwfDGq99DrAI/Pt6fZAO4c/0fbe ZYnQ42DGeAkHK/XNxvF7zrkc9gJqwYhpUp8EyuguilYmnzyMkwR2SGu8hbDGRTmc L0dNRuArZaaaTnFWLKAWPjgTZVXg9Ah2ZEawWOlvUNmygtOoqGqtcgIsRWdxr4QR 6hRTcbmiwc6CdmR1QsdL+hHs7MPAq1Kvj5CQjjNmCefkZBleu2W7EOnyLXZmfNDM MbUOJ79JesS7z8wEkwLgcYgas9Qj+vaHw7tWWaqLxvxXEldFCcfub+Xjf1+Vi2X4 MoZtGAyNQQ6KHPpPS/ObI4D1Ati7eQy4ZiN7ILRgjmPWj4vy+RVwlwNbC7C9yLe2 kugwJGzRePFSl6Sagf1kFymKyK0MzIfquN/7vNh/2+Z/AOR6yISPqNkoxQveXk06 PK/ee4DngWcsUxuDpYcc1/NgNLccRfddyNINxCcbz2OuwQGbhRSa9Y8dpxJeXSIR yaElS7FFUXCaebeXur3U9KsVq+s9mDpBRRZjaPaSsoss2tWb/F1wz93Yr96QUIIq FOZrSKKdOxM8Q+Fj0SqP =REOX -END PGP SIGNATURE- ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] libdrm,intel: Add Android makefiles (v2)
This enables libdrm.so and libdrm_intel.so to build on Android IceCreamSandwich. v2: Link libdrm_intel to libpciaccess. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- Android.mk | 52 + intel/Android.mk | 57 ++ 2 files changed, 109 insertions(+), 0 deletions(-) create mode 100644 Android.mk create mode 100644 intel/Android.mk diff --git a/Android.mk b/Android.mk new file mode 100644 index 000..1029dc6 --- /dev/null +++ b/Android.mk @@ -0,0 +1,52 @@ +# +# Copyright © 2011 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. +# + +LIBDRM_VALID_GPU_DRIVERS := i915 i965 + +# Skip this makefile if we're not building for a known GPU. +ifneq ($(filter $(BOARD_GPU_DRIVERS), $(LIBDRM_VALID_GPU_DRIVERS)),) + +LOCAL_PATH := $(call my-dir) +include $(CLEAR_VARS) + +LIBDRM_TOP := $(LOCAL_PATH) + +# Import variable LIBDRM_SOURCES. +include $(LOCAL_PATH)/sources.mk + +LOCAL_MODULE := libdrm +LOCAL_MODULE_TAGS := optional + +LOCAL_SRC_FILES := $(LIBDRM_SOURCES) + +LOCAL_C_INCLUDES := \ + $(LIBDRM_TOP)/include/drm + +LOCAL_CFLAGS := \ + -DHAVE_LIBDRM_ATOMIC_PRIMITIVES=1 + +include $(BUILD_SHARED_LIBRARY) + +include $(LOCAL_PATH)/intel/Android.mk + +endif # BOARD_GPU_DRIVERS diff --git a/intel/Android.mk b/intel/Android.mk new file mode 100644 index 000..3647a14 --- /dev/null +++ b/intel/Android.mk @@ -0,0 +1,57 @@ +# +# Copyright © 2011 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. +# + +LIBDRM_INTEL_VALID_GPU_DRIVERS := i915 i965 + +# Skip this makefile if we're not building for an Intel GPU. +ifneq ($(filter $(BOARD_GPU_DRIVERS), $(LIBDRM_INTEL_VALID_GPU_DRIVERS)),) + +LOCAL_PATH := $(call my-dir) +include $(CLEAR_VARS) + +# Import variable LIBDRM_INTEL_SOURCES. +include $(LOCAL_PATH)/sources.mk + +LOCAL_MODULE := libdrm_intel +LOCAL_MODULE_TAGS := optional + +LOCAL_SHARED_LIBRARIES := libdrm + +LOCAL_SRC_FILES := $(LIBDRM_INTEL_SOURCES) + +LOCAL_C_INCLUDES := \ + $(LIBDRM_TOP) \ + $(LIBDRM_TOP)/intel \ + $(LIBDRM_TOP)/include/drm \ + external/libpciaccess/include + +LOCAL_CFLAGS := \ + -DHAVE_LIBDRM_ATOMIC_PRIMITIVES=1 + +LOCAL_SHARED_LIBRARIES := \ + libdrm \ + libpciaccess + +include $(BUILD_SHARED_LIBRARY) + +endif # BOARD_GPU_DRIVERS -- 1.7.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] libdrm,intel: Factor source file lists into sources.mk
Factor the source file list for libdrm.so from Makefile.am into sources.mk. Ditto for libdrm_intel.so. This is in preparation for adding support for Android. The sources.mk's will be shared between autotools and Android. Rationale: The most commonly changed parts of any makefile are the source lists. So, by sharing the lists between the two build systems, we can reduce the frequency at which modifications to the Linux build breaks the Android build. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- Makefile.am |9 - intel/Makefile.am |9 - intel/sources.mk | 30 ++ sources.mk| 30 ++ 4 files changed, 68 insertions(+), 10 deletions(-) create mode 100644 intel/sources.mk create mode 100644 sources.mk diff --git a/Makefile.am b/Makefile.am index a4d07f4..b62287c 100644 --- a/Makefile.am +++ b/Makefile.am @@ -41,6 +41,9 @@ if HAVE_RADEON RADEON_SUBDIR = radeon endif +# Import variable LIBDRM_SOURCES. +include sources.mk + SUBDIRS = . $(LIBKMS_SUBDIR) $(INTEL_SUBDIR) $(NOUVEAU_SUBDIR) $(RADEON_SUBDIR) tests include libdrm_la_LTLIBRARIES = libdrm.la @@ -51,11 +54,7 @@ libdrm_la_LIBADD = @CLOCK_LIB@ libdrm_la_CPPFLAGS = -I$(top_srcdir)/include/drm libdrm_la_SOURCES =\ - xf86drm.c \ - xf86drmHash.c \ - xf86drmRandom.c \ - xf86drmSL.c \ - xf86drmMode.c \ + $(LIBDRM_SOURCES) \ xf86atomic.h\ libdrm_lists.h diff --git a/intel/Makefile.am b/intel/Makefile.am index 801210f..4896861 100644 --- a/intel/Makefile.am +++ b/intel/Makefile.am @@ -22,6 +22,9 @@ # Authors: #Eric Anholt e...@anholt.net +# Import variable LIBDRM_INTEL_SOURCES. +include sources.mk + AM_CFLAGS = \ $(WARN_CFLAGS) \ -I$(top_srcdir) \ @@ -39,13 +42,9 @@ libdrm_intel_la_LIBADD = ../libdrm.la \ @CLOCK_LIB@ libdrm_intel_la_SOURCES = \ - intel_bufmgr.c \ + $(LIBDRM_INTEL_SOURCES) \ intel_bufmgr_priv.h \ - intel_bufmgr_fake.c \ - intel_bufmgr_gem.c \ - intel_decode.c \ intel_chipset.h \ - mm.c \ mm.h intel_bufmgr_gem_o_CFLAGS = $(AM_CFLAGS) -c99 diff --git a/intel/sources.mk b/intel/sources.mk new file mode 100644 index 000..9e8cc5f --- /dev/null +++ b/intel/sources.mk @@ -0,0 +1,30 @@ +# +# Copyright © 2011 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. +# + +# for libdrm_intel.so +LIBDRM_INTEL_SOURCES := \ + intel_bufmgr.c \ + intel_bufmgr_fake.c \ + intel_bufmgr_gem.c \ + intel_decode.c \ + mm.c diff --git a/sources.mk b/sources.mk new file mode 100644 index 000..3d54e7e --- /dev/null +++ b/sources.mk @@ -0,0 +1,30 @@ +# +# Copyright © 2011 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
Re: [Intel-gfx] [PATCH 1/3] intel: Disable drm_intel_probe_agp_aperture_size() on Android
On 12/28/2011 07:11 AM, Purushothaman, Vijay A wrote: You need libpciaccess in Android if you want to use intel-gpu-tools in Android. We have already ported these tools to work in Android. Thanks, Vijay Vijay, please share the source for your Android ports of libpciaccess and intel-gpu-tools. We should try to avoid duplication of effort here. Chad ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/3] libdrm/intel: Add support for Android
From: Chad Versace c...@chad.versace.us This series enables libdrm.so and libdrm_intel.so to build on Android IceCreamSandwich. Chad Versace (3): intel: Disable drm_intel_probe_agp_aperture_size() on Android libdrm,intel: Factor source file lists into sources.mk libdrm,intel: Add Android makefiles Android.mk | 52 ++ Makefile.am |9 +++ intel/Android.mk | 49 +++ intel/Makefile.am|8 +++--- intel/intel_bufmgr.c | 10 - intel/sources.mk | 29 +++ sources.mk | 30 7 files changed, 177 insertions(+), 10 deletions(-) create mode 100644 Android.mk create mode 100644 intel/Android.mk create mode 100644 intel/sources.mk create mode 100644 sources.mk -- 1.7.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/3] intel: Disable drm_intel_probe_agp_aperture_size() on Android
The function uses libpciaccess, which is not present on Android. So, make it a no-op that returns 0. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- intel/intel_bufmgr.c | 10 +- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/intel/intel_bufmgr.c b/intel/intel_bufmgr.c index 905556f..d80b00e 100644 --- a/intel/intel_bufmgr.c +++ b/intel/intel_bufmgr.c @@ -36,11 +36,14 @@ #include errno.h #include drm.h #include i915_drm.h -#include pciaccess.h #include intel_bufmgr.h #include intel_bufmgr_priv.h #include xf86drm.h +#ifndef ANDROID +#include pciaccess.h +#endif + /** @file intel_bufmgr.c * * Convenience functions for buffer management methods. @@ -275,6 +278,10 @@ int drm_intel_get_pipe_from_crtc_id(drm_intel_bufmgr *bufmgr, int crtc_id) static size_t drm_intel_probe_agp_aperture_size(int fd) { +#ifdef ANDROID + /* Android does not have libpciaccess. */ + return 0; +#else struct pci_device *pci_dev; size_t size = 0; int ret; @@ -296,6 +303,7 @@ drm_intel_probe_agp_aperture_size(int fd) err: pci_system_cleanup (); return size; +#endif } int drm_intel_get_aperture_sizes(int fd, -- 1.7.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/3] libdrm,intel: Factor source file lists into sources.mk
Factor the source file list for libdrm.so from Makefile.am into sources.mk. Ditto for libdrm_intel.so. This is in preparation for adding support for Android. The sources.mk's will be shared between autotools and Android. Rationale: The most commonly changed parts of any makefile are the source lists. So, by sharing the lists between the two build systems, we can reduce the frequency at which modifications to the Linux build breaks the Android build. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- Makefile.am |9 - intel/Makefile.am |8 intel/sources.mk | 29 + sources.mk| 30 ++ 4 files changed, 67 insertions(+), 9 deletions(-) create mode 100644 intel/sources.mk create mode 100644 sources.mk diff --git a/Makefile.am b/Makefile.am index a4d07f4..b62287c 100644 --- a/Makefile.am +++ b/Makefile.am @@ -41,6 +41,9 @@ if HAVE_RADEON RADEON_SUBDIR = radeon endif +# Import variable LIBDRM_SOURCES. +include sources.mk + SUBDIRS = . $(LIBKMS_SUBDIR) $(INTEL_SUBDIR) $(NOUVEAU_SUBDIR) $(RADEON_SUBDIR) tests include libdrm_la_LTLIBRARIES = libdrm.la @@ -51,11 +54,7 @@ libdrm_la_LIBADD = @CLOCK_LIB@ libdrm_la_CPPFLAGS = -I$(top_srcdir)/include/drm libdrm_la_SOURCES =\ - xf86drm.c \ - xf86drmHash.c \ - xf86drmRandom.c \ - xf86drmSL.c \ - xf86drmMode.c \ + $(LIBDRM_SOURCES) \ xf86atomic.h\ libdrm_lists.h diff --git a/intel/Makefile.am b/intel/Makefile.am index 7a44aaf..11f0b97 100644 --- a/intel/Makefile.am +++ b/intel/Makefile.am @@ -22,6 +22,9 @@ # Authors: #Eric Anholt e...@anholt.net +# Import variable LIBDRM_INTEL_SOURCES. +include sources.mk + AM_CFLAGS = \ $(WARN_CFLAGS) \ -I$(top_srcdir) \ @@ -36,12 +39,9 @@ libdrm_intel_la_LDFLAGS = -version-number 1:0:0 -no-undefined libdrm_intel_la_LIBADD = ../libdrm.la @PTHREADSTUBS_LIBS@ @PCIACCESS_LIBS@ @CLOCK_LIB@ libdrm_intel_la_SOURCES = \ - intel_bufmgr.c \ + $(LIBDRM_INTEL_SOURCES) \ intel_bufmgr_priv.h \ - intel_bufmgr_fake.c \ - intel_bufmgr_gem.c \ intel_chipset.h \ - mm.c \ mm.h libdrm_intelincludedir = ${includedir}/libdrm diff --git a/intel/sources.mk b/intel/sources.mk new file mode 100644 index 000..24cba76 --- /dev/null +++ b/intel/sources.mk @@ -0,0 +1,29 @@ +# +# Copyright © 2011 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. +# + +# for libdrm_intel.so +LIBDRM_INTEL_SOURCES := \ + intel_bufmgr.c \ + intel_bufmgr_fake.c \ + intel_bufmgr_gem.c \ + mm.c diff --git a/sources.mk b/sources.mk new file mode 100644 index 000..3d54e7e --- /dev/null +++ b/sources.mk @@ -0,0 +1,30 @@ +# +# Copyright © 2011 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS
[Intel-gfx] [PATCH 3/3] libdrm,intel: Add Android makefiles
This enables libdrm.so and libdrm_intel.so to build on Android IceCreamSandwich. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- Android.mk | 52 intel/Android.mk | 49 + 2 files changed, 101 insertions(+), 0 deletions(-) create mode 100644 Android.mk create mode 100644 intel/Android.mk diff --git a/Android.mk b/Android.mk new file mode 100644 index 000..1029dc6 --- /dev/null +++ b/Android.mk @@ -0,0 +1,52 @@ +# +# Copyright © 2011 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. +# + +LIBDRM_VALID_GPU_DRIVERS := i915 i965 + +# Skip this makefile if we're not building for a known GPU. +ifneq ($(filter $(BOARD_GPU_DRIVERS), $(LIBDRM_VALID_GPU_DRIVERS)),) + +LOCAL_PATH := $(call my-dir) +include $(CLEAR_VARS) + +LIBDRM_TOP := $(LOCAL_PATH) + +# Import variable LIBDRM_SOURCES. +include $(LOCAL_PATH)/sources.mk + +LOCAL_MODULE := libdrm +LOCAL_MODULE_TAGS := optional + +LOCAL_SRC_FILES := $(LIBDRM_SOURCES) + +LOCAL_C_INCLUDES := \ + $(LIBDRM_TOP)/include/drm + +LOCAL_CFLAGS := \ + -DHAVE_LIBDRM_ATOMIC_PRIMITIVES=1 + +include $(BUILD_SHARED_LIBRARY) + +include $(LOCAL_PATH)/intel/Android.mk + +endif # BOARD_GPU_DRIVERS diff --git a/intel/Android.mk b/intel/Android.mk new file mode 100644 index 000..b3459c0 --- /dev/null +++ b/intel/Android.mk @@ -0,0 +1,49 @@ +# +# Copyright © 2011 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. +# + +LIBDRM_INTEL_VALID_GPU_DRIVERS := i915 i965 + +# Skip this makefile if we're not building for an Intel GPU. +ifneq ($(filter $(BOARD_GPU_DRIVERS), $(LIBDRM_INTEL_VALID_GPU_DRIVERS)),) + +LOCAL_PATH := $(call my-dir) +include $(CLEAR_VARS) + +# Import variable LIBDRM_INTEL_SOURCES. +include $(LOCAL_PATH)/sources.mk + +LOCAL_MODULE := libdrm_intel +LOCAL_MODULE_TAGS := optional + +LOCAL_SHARED_LIBRARIES := libdrm + +LOCAL_SRC_FILES := $(LIBDRM_INTEL_SOURCES) + +LOCAL_C_INCLUDES := \ + $(LIBDRM_TOP) \ + $(LIBDRM_TOP)/intel \ + $(LIBDRM_TOP)/include/drm + +include $(BUILD_SHARED_LIBRARY) + +endif # BOARD_GPU_DRIVERS -- 1.7.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled
On 07/18/2011 01:20 AM, Paul Menzel wrote: Am Montag, den 18.07.2011, 00:55 -0700 schrieb Chad Versace: There are alignment/white space issues above. + unsigned stride = irb-region-pitch;\ + unsigned height = 2 * irb-region-height; \ + bool flip = rb-Name == 0; \ […] Thanks, Paul Thanks. Fix whitespace in both patches. -- Chad Versace c...@chad-versace.us signature.asc Description: OpenPGP digital signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] intel: Fix stencil buffer to be W tiled
Chad Versace (2): xf86-video-intel dri: Do not tile stencil buffer src/intel_dri.c | 16 mesa intel: Fix stencil buffer to be W tiled src/mesa/drivers/dri/intel/intel_clear.c |6 ++ src/mesa/drivers/dri/intel/intel_context.c |9 ++- src/mesa/drivers/dri/intel/intel_fbo.c | 13 +++-- src/mesa/drivers/dri/intel/intel_screen.h |9 ++- src/mesa/drivers/dri/intel/intel_span.c| 85 5 files changed, 89 insertions(+), 33 deletions(-) -- 1.7.6 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] dri: Do not tile stencil buffer
Until now, the stencil buffer was allocated as a Y tiled buffer, because in several locations the PRM states that it is. However, it is actually W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section 4.5.2.1 W-Major Format: W-Major Tile Format is used for separate stencil. The GTT is incapable of W fencing, so we allocate the stencil buffer with I915_TILING_NONE and decode the tile's layout in software. This commit mutually depends on the mesa commit: intel: Fix stencil buffer to be W tiled Author: Chad Versace c...@chad-versace.us Date: Mon Jul 18 00:37:45 2011 -0700 CC: Eric Anholt e...@anholt.net CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Chad Versace c...@chad-versace.us --- src/intel_dri.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/intel_dri.c b/src/intel_dri.c index 5ea7c2c..4652dc7 100644 --- a/src/intel_dri.c +++ b/src/intel_dri.c @@ -336,7 +336,6 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, switch (attachment) { case DRI2BufferDepth: case DRI2BufferDepthStencil: - case DRI2BufferStencil: case DRI2BufferHiz: if (SUPPORTS_YTILING(intel)) { hint |= INTEL_CREATE_PIXMAP_TILING_Y; @@ -351,6 +350,14 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, case DRI2BufferFrontRight: hint |= INTEL_CREATE_PIXMAP_TILING_X; break; + case DRI2BufferStencil: + /* +* The stencil buffer is W tiled. However, we +* request from the kernel a non-tiled buffer +* because the GTT is incapable of W fencing. +*/ + hint |= INTEL_CREATE_PIXMAP_TILING_NONE; + break; default: free(privates); free(buffer); @@ -368,11 +375,12 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, * To accomplish this, we resort to the nasty hack of doubling * the drm region's cpp and halving its height. * -* If we neglect to double the pitch, then -* drm_intel_gem_bo_map_gtt() maps the memory incorrectly. +* If we neglect to double the pitch, then render corruption + * occurs. */ if (attachment == DRI2BufferStencil) { - pixmap_height /= 2; + pixmap_width = ALIGN(pixmap_width, 64); + pixmap_height = ALIGN(pixmap_height / 2, 64); pixmap_cpp *= 2; } -- 1.7.6 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] intel: Fix stencil buffer to be W tiled
Until now, the stencil buffer was allocated as a Y tiled buffer, because in several locations the PRM states that it is. However, it is actually W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section 4.5.2.1 W-Major Format: W-Major Tile Format is used for separate stencil. The GTT is incapable of W fencing, so we allocate the stencil buffer with I915_TILING_NONE and decode the tile's layout in software. This fix touches the following portions of code: - In intel_allocate_renderbuffer_storage(), allocate the stencil buffer with I915_TILING_NONE. - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is not tiled. - In the stencil buffer's span functions, the tile's layout must be decoded in software. This commit mutually depends on the xf86-video-intel commit dri: Do not tile stencil buffer Author: Chad Versace c...@chad-versace.us Date: Mon Jul 18 00:38:00 2011 -0700 On Gen6 with separate stencil enabled, fixes the following Piglit tests: bugs/fdo23670-drawpix_stencil general/stencil-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels spec/EXT_packed_depth_stencil/readpixels-24_8 Note: This is a candidate for the 7.11 branch. CC: Eric Anholt e...@anholt.net CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Chad Versace c...@chad-versace.us --- src/mesa/drivers/dri/intel/intel_clear.c |6 ++ src/mesa/drivers/dri/intel/intel_context.c |9 ++- src/mesa/drivers/dri/intel/intel_fbo.c | 13 +++-- src/mesa/drivers/dri/intel/intel_screen.h |9 ++- src/mesa/drivers/dri/intel/intel_span.c| 85 5 files changed, 89 insertions(+), 33 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_clear.c b/src/mesa/drivers/dri/intel/intel_clear.c index dfca03c..5ab9873 100644 --- a/src/mesa/drivers/dri/intel/intel_clear.c +++ b/src/mesa/drivers/dri/intel/intel_clear.c @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask) */ tri_mask |= BUFFER_BIT_STENCIL; } +else if (intel-has_separate_stencil + stencilRegion-tiling == I915_TILING_NONE) { + /* The stencil buffer is actually W tiled, which the hardware +* cannot blit to. */ + tri_mask |= BUFFER_BIT_STENCIL; +} else { /* clearing all stencil bits, use blitting */ blit_mask |= BUFFER_BIT_STENCIL; diff --git a/src/mesa/drivers/dri/intel/intel_context.c b/src/mesa/drivers/dri/intel/intel_context.c index 2ba1363..fe8be08 100644 --- a/src/mesa/drivers/dri/intel/intel_context.c +++ b/src/mesa/drivers/dri/intel/intel_context.c @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context *intel, assert(stencil_rb-Base.Format == MESA_FORMAT_S8); assert(depth_rb depth_rb-Base.Format == MESA_FORMAT_X8_Z24); - if (stencil_rb-region-tiling == I915_TILING_Y) { + if (stencil_rb-region-tiling == I915_TILING_NONE) { +/* + * The stencil buffer is actually W tiled. The region's tiling is + * I915_TILING_NONE, however, because the GTT is incapable of W + * fencing. + */ intel-intelScreen-dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE; return; } else { @@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel, * Presently, however, no verification or clean up is necessary, and * execution should not reach here. If the framebuffer still has a hiz * region, then we have already set dri2_has_hiz to true after - * confirming above that the stencil buffer is Y tiled. + * confirming above that the stencil buffer is W tiled. */ assert(0); } diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 1669af2..507cc33 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -173,6
Re: [Intel-gfx] [Mesa-dev] [PATCH] dri: Do not tile stencil buffer
On 07/18/2011 11:45 AM, Ian Romanick wrote: On 07/18/2011 12:55 AM, Chad Versace wrote: Until now, the stencil buffer was allocated as a Y tiled buffer, because in several locations the PRM states that it is. However, it is actually W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section 4.5.2.1 W-Major Format: W-Major Tile Format is used for separate stencil. The GTT is incapable of W fencing, so we allocate the stencil buffer with I915_TILING_NONE and decode the tile's layout in software. This commit mutually depends on the mesa commit: intel: Fix stencil buffer to be W tiled Author: Chad Versace c...@chad-versace.us Date: Mon Jul 18 00:37:45 2011 -0700 CC: Eric Anholt e...@anholt.net CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Chad Versace c...@chad-versace.us --- src/intel_dri.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/intel_dri.c b/src/intel_dri.c index 5ea7c2c..4652dc7 100644 --- a/src/intel_dri.c +++ b/src/intel_dri.c @@ -336,7 +336,6 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, switch (attachment) { case DRI2BufferDepth: case DRI2BufferDepthStencil: -case DRI2BufferStencil: case DRI2BufferHiz: if (SUPPORTS_YTILING(intel)) { hint |= INTEL_CREATE_PIXMAP_TILING_Y; @@ -351,6 +350,14 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, case DRI2BufferFrontRight: hint |= INTEL_CREATE_PIXMAP_TILING_X; break; +case DRI2BufferStencil: +/* + * The stencil buffer is W tiled. However, we + * request from the kernel a non-tiled buffer + * because the GTT is incapable of W fencing. + */ +hint |= INTEL_CREATE_PIXMAP_TILING_NONE; +break; default: free(privates); free(buffer); Eh... it seems like this will break compatibility with older versions of Mesa. I haven't dug around, but there used to be a hack in DRI2 where a client would request a depth buffer and a stencil buffer, but it would get the same packed depth-stencil buffer for both. I guess that might all be up in the DRI2 layer and not in the driver... The 'case DRI2BufferStencil' modified in this hunk was added by me when implementing separate stencil support. It was never used for this hack. That hack is implemented by an alternate definition of I830DRI2CreateBuffer() which is ifdef'd out when DRI2INFOREC_VERSION = 2. FYI, the line that implements the hack can be found by grepping xf86-video-intel:src/intel_dri.c for } else if (attachments[i] == DRI2BufferStencil pDepthPixmap) { @@ -368,11 +375,12 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, * To accomplish this, we resort to the nasty hack of doubling * the drm region's cpp and halving its height. * - * If we neglect to double the pitch, then - * drm_intel_gem_bo_map_gtt() maps the memory incorrectly. + * If we neglect to double the pitch, then render corruption + * occurs. Mangled whitespace? Probably mixed tabs and spaces... Oops. Will fix. */ if (attachment == DRI2BufferStencil) { -pixmap_height /= 2; +pixmap_width = ALIGN(pixmap_width, 64); +pixmap_height = ALIGN(pixmap_height / 2, 64); pixmap_cpp *= 2; } -- Chad Versace c...@chad-versace.us signature.asc Description: OpenPGP digital signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled
On 07/18/2011 11:49 AM, Ian Romanick wrote: On 07/18/2011 12:55 AM, Chad Versace wrote: Until now, the stencil buffer was allocated as a Y tiled buffer, because in several locations the PRM states that it is. However, it is actually W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section 4.5.2.1 W-Major Format: W-Major Tile Format is used for separate stencil. The GTT is incapable of W fencing, so we allocate the stencil buffer with I915_TILING_NONE and decode the tile's layout in software. This fix touches the following portions of code: - In intel_allocate_renderbuffer_storage(), allocate the stencil buffer with I915_TILING_NONE. - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is not tiled. - In the stencil buffer's span functions, the tile's layout must be decoded in software. This commit mutually depends on the xf86-video-intel commit dri: Do not tile stencil buffer Author: Chad Versace c...@chad-versace.us Date: Mon Jul 18 00:38:00 2011 -0700 On Gen6 with separate stencil enabled, fixes the following Piglit tests: bugs/fdo23670-drawpix_stencil general/stencil-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels spec/EXT_packed_depth_stencil/readpixels-24_8 Note: This is a candidate for the 7.11 branch. CC: Eric Anholt e...@anholt.net CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Chad Versace c...@chad-versace.us --- src/mesa/drivers/dri/intel/intel_clear.c |6 ++ src/mesa/drivers/dri/intel/intel_context.c |9 ++- src/mesa/drivers/dri/intel/intel_fbo.c | 13 +++-- src/mesa/drivers/dri/intel/intel_screen.h |9 ++- src/mesa/drivers/dri/intel/intel_span.c| 85 5 files changed, 89 insertions(+), 33 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_clear.c b/src/mesa/drivers/dri/intel/intel_clear.c index dfca03c..5ab9873 100644 --- a/src/mesa/drivers/dri/intel/intel_clear.c +++ b/src/mesa/drivers/dri/intel/intel_clear.c @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask) */ tri_mask |= BUFFER_BIT_STENCIL; } + else if (intel-has_separate_stencil + stencilRegion-tiling == I915_TILING_NONE) { +/* The stencil buffer is actually W tiled, which the hardware + * cannot blit to. */ +tri_mask |= BUFFER_BIT_STENCIL; + } else { /* clearing all stencil bits, use blitting */ blit_mask |= BUFFER_BIT_STENCIL; diff --git a/src/mesa/drivers/dri/intel/intel_context.c b/src/mesa/drivers/dri/intel/intel_context.c index 2ba1363..fe8be08 100644 --- a/src/mesa/drivers/dri/intel/intel_context.c +++ b/src/mesa/drivers/dri/intel/intel_context.c @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context *intel, assert(stencil_rb-Base.Format == MESA_FORMAT_S8); assert(depth_rb depth_rb-Base.Format == MESA_FORMAT_X8_Z24); - if (stencil_rb-region-tiling == I915_TILING_Y) { + if (stencil_rb-region-tiling == I915_TILING_NONE) { + /* + * The stencil buffer is actually W tiled. The region's tiling is + * I915_TILING_NONE, however, because the GTT is incapable of W + * fencing. + */ intel-intelScreen-dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE; return; } else { @@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel, * Presently, however, no verification or clean up is necessary, and * execution should not reach here. If the framebuffer still has a hiz * region, then we have already set dri2_has_hiz to true after - * confirming above that the stencil buffer is Y tiled. + * confirming above that the stencil buffer is W tiled. */ assert(0); } diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri
Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled
On 07/18/2011 08:57 AM, Eric Anholt wrote: On Mon, 18 Jul 2011 00:55:03 -0700, Chad Versace c...@chad-versace.us wrote: diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 1669af2..507cc33 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -173,6 +173,9 @@ intel_alloc_renderbuffer_storage(struct gl_context * ctx, struct gl_renderbuffer if (irb-Base.Format == MESA_FORMAT_S8) { /* + * The stencil buffer is W tiled. However, we request from the kernel a + * non-tiled buffer because the GTT is incapable of W fencing. + * * The stencil buffer has quirky pitch requirements. From Vol 2a, * 11.5.6.2.1 3DSTATE_STENCIL_BUFFER, field Surface Pitch: *The pitch must be set to 2x the value computed based on width, as @@ -180,14 +183,14 @@ intel_alloc_renderbuffer_storage(struct gl_context * ctx, struct gl_renderbuffer * To accomplish this, we resort to the nasty hack of doubling the drm * region's cpp and halving its height. * - * If we neglect to double the pitch, then drm_intel_gem_bo_map_gtt() - * maps the memory incorrectly. + * If we neglect to double the pitch, then render corruption occurs. */ irb-region = intel_region_alloc(intel-intelScreen, - I915_TILING_Y, + I915_TILING_NONE, cpp * 2, - width, - height / 2, + ALIGN(width, 64), + /* XXX: Maybe align to 128? */ + ALIGN(height / 2, 64), GL_TRUE); if (!irb-region) return false; This looks like it would fail on a buffer with height = 129. Doesn't seem like 128 pitch requirement would be needed -- has it been tested? diff --git a/src/mesa/drivers/dri/intel/intel_span.c b/src/mesa/drivers/dri/intel/intel_span.c index 153803f..d306432 100644 --- a/src/mesa/drivers/dri/intel/intel_span.c +++ b/src/mesa/drivers/dri/intel/intel_span.c @@ -131,38 +131,77 @@ intel_set_span_functions(struct intel_context *intel, int miny = 0;\ int maxx = rb-Width;\ int maxy = rb-Height; \ - int stride = rb-RowStride; \ - uint8_t *buf = rb-Data; \ +\ + /* \ +* Here we ignore rb-Data and rb-RowStride as set by \ +* intelSpanRenderStart. Since intel_offset_S8 decodes the W tile \ +* manually, the region's *real* base address and stride is \ +* required. \ +*/ \ + struct intel_renderbuffer *irb = intel_renderbuffer(rb); \ + uint8_t *buf = irb-region-buffer-virtual; \ + unsigned stride = irb-region-pitch;\ + unsigned height = 2 * irb-region-height; \ + bool flip = rb-Name == 0; \ -/* Don't flip y. */ #undef Y_FLIP -#define Y_FLIP(y) y +#define Y_FLIP(y) ((1 - flip) * y + flip * (height - 1 - y)) The flip is usually handled by a scale and bias variable, so that Y_FLIP is ((y) * scale + bias). I think it'll produce less code, since Y_FLIP is used a lot. Good call. Does changing the hunk like this look good to you? + struct intel_renderbuffer *irb = intel_renderbuffer(rb);\ + uint8_t *buf = irb-region-buffer-virtual;\ + unsigned stride = irb-region-pitch; \ + unsigned height = 2 * irb-region-height; \ + bool flip = rb-Name == 0; \ + int y_scale = flip ? -1 : 1;\ + int y_bias = flip ? (height - 1) : 0; \ -/* Don't flip y. */ #undef Y_FLIP -#define Y_FLIP(y) y +#define Y_FLIP(y) (y_scale * (y) + y_bias) -- Chad Versace c...@chad-versace.us signature.asc Description: OpenPGP digital signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Mesa-dev] [PATCH] intel: Fix stencil buffer to be W tiled
On 07/18/2011 02:02 PM, Ian Romanick wrote: On 07/18/2011 01:54 PM, Chad Versace wrote: On 07/18/2011 11:49 AM, Ian Romanick wrote: On 07/18/2011 12:55 AM, Chad Versace wrote: Until now, the stencil buffer was allocated as a Y tiled buffer, because in several locations the PRM states that it is. However, it is actually W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section 4.5.2.1 W-Major Format: W-Major Tile Format is used for separate stencil. The GTT is incapable of W fencing, so we allocate the stencil buffer with I915_TILING_NONE and decode the tile's layout in software. This fix touches the following portions of code: - In intel_allocate_renderbuffer_storage(), allocate the stencil buffer with I915_TILING_NONE. - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is not tiled. - In the stencil buffer's span functions, the tile's layout must be decoded in software. This commit mutually depends on the xf86-video-intel commit dri: Do not tile stencil buffer Author: Chad Versace c...@chad-versace.us Date: Mon Jul 18 00:38:00 2011 -0700 On Gen6 with separate stencil enabled, fixes the following Piglit tests: bugs/fdo23670-drawpix_stencil general/stencil-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels spec/EXT_packed_depth_stencil/readpixels-24_8 Note: This is a candidate for the 7.11 branch. CC: Eric Anholt e...@anholt.net CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Chad Versace c...@chad-versace.us --- src/mesa/drivers/dri/intel/intel_clear.c |6 ++ src/mesa/drivers/dri/intel/intel_context.c |9 ++- src/mesa/drivers/dri/intel/intel_fbo.c | 13 +++-- src/mesa/drivers/dri/intel/intel_screen.h |9 ++- src/mesa/drivers/dri/intel/intel_span.c| 85 5 files changed, 89 insertions(+), 33 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_clear.c b/src/mesa/drivers/dri/intel/intel_clear.c index dfca03c..5ab9873 100644 --- a/src/mesa/drivers/dri/intel/intel_clear.c +++ b/src/mesa/drivers/dri/intel/intel_clear.c @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask) */ tri_mask |= BUFFER_BIT_STENCIL; } + else if (intel-has_separate_stencil + stencilRegion-tiling == I915_TILING_NONE) { + /* The stencil buffer is actually W tiled, which the hardware + * cannot blit to. */ + tri_mask |= BUFFER_BIT_STENCIL; + } else { /* clearing all stencil bits, use blitting */ blit_mask |= BUFFER_BIT_STENCIL; diff --git a/src/mesa/drivers/dri/intel/intel_context.c b/src/mesa/drivers/dri/intel/intel_context.c index 2ba1363..fe8be08 100644 --- a/src/mesa/drivers/dri/intel/intel_context.c +++ b/src/mesa/drivers/dri/intel/intel_context.c @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context *intel, assert(stencil_rb-Base.Format == MESA_FORMAT_S8); assert(depth_rb depth_rb-Base.Format == MESA_FORMAT_X8_Z24); - if (stencil_rb-region-tiling == I915_TILING_Y) { + if (stencil_rb-region-tiling == I915_TILING_NONE) { + /* +* The stencil buffer is actually W tiled. The region's tiling is +* I915_TILING_NONE, however, because the GTT is incapable of W +* fencing. +*/ intel-intelScreen-dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE; return; } else { @@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel, * Presently, however, no verification or clean up is necessary, and * execution should not reach here. If the framebuffer still has a hiz * region, then we have already set dri2_has_hiz to true after - * confirming above that the stencil buffer is Y tiled. + * confirming above that the stencil buffer is W tiled. */ assert(0); } diff --git
[Intel-gfx] [PATCH] dri: Do not tile stencil buffer
Until now, the stencil buffer was allocated as a Y tiled buffer, because in several locations the PRM states that it is. However, it is actually W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section 4.5.2.1 W-Major Format: W-Major Tile Format is used for separate stencil. The GTT is incapable of W fencing, so we allocate the stencil buffer with I915_TILING_NONE and decode the tile's layout in software. This commit mutually depends on the mesa commit: intel: Fix stencil buffer to be W tiled Author: Chad Versace c...@chad-versace.us Date: Mon Jul 18 00:37:45 2011 -0700 CC: Eric Anholt e...@anholt.net CC: Kenneth Graunke kenn...@whitecape.org CC: Ian Romancik ian.d.roman...@intel.com Signed-off-by: Chad Versace c...@chad-versace.us --- src/intel_dri.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/intel_dri.c b/src/intel_dri.c index 1269422..90abe5f 100644 --- a/src/intel_dri.c +++ b/src/intel_dri.c @@ -335,7 +335,6 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, switch (attachment) { case DRI2BufferDepth: case DRI2BufferDepthStencil: - case DRI2BufferStencil: case DRI2BufferHiz: if (SUPPORTS_YTILING(intel)) { hint |= INTEL_CREATE_PIXMAP_TILING_Y; @@ -350,6 +349,14 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, case DRI2BufferFrontRight: hint |= INTEL_CREATE_PIXMAP_TILING_X; break; + case DRI2BufferStencil: + /* +* The stencil buffer is W tiled. However, we +* request from the kernel a non-tiled buffer +* because the GTT is incapable of W fencing. +*/ + hint |= INTEL_CREATE_PIXMAP_TILING_NONE; + break; default: free(privates); free(buffer); @@ -367,11 +374,12 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, * To accomplish this, we resort to the nasty hack of doubling * the drm region's cpp and halving its height. * -* If we neglect to double the pitch, then -* drm_intel_gem_bo_map_gtt() maps the memory incorrectly. +* If we neglect to double the pitch, then render corruption +* occurs. */ if (attachment == DRI2BufferStencil) { - pixmap_height /= 2; + pixmap_width = ALIGN(pixmap_width, 64); + pixmap_height = ALIGN((pixmap_height + 1) / 2, 64); pixmap_cpp *= 2; } -- 1.7.6 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] intel: Fix stencil buffer to be W tiled
Until now, the stencil buffer was allocated as a Y tiled buffer, because in several locations the PRM states that it is. However, it is actually W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section 4.5.2.1 W-Major Format: W-Major Tile Format is used for separate stencil. The GTT is incapable of W fencing, so we allocate the stencil buffer with I915_TILING_NONE and decode the tile's layout in software. This fix touches the following portions of code: - In intel_allocate_renderbuffer_storage(), allocate the stencil buffer with I915_TILING_NONE. - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is not tiled. - In the stencil buffer's span functions, the tile's layout must be decoded in software. This commit mutually depends on the xf86-video-intel commit dri: Do not tile stencil buffer Author: Chad Versace c...@chad-versace.us Date: Mon Jul 18 00:38:00 2011 -0700 On Gen6 with separate stencil enabled, fixes the following Piglit tests: bugs/fdo23670-drawpix_stencil general/stencil-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels spec/EXT_packed_depth_stencil/readpixels-24_8 Note: This is a candidate for the 7.11 branch. CC: Eric Anholt e...@anholt.net CC: Kenneth Graunke kenn...@whitecape.org CC: Ian Romancik ian.d.roman...@intel.com Signed-off-by: Chad Versace c...@chad-versace.us --- src/mesa/drivers/dri/intel/intel_clear.c |6 ++ src/mesa/drivers/dri/intel/intel_context.c |9 ++- src/mesa/drivers/dri/intel/intel_fbo.c | 12 ++-- src/mesa/drivers/dri/intel/intel_screen.h |9 ++- src/mesa/drivers/dri/intel/intel_span.c| 88 +--- 5 files changed, 93 insertions(+), 31 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_clear.c b/src/mesa/drivers/dri/intel/intel_clear.c index dfca03c..5ab9873 100644 --- a/src/mesa/drivers/dri/intel/intel_clear.c +++ b/src/mesa/drivers/dri/intel/intel_clear.c @@ -143,6 +143,12 @@ intelClear(struct gl_context *ctx, GLbitfield mask) */ tri_mask |= BUFFER_BIT_STENCIL; } +else if (intel-has_separate_stencil + stencilRegion-tiling == I915_TILING_NONE) { + /* The stencil buffer is actually W tiled, which the hardware +* cannot blit to. */ + tri_mask |= BUFFER_BIT_STENCIL; +} else { /* clearing all stencil bits, use blitting */ blit_mask |= BUFFER_BIT_STENCIL; diff --git a/src/mesa/drivers/dri/intel/intel_context.c b/src/mesa/drivers/dri/intel/intel_context.c index 2ba1363..fe8be08 100644 --- a/src/mesa/drivers/dri/intel/intel_context.c +++ b/src/mesa/drivers/dri/intel/intel_context.c @@ -1439,7 +1439,12 @@ intel_verify_dri2_has_hiz(struct intel_context *intel, assert(stencil_rb-Base.Format == MESA_FORMAT_S8); assert(depth_rb depth_rb-Base.Format == MESA_FORMAT_X8_Z24); - if (stencil_rb-region-tiling == I915_TILING_Y) { + if (stencil_rb-region-tiling == I915_TILING_NONE) { +/* + * The stencil buffer is actually W tiled. The region's tiling is + * I915_TILING_NONE, however, because the GTT is incapable of W + * fencing. + */ intel-intelScreen-dri2_has_hiz = INTEL_DRI2_HAS_HIZ_TRUE; return; } else { @@ -1527,7 +1532,7 @@ intel_verify_dri2_has_hiz(struct intel_context *intel, * Presently, however, no verification or clean up is necessary, and * execution should not reach here. If the framebuffer still has a hiz * region, then we have already set dri2_has_hiz to true after - * confirming above that the stencil buffer is Y tiled. + * confirming above that the stencil buffer is W tiled. */ assert(0); } diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 1669af2..2206391 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa
Re: [Intel-gfx] [PATCH 1/2] Add attachment token DRI2BufferHiz
On Sun, 05 Jun 2011 01:53:29 -0700, Kenneth Graunke kenn...@whitecape.org wrote: On 06/04/2011 05:06 PM, Chad Versace wrote: ... and bump version to 2.6. Signed-off-by: Chad Versacec...@chad-versace.us --- configure.ac |2 +- dri2proto.txt |6 +- dri2tokens.h |1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index 297be0e..d671f5a 100644 --- a/configure.ac +++ b/configure.ac @@ -1,5 +1,5 @@ AC_PREREQ([2.60]) -AC_INIT([DRI2Proto], [2.5], [https://bugs.freedesktop.org/enter_bug.cgi?product=xorg]) +AC_INIT([DRI2Proto], [2.6], [https://bugs.freedesktop.org/enter_bug.cgi?product=xorg]) AM_INIT_AUTOMAKE([foreign dist-bzip2]) AM_MAINTAINER_MODE diff --git a/dri2proto.txt b/dri2proto.txt index dc46e58..df763c7 100644 --- a/dri2proto.txt +++ b/dri2proto.txt @@ -178,7 +178,8 @@ DRI2ATTACHMENT { DRI2BufferFrontLeft DRI2BufferAccum DRI2BufferFakeFrontLeft DRI2BufferFakeFrontRight -DRI2BufferDepthStencil } +DRI2BufferDepthStencil +DRI2BufferHiz } These values describe various attachment points for DRI2 buffers. @@ -509,6 +510,8 @@ The DRI2 extension has undergone a number of revisions before 2.3: Added the DRI2InvalidateBuffers event. + 2.6: Enlightenment attained. Added the DRI2BufferHiz attachment. + I hate to be a killjoy, but I'd leave out the first sentence lest someone think this is to support Enlightenment/EFL (enlightenment.org). Reviewed-by: Kenneth Graunke kenn...@whitecape.org It's not evident from the hunk, but I was just following the silly precedent in previous version bumps. 2.0: Awesomeness! 2.1: True excellence. Added DRI2GetBuffersWithFormat to allow more flexible object creation. 2.2: Approaching perfection. Added requests for swapbuffers, MSC and SBC related requests, and events. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/3] [RESUBMIT] ddx/dri: Add support for hiz and separate stencil buffers
patch 1: dri2proto patch 2-3: xf86-video-intel On Sun, 05 Jun 2011 10:10:39 -0700, Eric Anholt e...@anholt.net wrote: This should make configure.ac depend on the bumped version of dri2proto.h, so people get an explanation instead of just build failure. Also, I'm surprised to see you didn't fix the we'll allocate a buffer for *anything* behavior that made the compat path in Mesa strange this time around. Eric, I've fixed the patch series according to your comments and resubmitted. Chad Versace (2): dri: Do not create DRI2 buffers for unrecognized DRI2 buffer tokens dri: Add support for DRI2BufferStencil and DRI2BufferHiz configure.ac|1 + src/intel_dri.c | 44 ++-- 2 files changed, 39 insertions(+), 6 deletions(-) -- 1.7.5.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] Add attachment token DRI2BufferHiz
... and bump version to 2.6. CC: Eric Anholt e...@anholt.net CC: Ian Romanick i...@freedesktop.org CC: Kristian Høgsberg k...@bitplanet.net Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Chad Versace c...@chad-versace.us --- configure.ac |2 +- dri2proto.txt |6 +- dri2tokens.h |1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index 297be0e..d671f5a 100644 --- a/configure.ac +++ b/configure.ac @@ -1,5 +1,5 @@ AC_PREREQ([2.60]) -AC_INIT([DRI2Proto], [2.5], [https://bugs.freedesktop.org/enter_bug.cgi?product=xorg]) +AC_INIT([DRI2Proto], [2.6], [https://bugs.freedesktop.org/enter_bug.cgi?product=xorg]) AM_INIT_AUTOMAKE([foreign dist-bzip2]) AM_MAINTAINER_MODE diff --git a/dri2proto.txt b/dri2proto.txt index dc46e58..df763c7 100644 --- a/dri2proto.txt +++ b/dri2proto.txt @@ -178,7 +178,8 @@ DRI2ATTACHMENT { DRI2BufferFrontLeft DRI2BufferAccum DRI2BufferFakeFrontLeft DRI2BufferFakeFrontRight -DRI2BufferDepthStencil } +DRI2BufferDepthStencil +DRI2BufferHiz } These values describe various attachment points for DRI2 buffers. @@ -509,6 +510,8 @@ The DRI2 extension has undergone a number of revisions before 2.3: Added the DRI2InvalidateBuffers event. + 2.6: Enlightenment attained. Added the DRI2BufferHiz attachment. + Compatibility up to 2.0 is not preserved, but was also never released. @@ -569,6 +572,7 @@ A.1 Common Types 0x7 DRI2BufferFakeFrontLeft 0x8 DRI2BufferFakeFrontRight 0x9 DRI2BufferDepthStencil + 0xa DRI2BufferHiz └─── Used to encode the possible attachment points. The attachment DRI2BufferDepthStencil is only available with protocol version 1.1 or diff --git a/dri2tokens.h b/dri2tokens.h index 7804e4d..16c9008 100644 --- a/dri2tokens.h +++ b/dri2tokens.h @@ -43,6 +43,7 @@ #define DRI2BufferFakeFrontLeft7 #define DRI2BufferFakeFrontRight 8 #define DRI2BufferDepthStencil 9 +#define DRI2BufferHiz 10 #define DRI2DriverDRI 0 #define DRI2DriverVDPAU1 -- 1.7.5.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/3] dri: Do not create DRI2 buffers for unrecognized DRI2 buffer tokens
Before this commit, if a client were to request an unrecognized DRI2 buffer, such as DRI2BufferStencil, then I830DRI2CreateBuffer() allocated and returned an X-tiled buffer by accident. The problem was that unrecognized tokens were caught by the default case of a switch statement. Now, when given unrecognized DRI2 tokens, I830DRI2CreateBuffers() returns null. This shouldn't break older Mesa versions, because they never query (via DRI2GetBuffersWithFormat) for the drawable's DRI2BufferStencil. CC: Eric Anholt e...@anholt.net CC: Ian Romanick i...@freedesktop.org CC: Kristian Høgsberg k...@bitplanet.net CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Chad Versace c...@chad-versace.us --- src/intel_dri.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/src/intel_dri.c b/src/intel_dri.c index 48d0f56..4571d07 100644 --- a/src/intel_dri.c +++ b/src/intel_dri.c @@ -338,10 +338,20 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, hint |= INTEL_CREATE_PIXMAP_TILING_Y; break; } - default: + case DRI2BufferAccum: + case DRI2BufferBackLeft: + case DRI2BufferBackRight: + case DRI2BufferFakeFrontLeft: + case DRI2BufferFakeFrontRight: + case DRI2BufferFrontLeft: + case DRI2BufferFrontRight: hint |= INTEL_CREATE_PIXMAP_TILING_X; break; - } + default: + free(privates); + free(buffer); + return NULL; +} } pixmap = screen-CreatePixmap(screen, -- 1.7.5.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/2] ddx/dri: Add support for hiz and separate stencil buffers
These patches belong to my effort to enable hiz and separate stencil for IVB. Patch 1 belongs to dri2proto. Patch 2 belongs to xf86-video-intel. There is a related patch series on the mesa-dev list that uses this functionality. See the thread with subject: i965: Implement hiz and separate stencil for window framebuffer -- 1.7.5.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] dri: Add support for DRI2BufferStencil and DRI2BufferHiz
Before this commit, if a client were to request a stencil or hiz buffer, then I830DRI2CreateBuffer() allocated and returned an X-tiled buffer by accident. (DRI2BufferStencil and DRI2BufferHiz were unintentionally caught by the default case of a switch statement.) Now, I830DRI2CreateBuffer() correctly returns a Y-tiled buffer and handles the stencil buffer as a special case due its quirky pitch requirements. This shouldn't break older Mesa versions, because they never query (via DRI2GetBuffersWithFormat) for the drawable's DRI2BufferStencil. Signed-off-by: Chad Versace c...@chad-versace.us --- src/intel_dri.c | 30 ++ 1 files changed, 26 insertions(+), 4 deletions(-) diff --git a/src/intel_dri.c b/src/intel_dri.c index 48d0f56..4bcec45 100644 --- a/src/intel_dri.c +++ b/src/intel_dri.c @@ -329,11 +329,16 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, pixmap = get_front_buffer(drawable); if (pixmap == NULL) { unsigned int hint = INTEL_CREATE_PIXMAP_DRI2; + int pixmap_width = drawable-width; + int pixmap_height = drawable-height; + int pixmap_cpp = (format != 0) ? format : drawable-depth; if (intel-tiling INTEL_TILING_3D) { switch (attachment) { case DRI2BufferDepth: case DRI2BufferDepthStencil: + case DRI2BufferStencil: + case DRI2BufferHiz: if (SUPPORTS_YTILING(intel)) { hint |= INTEL_CREATE_PIXMAP_TILING_Y; break; @@ -344,11 +349,28 @@ I830DRI2CreateBuffer(DrawablePtr drawable, unsigned int attachment, } } + /* +* The stencil buffer has quirky pitch requirements. From Vol +* 2a, 11.5.6.2.1 3DSTATE_STENCIL_BUFFER, field Surface +* Pitch: +*The pitch must be set to 2x the value computed based on +*width, as the stencil buffer is stored with two rows +*interleaved. +* To accomplish this, we resort to the nasty hack of doubling +* the drm region's cpp and halving its height. +* +* If we neglect to double the pitch, then +* drm_intel_gem_bo_map_gtt() maps the memory incorrectly. +*/ + if (attachment == DRI2BufferStencil) { + pixmap_height /= 2; + pixmap_cpp *= 2; + } + pixmap = screen-CreatePixmap(screen, - drawable-width, - drawable-height, - (format != 0) ? format : - drawable-depth, + pixmap_width, + pixmap_height, + pixmap_cpp, hint); if (pixmap == NULL || intel_get_pixmap_bo(pixmap) == NULL) { if (pixmap) -- 1.7.5.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] Add attachment token DRI2BufferHiz
... and bump version to 2.6. Signed-off-by: Chad Versace c...@chad-versace.us --- configure.ac |2 +- dri2proto.txt |6 +- dri2tokens.h |1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index 297be0e..d671f5a 100644 --- a/configure.ac +++ b/configure.ac @@ -1,5 +1,5 @@ AC_PREREQ([2.60]) -AC_INIT([DRI2Proto], [2.5], [https://bugs.freedesktop.org/enter_bug.cgi?product=xorg]) +AC_INIT([DRI2Proto], [2.6], [https://bugs.freedesktop.org/enter_bug.cgi?product=xorg]) AM_INIT_AUTOMAKE([foreign dist-bzip2]) AM_MAINTAINER_MODE diff --git a/dri2proto.txt b/dri2proto.txt index dc46e58..df763c7 100644 --- a/dri2proto.txt +++ b/dri2proto.txt @@ -178,7 +178,8 @@ DRI2ATTACHMENT { DRI2BufferFrontLeft DRI2BufferAccum DRI2BufferFakeFrontLeft DRI2BufferFakeFrontRight -DRI2BufferDepthStencil } +DRI2BufferDepthStencil +DRI2BufferHiz } These values describe various attachment points for DRI2 buffers. @@ -509,6 +510,8 @@ The DRI2 extension has undergone a number of revisions before 2.3: Added the DRI2InvalidateBuffers event. + 2.6: Enlightenment attained. Added the DRI2BufferHiz attachment. + Compatibility up to 2.0 is not preserved, but was also never released. @@ -569,6 +572,7 @@ A.1 Common Types 0x7 DRI2BufferFakeFrontLeft 0x8 DRI2BufferFakeFrontRight 0x9 DRI2BufferDepthStencil + 0xa DRI2BufferHiz └─── Used to encode the possible attachment points. The attachment DRI2BufferDepthStencil is only available with protocol version 1.1 or diff --git a/dri2tokens.h b/dri2tokens.h index 7804e4d..16c9008 100644 --- a/dri2tokens.h +++ b/dri2tokens.h @@ -43,6 +43,7 @@ #define DRI2BufferFakeFrontLeft7 #define DRI2BufferFakeFrontRight 8 #define DRI2BufferDepthStencil 9 +#define DRI2BufferHiz 10 #define DRI2DriverDRI 0 #define DRI2DriverVDPAU1 -- 1.7.5.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx