[Intel-gfx] ✗ Ro.CI.BAT: failure for Enable i915 perf stream for Haswell OA unit

2016-08-18 Thread Patchwork
== Series Details == Series: Enable i915 perf stream for Haswell OA unit URL : https://patchwork.freedesktop.org/series/11295/ State : failure == Summary == Applying: drm/i915: Add i915 perf infrastructure Using index info to reconstruct a base tree... M drivers/gpu/drm/i915/Makefile M

[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [1/2] drm: Allow drivers to modify plane_state in prepare_fb/cleanup_fb

2016-08-18 Thread Patchwork
== Series Details == Series: series starting with [1/2] drm: Allow drivers to modify plane_state in prepare_fb/cleanup_fb URL : https://patchwork.freedesktop.org/series/11285/ State : failure == Summary == Series 11285v1 Series without cover letter

[Intel-gfx] ✗ Ro.CI.BAT: failure for Reclassify messages from GuC loader/submission (rev4)

2016-08-18 Thread Patchwork
== Series Details == Series: Reclassify messages from GuC loader/submission (rev4) URL : https://patchwork.freedesktop.org/series/10918/ State : failure == Summary == Applying: drm: extra printk() wrapper macros Using index info to reconstruct a base tree... M include/drm/drmP.h Falling

[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915/guc: use symbolic names for module parameter values (rev3)

2016-08-18 Thread Patchwork
== Series Details == Series: drm/i915/guc: use symbolic names for module parameter values (rev3) URL : https://patchwork.freedesktop.org/series/10188/ State : failure == Summary == Series 10188v3 drm/i915/guc: use symbolic names for module parameter values

Re: [Intel-gfx] [PATCH v2 1/3] drm/i915: Add function to return port from an encoder

2016-08-18 Thread Rodrigo Vivi
On Mon, Aug 15, 2016 at 05:00:53PM -0700, Dhinakaran Pandiyan wrote: > There are places in the driver where we just need the 'port' associated > with an encoder and not 'struct intel_digital_port' that contains it. > This basically is a generic implementation of intel_ddi_get_encoder_port() >

[Intel-gfx] linux-next: manual merge of the jc_docs tree with the drm-misc tree

2016-08-18 Thread Stephen Rothwell
Hi Jonathan, Today's linux-next merge of the jc_docs tree got a conflict in: Documentation/gpu/index.rst between commit: b754b35b089d ("vgaarbiter: rst-ifiy and polish kerneldoc") from the drm-misc tree and commit: 505f711174b0 ("doc-rst: add index to sub-folders") from the jc_docs

[Intel-gfx] [PATCH] drm/i915/dp/mst: Validate modes against the available link bandwidth

2016-08-18 Thread Anusha Srivatsa
Change intel_dp_mst_mode_valid() to use available link bandwidth rather than the link's maximum supported bandwidth to evaluate whether modes are legal for the current configuration. This takes into account the fact that link bandwidth may already be dedicated to other virtual channels.

[Intel-gfx] [PATCH v4 11/11] drm/i915: Add a kerneldoc summary for i915_perf.c

2016-08-18 Thread Robert Bragg
In particular this tries to capture for posterity some of the early challenges we had with using the core perf infrastructure in case we ever want to revisit adapting perf for device metrics. Cc: Chris Wilson Signed-off-by: Robert Bragg ---

[Intel-gfx] [PATCH v4 08/11] drm/i915: Add dev.i915.perf_event_paranoid sysctl option

2016-08-18 Thread Robert Bragg
Consistent with the kernel.perf_event_paranoid sysctl option that can allow non-root users to access system wide cpu metrics, this can optionally allow non-root users to access system wide OA counter metrics from Gen graphics hardware. Signed-off-by: Robert Bragg ---

[Intel-gfx] [PATCH v4 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit

2016-08-18 Thread Robert Bragg
Gen graphics hardware can be set up to periodically write snapshots of performance counters into a circular buffer via its Observation Architecture and this patch exposes that capability to userspace via the i915 perf interface. Cc: Chris Wilson Signed-off-by: Robert

[Intel-gfx] [PATCH v4 07/11] drm/i915: advertise available metrics via sysfs

2016-08-18 Thread Robert Bragg
Each metric set is given a sysfs entry like: /sys/class/drm/card0/metrics//id This allows userspace to enumerate the specific sets that are available for the current system. The 'id' file contains an unsigned integer that can be used to open the associated metric set via

[Intel-gfx] [PATCH v4 03/11] drm/i915: return EACCES for check_cmd() failures

2016-08-18 Thread Robert Bragg
check_cmd() is checking whether a command adheres to certain restrictions that ensure it's safe to execute within a privileged batch buffer. Returning false implies a privilege problem, not that the command is invalid. The distinction makes the difference between allowing the buffer to be

[Intel-gfx] [PATCH v4 02/11] drm/i915: rename OACONTROL GEN7_OACONTROL

2016-08-18 Thread Robert Bragg
OACONTROL changes quite a bit for gen8, with some bits split out into a per-context OACTXCONTROL register. Rename now before adding more gen7 OA registers Signed-off-by: Robert Bragg --- drivers/gpu/drm/i915/i915_cmd_parser.c | 4 ++-- drivers/gpu/drm/i915/i915_reg.h

[Intel-gfx] [PATCH v4 10/11] drm/i915: Add more Haswell OA metric sets

2016-08-18 Thread Robert Bragg
This adds 'compute', 'compute extended', 'memory reads', 'memory writes' and 'sampler balance' metric sets for Haswell. The code is auto generated from an XML description of metric sets, currently maintained in gputop, ref: https://github.com/rib/gputop > gputop-data/oa-*.xml >

[Intel-gfx] [PATCH v4 09/11] drm/i915: add oa_event_min_timer_exponent sysctl

2016-08-18 Thread Robert Bragg
The minimal sampling period is now configurable via a dev.i915.oa_min_timer_exponent sysctl parameter. Following the precedent set by perf, the default is the minimum that won't (on its own) exceed the default kernel.perf_event_max_sample_rate default of 10 samples/s. Signed-off-by: Robert

[Intel-gfx] [PATCH v4 01/11] drm/i915: Add i915 perf infrastructure

2016-08-18 Thread Robert Bragg
Adds base i915 perf infrastructure for Gen performance metrics. This adds a DRM_IOCTL_I915_PERF_OPEN ioctl that takes an array of uint64 properties to configure a stream of metrics and returns a new fd usable with standard VFS system calls including read() to read typed and sized records; ioctl()

[Intel-gfx] [PATCH v4 05/11] drm/i915: Add 'render basic' Haswell OA unit config

2016-08-18 Thread Robert Bragg
Adds a static OA unit, MUX + B Counter configuration for basic render metrics on Haswell. This is auto generated from an XML description of metric sets, currently maintained in gputop, ref: https://github.com/rib/gputop > gputop-data/oa-*.xml > scripts/i915-perf-kernelgen.py $ make -C

[Intel-gfx] [PATCH v4 00/11] Enable i915 perf stream for Haswell OA unit

2016-08-18 Thread Robert Bragg
I've updated the stream->ops->read() interface to avoid the struct i915_perf_read_state so it's hopefully a bit clearer to see the state being passed around: int (*read)(struct i915_perf_stream *stream, char __user *buf, size_t count, size_t

[Intel-gfx] [PATCH v4 04/11] drm/i915: don't whitelist oacontrol in cmd parser

2016-08-18 Thread Robert Bragg
Being able to program OACONTROL from a non-privileged batch buffer is not sufficient to be able to configure the OA unit. This was originally allowed to help enable Mesa to expose OA counters via the INTEL_performance_query extension, but the current implementation based on programming OACONTROL

Re: [Intel-gfx] [PATCH v3 03/11] drm/i915: return EACCES for check_cmd() failures

2016-08-18 Thread Robert Bragg
On Mon, Aug 15, 2016 at 4:04 PM, Chris Wilson wrote: > On Mon, Aug 15, 2016 at 03:41:20PM +0100, Robert Bragg wrote: > > check_cmd() is checking whether a command adheres to certain > > restrictions that ensure it's safe to execute within a privileged batch > > buffer.

Re: [Intel-gfx] [PATCH 1/2] drm: Allow drivers to modify plane_state in prepare_fb/cleanup_fb

2016-08-18 Thread Daniel Vetter
On Thu, Aug 18, 2016 at 07:00:16PM +0100, Chris Wilson wrote: > The drivers have to modify the atomic plane state during the prepare_fb > callback so they track allocations, reservations and dependencies for > this atomic operation involving this fb. In particular, how else do we > set the

[Intel-gfx] [PATCH 1/2] drm: Allow drivers to modify plane_state in prepare_fb/cleanup_fb

2016-08-18 Thread Chris Wilson
The drivers have to modify the atomic plane state during the prepare_fb callback so they track allocations, reservations and dependencies for this atomic operation involving this fb. In particular, how else do we set the plane->fence from the framebuffer! Signed-off-by: Chris Wilson

[Intel-gfx] [PATCH 2/2] drm/i915: Replace intel_plane->wait_req with plane->fence

2016-08-18 Thread Chris Wilson
Now that we subclass our request from struct fence, we start using the common primitives more freely and so avoid hand-rolling routines already provided for by the helpers. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_atomic_plane.c | 3 --

[Intel-gfx] [PATCH v4 3/4] drm/i915/guc: revisit GuC loader message levels

2016-08-18 Thread Dave Gordon
Some downgraded from DRM_ERROR() to DRM_WARN() or DRM_NOTE(), a few upgraded from DRM_INFO() to DRM_NOTE() or DRM_WARN(), and one eliminated completely. v2: different permutation of levels :) v3: convert a couple of "this shouldn't happen" messages to WARN() Signed-off-by: Dave Gordon

[Intel-gfx] [PATCH v4 4/4] NOMERGE: next version of GuC firmware is 8.11

2016-08-18 Thread Dave Gordon
Update GuC firmware version to 8.11, and re-enable GuC loading and submission by default on suitable platforms, since it's Intel's Plan of Record that GuC submission shall be used where available. Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_params.c |

[Intel-gfx] [PATCH v4 1/4] drm: extra printk() wrapper macros

2016-08-18 Thread Dave Gordon
We had only DRM_INFO() and DRM_ERROR(), whereas the underlying printk() provides several other useful intermediate levels such as NOTICE and WARNING. So this patch fills out the set by providing both regular and once-only macros for each of the levels INFO, NOTICE, and WARNING, using a common

[Intel-gfx] [PATCH v4 2/4] drm/i915/guc: downgrade some DRM_ERROR() messages to DRM_WARN()

2016-08-18 Thread Dave Gordon
Where we're going to continue regardless of the problem, rather than fail, then the message should be a WARNing rather than an ERROR. Signed-off-by: Dave Gordon Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_guc_submission.c | 18

[Intel-gfx] [PATCH v4 0/4] Reclassify messages from GuC loader/submission

2016-08-18 Thread Dave Gordon
Various downgrading, upgrading, or general reorganisation of the messages emitted by the GuC code. As general principles: * "can't happen" cases (inconsistencies/misconfiguration) are ERRORs * recoverable (ignored) errors are downgraded to WARNINGs * important auxiliary messages about failure or

[Intel-gfx] [PATCH v4 3/5] drm/i915/guc: symbolic name for GuC log-level none

2016-08-18 Thread Dave Gordon
The existing code that accesses the "guc_log_level" parameter uses an explicit numerical value for the "no logging" case, whereas there are symbolic names for the other levels. So this patch just provides and uses a name for the default log level (NONE), with the same numeric value that is

[Intel-gfx] [PATCH v4 5/5] drm/i915/guc: ignore unrecognised loading & submission options

2016-08-18 Thread Dave Gordon
Previously the code allowed *any* values for the enable_guc_loading and enable_guc_submission parameters, and forced them into range by clipping at each extremum. This version instead ignores unknown values, treating them as DEFAULT (which then gets converted to DISABLED or PREFERRED). Of course

[Intel-gfx] [PATCH v4 1/5] drm/i915/guc: symbolic names for GuC submission preferences

2016-08-18 Thread Dave Gordon
The existing code that accesses the "enable_guc_submission" parameter uses explicit numerical values for the various possibilities, including in one case relying on boolean 0/1 mapping to specific values (which could be confusing for maintainers). So this patch just provides and uses names for

[Intel-gfx] [PATCH v4 0/5] drm/i915/guc: use symbolic names for module parameter values

2016-08-18 Thread Dave Gordon
There are various literal constants used in the GuC module-parameter processing code; this sequence of patches replaces them with symbolic names for greater clarity. And then it re-enables GuC submission by default v3: Original patch broken into two (1/4 + 2/4) Name for GuC log level NONE

[Intel-gfx] [PATCH v4 2/5] drm/i915/guc: symbolic names for GuC firmare loading preferences

2016-08-18 Thread Dave Gordon
The existing code that accesses the "enable_guc_loading" parameter uses explicit numerical values for the various possibilities, including in one case relying on boolean 0/1 mapping to specific values (which could be confusing for maintainers). So this patch just provides and uses names for the

[Intel-gfx] [PATCH v4 4/5] drm/i915/guc: use symbolic names in setting defaults for module parameters

2016-08-18 Thread Dave Gordon
Of course, this also re-enables GuC loading and submission by default on suitable platforms, since it's Intel's Plan of Record that GuC submission shall be used where available. Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_params.c | 10 +- 1 file

Re: [Intel-gfx] [PATCH 2/2] drm/i915/error: capture errored context based on request context-id

2016-08-18 Thread Dave Gordon
On 11/08/16 17:43, Chris Wilson wrote: On Thu, Aug 11, 2016 at 05:09:01PM +0100, Arun Siluvery wrote: From: Dave Gordon Context capture hasn't worked for a while now, since the introduction of execlists because the function that records active context is using CCID

[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [CI,01/39] drm/i915: Unconditionally flush any chipset buffers before execbuf

2016-08-18 Thread Patchwork
== Series Details == Series: series starting with [CI,01/39] drm/i915: Unconditionally flush any chipset buffers before execbuf URL : https://patchwork.freedesktop.org/series/11278/ State : failure == Summary == Series 11278v1 Series without cover letter

[Intel-gfx] [CI 09/39] drm/i915: Before accessing an object via the cpu, flush GTT writes

2016-08-18 Thread Chris Wilson
If we want to read the pages directly via the CPU, we have to be sure that we have to flush the writes via the GTT (as the CPU can not see the address aliasing). Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen ---

[Intel-gfx] [CI 23/39] drm/i915: Fix partial GGTT faulting

2016-08-18 Thread Chris Wilson
We want to always use the partial VMA as a fallback for a failure to bind the object into the GGTT. This extends the support partial objects in the GGTT to cover everything, not just objects too large. v2: Call the partial view, view not partial. Signed-off-by: Chris Wilson

[Intel-gfx] [CI 05/39] drm/i915: Mark up the GTT flush following WC writes as ORIGIN_CPU

2016-08-18 Thread Chris Wilson
Similarly to invalidating beforehand, if the object is mmapped via I915_MMAP_WC we cannot track writes through the I915_GEM_DOMAIN_GTT. At the conclusion of the write, i915_gem_object_flush_gtt_writes() we also need to treat the origin carefully in case it may have been untracked. See also commit

[Intel-gfx] [CI 29/39] drm/i915: Bump the inactive tracking for all VMA accessed

2016-08-18 Thread Chris Wilson
We track the LRU access for eviction and bump the last access for the user GGTT on set-to-gtt. When we do so we need to not only bump the primary GGTT VMA but all partials as well. Similarly we want to bump the last access tracking for when unpinning an object from the scanout so that they do not

[Intel-gfx] [CI 24/39] drm/i915: Convert partial ggtt vma to full ggtt if it spans the entire object

2016-08-18 Thread Chris Wilson
If we want to create a partial vma from a chunk that is the same size as the object, create a normal ggtt vma instead. The benefit is that it will match future requests for the normal ggtt. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen

[Intel-gfx] [CI 18/39] drm/i915: Allocate rings from stolen

2016-08-18 Thread Chris Wilson
If we have stolen available, make use of it for ringbuffer allocation. Previously this was restricted to !llc platforms, as writing to stolen requires a GGTT mapping - but now that we have partial mappable support, the mappable aperture isn't quite so precious so we can use it more freely and

[Intel-gfx] [CI 31/39] drm/i915/cmdparser: Make initialisation failure non-fatal

2016-08-18 Thread Chris Wilson
If the developer adds a register in the wrong order, we BUG during boot. That makes development and testing very difficult. Let's be a bit more friendly and disable the command parser with a big warning if the tables are invalid. Signed-off-by: Chris Wilson Reviewed-by:

[Intel-gfx] [CI 34/39] drm/i915/cmdparser: Only cache the dst vmap

2016-08-18 Thread Chris Wilson
For simplicity, we want to continue using a contiguous mapping of the command buffer, but we can reduce the number of vmappings we hold by switching over to a page-by-page copy from the user batch buffer to the shadow. The cost for saving one linear mapping is about 5% in trivial workloads - which

[Intel-gfx] [CI 37/39] drm/i915/cmdparser: Check for SKIP descriptors first

2016-08-18 Thread Chris Wilson
If the command descriptor says to skip it, ignore checking for anyother other conflict. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_cmd_parser.c | 3 +++ 1 file changed, 3 insertions(+) diff --git

[Intel-gfx] [CI 19/39] drm/i915/userptr: Make gup errors stickier

2016-08-18 Thread Chris Wilson
Keep any error reported by the gup_worker until we are notified that the arena has changed (via the mmu-notifier). This has the importance of making two consecutive calls to i915_gem_object_get_pages() reporting the same error, and curtailing a loop of detecting a fault and requeueing a

[Intel-gfx] [CI 26/39] drm/i915: Choose not to evict faultable objects from the GGTT

2016-08-18 Thread Chris Wilson
Often times we do not want to evict mapped objects from the GGTT as these are quite expensive to teardown and frequently reused (causing an equally, if not more so, expensive setup). In particular, when faulting in a new object we want to avoid evicting an active object, or else we may trigger a

[Intel-gfx] [CI 36/39] drm/i915/cmdparser: Compare against the previous command descriptor

2016-08-18 Thread Chris Wilson
On the blitter (and in test code), we see long sequences of repeated commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these, we can skip the hashtable lookup by remembering the previous command descriptor and doing a straightforward compare of the command header. The corollary is

[Intel-gfx] [CI 39/39] drm/i915/cmdparser: Accelerate copies from WC memory

2016-08-18 Thread Chris Wilson
If we need to use clflush to prepare our batch for reads from memory, we can bypass the cache instead by using non-temporal copies. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_cmd_parser.c | 70

[Intel-gfx] [CI 33/39] drm/i915/cmdparser: Use cached vmappings

2016-08-18 Thread Chris Wilson
The single largest factor in the overhead of parsing the commands is the setup of the virtual mapping to provide a continuous block for the batch buffer. If we keep those vmappings around (against the better judgement of mm/vmalloc.c, which we offset by handwaving and looking suggestively at the

[Intel-gfx] [CI 38/39] drm/i915/cmdparser: Use binary search for faster register lookup

2016-08-18 Thread Chris Wilson
A significant proportion of the cmdparsing time for some batches is the cost to find the register in the mmiotable. We ensure that those tables are in ascending order such that we could do a binary search if it was ever merited. It is. Signed-off-by: Chris Wilson

[Intel-gfx] [CI 32/39] drm/i915/cmdparser: Add the TIMESTAMP register for the other engines

2016-08-18 Thread Chris Wilson
Since I have been using the BCS_TIMESTAMP to measure latency of execution upon the blitter ring, allow regular userspace to also read from that register. They are already allowed RCS_TIMESTAMP! Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld

[Intel-gfx] [CI 30/39] drm/i915: Stop discarding GTT cache-domain on unbind vma

2016-08-18 Thread Chris Wilson
Since commit 43566dedde54 ("drm/i915: Broaden application of set-domain(GTT)") we allowed objects to be in the GTT domain, but unbound. Therefore removing the GTT cache domain when removing the GGTT vma is no longer semantically correct. An unfortunate side-effect is we lose the wondrously named

[Intel-gfx] [CI 35/39] drm/i915/cmdparser: Improve hash function

2016-08-18 Thread Chris Wilson
The existing code's hashfunction is very suboptimal (most 3D commands use the same bucket degrading the hash to a long list). The code even acknowledge that the issue was known and the fix simple: /* * If we attempt to generate a perfect hash, we should be able to look at bits * 31:29 of a

[Intel-gfx] [CI 20/39] drm/i915: Rename fence.lru_list to link

2016-08-18 Thread Chris Wilson
Our current practice is to only name the actual list (here dev_priv->fence_list) using "list", and elements upon that list are referred to as "link". Further, the lru nature is of the list and not of the node and including in the name does not disambiguate the link from anything else.

[Intel-gfx] [CI 21/39] drm/i915: Move fence tracking from object to vma

2016-08-18 Thread Chris Wilson
In order to handle tiled partial GTT mmappings, we need to associate the fence with an individual vma. v2: A couple of silly drops replaced spotted by Joonas Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen ---

[Intel-gfx] [CI 16/39] drm/i915: Move map-and-fenceable tracking to the VMA

2016-08-18 Thread Chris Wilson
By moving map-and-fenceable tracking from the object to the VMA, we gain fine-grained tracking and the ability to track individual fences on the VMA (subsequent patch). Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen ---

[Intel-gfx] [CI 27/39] drm/i915: Fallback to using unmappable memory for scanout

2016-08-18 Thread Chris Wilson
The existing ABI says that scanouts are pinned into the mappable region so that legacy clients (e.g. old Xorg or plymouthd) can write directly into the scanout through a GTT mapping. However if the surface does not fit into the mappable region, we are better off just trying to fit it anywhere and

[Intel-gfx] [CI 25/39] drm/i915: Drop ORIGIN_GTT for untracked GTT writes

2016-08-18 Thread Chris Wilson
If FBC is set on a framebuffer that is unmapped, all GTT faults will be from a partial mapping. Writes by the user through the partial VMA are then untracked by the FBC and so we must use the ORIGIN_CPU when flushing the I915_GEM_DOMAIN_GTT. v2: Keep ORIGIN_CPU for set-to-domain(.write=CPU)

[Intel-gfx] [CI 28/39] drm/i915: Track display alignment on VMA

2016-08-18 Thread Chris Wilson
When using the aliasing ppgtt and pageflipping with the shrinker/eviction active, we note that we often have to rebind the backbuffer before flipping onto the scanout because it has an invalid alignment. If we store the worst-case alignment required for a VMA, we can avoid having to rebind at

[Intel-gfx] [CI 22/39] drm/i915: Choose partial chunksize based on tile row size

2016-08-18 Thread Chris Wilson
In order to support setting up fences for partial mappings of an object, we have to align those mappings with the fence. The minimum chunksize we choose is at least the size of a single tile row. v2: Make minimum chunk size a define for later use Signed-off-by: Chris Wilson

[Intel-gfx] [CI 15/39] drm/i915: Disallow direct CPU access to stolen pages for relocations

2016-08-18 Thread Chris Wilson
As we cannot access the backing pages behind stolen objects, we should not attempt to do so for relocations. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +++ 1 file

[Intel-gfx] [CI 07/39] drm/i915: Cache kmap between relocations

2016-08-18 Thread Chris Wilson
When doing relocations, we have to obtain a mapping to the page containing the target address. This is either a kmap or iomap depending on GPU and its cache coherency. Neighbouring relocation entries are typically within the same page and so we can cache our kmapping between them and avoid those

[Intel-gfx] [CI 11/39] drm/i915: Pin the pages first in shmem prepare read/write

2016-08-18 Thread Chris Wilson
There is an improbable, but not impossible, case that if we leave the pages unpin as we operate on the object, then somebody via the shrinker may steal the lock (which lock? right now, it is struct_mutex, THE lock) and change the cache domains after we have already inspected them. (Whilst here,

[Intel-gfx] [CI 12/39] drm/i915: Tidy up flush cpu/gtt write domains

2016-08-18 Thread Chris Wilson
Since we know the write domain, we can drop the local variable and make the code look a tiny bit simpler. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c | 15 --- 1 file

[Intel-gfx] [CI 13/39] drm/i915: Refactor execbuffer relocation writing

2016-08-18 Thread Chris Wilson
With the introduction of the reloc page cache, we are just one step away from refactoring the relocation write functions into one. Not only does it tidy the code (slightly), but it greatly simplifies the control logic much to gcc's satisfaction. v2: Add selftests to document the relationship

[Intel-gfx] [CI 14/39] drm/i915: Fallback to single page GTT mmappings for relocations

2016-08-18 Thread Chris Wilson
If we cannot pin the entire object into the mappable region of the GTT, try to pin a single page instead. This is much more likely to succeed, and prevents us falling back to the clflush slow path. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen

[Intel-gfx] [CI 17/39] drm/i915: Allow ringbuffers to be bound anywhere

2016-08-18 Thread Chris Wilson
Now that we have WC vmapping available, we can bind our rings anywhere in the GGTT and do not need to restrict them to the mappable region. Except for stolen objects, for which direct access is verbatim and we must use the mappable aperture. Signed-off-by: Chris Wilson

[Intel-gfx] [CI 10/39] drm/i915: Wait for writes through the GTT to land before reading back

2016-08-18 Thread Chris Wilson
If we quickly switch from writing through the GTT to a read of the physical page directly with the CPU (e.g. performing relocations through the GTT and then running the command parser), we can observe that the writes are not visible to the CPU. It is not a coherency problem, as extensive

[Intel-gfx] [CI 02/39] agp/intel: Flush chipset writes after updating a single PTE

2016-08-18 Thread Chris Wilson
After we update one PTE for a page, the caller expects to be able to immediately use that through a GGTT read/write. To comply with the callers expectations we therefore need to flush the chipset buffers before returning. Reported-by: Matti Hämäläinen Fixes: d6473f566417

[Intel-gfx] [CI 08/39] drm/i915: Extract i915_gem_obj_prepare_shmem_write()

2016-08-18 Thread Chris Wilson
This is a companion to i915_gem_obj_prepare_shmem_read() that prepares the backing storage for direct writes. It first serialises with the GPU, pins the backing storage and then indicates what clfushes are required in order for the writes to be coherent. Whilst here, fix support for ancient CPUs

[Intel-gfx] [CI 06/39] drm/i915: Fallback to single page pwrite/pread if unable to release fence

2016-08-18 Thread Chris Wilson
If we cannot release the fence (for example if someone is inexplicably trying to write into a tiled framebuffer that is currently pinned to the display! *cough* kms_frontbuffer_tracking *cough*) fallback to using the page-by-page pwrite/pread interface, rather than fail the syscall entirely.

[Intel-gfx] [CI 04/39] drm/i915: Use ORIGIN_CPU for fb invalidation from pwrite

2016-08-18 Thread Chris Wilson
As pwrite does not use the fence for its GTT access, and may even go through a secondary interface avoiding the main VMA, we cannot treat the write as automatically invalidated by the hardware and so we require ORIGIN_CPU frontbufer invalidate/flushes. Signed-off-by: Chris Wilson

[Intel-gfx] [CI 03/39] drm/i915: vfree() no longer ignores the low bits of the address

2016-08-18 Thread Chris Wilson
Since vfree() now likes to WARN when passed a non-page-aligned pointer, we need to discard the low bits to comply with it. Fixes: d31d7cb1460c ("drm/i915: Support for creating write combined type vmaps") Signed-off-by: Chris Wilson Cc: Joonas Lahtinen

[Intel-gfx] [CI 01/39] drm/i915: Unconditionally flush any chipset buffers before execbuf

2016-08-18 Thread Chris Wilson
If userspace is asynchronously streaming into the batch or other execobjects, we may not flush those writes along with a change in cache domain (as there is no change). Therefore those writes may end up in internal chipset buffers and not visible to the GPU upon execution. We must issue a flush

Re: [Intel-gfx] [PATCH 1/2] igt/gem_exec_nop: add burst submission to parallel execution test

2016-08-18 Thread Dave Gordon
On 18/08/16 16:27, Dave Gordon wrote: On 18/08/16 13:01, John Harrison wrote: [snip] Can you post the numbers that you get? I seem to get massive variability on my BDW. The render ring always gives me around 2.9us/batch but the other rings sometimes give me region of 1.2us and sometimes

Re: [Intel-gfx] [PATCH 1/2] igt/gem_exec_nop: add burst submission to parallel execution test

2016-08-18 Thread Dave Gordon
On 18/08/16 16:36, Dave Gordon wrote: On 18/08/16 16:27, Dave Gordon wrote: [snip] Note that SKL GuC firmware 6.1 didn't support dual submission or lite restore, whereas the next version (8.11) does. Therefore, with that firmware we don't see the same slowdown when going to 1-at-a-time

Re: [Intel-gfx] [PATCH 1/2] igt/gem_exec_nop: add burst submission to parallel execution test

2016-08-18 Thread Dave Gordon
On 18/08/16 16:27, Dave Gordon wrote: [snip] Note that SKL GuC firmware 6.1 didn't support dual submission or lite restore, whereas the next version (8.11) does. Therefore, with that firmware we don't see the same slowdown when going to 1-at-a-time round-robin. I have a different (new) test

Re: [Intel-gfx] [PATCH 1/2] igt/gem_exec_nop: add burst submission to parallel execution test

2016-08-18 Thread Dave Gordon
On 18/08/16 13:01, John Harrison wrote: On 03/08/2016 17:05, Dave Gordon wrote: On 03/08/16 16:45, Chris Wilson wrote: On Wed, Aug 03, 2016 at 04:36:46PM +0100, Dave Gordon wrote: The parallel execution test in gem_exec_nop chooses a pessimal distribution of work to multiple engines;

Re: [Intel-gfx] [PATCH 2/2] agp/intel: Flush chipset writes after updating a single PTE

2016-08-18 Thread Mika Kuoppala
Chris Wilson writes: > After we update one PTE for a page, the caller expects to be able to > immediately use that through a GGTT read/write. To comply with the > callers expectations we therefore need to flush the chipset buffers > before returning. > > Reported-by:

Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-18 Thread Goel, Akash
On 8/18/2016 8:25 PM, Imre Deak wrote: On to, 2016-08-18 at 20:05 +0530, Goel, Akash wrote: On 8/18/2016 7:48 PM, Imre Deak wrote: On to, 2016-08-18 at 19:17 +0530, Goel, Akash wrote: [...] Thanks for the inputs. Sorry not familiar with freezable WQ semantics. But after looking at code,

Re: [Intel-gfx] [PATCH v2] drm/i915: Drop ORIGIN_GTT for untracked GTT writes

2016-08-18 Thread Joonas Lahtinen
On to, 2016-08-18 at 15:26 +0100, Chris Wilson wrote: > If FBC is set on a framebuffer that is unmapped, all GTT faults will be > from a partial mapping. Writes by the user through the partial VMA are > then untracked by the FBC and so we must use the ORIGIN_CPU when flushing > the

Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-18 Thread Imre Deak
On to, 2016-08-18 at 20:05 +0530, Goel, Akash wrote: > > On 8/18/2016 7:48 PM, Imre Deak wrote: > > On to, 2016-08-18 at 19:17 +0530, Goel, Akash wrote: > > > [...] > > > Thanks for the inputs. Sorry not familiar with freezable WQ semantics. > > > But after looking at code, this is what I

Re: [Intel-gfx] [PATCH v2 0/4] Picture aspect ratio support in DRM layer

2016-08-18 Thread Jose Abreu
Hi, On 09-08-2016 15:55, Shashank Sharma wrote: > This patch series adds 4 patches. > - The first two patches add aspect ratio support in DRM layes > - Next two patches add new aspect ratios defined in CEA-861-F > supported for HDMI 2.0 4k modes. > > Adding aspect ratio support in DRM layer: >

Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-18 Thread Goel, Akash
On 8/18/2016 7:48 PM, Imre Deak wrote: On to, 2016-08-18 at 19:17 +0530, Goel, Akash wrote: [...] Thanks for the inputs. Sorry not familiar with freezable WQ semantics. But after looking at code, this is what I understood :- 1. freezable Workqueues will be frozen before the system suspend

Re: [Intel-gfx] [PATCH 1/2] drm/i915: Unconditionally flush any chipset buffers before execbuf

2016-08-18 Thread Chris Wilson
On Thu, Aug 18, 2016 at 04:59:35PM +0300, Mika Kuoppala wrote: > Chris Wilson writes: > > > If userspace is asynchronously streaming into the batch or other > > execobjects, we may not flush those writes along with a change in cache > > domain (as there is no change).

[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: Drop ORIGIN_GTT for untracked GTT writes (rev2)

2016-08-18 Thread Patchwork
== Series Details == Series: drm/i915: Drop ORIGIN_GTT for untracked GTT writes (rev2) URL : https://patchwork.freedesktop.org/series/11255/ State : failure == Summary == Applying: drm/i915: Drop ORIGIN_GTT for untracked GTT writes fatal: sha1 information is lacking or useless

[Intel-gfx] [PATCH v2] drm/i915: Drop ORIGIN_GTT for untracked GTT writes

2016-08-18 Thread Chris Wilson
If FBC is set on a framebuffer that is unmapped, all GTT faults will be from a partial mapping. Writes by the user through the partial VMA are then untracked by the FBC and so we must use the ORIGIN_CPU when flushing the I915_GEM_DOMAIN_GTT. v2: Keep ORIGIN_CPU for set-to-domain(.write=CPU)

Re: [Intel-gfx] [PATCH 2/2] drm/i915/fbc: Allow on unfenced surfaces, for recent gen

2016-08-18 Thread Joonas Lahtinen
On to, 2016-08-18 at 09:21 +0100, Chris Wilson wrote: > Only fbc1 is tied to using a fence. Later iterations of fbc are more > flexible and allow operation on unfenced frontbuffers. > > Signed-off-by: Chris Wilson > Cc: Daniel Vetter > Cc:

Re: [Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

2016-08-18 Thread Imre Deak
On to, 2016-08-18 at 19:17 +0530, Goel, Akash wrote: > [...] > Thanks for the inputs. Sorry not familiar with freezable WQ semantics. > But after looking at code, this is what I understood :- > 1. freezable Workqueues will be frozen before the system suspend > callbacks are invoked for the

[Intel-gfx] ✗ Ro.CI.BAT: warning for series starting with [v2] drm/i915: Fallback to single page pwrite/pread if unable to release fence (rev2)

2016-08-18 Thread Patchwork
== Series Details == Series: series starting with [v2] drm/i915: Fallback to single page pwrite/pread if unable to release fence (rev2) URL : https://patchwork.freedesktop.org/series/11266/ State : warning == Summary == Series 11266v2 Series without cover letter

Re: [Intel-gfx] [PATCH v12 2/7] drm/i915/skl: Add support for the SAGV, fix underrun hangs

2016-08-18 Thread Lyude Paul
On Thu, 2016-08-18 at 09:39 +0200, Maarten Lankhorst wrote: > Hey, > > Op 17-08-16 om 21:55 schreef Lyude: > > > > Since the watermark calculations for Skylake are still broken, we're apt > > to hitting underruns very easily under multi-monitor configurations. > > While it would be lovely if

Re: [Intel-gfx] [PATCH 00/15] drm/i915: Use connector atomic state in encoders.

2016-08-18 Thread Daniel Vetter
On Tue, Aug 09, 2016 at 05:03:59PM +0200, Maarten Lankhorst wrote: > This is required for supporting nonblocking modeset and atomic connector > properties. > Connector properties will need the connector state to be passed or it will > not work > as intended. > > Nonblocking modesets need to

Re: [Intel-gfx] [PATCH 2/2] drm/i915/fbc: Allow on unfenced surfaces, for recent gen

2016-08-18 Thread ch...@chris-wilson.co.uk
On Thu, Aug 18, 2016 at 01:56:56PM +, Zanoni, Paulo R wrote: > Em Qui, 2016-08-18 às 09:21 +0100, Chris Wilson escreveu: > > Only fbc1 is tied to using a fence. Later iterations of fbc are more > > flexible and allow operation on unfenced frontbuffers. > > But then we'll lose GTT tracking -

Re: [Intel-gfx] [PATCH 15/15] drm/i915: Use more atomic state in intel_color.c

2016-08-18 Thread Daniel Vetter
On Tue, Aug 09, 2016 at 05:04:14PM +0200, Maarten Lankhorst wrote: > crtc_state is already passed around, use it instead of crtc->config. > > Signed-off-by: Maarten Lankhorst Reviewed-by: Daniel Vetter > --- >

Re: [Intel-gfx] [PATCH 14/15] drm/i915: Convert intel_dp to use atomic state

2016-08-18 Thread Daniel Vetter
On Tue, Aug 09, 2016 at 05:04:13PM +0200, Maarten Lankhorst wrote: > Slightly less straightforward. Some of the drrs calls are done from > workers or from intel_ddi.c, pass along crtc_state when we can, > or crtc->config when we can't. > > Signed-off-by: Maarten Lankhorst

Re: [Intel-gfx] [PATCH 1/2] drm/i915: Unconditionally flush any chipset buffers before execbuf

2016-08-18 Thread Mika Kuoppala
Chris Wilson writes: > If userspace is asynchronously streaming into the batch or other > execobjects, we may not flush those writes along with a change in cache > domain (as there is no change). Therefore those writes may end up in > internal chipset buffers and not

Re: [Intel-gfx] [PATCH 2/2] drm/i915/fbc: Allow on unfenced surfaces, for recent gen

2016-08-18 Thread Zanoni, Paulo R
Em Qui, 2016-08-18 às 09:21 +0100, Chris Wilson escreveu: > Only fbc1 is tied to using a fence. Later iterations of fbc are more > flexible and allow operation on unfenced frontbuffers. But then we'll lose GTT tracking - which we currently rely on - and I'm 87.5% sure we'll need to implement some

Re: [Intel-gfx] [PATCH 13/15] drm/i915: Convert intel_dp_mst to use atomic state

2016-08-18 Thread Daniel Vetter
On Tue, Aug 09, 2016 at 05:04:12PM +0200, Maarten Lankhorst wrote: > Signed-off-by: Maarten Lankhorst > --- > drivers/gpu/drm/i915/intel_dp_mst.c | 48 > ++--- > 1 file changed, 18 insertions(+), 30 deletions(-) > > diff --git

[Intel-gfx] ✓ Ro.CI.BAT: success for series starting with [1/2] drm/i915: Unconditionally flush any chipset buffers before execbuf

2016-08-18 Thread Patchwork
== Series Details == Series: series starting with [1/2] drm/i915: Unconditionally flush any chipset buffers before execbuf URL : https://patchwork.freedesktop.org/series/11266/ State : success == Summary == Series 11266v1 Series without cover letter

  1   2   >