Re: [Intel-gfx] [PATCH 0/9] drm: More vblank on/off work
On Mon, 26 May 2014 14:46:23 +0300 ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com Another vblank series with the following features: - Plug a race between drm_vblank_off() and marking the crtc inactive - Don't send zeroed vblank evens to userspace at drm_vblank_off() - Have the user visible vblank counter account the entire time when the crtc was active, regardless of how long vblank interrupts were enabled - Avoid random jumps in the user visible vblank counter if the hardware counter gets reset - Allow disabling vblank interrupts immediately at drm_vblank_put() - Some polish via coccinelle While setting drm_vblank_offdelay to 0 is now possible, I'm not sure if we should set it 0 automatically in the i915 driver. If there are multiple GPUs in the system that setting will affect them all, which might have bad consequences if the other GPU doesn't have a hardware frame counter, or if it's just buggy. So perhaps we should move that option to be per-driver? Ville Syrjälä (9): drm: Always reject drm_vblank_get() after drm_vblank_off() drm/i915: Warn if drm_vblank_get() still works after drm_vblank_off() drm: Don't clear vblank timestamps when vblank interrupt is disabled drm: Move drm_update_vblank_count() drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off() drm: Avoid random vblank counter jumps if the hardware counter has been reset drm: Disable vblank interrupt immediately when drm_vblank_offdelay==0 drm: Reduce the amount of dev-vblank[crtc] in the code drm/i915: Leave interrupts enabled while disabling crtcs during suspend Here's one that may be fixed by this series, needs testing though: https://bugs.freedesktop.org/show_bug.cgi?id=79054 -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/bdw: BDW Software Turbo
On Mon, 16 Jun 2014 13:13:38 -0700 Daisy Sun daisy@intel.com wrote: BDW supports GT C0 residency reporting in constant time unit. Driver calculates GT utilization based on C0 residency and adjusts RP frequency up/down accordingly. Signed-off-by: Daisy Sun daisy@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 17 drivers/gpu/drm/i915/i915_irq.c | 10 +++ drivers/gpu/drm/i915/i915_reg.h | 4 + drivers/gpu/drm/i915/intel_display.c | 2 + drivers/gpu/drm/i915/intel_drv.h | 1 + drivers/gpu/drm/i915/intel_pm.c | 148 +-- 6 files changed, 158 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6b0e174..3a52e84 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -880,6 +880,19 @@ struct vlv_s0ix_state { u32 clock_gate_dis2; }; +struct intel_rps_bdw_cal { + u32 it_threshold_pct; /* interrupt, in percentage */ + u32 eval_interval; /* evaluation interval, in us */ + u32 last_ts; + u32 last_c0; + bool is_up; +}; + +struct intel_rps_bdw_turbo { + struct intel_rps_bdw_cal up; + struct intel_rps_bdw_cal down; +}; + struct intel_gen6_power_mgmt { /* work and pm_iir are protected by dev_priv-irq_lock */ struct work_struct work; @@ -910,6 +923,9 @@ struct intel_gen6_power_mgmt { bool enabled; struct delayed_work delayed_resume_work; + bool is_bdw_sw_turbo; /* Switch of BDW software turbo */ + struct intel_rps_bdw_turbo sw_turbo;/* Calculate RP interrupt timing */ + /* * Protects RPS/RC6 register access and PCU communication. * Must be taken after struct_mutex if nested. @@ -2579,6 +2595,7 @@ extern void intel_disable_fbc(struct drm_device *dev); extern bool ironlake_set_drps(struct drm_device *dev, u8 val); extern void intel_init_pch_refclk(struct drm_device *dev); extern void gen6_set_rps(struct drm_device *dev, u8 val); +extern void bdw_software_turbo(struct drm_device *dev); extern void valleyview_set_rps(struct drm_device *dev, u8 val); extern int valleyview_rps_max_freq(struct drm_i915_private *dev_priv); extern int valleyview_rps_min_freq(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index b10fbde..9ad1e93 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1530,6 +1530,16 @@ static void i9xx_pipe_crc_irq_handler(struct drm_device *dev, enum pipe pipe) res1, res2); } +void gen8_flip_interrupt(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + + if (!dev_priv-rps.is_bdw_sw_turbo) + return; + + bdw_software_turbo(dev); +} + /* The RPS events need forcewake, so we add them to a work queue and mask their * IMR bits until the work is done. Other interrupts can be processed without * the work queue. */ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 122ed3f..d929f3b 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -5240,6 +5240,10 @@ enum punit_power_well { #define GEN8_UCGCTL6 0x9430 #define GEN8_SDEUNIT_CLOCK_GATE_DISABLE(114) +#define TIMESTAMP_CTR0x44070 +#define FREQ_1_28_US(us) (((us) * 100) 7) +#define MCHBAR_PCU_C0(MCHBAR_MIRROR_BASE_SNB + 0x5960) + #define GEN6_GFXPAUSE0xA000 #define GEN6_RPNSWREQ0xA008 #define GEN6_TURBO_DISABLE (131) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 767ca96..2a45617 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9176,6 +9176,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, unsigned long flags; int ret; + gen8_flip_interrupt(dev); + /* Can't change pixel format via MI display flips. */ if (fb-pixel_format != crtc-primary-fb-pixel_format) return -EINVAL; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index acfc5c8..b8f375e 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -948,6 +948,7 @@ void ironlake_teardown_rc6(struct drm_device *dev); void gen6_update_ring_freq(struct drm_device *dev); void gen6_rps_idle(struct drm_i915_private *dev_priv); void gen6_rps_boost(struct drm_i915_private *dev_priv); +void gen8_flip_interrupt(struct drm_device *dev); void intel_aux_display_runtime_get(struct drm_i915_private *dev_priv); void intel_aux_display_runtime_put(struct drm_i915_private *dev_priv); void
Re: [Intel-gfx] [PATCH] drm/i915/bdw: BDW Software Turbo
On Thu, 26 Jun 2014 09:42:45 -0700 Jesse Barnes jbar...@virtuousgeek.org wrote: On Mon, 16 Jun 2014 13:13:38 -0700 Daisy Sun daisy@intel.com wrote: BDW supports GT C0 residency reporting in constant time unit. Driver calculates GT utilization based on C0 residency and adjusts RP frequency up/down accordingly. Signed-off-by: Daisy Sun daisy@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 17 drivers/gpu/drm/i915/i915_irq.c | 10 +++ drivers/gpu/drm/i915/i915_reg.h | 4 + drivers/gpu/drm/i915/intel_display.c | 2 + drivers/gpu/drm/i915/intel_drv.h | 1 + drivers/gpu/drm/i915/intel_pm.c | 148 +-- 6 files changed, 158 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6b0e174..3a52e84 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -880,6 +880,19 @@ struct vlv_s0ix_state { u32 clock_gate_dis2; }; +struct intel_rps_bdw_cal { + u32 it_threshold_pct; /* interrupt, in percentage */ + u32 eval_interval; /* evaluation interval, in us */ + u32 last_ts; + u32 last_c0; + bool is_up; +}; + +struct intel_rps_bdw_turbo { + struct intel_rps_bdw_cal up; + struct intel_rps_bdw_cal down; +}; + struct intel_gen6_power_mgmt { /* work and pm_iir are protected by dev_priv-irq_lock */ struct work_struct work; @@ -910,6 +923,9 @@ struct intel_gen6_power_mgmt { bool enabled; struct delayed_work delayed_resume_work; + bool is_bdw_sw_turbo; /* Switch of BDW software turbo */ + struct intel_rps_bdw_turbo sw_turbo;/* Calculate RP interrupt timing */ + /* * Protects RPS/RC6 register access and PCU communication. * Must be taken after struct_mutex if nested. @@ -2579,6 +2595,7 @@ extern void intel_disable_fbc(struct drm_device *dev); extern bool ironlake_set_drps(struct drm_device *dev, u8 val); extern void intel_init_pch_refclk(struct drm_device *dev); extern void gen6_set_rps(struct drm_device *dev, u8 val); +extern void bdw_software_turbo(struct drm_device *dev); extern void valleyview_set_rps(struct drm_device *dev, u8 val); extern int valleyview_rps_max_freq(struct drm_i915_private *dev_priv); extern int valleyview_rps_min_freq(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index b10fbde..9ad1e93 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1530,6 +1530,16 @@ static void i9xx_pipe_crc_irq_handler(struct drm_device *dev, enum pipe pipe) res1, res2); } +void gen8_flip_interrupt(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + + if (!dev_priv-rps.is_bdw_sw_turbo) + return; + + bdw_software_turbo(dev); +} + /* The RPS events need forcewake, so we add them to a work queue and mask their * IMR bits until the work is done. Other interrupts can be processed without * the work queue. */ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 122ed3f..d929f3b 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -5240,6 +5240,10 @@ enum punit_power_well { #define GEN8_UCGCTL6 0x9430 #define GEN8_SDEUNIT_CLOCK_GATE_DISABLE (114) +#define TIMESTAMP_CTR 0x44070 +#define FREQ_1_28_US(us) (((us) * 100) 7) +#define MCHBAR_PCU_C0 (MCHBAR_MIRROR_BASE_SNB + 0x5960) + #define GEN6_GFXPAUSE 0xA000 #define GEN6_RPNSWREQ 0xA008 #define GEN6_TURBO_DISABLE (131) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 767ca96..2a45617 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9176,6 +9176,8 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, unsigned long flags; int ret; + gen8_flip_interrupt(dev); + /* Can't change pixel format via MI display flips. */ if (fb-pixel_format != crtc-primary-fb-pixel_format) return -EINVAL; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index acfc5c8..b8f375e 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -948,6 +948,7 @@ void ironlake_teardown_rc6(struct drm_device *dev); void gen6_update_ring_freq(struct drm_device *dev); void gen6_rps_idle(struct drm_i915_private *dev_priv); void gen6_rps_boost(struct drm_i915_private *dev_priv); +void gen8_flip_interrupt(struct drm_device *dev); void
[Intel-gfx] [RFC 29/44] drm/i915: Hook scheduler into intel_ring_idle()
From: John Harrison john.c.harri...@intel.com The code to wait for a ring to be idle ends by calling __wait_seqno() on the value in the last request structure. However, with a scheduler, there may be work queued up but not yet submitted. There is also the possiblity of pre-emption re-ordering work after it has been submitted. Thus the last request structure at the current moment is not necessarily the last piece of work by the time that particular seqno has completed. It is not possible to force the scheduler to submit all work from inside the ring idle function as it might not be a safe place to do so. Instead, it must simply return early if the scheduler has outstanding work and roll back as far as releasing the driver mutex lock and the returning the system to a consistent state. --- drivers/gpu/drm/i915/i915_scheduler.c | 12 drivers/gpu/drm/i915/i915_scheduler.h |1 + drivers/gpu/drm/i915/intel_ringbuffer.c |8 3 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 6b6827f..6a10a76 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -165,6 +165,13 @@ int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file) return 0; } +bool i915_scheduler_is_idle(struct intel_engine_cs *ring) +{ + /* Do stuff... */ + + return true; +} + #else /* CONFIG_DRM_I915_SCHEDULER */ int i915_scheduler_init(struct drm_device *dev) @@ -177,6 +184,11 @@ int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file) return 0; } +bool i915_scheduler_is_idle(struct intel_engine_cs *ring) +{ + return true; +} + int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) { return i915_gem_do_execbuffer_final(qe-params); diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 898d2bb..1b3d51a 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -74,6 +74,7 @@ int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file); int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe); int i915_scheduler_handle_IRQ(struct intel_engine_cs *ring); +booli915_scheduler_is_idle(struct intel_engine_cs *ring); #ifdef CONFIG_DRM_I915_SCHEDULER diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 1ef0cbd..1ad162b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1651,6 +1651,14 @@ int intel_ring_idle(struct intel_engine_cs *ring) return ret; } + /* If there is anything outstanding within the scheduler then give up +* now as the submission of such work requires the mutex lock. While +* the lock is definitely held at this point (i915_wait_seqno will BUG +* if called without), the driver is not necessarily at a safe point +* to start submitting ring work. */ + if (!i915_scheduler_is_idle(ring)) + return -EAGAIN; + /* Wait upon the last request to be completed */ if (list_empty(ring-request_list)) return 0; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 09/44] drm/i915: Start of GPU scheduler
From: John Harrison john.c.harri...@intel.com Created GPU scheduler source files with only a basic init function. --- drivers/gpu/drm/i915/Makefile |1 + drivers/gpu/drm/i915/i915_drv.h |4 +++ drivers/gpu/drm/i915/i915_gem.c |3 ++ drivers/gpu/drm/i915/i915_scheduler.c | 59 + drivers/gpu/drm/i915/i915_scheduler.h | 40 ++ 5 files changed, 107 insertions(+) create mode 100644 drivers/gpu/drm/i915/i915_scheduler.c create mode 100644 drivers/gpu/drm/i915/i915_scheduler.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index cad1683..12817a8 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -11,6 +11,7 @@ i915-y := i915_drv.o \ i915_params.o \ i915_suspend.o \ i915_sysfs.o \ + i915_scheduler.o \ intel_pm.o i915-$(CONFIG_COMPAT) += i915_ioc32.o i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 53f6fe5..6e592d3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1331,6 +1331,8 @@ struct intel_pipe_crc { wait_queue_head_t wq; }; +struct i915_scheduler; + struct drm_i915_private { struct drm_device *dev; struct kmem_cache *slab; @@ -1540,6 +1542,8 @@ struct drm_i915_private { struct i915_runtime_pm pm; + struct i915_scheduler *scheduler; + /* Old dri1 support infrastructure, beware the dragons ya fools entering * here! */ struct i915_dri1_state dri1; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 898660c..b784eb2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -37,6 +37,7 @@ #include linux/swap.h #include linux/pci.h #include linux/dma-buf.h +#include i915_scheduler.h static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj); static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj, @@ -4669,6 +4670,8 @@ static int i915_gem_init_rings(struct drm_device *dev) goto cleanup_vebox_ring; } + i915_scheduler_init(dev); + ret = i915_gem_set_seqno(dev, ((u32)~0 - 0x1000)); if (ret) goto cleanup_bsd2_ring; diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c new file mode 100644 index 000..9ec0225 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -0,0 +1,59 @@ +/* + * Copyright (c) 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#include i915_drv.h +#include intel_drv.h +#include i915_scheduler.h + +#ifdef CONFIG_DRM_I915_SCHEDULER + +int i915_scheduler_init(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + if (scheduler) + return 0; + + scheduler = kzalloc(sizeof(*scheduler), GFP_KERNEL); + if (!scheduler) + return -ENOMEM; + + spin_lock_init(scheduler-lock); + + scheduler-index = 1; + + dev_priv-scheduler = scheduler; + + return 0; +} + +#else /* CONFIG_DRM_I915_SCHEDULER */ + +int i915_scheduler_init(struct drm_device *dev) +{ + return 0; +} + +#endif /* CONFIG_DRM_I915_SCHEDULER */ diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h new file mode 100644 index 000..bbe1934 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -0,0 +1,40 @@ +/* + * Copyright (c) 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the
[Intel-gfx] [RFC 07/44] drm/i915: Disable 'get seqno' workaround for VLV
From: John Harrison john.c.harri...@intel.com There is a workaround for a hardware bug when reading the seqno from the status page. The bug does not exist on VLV however, the workaround was still being applied. --- drivers/gpu/drm/i915/intel_ringbuffer.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 279488a..bad5db0 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1960,7 +1960,10 @@ int intel_init_render_ring_buffer(struct drm_device *dev) ring-irq_put = gen6_ring_put_irq; } ring-irq_enable_mask = GT_RENDER_USER_INTERRUPT; - ring-get_seqno = gen6_ring_get_seqno; + if (IS_VALLEYVIEW(dev)) + ring-get_seqno = ring_get_seqno; + else + ring-get_seqno = gen6_ring_get_seqno; ring-set_seqno = ring_set_seqno; ring-semaphore.sync_to = gen6_ring_sync; ring-semaphore.signal = gen6_signal; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 03/44] drm/i915: Add extra add_request calls
From: John Harrison john.c.harri...@intel.com The scheduler needs to track batch buffers by seqno without extra, non-batch buffer work being attached to the same seqno. This means that anywhere which adds work to the ring should explicitly call i915_add_request() when it has finished writing to the ring. The add_request() function does extra work, such as flushing caches, that does not necessarily want to be done everywhere. Instead, a new i915_add_request_wo_flush() function has been added which skips the cache flush and just tidies up request structures and seqno values. Note, much of this patch was implemented by Naresh Kumar Kachhi for pending power management improvements. However, it is also directly applicable to the scheduler work as noted above. --- drivers/gpu/drm/i915/i915_dma.c |5 + drivers/gpu/drm/i915/i915_drv.h |9 +--- drivers/gpu/drm/i915/i915_gem.c | 31 -- drivers/gpu/drm/i915/i915_gem_context.c |9 drivers/gpu/drm/i915/i915_gem_execbuffer.c |4 ++-- drivers/gpu/drm/i915/i915_gem_render_state.c |2 +- drivers/gpu/drm/i915/intel_display.c | 10 - 7 files changed, 52 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 67f2918..494b156 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -456,6 +456,7 @@ static int i915_dispatch_cmdbuffer(struct drm_device * dev, struct drm_clip_rect *cliprects, void *cmdbuf) { + struct drm_i915_private *dev_priv = dev-dev_private; int nbox = cmd-num_cliprects; int i = 0, count, ret; @@ -482,6 +483,7 @@ static int i915_dispatch_cmdbuffer(struct drm_device * dev, } i915_emit_breadcrumb(dev); + i915_add_request_wo_flush(LP_RING(dev_priv)); return 0; } @@ -544,6 +546,7 @@ static int i915_dispatch_batchbuffer(struct drm_device * dev, } i915_emit_breadcrumb(dev); + i915_add_request_wo_flush(LP_RING(dev_priv)); return 0; } @@ -597,6 +600,7 @@ static int i915_dispatch_flip(struct drm_device * dev) ADVANCE_LP_RING(); } + i915_add_request_wo_flush(LP_RING(dev_priv)); master_priv-sarea_priv-pf_current_page = dev_priv-dri1.current_page; return 0; } @@ -774,6 +778,7 @@ static int i915_emit_irq(struct drm_device * dev) OUT_RING(dev_priv-dri1.counter); OUT_RING(MI_USER_INTERRUPT); ADVANCE_LP_RING(); + i915_add_request_wo_flush(LP_RING(dev_priv)); } return dev_priv-dri1.counter; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7a96ca0..e3295cb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2199,7 +2199,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); int i915_gem_object_sync(struct drm_i915_gem_object *obj, -struct intel_engine_cs *to); +struct intel_engine_cs *to, bool add_request); void i915_vma_move_to_active(struct i915_vma *vma, struct intel_engine_cs *ring); int i915_gem_dumb_create(struct drm_file *file_priv, @@ -2272,9 +2272,12 @@ int __must_check i915_gem_suspend(struct drm_device *dev); int __i915_add_request(struct intel_engine_cs *ring, struct drm_file *file, struct drm_i915_gem_object *batch_obj, - u32 *seqno); + u32 *seqno, + bool flush_caches); #define i915_add_request(ring, seqno) \ - __i915_add_request(ring, NULL, NULL, seqno) + __i915_add_request(ring, NULL, NULL, seqno, true) +#define i915_add_request_wo_flush(ring) \ + __i915_add_request(ring, NULL, NULL, NULL, false) int __must_check i915_wait_seqno(struct intel_engine_cs *ring, uint32_t seqno); int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 5a13d9e..898660c 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2320,7 +2320,8 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno) int __i915_add_request(struct intel_engine_cs *ring, struct drm_file *file, struct drm_i915_gem_object *obj, - u32 *out_seqno) + u32 *out_seqno, + bool flush_caches) { struct drm_i915_private *dev_priv = ring-dev-dev_private; struct drm_i915_gem_request *request; @@ -2335,9 +2336,11 @@ int __i915_add_request(struct
[Intel-gfx] [RFC 05/44] drm/i915: Updating assorted register and status page definitions
From: John Harrison john.c.harri...@intel.com Added various definitions that will be useful for the scheduler in general and pre-emptive context switching in particular. --- drivers/gpu/drm/i915/i915_drv.h |5 ++- drivers/gpu/drm/i915/i915_reg.h | 30 ++- drivers/gpu/drm/i915/intel_ringbuffer.h | 61 ++- 3 files changed, 92 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e3295cb..53f6fe5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -584,7 +584,10 @@ struct i915_ctx_hang_stats { }; /* This must match up with the value previously used for execbuf2.rsvd1. */ -#define DEFAULT_CONTEXT_ID 0 +#define DEFAULT_CONTEXT_ID 0 +/* This must not match any user context */ +#define PREEMPTION_CONTEXT_ID (-1) + struct intel_context { struct kref ref; int id; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 242df99..cfc918d 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -205,6 +205,10 @@ #define MI_GLOBAL_GTT(122) #define MI_NOOPMI_INSTR(0, 0) +#define MI_NOOP_WRITE_ID (122) +#define MI_NOOP_ID_MASK ((122) - 1) +#define MI_NOOP_MID(id) ((id) MI_NOOP_ID_MASK) +#define MI_NOOP_WITH_ID(id)MI_INSTR(0, MI_NOOP_WRITE_ID|MI_NOOP_MID(id)) #define MI_USER_INTERRUPT MI_INSTR(0x02, 0) #define MI_WAIT_FOR_EVENT MI_INSTR(0x03, 0) #define MI_WAIT_FOR_OVERLAY_FLIP (116) @@ -222,6 +226,7 @@ #define MI_ARB_ON_OFF MI_INSTR(0x08, 0) #define MI_ARB_ENABLE(10) #define MI_ARB_DISABLE (00) +#define MI_ARB_CHECK MI_INSTR(0x05, 0) #define MI_BATCH_BUFFER_ENDMI_INSTR(0x0a, 0) #define MI_SUSPEND_FLUSH MI_INSTR(0x0b, 0) #define MI_SUSPEND_FLUSH_EN (10) @@ -260,6 +265,8 @@ #define MI_SEMAPHORE_SYNC_INVALID (316) #define MI_SEMAPHORE_SYNC_MASK(316) #define MI_SET_CONTEXT MI_INSTR(0x18, 0) +#define MI_CONTEXT_ADDR_MASK ((~0)12) +#define MI_SET_CONTEXT_FLAG_MASK ((112)-1) #define MI_MM_SPACE_GTT (18) #define MI_MM_SPACE_PHYSICAL (08) #define MI_SAVE_EXT_STATE_EN (13) @@ -270,6 +277,10 @@ #define MI_MEM_VIRTUAL (1 22) /* 965+ only */ #define MI_STORE_DWORD_INDEX MI_INSTR(0x21, 1) #define MI_STORE_DWORD_INDEX_SHIFT 2 +#define MI_STORE_REG_MEM MI_INSTR(0x24, 1) +#define MI_STORE_REG_MEM_GTT (1 22) +#define MI_STORE_REG_MEM_PREDICATE (1 21) + /* Official intel docs are somewhat sloppy concerning MI_LOAD_REGISTER_IMM: * - Always issue a MI_NOOP _before_ the MI_LOAD_REGISTER_IMM - otherwise hw * simply ignores the register load under certain conditions. @@ -283,7 +294,10 @@ #define MI_FLUSH_DWMI_INSTR(0x26, 1) /* for GEN6 */ #define MI_FLUSH_DW_STORE_INDEX (121) #define MI_INVALIDATE_TLB(118) +#define MI_FLUSH_DW_OP_NONE (014) #define MI_FLUSH_DW_OP_STOREDW (114) +#define MI_FLUSH_DW_OP_RSVD (214) +#define MI_FLUSH_DW_OP_STAMP (314) #define MI_FLUSH_DW_OP_MASK (314) #define MI_FLUSH_DW_NOTIFY (18) #define MI_INVALIDATE_BSD(17) @@ -1005,6 +1019,19 @@ enum punit_power_well { #define GEN6_VERSYNC (RING_SYNC_1(VEBOX_RING_BASE)) #define GEN6_VEVSYNC (RING_SYNC_2(VEBOX_RING_BASE)) #define GEN6_NOSYNC 0 + +/* + * Premption-related registers + */ +#define RING_UHPTR(base) ((base)+0x134) +#define UHPTR_GFX_ADDR_ALIGN (0x7) +#define UHPTR_VALID (0x1) +#define RING_PREEMPT_ADDR 0x0214c +#define PREEMPT_BATCH_LEVEL_MASK (0x3) +#define BB_PREEMPT_ADDR0x02148 +#define SBB_PREEMPT_ADDR 0x0213c +#define RS_PREEMPT_STATUS 0x0215c + #define RING_MAX_IDLE(base)((base)+0x54) #define RING_HWS_PGA(base) ((base)+0x80) #define RING_HWS_PGA_GEN6(base)((base)+0x2080) @@ -5383,7 +5410,8 @@ enum punit_power_well { #define VLV_SPAREG2H 0xA194 #define GTFIFODBG 0x12 -#defineGT_FIFO_SBDROPERR (16) +#defineGT_FIFO_CPU_ERROR_MASK 0xf +#defineGT_FIFO_SDDROPERR (16) #defineGT_FIFO_BLOBDROPERR (15) #defineGT_FIFO_SB_READ_ABORTERR(14) #defineGT_FIFO_DROPERR (13) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 910c83c..30841ea 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -40,6 +40,12 @@ struct intel_hw_status_page { #define I915_READ_MODE(ring) I915_READ(RING_MI_MODE((ring)-mmio_base)) #define
[Intel-gfx] [RFC 06/44] drm/i915: Fixes for FIFO space queries
From: John Harrison john.c.harri...@intel.com The previous code was not correctly masking the value of the GTFIFOCTL register, leading to overruns and the message MMIO read or write has been dropped. In addition, the checks were repeated in several different places. This commit replaces these various checks with a simple (inline) function to encapsulate the read-and-mask operation. In addition, it adds a custom wait-for-fifo function for VLV, as the timing parameters are somewhat different from those on earlier chips. --- drivers/gpu/drm/i915/intel_uncore.c | 49 ++- 1 file changed, 42 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 871c284..6a3dddf 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -47,6 +47,12 @@ assert_device_not_suspended(struct drm_i915_private *dev_priv) Device suspended\n); } +static inline u32 fifo_free_entries(struct drm_i915_private *dev_priv) +{ + u32 count = __raw_i915_read32(dev_priv, GTFIFOCTL); + return count GT_FIFO_FREE_ENTRIES_MASK; +} + static void __gen6_gt_wait_for_thread_c0(struct drm_i915_private *dev_priv) { u32 gt_thread_status_mask; @@ -154,6 +160,28 @@ static void __gen7_gt_force_wake_mt_put(struct drm_i915_private *dev_priv, gen6_gt_check_fifodbg(dev_priv); } +static int __vlv_gt_wait_for_fifo(struct drm_i915_private *dev_priv) +{ + u32 free = fifo_free_entries(dev_priv); + int loop1, loop2; + + for (loop1 = 0; loop1 5000 free GT_FIFO_NUM_RESERVED_ENTRIES; ) { + for (loop2 = 0; loop2 1000 free GT_FIFO_NUM_RESERVED_ENTRIES; loop2 += 10) { + udelay(10); + free = fifo_free_entries(dev_priv); + } + loop1 += loop2; + if (loop1 1000 || free 48) + DRM_DEBUG(after %d us, the FIFO has %d slots, loop1, free); + } + + dev_priv-uncore.fifo_count = free; + if (WARN(free GT_FIFO_NUM_RESERVED_ENTRIES, + FIFO has insufficient space (%d slots), free)) + return -1; + return 0; +} + static int __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv) { int ret = 0; @@ -161,16 +189,15 @@ static int __gen6_gt_wait_for_fifo(struct drm_i915_private *dev_priv) /* On VLV, FIFO will be shared by both SW and HW. * So, we need to read the FREE_ENTRIES everytime */ if (IS_VALLEYVIEW(dev_priv-dev)) - dev_priv-uncore.fifo_count = - __raw_i915_read32(dev_priv, GTFIFOCTL) - GT_FIFO_FREE_ENTRIES_MASK; + return __vlv_gt_wait_for_fifo(dev_priv); if (dev_priv-uncore.fifo_count GT_FIFO_NUM_RESERVED_ENTRIES) { int loop = 500; - u32 fifo = __raw_i915_read32(dev_priv, GTFIFOCTL) GT_FIFO_FREE_ENTRIES_MASK; + u32 fifo = fifo_free_entries(dev_priv); + while (fifo = GT_FIFO_NUM_RESERVED_ENTRIES loop--) { udelay(10); - fifo = __raw_i915_read32(dev_priv, GTFIFOCTL) GT_FIFO_FREE_ENTRIES_MASK; + fifo = fifo_free_entries(dev_priv); } if (WARN_ON(loop 0 fifo = GT_FIFO_NUM_RESERVED_ENTRIES)) ++ret; @@ -194,6 +221,11 @@ static void vlv_force_wake_reset(struct drm_i915_private *dev_priv) static void __vlv_force_wake_get(struct drm_i915_private *dev_priv, int fw_engine) { +#if1 + if (__gen6_gt_wait_for_fifo(dev_priv)) + gen6_gt_check_fifodbg(dev_priv); +#endif + /* Check for Render Engine */ if (FORCEWAKE_RENDER fw_engine) { if (wait_for_atomic((__raw_i915_read32(dev_priv, @@ -238,6 +270,10 @@ static void __vlv_force_wake_get(struct drm_i915_private *dev_priv, static void __vlv_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) { +#if1 + if (__gen6_gt_wait_for_fifo(dev_priv)) + gen6_gt_check_fifodbg(dev_priv); +#endif /* Check for Render Engine */ if (FORCEWAKE_RENDER fw_engine) @@ -355,8 +391,7 @@ static void intel_uncore_forcewake_reset(struct drm_device *dev, bool restore) if (IS_GEN6(dev) || IS_GEN7(dev)) dev_priv-uncore.fifo_count = - __raw_i915_read32(dev_priv, GTFIFOCTL) - GT_FIFO_FREE_ENTRIES_MASK; + fifo_free_entries(dev_priv); } else { dev_priv-uncore.forcewake_count = 0; dev_priv-uncore.fw_rendercount = 0; -- 1.7.9.5 ___ Intel-gfx mailing list
[Intel-gfx] [RFC 25/44] drm/i915: Added hook to catch 'unexpected' ring submissions
From: John Harrison john.c.harri...@intel.com The scheduler needs to know what each seqno that pops out of the ring is referring to. This change adds a hook into the the 'submit some random work that got forgotten about' clean up code to inform the scheduler that a new seqno has been sent to the ring for some non-batch buffer operation. --- drivers/gpu/drm/i915/i915_gem.c | 20 +++- drivers/gpu/drm/i915/i915_scheduler.c |7 +++ drivers/gpu/drm/i915/i915_scheduler.h |1 + 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 57b24f0..7727f0f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2347,6 +2347,25 @@ int __i915_add_request(struct intel_engine_cs *ring, if (WARN_ON(request == NULL)) return -ENOMEM; + request-seqno = intel_ring_get_seqno(ring); + +#ifdef CONFIG_DRM_I915_SCHEDULER + /* The scheduler needs to know about all seqno values that can pop out +* of the ring. Otherwise, things can get confused when batch buffers +* are re-ordered. Specifically, the scheduler has to work out which +* buffers have completed by matching the last completed seqno with its +* internal list of all seqnos ordered by when they were sent to the +* ring. If an unknown seqno appears, the scheduler is unable to process +* any batch buffers that might have completed just before the unknown +* one. +* NB: The scheduler must be told before the request is actually sent +* to the ring as it needs to know about it before the interrupt occurs. +*/ + ret = i915_scheduler_fly_seqno(ring, request-seqno); + if (ret) + return ret; +#endif + /* Record the position of the start of the request so that * should we detect the updated seqno part-way through the * GPU processing the request, we never over-estimate the @@ -2358,7 +2377,6 @@ int __i915_add_request(struct intel_engine_cs *ring, if (ret) return ret; - request-seqno = intel_ring_get_seqno(ring); request-ring = ring; request-head = request_start; request-tail = request_ring_position; diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 1e4d7c313..b5d391c 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -92,6 +92,13 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) return ret; } +int i915_scheduler_fly_seqno(struct intel_engine_cs *ring, uint32_t seqno) +{ + /* Do stuff... */ + + return 0; +} + int i915_scheduler_handle_IRQ(struct intel_engine_cs *ring) { struct drm_i915_private *dev_priv = ring-dev-dev_private; diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index dd7d699..57e001a 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -72,6 +72,7 @@ struct i915_scheduler { uint32_tindex; }; +int i915_scheduler_fly_seqno(struct intel_engine_cs *ring, uint32_t seqno); int i915_scheduler_remove(struct intel_engine_cs *ring); booli915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, uint32_t seqno, bool *completed); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 27/44] drm/i915: Added scheduler support to page fault handler
From: John Harrison john.c.harri...@intel.com GPU page faults can now require scheduler operation in order to complete. For example, in order to free up sufficient memory to handle the fault the handler must wait for a batch buffer to complete that has not even been sent to the hardware yet. Thus EAGAIN no longer means a GPU hang, it can occur under normal operation. --- drivers/gpu/drm/i915/i915_gem.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 5ed5f66..aa1e0b2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1622,10 +1622,16 @@ out: } case -EAGAIN: /* -* EAGAIN means the gpu is hung and we'll wait for the error -* handler to reset everything when re-faulting in +* EAGAIN can mean the gpu is hung and we'll have to wait for +* the error handler to reset everything when re-faulting in * i915_mutex_lock_interruptible. +* +* It can also indicate various other nonfatal errors for which +* the best response is to give other threads a chance to run, +* and then retry the failing operation in its entirety. */ + set_need_resched(); + /*FALLTHRU*/ case 0: case -ERESTARTSYS: case -EINTR: -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 33/44] drm/i915: Added trace points to scheduler
From: John Harrison john.c.harri...@intel.com Added trace points to the scheduler to track all the various events, node state transitions and other interesting things that occur. --- drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 + drivers/gpu/drm/i915/i915_scheduler.c | 31 - drivers/gpu/drm/i915/i915_trace.h | 194 3 files changed, 226 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 98cc95e..bf19e02 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1413,6 +1413,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, ring-outstanding_lazy_seqno= 0; ring-preallocated_lazy_request = NULL; + trace_i915_gem_ring_queue(ring, qe); + ret = i915_scheduler_queue_execbuffer(qe); if (ret) goto err; diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 71d8db4..6d0f4cb 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -87,6 +87,8 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) qe-params.scheduler_index = scheduler-index++; + trace_i915_scheduler_queue(qe-params.ring, qe); + scheduler-flags[qe-params.ring-id] |= i915_sf_submitting; ret = i915_gem_do_execbuffer_final(qe-params); scheduler-flags[qe-params.ring-id] = ~i915_sf_submitting; @@ -215,6 +217,9 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) not_flying = i915_scheduler_count_flying(scheduler, ring) scheduler-min_flying; + trace_i915_scheduler_queue(ring, node); + trace_i915_scheduler_node_state_change(ring, node); + spin_unlock_irqrestore(scheduler-lock, flags); if (not_flying) @@ -253,6 +258,8 @@ int i915_scheduler_fly_seqno(struct intel_engine_cs *ring, uint32_t seqno) node-stamp= stamp; node-status = i915_sqs_none; + trace_i915_scheduler_node_state_change(ring, node); + spin_lock_irqsave(scheduler-lock, flags); ret = i915_scheduler_fly_node(node); spin_unlock_irqrestore(scheduler-lock, flags); @@ -279,6 +286,9 @@ int i915_scheduler_fly_node(struct i915_scheduler_queue_entry *node) node-status = i915_sqs_flying; + trace_i915_scheduler_fly(ring, node); + trace_i915_scheduler_node_state_change(ring, node); + if (!(scheduler-flags[ring-id] i915_sf_interrupts_enabled)) { boolsuccess = true; @@ -343,6 +353,8 @@ static void i915_scheduler_node_requeue(struct i915_scheduler_queue_entry *node) BUG_ON(!I915_SQS_IS_FLYING(node)); node-status = i915_sqs_queued; + trace_i915_scheduler_unfly(node-params.ring, node); + trace_i915_scheduler_node_state_change(node-params.ring, node); } /* Give up on a popped node completely. For example, because it is causing the @@ -353,6 +365,8 @@ static void i915_scheduler_node_kill(struct i915_scheduler_queue_entry *node) BUG_ON(!I915_SQS_IS_FLYING(node)); node-status = i915_sqs_complete; + trace_i915_scheduler_unfly(node-params.ring, node); + trace_i915_scheduler_node_state_change(node-params.ring, node); } /* @@ -377,13 +391,17 @@ static int i915_scheduler_seqno_complete(struct intel_engine_cs *ring, uint32_t * if a completed entry is found then there is no need to scan further. */ list_for_each_entry(node, scheduler-node_queue[ring-id], link) { - if (I915_SQS_IS_COMPLETE(node)) + if (I915_SQS_IS_COMPLETE(node)) { + trace_i915_scheduler_landing(ring, seqno, node); goto done; + } if (seqno == node-params.seqno) break; } + trace_i915_scheduler_landing(ring, seqno, node); + /* * NB: Lots of extra seqnos get added to the ring to track things * like cache flushes and page flips. So don't complain about if @@ -405,6 +423,7 @@ static int i915_scheduler_seqno_complete(struct intel_engine_cs *ring, uint32_t /* Node was in flight so mark it as complete. */ node-status = i915_sqs_complete; + trace_i915_scheduler_node_state_change(ring, node); } /* Should submit new work here if flight list is empty but the DRM @@ -425,6 +444,8 @@ int i915_scheduler_handle_IRQ(struct intel_engine_cs *ring) seqno = ring-get_seqno(ring, false); + trace_i915_scheduler_irq(ring, seqno); + if (i915.scheduler_override i915_so_direct_submit) return 0; @@ -526,6 +547,8 @@ int
[Intel-gfx] [RFC 18/44] drm/i915: Added scheduler debug macro
From: John Harrison john.c.harri...@intel.com Added a DRM debug facility for use by the scheduler. --- include/drm/drmP.h |7 +++ 1 file changed, 7 insertions(+) diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 76ccaab..2f477c9 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -120,6 +120,7 @@ struct videomode; #define DRM_UT_DRIVER 0x02 #define DRM_UT_KMS 0x04 #define DRM_UT_PRIME 0x08 +#define DRM_UT_SCHED 0x40 extern __printf(2, 3) void drm_ut_debug_printk(const char *function_name, @@ -221,10 +222,16 @@ int drm_err(const char *func, const char *format, ...); if (unlikely(drm_debug DRM_UT_PRIME)) \ drm_ut_debug_printk(__func__, fmt, ##args); \ } while (0) +#define DRM_DEBUG_SCHED(fmt, args...) \ + do {\ + if (unlikely(drm_debug DRM_UT_SCHED)) \ + drm_ut_debug_printk(__func__, fmt, ##args); \ + } while (0) #else #define DRM_DEBUG_DRIVER(fmt, args...) do { } while (0) #define DRM_DEBUG_KMS(fmt, args...)do { } while (0) #define DRM_DEBUG_PRIME(fmt, args...) do { } while (0) +#define DRM_DEBUG_SCHED(fmt, args...) do { } while (0) #define DRM_DEBUG(fmt, arg...) do { } while (0) #endif -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 04/44] drm/i915: Fix null pointer dereference in error capture
From: John Harrison john.c.harri...@intel.com The i915_gem_record_rings() code was unconditionally querying and saving state for the batch_obj of a request structure. This is not necessarily set. Thus a null pointer dereference can occur. --- drivers/gpu/drm/i915/i915_gpu_error.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 87ec60e..0738f21 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -902,12 +902,13 @@ static void i915_gem_record_rings(struct drm_device *dev, * as the simplest method to avoid being overwritten * by userspace. */ - error-ring[i].batchbuffer = - i915_error_object_create(dev_priv, -request-batch_obj, -request-ctx ? -request-ctx-vm : -dev_priv-gtt.base); + if(request-batch_obj) + error-ring[i].batchbuffer = + i915_error_object_create(dev_priv, + request-batch_obj, +request-ctx ? + request-ctx-vm : + dev_priv-gtt.base); if (HAS_BROKEN_CS_TLB(dev_priv-dev) ring-scratch.obj) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 11/44] drm/i915: Added scheduler hook into i915_seqno_passed()
From: John Harrison john.c.harri...@intel.com The GPU scheduler can cause seqno values to become out of order. This means that a straight forward 'is seqno X seqno Y' test is no longer valid. Instead, a call into the scheduler must be made to see if the value being queried is known to be out of order. --- drivers/gpu/drm/i915/i915_drv.h | 23 ++- drivers/gpu/drm/i915/i915_gem.c | 14 +++--- drivers/gpu/drm/i915/i915_irq.c |4 ++-- drivers/gpu/drm/i915/i915_scheduler.c | 20 drivers/gpu/drm/i915/i915_scheduler.h |3 +++ 5 files changed, 54 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6e592d3..0977653 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2214,14 +2214,35 @@ int i915_gem_dumb_create(struct drm_file *file_priv, struct drm_mode_create_dumb *args); int i915_gem_mmap_gtt(struct drm_file *file_priv, struct drm_device *dev, uint32_t handle, uint64_t *offset); + +bool i915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, + uint32_t seqno, bool *completed); + /** * Returns true if seq1 is later than seq2. */ static inline bool -i915_seqno_passed(uint32_t seq1, uint32_t seq2) +i915_seqno_passed(struct intel_engine_cs *ring, uint32_t seq1, uint32_t seq2) { +#ifdef CONFIG_DRM_I915_SCHEDULER + boolcompleted; + + if (i915_scheduler_is_seqno_in_flight(ring, seq2, completed)) + return completed; +#endif + return (int32_t)(seq1 - seq2) = 0; } +static inline int32_t +i915_compare_seqno_values(uint32_t seq1, uint32_t seq2) +{ + int32_t diff = seq1 - seq2; + + if (!diff) + return 0; + + return (diff 0) ? 1 : -1; +} int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno); int __must_check i915_gem_set_seqno(struct drm_device *dev, u32 seqno); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7e53446..fece5e7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1165,7 +1165,7 @@ static int __wait_seqno(struct intel_engine_cs *ring, u32 seqno, WARN(dev_priv-pm.irqs_disabled, IRQs disabled\n); - if (i915_seqno_passed(ring-get_seqno(ring, true), seqno)) + if (i915_seqno_passed(ring, ring-get_seqno(ring, true), seqno)) return 0; timeout_expire = timeout ? jiffies + timespec_to_jiffies_timeout(timeout) : 0; @@ -1201,7 +1201,7 @@ static int __wait_seqno(struct intel_engine_cs *ring, u32 seqno, break; } - if (i915_seqno_passed(ring-get_seqno(ring, false), seqno)) { + if (i915_seqno_passed(ring, ring-get_seqno(ring, false), seqno)) { ret = 0; break; } @@ -2243,7 +2243,7 @@ i915_gem_object_retire(struct drm_i915_gem_object *obj) if (ring == NULL) return; - if (i915_seqno_passed(ring-get_seqno(ring, true), + if (i915_seqno_passed(ring, ring-get_seqno(ring, true), obj-last_read_seqno)) i915_gem_object_move_to_inactive(obj); } @@ -2489,7 +2489,7 @@ i915_gem_find_active_request(struct intel_engine_cs *ring) completed_seqno = ring-get_seqno(ring, false); list_for_each_entry(request, ring-request_list, list) { - if (i915_seqno_passed(completed_seqno, request-seqno)) + if (i915_seqno_passed(ring, completed_seqno, request-seqno)) continue; return request; @@ -2620,7 +2620,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring) */ list_for_each_entry_safe(req, req_next, ring-request_list, list) { - if (!i915_seqno_passed(seqno, req-seqno)) + if (!i915_seqno_passed(ring, seqno, req-seqno)) continue; trace_i915_gem_request_retire(ring, req-seqno); @@ -2639,14 +2639,14 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring) * before we free the context associated with the requests. */ list_for_each_entry_safe(obj, obj_next, ring-active_list, ring_list) { - if (!i915_seqno_passed(seqno, obj-last_read_seqno)) + if (!i915_seqno_passed(ring, seqno, obj-last_read_seqno)) continue; i915_gem_object_move_to_inactive(obj); } if (unlikely(ring-trace_irq_seqno -i915_seqno_passed(seqno, ring-trace_irq_seqno))) { +i915_seqno_passed(ring, seqno, ring-trace_irq_seqno))) { ring-irq_put(ring); ring-trace_irq_seqno = 0; } diff --git
[Intel-gfx] [RFC 28/44] drm/i915: Added scheduler flush calls to ring throttle and idle functions
From: John Harrison john.c.harri...@intel.com When requesting that all GPU work is completed, it is now necessary to get the scheduler involved in order to flush out work that queued and not yet submitted. --- drivers/gpu/drm/i915/i915_gem.c | 16 +++- drivers/gpu/drm/i915/i915_scheduler.c |7 +++ drivers/gpu/drm/i915/i915_scheduler.h |5 + 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index aa1e0b2..1c508b7 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3049,6 +3049,10 @@ int i915_gpu_idle(struct drm_device *dev) /* Flush everything onto the inactive list. */ for_each_ring(ring, dev_priv, i) { + ret = I915_SCHEDULER_FLUSH_ALL(ring, true); + if (ret 0) + return ret; + ret = i915_switch_context(ring, ring-default_context); if (ret) return ret; @@ -4088,7 +4092,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file) struct intel_engine_cs *ring = NULL; unsigned reset_counter; u32 seqno = 0; - int ret; + int i, ret; ret = i915_gem_wait_for_error(dev_priv-gpu_error); if (ret) @@ -4098,6 +4102,16 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file) if (ret) return ret; + for_each_ring(ring, dev_priv, i) { + /* Need a mechanism to flush out scheduler entries that were +* submitted more than 'recent_enough' time ago as well! In the +* meantime, just flush everything out to ensure that entries +* can not sit around indefinitely. */ + ret = I915_SCHEDULER_FLUSH_ALL(ring, false); + if (ret 0) + return ret; + } + spin_lock(file_priv-mm.lock); list_for_each_entry(request, file_priv-mm.request_list, client_list) { if (time_after_eq(request-emitted_jiffies, recent_enough)) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index d579bab..6b6827f 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -125,6 +125,13 @@ int i915_scheduler_flush_seqno(struct intel_engine_cs *ring, bool is_locked, return 0; } +int i915_scheduler_flush(struct intel_engine_cs *ring, bool is_locked) +{ + /* Do stuff... */ + + return 0; +} + bool i915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, uint32_t seqno, bool *completed) { diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 3811359..898d2bb 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -58,9 +58,13 @@ struct i915_scheduler_queue_entry { }; #ifdef CONFIG_DRM_I915_SCHEDULER +# define I915_SCHEDULER_FLUSH_ALL(ring, locked)\ + i915_scheduler_flush(ring, locked) + # define I915_SCHEDULER_FLUSH_SEQNO(ring, locked, seqno) \ i915_scheduler_flush_seqno(ring, locked, seqno) #else +# define I915_SCHEDULER_FLUSH_ALL(ring, locked) 0 # define I915_SCHEDULER_FLUSH_SEQNO(ring, locked, seqno) 0 #endif @@ -81,6 +85,7 @@ struct i915_scheduler { int i915_scheduler_fly_seqno(struct intel_engine_cs *ring, uint32_t seqno); int i915_scheduler_remove(struct intel_engine_cs *ring); +int i915_scheduler_flush(struct intel_engine_cs *ring, bool is_locked); int i915_scheduler_flush_seqno(struct intel_engine_cs *ring, bool is_locked, uint32_t seqno); booli915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 22/44] drm/i915: Ensure OLS PLR are always in sync
From: John Harrison john.c.harri...@intel.com The new seqno alloction code pre-allocates a 'lazy' request structure and then tries to allocate the 'lazy' seqno. The seqno allocation can potential wrap around zero and when doing so, tries to idle the ring by waiting for all oustanding work to complete. With a scheduler in place, this can mean first submitting extra work to the ring. However, at this point in time, the lazy request is valid but the lazy seqno is not. Some existing code was getting confused by this state and Bad Things would happen. The safest solution is to still allocate the lazy request in advance (to avoid having to roll back in an out of memory sitation) but to save the pointer in a local variable rather than immediately updating the lazy pointer. Only after a valid seqno has been acquired is the lazy request pointer actually updated. This guarantees that both lazy values are either invalid or both valid. There can no longer be an inconsistent state. --- drivers/gpu/drm/i915/intel_ringbuffer.c | 27 +++ 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 737c41b..1ef0cbd 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1665,20 +1665,31 @@ int intel_ring_idle(struct intel_engine_cs *ring) int intel_ring_alloc_seqno(struct intel_engine_cs *ring) { - if (ring-outstanding_lazy_seqno) + int ret; + struct drm_i915_gem_request *request; + + /* NB: Some code seems to test the OLS and other code tests the PLR. +* Therefore it is only safe if the two are kept in step. */ + + if (ring-outstanding_lazy_seqno) { + BUG_ON(ring-preallocated_lazy_request == NULL); return 0; + } - if (ring-preallocated_lazy_request == NULL) { - struct drm_i915_gem_request *request; + BUG_ON(ring-preallocated_lazy_request != NULL); - request = kmalloc(sizeof(*request), GFP_KERNEL); - if (request == NULL) - return -ENOMEM; + request = kmalloc(sizeof(*request), GFP_KERNEL); + if (request == NULL) + return -ENOMEM; - ring-preallocated_lazy_request = request; + ret = i915_gem_get_seqno(ring-dev, ring-outstanding_lazy_seqno); + if (ret) { + kfree(request); + return ret; } - return i915_gem_get_seqno(ring-dev, ring-outstanding_lazy_seqno); + ring-preallocated_lazy_request = request; + return 0; } static int __intel_ring_prepare(struct intel_engine_cs *ring, -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 34/44] drm/i915: Added scheduler queue throttling by DRM file handle
From: John Harrison john.c.harri...@intel.com The scheduler decouples the submission of batch buffers to the driver from their subsequent submission to the hardware. This means that an application which is continuously submitting buffers as fast as it can could potentialy flood the driver. To prevent this, the driver now tracks how many buffers are in progress (queued in software or executing in hardware) and limits this to a given (tunable) number. If this number is exceeded then the queue to the driver will return EAGAIN and thus prevent the scheduler's queue becoming arbitrarily large. --- drivers/gpu/drm/i915/i915_drv.h|2 ++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 12 +++ drivers/gpu/drm/i915/i915_scheduler.c | 32 drivers/gpu/drm/i915/i915_scheduler.h |5 + 4 files changed, 51 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4d52c67..872e869 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1785,6 +1785,8 @@ struct drm_i915_file_private { atomic_t rps_wait_boost; struct intel_engine_cs *bsd_ring; + + u32 scheduler_queue_length; }; /* diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index bf19e02..3227a39 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1614,6 +1614,12 @@ i915_gem_execbuffer(struct drm_device *dev, void *data, return -EINVAL; } +#ifdef CONFIG_DRM_I915_SCHEDULER + /* Throttle batch requests per device file */ + if (i915_scheduler_file_queue_is_full(file)) + return -EAGAIN; +#endif + /* Copy in the exec list from userland */ exec_list = drm_malloc_ab(sizeof(*exec_list), args-buffer_count); exec2_list = drm_malloc_ab(sizeof(*exec2_list), args-buffer_count); @@ -1702,6 +1708,12 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data, return -EINVAL; } +#ifdef CONFIG_DRM_I915_SCHEDULER + /* Throttle batch requests per device file */ + if (i915_scheduler_file_queue_is_full(file)) + return -EAGAIN; +#endif + exec2_list = kmalloc(sizeof(*exec2_list)*args-buffer_count, GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY); if (exec2_list == NULL) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 6d0f4cb..6782249 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -61,6 +61,7 @@ int i915_scheduler_init(struct drm_device *dev) scheduler-priority_level_max = ~0U; scheduler-priority_level_preempt = 900; scheduler-min_flying = 2; + scheduler-file_queue_max = 64; dev_priv-scheduler = scheduler; @@ -211,6 +212,8 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) list_add_tail(node-link, scheduler-node_queue[ring-id]); + i915_scheduler_file_queue_inc(node-params.file); + if (i915.scheduler_override i915_so_submit_on_queue) not_flying = true; else @@ -530,6 +533,12 @@ int i915_scheduler_remove(struct intel_engine_cs *ring) /* Strip the dependency info while the mutex is still locked */ i915_scheduler_remove_dependent(scheduler, node); + /* Likewise clean up the file descriptor before it might disappear. */ + if (node-params.file) { + i915_scheduler_file_queue_dec(node-params.file); + node-params.file = NULL; + } + continue; } @@ -1079,6 +1088,29 @@ bool i915_scheduler_is_idle(struct intel_engine_cs *ring) return true; } +bool i915_scheduler_file_queue_is_full(struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file-driver_priv; + struct drm_i915_private *dev_priv = file_priv-dev_priv; + struct i915_scheduler*scheduler = dev_priv-scheduler; + + return (file_priv-scheduler_queue_length = scheduler-file_queue_max); +} + +void i915_scheduler_file_queue_inc(struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file-driver_priv; + + file_priv-scheduler_queue_length++; +} + +void i915_scheduler_file_queue_dec(struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file-driver_priv; + + file_priv-scheduler_queue_length--; +} + #else /* CONFIG_DRM_I915_SCHEDULER */ int i915_scheduler_init(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index e824e700..78a92c9 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -112,6 +112,7
[Intel-gfx] [RFC 13/44] drm/i915: Added scheduler hook when closing DRM file handles
From: John Harrison john.c.harri...@intel.com The scheduler decouples the submission of batch buffers to the driver with submission of batch buffers to the hardware. Thus it is possible for an application to submit work, then close the DRM handle and free up all the resources that piece of work wishes to use before the work has even been submitted to the hardware. To prevent this, the scheduler needs to be informed of the DRM close event so that it can force through any outstanding work attributed to that file handle. --- drivers/gpu/drm/i915/i915_dma.c |3 +++ drivers/gpu/drm/i915/i915_scheduler.c | 18 ++ drivers/gpu/drm/i915/i915_scheduler.h |2 ++ 3 files changed, 23 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 494b156..6c9ce82 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -42,6 +42,7 @@ #include linux/vga_switcheroo.h #include linux/slab.h #include acpi/video.h +#include i915_scheduler.h #include linux/pm.h #include linux/pm_runtime.h #include linux/oom.h @@ -1930,6 +1931,8 @@ void i915_driver_lastclose(struct drm_device * dev) void i915_driver_preclose(struct drm_device *dev, struct drm_file *file) { + i915_scheduler_closefile(dev, file); + mutex_lock(dev-struct_mutex); i915_gem_context_close(dev, file); i915_gem_release(dev, file); diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index d9c1879..66a6568 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -78,6 +78,19 @@ bool i915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, return found; } +int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + if (!scheduler) + return 0; + + /* Do stuff... */ + + return 0; +} + #else /* CONFIG_DRM_I915_SCHEDULER */ int i915_scheduler_init(struct drm_device *dev) @@ -85,4 +98,9 @@ int i915_scheduler_init(struct drm_device *dev) return 0; } +int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file) +{ + return 0; +} + #endif /* CONFIG_DRM_I915_SCHEDULER */ diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 4044b6e..95641f6 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -27,6 +27,8 @@ booli915_scheduler_is_enabled(struct drm_device *dev); int i915_scheduler_init(struct drm_device *dev); +int i915_scheduler_closefile(struct drm_device *dev, +struct drm_file *file); #ifdef CONFIG_DRM_I915_SCHEDULER -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 20/44] drm/i915: Redirect execbuffer_final() via scheduler
From: John Harrison john.c.harri...@intel.com Updated the execbuffer() code to pass the packaged up batch buffer information to the scheduler rather than calling execbuffer_final() directly. The scheduler queue() code is currently a stub which simply chains on to _final() immediately. --- drivers/gpu/drm/i915/i915_gem_execbuffer.c |6 +- drivers/gpu/drm/i915/i915_scheduler.c | 23 +++ drivers/gpu/drm/i915/i915_scheduler.h |2 ++ 3 files changed, 26 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 334e8c6..f73c936 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1348,14 +1348,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, i915_gem_execbuffer_move_to_active(eb-vmas, ring); - ret = i915_gem_do_execbuffer_final(qe.params); + ret = i915_scheduler_queue_execbuffer(qe); if (ret) goto err; - /* Free everything that was stored in the QE structure (until the -* scheduler arrives and does it instead): */ - kfree(qe.params.cliprects); - /* The eb list is no longer required. The scheduler has extracted all * the information than needs to persist. */ eb_destroy(eb); diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 37f8a98..d95c789 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -58,6 +58,24 @@ int i915_scheduler_init(struct drm_device *dev) return 0; } +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) +{ + struct drm_i915_private *dev_priv = qe-params.dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + int ret; + + BUG_ON(!scheduler); + + qe-params.scheduler_index = scheduler-index++; + + ret = i915_gem_do_execbuffer_final(qe-params); + + /* Free everything that is owned by the QE structure: */ + kfree(qe-params.cliprects); + + return ret; +} + int i915_scheduler_remove(struct intel_engine_cs *ring) { /* Do stuff... */ @@ -110,4 +128,9 @@ int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file) return 0; } +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) +{ + return i915_gem_do_execbuffer_final(qe-params); +} + #endif /* CONFIG_DRM_I915_SCHEDULER */ diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 68a9543..4c3e081 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -42,6 +42,7 @@ struct i915_execbuffer_params { uint32_tmask; int mode; struct intel_context*ctx; + uint32_tscheduler_index; }; struct i915_scheduler_queue_entry { @@ -52,6 +53,7 @@ booli915_scheduler_is_enabled(struct drm_device *dev); int i915_scheduler_init(struct drm_device *dev); int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file); +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe); #ifdef CONFIG_DRM_I915_SCHEDULER -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 37/44] drm/i915: Added facility for cancelling an outstanding request
From: John Harrison john.c.harri...@intel.com If the scheduler pre-empts a batch buffer that is queued in the ring or even executing in the ring then that buffer must be returned to the queued in software state. Part of this re-queueing is to clean up the request structure. --- drivers/gpu/drm/i915/i915_drv.h |1 + drivers/gpu/drm/i915/i915_gem.c | 16 2 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 872e869..f8980c0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2319,6 +2319,7 @@ int __i915_add_request(struct intel_engine_cs *ring, __i915_add_request(ring, NULL, NULL, seqno, true) #define i915_add_request_wo_flush(ring) \ __i915_add_request(ring, NULL, NULL, NULL, false) +int i915_gem_cancel_request(struct intel_engine_cs *ring, u32 seqno); int __must_check i915_wait_seqno(struct intel_engine_cs *ring, uint32_t seqno); int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1c508b7..dd0fac8 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2655,6 +2655,22 @@ void i915_gem_reset(struct drm_device *dev) i915_gem_restore_fences(dev); } +int +i915_gem_cancel_request(struct intel_engine_cs *ring, u32 seqno) +{ + struct drm_i915_gem_request *req, *next; + int found = 0; + + list_for_each_entry_safe(req, next, ring-request_list, list) { + if (req-seqno == seqno) { + found += 1; + i915_gem_free_request(req); + } + } + + return found; +} + /** * This function clears the request list as sequence numbers are passed. */ -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 23/44] drm/i915: Added manipulation of OLS/PLR
From: John Harrison john.c.harri...@intel.com The scheduler requires each batch buffer to be tagged with the seqno it has been assigned and for that seqno to only be attached to the given batch buffer. Note that the seqno assigned to a batch buffer that is being submitted to the hardware might be very different to the next seqno that would be assigned automatically on ring submission. This means manipulating the lazy seqno and request values around batch buffer submission. At the start of execbuffer() the lazy seqno should be zero, if not it means that something has been written to the ring without a request being added. The lazy seqno also needs to be reset back to zero at the end ready for the next request to start. Then, execbuffer_final() needs to manually set the lazy seqno to the batch buffer's pre-assigned value rather than grabbing the next available value. There is no need to explictly clear the lazy seqno at the end of _final() as the add_request() call within _retire_commands() will do that automatically. --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 68 +++- drivers/gpu/drm/i915/i915_scheduler.h |2 + 2 files changed, 69 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 6bb1fd6..98cc95e 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1328,10 +1328,22 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, vma-bind_vma(vma, batch_obj-cache_level, GLOBAL_BIND); } + /* OLS should be zero at this point. If not then this buffer is going +* to be tagged as someone else's work! */ + BUG_ON(ring-outstanding_lazy_seqno!= 0); + BUG_ON(ring-preallocated_lazy_request != NULL); + /* Allocate a seqno for this batch buffer nice and early. */ ret = intel_ring_alloc_seqno(ring); if (ret) goto err; + qe.params.seqno = ring-outstanding_lazy_seqno; + qe.params.request = ring-preallocated_lazy_request; + + BUG_ON(ring-outstanding_lazy_seqno== 0); + BUG_ON(ring-outstanding_lazy_seqno!= qe.params.seqno); + BUG_ON(ring-preallocated_lazy_request != qe.params.request); + BUG_ON(ring-preallocated_lazy_request == NULL); /* Save assorted stuff away to pass through to execbuffer_final() */ qe.params.dev = dev; @@ -1373,6 +1385,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, qe.params.ctx = ctx; #endif // CONFIG_DRM_I915_SCHEDULER + /* OLS should have been set to something useful above */ + BUG_ON(ring-outstanding_lazy_seqno!= qe.params.seqno); + BUG_ON(ring-preallocated_lazy_request != qe.params.request); + if (flags I915_DISPATCH_SECURE) qe.params.batch_obj_vm_offset = i915_gem_obj_ggtt_offset(batch_obj); else @@ -1384,6 +1400,19 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, i915_gem_execbuffer_move_to_active(eb-vmas, ring); + /* Make sure the OLS hasn't advanced (which would indicate a flush +* of the work in progess which in turn would be a Bad Thing). */ + BUG_ON(ring-outstanding_lazy_seqno!= qe.params.seqno); + BUG_ON(ring-preallocated_lazy_request != qe.params.request); + + /* +* A new seqno has been assigned to the buffer and saved away for +* future reference. So clear the OLS to ensure that any further +* work is assigned a brand new seqno: +*/ + ring-outstanding_lazy_seqno= 0; + ring-preallocated_lazy_request = NULL; + ret = i915_scheduler_queue_execbuffer(qe); if (ret) goto err; @@ -1425,6 +1454,12 @@ err: } #endif // CONFIG_DRM_I915_SCHEDULER + /* Clear the OLS again in case the failure occurred after it had been +* assigned. */ + kfree(ring-preallocated_lazy_request); + ring-preallocated_lazy_request = NULL; + ring-outstanding_lazy_seqno= 0; + mutex_unlock(dev-struct_mutex); pre_mutex_err: @@ -1443,6 +1478,7 @@ int i915_gem_do_execbuffer_final(struct i915_execbuffer_params *params) struct intel_engine_cs *ring = params-ring; u64 exec_start, exec_len; int ret, i; + u32 seqno; /* The mutex must be acquired before calling this function */ BUG_ON(!mutex_is_locked(params-dev-struct_mutex)); @@ -1454,6 +1490,14 @@ int i915_gem_do_execbuffer_final(struct i915_execbuffer_params *params) intel_runtime_pm_get(dev_priv); + /* Ensure the correct seqno gets assigned to the correct buffer: */ + BUG_ON(ring-outstanding_lazy_seqno!= 0); + BUG_ON(ring-preallocated_lazy_request != NULL); + ring-outstanding_lazy_seqno= params-seqno; + ring-preallocated_lazy_request =
[Intel-gfx] [RFC 12/44] drm/i915: Disable hardware semaphores when GPU scheduler is enabled
From: John Harrison john.c.harri...@intel.com Hardware sempahores require seqno values to be continuously incrementing. However, the scheduler's reordering of batch buffers means that the seqno values going through the hardware could be out of order. Thus semaphores can not be used. On the other hand, the scheduler superceeds the need for hardware semaphores anyway. Having one ring stall waiting for something to complete on another ring is inefficient if that ring could be working on some other, independent task. This is what the scheduler is meant to do - keep the hardware as busy as possible by reordering batch buffers to avoid dependency stalls. --- drivers/gpu/drm/i915/i915_drv.c |9 + drivers/gpu/drm/i915/i915_scheduler.c |9 + drivers/gpu/drm/i915/i915_scheduler.h |1 + drivers/gpu/drm/i915/intel_ringbuffer.c |4 4 files changed, 23 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index e2bfdda..748b13a 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -33,6 +33,7 @@ #include i915_drv.h #include i915_trace.h #include intel_drv.h +#include i915_scheduler.h #include linux/console.h #include linux/module.h @@ -468,6 +469,14 @@ void intel_detect_pch(struct drm_device *dev) bool i915_semaphore_is_enabled(struct drm_device *dev) { + /* Hardware semaphores are not compatible with the scheduler due to the +* seqno values being potentially out of order. However, semaphores are +* also not required as the scheduler will handle interring dependencies +* and try do so in a way that does not cause dead time on the hardware. +*/ + if (i915_scheduler_is_enabled(dev)) + return 0; + if (INTEL_INFO(dev)-gen 6) return false; diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index e9aa566..d9c1879 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -26,6 +26,15 @@ #include intel_drv.h #include i915_scheduler.h +bool i915_scheduler_is_enabled(struct drm_device *dev) +{ +#ifdef CONFIG_DRM_I915_SCHEDULER + return true; +#else + return false; +#endif +} + #ifdef CONFIG_DRM_I915_SCHEDULER int i915_scheduler_init(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 67260b7..4044b6e 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -25,6 +25,7 @@ #ifndef _I915_SCHEDULER_H_ #define _I915_SCHEDULER_H_ +booli915_scheduler_is_enabled(struct drm_device *dev); int i915_scheduler_init(struct drm_device *dev); #ifdef CONFIG_DRM_I915_SCHEDULER diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index bad5db0..34d6d6e 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -32,6 +32,7 @@ #include drm/i915_drm.h #include i915_trace.h #include intel_drv.h +#include i915_scheduler.h /* Early gen2 devices have a cacheline of just 32 bytes, using 64 is overkill, * but keeps the logic simple. Indeed, the whole purpose of this macro is just @@ -765,6 +766,9 @@ gen6_ring_sync(struct intel_engine_cs *waiter, u32 wait_mbox = signaller-semaphore.mbox.wait[waiter-id]; int ret; + /* Arithmetic on sequence numbers is unreliable with a scheduler. */ + BUG_ON(i915_scheduler_is_enabled(signaller-dev)); + /* Throughout all of the GEM code, seqno passed implies our current * seqno is = the last seqno executed. However for hardware the * comparison is strictly greater than. -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 15/44] drm/i915: Added deferred work handler for scheduler
From: John Harrison john.c.harri...@intel.com The scheduler needs to do interrupt triggered work that is too complex to do in the interrupt handler. Thus it requires a deferred work handler to process this work asynchronously. --- drivers/gpu/drm/i915/i915_dma.c |3 +++ drivers/gpu/drm/i915/i915_drv.h | 10 ++ drivers/gpu/drm/i915/i915_gem.c | 27 +++ drivers/gpu/drm/i915/i915_scheduler.c |7 +++ drivers/gpu/drm/i915/i915_scheduler.h |1 + 5 files changed, 48 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 1668316..d1356f3 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1813,6 +1813,9 @@ int i915_driver_unload(struct drm_device *dev) WARN_ON(unregister_oom_notifier(dev_priv-mm.oom_notifier)); unregister_shrinker(dev_priv-mm.shrinker); + /* Cancel the scheduler work handler, which should be idle now. */ + cancel_work_sync(dev_priv-mm.scheduler_work); + io_mapping_free(dev_priv-gtt.mappable); arch_phys_wc_del(dev_priv-gtt.mtrr); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 0977653..fbafa68 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1075,6 +1075,16 @@ struct i915_gem_mm { struct delayed_work idle_work; /** +* New scheme is to get an interrupt after every work packet +* in order to allow the low latency scheduling of pending +* packets. The idea behind adding new packets to a pending +* queue rather than directly into the hardware ring buffer +* is to allow high priority packets to over take low priority +* ones. +*/ + struct work_struct scheduler_work; + + /** * Are we in a non-interruptible section of code like * modesetting? */ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fece5e7..57b24f0 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2712,6 +2712,29 @@ i915_gem_idle_work_handler(struct work_struct *work) intel_mark_idle(dev_priv-dev); } +#ifdef CONFIG_DRM_I915_SCHEDULER +static void +i915_gem_scheduler_work_handler(struct work_struct *work) +{ + struct intel_engine_cs *ring; + struct drm_i915_private *dev_priv; + struct drm_device *dev; + int i; + + dev_priv = container_of(work, struct drm_i915_private, mm.scheduler_work); + dev = dev_priv-dev; + + mutex_lock(dev-struct_mutex); + + /* Do stuff: */ + for_each_ring(ring, dev_priv, i) { + i915_scheduler_remove(ring); + } + + mutex_unlock(dev-struct_mutex); +} +#endif + /** * Ensures that an object will eventually get non-busy by flushing any required * write domains, emitting any outstanding lazy request and retiring and @@ -4916,6 +4939,10 @@ i915_gem_load(struct drm_device *dev) i915_gem_retire_work_handler); INIT_DELAYED_WORK(dev_priv-mm.idle_work, i915_gem_idle_work_handler); +#ifdef CONFIG_DRM_I915_SCHEDULER + INIT_WORK(dev_priv-mm.scheduler_work, + i915_gem_scheduler_work_handler); +#endif init_waitqueue_head(dev_priv-gpu_error.reset_queue); /* On GEN3 we really need to make sure the ARB C3 LP bit is set */ diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 66a6568..37f8a98 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -58,6 +58,13 @@ int i915_scheduler_init(struct drm_device *dev) return 0; } +int i915_scheduler_remove(struct intel_engine_cs *ring) +{ + /* Do stuff... */ + + return 0; +} + bool i915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, uint32_t seqno, bool *completed) { diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 95641f6..6b2cc51 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -38,6 +38,7 @@ struct i915_scheduler { uint32_tindex; }; +int i915_scheduler_remove(struct intel_engine_cs *ring); booli915_scheduler_is_seqno_in_flight(struct intel_engine_cs *ring, uint32_t seqno, bool *completed); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 02/44] drm/i915: Added getparam for native sync
From: John Harrison john.c.harri...@intel.com Validation tests need a run time mechanism for querying whether or not the driver supports the Android native sync facility. --- drivers/gpu/drm/i915/i915_dma.c |7 +++ include/uapi/drm/i915_drm.h |1 + 2 files changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 6cce55b..67f2918 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1022,6 +1022,13 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_CMD_PARSER_VERSION: value = i915_cmd_parser_get_version(); break; + case I915_PARAM_HAS_NATIVE_SYNC: +#ifdef CONFIG_DRM_I915_SYNC + value = 1; +#else + value = 0; +#endif + break; default: DRM_DEBUG(Unknown parameter %d\n, param-param); return -EINVAL; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index ff57f07..bf54c78 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -340,6 +340,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_EXEC_HANDLE_LUT 26 #define I915_PARAM_HAS_WT 27 #define I915_PARAM_CMD_PARSER_VERSION 28 +#define I915_PARAM_HAS_NATIVE_SYNC 30 typedef struct drm_i915_getparam { int param; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 08/44] drm/i915: Added GPU scheduler config option
From: John Harrison john.c.harri...@intel.com Added a Kconfig option for enabling/disabling the GPU scheduler. --- drivers/gpu/drm/i915/Kconfig |8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 437e182..22a036b 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -81,3 +81,11 @@ config DRM_I915_UMS enable this only if you have ancient versions of the DDX drivers. If in doubt, say N. + +config DRM_I915_SCHEDULER + bool Enable GPU scheduler on Intel hardware + depends on DRM_I915 + default y + help + Choose this option to enable GPU task scheduling for improved + performance and efficiency. -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 32/44] drm/i915: Added immediate submission override to scheduler
From: John Harrison john.c.harri...@intel.com To aid with debugging issues related to the scheduler, it can be useful to ensure that all batch buffers are submitted immediately rather than queued until later. This change adds an override flag via the module parameter to force instant submission. --- drivers/gpu/drm/i915/i915_scheduler.c |7 +-- drivers/gpu/drm/i915/i915_scheduler.h |1 + 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 1816f1d..71d8db4 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -209,8 +209,11 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) list_add_tail(node-link, scheduler-node_queue[ring-id]); - not_flying = i915_scheduler_count_flying(scheduler, ring) -scheduler-min_flying; + if (i915.scheduler_override i915_so_submit_on_queue) + not_flying = true; + else + not_flying = i915_scheduler_count_flying(scheduler, ring) +scheduler-min_flying; spin_unlock_irqrestore(scheduler-lock, flags); diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index f93d57d..e824e700 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -123,6 +123,7 @@ enum { /* Options for 'scheduler_override' module parameter: */ enum { i915_so_direct_submit = (1 0), + i915_so_submit_on_queue = (1 1), }; booli915_scheduler_is_busy(struct intel_engine_cs *ring); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 42/44] drm/i915: Added scheduler statistic reporting to debugfs
From: John Harrison john.c.harri...@intel.com It is useful for know what the scheduler is doing for both debugging and performance analysis purposes. This change adds a bunch of counters and such that keep track of various scheduler operations (batches submitted, preempted, interrupts processed, flush requests, etc.). The data can then be read in userland via the debugfs mechanism. --- drivers/gpu/drm/i915/i915_debugfs.c | 85 + drivers/gpu/drm/i915/i915_scheduler.c | 66 +++-- drivers/gpu/drm/i915/i915_scheduler.h | 50 +++ 3 files changed, 198 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 1c20c8c..cb9839b 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2482,6 +2482,88 @@ static int i915_display_info(struct seq_file *m, void *unused) return 0; } +#ifdef CONFIG_DRM_I915_SCHEDULER +static int i915_scheduler_info(struct seq_file *m, void *unused) +{ + struct drm_info_node *node = (struct drm_info_node *) m-private; + struct drm_device *dev = node-minor-dev; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + struct i915_scheduler_stats *stats = scheduler-stats; + struct i915_scheduler_stats_nodes node_stats[I915_NUM_RINGS]; + struct intel_engine_cs *ring; + char str[50 * (I915_NUM_RINGS + 1)], name[50], *ptr; + int ret, i, r; + + ret = mutex_lock_interruptible(dev-mode_config.mutex); + if (ret) + return ret; + +#define PRINT_VAR(name, fmt, var) \ + do {\ + sprintf(str, %-22s, name ); \ + ptr = str + strlen(str);\ + for_each_ring(ring, dev_priv, r) { \ + sprintf(ptr, %10 fmt, var); \ + ptr += strlen(ptr); \ + } \ + seq_printf(m, %s\n, str); \ + } while(0) + + PRINT_VAR(Ring name:, s, dev_priv-ring[r].name); + seq_printf(m, Batch submissions:\n); + PRINT_VAR( Queued, u, stats[r].queued); + PRINT_VAR( Queued preemptive,u, stats[r].queued_preemptive); + PRINT_VAR( Submitted,u, stats[r].submitted); + PRINT_VAR( Submitted preemptive, u, stats[r].submitted_preemptive); + PRINT_VAR( Preempted,u, stats[r].preempted); + PRINT_VAR( Completed,u, stats[r].completed); + PRINT_VAR( Completed preemptive, u, stats[r].completed_preemptive); + PRINT_VAR( Expired, u, stats[r].expired); + seq_putc(m, '\n'); + + seq_printf(m, Flush counts:\n); + PRINT_VAR( By object,u, stats[r].flush_obj); + PRINT_VAR( By seqno, u, stats[r].flush_seqno); + PRINT_VAR( Blanket, u, stats[r].flush_all); + PRINT_VAR( Entries bumped, u, stats[r].flush_bump); + PRINT_VAR( Entries submitted,u, stats[r].flush_submit); + seq_putc(m, '\n'); + + seq_printf(m, Interrupt counts:\n); + PRINT_VAR( Regular, llu, stats[r].irq.regular); + PRINT_VAR( Preemptive, llu, stats[r].irq.preemptive); + PRINT_VAR( Idle, llu, stats[r].irq.idle); + PRINT_VAR( Inter-batch, llu, stats[r].irq.interbatch); + PRINT_VAR( Mid-batch,llu, stats[r].irq.midbatch); + seq_putc(m, '\n'); + + seq_printf(m, Seqno values at last IRQ:\n); + PRINT_VAR( Seqno,d, stats[r].irq.last_seqno); + PRINT_VAR( Batch done, d, stats[r].irq.last_b_done); + PRINT_VAR( Preemptive done, d, stats[r].irq.last_p_done); + PRINT_VAR( Batch active, d, stats[r].irq.last_b_active); + PRINT_VAR( Preemptive active,d, stats[r].irq.last_p_active); + seq_putc(m, '\n'); + + seq_printf(m, Queue contents:\n); + for_each_ring(ring, dev_priv, i) + i915_scheduler_query_stats(ring, node_stats + ring-id); + + for (i = 0; i i915_sqs_MAX; i++) { + sprintf(name, %s, i915_scheduler_queue_status_str(i)); + PRINT_VAR(name, d, node_stats[r].counts[i]); + } + seq_putc(m, '\n'); + +#undef PRINT_VAR + + mutex_unlock(dev-mode_config.mutex); + + return 0; +} +#endif + struct pipe_crc_info { const char *name; struct drm_device *dev; @@ -3928,6 +4010,9 @@ static const struct drm_info_list i915_debugfs_list[] = {
[Intel-gfx] [RFC 30/44] drm/i915: Added a module parameter for allowing scheduler overrides
From: John Harrison john.c.harri...@intel.com It can be useful to be able to disable certain features (e.g. the entire scheduler) via a module parameter for debugging purposes. A parameter has the advantage of not being a compile time switch but without implying that it can be changed dynamically at runtime. --- drivers/gpu/drm/i915/i915_drv.h |1 + drivers/gpu/drm/i915/i915_params.c|4 drivers/gpu/drm/i915/i915_scheduler.h |5 + 3 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index fbafa68..4d52c67 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2053,6 +2053,7 @@ struct i915_params { bool reset; bool disable_display; bool disable_vtd_wa; + int scheduler_override; }; extern struct i915_params i915 __read_mostly; diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index d05a2af..ce99733 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = { .disable_display = 0, .enable_cmd_parser = 1, .disable_vtd_wa = 0, + .scheduler_override = 0, }; module_param_named(modeset, i915.modeset, int, 0400); @@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, Disable all VT-d workarounds (default: false) module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600); MODULE_PARM_DESC(enable_cmd_parser, Enable command parsing (1=enabled [default], 0=disabled)); + +module_param_named(scheduler_override, i915.scheduler_override, int, 0600); +MODULE_PARM_DESC(scheduler_override, Scheduler override option (default: 0)); diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 1b3d51a..6dd4fea 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -84,6 +84,11 @@ struct i915_scheduler { uint32_tindex; }; +/* Options for 'scheduler_override' module parameter: */ +enum { + i915_so_normal = 0, +}; + int i915_scheduler_fly_seqno(struct intel_engine_cs *ring, uint32_t seqno); int i915_scheduler_remove(struct intel_engine_cs *ring); int i915_scheduler_flush(struct intel_engine_cs *ring, bool is_locked); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 01/44] drm/i915: Corrected 'file_priv' to 'file' in 'i915_driver_preclose()'
From: John Harrison john.c.harri...@intel.com The 'i915_driver_preclose()' function has a parameter called 'file_priv'. However, this is misleading as the structure it points to is a 'drm_file' not a 'drm_i915_file_private'. It should be named just 'file' to avoid confusion. --- drivers/gpu/drm/i915/i915_dma.c |6 +++--- drivers/gpu/drm/i915/i915_drv.h |6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index b9159ad..6cce55b 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1916,11 +1916,11 @@ void i915_driver_lastclose(struct drm_device * dev) i915_dma_cleanup(dev); } -void i915_driver_preclose(struct drm_device * dev, struct drm_file *file_priv) +void i915_driver_preclose(struct drm_device *dev, struct drm_file *file) { mutex_lock(dev-struct_mutex); - i915_gem_context_close(dev, file_priv); - i915_gem_release(dev, file_priv); + i915_gem_context_close(dev, file); + i915_gem_release(dev, file); mutex_unlock(dev-struct_mutex); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bea9ab40..7a96ca0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2044,12 +2044,12 @@ void i915_update_dri1_breadcrumb(struct drm_device *dev); extern void i915_kernel_lost_context(struct drm_device * dev); extern int i915_driver_load(struct drm_device *, unsigned long flags); extern int i915_driver_unload(struct drm_device *); -extern int i915_driver_open(struct drm_device *dev, struct drm_file *file_priv); +extern int i915_driver_open(struct drm_device *dev, struct drm_file *file); extern void i915_driver_lastclose(struct drm_device * dev); extern void i915_driver_preclose(struct drm_device *dev, -struct drm_file *file_priv); +struct drm_file *file); extern void i915_driver_postclose(struct drm_device *dev, - struct drm_file *file_priv); + struct drm_file *file); extern int i915_driver_device_is_agp(struct drm_device * dev); #ifdef CONFIG_COMPAT extern long i915_compat_ioctl(struct file *filp, unsigned int cmd, -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 31/44] drm/i915: Implemented the GPU scheduler
From: John Harrison john.c.harri...@intel.com Filled in all the 'do stuff here' blanks... The general theory of operation is that when batch buffers are submitted to the driver, the execbuffer() code assigns a unique seqno value and then packages up all the information required to execute the batch buffer at a later time. This package is given over to the scheduler which adds it to an internal node list. The scheduler also scans the list of objects associated with the batch buffer and compares them against the objects already in use by other buffers in the node list. If matches are found then the new batch buffer node is marked as being dependent upon the matching node. The same is done for the context object. The scheduler also bumps up the priority of such matching nodes on the grounds that the more dependencies a given batch buffer has the more important it is likely to be. The scheduler aims to have a given (tuneable) number of batch buffers in flight on the hardware at any given time. If fewer than this are currently executing when a new node is queued, then the node is passed straight through to the submit function. Otherwise it is simply added to the queue and the driver returns back to user land. As each batch buffer completes, it raises an interrupt which wakes up the scheduler. Note that it is possible for multiple buffers to complete before the IRQ handler gets to run. Further, the seqno values of the individual buffers are not necessary incrementing as the scheduler may have re-ordered their submission. However, the scheduler keeps the list of executing buffers in order of hardware submission. Thus it can scan through the list until a matching seqno is found and then mark all in flight nodes from that point on as completed. A deferred work queue is also poked by the interrupt handler. When this wakes up it can do more involved processing such as actually removing completed nodes from the queue and freeing up the resources associated with them (internal memory allocations, DRM object references, context reference, etc.). The work handler also checks the in flight count and calls the submission code if a new slot has appeared. When the scheduler's submit code is called, it scans the queued node list for the highest priority node that has no unmet dependencies. Note that the dependency calculation is complex as it must take inter-ring dependencies and potential preemptions into account. Note also that in the future this will be extended to include external dependencies such as the Android Native Sync file descriptors and/or the linux dma-buff synchronisation scheme. If a suitable node is found then it is sent to execbuff_final() for submission to the hardware. The in flight count is then re-checked and a new node popped from the list if appropriate. Note that this change does not implement pre-emptive scheduling. Only basic scheduling by re-ordering batch buffer submission is currently implemented. --- drivers/gpu/drm/i915/i915_scheduler.c | 945 +++-- drivers/gpu/drm/i915/i915_scheduler.h | 59 +- 2 files changed, 965 insertions(+), 39 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 6a10a76..1816f1d 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -41,6 +41,7 @@ int i915_scheduler_init(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; struct i915_scheduler *scheduler = dev_priv-scheduler; + int r; if (scheduler) return 0; @@ -51,8 +52,16 @@ int i915_scheduler_init(struct drm_device *dev) spin_lock_init(scheduler-lock); + for (r = 0; r I915_NUM_RINGS; r++) + INIT_LIST_HEAD(scheduler-node_queue[r]); + scheduler-index = 1; + /* Default tuning values: */ + scheduler-priority_level_max = ~0U; + scheduler-priority_level_preempt = 900; + scheduler-min_flying = 2; + dev_priv-scheduler = scheduler; return 0; @@ -60,50 +69,371 @@ int i915_scheduler_init(struct drm_device *dev) int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) { - struct drm_i915_private *dev_priv = qe-params.dev-dev_private; - struct i915_scheduler *scheduler = dev_priv-scheduler; - int ret, i; + struct drm_i915_private *dev_priv = qe-params.dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + struct intel_engine_cs *ring = qe-params.ring; + struct i915_scheduler_queue_entry *node; + struct i915_scheduler_queue_entry *test; + struct timespec stamp; + unsigned long flags; + boolnot_flying, found; + int i, j, r, got_batch = 0; + int incomplete = 0; BUG_ON(!scheduler); -
[Intel-gfx] [RFC 44/44] drm/i915: Fake batch support for page flips
From: John Harrison john.c.harri...@intel.com Any commands written to the ring without the scheduler's knowledge can get lost during a pre-emption event. This checkin updates the page flip code to send the ring commands via the scheduler's 'fake batch' interface. Thus the page flip is kept safe from being clobbered. --- drivers/gpu/drm/i915/intel_display.c | 84 -- 1 file changed, 40 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index fa1ffbb..8bbc5d3 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9099,8 +9099,8 @@ static int intel_gen7_queue_flip(struct drm_device *dev, uint32_t flags) { struct intel_crtc *intel_crtc = to_intel_crtc(crtc); - uint32_t plane_bit = 0; - int len, ret; + uint32_t plane_bit = 0, sched_flags; + int ret; switch (intel_crtc-plane) { case PLANE_A: @@ -9117,18 +9117,6 @@ static int intel_gen7_queue_flip(struct drm_device *dev, return -ENODEV; } - len = 4; - if (ring-id == RCS) { - len += 6; - /* -* On Gen 8, SRM is now taking an extra dword to accommodate -* 48bits addresses, and we need a NOOP for the batch size to -* stay even. -*/ - if (IS_GEN8(dev)) - len += 2; - } - /* * BSpec MI_DISPLAY_FLIP for IVB: * The full packet must be contained within the same cache line. @@ -9139,13 +9127,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev, * then do the cacheline alignment, and finally emit the * MI_DISPLAY_FLIP. */ - ret = intel_ring_cacheline_align(ring); - if (ret) - return ret; - - ret = intel_ring_begin(ring, len); - if (ret) - return ret; + sched_flags = i915_ebp_sf_cacheline_align; /* Unmask the flip-done completion message. Note that the bspec says that * we should do this for both the BCS and RCS, and that we must not unmask @@ -9157,32 +9139,46 @@ static int intel_gen7_queue_flip(struct drm_device *dev, * to zero does lead to lockups within MI_DISPLAY_FLIP. */ if (ring-id == RCS) { - intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); - intel_ring_emit(ring, DERRMR); - intel_ring_emit(ring, ~(DERRMR_PIPEA_PRI_FLIP_DONE | - DERRMR_PIPEB_PRI_FLIP_DONE | - DERRMR_PIPEC_PRI_FLIP_DONE)); - if (IS_GEN8(dev)) - intel_ring_emit(ring, MI_STORE_REGISTER_MEM_GEN8(1) | - MI_SRM_LRM_GLOBAL_GTT); - else - intel_ring_emit(ring, MI_STORE_REGISTER_MEM(1) | - MI_SRM_LRM_GLOBAL_GTT); - intel_ring_emit(ring, DERRMR); - intel_ring_emit(ring, ring-scratch.gtt_offset + 256); - if (IS_GEN8(dev)) { - intel_ring_emit(ring, 0); - intel_ring_emit(ring, MI_NOOP); - } - } + uint32_t cmds[] = { + MI_LOAD_REGISTER_IMM(1), + DERRMR, + ~(DERRMR_PIPEA_PRI_FLIP_DONE | + DERRMR_PIPEB_PRI_FLIP_DONE | + DERRMR_PIPEC_PRI_FLIP_DONE), + IS_GEN8(dev) ? (MI_STORE_REGISTER_MEM_GEN8(1) | + MI_SRM_LRM_GLOBAL_GTT) : + (MI_STORE_REGISTER_MEM(1) | + MI_SRM_LRM_GLOBAL_GTT), + DERRMR, + ring-scratch.gtt_offset + 256, +// if (IS_GEN8(dev)) { + 0, + MI_NOOP, +// } + MI_DISPLAY_FLIP_I915 | plane_bit, + fb-pitches[0] | obj-tiling_mode, + intel_crtc-unpin_work-gtt_offset, + MI_NOOP + }; + uint32_t len = sizeof(cmds) / sizeof(*cmds); + + ret = i915_scheduler_queue_nonbatch(ring, cmds, len, obj, 1, sched_flags); + } else { + uint32_t cmds[] = { + MI_DISPLAY_FLIP_I915 | plane_bit, + fb-pitches[0] | obj-tiling_mode, + intel_crtc-unpin_work-gtt_offset, + MI_NOOP + }; + uint32_t len = sizeof(cmds) / sizeof(*cmds); - intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit); - intel_ring_emit(ring, (fb-pitches[0] | obj-tiling_mode)); -
[Intel-gfx] [RFC 36/44] drm/i915: Added debug state dump facilities to scheduler
From: John Harrison john.c.harri...@intel.com When debugging batch buffer submission issues, it is useful to be able to see what the current state of the scheduler is. This change adds functions for decoding the internal scheduler state and reporting it. --- drivers/gpu/drm/i915/i915_scheduler.c | 255 + drivers/gpu/drm/i915/i915_scheduler.h | 17 +++ 2 files changed, 272 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 6782249..7c03fb7 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -37,6 +37,101 @@ bool i915_scheduler_is_enabled(struct drm_device *dev) #ifdef CONFIG_DRM_I915_SCHEDULER +const char *i915_qe_state_str(struct i915_scheduler_queue_entry *node) +{ + static char str[50]; + char*ptr = str; + + *(ptr++) = node-bumped ? 'B' : '-', + + *ptr = 0; + + return str; +} + +char i915_scheduler_queue_status_chr(enum i915_scheduler_queue_status status) +{ + switch (status) { + case i915_sqs_none: + return 'N'; + + case i915_sqs_queued: + return 'Q'; + + case i915_sqs_flying: + return 'F'; + + case i915_sqs_complete: + return 'C'; + + default: + break; + } + + return '?'; +} + +const char *i915_scheduler_queue_status_str( + enum i915_scheduler_queue_status status) +{ + static char str[50]; + + switch (status) { + case i915_sqs_none: + return None; + + case i915_sqs_queued: + return Queued; + + case i915_sqs_flying: + return Flying; + + case i915_sqs_complete: + return Complete; + + default: + break; + } + + sprintf(str, [Unknown_%d!], status); + return str; +} + +const char *i915_scheduler_flag_str(uint32_t flags) +{ + static char str[100]; + char *ptr = str; + + *ptr = 0; + +#define TEST_FLAG(flag, msg) \ + if (flags (flag)) { \ + strcpy(ptr, msg); \ + ptr += strlen(ptr); \ + flags = ~(flag); \ + } + + TEST_FLAG(i915_sf_interrupts_enabled, IntOn|); + TEST_FLAG(i915_sf_submitting, Submitting|); + TEST_FLAG(i915_sf_dump_force, DumpForce|); + TEST_FLAG(i915_sf_dump_details, DumpDetails|); + TEST_FLAG(i915_sf_dump_dependencies, DumpDeps|); + +#undef TEST_FLAG + + if (flags) { + sprintf(ptr, Unknown_0x%X!, flags); + ptr += strlen(ptr); + } + + if (ptr == str) + strcpy(str, -); + else + ptr[-1] = 0; + + return str; +}; + int i915_scheduler_init(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -589,6 +684,166 @@ int i915_scheduler_remove(struct intel_engine_cs *ring) return ret; } +int i915_scheduler_dump_all(struct drm_device *dev, const char *msg) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + unsigned long flags; + int ret; + + spin_lock_irqsave(scheduler-lock, flags); + ret = i915_scheduler_dump_all_locked(dev, msg); + spin_unlock_irqrestore(scheduler-lock, flags); + + return ret; +} + +int i915_scheduler_dump_all_locked(struct drm_device *dev, const char *msg) +{ + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + struct intel_engine_cs *ring; + int i, r, ret = 0; + + for_each_ring(ring, dev_priv, i) { + scheduler-flags[ring-id] |= i915_sf_dump_force | + i915_sf_dump_details | + i915_sf_dump_dependencies; + r = i915_scheduler_dump_locked(ring, msg); + if (ret == 0) + ret = r; + } + + return ret; +} + +int i915_scheduler_dump(struct intel_engine_cs *ring, const char *msg) +{ + struct drm_i915_private *dev_priv = ring-dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + unsigned long flags; + int ret; + + spin_lock_irqsave(scheduler-lock, flags); + ret = i915_scheduler_dump_locked(ring, msg); + spin_unlock_irqrestore(scheduler-lock, flags); + + return ret; +} + +int i915_scheduler_dump_locked(struct intel_engine_cs *ring, const char *msg) +{ + struct drm_i915_private *dev_priv = ring-dev-dev_private; + struct i915_scheduler *scheduler =
[Intel-gfx] [RFC 41/44] drm/i915: Added validation callback to trace points
From: John Harrison john.c.harri...@intel.com The validation tests require hooks into the GPU scheduler to allow them to analyse what the scheduler is doing internally. --- drivers/gpu/drm/i915/i915_scheduler.c |4 drivers/gpu/drm/i915/i915_scheduler.h | 16 drivers/gpu/drm/i915/i915_trace.h | 16 3 files changed, 36 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 0eb6a31..8d45b73 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -26,6 +26,10 @@ #include intel_drv.h #include i915_scheduler.h +i915_scheduler_validation_callback_type + i915_scheduler_validation_callback = NULL; +EXPORT_SYMBOL(i915_scheduler_validation_callback); + bool i915_scheduler_is_enabled(struct drm_device *dev) { #ifdef CONFIG_DRM_I915_SCHEDULER diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index f86b687..2f8c566 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -196,4 +196,20 @@ void i915_scheduler_file_queue_dec(struct drm_file *file); int i915_gem_do_execbuffer_final(struct i915_execbuffer_params *params); +/* A callback mechanism to allow validation tests to hook into the internal + * state of the scheduler. */ +enum i915_scheduler_validation_op { + i915_scheduler_validation_op_state_change = 1, + i915_scheduler_validation_op_queue, + i915_scheduler_validation_op_dispatch, + i915_scheduler_validation_op_complete, +}; +typedef int (*i915_scheduler_validation_callback_type) + (enum i915_scheduler_validation_op op, +struct intel_engine_cs *ring, +uint32_t seqno, +struct i915_scheduler_queue_entry *node); +extern i915_scheduler_validation_callback_type + i915_scheduler_validation_callback; + #endif /* _I915_SCHEDULER_H_ */ diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 40b1c6f..2029d8b 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -369,6 +369,10 @@ TRACE_EVENT(i915_gem_ring_dispatch, __entry-seqno = seqno; __entry-flags = flags; i915_trace_irq_get(ring, seqno); + if (i915_scheduler_validation_callback) + i915_scheduler_validation_callback( + i915_scheduler_validation_op_dispatch, + ring, seqno, NULL); ), TP_printk(dev=%u, ring=%u, seqno=%u, flags=%x, @@ -660,6 +664,10 @@ TRACE_EVENT(i915_scheduler_landing, __entry-ring = ring-id; __entry-seqno = seqno; __entry-status = node ? node-status : ~0U; + if (i915_scheduler_validation_callback) + i915_scheduler_validation_callback( + i915_scheduler_validation_op_complete, + ring, seqno, node); ), TP_printk(ring=%d, seqno=%d, status=%d, @@ -740,6 +748,10 @@ TRACE_EVENT(i915_scheduler_node_state_change, __entry-ring = ring-id; __entry-seqno = node-params.seqno; __entry-status = node-status; + if (i915_scheduler_validation_callback) + i915_scheduler_validation_callback( + i915_scheduler_validation_op_state_change, + ring, node-params.seqno, node); ), TP_printk(ring=%d, seqno=%d, status=%d, @@ -789,6 +801,10 @@ TRACE_EVENT(i915_gem_ring_queue, TP_fast_assign( __entry-ring = ring-id; __entry-seqno = node-params.seqno; + if (i915_scheduler_validation_callback) + i915_scheduler_validation_callback( + i915_scheduler_validation_op_queue, + ring, node-params.seqno, node); ), TP_printk(ring=%d, seqno=%d, __entry-ring, __entry-seqno) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 39/44] drm/i915: Added support for pre-emptive scheduling
From: John Harrison john.c.harri...@intel.com Added support for pre-empting batch buffers that have already been submitted to the ring. Currently this implements Gen7 level pre-emption which means pre-empting only at voluntary points within the batch buffer. The ring submission code itself adds such points between batch buffers and the OpenCL driver should be adding them within GPGPU specific batch buffers. Other types of workloads cannot be preempted by the hardware and so will not be adding pre-emption points to their buffers. When a pre-emption occurs, the scheduler must work out which buffers have been pre-empted versus which actually managed to complete first, and of those that were pre-empted was the last one pre-empted mid-batch or had it not yet begun to execute. This is done by extending the seqno mechanism to four slots: batch buffer start, batch buffer end, preemption start and preemption end. By querying these four numbers (and only allowing a single preemption event at a time) the scheduler can guarantee to work out exactly what happened to all batch buffers that had been submitted to the ring. A Kconfig option has also been added to allow pre-emption support to be enabled or disabled. --- drivers/gpu/drm/i915/Kconfig |8 + drivers/gpu/drm/i915/i915_gem.c| 12 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 273 drivers/gpu/drm/i915/i915_scheduler.c | 467 +++- drivers/gpu/drm/i915/i915_scheduler.h | 25 +- drivers/gpu/drm/i915/i915_trace.h | 23 +- drivers/gpu/drm/i915/intel_ringbuffer.h|4 + 7 files changed, 797 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 22a036b..b94d4c7 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -89,3 +89,11 @@ config DRM_I915_SCHEDULER help Choose this option to enable GPU task scheduling for improved performance and efficiency. + +config DRM_I915_SCHEDULER_PREEMPTION + bool Enable pre-emption within the GPU scheduler + depends on DRM_I915_SCHEDULER + default y + help + Choose this option to enable pre-emptive context switching within the + GPU scheduler for even more performance and efficiency improvements. diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index dd0fac8..2cb4484 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2312,6 +2312,18 @@ i915_gem_init_seqno(struct drm_device *dev, u32 seqno) ring-semaphore.sync_seqno[j] = 0; } +#ifdef CONFIG_DRM_I915_SCHEDULER_PREEMPTION + /* Also reset sw batch tracking state */ + for_each_ring(ring, dev_priv, i) { + ring-last_regular_batch = 0; + ring-last_preemptive_batch = 0; + intel_write_status_page(ring, I915_BATCH_DONE_SEQNO, 0); + intel_write_status_page(ring, I915_BATCH_ACTIVE_SEQNO, 0); + intel_write_status_page(ring, I915_PREEMPTIVE_DONE_SEQNO, 0); + intel_write_status_page(ring, I915_PREEMPTIVE_ACTIVE_SEQNO, 0); + } +#endif + return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a9570ff..81acdf2 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1470,6 +1470,238 @@ pre_mutex_err: return ret; } +#ifdef CONFIG_DRM_I915_SCHEDULER_PREEMPTION +/* + * The functions below emit opcodes into the ring buffer. + * The simpler ones insert a single instruction, whereas the + * prequel/preamble/postamble functions generate a sequence + * of operations according to the nature of the current batch. + * Top among them is i915_gem_do_execbuffer_final() which is + * called by the scheduler to pass a batch to the hardware. + * + * There are three different types of batch handled here: + * 1. non-preemptible batches (using the default context) + * 2. preemptible batches (using a non-default context) + * 3. preemptive batches (using a non-default context) + * and three points at which the code paths vary (prequel, at the very + * start of per-batch processing; preamble, just before the call to the + * batch buffer; and postamble, which after the batch buffer completes). + * + * The preamble is simple; it logs the sequence number of the batch that's + * about to start, and enables or disables preemption for the duration of + * the batch. The postamble is similar: it logs the sequence number of the + * batch that's just finished, and clears the in-progress sequence number + * (except for preemptive batches, where this is deferred to the interrupt + * handler). + * + * The prequel is the part that differs most. In the case of a regular batch, + * it contains an ARB ON/ARB CHECK sequence that allows preemption before + *
[Intel-gfx] [RFC 35/44] drm/i915: Added debugfs interface to scheduler tuning parameters
From: John Harrison john.c.harri...@intel.com There are various parameters within the scheduler which can be tuned to improve performance, reduce memory footprint, etc. This change adds support for altering these via debugfs. --- drivers/gpu/drm/i915/i915_debugfs.c | 117 +++ 1 file changed, 117 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 5858cbb..1c20c8c 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -39,6 +39,7 @@ #include intel_ringbuffer.h #include drm/i915_drm.h #include i915_drv.h +#include i915_scheduler.h enum { ACTIVE_LIST, @@ -983,6 +984,116 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_next_seqno_fops, i915_next_seqno_get, i915_next_seqno_set, 0x%llx\n); +#ifdef CONFIG_DRM_I915_SCHEDULER +static int +i915_scheduler_priority_max_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + *val = (u64) scheduler-priority_level_max; + return 0; +} + +static int +i915_scheduler_priority_max_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + scheduler-priority_level_max = (u32) val; + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_scheduler_priority_max_fops, + i915_scheduler_priority_max_get, + i915_scheduler_priority_max_set, + 0x%llx\n); + +static int +i915_scheduler_priority_preempt_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + *val = (u64) scheduler-priority_level_preempt; + return 0; +} + +static int +i915_scheduler_priority_preempt_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + scheduler-priority_level_preempt = (u32) val; + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_scheduler_priority_preempt_fops, + i915_scheduler_priority_preempt_get, + i915_scheduler_priority_preempt_set, + 0x%llx\n); + +static int +i915_scheduler_min_flying_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + *val = (u64) scheduler-min_flying; + return 0; +} + +static int +i915_scheduler_min_flying_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + scheduler-min_flying = (u32) val; + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_scheduler_min_flying_fops, + i915_scheduler_min_flying_get, + i915_scheduler_min_flying_set, + 0x%llx\n); + +static int +i915_scheduler_file_queue_max_get(void *data, u64 *val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + *val = (u64) scheduler-file_queue_max; + return 0; +} + +static int +i915_scheduler_file_queue_max_set(void *data, u64 val) +{ + struct drm_device *dev = data; + struct drm_i915_private *dev_priv = dev-dev_private; + struct i915_scheduler *scheduler = dev_priv-scheduler; + + scheduler-file_queue_max = (u32) val; + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_scheduler_file_queue_max_fops, + i915_scheduler_file_queue_max_get, + i915_scheduler_file_queue_max_set, + 0x%llx\n); +#endif /* CONFIG_DRM_I915_SCHEDULER */ + static int i915_rstdby_delays(struct seq_file *m, void *unused) { struct drm_info_node *node = m-private; @@ -3834,6 +3945,12 @@ static const struct i915_debugfs_files { {i915_gem_drop_caches, i915_drop_caches_fops}, {i915_error_state, i915_error_state_fops}, {i915_next_seqno, i915_next_seqno_fops}, +#ifdef CONFIG_DRM_I915_SCHEDULER + {i915_scheduler_priority_max, i915_scheduler_priority_max_fops}, + {i915_scheduler_priority_preempt, i915_scheduler_priority_preempt_fops}, + {i915_scheduler_min_flying, i915_scheduler_min_flying_fops}, +
[Intel-gfx] [RFC 16/44] drm/i915: Alloc early seqno
From: John Harrison john.c.harri...@intel.com The scheduler needs to explicitly allocate a seqno to track each submitted batch buffer. This must happen a long time before any commands are actually written to the ring. --- drivers/gpu/drm/i915/i915_gem_execbuffer.c |5 + drivers/gpu/drm/i915/intel_ringbuffer.c|2 +- drivers/gpu/drm/i915/intel_ringbuffer.h|1 + 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index ee836a6..ec274ef 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1317,6 +1317,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, vma-bind_vma(vma, batch_obj-cache_level, GLOBAL_BIND); } + /* Allocate a seqno for this batch buffer nice and early. */ + ret = intel_ring_alloc_seqno(ring); + if (ret) + goto err; + if (flags I915_DISPATCH_SECURE) exec_start += i915_gem_obj_ggtt_offset(batch_obj); else diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 34d6d6e..737c41b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1662,7 +1662,7 @@ int intel_ring_idle(struct intel_engine_cs *ring) return i915_wait_seqno(ring, seqno); } -static int +int intel_ring_alloc_seqno(struct intel_engine_cs *ring) { if (ring-outstanding_lazy_seqno) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 30841ea..cc92de2 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -347,6 +347,7 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring); int __must_check intel_ring_begin(struct intel_engine_cs *ring, int n); int __must_check intel_ring_cacheline_align(struct intel_engine_cs *ring); +int __must_check intel_ring_alloc_seqno(struct intel_engine_cs *ring); static inline void intel_ring_emit(struct intel_engine_cs *ring, u32 data) { -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 24/44] drm/i915: Added scheduler interrupt handler hook
From: John Harrison john.c.harri...@intel.com The scheduler needs to be informed of each batch buffer completion. This is done via the user interrupt mechanism. The epilogue of each batch buffer submission updates a sequence number value (seqno) and triggers a user interrupt. This change hooks the scheduler in to the processing of that interrupt via the notify_ring() function. The scheduler also has clean up code that needs to be done outside of the interrupt context, thus notify_ring() now also pokes the scheduler's work queue. --- drivers/gpu/drm/i915/i915_irq.c |3 +++ drivers/gpu/drm/i915/i915_scheduler.c | 16 drivers/gpu/drm/i915/i915_scheduler.h |1 + 3 files changed, 20 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index eff08a3e..7089242 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -36,6 +36,7 @@ #include i915_drv.h #include i915_trace.h #include intel_drv.h +#include i915_scheduler.h static const u32 hpd_ibx[] = { [HPD_CRT] = SDE_CRT_HOTPLUG, @@ -1218,6 +1219,8 @@ static void notify_ring(struct drm_device *dev, trace_i915_gem_request_complete(ring); + i915_scheduler_handle_IRQ(ring); + wake_up_all(ring-irq_queue); i915_queue_hangcheck(dev); } diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index fc165c2..1e4d7c313 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -92,6 +92,17 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) return ret; } +int i915_scheduler_handle_IRQ(struct intel_engine_cs *ring) +{ + struct drm_i915_private *dev_priv = ring-dev-dev_private; + + /* Do stuff... */ + + queue_work(dev_priv-wq, dev_priv-mm.scheduler_work); + + return 0; +} + int i915_scheduler_remove(struct intel_engine_cs *ring) { /* Do stuff... */ @@ -149,4 +160,9 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) return i915_gem_do_execbuffer_final(qe-params); } +int i915_scheduler_handle_IRQ(struct intel_engine_cs *ring) +{ + return 0; +} + #endif /* CONFIG_DRM_I915_SCHEDULER */ diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index e62254a..dd7d699 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -62,6 +62,7 @@ int i915_scheduler_init(struct drm_device *dev); int i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file); int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe); +int i915_scheduler_handle_IRQ(struct intel_engine_cs *ring); #ifdef CONFIG_DRM_I915_SCHEDULER -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 10/44] drm/i915: Prepare retire_requests to handle out-of-order seqnos
From: John Harrison john.c.harri...@intel.com A major point of the GPU scheduler is that it re-orders batch buffers after they have been submitted to the driver. Rather than attempting to re-assign seqno values, it is much simpler to have each batch buffer keep its initially assigned number and modify the rest of the driver to cope with seqnos being returned out of order. In practice, very little code actually needs updating to cope. One such place is the retire request handler. Rather than stopping as soon as an uncompleted seqno is found, it must now keep iterating through the requests in case later seqnos have completed. There is also a problem with doing the free of the request before the move to inactive. Thus the requests are now moved to a temporary list first, then the objects de-activated and finally the requests on the temporary list are freed. --- drivers/gpu/drm/i915/i915_gem.c | 60 +-- 1 file changed, 32 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index b784eb2..7e53446 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2602,7 +2602,10 @@ void i915_gem_reset(struct drm_device *dev) void i915_gem_retire_requests_ring(struct intel_engine_cs *ring) { + struct drm_i915_gem_object *obj, *obj_next; + struct drm_i915_gem_request *req, *req_next; uint32_t seqno; + LIST_HEAD(deferred_request_free); if (list_empty(ring-request_list)) return; @@ -2611,43 +2614,35 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring) seqno = ring-get_seqno(ring, true); - /* Move any buffers on the active list that are no longer referenced -* by the ringbuffer to the flushing/inactive lists as appropriate, -* before we free the context associated with the requests. + /* Note that seqno values might be out of order due to rescheduling and +* pre-emption. Thus both lists must be processed in their entirety +* rather than stopping at the first 'non-passed' entry. */ - while (!list_empty(ring-active_list)) { - struct drm_i915_gem_object *obj; - - obj = list_first_entry(ring-active_list, - struct drm_i915_gem_object, - ring_list); - - if (!i915_seqno_passed(seqno, obj-last_read_seqno)) - break; - i915_gem_object_move_to_inactive(obj); - } - - - while (!list_empty(ring-request_list)) { - struct drm_i915_gem_request *request; - - request = list_first_entry(ring-request_list, - struct drm_i915_gem_request, - list); - - if (!i915_seqno_passed(seqno, request-seqno)) - break; + list_for_each_entry_safe(req, req_next, ring-request_list, list) { + if (!i915_seqno_passed(seqno, req-seqno)) + continue; - trace_i915_gem_request_retire(ring, request-seqno); + trace_i915_gem_request_retire(ring, req-seqno); /* We know the GPU must have read the request to have * sent us the seqno + interrupt, so use the position * of tail of the request to update the last known position * of the GPU head. */ - ring-buffer-last_retired_head = request-tail; + ring-buffer-last_retired_head = req-tail; - i915_gem_free_request(request); + list_move_tail(req-list, deferred_request_free); + } + + /* Move any buffers on the active list that are no longer referenced +* by the ringbuffer to the flushing/inactive lists as appropriate, +* before we free the context associated with the requests. +*/ + list_for_each_entry_safe(obj, obj_next, ring-active_list, ring_list) { + if (!i915_seqno_passed(seqno, obj-last_read_seqno)) + continue; + + i915_gem_object_move_to_inactive(obj); } if (unlikely(ring-trace_irq_seqno @@ -2656,6 +2651,15 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring) ring-trace_irq_seqno = 0; } + /* Finish processing active list before freeing request */ + while (!list_empty(deferred_request_free)) { + req = list_first_entry(deferred_request_free, + struct drm_i915_gem_request, + list); + + i915_gem_free_request(req); + } + WARN_ON(i915_verify_lists(ring-dev)); } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org
[Intel-gfx] [RFC 14/44] drm/i915: Added getparam for GPU scheduler
From: John Harrison john.c.harri...@intel.com This is required by user land validation programs that need to know whether the scheduler is available for testing or not. --- drivers/gpu/drm/i915/i915_dma.c |3 +++ include/uapi/drm/i915_drm.h |1 + 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 6c9ce82..1668316 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1035,6 +1035,9 @@ static int i915_getparam(struct drm_device *dev, void *data, value = 0; #endif break; + case I915_PARAM_HAS_GPU_SCHEDULER: + value = i915_scheduler_is_enabled(dev); + break; default: DRM_DEBUG(Unknown parameter %d\n, param-param); return -EINVAL; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index bf54c78..de6f603 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -341,6 +341,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_WT 27 #define I915_PARAM_CMD_PARSER_VERSION 28 #define I915_PARAM_HAS_NATIVE_SYNC 30 +#define I915_PARAM_HAS_GPU_SCHEDULER31 typedef struct drm_i915_getparam { int param; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PULL] drm-intel-fixes
Hi Dave - Fixes for 3.16-rc2; regressions, races, and warns; Broadwell PCI IDs. BR, Jani. The following changes since commit a497c3ba1d97fc69c1e78e7b96435ba8c2cb42ee: Linux 3.16-rc2 (2014-06-21 19:02:54 -1000) are available in the git repository at: git://anongit.freedesktop.org/drm-intel tags/drm-intel-fixes-2014-06-26 for you to fetch changes up to 8525a235c96a548873c6c5644f50df32b31f04c6: drm/i915: vlv_prepare_pll is only needed in case of non DSI interfaces (2014-06-25 11:22:18 +0300) Chris Wilson (2): drm/i915: Only mark the ctx as initialised after a SET_CONTEXT operation drm/i915: Hold the table lock whilst walking the file's idr and counting the objects in debugfs Imre Deak (1): drm/i915: cache hw power well enabled state Jani Nikula (1): drm/i915: default to having backlight if VBT not available Rodrigo Vivi (1): drm/i915: BDW: Adding Reserved PCI IDs. Shobhit Kumar (1): drm/i915: vlv_prepare_pll is only needed in case of non DSI interfaces drivers/gpu/drm/i915/i915_debugfs.c | 2 ++ drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem_context.c | 8 --- drivers/gpu/drm/i915/intel_bios.c | 6 +++--- drivers/gpu/drm/i915/intel_display.c| 13 ++-- drivers/gpu/drm/i915/intel_drv.h| 4 ++-- drivers/gpu/drm/i915/intel_pm.c | 37 + include/drm/i915_pciids.h | 12 +-- 8 files changed, 46 insertions(+), 38 deletions(-) -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 00/44] GPU scheduler for i915 driver
Implemented a batch buffer submission scheduler for the i915 DRM driver. While this seems very interesting, you might want to address in the commit msg or the cover email a) why this is needed, b) any improvements in speed, power consumption or throughput it generates, i.e. benchmarks. also some notes on what hw supports preemption. Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] xf86-video-intel-2.99.912 with dri3 enabled breaks gnome-shell
Le jeudi 26 juin 2014 à 13:36 +0100, Steven Newbury a écrit : Hi Hans! Have you got to the bottom of this? I see the same thing, and also possibly with GLX contexts under XWayland also with gnome-shell. If it's the same thing, it points to the mesa i965 DRI(3) driver side rather than the DDX driver since it's not involved any more with XWayland. FTR, I disabled DRI3 in our DDX ebuild because of this bug. IIRC, Julien did too in Debian. Cheers, Rémi ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 3/3] drm/i915: gmch: fix stuck primary plane due to memory self-refresh mode
Hi Daniel, hi Imre, Daniel Vetter writes: Adding Egbert since he's done the original hack here. Imre please keep him on cc. -Daniel I finally managed to get this set of patches tested on the platform that exhibited the intermittent blanking problem when terminating the Xserver. I can confirm that Imre's patches resolve the issue and that g4x_fixup_plane() which I had introduced after extensive experiments is no longer needed to prevent the blanking from happening. If you want I can provide a patch to back this out with the appropriate comments once Imre's patches are in. Cheers, Egbert. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] xf86-video-intel-2.99.912 with dri3 enabled breaks gnome-shell
On 27 June 2014 07:54, Rémi Cardona r...@gentoo.org wrote: Le jeudi 26 juin 2014 à 13:36 +0100, Steven Newbury a écrit : Hi Hans! Have you got to the bottom of this? I see the same thing, and also possibly with GLX contexts under XWayland also with gnome-shell. If it's the same thing, it points to the mesa i965 DRI(3) driver side rather than the DDX driver since it's not involved any more with XWayland. FTR, I disabled DRI3 in our DDX ebuild because of this bug. IIRC, Julien did too in Debian. this is a bug that is to be fixed in mesa, the GLX_INTEL_swap_event extension was implemented wrongly in DRI2, and DRI3 needs to be fixed to be bug compatible. I'll push fix in a while once I test it. Dave. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/opregion: ignore firmware requests for backlight change
On 06/25/2014 07:08 PM, Jani Nikula wrote: On Tue, 24 Jun 2014, Aaron Lu aaron...@intel.com wrote: Some Thinkpad laptops' firmware will initiate a backlight level change request through operation region on the events of AC plug/unplug, but since we are not using firmware's interface to do the backlight setting on these affected laptops, we do not want the firmware to use some arbitrary value from its ASL variable to set the backlight level on AC plug/unplug either. I'm curious whether this happens with EFI boot, or only with legacy. Igor, Anton, Are you using legacy boot or UEFI boot? Possible to test the other case? One comment inline, otherwise Will add that in next revision. Acked-by: Jani Nikula jani.nik...@intel.com Thanks for the review! -Aaron for merging through the ACPI tree, as the change is more likely to conflict there. Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=76491 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=77091 Reported-and-tested-by: Igor Gnatenko i.gnatenko.br...@gmail.com Reported-and-tested-by: Anton Gubarkov anton.gubar...@gmail.com Signed-off-by: Aaron Lu aaron...@intel.com --- drivers/acpi/video.c | 3 ++- drivers/gpu/drm/i915/intel_opregion.c | 7 +++ include/acpi/video.h | 2 ++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/video.c b/drivers/acpi/video.c index fb9ffe9adc64..cf99d6d2d491 100644 --- a/drivers/acpi/video.c +++ b/drivers/acpi/video.c @@ -241,13 +241,14 @@ static bool acpi_video_use_native_backlight(void) return use_native_backlight_dmi; } -static bool acpi_video_verify_backlight_support(void) +bool acpi_video_verify_backlight_support(void) { if (acpi_osi_is_win8() acpi_video_use_native_backlight() backlight_device_registered(BACKLIGHT_RAW)) return false; return acpi_video_backlight_support(); } +EXPORT_SYMBOL(acpi_video_verify_backlight_support); /* backlight device sysfs support */ static int acpi_video_get_brightness(struct backlight_device *bd) diff --git a/drivers/gpu/drm/i915/intel_opregion.c b/drivers/gpu/drm/i915/intel_opregion.c index 2e2c71fcc9ed..02943d93e88e 100644 --- a/drivers/gpu/drm/i915/intel_opregion.c +++ b/drivers/gpu/drm/i915/intel_opregion.c @@ -403,6 +403,13 @@ static u32 asle_set_backlight(struct drm_device *dev, u32 bclp) DRM_DEBUG_DRIVER(bclp = 0x%08x\n, bclp); +/* + * If the acpi_video interface is not supposed to be used, don't + * bother processing backlight level change requests from firmware. + */ +if (!acpi_video_verify_backlight_support()) +return 0; I'd appreciate a DRM_DEBUG_KMS here about what happened. We're bound to wonder about that staring at some dmesg later on! + if (!(bclp ASLE_BCLP_VALID)) return ASLC_BACKLIGHT_FAILED; diff --git a/include/acpi/video.h b/include/acpi/video.h index ea4c7bbded4d..92f8c4bffefb 100644 --- a/include/acpi/video.h +++ b/include/acpi/video.h @@ -22,6 +22,7 @@ extern void acpi_video_unregister(void); extern void acpi_video_unregister_backlight(void); extern int acpi_video_get_edid(struct acpi_device *device, int type, int device_id, void **edid); +extern bool acpi_video_verify_backlight_support(void); #else static inline int acpi_video_register(void) { return 0; } static inline void acpi_video_unregister(void) { return; } @@ -31,6 +32,7 @@ static inline int acpi_video_get_edid(struct acpi_device *device, int type, { return -ENODEV; } +static bool acpi_video_verify_backlight_support() { return false; } #endif #endif -- 1.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [QA] Testing report for `drm-intel-testing` (was: Updated -next) on ww26
Summary We covered the platform: Broadwell, Baytrail-M, Haswell mobile, HSW desktop, HSW ULT, IvyBridge, SandyBridge, IronLake. In this circle, 1 new bugs is filed. From now on, QA can cover power testing, including power consuming measure testing and function testing. Only function part are involved this week. Test Environment Kernel: (drm-intel-testing) ac710a93740e609759fa75dacdc96f1dfc34b5c5 Merge: 2124f7d efd4b76 Author: Daniel Vetter daniel.vet...@ffwll.ch Date: Thu Jun 19 21:00:32 2014 +0200 Finding New Bugs: 1 bugs 80549https://bugs.freedesktop.org/show_bug.cgi?id=80549 [IVB,HSW]Resuming form s3 cause Call Trace, with warm boot New power related coverage are as following. All are passed. [PM_BLC] Verify display current backlight value [PM_BLC] Set and verify the backlight values by increase [PM_BLC] Set and verify the backlight values by decrease [PM_BLC] Set and verify the backlight value out of the supported range [PM_MSR] Verify Memory Self-Refresh is enabled [PM_MSR] Check Memeory Self Refresh working with different modes [PM_MSR] Check Memeory Self Refresh working with rotation Thanks --Sun, Yi ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/8] Execlists prep-work (II)
From: Oscar Mateo oscar.ma...@intel.com These patches contain more refactoring and preparatory work for Execlists [1]. [1] http://lists.freedesktop.org/archives/intel-gfx/2014-June/047138.html Oscar Mateo (8): drm/i915: Extract context backing object allocation drm/i915: Rename ctx-obj to ctx-rcs_state drm/i915: Rename ctx-is_initialized to ctx-rcs_is_initialized drm/i915: Rename ctx-id to ctx-handle drm/i915: Extract ringbuffer destroy generalize alloc to take a ringbuf drm/i915: Generalize ring_space to take a ringbuf drm/i915: Generalize intel_ring_get_tail to take a ringbuf drm/i915: Extract the actual workload submission mechanism from execbuffer drivers/gpu/drm/i915/i915_debugfs.c| 8 +- drivers/gpu/drm/i915/i915_drv.h| 10 +- drivers/gpu/drm/i915/i915_gem.c| 4 +- drivers/gpu/drm/i915/i915_gem_context.c| 132 +++-- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 300 - drivers/gpu/drm/i915/intel_ringbuffer.c| 39 ++-- drivers/gpu/drm/i915/intel_ringbuffer.h| 4 +- drivers/gpu/drm/i915/intel_uncore.c| 2 +- 8 files changed, 273 insertions(+), 226 deletions(-) -- 1.9.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/8] drm/i915: Extract context backing object allocation
From: Oscar Mateo oscar.ma...@intel.com This is preparatory work for Execlists: we plan to use it later to allocate our own context objects (since Logical Ring Contexts do not have the same kind of backing objects). No functional changes. Signed-off-by: Oscar Mateo oscar.ma...@intel.com --- drivers/gpu/drm/i915/i915_gem_context.c | 54 + 1 file changed, 35 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 21eda88..ab25368 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -198,6 +198,36 @@ void i915_gem_context_free(struct kref *ctx_ref) kfree(ctx); } +static struct drm_i915_gem_object * +i915_gem_alloc_context_obj(struct drm_device *dev, size_t size) +{ + struct drm_i915_gem_object *obj; + int ret; + + obj = i915_gem_alloc_object(dev, size); + if (obj == NULL) + return ERR_PTR(-ENOMEM); + + /* +* Try to make the context utilize L3 as well as LLC. +* +* On VLV we don't have L3 controls in the PTEs so we +* shouldn't touch the cache level, especially as that +* would make the object snooped which might have a +* negative performance impact. +*/ + if (INTEL_INFO(dev)-gen = 7 !IS_VALLEYVIEW(dev)) { + ret = i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC); + /* Failure shouldn't ever happen this early */ + if (WARN_ON(ret)) { + drm_gem_object_unreference(obj-base); + return ERR_PTR(ret); + } + } + + return obj; +} + static struct i915_hw_ppgtt * create_vm_for_ctx(struct drm_device *dev, struct intel_context *ctx) { @@ -234,27 +264,13 @@ __create_hw_context(struct drm_device *dev, list_add_tail(ctx-link, dev_priv-context_list); if (dev_priv-hw_context_size) { - ctx-obj = i915_gem_alloc_object(dev, dev_priv-hw_context_size); - if (ctx-obj == NULL) { - ret = -ENOMEM; + struct drm_i915_gem_object *obj = + i915_gem_alloc_context_obj(dev, dev_priv-hw_context_size); + if (IS_ERR(obj)) { + ret = PTR_ERR(obj); goto err_out; } - - /* -* Try to make the context utilize L3 as well as LLC. -* -* On VLV we don't have L3 controls in the PTEs so we -* shouldn't touch the cache level, especially as that -* would make the object snooped which might have a -* negative performance impact. -*/ - if (INTEL_INFO(dev)-gen = 7 !IS_VALLEYVIEW(dev)) { - ret = i915_gem_object_set_cache_level(ctx-obj, - I915_CACHE_L3_LLC); - /* Failure shouldn't ever happen this early */ - if (WARN_ON(ret)) - goto err_out; - } + ctx-obj = obj; } /* Default context will never have a file_priv */ -- 1.9.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/8] drm/i915: Rename ctx-is_initialized to ctx-rcs_is_initialized
From: Oscar Mateo oscar.ma...@intel.com We only use this flag to signify that the render state (a.k.a. golden context, a.k.a. null context) has been initialized. It doesn't mean anything for the other engines, so make that distinction obvious. This renaming was suggested by Daniel Vetter. Implemented with this cocci script (plus manual changes to the struct declaration): @ struct intel_context c; @@ - (c).is_initialized + c.rcs_is_initialized @@ struct intel_context *c; @@ - (c)-is_initialized + c-rcs_is_initialized No functional changes. Signed-off-by: Oscar Mateo oscar.ma...@intel.com --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/i915/i915_gem_context.c | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index b7bcfd5..d4b8391 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -176,7 +176,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) static void describe_ctx(struct seq_file *m, struct intel_context *ctx) { - seq_putc(m, ctx-is_initialized ? 'I' : 'i'); + seq_putc(m, ctx-rcs_is_initialized ? 'I' : 'i'); seq_putc(m, ctx-remap_slice ? 'R' : 'r'); seq_putc(m, ' '); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b7c6388..122e942 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -588,7 +588,7 @@ struct i915_ctx_hang_stats { struct intel_context { struct kref ref; int id; - bool is_initialized; + bool rcs_is_initialized; uint8_t remap_slice; struct drm_i915_file_private *file_priv; struct drm_i915_gem_object *rcs_state; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 9cc31c6..b8b9859 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -669,7 +669,7 @@ static int do_switch(struct intel_engine_cs *ring, vma-bind_vma(vma, to-rcs_state-cache_level, GLOBAL_BIND); } - if (!to-is_initialized || i915_gem_context_is_default(to)) + if (!to-rcs_is_initialized || i915_gem_context_is_default(to)) hw_flags |= MI_RESTORE_INHIBIT; ret = mi_set_context(ring, to, hw_flags); @@ -716,13 +716,13 @@ done: i915_gem_context_reference(to); ring-last_context = to; - if (ring-id == RCS !to-is_initialized from == NULL) { + if (ring-id == RCS !to-rcs_is_initialized from == NULL) { ret = i915_gem_render_state_init(ring); if (ret) DRM_ERROR(init render state: %d\n, ret); } - to-is_initialized = true; + to-rcs_is_initialized = true; return 0; -- 1.9.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/8] drm/i915: Generalize ring_space to take a ringbuf
From: Oscar Mateo oscar.ma...@intel.com It's simple enough that it doesn't need to know anything about the engine. Trivial change. Signed-off-by: Oscar Mateo oscar.ma...@intel.com --- drivers/gpu/drm/i915/intel_ringbuffer.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index ffdb366..405edec 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -48,9 +48,8 @@ static inline int __ring_space(int head, int tail, int size) return space; } -static inline int ring_space(struct intel_engine_cs *ring) +static inline int ring_space(struct intel_ringbuffer *ringbuf) { - struct intel_ringbuffer *ringbuf = ring-buffer; return __ring_space(ringbuf-head HEAD_ADDR, ringbuf-tail, ringbuf-size); } @@ -545,7 +544,7 @@ static int init_ring_common(struct intel_engine_cs *ring) else { ringbuf-head = I915_READ_HEAD(ring); ringbuf-tail = I915_READ_TAIL(ring) TAIL_ADDR; - ringbuf-space = ring_space(ring); + ringbuf-space = ring_space(ringbuf); ringbuf-last_retired_head = -1; } @@ -1537,7 +1536,7 @@ static int intel_ring_wait_request(struct intel_engine_cs *ring, int n) ringbuf-head = ringbuf-last_retired_head; ringbuf-last_retired_head = -1; - ringbuf-space = ring_space(ring); + ringbuf-space = ring_space(ringbuf); if (ringbuf-space = n) return 0; } @@ -1560,7 +1559,7 @@ static int intel_ring_wait_request(struct intel_engine_cs *ring, int n) ringbuf-head = ringbuf-last_retired_head; ringbuf-last_retired_head = -1; - ringbuf-space = ring_space(ring); + ringbuf-space = ring_space(ringbuf); return 0; } @@ -1589,7 +1588,7 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n) trace_i915_ring_wait_begin(ring); do { ringbuf-head = I915_READ_HEAD(ring); - ringbuf-space = ring_space(ring); + ringbuf-space = ring_space(ringbuf); if (ringbuf-space = n) { ret = 0; break; @@ -1641,7 +1640,7 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring) iowrite32(MI_NOOP, virt++); ringbuf-tail = 0; - ringbuf-space = ring_space(ring); + ringbuf-space = ring_space(ringbuf); return 0; } -- 1.9.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/8] drm/i915: Extract ringbuffer destroy generalize alloc to take a ringbuf
From: Oscar Mateo oscar.ma...@intel.com More prep work: with Execlists, we are going to start creating a lot of extra ringbuffers soon, so these functions are handy. No functional changes. v2: rename allocate/destroy_ring_buffer to alloc/destroy_ringbuffer_obj because the name is more meaningful and to mirror a similar function in the context world: i915_gem_alloc_context_obj(). Change suggested by Brad Volkin. Signed-off-by: Oscar Mateo oscar.ma...@intel.com --- drivers/gpu/drm/i915/intel_ringbuffer.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 2faef26..ffdb366 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1380,15 +1380,25 @@ static int init_phys_status_page(struct intel_engine_cs *ring) return 0; } -static int allocate_ring_buffer(struct intel_engine_cs *ring) +static void intel_destroy_ringbuffer_obj(struct intel_ringbuffer *ringbuf) +{ + if (!ringbuf-obj) + return; + + iounmap(ringbuf-virtual_start); + i915_gem_object_ggtt_unpin(ringbuf-obj); + drm_gem_object_unreference(ringbuf-obj-base); + ringbuf-obj = NULL; +} + +static int intel_alloc_ringbuffer_obj(struct drm_device *dev, + struct intel_ringbuffer *ringbuf) { - struct drm_device *dev = ring-dev; struct drm_i915_private *dev_priv = to_i915(dev); - struct intel_ringbuffer *ringbuf = ring-buffer; struct drm_i915_gem_object *obj; int ret; - if (intel_ring_initialized(ring)) + if (ringbuf-obj) return 0; obj = NULL; @@ -1460,7 +1470,7 @@ static int intel_init_ring_buffer(struct drm_device *dev, goto error; } - ret = allocate_ring_buffer(ring); + ret = intel_alloc_ringbuffer_obj(dev, ringbuf); if (ret) { DRM_ERROR(Failed to allocate ringbuffer %s: %d\n, ring-name, ret); goto error; @@ -1501,11 +1511,7 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring) intel_stop_ring_buffer(ring); WARN_ON(!IS_GEN2(ring-dev) (I915_READ_MODE(ring) MODE_IDLE) == 0); - iounmap(ringbuf-virtual_start); - - i915_gem_object_ggtt_unpin(ringbuf-obj); - drm_gem_object_unreference(ringbuf-obj-base); - ringbuf-obj = NULL; + intel_destroy_ringbuffer_obj(ringbuf); ring-preallocated_lazy_request = NULL; ring-outstanding_lazy_seqno = 0; -- 1.9.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 7/8] drm/i915: Generalize intel_ring_get_tail to take a ringbuf
From: Oscar Mateo oscar.ma...@intel.com Again, it's low-level enough to simply take a ringbuf and nothing else. Trivial change. Signed-off-by: Oscar Mateo oscar.ma...@intel.com --- drivers/gpu/drm/i915/i915_gem.c | 4 ++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f6d1238..ac7d50a 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2330,7 +2330,7 @@ int __i915_add_request(struct intel_engine_cs *ring, u32 request_ring_position, request_start; int ret; - request_start = intel_ring_get_tail(ring); + request_start = intel_ring_get_tail(ring-buffer); /* * Emit any outstanding flushes - execbuf can fail to emit the flush * after having emitted the batchbuffer command. Hence we need to fix @@ -2351,7 +2351,7 @@ int __i915_add_request(struct intel_engine_cs *ring, * GPU processing the request, we never over-estimate the * position of the head. */ - request_ring_position = intel_ring_get_tail(ring); + request_ring_position = intel_ring_get_tail(ring-buffer); ret = ring-add_request(ring); if (ret) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index e72017b..070568b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -318,9 +318,9 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev); u64 intel_ring_get_active_head(struct intel_engine_cs *ring); void intel_ring_setup_status_page(struct intel_engine_cs *ring); -static inline u32 intel_ring_get_tail(struct intel_engine_cs *ring) +static inline u32 intel_ring_get_tail(struct intel_ringbuffer *ringbuf) { - return ring-buffer-tail; + return ringbuf-tail; } static inline u32 intel_ring_get_seqno(struct intel_engine_cs *ring) -- 1.9.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 8/8] drm/i915: Extract the actual workload submission mechanism from execbuffer
From: Oscar Mateo oscar.ma...@intel.com So that we isolate the legacy ringbuffer submission mechanism, which becomes a good candidate to be abstracted away. This is prep-work for Execlists (which will its own workload submission mechanism). No functional changes. Signed-off-by: Oscar Mateo oscar.ma...@intel.com --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 298 - 1 file changed, 162 insertions(+), 136 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index c97178e..60998fc 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1026,6 +1026,163 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev, return 0; } +static int +legacy_ringbuffer_submission(struct drm_device *dev, struct drm_file *file, +struct intel_engine_cs *ring, +struct intel_context *ctx, +struct drm_i915_gem_execbuffer2 *args, +struct list_head *vmas, +struct drm_i915_gem_object *batch_obj, +u64 exec_start, u32 flags) +{ + struct drm_clip_rect *cliprects = NULL; + struct drm_i915_private *dev_priv = dev-dev_private; + u64 exec_len; + int instp_mode; + u32 instp_mask; + int i, ret = 0; + + if (args-num_cliprects != 0) { + if (ring != dev_priv-ring[RCS]) { + DRM_DEBUG(clip rectangles are only valid with the render ring\n); + return -EINVAL; + } + + if (INTEL_INFO(dev)-gen = 5) { + DRM_DEBUG(clip rectangles are only valid on pre-gen5\n); + return -EINVAL; + } + + if (args-num_cliprects UINT_MAX / sizeof(*cliprects)) { + DRM_DEBUG(execbuf with %u cliprects\n, + args-num_cliprects); + return -EINVAL; + } + + cliprects = kcalloc(args-num_cliprects, + sizeof(*cliprects), + GFP_KERNEL); + if (cliprects == NULL) { + ret = -ENOMEM; + goto error; + } + + if (copy_from_user(cliprects, + to_user_ptr(args-cliprects_ptr), + sizeof(*cliprects)*args-num_cliprects)) { + ret = -EFAULT; + goto error; + } + } else { + if (args-DR4 == 0x) { + DRM_DEBUG(UXA submitting garbage DR4, fixing up\n); + args-DR4 = 0; + } + + if (args-DR1 || args-DR4 || args-cliprects_ptr) { + DRM_DEBUG(0 cliprects but dirt in cliprects fields\n); + return -EINVAL; + } + } + + ret = i915_gem_execbuffer_move_to_gpu(ring, vmas); + if (ret) + goto error; + + ret = i915_switch_context(ring, ctx); + if (ret) + goto error; + + instp_mode = args-flags I915_EXEC_CONSTANTS_MASK; + instp_mask = I915_EXEC_CONSTANTS_MASK; + switch (instp_mode) { + case I915_EXEC_CONSTANTS_REL_GENERAL: + case I915_EXEC_CONSTANTS_ABSOLUTE: + case I915_EXEC_CONSTANTS_REL_SURFACE: + if (instp_mode != 0 ring != dev_priv-ring[RCS]) { + DRM_DEBUG(non-0 rel constants mode on non-RCS\n); + ret = -EINVAL; + goto error; + } + + if (instp_mode != dev_priv-relative_constants_mode) { + if (INTEL_INFO(dev)-gen 4) { + DRM_DEBUG(no rel constants on pre-gen4\n); + ret = -EINVAL; + goto error; + } + + if (INTEL_INFO(dev)-gen 5 + instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) { + DRM_DEBUG(rel surface constants mode invalid on gen5+\n); + ret = -EINVAL; + goto error; + } + + /* The HW changed the meaning on this bit on gen6 */ + if (INTEL_INFO(dev)-gen = 6) + instp_mask = ~I915_EXEC_CONSTANTS_REL_SURFACE; + } + break; + default: + DRM_DEBUG(execbuf with unknown constants: %d\n, instp_mode); + ret = -EINVAL; + goto error; + } + + if (ring == dev_priv-ring[RCS] + instp_mode !=
Re: [Intel-gfx] [PATCH] drm/i915: fix sanitize_enable_ppgtt for full PPGTT
On Wed, Jun 25, 2014 at 03:45:33PM -0700, Jesse Barnes wrote: Apparently trinary logic is hard. We were falling through all the forced cases and simply enabling aliasing PPGTT or not based on hardware, rather than full PPGTT if available. References: https://bugs.freedesktop.org/show_bug.cgi?id=80083 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_gem_gtt.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index a4153ee..86521a7 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -69,7 +69,13 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt) return 0; } - return HAS_ALIASING_PPGTT(dev) ? 1 : 0; + /* Fall through to auto-detect */ + if (HAS_PPGTT(dev)) + return 2; + else if (HAS_ALIASING_PPGTT(dev)) + return 1; + + return 0; I don't get it. This would just enable full ppgtt by default. But full ppgtt is still a bit broken so I don't think we want this. The dmesg in the bug shows that ppgtt was forced off by the PCI revision check, so that part seems to have worked. Looks like the tests doesn't expect the batches to be accepted by the kernel, but the cmd parser is bypassed when ppgtt is disabled, so the test fails. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fix sanitize_enable_ppgtt for full PPGTT
On Thu, 26 Jun 2014 17:59:48 +0300 Ville Syrjälä ville.syrj...@linux.intel.com wrote: On Wed, Jun 25, 2014 at 03:45:33PM -0700, Jesse Barnes wrote: Apparently trinary logic is hard. We were falling through all the forced cases and simply enabling aliasing PPGTT or not based on hardware, rather than full PPGTT if available. References: https://bugs.freedesktop.org/show_bug.cgi?id=80083 Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_gem_gtt.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index a4153ee..86521a7 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -69,7 +69,13 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt) return 0; } - return HAS_ALIASING_PPGTT(dev) ? 1 : 0; + /* Fall through to auto-detect */ + if (HAS_PPGTT(dev)) + return 2; + else if (HAS_ALIASING_PPGTT(dev)) + return 1; + + return 0; I don't get it. This would just enable full ppgtt by default. But full ppgtt is still a bit broken so I don't think we want this. The dmesg in the bug shows that ppgtt was forced off by the PCI revision check, so that part seems to have worked. Looks like the tests doesn't expect the batches to be accepted by the kernel, but the cmd parser is bypassed when ppgtt is disabled, so the test fails. Yeah and I was testing on a platform that had PPGTT support, and figured I had screwed up the PPGTT enable. But I guess not, and we can expect these tests to fail until we have the command parser enabled everywhere. -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] gem_exec_parse: require PPGTT as well
The command parser may be present, but not active, so check for PPGTT before allowing this test to run. Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- tests/gem_exec_parse.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c index e78a192..f7376e3 100644 --- a/tests/gem_exec_parse.c +++ b/tests/gem_exec_parse.c @@ -198,7 +198,7 @@ int fd; igt_main { igt_fixture { - int parser_version = 0; + int parser_version = 0, has_ppgtt = 0; drm_i915_getparam_t gp; int rc; @@ -209,6 +209,11 @@ igt_main rc = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp); igt_require(!rc parser_version 0); + gp.param = I915_PARAM_HAS_ALIASING_PPGTT; + gp.value = has_ppgtt; + rc = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp); + igt_require(!rc has_ppgtt 0); + handle = gem_create(fd, 4096); /* ATM cmd parser only exists on gen7. */ -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] gem_exec_parse: require PPGTT as well
On Thu, Jun 26, 2014 at 08:48:46AM -0700, Jesse Barnes wrote: The command parser may be present, but not active, so check for PPGTT before allowing this test to run. Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- tests/gem_exec_parse.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tests/gem_exec_parse.c b/tests/gem_exec_parse.c index e78a192..f7376e3 100644 --- a/tests/gem_exec_parse.c +++ b/tests/gem_exec_parse.c @@ -198,7 +198,7 @@ int fd; igt_main { igt_fixture { - int parser_version = 0; + int parser_version = 0, has_ppgtt = 0; drm_i915_getparam_t gp; int rc; @@ -209,6 +209,11 @@ igt_main rc = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp); igt_require(!rc parser_version 0); + gp.param = I915_PARAM_HAS_ALIASING_PPGTT; + gp.value = has_ppgtt; + rc = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp); + igt_require(!rc has_ppgtt 0); + You could also shorten it to igt_require(gem_uses_aliasing_ppgtt(fd)); if you like. And sorry, I should have added that check in the first place. Brad handle = gem_create(fd, 4096); /* ATM cmd parser only exists on gen7. */ -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx