Re: [Intel-gfx] [PATCH 4/9] drm/i915: Delete unnecessary braces in three functions
On Thu, 04 May 2017, SF Markus Elfringwrote: > From: Markus Elfring > Date: Thu, 4 May 2017 13:40:53 +0200 > > Do not use curly brackets at some source code places > where a single statement should be sufficient. We only tend to do this kind of changes when we're changing the surrounding code anyway. I'm sure there are plenty of places where you could add or remove braces, but it's not productive to go around changing just them. BR, Jani. > > Signed-off-by: Markus Elfring > --- > drivers/gpu/drm/i915/i915_debugfs.c | 19 --- > 1 file changed, 8 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index 296108464f2b..bf9a2e8d8c16 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -565,13 +565,13 @@ static int i915_gem_pageflip_info(struct seq_file *m, > void *data) > u32 addr; > > pending = atomic_read(>pending); > - if (pending) { > + if (pending) > seq_printf(m, "Flip ioctl preparing on pipe %c > (plane %c)\n", > pipe, plane); > - } else { > + else > seq_printf(m, "Flip pending (waiting for vsync) > on pipe %c (plane %c)\n", > pipe, plane); > - } > + > if (work->flip_queued_req) { > struct intel_engine_cs *engine = > work->flip_queued_req->engine; > > @@ -3130,13 +3130,11 @@ static void intel_plane_info(struct seq_file *m, > struct intel_crtc *intel_crtc) > } > > state = plane->state; > - > - if (state->fb) { > + if (state->fb) > drm_get_format_name(state->fb->format->format, > _name); > - } else { > + else > sprintf(format_name.str, "N/A"); > - } > > seq_printf(m, "\t--Plane id %d: type=%s, crtc_pos=%4dx%4d, > crtc_size=%4dx%4d, src_pos=%d.%04ux%d.%04u, src_size=%d.%04ux%d.%04u, > format=%s, rotation=%s\n", > plane->base.id, > @@ -4636,13 +4634,12 @@ static int i915_sseu_status(struct seq_file *m, void > *unused) > > intel_runtime_pm_get(dev_priv); > > - if (IS_CHERRYVIEW(dev_priv)) { > + if (IS_CHERRYVIEW(dev_priv)) > cherryview_sseu_device_status(dev_priv, ); > - } else if (IS_BROADWELL(dev_priv)) { > + else if (IS_BROADWELL(dev_priv)) > broadwell_sseu_device_status(dev_priv, ); > - } else if (INTEL_GEN(dev_priv) >= 9) { > + else if (INTEL_GEN(dev_priv) >= 9) > gen9_sseu_device_status(dev_priv, ); > - } > > intel_runtime_pm_put(dev_priv); -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/9] drm/i915: Replace 14 seq_printf() calls by seq_puts()
On Thu, 04 May 2017, Chris Wilsonwrote: > On Thu, May 04, 2017 at 06:54:16PM +0200, SF Markus Elfring wrote: >> From: Markus Elfring >> Date: Thu, 4 May 2017 13:20:47 +0200 >> >> Some strings which did not contain data format specifications should be put >> into a sequence. Thus use the corresponding function "seq_puts". > > debugfs / seq_file is not performance critical. Familiar idiomatic code is > much preferred over continually switching between seq_printf and seq_puts. > > And don't even start on converting seq_printf / seq_puts to seq_putc... Agreed. I don't want any of the seq_* changes in this series. BR, Jani. -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 6/9] drm/i915: Add spaces for better code readability
On Thu, 04 May 2017, SF Markus Elfringwrote: > From: Markus Elfring > Date: Thu, 4 May 2017 14:04:38 +0200 > > Use space characters at some source code places according to > the Linux coding style convention. LGTM. Frankly the only concern I have with accepting this patch is that it encourages you and others to submit more patches like this. Generally, we do this kind of changes only when touching the nearby code for some real changes. BR, Jani. > > Signed-off-by: Markus Elfring > --- > drivers/gpu/drm/i915/i915_debugfs.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index d9c699d7245e..6f3119d40c50 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -2358,7 +2358,7 @@ static int i915_llc(struct seq_file *m, void *data) > > seq_printf(m, "LLC: %s\n", yesno(HAS_LLC(dev_priv))); > seq_printf(m, "%s: %lluMB\n", edram ? "eDRAM" : "eLLC", > -intel_uncore_edram_size(dev_priv)/1024/1024); > +intel_uncore_edram_size(dev_priv) / 1024 / 1024); > > return 0; > } > @@ -4502,7 +4502,7 @@ static void gen9_sseu_device_status(struct > drm_i915_private *dev_priv, > { > int s_max = 3, ss_max = 4; > int s, ss; > - u32 s_reg[s_max], eu_reg[2*s_max], eu_mask[2]; > + u32 s_reg[s_max], eu_reg[2 * s_max], eu_mask[2]; > > /* BXT has a single slice and at most 3 subslices. */ > if (IS_GEN9_LP(dev_priv)) { > @@ -4512,8 +4512,8 @@ static void gen9_sseu_device_status(struct > drm_i915_private *dev_priv, > > for (s = 0; s < s_max; s++) { > s_reg[s] = I915_READ(GEN9_SLICE_PGCTL_ACK(s)); > - eu_reg[2*s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s)); > - eu_reg[2*s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s)); > + eu_reg[2 * s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s)); > + eu_reg[2 * s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s)); > } > > eu_mask[0] = GEN9_PGCTL_SSA_EU08_ACK | > @@ -4547,8 +4547,8 @@ static void gen9_sseu_device_status(struct > drm_i915_private *dev_priv, > sseu->subslice_mask |= BIT(ss); > } > > - eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] & > -eu_mask[ss%2]); > + eu_cnt = 2 * hweight32(eu_reg[2 * s + ss / 2] & > +eu_mask[ss % 2]); > sseu->eu_total += eu_cnt; > sseu->eu_per_subslice = max_t(unsigned int, > sseu->eu_per_subslice, -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/9] drm/i915: Adjust seven checks for null pointers
On Thu, 04 May 2017, SF Markus Elfringwrote: > From: Markus Elfring > Date: Thu, 4 May 2017 13:52:19 +0200 > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > The script “checkpatch.pl” pointed information out like the following. > > Comparison to NULL could be written … Could be written one way or the other. We have and accept both. Sometimes explicit comparison with NULL is preferred, depending on judgement, not based on what a tool says. BR, Jani. > > Thus fix affected source code places. > > Signed-off-by: Markus Elfring > --- > drivers/gpu/drm/i915/i915_debugfs.c | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index bf9a2e8d8c16..d9c699d7245e 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -242,7 +242,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, > void *data) > if (count == total) > break; > > - if (obj->stolen == NULL) > + if (!obj->stolen) > continue; > > objects[count++] = obj; > @@ -254,7 +254,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, > void *data) > if (count == total) > break; > > - if (obj->stolen == NULL) > + if (!obj->stolen) > continue; > > objects[count++] = obj; > @@ -557,7 +557,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, > void *data) > > spin_lock_irq(>event_lock); > work = crtc->flip_work; > - if (work == NULL) { > + if (!work) { > seq_printf(m, "No flip due on pipe %c (plane %c)\n", > pipe, plane); > } else { > @@ -3717,7 +3717,7 @@ static ssize_t > i915_displayport_test_active_write(struct file *file, > continue; > > if (connector->status == connector_status_connected && > - connector->encoder != NULL) { > + connector->encoder) { > intel_dp = enc_to_intel_dp(connector->encoder); > status = kstrtoint(input_buffer, 10, ); > if (status < 0) > @@ -3756,7 +3756,7 @@ static int i915_displayport_test_active_show(struct > seq_file *m, void *data) > continue; > > if (connector->status == connector_status_connected && > - connector->encoder != NULL) { > + connector->encoder) { > intel_dp = enc_to_intel_dp(connector->encoder); > seq_putc(m, >intel_dp->compliance.test_active ? '1' : '0'); > @@ -3801,7 +3801,7 @@ static int i915_displayport_test_data_show(struct > seq_file *m, void *data) > continue; > > if (connector->status == connector_status_connected && > - connector->encoder != NULL) { > + connector->encoder) { > intel_dp = enc_to_intel_dp(connector->encoder); > if (intel_dp->compliance.test_type == > DP_TEST_LINK_EDID_READ) > @@ -3855,7 +3855,7 @@ static int i915_displayport_test_type_show(struct > seq_file *m, void *data) > continue; > > if (connector->status == connector_status_connected && > - connector->encoder != NULL) { > + connector->encoder) { > intel_dp = enc_to_intel_dp(connector->encoder); > seq_printf(m, "%02lx", intel_dp->compliance.test_type); > } else { -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()
On Thu, May 04, 2017 at 09:12:32PM +0100, Chris Wilson wrote: > On Thu, May 04, 2017 at 06:59:23PM +0200, SF Markus Elfring wrote: > > From: Markus Elfring> > Date: Thu, 4 May 2017 14:15:00 +0200 > > > > The script "checkpatch.pl" pointed information out like the following. > > > > WARNING: quoted string split across lines > > > > Thus fix the affected source code place. > > > > Signed-off-by: Markus Elfring > > --- > > drivers/gpu/drm/i915/i915_debugfs.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > > b/drivers/gpu/drm/i915/i915_debugfs.c > > index 6f3119d40c50..dbd52ea89fb4 100644 > > --- a/drivers/gpu/drm/i915/i915_debugfs.c > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > > @@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m) > > > > forcewake_count = > > READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count); > > if (forcewake_count) { > > - seq_puts(m, "RC information inaccurate because somebody " > > - "holds a forcewake reference \n"); > > + seq_puts(m, > > +"RC information inaccurate because somebody holds a > > forcewake reference.\n"); > > And now you break the 80col rule. Blind adherence to checkpatch is > impossible. > -Chris No. Checkpatch allows you to go over 80 characters to avoid splitting a string. regards, dan carpenter ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCHv4 3/3] drm/vgem: Enable dmabuf import interfaces
On Thu, 2017-05-04 at 21:25 +0100, Chris Wilson wrote: > On Thu, May 04, 2017 at 11:45:48AM -0700, Laura Abbott wrote: > > > > Enable the GEM dma-buf import interfaces in addition to the export > > interfaces. This lets vgem be used as a test source for other allocators > > (e.g. Ion). > > > > Reviewed-by: Chris Wilson> > Signed-off-by: Laura Abbott > > --- > > v4: Use new drm_gem_prime_import_dev function > > --- > > static const struct vm_operations_struct vgem_gem_vm_ops = { > > @@ -114,12 +142,8 @@ static void vgem_postclose(struct drm_device *dev, > > struct drm_file *file) > > kfree(vfile); > > } > > > > -/* ioctls */ > > - > > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > > - struct drm_file *file, > > - unsigned int *handle, > > - unsigned long size) > > +static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device > > *dev, > > + unsigned long size) > > I'm going to guess that doesn't line up anymore. If checkpatch isn't > complaining, then sorry for the noise. Because of the very long identifiers, perhaps a nicer way to write this is like: static struct drm_vgem_gem_object * __vgen_gem_create(struct drm_device *dev, unsigned long size); > > +static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > > + struct drm_file *file, > > + unsigned int *handle, > > + unsigned long size) > > Ditto. etc... ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On 5/4/2017 9:51 AM, Kenneth Graunke wrote: MediaSDK is not a benchmark. If I'm not mistaken, it's a userspace driver produced by Intel engineers, one which Intel has the full capability to change. What you're saying is that Intel's MediaSDK engineers are unwilling to change their software to provide better performance for their Linux users. That's pretty mental. You are mistaken. Media SDK is not a driver. It is a user space library which talks to the user space driver. And Media SDK does not set _any_ caching policies you are discussing here. That's the driver who sets these policies. I don't want to go further here who supports this driver, Intel or not, but there are mediasdk engineers whom you blame to not willing to do something and who actually only indirectly are related to this topic. Please, if you mean driver, say a driver. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RESEND] drm: i915: Don't try detecting sinks on ports already in use
On Thu, May 04, 2017 at 05:36:49PM -0300, Gabriel Krisman Bertazi wrote: > On systems where more than one connector is attached to the same port, > the HPD pin is also shared, and attaching one connector will trigger a > hotplug on every other connector on that port. But, according to the > documentation, connectors sharing the port cannot be enabled > simultaneously, such that we can abort the detection process early if > another connector was detected succesfully. > > This has the good side effect of preventing DP timeouts whenever > something else is connected to the shared port, like below: > > [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7143003f > [drm:drm_dp_dpcd_access] Too many retries, giving up. First error: -110 > > Since this reduces the overhead of the i915_hotplug_work_func, it may > address the following vblank misses detected by the CI: > > https://bugs.freedesktop.org/show_bug.cgi?id=100215 > https://bugs.freedesktop.org/show_bug.cgi?id=100558 > > Signed-off-by: Gabriel Krisman Bertazi> --- > drivers/gpu/drm/i915/intel_display.c | 26 ++ > drivers/gpu/drm/i915/intel_dp.c | 3 +++ > drivers/gpu/drm/i915/intel_drv.h | 1 + > drivers/gpu/drm/i915/intel_hdmi.c| 3 +++ > 4 files changed, 33 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index 85b9e2f521a0..618b5138c0c7 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -11270,6 +11270,32 @@ static bool check_digital_port_conflicts(struct > drm_atomic_state *state) > return true; > } > > +bool intel_shared_digital_port_in_use(struct drm_connector *conn) > +{ > + struct drm_connector *peer; > + struct drm_connector_list_iter iter; > + struct intel_encoder *enc = to_intel_connector(conn)->encoder; > + int ret = false; > + > + drm_connector_list_iter_begin(conn->dev, ); > + drm_for_each_connector_iter(peer, ) { > + struct intel_encoder *peer_enc; > + > + if (peer == conn || > + peer->status != connector_status_connected) > + continue; So here, you are trying to find another connector in the list of connectors that is not same as the passed connector but it is connected, right? > + > + peer_enc = to_intel_connector(peer)->encoder; > + if (peer_enc->port == enc->port) { > + ret = true; > + break; And the intention for this check is to see the connector that just got hotplugged is connected to the same port as the port that already has a connector connected? How does this handle the case where displays are cloned? Guess it would be same encoder but still different ports? Manasi > + } > + } > + drm_connector_list_iter_end(); > + > + return ret; > +} > + > static void > clear_intel_crtc_state(struct intel_crtc_state *crtc_state) > { > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index 08834f74d396..0823c588575f 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -4628,6 +4628,9 @@ intel_dp_long_pulse(struct intel_connector > *intel_connector) > > > WARN_ON(!drm_modeset_is_locked(>dev->mode_config.connection_mutex)); > > + if (intel_shared_digital_port_in_use(connector)) > + return connector_status_disconnected; > + > intel_display_power_get(to_i915(dev), intel_dp->aux_power_domain); > > /* Can't disconnect eDP, but you can close the lid... */ > diff --git a/drivers/gpu/drm/i915/intel_drv.h > b/drivers/gpu/drm/i915/intel_drv.h > index 54f3ff840812..fd5f2517bede 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -1411,6 +1411,7 @@ int vlv_force_pll_on(struct drm_i915_private *dev_priv, > enum pipe pipe, >const struct dpll *dpll); > void vlv_force_pll_off(struct drm_i915_private *dev_priv, enum pipe pipe); > int lpt_get_iclkip(struct drm_i915_private *dev_priv); > +bool intel_shared_digital_port_in_use(struct drm_connector *conn); > > /* modesetting asserts */ > void assert_panel_unlocked(struct drm_i915_private *dev_priv, > diff --git a/drivers/gpu/drm/i915/intel_hdmi.c > b/drivers/gpu/drm/i915/intel_hdmi.c > index 52f0b2d5fad2..9ce0b1f45cde 100644 > --- a/drivers/gpu/drm/i915/intel_hdmi.c > +++ b/drivers/gpu/drm/i915/intel_hdmi.c > @@ -1532,6 +1532,9 @@ intel_hdmi_detect(struct drm_connector *connector, bool > force) > DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", > connector->base.id, connector->name); > > + if (intel_shared_digital_port_in_use(connector)) > + return connector_status_disconnected;; > + > intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS); > > intel_hdmi_unset_edid(connector); > -- > 2.11.0 >
Re: [Intel-gfx] [RFC] drm/i915/guc: capture GuC logs if FW fails to load
On Thu, May 04, 2017 at 09:26:35PM +, Srivatsa, Anusha wrote: > >+void i915_guc_load_error_log_capture(struct drm_i915_private *i915) { > >+void *log, *buf; > >+struct i915_vma *vma = i915->guc.log.vma; > >+ > >+if (i915->gpu_error.guc_load_fail_log || !vma) > >+return; > >+ > >+/* > >+ * the vma should be already pinned and mapped for log runtime > >+ * management but let's play safe > >+ */ > >+log = i915_gem_object_pin_map(vma->obj, I915_MAP_WC); > >+if (IS_ERR(log)) { > >+DRM_ERROR("Failed to pin guc_log vma\n"); > >+return; > >+} > >+ > >+buf = kzalloc(GUC_LOG_SIZE, GFP_KERNEL); > >+if (buf) { > >+memcpy(buf, log, GUC_LOG_SIZE); > >+i915->gpu_error.guc_load_fail_log = buf; > >+} else { > >+DRM_ERROR("Failed to copy guc log\n"); > >+} > >+ > >+i915_gem_object_unpin_map(vma->obj); You are trading a swappable object for unswappable kernel memory. If you want to have the guc log after guc is disabled, just keep the log object around. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC] drm/i915/guc: capture GuC logs if FW fails to load
>-Original Message- >From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf Of >Daniele Ceraolo Spurio >Sent: Thursday, May 4, 2017 11:52 AM >To: intel-gfx@lists.freedesktop.org >Subject: [Intel-gfx] [RFC] drm/i915/guc: capture GuC logs if FW fails to load > >We're currently deleting the GuC logs if the FW fails to load, but those are >still >useful to understand why the loading failed. Instead of deleting them, taking a >snapshot allows us to access them after driver load is completed. Hi Daniele, I like the idea. But just to confirm, we are still going to get the status of fetch and load-like PENDING or FAIL, but the reason of failure is going to be in the debugfs. Correct? Anusha >Cc: Oscar Mateo>Cc: Michal Wajdeczko >Signed-off-by: Daniele Ceraolo Spurio >--- > drivers/gpu/drm/i915/i915_debugfs.c | 36 --- > drivers/gpu/drm/i915/i915_drv.c | 3 +++ > drivers/gpu/drm/i915/i915_drv.h | 6 ++ > drivers/gpu/drm/i915/i915_gpu_error.c | 36 >+++ > drivers/gpu/drm/i915/intel_guc_fwif.h | 14 +++--- >drivers/gpu/drm/i915/intel_guc_log.c | 10 ++ > drivers/gpu/drm/i915/intel_uc.c | 7 +-- > 7 files changed, 84 insertions(+), 28 deletions(-) > >diff --git a/drivers/gpu/drm/i915/i915_debugfs.c >b/drivers/gpu/drm/i915/i915_debugfs.c >index 870c470..4ff20fc 100644 >--- a/drivers/gpu/drm/i915/i915_debugfs.c >+++ b/drivers/gpu/drm/i915/i915_debugfs.c >@@ -2543,26 +2543,32 @@ static int i915_guc_info(struct seq_file *m, void >*data) static int i915_guc_log_dump(struct seq_file *m, void *data) { > struct drm_i915_private *dev_priv = node_to_i915(m->private); >- struct drm_i915_gem_object *obj; >- int i = 0, pg; >- >- if (!dev_priv->guc.log.vma) >+ u32 *log; >+ int i = 0; >+ >+ if (dev_priv->guc.log.vma) { >+ log = i915_gem_object_pin_map(dev_priv->guc.log.vma->obj, >+I915_MAP_WC); >+ if (IS_ERR(log)) { >+ DRM_ERROR("Failed to pin guc_log vma\n"); >+ return -ENOMEM; >+ } >+ } else if (dev_priv->gpu_error.guc_load_fail_log) { >+ log = dev_priv->gpu_error.guc_load_fail_log; >+ } else { > return 0; >- >- obj = dev_priv->guc.log.vma->obj; >- for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) { >- u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg)); >- >- for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4) >- seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n", >- *(log + i), *(log + i + 1), >- *(log + i + 2), *(log + i + 3)); >- >- kunmap_atomic(log); > } > >+ for (i = 0; i < GUC_LOG_SIZE / sizeof(u32); i += 4) >+ seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n", >+ *(log + i), *(log + i + 1), >+ *(log + i + 2), *(log + i + 3)); >+ > seq_putc(m, '\n'); > >+ if (dev_priv->guc.log.vma) >+ i915_gem_object_unpin_map(dev_priv->guc.log.vma->obj); >+ > return 0; > } > >diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c >index 452c265..c7cb36c 100644 >--- a/drivers/gpu/drm/i915/i915_drv.c >+++ b/drivers/gpu/drm/i915/i915_drv.c >@@ -1354,6 +1354,9 @@ void i915_driver_unload(struct drm_device *dev) > cancel_delayed_work_sync(_priv->gpu_error.hangcheck_work); > i915_reset_error_state(dev_priv); > >+ /* release GuC error log (if any) */ >+ i915_guc_load_error_log_free(dev_priv); >+ > /* Flush any outstanding unpin_work. */ > drain_workqueue(dev_priv->wq); > >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h >index 4588b3e..761c663 100644 >--- a/drivers/gpu/drm/i915/i915_drv.h >+++ b/drivers/gpu/drm/i915/i915_drv.h >@@ -1555,6 +1555,9 @@ struct i915_gpu_error { > /* Protected by the above dev->gpu_error.lock. */ > struct i915_gpu_state *first_error; > >+ /* Log snapshot if GuC errors during load */ >+ void *guc_load_fail_log; >+ > unsigned long missed_irq_rings; > > /** >@@ -3687,6 +3690,9 @@ static inline void i915_reset_error_state(struct >drm_i915_private *i915) > > #endif > >+void i915_guc_load_error_log_capture(struct drm_i915_private *i915); >+void i915_guc_load_error_log_free(struct drm_i915_private *i915); >+ > const char *i915_cache_level_str(struct drm_i915_private *i915, int type); > > /* i915_cmd_parser.c */ >diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c >b/drivers/gpu/drm/i915/i915_gpu_error.c >index ec526d9..44a873b 100644 >--- a/drivers/gpu/drm/i915/i915_gpu_error.c >+++ b/drivers/gpu/drm/i915/i915_gpu_error.c >@@ -1809,3 +1809,39 @@ void
Re: [Intel-gfx] i915 4.9 regression: DP AUX CH sanitization no longer working on Asus desktops
On Thu, May 04, 2017 at 02:52:09PM -0600, Daniel Drake wrote: > On Thu, May 4, 2017 at 2:37 PM, Ville Syrjälä >wrote: > > Please check if commit bb1d132935c2 ("drm/i915/vbt: split out defaults > > that are set when there is no VBT") fixes things for you. > > I think this is not going to help. This would only make a difference > when there is no VBT at all at which point we would see this message > in the logs: > > DRM_INFO("Failed to find VBIOS tables (VBT)\n"); > > but in this case we have a VBT for ports B, C and E. > > [drm:intel_bios_init [i915]] Port B VBT info: DP:1 HDMI:1 DVI:1 EDP:0 CRT:0 > [drm:intel_bios_init [i915]] VBT HDMI level shift for port B: 8 > [drm:intel_bios_init [i915]] Port C VBT info: DP:0 HDMI:1 DVI:1 EDP:0 CRT:0 > [drm:intel_bios_init [i915]] VBT HDMI level shift for port C: 8 > [drm:intel_bios_init [i915]] Port E VBT info: DP:1 HDMI:0 DVI:0 EDP:0 CRT:0 > [drm:intel_bios_init [i915]] VBT HDMI level shift for port E: 0 > > Let me know if I'm missing something and we will test it anyway > I think now without the "Split out defaults that are set when there is no VBT" patch, what happens is it enables DP on PORTA by default and then since there is a DP-VGA adapter on Port E it also enables DP on Port E and since both of these ports use the same AUX channel, it causes the AUX CH sanitization issues. Atleast that's my guess at triage. So after the correct VBT parsing, it should actually not enable any AUX transactions on Port A and detect that no child devices connected to Port A. So the AUX CH should only be used for Port E and VGA output should work fine. Manasi > Thanks > Daniel > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE
Chris Wilsonwrites: > On Thu, May 04, 2017 at 10:56:54AM -0700, Francisco Jerez wrote: >> David Weinehall writes: >> >> > On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote: >> >> A good default for garbage entries from the user is to follow the >> >> default setting of the object (i.e. the PTE). Currently they use the >> >> uncached entry, and now the only way to accidentally hit uncached >> >> performance is via explicit use of the uncached MOCS or setting the >> >> object to uncached. Note that these entries are currently undefined in >> >> the ABI and we reserve the right to change them. We originally chose >> >> uncached to eliminate any problem with reducing the caching level in >> >> future, but the object is a much better definition of the minimum >> >> caching level. >> >> >> >> NAK. The reason for the default being UC is that it's the only setting >> that guarantees full forwards compatibility with any other entry that >> might be added in the future. If you default to PTE on (e)LLC and WB on >> L3, userspace will no longer be able to use any newly introduced entry >> with stricter coherency guarantees than that (e.g. any L3-uncached >> entry) in a backwards-compatible way. Attempting to do so may break >> memory coherency assumptions of the application and lead to misrendering >> when run on older kernel versions (which to my judgment is a scarier >> failure mode than reduced performance). > > You can't use a weaker coherency model in mocs than that specified for > the object as you can't control other uses of the object (even just > memory pressure will break your assumptions). Exactly, but you can use a stronger coherency model than the application requested, which is why falling back to UC should generally work for unknown entries but falling back to PTE+WB isn't guaranteed to. > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre signature.asc Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RESEND] drm: i915: Don't try detecting sinks on ports already in use
On Thu, May 04, 2017 at 05:36:49PM -0300, Gabriel Krisman Bertazi wrote: > On systems where more than one connector is attached to the same port, > the HPD pin is also shared, and attaching one connector will trigger a > hotplug on every other connector on that port. But, according to the > documentation, connectors sharing the port cannot be enabled > simultaneously, such that we can abort the detection process early if > another connector was detected succesfully. > > This has the good side effect of preventing DP timeouts whenever > something else is connected to the shared port, like below: > > [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7143003f > [drm:drm_dp_dpcd_access] Too many retries, giving up. First error: -110 > > Since this reduces the overhead of the i915_hotplug_work_func, it may > address the following vblank misses detected by the CI: > > https://bugs.freedesktop.org/show_bug.cgi?id=100215 > https://bugs.freedesktop.org/show_bug.cgi?id=100558 > > Signed-off-by: Gabriel Krisman BertaziThe key problem here is say a race between DP unplug and HDMI plug, and users are evil enough (or common enough) for it to happen. I thought the idea was reasonable though, and perhaps we could make more use of the knowlege of the shared ports to improve detection of common DP/HDMI DDIs (i.e. run detection once for a ddi and have it decide whether it is hdmi or dp). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()
On Thu, May 04, 2017 at 10:48:10PM +0200, SF Markus Elfring wrote: > >> +++ b/drivers/gpu/drm/i915/i915_debugfs.c > >> @@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m) > >> > >>forcewake_count = > >> READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count); > >>if (forcewake_count) { > >> - seq_puts(m, "RC information inaccurate because somebody " > >> - "holds a forcewake reference \n"); > >> + seq_puts(m, > >> + "RC information inaccurate because somebody holds a > >> forcewake reference.\n"); > > > > And now you break the 80col rule. Blind adherence to checkpatch is > > impossible. > > Have you got any other coding style preferences around the grepping > of longer message strings from such source code? I personally use long strings (because they are less hassle to write), except when they are ridiculously long. But checkpatch complains either way, so checkpatch itself is not a reason to make a change. Certainly grepping for a complete seq_printf() is unlikely (i.e. you had to open the debugfs file to see it, so you must already know where to look in the code). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm: i915: Don't try detecting sinks on ports already in use (rev2)
== Series Details == Series: drm: i915: Don't try detecting sinks on ports already in use (rev2) URL : https://patchwork.freedesktop.org/series/23299/ State : success == Summary == Series 23299v2 drm: i915: Don't try detecting sinks on ports already in use https://patchwork.freedesktop.org/api/1.0/series/23299/revisions/2/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: dmesg-warn -> PASS (fi-kbl-7560u) fdo#100125 fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:422s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:426s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:570s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:500s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:544s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:466s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:477s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:416s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:403s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:420s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:490s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:457s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:463s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:566s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:459s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:568s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:460s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:490s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:434s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:528s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:398s 369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest 723be02 drm: i915: Don't try detecting sinks on ports already in use == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4627/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible
On Thu, May 04, 2017 at 06:26:04PM +0200, Michal Wajdeczko wrote: > On Thu, May 04, 2017 at 04:22:15PM +0300, Jani Nikula wrote: > > On Thu, 04 May 2017, Michal Wajdeczkowrote: > > > We are using some scratch registers in MMIO based send function. > > > Make their base and count flexible in preparation of upcoming > > > GuC firmware/hardware changes. While around, change cmd len > > > parameter verification from WARN_ON to GEM_BUG_ON as we don't > > > need this all the time. > > > > I'm not generally fond of caching the registers like this or adding > > _MMIO() wrapping outside of i915_reg.h. Sure, we have some of that here > > and there, but here it's hard to see the rationale because you do this > > in preparation for something that we you're not sharing. > > > > I can't share details atm, but as commit message says, there will be a > change in both offsets and number of scratch registers. > > Imho any wrapping around these values can't go to the i915_[guc_]reg.h file > as that file shall include only raw MMIO definitions, without any extra > logic that is based on GEN or PLATFORM or FW version. The guc->send.base + offset approach is reasonable; it is certainly the tried and trusted approach. I would stick with it, but we just can't help with any suggestions without seeing the destination. Oh well, we can dream that instead of using mmio space for datagrams they move to ring (even WC will be better than a bunch of UC)! Don't overqualify the ints though, u32 base is ok, but it could be unsigned count (though an alternative would be u32 end, and even mark it as GEM_DEBUG_DECL!) and definitely unsigned fw_domains as that is not defined as being u32. (Just more than 32 domains is unlikely before tomorrow ;) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] i915 4.9 regression: DP AUX CH sanitization no longer working on Asus desktops
On Thu, May 4, 2017 at 2:37 PM, Ville Syrjäläwrote: > Please check if commit bb1d132935c2 ("drm/i915/vbt: split out defaults > that are set when there is no VBT") fixes things for you. I think this is not going to help. This would only make a difference when there is no VBT at all at which point we would see this message in the logs: DRM_INFO("Failed to find VBIOS tables (VBT)\n"); but in this case we have a VBT for ports B, C and E. [drm:intel_bios_init [i915]] Port B VBT info: DP:1 HDMI:1 DVI:1 EDP:0 CRT:0 [drm:intel_bios_init [i915]] VBT HDMI level shift for port B: 8 [drm:intel_bios_init [i915]] Port C VBT info: DP:0 HDMI:1 DVI:1 EDP:0 CRT:0 [drm:intel_bios_init [i915]] VBT HDMI level shift for port C: 8 [drm:intel_bios_init [i915]] Port E VBT info: DP:1 HDMI:0 DVI:0 EDP:0 CRT:0 [drm:intel_bios_init [i915]] VBT HDMI level shift for port E: 0 Let me know if I'm missing something and we will test it anyway Thanks Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c >> @@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m) >> >> forcewake_count = >> READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count); >> if (forcewake_count) { >> -seq_puts(m, "RC information inaccurate because somebody " >> -"holds a forcewake reference \n"); >> +seq_puts(m, >> + "RC information inaccurate because somebody holds a >> forcewake reference.\n"); > > And now you break the 80col rule. Blind adherence to checkpatch is impossible. Have you got any other coding style preferences around the grepping of longer message strings from such source code? Regards, Markus ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] i915 4.9 regression: DP AUX CH sanitization no longer working on Asus desktops
On Thu, May 04, 2017 at 02:21:26PM -0600, Daniel Drake wrote: > Hi, > > Numerous Asus desktops and All-in-one computers (e.g. D520MT) have a > regression on Linux 4.9 where the VGA output is shown all-white. > > This is a regression caused by: > > commit 0ce140d45a8398b501934ac289aef0eb7f47c596 > Author: Ville Syrjälä> Date: Tue Oct 11 20:52:47 2016 +0300 > > drm/i915: Clean up DDI DDC/AUX CH sanitation > > > On these platforms, the VGA output is detected as DP (presumably > theres a DP-to-VGA converter on the motherboard). The sanitization > done by the code that was removed here was correctly realising that > port E's DP aux channel was DP_AUX_A, so it disabled DP output on port > A, also showing this message: > >[drm:intel_ddi_init] VBT says port A is not DVI/HDMI/DP compatible, > respect it > > But after this cleanup commit, both port A and port E are activated > and the screen shows all-white. Reverting the commit restores usable > VGA display output. > > The reason the new implementation doesn't catch the duplicate > configuration is because the new code only considers ports that are > present in the VBT where parse_ddi_port() has run on them (in order to > set that port's info->alternate_aux_channel). > > In this case, port A is not present in the VBT so it will not have > info->alternate_aux_channel set, and the new sanitize_aux_ch will run > on port E but will not consider any overlap with port A. > > debug logs from an affected kernel: > https://gist.github.com/dsd/7e56c9bca7b2345b678cfacdab30ec55 > > Should we modify sanitize_aux_ch to look at all aux channels, not only > for the ports specified in the VBT? Please check if commit bb1d132935c2 ("drm/i915/vbt: split out defaults that are set when there is no VBT") fixes things for you. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH RESEND] drm: i915: Don't try detecting sinks on ports already in use
On systems where more than one connector is attached to the same port, the HPD pin is also shared, and attaching one connector will trigger a hotplug on every other connector on that port. But, according to the documentation, connectors sharing the port cannot be enabled simultaneously, such that we can abort the detection process early if another connector was detected succesfully. This has the good side effect of preventing DP timeouts whenever something else is connected to the shared port, like below: [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7143003f [drm:drm_dp_dpcd_access] Too many retries, giving up. First error: -110 Since this reduces the overhead of the i915_hotplug_work_func, it may address the following vblank misses detected by the CI: https://bugs.freedesktop.org/show_bug.cgi?id=100215 https://bugs.freedesktop.org/show_bug.cgi?id=100558 Signed-off-by: Gabriel Krisman Bertazi--- drivers/gpu/drm/i915/intel_display.c | 26 ++ drivers/gpu/drm/i915/intel_dp.c | 3 +++ drivers/gpu/drm/i915/intel_drv.h | 1 + drivers/gpu/drm/i915/intel_hdmi.c| 3 +++ 4 files changed, 33 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 85b9e2f521a0..618b5138c0c7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -11270,6 +11270,32 @@ static bool check_digital_port_conflicts(struct drm_atomic_state *state) return true; } +bool intel_shared_digital_port_in_use(struct drm_connector *conn) +{ + struct drm_connector *peer; + struct drm_connector_list_iter iter; + struct intel_encoder *enc = to_intel_connector(conn)->encoder; + int ret = false; + + drm_connector_list_iter_begin(conn->dev, ); + drm_for_each_connector_iter(peer, ) { + struct intel_encoder *peer_enc; + + if (peer == conn || + peer->status != connector_status_connected) + continue; + + peer_enc = to_intel_connector(peer)->encoder; + if (peer_enc->port == enc->port) { + ret = true; + break; + } + } + drm_connector_list_iter_end(); + + return ret; +} + static void clear_intel_crtc_state(struct intel_crtc_state *crtc_state) { diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 08834f74d396..0823c588575f 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -4628,6 +4628,9 @@ intel_dp_long_pulse(struct intel_connector *intel_connector) WARN_ON(!drm_modeset_is_locked(>dev->mode_config.connection_mutex)); + if (intel_shared_digital_port_in_use(connector)) + return connector_status_disconnected; + intel_display_power_get(to_i915(dev), intel_dp->aux_power_domain); /* Can't disconnect eDP, but you can close the lid... */ diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 54f3ff840812..fd5f2517bede 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1411,6 +1411,7 @@ int vlv_force_pll_on(struct drm_i915_private *dev_priv, enum pipe pipe, const struct dpll *dpll); void vlv_force_pll_off(struct drm_i915_private *dev_priv, enum pipe pipe); int lpt_get_iclkip(struct drm_i915_private *dev_priv); +bool intel_shared_digital_port_in_use(struct drm_connector *conn); /* modesetting asserts */ void assert_panel_unlocked(struct drm_i915_private *dev_priv, diff --git a/drivers/gpu/drm/i915/intel_hdmi.c b/drivers/gpu/drm/i915/intel_hdmi.c index 52f0b2d5fad2..9ce0b1f45cde 100644 --- a/drivers/gpu/drm/i915/intel_hdmi.c +++ b/drivers/gpu/drm/i915/intel_hdmi.c @@ -1532,6 +1532,9 @@ intel_hdmi_detect(struct drm_connector *connector, bool force) DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id, connector->name); + if (intel_shared_digital_port_in_use(connector)) + return connector_status_disconnected;; + intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS); intel_hdmi_unset_edid(connector); -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On 5/4/2017 9:51 AM, Kenneth Graunke wrote: On Thursday, May 4, 2017 7:47:21 AM PDT David Weinehall wrote: On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote: Thanks for rephrasing - that's exactly what I am concerned with. Did you just use the MediaSDK as it is - meaning that MOCS entries beyond the set of the 3 we have defined had been naively utilized? If that's the case it is probably the cause of the performance difference - everything beyond "the 3" means UNCACHED. Can you try changing MediaSDK to only use entries that are already in? How the performance differs in that case? We're benchmarking using upstream MediaSDK without changes, since that's the only thing that's relevant. Customising benchmarks to get better results isn't really an acceptable solution :) Obviously fixing MediaSDK upstream is a different story, in case one of the three pre-defined entries we have turns out to be the best possible MOCS-settings for that workload. You're right about customizing benchmarks, but... MediaSDK is not a benchmark. If I'm not mistaken, it's a userspace driver produced by Intel engineers, one which Intel has the full capability to change. What you're saying is that Intel's MediaSDK engineers are unwilling to change their software to provide better performance for their Linux users. That's pretty mental. We don't warp the core operating system to work around userspace software simply because they don't want to change it. Agreed, that isn't the intention. This isn't about open vs. closed or internal vs. public projects, either. I work on a public userspace driver for Intel graphics. If I sent a kernel patch, the kernel developers would ask me the exact same questions, to justify my new additions: 1. Is your userspace actually using all these new additions? If not, which ones are you using? They would ask me to drop anything I wasn't actually using yet, because speculatively adding things to the kernel that we have to maintain backwards compatibility for has caused both kernel and userspace developers a lot of trouble. 2. Are you sure that you need them all? Is there a simpler solution - are some existing things good enough? What's the additional benefit of each new addition? I would have to answer these questions to the satisfaction of the kernel developers before they would even consider taking my patch. You keep pointing to your large performance improvement, but all it's shown is that actually using the GPU cache is faster than having a broken userspace driver explicitly set everything to uncached. Many people have pointed this out. Arek and Tvrtko have good suggestions. I don't think you're going to get anywhere with this until you demonstrate that the new MOCS entries provide some non-zero value over using the existing WB entries Here are a couple more data points: 1. We likely can't implement the documented "MOCS Version 1" table as is. The kernel exposes existing entries with specific semantics. Changing their meaning would introduce a backwards-incompatible change that would likely regress the performance of existing userspace. This is almost certainly unacceptable - our customers, distro partners, users, and even people like Linus Torvalds will suffer and complain loudly. The way the existing entries are exposed should not be changing. We could add the new entries at an offset - i.e. leave the existing 3 entries, and append the rest after that. But that would require changing userspace that assumes the Windows tables, such as MediaSDK (they would have to add 3 to their MOCS indexes). At which point, we're changing them, so...the "runs unaltered" argument falls over. The intention is certainly that the existing 3 entries already in use by open source user mode stack are not changed. This is additive to what is there. The BSpec should be reflecting that - if not that is an error. The Intel UMDs will change to support this. 2. The docs finally contain "recommended MOCS settings" - i.e. where to cache various types of objects, and at what age. However, I believe those recommendations can be implemented with 1-2 new table entries and a PTE change to be eLLC-only by default. Most of the table is completely unnecessary to implement the recommendations. I personally would like to try implementing their recommended settings in my driver. I have not had time yet, but plan to try. I'm very glad to see the Windows MOCS recommendations documented. I'd been asking for that information for literally years. If we'd gotten it earlier, a lot of mess could have been avoided. For future platforms, we may want to coordinate and use the same table. But Gen9 has been shipping for ages, and we don't have that luxury. I would hope that is not the case and the change can be made for gen9, as it should have no impact on
Re: [Intel-gfx] [PATCH v2] tests/pm_sseu: Re-enable the test
On 05/04/2017 08:37 AM, Petri Latvala wrote: On Wed, Apr 26, 2017 at 03:28:09AM -0700, Oscar Mateo wrote: This test got inadvertently disabled by commit 83884e97 (Restore "lib: Open debugfs files for the given DRM device") when the initialization order got changed (dbg_init before gem_init). v2: - The asserts on fd are useless (Petri) - Deinit in inverse order. Cc: Petri LatvalaSigned-off-by: Oscar Mateo Thanks, pushed with R-b. Btw, can you do git config format.subjectprefix "PATCH i-g-t" for your future patches? Will do. Thanks! ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCHv4 3/3] drm/vgem: Enable dmabuf import interfaces
On Thu, May 04, 2017 at 11:45:48AM -0700, Laura Abbott wrote: > > Enable the GEM dma-buf import interfaces in addition to the export > interfaces. This lets vgem be used as a test source for other allocators > (e.g. Ion). > > Reviewed-by: Chris Wilson> Signed-off-by: Laura Abbott > --- > v4: Use new drm_gem_prime_import_dev function > --- > static const struct vm_operations_struct vgem_gem_vm_ops = { > @@ -114,12 +142,8 @@ static void vgem_postclose(struct drm_device *dev, > struct drm_file *file) > kfree(vfile); > } > > -/* ioctls */ > - > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > - struct drm_file *file, > - unsigned int *handle, > - unsigned long size) > +static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > + unsigned long size) I'm going to guess that doesn't line up anymore. If checkpatch isn't complaining, then sorry for the noise. > +static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > + struct drm_file *file, > + unsigned int *handle, > + unsigned long size) Ditto. Lgtm, so r-b still good. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCHv4 1/3] drm/vgem: Add a dummy platform device
On Thu, May 04, 2017 at 11:45:46AM -0700, Laura Abbott wrote: > > The vgem driver is currently registered independent of any actual > device. Some usage of the dmabuf APIs require an actual device structure > to do anything. Register a dummy platform device for use with dmabuf. > > Reviewed-by: Chris Wilson> Signed-off-by: Laura Abbott > --- > v4: Switch from the now removed platformdev to a static platform device. I was thinking of avoiding the static, i.e. static struct vgem_device { struct drm_device drm; struct device *platform; } *vgem_device; vgem_init(): vgem_device = kzalloc(sizeof(*vgem_device), GFP_KERNEEL); ret = drm_dev_init(_device->drm, _drv, NULL); vgem_device->platform = platform_device_register_simple("vgem"); And then platform_device_unregister() should be done in a new vgem_drv.release callback. I'm not going to insist upon it as I can send a patch to move over to the "modern" drm_device subclassing later. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] i915 4.9 regression: DP AUX CH sanitization no longer working on Asus desktops
Hi, Numerous Asus desktops and All-in-one computers (e.g. D520MT) have a regression on Linux 4.9 where the VGA output is shown all-white. This is a regression caused by: commit 0ce140d45a8398b501934ac289aef0eb7f47c596 Author: Ville SyrjäläDate: Tue Oct 11 20:52:47 2016 +0300 drm/i915: Clean up DDI DDC/AUX CH sanitation On these platforms, the VGA output is detected as DP (presumably theres a DP-to-VGA converter on the motherboard). The sanitization done by the code that was removed here was correctly realising that port E's DP aux channel was DP_AUX_A, so it disabled DP output on port A, also showing this message: [drm:intel_ddi_init] VBT says port A is not DVI/HDMI/DP compatible, respect it But after this cleanup commit, both port A and port E are activated and the screen shows all-white. Reverting the commit restores usable VGA display output. The reason the new implementation doesn't catch the duplicate configuration is because the new code only considers ports that are present in the VBT where parse_ddi_port() has run on them (in order to set that port's info->alternate_aux_channel). In this case, port A is not present in the VBT so it will not have info->alternate_aux_channel set, and the new sanitize_aux_ch will run on port E but will not consider any overlap with port A. debug logs from an affected kernel: https://gist.github.com/dsd/7e56c9bca7b2345b678cfacdab30ec55 Should we modify sanitize_aux_ch to look at all aux channels, not only for the ports specified in the VBT? Thanks Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCHv4 2/3] drm/prime: Introduce drm_gem_prime_import_dev
On Thu, May 04, 2017 at 11:45:47AM -0700, Laura Abbott wrote: > > The existing drm_gem_prime_import function uses the underlying > struct device of a drm_device for attaching to a dma_buf. Some drivers > (notably vgem) may not have an underlying device structure. Offer > an alternate function to attach using any available device structure. > > Signed-off-by: Laura Abbott> --- > v4: Alternate implemntation to take an arbitrary struct dev instead of just > a platform device. > > This was different enough that I dropped the previous Reviewed-by > --- > drivers/gpu/drm/drm_prime.c | 30 -- > include/drm/drm_prime.h | 5 + > 2 files changed, 29 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index 9fb65b7..5ad9a26 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -595,15 +595,18 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev, > EXPORT_SYMBOL(drm_gem_prime_handle_to_fd); > > /** > - * drm_gem_prime_import - helper library implementation of the import > callback > + * drm_gem_prime_import_dev - core implementation of the import callback > * @dev: drm_device to import into > * @dma_buf: dma-buf object to import > + * @attach_dev: struct device to dma_buf attach > * > - * This is the implementation of the gem_prime_import functions for GEM > drivers > - * using the PRIME helpers. > + * This is the core of drm_gem_prime_import. It's designed to be called by > + * drivers who want to use a different device structure than dev->dev for > + * attaching via dma_buf. > */ > -struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, > - struct dma_buf *dma_buf) > +struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, > + struct dma_buf *dma_buf, > + struct device *attach_dev) My critique would be that this should be called drm_gem_prime_import_for_device() Either way (though naturally I like my suggestion ;), Reviewed-by: Chris Wilson -Chris > -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()
On Thu, May 04, 2017 at 06:59:23PM +0200, SF Markus Elfring wrote: > From: Markus Elfring> Date: Thu, 4 May 2017 14:15:00 +0200 > > The script "checkpatch.pl" pointed information out like the following. > > WARNING: quoted string split across lines > > Thus fix the affected source code place. > > Signed-off-by: Markus Elfring > --- > drivers/gpu/drm/i915/i915_debugfs.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index 6f3119d40c50..dbd52ea89fb4 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m) > > forcewake_count = > READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count); > if (forcewake_count) { > - seq_puts(m, "RC information inaccurate because somebody " > - "holds a forcewake reference \n"); > + seq_puts(m, > + "RC information inaccurate because somebody holds a > forcewake reference.\n"); And now you break the 80col rule. Blind adherence to checkpatch is impossible. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/9] drm/i915: Replace 14 seq_printf() calls by seq_puts()
On Thu, May 04, 2017 at 06:54:16PM +0200, SF Markus Elfring wrote: > From: Markus Elfring> Date: Thu, 4 May 2017 13:20:47 +0200 > > Some strings which did not contain data format specifications should be put > into a sequence. Thus use the corresponding function "seq_puts". debugfs / seq_file is not performance critical. Familiar idiomatic code is much preferred over continually switching between seq_printf and seq_puts. And don't even start on converting seq_printf / seq_puts to seq_putc... -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE
On Thu, May 04, 2017 at 10:56:54AM -0700, Francisco Jerez wrote: > David Weinehallwrites: > > > On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote: > >> A good default for garbage entries from the user is to follow the > >> default setting of the object (i.e. the PTE). Currently they use the > >> uncached entry, and now the only way to accidentally hit uncached > >> performance is via explicit use of the uncached MOCS or setting the > >> object to uncached. Note that these entries are currently undefined in > >> the ABI and we reserve the right to change them. We originally chose > >> uncached to eliminate any problem with reducing the caching level in > >> future, but the object is a much better definition of the minimum > >> caching level. > >> > > NAK. The reason for the default being UC is that it's the only setting > that guarantees full forwards compatibility with any other entry that > might be added in the future. If you default to PTE on (e)LLC and WB on > L3, userspace will no longer be able to use any newly introduced entry > with stricter coherency guarantees than that (e.g. any L3-uncached > entry) in a backwards-compatible way. Attempting to do so may break > memory coherency assumptions of the application and lead to misrendering > when run on older kernel versions (which to my judgment is a scarier > failure mode than reduced performance). You can't use a weaker coherency model in mocs than that specified for the object as you can't control other uses of the object (even just memory pressure will break your assumptions). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 04/15] drm/i915: Clean up cursor junk from intel_crtc
On Mon, Mar 27, 2017 at 09:55:35PM +0300, ville.syrj...@linux.intel.com wrote: > From: Ville Syrjälä> > Move cursor_base, cursor_cntl, and cursor_size from intel_crtc > into intel_plane so that we don't need the crtc for cursor stuff > so much. > > Also entirely nuke cursor_addr which IMO doesn't provide any benefit > since it's not actually used by the cursor code itself. I'm not 100% > sure what the SKL+ DDB is code is after by looking at cursor_addr so > I just make it do its checks unconditionally. If that's not correct > then we should likely replace it with somehting like > plane_state->visible. Yes, AFAICS in case it's not visible the cursor DDB and WM will be still computed (to fixed minimum and 0 accordingly) and programmed. Maybe historical left-over? The code comment about this is also stale then. > > Signed-off-by: Ville Syrjälä Reviewed-by: Imre Deak > --- > drivers/gpu/drm/i915/i915_debugfs.c | 48 +- > drivers/gpu/drm/i915/intel_display.c | 80 > +++- > drivers/gpu/drm/i915/intel_drv.h | 9 ++-- > 3 files changed, 48 insertions(+), 89 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index 316bc47a8eea..b9410cb845f3 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -3040,36 +3040,6 @@ static void intel_connector_info(struct seq_file *m, > intel_seq_print_mode(m, 2, mode); > } > > -static bool cursor_active(struct drm_i915_private *dev_priv, int pipe) > -{ > - u32 state; > - > - if (IS_I845G(dev_priv) || IS_I865G(dev_priv)) > - state = I915_READ(CURCNTR(PIPE_A)) & CURSOR_ENABLE; > - else > - state = I915_READ(CURCNTR(pipe)) & CURSOR_MODE; > - > - return state; > -} > - > -static bool cursor_position(struct drm_i915_private *dev_priv, > - int pipe, int *x, int *y) > -{ > - u32 pos; > - > - pos = I915_READ(CURPOS(pipe)); > - > - *x = (pos >> CURSOR_X_SHIFT) & CURSOR_POS_MASK; > - if (pos & (CURSOR_POS_SIGN << CURSOR_X_SHIFT)) > - *x = -*x; > - > - *y = (pos >> CURSOR_Y_SHIFT) & CURSOR_POS_MASK; > - if (pos & (CURSOR_POS_SIGN << CURSOR_Y_SHIFT)) > - *y = -*y; > - > - return cursor_active(dev_priv, pipe); > -} > - > static const char *plane_type(enum drm_plane_type type) > { > switch (type) { > @@ -3191,9 +3161,7 @@ static int i915_display_info(struct seq_file *m, void > *unused) > seq_printf(m, "CRTC info\n"); > seq_printf(m, "-\n"); > for_each_intel_crtc(dev, crtc) { > - bool active; > struct intel_crtc_state *pipe_config; > - int x, y; > > drm_modeset_lock(>base.mutex, NULL); > pipe_config = to_intel_crtc_state(crtc->base.state); > @@ -3205,14 +3173,18 @@ static int i915_display_info(struct seq_file *m, void > *unused) > yesno(pipe_config->dither), pipe_config->pipe_bpp); > > if (pipe_config->base.active) { > + struct intel_plane *cursor = > + to_intel_plane(crtc->base.cursor); > + > intel_crtc_info(m, crtc); > > - active = cursor_position(dev_priv, crtc->pipe, , ); > - seq_printf(m, "\tcursor visible? %s, position (%d, %d), > size %dx%d, addr 0x%08x, active? %s\n", > -yesno(crtc->cursor_base), > -x, y, crtc->base.cursor->state->crtc_w, > -crtc->base.cursor->state->crtc_h, > -crtc->cursor_addr, yesno(active)); > + seq_printf(m, "\tcursor visible? %s, position (%d, %d), > size %dx%d, addr 0x%08x\n", > +yesno(cursor->base.state->visible), > +cursor->base.state->crtc_x, > +cursor->base.state->crtc_y, > +cursor->base.state->crtc_w, > +cursor->base.state->crtc_h, > +cursor->cursor.base); > intel_scaler_info(m, crtc); > intel_plane_info(m, crtc); > } > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index 5e04f64a0f76..1d55fac397ad 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -9126,8 +9126,7 @@ static bool haswell_get_pipe_config(struct intel_crtc > *crtc, > return active; > } > > -static u32 intel_cursor_base(struct intel_crtc *crtc, > - const struct intel_plane_state *plane_state) > +static u32 intel_cursor_base(const struct
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/guc: capture GuC logs if FW fails to load
== Series Details == Series: drm/i915/guc: capture GuC logs if FW fails to load URL : https://patchwork.freedesktop.org/series/23982/ State : success == Summary == Series 23982v1 drm/i915/guc: capture GuC logs if FW fails to load https://patchwork.freedesktop.org/api/1.0/series/23982/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: dmesg-warn -> PASS (fi-kbl-7560u) fdo#100125 fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:439s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:432s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:572s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:500s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:565s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:488s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:481s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:409s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:420s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:486s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:467s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:461s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:567s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:455s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:576s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:467s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:498s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:437s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:535s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:403s 369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest ed316a1 drm/i915/guc: capture GuC logs if FW fails to load == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4626/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for dma_buf import support for vgem (rev2)
== Series Details == Series: dma_buf import support for vgem (rev2) URL : https://patchwork.freedesktop.org/series/23824/ State : success == Summary == Series 23824v2 dma_buf import support for vgem https://patchwork.freedesktop.org/api/1.0/series/23824/revisions/2/mbox/ Test gem_exec_flush: Subgroup basic-batch-kernel-default-uc: pass -> FAIL (fi-snb-2600) fdo#17 fdo#17 https://bugs.freedesktop.org/show_bug.cgi?id=17 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:431s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:425s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:584s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:508s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:553s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:492s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:480s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:411s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:415s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:493s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:464s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:460s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:560s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:455s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:566s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:453s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:498s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:431s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:539s fi-snb-2600 total:278 pass:248 dwarn:0 dfail:0 fail:1 skip:29 time:417s 369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest 0e6a5c5 drm/vgem: Enable dmabuf import interfaces 36b39d3 drm/prime: Introduce drm_gem_prime_import_dev d231a4f drm/vgem: Add a dummy platform device == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4625/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC] drm/i915/guc: capture GuC logs if FW fails to load
We're currently deleting the GuC logs if the FW fails to load, but those are still useful to understand why the loading failed. Instead of deleting them, taking a snapshot allows us to access them after driver load is completed. Cc: Oscar MateoCc: Michal Wajdeczko Signed-off-by: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/i915_debugfs.c | 36 --- drivers/gpu/drm/i915/i915_drv.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 6 ++ drivers/gpu/drm/i915/i915_gpu_error.c | 36 +++ drivers/gpu/drm/i915/intel_guc_fwif.h | 14 +++--- drivers/gpu/drm/i915/intel_guc_log.c | 10 ++ drivers/gpu/drm/i915/intel_uc.c | 7 +-- 7 files changed, 84 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 870c470..4ff20fc 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2543,26 +2543,32 @@ static int i915_guc_info(struct seq_file *m, void *data) static int i915_guc_log_dump(struct seq_file *m, void *data) { struct drm_i915_private *dev_priv = node_to_i915(m->private); - struct drm_i915_gem_object *obj; - int i = 0, pg; - - if (!dev_priv->guc.log.vma) + u32 *log; + int i = 0; + + if (dev_priv->guc.log.vma) { + log = i915_gem_object_pin_map(dev_priv->guc.log.vma->obj, + I915_MAP_WC); + if (IS_ERR(log)) { + DRM_ERROR("Failed to pin guc_log vma\n"); + return -ENOMEM; + } + } else if (dev_priv->gpu_error.guc_load_fail_log) { + log = dev_priv->gpu_error.guc_load_fail_log; + } else { return 0; - - obj = dev_priv->guc.log.vma->obj; - for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) { - u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg)); - - for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4) - seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n", - *(log + i), *(log + i + 1), - *(log + i + 2), *(log + i + 3)); - - kunmap_atomic(log); } + for (i = 0; i < GUC_LOG_SIZE / sizeof(u32); i += 4) + seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n", + *(log + i), *(log + i + 1), + *(log + i + 2), *(log + i + 3)); + seq_putc(m, '\n'); + if (dev_priv->guc.log.vma) + i915_gem_object_unpin_map(dev_priv->guc.log.vma->obj); + return 0; } diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 452c265..c7cb36c 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1354,6 +1354,9 @@ void i915_driver_unload(struct drm_device *dev) cancel_delayed_work_sync(_priv->gpu_error.hangcheck_work); i915_reset_error_state(dev_priv); + /* release GuC error log (if any) */ + i915_guc_load_error_log_free(dev_priv); + /* Flush any outstanding unpin_work. */ drain_workqueue(dev_priv->wq); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4588b3e..761c663 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1555,6 +1555,9 @@ struct i915_gpu_error { /* Protected by the above dev->gpu_error.lock. */ struct i915_gpu_state *first_error; + /* Log snapshot if GuC errors during load */ + void *guc_load_fail_log; + unsigned long missed_irq_rings; /** @@ -3687,6 +3690,9 @@ static inline void i915_reset_error_state(struct drm_i915_private *i915) #endif +void i915_guc_load_error_log_capture(struct drm_i915_private *i915); +void i915_guc_load_error_log_free(struct drm_i915_private *i915); + const char *i915_cache_level_str(struct drm_i915_private *i915, int type); /* i915_cmd_parser.c */ diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index ec526d9..44a873b 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1809,3 +1809,39 @@ void i915_reset_error_state(struct drm_i915_private *i915) i915_gpu_state_put(error); } + +void i915_guc_load_error_log_capture(struct drm_i915_private *i915) +{ + void *log, *buf; + struct i915_vma *vma = i915->guc.log.vma; + + if (i915->gpu_error.guc_load_fail_log || !vma) + return; + + /* +* the vma should be already pinned and mapped for log runtime +* management but let's play safe +*/ + log = i915_gem_object_pin_map(vma->obj, I915_MAP_WC); + if (IS_ERR(log)) { +
[Intel-gfx] [PATCHv4 1/3] drm/vgem: Add a dummy platform device
The vgem driver is currently registered independent of any actual device. Some usage of the dmabuf APIs require an actual device structure to do anything. Register a dummy platform device for use with dmabuf. Reviewed-by: Chris WilsonSigned-off-by: Laura Abbott --- v4: Switch from the now removed platformdev to a static platform device. --- drivers/gpu/drm/vgem/vgem_drv.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index 9fee38a..d1d98af 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -42,6 +42,8 @@ #define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 +static struct platform_device *vgem_platform; + static void vgem_gem_free_object(struct drm_gem_object *obj) { struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); @@ -335,11 +337,20 @@ static int __init vgem_init(void) int ret; vgem_device = drm_dev_alloc(_driver, NULL); - if (IS_ERR(vgem_device)) { - ret = PTR_ERR(vgem_device); + if (IS_ERR(vgem_device)) + return PTR_ERR(vgem_device); + + vgem_platform = platform_device_register_simple("vgem", + -1, NULL, 0); + + if (!vgem_platform) { + ret = -ENODEV; goto out; } + dma_coerce_mask_and_coherent(_platform->dev, + DMA_BIT_MASK(64)); + ret = drm_dev_register(vgem_device, 0); if (ret) goto out_unref; @@ -347,13 +358,15 @@ static int __init vgem_init(void) return 0; out_unref: - drm_dev_unref(vgem_device); + platform_device_unregister(vgem_platform); out: + drm_dev_unref(vgem_device); return ret; } static void __exit vgem_exit(void) { + platform_device_unregister(vgem_platform); drm_dev_unregister(vgem_device); drm_dev_unref(vgem_device); } -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCHv4 2/3] drm/prime: Introduce drm_gem_prime_import_dev
The existing drm_gem_prime_import function uses the underlying struct device of a drm_device for attaching to a dma_buf. Some drivers (notably vgem) may not have an underlying device structure. Offer an alternate function to attach using any available device structure. Signed-off-by: Laura Abbott--- v4: Alternate implemntation to take an arbitrary struct dev instead of just a platform device. This was different enough that I dropped the previous Reviewed-by --- drivers/gpu/drm/drm_prime.c | 30 -- include/drm/drm_prime.h | 5 + 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 9fb65b7..5ad9a26 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -595,15 +595,18 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev, EXPORT_SYMBOL(drm_gem_prime_handle_to_fd); /** - * drm_gem_prime_import - helper library implementation of the import callback + * drm_gem_prime_import_dev - core implementation of the import callback * @dev: drm_device to import into * @dma_buf: dma-buf object to import + * @attach_dev: struct device to dma_buf attach * - * This is the implementation of the gem_prime_import functions for GEM drivers - * using the PRIME helpers. + * This is the core of drm_gem_prime_import. It's designed to be called by + * drivers who want to use a different device structure than dev->dev for + * attaching via dma_buf. */ -struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) +struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, + struct dma_buf *dma_buf, + struct device *attach_dev) { struct dma_buf_attachment *attach; struct sg_table *sgt; @@ -625,7 +628,7 @@ struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL); - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(dma_buf, attach_dev); if (IS_ERR(attach)) return ERR_CAST(attach); @@ -655,6 +658,21 @@ struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, return ERR_PTR(ret); } +EXPORT_SYMBOL(drm_gem_prime_import_dev); + +/** + * drm_gem_prime_import - helper library implementation of the import callback + * @dev: drm_device to import into + * @dma_buf: dma-buf object to import + * + * This is the implementation of the gem_prime_import functions for GEM drivers + * using the PRIME helpers. + */ +struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, + struct dma_buf *dma_buf) +{ + return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); +} EXPORT_SYMBOL(drm_gem_prime_import); /** diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h index 0b2a235..46fd1fb 100644 --- a/include/drm/drm_prime.h +++ b/include/drm/drm_prime.h @@ -65,6 +65,11 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev, int *prime_fd); struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf); + +struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, + struct dma_buf *dma_buf, + struct device *attach_dev); + int drm_gem_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd, uint32_t *handle); struct dma_buf *drm_gem_dmabuf_export(struct drm_device *dev, -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCHv4 3/3] drm/vgem: Enable dmabuf import interfaces
Enable the GEM dma-buf import interfaces in addition to the export interfaces. This lets vgem be used as a test source for other allocators (e.g. Ion). Reviewed-by: Chris WilsonSigned-off-by: Laura Abbott --- v4: Use new drm_gem_prime_import_dev function --- drivers/gpu/drm/vgem/vgem_drv.c | 136 +++- drivers/gpu/drm/vgem/vgem_drv.h | 2 + 2 files changed, 109 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index d1d98af..c9381d45 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -48,6 +48,11 @@ static void vgem_gem_free_object(struct drm_gem_object *obj) { struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); + drm_free_large(vgem_obj->pages); + + if (obj->import_attach) + drm_prime_gem_destroy(obj, vgem_obj->table); + drm_gem_object_release(obj); kfree(vgem_obj); } @@ -58,26 +63,49 @@ static int vgem_gem_fault(struct vm_fault *vmf) struct drm_vgem_gem_object *obj = vma->vm_private_data; /* We don't use vmf->pgoff since that has the fake offset */ unsigned long vaddr = vmf->address; - struct page *page; - - page = shmem_read_mapping_page(file_inode(obj->base.filp)->i_mapping, - (vaddr - vma->vm_start) >> PAGE_SHIFT); - if (!IS_ERR(page)) { - vmf->page = page; - return 0; - } else switch (PTR_ERR(page)) { - case -ENOSPC: - case -ENOMEM: - return VM_FAULT_OOM; - case -EBUSY: - return VM_FAULT_RETRY; - case -EFAULT: - case -EINVAL: - return VM_FAULT_SIGBUS; - default: - WARN_ON_ONCE(PTR_ERR(page)); - return VM_FAULT_SIGBUS; + int ret; + loff_t num_pages; + pgoff_t page_offset; + page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; + + num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); + + if (page_offset > num_pages) + return VM_FAULT_SIGBUS; + + if (obj->pages) { + get_page(obj->pages[page_offset]); + vmf->page = obj->pages[page_offset]; + ret = 0; + } else { + struct page *page; + + page = shmem_read_mapping_page( + file_inode(obj->base.filp)->i_mapping, + page_offset); + if (!IS_ERR(page)) { + vmf->page = page; + ret = 0; + } else switch (PTR_ERR(page)) { + case -ENOSPC: + case -ENOMEM: + ret = VM_FAULT_OOM; + break; + case -EBUSY: + ret = VM_FAULT_RETRY; + break; + case -EFAULT: + case -EINVAL: + ret = VM_FAULT_SIGBUS; + break; + default: + WARN_ON(PTR_ERR(page)); + ret = VM_FAULT_SIGBUS; + break; + } + } + return ret; } static const struct vm_operations_struct vgem_gem_vm_ops = { @@ -114,12 +142,8 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) kfree(vfile); } -/* ioctls */ - -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, - struct drm_file *file, - unsigned int *handle, - unsigned long size) +static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, + unsigned long size) { struct drm_vgem_gem_object *obj; int ret; @@ -129,8 +153,31 @@ static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, return ERR_PTR(-ENOMEM); ret = drm_gem_object_init(dev, >base, roundup(size, PAGE_SIZE)); - if (ret) - goto err_free; + if (ret) { + kfree(obj); + return ERR_PTR(ret); + } + + return obj; +} + +static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) +{ + drm_gem_object_release(>base); + kfree(obj); +} + +static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, + struct drm_file *file, + unsigned int *handle, + unsigned long size) +{
[Intel-gfx] [PATCHv4 0/3] dma_buf import support for vgem
Hi, This v4 of the series to add dma_buf import functions for vgem. This version primarily focuses on adding a new approach for an alternate dma_buf attach after platformdev was removed. Thanks, Laura Laura Abbott (3): drm/vgem: Add a dummy platform device drm/prime: Introduce drm_gem_prime_import_dev drm/vgem: Enable dmabuf import interfaces drivers/gpu/drm/drm_prime.c | 30 ++-- drivers/gpu/drm/vgem/vgem_drv.c | 155 +++- drivers/gpu/drm/vgem/vgem_drv.h | 2 + include/drm/drm_prime.h | 5 ++ 4 files changed, 154 insertions(+), 38 deletions(-) -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix rawclk readout for g4x
== Series Details == Series: drm/i915: Fix rawclk readout for g4x URL : https://patchwork.freedesktop.org/series/23978/ State : success == Summary == Series 23978v1 drm/i915: Fix rawclk readout for g4x https://patchwork.freedesktop.org/api/1.0/series/23978/revisions/1/mbox/ fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:432s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:425s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:579s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:515s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:564s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:494s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:483s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:411s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:409s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:420s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:480s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:487s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:458s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:571s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:454s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:573s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:455s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:492s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:430s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:529s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:413s 369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest be10d0a drm/i915: Fix rawclk readout for g4x == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4624/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Fix rawclk readout for g4x
From: Ville SyrjäläTurns out our skills in decoding the CLKCFG register weren't good enough. On this particular elk the answer we got was 400 MHz when in reality the clock was running at 266 MHz, which then caused us to program a bogus AUX clock divider that caused all AUX communication to fail. Sadly the docs are now in bit heaven, so the fix will have to be based on empirical evidence. Using another elk machine I was able to frob the FSB frequency from the BIOS and see how it affects the CLKCFG register. The machine seesm to use a frequency of 266 MHz by default, and fortunately it still boot even with the 50% CPU overclock that we get when we bump the FSB up to 400 MHz. It turns out the actual FSB frequency and the register have no real link whatsoever. The register value is based on some straps or something, but fortunately those too can be configured from the BIOS on this board, although it doesn't seem to respect the settings 100%. In the end I was able to derive the following relationship: BIOS FSB / strap | CLKCFG - 200 | 0x2 266 | 0x0 333 | 0x4 400 | 0x4 So only the 200 and 400 MHz cases actually match how we're currently decoding that register. But as the comment next to some of the defines says, we have been just guessing anyway. So let's fix things up so that at least the 266 MHz case will work correctly as that is actually the setting used by both the buggy machine and my test machine. The fact that 333 and 400 MHz BIOS settings result in the same register value is a little disappointing, as that means we can't tell them apart. However, according to the gmch datasheet for both elk and ctg 400 Mhz is not even a supported FSB frequency, so I'm going to make the assumption that we should decode it as 333 MHz instead. Cc: sta...@vger.kernel.org Cc: Tomi Sarvela Reported-by: Tomi Sarvela Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100926 Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/i915_reg.h| 10 +++--- drivers/gpu/drm/i915/intel_cdclk.c | 6 ++ 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index ee8170cda93e..524fdfda9d45 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3059,10 +3059,14 @@ enum skl_disp_power_wells { #define CLKCFG_FSB_667 (3 << 0)/* hrawclk 166 */ #define CLKCFG_FSB_800 (2 << 0)/* hrawclk 200 */ #define CLKCFG_FSB_1067(6 << 0) /* hrawclk 266 */ +#define CLKCFG_FSB_1067_ALT(0 << 0)/* hrawclk 266 */ #define CLKCFG_FSB_1333(7 << 0) /* hrawclk 333 */ -/* Note, below two are guess */ -#define CLKCFG_FSB_1600(4 << 0) /* hrawclk 400 */ -#define CLKCFG_FSB_1600_ALT(0 << 0)/* hrawclk 400 */ +/* + * Note that on at least on ELK the below value is reported for both + * 333 and 400 MHz BIOS FSB setting, but given that the gmch datasheet + * lists only 200/266/333 MHz FSB as supported let's decode it as 333 MHz. + */ +#define CLKCFG_FSB_1333_ALT(4 << 0)/* hrawclk 333 */ #define CLKCFG_FSB_MASK(7 << 0) #define CLKCFG_MEM_533 (1 << 4) #define CLKCFG_MEM_667 (2 << 4) diff --git a/drivers/gpu/drm/i915/intel_cdclk.c b/drivers/gpu/drm/i915/intel_cdclk.c index 763010f8ad89..29792972d55d 100644 --- a/drivers/gpu/drm/i915/intel_cdclk.c +++ b/drivers/gpu/drm/i915/intel_cdclk.c @@ -1808,13 +1808,11 @@ static int g4x_hrawclk(struct drm_i915_private *dev_priv) case CLKCFG_FSB_800: return 20; case CLKCFG_FSB_1067: + case CLKCFG_FSB_1067_ALT: return 27; case CLKCFG_FSB_1333: + case CLKCFG_FSB_1333_ALT: return 33; - /* these two are just a guess; one of them might be right */ - case CLKCFG_FSB_1600: - case CLKCFG_FSB_1600_ALT: - return 40; default: return 13; } -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE
David Weinehallwrites: > On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote: >> A good default for garbage entries from the user is to follow the >> default setting of the object (i.e. the PTE). Currently they use the >> uncached entry, and now the only way to accidentally hit uncached >> performance is via explicit use of the uncached MOCS or setting the >> object to uncached. Note that these entries are currently undefined in >> the ABI and we reserve the right to change them. We originally chose >> uncached to eliminate any problem with reducing the caching level in >> future, but the object is a much better definition of the minimum >> caching level. >> NAK. The reason for the default being UC is that it's the only setting that guarantees full forwards compatibility with any other entry that might be added in the future. If you default to PTE on (e)LLC and WB on L3, userspace will no longer be able to use any newly introduced entry with stricter coherency guarantees than that (e.g. any L3-uncached entry) in a backwards-compatible way. Attempting to do so may break memory coherency assumptions of the application and lead to misrendering when run on older kernel versions (which to my judgment is a scarier failure mode than reduced performance). My other concern is that this change may make inadvertent use of undefined MOCS entries extremely difficult to detect in some cases -- UC gives userspace a pretty obvious (if functionally harmless) indicative that it's got its caching settings wrong, and is a strong motivation for userspace developers to contribute MOCS table changes to the kernel instead of blindly making assumptions about them (e.g. that they match the Android kernel as media-sdk was probably doing). With this change checked in, the performance drawback from using media-sdk on an upstream kernel may have been subtle enough that David would never have bothered to look into the issue. People may have started shipping copies of media-sdk making bogus MOCS table assumptions (with potential correctness implications), at which point you would have to deal with userspace regressions anytime the MOCS table is extended in the future. >> Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS") >> Signed-off-by: Chris Wilson >> Cc: David Weinehall >> Cc: Arkadiusz Hiler >> Cc: Tvrtko Ursulin >> Cc: sta...@vger.kernel.org > > LGTM, and passes our nightly msdk test case. > > Tested-by: David Weinehall > Reviewed-by: David Weinehall > >> --- >> drivers/gpu/drm/i915/intel_mocs.c | 39 >> +++ >> 1 file changed, 15 insertions(+), 24 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/intel_mocs.c >> b/drivers/gpu/drm/i915/intel_mocs.c >> index 92e461c68385..e7a7781ca457 100644 >> --- a/drivers/gpu/drm/i915/intel_mocs.c >> +++ b/drivers/gpu/drm/i915/intel_mocs.c >> @@ -85,10 +85,7 @@ struct drm_i915_mocs_table { >> * >> * Entries not part of the following tables are undefined as far as >> * userspace is concerned and shouldn't be relied upon. For the time >> - * being they will be implicitly initialized to the strictest caching >> - * configuration (uncached) to guarantee forwards compatibility with >> - * userspace programs written against more recent kernels providing >> - * additional MOCS entries. >> + * being they will be implicitly initialized to follow the PTE. >> * >> * NOTE: These tables MUST start with being uncached and the length >> * MUST be less than 63 as the last two registers are reserved >> @@ -249,16 +246,13 @@ int intel_mocs_init_engine(struct intel_engine_cs >> *engine) >> table.table[index].control_value); >> >> /* >> - * Ok, now set the unused entries to uncached. These entries >> + * Ok, now set the unused entries to follow the PTE. These entries >> * are officially undefined and no contract for the contents >> * and settings is given for these entries. >> - * >> - * Entry 0 in the table is uncached - so we are just writing >> - * that value to all the used entries. >> */ >> for (; index < GEN9_NUM_MOCS_ENTRIES; index++) >> I915_WRITE(mocs_register(engine->id, index), >> - table.table[0].control_value); >> + table.table[I915_MOCS_PTE].control_value); >> >> return 0; >> } >> @@ -295,16 +289,13 @@ static int emit_mocs_control_table(struct >> drm_i915_gem_request *req, >> } >> >> /* >> - * Ok, now set the unused entries to uncached. These entries >> + * Ok, now set the unused entries to follow the PTE. These entries >> * are officially undefined and no contract for the contents >> * and settings is given for these
Re: [Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us
On 05/04/2017 11:42 AM, Ville Syrjälä wrote: > On Thu, May 04, 2017 at 09:26:09AM -0600, Jens Axboe wrote: >> Hi, >> >> Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get >> a lot of the below warnings. Things seem to work fine (in fact it seems >> faster in general use than previously), but it's a lot of warning spew. >> >> [ 764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under >> evasion is 100 us > > I tried to optimize this a bit recently but indeed it's stil known to be too > slow. Looks like all of that stuff did land in Linus's tree already, > so presumably you have it all already. Yes, this is Linus' tree... > I did have some further ideas that should help but I got sidetracked by > other things before I managed to finish the work. I guess I'll need to get > back on that horse and try to finish what I started. > > In the meantime, maybe we should just silence this error spew again > until we're more confident about meeting the deadlines. Maarten? > > Do you have lockdep enabled BTW? Based on what I've seen lockdep does > seem be a major contributor to slowness here. Nope, running a fairly optimized build on my laptop. -- Jens Axboe ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us
On Thu, May 04, 2017 at 09:26:09AM -0600, Jens Axboe wrote: > Hi, > > Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get > a lot of the below warnings. Things seem to work fine (in fact it seems > faster in general use than previously), but it's a lot of warning spew. > > [ 764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under > evasion is 100 us I tried to optimize this a bit recently but indeed it's stil known to be too slow. Looks like all of that stuff did land in Linus's tree already, so presumably you have it all already. I did have some further ideas that should help but I got sidetracked by other things before I managed to finish the work. I guess I'll need to get back on that horse and try to finish what I started. In the meantime, maybe we should just silence this error spew again until we're more confident about meeting the deadlines. Maarten? Do you have lockdep enabled BTW? Based on what I've seen lockdep does seem be a major contributor to slowness here. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
== Series Details == Series: series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages URL : https://patchwork.freedesktop.org/series/23969/ State : success == Summary == Series 23969v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/23969/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: dmesg-warn -> PASS (fi-kbl-7560u) fdo#100125 fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:430s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:572s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:513s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:552s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:486s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:481s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:405s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:416s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:484s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:464s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:459s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:566s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:456s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:568s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:473s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:500s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:438s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:531s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:404s fi-bdw-gvtdvm failed to collect. IGT log at Patchwork_4623/fi-bdw-gvtdvm/igt.log 369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest 5bb846f drm/i915: Use __sg_alloc_table_from_pages for userptr allocations 54ed0e1 lib/scatterlist: Introduce and export __sg_alloc_table_from_pages bafac0f lib/scatterlist: Avoid potential scatterlist entry overflow b5fb37a lib/scatterlist: Fix offset type in sg_alloc_table_from_pages == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4623/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/9] drm/i915: Adjust seven checks for null pointers
From: Markus ElfringDate: Thu, 4 May 2017 13:52:19 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The script “checkpatch.pl” pointed information out like the following. Comparison to NULL could be written … Thus fix affected source code places. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index bf9a2e8d8c16..d9c699d7245e 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -242,7 +242,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) if (count == total) break; - if (obj->stolen == NULL) + if (!obj->stolen) continue; objects[count++] = obj; @@ -254,7 +254,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) if (count == total) break; - if (obj->stolen == NULL) + if (!obj->stolen) continue; objects[count++] = obj; @@ -557,7 +557,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) spin_lock_irq(>event_lock); work = crtc->flip_work; - if (work == NULL) { + if (!work) { seq_printf(m, "No flip due on pipe %c (plane %c)\n", pipe, plane); } else { @@ -3717,7 +3717,7 @@ static ssize_t i915_displayport_test_active_write(struct file *file, continue; if (connector->status == connector_status_connected && - connector->encoder != NULL) { + connector->encoder) { intel_dp = enc_to_intel_dp(connector->encoder); status = kstrtoint(input_buffer, 10, ); if (status < 0) @@ -3756,7 +3756,7 @@ static int i915_displayport_test_active_show(struct seq_file *m, void *data) continue; if (connector->status == connector_status_connected && - connector->encoder != NULL) { + connector->encoder) { intel_dp = enc_to_intel_dp(connector->encoder); seq_putc(m, intel_dp->compliance.test_active ? '1' : '0'); @@ -3801,7 +3801,7 @@ static int i915_displayport_test_data_show(struct seq_file *m, void *data) continue; if (connector->status == connector_status_connected && - connector->encoder != NULL) { + connector->encoder) { intel_dp = enc_to_intel_dp(connector->encoder); if (intel_dp->compliance.test_type == DP_TEST_LINK_EDID_READ) @@ -3855,7 +3855,7 @@ static int i915_displayport_test_type_show(struct seq_file *m, void *data) continue; if (connector->status == connector_status_connected && - connector->encoder != NULL) { + connector->encoder) { intel_dp = enc_to_intel_dp(connector->encoder); seq_printf(m, "%02lx", intel_dp->compliance.test_type); } else { -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 9/9] drm/i915: Combine substrings for two messages in i915_ggtt_probe_hw()
From: Markus ElfringDate: Thu, 4 May 2017 14:30:37 +0200 The script "checkpatch.pl" pointed information out like the following. WARNING: quoted string split across lines Thus fix the affected source code place. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 9f64dc3f2d05..508431f42b65 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2905,16 +2905,14 @@ int i915_ggtt_probe_hw(struct drm_i915_private *dev_priv) } if ((ggtt->base.total - 1) >> 32) { - DRM_ERROR("We never expected a Global GTT with more than 32bits" - " of address space! Found %lldM!\n", + DRM_ERROR("We never expected a Global GTT with more than 32bits of address space! Found %lldM!\n", ggtt->base.total >> 20); ggtt->base.total = 1ULL << 32; ggtt->mappable_end = min(ggtt->mappable_end, ggtt->base.total); } if (ggtt->mappable_end > ggtt->base.total) { - DRM_ERROR("mappable aperture extends past end of GGTT," - " aperture=%llx, total=%llx\n", + DRM_ERROR("mappable aperture extends past end of GGTT, aperture=%llx, total=%llx\n", ggtt->mappable_end, ggtt->base.total); ggtt->mappable_end = ggtt->base.total; } -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 8/9] drm/i915: Replace a seq_puts() call by seq_putc() in two functions
From: Markus ElfringDate: Thu, 4 May 2017 14:23:32 +0200 Two single characters (line breaks) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_gem_gtt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 2aa6b97fd22f..9f64dc3f2d05 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1254,7 +1254,7 @@ static void gen8_dump_pdp(struct i915_hw_ppgtt *ppgtt, else seq_puts(m, " SCRATCH "); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); } kunmap_atomic(pt_vaddr); } @@ -1437,7 +1437,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m) else seq_puts(m, " SCRATCH "); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); } kunmap_atomic(pt_vaddr); } -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/9] drm/i915: Delete unnecessary braces in three functions
From: Markus ElfringDate: Thu, 4 May 2017 13:40:53 +0200 Do not use curly brackets at some source code places where a single statement should be sufficient. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 19 --- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 296108464f2b..bf9a2e8d8c16 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -565,13 +565,13 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) u32 addr; pending = atomic_read(>pending); - if (pending) { + if (pending) seq_printf(m, "Flip ioctl preparing on pipe %c (plane %c)\n", pipe, plane); - } else { + else seq_printf(m, "Flip pending (waiting for vsync) on pipe %c (plane %c)\n", pipe, plane); - } + if (work->flip_queued_req) { struct intel_engine_cs *engine = work->flip_queued_req->engine; @@ -3130,13 +3130,11 @@ static void intel_plane_info(struct seq_file *m, struct intel_crtc *intel_crtc) } state = plane->state; - - if (state->fb) { + if (state->fb) drm_get_format_name(state->fb->format->format, _name); - } else { + else sprintf(format_name.str, "N/A"); - } seq_printf(m, "\t--Plane id %d: type=%s, crtc_pos=%4dx%4d, crtc_size=%4dx%4d, src_pos=%d.%04ux%d.%04u, src_size=%d.%04ux%d.%04u, format=%s, rotation=%s\n", plane->base.id, @@ -4636,13 +4634,12 @@ static int i915_sseu_status(struct seq_file *m, void *unused) intel_runtime_pm_get(dev_priv); - if (IS_CHERRYVIEW(dev_priv)) { + if (IS_CHERRYVIEW(dev_priv)) cherryview_sseu_device_status(dev_priv, ); - } else if (IS_BROADWELL(dev_priv)) { + else if (IS_BROADWELL(dev_priv)) broadwell_sseu_device_status(dev_priv, ); - } else if (INTEL_GEN(dev_priv) >= 9) { + else if (INTEL_GEN(dev_priv) >= 9) gen9_sseu_device_status(dev_priv, ); - } intel_runtime_pm_put(dev_priv); -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/9] drm/i915: Replace 14 seq_printf() calls by seq_puts()
From: Markus ElfringDate: Thu, 4 May 2017 13:20:47 +0200 Some strings which did not contain data format specifications should be put into a sequence. Thus use the corresponding function "seq_puts". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 4adf96be9146..296108464f2b 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -149,7 +149,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) } seq_printf(m, " (pinned x %d)", pin_count); if (obj->pin_display) - seq_printf(m, " (display)"); + seq_puts(m, " (display)"); list_for_each_entry(vma, >vma_list, obj_link) { if (!drm_mm_node_allocated(>node)) continue; @@ -581,8 +581,10 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) intel_engine_last_submit(engine), intel_engine_get_seqno(engine), i915_gem_request_completed(work->flip_queued_req)); - } else - seq_printf(m, "Flip not associated with any ring\n"); + } else { + seq_puts(m, +"Flip not associated with any ring\n"); + } seq_printf(m, "Flip queued on frame %d, (was ready on frame %d), now %d\n", work->flip_queued_vblank, work->flip_ready_vblank, @@ -2048,7 +2050,7 @@ static int i915_dump_lrc(struct seq_file *m, void *unused) int ret; if (!i915.enable_execlists) { - seq_printf(m, "Logical Ring Contexts are disabled\n"); + seq_puts(m, "Logical Ring Contexts are disabled\n"); return 0; } @@ -2402,7 +2404,7 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data) if (!HAS_GUC_UCODE(dev_priv)) return 0; - seq_printf(m, "GuC firmware status:\n"); + seq_puts(m, "GuC firmware status:\n"); seq_printf(m, "\tpath: %s\n", guc_fw->path); seq_printf(m, "\tfetch: %s\n", @@ -2510,7 +2512,7 @@ static int i915_guc_info(struct seq_file *m, void *data) return 0; } - seq_printf(m, "Doorbell map:\n"); + seq_puts(m, "Doorbell map:\n"); seq_printf(m, "\t%*pb\n", GUC_NUM_DOORBELLS, guc->doorbell_bitmap); seq_printf(m, "Doorbell next cacheline: 0x%x\n\n", guc->db_cacheline); @@ -2521,7 +2523,7 @@ static int i915_guc_info(struct seq_file *m, void *data) seq_printf(m, "GuC last action error code: %d\n", guc->action_err); total = 0; - seq_printf(m, "\nGuC submissions:\n"); + seq_puts(m, "\nGuC submissions:\n"); for_each_engine(engine, dev_priv, id) { u64 submissions = guc->submissions[id]; total += submissions; @@ -2795,7 +2797,7 @@ static int i915_runtime_pm_status(struct seq_file *m, void *unused) seq_printf(m, "Usage count: %d\n", atomic_read(_priv->drm.dev->power.usage_count)); #else - seq_printf(m, "Device Power Management (CONFIG_PM) disabled\n"); + seq_puts(m, "Device Power Management (CONFIG_PM) disabled\n"); #endif seq_printf(m, "PCI device power state: %s [%d]\n", pci_power_name(pdev->current_state), @@ -2914,7 +2916,7 @@ static void intel_encoder_info(struct seq_file *m, drm_get_connector_status_name(connector->status)); if (connector->status == connector_status_connected) { struct drm_display_mode *mode = >mode; - seq_printf(m, ", mode:\n"); + seq_puts(m, ", mode:\n"); intel_seq_print_mode(m, 2, mode); } else { seq_putc(m, '\n'); @@ -2945,7 +2947,7 @@ static void intel_panel_info(struct seq_file *m, struct intel_panel *panel) { struct drm_display_mode *mode = panel->fixed_mode; - seq_printf(m, "\tfixed mode:\n"); + seq_puts(m, "\tfixed mode:\n"); intel_seq_print_mode(m, 2, mode); } @@ -3038,7 +3040,7 @@ static void intel_connector_info(struct seq_file *m, break; } - seq_printf(m, "\tmodes:\n"); + seq_puts(m, "\tmodes:\n"); list_for_each_entry(mode, >modes, head) intel_seq_print_mode(m, 2, mode); } @@ -3266,9 +3268,7 @@
[Intel-gfx] [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()
From: Markus ElfringDate: Thu, 4 May 2017 14:15:00 +0200 The script "checkpatch.pl" pointed information out like the following. WARNING: quoted string split across lines Thus fix the affected source code place. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 6f3119d40c50..dbd52ea89fb4 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m) forcewake_count = READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count); if (forcewake_count) { - seq_puts(m, "RC information inaccurate because somebody " - "holds a forcewake reference \n"); + seq_puts(m, +"RC information inaccurate because somebody holds a forcewake reference.\n"); } else { /* NB: we cannot use forcewake, else we read the wrong values */ while (count++ < 50 && (I915_READ_NOTRACE(FORCEWAKE_ACK) & 1)) -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/9] drm/i915: Add spaces for better code readability
From: Markus ElfringDate: Thu, 4 May 2017 14:04:38 +0200 Use space characters at some source code places according to the Linux coding style convention. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index d9c699d7245e..6f3119d40c50 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2358,7 +2358,7 @@ static int i915_llc(struct seq_file *m, void *data) seq_printf(m, "LLC: %s\n", yesno(HAS_LLC(dev_priv))); seq_printf(m, "%s: %lluMB\n", edram ? "eDRAM" : "eLLC", - intel_uncore_edram_size(dev_priv)/1024/1024); + intel_uncore_edram_size(dev_priv) / 1024 / 1024); return 0; } @@ -4502,7 +4502,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv, { int s_max = 3, ss_max = 4; int s, ss; - u32 s_reg[s_max], eu_reg[2*s_max], eu_mask[2]; + u32 s_reg[s_max], eu_reg[2 * s_max], eu_mask[2]; /* BXT has a single slice and at most 3 subslices. */ if (IS_GEN9_LP(dev_priv)) { @@ -4512,8 +4512,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv, for (s = 0; s < s_max; s++) { s_reg[s] = I915_READ(GEN9_SLICE_PGCTL_ACK(s)); - eu_reg[2*s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s)); - eu_reg[2*s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s)); + eu_reg[2 * s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s)); + eu_reg[2 * s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s)); } eu_mask[0] = GEN9_PGCTL_SSA_EU08_ACK | @@ -4547,8 +4547,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv, sseu->subslice_mask |= BIT(ss); } - eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] & - eu_mask[ss%2]); + eu_cnt = 2 * hweight32(eu_reg[2 * s + ss / 2] & + eu_mask[ss % 2]); sseu->eu_total += eu_cnt; sseu->eu_per_subslice = max_t(unsigned int, sseu->eu_per_subslice, -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/9] drm/i915: Combine five seq_printf() calls in i915_display_info()
From: Markus ElfringDate: Thu, 4 May 2017 13:17:10 +0200 Some text was put into a sequence by separate function calls. Print the same data by two single function calls instead. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index f2bda699749a..4adf96be9146 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3191,8 +3191,7 @@ static int i915_display_info(struct seq_file *m, void *unused) struct drm_connector_list_iter conn_iter; intel_runtime_pm_get(dev_priv); - seq_printf(m, "CRTC info\n"); - seq_printf(m, "-\n"); + seq_puts(m, "CRTC info\n-\n"); for_each_intel_crtc(dev, crtc) { bool active; struct intel_crtc_state *pipe_config; @@ -3226,9 +3225,7 @@ static int i915_display_info(struct seq_file *m, void *unused) drm_modeset_unlock(>base.mutex); } - seq_printf(m, "\n"); - seq_printf(m, "Connector info\n"); - seq_printf(m, "--\n"); + seq_puts(m, "\nConnector info\n--\n"); mutex_lock(>mode_config.mutex); drm_connector_list_iter_begin(dev, _iter); drm_for_each_connector_iter(connector, _iter) -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/9] drm/i915: Replace ten seq_puts() calls by seq_putc()
From: Markus ElfringDate: Thu, 4 May 2017 11:04:45 +0200 Some single characters should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index d689e511744e..f2bda699749a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -190,7 +190,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) seq_printf(m, " , fence: %d%s", vma->fence->id, i915_gem_active_isset(>last_fence) ? "*" : ""); - seq_puts(m, ")"); + seq_putc(m, ')'); } if (obj->stolen) seq_printf(m, " (stolen: %08llx)", obj->stolen->start); @@ -2689,7 +2689,7 @@ static int i915_edp_psr_status(struct seq_file *m, void *data) (stat[pipe] == VLV_EDP_PSR_ACTIVE_SF_UPDATE)) seq_printf(m, " pipe %c", pipe_name(pipe)); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); /* * VLV/CHV PSR has no kind of performance counter @@ -3176,7 +3176,7 @@ static void intel_scaler_info(struct seq_file *m, struct intel_crtc *intel_crtc) seq_printf(m, ", scalers[%d]: use=%s, mode=%x", i, yesno(sc->in_use), sc->mode); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); } else { seq_puts(m, "\tNo scalers available on this platform\n"); } @@ -3384,8 +3384,7 @@ static int i915_engine_info(struct seq_file *m, void *unused) w->tsk->comm, w->tsk->pid, w->seqno); } spin_unlock_irq(>rb_lock); - - seq_puts(m, "\n"); + seq_putc(m, '\n'); } intel_runtime_pm_put(dev_priv); @@ -3629,7 +3628,7 @@ static void drrs_status_per_crtc(struct seq_file *m, /* DRRS not supported. Print the VBT parameter*/ seq_puts(m, "\tDRRS Supported : No"); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); } static int i915_drrs_status(struct seq_file *m, void *unused) @@ -3764,12 +3763,11 @@ static int i915_displayport_test_active_show(struct seq_file *m, void *data) if (connector->status == connector_status_connected && connector->encoder != NULL) { intel_dp = enc_to_intel_dp(connector->encoder); - if (intel_dp->compliance.test_active) - seq_puts(m, "1"); - else - seq_puts(m, "0"); - } else - seq_puts(m, "0"); + seq_putc(m, +intel_dp->compliance.test_active ? '1' : '0'); + } else { + seq_putc(m, '0'); + } } drm_connector_list_iter_end(_iter); @@ -3823,8 +3821,9 @@ static int i915_displayport_test_data_show(struct seq_file *m, void *data) seq_printf(m, "bpc: %u\n", intel_dp->compliance.test_data.bpc); } - } else - seq_puts(m, "0"); + } else { + seq_putc(m, '0'); + } } drm_connector_list_iter_end(_iter); @@ -3864,8 +3863,9 @@ static int i915_displayport_test_type_show(struct seq_file *m, void *data) connector->encoder != NULL) { intel_dp = enc_to_intel_dp(connector->encoder); seq_printf(m, "%02lx", intel_dp->compliance.test_type); - } else - seq_puts(m, "0"); + } else { + seq_putc(m, '0'); + } } drm_connector_list_iter_end(_iter); -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On Thursday, May 4, 2017 7:47:21 AM PDT David Weinehall wrote: > On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote: > > Thanks for rephrasing - that's exactly what I am concerned with. > > > > Did you just use the MediaSDK as it is - meaning that MOCS entries > > beyond the set of the 3 we have defined had been naively utilized? > > > > If that's the case it is probably the cause of the performance > > difference - everything beyond "the 3" means UNCACHED. > > > > Can you try changing MediaSDK to only use entries that are already in? > > How the performance differs in that case? > > We're benchmarking using upstream MediaSDK without changes, since that's > the only thing that's relevant. Customising benchmarks to get better > results isn't really an acceptable solution :) > > Obviously fixing MediaSDK upstream is a different story, in case one of > the three pre-defined entries we have turns out to be the best possible > MOCS-settings for that workload. You're right about customizing benchmarks, but... MediaSDK is not a benchmark. If I'm not mistaken, it's a userspace driver produced by Intel engineers, one which Intel has the full capability to change. What you're saying is that Intel's MediaSDK engineers are unwilling to change their software to provide better performance for their Linux users. That's pretty mental. We don't warp the core operating system to work around userspace software simply because they don't want to change it. This isn't about open vs. closed or internal vs. public projects, either. I work on a public userspace driver for Intel graphics. If I sent a kernel patch, the kernel developers would ask me the exact same questions, to justify my new additions: 1. Is your userspace actually using all these new additions? If not, which ones are you using? They would ask me to drop anything I wasn't actually using yet, because speculatively adding things to the kernel that we have to maintain backwards compatibility for has caused both kernel and userspace developers a lot of trouble. 2. Are you sure that you need them all? Is there a simpler solution - are some existing things good enough? What's the additional benefit of each new addition? I would have to answer these questions to the satisfaction of the kernel developers before they would even consider taking my patch. You keep pointing to your large performance improvement, but all it's shown is that actually using the GPU cache is faster than having a broken userspace driver explicitly set everything to uncached. Many people have pointed this out. Arek and Tvrtko have good suggestions. I don't think you're going to get anywhere with this until you demonstrate that the new MOCS entries provide some non-zero value over using the existing WB entry. Here are a couple more data points: 1. We likely can't implement the documented "MOCS Version 1" table as is. The kernel exposes existing entries with specific semantics. Changing their meaning would introduce a backwards-incompatible change that would likely regress the performance of existing userspace. This is almost certainly unacceptable - our customers, distro partners, users, and even people like Linus Torvalds will suffer and complain loudly. We could add the new entries at an offset - i.e. leave the existing 3 entries, and append the rest after that. But that would require changing userspace that assumes the Windows tables, such as MediaSDK (they would have to add 3 to their MOCS indexes). At which point, we're changing them, so...the "runs unaltered" argument falls over. 2. The docs finally contain "recommended MOCS settings" - i.e. where to cache various types of objects, and at what age. However, I believe those recommendations can be implemented with 1-2 new table entries and a PTE change to be eLLC-only by default. Most of the table is completely unnecessary to implement the recommendations. I personally would like to try implementing their recommended settings in my driver. I have not had time yet, but plan to try. I'm very glad to see the Windows MOCS recommendations documented. I'd been asking for that information for literally years. If we'd gotten it earlier, a lot of mess could have been avoided. For future platforms, we may want to coordinate and use the same table. But Gen9 has been shipping for ages, and we don't have that luxury. --Ken signature.asc Description: This is a digitally signed message part. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible
On Thu, May 04, 2017 at 04:22:15PM +0300, Jani Nikula wrote: > On Thu, 04 May 2017, Michal Wajdeczkowrote: > > We are using some scratch registers in MMIO based send function. > > Make their base and count flexible in preparation of upcoming > > GuC firmware/hardware changes. While around, change cmd len > > parameter verification from WARN_ON to GEM_BUG_ON as we don't > > need this all the time. > > I'm not generally fond of caching the registers like this or adding > _MMIO() wrapping outside of i915_reg.h. Sure, we have some of that here > and there, but here it's hard to see the rationale because you do this > in preparation for something that we you're not sharing. > I can't share details atm, but as commit message says, there will be a change in both offsets and number of scratch registers. Imho any wrapping around these values can't go to the i915_[guc_]reg.h file as that file shall include only raw MMIO definitions, without any extra logic that is based on GEN or PLATFORM or FW version. Alternate approach would be, thanks to the already defined virtual function send(), to create new send_mmio function(s) that will be 100% the same as the old send_mmio except offset and count of the scratch registers. Then we can benefit from most optimal implementation per GEN|PLATFORM|FW that can run without reading cached regs offsets/count, but at the cost of extra code that need to be maintained to be in sync with the original function. And then someone else can point out that we missed code sharing opportunity. I'm afraid there is no clear winner. -Michal > BR, > Jani. > > > > > v2: call out WARN/GEM_BUG change in the commit msg (Daniele) > > > > Signed-off-by: Michal Wajdeczko > > Suggested-by: Daniele Ceraolo Spurio > > Cc: Daniele Ceraolo Spurio > > Cc: Joonas Lahtinen > > Reviewed-by: Daniele Ceraolo Spurio > > --- > > drivers/gpu/drm/i915/intel_uc.c | 41 > > ++--- > > drivers/gpu/drm/i915/intel_uc.h | 7 +++ > > 2 files changed, 41 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_uc.c > > b/drivers/gpu/drm/i915/intel_uc.c > > index 72f49e6..9d11c42 100644 > > --- a/drivers/gpu/drm/i915/intel_uc.c > > +++ b/drivers/gpu/drm/i915/intel_uc.c > > @@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private > > *dev_priv) > > __intel_uc_fw_fini(_priv->huc.fw); > > } > > > > +static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i) > > +{ > > + GEM_BUG_ON(!guc->send_regs.base); > > + GEM_BUG_ON(!guc->send_regs.count); > > + GEM_BUG_ON(i >= guc->send_regs.count); > > + > > + return _MMIO(guc->send_regs.base + 4 * i); > > +} > > + > > +static void guc_init_send_regs(struct intel_guc *guc) > > +{ > > + struct drm_i915_private *dev_priv = guc_to_i915(guc); > > + enum forcewake_domains fw_domains = 0; > > + u32 i; > > + > > + guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0)); > > + guc->send_regs.count = SOFT_SCRATCH_COUNT - 1; > > + > > + for (i = 0; i < guc->send_regs.count; i++) { > > + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, > > + guc_send_reg(guc, i), > > + FW_REG_READ | FW_REG_WRITE); > > + } > > + guc->send_regs.fw_domains = fw_domains; > > +} > > + > > static int guc_enable_communication(struct intel_guc *guc) > > { > > /* XXX: placeholder for alternate setup */ > > + guc_init_send_regs(guc); > > guc->send = intel_guc_send_mmio; > > return 0; > > } > > @@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const > > u32 *action, u32 len) > > int i; > > int ret; > > > > - if (WARN_ON(len < 1 || len > 15)) > > - return -EINVAL; > > + GEM_BUG_ON(!len); > > + GEM_BUG_ON(len > guc->send_regs.count); > > > > mutex_lock(>send_mutex); > > - intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER); > > + intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains); > > > > dev_priv->guc.action_count += 1; > > dev_priv->guc.action_cmd = action[0]; > > > > for (i = 0; i < len; i++) > > - I915_WRITE(SOFT_SCRATCH(i), action[i]); > > + I915_WRITE(guc_send_reg(guc, i), action[i]); > > > > - POSTING_READ(SOFT_SCRATCH(i - 1)); > > + POSTING_READ(guc_send_reg(guc, i - 1)); > > > > intel_guc_notify(guc); > > > > @@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const > > u32 *action, u32 len) > > * Fast commands should still complete in 10us. > > */ > > ret = __intel_wait_for_register_fw(dev_priv, > > - SOFT_SCRATCH(0), > > + guc_send_reg(guc, 0), > >
Re: [Intel-gfx] [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf
On Thu, 4 May 2017 03:09:40 + "Chen, Xiaoguang"wrote: > Hi Alex, do you have any comments for this interface? > > >-Original Message- > >From: intel-gvt-dev [mailto:intel-gvt-dev-boun...@lists.freedesktop.org] On > >Behalf Of Chen, Xiaoguang > >Sent: Wednesday, May 03, 2017 9:39 AM > >To: Gerd Hoffmann > >Cc: Tian, Kevin ; intel-gfx@lists.freedesktop.org; > >linux- > >ker...@vger.kernel.org; zhen...@linux.intel.com; alex.william...@redhat.com; > >Lv, Zhiyuan ; intel-gvt-...@lists.freedesktop.org; > >Wang, > >Zhi A > >Subject: RE: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf > > > > > > > >>-Original Message- > >>From: Gerd Hoffmann [mailto:kra...@redhat.com] > >>Sent: Tuesday, May 02, 2017 5:51 PM > >>To: Chen, Xiaoguang > >>Cc: alex.william...@redhat.com; intel-gfx@lists.freedesktop.org; > >>intel-gvt- d...@lists.freedesktop.org; Wang, Zhi A > >> ; zhen...@linux.intel.com; > >>linux-ker...@vger.kernel.org; Lv, Zhiyuan ; Tian, > >>Kevin > >>Subject: Re: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the > >>dmabuf > >> > >>On Fr, 2017-04-28 at 17:35 +0800, Xiaoguang Chen wrote: > >>> +static size_t intel_vgpu_reg_rw_gvtg(struct intel_vgpu *vgpu, char > >>> *buf, > >>> + size_t count, loff_t *ppos, bool iswrite) { > >>> + unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) - > >>> + VFIO_PCI_NUM_REGIONS; > >>> + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK; > >>> + int fd; > >>> + > >>> + if (pos >= vgpu->vdev.region[i].size || iswrite) { > >>> + gvt_vgpu_err("invalid op or offset for Intel vgpu fd > >>> region\n"); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + fd = anon_inode_getfd("gvtg", _vgpu_gvtg_ops, vgpu, > >>> + O_RDWR | O_CLOEXEC); > >>> + if (fd < 0) { > >>> + gvt_vgpu_err("create intel vgpu fd failed:%d\n", fd); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + count = min(count, (size_t)(vgpu->vdev.region[i].size - pos)); > >>> + memcpy(buf, , count); > >>> + > >>> + return count; > >>> +} > >> > >>Hmm, that looks like a rather strange way to return a file descriptor. > >> > >>What is the reason to not use ioctls on the vfio file handle, like > >>older version of these patches did? > >If I understood correctly that Alex prefer not to change the ioctls on the > >vfio file > >handle like the old version. > >So I used this way the smallest change to general vfio framework only adding > >a > >subregion definition. I think I was hoping we could avoid a separate file descriptor altogether and use a vfio region instead. However, it was explained previously why this really needs to be a separate fd and I agree that using a region to expose an fd is really awkward. If we're going to have a separate fd, let's use a device specific ioctl to get it. Thanks, Alex ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
From: Tvrtko UrsulinWith the addition of __sg_alloc_table_from_pages we can control the maximum coallescing size and eliminate a separate path for allocating backing store here. Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") this enables more compact sg lists to be created and so has a beneficial effect on workloads with many and/or large objects of this class. v2: * Rename helper to i915_sg_segment_size and fix swiotlb override. * Commit message update. v3: * Actually include the swiotlb override fix. v4: * Regroup parameters a bit. (Chris Wilson) v5: * Rebase for swiotlb_max_segment. * Add DMA map failure handling as in abb0deacb5a6 ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping"). v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen) v7: Rebase. Signed-off-by: Tvrtko Ursulin Cc: Chris Wilson Cc: linux-ker...@vger.kernel.org Reviewed-by: Chris Wilson (v4) Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 15 +++ drivers/gpu/drm/i915/i915_gem.c | 6 +-- drivers/gpu/drm/i915/i915_gem_userptr.c | 79 - 3 files changed, 45 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b20ed16da0ad..320c16df1c9c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2676,6 +2676,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg) (((__iter).curr += PAGE_SIZE) < (__iter).max) || \ ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0)) +static inline unsigned int i915_sg_segment_size(void) +{ + unsigned int size = swiotlb_max_segment(); + + if (size == 0) + return SCATTERLIST_MAX_SEGMENT; + + size = rounddown(size, PAGE_SIZE); + /* swiotlb_max_segment_size can return 1 byte when it means one page. */ + if (size < PAGE_SIZE) + size = PAGE_SIZE; + + return size; +} + static inline const struct intel_device_info * intel_info(const struct drm_i915_private *dev_priv) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f9c6b9b5002c..b2727905ef2b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2336,7 +2336,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) struct sgt_iter sgt_iter; struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment; + unsigned int max_segment = i915_sg_segment_size(); int ret; gfp_t gfp; @@ -2347,10 +2347,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS); GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS); - max_segment = swiotlb_max_segment(); - if (!max_segment) - max_segment = rounddown(UINT_MAX, PAGE_SIZE); - st = kmalloc(sizeof(*st), GFP_KERNEL); if (st == NULL) return ERR_PTR(-ENOMEM); diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 58ccf8b8ca1c..d003076702ad 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -399,64 +399,42 @@ struct get_pages_work { struct task_struct *task; }; -#if IS_ENABLED(CONFIG_SWIOTLB) -#define swiotlb_active() swiotlb_nr_tbl() -#else -#define swiotlb_active() 0 -#endif - -static int -st_set_pages(struct sg_table **st, struct page **pvec, int num_pages) -{ - struct scatterlist *sg; - int ret, n; - - *st = kmalloc(sizeof(**st), GFP_KERNEL); - if (*st == NULL) - return -ENOMEM; - - if (swiotlb_active()) { - ret = sg_alloc_table(*st, num_pages, GFP_KERNEL); - if (ret) - goto err; - - for_each_sg((*st)->sgl, sg, num_pages, n) - sg_set_page(sg, pvec[n], PAGE_SIZE, 0); - } else { - ret = sg_alloc_table_from_pages(*st, pvec, num_pages, - 0, num_pages << PAGE_SHIFT, - GFP_KERNEL); - if (ret) - goto err; - } - - return 0; - -err: - kfree(*st); - *st = NULL; - return ret; -} - static struct sg_table * -__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj, -struct page **pvec, int num_pages) +__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj, + struct page **pvec, int num_pages) { - struct sg_table *pages; + unsigned
[Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
From: Tvrtko UrsulinScatterlist entries have an unsigned int for the offset so correct the sg_alloc_table_from_pages function accordingly. Since these are offsets withing a page, unsigned int is wide enough. Also converts callers which were using unsigned long locally with the lower_32_bits annotation to make it explicitly clear what is happening. v2: Use offset_in_page. (Chris Wilson) Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: Pawel Osciak Cc: Marek Szyprowski Cc: Kyungmin Park Cc: Tomasz Stanislawski Cc: Matt Porter Cc: Alexandre Bounine Cc: linux-me...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Acked-by: Marek Szyprowski (v1) Reviewed-by: Chris Wilson Reviewed-by: Mauro Carvalho Chehab --- drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- drivers/rapidio/devices/rio_mport_cdev.c | 4 ++-- include/linux/scatterlist.h| 2 +- lib/scatterlist.c | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c index 2db0413f5d57..b5009c1649bc 100644 --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c @@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr, { struct vb2_dc_buf *buf; struct frame_vector *vec; - unsigned long offset; + unsigned int offset; int n_pages, i; int ret = 0; struct sg_table *sgt; @@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr, buf->dev = dev; buf->dma_dir = dma_dir; - offset = vaddr & ~PAGE_MASK; + offset = lower_32_bits(offset_in_page(vaddr)); vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE); if (IS_ERR(vec)) { ret = PTR_ERR(vec); diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c index 50b617af81bd..a8b6696ab6cb 100644 --- a/drivers/rapidio/devices/rio_mport_cdev.c +++ b/drivers/rapidio/devices/rio_mport_cdev.c @@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode, * offset within the internal buffer specified by handle parameter. */ if (xfer->loc_addr) { - unsigned long offset; + unsigned int offset; long pinned; - offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK; + offset = lower_32_bits(offset_in_page(xfer->loc_addr)); nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT; page_list = kmalloc_array(nr_pages, diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index cb3c8fe6acd7..c981bee1a3ae 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, int sg_alloc_table(struct sg_table *, unsigned int, gfp_t); int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, unsigned int n_pages, - unsigned long offset, unsigned long size, + unsigned int offset, unsigned long size, gfp_t gfp_mask); size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, diff --git a/lib/scatterlist.c b/lib/scatterlist.c index c6cf82242d65..11f172c383cb 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table); */ int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, unsigned int n_pages, - unsigned long offset, unsigned long size, + unsigned int offset, unsigned long size, gfp_t gfp_mask) { unsigned int chunks; -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
From: Tvrtko UrsulinDrivers like i915 benefit from being able to control the maxium size of the sg coallesced segment while building the scatter- gather list. Introduce and export the __sg_alloc_table_from_pages function which will allow it that control. v2: Reorder parameters. (Chris Wilson) v3: Fix incomplete reordering in v2. v4: max_segment needs to be page aligned. v5: Rebase. v6: Rebase. Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: linux-ker...@vger.kernel.org Cc: Chris Wilson Reviewed-by: Chris Wilson (v2) Cc: Joonas Lahtinen --- include/linux/scatterlist.h | 11 + lib/scatterlist.c | 58 +++-- 2 files changed, 52 insertions(+), 17 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 4768eeeb7054..4d67a9652c7d 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -267,10 +267,13 @@ void sg_free_table(struct sg_table *); int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, struct scatterlist *, gfp_t, sg_alloc_fn *); int sg_alloc_table(struct sg_table *, unsigned int, gfp_t); -int sg_alloc_table_from_pages(struct sg_table *sgt, - struct page **pages, unsigned int n_pages, - unsigned int offset, unsigned long size, - gfp_t gfp_mask); +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask); +int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, gfp_t gfp_mask); size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, size_t buflen, off_t skip, bool to_buffer); diff --git a/lib/scatterlist.c b/lib/scatterlist.c index ca4ccd8c80b9..73dace1bd5bb 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask) EXPORT_SYMBOL(sg_alloc_table); /** - * sg_alloc_table_from_pages - Allocate and initialize an sg table from - *an array of pages - * @sgt: The sg table header to use - * @pages: Pointer to an array of page pointers - * @n_pages: Number of pages in the pages array - * @offset: Offset from start of the first page to the start of a buffer - * @size: Number of valid bytes in the buffer (after offset) - * @gfp_mask: GFP allocation mask + * __sg_alloc_table_from_pages - Allocate and initialize an sg table from + * an array of pages + * @sgt:The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages:Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size:Number of valid bytes in the buffer (after offset) + * @max_segment: Maximum size of a scatterlist node in bytes (page aligned) + * @gfp_mask: GFP allocation mask * * Description: *Allocate and initialize an sg table from a list of pages. Contiguous @@ -389,16 +390,18 @@ EXPORT_SYMBOL(sg_alloc_table); * Returns: * 0 on success, negative error on failure */ -int sg_alloc_table_from_pages(struct sg_table *sgt, - struct page **pages, unsigned int n_pages, - unsigned int offset, unsigned long size, - gfp_t gfp_mask) +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask) { - const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT; unsigned int chunks, cur_page, seg_len, i; int ret; struct scatterlist *s; + if (WARN_ON(!max_segment || offset_in_page(max_segment))) + return -EINVAL; + /* compute number of contiguous chunks */ chunks = 1; seg_len = 0; @@ -440,6 +443,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, return 0; } +EXPORT_SYMBOL(__sg_alloc_table_from_pages); + +/** + * sg_alloc_table_from_pages - Allocate and initialize an sg table from + *an array of pages + * @sgt:The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages:Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size:Number of valid bytes
[Intel-gfx] [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
From: Tvrtko UrsulinSince the scatterlist length field is an unsigned int, make sure that sg_alloc_table_from_pages does not overflow it while coallescing pages to a single entry. v2: Drop reference to future use. Use UINT_MAX. v3: max_segment must be page aligned. v4: Do not rely on compiler to optimise out the rounddown. (Joonas Lahtinen) v5: Simplified loops and use post-increments rather than pre-increments. Use PAGE_MASK and fix comment typo. (Andy Shevchenko) Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: linux-ker...@vger.kernel.org Reviewed-by: Chris Wilson (v2) Cc: Joonas Lahtinen Cc: Andy Shevchenko --- include/linux/scatterlist.h | 6 ++ lib/scatterlist.c | 31 --- 2 files changed, 26 insertions(+), 11 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index c981bee1a3ae..4768eeeb7054 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -21,6 +21,12 @@ struct scatterlist { }; /* + * Since the above length field is an unsigned int, below we define the maximum + * length in bytes that can be stored in one scatterlist entry. + */ +#define SCATTERLIST_MAX_SEGMENT (UINT_MAX & PAGE_MASK) + +/* * These macros should be used after a dma_map_sg call has been done * to get bus addresses of each of the SG entries and their lengths. * You should only work with the number of sg entries dma_map_sg diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 11f172c383cb..ca4ccd8c80b9 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -394,17 +394,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, unsigned int offset, unsigned long size, gfp_t gfp_mask) { - unsigned int chunks; - unsigned int i; - unsigned int cur_page; + const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT; + unsigned int chunks, cur_page, seg_len, i; int ret; struct scatterlist *s; /* compute number of contiguous chunks */ chunks = 1; - for (i = 1; i < n_pages; ++i) - if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) - ++chunks; + seg_len = 0; + for (i = 1; i < n_pages; i++) { + seg_len += PAGE_SIZE; + if (seg_len >= max_segment || + page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) { + chunks++; + seg_len = 0; + } + } ret = sg_alloc_table(sgt, chunks, gfp_mask); if (unlikely(ret)) @@ -413,17 +418,21 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, /* merging chunks and putting them into the scatterlist */ cur_page = 0; for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { - unsigned long chunk_size; - unsigned int j; + unsigned int j, chunk_size; /* look for the end of the current chunk */ - for (j = cur_page + 1; j < n_pages; ++j) - if (page_to_pfn(pages[j]) != + seg_len = 0; + for (j = cur_page + 1; j < n_pages; j++) { + seg_len += PAGE_SIZE; + if (seg_len >= max_segment || + page_to_pfn(pages[j]) != page_to_pfn(pages[j - 1]) + 1) break; + } chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; - sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); + sg_set_page(s, pages[cur_page], + min_t(unsigned long, size, chunk_size), offset); size -= chunk_size; offset = 0; cur_page = j; -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On May 04 2017 or thereabouts, Andy Shevchenko wrote: > acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16 > bytes. Instead we convert them to use uuid_le type. At the same time we > convert current users. > > acpi_str_to_uuid() becomes useless after the conversion and it's safe to > get rid of it. > > The conversion fixes a potential bug in int340x_thermal as well since > we have to use memcmp() on binary data. > > Cc: Rafael J. Wysocki> Cc: Mika Westerberg > Cc: Borislav Petkov > Cc: Dan Williams > Cc: Amir Goldstein > Cc: Jarkko Sakkinen > Cc: Jani Nikula > Cc: Ben Skeggs > Cc: Benjamin Tissoires > Cc: Joerg Roedel > Cc: Adrian Hunter > Cc: Yisen Zhuang > Cc: Bjorn Helgaas > Cc: Zhang Rui > Cc: Felipe Balbi > Cc: Mathias Nyman > Cc: Heikki Krogerus > Cc: Liam Girdwood > Cc: Mark Brown > Signed-off-by: Andy Shevchenko > --- For i2c-hid: Acked-by: Benjamin Tissoires ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us
Hi, Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get a lot of the below warnings. Things seem to work fine (in fact it seems faster in general use than previously), but it's a lot of warning spew. [ 764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under evasion is 100 us [ 1210.063144] [drm] Atomic update on pipe (A) took 152 us, max time under evasion is 100 us [ 1272.208727] [drm] Atomic update on pipe (A) took 213 us, max time under evasion is 100 us [ 1308.106266] [drm] Atomic update on pipe (A) took 194 us, max time under evasion is 100 us [ 1308.439572] [drm] Atomic update on pipe (A) took 202 us, max time under evasion is 100 us [ 1371.905950] [drm] Atomic update on pipe (A) took 135 us, max time under evasion is 100 us [ 1373.891378] [drm] Atomic update on pipe (A) took 202 us, max time under evasion is 100 us [ 1497.259572] [drm] Atomic update on pipe (A) took 199 us, max time under evasion is 100 us [ 1497.292922] [drm] Atomic update on pipe (A) took 178 us, max time under evasion is 100 us [ 1497.326313] [drm] Atomic update on pipe (A) took 188 us, max time under evasion is 100 us [ 1534.106959] [drm] Atomic update on pipe (A) took 223 us, max time under evasion is 100 us [ 1534.190331] [drm] Atomic update on pipe (A) took 180 us, max time under evasion is 100 us [ 1680.613275] [drm] Atomic update on pipe (A) took 101 us, max time under evasion is 100 us [ 1870.783352] [drm] Atomic update on pipe (A) took 188 us, max time under evasion is 100 us [ 2338.083752] [drm] Atomic update on pipe (A) took 225 us, max time under evasion is 100 us [ 2405.212252] [drm] Atomic update on pipe (A) took 114 us, max time under evasion is 100 us [ 2421.811125] [drm] Atomic update on pipe (A) took 112 us, max time under evasion is 100 us [ 2426.344151] [drm] Atomic update on pipe (A) took 137 us, max time under evasion is 100 us [ 2439.012088] [drm] Atomic update on pipe (A) took 143 us, max time under evasion is 100 us [ 2446.011309] [drm] Atomic update on pipe (A) took 163 us, max time under evasion is 100 us [ 2446.142622] [drm] Atomic update on pipe (A) took 112 us, max time under evasion is 100 us [ 2446.542772] [drm] Atomic update on pipe (A) took 137 us, max time under evasion is 100 us [ 2448.243922] [drm] Atomic update on pipe (A) took 157 us, max time under evasion is 100 us [ 2450.042450] [drm] Atomic update on pipe (A) took 157 us, max time under evasion is 100 us [ 2456.575226] [drm] Atomic update on pipe (A) took 131 us, max time under evasion is 100 us [ 2457.275176] [drm] Atomic update on pipe (A) took 115 us, max time under evasion is 100 us [ 2464.308098] [drm] Atomic update on pipe (A) took 112 us, max time under evasion is 100 us [ 2569.418646] [drm] Atomic update on pipe (A) took 179 us, max time under evasion is 100 us [ 2572.302065] [drm] Atomic update on pipe (A) took 133 us, max time under evasion is 100 us [ 2589.933225] [drm] Atomic update on pipe (A) took 168 us, max time under evasion is 100 us [ 2590.701810] [drm] Atomic update on pipe (A) took 175 us, max time under evasion is 100 us [ 2606.732899] [drm] Atomic update on pipe (A) took 130 us, max time under evasion is 100 us [ 2611.732710] [drm] Atomic update on pipe (A) took 147 us, max time under evasion is 100 us [ 2615.532819] [drm] Atomic update on pipe (A) took 145 us, max time under evasion is 100 us [ 2654.412509] [drm] Atomic update on pipe (A) took 157 us, max time under evasion is 100 us [ 2657.012470] [drm] Atomic update on pipe (A) took 168 us, max time under evasion is 100 us [ 2714.341971] [drm] Atomic update on pipe (A) took 144 us, max time under evasion is 100 us [ 2775.486168] [drm] Atomic update on pipe (A) took 138 us, max time under evasion is 100 us [ 2782.852360] [drm] Atomic update on pipe (A) took 113 us, max time under evasion is 100 us [ 2795.319781] [drm] Atomic update on pipe (A) took 188 us, max time under evasion is 100 us [ 2818.601093] [drm] Atomic update on pipe (A) took 160 us, max time under evasion is 100 us [ 2867.998524] [drm] Atomic update on pipe (A) took 167 us, max time under evasion is 100 us [ 2878.980535] [drm] Atomic update on pipe (A) took 163 us, max time under evasion is 100 us [ 2945.607547] [drm] Atomic update on pipe (A) took 110 us, max time under evasion is 100 us [ 2957.606588] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=177768 end=177769) time 214 us, min 1431, max 1439, scanline start 1423, end 1442 [ 2958.609128] [drm] Atomic update on pipe (A) took 168 us, max time under evasion is 100 us [ 2960.059591] [drm] Atomic update on pipe (A) took 186 us, max time under evasion is 100 us [ 2960.658177] [drm] Atomic update on pipe (A) took 181 us, max time under evasion is 100 us [ 3002.688632] [drm] Atomic update on pipe (A) took 210 us, max time under evasion is 100 us [ 3021.939015] [drm] Atomic update on pipe (A) took 140 us, max time under
Re: [Intel-gfx] [PATCH 5/9] drm/i915: Use a define for the default priority [0]
On Thu, May 04, 2017 at 04:32:34PM +0300, Joonas Lahtinen wrote: > On ke, 2017-05-03 at 12:37 +0100, Chris Wilson wrote: > > Explicitly assign the default priority, and give it a name (macro). > > > > Signed-off-by: Chris Wilson> > > > > kref_init(>ref); > > list_add_tail(>link, _priv->context_list); > > ctx->i915 = dev_priv; > > + ctx->priority = I915_PRIORITY_DFL; > > I915_PRIORITY_DEFAULT would work better. On the one hand I have the symmetry with MIN, DFL, MAX, on the other hand DFL is plain bizarre. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE
On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote: > A good default for garbage entries from the user is to follow the > default setting of the object (i.e. the PTE). Currently they use the > uncached entry, and now the only way to accidentally hit uncached > performance is via explicit use of the uncached MOCS or setting the > object to uncached. Note that these entries are currently undefined in > the ABI and we reserve the right to change them. We originally chose > uncached to eliminate any problem with reducing the caching level in > future, but the object is a much better definition of the minimum > caching level. > > Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS") > Signed-off-by: Chris Wilson> Cc: David Weinehall > Cc: Arkadiusz Hiler > Cc: Tvrtko Ursulin > Cc: sta...@vger.kernel.org LGTM, and passes our nightly msdk test case. Tested-by: David Weinehall Reviewed-by: David Weinehall > --- > drivers/gpu/drm/i915/intel_mocs.c | 39 > +++ > 1 file changed, 15 insertions(+), 24 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_mocs.c > b/drivers/gpu/drm/i915/intel_mocs.c > index 92e461c68385..e7a7781ca457 100644 > --- a/drivers/gpu/drm/i915/intel_mocs.c > +++ b/drivers/gpu/drm/i915/intel_mocs.c > @@ -85,10 +85,7 @@ struct drm_i915_mocs_table { > * > * Entries not part of the following tables are undefined as far as > * userspace is concerned and shouldn't be relied upon. For the time > - * being they will be implicitly initialized to the strictest caching > - * configuration (uncached) to guarantee forwards compatibility with > - * userspace programs written against more recent kernels providing > - * additional MOCS entries. > + * being they will be implicitly initialized to follow the PTE. > * > * NOTE: These tables MUST start with being uncached and the length > * MUST be less than 63 as the last two registers are reserved > @@ -249,16 +246,13 @@ int intel_mocs_init_engine(struct intel_engine_cs > *engine) > table.table[index].control_value); > > /* > - * Ok, now set the unused entries to uncached. These entries > + * Ok, now set the unused entries to follow the PTE. These entries >* are officially undefined and no contract for the contents >* and settings is given for these entries. > - * > - * Entry 0 in the table is uncached - so we are just writing > - * that value to all the used entries. >*/ > for (; index < GEN9_NUM_MOCS_ENTRIES; index++) > I915_WRITE(mocs_register(engine->id, index), > -table.table[0].control_value); > +table.table[I915_MOCS_PTE].control_value); > > return 0; > } > @@ -295,16 +289,13 @@ static int emit_mocs_control_table(struct > drm_i915_gem_request *req, > } > > /* > - * Ok, now set the unused entries to uncached. These entries > + * Ok, now set the unused entries to follow the PTE. These entries >* are officially undefined and no contract for the contents >* and settings is given for these entries. > - * > - * Entry 0 in the table is uncached - so we are just writing > - * that value to all the used entries. >*/ > for (; index < GEN9_NUM_MOCS_ENTRIES; index++) { > *cs++ = i915_mmio_reg_offset(mocs_register(engine, index)); > - *cs++ = table->table[0].control_value; > + *cs++ = table->table[I915_MOCS_PTE].control_value; > } > > *cs++ = MI_NOOP; > @@ -355,18 +346,17 @@ static int emit_mocs_l3cc_table(struct > drm_i915_gem_request *req, > if (table->size & 0x01) { > /* Odd table size - 1 left over */ > *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); > - *cs++ = l3cc_combine(table, 2 * i, 0); > + *cs++ = l3cc_combine(table, 2 * i, I915_MOCS_PTE); > i++; > } > > /* > - * Now set the rest of the table to uncached - use entry 0 as > - * this will be uncached. Leave the last pair uninitialised as > - * they are reserved by the hardware. > + * Now set the rest of the table to follow the PTE. > + * Leave the last pair as they are reserved by the hardware. >*/ > for (; i < GEN9_NUM_MOCS_ENTRIES / 2; i++) { > *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); > - *cs++ = l3cc_combine(table, 0, 0); > + *cs++ = l3cc_combine(table, I915_MOCS_PTE, I915_MOCS_PTE); > } > > *cs++ = MI_NOOP; > @@ -402,17 +392,18 @@ void intel_mocs_init_l3cc_table(struct drm_i915_private > *dev_priv) > > /* Odd table size - 1 left over */ > if (table.size &
Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()
On Thu, May 04, 2017 at 04:17:13PM +0200, Michal Wajdeczko wrote: > On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote: > > Exploit the power-of-two ring size to compute the space across the > > wraparound using a mask rather than a if. Convert to unsigned integers > > so the operation is well defined. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 > > Signed-off-by: Chris Wilson> > Cc: Mika Kuoppala > > Reviewed-by: Mika Kuoppala > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++-- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 36 > > - > > 2 files changed, 34 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > > b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index 3ce1c87dec46..e7ef04cc071b 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -39,12 +39,16 @@ > > */ > > #define LEGACY_REQUEST_SIZE 200 > > > > -static int __intel_ring_space(int head, int tail, int size) > > +static unsigned int __intel_ring_space(unsigned int head, > > + unsigned int tail, > > + unsigned int size) > > { > > - int space = head - tail; > > - if (space <= 0) > > - space += size; > > - return space - I915_RING_FREE_SPACE; > > + /* > > +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the > > +* same cacheline, the Head Pointer must not be greater than the Tail > > +* Pointer." > > +*/ > > + return (head - tail - CACHELINE_BYTES) & (size - 1); > > Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat > > GEM_BUG_ON(!is_power_of_2(size)); > > to emphase this assumption in the code (not only in the commit message)? I did check we had an is_power_of_2() check in intel_engine_create_ring. Might be worth asserting here as well as there's a little disconnect between the function and ring->size. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()
On Thu, May 04, 2017 at 04:17:13PM +0200, Michal Wajdeczko wrote: > On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote: > > Exploit the power-of-two ring size to compute the space across the > > wraparound using a mask rather than a if. Convert to unsigned integers > > so the operation is well defined. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 > > Signed-off-by: Chris Wilson> > Cc: Mika Kuoppala > > Reviewed-by: Mika Kuoppala > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++-- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 36 > > - > > 2 files changed, 34 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > > b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index 3ce1c87dec46..e7ef04cc071b 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -39,12 +39,16 @@ > > */ > > #define LEGACY_REQUEST_SIZE 200 > > > > -static int __intel_ring_space(int head, int tail, int size) > > +static unsigned int __intel_ring_space(unsigned int head, > > + unsigned int tail, > > + unsigned int size) > > { > > - int space = head - tail; > > - if (space <= 0) > > - space += size; > > - return space - I915_RING_FREE_SPACE; > > + /* > > +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the > > +* same cacheline, the Head Pointer must not be greater than the Tail > > +* Pointer." > > +*/ > > + return (head - tail - CACHELINE_BYTES) & (size - 1); > > Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat > > GEM_BUG_ON(!is_power_of_2(size)); > > to emphase this assumption in the code (not only in the commit message)? I've made the cardinal sin of changing it at the last moment, if I've broken everything I'm going to blame you :) Semi-pushed, looks like we're already back in conflict territory. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote: > On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote: > > On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote: > > > On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote: > > > > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote: > > > > > Add a bunch of MOCS entries for gen 9 that were missing from > > > > > intel_mocs. > > > > > Some of these are used by media-sdk; if these entries are missing > > > > > the default will instead be to do everything uncached. > > > > > > > > > > This patch improves media-sdk performance with up to 60% > > > > > with the (admittedly synthetic) benchmarks we use in our nightly > > > > > testing, without regressing any other benchmarks. > > > > > > > > Hey David, > > > > > > > > I am testing some of the extended MOCS with Mesa and the differences I > > > > see fit in the margins of statistical error. > > > > > > > > Odd, I thought, so to make sure I haven't messed up anything in the > > > > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned > > > > everything to UNCACHED - and I saw severe performance drop. > > > > > > > > So here is the question it induced: > > > > > > > > Have you used the "closest neighbour" from entries available or did you > > > > defaulted to the UNCACHED ones? That could be the culprit. > > > > > > > > Note: I have tested MOCS for VB and Render Target only, and only in a > > > > few synthetic cases - it will require much more fine-tuning and > > > > benchmarking before any final conclusions. > > > > > > As I mentioned in the commit message, the improvements only manifest > > > themselves for media-sdk workloads (and presumably other workloads > > > that uses the same hardware); if you see any performance regressions > > > with these additional entries I'd be interested to know. > > > > But what is being counter suggested is that their is no reason for these > > mocs entries. If the sdk is just using mocs registers without first > > programming them outside of the kernel abi, then it will be hitting > > uncached memory - and then the only benefit is from simply enabling > > cached access. The kernel ABI is minimalist for a reason, and we want to > > know why we should be adding tables that we need to maintain forever > > (bonus points for making that a consistent interface for hardware for > > years to come). > > -Chris > > Thanks for rephrasing - that's exactly what I am concerned with. > > Did you just use the MediaSDK as it is - meaning that MOCS entries > beyond the set of the 3 we have defined had been naively utilized? > > If that's the case it is probably the cause of the performance > difference - everything beyond "the 3" means UNCACHED. > > Can you try changing MediaSDK to only use entries that are already in? > How the performance differs in that case? We're benchmarking using upstream MediaSDK without changes, since that's the only thing that's relevant. Customising benchmarks to get better results isn't really an acceptable solution :) Obviously fixing MediaSDK upstream is a different story, in case one of the three pre-defined entries we have turns out to be the best possible MOCS-settings for that workload. Kind regards, David ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()
On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote: > Exploit the power-of-two ring size to compute the space across the > wraparound using a mask rather than a if. Convert to unsigned integers > so the operation is well defined. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 > Signed-off-by: Chris Wilson> Cc: Mika Kuoppala > Reviewed-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++-- > drivers/gpu/drm/i915/intel_ringbuffer.h | 36 > - > 2 files changed, 34 insertions(+), 25 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 3ce1c87dec46..e7ef04cc071b 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -39,12 +39,16 @@ > */ > #define LEGACY_REQUEST_SIZE 200 > > -static int __intel_ring_space(int head, int tail, int size) > +static unsigned int __intel_ring_space(unsigned int head, > +unsigned int tail, > +unsigned int size) > { > - int space = head - tail; > - if (space <= 0) > - space += size; > - return space - I915_RING_FREE_SPACE; > + /* > + * "If the Ring Buffer Head Pointer and the Tail Pointer are on the > + * same cacheline, the Head Pointer must not be greater than the Tail > + * Pointer." > + */ > + return (head - tail - CACHELINE_BYTES) & (size - 1); Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat GEM_BUG_ON(!is_power_of_2(size)); to emphase this assumption in the code (not only in the commit message)? -Michal ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 8/9] drm/i915: Stop inlining the execlists IRQ handler
Chris Wilsonwrites: > As the handler is now quite complex, involving a few atomics, the cost > of the function preamble is negligible in comparison and so we should > leave the function out-of-line for better I$. > > Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/i915_irq.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 86ede88daaab..8f60c8045b3e 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -1353,7 +1353,7 @@ static void snb_gt_irq_handler(struct drm_i915_private > *dev_priv, > ivybridge_parity_error_irq_handler(dev_priv, gt_iir); > } > > -static __always_inline void > +static void > gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift) > { > bool tasklet = false; > -- > 2.11.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On Thu, May 4, 2017 at 4:21 AM, Andy Shevchenkowrote: > acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16 > bytes. Instead we convert them to use uuid_le type. At the same time we > convert current users. > > acpi_str_to_uuid() becomes useless after the conversion and it's safe to > get rid of it. > > The conversion fixes a potential bug in int340x_thermal as well since > we have to use memcmp() on binary data. > > Cc: Rafael J. Wysocki > Cc: Mika Westerberg > Cc: Borislav Petkov > Cc: Dan Williams > Cc: Amir Goldstein > Cc: Jarkko Sakkinen > Cc: Jani Nikula > Cc: Ben Skeggs > Cc: Benjamin Tissoires > Cc: Joerg Roedel > Cc: Adrian Hunter > Cc: Yisen Zhuang > Cc: Bjorn Helgaas > Cc: Zhang Rui > Cc: Felipe Balbi > Cc: Mathias Nyman > Cc: Heikki Krogerus > Cc: Liam Girdwood > Cc: Mark Brown > Signed-off-by: Andy Shevchenko For the drivers/pci parts: Acked-by: Bjorn Helgaas > --- > drivers/acpi/acpi_extlog.c | 10 +++--- > drivers/acpi/bus.c | 29 ++-- > drivers/acpi/nfit/core.c | 40 > +++--- > drivers/acpi/nfit/nfit.h | 3 +- > drivers/acpi/utils.c | 4 +-- > drivers/char/tpm/tpm_crb.c | 9 +++-- > drivers/char/tpm/tpm_ppi.c | 20 +-- > drivers/gpu/drm/i915/intel_acpi.c | 14 +++- > drivers/gpu/drm/nouveau/nouveau_acpi.c | 20 +-- > drivers/gpu/drm/nouveau/nvkm/subdev/mxm/base.c | 9 +++-- > drivers/hid/i2c-hid/i2c-hid.c | 9 +++-- > drivers/iommu/dmar.c | 11 +++--- > drivers/mmc/host/sdhci-pci-core.c | 9 +++-- > drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 15 > drivers/pci/pci-acpi.c | 11 +++--- > drivers/pci/pci-label.c| 4 +-- > drivers/thermal/int340x_thermal/int3400_thermal.c | 8 ++--- > drivers/usb/dwc3/dwc3-pci.c| 6 ++-- > drivers/usb/host/xhci-pci.c| 9 +++-- > drivers/usb/misc/ucsi.c| 2 +- > drivers/usb/typec/typec_wcove.c| 4 +-- > include/acpi/acpi_bus.h| 9 ++--- > include/linux/acpi.h | 4 +-- > include/linux/pci-acpi.h | 2 +- > sound/soc/intel/skylake/skl-nhlt.c | 7 ++-- > tools/testing/nvdimm/test/iomap.c | 2 +- > tools/testing/nvdimm/test/nfit.c | 2 +- > 27 files changed, 116 insertions(+), 156 deletions(-) > > diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c > index 502ea4dc2080..69d6140b6afa 100644 > --- a/drivers/acpi/acpi_extlog.c > +++ b/drivers/acpi/acpi_extlog.c > @@ -182,17 +182,17 @@ static int extlog_print(struct notifier_block *nb, > unsigned long val, > > static bool __init extlog_get_l1addr(void) > { > - u8 uuid[16]; > + uuid_le uuid; > acpi_handle handle; > union acpi_object *obj; > > - acpi_str_to_uuid(extlog_dsm_uuid, uuid); > - > + if (uuid_le_to_bin(extlog_dsm_uuid, )) > + return false; > if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", ))) > return false; > - if (!acpi_check_dsm(handle, uuid, EXTLOG_DSM_REV, 1 << > EXTLOG_FN_ADDR)) > + if (!acpi_check_dsm(handle, , EXTLOG_DSM_REV, 1 << > EXTLOG_FN_ADDR)) > return false; > - obj = acpi_evaluate_dsm_typed(handle, uuid, EXTLOG_DSM_REV, > + obj = acpi_evaluate_dsm_typed(handle, , EXTLOG_DSM_REV, > EXTLOG_FN_ADDR, NULL, > ACPI_TYPE_INTEGER); > if (!obj) { > return false; > diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c > index 784bda663d16..e8130a4873e9 100644 > --- a/drivers/acpi/bus.c > +++ b/drivers/acpi/bus.c > @@ -196,42 +196,19 @@ static void acpi_print_osc_error(acpi_handle handle, > pr_debug("\n"); > } > > -acpi_status acpi_str_to_uuid(char *str, u8 *uuid) > -{ > - int i; > - static int opc_map_to_uuid[16] = {6, 4, 2, 0, 11, 9, 16, 14, 19, 21, > - 24, 26, 28, 30, 32, 34}; > - > - if (strlen(str) != 36) > - return
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [CI,1/3] drm/i915: Avoid the branch in computing intel_ring_space()
== Series Details == Series: series starting with [CI,1/3] drm/i915: Avoid the branch in computing intel_ring_space() URL : https://patchwork.freedesktop.org/series/23958/ State : success == Summary == Series 23958v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/23958/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: pass -> DMESG-WARN (fi-snb-2600) fdo#100125 Test kms_flip: Subgroup basic-flip-vs-modeset: dmesg-warn -> PASS (fi-byt-j1900) fdo#100652 fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fdo#100652 https://bugs.freedesktop.org/show_bug.cgi?id=100652 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:436s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:429s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:576s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:506s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:568s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:496s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:483s fi-elk-e7500 total:278 pass:221 dwarn:0 dfail:0 fail:0 skip:57 time:407s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:416s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:402s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:414s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:495s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:487s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:459s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:565s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:452s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:583s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:461s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:489s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:429s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:535s fi-snb-2600 total:278 pass:248 dwarn:1 dfail:0 fail:0 skip:29 time:415s 1fbac016c8f2c9d4405111f3425f778d2ecdea62 drm-tip: 2017y-05m-04d-12h-52m-01s UTC integration manifest f1c0df1 drm/i915: Micro-optimise hotpath through intel_ring_begin() 03ee0e5 drm/i915: Report the ring->space from intel_ring_update_space() eca45ee drm/i915: Avoid the branch in computing intel_ring_space() == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4622/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/9] drm/i915: Use a define for the default priority [0]
On ke, 2017-05-03 at 12:37 +0100, Chris Wilson wrote: > Explicitly assign the default priority, and give it a name (macro). > > Signed-off-by: Chris Wilson> kref_init(>ref); > list_add_tail(>link, _priv->context_list); > ctx->i915 = dev_priv; > + ctx->priority = I915_PRIORITY_DFL; I915_PRIORITY_DEFAULT would work better. Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/vblank: Lock down vblank->hwmode more
On Wed, May 03, 2017 at 05:09:08PM +0300, Ville Syrjälä wrote: > On Wed, May 03, 2017 at 09:26:38AM +0200, Daniel Vetter wrote: > > In the previous patch we've implemented hwmode tracking a la i915 for > > the vblank timestamp calculations. But that was just the basic > > semantics, i915 has some nice sanity checks to make sure we keep > > getting this right. Move them over too. > > > > Cc: Ville Syrjälä> > Reviewed-by: Neil Armstrong > > Signed-off-by: Daniel Vetter > > --- > > drivers/gpu/drm/drm_irq.c| 8 +++- > > drivers/gpu/drm/i915/i915_irq.c | 10 ++ > > drivers/gpu/drm/i915/intel_display.c | 11 ++- > > 3 files changed, 15 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > > index 89f0928b042a..942183a2aa3c 100644 > > --- a/drivers/gpu/drm/drm_irq.c > > +++ b/drivers/gpu/drm/drm_irq.c > > @@ -775,8 +775,10 @@ bool drm_calc_vbltimestamp_from_scanoutpos(struct > > drm_device *dev, > > /* If mode timing undefined, just return as no-op: > > * Happens during initial modesetting of a crtc. > > */ > > - if (mode->crtc_clock == 0) { > > + if (WARN_ON(mode->crtc_clock == 0)) { > > DRM_DEBUG("crtc %u: Noop due to uninitialized mode.\n", pipe); > > + WARN_ON(drm_drv_uses_atomic_modeset(dev)); > > I would make these _ONCE() otherwise the machine might end up > practically dead. Will do. > > + > > return false; > > } > > > > @@ -1338,6 +1340,10 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) > > send_vblank_event(dev, e, seq, ); > > } > > spin_unlock_irqrestore(>event_lock, irqflags); > > + > > + /* Will be reset by the modeset helpers when re-enabling the crtc by > > +* calling drm_calc_timestamping_constants(). */ > > + vblank->hwmode.crtc_clock = 0; > > } > > EXPORT_SYMBOL(drm_crtc_vblank_off); > > Shouldn't we do this in drm_crtc_vblank_reset() as well? > > Hmm. Except we call that after drm_calc_timestamping_constants(). I > guess we should be able to move the reset() into > intel_modeset_readout_hw_state(). And possibly move the vblank_on() > call as well? Yeah, it'd be nice to clean this stuff up some more, but there's also the problem that legacy and new drivers callc drm_calc_timestamping_constants at opposite ends of the modeset sequence. Doing more here is a bunch more work, maybe for the next patche series ... I don't think we need to call it in _reset, at least at boot-up it should be 0 already. And for s/r we already shut down the pipe on suspend, so it's gone through this here. With the _ONCE nit address (and the build breakage I've introduced in this version fixed), ack from you on the entire series? Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible
On Thu, 04 May 2017, Michal Wajdeczkowrote: > We are using some scratch registers in MMIO based send function. > Make their base and count flexible in preparation of upcoming > GuC firmware/hardware changes. While around, change cmd len > parameter verification from WARN_ON to GEM_BUG_ON as we don't > need this all the time. I'm not generally fond of caching the registers like this or adding _MMIO() wrapping outside of i915_reg.h. Sure, we have some of that here and there, but here it's hard to see the rationale because you do this in preparation for something that we you're not sharing. BR, Jani. > > v2: call out WARN/GEM_BUG change in the commit msg (Daniele) > > Signed-off-by: Michal Wajdeczko > Suggested-by: Daniele Ceraolo Spurio > Cc: Daniele Ceraolo Spurio > Cc: Joonas Lahtinen > Reviewed-by: Daniele Ceraolo Spurio > --- > drivers/gpu/drm/i915/intel_uc.c | 41 > ++--- > drivers/gpu/drm/i915/intel_uc.h | 7 +++ > 2 files changed, 41 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c > index 72f49e6..9d11c42 100644 > --- a/drivers/gpu/drm/i915/intel_uc.c > +++ b/drivers/gpu/drm/i915/intel_uc.c > @@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private *dev_priv) > __intel_uc_fw_fini(_priv->huc.fw); > } > > +static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i) > +{ > + GEM_BUG_ON(!guc->send_regs.base); > + GEM_BUG_ON(!guc->send_regs.count); > + GEM_BUG_ON(i >= guc->send_regs.count); > + > + return _MMIO(guc->send_regs.base + 4 * i); > +} > + > +static void guc_init_send_regs(struct intel_guc *guc) > +{ > + struct drm_i915_private *dev_priv = guc_to_i915(guc); > + enum forcewake_domains fw_domains = 0; > + u32 i; > + > + guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0)); > + guc->send_regs.count = SOFT_SCRATCH_COUNT - 1; > + > + for (i = 0; i < guc->send_regs.count; i++) { > + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, > + guc_send_reg(guc, i), > + FW_REG_READ | FW_REG_WRITE); > + } > + guc->send_regs.fw_domains = fw_domains; > +} > + > static int guc_enable_communication(struct intel_guc *guc) > { > /* XXX: placeholder for alternate setup */ > + guc_init_send_regs(guc); > guc->send = intel_guc_send_mmio; > return 0; > } > @@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const > u32 *action, u32 len) > int i; > int ret; > > - if (WARN_ON(len < 1 || len > 15)) > - return -EINVAL; > + GEM_BUG_ON(!len); > + GEM_BUG_ON(len > guc->send_regs.count); > > mutex_lock(>send_mutex); > - intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER); > + intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains); > > dev_priv->guc.action_count += 1; > dev_priv->guc.action_cmd = action[0]; > > for (i = 0; i < len; i++) > - I915_WRITE(SOFT_SCRATCH(i), action[i]); > + I915_WRITE(guc_send_reg(guc, i), action[i]); > > - POSTING_READ(SOFT_SCRATCH(i - 1)); > + POSTING_READ(guc_send_reg(guc, i - 1)); > > intel_guc_notify(guc); > > @@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 > *action, u32 len) >* Fast commands should still complete in 10us. >*/ > ret = __intel_wait_for_register_fw(dev_priv, > -SOFT_SCRATCH(0), > +guc_send_reg(guc, 0), > INTEL_GUC_RECV_MASK, > INTEL_GUC_RECV_MASK, > 10, 10, ); > @@ -450,7 +477,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 > *action, u32 len) > } > dev_priv->guc.action_status = status; > > - intel_uncore_forcewake_put(dev_priv, FORCEWAKE_BLITTER); > + intel_uncore_forcewake_put(dev_priv, guc->send_regs.fw_domains); > mutex_unlock(>send_mutex); > > return ret; > diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h > index 097289b..a37a8cc 100644 > --- a/drivers/gpu/drm/i915/intel_uc.h > +++ b/drivers/gpu/drm/i915/intel_uc.h > @@ -205,6 +205,13 @@ struct intel_guc { > uint64_t submissions[I915_NUM_ENGINES]; > uint32_t last_seqno[I915_NUM_ENGINES]; > > + /* GuC's FW specific registers used in MMIO send */ > + struct { > + u32 base; > + u32 count; > + u32 fw_domains; /* enum forcewake_domains */ > + } send_regs; > + > /* To serialize the intel_guc_send
Re: [Intel-gfx] [PATCH 33/67] drm/i915: Configure DPLL's for Cannonlake
On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote: > From: "Kahola, Mika"> > DPLL's are defined in DPCLKA_CFGCR0 register (0x6C200). Let's use these > definitions when computing dpll's for ddi ports. > > v2: (Rodrigo) Remove register that was defined in another patch with > fixed name and more bits. > > Signed-off-by: Kahola, Mika > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/intel_display.c | 20 +++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index 87d2822..4d0ae98 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -8850,6 +8850,22 @@ static int haswell_crtc_compute_clock(struct > intel_crtc *crtc, > return 0; > } > > +static void cannonlake_get_ddi_pll(struct drm_i915_private *dev_priv, > +enum port port, > +struct intel_crtc_state *pipe_config) > +{ > + enum intel_dpll_id id; > + u32 temp; > + > + temp = I915_READ(DPCLKA_CFGCR0) & DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port); > + id = temp >> (port * 2); Maybe use DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT which was defined in the previous patch? Also, might make sense to squash this with the next patch, but anyway, Reviewed-by: Ander Conselvan de Oliveira > + > + if (WARN_ON(id < SKL_DPLL0 || id > SKL_DPLL2)) > + return; > + > + pipe_config->shared_dpll = intel_get_shared_dpll_by_id(dev_priv, id); > +} > + > static void bxt_get_ddi_pll(struct drm_i915_private *dev_priv, > enum port port, > struct intel_crtc_state *pipe_config) > @@ -9037,7 +9053,9 @@ static void haswell_get_ddi_port_state(struct > intel_crtc *crtc, > > port = (tmp & TRANS_DDI_PORT_MASK) >> TRANS_DDI_PORT_SHIFT; > > - if (IS_GEN9_BC(dev_priv)) > + if (IS_CANNONLAKE(dev_priv)) > + cannonlake_get_ddi_pll(dev_priv, port, pipe_config); > + else if (IS_GEN9_BC(dev_priv)) > skylake_get_ddi_pll(dev_priv, port, pipe_config); > else if (IS_GEN9_LP(dev_priv)) > bxt_get_ddi_pll(dev_priv, port, pipe_config); ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
On Thu, May 04, 2017 at 03:02:07PM +0200, Maarten Lankhorst wrote: > Op 04-05-17 om 14:44 schreef Ville Syrjälä: > > On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote: > >> On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote: > >>> Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu: > One of the steps for PLL (un)initialization is to (un)map > the correspondent DDI that is actually using that PLL. > > So, let's do this step following the places already stablished > and used so far, although spec put this as part of PLL > initialization sequences. > > v2: Use proper prefix on bits names as suggested by Ander. > v3: Add missed "~". Without that the logic was inverted > so we were disabling interrupts. > Credits-to: Clinton > Credits-to: Art > v4: Spec is getting updated to do DDI -> PLL mapping > and clock on in 2 separated reg writes. (Paulo) > Also update bits definitions to use space > (1 << 1) instead of (1<<1). (Paulo) > > Cc: Paulo Zanoni> Cc: Art Runyan > Cc: Clint Taylor > Cc: Ville Syrjälä > Cc: Kahola, Mika > Cc: Ander Conselvan De Oliveira m> > Signed-off-by: Rodrigo Vivi > Reviewed-by: Kahola, Mika > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/i915_reg.h | 9 + > drivers/gpu/drm/i915/intel_ddi.c | 23 --- > 2 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h > b/drivers/gpu/drm/i915/i915_reg.h > index 3cfc65f..dcb8e21 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -8150,6 +8150,15 @@ enum { > #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1, > _DPLL1_CFGCR1, _DPLL2_CFGCR1) > #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1, > _DPLL1_CFGCR2, _DPLL2_CFGCR2) > > +/* > + * CNL Clocks > + */ > +#define DPCLKA_CFGCR0 _MMIO(0x6C200) > +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port)(1 << ((port)+10)) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port) (3 << > ((port)*2)) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port) ((pll) << > ((port)*2)) > + > /* BXT display engine PLL */ > #define BXT_DE_PLL_CTL _MMIO(0x6d000) > #define BXT_DE_PLL_RATIO(x) (x) /* > {60,65,100} * 19.2MHz */ > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > b/drivers/gpu/drm/i915/intel_ddi.c > index 0914ad9..2a901bf 100644 > --- a/drivers/gpu/drm/i915/intel_ddi.c > +++ b/drivers/gpu/drm/i915/intel_ddi.c > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct > intel_encoder *encoder, > { > struct drm_i915_private *dev_priv = to_i915(encoder- > > base.dev); > enum port port = intel_ddi_get_encoder_port(encoder); > +uint32_t val; > > if (WARN_ON(!pll)) > return; > > -if (IS_GEN9_BC(dev_priv)) { > -uint32_t val; > +if (IS_CANNONLAKE(dev_priv)) { > +/* Configure DPCLKA_CFGCR0 to map the DPLL to the > DDI. */ > +val = I915_READ(DPCLKA_CFGCR0); > +val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); > +I915_WRITE(DPCLKA_CFGCR0, val); > >>> A question to the Atomic Lords: don't we need some sort of locking > >>> around this register since it's used by all ports/clocks? I suppose > >>> dev_priv->dpll_lock would do... > >>> > >>> Maybe the same would apply for gen9_bc. > >> If there are modesets happening in parallel for different crtcs, then some > >> locking is needed. dpll_lock seems like the right call, that's what's used > >> to > >> avoid the same problem with the enable/disable hooks. > > If something is allowing modesets to commit in parallel then probably > > the whole world is on fire. Historically connection_mutex has been there > > to protect us, but not sure how that goes with nonblocking commits. I > > do hope there's still something there to prevents this... > > During nonblocking modesets we don't hold any locks. It's still possible > that we force serialization through some other means, for example grabbing > all crtc_states might force serialization previously. But I'm not sure this > is guaranteed to happen even for SKL. It might happen for when DDB > allocation or cdclk changes but
[Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()
Exploit the power-of-two ring size to compute the space across the wraparound using a mask rather than a if. Convert to unsigned integers so the operation is well defined. References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 Signed-off-by: Chris WilsonCc: Mika Kuoppala Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 36 - 2 files changed, 34 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 3ce1c87dec46..e7ef04cc071b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -39,12 +39,16 @@ */ #define LEGACY_REQUEST_SIZE 200 -static int __intel_ring_space(int head, int tail, int size) +static unsigned int __intel_ring_space(unsigned int head, + unsigned int tail, + unsigned int size) { - int space = head - tail; - if (space <= 0) - space += size; - return space - I915_RING_FREE_SPACE; + /* +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the +* same cacheline, the Head Pointer must not be greater than the Tail +* Pointer." +*/ + return (head - tail - CACHELINE_BYTES) & (size - 1); } void intel_ring_update_space(struct intel_ring *ring) @@ -1670,12 +1674,9 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes) GEM_BUG_ON(!req->reserved_space); list_for_each_entry(target, >request_list, ring_link) { - unsigned space; - /* Would completion of this request free enough space? */ - space = __intel_ring_space(target->postfix, ring->emit, - ring->size); - if (space >= bytes) + if (bytes <= __intel_ring_space(target->postfix, + ring->emit, ring->size)) break; } @@ -1744,11 +1745,11 @@ u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) } GEM_BUG_ON(ring->emit > ring->size - bytes); + GEM_BUG_ON(ring->space < bytes); cs = ring->vaddr + ring->emit; GEM_DEBUG_EXEC(memset(cs, POISON_INUSE, bytes)); ring->emit += bytes; ring->space -= bytes; - GEM_BUG_ON(ring->space < 0); return cs; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 600713b29d79..650ab884d6c8 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -17,17 +17,6 @@ #define CACHELINE_BYTES 64 #define CACHELINE_DWORDS (CACHELINE_BYTES / sizeof(uint32_t)) -/* - * Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use" - * Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use" - * Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring Buffer Use" - * - * "If the Ring Buffer Head Pointer and the Tail Pointer are on the same - * cacheline, the Head Pointer must not be greater than the Tail - * Pointer." - */ -#define I915_RING_FREE_SPACE 64 - struct intel_hw_status_page { struct i915_vma *vma; u32 *page_addr; @@ -145,9 +134,9 @@ struct intel_ring { u32 tail; u32 emit; - int space; - int size; - int effective_size; + u32 space; + u32 size; + u32 effective_size; }; struct i915_gem_context; @@ -548,6 +537,25 @@ assert_ring_tail_valid(const struct intel_ring *ring, unsigned int tail) */ GEM_BUG_ON(!IS_ALIGNED(tail, 8)); GEM_BUG_ON(tail >= ring->size); + + /* +* "Ring Buffer Use" +* Gen2 BSpec "1. Programming Environment" / 1.4.4.6 +* Gen3 BSpec "1c Memory Interface Functions" / 2.3.4.5 +* Gen4+ BSpec "1c Memory Interface and Command Stream" / 5.3.4.5 +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the +* same cacheline, the Head Pointer must not be greater than the Tail +* Pointer." +* +* We use ring->head as the last known location of the actual RING_HEAD, +* it may have advanced but in the worst case it is equally the same +* as ring->head and so we should never program RING_TAIL to advance +* into the same cacheline as ring->head. +*/ +#define cacheline(a) round_down(a, CACHELINE_BYTES) + GEM_BUG_ON(cacheline(tail) == cacheline(ring->head) && + tail < ring->head); +#undef cacheline } static inline unsigned int -- 2.11.0 ___ Intel-gfx mailing list
[Intel-gfx] [CI 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
Typically, there is space available within the ring and if not we have to wait (by definition a slow path). Rearrange the code to reduce the number of branches and stack size for the hotpath, accomodating a slight growth for the wait. v2: Fix the new assert that packets are not larger than the actual ring. v3: Make the parameters unsigned as well to make usage. Signed-off-by: Chris WilsonReviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/intel_ringbuffer.c | 67 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +- 2 files changed, 38 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 47f144b1e3fa..8b427a6151b2 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1655,7 +1655,8 @@ static int ring_request_alloc(struct drm_i915_gem_request *request) return 0; } -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) +static noinline int wait_for_space(struct drm_i915_gem_request *req, + unsigned int bytes) { struct intel_ring *ring = req->ring; struct drm_i915_gem_request *target; @@ -1700,52 +1701,56 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes) return 0; } -u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) +u32 *intel_ring_begin(struct drm_i915_gem_request *req, + unsigned int num_dwords) { struct intel_ring *ring = req->ring; - int remain_actual = ring->size - ring->emit; - int remain_usable = ring->effective_size - ring->emit; - int bytes = num_dwords * sizeof(u32); - int total_bytes, wait_bytes; - bool need_wrap = false; + const unsigned int remain_usable = ring->effective_size - ring->emit; + const unsigned int bytes = num_dwords * sizeof(u32); + unsigned int need_wrap = 0; + unsigned int total_bytes; u32 *cs; total_bytes = bytes + req->reserved_space; + GEM_BUG_ON(total_bytes > ring->effective_size); - if (unlikely(bytes > remain_usable)) { - /* -* Not enough space for the basic request. So need to flush -* out the remainder and then wait for base + reserved. -*/ - wait_bytes = remain_actual + total_bytes; - need_wrap = true; - } else if (unlikely(total_bytes > remain_usable)) { - /* -* The base request will fit but the reserved space -* falls off the end. So we don't need an immediate wrap -* and only need to effectively wait for the reserved -* size space from the start of ringbuffer. -*/ - wait_bytes = remain_actual + req->reserved_space; - } else { - /* No wrapping required, just waiting. */ - wait_bytes = total_bytes; + if (unlikely(total_bytes > remain_usable)) { + const int remain_actual = ring->size - ring->emit; + + if (bytes > remain_usable) { + /* +* Not enough space for the basic request. So need to +* flush out the remainder and then wait for +* base + reserved. +*/ + total_bytes += remain_actual; + need_wrap = remain_actual | 1; + } else { + /* +* The base request will fit but the reserved space +* falls off the end. So we don't need an immediate +* wrap and only need to effectively wait for the +* reserved size from the start of ringbuffer. +*/ + total_bytes = req->reserved_space + remain_actual; + } } - if (wait_bytes > ring->space) { - int ret = wait_for_space(req, wait_bytes); + if (unlikely(total_bytes > ring->space)) { + int ret = wait_for_space(req, total_bytes); if (unlikely(ret)) return ERR_PTR(ret); } if (unlikely(need_wrap)) { - GEM_BUG_ON(remain_actual > ring->space); - GEM_BUG_ON(ring->emit + remain_actual > ring->size); + need_wrap &= ~1; + GEM_BUG_ON(need_wrap > ring->space); + GEM_BUG_ON(ring->emit + need_wrap > ring->size); /* Fill the tail with MI_NOOP */ - memset(ring->vaddr + ring->emit, 0, remain_actual); + memset(ring->vaddr + ring->emit, 0, need_wrap); ring->emit = 0; - ring->space -= remain_actual; + ring->space -=
[Intel-gfx] [CI 2/3] drm/i915: Report the ring->space from intel_ring_update_space()
Some callers immediately want to know the current ring->space after calling intel_ring_update_space(), which we can freely provide via the return parameter. Signed-off-by: Chris WilsonReviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_ringbuffer.c | 12 drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index e7ef04cc071b..47f144b1e3fa 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -51,9 +51,14 @@ static unsigned int __intel_ring_space(unsigned int head, return (head - tail - CACHELINE_BYTES) & (size - 1); } -void intel_ring_update_space(struct intel_ring *ring) +unsigned int intel_ring_update_space(struct intel_ring *ring) { - ring->space = __intel_ring_space(ring->head, ring->emit, ring->size); + unsigned int space; + + space = __intel_ring_space(ring->head, ring->emit, ring->size); + + ring->space = space; + return space; } static int @@ -1658,8 +1663,7 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes) lockdep_assert_held(>i915->drm.struct_mutex); - intel_ring_update_space(ring); - if (ring->space >= bytes) + if (intel_ring_update_space(ring) >= bytes) return 0; /* diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 650ab884d6c8..3e343b09eeb6 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -486,7 +486,7 @@ int intel_ring_pin(struct intel_ring *ring, struct drm_i915_private *i915, unsigned int offset_bias); void intel_ring_reset(struct intel_ring *ring, u32 tail); -void intel_ring_update_space(struct intel_ring *ring); +unsigned int intel_ring_update_space(struct intel_ring *ring); void intel_ring_unpin(struct intel_ring *ring); void intel_ring_free(struct intel_ring *ring); -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
On Thu, May 04, 2017 at 03:59:05PM +0300, Mika Kuoppala wrote: > Chris Wilsonwrites: > > > On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote: > >> Chris Wilson writes: > >> > >> > Typically, there is space available within the ring and if not we have > >> > to wait (by definition a slow path). Rearrange the code to reduce the > >> > number of branches and stack size for the hotpath, accomodating a slight > >> > growth for the wait. > >> > > >> > v2: Fix the new assert that packets are not larger than the actual ring. > >> > > >> > Signed-off-by: Chris Wilson > >> > --- > >> > drivers/gpu/drm/i915/intel_ringbuffer.c | 63 > >> > + > >> > 1 file changed, 33 insertions(+), 30 deletions(-) > >> > > >> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > >> > b/drivers/gpu/drm/i915/intel_ringbuffer.c > >> > index c46e5439d379..53123c1cfcc5 100644 > >> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > >> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > >> > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct > >> > drm_i915_gem_request *request) > >> > return 0; > >> > } > >> > > >> > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) > >> > +static noinline int wait_for_space(struct drm_i915_gem_request *req, > >> > int bytes) > >> > { > >> > struct intel_ring *ring = req->ring; > >> > struct drm_i915_gem_request *target; > >> > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct > >> > drm_i915_gem_request *req, int bytes) > >> > u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) > >> > { > >> > struct intel_ring *ring = req->ring; > >> > -int remain_actual = ring->size - ring->emit; > >> > -int remain_usable = ring->effective_size - ring->emit; > >> > -int bytes = num_dwords * sizeof(u32); > >> > -int total_bytes, wait_bytes; > >> > -bool need_wrap = false; > >> > +const unsigned int remain_usable = ring->effective_size - > >> > ring->emit; > >> > +const unsigned int bytes = num_dwords * sizeof(u32); > >> > +unsigned int need_wrap = 0; > >> > +unsigned int total_bytes; > >> > u32 *cs; > >> > > >> > total_bytes = bytes + req->reserved_space; > >> > +GEM_BUG_ON(total_bytes > ring->effective_size); > >> > > >> > -if (unlikely(bytes > remain_usable)) { > >> > -/* > >> > - * Not enough space for the basic request. So need to > >> > flush > >> > - * out the remainder and then wait for base + reserved. > >> > - */ > >> > -wait_bytes = remain_actual + total_bytes; > >> > -need_wrap = true; > >> > -} else if (unlikely(total_bytes > remain_usable)) { > >> > -/* > >> > - * The base request will fit but the reserved space > >> > - * falls off the end. So we don't need an immediate wrap > >> > - * and only need to effectively wait for the reserved > >> > - * size space from the start of ringbuffer. > >> > - */ > >> > -wait_bytes = remain_actual + req->reserved_space; > >> > -} else { > >> > -/* No wrapping required, just waiting. */ > >> > -wait_bytes = total_bytes; > >> > +if (unlikely(total_bytes > remain_usable)) { > >> > +const int remain_actual = ring->size - ring->emit; > >> > + > >> > +if (bytes > remain_usable) { > >> > +/* > >> > + * Not enough space for the basic request. So > >> > need to > >> > + * flush out the remainder and then wait for > >> > + * base + reserved. > >> > + */ > >> > +total_bytes += remain_actual; > >> > +need_wrap = remain_actual | 1; > >> > >> Your remain_actual should never reach zero. So in here > >> forcing the lowest bit on, and later off, seems superfluous. > > > > Why can't we fill up to the last byte with commands? remain_actual is > > just (size - tail) and we don't force a wrap until emit crosses the > > boundary (and not before). We hit remain_actual == 0 in practice. > > -Chris > > My mistake, was thinking postwrap. > > num_dwords and second parameter to wait_for_space should be unsigned. You predictive algorithm is working fine though. Applied after your suggestion from patch 1. Thanks, -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
Op 04-05-17 om 14:44 schreef Ville Syrjälä: > On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote: >> On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote: >>> Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu: One of the steps for PLL (un)initialization is to (un)map the correspondent DDI that is actually using that PLL. So, let's do this step following the places already stablished and used so far, although spec put this as part of PLL initialization sequences. v2: Use proper prefix on bits names as suggested by Ander. v3: Add missed "~". Without that the logic was inverted so we were disabling interrupts. Credits-to: Clinton Credits-to: Art v4: Spec is getting updated to do DDI -> PLL mapping and clock on in 2 separated reg writes. (Paulo) Also update bits definitions to use space (1 << 1) instead of (1<<1). (Paulo) Cc: Paulo ZanoniCc: Art Runyan Cc: Clint Taylor Cc: Ville Syrjälä Cc: Kahola, Mika Cc: Ander Conselvan De Oliveira Signed-off-by: Rodrigo Vivi Reviewed-by: Kahola, Mika Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_reg.h | 9 + drivers/gpu/drm/i915/intel_ddi.c | 23 --- 2 files changed, 29 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 3cfc65f..dcb8e21 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8150,6 +8150,15 @@ enum { #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR1, _DPLL2_CFGCR1) #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR2, _DPLL2_CFGCR2) +/* + * CNL Clocks + */ +#define DPCLKA_CFGCR0 _MMIO(0x6C200) +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10)) +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port) (3 << ((port)*2)) +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port)((port)*2) +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port) ((pll) << ((port)*2)) + /* BXT display engine PLL */ #define BXT_DE_PLL_CTL_MMIO(0x6d000) #define BXT_DE_PLL_RATIO(x) (x) /* {60,65,100} * 19.2MHz */ diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c index 0914ad9..2a901bf 100644 --- a/drivers/gpu/drm/i915/intel_ddi.c +++ b/drivers/gpu/drm/i915/intel_ddi.c @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct intel_encoder *encoder, { struct drm_i915_private *dev_priv = to_i915(encoder- > base.dev); enum port port = intel_ddi_get_encoder_port(encoder); + uint32_t val; if (WARN_ON(!pll)) return; - if (IS_GEN9_BC(dev_priv)) { - uint32_t val; + if (IS_CANNONLAKE(dev_priv)) { + /* Configure DPCLKA_CFGCR0 to map the DPLL to the DDI. */ + val = I915_READ(DPCLKA_CFGCR0); + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); + I915_WRITE(DPCLKA_CFGCR0, val); >>> A question to the Atomic Lords: don't we need some sort of locking >>> around this register since it's used by all ports/clocks? I suppose >>> dev_priv->dpll_lock would do... >>> >>> Maybe the same would apply for gen9_bc. >> If there are modesets happening in parallel for different crtcs, then some >> locking is needed. dpll_lock seems like the right call, that's what's used to >> avoid the same problem with the enable/disable hooks. > If something is allowing modesets to commit in parallel then probably > the whole world is on fire. Historically connection_mutex has been there > to protect us, but not sure how that goes with nonblocking commits. I > do hope there's still something there to prevents this... During nonblocking modesets we don't hold any locks. It's still possible that we force serialization through some other means, for example grabbing all crtc_states might force serialization previously. But I'm not sure this is guaranteed to happen even for SKL. It might happen for when DDB allocation or cdclk changes but there's no guarantee during modeset. So quite likely you'll need locking here. :) ~Maarten ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
Chris Wilsonwrites: > On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote: >> Chris Wilson writes: >> >> > Typically, there is space available within the ring and if not we have >> > to wait (by definition a slow path). Rearrange the code to reduce the >> > number of branches and stack size for the hotpath, accomodating a slight >> > growth for the wait. >> > >> > v2: Fix the new assert that packets are not larger than the actual ring. >> > >> > Signed-off-by: Chris Wilson >> > --- >> > drivers/gpu/drm/i915/intel_ringbuffer.c | 63 >> > + >> > 1 file changed, 33 insertions(+), 30 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c >> > b/drivers/gpu/drm/i915/intel_ringbuffer.c >> > index c46e5439d379..53123c1cfcc5 100644 >> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c >> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c >> > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct >> > drm_i915_gem_request *request) >> >return 0; >> > } >> > >> > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) >> > +static noinline int wait_for_space(struct drm_i915_gem_request *req, int >> > bytes) >> > { >> >struct intel_ring *ring = req->ring; >> >struct drm_i915_gem_request *target; >> > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct >> > drm_i915_gem_request *req, int bytes) >> > u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) >> > { >> >struct intel_ring *ring = req->ring; >> > - int remain_actual = ring->size - ring->emit; >> > - int remain_usable = ring->effective_size - ring->emit; >> > - int bytes = num_dwords * sizeof(u32); >> > - int total_bytes, wait_bytes; >> > - bool need_wrap = false; >> > + const unsigned int remain_usable = ring->effective_size - ring->emit; >> > + const unsigned int bytes = num_dwords * sizeof(u32); >> > + unsigned int need_wrap = 0; >> > + unsigned int total_bytes; >> >u32 *cs; >> > >> >total_bytes = bytes + req->reserved_space; >> > + GEM_BUG_ON(total_bytes > ring->effective_size); >> > >> > - if (unlikely(bytes > remain_usable)) { >> > - /* >> > - * Not enough space for the basic request. So need to flush >> > - * out the remainder and then wait for base + reserved. >> > - */ >> > - wait_bytes = remain_actual + total_bytes; >> > - need_wrap = true; >> > - } else if (unlikely(total_bytes > remain_usable)) { >> > - /* >> > - * The base request will fit but the reserved space >> > - * falls off the end. So we don't need an immediate wrap >> > - * and only need to effectively wait for the reserved >> > - * size space from the start of ringbuffer. >> > - */ >> > - wait_bytes = remain_actual + req->reserved_space; >> > - } else { >> > - /* No wrapping required, just waiting. */ >> > - wait_bytes = total_bytes; >> > + if (unlikely(total_bytes > remain_usable)) { >> > + const int remain_actual = ring->size - ring->emit; >> > + >> > + if (bytes > remain_usable) { >> > + /* >> > + * Not enough space for the basic request. So need to >> > + * flush out the remainder and then wait for >> > + * base + reserved. >> > + */ >> > + total_bytes += remain_actual; >> > + need_wrap = remain_actual | 1; >> >> Your remain_actual should never reach zero. So in here >> forcing the lowest bit on, and later off, seems superfluous. > > Why can't we fill up to the last byte with commands? remain_actual is > just (size - tail) and we don't force a wrap until emit crosses the > boundary (and not before). We hit remain_actual == 0 in practice. > -Chris My mistake, was thinking postwrap. num_dwords and second parameter to wait_for_space should be unsigned. Reviewed-by: Mika Kuoppala ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote: > One of the steps for PLL (un)initialization is to (un)map > the correspondent DDI that is actually using that PLL. > > So, let's do this step following the places already stablished > and used so far, although spec put this as part of PLL > initialization sequences. > > v2: Use proper prefix on bits names as suggested by Ander. > v3: Add missed "~". Without that the logic was inverted > so we were disabling interrupts. > Credits-to: Clinton > Credits-to: Art > v4: Spec is getting updated to do DDI -> PLL mapping > and clock on in 2 separated reg writes. (Paulo) > Also update bits definitions to use space > (1 << 1) instead of (1<<1). (Paulo) > > Cc: Paulo Zanoni> Cc: Art Runyan > Cc: Clint Taylor > Cc: Ville Syrjälä > Cc: Kahola, Mika > Cc: Ander Conselvan De Oliveira > Signed-off-by: Rodrigo Vivi > Reviewed-by: Kahola, Mika > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/i915_reg.h | 9 + > drivers/gpu/drm/i915/intel_ddi.c | 23 --- > 2 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 3cfc65f..dcb8e21 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -8150,6 +8150,15 @@ enum { > #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR1, > _DPLL2_CFGCR1) > #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR2, > _DPLL2_CFGCR2) > > +/* > + * CNL Clocks > + */ > +#define DPCLKA_CFGCR0_MMIO(0x6C200) > +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10)) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)(3 << ((port)*2)) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port)((pll) << ((port)*2)) > + > /* BXT display engine PLL */ > #define BXT_DE_PLL_CTL _MMIO(0x6d000) > #define BXT_DE_PLL_RATIO(x)(x) /* {60,65,100} * > 19.2MHz */ > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > b/drivers/gpu/drm/i915/intel_ddi.c > index 0914ad9..2a901bf 100644 > --- a/drivers/gpu/drm/i915/intel_ddi.c > +++ b/drivers/gpu/drm/i915/intel_ddi.c > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct intel_encoder > *encoder, > { > struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); > enum port port = intel_ddi_get_encoder_port(encoder); > + uint32_t val; > > if (WARN_ON(!pll)) > return; > > - if (IS_GEN9_BC(dev_priv)) { > - uint32_t val; > + if (IS_CANNONLAKE(dev_priv)) { > + /* Configure DPCLKA_CFGCR0 to map the DPLL to the DDI. */ > + val = I915_READ(DPCLKA_CFGCR0); > + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); > + I915_WRITE(DPCLKA_CFGCR0, val); > > + /* > + * Configure DPCLKA_CFGCR0 to turn on the clock for the DDI. > + * This step and the step before must be done with separate > + * register writes. > + */ > + val = I915_READ(DPCLKA_CFGCR0); > + val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) | > + DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)); val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); ? Or clearing the clock select to zero has no effect here? Ander > + I915_WRITE(DPCLKA_CFGCR0, val); > + } else if (IS_GEN9_BC(dev_priv)) { > /* DDI -> PLL mapping */ > val = I915_READ(DPLL_CTRL2); > > @@ -1763,7 +1777,10 @@ static void intel_ddi_post_disable(struct > intel_encoder *intel_encoder, > if (dig_port) > intel_display_power_put(dev_priv, > dig_port->ddi_io_power_domain); > > - if (IS_GEN9_BC(dev_priv)) > + if (IS_CANNONLAKE(dev_priv)) > + I915_WRITE(DPCLKA_CFGCR0, I915_READ(DPCLKA_CFGCR0) | > +DPCLKA_CFGCR0_DDI_CLK_OFF(port)); > + else if (IS_GEN9_BC(dev_priv)) > I915_WRITE(DPLL_CTRL2, (I915_READ(DPLL_CTRL2) | > DPLL_CTRL2_DDI_CLK_OFF(port))); > else if (INTEL_GEN(dev_priv) < 9) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On Thu, May 04, 2017 at 12:21:51PM +0300, Andy Shevchenko wrote: > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c > index cbf7763d8091..420d51b286ad 100644 > --- a/drivers/iommu/dmar.c > +++ b/drivers/iommu/dmar.c > @@ -1808,10 +1808,9 @@ IOMMU_INIT_POST(detect_intel_iommu); > * for Directed-IO Architecture Specifiction, Rev 2.2, Section 8.8 > * "Remapping Hardware Unit Hot Plug". > */ > -static u8 dmar_hp_uuid[] = { > - /* */0xA6, 0xA3, 0xC1, 0xD8, 0x9B, 0xBE, 0x9B, 0x4C, > - /* 0008 */0x91, 0xBF, 0xC3, 0xCB, 0x81, 0xFC, 0x5D, 0xAF > -}; > +static uuid_le dmar_hp_uuid = > + UUID_LE(0xD8C1A3A6, 0xBE9B, 0x4C9B, > + 0x91, 0xBF, 0xC3, 0xCB, 0x81, 0xFC, 0x5D, 0xAF); > > /* > * Currently there's only one revision and BIOS will not check the revision > id, > @@ -1824,7 +1823,7 @@ static u8 dmar_hp_uuid[] = { > > static inline bool dmar_detect_dsm(acpi_handle handle, int func) > { > - return acpi_check_dsm(handle, dmar_hp_uuid, DMAR_DSM_REV_ID, 1 << func); > + return acpi_check_dsm(handle, _hp_uuid, DMAR_DSM_REV_ID, 1 << > func); > } > > static int dmar_walk_dsm_resource(acpi_handle handle, int func, > @@ -1843,7 +1842,7 @@ static int dmar_walk_dsm_resource(acpi_handle handle, > int func, > if (!dmar_detect_dsm(handle, func)) > return 0; > > - obj = acpi_evaluate_dsm_typed(handle, dmar_hp_uuid, DMAR_DSM_REV_ID, > + obj = acpi_evaluate_dsm_typed(handle, _hp_uuid, DMAR_DSM_REV_ID, > func, NULL, ACPI_TYPE_BUFFER); > if (!obj) > return -ENODEV; DMAR part is Acked-by: Joerg Roedel___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible
We are using some scratch registers in MMIO based send function. Make their base and count flexible in preparation of upcoming GuC firmware/hardware changes. While around, change cmd len parameter verification from WARN_ON to GEM_BUG_ON as we don't need this all the time. v2: call out WARN/GEM_BUG change in the commit msg (Daniele) Signed-off-by: Michal WajdeczkoSuggested-by: Daniele Ceraolo Spurio Cc: Daniele Ceraolo Spurio Cc: Joonas Lahtinen Reviewed-by: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/intel_uc.c | 41 ++--- drivers/gpu/drm/i915/intel_uc.h | 7 +++ 2 files changed, 41 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c index 72f49e6..9d11c42 100644 --- a/drivers/gpu/drm/i915/intel_uc.c +++ b/drivers/gpu/drm/i915/intel_uc.c @@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private *dev_priv) __intel_uc_fw_fini(_priv->huc.fw); } +static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i) +{ + GEM_BUG_ON(!guc->send_regs.base); + GEM_BUG_ON(!guc->send_regs.count); + GEM_BUG_ON(i >= guc->send_regs.count); + + return _MMIO(guc->send_regs.base + 4 * i); +} + +static void guc_init_send_regs(struct intel_guc *guc) +{ + struct drm_i915_private *dev_priv = guc_to_i915(guc); + enum forcewake_domains fw_domains = 0; + u32 i; + + guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0)); + guc->send_regs.count = SOFT_SCRATCH_COUNT - 1; + + for (i = 0; i < guc->send_regs.count; i++) { + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, + guc_send_reg(guc, i), + FW_REG_READ | FW_REG_WRITE); + } + guc->send_regs.fw_domains = fw_domains; +} + static int guc_enable_communication(struct intel_guc *guc) { /* XXX: placeholder for alternate setup */ + guc_init_send_regs(guc); guc->send = intel_guc_send_mmio; return 0; } @@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len) int i; int ret; - if (WARN_ON(len < 1 || len > 15)) - return -EINVAL; + GEM_BUG_ON(!len); + GEM_BUG_ON(len > guc->send_regs.count); mutex_lock(>send_mutex); - intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER); + intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains); dev_priv->guc.action_count += 1; dev_priv->guc.action_cmd = action[0]; for (i = 0; i < len; i++) - I915_WRITE(SOFT_SCRATCH(i), action[i]); + I915_WRITE(guc_send_reg(guc, i), action[i]); - POSTING_READ(SOFT_SCRATCH(i - 1)); + POSTING_READ(guc_send_reg(guc, i - 1)); intel_guc_notify(guc); @@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len) * Fast commands should still complete in 10us. */ ret = __intel_wait_for_register_fw(dev_priv, - SOFT_SCRATCH(0), + guc_send_reg(guc, 0), INTEL_GUC_RECV_MASK, INTEL_GUC_RECV_MASK, 10, 10, ); @@ -450,7 +477,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len) } dev_priv->guc.action_status = status; - intel_uncore_forcewake_put(dev_priv, FORCEWAKE_BLITTER); + intel_uncore_forcewake_put(dev_priv, guc->send_regs.fw_domains); mutex_unlock(>send_mutex); return ret; diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h index 097289b..a37a8cc 100644 --- a/drivers/gpu/drm/i915/intel_uc.h +++ b/drivers/gpu/drm/i915/intel_uc.h @@ -205,6 +205,13 @@ struct intel_guc { uint64_t submissions[I915_NUM_ENGINES]; uint32_t last_seqno[I915_NUM_ENGINES]; + /* GuC's FW specific registers used in MMIO send */ + struct { + u32 base; + u32 count; + u32 fw_domains; /* enum forcewake_domains */ + } send_regs; + /* To serialize the intel_guc_send actions */ struct mutex send_mutex; -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote: > On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote: > > Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu: > > > One of the steps for PLL (un)initialization is to (un)map > > > the correspondent DDI that is actually using that PLL. > > > > > > So, let's do this step following the places already stablished > > > and used so far, although spec put this as part of PLL > > > initialization sequences. > > > > > > v2: Use proper prefix on bits names as suggested by Ander. > > > v3: Add missed "~". Without that the logic was inverted > > > so we were disabling interrupts. > > > Credits-to: Clinton > > > Credits-to: Art > > > v4: Spec is getting updated to do DDI -> PLL mapping > > > and clock on in 2 separated reg writes. (Paulo) > > > Also update bits definitions to use space > > > (1 << 1) instead of (1<<1). (Paulo) > > > > > > Cc: Paulo Zanoni> > > Cc: Art Runyan > > > Cc: Clint Taylor > > > Cc: Ville Syrjälä > > > Cc: Kahola, Mika > > > Cc: Ander Conselvan De Oliveira > > m> > > > Signed-off-by: Rodrigo Vivi > > > Reviewed-by: Kahola, Mika > > > Signed-off-by: Rodrigo Vivi > > > --- > > > drivers/gpu/drm/i915/i915_reg.h | 9 + > > > drivers/gpu/drm/i915/intel_ddi.c | 23 --- > > > 2 files changed, 29 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h > > > b/drivers/gpu/drm/i915/i915_reg.h > > > index 3cfc65f..dcb8e21 100644 > > > --- a/drivers/gpu/drm/i915/i915_reg.h > > > +++ b/drivers/gpu/drm/i915/i915_reg.h > > > @@ -8150,6 +8150,15 @@ enum { > > > #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1, > > > _DPLL1_CFGCR1, _DPLL2_CFGCR1) > > > #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1, > > > _DPLL1_CFGCR2, _DPLL2_CFGCR2) > > > > > > +/* > > > + * CNL Clocks > > > + */ > > > +#define DPCLKA_CFGCR0_MMIO(0x6C200) > > > +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10)) > > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)(3 << > > > ((port)*2)) > > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2) > > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port)((pll) << > > > ((port)*2)) > > > + > > > /* BXT display engine PLL */ > > > #define BXT_DE_PLL_CTL _MMIO(0x6d000) > > > #define BXT_DE_PLL_RATIO(x)(x) /* > > > {60,65,100} * 19.2MHz */ > > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > > > b/drivers/gpu/drm/i915/intel_ddi.c > > > index 0914ad9..2a901bf 100644 > > > --- a/drivers/gpu/drm/i915/intel_ddi.c > > > +++ b/drivers/gpu/drm/i915/intel_ddi.c > > > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct > > > intel_encoder *encoder, > > > { > > > struct drm_i915_private *dev_priv = to_i915(encoder- > > > > base.dev); > > > > > > enum port port = intel_ddi_get_encoder_port(encoder); > > > + uint32_t val; > > > > > > if (WARN_ON(!pll)) > > > return; > > > > > > - if (IS_GEN9_BC(dev_priv)) { > > > - uint32_t val; > > > + if (IS_CANNONLAKE(dev_priv)) { > > > + /* Configure DPCLKA_CFGCR0 to map the DPLL to the > > > DDI. */ > > > + val = I915_READ(DPCLKA_CFGCR0); > > > + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); > > > + I915_WRITE(DPCLKA_CFGCR0, val); > > > > A question to the Atomic Lords: don't we need some sort of locking > > around this register since it's used by all ports/clocks? I suppose > > dev_priv->dpll_lock would do... > > > > Maybe the same would apply for gen9_bc. > > If there are modesets happening in parallel for different crtcs, then some > locking is needed. dpll_lock seems like the right call, that's what's used to > avoid the same problem with the enable/disable hooks. If something is allowing modesets to commit in parallel then probably the whole world is on fire. Historically connection_mutex has been there to protect us, but not sure how that goes with nonblocking commits. I do hope there's still something there to prevents this... > > Btw, I think this patch shows why something like [1] might be a good idea. > > [1] https://patchwork.freedesktop.org/patch/113598/ > > > > > > > > + /* > > > + * Configure DPCLKA_CFGCR0 to turn on the clock for > > > the DDI. > > > + * This step and the step before must be done with > > > separate > > > + * register writes. > > > + */ > > > + val = I915_READ(DPCLKA_CFGCR0); > > > + val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) | > > > + DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)); > > > + I915_WRITE(DPCLKA_CFGCR0, val); > > > + } else if
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote: > Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu: > > One of the steps for PLL (un)initialization is to (un)map > > the correspondent DDI that is actually using that PLL. > > > > So, let's do this step following the places already stablished > > and used so far, although spec put this as part of PLL > > initialization sequences. > > > > v2: Use proper prefix on bits names as suggested by Ander. > > v3: Add missed "~". Without that the logic was inverted > > so we were disabling interrupts. > > Credits-to: Clinton > > Credits-to: Art > > v4: Spec is getting updated to do DDI -> PLL mapping > > and clock on in 2 separated reg writes. (Paulo) > > Also update bits definitions to use space > > (1 << 1) instead of (1<<1). (Paulo) > > > > Cc: Paulo Zanoni> > Cc: Art Runyan > > Cc: Clint Taylor > > Cc: Ville Syrjälä > > Cc: Kahola, Mika > > Cc: Ander Conselvan De Oliveira > m> > > Signed-off-by: Rodrigo Vivi > > Reviewed-by: Kahola, Mika > > Signed-off-by: Rodrigo Vivi > > --- > > drivers/gpu/drm/i915/i915_reg.h | 9 + > > drivers/gpu/drm/i915/intel_ddi.c | 23 --- > > 2 files changed, 29 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h > > b/drivers/gpu/drm/i915/i915_reg.h > > index 3cfc65f..dcb8e21 100644 > > --- a/drivers/gpu/drm/i915/i915_reg.h > > +++ b/drivers/gpu/drm/i915/i915_reg.h > > @@ -8150,6 +8150,15 @@ enum { > > #define DPLL_CFGCR1(id)_MMIO_PIPE((id) - SKL_DPLL1, > > _DPLL1_CFGCR1, _DPLL2_CFGCR1) > > #define DPLL_CFGCR2(id)_MMIO_PIPE((id) - SKL_DPLL1, > > _DPLL1_CFGCR2, _DPLL2_CFGCR2) > > > > +/* > > + * CNL Clocks > > + */ > > +#define DPCLKA_CFGCR0 _MMIO(0x6C200) > > +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10)) > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port) (3 << > > ((port)*2)) > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2) > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port) ((pll) << > > ((port)*2)) > > + > > /* BXT display engine PLL */ > > #define BXT_DE_PLL_CTL _MMIO(0x6d000) > > #define BXT_DE_PLL_RATIO(x) (x) /* > > {60,65,100} * 19.2MHz */ > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > > b/drivers/gpu/drm/i915/intel_ddi.c > > index 0914ad9..2a901bf 100644 > > --- a/drivers/gpu/drm/i915/intel_ddi.c > > +++ b/drivers/gpu/drm/i915/intel_ddi.c > > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct > > intel_encoder *encoder, > > { > > struct drm_i915_private *dev_priv = to_i915(encoder- > > > base.dev); > > > > enum port port = intel_ddi_get_encoder_port(encoder); > > + uint32_t val; > > > > if (WARN_ON(!pll)) > > return; > > > > - if (IS_GEN9_BC(dev_priv)) { > > - uint32_t val; > > + if (IS_CANNONLAKE(dev_priv)) { > > + /* Configure DPCLKA_CFGCR0 to map the DPLL to the > > DDI. */ > > + val = I915_READ(DPCLKA_CFGCR0); > > + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); > > + I915_WRITE(DPCLKA_CFGCR0, val); > > A question to the Atomic Lords: don't we need some sort of locking > around this register since it's used by all ports/clocks? I suppose > dev_priv->dpll_lock would do... > > Maybe the same would apply for gen9_bc. If there are modesets happening in parallel for different crtcs, then some locking is needed. dpll_lock seems like the right call, that's what's used to avoid the same problem with the enable/disable hooks. Btw, I think this patch shows why something like [1] might be a good idea. [1] https://patchwork.freedesktop.org/patch/113598/ > > > > > + /* > > + * Configure DPCLKA_CFGCR0 to turn on the clock for > > the DDI. > > + * This step and the step before must be done with > > separate > > + * register writes. > > + */ > > + val = I915_READ(DPCLKA_CFGCR0); > > + val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) | > > + DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)); > > + I915_WRITE(DPCLKA_CFGCR0, val); > > + } else if (IS_GEN9_BC(dev_priv)) { > > /* DDI -> PLL mapping */ > > val = I915_READ(DPLL_CTRL2); > > > > @@ -1763,7 +1777,10 @@ static void intel_ddi_post_disable(struct > > intel_encoder *intel_encoder, > > if (dig_port) > > intel_display_power_put(dev_priv, dig_port- > > > ddi_io_power_domain); > > > > > > - if (IS_GEN9_BC(dev_priv)) > > + if (IS_CANNONLAKE(dev_priv)) > > + I915_WRITE(DPCLKA_CFGCR0, I915_READ(DPCLKA_CFGCR0) | > > +
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote: > Chris Wilsonwrites: > > > Typically, there is space available within the ring and if not we have > > to wait (by definition a slow path). Rearrange the code to reduce the > > number of branches and stack size for the hotpath, accomodating a slight > > growth for the wait. > > > > v2: Fix the new assert that packets are not larger than the actual ring. > > > > Signed-off-by: Chris Wilson > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 63 > > + > > 1 file changed, 33 insertions(+), 30 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > > b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index c46e5439d379..53123c1cfcc5 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct > > drm_i915_gem_request *request) > > return 0; > > } > > > > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) > > +static noinline int wait_for_space(struct drm_i915_gem_request *req, int > > bytes) > > { > > struct intel_ring *ring = req->ring; > > struct drm_i915_gem_request *target; > > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct > > drm_i915_gem_request *req, int bytes) > > u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) > > { > > struct intel_ring *ring = req->ring; > > - int remain_actual = ring->size - ring->emit; > > - int remain_usable = ring->effective_size - ring->emit; > > - int bytes = num_dwords * sizeof(u32); > > - int total_bytes, wait_bytes; > > - bool need_wrap = false; > > + const unsigned int remain_usable = ring->effective_size - ring->emit; > > + const unsigned int bytes = num_dwords * sizeof(u32); > > + unsigned int need_wrap = 0; > > + unsigned int total_bytes; > > u32 *cs; > > > > total_bytes = bytes + req->reserved_space; > > + GEM_BUG_ON(total_bytes > ring->effective_size); > > > > - if (unlikely(bytes > remain_usable)) { > > - /* > > -* Not enough space for the basic request. So need to flush > > -* out the remainder and then wait for base + reserved. > > -*/ > > - wait_bytes = remain_actual + total_bytes; > > - need_wrap = true; > > - } else if (unlikely(total_bytes > remain_usable)) { > > - /* > > -* The base request will fit but the reserved space > > -* falls off the end. So we don't need an immediate wrap > > -* and only need to effectively wait for the reserved > > -* size space from the start of ringbuffer. > > -*/ > > - wait_bytes = remain_actual + req->reserved_space; > > - } else { > > - /* No wrapping required, just waiting. */ > > - wait_bytes = total_bytes; > > + if (unlikely(total_bytes > remain_usable)) { > > + const int remain_actual = ring->size - ring->emit; > > + > > + if (bytes > remain_usable) { > > + /* > > +* Not enough space for the basic request. So need to > > +* flush out the remainder and then wait for > > +* base + reserved. > > +*/ > > + total_bytes += remain_actual; > > + need_wrap = remain_actual | 1; > > Your remain_actual should never reach zero. So in here > forcing the lowest bit on, and later off, seems superfluous. Why can't we fill up to the last byte with commands? remain_actual is just (size - tail) and we don't force a wrap until emit crosses the boundary (and not before). We hit remain_actual == 0 in practice. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On Thu, May 04, 2017 at 12:21:51PM +0300, Andy Shevchenko wrote: > acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16 > bytes. Instead we convert them to use uuid_le type. At the same time we > convert current users. > > acpi_str_to_uuid() becomes useless after the conversion and it's safe to > get rid of it. > > The conversion fixes a potential bug in int340x_thermal as well since > we have to use memcmp() on binary data. > > Cc: Rafael J. Wysocki> Cc: Mika Westerberg > Cc: Borislav Petkov > Cc: Dan Williams > Cc: Amir Goldstein > Cc: Jarkko Sakkinen > Cc: Jani Nikula > Cc: Ben Skeggs > Cc: Benjamin Tissoires > Cc: Joerg Roedel > Cc: Adrian Hunter > Cc: Yisen Zhuang > Cc: Bjorn Helgaas > Cc: Zhang Rui > Cc: Felipe Balbi > Cc: Mathias Nyman > Cc: Heikki Krogerus > Cc: Liam Girdwood > Cc: Mark Brown > Signed-off-by: Andy Shevchenko OK by me, FWIW: Reviewed-by: Heikki Krogerus Thanks, -- heikki ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Move the unclaimed mmio detection into the powerwell for KMS
== Series Details == Series: drm/i915: Move the unclaimed mmio detection into the powerwell for KMS URL : https://patchwork.freedesktop.org/series/23955/ State : success == Summary == Series 23955v1 drm/i915: Move the unclaimed mmio detection into the powerwell for KMS https://patchwork.freedesktop.org/api/1.0/series/23955/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: dmesg-warn -> PASS (fi-kbl-7560u) fdo#100125 Test vgem_basic: Subgroup sysfs: incomplete -> PASS (fi-snb-2600) fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:430s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:426s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:513s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:548s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:494s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:479s fi-elk-e7500 total:278 pass:221 dwarn:0 dfail:0 fail:0 skip:57 time:402s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:407s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:415s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:494s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:489s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:453s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:565s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:453s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:569s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:459s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:494s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:431s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:528s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:400s fi-bsw-n3050 failed to collect. IGT log at Patchwork_4621/fi-bsw-n3050/igt.log 93dcb17f41bd2025c355f4e2aded42c0fc5a5c5d drm-tip: 2017y-05m-04d-10h-58m-24s UTC integration manifest 03c0c51 drm/i915: Move the unclaimed mmio detection into the powerwell for KMS == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4621/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
Chris Wilsonwrites: > Typically, there is space available within the ring and if not we have > to wait (by definition a slow path). Rearrange the code to reduce the > number of branches and stack size for the hotpath, accomodating a slight > growth for the wait. > > v2: Fix the new assert that packets are not larger than the actual ring. > > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 63 > + > 1 file changed, 33 insertions(+), 30 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index c46e5439d379..53123c1cfcc5 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct > drm_i915_gem_request *request) > return 0; > } > > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) > +static noinline int wait_for_space(struct drm_i915_gem_request *req, int > bytes) > { > struct intel_ring *ring = req->ring; > struct drm_i915_gem_request *target; > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct drm_i915_gem_request > *req, int bytes) > u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) > { > struct intel_ring *ring = req->ring; > - int remain_actual = ring->size - ring->emit; > - int remain_usable = ring->effective_size - ring->emit; > - int bytes = num_dwords * sizeof(u32); > - int total_bytes, wait_bytes; > - bool need_wrap = false; > + const unsigned int remain_usable = ring->effective_size - ring->emit; > + const unsigned int bytes = num_dwords * sizeof(u32); > + unsigned int need_wrap = 0; > + unsigned int total_bytes; > u32 *cs; > > total_bytes = bytes + req->reserved_space; > + GEM_BUG_ON(total_bytes > ring->effective_size); > > - if (unlikely(bytes > remain_usable)) { > - /* > - * Not enough space for the basic request. So need to flush > - * out the remainder and then wait for base + reserved. > - */ > - wait_bytes = remain_actual + total_bytes; > - need_wrap = true; > - } else if (unlikely(total_bytes > remain_usable)) { > - /* > - * The base request will fit but the reserved space > - * falls off the end. So we don't need an immediate wrap > - * and only need to effectively wait for the reserved > - * size space from the start of ringbuffer. > - */ > - wait_bytes = remain_actual + req->reserved_space; > - } else { > - /* No wrapping required, just waiting. */ > - wait_bytes = total_bytes; > + if (unlikely(total_bytes > remain_usable)) { > + const int remain_actual = ring->size - ring->emit; > + > + if (bytes > remain_usable) { > + /* > + * Not enough space for the basic request. So need to > + * flush out the remainder and then wait for > + * base + reserved. > + */ > + total_bytes += remain_actual; > + need_wrap = remain_actual | 1; Your remain_actual should never reach zero. So in here forcing the lowest bit on, and later off, seems superfluous. -Mika > + } else { > + /* > + * The base request will fit but the reserved space > + * falls off the end. So we don't need an immediate > + * wrap and only need to effectively wait for the > + * reserved size from the start of ringbuffer. > + */ > + total_bytes = req->reserved_space + remain_actual; > + } > } > > - if (wait_bytes > ring->space) { > - int ret = wait_for_space(req, wait_bytes); > + if (unlikely(total_bytes > ring->space)) { > + int ret = wait_for_space(req, total_bytes); > if (unlikely(ret)) > return ERR_PTR(ret); > } > > if (unlikely(need_wrap)) { > - GEM_BUG_ON(remain_actual > ring->space); > - GEM_BUG_ON(ring->emit + remain_actual > ring->size); > + need_wrap &= ~1; > + GEM_BUG_ON(need_wrap > ring->space); > + GEM_BUG_ON(ring->emit + need_wrap > ring->size); > > /* Fill the tail with MI_NOOP */ > - memset(ring->vaddr + ring->emit, 0, remain_actual); > + memset(ring->vaddr + ring->emit, 0, need_wrap); > ring->emit = 0; > - ring->space -= remain_actual; > + ring->space -= need_wrap; > } > > GEM_BUG_ON(ring->emit > ring->size - bytes); > -- >
[Intel-gfx] [PATCH] drm/i915: Move the unclaimed mmio detection into the powerwell for KMS
Replace the large comment about requiring the powerwell for intel_uncore_arm_unclaimed_mmio_detection() by moving the arming of the mmio error detection into the powerwell held for modesetting. Thereby also accomplishing the goal of only arming the mmio detection after a full modeset. Signed-off-by: Chris WilsonCc: Mika Kuoppala Cc: Daniel Vetter Cc: Ville Syrjälä --- drivers/gpu/drm/i915/intel_display.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 85b9e2f521a0..14e12e46eda5 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12912,8 +12912,16 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) drm_atomic_helper_commit_hw_done(state); - if (intel_state->modeset) + if (intel_state->modeset) { + /* As one of the primary mmio accessors, KMS has a high +* likelihood of triggering bugs in unclaimed access. After we +* finish modesetting, see if an error has been flagged, and if +* so enable debugging for the next modeset - and hope we catch +* the culprit. +*/ + intel_uncore_arm_unclaimed_mmio_detection(dev_priv); intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET); + } mutex_lock(>struct_mutex); drm_atomic_helper_cleanup_planes(dev, state); @@ -12923,19 +12931,6 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) drm_atomic_state_put(state); - /* As one of the primary mmio accessors, KMS has a high likelihood -* of triggering bugs in unclaimed access. After we finish -* modesetting, see if an error has been flagged, and if so -* enable debugging for the next modeset - and hope we catch -* the culprit. -* -* XXX note that we assume display power is on at this point. -* This might hold true now but we need to add pm helper to check -* unclaimed only when the hardware is on, as atomic commits -* can happen also when the device is completely off. -*/ - intel_uncore_arm_unclaimed_mmio_detection(dev_priv); - intel_atomic_helper_free_state(dev_priv); } -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 6/7] drm/i915: Kill off intel_crtc_active.
Use crtc->active directly instead. This is still not completely optimal and needs fixing, but it's about as good as using intel_crtc_active. Signed-off-by: Maarten Lankhorst--- drivers/gpu/drm/i915/intel_display.c | 19 --- drivers/gpu/drm/i915/intel_drv.h | 1 - drivers/gpu/drm/i915/intel_fbc.c | 2 +- drivers/gpu/drm/i915/intel_pm.c | 6 +++--- 4 files changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index c7d295a0895d..8538c0246015 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -948,25 +948,6 @@ bool bxt_find_best_dpll(struct intel_crtc_state *crtc_state, int target_clock, target_clock, refclk, NULL, best_clock); } -bool intel_crtc_active(struct intel_crtc *crtc) -{ - /* Be paranoid as we can arrive here with only partial -* state retrieved from the hardware during setup. -* -* We can ditch the adjusted_mode.crtc_clock check as soon -* as Haswell has gained clock readout/fastboot support. -* -* We can ditch the crtc->primary->fb check as soon as we can -* properly reconstruct framebuffers. -* -* FIXME: The intel_crtc->active here should be switched to -* crtc->state->active once we have proper CRTC states wired up -* for atomic. -*/ - return crtc->active && crtc->base.primary->state->fb && - crtc->config->base.adjusted_mode.crtc_clock; -} - enum transcoder intel_pipe_to_cpu_transcoder(struct drm_i915_private *dev_priv, enum pipe pipe) { diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 62f690c7691e..dbe33b7bcf67 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1490,7 +1490,6 @@ bool bxt_find_best_dpll(struct intel_crtc_state *crtc_state, int target_clock, struct dpll *best_clock); int chv_calc_dpll_params(int refclk, struct dpll *pll_clock); -bool intel_crtc_active(struct intel_crtc *crtc); void hsw_enable_ips(struct intel_crtc *crtc); void hsw_disable_ips(struct intel_crtc *crtc); enum intel_display_power_domain intel_port_to_power_domain(enum port port); diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c index ded2add18b26..a93214d0388e 100644 --- a/drivers/gpu/drm/i915/intel_fbc.c +++ b/drivers/gpu/drm/i915/intel_fbc.c @@ -1282,7 +1282,7 @@ void intel_fbc_init_pipe_state(struct drm_i915_private *dev_priv) return; for_each_intel_crtc(_priv->drm, crtc) - if (intel_crtc_active(crtc) && + if (crtc->base.state->active && crtc->base.primary->state->visible) dev_priv->fbc.visible_pipes_mask |= (1 << crtc->pipe); } diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 969eb11ed5cd..bf2127a3f730 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -814,7 +814,7 @@ static struct intel_crtc *single_enabled_crtc(struct drm_i915_private *dev_priv) struct intel_crtc *crtc, *enabled = NULL; for_each_intel_crtc(_priv->drm, crtc) { - if (intel_crtc_active(crtc)) { + if (crtc->active) { if (enabled) return NULL; enabled = crtc; @@ -2486,11 +2486,11 @@ static void i9xx_program_watermarks(struct drm_i915_private *dev_priv) crtc = intel_get_crtc_for_plane(dev_priv, 0); planea_wm = crtc->wm.active.i9xx.plane_wm; - if (intel_crtc_active(crtc)) + if (crtc->active) enabled = crtc; crtc = intel_get_crtc_for_plane(dev_priv, 1); - if (intel_crtc_active(crtc)) { + if (crtc->active) { if (enabled == NULL) enabled = crtc; else -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 5/7] drm/i915: Program gen4 watermarks atomically
We're already calculating the watermarks correctly, now we have to program them too. Signed-off-by: Maarten Lankhorst--- drivers/gpu/drm/i915/intel_pm.c | 25 +++-- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index c5bdef6281f3..969eb11ed5cd 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2268,20 +2268,20 @@ static int i965_compute_pipe_wm(struct intel_crtc_state *crtc_state) return 0; } -static void i965_update_wm(struct intel_crtc *crtc) +static void i965_program_watermarks(struct drm_i915_private *dev_priv) { - struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct intel_crtc *crtc; + struct i9xx_wm_state *wm_state = NULL; int srwm = 1; int cursor_sr = 16; bool cxsr_enabled = false; - crtc->wm.active.i9xx = crtc->config->wm.i9xx.optimal; - - /* Calc sr entries for one plane configs */ crtc = single_enabled_crtc(dev_priv); - if (crtc && crtc->wm.active.i9xx.cxsr) { - struct i9xx_wm_state *wm_state = >wm.active.i9xx; + if (crtc) + wm_state = >wm.active.i9xx; + /* Calc sr entries for one plane configs */ + if (wm_state && wm_state->cxsr) { srwm = wm_state->sr.plane; cursor_sr = wm_state->sr.cursor; @@ -2571,8 +2571,10 @@ static void i9xx_initial_watermarks(struct intel_atomic_state *state, pnv_program_watermarks(dev_priv); else if (INTEL_INFO(dev_priv)->num_pipes == 1) i845_program_watermarks(intel_crtc); - else + else if (INTEL_GEN(dev_priv) < 4) i9xx_program_watermarks(dev_priv); + else + i965_program_watermarks(dev_priv); mutex_unlock(_priv->wm.wm_mutex); } @@ -2591,8 +2593,10 @@ static void i9xx_optimize_watermarks(struct intel_atomic_state *state, pnv_program_watermarks(dev_priv); else if (INTEL_INFO(dev_priv)->num_pipes == 1) i845_program_watermarks(intel_crtc); - else + else if (INTEL_GEN(dev_priv) < 4) i9xx_program_watermarks(dev_priv); + else + i965_program_watermarks(dev_priv); mutex_unlock(_priv->wm.wm_mutex); } @@ -8911,7 +8915,8 @@ void intel_init_pm(struct drm_i915_private *dev_priv) } } else if (IS_GEN4(dev_priv)) { dev_priv->display.compute_pipe_wm = i965_compute_pipe_wm; - dev_priv->display.update_wm = i965_update_wm; + dev_priv->display.initial_watermarks = i9xx_initial_watermarks; + dev_priv->display.optimize_watermarks = i9xx_optimize_watermarks; } else if (IS_GEN3(dev_priv)) { dev_priv->display.compute_pipe_wm = i9xx_compute_pipe_wm; dev_priv->display.compute_intermediate_wm = i9xx_compute_intermediate_wm; -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 7/7] drm/i915: Rip out legacy watermark infrastructure
The legacy watermark infrastructure is now unused, so remove it. Signed-off-by: Maarten Lankhorst--- drivers/gpu/drm/i915/i915_drv.h | 1 - drivers/gpu/drm/i915/intel_atomic.c | 2 - drivers/gpu/drm/i915/intel_display.c | 75 ++-- drivers/gpu/drm/i915/intel_drv.h | 2 - drivers/gpu/drm/i915/intel_pm.c | 42 5 files changed, 3 insertions(+), 119 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7af4f908b2cd..46b317c991f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -637,7 +637,6 @@ struct drm_i915_display_funcs { void (*optimize_watermarks)(struct intel_atomic_state *state, struct intel_crtc_state *cstate); int (*compute_global_watermarks)(struct drm_atomic_state *state); - void (*update_wm)(struct intel_crtc *crtc); int (*modeset_calc_cdclk)(struct drm_atomic_state *state); /* Returns the active state of the crtc, and if the crtc is active, * fills out the pipe-config with the hw state. */ diff --git a/drivers/gpu/drm/i915/intel_atomic.c b/drivers/gpu/drm/i915/intel_atomic.c index 87b1dd464eee..7a4acaa45edd 100644 --- a/drivers/gpu/drm/i915/intel_atomic.c +++ b/drivers/gpu/drm/i915/intel_atomic.c @@ -173,8 +173,6 @@ intel_crtc_duplicate_state(struct drm_crtc *crtc) crtc_state->update_pipe = false; crtc_state->disable_lp_wm = false; crtc_state->disable_cxsr = false; - crtc_state->update_wm_pre = false; - crtc_state->update_wm_post = false; crtc_state->fb_changed = false; crtc_state->fifo_changed = false; crtc_state->wm.need_postvbl_update = false; diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 8538c0246015..295e17d0f272 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -4958,9 +4958,6 @@ static void intel_post_plane_update(struct intel_crtc_state *old_crtc_state) intel_frontbuffer_flip(to_i915(crtc->base.dev), pipe_config->fb_bits); - if (pipe_config->update_wm_post && pipe_config->base.active) - intel_update_watermarks(crtc); - if (old_pri_state) { struct intel_plane_state *primary_state = to_intel_plane_state(primary->state); @@ -5050,8 +5047,6 @@ static void intel_pre_plane_update(struct intel_crtc_state *old_crtc_state, if (dev_priv->display.initial_watermarks != NULL) dev_priv->display.initial_watermarks(old_intel_state, pipe_config); - else if (pipe_config->update_wm_pre) - intel_update_watermarks(crtc); } static void intel_crtc_disable_planes(struct drm_crtc *crtc, unsigned plane_mask) @@ -5737,8 +5732,6 @@ static void i9xx_crtc_enable(struct intel_crtc_state *pipe_config, if (dev_priv->display.initial_watermarks != NULL) dev_priv->display.initial_watermarks(old_intel_state, intel_crtc->config); - else - intel_update_watermarks(intel_crtc); intel_enable_pipe(intel_crtc); assert_vblank_disabled(crtc); @@ -5802,9 +5795,6 @@ static void i9xx_crtc_disable(struct intel_crtc_state *old_crtc_state, if (!IS_GEN2(dev_priv)) intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false); - - if (!dev_priv->display.initial_watermarks) - intel_update_watermarks(intel_crtc); } static void intel_crtc_disable_noatomic(struct drm_crtc *crtc) @@ -5863,7 +5853,6 @@ static void intel_crtc_disable_noatomic(struct drm_crtc *crtc) encoder->base.crtc = NULL; intel_fbc_disable(intel_crtc); - intel_update_watermarks(intel_crtc); intel_disable_shared_dpll(intel_crtc); domains = intel_crtc->enabled_power_domains; @@ -10738,40 +10727,6 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, } -/** - * intel_wm_need_update - Check whether watermarks need updating - * @plane: drm plane - * @state: new plane state - * - * Check current plane state versus the new one to determine whether - * watermarks need to be recalculated. - * - * Returns true or false. - */ -static bool intel_wm_need_update(struct drm_plane *plane, -struct drm_plane_state *state) -{ - struct intel_plane_state *new = to_intel_plane_state(state); - struct intel_plane_state *cur = to_intel_plane_state(plane->state); - - /* Update watermarks on tiling or size changes. */ - if (new->base.visible != cur->base.visible) - return true; - - if (!cur->base.fb || !new->base.fb) - return false; - - if (cur->base.fb->modifier !=
[Intel-gfx] [RFC 3/7] drm/i915: Convert pineview watermarks to atomic
Pineview seems to have different watermarks from the other platforms and are calculated separately. Signed-off-by: Maarten Lankhorst--- drivers/gpu/drm/i915/intel_drv.h | 3 +- drivers/gpu/drm/i915/intel_pm.c | 134 ++- 2 files changed, 92 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 73e74fc7383c..62f690c7691e 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -552,7 +552,8 @@ struct i9xx_wm_state { struct { uint16_t plane; - } sr; + uint16_t cursor; + } sr, hpll; }; struct intel_crtc_wm_state { diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index c39f63aff4a5..eb1bb8b3f9a6 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -824,13 +824,17 @@ static struct intel_crtc *single_enabled_crtc(struct drm_i915_private *dev_priv) return enabled; } -static void pineview_update_wm(struct intel_crtc *unused_crtc) +static int pnv_compute_pipe_wm(struct intel_crtc_state *crtc_state) { - struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev); - struct intel_crtc *crtc; + struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct i9xx_wm_state *wm_state = _state->wm.i9xx.optimal; + struct intel_plane *plane = to_intel_plane(crtc->base.primary); + struct intel_atomic_state *state = to_intel_atomic_state(crtc_state->base.state); + const struct drm_plane_state *primary_plane_state = NULL; const struct cxsr_latency *latency; - u32 reg; - unsigned int wm; + + memset(wm_state, 0, sizeof(*wm_state)); latency = intel_get_cxsr_latency(IS_PINEVIEW_G(dev_priv), dev_priv->is_ddr3, @@ -838,60 +842,90 @@ static void pineview_update_wm(struct intel_crtc *unused_crtc) dev_priv->mem_freq); if (!latency) { DRM_DEBUG_KMS("Unknown FSB/MEM found, disable CxSR\n"); - intel_set_memory_cxsr(dev_priv, false); - return; + + return 0; } - crtc = single_enabled_crtc(dev_priv); - if (crtc) { - const struct drm_display_mode *adjusted_mode = - >config->base.adjusted_mode; + if (crtc_state->base.plane_mask & BIT(drm_plane_index(>base))) + primary_plane_state = __drm_atomic_get_current_plane_state(>base, >base); + + if (primary_plane_state) { const struct drm_framebuffer *fb = - crtc->base.primary->state->fb; + primary_plane_state->fb; int cpp = fb->format->cpp[0]; - int clock = adjusted_mode->crtc_clock; + const struct drm_display_mode *adjusted_mode = + _state->base.adjusted_mode; + unsigned active_crtcs; + + if (state->modeset) + active_crtcs = state->active_crtcs; + else + active_crtcs = dev_priv->active_crtcs; + + wm_state->cxsr = active_crtcs == drm_crtc_mask(>base); + + wm_state->sr.plane = intel_calculate_wm(adjusted_mode->crtc_clock, + _display_wm, + pineview_display_wm.fifo_size, + cpp, latency->display_sr); + + wm_state->sr.cursor = intel_calculate_wm(adjusted_mode->crtc_clock, +_cursor_wm, + pineview_display_wm.fifo_size, +4, latency->cursor_sr); + + wm_state->hpll.plane = intel_calculate_wm(adjusted_mode->crtc_clock, + _display_hplloff_wm, + pineview_display_hplloff_wm.fifo_size, +cpp, latency->display_hpll_disable); + + wm_state->hpll.cursor = intel_calculate_wm(adjusted_mode->crtc_clock, + _cursor_hplloff_wm, + pineview_display_hplloff_wm.fifo_size, + 4, latency->cursor_hpll_disable); + + DRM_DEBUG_KMS("FIFO watermarks - can cxsr: %s, display plane %d, cursor SR size: %d\n", +
[Intel-gfx] [RFC 4/7] drm/i915: Calculate gen4 watermarks semiatomically.
Gen4 watermark is handled same as gen3-. Calculate the optimal watermarks atomically first, and program it in the legacy helper. Signed-off-by: Maarten Lankhorst--- drivers/gpu/drm/i915/intel_pm.c | 136 1 file changed, 95 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index eb1bb8b3f9a6..c5bdef6281f3 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2189,58 +2189,109 @@ static void vlv_optimize_watermarks(struct intel_atomic_state *state, mutex_unlock(_priv->wm.wm_mutex); } -static void i965_update_wm(struct intel_crtc *unused_crtc) +static int i965_compute_pipe_wm(struct intel_crtc_state *crtc_state) { - struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev); - struct intel_crtc *crtc; + struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct intel_atomic_state *state = + to_intel_atomic_state(crtc_state->base.state); + struct i9xx_wm_state *wm_state = _state->wm.i9xx.optimal; + struct intel_plane *plane = to_intel_plane(crtc->base.primary); + const struct drm_plane_state *primary_plane_state = NULL; + const struct drm_plane_state *cursor_plane_state = NULL; + + memset(wm_state, 0, sizeof(*wm_state)); + + if (crtc_state->base.plane_mask & BIT(drm_plane_index(>base))) + primary_plane_state = __drm_atomic_get_current_plane_state(>base, >base); + + if (crtc_state->base.plane_mask & BIT(drm_plane_index(crtc->base.cursor))) + cursor_plane_state = __drm_atomic_get_current_plane_state(>base, crtc->base.cursor); + + if (primary_plane_state) { + static const int sr_latency_ns = 12000; + const struct drm_display_mode *adjusted_mode = + _state->base.adjusted_mode; + unsigned active_crtcs; + unsigned long entries; + bool may_cxsr; + + if (state->modeset) + active_crtcs = state->active_crtcs; + else + active_crtcs = dev_priv->active_crtcs; + + may_cxsr = active_crtcs == drm_crtc_mask(>base); + + if (may_cxsr && intel_wm_plane_visible(crtc_state, to_intel_plane_state(primary_plane_state))) { + struct drm_framebuffer *fb = primary_plane_state->fb; + unsigned cpp = fb->format->cpp[0]; + + entries = intel_wm_method2(adjusted_mode->crtc_clock, + adjusted_mode->crtc_htotal, + crtc_state->pipe_src_w, cpp, + sr_latency_ns / 100); + entries = DIV_ROUND_UP(entries, I915_FIFO_LINE_SIZE); + if (entries < I965_FIFO_SIZE) + wm_state->sr.plane = I965_FIFO_SIZE - entries; + else + may_cxsr = false; + + DRM_DEBUG_KMS("self-refresh entries: %ld\n", entries); + } + + /* No need to use intel_wm_plane_visible here, since cursor. */ + if (may_cxsr && cursor_plane_state && crtc_state->base.active) { + entries = intel_wm_method2(adjusted_mode->crtc_clock, + adjusted_mode->crtc_htotal, + cursor_plane_state->crtc_w, 4, + sr_latency_ns / 100); + + entries = DIV_ROUND_UP(entries, + i965_cursor_wm_info.cacheline_size) + + i965_cursor_wm_info.guard_size; + + if (entries < i965_cursor_wm_info.fifo_size) + wm_state->sr.cursor = min(i965_cursor_wm_info.fifo_size - entries, + (unsigned long)(i965_cursor_wm_info.max_wm)); + else + may_cxsr = false; + } else if (may_cxsr) + wm_state->sr.cursor = 16; + + wm_state->cxsr = may_cxsr; + + DRM_DEBUG_KMS("FIFO watermarks - can cxsr: %s, display plane %d, cursor SR size: %d\n", + yesno(wm_state->cxsr), wm_state->sr.plane, wm_state->sr.cursor); + } + + return 0; +} + +static void i965_update_wm(struct intel_crtc *crtc) +{ + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); int srwm = 1; int cursor_sr = 16; - bool cxsr_enabled; + bool cxsr_enabled = false;
[Intel-gfx] [RFC 2/7] drm/i915: Program gen3- watermarks atomically
With the atomic watermark calculations calculate intermediary watermark values and update the watermarks atomically. Signed-off-by: Maarten Lankhorst--- drivers/gpu/drm/i915/i915_drv.h | 5 ++ drivers/gpu/drm/i915/intel_drv.h | 2 +- drivers/gpu/drm/i915/intel_pm.c | 103 +-- 3 files changed, 95 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 91b945cd39f9..7af4f908b2cd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1793,6 +1793,10 @@ struct g4x_wm_values { bool fbc_en; }; +struct i9xx_wm_values { + bool cxsr; +}; + struct skl_ddb_entry { uint16_t start, end;/* in number of blocks, 'end' is exclusive */ }; @@ -2422,6 +2426,7 @@ struct drm_i915_private { struct skl_wm_values skl_hw; struct vlv_wm_values vlv; struct g4x_wm_values g4x; + struct i9xx_wm_values i9xx; }; uint8_t max_level; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index d9e49f2b3c22..73e74fc7383c 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -600,7 +600,7 @@ struct intel_crtc_wm_state { struct g4x_wm_state optimal; } g4x; struct { - struct i9xx_wm_state optimal; + struct i9xx_wm_state optimal, intermediate; } i9xx; }; diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 0c933cfad02c..c39f63aff4a5 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -433,6 +433,8 @@ bool intel_set_memory_cxsr(struct drm_i915_private *dev_priv, bool enable) dev_priv->wm.vlv.cxsr = enable; else if (IS_G4X(dev_priv)) dev_priv->wm.g4x.cxsr = enable; + else if (INTEL_GEN(dev_priv) <= 4) + dev_priv->wm.i9xx.cxsr = enable; mutex_unlock(_priv->wm.wm_mutex); return ret; @@ -2317,6 +2319,44 @@ static int i9xx_compute_pipe_wm(struct intel_crtc_state *crtc_state) return 0; } +static int i9xx_compute_intermediate_wm(struct drm_device *dev, + struct intel_crtc *intel_crtc, + struct intel_crtc_state *newstate) +{ + struct i9xx_wm_state *intermediate = >wm.i9xx.intermediate; + const struct drm_crtc_state *old_drm_state = + drm_atomic_get_old_crtc_state(newstate->base.state, _crtc->base); + const struct i9xx_wm_state *old = _intel_crtc_state(old_drm_state)->wm.i9xx.optimal; + const struct i9xx_wm_state *optimal = >wm.i9xx.optimal; + + /* +* Start with the final, target watermarks, then combine with the +* currently active watermarks to get values that are safe both before +* and after the vblank. +*/ + *intermediate = *optimal; + if (newstate->disable_cxsr) + intermediate->cxsr = false; + + if (!newstate->base.active || + drm_atomic_crtc_needs_modeset(>base)) + goto out; + + intermediate->plane_wm = min(old->plane_wm, optimal->plane_wm); + intermediate->sr.plane = min(old->sr.plane, optimal->sr.plane); + +out: + /* +* If our intermediate WM are identical to the final WM, then we can +* omit the post-vblank programming; only update if it's different. +*/ + if (newstate->base.active && + memcmp(intermediate, optimal, sizeof(*intermediate)) != 0) + newstate->wm.need_postvbl_update = true; + + return 0; +} + void i9xx_wm_get_hw_state(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); @@ -2345,17 +2385,15 @@ void i9xx_wm_get_hw_state(struct drm_device *dev) } } -static void i9xx_update_wm(struct intel_crtc *crtc) +static void i9xx_program_watermarks(struct drm_i915_private *dev_priv) { - struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct intel_crtc *crtc; uint32_t fwater_lo; uint32_t fwater_hi; int cwm, srwm = -1; int planea_wm, planeb_wm; struct intel_crtc *enabled = NULL; - crtc->wm.active.i9xx = crtc->config->wm.i9xx.optimal; - crtc = intel_get_crtc_for_plane(dev_priv, 0); planea_wm = crtc->wm.active.i9xx.plane_wm; if (intel_crtc_active(crtc)) @@ -2381,7 +2419,7 @@ static void i9xx_update_wm(struct intel_crtc *crtc) cwm = 2; /* Play safe and disable self-refresh before adjusting watermarks. */ - intel_set_memory_cxsr(dev_priv, false); + _intel_set_memory_cxsr(dev_priv, false); /* Calc sr entries for one plane configs
[Intel-gfx] [RFC 0/7] drm/i915: Convert gen4- watermarks to atomic.
I've only compile time tested this and the series depends on Ville's gen4x watermark conversion so CI will fail to apply it. Maarten Lankhorst (7): drm/i915: Calculate gen3- watermarks semi-atomically. drm/i915: Program gen3- watermarks atomically drm/i915: Convert pineview watermarks to atomic drm/i915: Calculate gen4 watermarks semiatomically. drm/i915: Program gen4 watermarks atomically drm/i915: Kill off intel_crtc_active. drm/i915: Rip out legacy watermark infrastructure drivers/gpu/drm/i915/i915_drv.h | 6 +- drivers/gpu/drm/i915/intel_atomic.c | 2 - drivers/gpu/drm/i915/intel_display.c | 97 +- drivers/gpu/drm/i915/intel_drv.h | 18 +- drivers/gpu/drm/i915/intel_fbc.c | 2 +- drivers/gpu/drm/i915/intel_pm.c | 635 ++- 6 files changed, 433 insertions(+), 327 deletions(-) -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 1/7] drm/i915: Calculate gen3- watermarks semi-atomically.
The gen3 watermark calculations are converted to atomic, but the wm update calls are still done through the legacy functions. This will make it easier to bisect things if they go wrong. Signed-off-by: Maarten Lankhorst--- drivers/gpu/drm/i915/intel_display.c | 3 +- drivers/gpu/drm/i915/intel_drv.h | 14 +++ drivers/gpu/drm/i915/intel_pm.c | 231 +-- 3 files changed, 152 insertions(+), 96 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 4991ef2ac77d..c7d295a0895d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -15518,7 +15518,8 @@ intel_modeset_setup_hw_state(struct drm_device *dev) skl_wm_get_hw_state(dev); } else if (HAS_PCH_SPLIT(dev_priv)) { ilk_wm_get_hw_state(dev); - } + } else if (INTEL_GEN(dev_priv) <= 3 && !IS_PINEVIEW(dev_priv)) + i9xx_wm_get_hw_state(dev); for_each_intel_crtc(dev, crtc) { u64 put_domains; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index ae9173707959..d9e49f2b3c22 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -546,6 +546,15 @@ struct g4x_wm_state { bool fbc_en; }; +struct i9xx_wm_state { + uint16_t plane_wm; + bool cxsr; + + struct { + uint16_t plane; + } sr; +}; + struct intel_crtc_wm_state { union { struct { @@ -590,6 +599,9 @@ struct intel_crtc_wm_state { /* optimal watermarks */ struct g4x_wm_state optimal; } g4x; + struct { + struct i9xx_wm_state optimal; + } i9xx; }; /* @@ -828,6 +840,7 @@ struct intel_crtc { struct intel_pipe_wm ilk; struct vlv_wm_state vlv; struct g4x_wm_state g4x; + struct i9xx_wm_state i9xx; } active; } wm; @@ -1868,6 +1881,7 @@ void gen6_rps_boost(struct drm_i915_private *dev_priv, unsigned long submitted); void intel_queue_rps_boost_for_request(struct drm_i915_gem_request *req); void g4x_wm_get_hw_state(struct drm_device *dev); +void i9xx_wm_get_hw_state(struct drm_device *dev); void vlv_wm_get_hw_state(struct drm_device *dev); void ilk_wm_get_hw_state(struct drm_device *dev); void skl_wm_get_hw_state(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index d2cec3249e87..0c933cfad02c 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2226,89 +2226,154 @@ static void i965_update_wm(struct intel_crtc *unused_crtc) #undef FW_WM -static void i9xx_update_wm(struct intel_crtc *unused_crtc) +static const struct intel_watermark_params *i9xx_get_wm_info(struct drm_i915_private *dev_priv, +struct intel_crtc *crtc) { - struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev); - const struct intel_watermark_params *wm_info; - uint32_t fwater_lo; - uint32_t fwater_hi; - int cwm, srwm = 1; - int fifo_size; - int planea_wm, planeb_wm; - struct intel_crtc *crtc, *enabled = NULL; + struct intel_plane *plane = to_intel_plane(crtc->base.primary); if (IS_I945GM(dev_priv)) - wm_info = _wm_info; + return _wm_info; else if (!IS_GEN2(dev_priv)) - wm_info = _wm_info; + return _wm_info; + else if (plane->plane == PLANE_A) + return _a_wm_info; else - wm_info = _a_wm_info; + return _bc_wm_info; +} - fifo_size = dev_priv->display.get_fifo_size(dev_priv, 0); - crtc = intel_get_crtc_for_plane(dev_priv, 0); - if (intel_crtc_active(crtc)) { +static int i9xx_compute_pipe_wm(struct intel_crtc_state *crtc_state) +{ + struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct intel_atomic_state *state = + to_intel_atomic_state(crtc_state->base.state); + struct i9xx_wm_state *wm_state = _state->wm.i9xx.optimal; + struct intel_plane *plane = to_intel_plane(crtc->base.primary); + const struct drm_plane_state *plane_state = NULL; + int fifo_size; + const struct intel_watermark_params *wm_info; + + fifo_size = dev_priv->display.get_fifo_size(dev_priv, plane->plane); + + wm_info = i9xx_get_wm_info(dev_priv, crtc); + + wm_state->cxsr = false; + memset(_state->sr, 0, sizeof(wm_state->sr)); + + if (crtc_state->base.plane_mask & BIT(drm_plane_index(>base))) +
Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Set all undefined MOCS entries to follow PTE
On Thu, May 04, 2017 at 11:59:53AM +0100, Chris Wilson wrote: > On Thu, May 04, 2017 at 10:09:57AM -, Patchwork wrote: > > == Series Details == > > > > Series: drm/i915: Set all undefined MOCS entries to follow PTE > > URL : https://patchwork.freedesktop.org/series/23941/ > > State : success > > > > == Summary == > > > > Series 23941v1 drm/i915: Set all undefined MOCS entries to follow PTE > > https://patchwork.freedesktop.org/api/1.0/series/23941/revisions/1/mbox/ > > Pushed, thanks for the kick and the review. Actually, no I didn't. That reply was intended for a different series, sorry for the scare/noise. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx