Re: [Nouveau] [PATCH v4 1/1] drm: allow limiting the scatter list size.

2020-09-08 Thread Gerd Hoffmann
> > > The comments I've found suggest very much not ... Or is that all very
> > > old stuff only that no one cares about anymore?
> > 
> > I think these days it is possible to override dma_ops per device, which
> > in turn allows virtio to deal with the quirks without the rest of the
> > kernel knowing about these details.
> > 
> > I also think virtio-gpu can drop the virtio_has_dma_quirk() checks, just
> > use the dma api path unconditionally and depend on virtio core having
> > setup dma_ops in a way that it JustWorks[tm].  I'll look into that next.
> 
> The comment above vring_use_dma_api() suggests that this has not yet
> happened, that's why I'm asking.

Hmm, wading through the code, seems it indeed happen yet, even though my
testing didn't show any issues.  Probably pure luck because devices and
cpus have the same memory view on x86.  Guess I need to try this on
ppc64 to see it actually failing ...

So dropping the virtio_has_dma_quirk() checks isn't going to fly.

Using dma_max_mapping_size() should be fine though.  It might use a
lower limit than needed for virtio, but it should not break things.

take care,
  Gerd

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v4 1/1] drm: allow limiting the scatter list size.

2020-09-08 Thread Daniel Vetter
On Tue, Sep 08, 2020 at 07:48:58AM +0200, Gerd Hoffmann wrote:
> On Mon, Sep 07, 2020 at 03:53:02PM +0200, Daniel Vetter wrote:
> > On Mon, Sep 7, 2020 at 1:24 PM Gerd Hoffmann  wrote:
> > >
> > > Add drm_device argument to drm_prime_pages_to_sg(), so we can
> > > call dma_max_mapping_size() to figure the segment size limit
> > > and call into __sg_alloc_table_from_pages() with the correct
> > > limit.
> > >
> > > This fixes virtio-gpu with sev.  Possibly it'll fix other bugs
> > > too given that drm seems to totaly ignore segment size limits
> > > so far ...
> > >
> > > v2: place max_segment in drm driver not gem object.
> > > v3: move max_segment next to the other gem fields.
> > > v4: just use dma_max_mapping_size().
> > >
> > > Signed-off-by: Gerd Hoffmann 
> > 
> > Uh, are you sure this works in all cases for virtio?
> 
> Sure, I've tested it ;)
> 
> > The comments I've found suggest very much not ... Or is that all very
> > old stuff only that no one cares about anymore?
> 
> I think these days it is possible to override dma_ops per device, which
> in turn allows virtio to deal with the quirks without the rest of the
> kernel knowing about these details.
> 
> I also think virtio-gpu can drop the virtio_has_dma_quirk() checks, just
> use the dma api path unconditionally and depend on virtio core having
> setup dma_ops in a way that it JustWorks[tm].  I'll look into that next.

The comment above vring_use_dma_api() suggests that this has not yet
happened, that's why I'm asking. If this has happened then I think it'd be
best if you remove that todo entry and update it, as part of the overall
series to add dma_max_mapping_size and remove the quirks.

Otherwise this all is a bit wtf material :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v4 1/1] drm: allow limiting the scatter list size.

2020-09-08 Thread Daniel Vetter
On Tue, Sep 08, 2020 at 12:02:53PM +0200, Gerd Hoffmann wrote:
> > > > The comments I've found suggest very much not ... Or is that all very
> > > > old stuff only that no one cares about anymore?
> > > 
> > > I think these days it is possible to override dma_ops per device, which
> > > in turn allows virtio to deal with the quirks without the rest of the
> > > kernel knowing about these details.
> > > 
> > > I also think virtio-gpu can drop the virtio_has_dma_quirk() checks, just
> > > use the dma api path unconditionally and depend on virtio core having
> > > setup dma_ops in a way that it JustWorks[tm].  I'll look into that next.
> > 
> > The comment above vring_use_dma_api() suggests that this has not yet
> > happened, that's why I'm asking.
> 
> Hmm, wading through the code, seems it indeed happen yet, even though my
> testing didn't show any issues.  Probably pure luck because devices and
> cpus have the same memory view on x86.  Guess I need to try this on
> ppc64 to see it actually failing ...
> 
> So dropping the virtio_has_dma_quirk() checks isn't going to fly.
> 
> Using dma_max_mapping_size() should be fine though.  It might use a
> lower limit than needed for virtio, but it should not break things.

Makes sense. On this patch here:

Reviewed-by: Daniel Vetter 

And I guess would be good if virtio pushes a bit more towards using the
dma api abstraction fully so we can get rid of these hacks. Virtio feels
like a driver that really should be using dma-api and not dig around
behind it because "it' makes stuff 0.5% faster" or so, since being
virtualized it's already not the king of speed anyway :-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 4/5] drm_dp_cec: add plumbing in preparation for MST support

2020-09-08 Thread Hans Verkuil
On 01/09/2020 08:22, Sam McNally wrote:
> From: Hans Verkuil 
> 
> Signed-off-by: Hans Verkuil 
> [sa...@chromium.org:
>  - rebased
>  - removed polling-related changes
>  - moved the calls to drm_dp_cec_(un)set_edid() into the next patch
> ]
> Signed-off-by: Sam McNally 
> ---
> 
>  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  2 +-
>  drivers/gpu/drm/drm_dp_cec.c  | 22 ++-
>  drivers/gpu/drm/i915/display/intel_dp.c   |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_connector.c   |  2 +-
>  include/drm/drm_dp_helper.h   |  6 +++--
>  5 files changed, 19 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> index 461fa4da0a34..6e7075893ec9 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> @@ -419,7 +419,7 @@ void amdgpu_dm_initialize_dp_connector(struct 
> amdgpu_display_manager *dm,
>  
>   drm_dp_aux_init(>dm_dp_aux.aux);
>   drm_dp_cec_register_connector(>dm_dp_aux.aux,
> -   >base);
> +   >base, false);
>  
>   if (aconnector->base.connector_type == DRM_MODE_CONNECTOR_eDP)
>   return;
> diff --git a/drivers/gpu/drm/drm_dp_cec.c b/drivers/gpu/drm/drm_dp_cec.c
> index 3ab2609f9ec7..04ab7b88055c 100644
> --- a/drivers/gpu/drm/drm_dp_cec.c
> +++ b/drivers/gpu/drm/drm_dp_cec.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  /*
>   * Unfortunately it turns out that we have a chicken-and-egg situation
> @@ -338,8 +339,6 @@ void drm_dp_cec_set_edid(struct drm_dp_aux *aux, const 
> struct edid *edid)
>   if (aux->cec.adap) {
>   if (aux->cec.adap->capabilities == cec_caps &&
>   aux->cec.adap->available_log_addrs == num_las) {
> - /* Unchanged, so just set the phys addr */
> - cec_s_phys_addr_from_edid(aux->cec.adap, edid);
>   goto unlock;
>   }
>   /*
> @@ -364,15 +363,16 @@ void drm_dp_cec_set_edid(struct drm_dp_aux *aux, const 
> struct edid *edid)
>   if (cec_register_adapter(aux->cec.adap, connector->dev->dev)) {
>   cec_delete_adapter(aux->cec.adap);
>   aux->cec.adap = NULL;
> - } else {
> - /*
> -  * Update the phys addr for the new CEC adapter. When called
> -  * from drm_dp_cec_register_connector() edid == NULL, so in
> -  * that case the phys addr is just invalidated.
> -  */
> - cec_s_phys_addr_from_edid(aux->cec.adap, edid);
>   }
>  unlock:
> + /*
> +  * Update the phys addr for the new CEC adapter. When called
> +  * from drm_dp_cec_register_connector() edid == NULL, so in
> +  * that case the phys addr is just invalidated.
> +  */

The comment is no longer in sync with the code: if EDID == NULL, then
nothing is done due to the edid check in the 'if' below.

> + if (aux->cec.adap && edid) {

I think this should just be: if (aux->cec.adap)

Also, the {} aren't necessary here.

> + cec_s_phys_addr_from_edid(aux->cec.adap, edid);
> + }
>   mutex_unlock(>cec.lock);
>  }
>  EXPORT_SYMBOL(drm_dp_cec_set_edid);

Frankly, the changes to this function should be dropped completely, from
what I can see they are not necessary. It was done in my original patch
because of the way I handled mst, but you did it differently (and I think
better), so these changes are no longer needed.

I know I am actually commenting on my old patch, but that patch was from a
work-in-progress git branch and was never meant as a 'proper' patch.

However, what complicates matters is that after digging a bit more I discovered
that commit 732300154980 ("drm: Do not call drm_dp_cec_set_edid() while 
registering
DP connectors") changed drm_dp_cec_register_connector() so that it no longer
calls drm_dp_cec_set_edid(), but the comments there and in this function were
not updated. It would be nice if you can add a patch fixing these outdated
comments.

Regardless of that change in commit 732300154980, the edid pointer can still be
NULL and the existing behavior should be kept (i.e. create a CEC device, but 
with
an invalid physical address since there is no EDID for some reason).

Regards,

Hans

> @@ -418,6 +418,7 @@ EXPORT_SYMBOL(drm_dp_cec_unset_edid);
>   * drm_dp_cec_register_connector() - register a new connector
>   * @aux: DisplayPort AUX channel
>   * @connector: drm connector
> + * @is_mst: set to true if this is an MST branch
>   *
>   * A new connector was registered with associated CEC adapter name and
>   * CEC adapter parent device. After registering the name and parent
> @@ -425,12 +426,13 @@ EXPORT_SYMBOL(drm_dp_cec_unset_edid);
>   * CEC and to register a CEC adapter if 

Re: [Nouveau] [PATCH] drm/nouveau: Add fine-grain temperature reporting

2020-09-08 Thread Ben Skeggs
On Thu, 13 Aug 2020 at 06:50, Jeremy Cline  wrote:
>
> Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of
> new gp1xx temperature sensor") added support for reading finer-grain
> temperatures, but continued to report temperatures in 1 degree Celsius
> increments via nvkm_therm_temp_get().
>
> Rather than altering nvkm_therm_temp_get() to report finer-grain
> temperatures, which would be inconvenient for other users of the
> function, a second interface has been added to line up with hwmon's
> native unit of temperature.
Hey Jeremy,

Sorry this slipped past me until now.  I'm OK with adding support for
millidegree temperature reporting, but don't think we need to keep
both interfaces around and would rather see the existing code
converted to return millidegrees (even on GPUs that don't support it)
instead of degrees.

Thanks!
Ben.

>
> Signed-off-by: Jeremy Cline 
> ---
>  .../drm/nouveau/include/nvkm/subdev/therm.h   | 18 +
>  drivers/gpu/drm/nouveau/nouveau_hwmon.c   |  4 +--
>  .../gpu/drm/nouveau/nvkm/subdev/therm/base.c  | 16 
>  .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 25 +--
>  .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h  |  1 +
>  5 files changed, 60 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> index 62c34f98c930..7b9928dd001c 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> @@ -100,6 +100,24 @@ struct nvkm_therm {
>  };
>
>  int nvkm_therm_temp_get(struct nvkm_therm *);
> +
> +/**
> + * nvkm_therm_temp_millidegree_get() - get the temperature in millidegrees
> + * @therm: The thermal device to read from.
> + *
> + * This interface reports temperatures in units of millidegree Celsius to
> + * align with the hwmon API. Some cards may only be capable of reporting in
> + * units of Celsius, and those that report finer grain temperatures may not 
> be
> + * capable of millidegree Celsius accuracy,
> + *
> + * For cases where millidegree temperature is too fine-grain, the
> + * nvkm_therm_temp_get() interface reports temperatures in one degree Celsius
> + * increments.
> + *
> + * Return: The temperature in millidegrees Celsius, or -ENODEV if temperature
> + * reporting is not supported.
> + */
> +int nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm);
>  int nvkm_therm_fan_sense(struct nvkm_therm *);
>  int nvkm_therm_cstate(struct nvkm_therm *, int, int);
>  void nvkm_therm_clkgate_init(struct nvkm_therm *,
> diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c 
> b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> index 1c3104d20571..e96355f93ce5 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> channel, long *val)
> case hwmon_temp_input:
> if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON)
> return -EINVAL;
> -   ret = nvkm_therm_temp_get(therm);
> -   *val = ret < 0 ? ret : (ret * 1000);
> +   ret = nvkm_therm_temp_millidegree_get(therm);
> +   *val = ret;
> break;
> case hwmon_temp_max:
> *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK)
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> index 4a4d1e224126..e655b32c78b8 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> @@ -34,6 +34,22 @@ nvkm_therm_temp_get(struct nvkm_therm *therm)
> return -ENODEV;
>  }
>
> +int
> +nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm)
> +{
> +   int ret = -ENODEV;
> +
> +   if (therm->func->temp_millidegree_get)
> +   return therm->func->temp_millidegree_get(therm);
> +
> +   if (therm->func->temp_get) {
> +   ret = therm->func->temp_get(therm);
> +   if (ret > 0)
> +   ret *= 1000;
> +   }
> +   return ret;
> +}
> +
>  static int
>  nvkm_therm_update_trip(struct nvkm_therm *therm)
>  {
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c
> index 9f0dea3f61dc..4c3c2895a3cb 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c
> @@ -24,7 +24,7 @@
>  #include "priv.h"
>
>  static int
> -gp100_temp_get(struct nvkm_therm *therm)
> +gp100_temp_get_raw(struct nvkm_therm *therm)
>  {
> struct nvkm_device *device = therm->subdev.device;
> struct nvkm_subdev *subdev = >subdev;
> @@ -37,14 +37,35 @@ gp100_temp_get(struct nvkm_therm *therm)
>
> /* device valid */
> if (tsensor &