Re: [Nouveau] [PATCH 13/40] drm/nouveau/dispnv50/disp: Include header containing our prototypes

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:38 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/dispnv50/disp.c:2599:1: warning: no previous 
> prototype for ‘nv50_display_create’ [-Wmissing-prototypes]
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c 
> b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> index 351f954989530..4905ed584ff48 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> @@ -68,6 +68,8 @@
>
>  #include 
>
> +#include "nv50_display.h"
> +
>  
> /**
>   * EVO channel
>   
> */
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau



Re: [PATCH 14/40] drm/nouveau/nouveau_ioc32: File headers are not good candidates for kernel-doc

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:38 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c:2: warning: Cannot understand  * 
> file mga_ioc32.c
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ioc32.c 
> b/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> index adf01ca9e035d..8ddf9b2325a42 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> @@ -1,4 +1,4 @@
> -/**
> +/*
>   * \file mga_ioc32.c
>   *
>   * 32-bit ioctl compatibility routines for the MGA DRM.
> --
> 2.27.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>



Re: [Nouveau] [PATCH 16/40] drm/nouveau/nouveau_ioc32: Demote kernel-doc abuse to standard comment block

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:38 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or 
> member 'filp' not described in 'nouveau_compat_ioctl'
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or 
> member 'cmd' not described in 'nouveau_compat_ioctl'
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or 
> member 'arg' not described in 'nouveau_compat_ioctl'
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ioc32.c 
> b/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> index 8ddf9b2325a42..2af3615c5205c 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> @@ -38,7 +38,7 @@
>
>  #include "nouveau_ioctl.h"
>
> -/**
> +/*
>   * Called whenever a 32-bit process running under a 64-bit kernel
>   * performs an ioctl on /dev/dri/card.
>   *
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>



Re: [Nouveau] [PATCH 15/40] drm/nouveau/nouveau_svm: Remove unused variable 'ret' from void function

2021-04-19 Thread Karol Herbst
On Fri, Apr 16, 2021 at 4:38 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nouveau_svm.c: In function ‘nouveau_pfns_map’:
>  drivers/gpu/drm/nouveau/nouveau_svm.c:810:6: warning: variable ‘ret’ set but 
> not used [-Wunused-but-set-variable]
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nouveau_svm.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
> b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index 1c3f890377d2c..26af6ee915368 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -811,7 +811,6 @@ nouveau_pfns_map(struct nouveau_svmm *svmm, struct 
> mm_struct *mm,
>  unsigned long addr, u64 *pfns, unsigned long npages)
>  {
> struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns);
> -   int ret;
>
> args->p.addr = addr;
> args->p.size = npages << PAGE_SHIFT;
> @@ -819,8 +818,8 @@ nouveau_pfns_map(struct nouveau_svmm *svmm, struct 
> mm_struct *mm,
> mutex_lock(>mutex);
>
> svmm->vmm->vmm.object.client->super = true;
> -   ret = nvif_object_ioctl(>vmm->vmm.object, args, sizeof(*args) +
> -   npages * sizeof(args->p.phys[0]), NULL);
> +   nvif_object_ioctl(>vmm->vmm.object, args, sizeof(*args) +
> + npages * sizeof(args->p.phys[0]), NULL);

yeah mhh.. I think this one is actually fine, but it might make sense
to still report something back, although in userspace we still don't
care as the CL API doesn't return any error.

> svmm->vmm->vmm.object.client->super = false;
>
> mutex_unlock(>mutex);
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau



Re: [Nouveau] [PATCH 12/40] drm/nouveau/nv50_display: Remove superfluous prototype for local static functions

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:38 PM Lee Jones  wrote:
>
> Fixes the following build error:
>
>  drivers/gpu/drm/nouveau/dispnv50/disp.c:2530:1: error: conflicting types for 
> ‘nv50_display_fini’
>  In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71:
>  drivers/gpu/drm/nouveau/nv50_display.h:36:6: note: previous declaration of 
> ‘nv50_display_fini’ was her
>  In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71:
>  drivers/gpu/drm/nouveau/nv50_display.h:35:6: note: previous declaration of 
> ‘nv50_display_init’ was here
>  drivers/gpu/drm/nouveau/dispnv50/disp.c:2581:1: error: static declaration of 
> ‘nv50_display_destroy’ follows non-static declaration
>  In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71:
>  drivers/gpu/drm/nouveau/nv50_display.h:34:6: note: previous declaration of 
> ‘nv50_display_destroy’ was here
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nv50_display.h | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nv50_display.h 
> b/drivers/gpu/drm/nouveau/nv50_display.h
> index fbd3b15583bc8..2421401d12636 100644
> --- a/drivers/gpu/drm/nouveau/nv50_display.h
> +++ b/drivers/gpu/drm/nouveau/nv50_display.h
> @@ -31,7 +31,4 @@
>  #include "nouveau_reg.h"
>
>  int  nv50_display_create(struct drm_device *);
> -void nv50_display_destroy(struct drm_device *);
> -int  nv50_display_init(struct drm_device *);
> -void nv50_display_fini(struct drm_device *);
>  #endif /* __NV50_DISPLAY_H__ */
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau



Re: [Nouveau] [PATCH 11/40] drm/nouveau/dispnv50/headc57d: Make local function 'headc57d_olut' static

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:38 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/dispnv50/headc57d.c:173:1: warning: no previous 
> prototype for ‘headc57d_olut’ [-Wmissing-prototypes]
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Lyude Paul 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c 
> b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
> index fd51527b56b83..bdcfd240d61c8 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
> @@ -169,7 +169,7 @@ headc57d_olut_load(struct drm_color_lut *in, int size, 
> void __iomem *mem)
> writew(readw(mem - 4), mem + 4);
>  }
>
> -bool
> +static bool
>  headc57d_olut(struct nv50_head *head, struct nv50_head_atom *asyh, int size)
>  {
> if (size != 0 && size != 256 && size != 1024)
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau



Re: [Nouveau] [PATCH 10/40] drm/nouveau/dispnv50/disp: Remove unused variable 'ret' from function returning void

2021-04-19 Thread Karol Herbst
On Fri, Apr 16, 2021 at 4:38 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/dispnv50/disp.c: In function ‘nv50_mstm_cleanup’:
>  drivers/gpu/drm/nouveau/dispnv50/disp.c:1357:6: warning: variable ‘ret’ set 
> but not used [-Wunused-but-set-variable]
>

same comment here: we should really check if it's better to handle the
error and report it back that something failed or so..

> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/dispnv50/disp.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c 
> b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> index 4801aafd9552b..351f954989530 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> @@ -1386,12 +1386,11 @@ nv50_mstm_cleanup(struct nv50_mstm *mstm)
>  {
> struct nouveau_drm *drm = nouveau_drm(mstm->outp->base.base.dev);
> struct drm_encoder *encoder;
> -   int ret;
>
> NV_ATOMIC(drm, "%s: mstm cleanup\n", mstm->outp->base.base.name);
> -   ret = drm_dp_check_act_status(>mgr);
> +   drm_dp_check_act_status(>mgr);
>
> -   ret = drm_dp_update_payload_part2(>mgr);
> +   drm_dp_update_payload_part2(>mgr);
>
> drm_for_each_encoder(encoder, mstm->outp->base.base.dev) {
> if (encoder->encoder_type == DRM_MODE_ENCODER_DPMST) {
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau



Re: [PATCH 09/40] drm/nouveau/dispnv04/crtc: Demote non-conforming kernel-doc headers

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:38 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:462: warning: Function parameter or 
> member 'crtc' not described in 'nv_crtc_mode_set_regs'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:462: warning: Function parameter or 
> member 'mode' not described in 'nv_crtc_mode_set_regs'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'crtc' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'mode' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'adjusted_mode' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'x' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'y' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'old_fb' not described in 'nv_crtc_mode_set'
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c 
> b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
> index f9e962fd94d0d..f9a276ea5a9e0 100644
> --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
> +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
> @@ -449,7 +449,7 @@ nv_crtc_mode_set_vga(struct drm_crtc *crtc, struct 
> drm_display_mode *mode)
> regp->Attribute[NV_CIO_AR_CSEL_INDEX] = 0x00;
>  }
>
> -/**
> +/*
>   * Sets up registers for the given mode/adjusted_mode pair.
>   *
>   * The clocks, CRTCs and outputs attached to this CRTC must be off.
> @@ -625,7 +625,7 @@ nv_crtc_swap_fbs(struct drm_crtc *crtc, struct 
> drm_framebuffer *old_fb)
> return ret;
>  }
>
> -/**
> +/*
>   * Sets up registers for the given mode/adjusted_mode pair.
>   *
>   * The clocks, CRTCs and outputs attached to this CRTC must be off.
> --
> 2.27.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>



Re: [Nouveau] [PATCH 07/40] drm/nouveau/nouveau_bo: Remove unused variables 'dev'

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:37 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nouveau_bo.c: In function ‘nouveau_ttm_tt_populate’:
>  drivers/gpu/drm/nouveau/nouveau_bo.c:1228:17: warning: variable ‘dev’ set 
> but not used [-Wunused-but-set-variable]
>  drivers/gpu/drm/nouveau/nouveau_bo.c: In function 
> ‘nouveau_ttm_tt_unpopulate’:
>  drivers/gpu/drm/nouveau/nouveau_bo.c:1252:17: warning: variable ‘dev’ set 
> but not used [-Wunused-but-set-variable]
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: Jeremy Kolb 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nouveau_bo.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
> b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index 3e09df0472ce4..37b3d2c10f5c5 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -1255,7 +1255,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev,
>  {
> struct ttm_tt *ttm_dma = (void *)ttm;
> struct nouveau_drm *drm;
> -   struct device *dev;
> bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
>
> if (ttm_tt_is_populated(ttm))
> @@ -1268,7 +1267,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev,
> }
>
> drm = nouveau_bdev(bdev);
> -   dev = drm->dev->dev;
>
> return ttm_pool_alloc(>ttm.bdev.pool, ttm, ctx);
>  }
> @@ -1278,14 +1276,12 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev,
>   struct ttm_tt *ttm)
>  {
> struct nouveau_drm *drm;
> -   struct device *dev;
> bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
>
> if (slave)
> return;
>
> drm = nouveau_bdev(bdev);
> -   dev = drm->dev->dev;
>
> return ttm_pool_free(>ttm.bdev.pool, ttm);
>  }
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau



Re: [Nouveau] [PATCH 05/40] drm/nouveau/nvkm/subdev/volt/gk20a: Demote non-conformant kernel-doc headers

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:37 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function 
> parameter or member 'speedo' not described in 'gk20a_volt_get_cvb_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function 
> parameter or member 's_scale' not described in 'gk20a_volt_get_cvb_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function 
> parameter or member 'coef' not described in 'gk20a_volt_get_cvb_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 'speedo' not described in 'gk20a_volt_get_cvb_t_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 'temp' not described in 'gk20a_volt_get_cvb_t_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 's_scale' not described in 'gk20a_volt_get_cvb_t_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 't_scale' not described in 'gk20a_volt_get_cvb_t_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 'coef' not described in 'gk20a_volt_get_cvb_t_voltage'
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c
> index 8c2faa9645111..ccac88da88648 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c
> @@ -45,7 +45,7 @@ static const struct cvb_coef gk20a_cvb_coef[] = {
> /* 852 */ { 1608418, -21643, -269, 0,763,  -48},
>  };
>
> -/**
> +/*
>   * cvb_mv = ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0)
>   */
>  static inline int
> @@ -58,7 +58,7 @@ gk20a_volt_get_cvb_voltage(int speedo, int s_scale, const 
> struct cvb_coef *coef)
> return mv;
>  }
>
> -/**
> +/*
>   * cvb_t_mv =
>   * ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0) +
>   * ((c3 * speedo / s_scale + c4 + c5 * T / t_scale) * T / t_scale)
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>



Re: [Nouveau] [PATCH 06/40] drm/nouveau/nvkm/engine/gr/gf100: Demote non-conformant kernel-doc header

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:37 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c:992: warning: Function 
> parameter or member 'gr' not described in 'gf100_gr_wait_idle'
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c 
> b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
> index 397ff4fe9df89..69e6008f99196 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
> @@ -982,7 +982,7 @@ gf100_gr_zbc_init(struct gf100_gr *gr)
> }
>  }
>
> -/**
> +/*
>   * Wait until GR goes idle. GR is considered idle if it is disabled by the
>   * MC (0x200) register, or GR is not busy and a context switch is not in
>   * progress.
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>



Re: [Nouveau] [PATCH 02/40] drm/nouveau/dispnv50/disp: Remove unused variable 'ret'

2021-04-19 Thread Karol Herbst
On Fri, Apr 16, 2021 at 4:37 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/dispnv50/disp.c:1381:6: warning: variable ‘ret’ set 
> but not used [-Wunused-but-set-variable]
>

not a big fan of just ignoring return codes, I'd rather see it handled
somehow, unless somebody knowing more about the details here says it's
okay.

> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/dispnv50/disp.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c 
> b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> index 1c9c0cdf85dbc..4801aafd9552b 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> @@ -1410,10 +1410,9 @@ nv50_mstm_prepare(struct nv50_mstm *mstm)
>  {
> struct nouveau_drm *drm = nouveau_drm(mstm->outp->base.base.dev);
> struct drm_encoder *encoder;
> -   int ret;
>
> NV_ATOMIC(drm, "%s: mstm prepare\n", mstm->outp->base.base.name);
> -   ret = drm_dp_update_payload_part1(>mgr);
> +   drm_dp_update_payload_part1(>mgr);
>
> drm_for_each_encoder(encoder, mstm->outp->base.base.dev) {
> if (encoder->encoder_type == DRM_MODE_ENCODER_DPMST) {
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau



Re: [Nouveau] [PATCH 01/40] drm/nouveau/nvkm/subdev/bios/init: Demote obvious abuse of kernel-doc

2021-04-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Apr 16, 2021 at 4:37 PM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:584: warning: Function 
> parameter or member 'init' not described in 'init_reserved'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:611: warning: Function 
> parameter or member 'init' not described in 'init_done'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:622: warning: Function 
> parameter or member 'init' not described in 'init_io_restrict_prog'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:659: warning: Function 
> parameter or member 'init' not described in 'init_repeat'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:685: warning: Function 
> parameter or member 'init' not described in 'init_io_restrict_pll'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:725: warning: Function 
> parameter or member 'init' not described in 'init_end_repeat'
>
> NB: Trimmed for brevity (lots of these!)
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  .../gpu/drm/nouveau/nvkm/subdev/bios/init.c   | 204 ++
>  1 file changed, 68 insertions(+), 136 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
> index 9de74f41dcd2a..5a91dc4e5c8ec 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
> @@ -575,9 +575,8 @@ init_tmds_reg(struct nvbios_init *init, u8 tmds)
>   * init opcode handlers
>   
> */
>
> -/**
> +/*
>   * init_reserved - stub for various unknown/unused single-byte opcodes
> - *
>   */
>  static void
>  init_reserved(struct nvbios_init *init)
> @@ -602,9 +601,8 @@ init_reserved(struct nvbios_init *init)
> init->offset += length;
>  }
>
> -/**
> +/*
>   * INIT_DONE - opcode 0x71
> - *
>   */
>  static void
>  init_done(struct nvbios_init *init)
> @@ -613,9 +611,8 @@ init_done(struct nvbios_init *init)
> init->offset = 0x;
>  }
>
> -/**
> +/*
>   * INIT_IO_RESTRICT_PROG - opcode 0x32
> - *
>   */
>  static void
>  init_io_restrict_prog(struct nvbios_init *init)
> @@ -650,9 +647,8 @@ init_io_restrict_prog(struct nvbios_init *init)
> trace("}]\n");
>  }
>
> -/**
> +/*
>   * INIT_REPEAT - opcode 0x33
> - *
>   */
>  static void
>  init_repeat(struct nvbios_init *init)
> @@ -676,9 +672,8 @@ init_repeat(struct nvbios_init *init)
> init->repeat = repeat;
>  }
>
> -/**
> +/*
>   * INIT_IO_RESTRICT_PLL - opcode 0x34
> - *
>   */
>  static void
>  init_io_restrict_pll(struct nvbios_init *init)
> @@ -716,9 +711,8 @@ init_io_restrict_pll(struct nvbios_init *init)
> trace("}]\n");
>  }
>
> -/**
> +/*
>   * INIT_END_REPEAT - opcode 0x36
> - *
>   */
>  static void
>  init_end_repeat(struct nvbios_init *init)
> @@ -732,9 +726,8 @@ init_end_repeat(struct nvbios_init *init)
> }
>  }
>
> -/**
> +/*
>   * INIT_COPY - opcode 0x37
> - *
>   */
>  static void
>  init_copy(struct nvbios_init *init)
> @@ -759,9 +752,8 @@ init_copy(struct nvbios_init *init)
> init_wrvgai(init, port, index, data);
>  }
>
> -/**
> +/*
>   * INIT_NOT - opcode 0x38
> - *
>   */
>  static void
>  init_not(struct nvbios_init *init)
> @@ -771,9 +763,8 @@ init_not(struct nvbios_init *init)
> init_exec_inv(init);
>  }
>
> -/**
> +/*
>   * INIT_IO_FLAG_CONDITION - opcode 0x39
> - *
>   */
>  static void
>  init_io_flag_condition(struct nvbios_init *init)
> @@ -788,9 +779,8 @@ init_io_flag_condition(struct nvbios_init *init)
> init_exec_set(init, false);
>  }
>
> -/**
> +/*
>   * INIT_GENERIC_CONDITION - opcode 0x3a
> - *
>   */
>  static void
>  init_generic_condition(struct nvbios_init *init)
> @@ -840,9 +830,8 @@ init_generic_condition(struct nvbios_init *init)
> }
>  }
>
> -/**
> +/*
>   * INIT_IO_MASK_OR - opcode 0x3b
> - *
>   */
>  static void
>  init_io_mask_or(struct nvbios_init *init)
> @@ -859,9 +848,8 @@ init_io_mask_or(struct nvbios_init *init)
> init_wrvgai(init, 0x03d4, index, data &= ~(1 << or));
>  }
>
> -/**
> +/*
>   * INIT_IO_OR - opcode 0x3c
> - *
>   */
>  static void
>  init_io_or(struct nvbios_init *init)
> @@ -878,9 +866,8 @@ init_i

Re: [Nouveau] [PATCH v2] ALSA: hda: Continue to probe when codec probe fails

2021-04-13 Thread Karol Herbst
On Tue, Apr 13, 2021 at 1:17 PM Roy Spliet  wrote:
>
> Op 13-04-2021 om 10:48 schreef Karol Herbst:
> > On Tue, Apr 13, 2021 at 10:24 AM Roy Spliet  wrote:
> >>
> >> Op 13-04-2021 om 01:10 schreef Karol Herbst:
> >>> On Mon, Apr 12, 2021 at 9:36 PM Roy Spliet  wrote:
> >>>>
> >>>> Hello Aaron,
> >>>>
> >>>> Thanks for your insights. A follow-up query and some observations 
> >>>> in-line.
> >>>>
> >>>> Op 12-04-2021 om 20:06 schreef Aaron Plattner:
> >>>>> On 4/10/21 1:48 PM, Roy Spliet wrote:
> >>>>>> Op 10-04-2021 om 20:23 schreef Lukas Wunner:
> >>>>>>> On Sat, Apr 10, 2021 at 04:51:27PM +0100, Roy Spliet wrote:
> >>>>>>>> Can I ask someone with more
> >>>>>>>> technical knowledge of snd_hda_intel and vgaswitcheroo to brainstorm
> >>>>>>>> about
> >>>>>>>> the possible challenges of nouveau taking matters into its own hand
> >>>>>>>> rather
> >>>>>>>> than keeping this PCI quirk around?
> >>>>>>>
> >>>>>>> It sounds to me like the HDA is not powered if no cable is plugged in.
> >>>>>>> What is reponsible then for powering it up or down, firmware code on
> >>>>>>> the GPU or in the host's BIOS?
> >>>>>>
> >>>>>> Sometimes the BIOS, but definitely unconditionally the PCI quirk code:
> >>>>>> https://github.com/torvalds/linux/blob/master/drivers/pci/quirks.c#L5289
> >>>>>>
> >>>>>> (CC Aaron Plattner)
> >>>>>
> >>>>> My basic understanding is that the audio function stops responding
> >>>>> whenever the graphics function is powered off. So the requirement here
> >>>>> is that the audio driver can't try to talk to the audio function while
> >>>>> the graphics function is asleep, and must trigger a graphics function
> >>>>> wakeup before trying to communicate with the audio function.
> >>>>
> >>>> I believe that vgaswitcheroo takes care of this for us.
> >>>>
> >>>
> >>> yeah, and also: why would the driver want to do stuff? If the GPU is
> >>> turned off, there is no point in communicating with the audio device
> >>> anyway. The driver should do the initial probe and leave the device be
> >>> unless it's actively used. Also there is no such thing as "use the
> >>> audio function, but not the graphics one"
> >>>
> >>>>> I think
> >>>>> there are also requirements about the audio function needing to be awake
> >>>>> when the graphics driver is updating the ELD, but I'm not sure.
> >>>>>
> >>>
> >>> well, it's one physical device anyway, so technically the audio
> >>> function is powered on.
> >>>
> >>>>> This is harder on Windows because the audio driver lives in its own
> >>>>> little world doing its own thing but on Linux we can do better.
> >>>>>
> >>>>>>> Ideally, we should try to find out how to control HDA power from the
> >>>>>>> operating system rather than trying to cooperate with whatever 
> >>>>>>> firmware
> >>>>>>> is doing.  If we have that capability, the OS should power the HDA up
> >>>>>>> and down as it sees fit.
> >>>>>
> >>>>> After system boot, I don't think there's any firmware involved, but I'm
> >>>>> not super familiar with the low-level details and it's possible the
> >>>>> situation changed since I last looked at it.
> >>>>>
> >>>>> I think the problem with having nouveau write this quirk is that the
> >>>>> kernel will need to re-probe the PCI device to notice that it has
> >>>>> suddenly become a multi-function device with an audio function, and
> >>>>> hotplug the audio driver. I originally looked into trying to do that but
> >>>>> it was tricky because the PCI subsystem didn't really have a mechanism
> >>>>> for a single-function device to become a multi-function device on the
> >>>>> fly and it seemed easier to enable it early on during bus enumeration.
> &g

Re: [Nouveau] [PATCH v2] ALSA: hda: Continue to probe when codec probe fails

2021-04-13 Thread Karol Herbst
On Tue, Apr 13, 2021 at 10:24 AM Roy Spliet  wrote:
>
> Op 13-04-2021 om 01:10 schreef Karol Herbst:
> > On Mon, Apr 12, 2021 at 9:36 PM Roy Spliet  wrote:
> >>
> >> Hello Aaron,
> >>
> >> Thanks for your insights. A follow-up query and some observations in-line.
> >>
> >> Op 12-04-2021 om 20:06 schreef Aaron Plattner:
> >>> On 4/10/21 1:48 PM, Roy Spliet wrote:
> >>>> Op 10-04-2021 om 20:23 schreef Lukas Wunner:
> >>>>> On Sat, Apr 10, 2021 at 04:51:27PM +0100, Roy Spliet wrote:
> >>>>>> Can I ask someone with more
> >>>>>> technical knowledge of snd_hda_intel and vgaswitcheroo to brainstorm
> >>>>>> about
> >>>>>> the possible challenges of nouveau taking matters into its own hand
> >>>>>> rather
> >>>>>> than keeping this PCI quirk around?
> >>>>>
> >>>>> It sounds to me like the HDA is not powered if no cable is plugged in.
> >>>>> What is reponsible then for powering it up or down, firmware code on
> >>>>> the GPU or in the host's BIOS?
> >>>>
> >>>> Sometimes the BIOS, but definitely unconditionally the PCI quirk code:
> >>>> https://github.com/torvalds/linux/blob/master/drivers/pci/quirks.c#L5289
> >>>>
> >>>> (CC Aaron Plattner)
> >>>
> >>> My basic understanding is that the audio function stops responding
> >>> whenever the graphics function is powered off. So the requirement here
> >>> is that the audio driver can't try to talk to the audio function while
> >>> the graphics function is asleep, and must trigger a graphics function
> >>> wakeup before trying to communicate with the audio function.
> >>
> >> I believe that vgaswitcheroo takes care of this for us.
> >>
> >
> > yeah, and also: why would the driver want to do stuff? If the GPU is
> > turned off, there is no point in communicating with the audio device
> > anyway. The driver should do the initial probe and leave the device be
> > unless it's actively used. Also there is no such thing as "use the
> > audio function, but not the graphics one"
> >
> >>> I think
> >>> there are also requirements about the audio function needing to be awake
> >>> when the graphics driver is updating the ELD, but I'm not sure.
> >>>
> >
> > well, it's one physical device anyway, so technically the audio
> > function is powered on.
> >
> >>> This is harder on Windows because the audio driver lives in its own
> >>> little world doing its own thing but on Linux we can do better.
> >>>
> >>>>> Ideally, we should try to find out how to control HDA power from the
> >>>>> operating system rather than trying to cooperate with whatever firmware
> >>>>> is doing.  If we have that capability, the OS should power the HDA up
> >>>>> and down as it sees fit.
> >>>
> >>> After system boot, I don't think there's any firmware involved, but I'm
> >>> not super familiar with the low-level details and it's possible the
> >>> situation changed since I last looked at it.
> >>>
> >>> I think the problem with having nouveau write this quirk is that the
> >>> kernel will need to re-probe the PCI device to notice that it has
> >>> suddenly become a multi-function device with an audio function, and
> >>> hotplug the audio driver. I originally looked into trying to do that but
> >>> it was tricky because the PCI subsystem didn't really have a mechanism
> >>> for a single-function device to become a multi-function device on the
> >>> fly and it seemed easier to enable it early on during bus enumeration.
> >>> That way the kernel sees both functions all the time without anything
> >>> else having to be special about this configuration.
> >
> > Well, we do have this pci/quirk.c thing, no? Nouveau does flip the
> > bit, but I am actually not sure if that's even doing something
> > anymore. Maybe in the runtime_resume case it's still relevant but not
> > sure _when_ DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY is triggered, it does
> > seem to be called even in the runtime_resume case though.
> >
> >>
> >> Right, so for a little more context: a while ago I noticed that my
> >> laptop (lucky me, Asus K501UB) has a 940M with HDA but no codec. Seems
> >&g

Re: [Nouveau] [PATCH v2] ALSA: hda: Continue to probe when codec probe fails

2021-04-12 Thread Karol Herbst
On Mon, Apr 12, 2021 at 9:36 PM Roy Spliet  wrote:
>
> Hello Aaron,
>
> Thanks for your insights. A follow-up query and some observations in-line.
>
> Op 12-04-2021 om 20:06 schreef Aaron Plattner:
> > On 4/10/21 1:48 PM, Roy Spliet wrote:
> >> Op 10-04-2021 om 20:23 schreef Lukas Wunner:
> >>> On Sat, Apr 10, 2021 at 04:51:27PM +0100, Roy Spliet wrote:
>  Can I ask someone with more
>  technical knowledge of snd_hda_intel and vgaswitcheroo to brainstorm
>  about
>  the possible challenges of nouveau taking matters into its own hand
>  rather
>  than keeping this PCI quirk around?
> >>>
> >>> It sounds to me like the HDA is not powered if no cable is plugged in.
> >>> What is reponsible then for powering it up or down, firmware code on
> >>> the GPU or in the host's BIOS?
> >>
> >> Sometimes the BIOS, but definitely unconditionally the PCI quirk code:
> >> https://github.com/torvalds/linux/blob/master/drivers/pci/quirks.c#L5289
> >>
> >> (CC Aaron Plattner)
> >
> > My basic understanding is that the audio function stops responding
> > whenever the graphics function is powered off. So the requirement here
> > is that the audio driver can't try to talk to the audio function while
> > the graphics function is asleep, and must trigger a graphics function
> > wakeup before trying to communicate with the audio function.
>
> I believe that vgaswitcheroo takes care of this for us.
>

yeah, and also: why would the driver want to do stuff? If the GPU is
turned off, there is no point in communicating with the audio device
anyway. The driver should do the initial probe and leave the device be
unless it's actively used. Also there is no such thing as "use the
audio function, but not the graphics one"

> > I think
> > there are also requirements about the audio function needing to be awake
> > when the graphics driver is updating the ELD, but I'm not sure.
> >

well, it's one physical device anyway, so technically the audio
function is powered on.

> > This is harder on Windows because the audio driver lives in its own
> > little world doing its own thing but on Linux we can do better.
> >
> >>> Ideally, we should try to find out how to control HDA power from the
> >>> operating system rather than trying to cooperate with whatever firmware
> >>> is doing.  If we have that capability, the OS should power the HDA up
> >>> and down as it sees fit.
> >
> > After system boot, I don't think there's any firmware involved, but I'm
> > not super familiar with the low-level details and it's possible the
> > situation changed since I last looked at it.
> >
> > I think the problem with having nouveau write this quirk is that the
> > kernel will need to re-probe the PCI device to notice that it has
> > suddenly become a multi-function device with an audio function, and
> > hotplug the audio driver. I originally looked into trying to do that but
> > it was tricky because the PCI subsystem didn't really have a mechanism
> > for a single-function device to become a multi-function device on the
> > fly and it seemed easier to enable it early on during bus enumeration.
> > That way the kernel sees both functions all the time without anything
> > else having to be special about this configuration.

Well, we do have this pci/quirk.c thing, no? Nouveau does flip the
bit, but I am actually not sure if that's even doing something
anymore. Maybe in the runtime_resume case it's still relevant but not
sure _when_ DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY is triggered, it does
seem to be called even in the runtime_resume case though.

>
> Right, so for a little more context: a while ago I noticed that my
> laptop (lucky me, Asus K501UB) has a 940M with HDA but no codec. Seems
> legit, given how this GPU has no displays attached; they're all hooked
> up to the Intel integrated GPU. That threw off the snd_hda_intel
> mid-probe, and as a result didn't permit runpm, keeping the entire GPU,
> PCIe bus and thus the CPU package awake. A bit of hackerly later we
> decided to continue probing without a codec, and now my laptop is happy,
> but...
> A new problem popped up with several other NVIDIA GPUs that expose their
> HDA subdevice, but somehow its inaccessible. Relevant lines from a
> users' log:
>
> [3.031222] MXM: GUID detected in BIOS
> [3.031280] ACPI BIOS Error (bug): AE_AML_PACKAGE_LIMIT, Index
> (0x3) is beyond end of object (length 0x0) (20200925/exoparg2-393)
> [3.031352] ACPI Error: Aborting method \_SB.PCI0.GFX0._DSM due to
> previous error (AE_AML_PACKAGE_LIMIT) (20200925/psparse-529)
> [3.031419] ACPI: \_SB_.PCI0.GFX0: failed to evaluate _DSM (0x300b)
> [3.031424] ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type
> mismatch - Found [Buffer], ACPI requires [Package] (20200925/nsarguments-61)
> [3.031619] pci :00:02.0: optimus capabilities: enabled, status
> dynamic power,
> [3.031667] ACPI BIOS Error (bug): AE_AML_PACKAGE_LIMIT, Index
> (0x3) is beyond end of object 

Re: [PATCH 14/19] drm/nouveau/dispnv50/headc57d: Make local function 'headc57d_olut' static

2021-03-19 Thread Karol Herbst
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/dispnv50/headc57d.c:173:1: warning: no previous 
> prototype for ‘headc57d_olut’ [-Wmissing-prototypes]
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Lyude Paul 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c 
> b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
> index fd51527b56b83..bdcfd240d61c8 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
> @@ -169,7 +169,7 @@ headc57d_olut_load(struct drm_color_lut *in, int size, 
> void __iomem *mem)
> writew(readw(mem - 4), mem + 4);
>  }
>
> -bool
> +static bool
>  headc57d_olut(struct nv50_head *head, struct nv50_head_atom *asyh, int size)
>  {
> if (size != 0 && size != 256 && size != 1024)
> --
> 2.27.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reviewed-by: Karol Herbst 



Re: [PATCH 12/19] drm/nouveau/dispnv04/crtc: Demote non-conforming kernel-doc headers

2021-03-19 Thread Karol Herbst
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:462: warning: Function parameter or 
> member 'crtc' not described in 'nv_crtc_mode_set_regs'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:462: warning: Function parameter or 
> member 'mode' not described in 'nv_crtc_mode_set_regs'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'crtc' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'mode' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'adjusted_mode' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'x' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'y' not described in 'nv_crtc_mode_set'
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or 
> member 'old_fb' not described in 'nv_crtc_mode_set'
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c 
> b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
> index f9e962fd94d0d..f9a276ea5a9e0 100644
> --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
> +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
> @@ -449,7 +449,7 @@ nv_crtc_mode_set_vga(struct drm_crtc *crtc, struct 
> drm_display_mode *mode)
> regp->Attribute[NV_CIO_AR_CSEL_INDEX] = 0x00;
>  }
>
> -/**
> +/*
>   * Sets up registers for the given mode/adjusted_mode pair.
>   *
>   * The clocks, CRTCs and outputs attached to this CRTC must be off.
> @@ -625,7 +625,7 @@ nv_crtc_swap_fbs(struct drm_crtc *crtc, struct 
> drm_framebuffer *old_fb)
> return ret;
>  }
>
> -/**
> +/*
>   * Sets up registers for the given mode/adjusted_mode pair.
>   *
>   * The clocks, CRTCs and outputs attached to this CRTC must be off.
> --
> 2.27.0
>
> _______
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Reviewed-by: Karol Herbst 



Re: [PATCH 19/19] drm/nouveau/nouveau_ioc32: Demote kernel-doc abuse to standard comment block

2021-03-19 Thread Karol Herbst
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or 
> member 'filp' not described in 'nouveau_compat_ioctl'
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or 
> member 'cmd' not described in 'nouveau_compat_ioctl'
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or 
> member 'arg' not described in 'nouveau_compat_ioctl'
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ioc32.c 
> b/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> index 8ddf9b2325a42..2af3615c5205c 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> @@ -38,7 +38,7 @@
>
>  #include "nouveau_ioctl.h"
>
> -/**
> +/*
>   * Called whenever a 32-bit process running under a 64-bit kernel
>   * performs an ioctl on /dev/dri/card.
>   *
> --
> 2.27.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Reviewed-by: Karol Herbst 



Re: [PATCH 17/19] drm/nouveau/nouveau_ioc32: File headers are not good candidates for kernel-doc

2021-03-19 Thread Karol Herbst
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c:2: warning: Cannot understand  * 
> file mga_ioc32.c
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nouveau_ioc32.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ioc32.c 
> b/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> index adf01ca9e035d..8ddf9b2325a42 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ioc32.c
> @@ -1,4 +1,4 @@
> -/**
> +/*
>   * \file mga_ioc32.c
>   *
>   * 32-bit ioctl compatibility routines for the MGA DRM.
> --
> 2.27.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Reviewed-by: Karol Herbst 



Re: [PATCH 10/19] drm/nouveau/nouveau_bo: Remove unused variables 'dev'

2021-03-19 Thread Karol Herbst
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nouveau_bo.c: In function ‘nouveau_ttm_tt_populate’:
>  drivers/gpu/drm/nouveau/nouveau_bo.c:1228:17: warning: variable ‘dev’ set 
> but not used [-Wunused-but-set-variable]
>  drivers/gpu/drm/nouveau/nouveau_bo.c: In function 
> ‘nouveau_ttm_tt_unpopulate’:
>  drivers/gpu/drm/nouveau/nouveau_bo.c:1252:17: warning: variable ‘dev’ set 
> but not used [-Wunused-but-set-variable]
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: Jeremy Kolb 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nouveau_bo.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
> b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index 281e9ed139895..913035ab85ec2 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -1250,7 +1250,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev,
>  {
> struct ttm_tt *ttm_dma = (void *)ttm;
> struct nouveau_drm *drm;
> -   struct device *dev;
> bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
>
> if (ttm_tt_is_populated(ttm))
> @@ -1263,7 +1262,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev,
> }
>
> drm = nouveau_bdev(bdev);
> -   dev = drm->dev->dev;
>
> return ttm_pool_alloc(>ttm.bdev.pool, ttm, ctx);
>  }
> @@ -1273,14 +1271,12 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev,
>   struct ttm_tt *ttm)
>  {
> struct nouveau_drm *drm;
> -   struct device *dev;
> bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG);
>
> if (slave)
> return;
>
> drm = nouveau_bdev(bdev);
> -   dev = drm->dev->dev;
>
> return ttm_pool_free(>ttm.bdev.pool, ttm);
>  }
> --
> 2.27.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reviewed-by: Karol Herbst 



Re: [PATCH 15/19] drm/nouveau/nv50_display: Remove superfluous prototype for local static functions

2021-03-19 Thread Karol Herbst
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones  wrote:
>
> Fixes the following build error:
>
>  drivers/gpu/drm/nouveau/dispnv50/disp.c:2530:1: error: conflicting types for 
> ‘nv50_display_fini’
>  In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71:
>  drivers/gpu/drm/nouveau/nv50_display.h:36:6: note: previous declaration of 
> ‘nv50_display_fini’ was her
>  In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71:
>  drivers/gpu/drm/nouveau/nv50_display.h:35:6: note: previous declaration of 
> ‘nv50_display_init’ was here
>  drivers/gpu/drm/nouveau/dispnv50/disp.c:2581:1: error: static declaration of 
> ‘nv50_display_destroy’ follows non-static declaration
>  In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71:
>  drivers/gpu/drm/nouveau/nv50_display.h:34:6: note: previous declaration of 
> ‘nv50_display_destroy’ was here
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nv50_display.h | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nv50_display.h 
> b/drivers/gpu/drm/nouveau/nv50_display.h
> index fbd3b15583bc8..2421401d12636 100644
> --- a/drivers/gpu/drm/nouveau/nv50_display.h
> +++ b/drivers/gpu/drm/nouveau/nv50_display.h
> @@ -31,7 +31,4 @@
>  #include "nouveau_reg.h"
>
>  int  nv50_display_create(struct drm_device *);
> -void nv50_display_destroy(struct drm_device *);
> -int  nv50_display_init(struct drm_device *);
> -void nv50_display_fini(struct drm_device *);
>  #endif /* __NV50_DISPLAY_H__ */
> --
> 2.27.0
>
> _______
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reviewed-by: Karol Herbst 



Re: [PATCH 05/19] drm/nouveau/nvkm/subdev/volt/gk20a: Demote non-conformant kernel-doc headers

2021-03-19 Thread Karol Herbst
On Fri, Mar 19, 2021 at 9:24 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function 
> parameter or member 'speedo' not described in 'gk20a_volt_get_cvb_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function 
> parameter or member 's_scale' not described in 'gk20a_volt_get_cvb_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function 
> parameter or member 'coef' not described in 'gk20a_volt_get_cvb_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 'speedo' not described in 'gk20a_volt_get_cvb_t_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 'temp' not described in 'gk20a_volt_get_cvb_t_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 's_scale' not described in 'gk20a_volt_get_cvb_t_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 't_scale' not described in 'gk20a_volt_get_cvb_t_voltage'
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function 
> parameter or member 'coef' not described in 'gk20a_volt_get_cvb_t_voltage'
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c
> index 8c2faa9645111..ccac88da88648 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c
> @@ -45,7 +45,7 @@ static const struct cvb_coef gk20a_cvb_coef[] = {
> /* 852 */ { 1608418, -21643, -269, 0,763,  -48},
>  };
>
> -/**
> +/*
>   * cvb_mv = ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0)
>   */
>  static inline int
> @@ -58,7 +58,7 @@ gk20a_volt_get_cvb_voltage(int speedo, int s_scale, const 
> struct cvb_coef *coef)
> return mv;
>  }
>
> -/**
> +/*
>   * cvb_t_mv =
>   * ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0) +
>   * ((c3 * speedo / s_scale + c4 + c5 * T / t_scale) * T / t_scale)
> --
> 2.27.0
>
> _______
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Reviewed-by: Karol Herbst 



Re: [PATCH 09/19] drm/nouveau/nvkm/engine/gr/gf100: Demote non-conformant kernel-doc header

2021-03-19 Thread Karol Herbst
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c:992: warning: Function 
> parameter or member 'gr' not described in 'gf100_gr_wait_idle'
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c 
> b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
> index 397ff4fe9df89..69e6008f99196 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
> @@ -982,7 +982,7 @@ gf100_gr_zbc_init(struct gf100_gr *gr)
> }
>  }
>
> -/**
> +/*
>   * Wait until GR goes idle. GR is considered idle if it is disabled by the
>   * MC (0x200) register, or GR is not busy and a context switch is not in
>   * progress.
> --
> 2.27.0
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

Reviewed-by: Karol Herbst 



Re: [PATCH 01/19] drm/nouveau/nvkm/subdev/bios/init: Demote obvious abuse of kernel-doc

2021-03-19 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Mar 19, 2021 at 9:24 AM Lee Jones  wrote:
>
> Fixes the following W=1 kernel build warning(s):
>
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:584: warning: Function 
> parameter or member 'init' not described in 'init_reserved'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:611: warning: Function 
> parameter or member 'init' not described in 'init_done'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:622: warning: Function 
> parameter or member 'init' not described in 'init_io_restrict_prog'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:659: warning: Function 
> parameter or member 'init' not described in 'init_repeat'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:685: warning: Function 
> parameter or member 'init' not described in 'init_io_restrict_pll'
>  drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:725: warning: Function 
> parameter or member 'init' not described in 'init_end_repeat'
>
> NB: Trimmed for brevity (lots of these!)
>
> Cc: Ben Skeggs 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---
>  .../gpu/drm/nouveau/nvkm/subdev/bios/init.c   | 204 ++
>  1 file changed, 68 insertions(+), 136 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
> index 9de74f41dcd2a..5a91dc4e5c8ec 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
> @@ -575,9 +575,8 @@ init_tmds_reg(struct nvbios_init *init, u8 tmds)
>   * init opcode handlers
>   
> */
>
> -/**
> +/*
>   * init_reserved - stub for various unknown/unused single-byte opcodes
> - *
>   */
>  static void
>  init_reserved(struct nvbios_init *init)
> @@ -602,9 +601,8 @@ init_reserved(struct nvbios_init *init)
> init->offset += length;
>  }
>
> -/**
> +/*
>   * INIT_DONE - opcode 0x71
> - *
>   */
>  static void
>  init_done(struct nvbios_init *init)
> @@ -613,9 +611,8 @@ init_done(struct nvbios_init *init)
> init->offset = 0x;
>  }
>
> -/**
> +/*
>   * INIT_IO_RESTRICT_PROG - opcode 0x32
> - *
>   */
>  static void
>  init_io_restrict_prog(struct nvbios_init *init)
> @@ -650,9 +647,8 @@ init_io_restrict_prog(struct nvbios_init *init)
> trace("}]\n");
>  }
>
> -/**
> +/*
>   * INIT_REPEAT - opcode 0x33
> - *
>   */
>  static void
>  init_repeat(struct nvbios_init *init)
> @@ -676,9 +672,8 @@ init_repeat(struct nvbios_init *init)
> init->repeat = repeat;
>  }
>
> -/**
> +/*
>   * INIT_IO_RESTRICT_PLL - opcode 0x34
> - *
>   */
>  static void
>  init_io_restrict_pll(struct nvbios_init *init)
> @@ -716,9 +711,8 @@ init_io_restrict_pll(struct nvbios_init *init)
> trace("}]\n");
>  }
>
> -/**
> +/*
>   * INIT_END_REPEAT - opcode 0x36
> - *
>   */
>  static void
>  init_end_repeat(struct nvbios_init *init)
> @@ -732,9 +726,8 @@ init_end_repeat(struct nvbios_init *init)
> }
>  }
>
> -/**
> +/*
>   * INIT_COPY - opcode 0x37
> - *
>   */
>  static void
>  init_copy(struct nvbios_init *init)
> @@ -759,9 +752,8 @@ init_copy(struct nvbios_init *init)
> init_wrvgai(init, port, index, data);
>  }
>
> -/**
> +/*
>   * INIT_NOT - opcode 0x38
> - *
>   */
>  static void
>  init_not(struct nvbios_init *init)
> @@ -771,9 +763,8 @@ init_not(struct nvbios_init *init)
> init_exec_inv(init);
>  }
>
> -/**
> +/*
>   * INIT_IO_FLAG_CONDITION - opcode 0x39
> - *
>   */
>  static void
>  init_io_flag_condition(struct nvbios_init *init)
> @@ -788,9 +779,8 @@ init_io_flag_condition(struct nvbios_init *init)
> init_exec_set(init, false);
>  }
>
> -/**
> +/*
>   * INIT_GENERIC_CONDITION - opcode 0x3a
> - *
>   */
>  static void
>  init_generic_condition(struct nvbios_init *init)
> @@ -840,9 +830,8 @@ init_generic_condition(struct nvbios_init *init)
> }
>  }
>
> -/**
> +/*
>   * INIT_IO_MASK_OR - opcode 0x3b
> - *
>   */
>  static void
>  init_io_mask_or(struct nvbios_init *init)
> @@ -859,9 +848,8 @@ init_io_mask_or(struct nvbios_init *init)
> init_wrvgai(init, 0x03d4, index, data &= ~(1 << or));
>  }
>
> -/**
> +/*
>   * INIT_IO_OR - opcode 0x3c
> - *
>   */
>  static void
>  init_io_or(struct nvbios_init *init)
> @@ -878,9 +866,8 @@ init_i

Re: [Nouveau] nouveau regression post v5.8, still present in v5.10

2021-01-21 Thread Karol Herbst
fyi, there is a patch which solves a maybe related issue on your GPU,
mind giving it a try before we dig further?
https://gitlab.freedesktop.org/drm/nouveau/-/issues/14#note_767791

On Thu, Jan 21, 2021 at 3:33 AM Jamie Heilman
 wrote:
>
> Karol Herbst wrote:
> > On Wed, Jan 6, 2021 at 4:25 AM Jamie Heilman
> >  wrote:
> > >
> > > Jamie Heilman wrote:
> > > > Jamie Heilman wrote:
> > > > > Karol Herbst wrote:
> > > > > > do you think you'd be able to do a kernel bisect in order to 
> > > > > > pinpoint
> > > > > > the actual commit causing it? Thanks
> > > > >
> > > > > No.  I can't reproduce it reliably.  I if I could, bisection wouldn't
> > > > > be a problem but as I can't and as it can take weeks for the problem
> > > > > to occur there's essentially no chance.  I know it regressed roughly
> > > > > in 5.8-rc1 only because that's what I was running when the first event
> > > > > occured.
> > > >
> > > > er, 5.9.0-rc1 rather
> > >
> > > Actually ... I've found a way to reproduce this in hours intead of
> > > weeks, so I think I may be able to bisect it after all, it's something
> > > of a brute force approach and its probably doing horrible things to
> > > the backlight in my poor old monitor, but just running this:
> > >
> > > #!/bin/sh
> > > sleep 5
> > > while ! dmesg | tail | grep -q nouveau
> > > do
> > > xset dpms force off
> > > sleep 65
> > > xdotool mousemove 1024 1024 mousemove restore
> > > sleep 10
> > > done
> > >
> > > Does manage to trip the issue sooner than it would otherwise happen
> > > with natural usage.  Given that this is my primary workstation and I
> > > sort of need it functional during waking hours, it'll take me a bit,
> > > but I'll update folks when I have the error more dialed in.
> > >
> >
> > huh interesting. Kind of feels like a random thing still. But I think
> > in general you'd spend way too much time on this if you can't
> > reproduce within seconds/minutes and then it might not point out the
> > actual issue, because randomly the issue didn't appear and stuff.
> >
> > maybe you can tune it to have shorter pauses or something? I'd really
> > try to bring down the time per cycle.
>
> Well I'm confident enough, at this point, to say this bisects to
> 0a96099691c8 ("drm/nouveau/kms/nv50-: implement proper push buffer control 
> logic")
>
> Now... I wish I could say the bisection was straightforward and
> simple, but it wasn't thanks to still not having a reproducer really
> dialed in.  The above script doesn't work unless I've got some normal
> usage around it.  It certainly triggers the issue sooner than it
> otherwise would, but by itself it isn't enough.  I modified it
> somewhat in the hopes of capturing the rough idea of how many
> itterations it would take to trigger the problem by using:
>
> #!/bin/sh
> I=0
> trap 'echo;echo $I' 0
> trap 'exit' INT
> sleep 5
> while ! dmesg | tail | grep -q nouveau
> do
> I=$(($I + 1))
> xset dpms force off
> sleep 32
> xdotool mousemove 1 12 mousemove restore
> sleep 28
> done
>
> but ultimately that didn't really pan out the way I'd hoped.  I don't
> think the itterations have all that much to do with the condition in
> the end.  I wanted to try applying ca386aa7155a
> ("drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bug") on
> top of the first bad commit becuase I sort of felt like when I was
> running -rc versions that things got a bit less chaotic after that
> commit landed, but it was just a gut feeling and I wanted to see if I
> could support it with metrics---but no, I can't really get consistent
> metrics even without that commit so I gave up, and decided to report
> what I've got so far.
>
>
>
> > > I'm using git bisect start -- drivers/gpu/drm include/drm include/video
> > > in an effort to make this go a bit quicker, let me know if you think
> > > that's a bad idea or I should add other paths.
> > >
> > > > > > On Sun, Dec 27, 2020 at 8:16 PM Jamie Heilman
> > > > > >  wrote:
> > > > > > >
> > > > > > > Something between v5.8 and v5.9 has resulted in periodically 
> > > > > > > losing video.
> > > > > > > Unfortunately, I can't reliably reproduce it, it seems to happen 
> > > > > > > every
> 

Re: [Nouveau] [PATCH] tracing: remove definition of DEBUG

2021-01-18 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Fri, Jan 15, 2021 at 11:51 PM  wrote:
>
> From: Tom Rix 
>
> Defining DEBUG should only be done in development.
> So remove DEBUG.
>
> Signed-off-by: Tom Rix 
> ---
>  kernel/trace/trace_mmiotrace.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/kernel/trace/trace_mmiotrace.c b/kernel/trace/trace_mmiotrace.c
> index 84582bf1ed5f..2c3c31791497 100644
> --- a/kernel/trace/trace_mmiotrace.c
> +++ b/kernel/trace/trace_mmiotrace.c
> @@ -5,8 +5,6 @@
>   * Copyright (C) 2008 Pekka Paalanen 
>   */
>
> -#define DEBUG 1
> -
>  #include 
>  #include 
>  #include 
> --
> 2.27.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>



[PATCH] drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices

2021-01-18 Thread Karol Herbst
Fixes a crash when trying to create a channel on e.g. Turing GPUs when
NOUVEAU_SVM_INIT was called before.

Fixes: eeaf06ac1a558 ("drm/nouveau/svm: initial support for shared virtual 
memory")
Signed-off-by: Karol Herbst 
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 4f69e4c3dafd..1c3f890377d2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -315,6 +315,10 @@ nouveau_svmm_init(struct drm_device *dev, void *data,
struct drm_nouveau_svm_init *args = data;
int ret;
 
+   /* We need to fail if svm is disabled */
+   if (!cli->drm->svm)
+   return -ENOSYS;
+
/* Allocate tracking for SVM-enabled VMM. */
if (!(svmm = kzalloc(sizeof(*svmm), GFP_KERNEL)))
return -ENOMEM;
-- 
2.29.2



[PATCH] drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices

2021-01-18 Thread Karol Herbst
Fixes a crash when trying to create a channel on e.g. Turing GPUs when
NOUVEAU_SVM_INIT was called before.

Fixes: eeaf06ac1a558 ("drm/nouveau/svm: initial support for shared virtual 
memory")
Signed-off-by: Karol Herbst 
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 4f69e4c3dafd..1c3f890377d2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -315,6 +315,10 @@ nouveau_svmm_init(struct drm_device *dev, void *data,
struct drm_nouveau_svm_init *args = data;
int ret;
 
+   /* We need to fail if svm is disabled */
+   if (!cli->drm->svm)
+   return -ENOSYS;
+
/* Allocate tracking for SVM-enabled VMM. */
if (!(svmm = kzalloc(sizeof(*svmm), GFP_KERNEL)))
return -ENOMEM;
-- 
2.29.2



Re: [Nouveau] nouveau regression post v5.8, still present in v5.10

2021-01-06 Thread Karol Herbst
On Wed, Jan 6, 2021 at 4:25 AM Jamie Heilman
 wrote:
>
> Jamie Heilman wrote:
> > Jamie Heilman wrote:
> > > Karol Herbst wrote:
> > > > do you think you'd be able to do a kernel bisect in order to pinpoint
> > > > the actual commit causing it? Thanks
> > >
> > > No.  I can't reproduce it reliably.  I if I could, bisection wouldn't
> > > be a problem but as I can't and as it can take weeks for the problem
> > > to occur there's essentially no chance.  I know it regressed roughly
> > > in 5.8-rc1 only because that's what I was running when the first event
> > > occured.
> >
> > er, 5.9.0-rc1 rather
>
> Actually ... I've found a way to reproduce this in hours intead of
> weeks, so I think I may be able to bisect it after all, it's something
> of a brute force approach and its probably doing horrible things to
> the backlight in my poor old monitor, but just running this:
>
> #!/bin/sh
> sleep 5
> while ! dmesg | tail | grep -q nouveau
> do
> xset dpms force off
> sleep 65
> xdotool mousemove 1024 1024 mousemove restore
> sleep 10
> done
>
> Does manage to trip the issue sooner than it would otherwise happen
> with natural usage.  Given that this is my primary workstation and I
> sort of need it functional during waking hours, it'll take me a bit,
> but I'll update folks when I have the error more dialed in.
>

huh interesting. Kind of feels like a random thing still. But I think
in general you'd spend way too much time on this if you can't
reproduce within seconds/minutes and then it might not point out the
actual issue, because randomly the issue didn't appear and stuff.

maybe you can tune it to have shorter pauses or something? I'd really
try to bring down the time per cycle.

> I'm using git bisect start -- drivers/gpu/drm include/drm include/video
> in an effort to make this go a bit quicker, let me know if you think
> that's a bad idea or I should add other paths.
>
> > > > On Sun, Dec 27, 2020 at 8:16 PM Jamie Heilman
> > > >  wrote:
> > > > >
> > > > > Something between v5.8 and v5.9 has resulted in periodically losing 
> > > > > video.
> > > > > Unfortunately, I can't reliably reproduce it, it seems to happen every
> > > > > once in a long while---I can go weeks without an occurance, but it
> > > > > always seems to happen after my workstation has been idle long enough
> > > > > to screen blank and put the monitor to sleep.  I'm using a single
> > > > > display (Dell 2405FPW) connected via DVI, running X (Xorg 1.20.x from
> > > > > Debian sid).  I don't really do anything fancy, xterms, a browser or
> > > > > two, play the occasional video, but like I said, I can't reliably
> > > > > reproduce this.  I've had it happen about 11 times since August.
> > > > >
> > > > > lspci -vv output is:
> > > > >
> > > > > 01:00.0 VGA compatible controller: NVIDIA Corporation G86 [Quadro NVS 
> > > > > 290] (rev a1) (prog-if 00 [VGA controller])
> > > > > Subsystem: NVIDIA Corporation G86 [Quadro NVS 290]
> > > > > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> > > > > ParErr- Stepping- SERR- FastB2B- DisINTx+
> > > > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast 
> > > > > >TAbort- SERR-  > > > > Latency: 0, Cache Line Size: 64 bytes
> > > > > Interrupt: pin A routed to IRQ 28
> > > > > Region 0: Memory at fc00 (32-bit, non-prefetchable) 
> > > > > [size=16M]
> > > > > Region 1: Memory at d000 (64-bit, prefetchable) 
> > > > > [size=256M]
> > > > > Region 3: Memory at fa00 (64-bit, non-prefetchable) 
> > > > > [size=32M]
> > > > > Region 5: I/O ports at dc80 [size=128]
> > > > > Expansion ROM at 000c [disabled] [size=128K]
> > > > > Capabilities: [60] Power Management version 2
> > > > > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
> > > > > PME(D0-,D1-,D2-,D3hot-,D3cold-)
> > > > > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> > > > > Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
> > > > > Address: fee01004  Data: 4023
> > > > > Capabilities: [78] Express (v1) Endpoint, MSI 00
> > > > >  

Re: [Nouveau] nouveau regression post v5.8, still present in v5.10

2021-01-04 Thread Karol Herbst
do you think you'd be able to do a kernel bisect in order to pinpoint
the actual commit causing it? Thanks

On Sun, Dec 27, 2020 at 8:16 PM Jamie Heilman
 wrote:
>
> Something between v5.8 and v5.9 has resulted in periodically losing video.
> Unfortunately, I can't reliably reproduce it, it seems to happen every
> once in a long while---I can go weeks without an occurance, but it
> always seems to happen after my workstation has been idle long enough
> to screen blank and put the monitor to sleep.  I'm using a single
> display (Dell 2405FPW) connected via DVI, running X (Xorg 1.20.x from
> Debian sid).  I don't really do anything fancy, xterms, a browser or
> two, play the occasional video, but like I said, I can't reliably
> reproduce this.  I've had it happen about 11 times since August.
>
> lspci -vv output is:
>
> 01:00.0 VGA compatible controller: NVIDIA Corporation G86 [Quadro NVS 290] 
> (rev a1) (prog-if 00 [VGA controller])
> Subsystem: NVIDIA Corporation G86 [Quadro NVS 290]
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
> Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> SERR-  Latency: 0, Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 28
> Region 0: Memory at fc00 (32-bit, non-prefetchable) [size=16M]
> Region 1: Memory at d000 (64-bit, prefetchable) [size=256M]
> Region 3: Memory at fa00 (64-bit, non-prefetchable) [size=32M]
> Region 5: I/O ports at dc80 [size=128]
> Expansion ROM at 000c [disabled] [size=128K]
> Capabilities: [60] Power Management version 2
> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
> Address: fee01004  Data: 4023
> Capabilities: [78] Express (v1) Endpoint, MSI 00
> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s 
> <512ns, L1 <4us
> ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- 
> SlotPowerLimit 25.000W
> DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq-
> RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> MaxPayload 128 bytes, MaxReadReq 512 bytes
> DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- 
> TransPend-
> LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit 
> Latency L0s <512ns, L1 <4us
> ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 2.5GT/s (ok), Width x16 (ok)
> TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> Capabilities: [100 v1] Virtual Channel
> Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
> Arb:Fixed- WRR32- WRR64- WRR128-
> Ctrl:   ArbSelect=Fixed
> Status: InProgress-
> VC0:Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> Arb:Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
> Status: NegoPending- InProgress-
> Capabilities: [128 v1] Power Budgeting 
> Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 
> Len=024 
> Kernel driver in use: nouveau
>
> The last time this happened, this is what got logged:
>
> nouveau :01:00.0: disp: ERROR 5 [INVALID_STATE] 06 [] chid 1 mthd 0080 
> data 0001
> nouveau :01:00.0: disp: Base 1:
> nouveau :01:00.0: disp:0084: 
> nouveau :01:00.0: disp:0088: 
> nouveau :01:00.0: disp:008c: 
> nouveau :01:00.0: disp:0090: 
> nouveau :01:00.0: disp:0094: 
> nouveau :01:00.0: disp:00a0: 0060 -> 0070
> nouveau :01:00.0: disp:00a4:  -> f000
> nouveau :01:00.0: disp:00c0: 
> nouveau :01:00.0: disp:00c4: 
> nouveau :01:00.0: disp:00c8: 
> nouveau :01:00.0: disp:00cc: 
> nouveau :01:00.0: disp:00e0: 4000
> nouveau :01:00.0: disp:00e4: 
> nouveau :01:00.0: disp:00e8: 
> nouveau :01:00.0: disp:00ec: 
> nouveau :01:00.0: disp:00fc: 
> nouveau :01:00.0: disp:0100: fffe
> nouveau :01:00.0: disp:0104: 
> nouveau :01:00.0: disp:0110: 
> nouveau :01:00.0: disp:0114: 
> nouveau :01:00.0: disp: 

Re: [Nouveau] [PATCH v2] ALSA: hda: Continue to probe when codec probe fails

2021-01-04 Thread Karol Herbst
On Tue, Dec 22, 2020 at 3:50 AM Kai-Heng Feng
 wrote:
>
> On Tue, Dec 22, 2020 at 1:56 AM Ilia Mirkin  wrote:
> >
> > On Mon, Dec 21, 2020 at 11:33 AM Kai-Heng Feng
> >  wrote:
> > >
> > > [+Cc nouveau]
> > >
> > > On Fri, Dec 18, 2020 at 4:06 PM Takashi Iwai  wrote:
> > > [snip]
> > > > > Quite possibly the system doesn't power up HDA controller when there's
> > > > > no external monitor.
> > > > > So when it's connected to external monitor, it's still needed for 
> > > > > HDMI audio.
> > > > > Let me ask the user to confirm this.
> > > >
> > > > Yeah, it's the basic question whether the HD-audio is supposed to work
> > > > on this machine at all.  If yes, the current approach we take makes
> > > > less sense - instead we should rather make the HD-audio controller
> > > > working.
> > >
> > > Yea, confirmed that the Nvidia HDA works when HDMI is connected prior 
> > > boot.
> > >
> > > > > > - The second problem is that pci_enable_device() ignores the error
> > > > > >   returned from pci_set_power_state() if it's -EIO.  And the
> > > > > >   inaccessible access error returns -EIO, although it's rather a 
> > > > > > fatal
> > > > > >   problem.  So the driver believes as the PCI device gets enabled
> > > > > >   properly.
> > > > >
> > > > > This was introduced in 2005, by Alan's 11f3859b1e85 ("[PATCH] PCI: Fix
> > > > > regression in pci_enable_device_bars") to fix UHCI controller.
> > > > >
> > > > > >
> > > > > > - The third problem is that HD-audio driver blindly believes the
> > > > > >   codec_mask read from the register even if it's a read failure as I
> > > > > >   already showed.
> > > > >
> > > > > This approach has least regression risk.
> > > >
> > > > Yes, but it assumes that HD-audio is really non-existent.
> > >
> > > I really don't know any good approach to address this.
> > > On Windows, HDA PCI is "hidden" until HDMI cable is plugged, then the
> > > driver will flag the magic bit to make HDA audio appear on the PCI
> > > bus.
> > > IIRC the current approach is to make nouveau and device link work.
> >
> > I don't have the full context of this discussion, but the kernel
> > force-enables the HDA subfunction nowadays, irrespective of nouveau or
> > nvidia or whatever:
>
> That's the problem.
>
> The nvidia HDA controller on the affected system only gets its power
> after HDMI cable plugged, so the probe on boot fails.
>

it might be that the code to enable the sub function is a bit broken
:/ but it should work. Maybe the quirk_nvidia_hda function needs to be
called on more occasions? No idea.

> Kai-Heng
>
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/quirks.c?h=v5.10#n5267
> >
> > Cheers,
> >
> >   -ilia
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>



Re: [PATCH 000/141] Fix fall-through warnings for Clang

2020-11-26 Thread Karol Herbst
On Thu, Nov 26, 2020 at 4:28 PM Geert Uytterhoeven  wrote:
>
> Hi Miguel,
>
> On Thu, Nov 26, 2020 at 3:54 PM Miguel Ojeda
>  wrote:
> > On Wed, Nov 25, 2020 at 11:44 PM Edward Cree  wrote:
> > > To make the intent clear, you have to first be certain that you
> > >  understand the intent; otherwise by adding either a break or a
> > >  fallthrough to suppress the warning you are just destroying the
> > >  information that "the intent of this code is unknown".
> >
> > If you don't know what the intent of your own code is, then you
> > *already* have a problem in your hands.
>
> The maintainer is not necessarily the owner/author of the code, and
> thus may not know the intent of the code.
>
> > > or does it flag up code
> > >  that can be mindlessly "fixed" (in which case the warning is
> > >  worthless)?  Proponents in this thread seem to be trying to
> > >  have it both ways.
> >
> > A warning is not worthless just because you can mindlessly fix it.
> > There are many counterexamples, e.g. many
> > checkpatch/lint/lang-format/indentation warnings, functional ones like
> > the `if (a = b)` warning...
>
> BTW, you cannot mindlessly fix the latter, as you cannot know if
> "(a == b)" or "((a = b))" was intended, without understanding the code
> (and the (possibly unavailable) data sheet, and the hardware, ...).
>

to allow assignments in if statements was clearly a mistake and if you
need outside information to understand the code, your code is the
issue already.

> P.S. So far I've stayed out of this thread, as I like it if the compiler
>  flags possible mistakes.  After all I was the one fixing new
>  "may be used uninitialized" warnings thrown up by gcc-4.1, until
>  (a bit later than) support for that compiler was removed...
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>



Re: [Nouveau] [PATCH 2/3] drm/nouveau: manage nouveau_drm lifetime with devres

2020-11-06 Thread Karol Herbst
On Fri, Nov 6, 2020 at 3:17 AM Jeremy Cline  wrote:
>
> Make use of the devm_drm_dev_alloc() API to bind the lifetime of
> nouveau_drm structure to the drm_device. This is important because a
> reference to nouveau_drm is accessible from drm_device, which is
> provided to a number of DRM layer callbacks that can run after the
> deallocation of nouveau_drm currently occurs.
>
> Signed-off-by: Jeremy Cline 
> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 44 ---
>  drivers/gpu/drm/nouveau/nouveau_drv.h | 10 --
>  2 files changed, 26 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index bc6f51bf23b7..f750c25e92f9 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -30,9 +30,11 @@
>  #include 
>  #include 
>
> +#include 
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -532,13 +534,8 @@ nouveau_parent = {
>  static int
>  nouveau_drm_device_init(struct drm_device *dev)
>  {
> -   struct nouveau_drm *drm;
> int ret;
> -
> -   if (!(drm = kzalloc(sizeof(*drm), GFP_KERNEL)))
> -   return -ENOMEM;
> -   dev->dev_private = drm;
> -   drm->dev = dev;
> +   struct nouveau_drm *drm = nouveau_drm(dev);
>
> nvif_parent_ctor(_parent, >parent);
> drm->master.base.object.parent = >parent;
> @@ -620,7 +617,6 @@ nouveau_drm_device_init(struct drm_device *dev)
> nouveau_cli_fini(>master);
>  fail_alloc:
> nvif_parent_dtor(>parent);
> -   kfree(drm);
> return ret;
>  }
>
> @@ -654,7 +650,6 @@ nouveau_drm_device_fini(struct drm_device *dev)
> nouveau_cli_fini(>client);
> nouveau_cli_fini(>master);
> nvif_parent_dtor(>parent);
> -   kfree(drm);
>  }
>
>  /*
> @@ -720,6 +715,7 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
>  {
> struct nvkm_device *device;
> struct drm_device *drm_dev;
> +   struct nouveau_drm *nv_dev;
> int ret;
>
> if (vga_switcheroo_client_probe_defer(pdev))
> @@ -750,15 +746,16 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
> if (nouveau_atomic)
> driver_pci.driver_features |= DRIVER_ATOMIC;
>
> -   drm_dev = drm_dev_alloc(_pci, >dev);
> -   if (IS_ERR(drm_dev)) {
> -   ret = PTR_ERR(drm_dev);
> +   nv_dev = devm_drm_dev_alloc(>dev, _stub, 
> typeof(*nv_dev), drm_dev);
> +   if (IS_ERR(nv_dev)) {
> +   ret = PTR_ERR(nv_dev);
> goto fail_nvkm;
> }
> +   drm_dev = nouveau_to_drm_dev(nv_dev);
>
> ret = pci_enable_device(pdev);
> if (ret)
> -   goto fail_drm;
> +   goto fail_nvkm;
>
> drm_dev->pdev = pdev;
> pci_set_drvdata(pdev, drm_dev);
> @@ -778,8 +775,6 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
> nouveau_drm_device_fini(drm_dev);
>  fail_pci:
> pci_disable_device(pdev);
> -fail_drm:
> -   drm_dev_put(drm_dev);

it sounded like that when using devm_drm_dev_alloc we still have an
initial refcount of 1, so at least in this regard nothing changed so I
am wondering why this change is necessary and if the reason is
unrelated it might make sense to move it into its own patch.

>  fail_nvkm:
> nvkm_device_del();
> return ret;
> @@ -799,7 +794,6 @@ nouveau_drm_device_remove(struct drm_device *dev)
> device = nvkm_device_find(client->device);
>
> nouveau_drm_device_fini(dev);
> -   drm_dev_put(dev);
> nvkm_device_del();
>  }
>
> @@ -1285,7 +1279,8 @@ nouveau_platform_device_create(const struct 
> nvkm_device_tegra_func *func,
>struct platform_device *pdev,
>struct nvkm_device **pdevice)
>  {
> -   struct drm_device *drm;
> +   struct nouveau_drm *nv_dev;
> +   struct drm_device *drm_dev;
> int err;
>
> err = nvkm_device_tegra_new(func, pdev, nouveau_config, nouveau_debug,
> @@ -1293,22 +1288,21 @@ nouveau_platform_device_create(const struct 
> nvkm_device_tegra_func *func,
> if (err)
> goto err_free;
>
> -   drm = drm_dev_alloc(_platform, >dev);
> -   if (IS_ERR(drm)) {
> -   err = PTR_ERR(drm);
> +   nv_dev = devm_drm_dev_alloc(>dev, _platform, 
> typeof(*nv_dev), drm_dev);
> +   if (IS_ERR(nv_dev)) {
> +   err = PTR_ERR(nv_dev);
> goto err_free;
> }
> +   drm_dev = nouveau_to_drm_dev(nv_dev);
>
> -   err = nouveau_drm_device_init(drm);
> +   err = nouveau_drm_device_init(drm_dev);
> if (err)
> -   goto err_put;
> +   goto err_free;
>
> -   platform_set_drvdata(pdev, drm);
> +   platform_set_drvdata(pdev, drm_dev);
>
> -   return drm;
> +   return drm_dev;
>
> -err_put:
> -   drm_dev_put(drm);
>  

Re: [Nouveau] nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian

2020-10-28 Thread Karol Herbst
I sent a patch to the mailing list and wanted to have some review on
that from at least Ben, but no idea if Ben already picked it and if
it's good enough for sending it to stable yet.



Re: [Nouveau] [PATCH RFC] drm/nouveau: fix memory leak in nvkm_iccsense_oneinit

2020-10-15 Thread Karol Herbst
On Thu, Oct 15, 2020 at 6:32 PM Karol Herbst  wrote:
>
> Ben, I think this is like the 5th patch tackling this issue, I think
> we should merge one of those.
>

maybe I just confused that with reports, but it seems to turn up quite
a bit and maybe I should have pushed more of it as well...

Anyway, the patch itself looks good.

Reviewed-by: Karol Herbst 

> On Thu, Oct 15, 2020 at 6:23 AM Keita Suzuki
>  wrote:
> >
> > struct pw_rail_t is allocated as an array in function nvios_iccsense_parse,
> > and stored to a struct member of local variable. However, the array is not
> > freed when the local variable becomes invalid, and the reference is not
> > passed on, leading to a memory leak.
> >
> > Fix this by freeing struct pw_rail_t when exiting nvkm_iccsense_oneinit.
> >
> > Signed-off-by: Keita Suzuki 
> > ---
> >  drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c 
> > b/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c
> > index fecfa6afcf54..8ba8d8e3f52a 100644
> > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c
> > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c
> > @@ -280,8 +280,10 @@ nvkm_iccsense_oneinit(struct nvkm_subdev *subdev)
> > }
> >
> > rail = kmalloc(sizeof(*rail), GFP_KERNEL);
> > -   if (!rail)
> > +   if (!rail) {
> > +   kfree(stbl.rail);
> > return -ENOMEM;
> > +   }
> >
> > rail->read = read;
> > rail->sensor = sensor;
> > @@ -291,6 +293,7 @@ nvkm_iccsense_oneinit(struct nvkm_subdev *subdev)
> > list_add_tail(>head, >rails);
> > }
> > }
> > +   kfree(stbl.rail);
> > return 0;
> >  }
> >
> > --
> > 2.17.1
> >
> > ___
> > Nouveau mailing list
> > nouv...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/nouveau
> >



Re: [Nouveau] [PATCH RFC] drm/nouveau: fix memory leak in nvkm_iccsense_oneinit

2020-10-15 Thread Karol Herbst
Ben, I think this is like the 5th patch tackling this issue, I think
we should merge one of those.

On Thu, Oct 15, 2020 at 6:23 AM Keita Suzuki
 wrote:
>
> struct pw_rail_t is allocated as an array in function nvios_iccsense_parse,
> and stored to a struct member of local variable. However, the array is not
> freed when the local variable becomes invalid, and the reference is not
> passed on, leading to a memory leak.
>
> Fix this by freeing struct pw_rail_t when exiting nvkm_iccsense_oneinit.
>
> Signed-off-by: Keita Suzuki 
> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c
> index fecfa6afcf54..8ba8d8e3f52a 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c
> @@ -280,8 +280,10 @@ nvkm_iccsense_oneinit(struct nvkm_subdev *subdev)
> }
>
> rail = kmalloc(sizeof(*rail), GFP_KERNEL);
> -   if (!rail)
> +   if (!rail) {
> +   kfree(stbl.rail);
> return -ENOMEM;
> +   }
>
> rail->read = read;
> rail->sensor = sensor;
> @@ -291,6 +293,7 @@ nvkm_iccsense_oneinit(struct nvkm_subdev *subdev)
> list_add_tail(>head, >rails);
> }
> }
> +   kfree(stbl.rail);
> return 0;
>  }
>
> --
> 2.17.1
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>



Re: [Nouveau] nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian

2020-10-09 Thread Karol Herbst
On Sat, Oct 10, 2020 at 12:23 AM Ilia Mirkin  wrote:
>
> On Fri, Oct 9, 2020 at 5:54 PM Karol Herbst  wrote:
> >
> > On Fri, Oct 9, 2020 at 11:35 PM Ondrej Zary  wrote:
> > >
> > > Hello,
> > > I'm testing 5.9.0-rc8 and found that Riva TNT2 stopped working:
> > > [0.00] Linux version 5.9.0-rc8+ (zary@gsql) (gcc (Debian 8.3.0-6) 
> > > 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #326 SMP Fri Oct 9 
> > > 22:31:40 CEST 2020
> > > ...
> > > [   14.771464] nouveau :01:00.0: GPU not supported on big-endian
> > > [   14.771782] nouveau: probe of :01:00.0 failed with error -38
> > >
> > > big-endian? WTF? The machine is x86.
> > >
> >
> > mhh, we reworked the endianess checks a bit and apparently that broke
> > something... I will give it some thoughts, but could you be so kind
> > and create an mmiotrace under 5.9 with nouveau? You won't need to
> > start X or anything while doing it. Just enable the trace and modprobe
> > nouveau and collect the trace.
>
> Looks like nvkm_device_endianness unconditionally reads out 0x4. I
> don't think that reg is there pre-NV11. At least NV4, NV5, NV10 and
> maybe NV15 (which is logically pre-NV11) don't support big-endian
> mode. Not sure about NV1A, which was the IGP of the series and IIRC
> logically pre-NV11 as well (but clearly could only be used with x86
> chips, since it was part of the motherboard).
>
> Aha, it's documented in rnndb:
>
> https://github.com/envytools/envytools/blob/master/rnndb/bus/pmc.xml
> 
>

ohh, I should have checked there.. yeah, will write a fix for it then.
Before my patch we just always tried to switch it, but never threw an
error.

>   -ilia
>



Re: [Nouveau] nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian

2020-10-09 Thread Karol Herbst
On Fri, Oct 9, 2020 at 11:35 PM Ondrej Zary  wrote:
>
> Hello,
> I'm testing 5.9.0-rc8 and found that Riva TNT2 stopped working:
> [0.00] Linux version 5.9.0-rc8+ (zary@gsql) (gcc (Debian 8.3.0-6) 
> 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #326 SMP Fri Oct 9 22:31:40 
> CEST 2020
> ...
> [   14.771464] nouveau :01:00.0: GPU not supported on big-endian
> [   14.771782] nouveau: probe of :01:00.0 failed with error -38
>
> big-endian? WTF? The machine is x86.
>

mhh, we reworked the endianess checks a bit and apparently that broke
something... I will give it some thoughts, but could you be so kind
and create an mmiotrace under 5.9 with nouveau? You won't need to
start X or anything while doing it. Just enable the trace and modprobe
nouveau and collect the trace.

> It works fine with Debian 5.7 kernel (5.7.10-1~bpo10+1):
> [0.00] Linux version 5.7.0-0.bpo.2-686 
> (debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6), GNU ld 
> (GNU Binutils for Debian) 2.31.1) #1 SMP Debian 5.7.10-1~bpo10+1 (2020-07-30)
> ...
> [   23.266196] nouveau :01:00.0: NVIDIA NV05 (20154000)
> [   23.288582] nouveau :01:00.0: bios: version 02.05.20.02.00
> [   23.288869] nouveau :01:00.0: bios: DCB table not found
> [   23.289595] nouveau :01:00.0: bios: DCB table not found
> [   23.289956] nouveau :01:00.0: bios: DCB table not found
> [   23.290015] nouveau :01:00.0: bios: DCB table not found
> [   23.290215] agpgart-intel :00:00.0: AGP 3.0 bridge
> [   23.290287] agpgart-intel :00:00.0: bridge is in legacy mode, falling 
> back to 2.x
> [   23.290351] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
> [   23.290430] nouveau :01:00.0: putting AGP V2 device into 4x mode
> [   23.290565] agpgart-intel :00:00.0: AGP 3.0 bridge
> [   23.290627] agpgart-intel :00:00.0: bridge is in legacy mode, falling 
> back to 2.x
> [   23.290690] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode
> [   23.290768] nouveau :01:00.0: putting AGP V2 device into 4x mode
> [   23.290830] nouveau :01:00.0: tmr: unknown input clock freq
> [   23.293026] nouveau :01:00.0: fb: 32 MiB SDRAM
> [   23.301269] [TTM] Zone  kernel: Available graphics memory: 382728 KiB
> [   23.301327] [TTM] Initializing pool allocator
> [   23.301414] nouveau :01:00.0: DRM: VRAM: 31 MiB
> [   23.301465] nouveau :01:00.0: DRM: GART: 128 MiB
> [   23.301518] nouveau :01:00.0: DRM: BMP version 5.6
> [   23.301570] nouveau :01:00.0: DRM: No DCB data found in VBIOS
> [   23.303594] nouveau :01:00.0: DRM: MM: using M2MF for buffer copies
> [   23.303719] nouveau :01:00.0: bios: DCB table not found
> [   23.304904] nouveau :01:00.0: DRM: Saving VGA fonts
> [   23.349089] nouveau :01:00.0: DRM: No DCB data found in VBIOS
> [   23.349681] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [   23.383066] nouveau :01:00.0: DRM: allocated 1280x1024 fb: 0x4000, bo 
> b10d2f17
> [   23.413903] fbcon: nouveaudrmfb (fb0) is primary device
> [   23.569851] Console: switching to colour frame buffer device 160x64
> [   23.571050] nouveau :01:00.0: fb0: nouveaudrmfb frame buffer device
>
>
> --
> Ondrej Zary
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>



[PATCH 2/2] drm/nouveau/mem: guard against NULL pointer access in mem_del

2020-10-06 Thread Karol Herbst
other drivers seems to do something similar

Signed-off-by: Karol Herbst 
Cc: dri-devel 
Cc: Dave Airlie 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/nouveau/nouveau_mem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.c 
b/drivers/gpu/drm/nouveau/nouveau_mem.c
index b1bb542d31158..e5fae57fffbd1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_mem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_mem.c
@@ -176,6 +176,8 @@ void
 nouveau_mem_del(struct ttm_mem_reg *reg)
 {
struct nouveau_mem *mem = nouveau_mem(reg);
+   if (!mem)
+   return;
nouveau_mem_fini(mem);
kfree(reg->mm_node);
reg->mm_node = NULL;
-- 
2.26.2



[PATCH 1/2] drm/nouveau/device: return error for unknown chipsets

2020-10-06 Thread Karol Herbst
Previously the code relied on device->pri to be NULL and to fail probing
later. We really should just return an error inside nvkm_device_ctor for
unsupported GPUs.

Fixes: 24d5ff40a732 ("drm/nouveau/device: rework mmio mapping code to get rid 
of second map")

Signed-off-by: Karol Herbst 
Cc: dann frazier 
Cc: dri-devel 
Cc: Dave Airlie 
Cc: sta...@vger.kernel.org
Reviewed-by: Jeremy Cline 
---
 drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c 
b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
index 9f4ac2672cf2e..dcb70677d0acc 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
@@ -3149,6 +3149,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func,
case 0x168: device->chip = _chipset; break;
default:
nvdev_error(device, "unknown chipset (%08x)\n", boot0);
+   ret = -ENODEV;
goto done;
}
 
-- 
2.26.2



Re: [PATCH v2 1/2] drm/nouveau: return temperatures in temp_get() via parameter

2020-09-17 Thread Karol Herbst
On Thu, Sep 17, 2020 at 4:11 PM Jeremy Cline  wrote:
>
> On Wed, Sep 16, 2020 at 10:03:22PM +0200, Karol Herbst wrote:
> > On Wed, Sep 16, 2020 at 10:01 PM Karol Herbst  wrote:
> > >
> > > On Wed, Sep 16, 2020 at 9:47 PM Jeremy Cline  wrote:
> > > >
> > > > The temp_get() function currently returns negative error numbers or a
> > > > temperature. However, the thermal sensors can (in theory) measure
> > > > negative temperatures. Some implementations of temp_get() correctly
> > > > clamp negative temperature readings to 0 so that users do not mistake
> > > > them for errors, but some, like gp100_temp_get(), do not.
> > > >
> > > > Rather than relying on implementations remembering to clamp values,
> > > > dedicate the function return value to error codes and accept a pointer
> > > > to an integer where the temperature reading should be stored.
> > > >
> > > > Signed-off-by: Jeremy Cline 
> > > > ---
> > > >  .../drm/nouveau/include/nvkm/subdev/therm.h   |  2 +-
> > > >  drivers/gpu/drm/nouveau/nouveau_hwmon.c   | 12 ++--
> > > >  .../gpu/drm/nouveau/nvkm/subdev/therm/base.c  | 19 ++-
> > > >  .../gpu/drm/nouveau/nvkm/subdev/therm/g84.c   | 11 ++-
> > > >  .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 11 ++-
> > > >  .../gpu/drm/nouveau/nvkm/subdev/therm/nv40.c  |  9 +++--
> > > >  .../gpu/drm/nouveau/nvkm/subdev/therm/nv50.c  |  9 +++--
> > > >  .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h  | 17 +++--
> > > >  .../gpu/drm/nouveau/nvkm/subdev/therm/temp.c  | 12 
> > > >  9 files changed, 62 insertions(+), 40 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> > > > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > > > index 62c34f98c930..bfe9779216fc 100644
> > > > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > > > @@ -99,7 +99,7 @@ struct nvkm_therm {
> > > > bool clkgating_enabled;
> > > >  };
> > > >
> > > > -int nvkm_therm_temp_get(struct nvkm_therm *);
> > > > +int nvkm_therm_temp_get(struct nvkm_therm *therm, int *temp);
> > > >  int nvkm_therm_fan_sense(struct nvkm_therm *);
> > > >  int nvkm_therm_cstate(struct nvkm_therm *, int, int);
> > > >  void nvkm_therm_clkgate_init(struct nvkm_therm *,
> > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c 
> > > > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > > > index 1c3104d20571..aff6aa296678 100644
> > > > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > > > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > > > @@ -325,7 +325,7 @@ nouveau_temp_is_visible(const void *data, u32 attr, 
> > > > int channel)
> > > > struct nouveau_drm *drm = nouveau_drm((struct drm_device 
> > > > *)data);
> > > > struct nvkm_therm *therm = nvxx_therm(>client.device);
> > > >
> > > > -   if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm) < 
> > > > 0)
> > > > +   if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm, 
> > > > NULL) < 0)
> > > > return 0;
> > > >
> > > > switch (attr) {
> > > > @@ -419,7 +419,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> > > > channel, long *val)
> > > > struct drm_device *drm_dev = dev_get_drvdata(dev);
> > > > struct nouveau_drm *drm = nouveau_drm(drm_dev);
> > > > struct nvkm_therm *therm = nvxx_therm(>client.device);
> > > > -   int ret;
> > > > +   int ret = 0, temp;
> > > >
> > > > if (!therm || !therm->attr_get)
> > > > return -EOPNOTSUPP;
> > > > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> > > > channel, long *val)
> > > > case hwmon_temp_input:
> > > > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON)
> > > > return -EINVAL;
> > > > -   ret = nvkm_therm_temp_get(therm);
> > > > -   *val = ret < 0 ? ret : (ret * 1000);
> > > > +  

Re: [PATCH v2 1/2] drm/nouveau: return temperatures in temp_get() via parameter

2020-09-16 Thread Karol Herbst
On Wed, Sep 16, 2020 at 10:01 PM Karol Herbst  wrote:
>
> On Wed, Sep 16, 2020 at 9:47 PM Jeremy Cline  wrote:
> >
> > The temp_get() function currently returns negative error numbers or a
> > temperature. However, the thermal sensors can (in theory) measure
> > negative temperatures. Some implementations of temp_get() correctly
> > clamp negative temperature readings to 0 so that users do not mistake
> > them for errors, but some, like gp100_temp_get(), do not.
> >
> > Rather than relying on implementations remembering to clamp values,
> > dedicate the function return value to error codes and accept a pointer
> > to an integer where the temperature reading should be stored.
> >
> > Signed-off-by: Jeremy Cline 
> > ---
> >  .../drm/nouveau/include/nvkm/subdev/therm.h   |  2 +-
> >  drivers/gpu/drm/nouveau/nouveau_hwmon.c   | 12 ++--
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/base.c  | 19 ++-
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/g84.c   | 11 ++-
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 11 ++-
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/nv40.c  |  9 +++--
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/nv50.c  |  9 +++--
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h  | 17 +++--
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/temp.c  | 12 
> >  9 files changed, 62 insertions(+), 40 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > index 62c34f98c930..bfe9779216fc 100644
> > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > @@ -99,7 +99,7 @@ struct nvkm_therm {
> > bool clkgating_enabled;
> >  };
> >
> > -int nvkm_therm_temp_get(struct nvkm_therm *);
> > +int nvkm_therm_temp_get(struct nvkm_therm *therm, int *temp);
> >  int nvkm_therm_fan_sense(struct nvkm_therm *);
> >  int nvkm_therm_cstate(struct nvkm_therm *, int, int);
> >  void nvkm_therm_clkgate_init(struct nvkm_therm *,
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c 
> > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > index 1c3104d20571..aff6aa296678 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > @@ -325,7 +325,7 @@ nouveau_temp_is_visible(const void *data, u32 attr, int 
> > channel)
> > struct nouveau_drm *drm = nouveau_drm((struct drm_device *)data);
> > struct nvkm_therm *therm = nvxx_therm(>client.device);
> >
> > -   if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm) < 0)
> > +   if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm, NULL) 
> > < 0)
> > return 0;
> >
> > switch (attr) {
> > @@ -419,7 +419,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> > channel, long *val)
> > struct drm_device *drm_dev = dev_get_drvdata(dev);
> > struct nouveau_drm *drm = nouveau_drm(drm_dev);
> > struct nvkm_therm *therm = nvxx_therm(>client.device);
> > -   int ret;
> > +   int ret = 0, temp;
> >
> > if (!therm || !therm->attr_get)
> > return -EOPNOTSUPP;
> > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> > channel, long *val)
> > case hwmon_temp_input:
> > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON)
> > return -EINVAL;
> > -   ret = nvkm_therm_temp_get(therm);
> > -   *val = ret < 0 ? ret : (ret * 1000);
> > +   ret = nvkm_therm_temp_get(therm, );
> > +   *val = temp * 1000;
> > break;
> > case hwmon_temp_max:
> > *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK)
> > @@ -459,7 +459,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> > channel, long *val)
> > return -EOPNOTSUPP;
> > }
> >
> > -   return 0;
> > +   return ret;
> >  }
> >
> >  static int
> > @@ -735,7 +735,7 @@ nouveau_hwmon_init(struct drm_device *dev)
> > hwmon->dev = dev;
> >
> > if (therm && therm->attr_get && therm->attr_set) {
> > -   if (nvkm_therm_temp_get(therm) >= 0)
> > +   if (nvkm_therm_temp_get(therm, NULL) >= 0)
> > 

Re: [PATCH v2 1/2] drm/nouveau: return temperatures in temp_get() via parameter

2020-09-16 Thread Karol Herbst
On Wed, Sep 16, 2020 at 9:47 PM Jeremy Cline  wrote:
>
> The temp_get() function currently returns negative error numbers or a
> temperature. However, the thermal sensors can (in theory) measure
> negative temperatures. Some implementations of temp_get() correctly
> clamp negative temperature readings to 0 so that users do not mistake
> them for errors, but some, like gp100_temp_get(), do not.
>
> Rather than relying on implementations remembering to clamp values,
> dedicate the function return value to error codes and accept a pointer
> to an integer where the temperature reading should be stored.
>
> Signed-off-by: Jeremy Cline 
> ---
>  .../drm/nouveau/include/nvkm/subdev/therm.h   |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_hwmon.c   | 12 ++--
>  .../gpu/drm/nouveau/nvkm/subdev/therm/base.c  | 19 ++-
>  .../gpu/drm/nouveau/nvkm/subdev/therm/g84.c   | 11 ++-
>  .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 11 ++-
>  .../gpu/drm/nouveau/nvkm/subdev/therm/nv40.c  |  9 +++--
>  .../gpu/drm/nouveau/nvkm/subdev/therm/nv50.c  |  9 +++--
>  .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h  | 17 +++--
>  .../gpu/drm/nouveau/nvkm/subdev/therm/temp.c  | 12 
>  9 files changed, 62 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> index 62c34f98c930..bfe9779216fc 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> @@ -99,7 +99,7 @@ struct nvkm_therm {
> bool clkgating_enabled;
>  };
>
> -int nvkm_therm_temp_get(struct nvkm_therm *);
> +int nvkm_therm_temp_get(struct nvkm_therm *therm, int *temp);
>  int nvkm_therm_fan_sense(struct nvkm_therm *);
>  int nvkm_therm_cstate(struct nvkm_therm *, int, int);
>  void nvkm_therm_clkgate_init(struct nvkm_therm *,
> diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c 
> b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> index 1c3104d20571..aff6aa296678 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> @@ -325,7 +325,7 @@ nouveau_temp_is_visible(const void *data, u32 attr, int 
> channel)
> struct nouveau_drm *drm = nouveau_drm((struct drm_device *)data);
> struct nvkm_therm *therm = nvxx_therm(>client.device);
>
> -   if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm) < 0)
> +   if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm, NULL) < 
> 0)
> return 0;
>
> switch (attr) {
> @@ -419,7 +419,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> channel, long *val)
> struct drm_device *drm_dev = dev_get_drvdata(dev);
> struct nouveau_drm *drm = nouveau_drm(drm_dev);
> struct nvkm_therm *therm = nvxx_therm(>client.device);
> -   int ret;
> +   int ret = 0, temp;
>
> if (!therm || !therm->attr_get)
> return -EOPNOTSUPP;
> @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> channel, long *val)
> case hwmon_temp_input:
> if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON)
> return -EINVAL;
> -   ret = nvkm_therm_temp_get(therm);
> -   *val = ret < 0 ? ret : (ret * 1000);
> +   ret = nvkm_therm_temp_get(therm, );
> +   *val = temp * 1000;
> break;
> case hwmon_temp_max:
> *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK)
> @@ -459,7 +459,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> channel, long *val)
> return -EOPNOTSUPP;
> }
>
> -   return 0;
> +   return ret;
>  }
>
>  static int
> @@ -735,7 +735,7 @@ nouveau_hwmon_init(struct drm_device *dev)
> hwmon->dev = dev;
>
> if (therm && therm->attr_get && therm->attr_set) {
> -   if (nvkm_therm_temp_get(therm) >= 0)
> +   if (nvkm_therm_temp_get(therm, NULL) >= 0)
> special_groups[i++] = _auto_point_sensor_group;
> if (therm->fan_get && therm->fan_get(therm) >= 0)
> special_groups[i++] = _fan_sensor_group;
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> index 4a4d1e224126..e7ffea1512e6 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> @@ -27,10 +27,15 @@
>  #include 
>
>  int
> -nvkm_therm_temp_get(struct nvkm_therm *therm)
> +nvkm_therm_temp_get(struct nvkm_therm *therm, int *temp)
>  {
> +   int ignored_reading;
> +
> +   if (temp == NULL)
> +   temp = _reading;
> +
> if (therm->func->temp_get)
> -   return therm->func->temp_get(therm);
> +   return therm->func->temp_get(therm, temp);

Re: [Nouveau] [PATCH] drm/nouveau: Add fine-grain temperature reporting

2020-09-09 Thread Karol Herbst
On Wed, Sep 9, 2020 at 6:06 AM Ben Skeggs  wrote:
>
> On Thu, 13 Aug 2020 at 06:50, Jeremy Cline  wrote:
> >
> > Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of
> > new gp1xx temperature sensor") added support for reading finer-grain
> > temperatures, but continued to report temperatures in 1 degree Celsius
> > increments via nvkm_therm_temp_get().
> >
> > Rather than altering nvkm_therm_temp_get() to report finer-grain
> > temperatures, which would be inconvenient for other users of the
> > function, a second interface has been added to line up with hwmon's
> > native unit of temperature.
> Hey Jeremy,
>
> Sorry this slipped past me until now.  I'm OK with adding support for
> millidegree temperature reporting, but don't think we need to keep
> both interfaces around and would rather see the existing code
> converted to return millidegrees (even on GPUs that don't support it)
> instead of degrees.
>
> Thanks!
> Ben.
>

just a note as I was trying something like that in the past: we have a
lot of code which fetches the temperature and there are a lot of
places where we would then do a divide by 1000, so having a wrapper
function at least makes sense. If we want to keep the func pointers?
well.. I guess the increased CPU overhead wouldn't be as bad if all
sub classes do this static * 1000 as well either. I just think we
should have both interfaces in subdev/therm.h so we can just keep most
of the current code as is.

> >
> > Signed-off-by: Jeremy Cline 
> > ---
> >  .../drm/nouveau/include/nvkm/subdev/therm.h   | 18 +
> >  drivers/gpu/drm/nouveau/nouveau_hwmon.c   |  4 +--
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/base.c  | 16 
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 25 +--
> >  .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h  |  1 +
> >  5 files changed, 60 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > index 62c34f98c930..7b9928dd001c 100644
> > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> > @@ -100,6 +100,24 @@ struct nvkm_therm {
> >  };
> >
> >  int nvkm_therm_temp_get(struct nvkm_therm *);
> > +
> > +/**
> > + * nvkm_therm_temp_millidegree_get() - get the temperature in millidegrees
> > + * @therm: The thermal device to read from.
> > + *
> > + * This interface reports temperatures in units of millidegree Celsius to
> > + * align with the hwmon API. Some cards may only be capable of reporting in
> > + * units of Celsius, and those that report finer grain temperatures may 
> > not be
> > + * capable of millidegree Celsius accuracy,
> > + *
> > + * For cases where millidegree temperature is too fine-grain, the
> > + * nvkm_therm_temp_get() interface reports temperatures in one degree 
> > Celsius
> > + * increments.
> > + *
> > + * Return: The temperature in millidegrees Celsius, or -ENODEV if 
> > temperature
> > + * reporting is not supported.
> > + */
> > +int nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm);
> >  int nvkm_therm_fan_sense(struct nvkm_therm *);
> >  int nvkm_therm_cstate(struct nvkm_therm *, int, int);
> >  void nvkm_therm_clkgate_init(struct nvkm_therm *,
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c 
> > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > index 1c3104d20571..e96355f93ce5 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> > channel, long *val)
> > case hwmon_temp_input:
> > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON)
> > return -EINVAL;
> > -   ret = nvkm_therm_temp_get(therm);
> > -   *val = ret < 0 ? ret : (ret * 1000);
> > +   ret = nvkm_therm_temp_millidegree_get(therm);
> > +   *val = ret;
> > break;
> > case hwmon_temp_max:
> > *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK)
> > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c 
> > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> > index 4a4d1e224126..e655b32c78b8 100644
> > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> > @@ -34,6 +34,22 @@ nvkm_therm_temp_get(struct nvkm_therm *therm)
> > return -ENODEV;
> >  }
> >
> > +int
> > +nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm)
> > +{
> > +   int ret = -ENODEV;
> > +
> > +   if (therm->func->temp_millidegree_get)
> > +   return therm->func->temp_millidegree_get(therm);
> > +
> > +   if (therm->func->temp_get) {
> > +   ret = therm->func->temp_get(therm);
> > +   if (ret > 0)
> > +   ret *= 1000;
> > +   }

Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-07 Thread Karol Herbst
On Mon, Sep 7, 2020 at 10:58 PM Marc MERLIN  wrote:
>
> On Mon, Sep 07, 2020 at 09:14:03PM +0200, Karol Herbst wrote:
> > > - changes in the nouveau driver. Mika told me the PCIe regression
> > >   "pcieport :00:01.0: PME: Spurious native interrupt!" is supposed
> > >   to be fixed in 5.8, but I still get a 4mn hang or so during boot and
> > >   with 5.8, removing the USB key, didn't help make the boot faster
> >
> > that's the root port the GPU is attached to, no? I saw that message on
> > the Thinkpad P1G2 when runtime resuming the Nvidia GPU, but it does
> > seem to come from the root port.
>
> Hi Karol, thanks for your answer.
>
> 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core 
> Processor PCIe Controller (x16) (rev 0d)
> 01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX 
> 4000 Mobile / Max-Q] (rev a1)
>
> > Well, you'd also need it when attaching external displays.
>
> Indeed. I just don't need that on this laptop, but familiar with the not
> so seemless procedure to turn on both GPUs, and mirror the intel one into
> the nvidia one for external output.
>
> > > [   11.262985] nvidia-gpu :01:00.3: PME# enabled
> > > [   11.303060] nvidia-gpu :01:00.3: PME# disabled
> >
> > mhh, interesting. I heard some random comments that the Nvidia
> > USB-C/UCSI driver is a bit broken and can cause various issues. Mind
> > blacklisting i2c-nvidia-gpu and typec_nvidia (and verify they don't
> > get loaded) and see if that helps?
>
> Right, this one:
> 01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C 
> UCSI Controller (rev a1)
> Sure, I'll blacklist it. Ok, just did that, removed from initrd,
> rebooted, and it was no better.
>
> From initrd (before root gets mounted), I have this:
> nouveau  1961984  0
> mxm_wmi16384  1 nouveau
> hwmon  32768  1 nouveau
> ttm   102400  1 nouveau
> wmi32768  2 nouveau,mxm_wmi
>
> I still got a 2mn hang. and a nouveau probe error
> [  189.124530] nouveau: probe of :01:00.0 failed with error -12
>
>
> Here's what it looks like:
> [9.693230] hid: raw HID events driver (C) Jiri Kosina
> [9.694988] usbcore: registered new interface driver usbhid
> [9.694989] usbhid: USB HID core driver
> [9.696700] hid-generic 0003:1050:0200.0001: hiddev0,hidraw0: USB HID 
> v1.00 Device [Yubico Yubico Gnubby (gnubby1)] on usb-:00:14.0-2/input0
> [9.784456] Console: switching to colour frame buffer device 240x67
> [9.816297] i915 :00:02.0: fb0: i915drmfb frame buffer device
> [   25.087400] thunderbolt :06:00.0: saving config space at offset 0x0 
> (reading 0x15eb8086)
> [   25.087414] thunderbolt :06:00.0: saving config space at offset 0x4 
> (reading 0x100406)
> [   25.087419] thunderbolt :06:00.0: saving config space at offset 0x8 
> (reading 0x886)
> [   25.087424] thunderbolt :06:00.0: saving config space at offset 0xc 
> (reading 0x20)
> [   25.087430] thunderbolt :06:00.0: saving config space at offset 0x10 
> (reading 0xcc10)
> [   25.087435] thunderbolt :06:00.0: saving config space at offset 0x14 
> (reading 0xcc14)
> [   25.087440] thunderbolt :06:00.0: saving config space at offset 0x18 
> (reading 0x0)
> [   25.087445] thunderbolt :06:00.0: saving config space at offset 0x1c 
> (reading 0x0)
> [   25.087450] thunderbolt :06:00.0: saving config space at offset 0x20 
> (reading 0x0)
> [   25.087455] thunderbolt :06:00.0: saving config space at offset 0x24 
> (reading 0x0)
> [   25.087460] thunderbolt :06:00.0: saving config space at offset 0x28 
> (reading 0x0)
> [   25.087466] thunderbolt :06:00.0: saving config space at offset 0x2c 
> (reading 0x229b17aa)
> [   25.087471] thunderbolt :06:00.0: saving config space at offset 0x30 
> (reading 0x0)
> [   25.087476] thunderbolt :06:00.0: saving config space at offset 0x34 
> (reading 0x80)
> [   25.087481] thunderbolt :06:00.0: saving config space at offset 0x38 
> (reading 0x0)
> [   25.087486] thunderbolt :06:00.0: saving config space at offset 0x3c 
> (reading 0x1ff)
> [   25.087571] thunderbolt :06:00.0: PME# enabled
> [   25.105353] pcieport :05:00.0: saving config space at offset 0x0 
> (reading 0x15ea8086)
> [   25.105364] pcieport :05:00.0: saving config space at offset 0x4 
> (reading 0x100407)
> [   25.105370] pcieport :05:00.0: saving config space at offset 0x8 
> (reading 0x6040006)
> [   25.105375] pcieport :05:00.0: saving config space at offset 0xc 
> (reading 0x100

Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)

2020-09-07 Thread Karol Herbst
On Sun, Sep 6, 2020 at 8:52 PM Marc MERLIN  wrote:
>
> Ok, I have an update to this problem. I added the nouveau list because
> I can't quite tell if the issue is:
> - the PCIe changes that went in 5.6 I think (or 5.5?), referenced below
>
> - a new issue with thunderbold on thinkpad P73, that seems to be
>   triggered if I have a USB-C yubikey in the port. With 5.7, my issues
>   went away if I removed the USB key during boot, showing an interaction
>   between nouveau and thunderbolt
>
> - changes in the nouveau driver. Mika told me the PCIe regression
>   "pcieport :00:01.0: PME: Spurious native interrupt!" is supposed
>   to be fixed in 5.8, but I still get a 4mn hang or so during boot and
>   with 5.8, removing the USB key, didn't help make the boot faster
>

that's the root port the GPU is attached to, no? I saw that message on
the Thinkpad P1G2 when runtime resuming the Nvidia GPU, but it does
seem to come from the root port.

> I don't otherwise use the nvidia chip I so wish I didn't have, I only
> use intel graphics on that laptop, but I must apparently use the nouveau
> driver to manage the nouveau chip so that it's turned off and not
> burning 60W doing nothing.
>

Well, you'd also need it when attaching external displays.

> lspci is in the quoted message below, I won't copy it here again, but
> here's the nvidia bit:
> 01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX 
> 4000 Mobile / Max-Q] (rev a1)
> 01:00.1 Audio device: NVIDIA Corporation TU104 HD Audio Controller (rev a1)
> 01:00.2 USB controller: NVIDIA Corporation TU104 USB 3.1 Host Controller (rev 
> a1)
> 01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C 
> UCSI Controller (rev a1)
>
> Here are 5 boots, 4 on 5.8.5:
>
> dmesg.1_hang_but_no_warning.txt https://pastebin.com/Y5NaH08n
> Boot hung for quite a while, but no clear output
>
> dmesg.2_pme_spurious.txt https://pastebin.com/dX19aCpj
> [8.185808] nvidia-gpu :01:00.3: runtime IRQ mapping not provided by 
> arch
> [8.185989] nvidia-gpu :01:00.3: enabling device ( -> 0002)
> [8.188986] nvidia-gpu :01:00.3: enabling bus mastering
> [   11.936507] nvidia-gpu :01:00.3: PME# enabled
> [   11.975985] nvidia-gpu :01:00.3: PME# disabled
> [   11.976011] pcieport :00:01.0: PME: Spurious native interrupt!
>
> dmesg.3_usb_key_yanked.txt https://pastebin.com/m7QLnCZt
> I yanked the USB key during boot, that seemed to help unlock things with
> 5.7, but did not with 5.8. It's hung on a loop of:
> [   11.262854] nvidia-gpu :01:00.3: saving config space at offset 0x0 
> (reading 0x1ad910de)
> [   11.262863] nvidia-gpu :01:00.3: saving config space at offset 0x4 
> (reading 0x100406)
> [   11.262869] nvidia-gpu :01:00.3: saving config space at offset 0x8 
> (reading 0xc8000a1)
> [   11.262874] nvidia-gpu :01:00.3: saving config space at offset 0xc 
> (reading 0x80)
> [   11.262880] nvidia-gpu :01:00.3: saving config space at offset 0x10 
> (reading 0xce054000)
> [   11.262885] nvidia-gpu :01:00.3: saving config space at offset 0x14 
> (reading 0x0)
> [   11.262890] nvidia-gpu :01:00.3: saving config space at offset 0x18 
> (reading 0x0)
> [   11.262895] nvidia-gpu :01:00.3: saving config space at offset 0x1c 
> (reading 0x0)
> [   11.262900] nvidia-gpu :01:00.3: saving config space at offset 0x20 
> (reading 0x0)
> [   11.262906] nvidia-gpu :01:00.3: saving config space at offset 0x24 
> (reading 0x0)
> [   11.262911] nvidia-gpu :01:00.3: saving config space at offset 0x28 
> (reading 0x0)
> [   11.262916] nvidia-gpu :01:00.3: saving config space at offset 0x2c 
> (reading 0x229b17aa)
> [   11.262921] nvidia-gpu :01:00.3: saving config space at offset 0x30 
> (reading 0x0)
> [   11.262926] nvidia-gpu :01:00.3: saving config space at offset 0x34 
> (reading 0x68)
> [   11.262931] nvidia-gpu :01:00.3: saving config space at offset 0x38 
> (reading 0x0)
> [   11.262937] nvidia-gpu :01:00.3: saving config space at offset 0x3c 
> (reading 0x4ff)
> [   11.262985] nvidia-gpu :01:00.3: PME# enabled
> [   11.303060] nvidia-gpu :01:00.3: PME# disabled
>

mhh, interesting. I heard some random comments that the Nvidia
USB-C/UCSI driver is a bit broken and can cause various issues. Mind
blacklisting i2c-nvidia-gpu and typec_nvidia (and verify they don't
get loaded) and see if that helps?

> dmesg.4_5.5_boot_fine.txt https://pastebin.com/WXgQTUYP
> reference boot with 4.5, it works fine, no issues
>
> dmesg.5_no_key_still_hang.txt https://pastebin.com/kcT8Ras0
> unfortunately, booting without the USB-C key in thunderbolt, did not
> allow this boot to be faster, it looks different though:
> [6.723454] pcieport :00:01.0: runtime IRQ mapping not provided by arch
> [6.723598] pcieport :00:01.0: PME: Signaling with IRQ 122
> [6.724011] pcieport :00:01.0: saving config space at offset 0x0 
> (reading 0x19018086)
> [6.724016] 

Re: [PATCH] drm/nouveau: Add fine-grain temperature reporting

2020-08-13 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Wed, Aug 12, 2020 at 10:50 PM Jeremy Cline  wrote:
>
> Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of
> new gp1xx temperature sensor") added support for reading finer-grain
> temperatures, but continued to report temperatures in 1 degree Celsius
> increments via nvkm_therm_temp_get().
>
> Rather than altering nvkm_therm_temp_get() to report finer-grain
> temperatures, which would be inconvenient for other users of the
> function, a second interface has been added to line up with hwmon's
> native unit of temperature.
>
> Signed-off-by: Jeremy Cline 
> ---
>  .../drm/nouveau/include/nvkm/subdev/therm.h   | 18 +
>  drivers/gpu/drm/nouveau/nouveau_hwmon.c   |  4 +--
>  .../gpu/drm/nouveau/nvkm/subdev/therm/base.c  | 16 
>  .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 25 +--
>  .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h  |  1 +
>  5 files changed, 60 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> index 62c34f98c930..7b9928dd001c 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> @@ -100,6 +100,24 @@ struct nvkm_therm {
>  };
>
>  int nvkm_therm_temp_get(struct nvkm_therm *);
> +
> +/**
> + * nvkm_therm_temp_millidegree_get() - get the temperature in millidegrees
> + * @therm: The thermal device to read from.
> + *
> + * This interface reports temperatures in units of millidegree Celsius to
> + * align with the hwmon API. Some cards may only be capable of reporting in
> + * units of Celsius, and those that report finer grain temperatures may not 
> be
> + * capable of millidegree Celsius accuracy,
> + *
> + * For cases where millidegree temperature is too fine-grain, the
> + * nvkm_therm_temp_get() interface reports temperatures in one degree Celsius
> + * increments.
> + *
> + * Return: The temperature in millidegrees Celsius, or -ENODEV if temperature
> + * reporting is not supported.
> + */
> +int nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm);
>  int nvkm_therm_fan_sense(struct nvkm_therm *);
>  int nvkm_therm_cstate(struct nvkm_therm *, int, int);
>  void nvkm_therm_clkgate_init(struct nvkm_therm *,
> diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c 
> b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> index 1c3104d20571..e96355f93ce5 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c
> @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int 
> channel, long *val)
> case hwmon_temp_input:
> if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON)
> return -EINVAL;
> -   ret = nvkm_therm_temp_get(therm);
> -   *val = ret < 0 ? ret : (ret * 1000);
> +   ret = nvkm_therm_temp_millidegree_get(therm);
> +   *val = ret;
> break;
> case hwmon_temp_max:
> *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK)
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> index 4a4d1e224126..e655b32c78b8 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c
> @@ -34,6 +34,22 @@ nvkm_therm_temp_get(struct nvkm_therm *therm)
> return -ENODEV;
>  }
>
> +int
> +nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm)
> +{
> +   int ret = -ENODEV;
> +
> +   if (therm->func->temp_millidegree_get)
> +   return therm->func->temp_millidegree_get(therm);
> +
> +   if (therm->func->temp_get) {
> +   ret = therm->func->temp_get(therm);
> +   if (ret > 0)
> +   ret *= 1000;
> +   }
> +   return ret;
> +}
> +
>  static int
>  nvkm_therm_update_trip(struct nvkm_therm *therm)
>  {
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c
> index 9f0dea3f61dc..4c3c2895a3cb 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c
> @@ -24,7 +24,7 @@
>  #include "priv.h"
>
>  static int
> -gp100_temp_get(struct nvkm_therm *therm)
> +gp100_temp_get_raw(struct nvkm_therm *therm)
>  {
> struct nvkm_device *device = therm->subdev.device;
> struct nvkm_subdev *subdev = >s

Re: [Nouveau] [PATCH] drm/nouveau/acr: fix a coding style in nvkm_acr_lsfw_load_bl_inst_data_sig()

2020-08-03 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Mon, Aug 3, 2020 at 5:46 AM Jing Xiangfeng  wrote:
>
> This patch performs the following changes:
> 1. remove a redundant parentheses around the  nvkm_acr_lsfw_add() calls
> 2. do assignment before this if condition, it is more readable
>
> Signed-off-by: Jing Xiangfeng 
> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
> index 07d1830126ab..5f6006418472 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
> @@ -191,7 +191,8 @@ nvkm_acr_lsfw_load_bl_inst_data_sig(struct nvkm_subdev 
> *subdev,
> u32 *bldata;
> int ret;
>
> -   if (IS_ERR((lsfw = nvkm_acr_lsfw_add(func, acr, falcon, id
> +   lsfw = nvkm_acr_lsfw_add(func, acr, falcon, id);
> +   if (IS_ERR(lsfw))
> return PTR_ERR(lsfw);
>
> ret = nvkm_firmware_load_name(subdev, path, "bl", ver, );
> --
> 2.17.1
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>



Re: nouveau regression with 5.7 caused by "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"

2020-07-23 Thread Karol Herbst
On Wed, Jul 22, 2020 at 11:25 AM Mika Westerberg
 wrote:
>
> On Tue, Jul 21, 2020 at 01:37:12PM -0500, Patrick Volkerding wrote:
> > On 7/21/20 10:27 AM, Mika Westerberg wrote:
> > > On Tue, Jul 21, 2020 at 11:01:55AM -0400, Lyude Paul wrote:
> > >> Sure thing. Also, feel free to let me know if you'd like access to one 
> > >> of the
> > >> systems we saw breaking with this patch - I'm fairly sure I've got one 
> > >> of them
> > >> locally at my apartment and don't mind setting up AMT/KVM/SSH
> > > Probably no need for remote access (thanks for the offer, though). I
> > > attached a test patch to the bug report:
> > >
> > >   https://bugzilla.kernel.org/show_bug.cgi?id=208597
> > >
> > > that tries to work it around (based on the ->pm_cap == 0). I wonder if
> > > anyone would have time to try it out.
> >
> >
> > Hi Mika,
> >
> > I can confirm that this patch applied to 5.4.52 fixes the issue with
> > hybrid graphics on the Thinkpad X1 Extreme gen2.
>
> Great, thanks for testing!
>

yeah, works on the P1G2 as well.



Re: [PATCH] RFC: ACPI / OSI: remove workarounds for hybrid graphics laptops

2020-07-20 Thread Karol Herbst
On Mon, Jul 20, 2020 at 3:19 AM Alex Hung  wrote:
>
> On 2020-07-19 1:50 p.m., Karol Herbst wrote:
> > On Fri, Jul 17, 2020 at 9:52 PM Alex Hung  wrote:
> >>
> >> On 2020-07-17 1:05 p.m., Karol Herbst wrote:
> >>> It's hard to figure out what systems are actually affected and right now I
> >>> don't see a good way of removing those...
> >>>
> >>> But I'd like to see thos getting removed and drivers fixed instead (which
> >>> happened at least for nouveau).
> >>>
> >>> And as mentioned before, I prefer people working on fixing issues instead
> >>> of spending time to add firmware level workarounds which are hard to know
> >>> to which systems they apply to, hard to remove and basically a big huge
> >>> pain to work with.> In the end I have no idea how to even figure out what 
> >>> systems are affected
> >>> and which not by this, so I have no idea how to even verify we can safely
> >>> remove this (which just means those are impossible to remove unless we 
> >>> risk
> >>> breaking systems, which again makes those supper annoying to deal with).
> >>>
> >>> Also from the comments it's hard to get what those bits really do. Are 
> >>> they
> >>> just preventing runtime pm or do the devices are powered down when 
> >>> booting?
> >>> I am sure it's the former, still...
> >>>
> >>> Please, don't do this again.
> >>>
> >>> For now, those workaround prevent power savings on systems those 
> >>> workaround
> >>> applies to, which might be any so those should get removed asap and if
> >>> new issues arrise removing those please do a proper bug report and we can
> >>> look into it and come up with a proper fix (and keep this patch out until
> >>> we resolve all of those).
> >>>
> >>> Signed-off-by: Karol Herbst 
> >>> CC: Alex Hung 
> >>> CC: "Rafael J. Wysocki" 
> >>> CC: Len Brown 
> >>> CC: Lyude Paul 
> >>> CC: linux-kernel@vger.kernel.org
> >>> CC: dri-de...@lists.freedesktop.org
> >>> CC: nouv...@lists.freedesktop.org
> >>> ---
> >>>  drivers/acpi/osi.c | 24 
> >>>  1 file changed, 24 deletions(-)
> >>>
> >>> diff --git a/drivers/acpi/osi.c b/drivers/acpi/osi.c
> >>> index 9f68538091384..d4405e1ca9b97 100644
> >>> --- a/drivers/acpi/osi.c
> >>> +++ b/drivers/acpi/osi.c
> >>> @@ -44,30 +44,6 @@ osi_setup_entries[OSI_STRING_ENTRIES_MAX] __initdata = 
> >>> {
> >>>   {"Processor Device", true},
> >>>   {"3.0 _SCP Extensions", true},
> >>>   {"Processor Aggregator Device", true},
> >>> - /*
> >>> -  * Linux-Dell-Video is used by BIOS to disable RTD3 for NVidia 
> >>> graphics
> >>> -  * cards as RTD3 is not supported by drivers now.  Systems with 
> >>> NVidia
> >>> -  * cards will hang without RTD3 disabled.
> >>> -  *
> >>> -  * Once NVidia drivers officially support RTD3, this _OSI strings 
> >>> can
> >>> -  * be removed if both new and old graphics cards are supported.
> >>> -  */
> >>> - {"Linux-Dell-Video", true},
> >>> - /*
> >>> -  * Linux-Lenovo-NV-HDMI-Audio is used by BIOS to power on NVidia's 
> >>> HDMI
> >>> -  * audio device which is turned off for power-saving in Windows OS.
> >>> -  * This power management feature observed on some Lenovo Thinkpad
> >>> -  * systems which will not be able to output audio via HDMI without
> >>> -  * a BIOS workaround.
> >>> -  */
> >>> - {"Linux-Lenovo-NV-HDMI-Audio", true},
> >>> - /*
> >>> -  * Linux-HPI-Hybrid-Graphics is used by BIOS to enable dGPU to
> >>> -  * output video directly to external monitors on HP Inc. mobile
> >>> -  * workstations as Nvidia and AMD VGA drivers provide limited
> >>> -  * hybrid graphics supports.
> >>> -  */
> >>> - {"Linux-HPI-Hybrid-Graphics", true},
> >>>  };
> >>>
> >>>  static u32 acpi_osi_handler(acpi_string interface, u32 supported)
> >>>
> >>
> >> The changes were discussed and tested a while ago, and no crashes were
> >> observed. Thanks for solving PM issues in nouveau.
> >>
> >> Acked-by: Alex Hung 
> >>
> >
> > By any chance, do you have a list of systems implementing those workarounds?
> >
>
> I don't keep a list but the workaround, in theory, should only apply to
> the systems with the specific nvidia hardware.
>
> I reminded OEMs and ODMs that these _OSI strings were temporary
> solutions, and highlighted we were going to remove them after our
> discussion last year. If they were paying attentions recent systems
> shouldn't have these _OSI strings.
>

Right.. but I am actually wondering because I never saw those strings
in the wild or not on the Dell and Lenovo systems I was testing on. So
I think we might want to ask the vendors themselves and verify on
those systems.

> --
> Cheers,
> Alex Hung
>



Re: [PATCH] RFC: ACPI / OSI: remove workarounds for hybrid graphics laptops

2020-07-19 Thread Karol Herbst
On Fri, Jul 17, 2020 at 9:52 PM Alex Hung  wrote:
>
> On 2020-07-17 1:05 p.m., Karol Herbst wrote:
> > It's hard to figure out what systems are actually affected and right now I
> > don't see a good way of removing those...
> >
> > But I'd like to see thos getting removed and drivers fixed instead (which
> > happened at least for nouveau).
> >
> > And as mentioned before, I prefer people working on fixing issues instead
> > of spending time to add firmware level workarounds which are hard to know
> > to which systems they apply to, hard to remove and basically a big huge
> > pain to work with.> In the end I have no idea how to even figure out what 
> > systems are affected
> > and which not by this, so I have no idea how to even verify we can safely
> > remove this (which just means those are impossible to remove unless we risk
> > breaking systems, which again makes those supper annoying to deal with).
> >
> > Also from the comments it's hard to get what those bits really do. Are they
> > just preventing runtime pm or do the devices are powered down when booting?
> > I am sure it's the former, still...
> >
> > Please, don't do this again.
> >
> > For now, those workaround prevent power savings on systems those workaround
> > applies to, which might be any so those should get removed asap and if
> > new issues arrise removing those please do a proper bug report and we can
> > look into it and come up with a proper fix (and keep this patch out until
> > we resolve all of those).
> >
> > Signed-off-by: Karol Herbst 
> > CC: Alex Hung 
> > CC: "Rafael J. Wysocki" 
> > CC: Len Brown 
> > CC: Lyude Paul 
> > CC: linux-kernel@vger.kernel.org
> > CC: dri-de...@lists.freedesktop.org
> > CC: nouv...@lists.freedesktop.org
> > ---
> >  drivers/acpi/osi.c | 24 
> >  1 file changed, 24 deletions(-)
> >
> > diff --git a/drivers/acpi/osi.c b/drivers/acpi/osi.c
> > index 9f68538091384..d4405e1ca9b97 100644
> > --- a/drivers/acpi/osi.c
> > +++ b/drivers/acpi/osi.c
> > @@ -44,30 +44,6 @@ osi_setup_entries[OSI_STRING_ENTRIES_MAX] __initdata = {
> >   {"Processor Device", true},
> >   {"3.0 _SCP Extensions", true},
> >   {"Processor Aggregator Device", true},
> > - /*
> > -  * Linux-Dell-Video is used by BIOS to disable RTD3 for NVidia 
> > graphics
> > -  * cards as RTD3 is not supported by drivers now.  Systems with NVidia
> > -  * cards will hang without RTD3 disabled.
> > -  *
> > -  * Once NVidia drivers officially support RTD3, this _OSI strings can
> > -  * be removed if both new and old graphics cards are supported.
> > -  */
> > - {"Linux-Dell-Video", true},
> > - /*
> > -  * Linux-Lenovo-NV-HDMI-Audio is used by BIOS to power on NVidia's 
> > HDMI
> > -  * audio device which is turned off for power-saving in Windows OS.
> > -  * This power management feature observed on some Lenovo Thinkpad
> > -  * systems which will not be able to output audio via HDMI without
> > -  * a BIOS workaround.
> > -  */
> > - {"Linux-Lenovo-NV-HDMI-Audio", true},
> > - /*
> > -  * Linux-HPI-Hybrid-Graphics is used by BIOS to enable dGPU to
> > -  * output video directly to external monitors on HP Inc. mobile
> > -  * workstations as Nvidia and AMD VGA drivers provide limited
> > -  * hybrid graphics supports.
> > -  */
> > - {"Linux-HPI-Hybrid-Graphics", true},
> >  };
> >
> >  static u32 acpi_osi_handler(acpi_string interface, u32 supported)
> >
>
> The changes were discussed and tested a while ago, and no crashes were
> observed. Thanks for solving PM issues in nouveau.
>
> Acked-by: Alex Hung 
>

By any chance, do you have a list of systems implementing those workarounds?



[PATCH] RFC: ACPI / OSI: remove workarounds for hybrid graphics laptops

2020-07-17 Thread Karol Herbst
It's hard to figure out what systems are actually affected and right now I
don't see a good way of removing those...

But I'd like to see thos getting removed and drivers fixed instead (which
happened at least for nouveau).

And as mentioned before, I prefer people working on fixing issues instead
of spending time to add firmware level workarounds which are hard to know
to which systems they apply to, hard to remove and basically a big huge
pain to work with.
In the end I have no idea how to even figure out what systems are affected
and which not by this, so I have no idea how to even verify we can safely
remove this (which just means those are impossible to remove unless we risk
breaking systems, which again makes those supper annoying to deal with).

Also from the comments it's hard to get what those bits really do. Are they
just preventing runtime pm or do the devices are powered down when booting?
I am sure it's the former, still...

Please, don't do this again.

For now, those workaround prevent power savings on systems those workaround
applies to, which might be any so those should get removed asap and if
new issues arrise removing those please do a proper bug report and we can
look into it and come up with a proper fix (and keep this patch out until
we resolve all of those).

Signed-off-by: Karol Herbst 
CC: Alex Hung 
CC: "Rafael J. Wysocki" 
CC: Len Brown 
CC: Lyude Paul 
CC: linux-kernel@vger.kernel.org
CC: dri-de...@lists.freedesktop.org
CC: nouv...@lists.freedesktop.org
---
 drivers/acpi/osi.c | 24 
 1 file changed, 24 deletions(-)

diff --git a/drivers/acpi/osi.c b/drivers/acpi/osi.c
index 9f68538091384..d4405e1ca9b97 100644
--- a/drivers/acpi/osi.c
+++ b/drivers/acpi/osi.c
@@ -44,30 +44,6 @@ osi_setup_entries[OSI_STRING_ENTRIES_MAX] __initdata = {
{"Processor Device", true},
{"3.0 _SCP Extensions", true},
{"Processor Aggregator Device", true},
-   /*
-* Linux-Dell-Video is used by BIOS to disable RTD3 for NVidia graphics
-* cards as RTD3 is not supported by drivers now.  Systems with NVidia
-* cards will hang without RTD3 disabled.
-*
-* Once NVidia drivers officially support RTD3, this _OSI strings can
-* be removed if both new and old graphics cards are supported.
-*/
-   {"Linux-Dell-Video", true},
-   /*
-* Linux-Lenovo-NV-HDMI-Audio is used by BIOS to power on NVidia's HDMI
-* audio device which is turned off for power-saving in Windows OS.
-* This power management feature observed on some Lenovo Thinkpad
-* systems which will not be able to output audio via HDMI without
-* a BIOS workaround.
-*/
-   {"Linux-Lenovo-NV-HDMI-Audio", true},
-   /*
-* Linux-HPI-Hybrid-Graphics is used by BIOS to enable dGPU to
-* output video directly to external monitors on HP Inc. mobile
-* workstations as Nvidia and AMD VGA drivers provide limited
-* hybrid graphics supports.
-*/
-   {"Linux-HPI-Hybrid-Graphics", true},
 };
 
 static u32 acpi_osi_handler(acpi_string interface, u32 supported)
-- 
2.26.2



Re: nouveau regression with 5.7 caused by "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"

2020-07-17 Thread Karol Herbst
Filed at https://bugzilla.kernel.org/show_bug.cgi?id=208597

oddly enough I wasn't able to reproduce it on my XPS 9560, will ping
once something breaks.

On Fri, Jul 17, 2020 at 2:43 AM Karol Herbst  wrote:
>
> On Fri, Jul 17, 2020 at 1:54 AM Bjorn Helgaas  wrote:
> >
> > [+cc Sasha -- stable kernel regression]
> > [+cc Patrick, Kai-Heng, LKML]
> >
> > On Fri, Jul 17, 2020 at 12:10:39AM +0200, Karol Herbst wrote:
> > > On Tue, Jul 7, 2020 at 9:30 PM Karol Herbst  wrote:
> > > >
> > > > Hi everybody,
> > > >
> > > > with the mentioned commit Nouveau isn't able to load firmware onto the
> > > > GPU on one of my systems here. Even though the issue doesn't always
> > > > happen I am quite confident this is the commit breaking it.
> > > >
> > > > I am still digging into the issue and trying to figure out what
> > > > exactly breaks, but it shows up in different ways. Either we are not
> > > > able to boot the engines on the GPU or the GPU becomes unresponsive.
> > > > Btw, this is also a system where our runtime power management issue
> > > > shows up, so maybe there is indeed something funky with the bridge
> > > > controller.
> > > >
> > > > Just pinging you in case you have an idea on how this could break 
> > > > Nouveau
> > > >
> > > > most of the times it shows up like this:
> > > > nouveau :01:00.0: acr: AHESASC binary failed
> > > >
> > > > Sometimes it works at boot and fails at runtime resuming with random
> > > > faults. So I will be investigating a bit more, but yeah... I am super
> > > > sure the commit triggered this issue, no idea if it actually causes
> > > > it.
> > >
> > > so yeah.. I reverted that locally and never ran into issues again.
> > > Still valid on latest 5.7. So can we get this reverted or properly
> > > fixed? This breaks runtime pm for us on at least some hardware.
> >
> > Yeah, that stinks.  We had another similar report from Patrick:
> >
> >   
> > https://lore.kernel.org/r/caerspo5stek_my1dehwp7ahd0xop87+ohywktjbl7algdbx...@mail.gmail.com
> >
> > Apparently the problem is ec411e02b7a2 ("PCI/PM: Assume ports without
> > DLL Link Active train links in 100 ms"), which Patrick found was
> > backported to v5.4.49 as 828b192c57e8, and you found was backported to
> > v5.7.6 as afaff825e3a4.
> >
> > Oddly, Patrick reported that v5.7.7 worked correctly, even though it
> > still contains afaff825e3a4.
> >
> > I guess in the absence of any other clues we'll have to revert it.
> > I hate to do that because that means we'll have slow resume of
> > Thunderbolt-connected devices again, but that's better than having
> > GPUs completely broken.
> >
> > Could you and Patrick open bugzilla.kernel.org reports, attach dmesg
> > logs and "sudo lspci -vv" output, and add the URLs to Kai-Heng's
> > original report at https://bugzilla.kernel.org/show_bug.cgi?id=206837
> > and to this thread?
> >
> > There must be a way to fix the slow resume problem without breaking
> > the GPUs.
> >
>
> I wouldn't be surprised if this is related to the Intel bridge we
> check against for Nouveau.. I still have to check on another laptop
> with the same bridge our workaround was required as well but wouldn't
> be surprised if it shows the same problem. Will get you the
> information from both systems tomorrow then.
>
> > Bjorn
> >



Re: nouveau regression with 5.7 caused by "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"

2020-07-16 Thread Karol Herbst
On Fri, Jul 17, 2020 at 1:54 AM Bjorn Helgaas  wrote:
>
> [+cc Sasha -- stable kernel regression]
> [+cc Patrick, Kai-Heng, LKML]
>
> On Fri, Jul 17, 2020 at 12:10:39AM +0200, Karol Herbst wrote:
> > On Tue, Jul 7, 2020 at 9:30 PM Karol Herbst  wrote:
> > >
> > > Hi everybody,
> > >
> > > with the mentioned commit Nouveau isn't able to load firmware onto the
> > > GPU on one of my systems here. Even though the issue doesn't always
> > > happen I am quite confident this is the commit breaking it.
> > >
> > > I am still digging into the issue and trying to figure out what
> > > exactly breaks, but it shows up in different ways. Either we are not
> > > able to boot the engines on the GPU or the GPU becomes unresponsive.
> > > Btw, this is also a system where our runtime power management issue
> > > shows up, so maybe there is indeed something funky with the bridge
> > > controller.
> > >
> > > Just pinging you in case you have an idea on how this could break Nouveau
> > >
> > > most of the times it shows up like this:
> > > nouveau :01:00.0: acr: AHESASC binary failed
> > >
> > > Sometimes it works at boot and fails at runtime resuming with random
> > > faults. So I will be investigating a bit more, but yeah... I am super
> > > sure the commit triggered this issue, no idea if it actually causes
> > > it.
> >
> > so yeah.. I reverted that locally and never ran into issues again.
> > Still valid on latest 5.7. So can we get this reverted or properly
> > fixed? This breaks runtime pm for us on at least some hardware.
>
> Yeah, that stinks.  We had another similar report from Patrick:
>
>   
> https://lore.kernel.org/r/caerspo5stek_my1dehwp7ahd0xop87+ohywktjbl7algdbx...@mail.gmail.com
>
> Apparently the problem is ec411e02b7a2 ("PCI/PM: Assume ports without
> DLL Link Active train links in 100 ms"), which Patrick found was
> backported to v5.4.49 as 828b192c57e8, and you found was backported to
> v5.7.6 as afaff825e3a4.
>
> Oddly, Patrick reported that v5.7.7 worked correctly, even though it
> still contains afaff825e3a4.
>
> I guess in the absence of any other clues we'll have to revert it.
> I hate to do that because that means we'll have slow resume of
> Thunderbolt-connected devices again, but that's better than having
> GPUs completely broken.
>
> Could you and Patrick open bugzilla.kernel.org reports, attach dmesg
> logs and "sudo lspci -vv" output, and add the URLs to Kai-Heng's
> original report at https://bugzilla.kernel.org/show_bug.cgi?id=206837
> and to this thread?
>
> There must be a way to fix the slow resume problem without breaking
> the GPUs.
>

I wouldn't be surprised if this is related to the Intel bridge we
check against for Nouveau.. I still have to check on another laptop
with the same bridge our workaround was required as well but wouldn't
be surprised if it shows the same problem. Will get you the
information from both systems tomorrow then.

> Bjorn
>



Re: [git pull] drm for 5.8-rc1

2020-07-01 Thread Karol Herbst
On Wed, Jul 1, 2020 at 7:37 PM James Jones  wrote:
>
> On 7/1/20 10:04 AM, Karol Herbst wrote:
> > On Wed, Jul 1, 2020 at 6:01 PM Daniel Vetter  wrote:
> >>
> >> On Wed, Jul 1, 2020 at 5:51 PM James Jones  wrote:
> >>>
> >>> On 7/1/20 4:24 AM, Karol Herbst wrote:
> >>>> On Wed, Jul 1, 2020 at 6:45 AM James Jones  wrote:
> >>>>>
> >>>>> This implies something is trying to use one of the old
> >>>>> DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifiers with DRM-KMS without
> >>>>> first checking whether it is supported by the kernel.  I had tried to
> >>>>> force an Xorg+Mesa stack without my userspace patches to hit this error
> >>>>> when testing, but must have missed some permutation.  If the stalled
> >>>>> Mesa patches go in, this would stop happening of course, but those were
> >>>>> held up for a long time in review, and are now waiting on me to make
> >>>>> some modifications.
> >>>>>
> >>>>
> >>>> that's completely irrelevant. If a kernel change breaks userspace,
> >>>> it's a kernel bug.
> >>>
> >>> Agreed it is unacceptable to break userspace, but I don't think it's
> >>> irrelevant.  Perhaps the musings on pending userspace patches are.
> >>>
> >>> My intent here was to point out it appears at first glance that
> >>> something isn't behaving as expected in userspace, so fixing this would
> >>> likely require some sort of work-around for broken userspace rather than
> >>> straight-forward fixing of a bug in the kernel logic.  My intent was not
> >>> to shift blame to something besides my code & testing for the
> >>> regression, though I certainly see how it could be interpreted that way.
> >>>
> >>> Regardless, I'm looking in to it.
> >>
> >
> > I assume the MR you were talking about is
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3724 ?
>
> Correct.
>
> > I am
> > also aware of the tegra driver being broken on my jetson nano and I am
> > now curious if this MR could fix this bug as well... and sorry for the
> > harsh reply, I was just a annoyed by the fact that "everything
> > modifier related is just breaking things", first tegra and that nobody
> > is looking into fixing it and then apparently the userspace code being
> > quite broken as well :/
> >
> > Anyway, yeah I trust you guys on figuring out the keeping "broken"
> > userspace happy from a kernel side and maybe I can help out with
> > reviewing the mesa bits. I am just wondering if it could help with the
> > tegra situation giving me more reasons to look into it as this would
> > solve other issues I should be working on :)
>
> Not sure if you're claiming this, but if there's Tegra breakage
> attributable to this patch series, I'd love to hear more details there
> as well.  The Tegra patches did have backwards-compat code to handle the
> old modifiers, since Tegra was the only working use case I could find
> for them within the kernel itself.  However, the Tegra kernel patches
> are independent (and haven't even been reviewed yet to my knowledge), so
> Tegra shouldn't be affected at all given it uses TegraDRM rather than
> Nouveau's modesetting driver.
>
> If there are just general existing issues with modifier support on
> Tegra, let's take that to a smaller venue.  I probably won't be as much
> help there, but I can at least try to help get some eyes on it.
>

I am sure that your patches here have nothing to do with it, just
inside mesa since
https://gitlab.freedesktop.org/mesa/mesa/commit/c56fe4118a2dd991ff1b2a532c0f234eddd435e9
it's broken on the jetson nano and because it's so old I am not able
to tell if this is because of some kernel changes or because of the
modifier code inside mesa being slightly broken.

Maybe you have an idea, but Thierry knows about the issue, but I think
he never was able to reproduce it on any system.

> Thanks,
> -James
>
> >> If we do need to have a kernel workaround I'm happy to help out, I've
> >> done a bunch of these and occasionally it's good to get rather
> >> creative :-)
> >>
> >> Ideally we'd also push a minimal fix in userspace to all stable
> >> branches and make sure distros upgrade (might need releases if some
> >> distro is stuck on old horrors), so that we don't have to keep the
> >> hack in place for 10+ years or so. Definitely if the hack amounts to
> >> disabling modifiers on 

Re: [git pull] drm for 5.8-rc1

2020-07-01 Thread Karol Herbst
On Wed, Jul 1, 2020 at 6:01 PM Daniel Vetter  wrote:
>
> On Wed, Jul 1, 2020 at 5:51 PM James Jones  wrote:
> >
> > On 7/1/20 4:24 AM, Karol Herbst wrote:
> > > On Wed, Jul 1, 2020 at 6:45 AM James Jones  wrote:
> > >>
> > >> This implies something is trying to use one of the old
> > >> DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifiers with DRM-KMS without
> > >> first checking whether it is supported by the kernel.  I had tried to
> > >> force an Xorg+Mesa stack without my userspace patches to hit this error
> > >> when testing, but must have missed some permutation.  If the stalled
> > >> Mesa patches go in, this would stop happening of course, but those were
> > >> held up for a long time in review, and are now waiting on me to make
> > >> some modifications.
> > >>
> > >
> > > that's completely irrelevant. If a kernel change breaks userspace,
> > > it's a kernel bug.
> >
> > Agreed it is unacceptable to break userspace, but I don't think it's
> > irrelevant.  Perhaps the musings on pending userspace patches are.
> >
> > My intent here was to point out it appears at first glance that
> > something isn't behaving as expected in userspace, so fixing this would
> > likely require some sort of work-around for broken userspace rather than
> > straight-forward fixing of a bug in the kernel logic.  My intent was not
> > to shift blame to something besides my code & testing for the
> > regression, though I certainly see how it could be interpreted that way.
> >
> > Regardless, I'm looking in to it.
>

I assume the MR you were talking about is
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3724 ? I am
also aware of the tegra driver being broken on my jetson nano and I am
now curious if this MR could fix this bug as well... and sorry for the
harsh reply, I was just a annoyed by the fact that "everything
modifier related is just breaking things", first tegra and that nobody
is looking into fixing it and then apparently the userspace code being
quite broken as well :/

Anyway, yeah I trust you guys on figuring out the keeping "broken"
userspace happy from a kernel side and maybe I can help out with
reviewing the mesa bits. I am just wondering if it could help with the
tegra situation giving me more reasons to look into it as this would
solve other issues I should be working on :)

> If we do need to have a kernel workaround I'm happy to help out, I've
> done a bunch of these and occasionally it's good to get rather
> creative :-)
>
> Ideally we'd also push a minimal fix in userspace to all stable
> branches and make sure distros upgrade (might need releases if some
> distro is stuck on old horrors), so that we don't have to keep the
> hack in place for 10+ years or so. Definitely if the hack amounts to
> disabling modifiers on nouveau, that would be kinda sad.
> -Daniel
>
> >
> > Thanks,
> > -James
> >
> > >> Are you using the modesetting driver in X?  If so, with glamor I
> > >> presume?  What version of Mesa?  Any distro patches?  Any non-default
> > >> xorg.conf options that would affect modesetting, your X driver if it
> > >> isn't modesetting, or glamour?
> > >>
> > >> Thanks,
> > >> -James
> > >>
> > >> On 6/30/20 4:08 PM, Kirill A. Shutemov wrote:
> > >>> On Tue, Jun 02, 2020 at 04:06:32PM +1000, Dave Airlie wrote:
> > >>>> James Jones (4):
> > >>> ...
> > >>>> drm/nouveau/kms: Support NVIDIA format modifiers
> > >>>
> > >>> This commit is the first one that breaks Xorg startup for my setup:
> > >>> GTX 1080 + Dell UP2414Q (4K DP MST monitor).
> > >>>
> > >>> I believe this is the crucial part of dmesg (full dmesg is attached):
> > >>>
> > >>> [   29.997140] [drm:nouveau_framebuffer_new] Unsupported modifier: 
> > >>> 0x314
> > >>> [   29.997143] [drm:drm_internal_framebuffer_create] could not create 
> > >>> framebuffer
> > >>> [   29.997145] [drm:drm_ioctl] pid=3393, ret = -22
> > >>>
> > >>> Any suggestions?
> > >>>
> > >> ___
> > >> dri-devel mailing list
> > >> dri-de...@lists.freedesktop.org
> > >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > >>
> > >
> > > ___
> > > dri-devel mailing list
> > > dri-de...@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > >
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
>



Re: [git pull] drm for 5.8-rc1

2020-07-01 Thread Karol Herbst
On Wed, Jul 1, 2020 at 6:45 AM James Jones  wrote:
>
> This implies something is trying to use one of the old
> DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifiers with DRM-KMS without
> first checking whether it is supported by the kernel.  I had tried to
> force an Xorg+Mesa stack without my userspace patches to hit this error
> when testing, but must have missed some permutation.  If the stalled
> Mesa patches go in, this would stop happening of course, but those were
> held up for a long time in review, and are now waiting on me to make
> some modifications.
>

that's completely irrelevant. If a kernel change breaks userspace,
it's a kernel bug.

> Are you using the modesetting driver in X?  If so, with glamor I
> presume?  What version of Mesa?  Any distro patches?  Any non-default
> xorg.conf options that would affect modesetting, your X driver if it
> isn't modesetting, or glamour?
>
> Thanks,
> -James
>
> On 6/30/20 4:08 PM, Kirill A. Shutemov wrote:
> > On Tue, Jun 02, 2020 at 04:06:32PM +1000, Dave Airlie wrote:
> >> James Jones (4):
> > ...
> >>drm/nouveau/kms: Support NVIDIA format modifiers
> >
> > This commit is the first one that breaks Xorg startup for my setup:
> > GTX 1080 + Dell UP2414Q (4K DP MST monitor).
> >
> > I believe this is the crucial part of dmesg (full dmesg is attached):
> >
> > [   29.997140] [drm:nouveau_framebuffer_new] Unsupported modifier: 
> > 0x314
> > [   29.997143] [drm:drm_internal_framebuffer_create] could not create 
> > framebuffer
> > [   29.997145] [drm:drm_ioctl] pid=3393, ret = -22
> >
> > Any suggestions?
> >
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>



Re: [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-22 Thread Karol Herbst
On Tue, Oct 22, 2019 at 2:45 PM Mika Westerberg
 wrote:
>
> On Tue, Oct 22, 2019 at 11:16:14AM +0200, Karol Herbst wrote:
> > I think there is something I totally forgot about:
> >
> > When there was never a driver bound to the GPU, and if runtime power
> > management gets enabled on that device, runtime suspend/resume works
> > as expected (I am not 100% sure on if that always works, but I will
> > recheck that).
>
> AFAIK, if there is no driver bound to the PCI device it is left to D0
> regardless of the runtime PM state which could explain why it works in
> that case (it is never put into D3hot).
>
> I looked at the acpidump you sent and there is one thing that may
> explain the differences between Windows and Linux. Not sure if you were
> aware of this already, though. The power resource PGOF() method has
> this:
>
>If (((OSYS <= 0x07D9) || ((OSYS == 0x07DF) && (_REV == 0x05 {
>   ...
>}
>

I think this is the fallback to some older method of runtime
suspending the device, and I think it will end up touching different
registers on the bridge controller which do not show the broken
behaviour.

You'll find references to following variables which all cause a link
to be powered down: Q0L2 (newest), P0L2, P0LD (oldest, I think).

Maybe I remember incorrectly and have to read the code again... okay,
the fallback path uses P0LD indeed. That's actually the only register
of those being documented by Intel afaik.

> If I read it right, the later condition tries to detect Linux which
> fails nowadays but if you have acpi_rev_override in the command line (or
> the machine is listed in acpi_rev_dmi_table) this check passes and does
> some magic which is not clear to me. There is similar in PGON() side
> which is used to turn the device back on.
>
> You can check what actually happens when _ON()/_OFF() is called by
> passing something like below to the kernel command line:
>
>   acpi.trace_debug_layer=0x80 acpi.trace_debug_level=0x10 
> acpi.trace_method_name=\_SB.PCI0.PEG0.PG00._ON acpi.trace_state=method
>
> (See also Documentation/firmware-guide/acpi/method-tracing.rst).
>
> Trace goes to system dmesg.

This sounds to be very helpful, I'll give it a try.


Re: [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-21 Thread Karol Herbst
On Mon, Oct 21, 2019 at 4:09 PM Mika Westerberg
 wrote:
>
> On Mon, Oct 21, 2019 at 03:54:09PM +0200, Karol Herbst wrote:
> > > I really would like to provide you more information about such
> > > workaround but I'm not aware of any ;-) I have not seen any issues like
> > > this when D3cold is properly implemented in the platform.  That's why
> > > I'm bit skeptical that this has anything to do with specific Intel PCIe
> > > ports. More likely it is some power sequence in the _ON/_OFF() methods
> > > that is run differently on Windows.
> >
> > yeah.. maybe. I really don't know what's the actual root cause. I just
> > know that with this workaround it works perfectly fine on my and some
> > other systems it was tested on. Do you know who would be best to
> > approach to get proper documentation about those methods and what are
> > the actual prerequisites of those methods?
>
> Those should be documented in the ACPI spec. Chapter 7 should explain
> power resources and the device power methods in detail.

either I looked up the wrong spec or the documentation isn't really
saying much there.


Re: [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-16 Thread Karol Herbst
On Wed, Oct 16, 2019 at 11:37 PM Bjorn Helgaas  wrote:
>
> [+cc linux-acpi]
>
> On Wed, Oct 16, 2019 at 09:18:32PM +0200, Karol Herbst wrote:
> > but setting the PCI_DEV_FLAGS_NO_D3 flag does prevent using the
> > platform means of putting the device into D3cold, right? That's
> > actually what should still happen, just the D3hot step should be
> > skipped.
>
> If I understand correctly, when we put a device in D3cold on an ACPI
> system, we do something like this:
>
>   pci_set_power_state(D3cold)
> if (PCI_DEV_FLAGS_NO_D3)
>   return 0   <-- nothing at all if quirked
> pci_raw_set_power_state
>   pci_write_config_word(PCI_PM_CTRL, D3hot)  <-- set to D3hot
> __pci_complete_power_transition(D3cold)
>   pci_platform_power_transition(D3cold)
> platform_pci_set_power_state(D3cold)
>   acpi_pci_set_power_state(D3cold)
> acpi_device_set_power(ACPI_STATE_D3_COLD)
>   ...
> acpi_evaluate_object("_OFF") <-- set to D3cold
>
> I did not understand the connection with platform (ACPI) power
> management from your patch.  It sounds like you want this entire path
> except that you want to skip the PCI_PM_CTRL write?
>

exactly. I am running with this workaround for a while now and never
had any fails with it anymore. The GPU gets turned off correctly and I
see the same power savings, just that the GPU can be powered on again.

> That seems like something Rafael should weigh in on.  I don't know
> why we set the device to D3hot with PCI_PM_CTRL before using the ACPI
> methods, and I don't know what the effect of skipping that is.  It
> seems a little messy to slice out this tiny piece from the middle, but
> maybe it makes sense.
>

afaik when I was talking with others in the past about it, Windows is
doing that before using ACPI calls, but maybe they have some similar
workarounds for certain intel bridges as well? I am sure it affects
more than the one I am blacklisting here, but I rather want to check
each device before blacklisting all kabylake and sky lake bridges (as
those are the ones were this issue can be observed).

Sadly we had no luck getting any information about such workaround out
of Nvidia or Intel.

> > On Wed, Oct 16, 2019 at 9:14 PM Bjorn Helgaas  wrote:
> > >
> > > On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote:
> > > > Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher 
> > > > device
> > > > states.
> > > >
> > > > v2: convert to pci_dev quirk
> > > > put a proper technical explanation of the issue as a in-code comment
> > > > v3: disable it only for certain combinations of intel and nvidia 
> > > > hardware
> > > >
> > > > Signed-off-by: Karol Herbst 
> > > > Cc: Bjorn Helgaas 
> > > > Cc: Lyude Paul 
> > > > Cc: Rafael J. Wysocki 
> > > > Cc: Mika Westerberg 
> > > > Cc: linux-...@vger.kernel.org
> > > > Cc: linux...@vger.kernel.org
> > > > Cc: dri-de...@lists.freedesktop.org
> > > > Cc: nouv...@lists.freedesktop.org
> > > > ---
> > > >  drivers/pci/pci.c| 11 ++
> > > >  drivers/pci/quirks.c | 52 
> > > >  include/linux/pci.h  |  1 +
> > > >  3 files changed, 64 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index b97d9e10c9cc..8e056eb7e6ff 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct 
> > > > pci_dev *dev)
> > > >   return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false;
> > > >  }
> > > >
> > > > +static inline bool parent_broken_child_pm(struct pci_dev *dev)
> > > > +{
> > > > + if (!dev->bus || !dev->bus->self)
> > > > + return false;
> > > > + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm;
> > > > +}
> > > > +
> > > >  /**
> > > >   * pci_raw_set_power_state - Use PCI PM registers to set the power 
> > > > state of
> > > >   *given PCI device
> > > > @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev 
> > > > *dev, pci_power_t state)
> > > >  || (state == PCI_D

Re: [RFC PATCH] pci: prevent putting pcie devices into lower device states on certain intel bridges

2019-10-01 Thread Karol Herbst
On Tue, Oct 1, 2019 at 11:11 AM Mika Westerberg
 wrote:
>
> On Tue, Oct 01, 2019 at 10:56:39AM +0200, Karol Herbst wrote:
> > On Tue, Oct 1, 2019 at 10:47 AM Mika Westerberg
> >  wrote:
> > >
> > > On Mon, Sep 30, 2019 at 06:36:12PM +0200, Karol Herbst wrote:
> > > > On Mon, Sep 30, 2019 at 6:30 PM Mika Westerberg
> > > >  wrote:
> > > > >
> > > > > On Mon, Sep 30, 2019 at 06:05:14PM +0200, Karol Herbst wrote:
> > > > > > still happens with your patch applied. The machine simply gets shut 
> > > > > > down.
> > > > > >
> > > > > > dmesg can be found here:
> > > > > > https://gist.githubusercontent.com/karolherbst/40eb091c7b7b33ef993525de660f1a3b/raw/2380e31f566e93e5ba7c87ef545420965d4c492c/gistfile1.txt
> > > > >
> > > > > Looking your dmesg:
> > > > >
> > > > > Sep 30 17:24:27 kernel: nouveau :01:00.0: DRM: DCB version 4.1
> > > > > Sep 30 17:24:27 kernel: nouveau :01:00.0: DRM: MM: using COPY for 
> > > > > buffer copies
> > > > > Sep 30 17:24:27 kernel: [drm] Initialized nouveau 1.3.1 20120801 for 
> > > > > :01:00.0 on minor 1
> > > > >
> > > > > I would assume it runtime suspends here. Then it wakes up because of 
> > > > > PCI
> > > > > access from userspace:
> > > > >
> > > > > Sep 30 17:24:42 kernel: pci_raw_set_power_state: 56 callbacks 
> > > > > suppressed
> > > > >
> > > > > and for some reason it does not get resumed properly. There are also 
> > > > > few
> > > > > warnings from ACPI that might be relevant:
> > > > >
> > > > > Sep 30 17:24:27 kernel: ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument 
> > > > > #4 type mismatch - Found [Buffer], ACPI requires [Package] 
> > > > > (20190509/nsarguments-59)
> > > > > Sep 30 17:24:27 kernel: ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: 
> > > > > Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] 
> > > > > (20190509/nsarguments-59)
> > > > >
> > > >
> > > > afaik this is the case for essentially every laptop out there.
> > >
> > > OK, so they are harmless?
> > >
> >
> > yes
> >
> > > > > This seems to be Dell XPS 9560 which I think has been around some time
> > > > > already so I wonder why we only see issues now. Has it ever worked for
> > > > > you or maybe there is a regression that causes it to happen now?
> > > >
> > > > oh, it's broken since forever, we just tried to get more information
> > > > from Nvidia if they know what this is all about, but we got nothing
> > > > useful.
> > > >
> > > > We were also hoping to find a reliable fix or workaround we could have
> > > > inside nouveau to fix that as I think nouveau is the only driver
> > > > actually hit by this issue, but nothing turned out to be reliable
> > > > enough.
> > >
> > > Can't you just block runtime PM from the nouveau driver until this is
> > > understood better? That can be done by calling pm_runtime_forbid() (or
> > > not calling pm_runtime_allow() in the driver). Or in case of PCI driver
> > > you just don't decrease the reference count when probe() ends.
> > >
> >
> > the thing is, it does work for a lot of laptops. We could only observe
> > this on kaby lake and skylake ones. Even on Cannon Lakes it seems to
> > work just fine.
>
> Can't you then limit it to those?
>
> I've experienced that Kabylake root ports can enter and exit in D3cold
> just fine because we do that for Thunderbolt for example. But that
> always requires help from ACPI. If the system is using non-standard ACPI
> methods for example that may require some tricks in the driver side.
>

yeah.. I am not quite sure what's actually the root cause. I was also
trying to use the same PCI registers ACPI is using to trigger this
issue on a normal desktop, no luck. Using the same registers does
trigger the issue (hence the script).

The script is essentially just doing what ACPI does, just skipping a lot.

> > > I think that would be much better than blocking any devices behind
> > > Kabylake PCIe root ports from entering D3 (I don't really think the
> > > problem is in the root ports itself but there is something we are
> > > missing when the NVIDIA GPU is put into D3cold or back from there).
> >
> > I highly doubt there is anything wrong with the GPU alone as we have
> > too many indications which tell us otherwise.
> >
> > Anyway, at this point I don't know where to look further for what's
> > actually wrong. And apparently it works on Windows, but I don't know
> > why and I have no idea what Windows does on such systems to make it
> > work reliably.
>
> By works you mean that Windows is able to put it into D3cold and back?
> If that's the case it may be that there is some ACPI magic that the
> Windows driver does and we of course are missing in Linux.

Afaik that's the case. We were talking with Nvidia about it, but they
are not aware of any issues generally. (on Windows, nor the hardware).
No idea if we can trust their statements though.

But yeah, it might be that on Windows they still do _DSM calls or
something... but until today, Nvidia didn't provide any documentation
to us for that.


Re: [RFC PATCH] pci: prevent putting pcie devices into lower device states on certain intel bridges

2019-10-01 Thread Karol Herbst
On Tue, Oct 1, 2019 at 10:47 AM Mika Westerberg
 wrote:
>
> On Mon, Sep 30, 2019 at 06:36:12PM +0200, Karol Herbst wrote:
> > On Mon, Sep 30, 2019 at 6:30 PM Mika Westerberg
> >  wrote:
> > >
> > > On Mon, Sep 30, 2019 at 06:05:14PM +0200, Karol Herbst wrote:
> > > > still happens with your patch applied. The machine simply gets shut 
> > > > down.
> > > >
> > > > dmesg can be found here:
> > > > https://gist.githubusercontent.com/karolherbst/40eb091c7b7b33ef993525de660f1a3b/raw/2380e31f566e93e5ba7c87ef545420965d4c492c/gistfile1.txt
> > >
> > > Looking your dmesg:
> > >
> > > Sep 30 17:24:27 kernel: nouveau :01:00.0: DRM: DCB version 4.1
> > > Sep 30 17:24:27 kernel: nouveau :01:00.0: DRM: MM: using COPY for 
> > > buffer copies
> > > Sep 30 17:24:27 kernel: [drm] Initialized nouveau 1.3.1 20120801 for 
> > > :01:00.0 on minor 1
> > >
> > > I would assume it runtime suspends here. Then it wakes up because of PCI
> > > access from userspace:
> > >
> > > Sep 30 17:24:42 kernel: pci_raw_set_power_state: 56 callbacks suppressed
> > >
> > > and for some reason it does not get resumed properly. There are also few
> > > warnings from ACPI that might be relevant:
> > >
> > > Sep 30 17:24:27 kernel: ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 
> > > type mismatch - Found [Buffer], ACPI requires [Package] 
> > > (20190509/nsarguments-59)
> > > Sep 30 17:24:27 kernel: ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument 
> > > #4 type mismatch - Found [Buffer], ACPI requires [Package] 
> > > (20190509/nsarguments-59)
> > >
> >
> > afaik this is the case for essentially every laptop out there.
>
> OK, so they are harmless?
>

yes

> > > This seems to be Dell XPS 9560 which I think has been around some time
> > > already so I wonder why we only see issues now. Has it ever worked for
> > > you or maybe there is a regression that causes it to happen now?
> >
> > oh, it's broken since forever, we just tried to get more information
> > from Nvidia if they know what this is all about, but we got nothing
> > useful.
> >
> > We were also hoping to find a reliable fix or workaround we could have
> > inside nouveau to fix that as I think nouveau is the only driver
> > actually hit by this issue, but nothing turned out to be reliable
> > enough.
>
> Can't you just block runtime PM from the nouveau driver until this is
> understood better? That can be done by calling pm_runtime_forbid() (or
> not calling pm_runtime_allow() in the driver). Or in case of PCI driver
> you just don't decrease the reference count when probe() ends.
>

the thing is, it does work for a lot of laptops. We could only observe
this on kaby lake and skylake ones. Even on Cannon Lakes it seems to
work just fine.

> I think that would be much better than blocking any devices behind
> Kabylake PCIe root ports from entering D3 (I don't really think the
> problem is in the root ports itself but there is something we are
> missing when the NVIDIA GPU is put into D3cold or back from there).

I highly doubt there is anything wrong with the GPU alone as we have
too many indications which tell us otherwise.

Anyway, at this point I don't know where to look further for what's
actually wrong. And apparently it works on Windows, but I don't know
why and I have no idea what Windows does on such systems to make it
work reliably.


Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"

2019-08-15 Thread Karol Herbst
On Thu, Aug 15, 2019 at 4:30 PM  wrote:
>
> > On Thu, Aug 15, 2019 at 10:15 AM Karol Herbst  wrote:
> > >
> > > On Thu, Aug 15, 2019 at 4:13 PM Alex Deucher 
> > wrote:
> > > >
> > > > On Thu, Aug 15, 2019 at 10:04 AM Karol Herbst  
> > > > wrote:
> > > > >
> > > > > On Thu, Aug 15, 2019 at 3:56 PM  wrote:
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: linux-acpi-ow...@vger.kernel.org  > ow...@vger.kernel.org> On
> > > > > > > Behalf Of Dave Airlie
> > > > > > > Sent: Wednesday, August 14, 2019 5:48 PM
> > > > > > > To: Karol Herbst
> > > > > > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; 
> > > > > > > Alex Hung;
> > Ben
> > > > > > > Skeggs; Dave Airlie
> > > > > > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM 
> > > > > > > _OSI
> > string to
> > > > > > > enable dGPU direct output"
> > > > > > >
> > > > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst 
> > wrote:
> > > > > > > >
> > > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c.
> > > > > > > >
> > > > > > > > The original commit message didn't even make sense. AMD _does_
> > support it and
> > > > > > > > it works with Nouveau as well.
> > > > > > > >
> > > > > > > > Also what was the issue being solved here? No references to any 
> > > > > > > > bugs
> > and not
> > > > > > > > even explaining any issue at all isn't the way we do things.
> > > > > > > >
> > > > > > > > And even if it means a muxed design, then the fix is to make it 
> > > > > > > > work
> > inside the
> > > > > > > > driver, not adding some hacky workaround through ACPI tricks.
> > > > > > > >
> > > > > > > > And what out of tree drivers do or do not support we don't care 
> > > > > > > > one
> > bit anyway.
> > > > > > > >
> > > > > > >
> > > > > > > I think the reverts should be merged via Rafael's tree as the 
> > > > > > > original
> > > > > > > patches went in via there, and we should get them in asap.
> > > > > > >
> > > > > > > Acked-by: Dave Airlie 
> > > > > > > Dave.
> > > > > >
> > > > > > There are definitely going to be regressions on machines in the 
> > > > > > field with
> > the
> > > > > > in tree drivers by reverting this.  I think we should have an 
> > > > > > answer for all
> > of those
> > > > > > before this revert is accepted.
> > > > > >
> > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with 
> > > > > > partners to
> > collect
> > > > > > some information on the impact of reverting this.
> > > > > >
> > > > > > When this is used on a system with Intel+AMD the ASL configures AMD
> > GPU to use
> > > > > > "Hybrid Graphics" when on Windows and "Power Express" and
> > "Switchable Graphics"
> > > > > > when on Linux.
> > > > >
> > > > > and what's exactly the difference between those? And what's the actual
> > > > > issue here?
> > > >
> > > > Hybrid Graphics is the new "standard" way of handling these laptops.
> > > > It uses the standard _PR3 APCI method to handle dGPU power.  Support
> > > > for this was added to Linux relatively later compared to when the
> > > > laptops were launched.  "Power Express" used the other AMD specific
> > > > ATPX ACPI method to handle dGPU power.  The driver supports both so
> > > > either method will work.
> > > >
> > > > Alex
> > > >
> > >
> > > thanks for clarifying. But that still means that we won't need such
> > > workarounds for AMD users, right? amdgpu handles hybrid graphics just
> > > fine, right?
> >
> > Yeah it should, assuming you have a new enough kernel which supports
> > HG, which has been several years at this point IIRC.
> >
> > Alex
> >
>
> Can you define how new of a kernel is a new enough kernel?
>
> Looking on my side these problems were on new hardware (Precision 7740) and
> are checked as recently as start of this summer, w/ kernel 4.15.

That's not even a long term one. And it shouldn't get any fixes. I
just checked, last update was 16 months ago.

>
> We can arrange to have it checked again on 5.3rcX w/ the OSI disabled.

yeah, please do. If there are any issues, we (as in drm developers)
are happy to fix the issues inside the drivers.


Re: [PATCH 1/5] drm/nouveau: Check backlight IDs are >= 0, not > 0

2018-08-23 Thread Karol Herbst
Patches 1-5 are Reviewed-by: Karol Herbst 

I think it might be worth to test those patches on a system without
any backlight devices just to verify we don't break things, but the
code looked good already, so maybe we don't really need to test.

On Thu, Aug 23, 2018 at 3:21 AM, Lyude Paul  wrote:
> Remember, ida IDs start at 0, not 1!
>
> Signed-off-by: Lyude Paul 
> Cc: sta...@vger.kernel.org
> Cc: Jeffery Miller 
> Cc: Karol Herbst 
> ---
>  drivers/gpu/drm/nouveau/nouveau_backlight.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
> b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> index 408b955e5c39..6dd72bc32897 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> @@ -116,7 +116,7 @@ nv40_backlight_init(struct drm_connector *connector)
>_bl_ops, );
>
> if (IS_ERR(bd)) {
> -   if (bl_connector.id > 0)
> +   if (bl_connector.id >= 0)
> ida_simple_remove(_ida, bl_connector.id);
> return PTR_ERR(bd);
> }
> @@ -249,7 +249,7 @@ nv50_backlight_init(struct drm_connector *connector)
>nv_encoder, ops, );
>
> if (IS_ERR(bd)) {
> -   if (bl_connector.id > 0)
> +   if (bl_connector.id >= 0)
> ida_simple_remove(_ida, bl_connector.id);
> return PTR_ERR(bd);
> }
> --
> 2.17.1
>


Re: [PATCH 1/5] drm/nouveau: Check backlight IDs are >= 0, not > 0

2018-08-23 Thread Karol Herbst
Patches 1-5 are Reviewed-by: Karol Herbst 

I think it might be worth to test those patches on a system without
any backlight devices just to verify we don't break things, but the
code looked good already, so maybe we don't really need to test.

On Thu, Aug 23, 2018 at 3:21 AM, Lyude Paul  wrote:
> Remember, ida IDs start at 0, not 1!
>
> Signed-off-by: Lyude Paul 
> Cc: sta...@vger.kernel.org
> Cc: Jeffery Miller 
> Cc: Karol Herbst 
> ---
>  drivers/gpu/drm/nouveau/nouveau_backlight.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c 
> b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> index 408b955e5c39..6dd72bc32897 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
> @@ -116,7 +116,7 @@ nv40_backlight_init(struct drm_connector *connector)
>_bl_ops, );
>
> if (IS_ERR(bd)) {
> -   if (bl_connector.id > 0)
> +   if (bl_connector.id >= 0)
> ida_simple_remove(_ida, bl_connector.id);
> return PTR_ERR(bd);
> }
> @@ -249,7 +249,7 @@ nv50_backlight_init(struct drm_connector *connector)
>nv_encoder, ops, );
>
> if (IS_ERR(bd)) {
> -   if (bl_connector.id > 0)
> +   if (bl_connector.id >= 0)
> ida_simple_remove(_ida, bl_connector.id);
> return PTR_ERR(bd);
> }
> --
> 2.17.1
>


Re: [Nouveau] [RFC v2 1/4] drm/nouveau: Add support for basic clockgating on Kepler1

2018-01-26 Thread Karol Herbst
On Fri, Jan 26, 2018 at 4:35 AM, Lyude Paul  wrote:
> This adds support for enabling automatic clockgating on nvidia GPUs for
> Kepler1. While this is not technically a clockgating level, it does
> enable clockgating using the clockgating values initially set by the
> vbios (which should be safe to use).
>
> This introduces two therm helpers for controlling basic clockgating:
> nvkm_therm_clkgate_enable() - enables clockgating through
> CG_CTRL, done after initializing the GPU fully
> nvkm_therm_clkgate_fini() - prepares clockgating for suspend or
> driver unload
>
> As well, we add the nouveau kernel config parameter NvPmEnableGating,
> which can be toggled on or off in order to enable/disable clockgating.
> Since we've only had limited testing on this thus far, we disable this
> by default.
>
> A lot of this code was originally going to be based off of fermi;
> however it turns out that while Fermi's the first line of GPUs that
> introduced this kind of power saving, Fermi requires more fine tuned
> control of the CG_CTRL registers from the driver while reclocking that
> we don't entirely understand yet.
>
> For the simple parts we will be sharing with Fermi for certain however,
> we at least add those into a new subdev/therm/gf100.h header.
>
> Signed-off-by: Lyude Paul 
> ---
>  .../gpu/drm/nouveau/include/nvkm/subdev/therm.h|   5 +
>  drivers/gpu/drm/nouveau/nvkm/engine/device/base.c  |  17 +--
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild   |   1 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c   |  60 +++--
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h  |  35 ++
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c  |   8 +-
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c  | 135 
> +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h  |  48 
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h   |  15 ++-
>  9 files changed, 303 insertions(+), 21 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h
>
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> index b1ac47eb786e..240b19bb4667 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> @@ -85,17 +85,22 @@ struct nvkm_therm {
>
> int (*attr_get)(struct nvkm_therm *, enum nvkm_therm_attr_type);
> int (*attr_set)(struct nvkm_therm *, enum nvkm_therm_attr_type, int);
> +
> +   bool clkgating_enabled;
>  };
>
>  int nvkm_therm_temp_get(struct nvkm_therm *);
>  int nvkm_therm_fan_sense(struct nvkm_therm *);
>  int nvkm_therm_cstate(struct nvkm_therm *, int, int);
> +void nvkm_therm_clkgate_enable(struct nvkm_therm *);
> +void nvkm_therm_clkgate_fini(struct nvkm_therm *, bool);
>
>  int nv40_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int nv50_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
> +int gk104_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gm200_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gp100_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> index 08e77cd55e6e..74bd09b1c893 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> @@ -28,6 +28,7 @@
>  #include 
>
>  #include 
> +#include 
>
>  static DEFINE_MUTEX(nv_devices_mutex);
>  static LIST_HEAD(nv_devices);
> @@ -1682,7 +1683,7 @@ nve4_chipset = {
> .mxm = nv50_mxm_new,
> .pci = gk104_pci_new,
> .pmu = gk104_pmu_new,
> -   .therm = gf119_therm_new,
> +   .therm = gk104_therm_new,
> .timer = nv41_timer_new,
> .top = gk104_top_new,
> .volt = gk104_volt_new,
> @@ -1721,7 +1722,7 @@ nve6_chipset = {
> .mxm = nv50_mxm_new,
> .pci = gk104_pci_new,
> .pmu = gk104_pmu_new,
> -   .therm = gf119_therm_new,
> +   .therm = gk104_therm_new,
> .timer = nv41_timer_new,
> .top = gk104_top_new,
> .volt = gk104_volt_new,
> @@ -1760,7 +1761,7 @@ nve7_chipset = {
> .mxm = nv50_mxm_new,
> .pci = gk104_pci_new,
> .pmu = gk104_pmu_new,
> -   .therm = gf119_therm_new,
> +   .therm = gk104_therm_new,
> .timer = nv41_timer_new,
> .top 

Re: [Nouveau] [RFC v2 1/4] drm/nouveau: Add support for basic clockgating on Kepler1

2018-01-26 Thread Karol Herbst
On Fri, Jan 26, 2018 at 4:35 AM, Lyude Paul  wrote:
> This adds support for enabling automatic clockgating on nvidia GPUs for
> Kepler1. While this is not technically a clockgating level, it does
> enable clockgating using the clockgating values initially set by the
> vbios (which should be safe to use).
>
> This introduces two therm helpers for controlling basic clockgating:
> nvkm_therm_clkgate_enable() - enables clockgating through
> CG_CTRL, done after initializing the GPU fully
> nvkm_therm_clkgate_fini() - prepares clockgating for suspend or
> driver unload
>
> As well, we add the nouveau kernel config parameter NvPmEnableGating,
> which can be toggled on or off in order to enable/disable clockgating.
> Since we've only had limited testing on this thus far, we disable this
> by default.
>
> A lot of this code was originally going to be based off of fermi;
> however it turns out that while Fermi's the first line of GPUs that
> introduced this kind of power saving, Fermi requires more fine tuned
> control of the CG_CTRL registers from the driver while reclocking that
> we don't entirely understand yet.
>
> For the simple parts we will be sharing with Fermi for certain however,
> we at least add those into a new subdev/therm/gf100.h header.
>
> Signed-off-by: Lyude Paul 
> ---
>  .../gpu/drm/nouveau/include/nvkm/subdev/therm.h|   5 +
>  drivers/gpu/drm/nouveau/nvkm/engine/device/base.c  |  17 +--
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild   |   1 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c   |  60 +++--
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h  |  35 ++
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c  |   8 +-
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c  | 135 
> +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h  |  48 
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h   |  15 ++-
>  9 files changed, 303 insertions(+), 21 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h
>
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> index b1ac47eb786e..240b19bb4667 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> @@ -85,17 +85,22 @@ struct nvkm_therm {
>
> int (*attr_get)(struct nvkm_therm *, enum nvkm_therm_attr_type);
> int (*attr_set)(struct nvkm_therm *, enum nvkm_therm_attr_type, int);
> +
> +   bool clkgating_enabled;
>  };
>
>  int nvkm_therm_temp_get(struct nvkm_therm *);
>  int nvkm_therm_fan_sense(struct nvkm_therm *);
>  int nvkm_therm_cstate(struct nvkm_therm *, int, int);
> +void nvkm_therm_clkgate_enable(struct nvkm_therm *);
> +void nvkm_therm_clkgate_fini(struct nvkm_therm *, bool);
>
>  int nv40_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int nv50_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
> +int gk104_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gm200_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gp100_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> index 08e77cd55e6e..74bd09b1c893 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> @@ -28,6 +28,7 @@
>  #include 
>
>  #include 
> +#include 
>
>  static DEFINE_MUTEX(nv_devices_mutex);
>  static LIST_HEAD(nv_devices);
> @@ -1682,7 +1683,7 @@ nve4_chipset = {
> .mxm = nv50_mxm_new,
> .pci = gk104_pci_new,
> .pmu = gk104_pmu_new,
> -   .therm = gf119_therm_new,
> +   .therm = gk104_therm_new,
> .timer = nv41_timer_new,
> .top = gk104_top_new,
> .volt = gk104_volt_new,
> @@ -1721,7 +1722,7 @@ nve6_chipset = {
> .mxm = nv50_mxm_new,
> .pci = gk104_pci_new,
> .pmu = gk104_pmu_new,
> -   .therm = gf119_therm_new,
> +   .therm = gk104_therm_new,
> .timer = nv41_timer_new,
> .top = gk104_top_new,
> .volt = gk104_volt_new,
> @@ -1760,7 +1761,7 @@ nve7_chipset = {
> .mxm = nv50_mxm_new,
> .pci = gk104_pci_new,
> .pmu = gk104_pmu_new,
> -   .therm = gf119_therm_new,
> +   .therm = gk104_therm_new,
> .timer = nv41_timer_new,
> .top = gk104_top_new,
> .volt = 

Re: [Nouveau] [PATCH] drm/nouveau/mmu: Fix trailing semicolon

2018-01-23 Thread Karol Herbst
Reviewed-by: Karol Herbst <kher...@redhat.com>

On Wed, Jan 17, 2018 at 7:53 PM, Luis de Bethencourt <lui...@kernel.org> wrote:
> The trailing semicolon is an empty statement that does no operation.
> Removing it since it doesn't do anything.
>
> Signed-off-by: Luis de Bethencourt <lui...@kernel.org>
> ---
>
> Hi,
>
> After fixing the same thing in drivers/staging/rtl8723bs/, Joe Perches
> suggested I fix it treewide [0].
>
> Best regards
> Luis
>
>
> [0] 
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115410.html
> [1] 
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115390.html
>
>  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> index e35d3e17cd7c..93946dcee319 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> @@ -642,7 +642,7 @@ nvkm_vmm_ptes_sparse(struct nvkm_vmm *vmm, u64 addr, u64 
> size, bool ref)
> else
> block = (size >> page[i].shift) << 
> page[i].shift;
> } else {
> -   block = (size >> page[i].shift) << page[i].shift;;
> +   block = (size >> page[i].shift) << page[i].shift;
> }
>
> /* Perform operation. */
> --
> 2.15.1
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau/mmu: Fix trailing semicolon

2018-01-23 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Wed, Jan 17, 2018 at 7:53 PM, Luis de Bethencourt  wrote:
> The trailing semicolon is an empty statement that does no operation.
> Removing it since it doesn't do anything.
>
> Signed-off-by: Luis de Bethencourt 
> ---
>
> Hi,
>
> After fixing the same thing in drivers/staging/rtl8723bs/, Joe Perches
> suggested I fix it treewide [0].
>
> Best regards
> Luis
>
>
> [0] 
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115410.html
> [1] 
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115390.html
>
>  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> index e35d3e17cd7c..93946dcee319 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> @@ -642,7 +642,7 @@ nvkm_vmm_ptes_sparse(struct nvkm_vmm *vmm, u64 addr, u64 
> size, bool ref)
> else
> block = (size >> page[i].shift) << 
> page[i].shift;
> } else {
> -   block = (size >> page[i].shift) << page[i].shift;;
> +   block = (size >> page[i].shift) << page[i].shift;
> }
>
> /* Perform operation. */
> --
> 2.15.1
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

2017-12-12 Thread Karol Herbst
Hi Peter,

the basic idea is to detect if a driver accesses a memory region
mapped through ioremap. This is super usefull for reverse engineering
closed source drivers like the Nvidia GPU driver.

So here is what it does:
1. on ioremap the entire memory region mapped is registered in the
mmiotracer and marked as not presen, which basically leads to page
faults on acces
2. mmiotrace is the registered page fault handler for those pages and
while handling the page (which basically means marking them as presen,
because they were never missing in the first place) it parses the
current instruction to detect if it was a read or write and writes
relevant information into a file. This includes address accessed,
value read/written, type of instruction
3. after single stepping, the page is marked as not present again
4. on unmap time, mmiotrace unregisteres those regions and marks them as present

this is more or less the basic idea.

And to answer your question how it is not completely broken: I don't
know. It works for us (more or less, we can't parse repeat
instructions as one example what does not work) and if we come across
issues we try to fix them on the way.

Anyway, this is a super useful tool to record and debug what a driver
is doing with hardware and helps tracking down a lot of this,
especially for Nouveau.

I hope that helps.

On Tue, Dec 12, 2017 at 3:04 PM, Ingo Molnar <mi...@kernel.org> wrote:
>
> * Peter Zijlstra <pet...@infradead.org> wrote:
>
>> On Tue, Dec 12, 2017 at 02:55:30AM -0800, tip-bot for Karol Herbst wrote:
>> > Commit-ID:  6d60ce384d1d5ca32b595244db4077a419acc687
>> > Gitweb: 
>> > https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
>> > Author: Karol Herbst <kher...@redhat.com>
>> > AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
>> > Committer:  Ingo Molnar <mi...@kernel.org>
>> > CommitDate: Mon, 11 Dec 2017 15:35:18 +0100
>> >
>> > x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
>>
>> OK, let me hijack this thread since apparently people use and care about
>> mmiotrace.
>>
>> I was recently auditing the x86 tlb flushing and ran across this
>> 'thing'. Can someone please explain to me how this is supposed to work
>> and how its not completely broken?
>
> (I have Cc:-ed other gents as well.)
>
> Thanks,
>
> Ingo


Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

2017-12-12 Thread Karol Herbst
Hi Peter,

the basic idea is to detect if a driver accesses a memory region
mapped through ioremap. This is super usefull for reverse engineering
closed source drivers like the Nvidia GPU driver.

So here is what it does:
1. on ioremap the entire memory region mapped is registered in the
mmiotracer and marked as not presen, which basically leads to page
faults on acces
2. mmiotrace is the registered page fault handler for those pages and
while handling the page (which basically means marking them as presen,
because they were never missing in the first place) it parses the
current instruction to detect if it was a read or write and writes
relevant information into a file. This includes address accessed,
value read/written, type of instruction
3. after single stepping, the page is marked as not present again
4. on unmap time, mmiotrace unregisteres those regions and marks them as present

this is more or less the basic idea.

And to answer your question how it is not completely broken: I don't
know. It works for us (more or less, we can't parse repeat
instructions as one example what does not work) and if we come across
issues we try to fix them on the way.

Anyway, this is a super useful tool to record and debug what a driver
is doing with hardware and helps tracking down a lot of this,
especially for Nouveau.

I hope that helps.

On Tue, Dec 12, 2017 at 3:04 PM, Ingo Molnar  wrote:
>
> * Peter Zijlstra  wrote:
>
>> On Tue, Dec 12, 2017 at 02:55:30AM -0800, tip-bot for Karol Herbst wrote:
>> > Commit-ID:  6d60ce384d1d5ca32b595244db4077a419acc687
>> > Gitweb: 
>> > https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
>> > Author: Karol Herbst 
>> > AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
>> > Committer:  Ingo Molnar 
>> > CommitDate: Mon, 11 Dec 2017 15:35:18 +0100
>> >
>> > x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
>>
>> OK, let me hijack this thread since apparently people use and care about
>> mmiotrace.
>>
>> I was recently auditing the x86 tlb flushing and ran across this
>> 'thing'. Can someone please explain to me how this is supposed to work
>> and how its not completely broken?
>
> (I have Cc:-ed other gents as well.)
>
> Thanks,
>
> Ingo


[tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

2017-12-12 Thread tip-bot for Karol Herbst
Commit-ID:  6d60ce384d1d5ca32b595244db4077a419acc687
Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
Author: Karol Herbst <kher...@redhat.com>
AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Mon, 11 Dec 2017 15:35:18 +0100

x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

If something calls ioremap() with an address not aligned to PAGE_SIZE, the
returned address might be not aligned as well. This led to a probe
registered on exactly the returned address, but the entire page was armed
for mmiotracing.

On calling iounmap() the address passed to unregister_kmmio_probe() was
PAGE_SIZE aligned by the caller leading to a complete freeze of the
machine.

We should always page align addresses while (un)registerung mappings,
because the mmiotracer works on top of pages, not mappings. We still keep
track of the probes based on their real addresses and lengths though,
because the mmiotrace still needs to know what are mapped memory regions.

Also move the call to mmiotrace_iounmap() prior page aligning the address,
so that all probes are unregistered properly, otherwise the kernel ends up
failing memory allocations randomly after disabling the mmiotracer.

Tested-by: Lyude <ly...@redhat.com>
Signed-off-by: Karol Herbst <kher...@redhat.com>
Acked-by: Pekka Paalanen <ppaala...@gmail.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Steven Rostedt <rost...@goodmis.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: nouv...@lists.freedesktop.org
Link: http://lkml.kernel.org/r/20171127075139.4928-1-kher...@redhat.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/mm/ioremap.c |  4 ++--
 arch/x86/mm/kmmio.c   | 12 +++-
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 6e4573b..c45b6ec 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -404,11 +404,11 @@ void iounmap(volatile void __iomem *addr)
return;
}
 
+   mmiotrace_iounmap(addr);
+
addr = (volatile void __iomem *)
(PAGE_MASK & (unsigned long __force)addr);
 
-   mmiotrace_iounmap(addr);
-
/* Use the vm area unlocked, assuming the caller
   ensures there isn't another iounmap for the same address
   in parallel. Reuse of the virtual address is prevented by
diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
index c21c2ed..58477ec 100644
--- a/arch/x86/mm/kmmio.c
+++ b/arch/x86/mm/kmmio.c
@@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p)
unsigned long flags;
int ret = 0;
unsigned long size = 0;
+   unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
unsigned int l;
pte_t *pte;
 
spin_lock_irqsave(_lock, flags);
-   if (get_kmmio_probe(p->addr)) {
+   if (get_kmmio_probe(addr)) {
ret = -EEXIST;
goto out;
}
 
-   pte = lookup_address(p->addr, );
+   pte = lookup_address(addr, );
if (!pte) {
ret = -EINVAL;
goto out;
@@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p)
kmmio_count++;
list_add_rcu(>list, _probes);
while (size < size_lim) {
-   if (add_kmmio_fault_page(p->addr + size))
+   if (add_kmmio_fault_page(addr + size))
pr_err("Unable to set page fault.\n");
size += page_level_size(l);
}
@@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p)
 {
unsigned long flags;
unsigned long size = 0;
+   unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
struct kmmio_fault_page *release_list = NULL;
struct kmmio_delayed_release *drelease;
unsigned int l;
pte_t *pte;
 
-   pte = lookup_address(p->addr, );
+   pte = lookup_address(addr, );
if (!pte)
return;
 
spin_lock_irqsave(_lock, flags);
while (size < size_lim) {
-   release_kmmio_fault_page(p->addr + size, _list);
+   release_kmmio_fault_page(addr + size, _list);
size += page_level_size(l);
}
list_del_rcu(>list);


[tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

2017-12-12 Thread tip-bot for Karol Herbst
Commit-ID:  6d60ce384d1d5ca32b595244db4077a419acc687
Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
Author: Karol Herbst 
AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
Committer:  Ingo Molnar 
CommitDate: Mon, 11 Dec 2017 15:35:18 +0100

x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

If something calls ioremap() with an address not aligned to PAGE_SIZE, the
returned address might be not aligned as well. This led to a probe
registered on exactly the returned address, but the entire page was armed
for mmiotracing.

On calling iounmap() the address passed to unregister_kmmio_probe() was
PAGE_SIZE aligned by the caller leading to a complete freeze of the
machine.

We should always page align addresses while (un)registerung mappings,
because the mmiotracer works on top of pages, not mappings. We still keep
track of the probes based on their real addresses and lengths though,
because the mmiotrace still needs to know what are mapped memory regions.

Also move the call to mmiotrace_iounmap() prior page aligning the address,
so that all probes are unregistered properly, otherwise the kernel ends up
failing memory allocations randomly after disabling the mmiotracer.

Tested-by: Lyude 
Signed-off-by: Karol Herbst 
Acked-by: Pekka Paalanen 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Steven Rostedt 
Cc: Thomas Gleixner 
Cc: nouv...@lists.freedesktop.org
Link: http://lkml.kernel.org/r/20171127075139.4928-1-kher...@redhat.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/ioremap.c |  4 ++--
 arch/x86/mm/kmmio.c   | 12 +++-
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 6e4573b..c45b6ec 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -404,11 +404,11 @@ void iounmap(volatile void __iomem *addr)
return;
}
 
+   mmiotrace_iounmap(addr);
+
addr = (volatile void __iomem *)
(PAGE_MASK & (unsigned long __force)addr);
 
-   mmiotrace_iounmap(addr);
-
/* Use the vm area unlocked, assuming the caller
   ensures there isn't another iounmap for the same address
   in parallel. Reuse of the virtual address is prevented by
diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
index c21c2ed..58477ec 100644
--- a/arch/x86/mm/kmmio.c
+++ b/arch/x86/mm/kmmio.c
@@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p)
unsigned long flags;
int ret = 0;
unsigned long size = 0;
+   unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
unsigned int l;
pte_t *pte;
 
spin_lock_irqsave(_lock, flags);
-   if (get_kmmio_probe(p->addr)) {
+   if (get_kmmio_probe(addr)) {
ret = -EEXIST;
goto out;
}
 
-   pte = lookup_address(p->addr, );
+   pte = lookup_address(addr, );
if (!pte) {
ret = -EINVAL;
goto out;
@@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p)
kmmio_count++;
list_add_rcu(>list, _probes);
while (size < size_lim) {
-   if (add_kmmio_fault_page(p->addr + size))
+   if (add_kmmio_fault_page(addr + size))
pr_err("Unable to set page fault.\n");
size += page_level_size(l);
}
@@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p)
 {
unsigned long flags;
unsigned long size = 0;
+   unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
struct kmmio_fault_page *release_list = NULL;
struct kmmio_delayed_release *drelease;
unsigned int l;
pte_t *pte;
 
-   pte = lookup_address(p->addr, );
+   pte = lookup_address(addr, );
if (!pte)
return;
 
spin_lock_irqsave(_lock, flags);
while (size < size_lim) {
-   release_kmmio_fault_page(p->addr + size, _list);
+   release_kmmio_fault_page(addr + size, _list);
size += page_level_size(l);
}
list_del_rcu(>list);


Re: [Nouveau] [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings

2017-12-06 Thread Karol Herbst
Reviewed-by: Karol Herbst <kher...@redhat.com>

On Thu, Nov 30, 2017 at 8:53 PM, Christoph Böhmwalder
<christ...@boehmwalder.at> wrote:
> The kbuild test bot complained about a new coccinelle warning nearby,
> which sparked a discussion about the assignment to 'memory' inside of
> the conditional expression.  See Link below for the original post.
>
> Fix the assignment to silence the coccinelle warning and also make the
> code look a little nicer.
>
> Link: https://lists.freedesktop.org/archives/nouveau/2017-November/029242.html
> Signed-off-by: Christoph Böhmwalder <christ...@boehmwalder.at>
> ---
>  drm/nouveau/nvkm/subdev/mmu/uvmm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drm/nouveau/nvkm/subdev/mmu/uvmm.c 
> b/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> index fa81d0c1..37b201b9 100644
> --- a/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> +++ b/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> @@ -106,7 +106,8 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvmm, void *argv, 
> u32 argc)
> } else
> return ret;
>
> -   if (IS_ERR((memory = nvkm_umem_search(client, handle {
> +   memory = nvkm_umem_search(client, handle);
> +   if (IS_ERR(memory)) {
> VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, 
> PTR_ERR(memory));
> return PTR_ERR(memory);
> }
> --
> 2.13.6
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings

2017-12-06 Thread Karol Herbst
Reviewed-by: Karol Herbst 

On Thu, Nov 30, 2017 at 8:53 PM, Christoph Böhmwalder
 wrote:
> The kbuild test bot complained about a new coccinelle warning nearby,
> which sparked a discussion about the assignment to 'memory' inside of
> the conditional expression.  See Link below for the original post.
>
> Fix the assignment to silence the coccinelle warning and also make the
> code look a little nicer.
>
> Link: https://lists.freedesktop.org/archives/nouveau/2017-November/029242.html
> Signed-off-by: Christoph Böhmwalder 
> ---
>  drm/nouveau/nvkm/subdev/mmu/uvmm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drm/nouveau/nvkm/subdev/mmu/uvmm.c 
> b/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> index fa81d0c1..37b201b9 100644
> --- a/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> +++ b/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> @@ -106,7 +106,8 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvmm, void *argv, 
> u32 argc)
> } else
> return ret;
>
> -   if (IS_ERR((memory = nvkm_umem_search(client, handle {
> +   memory = nvkm_umem_search(client, handle);
> +   if (IS_ERR(memory)) {
> VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, 
> PTR_ERR(memory));
> return PTR_ERR(memory);
> }
> --
> 2.13.6
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings (fwd)

2017-11-28 Thread Karol Herbst
Hi julia,

I think it would be better to extract the assignment out of the if
clause. Then you'll get something like this:

memory = ...;
if (IS_ERR(memory)) {

}

so, to answer your question: no, it isn't necessary.

On Tue, Nov 28, 2017 at 1:56 PM, Julia Lawall  wrote:
> This is a false positive, but I wonder if it is really necessary to put
> the assignment in the conditional test expression.
>
> julia
>
> -- Forwarded message --
> Date: Tue, 28 Nov 2017 13:23:36 +0800
> From: kbuild test robot 
> To: kbu...@01.org
> Cc: Julia Lawall 
> Subject: [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings
>
> CC: kbuild-...@01.org
> CC: linux-kernel@vger.kernel.org
> TO: Ben Skeggs 
> CC: David Airlie 
> CC: dri-de...@lists.freedesktop.org
> CC: nouv...@lists.freedesktop.org
> CC: linux-kernel@vger.kernel.org
>
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c:109:5-11: inconsistent IS_ERR 
> and PTR_ERR on line 110.
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c:109:5-11: inconsistent IS_ERR 
> and PTR_ERR on line 111.
>
>  PTR_ERR should access the value just tested by IS_ERR
>
> Semantic patch information:
>  There can be false positives in the patch case, where it is the call to
>  IS_ERR that is wrong.
>
> Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci
>
> Fixes: 920d2b5ef215 ("drm/nouveau/mmu: define user interfaces to mmu vmm 
> opertaions")
> Signed-off-by: Fengguang Wu 
> ---
>
> Please take the patch only if it's a positive warning. Thanks!
>
>  uvmm.c |5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> @@ -107,8 +107,9 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvm
> return ret;
>
> if (IS_ERR((memory = nvkm_umem_search(client, handle {
> -   VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, 
> PTR_ERR(memory));
> -   return PTR_ERR(memory);
> +   VMM_DEBUG(vmm, "memory %016llx %ld\n", handle,
> + PTR_ERR((memory = nvkm_umem_search(client, 
> handle;
> +   return PTR_ERR((memory = nvkm_umem_search(client, handle)));
> }
>
> mutex_lock(>mutex);
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings (fwd)

2017-11-28 Thread Karol Herbst
Hi julia,

I think it would be better to extract the assignment out of the if
clause. Then you'll get something like this:

memory = ...;
if (IS_ERR(memory)) {

}

so, to answer your question: no, it isn't necessary.

On Tue, Nov 28, 2017 at 1:56 PM, Julia Lawall  wrote:
> This is a false positive, but I wonder if it is really necessary to put
> the assignment in the conditional test expression.
>
> julia
>
> -- Forwarded message --
> Date: Tue, 28 Nov 2017 13:23:36 +0800
> From: kbuild test robot 
> To: kbu...@01.org
> Cc: Julia Lawall 
> Subject: [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings
>
> CC: kbuild-...@01.org
> CC: linux-kernel@vger.kernel.org
> TO: Ben Skeggs 
> CC: David Airlie 
> CC: dri-de...@lists.freedesktop.org
> CC: nouv...@lists.freedesktop.org
> CC: linux-kernel@vger.kernel.org
>
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c:109:5-11: inconsistent IS_ERR 
> and PTR_ERR on line 110.
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c:109:5-11: inconsistent IS_ERR 
> and PTR_ERR on line 111.
>
>  PTR_ERR should access the value just tested by IS_ERR
>
> Semantic patch information:
>  There can be false positives in the patch case, where it is the call to
>  IS_ERR that is wrong.
>
> Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci
>
> Fixes: 920d2b5ef215 ("drm/nouveau/mmu: define user interfaces to mmu vmm 
> opertaions")
> Signed-off-by: Fengguang Wu 
> ---
>
> Please take the patch only if it's a positive warning. Thanks!
>
>  uvmm.c |5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c
> @@ -107,8 +107,9 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvm
> return ret;
>
> if (IS_ERR((memory = nvkm_umem_search(client, handle {
> -   VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, 
> PTR_ERR(memory));
> -   return PTR_ERR(memory);
> +   VMM_DEBUG(vmm, "memory %016llx %ld\n", handle,
> + PTR_ERR((memory = nvkm_umem_search(client, 
> handle;
> +   return PTR_ERR((memory = nvkm_umem_search(client, handle)));
> }
>
> mutex_lock(>mutex);
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


[PATCH] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

2017-11-26 Thread Karol Herbst
If something calls ioremap with an address not aligned to PAGE_SIZE, the
returned address might be not aligned as well. This led to a probe
registered on exactly the returned address, but the entire page was armed
for mmiotracing.

On calling iounmap the address passed to unregister_kmmio_probe was
PAGE_SIZE aligned by the caller leading to a complete freeze of the
machine.

We should always page align addresses while (un)registerung mappings,
because the mmiotracer works on top of pages, not mappings. We still keep
track of the probes based on their real addresses and lengths though,
because the mmiotrace still needs to know what are mapped memory regions.

Also move the call to mmiotrace_iounmap prior page aligning the address,
so that all probes are unregistered properly, otherwise the kernel ends up
failing memory allocations randomly after disabling the mmiotracer.

Cc: Steven Rostedt <rost...@goodmis.org>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Pekka Paalanen <ppaala...@gmail.com>
Cc: nouv...@lists.freedesktop.org
Cc: x...@kernel.org
Acked-by: Pekka Paalanen <ppaala...@gmail.com>
Tested-by: Lyude <ly...@redhat.com>
Signed-off-by: Karol Herbst <kher...@redhat.com>
---
 arch/x86/mm/ioremap.c |  4 ++--
 arch/x86/mm/kmmio.c   | 12 +++-
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 34f0e1847dd6..5d4c358778dd 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -349,11 +349,11 @@ void iounmap(volatile void __iomem *addr)
return;
}
 
+   mmiotrace_iounmap(addr);
+
addr = (volatile void __iomem *)
(PAGE_MASK & (unsigned long __force)addr);
 
-   mmiotrace_iounmap(addr);
-
/* Use the vm area unlocked, assuming the caller
   ensures there isn't another iounmap for the same address
   in parallel. Reuse of the virtual address is prevented by
diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
index c21c2ed04612..58477ec3d66d 100644
--- a/arch/x86/mm/kmmio.c
+++ b/arch/x86/mm/kmmio.c
@@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p)
unsigned long flags;
int ret = 0;
unsigned long size = 0;
+   unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
unsigned int l;
pte_t *pte;
 
spin_lock_irqsave(_lock, flags);
-   if (get_kmmio_probe(p->addr)) {
+   if (get_kmmio_probe(addr)) {
ret = -EEXIST;
goto out;
}
 
-   pte = lookup_address(p->addr, );
+   pte = lookup_address(addr, );
if (!pte) {
ret = -EINVAL;
goto out;
@@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p)
kmmio_count++;
list_add_rcu(>list, _probes);
while (size < size_lim) {
-   if (add_kmmio_fault_page(p->addr + size))
+   if (add_kmmio_fault_page(addr + size))
pr_err("Unable to set page fault.\n");
size += page_level_size(l);
}
@@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p)
 {
unsigned long flags;
unsigned long size = 0;
+   unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
struct kmmio_fault_page *release_list = NULL;
struct kmmio_delayed_release *drelease;
unsigned int l;
pte_t *pte;
 
-   pte = lookup_address(p->addr, );
+   pte = lookup_address(addr, );
if (!pte)
return;
 
spin_lock_irqsave(_lock, flags);
while (size < size_lim) {
-   release_kmmio_fault_page(p->addr + size, _list);
+   release_kmmio_fault_page(addr + size, _list);
size += page_level_size(l);
}
list_del_rcu(>list);
-- 
2.14.3



[PATCH] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

2017-11-26 Thread Karol Herbst
If something calls ioremap with an address not aligned to PAGE_SIZE, the
returned address might be not aligned as well. This led to a probe
registered on exactly the returned address, but the entire page was armed
for mmiotracing.

On calling iounmap the address passed to unregister_kmmio_probe was
PAGE_SIZE aligned by the caller leading to a complete freeze of the
machine.

We should always page align addresses while (un)registerung mappings,
because the mmiotracer works on top of pages, not mappings. We still keep
track of the probes based on their real addresses and lengths though,
because the mmiotrace still needs to know what are mapped memory regions.

Also move the call to mmiotrace_iounmap prior page aligning the address,
so that all probes are unregistered properly, otherwise the kernel ends up
failing memory allocations randomly after disabling the mmiotracer.

Cc: Steven Rostedt 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Pekka Paalanen 
Cc: nouv...@lists.freedesktop.org
Cc: x...@kernel.org
Acked-by: Pekka Paalanen 
Tested-by: Lyude 
Signed-off-by: Karol Herbst 
---
 arch/x86/mm/ioremap.c |  4 ++--
 arch/x86/mm/kmmio.c   | 12 +++-
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 34f0e1847dd6..5d4c358778dd 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -349,11 +349,11 @@ void iounmap(volatile void __iomem *addr)
return;
}
 
+   mmiotrace_iounmap(addr);
+
addr = (volatile void __iomem *)
(PAGE_MASK & (unsigned long __force)addr);
 
-   mmiotrace_iounmap(addr);
-
/* Use the vm area unlocked, assuming the caller
   ensures there isn't another iounmap for the same address
   in parallel. Reuse of the virtual address is prevented by
diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
index c21c2ed04612..58477ec3d66d 100644
--- a/arch/x86/mm/kmmio.c
+++ b/arch/x86/mm/kmmio.c
@@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p)
unsigned long flags;
int ret = 0;
unsigned long size = 0;
+   unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
unsigned int l;
pte_t *pte;
 
spin_lock_irqsave(_lock, flags);
-   if (get_kmmio_probe(p->addr)) {
+   if (get_kmmio_probe(addr)) {
ret = -EEXIST;
goto out;
}
 
-   pte = lookup_address(p->addr, );
+   pte = lookup_address(addr, );
if (!pte) {
ret = -EINVAL;
goto out;
@@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p)
kmmio_count++;
list_add_rcu(>list, _probes);
while (size < size_lim) {
-   if (add_kmmio_fault_page(p->addr + size))
+   if (add_kmmio_fault_page(addr + size))
pr_err("Unable to set page fault.\n");
size += page_level_size(l);
}
@@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p)
 {
unsigned long flags;
unsigned long size = 0;
+   unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
struct kmmio_fault_page *release_list = NULL;
struct kmmio_delayed_release *drelease;
unsigned int l;
pte_t *pte;
 
-   pte = lookup_address(p->addr, );
+   pte = lookup_address(addr, );
if (!pte)
return;
 
spin_lock_irqsave(_lock, flags);
while (size < size_lim) {
-   release_kmmio_fault_page(p->addr + size, _list);
+   release_kmmio_fault_page(addr + size, _list);
size += page_level_size(l);
}
list_del_rcu(>list);
-- 
2.14.3



Re: [Nouveau] [PATCH] [RESEND] drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning

2017-09-06 Thread Karol Herbst
On Wed, Sep 6, 2017 at 10:11 PM, Arnd Bergmann <a...@arndb.de> wrote:
> On Wed, Sep 6, 2017 at 4:20 PM, Karol Herbst <karolher...@gmail.com> wrote:
>>> In this instance, I think using multiplication is more intuitive
>>> than '&&', so I'm adding a comparison to zero instead to shut up
>>> the warning. To further improve readability, I also make the
>>> error case indented and leave the normal case as the final 'return'
>>> statement.
>>>
>>
>> I think to make perfectly clear why this check is done, we simply
>> should precompute the denominator and do something like this:
>>
>> int denominator = M * P
>> if (denominator == 0)
>>return 0;
>> return sclk * N / denominator;
>>
>> but with a better name for "denominator".
>
> I don't know what M and P actually are in this function, so I couldn't
> come up with a much better name either, how about simply 'divisor'?
>

what about "MP"? M and P are simply dividers for the PLL configuration
and there are two of them. I think "P" is the post divider.

>> I even imagine, that there
>> are some macros in the kernel for dealing with something like this
>> already, so that we could do instead:
>>
>> return AWESOME_DIV_MACRO(sclk * N, M * P)
>
> I don't think that exists, the behavior for divide-by-zero really
> depends a lot on the context, and returning zero is probably
> often not a good solution.
>
>   Arnd


Re: [Nouveau] [PATCH] [RESEND] drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning

2017-09-06 Thread Karol Herbst
On Wed, Sep 6, 2017 at 10:11 PM, Arnd Bergmann  wrote:
> On Wed, Sep 6, 2017 at 4:20 PM, Karol Herbst  wrote:
>>> In this instance, I think using multiplication is more intuitive
>>> than '&&', so I'm adding a comparison to zero instead to shut up
>>> the warning. To further improve readability, I also make the
>>> error case indented and leave the normal case as the final 'return'
>>> statement.
>>>
>>
>> I think to make perfectly clear why this check is done, we simply
>> should precompute the denominator and do something like this:
>>
>> int denominator = M * P
>> if (denominator == 0)
>>return 0;
>> return sclk * N / denominator;
>>
>> but with a better name for "denominator".
>
> I don't know what M and P actually are in this function, so I couldn't
> come up with a much better name either, how about simply 'divisor'?
>

what about "MP"? M and P are simply dividers for the PLL configuration
and there are two of them. I think "P" is the post divider.

>> I even imagine, that there
>> are some macros in the kernel for dealing with something like this
>> already, so that we could do instead:
>>
>> return AWESOME_DIV_MACRO(sclk * N, M * P)
>
> I don't think that exists, the behavior for divide-by-zero really
> depends a lot on the context, and returning zero is probably
> often not a good solution.
>
>   Arnd


Re: [Nouveau] [PATCH] [RESEND] drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning

2017-09-06 Thread Karol Herbst
On Wed, Sep 6, 2017 at 3:56 PM, Arnd Bergmann  wrote:
> gcc thinks that interpreting a multiplication result as a bool
> is confusing:
>
> drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c: In function 'read_pll':
> drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c:133:8: error: '*' in boolean 
> context, suggest '&&' instead [-Werror=int-in-bool-context]
>
> In this instance, I think using multiplication is more intuitive
> than '&&', so I'm adding a comparison to zero instead to shut up
> the warning. To further improve readability, I also make the
> error case indented and leave the normal case as the final 'return'
> statement.
>

I think to make perfectly clear why this check is done, we simply
should precompute the denominator and do something like this:

int denominator = M * P
if (denominator == 0)
   return 0;
return sclk * N / denominator;

but with a better name for "denominator". I even imagine, that there
are some macros in the kernel for dealing with something like this
already, so that we could do instead:

return AWESOME_DIV_MACRO(sclk * N, M * P)

> Fixes: 7632b30e4b8b ("drm/nouveau/clk: namespace + nvidia gpu names (no 
> binary change)")
> Signed-off-by: Arnd Bergmann 
> ---
> Originally submitted on July 14, but no reply. This is the same
> patch again. The warning is currently disabled in mainline, but
> I think we can turn it back on in the future, and this change here
> seems harmless.
> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c
> index 96e0941c8edd..04b4f4ccf186 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c
> @@ -130,10 +130,10 @@ read_pll(struct gt215_clk *clk, int idx, u32 pll)
> sclk = read_clk(clk, 0x10 + idx, false);
> }
>
> -   if (M * P)
> -   return sclk * N / (M * P);
> +   if (M * P == 0)
> +   return 0;
>
> -   return 0;
> +   return sclk * N / (M * P);
>  }
>
>  static int
> --
> 2.9.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] [RESEND] drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning

2017-09-06 Thread Karol Herbst
On Wed, Sep 6, 2017 at 3:56 PM, Arnd Bergmann  wrote:
> gcc thinks that interpreting a multiplication result as a bool
> is confusing:
>
> drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c: In function 'read_pll':
> drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c:133:8: error: '*' in boolean 
> context, suggest '&&' instead [-Werror=int-in-bool-context]
>
> In this instance, I think using multiplication is more intuitive
> than '&&', so I'm adding a comparison to zero instead to shut up
> the warning. To further improve readability, I also make the
> error case indented and leave the normal case as the final 'return'
> statement.
>

I think to make perfectly clear why this check is done, we simply
should precompute the denominator and do something like this:

int denominator = M * P
if (denominator == 0)
   return 0;
return sclk * N / denominator;

but with a better name for "denominator". I even imagine, that there
are some macros in the kernel for dealing with something like this
already, so that we could do instead:

return AWESOME_DIV_MACRO(sclk * N, M * P)

> Fixes: 7632b30e4b8b ("drm/nouveau/clk: namespace + nvidia gpu names (no 
> binary change)")
> Signed-off-by: Arnd Bergmann 
> ---
> Originally submitted on July 14, but no reply. This is the same
> patch again. The warning is currently disabled in mainline, but
> I think we can turn it back on in the future, and this change here
> seems harmless.
> ---
>  drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c
> index 96e0941c8edd..04b4f4ccf186 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c
> @@ -130,10 +130,10 @@ read_pll(struct gt215_clk *clk, int idx, u32 pll)
> sclk = read_clk(clk, 0x10 + idx, false);
> }
>
> -   if (M * P)
> -   return sclk * N / (M * P);
> +   if (M * P == 0)
> +   return 0;
>
> -   return 0;
> +   return sclk * N / (M * P);
>  }
>
>  static int
> --
> 2.9.0
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335

2017-07-14 Thread Karol Herbst
Yeah, we shouldn't let the machine die. Are there more WARN_ON_ONCE
usage we could convert to WARN_ONCE?

Reviewed-By: Karol Herbst <karolher...@gmail.com>

On Fri, Jul 14, 2017 at 5:05 PM, Tobias Klausmann
<tobias.johannes.klausm...@mni.thm.de> wrote:
> On 7/14/17 3:41 PM, Mike Galbraith wrote:
>>
>> On Fri, 2017-07-14 at 15:36 +0200, Mike Galbraith wrote:
>>>
>>>   All DRM did was to slip a
>>> WARN_ON_ONCE() that nouveau triggers into a kernel module where such
>>> things no longer warn, they blow the box out of the water.
>>
>> BTW, turn that irksome WARN_ON_ONCE() in drivers/gpu/drm/drm_vblank.c
>> into a WARN_ONCE(), and all is peachy, you get the warning, box lives.
>>
>> ---
>>   drivers/gpu/drm/drm_vblank.c |3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> --- a/drivers/gpu/drm/drm_vblank.c
>> +++ b/drivers/gpu/drm/drm_vblank.c
>> @@ -605,7 +605,8 @@ bool drm_calc_vbltimestamp_from_scanoutp
>>  */
>> if (mode->crtc_clock == 0) {
>> DRM_DEBUG("crtc %u: Noop due to uninitialized mode.\n",
>> pipe);
>> -   WARN_ON_ONCE(drm_drv_uses_atomic_modeset(dev));
>> +   WARN_ONCE(drm_drv_uses_atomic_modeset(dev), "%s: report
>> me.\n",
>> + dev->driver->name);
>> return false;
>> }
>
>
>
> Hey,
>
> confirmed this helps saving the box, but we still have to find the root
> cause! Backtrace with the above fix applied (and the one which came in with
> the latest drm-fixes merge)!
>
>
> [1] https://hastebin.com/uyoqifijed.http
>
> Thanks,
>
> Tobias
>Reviewed-By: Karol Herbst <karolher...@gmail.com>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335

2017-07-14 Thread Karol Herbst
Yeah, we shouldn't let the machine die. Are there more WARN_ON_ONCE
usage we could convert to WARN_ONCE?

Reviewed-By: Karol Herbst 

On Fri, Jul 14, 2017 at 5:05 PM, Tobias Klausmann
 wrote:
> On 7/14/17 3:41 PM, Mike Galbraith wrote:
>>
>> On Fri, 2017-07-14 at 15:36 +0200, Mike Galbraith wrote:
>>>
>>>   All DRM did was to slip a
>>> WARN_ON_ONCE() that nouveau triggers into a kernel module where such
>>> things no longer warn, they blow the box out of the water.
>>
>> BTW, turn that irksome WARN_ON_ONCE() in drivers/gpu/drm/drm_vblank.c
>> into a WARN_ONCE(), and all is peachy, you get the warning, box lives.
>>
>> ---
>>   drivers/gpu/drm/drm_vblank.c |3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> --- a/drivers/gpu/drm/drm_vblank.c
>> +++ b/drivers/gpu/drm/drm_vblank.c
>> @@ -605,7 +605,8 @@ bool drm_calc_vbltimestamp_from_scanoutp
>>  */
>> if (mode->crtc_clock == 0) {
>> DRM_DEBUG("crtc %u: Noop due to uninitialized mode.\n",
>> pipe);
>> -   WARN_ON_ONCE(drm_drv_uses_atomic_modeset(dev));
>> +   WARN_ONCE(drm_drv_uses_atomic_modeset(dev), "%s: report
>> me.\n",
>> + dev->driver->name);
>> return false;
>> }
>
>
>
> Hey,
>
> confirmed this helps saving the box, but we still have to find the root
> cause! Backtrace with the above fix applied (and the one which came in with
> the latest drm-fixes merge)!
>
>
> [1] https://hastebin.com/uyoqifijed.http
>
> Thanks,
>
> Tobias
>Reviewed-By: Karol Herbst 
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [PATCH] drm/nouveau: Add support for clockgating on Fermi+

2017-04-25 Thread Karol Herbst
Hi Lyude,

thanks for the great work. Just a view comments inline.

2017-04-25 20:38 GMT+02:00 Lyude :
> This adds support for enabling automatic clockgating on nvidia GPUs for
> Fermi and later generations. This saves a little bit of power, bringing
> my fermi GPU's power consumption from ~28.3W on idle to ~27W, and my
> kepler's idle power consumption from ~23.6W to ~21.65W.
>
> Similar to how the nvidia driver seems to handle this, we enable
> clockgating for each engine that supports it after it's initialization.
>
> Signed-off-by: Lyude 
> ---
>  .../gpu/drm/nouveau/include/nvkm/subdev/therm.h|  4 ++
>  drivers/gpu/drm/nouveau/nvkm/core/engine.c | 20 +-
>  drivers/gpu/drm/nouveau/nvkm/engine/device/base.c  | 14 ++--
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild   |  2 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c   |  2 +
>  .../gpu/drm/nouveau/nvkm/subdev/therm/clkgate.c| 49 ++
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c  | 77 
> ++
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c  |  2 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm107.c  |  2 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gt215.c  |  2 +-
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h   | 10 +++
>  11 files changed, 175 insertions(+), 9 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/clkgate.c
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c
>
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> index b268b96..904aa56 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> @@ -84,6 +84,9 @@ struct nvkm_therm {
>
> int (*attr_get)(struct nvkm_therm *, enum nvkm_therm_attr_type);
> int (*attr_set)(struct nvkm_therm *, enum nvkm_therm_attr_type, int);
> +
> +   int  (*clkgate_engine)(struct nvkm_therm *, enum nvkm_devidx);
> +   void (*clkgate_set)(struct nvkm_therm *, int gate_idx, bool enable);

remove those and have a simple "nvkm_therm_clkgate_engine" function

This way you know that every user calls this function and don't have
to check for silly function pointers like you currently do in engine.c

>  };
>
>  int nvkm_therm_temp_get(struct nvkm_therm *);
> @@ -94,6 +97,7 @@ int nv40_therm_new(struct nvkm_device *, int, struct 
> nvkm_therm **);
>  int nv50_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
> +int gf100_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  #endif
> diff --git a/drivers/gpu/drm/nouveau/nvkm/core/engine.c 
> b/drivers/gpu/drm/nouveau/nvkm/core/engine.c
> index b6c9169..473ad3e 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/core/engine.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/core/engine.c
> @@ -26,6 +26,7 @@
>  #include 
>
>  #include 
> +#include 
>
>  bool
>  nvkm_engine_chsw_load(struct nvkm_engine *engine)
> @@ -86,6 +87,13 @@ static int
>  nvkm_engine_fini(struct nvkm_subdev *subdev, bool suspend)
>  {
> struct nvkm_engine *engine = nvkm_engine(subdev);
> +   struct nvkm_therm *therm = subdev->device->therm;
> +   int gate_idx;
> +
> +   gate_idx = therm->clkgate_engine(therm, subdev->index);
> +   if (gate_idx != -1)
> +   therm->clkgate_set(therm, gate_idx, false);
> +

move this code inside "nvkm_therm_clkgate_engine". Nobody outside
nvkm_therm should even care about the index.

> if (engine->func->fini)
> return engine->func->fini(engine, suspend);
> return 0;
> @@ -96,12 +104,13 @@ nvkm_engine_init(struct nvkm_subdev *subdev)
>  {
> struct nvkm_engine *engine = nvkm_engine(subdev);
> struct nvkm_fb *fb = subdev->device->fb;
> +   struct nvkm_therm *therm = subdev->device->therm;
> int ret = 0, i;
> s64 time;
>
> if (!engine->usecount) {
> nvkm_trace(subdev, "init skipped, engine has no users\n");
> -   return ret;
> +   goto finish;
> }
>
> if (engine->func->oneinit && !engine->subdev.oneinit) {
> @@ -123,6 +132,15 @@ nvkm_engine_init(struct nvkm_subdev *subdev)
>
> for (i = 0; fb && i < fb->tile.regions; i++)
> nvkm_engine_tile(engine, i);
> +
> +finish:
> +   if (!ret) {
> +   int gate_idx = therm->clkgate_engine(therm, subdev->index);
> +
> +   if (gate_idx != -1)
> +   therm->clkgate_set(therm, gate_idx, true);
> +   }
> +

same code as above. More code sharing!

> return ret;
>  }
>
> diff --git 

Re: [PATCH] drm/nouveau: Add support for clockgating on Fermi+

2017-04-25 Thread Karol Herbst
Hi Lyude,

thanks for the great work. Just a view comments inline.

2017-04-25 20:38 GMT+02:00 Lyude :
> This adds support for enabling automatic clockgating on nvidia GPUs for
> Fermi and later generations. This saves a little bit of power, bringing
> my fermi GPU's power consumption from ~28.3W on idle to ~27W, and my
> kepler's idle power consumption from ~23.6W to ~21.65W.
>
> Similar to how the nvidia driver seems to handle this, we enable
> clockgating for each engine that supports it after it's initialization.
>
> Signed-off-by: Lyude 
> ---
>  .../gpu/drm/nouveau/include/nvkm/subdev/therm.h|  4 ++
>  drivers/gpu/drm/nouveau/nvkm/core/engine.c | 20 +-
>  drivers/gpu/drm/nouveau/nvkm/engine/device/base.c  | 14 ++--
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild   |  2 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c   |  2 +
>  .../gpu/drm/nouveau/nvkm/subdev/therm/clkgate.c| 49 ++
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c  | 77 
> ++
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c  |  2 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm107.c  |  2 +
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/gt215.c  |  2 +-
>  drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h   | 10 +++
>  11 files changed, 175 insertions(+), 9 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/clkgate.c
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c
>
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> index b268b96..904aa56 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
> @@ -84,6 +84,9 @@ struct nvkm_therm {
>
> int (*attr_get)(struct nvkm_therm *, enum nvkm_therm_attr_type);
> int (*attr_set)(struct nvkm_therm *, enum nvkm_therm_attr_type, int);
> +
> +   int  (*clkgate_engine)(struct nvkm_therm *, enum nvkm_devidx);
> +   void (*clkgate_set)(struct nvkm_therm *, int gate_idx, bool enable);

remove those and have a simple "nvkm_therm_clkgate_engine" function

This way you know that every user calls this function and don't have
to check for silly function pointers like you currently do in engine.c

>  };
>
>  int nvkm_therm_temp_get(struct nvkm_therm *);
> @@ -94,6 +97,7 @@ int nv40_therm_new(struct nvkm_device *, int, struct 
> nvkm_therm **);
>  int nv50_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
> +int gf100_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
>  #endif
> diff --git a/drivers/gpu/drm/nouveau/nvkm/core/engine.c 
> b/drivers/gpu/drm/nouveau/nvkm/core/engine.c
> index b6c9169..473ad3e 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/core/engine.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/core/engine.c
> @@ -26,6 +26,7 @@
>  #include 
>
>  #include 
> +#include 
>
>  bool
>  nvkm_engine_chsw_load(struct nvkm_engine *engine)
> @@ -86,6 +87,13 @@ static int
>  nvkm_engine_fini(struct nvkm_subdev *subdev, bool suspend)
>  {
> struct nvkm_engine *engine = nvkm_engine(subdev);
> +   struct nvkm_therm *therm = subdev->device->therm;
> +   int gate_idx;
> +
> +   gate_idx = therm->clkgate_engine(therm, subdev->index);
> +   if (gate_idx != -1)
> +   therm->clkgate_set(therm, gate_idx, false);
> +

move this code inside "nvkm_therm_clkgate_engine". Nobody outside
nvkm_therm should even care about the index.

> if (engine->func->fini)
> return engine->func->fini(engine, suspend);
> return 0;
> @@ -96,12 +104,13 @@ nvkm_engine_init(struct nvkm_subdev *subdev)
>  {
> struct nvkm_engine *engine = nvkm_engine(subdev);
> struct nvkm_fb *fb = subdev->device->fb;
> +   struct nvkm_therm *therm = subdev->device->therm;
> int ret = 0, i;
> s64 time;
>
> if (!engine->usecount) {
> nvkm_trace(subdev, "init skipped, engine has no users\n");
> -   return ret;
> +   goto finish;
> }
>
> if (engine->func->oneinit && !engine->subdev.oneinit) {
> @@ -123,6 +132,15 @@ nvkm_engine_init(struct nvkm_subdev *subdev)
>
> for (i = 0; fb && i < fb->tile.regions; i++)
> nvkm_engine_tile(engine, i);
> +
> +finish:
> +   if (!ret) {
> +   int gate_idx = therm->clkgate_engine(therm, subdev->index);
> +
> +   if (gate_idx != -1)
> +   therm->clkgate_set(therm, gate_idx, true);
> +   }
> +

same code as above. More code sharing!

> return ret;
>  }
>
> diff --git 

Re: [Nouveau] [PATCH] drm/nouveau: fix unknown chipset for GTX 1060

2016-12-20 Thread Karol Herbst
we already have that included in 4.10

2016-12-12 12:26 GMT+01:00 Chris Chiu :
> Nouveau driver shows unknown chipset (136000a1) for GTX 1060, so it
> only gives VGA resolution on screen. Use the same chipset as nv134
> then it shows FullHD. This commit copies fields from nv134_chipset
> to nv136_chipset for GTX 1060.
>
> Signed-off-by: Chris Chiu 
> ---
>  drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 29 
> +++
>  1 file changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> index 7218a06..7c6eece 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> @@ -2209,6 +2209,34 @@ nv134_chipset = {
> .fifo = gp100_fifo_new,
>  };
>
> +static const struct nvkm_device_chip
> +nv136_chipset = {
> +   .name = "GP104",
> +   .bar = gf100_bar_new,
> +   .bios = nvkm_bios_new,
> +   .bus = gf100_bus_new,
> +   .devinit = gm200_devinit_new,
> +   .fb = gp104_fb_new,
> +   .fuse = gm107_fuse_new,
> +   .gpio = gk104_gpio_new,
> +   .i2c = gm200_i2c_new,
> +   .ibus = gm200_ibus_new,
> +   .imem = nv50_instmem_new,
> +   .ltc = gp100_ltc_new,
> +   .mc = gp100_mc_new,
> +   .mmu = gf100_mmu_new,
> +   .pci = gp100_pci_new,
> +   .timer = gk20a_timer_new,
> +   .top = gk104_top_new,
> +   .ce[0] = gp104_ce_new,
> +   .ce[1] = gp104_ce_new,
> +   .ce[2] = gp104_ce_new,
> +   .ce[3] = gp104_ce_new,
> +   .disp = gp104_disp_new,
> +   .dma = gf119_dma_new,
> +   .fifo = gp100_fifo_new,
> +};
> +
>  static int
>  nvkm_device_event_ctor(struct nvkm_object *object, void *data, u32 size,
>struct nvkm_notify *notify)
> @@ -2644,6 +2672,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func,
> case 0x12b: device->chip = _chipset; break;
> case 0x130: device->chip = _chipset; break;
> case 0x134: device->chip = _chipset; break;
> +   case 0x136: device->chip = _chipset; break;
> default:
> nvdev_error(device, "unknown chipset (%08x)\n", 
> boot0);
> goto done;
> --
> 2.1.4
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau: fix unknown chipset for GTX 1060

2016-12-20 Thread Karol Herbst
we already have that included in 4.10

2016-12-12 12:26 GMT+01:00 Chris Chiu :
> Nouveau driver shows unknown chipset (136000a1) for GTX 1060, so it
> only gives VGA resolution on screen. Use the same chipset as nv134
> then it shows FullHD. This commit copies fields from nv134_chipset
> to nv136_chipset for GTX 1060.
>
> Signed-off-by: Chris Chiu 
> ---
>  drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 29 
> +++
>  1 file changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c 
> b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> index 7218a06..7c6eece 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
> @@ -2209,6 +2209,34 @@ nv134_chipset = {
> .fifo = gp100_fifo_new,
>  };
>
> +static const struct nvkm_device_chip
> +nv136_chipset = {
> +   .name = "GP104",
> +   .bar = gf100_bar_new,
> +   .bios = nvkm_bios_new,
> +   .bus = gf100_bus_new,
> +   .devinit = gm200_devinit_new,
> +   .fb = gp104_fb_new,
> +   .fuse = gm107_fuse_new,
> +   .gpio = gk104_gpio_new,
> +   .i2c = gm200_i2c_new,
> +   .ibus = gm200_ibus_new,
> +   .imem = nv50_instmem_new,
> +   .ltc = gp100_ltc_new,
> +   .mc = gp100_mc_new,
> +   .mmu = gf100_mmu_new,
> +   .pci = gp100_pci_new,
> +   .timer = gk20a_timer_new,
> +   .top = gk104_top_new,
> +   .ce[0] = gp104_ce_new,
> +   .ce[1] = gp104_ce_new,
> +   .ce[2] = gp104_ce_new,
> +   .ce[3] = gp104_ce_new,
> +   .disp = gp104_disp_new,
> +   .dma = gf119_dma_new,
> +   .fifo = gp100_fifo_new,
> +};
> +
>  static int
>  nvkm_device_event_ctor(struct nvkm_object *object, void *data, u32 size,
>struct nvkm_notify *notify)
> @@ -2644,6 +2672,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func,
> case 0x12b: device->chip = _chipset; break;
> case 0x130: device->chip = _chipset; break;
> case 0x134: device->chip = _chipset; break;
> +   case 0x136: device->chip = _chipset; break;
> default:
> nvdev_error(device, "unknown chipset (%08x)\n", 
> boot0);
> goto done;
> --
> 2.1.4
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: mmiotracer hangs the system

2016-11-24 Thread Karol Herbst
sorry for that, but I forgot the patch

2016-11-19 11:56 GMT+01:00 Karol Herbst <karolher...@gmail.com>:
> this is odd, I found a bug related to nouveau (modprobe/bind doesn't
> return), but that isn't related to your issue at all or maybe it is
> exactly this, cause the binding of the device doesn't return and
> depending on the kind of driver, it would hang the system... yeah,
> maybe it is the same issue.
>
> anyway, could you try to trace with the attached patch? Maybe the
> additional output would help me to verify it. Currently I am working
> on the bugfix I mentioned above and this may also fix your issue. I
> was still able to get a working mmiotrace file, even if the dvice
> binding didn't finish. Is this the same for you? (try cat
> "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this
> contains anything usefull).
>
> This really looks like an odd issue, because the mmiotracer still
> behaves as expected.
>
> 2016-10-22 18:02 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>:
>> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolher...@gmail.com> wrote:
>>> sorry for the delay fixing that bug. I got occupied with other things
>>> and didn't really got to the issue again, it is on my todo list as the
>>> next item though and I hope I will be able to get a fix ready this
>>> weekend. I think I might know where the issue is, but didn't confirm
>>> it yet.
>>
>> Thanks.I'm still using revert. Feel free to Cc me when you will have
>> some material to test.
>>
>>>
>>> Again, sorry for the delay.
>>>
>>> Karol
>>>
>>> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolher...@gmail.com>:
>>>> Hi again,
>>>>
>>>> I was able to get a crash/freeze/something while unbinding/binding my
>>>> nvidia gpu from nouveau.
>>>>
>>>> Guess that means something is odd. I will investigate this more over
>>>> the weekend.
>>>>
>>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>:
>>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolher...@gmail.com> 
>>>>> wrote:
>>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>:
>>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolher...@gmail.com> 
>>>>>>> wrote:
>>>>>>>> is there any update on that issue I missed somehow? I really don't
>>>>>>>> want to leave the mmiotracer in a state, where it breaks something
>>>>>>>> while fixing other issues.
>>>>>>>
>>>>>>> No updates. I'm busy right now with more priority tasks and revert
>>>>>>> works for me. Issue is reproducible in my case 100%.
>>>>>>>
>>>>>>
>>>>>> Is there something I could do with a "normal" haswell desktop system
>>>>>> to reproduce this issue?
>>>>>
>>>>> Try LPSS UART device(s)
>>>>>
>>>>>>
>>>>>> I'll try to play around the next days a bit and maybe I find something
>>>>>> that works out here as well. It seems to be related to
>>>>>> unmapping-mapping cycles.
>>>>>
>>>>> That is the only thing I would think of.
>>>>>
>>>>>>
>>>>>> Because if this only happens with the pwm-lpss driver,
>>>>>
>>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by
>>>>> intel-lpss driver.
>>>>>
>>>>>> it may be
>>>>>> really troublesome to debug, because I don't really know the code that
>>>>>> well to be sure where the issue might be.
>>>>>>
>>>>>>> So, I would able to attach dmesg in case it would be helpful.
>>>>>>> Otherwise tell me exact instructions how to debug the issue.
>>>>>>>
>>>>>>> Here you are:
>>>>>>> http://pastebin.com/raw/VfTZENt7
>>>>>>>
>>>>>>>> But for now, without being able to even reproduce the issue, I can't
>>>>>>>> really do much, because the code in the current state looks sane to
>>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>>>>>> arms new region for mmiotracing and that's 

Re: mmiotracer hangs the system

2016-11-24 Thread Karol Herbst
sorry for that, but I forgot the patch

2016-11-19 11:56 GMT+01:00 Karol Herbst :
> this is odd, I found a bug related to nouveau (modprobe/bind doesn't
> return), but that isn't related to your issue at all or maybe it is
> exactly this, cause the binding of the device doesn't return and
> depending on the kind of driver, it would hang the system... yeah,
> maybe it is the same issue.
>
> anyway, could you try to trace with the attached patch? Maybe the
> additional output would help me to verify it. Currently I am working
> on the bugfix I mentioned above and this may also fix your issue. I
> was still able to get a working mmiotrace file, even if the dvice
> binding didn't finish. Is this the same for you? (try cat
> "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this
> contains anything usefull).
>
> This really looks like an odd issue, because the mmiotracer still
> behaves as expected.
>
> 2016-10-22 18:02 GMT+02:00 Andy Shevchenko :
>> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst  wrote:
>>> sorry for the delay fixing that bug. I got occupied with other things
>>> and didn't really got to the issue again, it is on my todo list as the
>>> next item though and I hope I will be able to get a fix ready this
>>> weekend. I think I might know where the issue is, but didn't confirm
>>> it yet.
>>
>> Thanks.I'm still using revert. Feel free to Cc me when you will have
>> some material to test.
>>
>>>
>>> Again, sorry for the delay.
>>>
>>> Karol
>>>
>>> 2016-08-19 22:46 GMT+02:00 Karol Herbst :
>>>> Hi again,
>>>>
>>>> I was able to get a crash/freeze/something while unbinding/binding my
>>>> nvidia gpu from nouveau.
>>>>
>>>> Guess that means something is odd. I will investigate this more over
>>>> the weekend.
>>>>
>>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko :
>>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst  
>>>>> wrote:
>>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko :
>>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst  
>>>>>>> wrote:
>>>>>>>> is there any update on that issue I missed somehow? I really don't
>>>>>>>> want to leave the mmiotracer in a state, where it breaks something
>>>>>>>> while fixing other issues.
>>>>>>>
>>>>>>> No updates. I'm busy right now with more priority tasks and revert
>>>>>>> works for me. Issue is reproducible in my case 100%.
>>>>>>>
>>>>>>
>>>>>> Is there something I could do with a "normal" haswell desktop system
>>>>>> to reproduce this issue?
>>>>>
>>>>> Try LPSS UART device(s)
>>>>>
>>>>>>
>>>>>> I'll try to play around the next days a bit and maybe I find something
>>>>>> that works out here as well. It seems to be related to
>>>>>> unmapping-mapping cycles.
>>>>>
>>>>> That is the only thing I would think of.
>>>>>
>>>>>>
>>>>>> Because if this only happens with the pwm-lpss driver,
>>>>>
>>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by
>>>>> intel-lpss driver.
>>>>>
>>>>>> it may be
>>>>>> really troublesome to debug, because I don't really know the code that
>>>>>> well to be sure where the issue might be.
>>>>>>
>>>>>>> So, I would able to attach dmesg in case it would be helpful.
>>>>>>> Otherwise tell me exact instructions how to debug the issue.
>>>>>>>
>>>>>>> Here you are:
>>>>>>> http://pastebin.com/raw/VfTZENt7
>>>>>>>
>>>>>>>> But for now, without being able to even reproduce the issue, I can't
>>>>>>>> really do much, because the code in the current state looks sane to
>>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>>>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>>>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>>>>>
>>>>>>> Maybe. First thing happened is iounmap().
>>>&

Re: mmiotracer hangs the system

2016-11-19 Thread Karol Herbst
this is odd, I found a bug related to nouveau (modprobe/bind doesn't
return), but that isn't related to your issue at all or maybe it is
exactly this, cause the binding of the device doesn't return and
depending on the kind of driver, it would hang the system... yeah,
maybe it is the same issue.

anyway, could you try to trace with the attached patch? Maybe the
additional output would help me to verify it. Currently I am working
on the bugfix I mentioned above and this may also fix your issue. I
was still able to get a working mmiotrace file, even if the dvice
binding didn't finish. Is this the same for you? (try cat
"/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this
contains anything usefull).

This really looks like an odd issue, because the mmiotracer still
behaves as expected.

2016-10-22 18:02 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>:
> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolher...@gmail.com> wrote:
>> sorry for the delay fixing that bug. I got occupied with other things
>> and didn't really got to the issue again, it is on my todo list as the
>> next item though and I hope I will be able to get a fix ready this
>> weekend. I think I might know where the issue is, but didn't confirm
>> it yet.
>
> Thanks.I'm still using revert. Feel free to Cc me when you will have
> some material to test.
>
>>
>> Again, sorry for the delay.
>>
>> Karol
>>
>> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolher...@gmail.com>:
>>> Hi again,
>>>
>>> I was able to get a crash/freeze/something while unbinding/binding my
>>> nvidia gpu from nouveau.
>>>
>>> Guess that means something is odd. I will investigate this more over
>>> the weekend.
>>>
>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>:
>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolher...@gmail.com> 
>>>> wrote:
>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>:
>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolher...@gmail.com> 
>>>>>> wrote:
>>>>>>> is there any update on that issue I missed somehow? I really don't
>>>>>>> want to leave the mmiotracer in a state, where it breaks something
>>>>>>> while fixing other issues.
>>>>>>
>>>>>> No updates. I'm busy right now with more priority tasks and revert
>>>>>> works for me. Issue is reproducible in my case 100%.
>>>>>>
>>>>>
>>>>> Is there something I could do with a "normal" haswell desktop system
>>>>> to reproduce this issue?
>>>>
>>>> Try LPSS UART device(s)
>>>>
>>>>>
>>>>> I'll try to play around the next days a bit and maybe I find something
>>>>> that works out here as well. It seems to be related to
>>>>> unmapping-mapping cycles.
>>>>
>>>> That is the only thing I would think of.
>>>>
>>>>>
>>>>> Because if this only happens with the pwm-lpss driver,
>>>>
>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by
>>>> intel-lpss driver.
>>>>
>>>>> it may be
>>>>> really troublesome to debug, because I don't really know the code that
>>>>> well to be sure where the issue might be.
>>>>>
>>>>>> So, I would able to attach dmesg in case it would be helpful.
>>>>>> Otherwise tell me exact instructions how to debug the issue.
>>>>>>
>>>>>> Here you are:
>>>>>> http://pastebin.com/raw/VfTZENt7
>>>>>>
>>>>>>> But for now, without being able to even reproduce the issue, I can't
>>>>>>> really do much, because the code in the current state looks sane to
>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>>>>
>>>>>> Maybe. First thing happened is iounmap().
>>>>
>>>>
>>>> --
>>>> With Best Regards,
>>>> Andy Shevchenko
>
>
>
> --
> With Best Regards,
> Andy Shevchenko


Re: mmiotracer hangs the system

2016-11-19 Thread Karol Herbst
this is odd, I found a bug related to nouveau (modprobe/bind doesn't
return), but that isn't related to your issue at all or maybe it is
exactly this, cause the binding of the device doesn't return and
depending on the kind of driver, it would hang the system... yeah,
maybe it is the same issue.

anyway, could you try to trace with the attached patch? Maybe the
additional output would help me to verify it. Currently I am working
on the bugfix I mentioned above and this may also fix your issue. I
was still able to get a working mmiotrace file, even if the dvice
binding didn't finish. Is this the same for you? (try cat
"/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this
contains anything usefull).

This really looks like an odd issue, because the mmiotracer still
behaves as expected.

2016-10-22 18:02 GMT+02:00 Andy Shevchenko :
> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst  wrote:
>> sorry for the delay fixing that bug. I got occupied with other things
>> and didn't really got to the issue again, it is on my todo list as the
>> next item though and I hope I will be able to get a fix ready this
>> weekend. I think I might know where the issue is, but didn't confirm
>> it yet.
>
> Thanks.I'm still using revert. Feel free to Cc me when you will have
> some material to test.
>
>>
>> Again, sorry for the delay.
>>
>> Karol
>>
>> 2016-08-19 22:46 GMT+02:00 Karol Herbst :
>>> Hi again,
>>>
>>> I was able to get a crash/freeze/something while unbinding/binding my
>>> nvidia gpu from nouveau.
>>>
>>> Guess that means something is odd. I will investigate this more over
>>> the weekend.
>>>
>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko :
>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst  
>>>> wrote:
>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko :
>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst  
>>>>>> wrote:
>>>>>>> is there any update on that issue I missed somehow? I really don't
>>>>>>> want to leave the mmiotracer in a state, where it breaks something
>>>>>>> while fixing other issues.
>>>>>>
>>>>>> No updates. I'm busy right now with more priority tasks and revert
>>>>>> works for me. Issue is reproducible in my case 100%.
>>>>>>
>>>>>
>>>>> Is there something I could do with a "normal" haswell desktop system
>>>>> to reproduce this issue?
>>>>
>>>> Try LPSS UART device(s)
>>>>
>>>>>
>>>>> I'll try to play around the next days a bit and maybe I find something
>>>>> that works out here as well. It seems to be related to
>>>>> unmapping-mapping cycles.
>>>>
>>>> That is the only thing I would think of.
>>>>
>>>>>
>>>>> Because if this only happens with the pwm-lpss driver,
>>>>
>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by
>>>> intel-lpss driver.
>>>>
>>>>> it may be
>>>>> really troublesome to debug, because I don't really know the code that
>>>>> well to be sure where the issue might be.
>>>>>
>>>>>> So, I would able to attach dmesg in case it would be helpful.
>>>>>> Otherwise tell me exact instructions how to debug the issue.
>>>>>>
>>>>>> Here you are:
>>>>>> http://pastebin.com/raw/VfTZENt7
>>>>>>
>>>>>>> But for now, without being able to even reproduce the issue, I can't
>>>>>>> really do much, because the code in the current state looks sane to
>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and
>>>>>>> arms new region for mmiotracing and that's why it fails? Besides that,
>>>>>>> I have no idea and no way to reproduce this, so I can't help this way.
>>>>>>
>>>>>> Maybe. First thing happened is iounmap().
>>>>
>>>>
>>>> --
>>>> With Best Regards,
>>>> Andy Shevchenko
>
>
>
> --
> With Best Regards,
> Andy Shevchenko


Re: [Nouveau] [PATCH] drm/nouveau: fix LEDS_CLASS=m configuration

2016-11-08 Thread Karol Herbst
2016-11-08 17:12 GMT+01:00 Arnd Bergmann :
> On Tuesday, November 8, 2016 5:07:01 PM CET Lukas Wunner wrote:
>> On Tue, Nov 08, 2016 at 04:52:49PM +0100, Arnd Bergmann wrote:
>> > The underlying problem is that we already have a number of other
>> > symbols that either have "depends on LEDS_CLASS" or
>> > "select LEDS_CLASS". To clean that up properly, we should either
>> > make the symbol itself hidden and only select it from other drivers,
>> > or use "depends on LEDS_CLASS" everywhere.
>> >
>> > Another option is to use the IS_REACHABLE() macro instead of IS_ENABLED()
>> > in the header file, to stub out the calls into the new file, but
>> > that can be a bit confusing.
>>
>> Why don't you just add empty inline stubs for nouveau_led_init / _fini /
>> _suspend / _resume?
>>
>
> That's what I was suggesting:
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_led.h 
> b/drivers/gpu/drm/nouveau/nouveau_led.h
> index 9c9bb6ac938e..bc5f47cb516b 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_led.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_led.h
> @@ -35,21 +35,21 @@ struct nouveau_led {
> struct led_classdev led;
>  };
>
>  static inline struct nouveau_led *
>  nouveau_led(struct drm_device *dev)
>  {
> return nouveau_drm(dev)->led;
>  }
>
>  /* nouveau_led.c */
> -#if IS_ENABLED(CONFIG_LEDS_CLASS)
> +#if IS_REACHABLE(CONFIG_LEDS_CLASS)
>  int  nouveau_led_init(struct drm_device *dev);
>  void nouveau_led_suspend(struct drm_device *dev);
>  void nouveau_led_resume(struct drm_device *dev);
>  void nouveau_led_fini(struct drm_device *dev);
>  #else
>  static inline int  nouveau_led_init(struct drm_device *dev) { return 0; };
>  static inline void nouveau_led_suspend(struct drm_device *dev) { };
>  static inline void nouveau_led_resume(struct drm_device *dev) { };
>  static inline void nouveau_led_fini(struct drm_device *dev) { };
>  #endif
>
> The downside is that now the nouveau_led_init() just won't be called
> if CONFIG_LEDS_CLASS=m and CONFIG_DRM_NOUVEAU=y, which can be
> surprising to users.

yeah, it will. I guess it is fine to force LEDS to y if nouveau is set
to y. The thinks I absolutely dislike is:
1. auto hiding of features I _want_ to have, because I would have to
enable the dependencies first, which is like super annoying if there
are somewhere else
2. preventing me from enabling something, cause the dependency is missing.

We should clarify first if we actually want to enable those features
optionally, because there isn't much of a reason not to enable the
dependencies, except embedded systems. We have a lot more stuff where
we could add things like that: hwmon, debugfs, acpi, compat and maybe
there are even more things

>
> Arnd
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau: fix LEDS_CLASS=m configuration

2016-11-08 Thread Karol Herbst
2016-11-08 17:12 GMT+01:00 Arnd Bergmann :
> On Tuesday, November 8, 2016 5:07:01 PM CET Lukas Wunner wrote:
>> On Tue, Nov 08, 2016 at 04:52:49PM +0100, Arnd Bergmann wrote:
>> > The underlying problem is that we already have a number of other
>> > symbols that either have "depends on LEDS_CLASS" or
>> > "select LEDS_CLASS". To clean that up properly, we should either
>> > make the symbol itself hidden and only select it from other drivers,
>> > or use "depends on LEDS_CLASS" everywhere.
>> >
>> > Another option is to use the IS_REACHABLE() macro instead of IS_ENABLED()
>> > in the header file, to stub out the calls into the new file, but
>> > that can be a bit confusing.
>>
>> Why don't you just add empty inline stubs for nouveau_led_init / _fini /
>> _suspend / _resume?
>>
>
> That's what I was suggesting:
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_led.h 
> b/drivers/gpu/drm/nouveau/nouveau_led.h
> index 9c9bb6ac938e..bc5f47cb516b 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_led.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_led.h
> @@ -35,21 +35,21 @@ struct nouveau_led {
> struct led_classdev led;
>  };
>
>  static inline struct nouveau_led *
>  nouveau_led(struct drm_device *dev)
>  {
> return nouveau_drm(dev)->led;
>  }
>
>  /* nouveau_led.c */
> -#if IS_ENABLED(CONFIG_LEDS_CLASS)
> +#if IS_REACHABLE(CONFIG_LEDS_CLASS)
>  int  nouveau_led_init(struct drm_device *dev);
>  void nouveau_led_suspend(struct drm_device *dev);
>  void nouveau_led_resume(struct drm_device *dev);
>  void nouveau_led_fini(struct drm_device *dev);
>  #else
>  static inline int  nouveau_led_init(struct drm_device *dev) { return 0; };
>  static inline void nouveau_led_suspend(struct drm_device *dev) { };
>  static inline void nouveau_led_resume(struct drm_device *dev) { };
>  static inline void nouveau_led_fini(struct drm_device *dev) { };
>  #endif
>
> The downside is that now the nouveau_led_init() just won't be called
> if CONFIG_LEDS_CLASS=m and CONFIG_DRM_NOUVEAU=y, which can be
> surprising to users.

yeah, it will. I guess it is fine to force LEDS to y if nouveau is set
to y. The thinks I absolutely dislike is:
1. auto hiding of features I _want_ to have, because I would have to
enable the dependencies first, which is like super annoying if there
are somewhere else
2. preventing me from enabling something, cause the dependency is missing.

We should clarify first if we actually want to enable those features
optionally, because there isn't much of a reason not to enable the
dependencies, except embedded systems. We have a lot more stuff where
we could add things like that: hwmon, debugfs, acpi, compat and maybe
there are even more things

>
> Arnd
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau: fix LEDS_CLASS=m configuration

2016-11-08 Thread Karol Herbst
2016-11-08 16:46 GMT+01:00 Ilia Mirkin :
> On Tue, Nov 8, 2016 at 8:56 AM, Arnd Bergmann  wrote:
>> The newly introduced LED handling for nouveau fails to link when the
>> driver is built-in but the LED subsystem is a loadable module:
>>
>> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_do_suspend':
>> tvnv17.c:(.text.nouveau_do_suspend+0x10): undefined reference to 
>> `nouveau_led_suspend'
>> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_do_resume':
>> tvnv17.c:(.text.nouveau_do_resume+0xf0): undefined reference to 
>> `nouveau_led_resume'
>> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_drm_unload':
>> tvnv17.c:(.text.nouveau_drm_unload+0x34): undefined reference to 
>> `nouveau_led_fini'
>> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_drm_load':
>> tvnv17.c:(.text.nouveau_drm_load+0x7d0): undefined reference to 
>> `nouveau_led_init'
>>
>> This adds a separate Kconfig symbol for the LED support that
>> correctly tracks the dependencies.
>>
>> Fixes: 8d021d71b324 ("drm/nouveau/drm/nouveau: add a LED driver for the 
>> NVIDIA logo")
>> Signed-off-by: Arnd Bergmann 
>> ---
>>  drivers/gpu/drm/nouveau/Kbuild| 2 +-
>>  drivers/gpu/drm/nouveau/Kconfig   | 8 
>>  drivers/gpu/drm/nouveau/nouveau_led.h | 2 +-
>>  3 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/Kbuild b/drivers/gpu/drm/nouveau/Kbuild
>> index fde6e3656636..5e00e911daa6 100644
>> --- a/drivers/gpu/drm/nouveau/Kbuild
>> +++ b/drivers/gpu/drm/nouveau/Kbuild
>> @@ -22,7 +22,7 @@ nouveau-$(CONFIG_DEBUG_FS) += nouveau_debugfs.o
>>  nouveau-y += nouveau_drm.o
>>  nouveau-y += nouveau_hwmon.o
>>  nouveau-$(CONFIG_COMPAT) += nouveau_ioc32.o
>> -nouveau-$(CONFIG_LEDS_CLASS) += nouveau_led.o
>> +nouveau-$(CONFIG_DRM_NOUVEAU_LED) += nouveau_led.o
>>  nouveau-y += nouveau_nvif.o
>>  nouveau-$(CONFIG_NOUVEAU_PLATFORM_DRIVER) += nouveau_platform.o
>>  nouveau-y += nouveau_usif.o # userspace <-> nvif
>> diff --git a/drivers/gpu/drm/nouveau/Kconfig 
>> b/drivers/gpu/drm/nouveau/Kconfig
>> index 78631fb61adf..715cd6f4dc31 100644
>> --- a/drivers/gpu/drm/nouveau/Kconfig
>> +++ b/drivers/gpu/drm/nouveau/Kconfig
>> @@ -46,6 +46,14 @@ config NOUVEAU_DEBUG
>>   The paranoia and spam levels will add a lot of extra checks which
>>   may potentially slow down driver operation.
>>
>> +config DRM_NOUVEAU_LED
>> +   bool "Support for logo LED"
>> +   depends on DRM_NOUVEAU && LEDS_CLASS
>> +   depends on !(DRM_NOUVEAU=y && LEDS_CLASS=m)
>> +   help
>> + Say Y here to enabling controlling the brightness of the
>> + LED behind NVIDIA logo on certain Titan cards.
>
> This is a very odd restriction... could this be written as
>
> depends on DRM_NOUVEAU
> select LEDS_CLASS
>
> Or will that not flip the LEDS_CLASS from m to y in case
> DRM_NOUVEAU=y? If not, is there a way to cause that to happen?
>
> Separately, perhaps we should just drop this LEDS_CLASS select into
> DRM_NOUVEAU? We've tended to avoid adding tons of options.
>

well, that would mean that you always need the LEDS_CLASS and maybe on
a tegra system you don't want to, so the led stuff should stay
completely optional. Don't know though.

Alex, maybe you want to clarify which dependencies should stay
optional? If nobody on your side care, then we won't care as well and
only add switches if users actually request it.

> Cheers,
>
>   -ilia
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau: fix LEDS_CLASS=m configuration

2016-11-08 Thread Karol Herbst
2016-11-08 16:46 GMT+01:00 Ilia Mirkin :
> On Tue, Nov 8, 2016 at 8:56 AM, Arnd Bergmann  wrote:
>> The newly introduced LED handling for nouveau fails to link when the
>> driver is built-in but the LED subsystem is a loadable module:
>>
>> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_do_suspend':
>> tvnv17.c:(.text.nouveau_do_suspend+0x10): undefined reference to 
>> `nouveau_led_suspend'
>> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_do_resume':
>> tvnv17.c:(.text.nouveau_do_resume+0xf0): undefined reference to 
>> `nouveau_led_resume'
>> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_drm_unload':
>> tvnv17.c:(.text.nouveau_drm_unload+0x34): undefined reference to 
>> `nouveau_led_fini'
>> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_drm_load':
>> tvnv17.c:(.text.nouveau_drm_load+0x7d0): undefined reference to 
>> `nouveau_led_init'
>>
>> This adds a separate Kconfig symbol for the LED support that
>> correctly tracks the dependencies.
>>
>> Fixes: 8d021d71b324 ("drm/nouveau/drm/nouveau: add a LED driver for the 
>> NVIDIA logo")
>> Signed-off-by: Arnd Bergmann 
>> ---
>>  drivers/gpu/drm/nouveau/Kbuild| 2 +-
>>  drivers/gpu/drm/nouveau/Kconfig   | 8 
>>  drivers/gpu/drm/nouveau/nouveau_led.h | 2 +-
>>  3 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/Kbuild b/drivers/gpu/drm/nouveau/Kbuild
>> index fde6e3656636..5e00e911daa6 100644
>> --- a/drivers/gpu/drm/nouveau/Kbuild
>> +++ b/drivers/gpu/drm/nouveau/Kbuild
>> @@ -22,7 +22,7 @@ nouveau-$(CONFIG_DEBUG_FS) += nouveau_debugfs.o
>>  nouveau-y += nouveau_drm.o
>>  nouveau-y += nouveau_hwmon.o
>>  nouveau-$(CONFIG_COMPAT) += nouveau_ioc32.o
>> -nouveau-$(CONFIG_LEDS_CLASS) += nouveau_led.o
>> +nouveau-$(CONFIG_DRM_NOUVEAU_LED) += nouveau_led.o
>>  nouveau-y += nouveau_nvif.o
>>  nouveau-$(CONFIG_NOUVEAU_PLATFORM_DRIVER) += nouveau_platform.o
>>  nouveau-y += nouveau_usif.o # userspace <-> nvif
>> diff --git a/drivers/gpu/drm/nouveau/Kconfig 
>> b/drivers/gpu/drm/nouveau/Kconfig
>> index 78631fb61adf..715cd6f4dc31 100644
>> --- a/drivers/gpu/drm/nouveau/Kconfig
>> +++ b/drivers/gpu/drm/nouveau/Kconfig
>> @@ -46,6 +46,14 @@ config NOUVEAU_DEBUG
>>   The paranoia and spam levels will add a lot of extra checks which
>>   may potentially slow down driver operation.
>>
>> +config DRM_NOUVEAU_LED
>> +   bool "Support for logo LED"
>> +   depends on DRM_NOUVEAU && LEDS_CLASS
>> +   depends on !(DRM_NOUVEAU=y && LEDS_CLASS=m)
>> +   help
>> + Say Y here to enabling controlling the brightness of the
>> + LED behind NVIDIA logo on certain Titan cards.
>
> This is a very odd restriction... could this be written as
>
> depends on DRM_NOUVEAU
> select LEDS_CLASS
>
> Or will that not flip the LEDS_CLASS from m to y in case
> DRM_NOUVEAU=y? If not, is there a way to cause that to happen?
>
> Separately, perhaps we should just drop this LEDS_CLASS select into
> DRM_NOUVEAU? We've tended to avoid adding tons of options.
>

well, that would mean that you always need the LEDS_CLASS and maybe on
a tegra system you don't want to, so the led stuff should stay
completely optional. Don't know though.

Alex, maybe you want to clarify which dependencies should stay
optional? If nobody on your side care, then we won't care as well and
only add switches if users actually request it.

> Cheers,
>
>   -ilia
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] noveau: emergency shutdown handling is overcomplex and broken

2016-10-25 Thread Karol Herbst
Thanks for the pointer.

But I don't like this patch. If you find a bug, make a bug report or
just fix it if you know the fix already. Or write something in IRC. Or
write on the Mailing list as a general question or something else

But I really don't agree on doing it this way. You would have needed
like the same amount of time to actual fix the problem.

Anyway, for adding a printk:

struct nvkm_subdev *subdev = >subdev;
nvkm_error(subdev, "message");

2016-10-25 12:50 GMT+02:00 Pavel Machek :
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
> index b9703c0..adb1deb 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
> @@ -120,6 +120,11 @@ nvkm_therm_sensor_event(struct nvkm_therm *therm, enum 
> nvkm_therm_thrs thrs,
> struct work_struct *work;
>
> work = kmalloc(sizeof(*work), GFP_ATOMIC);
> +   /* FIXME:
> +  1) this is total overkill, orderly_poweroff() 
> already
> +  uses schedule_work internally
> +  2) it would  be good to at least printk what is 
> going on
> +   */
> if (work) {
> INIT_WORK(work, nv_poweroff_work);
> schedule_work(work);
>
> GFP_ATOMIC is not reliable. Plus, see the fixme.
>
> Best regards,
> Pavel
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>


Re: [Nouveau] noveau: emergency shutdown handling is overcomplex and broken

2016-10-25 Thread Karol Herbst
Thanks for the pointer.

But I don't like this patch. If you find a bug, make a bug report or
just fix it if you know the fix already. Or write something in IRC. Or
write on the Mailing list as a general question or something else

But I really don't agree on doing it this way. You would have needed
like the same amount of time to actual fix the problem.

Anyway, for adding a printk:

struct nvkm_subdev *subdev = >subdev;
nvkm_error(subdev, "message");

2016-10-25 12:50 GMT+02:00 Pavel Machek :
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
> index b9703c0..adb1deb 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
> @@ -120,6 +120,11 @@ nvkm_therm_sensor_event(struct nvkm_therm *therm, enum 
> nvkm_therm_thrs thrs,
> struct work_struct *work;
>
> work = kmalloc(sizeof(*work), GFP_ATOMIC);
> +   /* FIXME:
> +  1) this is total overkill, orderly_poweroff() 
> already
> +  uses schedule_work internally
> +  2) it would  be good to at least printk what is 
> going on
> +   */
> if (work) {
> INIT_WORK(work, nv_poweroff_work);
> schedule_work(work);
>
> GFP_ATOMIC is not reliable. Plus, see the fixme.
>
> Best regards,
> Pavel
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau
>


  1   2   >