Re: [Nouveau] [PATCH 13/40] drm/nouveau/dispnv50/disp: Include header containing our prototypes
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:38 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/dispnv50/disp.c:2599:1: warning: no previous > prototype for ‘nv50_display_create’ [-Wmissing-prototypes] > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/dispnv50/disp.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c > b/drivers/gpu/drm/nouveau/dispnv50/disp.c > index 351f954989530..4905ed584ff48 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c > @@ -68,6 +68,8 @@ > > #include > > +#include "nv50_display.h" > + > > /** > * EVO channel > > */ > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [PATCH 14/40] drm/nouveau/nouveau_ioc32: File headers are not good candidates for kernel-doc
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:38 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nouveau_ioc32.c:2: warning: Cannot understand * > file mga_ioc32.c > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nouveau_ioc32.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_ioc32.c > b/drivers/gpu/drm/nouveau/nouveau_ioc32.c > index adf01ca9e035d..8ddf9b2325a42 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_ioc32.c > +++ b/drivers/gpu/drm/nouveau/nouveau_ioc32.c > @@ -1,4 +1,4 @@ > -/** > +/* > * \file mga_ioc32.c > * > * 32-bit ioctl compatibility routines for the MGA DRM. > -- > 2.27.0 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel >
Re: [Nouveau] [PATCH 16/40] drm/nouveau/nouveau_ioc32: Demote kernel-doc abuse to standard comment block
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:38 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or > member 'filp' not described in 'nouveau_compat_ioctl' > drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or > member 'cmd' not described in 'nouveau_compat_ioctl' > drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or > member 'arg' not described in 'nouveau_compat_ioctl' > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nouveau_ioc32.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_ioc32.c > b/drivers/gpu/drm/nouveau/nouveau_ioc32.c > index 8ddf9b2325a42..2af3615c5205c 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_ioc32.c > +++ b/drivers/gpu/drm/nouveau/nouveau_ioc32.c > @@ -38,7 +38,7 @@ > > #include "nouveau_ioctl.h" > > -/** > +/* > * Called whenever a 32-bit process running under a 64-bit kernel > * performs an ioctl on /dev/dri/card. > * > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
Re: [Nouveau] [PATCH 15/40] drm/nouveau/nouveau_svm: Remove unused variable 'ret' from void function
On Fri, Apr 16, 2021 at 4:38 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nouveau_svm.c: In function ‘nouveau_pfns_map’: > drivers/gpu/drm/nouveau/nouveau_svm.c:810:6: warning: variable ‘ret’ set but > not used [-Wunused-but-set-variable] > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nouveau_svm.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c > b/drivers/gpu/drm/nouveau/nouveau_svm.c > index 1c3f890377d2c..26af6ee915368 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_svm.c > +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c > @@ -811,7 +811,6 @@ nouveau_pfns_map(struct nouveau_svmm *svmm, struct > mm_struct *mm, > unsigned long addr, u64 *pfns, unsigned long npages) > { > struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns); > - int ret; > > args->p.addr = addr; > args->p.size = npages << PAGE_SHIFT; > @@ -819,8 +818,8 @@ nouveau_pfns_map(struct nouveau_svmm *svmm, struct > mm_struct *mm, > mutex_lock(>mutex); > > svmm->vmm->vmm.object.client->super = true; > - ret = nvif_object_ioctl(>vmm->vmm.object, args, sizeof(*args) + > - npages * sizeof(args->p.phys[0]), NULL); > + nvif_object_ioctl(>vmm->vmm.object, args, sizeof(*args) + > + npages * sizeof(args->p.phys[0]), NULL); yeah mhh.. I think this one is actually fine, but it might make sense to still report something back, although in userspace we still don't care as the CL API doesn't return any error. > svmm->vmm->vmm.object.client->super = false; > > mutex_unlock(>mutex); > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 12/40] drm/nouveau/nv50_display: Remove superfluous prototype for local static functions
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:38 PM Lee Jones wrote: > > Fixes the following build error: > > drivers/gpu/drm/nouveau/dispnv50/disp.c:2530:1: error: conflicting types for > ‘nv50_display_fini’ > In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71: > drivers/gpu/drm/nouveau/nv50_display.h:36:6: note: previous declaration of > ‘nv50_display_fini’ was her > In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71: > drivers/gpu/drm/nouveau/nv50_display.h:35:6: note: previous declaration of > ‘nv50_display_init’ was here > drivers/gpu/drm/nouveau/dispnv50/disp.c:2581:1: error: static declaration of > ‘nv50_display_destroy’ follows non-static declaration > In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71: > drivers/gpu/drm/nouveau/nv50_display.h:34:6: note: previous declaration of > ‘nv50_display_destroy’ was here > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nv50_display.h | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nv50_display.h > b/drivers/gpu/drm/nouveau/nv50_display.h > index fbd3b15583bc8..2421401d12636 100644 > --- a/drivers/gpu/drm/nouveau/nv50_display.h > +++ b/drivers/gpu/drm/nouveau/nv50_display.h > @@ -31,7 +31,4 @@ > #include "nouveau_reg.h" > > int nv50_display_create(struct drm_device *); > -void nv50_display_destroy(struct drm_device *); > -int nv50_display_init(struct drm_device *); > -void nv50_display_fini(struct drm_device *); > #endif /* __NV50_DISPLAY_H__ */ > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 11/40] drm/nouveau/dispnv50/headc57d: Make local function 'headc57d_olut' static
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:38 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/dispnv50/headc57d.c:173:1: warning: no previous > prototype for ‘headc57d_olut’ [-Wmissing-prototypes] > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: Lyude Paul > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c > b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c > index fd51527b56b83..bdcfd240d61c8 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c > +++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c > @@ -169,7 +169,7 @@ headc57d_olut_load(struct drm_color_lut *in, int size, > void __iomem *mem) > writew(readw(mem - 4), mem + 4); > } > > -bool > +static bool > headc57d_olut(struct nv50_head *head, struct nv50_head_atom *asyh, int size) > { > if (size != 0 && size != 256 && size != 1024) > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 10/40] drm/nouveau/dispnv50/disp: Remove unused variable 'ret' from function returning void
On Fri, Apr 16, 2021 at 4:38 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/dispnv50/disp.c: In function ‘nv50_mstm_cleanup’: > drivers/gpu/drm/nouveau/dispnv50/disp.c:1357:6: warning: variable ‘ret’ set > but not used [-Wunused-but-set-variable] > same comment here: we should really check if it's better to handle the error and report it back that something failed or so.. > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/dispnv50/disp.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c > b/drivers/gpu/drm/nouveau/dispnv50/disp.c > index 4801aafd9552b..351f954989530 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c > @@ -1386,12 +1386,11 @@ nv50_mstm_cleanup(struct nv50_mstm *mstm) > { > struct nouveau_drm *drm = nouveau_drm(mstm->outp->base.base.dev); > struct drm_encoder *encoder; > - int ret; > > NV_ATOMIC(drm, "%s: mstm cleanup\n", mstm->outp->base.base.name); > - ret = drm_dp_check_act_status(>mgr); > + drm_dp_check_act_status(>mgr); > > - ret = drm_dp_update_payload_part2(>mgr); > + drm_dp_update_payload_part2(>mgr); > > drm_for_each_encoder(encoder, mstm->outp->base.base.dev) { > if (encoder->encoder_type == DRM_MODE_ENCODER_DPMST) { > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [PATCH 09/40] drm/nouveau/dispnv04/crtc: Demote non-conforming kernel-doc headers
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:38 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/dispnv04/crtc.c:462: warning: Function parameter or > member 'crtc' not described in 'nv_crtc_mode_set_regs' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:462: warning: Function parameter or > member 'mode' not described in 'nv_crtc_mode_set_regs' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'crtc' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'mode' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'adjusted_mode' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'x' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'y' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'old_fb' not described in 'nv_crtc_mode_set' > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c > b/drivers/gpu/drm/nouveau/dispnv04/crtc.c > index f9e962fd94d0d..f9a276ea5a9e0 100644 > --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c > +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c > @@ -449,7 +449,7 @@ nv_crtc_mode_set_vga(struct drm_crtc *crtc, struct > drm_display_mode *mode) > regp->Attribute[NV_CIO_AR_CSEL_INDEX] = 0x00; > } > > -/** > +/* > * Sets up registers for the given mode/adjusted_mode pair. > * > * The clocks, CRTCs and outputs attached to this CRTC must be off. > @@ -625,7 +625,7 @@ nv_crtc_swap_fbs(struct drm_crtc *crtc, struct > drm_framebuffer *old_fb) > return ret; > } > > -/** > +/* > * Sets up registers for the given mode/adjusted_mode pair. > * > * The clocks, CRTCs and outputs attached to this CRTC must be off. > -- > 2.27.0 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel >
Re: [Nouveau] [PATCH 07/40] drm/nouveau/nouveau_bo: Remove unused variables 'dev'
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:37 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nouveau_bo.c: In function ‘nouveau_ttm_tt_populate’: > drivers/gpu/drm/nouveau/nouveau_bo.c:1228:17: warning: variable ‘dev’ set > but not used [-Wunused-but-set-variable] > drivers/gpu/drm/nouveau/nouveau_bo.c: In function > ‘nouveau_ttm_tt_unpopulate’: > drivers/gpu/drm/nouveau/nouveau_bo.c:1252:17: warning: variable ‘dev’ set > but not used [-Wunused-but-set-variable] > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: Sumit Semwal > Cc: "Christian König" > Cc: Jeremy Kolb > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Cc: linux-me...@vger.kernel.org > Cc: linaro-mm-...@lists.linaro.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 4 > 1 file changed, 4 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c > b/drivers/gpu/drm/nouveau/nouveau_bo.c > index 3e09df0472ce4..37b3d2c10f5c5 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -1255,7 +1255,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev, > { > struct ttm_tt *ttm_dma = (void *)ttm; > struct nouveau_drm *drm; > - struct device *dev; > bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); > > if (ttm_tt_is_populated(ttm)) > @@ -1268,7 +1267,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev, > } > > drm = nouveau_bdev(bdev); > - dev = drm->dev->dev; > > return ttm_pool_alloc(>ttm.bdev.pool, ttm, ctx); > } > @@ -1278,14 +1276,12 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev, > struct ttm_tt *ttm) > { > struct nouveau_drm *drm; > - struct device *dev; > bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); > > if (slave) > return; > > drm = nouveau_bdev(bdev); > - dev = drm->dev->dev; > > return ttm_pool_free(>ttm.bdev.pool, ttm); > } > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 05/40] drm/nouveau/nvkm/subdev/volt/gk20a: Demote non-conformant kernel-doc headers
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:37 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function > parameter or member 'speedo' not described in 'gk20a_volt_get_cvb_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function > parameter or member 's_scale' not described in 'gk20a_volt_get_cvb_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function > parameter or member 'coef' not described in 'gk20a_volt_get_cvb_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 'speedo' not described in 'gk20a_volt_get_cvb_t_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 'temp' not described in 'gk20a_volt_get_cvb_t_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 's_scale' not described in 'gk20a_volt_get_cvb_t_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 't_scale' not described in 'gk20a_volt_get_cvb_t_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 'coef' not described in 'gk20a_volt_get_cvb_t_voltage' > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c > index 8c2faa9645111..ccac88da88648 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c > @@ -45,7 +45,7 @@ static const struct cvb_coef gk20a_cvb_coef[] = { > /* 852 */ { 1608418, -21643, -269, 0,763, -48}, > }; > > -/** > +/* > * cvb_mv = ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0) > */ > static inline int > @@ -58,7 +58,7 @@ gk20a_volt_get_cvb_voltage(int speedo, int s_scale, const > struct cvb_coef *coef) > return mv; > } > > -/** > +/* > * cvb_t_mv = > * ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0) + > * ((c3 * speedo / s_scale + c4 + c5 * T / t_scale) * T / t_scale) > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
Re: [Nouveau] [PATCH 06/40] drm/nouveau/nvkm/engine/gr/gf100: Demote non-conformant kernel-doc header
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:37 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c:992: warning: Function > parameter or member 'gr' not described in 'gf100_gr_wait_idle' > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c > b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c > index 397ff4fe9df89..69e6008f99196 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c > @@ -982,7 +982,7 @@ gf100_gr_zbc_init(struct gf100_gr *gr) > } > } > > -/** > +/* > * Wait until GR goes idle. GR is considered idle if it is disabled by the > * MC (0x200) register, or GR is not busy and a context switch is not in > * progress. > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
Re: [Nouveau] [PATCH 02/40] drm/nouveau/dispnv50/disp: Remove unused variable 'ret'
On Fri, Apr 16, 2021 at 4:37 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/dispnv50/disp.c:1381:6: warning: variable ‘ret’ set > but not used [-Wunused-but-set-variable] > not a big fan of just ignoring return codes, I'd rather see it handled somehow, unless somebody knowing more about the details here says it's okay. > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/dispnv50/disp.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c > b/drivers/gpu/drm/nouveau/dispnv50/disp.c > index 1c9c0cdf85dbc..4801aafd9552b 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c > @@ -1410,10 +1410,9 @@ nv50_mstm_prepare(struct nv50_mstm *mstm) > { > struct nouveau_drm *drm = nouveau_drm(mstm->outp->base.base.dev); > struct drm_encoder *encoder; > - int ret; > > NV_ATOMIC(drm, "%s: mstm prepare\n", mstm->outp->base.base.name); > - ret = drm_dp_update_payload_part1(>mgr); > + drm_dp_update_payload_part1(>mgr); > > drm_for_each_encoder(encoder, mstm->outp->base.base.dev) { > if (encoder->encoder_type == DRM_MODE_ENCODER_DPMST) { > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 01/40] drm/nouveau/nvkm/subdev/bios/init: Demote obvious abuse of kernel-doc
Reviewed-by: Karol Herbst On Fri, Apr 16, 2021 at 4:37 PM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:584: warning: Function > parameter or member 'init' not described in 'init_reserved' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:611: warning: Function > parameter or member 'init' not described in 'init_done' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:622: warning: Function > parameter or member 'init' not described in 'init_io_restrict_prog' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:659: warning: Function > parameter or member 'init' not described in 'init_repeat' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:685: warning: Function > parameter or member 'init' not described in 'init_io_restrict_pll' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:725: warning: Function > parameter or member 'init' not described in 'init_end_repeat' > > NB: Trimmed for brevity (lots of these!) > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > .../gpu/drm/nouveau/nvkm/subdev/bios/init.c | 204 ++ > 1 file changed, 68 insertions(+), 136 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c > index 9de74f41dcd2a..5a91dc4e5c8ec 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c > @@ -575,9 +575,8 @@ init_tmds_reg(struct nvbios_init *init, u8 tmds) > * init opcode handlers > > */ > > -/** > +/* > * init_reserved - stub for various unknown/unused single-byte opcodes > - * > */ > static void > init_reserved(struct nvbios_init *init) > @@ -602,9 +601,8 @@ init_reserved(struct nvbios_init *init) > init->offset += length; > } > > -/** > +/* > * INIT_DONE - opcode 0x71 > - * > */ > static void > init_done(struct nvbios_init *init) > @@ -613,9 +611,8 @@ init_done(struct nvbios_init *init) > init->offset = 0x; > } > > -/** > +/* > * INIT_IO_RESTRICT_PROG - opcode 0x32 > - * > */ > static void > init_io_restrict_prog(struct nvbios_init *init) > @@ -650,9 +647,8 @@ init_io_restrict_prog(struct nvbios_init *init) > trace("}]\n"); > } > > -/** > +/* > * INIT_REPEAT - opcode 0x33 > - * > */ > static void > init_repeat(struct nvbios_init *init) > @@ -676,9 +672,8 @@ init_repeat(struct nvbios_init *init) > init->repeat = repeat; > } > > -/** > +/* > * INIT_IO_RESTRICT_PLL - opcode 0x34 > - * > */ > static void > init_io_restrict_pll(struct nvbios_init *init) > @@ -716,9 +711,8 @@ init_io_restrict_pll(struct nvbios_init *init) > trace("}]\n"); > } > > -/** > +/* > * INIT_END_REPEAT - opcode 0x36 > - * > */ > static void > init_end_repeat(struct nvbios_init *init) > @@ -732,9 +726,8 @@ init_end_repeat(struct nvbios_init *init) > } > } > > -/** > +/* > * INIT_COPY - opcode 0x37 > - * > */ > static void > init_copy(struct nvbios_init *init) > @@ -759,9 +752,8 @@ init_copy(struct nvbios_init *init) > init_wrvgai(init, port, index, data); > } > > -/** > +/* > * INIT_NOT - opcode 0x38 > - * > */ > static void > init_not(struct nvbios_init *init) > @@ -771,9 +763,8 @@ init_not(struct nvbios_init *init) > init_exec_inv(init); > } > > -/** > +/* > * INIT_IO_FLAG_CONDITION - opcode 0x39 > - * > */ > static void > init_io_flag_condition(struct nvbios_init *init) > @@ -788,9 +779,8 @@ init_io_flag_condition(struct nvbios_init *init) > init_exec_set(init, false); > } > > -/** > +/* > * INIT_GENERIC_CONDITION - opcode 0x3a > - * > */ > static void > init_generic_condition(struct nvbios_init *init) > @@ -840,9 +830,8 @@ init_generic_condition(struct nvbios_init *init) > } > } > > -/** > +/* > * INIT_IO_MASK_OR - opcode 0x3b > - * > */ > static void > init_io_mask_or(struct nvbios_init *init) > @@ -859,9 +848,8 @@ init_io_mask_or(struct nvbios_init *init) > init_wrvgai(init, 0x03d4, index, data &= ~(1 << or)); > } > > -/** > +/* > * INIT_IO_OR - opcode 0x3c > - * > */ > static void > init_io_or(struct nvbios_init *init) > @@ -878,9 +866,8 @@ init_i
Re: [Nouveau] [PATCH v2] ALSA: hda: Continue to probe when codec probe fails
On Tue, Apr 13, 2021 at 1:17 PM Roy Spliet wrote: > > Op 13-04-2021 om 10:48 schreef Karol Herbst: > > On Tue, Apr 13, 2021 at 10:24 AM Roy Spliet wrote: > >> > >> Op 13-04-2021 om 01:10 schreef Karol Herbst: > >>> On Mon, Apr 12, 2021 at 9:36 PM Roy Spliet wrote: > >>>> > >>>> Hello Aaron, > >>>> > >>>> Thanks for your insights. A follow-up query and some observations > >>>> in-line. > >>>> > >>>> Op 12-04-2021 om 20:06 schreef Aaron Plattner: > >>>>> On 4/10/21 1:48 PM, Roy Spliet wrote: > >>>>>> Op 10-04-2021 om 20:23 schreef Lukas Wunner: > >>>>>>> On Sat, Apr 10, 2021 at 04:51:27PM +0100, Roy Spliet wrote: > >>>>>>>> Can I ask someone with more > >>>>>>>> technical knowledge of snd_hda_intel and vgaswitcheroo to brainstorm > >>>>>>>> about > >>>>>>>> the possible challenges of nouveau taking matters into its own hand > >>>>>>>> rather > >>>>>>>> than keeping this PCI quirk around? > >>>>>>> > >>>>>>> It sounds to me like the HDA is not powered if no cable is plugged in. > >>>>>>> What is reponsible then for powering it up or down, firmware code on > >>>>>>> the GPU or in the host's BIOS? > >>>>>> > >>>>>> Sometimes the BIOS, but definitely unconditionally the PCI quirk code: > >>>>>> https://github.com/torvalds/linux/blob/master/drivers/pci/quirks.c#L5289 > >>>>>> > >>>>>> (CC Aaron Plattner) > >>>>> > >>>>> My basic understanding is that the audio function stops responding > >>>>> whenever the graphics function is powered off. So the requirement here > >>>>> is that the audio driver can't try to talk to the audio function while > >>>>> the graphics function is asleep, and must trigger a graphics function > >>>>> wakeup before trying to communicate with the audio function. > >>>> > >>>> I believe that vgaswitcheroo takes care of this for us. > >>>> > >>> > >>> yeah, and also: why would the driver want to do stuff? If the GPU is > >>> turned off, there is no point in communicating with the audio device > >>> anyway. The driver should do the initial probe and leave the device be > >>> unless it's actively used. Also there is no such thing as "use the > >>> audio function, but not the graphics one" > >>> > >>>>> I think > >>>>> there are also requirements about the audio function needing to be awake > >>>>> when the graphics driver is updating the ELD, but I'm not sure. > >>>>> > >>> > >>> well, it's one physical device anyway, so technically the audio > >>> function is powered on. > >>> > >>>>> This is harder on Windows because the audio driver lives in its own > >>>>> little world doing its own thing but on Linux we can do better. > >>>>> > >>>>>>> Ideally, we should try to find out how to control HDA power from the > >>>>>>> operating system rather than trying to cooperate with whatever > >>>>>>> firmware > >>>>>>> is doing. If we have that capability, the OS should power the HDA up > >>>>>>> and down as it sees fit. > >>>>> > >>>>> After system boot, I don't think there's any firmware involved, but I'm > >>>>> not super familiar with the low-level details and it's possible the > >>>>> situation changed since I last looked at it. > >>>>> > >>>>> I think the problem with having nouveau write this quirk is that the > >>>>> kernel will need to re-probe the PCI device to notice that it has > >>>>> suddenly become a multi-function device with an audio function, and > >>>>> hotplug the audio driver. I originally looked into trying to do that but > >>>>> it was tricky because the PCI subsystem didn't really have a mechanism > >>>>> for a single-function device to become a multi-function device on the > >>>>> fly and it seemed easier to enable it early on during bus enumeration. > &g
Re: [Nouveau] [PATCH v2] ALSA: hda: Continue to probe when codec probe fails
On Tue, Apr 13, 2021 at 10:24 AM Roy Spliet wrote: > > Op 13-04-2021 om 01:10 schreef Karol Herbst: > > On Mon, Apr 12, 2021 at 9:36 PM Roy Spliet wrote: > >> > >> Hello Aaron, > >> > >> Thanks for your insights. A follow-up query and some observations in-line. > >> > >> Op 12-04-2021 om 20:06 schreef Aaron Plattner: > >>> On 4/10/21 1:48 PM, Roy Spliet wrote: > >>>> Op 10-04-2021 om 20:23 schreef Lukas Wunner: > >>>>> On Sat, Apr 10, 2021 at 04:51:27PM +0100, Roy Spliet wrote: > >>>>>> Can I ask someone with more > >>>>>> technical knowledge of snd_hda_intel and vgaswitcheroo to brainstorm > >>>>>> about > >>>>>> the possible challenges of nouveau taking matters into its own hand > >>>>>> rather > >>>>>> than keeping this PCI quirk around? > >>>>> > >>>>> It sounds to me like the HDA is not powered if no cable is plugged in. > >>>>> What is reponsible then for powering it up or down, firmware code on > >>>>> the GPU or in the host's BIOS? > >>>> > >>>> Sometimes the BIOS, but definitely unconditionally the PCI quirk code: > >>>> https://github.com/torvalds/linux/blob/master/drivers/pci/quirks.c#L5289 > >>>> > >>>> (CC Aaron Plattner) > >>> > >>> My basic understanding is that the audio function stops responding > >>> whenever the graphics function is powered off. So the requirement here > >>> is that the audio driver can't try to talk to the audio function while > >>> the graphics function is asleep, and must trigger a graphics function > >>> wakeup before trying to communicate with the audio function. > >> > >> I believe that vgaswitcheroo takes care of this for us. > >> > > > > yeah, and also: why would the driver want to do stuff? If the GPU is > > turned off, there is no point in communicating with the audio device > > anyway. The driver should do the initial probe and leave the device be > > unless it's actively used. Also there is no such thing as "use the > > audio function, but not the graphics one" > > > >>> I think > >>> there are also requirements about the audio function needing to be awake > >>> when the graphics driver is updating the ELD, but I'm not sure. > >>> > > > > well, it's one physical device anyway, so technically the audio > > function is powered on. > > > >>> This is harder on Windows because the audio driver lives in its own > >>> little world doing its own thing but on Linux we can do better. > >>> > >>>>> Ideally, we should try to find out how to control HDA power from the > >>>>> operating system rather than trying to cooperate with whatever firmware > >>>>> is doing. If we have that capability, the OS should power the HDA up > >>>>> and down as it sees fit. > >>> > >>> After system boot, I don't think there's any firmware involved, but I'm > >>> not super familiar with the low-level details and it's possible the > >>> situation changed since I last looked at it. > >>> > >>> I think the problem with having nouveau write this quirk is that the > >>> kernel will need to re-probe the PCI device to notice that it has > >>> suddenly become a multi-function device with an audio function, and > >>> hotplug the audio driver. I originally looked into trying to do that but > >>> it was tricky because the PCI subsystem didn't really have a mechanism > >>> for a single-function device to become a multi-function device on the > >>> fly and it seemed easier to enable it early on during bus enumeration. > >>> That way the kernel sees both functions all the time without anything > >>> else having to be special about this configuration. > > > > Well, we do have this pci/quirk.c thing, no? Nouveau does flip the > > bit, but I am actually not sure if that's even doing something > > anymore. Maybe in the runtime_resume case it's still relevant but not > > sure _when_ DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY is triggered, it does > > seem to be called even in the runtime_resume case though. > > > >> > >> Right, so for a little more context: a while ago I noticed that my > >> laptop (lucky me, Asus K501UB) has a 940M with HDA but no codec. Seems > >&g
Re: [Nouveau] [PATCH v2] ALSA: hda: Continue to probe when codec probe fails
On Mon, Apr 12, 2021 at 9:36 PM Roy Spliet wrote: > > Hello Aaron, > > Thanks for your insights. A follow-up query and some observations in-line. > > Op 12-04-2021 om 20:06 schreef Aaron Plattner: > > On 4/10/21 1:48 PM, Roy Spliet wrote: > >> Op 10-04-2021 om 20:23 schreef Lukas Wunner: > >>> On Sat, Apr 10, 2021 at 04:51:27PM +0100, Roy Spliet wrote: > Can I ask someone with more > technical knowledge of snd_hda_intel and vgaswitcheroo to brainstorm > about > the possible challenges of nouveau taking matters into its own hand > rather > than keeping this PCI quirk around? > >>> > >>> It sounds to me like the HDA is not powered if no cable is plugged in. > >>> What is reponsible then for powering it up or down, firmware code on > >>> the GPU or in the host's BIOS? > >> > >> Sometimes the BIOS, but definitely unconditionally the PCI quirk code: > >> https://github.com/torvalds/linux/blob/master/drivers/pci/quirks.c#L5289 > >> > >> (CC Aaron Plattner) > > > > My basic understanding is that the audio function stops responding > > whenever the graphics function is powered off. So the requirement here > > is that the audio driver can't try to talk to the audio function while > > the graphics function is asleep, and must trigger a graphics function > > wakeup before trying to communicate with the audio function. > > I believe that vgaswitcheroo takes care of this for us. > yeah, and also: why would the driver want to do stuff? If the GPU is turned off, there is no point in communicating with the audio device anyway. The driver should do the initial probe and leave the device be unless it's actively used. Also there is no such thing as "use the audio function, but not the graphics one" > > I think > > there are also requirements about the audio function needing to be awake > > when the graphics driver is updating the ELD, but I'm not sure. > > well, it's one physical device anyway, so technically the audio function is powered on. > > This is harder on Windows because the audio driver lives in its own > > little world doing its own thing but on Linux we can do better. > > > >>> Ideally, we should try to find out how to control HDA power from the > >>> operating system rather than trying to cooperate with whatever firmware > >>> is doing. If we have that capability, the OS should power the HDA up > >>> and down as it sees fit. > > > > After system boot, I don't think there's any firmware involved, but I'm > > not super familiar with the low-level details and it's possible the > > situation changed since I last looked at it. > > > > I think the problem with having nouveau write this quirk is that the > > kernel will need to re-probe the PCI device to notice that it has > > suddenly become a multi-function device with an audio function, and > > hotplug the audio driver. I originally looked into trying to do that but > > it was tricky because the PCI subsystem didn't really have a mechanism > > for a single-function device to become a multi-function device on the > > fly and it seemed easier to enable it early on during bus enumeration. > > That way the kernel sees both functions all the time without anything > > else having to be special about this configuration. Well, we do have this pci/quirk.c thing, no? Nouveau does flip the bit, but I am actually not sure if that's even doing something anymore. Maybe in the runtime_resume case it's still relevant but not sure _when_ DECLARE_PCI_FIXUP_CLASS_RESUME_EARLY is triggered, it does seem to be called even in the runtime_resume case though. > > Right, so for a little more context: a while ago I noticed that my > laptop (lucky me, Asus K501UB) has a 940M with HDA but no codec. Seems > legit, given how this GPU has no displays attached; they're all hooked > up to the Intel integrated GPU. That threw off the snd_hda_intel > mid-probe, and as a result didn't permit runpm, keeping the entire GPU, > PCIe bus and thus the CPU package awake. A bit of hackerly later we > decided to continue probing without a codec, and now my laptop is happy, > but... > A new problem popped up with several other NVIDIA GPUs that expose their > HDA subdevice, but somehow its inaccessible. Relevant lines from a > users' log: > > [3.031222] MXM: GUID detected in BIOS > [3.031280] ACPI BIOS Error (bug): AE_AML_PACKAGE_LIMIT, Index > (0x3) is beyond end of object (length 0x0) (20200925/exoparg2-393) > [3.031352] ACPI Error: Aborting method \_SB.PCI0.GFX0._DSM due to > previous error (AE_AML_PACKAGE_LIMIT) (20200925/psparse-529) > [3.031419] ACPI: \_SB_.PCI0.GFX0: failed to evaluate _DSM (0x300b) > [3.031424] ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type > mismatch - Found [Buffer], ACPI requires [Package] (20200925/nsarguments-61) > [3.031619] pci :00:02.0: optimus capabilities: enabled, status > dynamic power, > [3.031667] ACPI BIOS Error (bug): AE_AML_PACKAGE_LIMIT, Index > (0x3) is beyond end of object
Re: [PATCH 14/19] drm/nouveau/dispnv50/headc57d: Make local function 'headc57d_olut' static
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/dispnv50/headc57d.c:173:1: warning: no previous > prototype for ‘headc57d_olut’ [-Wmissing-prototypes] > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: Lyude Paul > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c > b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c > index fd51527b56b83..bdcfd240d61c8 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c > +++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c > @@ -169,7 +169,7 @@ headc57d_olut_load(struct drm_color_lut *in, int size, > void __iomem *mem) > writew(readw(mem - 4), mem + 4); > } > > -bool > +static bool > headc57d_olut(struct nv50_head *head, struct nv50_head_atom *asyh, int size) > { > if (size != 0 && size != 256 && size != 1024) > -- > 2.27.0 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel Reviewed-by: Karol Herbst
Re: [PATCH 12/19] drm/nouveau/dispnv04/crtc: Demote non-conforming kernel-doc headers
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/dispnv04/crtc.c:462: warning: Function parameter or > member 'crtc' not described in 'nv_crtc_mode_set_regs' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:462: warning: Function parameter or > member 'mode' not described in 'nv_crtc_mode_set_regs' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'crtc' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'mode' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'adjusted_mode' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'x' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'y' not described in 'nv_crtc_mode_set' > drivers/gpu/drm/nouveau/dispnv04/crtc.c:640: warning: Function parameter or > member 'old_fb' not described in 'nv_crtc_mode_set' > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c > b/drivers/gpu/drm/nouveau/dispnv04/crtc.c > index f9e962fd94d0d..f9a276ea5a9e0 100644 > --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c > +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c > @@ -449,7 +449,7 @@ nv_crtc_mode_set_vga(struct drm_crtc *crtc, struct > drm_display_mode *mode) > regp->Attribute[NV_CIO_AR_CSEL_INDEX] = 0x00; > } > > -/** > +/* > * Sets up registers for the given mode/adjusted_mode pair. > * > * The clocks, CRTCs and outputs attached to this CRTC must be off. > @@ -625,7 +625,7 @@ nv_crtc_swap_fbs(struct drm_crtc *crtc, struct > drm_framebuffer *old_fb) > return ret; > } > > -/** > +/* > * Sets up registers for the given mode/adjusted_mode pair. > * > * The clocks, CRTCs and outputs attached to this CRTC must be off. > -- > 2.27.0 > > _______ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel > Reviewed-by: Karol Herbst
Re: [PATCH 19/19] drm/nouveau/nouveau_ioc32: Demote kernel-doc abuse to standard comment block
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or > member 'filp' not described in 'nouveau_compat_ioctl' > drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or > member 'cmd' not described in 'nouveau_compat_ioctl' > drivers/gpu/drm/nouveau/nouveau_ioc32.c:52: warning: Function parameter or > member 'arg' not described in 'nouveau_compat_ioctl' > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nouveau_ioc32.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_ioc32.c > b/drivers/gpu/drm/nouveau/nouveau_ioc32.c > index 8ddf9b2325a42..2af3615c5205c 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_ioc32.c > +++ b/drivers/gpu/drm/nouveau/nouveau_ioc32.c > @@ -38,7 +38,7 @@ > > #include "nouveau_ioctl.h" > > -/** > +/* > * Called whenever a 32-bit process running under a 64-bit kernel > * performs an ioctl on /dev/dri/card. > * > -- > 2.27.0 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel > Reviewed-by: Karol Herbst
Re: [PATCH 17/19] drm/nouveau/nouveau_ioc32: File headers are not good candidates for kernel-doc
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nouveau_ioc32.c:2: warning: Cannot understand * > file mga_ioc32.c > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nouveau_ioc32.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_ioc32.c > b/drivers/gpu/drm/nouveau/nouveau_ioc32.c > index adf01ca9e035d..8ddf9b2325a42 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_ioc32.c > +++ b/drivers/gpu/drm/nouveau/nouveau_ioc32.c > @@ -1,4 +1,4 @@ > -/** > +/* > * \file mga_ioc32.c > * > * 32-bit ioctl compatibility routines for the MGA DRM. > -- > 2.27.0 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel > Reviewed-by: Karol Herbst
Re: [PATCH 10/19] drm/nouveau/nouveau_bo: Remove unused variables 'dev'
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nouveau_bo.c: In function ‘nouveau_ttm_tt_populate’: > drivers/gpu/drm/nouveau/nouveau_bo.c:1228:17: warning: variable ‘dev’ set > but not used [-Wunused-but-set-variable] > drivers/gpu/drm/nouveau/nouveau_bo.c: In function > ‘nouveau_ttm_tt_unpopulate’: > drivers/gpu/drm/nouveau/nouveau_bo.c:1252:17: warning: variable ‘dev’ set > but not used [-Wunused-but-set-variable] > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: Sumit Semwal > Cc: "Christian König" > Cc: Jeremy Kolb > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Cc: linux-me...@vger.kernel.org > Cc: linaro-mm-...@lists.linaro.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 4 > 1 file changed, 4 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c > b/drivers/gpu/drm/nouveau/nouveau_bo.c > index 281e9ed139895..913035ab85ec2 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -1250,7 +1250,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev, > { > struct ttm_tt *ttm_dma = (void *)ttm; > struct nouveau_drm *drm; > - struct device *dev; > bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); > > if (ttm_tt_is_populated(ttm)) > @@ -1263,7 +1262,6 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev, > } > > drm = nouveau_bdev(bdev); > - dev = drm->dev->dev; > > return ttm_pool_alloc(>ttm.bdev.pool, ttm, ctx); > } > @@ -1273,14 +1271,12 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev, > struct ttm_tt *ttm) > { > struct nouveau_drm *drm; > - struct device *dev; > bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); > > if (slave) > return; > > drm = nouveau_bdev(bdev); > - dev = drm->dev->dev; > > return ttm_pool_free(>ttm.bdev.pool, ttm); > } > -- > 2.27.0 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel Reviewed-by: Karol Herbst
Re: [PATCH 15/19] drm/nouveau/nv50_display: Remove superfluous prototype for local static functions
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones wrote: > > Fixes the following build error: > > drivers/gpu/drm/nouveau/dispnv50/disp.c:2530:1: error: conflicting types for > ‘nv50_display_fini’ > In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71: > drivers/gpu/drm/nouveau/nv50_display.h:36:6: note: previous declaration of > ‘nv50_display_fini’ was her > In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71: > drivers/gpu/drm/nouveau/nv50_display.h:35:6: note: previous declaration of > ‘nv50_display_init’ was here > drivers/gpu/drm/nouveau/dispnv50/disp.c:2581:1: error: static declaration of > ‘nv50_display_destroy’ follows non-static declaration > In file included from drivers/gpu/drm/nouveau/dispnv50/disp.c:71: > drivers/gpu/drm/nouveau/nv50_display.h:34:6: note: previous declaration of > ‘nv50_display_destroy’ was here > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nv50_display.h | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nv50_display.h > b/drivers/gpu/drm/nouveau/nv50_display.h > index fbd3b15583bc8..2421401d12636 100644 > --- a/drivers/gpu/drm/nouveau/nv50_display.h > +++ b/drivers/gpu/drm/nouveau/nv50_display.h > @@ -31,7 +31,4 @@ > #include "nouveau_reg.h" > > int nv50_display_create(struct drm_device *); > -void nv50_display_destroy(struct drm_device *); > -int nv50_display_init(struct drm_device *); > -void nv50_display_fini(struct drm_device *); > #endif /* __NV50_DISPLAY_H__ */ > -- > 2.27.0 > > _______ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel Reviewed-by: Karol Herbst
Re: [PATCH 05/19] drm/nouveau/nvkm/subdev/volt/gk20a: Demote non-conformant kernel-doc headers
On Fri, Mar 19, 2021 at 9:24 AM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function > parameter or member 'speedo' not described in 'gk20a_volt_get_cvb_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function > parameter or member 's_scale' not described in 'gk20a_volt_get_cvb_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:53: warning: Function > parameter or member 'coef' not described in 'gk20a_volt_get_cvb_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 'speedo' not described in 'gk20a_volt_get_cvb_t_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 'temp' not described in 'gk20a_volt_get_cvb_t_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 's_scale' not described in 'gk20a_volt_get_cvb_t_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 't_scale' not described in 'gk20a_volt_get_cvb_t_voltage' > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c:69: warning: Function > parameter or member 'coef' not described in 'gk20a_volt_get_cvb_t_voltage' > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c > index 8c2faa9645111..ccac88da88648 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gk20a.c > @@ -45,7 +45,7 @@ static const struct cvb_coef gk20a_cvb_coef[] = { > /* 852 */ { 1608418, -21643, -269, 0,763, -48}, > }; > > -/** > +/* > * cvb_mv = ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0) > */ > static inline int > @@ -58,7 +58,7 @@ gk20a_volt_get_cvb_voltage(int speedo, int s_scale, const > struct cvb_coef *coef) > return mv; > } > > -/** > +/* > * cvb_t_mv = > * ((c2 * speedo / s_scale + c1) * speedo / s_scale + c0) + > * ((c3 * speedo / s_scale + c4 + c5 * T / t_scale) * T / t_scale) > -- > 2.27.0 > > _______ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel > Reviewed-by: Karol Herbst
Re: [PATCH 09/19] drm/nouveau/nvkm/engine/gr/gf100: Demote non-conformant kernel-doc header
On Fri, Mar 19, 2021 at 9:25 AM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c:992: warning: Function > parameter or member 'gr' not described in 'gf100_gr_wait_idle' > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c > b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c > index 397ff4fe9df89..69e6008f99196 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c > @@ -982,7 +982,7 @@ gf100_gr_zbc_init(struct gf100_gr *gr) > } > } > > -/** > +/* > * Wait until GR goes idle. GR is considered idle if it is disabled by the > * MC (0x200) register, or GR is not busy and a context switch is not in > * progress. > -- > 2.27.0 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel > Reviewed-by: Karol Herbst
Re: [PATCH 01/19] drm/nouveau/nvkm/subdev/bios/init: Demote obvious abuse of kernel-doc
Reviewed-by: Karol Herbst On Fri, Mar 19, 2021 at 9:24 AM Lee Jones wrote: > > Fixes the following W=1 kernel build warning(s): > > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:584: warning: Function > parameter or member 'init' not described in 'init_reserved' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:611: warning: Function > parameter or member 'init' not described in 'init_done' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:622: warning: Function > parameter or member 'init' not described in 'init_io_restrict_prog' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:659: warning: Function > parameter or member 'init' not described in 'init_repeat' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:685: warning: Function > parameter or member 'init' not described in 'init_io_restrict_pll' > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c:725: warning: Function > parameter or member 'init' not described in 'init_end_repeat' > > NB: Trimmed for brevity (lots of these!) > > Cc: Ben Skeggs > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Lee Jones > --- > .../gpu/drm/nouveau/nvkm/subdev/bios/init.c | 204 ++ > 1 file changed, 68 insertions(+), 136 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c > index 9de74f41dcd2a..5a91dc4e5c8ec 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c > @@ -575,9 +575,8 @@ init_tmds_reg(struct nvbios_init *init, u8 tmds) > * init opcode handlers > > */ > > -/** > +/* > * init_reserved - stub for various unknown/unused single-byte opcodes > - * > */ > static void > init_reserved(struct nvbios_init *init) > @@ -602,9 +601,8 @@ init_reserved(struct nvbios_init *init) > init->offset += length; > } > > -/** > +/* > * INIT_DONE - opcode 0x71 > - * > */ > static void > init_done(struct nvbios_init *init) > @@ -613,9 +611,8 @@ init_done(struct nvbios_init *init) > init->offset = 0x; > } > > -/** > +/* > * INIT_IO_RESTRICT_PROG - opcode 0x32 > - * > */ > static void > init_io_restrict_prog(struct nvbios_init *init) > @@ -650,9 +647,8 @@ init_io_restrict_prog(struct nvbios_init *init) > trace("}]\n"); > } > > -/** > +/* > * INIT_REPEAT - opcode 0x33 > - * > */ > static void > init_repeat(struct nvbios_init *init) > @@ -676,9 +672,8 @@ init_repeat(struct nvbios_init *init) > init->repeat = repeat; > } > > -/** > +/* > * INIT_IO_RESTRICT_PLL - opcode 0x34 > - * > */ > static void > init_io_restrict_pll(struct nvbios_init *init) > @@ -716,9 +711,8 @@ init_io_restrict_pll(struct nvbios_init *init) > trace("}]\n"); > } > > -/** > +/* > * INIT_END_REPEAT - opcode 0x36 > - * > */ > static void > init_end_repeat(struct nvbios_init *init) > @@ -732,9 +726,8 @@ init_end_repeat(struct nvbios_init *init) > } > } > > -/** > +/* > * INIT_COPY - opcode 0x37 > - * > */ > static void > init_copy(struct nvbios_init *init) > @@ -759,9 +752,8 @@ init_copy(struct nvbios_init *init) > init_wrvgai(init, port, index, data); > } > > -/** > +/* > * INIT_NOT - opcode 0x38 > - * > */ > static void > init_not(struct nvbios_init *init) > @@ -771,9 +763,8 @@ init_not(struct nvbios_init *init) > init_exec_inv(init); > } > > -/** > +/* > * INIT_IO_FLAG_CONDITION - opcode 0x39 > - * > */ > static void > init_io_flag_condition(struct nvbios_init *init) > @@ -788,9 +779,8 @@ init_io_flag_condition(struct nvbios_init *init) > init_exec_set(init, false); > } > > -/** > +/* > * INIT_GENERIC_CONDITION - opcode 0x3a > - * > */ > static void > init_generic_condition(struct nvbios_init *init) > @@ -840,9 +830,8 @@ init_generic_condition(struct nvbios_init *init) > } > } > > -/** > +/* > * INIT_IO_MASK_OR - opcode 0x3b > - * > */ > static void > init_io_mask_or(struct nvbios_init *init) > @@ -859,9 +848,8 @@ init_io_mask_or(struct nvbios_init *init) > init_wrvgai(init, 0x03d4, index, data &= ~(1 << or)); > } > > -/** > +/* > * INIT_IO_OR - opcode 0x3c > - * > */ > static void > init_io_or(struct nvbios_init *init) > @@ -878,9 +866,8 @@ init_i
Re: [Nouveau] nouveau regression post v5.8, still present in v5.10
fyi, there is a patch which solves a maybe related issue on your GPU, mind giving it a try before we dig further? https://gitlab.freedesktop.org/drm/nouveau/-/issues/14#note_767791 On Thu, Jan 21, 2021 at 3:33 AM Jamie Heilman wrote: > > Karol Herbst wrote: > > On Wed, Jan 6, 2021 at 4:25 AM Jamie Heilman > > wrote: > > > > > > Jamie Heilman wrote: > > > > Jamie Heilman wrote: > > > > > Karol Herbst wrote: > > > > > > do you think you'd be able to do a kernel bisect in order to > > > > > > pinpoint > > > > > > the actual commit causing it? Thanks > > > > > > > > > > No. I can't reproduce it reliably. I if I could, bisection wouldn't > > > > > be a problem but as I can't and as it can take weeks for the problem > > > > > to occur there's essentially no chance. I know it regressed roughly > > > > > in 5.8-rc1 only because that's what I was running when the first event > > > > > occured. > > > > > > > > er, 5.9.0-rc1 rather > > > > > > Actually ... I've found a way to reproduce this in hours intead of > > > weeks, so I think I may be able to bisect it after all, it's something > > > of a brute force approach and its probably doing horrible things to > > > the backlight in my poor old monitor, but just running this: > > > > > > #!/bin/sh > > > sleep 5 > > > while ! dmesg | tail | grep -q nouveau > > > do > > > xset dpms force off > > > sleep 65 > > > xdotool mousemove 1024 1024 mousemove restore > > > sleep 10 > > > done > > > > > > Does manage to trip the issue sooner than it would otherwise happen > > > with natural usage. Given that this is my primary workstation and I > > > sort of need it functional during waking hours, it'll take me a bit, > > > but I'll update folks when I have the error more dialed in. > > > > > > > huh interesting. Kind of feels like a random thing still. But I think > > in general you'd spend way too much time on this if you can't > > reproduce within seconds/minutes and then it might not point out the > > actual issue, because randomly the issue didn't appear and stuff. > > > > maybe you can tune it to have shorter pauses or something? I'd really > > try to bring down the time per cycle. > > Well I'm confident enough, at this point, to say this bisects to > 0a96099691c8 ("drm/nouveau/kms/nv50-: implement proper push buffer control > logic") > > Now... I wish I could say the bisection was straightforward and > simple, but it wasn't thanks to still not having a reproducer really > dialed in. The above script doesn't work unless I've got some normal > usage around it. It certainly triggers the issue sooner than it > otherwise would, but by itself it isn't enough. I modified it > somewhat in the hopes of capturing the rough idea of how many > itterations it would take to trigger the problem by using: > > #!/bin/sh > I=0 > trap 'echo;echo $I' 0 > trap 'exit' INT > sleep 5 > while ! dmesg | tail | grep -q nouveau > do > I=$(($I + 1)) > xset dpms force off > sleep 32 > xdotool mousemove 1 12 mousemove restore > sleep 28 > done > > but ultimately that didn't really pan out the way I'd hoped. I don't > think the itterations have all that much to do with the condition in > the end. I wanted to try applying ca386aa7155a > ("drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bug") on > top of the first bad commit becuase I sort of felt like when I was > running -rc versions that things got a bit less chaotic after that > commit landed, but it was just a gut feeling and I wanted to see if I > could support it with metrics---but no, I can't really get consistent > metrics even without that commit so I gave up, and decided to report > what I've got so far. > > > > > > I'm using git bisect start -- drivers/gpu/drm include/drm include/video > > > in an effort to make this go a bit quicker, let me know if you think > > > that's a bad idea or I should add other paths. > > > > > > > > > On Sun, Dec 27, 2020 at 8:16 PM Jamie Heilman > > > > > > wrote: > > > > > > > > > > > > > > Something between v5.8 and v5.9 has resulted in periodically > > > > > > > losing video. > > > > > > > Unfortunately, I can't reliably reproduce it, it seems to happen > > > > > > > every >
Re: [Nouveau] [PATCH] tracing: remove definition of DEBUG
Reviewed-by: Karol Herbst On Fri, Jan 15, 2021 at 11:51 PM wrote: > > From: Tom Rix > > Defining DEBUG should only be done in development. > So remove DEBUG. > > Signed-off-by: Tom Rix > --- > kernel/trace/trace_mmiotrace.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/kernel/trace/trace_mmiotrace.c b/kernel/trace/trace_mmiotrace.c > index 84582bf1ed5f..2c3c31791497 100644 > --- a/kernel/trace/trace_mmiotrace.c > +++ b/kernel/trace/trace_mmiotrace.c > @@ -5,8 +5,6 @@ > * Copyright (C) 2008 Pekka Paalanen > */ > > -#define DEBUG 1 > - > #include > #include > #include > -- > 2.27.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
[PATCH] drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices
Fixes a crash when trying to create a channel on e.g. Turing GPUs when NOUVEAU_SVM_INIT was called before. Fixes: eeaf06ac1a558 ("drm/nouveau/svm: initial support for shared virtual memory") Signed-off-by: Karol Herbst --- drivers/gpu/drm/nouveau/nouveau_svm.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 4f69e4c3dafd..1c3f890377d2 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -315,6 +315,10 @@ nouveau_svmm_init(struct drm_device *dev, void *data, struct drm_nouveau_svm_init *args = data; int ret; + /* We need to fail if svm is disabled */ + if (!cli->drm->svm) + return -ENOSYS; + /* Allocate tracking for SVM-enabled VMM. */ if (!(svmm = kzalloc(sizeof(*svmm), GFP_KERNEL))) return -ENOMEM; -- 2.29.2
[PATCH] drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices
Fixes a crash when trying to create a channel on e.g. Turing GPUs when NOUVEAU_SVM_INIT was called before. Fixes: eeaf06ac1a558 ("drm/nouveau/svm: initial support for shared virtual memory") Signed-off-by: Karol Herbst --- drivers/gpu/drm/nouveau/nouveau_svm.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 4f69e4c3dafd..1c3f890377d2 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -315,6 +315,10 @@ nouveau_svmm_init(struct drm_device *dev, void *data, struct drm_nouveau_svm_init *args = data; int ret; + /* We need to fail if svm is disabled */ + if (!cli->drm->svm) + return -ENOSYS; + /* Allocate tracking for SVM-enabled VMM. */ if (!(svmm = kzalloc(sizeof(*svmm), GFP_KERNEL))) return -ENOMEM; -- 2.29.2
Re: [Nouveau] nouveau regression post v5.8, still present in v5.10
On Wed, Jan 6, 2021 at 4:25 AM Jamie Heilman wrote: > > Jamie Heilman wrote: > > Jamie Heilman wrote: > > > Karol Herbst wrote: > > > > do you think you'd be able to do a kernel bisect in order to pinpoint > > > > the actual commit causing it? Thanks > > > > > > No. I can't reproduce it reliably. I if I could, bisection wouldn't > > > be a problem but as I can't and as it can take weeks for the problem > > > to occur there's essentially no chance. I know it regressed roughly > > > in 5.8-rc1 only because that's what I was running when the first event > > > occured. > > > > er, 5.9.0-rc1 rather > > Actually ... I've found a way to reproduce this in hours intead of > weeks, so I think I may be able to bisect it after all, it's something > of a brute force approach and its probably doing horrible things to > the backlight in my poor old monitor, but just running this: > > #!/bin/sh > sleep 5 > while ! dmesg | tail | grep -q nouveau > do > xset dpms force off > sleep 65 > xdotool mousemove 1024 1024 mousemove restore > sleep 10 > done > > Does manage to trip the issue sooner than it would otherwise happen > with natural usage. Given that this is my primary workstation and I > sort of need it functional during waking hours, it'll take me a bit, > but I'll update folks when I have the error more dialed in. > huh interesting. Kind of feels like a random thing still. But I think in general you'd spend way too much time on this if you can't reproduce within seconds/minutes and then it might not point out the actual issue, because randomly the issue didn't appear and stuff. maybe you can tune it to have shorter pauses or something? I'd really try to bring down the time per cycle. > I'm using git bisect start -- drivers/gpu/drm include/drm include/video > in an effort to make this go a bit quicker, let me know if you think > that's a bad idea or I should add other paths. > > > > > On Sun, Dec 27, 2020 at 8:16 PM Jamie Heilman > > > > wrote: > > > > > > > > > > Something between v5.8 and v5.9 has resulted in periodically losing > > > > > video. > > > > > Unfortunately, I can't reliably reproduce it, it seems to happen every > > > > > once in a long while---I can go weeks without an occurance, but it > > > > > always seems to happen after my workstation has been idle long enough > > > > > to screen blank and put the monitor to sleep. I'm using a single > > > > > display (Dell 2405FPW) connected via DVI, running X (Xorg 1.20.x from > > > > > Debian sid). I don't really do anything fancy, xterms, a browser or > > > > > two, play the occasional video, but like I said, I can't reliably > > > > > reproduce this. I've had it happen about 11 times since August. > > > > > > > > > > lspci -vv output is: > > > > > > > > > > 01:00.0 VGA compatible controller: NVIDIA Corporation G86 [Quadro NVS > > > > > 290] (rev a1) (prog-if 00 [VGA controller]) > > > > > Subsystem: NVIDIA Corporation G86 [Quadro NVS 290] > > > > > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- > > > > > ParErr- Stepping- SERR- FastB2B- DisINTx+ > > > > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast > > > > > >TAbort- SERR- > > > > Latency: 0, Cache Line Size: 64 bytes > > > > > Interrupt: pin A routed to IRQ 28 > > > > > Region 0: Memory at fc00 (32-bit, non-prefetchable) > > > > > [size=16M] > > > > > Region 1: Memory at d000 (64-bit, prefetchable) > > > > > [size=256M] > > > > > Region 3: Memory at fa00 (64-bit, non-prefetchable) > > > > > [size=32M] > > > > > Region 5: I/O ports at dc80 [size=128] > > > > > Expansion ROM at 000c [disabled] [size=128K] > > > > > Capabilities: [60] Power Management version 2 > > > > > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA > > > > > PME(D0-,D1-,D2-,D3hot-,D3cold-) > > > > > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > > > > > Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ > > > > > Address: fee01004 Data: 4023 > > > > > Capabilities: [78] Express (v1) Endpoint, MSI 00 > > > > >
Re: [Nouveau] nouveau regression post v5.8, still present in v5.10
do you think you'd be able to do a kernel bisect in order to pinpoint the actual commit causing it? Thanks On Sun, Dec 27, 2020 at 8:16 PM Jamie Heilman wrote: > > Something between v5.8 and v5.9 has resulted in periodically losing video. > Unfortunately, I can't reliably reproduce it, it seems to happen every > once in a long while---I can go weeks without an occurance, but it > always seems to happen after my workstation has been idle long enough > to screen blank and put the monitor to sleep. I'm using a single > display (Dell 2405FPW) connected via DVI, running X (Xorg 1.20.x from > Debian sid). I don't really do anything fancy, xterms, a browser or > two, play the occasional video, but like I said, I can't reliably > reproduce this. I've had it happen about 11 times since August. > > lspci -vv output is: > > 01:00.0 VGA compatible controller: NVIDIA Corporation G86 [Quadro NVS 290] > (rev a1) (prog-if 00 [VGA controller]) > Subsystem: NVIDIA Corporation G86 [Quadro NVS 290] > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- > Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > SERR- Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 28 > Region 0: Memory at fc00 (32-bit, non-prefetchable) [size=16M] > Region 1: Memory at d000 (64-bit, prefetchable) [size=256M] > Region 3: Memory at fa00 (64-bit, non-prefetchable) [size=32M] > Region 5: I/O ports at dc80 [size=128] > Expansion ROM at 000c [disabled] [size=128K] > Capabilities: [60] Power Management version 2 > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA > PME(D0-,D1-,D2-,D3hot-,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ > Address: fee01004 Data: 4023 > Capabilities: [78] Express (v1) Endpoint, MSI 00 > DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s > <512ns, L1 <4us > ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- > SlotPowerLimit 25.000W > DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq- > RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ > MaxPayload 128 bytes, MaxReadReq 512 bytes > DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- > TransPend- > LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit > Latency L0s <512ns, L1 <4us > ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- > LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s (ok), Width x16 (ok) > TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- > Capabilities: [100 v1] Virtual Channel > Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 > Arb:Fixed- WRR32- WRR64- WRR128- > Ctrl: ArbSelect=Fixed > Status: InProgress- > VC0:Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- > Arb:Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- > Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 > Status: NegoPending- InProgress- > Capabilities: [128 v1] Power Budgeting > Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 > Len=024 > Kernel driver in use: nouveau > > The last time this happened, this is what got logged: > > nouveau :01:00.0: disp: ERROR 5 [INVALID_STATE] 06 [] chid 1 mthd 0080 > data 0001 > nouveau :01:00.0: disp: Base 1: > nouveau :01:00.0: disp:0084: > nouveau :01:00.0: disp:0088: > nouveau :01:00.0: disp:008c: > nouveau :01:00.0: disp:0090: > nouveau :01:00.0: disp:0094: > nouveau :01:00.0: disp:00a0: 0060 -> 0070 > nouveau :01:00.0: disp:00a4: -> f000 > nouveau :01:00.0: disp:00c0: > nouveau :01:00.0: disp:00c4: > nouveau :01:00.0: disp:00c8: > nouveau :01:00.0: disp:00cc: > nouveau :01:00.0: disp:00e0: 4000 > nouveau :01:00.0: disp:00e4: > nouveau :01:00.0: disp:00e8: > nouveau :01:00.0: disp:00ec: > nouveau :01:00.0: disp:00fc: > nouveau :01:00.0: disp:0100: fffe > nouveau :01:00.0: disp:0104: > nouveau :01:00.0: disp:0110: > nouveau :01:00.0: disp:0114: > nouveau :01:00.0: disp:
Re: [Nouveau] [PATCH v2] ALSA: hda: Continue to probe when codec probe fails
On Tue, Dec 22, 2020 at 3:50 AM Kai-Heng Feng wrote: > > On Tue, Dec 22, 2020 at 1:56 AM Ilia Mirkin wrote: > > > > On Mon, Dec 21, 2020 at 11:33 AM Kai-Heng Feng > > wrote: > > > > > > [+Cc nouveau] > > > > > > On Fri, Dec 18, 2020 at 4:06 PM Takashi Iwai wrote: > > > [snip] > > > > > Quite possibly the system doesn't power up HDA controller when there's > > > > > no external monitor. > > > > > So when it's connected to external monitor, it's still needed for > > > > > HDMI audio. > > > > > Let me ask the user to confirm this. > > > > > > > > Yeah, it's the basic question whether the HD-audio is supposed to work > > > > on this machine at all. If yes, the current approach we take makes > > > > less sense - instead we should rather make the HD-audio controller > > > > working. > > > > > > Yea, confirmed that the Nvidia HDA works when HDMI is connected prior > > > boot. > > > > > > > > > - The second problem is that pci_enable_device() ignores the error > > > > > > returned from pci_set_power_state() if it's -EIO. And the > > > > > > inaccessible access error returns -EIO, although it's rather a > > > > > > fatal > > > > > > problem. So the driver believes as the PCI device gets enabled > > > > > > properly. > > > > > > > > > > This was introduced in 2005, by Alan's 11f3859b1e85 ("[PATCH] PCI: Fix > > > > > regression in pci_enable_device_bars") to fix UHCI controller. > > > > > > > > > > > > > > > > > - The third problem is that HD-audio driver blindly believes the > > > > > > codec_mask read from the register even if it's a read failure as I > > > > > > already showed. > > > > > > > > > > This approach has least regression risk. > > > > > > > > Yes, but it assumes that HD-audio is really non-existent. > > > > > > I really don't know any good approach to address this. > > > On Windows, HDA PCI is "hidden" until HDMI cable is plugged, then the > > > driver will flag the magic bit to make HDA audio appear on the PCI > > > bus. > > > IIRC the current approach is to make nouveau and device link work. > > > > I don't have the full context of this discussion, but the kernel > > force-enables the HDA subfunction nowadays, irrespective of nouveau or > > nvidia or whatever: > > That's the problem. > > The nvidia HDA controller on the affected system only gets its power > after HDMI cable plugged, so the probe on boot fails. > it might be that the code to enable the sub function is a bit broken :/ but it should work. Maybe the quirk_nvidia_hda function needs to be called on more occasions? No idea. > Kai-Heng > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/quirks.c?h=v5.10#n5267 > > > > Cheers, > > > > -ilia > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
Re: [PATCH 000/141] Fix fall-through warnings for Clang
On Thu, Nov 26, 2020 at 4:28 PM Geert Uytterhoeven wrote: > > Hi Miguel, > > On Thu, Nov 26, 2020 at 3:54 PM Miguel Ojeda > wrote: > > On Wed, Nov 25, 2020 at 11:44 PM Edward Cree wrote: > > > To make the intent clear, you have to first be certain that you > > > understand the intent; otherwise by adding either a break or a > > > fallthrough to suppress the warning you are just destroying the > > > information that "the intent of this code is unknown". > > > > If you don't know what the intent of your own code is, then you > > *already* have a problem in your hands. > > The maintainer is not necessarily the owner/author of the code, and > thus may not know the intent of the code. > > > > or does it flag up code > > > that can be mindlessly "fixed" (in which case the warning is > > > worthless)? Proponents in this thread seem to be trying to > > > have it both ways. > > > > A warning is not worthless just because you can mindlessly fix it. > > There are many counterexamples, e.g. many > > checkpatch/lint/lang-format/indentation warnings, functional ones like > > the `if (a = b)` warning... > > BTW, you cannot mindlessly fix the latter, as you cannot know if > "(a == b)" or "((a = b))" was intended, without understanding the code > (and the (possibly unavailable) data sheet, and the hardware, ...). > to allow assignments in if statements was clearly a mistake and if you need outside information to understand the code, your code is the issue already. > P.S. So far I've stayed out of this thread, as I like it if the compiler > flags possible mistakes. After all I was the one fixing new > "may be used uninitialized" warnings thrown up by gcc-4.1, until > (a bit later than) support for that compiler was removed... > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel >
Re: [Nouveau] [PATCH 2/3] drm/nouveau: manage nouveau_drm lifetime with devres
On Fri, Nov 6, 2020 at 3:17 AM Jeremy Cline wrote: > > Make use of the devm_drm_dev_alloc() API to bind the lifetime of > nouveau_drm structure to the drm_device. This is important because a > reference to nouveau_drm is accessible from drm_device, which is > provided to a number of DRM layer callbacks that can run after the > deallocation of nouveau_drm currently occurs. > > Signed-off-by: Jeremy Cline > --- > drivers/gpu/drm/nouveau/nouveau_drm.c | 44 --- > drivers/gpu/drm/nouveau/nouveau_drv.h | 10 -- > 2 files changed, 26 insertions(+), 28 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c > b/drivers/gpu/drm/nouveau/nouveau_drm.c > index bc6f51bf23b7..f750c25e92f9 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c > @@ -30,9 +30,11 @@ > #include > #include > > +#include > #include > #include > #include > +#include > > #include > #include > @@ -532,13 +534,8 @@ nouveau_parent = { > static int > nouveau_drm_device_init(struct drm_device *dev) > { > - struct nouveau_drm *drm; > int ret; > - > - if (!(drm = kzalloc(sizeof(*drm), GFP_KERNEL))) > - return -ENOMEM; > - dev->dev_private = drm; > - drm->dev = dev; > + struct nouveau_drm *drm = nouveau_drm(dev); > > nvif_parent_ctor(_parent, >parent); > drm->master.base.object.parent = >parent; > @@ -620,7 +617,6 @@ nouveau_drm_device_init(struct drm_device *dev) > nouveau_cli_fini(>master); > fail_alloc: > nvif_parent_dtor(>parent); > - kfree(drm); > return ret; > } > > @@ -654,7 +650,6 @@ nouveau_drm_device_fini(struct drm_device *dev) > nouveau_cli_fini(>client); > nouveau_cli_fini(>master); > nvif_parent_dtor(>parent); > - kfree(drm); > } > > /* > @@ -720,6 +715,7 @@ static int nouveau_drm_probe(struct pci_dev *pdev, > { > struct nvkm_device *device; > struct drm_device *drm_dev; > + struct nouveau_drm *nv_dev; > int ret; > > if (vga_switcheroo_client_probe_defer(pdev)) > @@ -750,15 +746,16 @@ static int nouveau_drm_probe(struct pci_dev *pdev, > if (nouveau_atomic) > driver_pci.driver_features |= DRIVER_ATOMIC; > > - drm_dev = drm_dev_alloc(_pci, >dev); > - if (IS_ERR(drm_dev)) { > - ret = PTR_ERR(drm_dev); > + nv_dev = devm_drm_dev_alloc(>dev, _stub, > typeof(*nv_dev), drm_dev); > + if (IS_ERR(nv_dev)) { > + ret = PTR_ERR(nv_dev); > goto fail_nvkm; > } > + drm_dev = nouveau_to_drm_dev(nv_dev); > > ret = pci_enable_device(pdev); > if (ret) > - goto fail_drm; > + goto fail_nvkm; > > drm_dev->pdev = pdev; > pci_set_drvdata(pdev, drm_dev); > @@ -778,8 +775,6 @@ static int nouveau_drm_probe(struct pci_dev *pdev, > nouveau_drm_device_fini(drm_dev); > fail_pci: > pci_disable_device(pdev); > -fail_drm: > - drm_dev_put(drm_dev); it sounded like that when using devm_drm_dev_alloc we still have an initial refcount of 1, so at least in this regard nothing changed so I am wondering why this change is necessary and if the reason is unrelated it might make sense to move it into its own patch. > fail_nvkm: > nvkm_device_del(); > return ret; > @@ -799,7 +794,6 @@ nouveau_drm_device_remove(struct drm_device *dev) > device = nvkm_device_find(client->device); > > nouveau_drm_device_fini(dev); > - drm_dev_put(dev); > nvkm_device_del(); > } > > @@ -1285,7 +1279,8 @@ nouveau_platform_device_create(const struct > nvkm_device_tegra_func *func, >struct platform_device *pdev, >struct nvkm_device **pdevice) > { > - struct drm_device *drm; > + struct nouveau_drm *nv_dev; > + struct drm_device *drm_dev; > int err; > > err = nvkm_device_tegra_new(func, pdev, nouveau_config, nouveau_debug, > @@ -1293,22 +1288,21 @@ nouveau_platform_device_create(const struct > nvkm_device_tegra_func *func, > if (err) > goto err_free; > > - drm = drm_dev_alloc(_platform, >dev); > - if (IS_ERR(drm)) { > - err = PTR_ERR(drm); > + nv_dev = devm_drm_dev_alloc(>dev, _platform, > typeof(*nv_dev), drm_dev); > + if (IS_ERR(nv_dev)) { > + err = PTR_ERR(nv_dev); > goto err_free; > } > + drm_dev = nouveau_to_drm_dev(nv_dev); > > - err = nouveau_drm_device_init(drm); > + err = nouveau_drm_device_init(drm_dev); > if (err) > - goto err_put; > + goto err_free; > > - platform_set_drvdata(pdev, drm); > + platform_set_drvdata(pdev, drm_dev); > > - return drm; > + return drm_dev; > > -err_put: > - drm_dev_put(drm); >
Re: [Nouveau] nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian
I sent a patch to the mailing list and wanted to have some review on that from at least Ben, but no idea if Ben already picked it and if it's good enough for sending it to stable yet.
Re: [Nouveau] [PATCH RFC] drm/nouveau: fix memory leak in nvkm_iccsense_oneinit
On Thu, Oct 15, 2020 at 6:32 PM Karol Herbst wrote: > > Ben, I think this is like the 5th patch tackling this issue, I think > we should merge one of those. > maybe I just confused that with reports, but it seems to turn up quite a bit and maybe I should have pushed more of it as well... Anyway, the patch itself looks good. Reviewed-by: Karol Herbst > On Thu, Oct 15, 2020 at 6:23 AM Keita Suzuki > wrote: > > > > struct pw_rail_t is allocated as an array in function nvios_iccsense_parse, > > and stored to a struct member of local variable. However, the array is not > > freed when the local variable becomes invalid, and the reference is not > > passed on, leading to a memory leak. > > > > Fix this by freeing struct pw_rail_t when exiting nvkm_iccsense_oneinit. > > > > Signed-off-by: Keita Suzuki > > --- > > drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c | 5 - > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c > > index fecfa6afcf54..8ba8d8e3f52a 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c > > @@ -280,8 +280,10 @@ nvkm_iccsense_oneinit(struct nvkm_subdev *subdev) > > } > > > > rail = kmalloc(sizeof(*rail), GFP_KERNEL); > > - if (!rail) > > + if (!rail) { > > + kfree(stbl.rail); > > return -ENOMEM; > > + } > > > > rail->read = read; > > rail->sensor = sensor; > > @@ -291,6 +293,7 @@ nvkm_iccsense_oneinit(struct nvkm_subdev *subdev) > > list_add_tail(>head, >rails); > > } > > } > > + kfree(stbl.rail); > > return 0; > > } > > > > -- > > 2.17.1 > > > > ___ > > Nouveau mailing list > > nouv...@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau > >
Re: [Nouveau] [PATCH RFC] drm/nouveau: fix memory leak in nvkm_iccsense_oneinit
Ben, I think this is like the 5th patch tackling this issue, I think we should merge one of those. On Thu, Oct 15, 2020 at 6:23 AM Keita Suzuki wrote: > > struct pw_rail_t is allocated as an array in function nvios_iccsense_parse, > and stored to a struct member of local variable. However, the array is not > freed when the local variable becomes invalid, and the reference is not > passed on, leading to a memory leak. > > Fix this by freeing struct pw_rail_t when exiting nvkm_iccsense_oneinit. > > Signed-off-by: Keita Suzuki > --- > drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c > index fecfa6afcf54..8ba8d8e3f52a 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c > @@ -280,8 +280,10 @@ nvkm_iccsense_oneinit(struct nvkm_subdev *subdev) > } > > rail = kmalloc(sizeof(*rail), GFP_KERNEL); > - if (!rail) > + if (!rail) { > + kfree(stbl.rail); > return -ENOMEM; > + } > > rail->read = read; > rail->sensor = sensor; > @@ -291,6 +293,7 @@ nvkm_iccsense_oneinit(struct nvkm_subdev *subdev) > list_add_tail(>head, >rails); > } > } > + kfree(stbl.rail); > return 0; > } > > -- > 2.17.1 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
Re: [Nouveau] nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian
On Sat, Oct 10, 2020 at 12:23 AM Ilia Mirkin wrote: > > On Fri, Oct 9, 2020 at 5:54 PM Karol Herbst wrote: > > > > On Fri, Oct 9, 2020 at 11:35 PM Ondrej Zary wrote: > > > > > > Hello, > > > I'm testing 5.9.0-rc8 and found that Riva TNT2 stopped working: > > > [0.00] Linux version 5.9.0-rc8+ (zary@gsql) (gcc (Debian 8.3.0-6) > > > 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #326 SMP Fri Oct 9 > > > 22:31:40 CEST 2020 > > > ... > > > [ 14.771464] nouveau :01:00.0: GPU not supported on big-endian > > > [ 14.771782] nouveau: probe of :01:00.0 failed with error -38 > > > > > > big-endian? WTF? The machine is x86. > > > > > > > mhh, we reworked the endianess checks a bit and apparently that broke > > something... I will give it some thoughts, but could you be so kind > > and create an mmiotrace under 5.9 with nouveau? You won't need to > > start X or anything while doing it. Just enable the trace and modprobe > > nouveau and collect the trace. > > Looks like nvkm_device_endianness unconditionally reads out 0x4. I > don't think that reg is there pre-NV11. At least NV4, NV5, NV10 and > maybe NV15 (which is logically pre-NV11) don't support big-endian > mode. Not sure about NV1A, which was the IGP of the series and IIRC > logically pre-NV11 as well (but clearly could only be used with x86 > chips, since it was part of the motherboard). > > Aha, it's documented in rnndb: > > https://github.com/envytools/envytools/blob/master/rnndb/bus/pmc.xml > > ohh, I should have checked there.. yeah, will write a fix for it then. Before my patch we just always tried to switch it, but never threw an error. > -ilia >
Re: [Nouveau] nouveau broken on Riva TNT2 in 5.9.0-rc8: GPU not supported on big-endian
On Fri, Oct 9, 2020 at 11:35 PM Ondrej Zary wrote: > > Hello, > I'm testing 5.9.0-rc8 and found that Riva TNT2 stopped working: > [0.00] Linux version 5.9.0-rc8+ (zary@gsql) (gcc (Debian 8.3.0-6) > 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #326 SMP Fri Oct 9 22:31:40 > CEST 2020 > ... > [ 14.771464] nouveau :01:00.0: GPU not supported on big-endian > [ 14.771782] nouveau: probe of :01:00.0 failed with error -38 > > big-endian? WTF? The machine is x86. > mhh, we reworked the endianess checks a bit and apparently that broke something... I will give it some thoughts, but could you be so kind and create an mmiotrace under 5.9 with nouveau? You won't need to start X or anything while doing it. Just enable the trace and modprobe nouveau and collect the trace. > It works fine with Debian 5.7 kernel (5.7.10-1~bpo10+1): > [0.00] Linux version 5.7.0-0.bpo.2-686 > (debian-ker...@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6), GNU ld > (GNU Binutils for Debian) 2.31.1) #1 SMP Debian 5.7.10-1~bpo10+1 (2020-07-30) > ... > [ 23.266196] nouveau :01:00.0: NVIDIA NV05 (20154000) > [ 23.288582] nouveau :01:00.0: bios: version 02.05.20.02.00 > [ 23.288869] nouveau :01:00.0: bios: DCB table not found > [ 23.289595] nouveau :01:00.0: bios: DCB table not found > [ 23.289956] nouveau :01:00.0: bios: DCB table not found > [ 23.290015] nouveau :01:00.0: bios: DCB table not found > [ 23.290215] agpgart-intel :00:00.0: AGP 3.0 bridge > [ 23.290287] agpgart-intel :00:00.0: bridge is in legacy mode, falling > back to 2.x > [ 23.290351] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode > [ 23.290430] nouveau :01:00.0: putting AGP V2 device into 4x mode > [ 23.290565] agpgart-intel :00:00.0: AGP 3.0 bridge > [ 23.290627] agpgart-intel :00:00.0: bridge is in legacy mode, falling > back to 2.x > [ 23.290690] agpgart-intel :00:00.0: putting AGP V2 device into 4x mode > [ 23.290768] nouveau :01:00.0: putting AGP V2 device into 4x mode > [ 23.290830] nouveau :01:00.0: tmr: unknown input clock freq > [ 23.293026] nouveau :01:00.0: fb: 32 MiB SDRAM > [ 23.301269] [TTM] Zone kernel: Available graphics memory: 382728 KiB > [ 23.301327] [TTM] Initializing pool allocator > [ 23.301414] nouveau :01:00.0: DRM: VRAM: 31 MiB > [ 23.301465] nouveau :01:00.0: DRM: GART: 128 MiB > [ 23.301518] nouveau :01:00.0: DRM: BMP version 5.6 > [ 23.301570] nouveau :01:00.0: DRM: No DCB data found in VBIOS > [ 23.303594] nouveau :01:00.0: DRM: MM: using M2MF for buffer copies > [ 23.303719] nouveau :01:00.0: bios: DCB table not found > [ 23.304904] nouveau :01:00.0: DRM: Saving VGA fonts > [ 23.349089] nouveau :01:00.0: DRM: No DCB data found in VBIOS > [ 23.349681] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). > [ 23.383066] nouveau :01:00.0: DRM: allocated 1280x1024 fb: 0x4000, bo > b10d2f17 > [ 23.413903] fbcon: nouveaudrmfb (fb0) is primary device > [ 23.569851] Console: switching to colour frame buffer device 160x64 > [ 23.571050] nouveau :01:00.0: fb0: nouveaudrmfb frame buffer device > > > -- > Ondrej Zary > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
[PATCH 2/2] drm/nouveau/mem: guard against NULL pointer access in mem_del
other drivers seems to do something similar Signed-off-by: Karol Herbst Cc: dri-devel Cc: Dave Airlie Cc: sta...@vger.kernel.org --- drivers/gpu/drm/nouveau/nouveau_mem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.c b/drivers/gpu/drm/nouveau/nouveau_mem.c index b1bb542d31158..e5fae57fffbd1 100644 --- a/drivers/gpu/drm/nouveau/nouveau_mem.c +++ b/drivers/gpu/drm/nouveau/nouveau_mem.c @@ -176,6 +176,8 @@ void nouveau_mem_del(struct ttm_mem_reg *reg) { struct nouveau_mem *mem = nouveau_mem(reg); + if (!mem) + return; nouveau_mem_fini(mem); kfree(reg->mm_node); reg->mm_node = NULL; -- 2.26.2
[PATCH 1/2] drm/nouveau/device: return error for unknown chipsets
Previously the code relied on device->pri to be NULL and to fail probing later. We really should just return an error inside nvkm_device_ctor for unsupported GPUs. Fixes: 24d5ff40a732 ("drm/nouveau/device: rework mmio mapping code to get rid of second map") Signed-off-by: Karol Herbst Cc: dann frazier Cc: dri-devel Cc: Dave Airlie Cc: sta...@vger.kernel.org Reviewed-by: Jeremy Cline --- drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index 9f4ac2672cf2e..dcb70677d0acc 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -3149,6 +3149,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func, case 0x168: device->chip = _chipset; break; default: nvdev_error(device, "unknown chipset (%08x)\n", boot0); + ret = -ENODEV; goto done; } -- 2.26.2
Re: [PATCH v2 1/2] drm/nouveau: return temperatures in temp_get() via parameter
On Thu, Sep 17, 2020 at 4:11 PM Jeremy Cline wrote: > > On Wed, Sep 16, 2020 at 10:03:22PM +0200, Karol Herbst wrote: > > On Wed, Sep 16, 2020 at 10:01 PM Karol Herbst wrote: > > > > > > On Wed, Sep 16, 2020 at 9:47 PM Jeremy Cline wrote: > > > > > > > > The temp_get() function currently returns negative error numbers or a > > > > temperature. However, the thermal sensors can (in theory) measure > > > > negative temperatures. Some implementations of temp_get() correctly > > > > clamp negative temperature readings to 0 so that users do not mistake > > > > them for errors, but some, like gp100_temp_get(), do not. > > > > > > > > Rather than relying on implementations remembering to clamp values, > > > > dedicate the function return value to error codes and accept a pointer > > > > to an integer where the temperature reading should be stored. > > > > > > > > Signed-off-by: Jeremy Cline > > > > --- > > > > .../drm/nouveau/include/nvkm/subdev/therm.h | 2 +- > > > > drivers/gpu/drm/nouveau/nouveau_hwmon.c | 12 ++-- > > > > .../gpu/drm/nouveau/nvkm/subdev/therm/base.c | 19 ++- > > > > .../gpu/drm/nouveau/nvkm/subdev/therm/g84.c | 11 ++- > > > > .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 11 ++- > > > > .../gpu/drm/nouveau/nvkm/subdev/therm/nv40.c | 9 +++-- > > > > .../gpu/drm/nouveau/nvkm/subdev/therm/nv50.c | 9 +++-- > > > > .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 17 +++-- > > > > .../gpu/drm/nouveau/nvkm/subdev/therm/temp.c | 12 > > > > 9 files changed, 62 insertions(+), 40 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > > > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > > > index 62c34f98c930..bfe9779216fc 100644 > > > > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > > > @@ -99,7 +99,7 @@ struct nvkm_therm { > > > > bool clkgating_enabled; > > > > }; > > > > > > > > -int nvkm_therm_temp_get(struct nvkm_therm *); > > > > +int nvkm_therm_temp_get(struct nvkm_therm *therm, int *temp); > > > > int nvkm_therm_fan_sense(struct nvkm_therm *); > > > > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > > > > void nvkm_therm_clkgate_init(struct nvkm_therm *, > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > > > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > > > index 1c3104d20571..aff6aa296678 100644 > > > > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > > > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > > > @@ -325,7 +325,7 @@ nouveau_temp_is_visible(const void *data, u32 attr, > > > > int channel) > > > > struct nouveau_drm *drm = nouveau_drm((struct drm_device > > > > *)data); > > > > struct nvkm_therm *therm = nvxx_therm(>client.device); > > > > > > > > - if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm) < > > > > 0) > > > > + if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm, > > > > NULL) < 0) > > > > return 0; > > > > > > > > switch (attr) { > > > > @@ -419,7 +419,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int > > > > channel, long *val) > > > > struct drm_device *drm_dev = dev_get_drvdata(dev); > > > > struct nouveau_drm *drm = nouveau_drm(drm_dev); > > > > struct nvkm_therm *therm = nvxx_therm(>client.device); > > > > - int ret; > > > > + int ret = 0, temp; > > > > > > > > if (!therm || !therm->attr_get) > > > > return -EOPNOTSUPP; > > > > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int > > > > channel, long *val) > > > > case hwmon_temp_input: > > > > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON) > > > > return -EINVAL; > > > > - ret = nvkm_therm_temp_get(therm); > > > > - *val = ret < 0 ? ret : (ret * 1000); > > > > +
Re: [PATCH v2 1/2] drm/nouveau: return temperatures in temp_get() via parameter
On Wed, Sep 16, 2020 at 10:01 PM Karol Herbst wrote: > > On Wed, Sep 16, 2020 at 9:47 PM Jeremy Cline wrote: > > > > The temp_get() function currently returns negative error numbers or a > > temperature. However, the thermal sensors can (in theory) measure > > negative temperatures. Some implementations of temp_get() correctly > > clamp negative temperature readings to 0 so that users do not mistake > > them for errors, but some, like gp100_temp_get(), do not. > > > > Rather than relying on implementations remembering to clamp values, > > dedicate the function return value to error codes and accept a pointer > > to an integer where the temperature reading should be stored. > > > > Signed-off-by: Jeremy Cline > > --- > > .../drm/nouveau/include/nvkm/subdev/therm.h | 2 +- > > drivers/gpu/drm/nouveau/nouveau_hwmon.c | 12 ++-- > > .../gpu/drm/nouveau/nvkm/subdev/therm/base.c | 19 ++- > > .../gpu/drm/nouveau/nvkm/subdev/therm/g84.c | 11 ++- > > .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 11 ++- > > .../gpu/drm/nouveau/nvkm/subdev/therm/nv40.c | 9 +++-- > > .../gpu/drm/nouveau/nvkm/subdev/therm/nv50.c | 9 +++-- > > .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 17 +++-- > > .../gpu/drm/nouveau/nvkm/subdev/therm/temp.c | 12 > > 9 files changed, 62 insertions(+), 40 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > index 62c34f98c930..bfe9779216fc 100644 > > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > @@ -99,7 +99,7 @@ struct nvkm_therm { > > bool clkgating_enabled; > > }; > > > > -int nvkm_therm_temp_get(struct nvkm_therm *); > > +int nvkm_therm_temp_get(struct nvkm_therm *therm, int *temp); > > int nvkm_therm_fan_sense(struct nvkm_therm *); > > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > > void nvkm_therm_clkgate_init(struct nvkm_therm *, > > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > index 1c3104d20571..aff6aa296678 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > @@ -325,7 +325,7 @@ nouveau_temp_is_visible(const void *data, u32 attr, int > > channel) > > struct nouveau_drm *drm = nouveau_drm((struct drm_device *)data); > > struct nvkm_therm *therm = nvxx_therm(>client.device); > > > > - if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm) < 0) > > + if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm, NULL) > > < 0) > > return 0; > > > > switch (attr) { > > @@ -419,7 +419,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int > > channel, long *val) > > struct drm_device *drm_dev = dev_get_drvdata(dev); > > struct nouveau_drm *drm = nouveau_drm(drm_dev); > > struct nvkm_therm *therm = nvxx_therm(>client.device); > > - int ret; > > + int ret = 0, temp; > > > > if (!therm || !therm->attr_get) > > return -EOPNOTSUPP; > > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int > > channel, long *val) > > case hwmon_temp_input: > > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON) > > return -EINVAL; > > - ret = nvkm_therm_temp_get(therm); > > - *val = ret < 0 ? ret : (ret * 1000); > > + ret = nvkm_therm_temp_get(therm, ); > > + *val = temp * 1000; > > break; > > case hwmon_temp_max: > > *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK) > > @@ -459,7 +459,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int > > channel, long *val) > > return -EOPNOTSUPP; > > } > > > > - return 0; > > + return ret; > > } > > > > static int > > @@ -735,7 +735,7 @@ nouveau_hwmon_init(struct drm_device *dev) > > hwmon->dev = dev; > > > > if (therm && therm->attr_get && therm->attr_set) { > > - if (nvkm_therm_temp_get(therm) >= 0) > > + if (nvkm_therm_temp_get(therm, NULL) >= 0) > >
Re: [PATCH v2 1/2] drm/nouveau: return temperatures in temp_get() via parameter
On Wed, Sep 16, 2020 at 9:47 PM Jeremy Cline wrote: > > The temp_get() function currently returns negative error numbers or a > temperature. However, the thermal sensors can (in theory) measure > negative temperatures. Some implementations of temp_get() correctly > clamp negative temperature readings to 0 so that users do not mistake > them for errors, but some, like gp100_temp_get(), do not. > > Rather than relying on implementations remembering to clamp values, > dedicate the function return value to error codes and accept a pointer > to an integer where the temperature reading should be stored. > > Signed-off-by: Jeremy Cline > --- > .../drm/nouveau/include/nvkm/subdev/therm.h | 2 +- > drivers/gpu/drm/nouveau/nouveau_hwmon.c | 12 ++-- > .../gpu/drm/nouveau/nvkm/subdev/therm/base.c | 19 ++- > .../gpu/drm/nouveau/nvkm/subdev/therm/g84.c | 11 ++- > .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 11 ++- > .../gpu/drm/nouveau/nvkm/subdev/therm/nv40.c | 9 +++-- > .../gpu/drm/nouveau/nvkm/subdev/therm/nv50.c | 9 +++-- > .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 17 +++-- > .../gpu/drm/nouveau/nvkm/subdev/therm/temp.c | 12 > 9 files changed, 62 insertions(+), 40 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > index 62c34f98c930..bfe9779216fc 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > @@ -99,7 +99,7 @@ struct nvkm_therm { > bool clkgating_enabled; > }; > > -int nvkm_therm_temp_get(struct nvkm_therm *); > +int nvkm_therm_temp_get(struct nvkm_therm *therm, int *temp); > int nvkm_therm_fan_sense(struct nvkm_therm *); > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > void nvkm_therm_clkgate_init(struct nvkm_therm *, > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > index 1c3104d20571..aff6aa296678 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > @@ -325,7 +325,7 @@ nouveau_temp_is_visible(const void *data, u32 attr, int > channel) > struct nouveau_drm *drm = nouveau_drm((struct drm_device *)data); > struct nvkm_therm *therm = nvxx_therm(>client.device); > > - if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm) < 0) > + if (!therm || !therm->attr_get || nvkm_therm_temp_get(therm, NULL) < > 0) > return 0; > > switch (attr) { > @@ -419,7 +419,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int > channel, long *val) > struct drm_device *drm_dev = dev_get_drvdata(dev); > struct nouveau_drm *drm = nouveau_drm(drm_dev); > struct nvkm_therm *therm = nvxx_therm(>client.device); > - int ret; > + int ret = 0, temp; > > if (!therm || !therm->attr_get) > return -EOPNOTSUPP; > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int > channel, long *val) > case hwmon_temp_input: > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON) > return -EINVAL; > - ret = nvkm_therm_temp_get(therm); > - *val = ret < 0 ? ret : (ret * 1000); > + ret = nvkm_therm_temp_get(therm, ); > + *val = temp * 1000; > break; > case hwmon_temp_max: > *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK) > @@ -459,7 +459,7 @@ nouveau_temp_read(struct device *dev, u32 attr, int > channel, long *val) > return -EOPNOTSUPP; > } > > - return 0; > + return ret; > } > > static int > @@ -735,7 +735,7 @@ nouveau_hwmon_init(struct drm_device *dev) > hwmon->dev = dev; > > if (therm && therm->attr_get && therm->attr_set) { > - if (nvkm_therm_temp_get(therm) >= 0) > + if (nvkm_therm_temp_get(therm, NULL) >= 0) > special_groups[i++] = _auto_point_sensor_group; > if (therm->fan_get && therm->fan_get(therm) >= 0) > special_groups[i++] = _fan_sensor_group; > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > index 4a4d1e224126..e7ffea1512e6 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > @@ -27,10 +27,15 @@ > #include > > int > -nvkm_therm_temp_get(struct nvkm_therm *therm) > +nvkm_therm_temp_get(struct nvkm_therm *therm, int *temp) > { > + int ignored_reading; > + > + if (temp == NULL) > + temp = _reading; > + > if (therm->func->temp_get) > - return therm->func->temp_get(therm); > + return therm->func->temp_get(therm, temp);
Re: [Nouveau] [PATCH] drm/nouveau: Add fine-grain temperature reporting
On Wed, Sep 9, 2020 at 6:06 AM Ben Skeggs wrote: > > On Thu, 13 Aug 2020 at 06:50, Jeremy Cline wrote: > > > > Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of > > new gp1xx temperature sensor") added support for reading finer-grain > > temperatures, but continued to report temperatures in 1 degree Celsius > > increments via nvkm_therm_temp_get(). > > > > Rather than altering nvkm_therm_temp_get() to report finer-grain > > temperatures, which would be inconvenient for other users of the > > function, a second interface has been added to line up with hwmon's > > native unit of temperature. > Hey Jeremy, > > Sorry this slipped past me until now. I'm OK with adding support for > millidegree temperature reporting, but don't think we need to keep > both interfaces around and would rather see the existing code > converted to return millidegrees (even on GPUs that don't support it) > instead of degrees. > > Thanks! > Ben. > just a note as I was trying something like that in the past: we have a lot of code which fetches the temperature and there are a lot of places where we would then do a divide by 1000, so having a wrapper function at least makes sense. If we want to keep the func pointers? well.. I guess the increased CPU overhead wouldn't be as bad if all sub classes do this static * 1000 as well either. I just think we should have both interfaces in subdev/therm.h so we can just keep most of the current code as is. > > > > Signed-off-by: Jeremy Cline > > --- > > .../drm/nouveau/include/nvkm/subdev/therm.h | 18 + > > drivers/gpu/drm/nouveau/nouveau_hwmon.c | 4 +-- > > .../gpu/drm/nouveau/nvkm/subdev/therm/base.c | 16 > > .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 25 +-- > > .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 1 + > > 5 files changed, 60 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > index 62c34f98c930..7b9928dd001c 100644 > > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > @@ -100,6 +100,24 @@ struct nvkm_therm { > > }; > > > > int nvkm_therm_temp_get(struct nvkm_therm *); > > + > > +/** > > + * nvkm_therm_temp_millidegree_get() - get the temperature in millidegrees > > + * @therm: The thermal device to read from. > > + * > > + * This interface reports temperatures in units of millidegree Celsius to > > + * align with the hwmon API. Some cards may only be capable of reporting in > > + * units of Celsius, and those that report finer grain temperatures may > > not be > > + * capable of millidegree Celsius accuracy, > > + * > > + * For cases where millidegree temperature is too fine-grain, the > > + * nvkm_therm_temp_get() interface reports temperatures in one degree > > Celsius > > + * increments. > > + * > > + * Return: The temperature in millidegrees Celsius, or -ENODEV if > > temperature > > + * reporting is not supported. > > + */ > > +int nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm); > > int nvkm_therm_fan_sense(struct nvkm_therm *); > > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > > void nvkm_therm_clkgate_init(struct nvkm_therm *, > > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > index 1c3104d20571..e96355f93ce5 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int > > channel, long *val) > > case hwmon_temp_input: > > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON) > > return -EINVAL; > > - ret = nvkm_therm_temp_get(therm); > > - *val = ret < 0 ? ret : (ret * 1000); > > + ret = nvkm_therm_temp_millidegree_get(therm); > > + *val = ret; > > break; > > case hwmon_temp_max: > > *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > > index 4a4d1e224126..e655b32c78b8 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > > @@ -34,6 +34,22 @@ nvkm_therm_temp_get(struct nvkm_therm *therm) > > return -ENODEV; > > } > > > > +int > > +nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm) > > +{ > > + int ret = -ENODEV; > > + > > + if (therm->func->temp_millidegree_get) > > + return therm->func->temp_millidegree_get(therm); > > + > > + if (therm->func->temp_get) { > > + ret = therm->func->temp_get(therm); > > + if (ret > 0) > > + ret *= 1000; > > + }
Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)
On Mon, Sep 7, 2020 at 10:58 PM Marc MERLIN wrote: > > On Mon, Sep 07, 2020 at 09:14:03PM +0200, Karol Herbst wrote: > > > - changes in the nouveau driver. Mika told me the PCIe regression > > > "pcieport :00:01.0: PME: Spurious native interrupt!" is supposed > > > to be fixed in 5.8, but I still get a 4mn hang or so during boot and > > > with 5.8, removing the USB key, didn't help make the boot faster > > > > that's the root port the GPU is attached to, no? I saw that message on > > the Thinkpad P1G2 when runtime resuming the Nvidia GPU, but it does > > seem to come from the root port. > > Hi Karol, thanks for your answer. > > 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core > Processor PCIe Controller (x16) (rev 0d) > 01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX > 4000 Mobile / Max-Q] (rev a1) > > > Well, you'd also need it when attaching external displays. > > Indeed. I just don't need that on this laptop, but familiar with the not > so seemless procedure to turn on both GPUs, and mirror the intel one into > the nvidia one for external output. > > > > [ 11.262985] nvidia-gpu :01:00.3: PME# enabled > > > [ 11.303060] nvidia-gpu :01:00.3: PME# disabled > > > > mhh, interesting. I heard some random comments that the Nvidia > > USB-C/UCSI driver is a bit broken and can cause various issues. Mind > > blacklisting i2c-nvidia-gpu and typec_nvidia (and verify they don't > > get loaded) and see if that helps? > > Right, this one: > 01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C > UCSI Controller (rev a1) > Sure, I'll blacklist it. Ok, just did that, removed from initrd, > rebooted, and it was no better. > > From initrd (before root gets mounted), I have this: > nouveau 1961984 0 > mxm_wmi16384 1 nouveau > hwmon 32768 1 nouveau > ttm 102400 1 nouveau > wmi32768 2 nouveau,mxm_wmi > > I still got a 2mn hang. and a nouveau probe error > [ 189.124530] nouveau: probe of :01:00.0 failed with error -12 > > > Here's what it looks like: > [9.693230] hid: raw HID events driver (C) Jiri Kosina > [9.694988] usbcore: registered new interface driver usbhid > [9.694989] usbhid: USB HID core driver > [9.696700] hid-generic 0003:1050:0200.0001: hiddev0,hidraw0: USB HID > v1.00 Device [Yubico Yubico Gnubby (gnubby1)] on usb-:00:14.0-2/input0 > [9.784456] Console: switching to colour frame buffer device 240x67 > [9.816297] i915 :00:02.0: fb0: i915drmfb frame buffer device > [ 25.087400] thunderbolt :06:00.0: saving config space at offset 0x0 > (reading 0x15eb8086) > [ 25.087414] thunderbolt :06:00.0: saving config space at offset 0x4 > (reading 0x100406) > [ 25.087419] thunderbolt :06:00.0: saving config space at offset 0x8 > (reading 0x886) > [ 25.087424] thunderbolt :06:00.0: saving config space at offset 0xc > (reading 0x20) > [ 25.087430] thunderbolt :06:00.0: saving config space at offset 0x10 > (reading 0xcc10) > [ 25.087435] thunderbolt :06:00.0: saving config space at offset 0x14 > (reading 0xcc14) > [ 25.087440] thunderbolt :06:00.0: saving config space at offset 0x18 > (reading 0x0) > [ 25.087445] thunderbolt :06:00.0: saving config space at offset 0x1c > (reading 0x0) > [ 25.087450] thunderbolt :06:00.0: saving config space at offset 0x20 > (reading 0x0) > [ 25.087455] thunderbolt :06:00.0: saving config space at offset 0x24 > (reading 0x0) > [ 25.087460] thunderbolt :06:00.0: saving config space at offset 0x28 > (reading 0x0) > [ 25.087466] thunderbolt :06:00.0: saving config space at offset 0x2c > (reading 0x229b17aa) > [ 25.087471] thunderbolt :06:00.0: saving config space at offset 0x30 > (reading 0x0) > [ 25.087476] thunderbolt :06:00.0: saving config space at offset 0x34 > (reading 0x80) > [ 25.087481] thunderbolt :06:00.0: saving config space at offset 0x38 > (reading 0x0) > [ 25.087486] thunderbolt :06:00.0: saving config space at offset 0x3c > (reading 0x1ff) > [ 25.087571] thunderbolt :06:00.0: PME# enabled > [ 25.105353] pcieport :05:00.0: saving config space at offset 0x0 > (reading 0x15ea8086) > [ 25.105364] pcieport :05:00.0: saving config space at offset 0x4 > (reading 0x100407) > [ 25.105370] pcieport :05:00.0: saving config space at offset 0x8 > (reading 0x6040006) > [ 25.105375] pcieport :05:00.0: saving config space at offset 0xc > (reading 0x100
Re: [Nouveau] pcieport 0000:00:01.0: PME: Spurious native interrupt (nvidia with nouveau and thunderbolt on thinkpad P73)
On Sun, Sep 6, 2020 at 8:52 PM Marc MERLIN wrote: > > Ok, I have an update to this problem. I added the nouveau list because > I can't quite tell if the issue is: > - the PCIe changes that went in 5.6 I think (or 5.5?), referenced below > > - a new issue with thunderbold on thinkpad P73, that seems to be > triggered if I have a USB-C yubikey in the port. With 5.7, my issues > went away if I removed the USB key during boot, showing an interaction > between nouveau and thunderbolt > > - changes in the nouveau driver. Mika told me the PCIe regression > "pcieport :00:01.0: PME: Spurious native interrupt!" is supposed > to be fixed in 5.8, but I still get a 4mn hang or so during boot and > with 5.8, removing the USB key, didn't help make the boot faster > that's the root port the GPU is attached to, no? I saw that message on the Thinkpad P1G2 when runtime resuming the Nvidia GPU, but it does seem to come from the root port. > I don't otherwise use the nvidia chip I so wish I didn't have, I only > use intel graphics on that laptop, but I must apparently use the nouveau > driver to manage the nouveau chip so that it's turned off and not > burning 60W doing nothing. > Well, you'd also need it when attaching external displays. > lspci is in the quoted message below, I won't copy it here again, but > here's the nvidia bit: > 01:00.0 VGA compatible controller: NVIDIA Corporation TU104GLM [Quadro RTX > 4000 Mobile / Max-Q] (rev a1) > 01:00.1 Audio device: NVIDIA Corporation TU104 HD Audio Controller (rev a1) > 01:00.2 USB controller: NVIDIA Corporation TU104 USB 3.1 Host Controller (rev > a1) > 01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C > UCSI Controller (rev a1) > > Here are 5 boots, 4 on 5.8.5: > > dmesg.1_hang_but_no_warning.txt https://pastebin.com/Y5NaH08n > Boot hung for quite a while, but no clear output > > dmesg.2_pme_spurious.txt https://pastebin.com/dX19aCpj > [8.185808] nvidia-gpu :01:00.3: runtime IRQ mapping not provided by > arch > [8.185989] nvidia-gpu :01:00.3: enabling device ( -> 0002) > [8.188986] nvidia-gpu :01:00.3: enabling bus mastering > [ 11.936507] nvidia-gpu :01:00.3: PME# enabled > [ 11.975985] nvidia-gpu :01:00.3: PME# disabled > [ 11.976011] pcieport :00:01.0: PME: Spurious native interrupt! > > dmesg.3_usb_key_yanked.txt https://pastebin.com/m7QLnCZt > I yanked the USB key during boot, that seemed to help unlock things with > 5.7, but did not with 5.8. It's hung on a loop of: > [ 11.262854] nvidia-gpu :01:00.3: saving config space at offset 0x0 > (reading 0x1ad910de) > [ 11.262863] nvidia-gpu :01:00.3: saving config space at offset 0x4 > (reading 0x100406) > [ 11.262869] nvidia-gpu :01:00.3: saving config space at offset 0x8 > (reading 0xc8000a1) > [ 11.262874] nvidia-gpu :01:00.3: saving config space at offset 0xc > (reading 0x80) > [ 11.262880] nvidia-gpu :01:00.3: saving config space at offset 0x10 > (reading 0xce054000) > [ 11.262885] nvidia-gpu :01:00.3: saving config space at offset 0x14 > (reading 0x0) > [ 11.262890] nvidia-gpu :01:00.3: saving config space at offset 0x18 > (reading 0x0) > [ 11.262895] nvidia-gpu :01:00.3: saving config space at offset 0x1c > (reading 0x0) > [ 11.262900] nvidia-gpu :01:00.3: saving config space at offset 0x20 > (reading 0x0) > [ 11.262906] nvidia-gpu :01:00.3: saving config space at offset 0x24 > (reading 0x0) > [ 11.262911] nvidia-gpu :01:00.3: saving config space at offset 0x28 > (reading 0x0) > [ 11.262916] nvidia-gpu :01:00.3: saving config space at offset 0x2c > (reading 0x229b17aa) > [ 11.262921] nvidia-gpu :01:00.3: saving config space at offset 0x30 > (reading 0x0) > [ 11.262926] nvidia-gpu :01:00.3: saving config space at offset 0x34 > (reading 0x68) > [ 11.262931] nvidia-gpu :01:00.3: saving config space at offset 0x38 > (reading 0x0) > [ 11.262937] nvidia-gpu :01:00.3: saving config space at offset 0x3c > (reading 0x4ff) > [ 11.262985] nvidia-gpu :01:00.3: PME# enabled > [ 11.303060] nvidia-gpu :01:00.3: PME# disabled > mhh, interesting. I heard some random comments that the Nvidia USB-C/UCSI driver is a bit broken and can cause various issues. Mind blacklisting i2c-nvidia-gpu and typec_nvidia (and verify they don't get loaded) and see if that helps? > dmesg.4_5.5_boot_fine.txt https://pastebin.com/WXgQTUYP > reference boot with 4.5, it works fine, no issues > > dmesg.5_no_key_still_hang.txt https://pastebin.com/kcT8Ras0 > unfortunately, booting without the USB-C key in thunderbolt, did not > allow this boot to be faster, it looks different though: > [6.723454] pcieport :00:01.0: runtime IRQ mapping not provided by arch > [6.723598] pcieport :00:01.0: PME: Signaling with IRQ 122 > [6.724011] pcieport :00:01.0: saving config space at offset 0x0 > (reading 0x19018086) > [6.724016]
Re: [PATCH] drm/nouveau: Add fine-grain temperature reporting
Reviewed-by: Karol Herbst On Wed, Aug 12, 2020 at 10:50 PM Jeremy Cline wrote: > > Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of > new gp1xx temperature sensor") added support for reading finer-grain > temperatures, but continued to report temperatures in 1 degree Celsius > increments via nvkm_therm_temp_get(). > > Rather than altering nvkm_therm_temp_get() to report finer-grain > temperatures, which would be inconvenient for other users of the > function, a second interface has been added to line up with hwmon's > native unit of temperature. > > Signed-off-by: Jeremy Cline > --- > .../drm/nouveau/include/nvkm/subdev/therm.h | 18 + > drivers/gpu/drm/nouveau/nouveau_hwmon.c | 4 +-- > .../gpu/drm/nouveau/nvkm/subdev/therm/base.c | 16 > .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 25 +-- > .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 1 + > 5 files changed, 60 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > index 62c34f98c930..7b9928dd001c 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > @@ -100,6 +100,24 @@ struct nvkm_therm { > }; > > int nvkm_therm_temp_get(struct nvkm_therm *); > + > +/** > + * nvkm_therm_temp_millidegree_get() - get the temperature in millidegrees > + * @therm: The thermal device to read from. > + * > + * This interface reports temperatures in units of millidegree Celsius to > + * align with the hwmon API. Some cards may only be capable of reporting in > + * units of Celsius, and those that report finer grain temperatures may not > be > + * capable of millidegree Celsius accuracy, > + * > + * For cases where millidegree temperature is too fine-grain, the > + * nvkm_therm_temp_get() interface reports temperatures in one degree Celsius > + * increments. > + * > + * Return: The temperature in millidegrees Celsius, or -ENODEV if temperature > + * reporting is not supported. > + */ > +int nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm); > int nvkm_therm_fan_sense(struct nvkm_therm *); > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > void nvkm_therm_clkgate_init(struct nvkm_therm *, > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > index 1c3104d20571..e96355f93ce5 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int > channel, long *val) > case hwmon_temp_input: > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON) > return -EINVAL; > - ret = nvkm_therm_temp_get(therm); > - *val = ret < 0 ? ret : (ret * 1000); > + ret = nvkm_therm_temp_millidegree_get(therm); > + *val = ret; > break; > case hwmon_temp_max: > *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK) > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > index 4a4d1e224126..e655b32c78b8 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > @@ -34,6 +34,22 @@ nvkm_therm_temp_get(struct nvkm_therm *therm) > return -ENODEV; > } > > +int > +nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm) > +{ > + int ret = -ENODEV; > + > + if (therm->func->temp_millidegree_get) > + return therm->func->temp_millidegree_get(therm); > + > + if (therm->func->temp_get) { > + ret = therm->func->temp_get(therm); > + if (ret > 0) > + ret *= 1000; > + } > + return ret; > +} > + > static int > nvkm_therm_update_trip(struct nvkm_therm *therm) > { > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c > index 9f0dea3f61dc..4c3c2895a3cb 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c > @@ -24,7 +24,7 @@ > #include "priv.h" > > static int > -gp100_temp_get(struct nvkm_therm *therm) > +gp100_temp_get_raw(struct nvkm_therm *therm) > { > struct nvkm_device *device = therm->subdev.device; > struct nvkm_subdev *subdev = >s
Re: [Nouveau] [PATCH] drm/nouveau/acr: fix a coding style in nvkm_acr_lsfw_load_bl_inst_data_sig()
Reviewed-by: Karol Herbst On Mon, Aug 3, 2020 at 5:46 AM Jing Xiangfeng wrote: > > This patch performs the following changes: > 1. remove a redundant parentheses around the nvkm_acr_lsfw_add() calls > 2. do assignment before this if condition, it is more readable > > Signed-off-by: Jing Xiangfeng > --- > drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c > index 07d1830126ab..5f6006418472 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c > @@ -191,7 +191,8 @@ nvkm_acr_lsfw_load_bl_inst_data_sig(struct nvkm_subdev > *subdev, > u32 *bldata; > int ret; > > - if (IS_ERR((lsfw = nvkm_acr_lsfw_add(func, acr, falcon, id > + lsfw = nvkm_acr_lsfw_add(func, acr, falcon, id); > + if (IS_ERR(lsfw)) > return PTR_ERR(lsfw); > > ret = nvkm_firmware_load_name(subdev, path, "bl", ver, ); > -- > 2.17.1 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
Re: nouveau regression with 5.7 caused by "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
On Wed, Jul 22, 2020 at 11:25 AM Mika Westerberg wrote: > > On Tue, Jul 21, 2020 at 01:37:12PM -0500, Patrick Volkerding wrote: > > On 7/21/20 10:27 AM, Mika Westerberg wrote: > > > On Tue, Jul 21, 2020 at 11:01:55AM -0400, Lyude Paul wrote: > > >> Sure thing. Also, feel free to let me know if you'd like access to one > > >> of the > > >> systems we saw breaking with this patch - I'm fairly sure I've got one > > >> of them > > >> locally at my apartment and don't mind setting up AMT/KVM/SSH > > > Probably no need for remote access (thanks for the offer, though). I > > > attached a test patch to the bug report: > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=208597 > > > > > > that tries to work it around (based on the ->pm_cap == 0). I wonder if > > > anyone would have time to try it out. > > > > > > Hi Mika, > > > > I can confirm that this patch applied to 5.4.52 fixes the issue with > > hybrid graphics on the Thinkpad X1 Extreme gen2. > > Great, thanks for testing! > yeah, works on the P1G2 as well.
Re: [PATCH] RFC: ACPI / OSI: remove workarounds for hybrid graphics laptops
On Mon, Jul 20, 2020 at 3:19 AM Alex Hung wrote: > > On 2020-07-19 1:50 p.m., Karol Herbst wrote: > > On Fri, Jul 17, 2020 at 9:52 PM Alex Hung wrote: > >> > >> On 2020-07-17 1:05 p.m., Karol Herbst wrote: > >>> It's hard to figure out what systems are actually affected and right now I > >>> don't see a good way of removing those... > >>> > >>> But I'd like to see thos getting removed and drivers fixed instead (which > >>> happened at least for nouveau). > >>> > >>> And as mentioned before, I prefer people working on fixing issues instead > >>> of spending time to add firmware level workarounds which are hard to know > >>> to which systems they apply to, hard to remove and basically a big huge > >>> pain to work with.> In the end I have no idea how to even figure out what > >>> systems are affected > >>> and which not by this, so I have no idea how to even verify we can safely > >>> remove this (which just means those are impossible to remove unless we > >>> risk > >>> breaking systems, which again makes those supper annoying to deal with). > >>> > >>> Also from the comments it's hard to get what those bits really do. Are > >>> they > >>> just preventing runtime pm or do the devices are powered down when > >>> booting? > >>> I am sure it's the former, still... > >>> > >>> Please, don't do this again. > >>> > >>> For now, those workaround prevent power savings on systems those > >>> workaround > >>> applies to, which might be any so those should get removed asap and if > >>> new issues arrise removing those please do a proper bug report and we can > >>> look into it and come up with a proper fix (and keep this patch out until > >>> we resolve all of those). > >>> > >>> Signed-off-by: Karol Herbst > >>> CC: Alex Hung > >>> CC: "Rafael J. Wysocki" > >>> CC: Len Brown > >>> CC: Lyude Paul > >>> CC: linux-kernel@vger.kernel.org > >>> CC: dri-de...@lists.freedesktop.org > >>> CC: nouv...@lists.freedesktop.org > >>> --- > >>> drivers/acpi/osi.c | 24 > >>> 1 file changed, 24 deletions(-) > >>> > >>> diff --git a/drivers/acpi/osi.c b/drivers/acpi/osi.c > >>> index 9f68538091384..d4405e1ca9b97 100644 > >>> --- a/drivers/acpi/osi.c > >>> +++ b/drivers/acpi/osi.c > >>> @@ -44,30 +44,6 @@ osi_setup_entries[OSI_STRING_ENTRIES_MAX] __initdata = > >>> { > >>> {"Processor Device", true}, > >>> {"3.0 _SCP Extensions", true}, > >>> {"Processor Aggregator Device", true}, > >>> - /* > >>> - * Linux-Dell-Video is used by BIOS to disable RTD3 for NVidia > >>> graphics > >>> - * cards as RTD3 is not supported by drivers now. Systems with > >>> NVidia > >>> - * cards will hang without RTD3 disabled. > >>> - * > >>> - * Once NVidia drivers officially support RTD3, this _OSI strings > >>> can > >>> - * be removed if both new and old graphics cards are supported. > >>> - */ > >>> - {"Linux-Dell-Video", true}, > >>> - /* > >>> - * Linux-Lenovo-NV-HDMI-Audio is used by BIOS to power on NVidia's > >>> HDMI > >>> - * audio device which is turned off for power-saving in Windows OS. > >>> - * This power management feature observed on some Lenovo Thinkpad > >>> - * systems which will not be able to output audio via HDMI without > >>> - * a BIOS workaround. > >>> - */ > >>> - {"Linux-Lenovo-NV-HDMI-Audio", true}, > >>> - /* > >>> - * Linux-HPI-Hybrid-Graphics is used by BIOS to enable dGPU to > >>> - * output video directly to external monitors on HP Inc. mobile > >>> - * workstations as Nvidia and AMD VGA drivers provide limited > >>> - * hybrid graphics supports. > >>> - */ > >>> - {"Linux-HPI-Hybrid-Graphics", true}, > >>> }; > >>> > >>> static u32 acpi_osi_handler(acpi_string interface, u32 supported) > >>> > >> > >> The changes were discussed and tested a while ago, and no crashes were > >> observed. Thanks for solving PM issues in nouveau. > >> > >> Acked-by: Alex Hung > >> > > > > By any chance, do you have a list of systems implementing those workarounds? > > > > I don't keep a list but the workaround, in theory, should only apply to > the systems with the specific nvidia hardware. > > I reminded OEMs and ODMs that these _OSI strings were temporary > solutions, and highlighted we were going to remove them after our > discussion last year. If they were paying attentions recent systems > shouldn't have these _OSI strings. > Right.. but I am actually wondering because I never saw those strings in the wild or not on the Dell and Lenovo systems I was testing on. So I think we might want to ask the vendors themselves and verify on those systems. > -- > Cheers, > Alex Hung >
Re: [PATCH] RFC: ACPI / OSI: remove workarounds for hybrid graphics laptops
On Fri, Jul 17, 2020 at 9:52 PM Alex Hung wrote: > > On 2020-07-17 1:05 p.m., Karol Herbst wrote: > > It's hard to figure out what systems are actually affected and right now I > > don't see a good way of removing those... > > > > But I'd like to see thos getting removed and drivers fixed instead (which > > happened at least for nouveau). > > > > And as mentioned before, I prefer people working on fixing issues instead > > of spending time to add firmware level workarounds which are hard to know > > to which systems they apply to, hard to remove and basically a big huge > > pain to work with.> In the end I have no idea how to even figure out what > > systems are affected > > and which not by this, so I have no idea how to even verify we can safely > > remove this (which just means those are impossible to remove unless we risk > > breaking systems, which again makes those supper annoying to deal with). > > > > Also from the comments it's hard to get what those bits really do. Are they > > just preventing runtime pm or do the devices are powered down when booting? > > I am sure it's the former, still... > > > > Please, don't do this again. > > > > For now, those workaround prevent power savings on systems those workaround > > applies to, which might be any so those should get removed asap and if > > new issues arrise removing those please do a proper bug report and we can > > look into it and come up with a proper fix (and keep this patch out until > > we resolve all of those). > > > > Signed-off-by: Karol Herbst > > CC: Alex Hung > > CC: "Rafael J. Wysocki" > > CC: Len Brown > > CC: Lyude Paul > > CC: linux-kernel@vger.kernel.org > > CC: dri-de...@lists.freedesktop.org > > CC: nouv...@lists.freedesktop.org > > --- > > drivers/acpi/osi.c | 24 > > 1 file changed, 24 deletions(-) > > > > diff --git a/drivers/acpi/osi.c b/drivers/acpi/osi.c > > index 9f68538091384..d4405e1ca9b97 100644 > > --- a/drivers/acpi/osi.c > > +++ b/drivers/acpi/osi.c > > @@ -44,30 +44,6 @@ osi_setup_entries[OSI_STRING_ENTRIES_MAX] __initdata = { > > {"Processor Device", true}, > > {"3.0 _SCP Extensions", true}, > > {"Processor Aggregator Device", true}, > > - /* > > - * Linux-Dell-Video is used by BIOS to disable RTD3 for NVidia > > graphics > > - * cards as RTD3 is not supported by drivers now. Systems with NVidia > > - * cards will hang without RTD3 disabled. > > - * > > - * Once NVidia drivers officially support RTD3, this _OSI strings can > > - * be removed if both new and old graphics cards are supported. > > - */ > > - {"Linux-Dell-Video", true}, > > - /* > > - * Linux-Lenovo-NV-HDMI-Audio is used by BIOS to power on NVidia's > > HDMI > > - * audio device which is turned off for power-saving in Windows OS. > > - * This power management feature observed on some Lenovo Thinkpad > > - * systems which will not be able to output audio via HDMI without > > - * a BIOS workaround. > > - */ > > - {"Linux-Lenovo-NV-HDMI-Audio", true}, > > - /* > > - * Linux-HPI-Hybrid-Graphics is used by BIOS to enable dGPU to > > - * output video directly to external monitors on HP Inc. mobile > > - * workstations as Nvidia and AMD VGA drivers provide limited > > - * hybrid graphics supports. > > - */ > > - {"Linux-HPI-Hybrid-Graphics", true}, > > }; > > > > static u32 acpi_osi_handler(acpi_string interface, u32 supported) > > > > The changes were discussed and tested a while ago, and no crashes were > observed. Thanks for solving PM issues in nouveau. > > Acked-by: Alex Hung > By any chance, do you have a list of systems implementing those workarounds?
[PATCH] RFC: ACPI / OSI: remove workarounds for hybrid graphics laptops
It's hard to figure out what systems are actually affected and right now I don't see a good way of removing those... But I'd like to see thos getting removed and drivers fixed instead (which happened at least for nouveau). And as mentioned before, I prefer people working on fixing issues instead of spending time to add firmware level workarounds which are hard to know to which systems they apply to, hard to remove and basically a big huge pain to work with. In the end I have no idea how to even figure out what systems are affected and which not by this, so I have no idea how to even verify we can safely remove this (which just means those are impossible to remove unless we risk breaking systems, which again makes those supper annoying to deal with). Also from the comments it's hard to get what those bits really do. Are they just preventing runtime pm or do the devices are powered down when booting? I am sure it's the former, still... Please, don't do this again. For now, those workaround prevent power savings on systems those workaround applies to, which might be any so those should get removed asap and if new issues arrise removing those please do a proper bug report and we can look into it and come up with a proper fix (and keep this patch out until we resolve all of those). Signed-off-by: Karol Herbst CC: Alex Hung CC: "Rafael J. Wysocki" CC: Len Brown CC: Lyude Paul CC: linux-kernel@vger.kernel.org CC: dri-de...@lists.freedesktop.org CC: nouv...@lists.freedesktop.org --- drivers/acpi/osi.c | 24 1 file changed, 24 deletions(-) diff --git a/drivers/acpi/osi.c b/drivers/acpi/osi.c index 9f68538091384..d4405e1ca9b97 100644 --- a/drivers/acpi/osi.c +++ b/drivers/acpi/osi.c @@ -44,30 +44,6 @@ osi_setup_entries[OSI_STRING_ENTRIES_MAX] __initdata = { {"Processor Device", true}, {"3.0 _SCP Extensions", true}, {"Processor Aggregator Device", true}, - /* -* Linux-Dell-Video is used by BIOS to disable RTD3 for NVidia graphics -* cards as RTD3 is not supported by drivers now. Systems with NVidia -* cards will hang without RTD3 disabled. -* -* Once NVidia drivers officially support RTD3, this _OSI strings can -* be removed if both new and old graphics cards are supported. -*/ - {"Linux-Dell-Video", true}, - /* -* Linux-Lenovo-NV-HDMI-Audio is used by BIOS to power on NVidia's HDMI -* audio device which is turned off for power-saving in Windows OS. -* This power management feature observed on some Lenovo Thinkpad -* systems which will not be able to output audio via HDMI without -* a BIOS workaround. -*/ - {"Linux-Lenovo-NV-HDMI-Audio", true}, - /* -* Linux-HPI-Hybrid-Graphics is used by BIOS to enable dGPU to -* output video directly to external monitors on HP Inc. mobile -* workstations as Nvidia and AMD VGA drivers provide limited -* hybrid graphics supports. -*/ - {"Linux-HPI-Hybrid-Graphics", true}, }; static u32 acpi_osi_handler(acpi_string interface, u32 supported) -- 2.26.2
Re: nouveau regression with 5.7 caused by "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
Filed at https://bugzilla.kernel.org/show_bug.cgi?id=208597 oddly enough I wasn't able to reproduce it on my XPS 9560, will ping once something breaks. On Fri, Jul 17, 2020 at 2:43 AM Karol Herbst wrote: > > On Fri, Jul 17, 2020 at 1:54 AM Bjorn Helgaas wrote: > > > > [+cc Sasha -- stable kernel regression] > > [+cc Patrick, Kai-Heng, LKML] > > > > On Fri, Jul 17, 2020 at 12:10:39AM +0200, Karol Herbst wrote: > > > On Tue, Jul 7, 2020 at 9:30 PM Karol Herbst wrote: > > > > > > > > Hi everybody, > > > > > > > > with the mentioned commit Nouveau isn't able to load firmware onto the > > > > GPU on one of my systems here. Even though the issue doesn't always > > > > happen I am quite confident this is the commit breaking it. > > > > > > > > I am still digging into the issue and trying to figure out what > > > > exactly breaks, but it shows up in different ways. Either we are not > > > > able to boot the engines on the GPU or the GPU becomes unresponsive. > > > > Btw, this is also a system where our runtime power management issue > > > > shows up, so maybe there is indeed something funky with the bridge > > > > controller. > > > > > > > > Just pinging you in case you have an idea on how this could break > > > > Nouveau > > > > > > > > most of the times it shows up like this: > > > > nouveau :01:00.0: acr: AHESASC binary failed > > > > > > > > Sometimes it works at boot and fails at runtime resuming with random > > > > faults. So I will be investigating a bit more, but yeah... I am super > > > > sure the commit triggered this issue, no idea if it actually causes > > > > it. > > > > > > so yeah.. I reverted that locally and never ran into issues again. > > > Still valid on latest 5.7. So can we get this reverted or properly > > > fixed? This breaks runtime pm for us on at least some hardware. > > > > Yeah, that stinks. We had another similar report from Patrick: > > > > > > https://lore.kernel.org/r/caerspo5stek_my1dehwp7ahd0xop87+ohywktjbl7algdbx...@mail.gmail.com > > > > Apparently the problem is ec411e02b7a2 ("PCI/PM: Assume ports without > > DLL Link Active train links in 100 ms"), which Patrick found was > > backported to v5.4.49 as 828b192c57e8, and you found was backported to > > v5.7.6 as afaff825e3a4. > > > > Oddly, Patrick reported that v5.7.7 worked correctly, even though it > > still contains afaff825e3a4. > > > > I guess in the absence of any other clues we'll have to revert it. > > I hate to do that because that means we'll have slow resume of > > Thunderbolt-connected devices again, but that's better than having > > GPUs completely broken. > > > > Could you and Patrick open bugzilla.kernel.org reports, attach dmesg > > logs and "sudo lspci -vv" output, and add the URLs to Kai-Heng's > > original report at https://bugzilla.kernel.org/show_bug.cgi?id=206837 > > and to this thread? > > > > There must be a way to fix the slow resume problem without breaking > > the GPUs. > > > > I wouldn't be surprised if this is related to the Intel bridge we > check against for Nouveau.. I still have to check on another laptop > with the same bridge our workaround was required as well but wouldn't > be surprised if it shows the same problem. Will get you the > information from both systems tomorrow then. > > > Bjorn > >
Re: nouveau regression with 5.7 caused by "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
On Fri, Jul 17, 2020 at 1:54 AM Bjorn Helgaas wrote: > > [+cc Sasha -- stable kernel regression] > [+cc Patrick, Kai-Heng, LKML] > > On Fri, Jul 17, 2020 at 12:10:39AM +0200, Karol Herbst wrote: > > On Tue, Jul 7, 2020 at 9:30 PM Karol Herbst wrote: > > > > > > Hi everybody, > > > > > > with the mentioned commit Nouveau isn't able to load firmware onto the > > > GPU on one of my systems here. Even though the issue doesn't always > > > happen I am quite confident this is the commit breaking it. > > > > > > I am still digging into the issue and trying to figure out what > > > exactly breaks, but it shows up in different ways. Either we are not > > > able to boot the engines on the GPU or the GPU becomes unresponsive. > > > Btw, this is also a system where our runtime power management issue > > > shows up, so maybe there is indeed something funky with the bridge > > > controller. > > > > > > Just pinging you in case you have an idea on how this could break Nouveau > > > > > > most of the times it shows up like this: > > > nouveau :01:00.0: acr: AHESASC binary failed > > > > > > Sometimes it works at boot and fails at runtime resuming with random > > > faults. So I will be investigating a bit more, but yeah... I am super > > > sure the commit triggered this issue, no idea if it actually causes > > > it. > > > > so yeah.. I reverted that locally and never ran into issues again. > > Still valid on latest 5.7. So can we get this reverted or properly > > fixed? This breaks runtime pm for us on at least some hardware. > > Yeah, that stinks. We had another similar report from Patrick: > > > https://lore.kernel.org/r/caerspo5stek_my1dehwp7ahd0xop87+ohywktjbl7algdbx...@mail.gmail.com > > Apparently the problem is ec411e02b7a2 ("PCI/PM: Assume ports without > DLL Link Active train links in 100 ms"), which Patrick found was > backported to v5.4.49 as 828b192c57e8, and you found was backported to > v5.7.6 as afaff825e3a4. > > Oddly, Patrick reported that v5.7.7 worked correctly, even though it > still contains afaff825e3a4. > > I guess in the absence of any other clues we'll have to revert it. > I hate to do that because that means we'll have slow resume of > Thunderbolt-connected devices again, but that's better than having > GPUs completely broken. > > Could you and Patrick open bugzilla.kernel.org reports, attach dmesg > logs and "sudo lspci -vv" output, and add the URLs to Kai-Heng's > original report at https://bugzilla.kernel.org/show_bug.cgi?id=206837 > and to this thread? > > There must be a way to fix the slow resume problem without breaking > the GPUs. > I wouldn't be surprised if this is related to the Intel bridge we check against for Nouveau.. I still have to check on another laptop with the same bridge our workaround was required as well but wouldn't be surprised if it shows the same problem. Will get you the information from both systems tomorrow then. > Bjorn >
Re: [git pull] drm for 5.8-rc1
On Wed, Jul 1, 2020 at 7:37 PM James Jones wrote: > > On 7/1/20 10:04 AM, Karol Herbst wrote: > > On Wed, Jul 1, 2020 at 6:01 PM Daniel Vetter wrote: > >> > >> On Wed, Jul 1, 2020 at 5:51 PM James Jones wrote: > >>> > >>> On 7/1/20 4:24 AM, Karol Herbst wrote: > >>>> On Wed, Jul 1, 2020 at 6:45 AM James Jones wrote: > >>>>> > >>>>> This implies something is trying to use one of the old > >>>>> DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifiers with DRM-KMS without > >>>>> first checking whether it is supported by the kernel. I had tried to > >>>>> force an Xorg+Mesa stack without my userspace patches to hit this error > >>>>> when testing, but must have missed some permutation. If the stalled > >>>>> Mesa patches go in, this would stop happening of course, but those were > >>>>> held up for a long time in review, and are now waiting on me to make > >>>>> some modifications. > >>>>> > >>>> > >>>> that's completely irrelevant. If a kernel change breaks userspace, > >>>> it's a kernel bug. > >>> > >>> Agreed it is unacceptable to break userspace, but I don't think it's > >>> irrelevant. Perhaps the musings on pending userspace patches are. > >>> > >>> My intent here was to point out it appears at first glance that > >>> something isn't behaving as expected in userspace, so fixing this would > >>> likely require some sort of work-around for broken userspace rather than > >>> straight-forward fixing of a bug in the kernel logic. My intent was not > >>> to shift blame to something besides my code & testing for the > >>> regression, though I certainly see how it could be interpreted that way. > >>> > >>> Regardless, I'm looking in to it. > >> > > > > I assume the MR you were talking about is > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3724 ? > > Correct. > > > I am > > also aware of the tegra driver being broken on my jetson nano and I am > > now curious if this MR could fix this bug as well... and sorry for the > > harsh reply, I was just a annoyed by the fact that "everything > > modifier related is just breaking things", first tegra and that nobody > > is looking into fixing it and then apparently the userspace code being > > quite broken as well :/ > > > > Anyway, yeah I trust you guys on figuring out the keeping "broken" > > userspace happy from a kernel side and maybe I can help out with > > reviewing the mesa bits. I am just wondering if it could help with the > > tegra situation giving me more reasons to look into it as this would > > solve other issues I should be working on :) > > Not sure if you're claiming this, but if there's Tegra breakage > attributable to this patch series, I'd love to hear more details there > as well. The Tegra patches did have backwards-compat code to handle the > old modifiers, since Tegra was the only working use case I could find > for them within the kernel itself. However, the Tegra kernel patches > are independent (and haven't even been reviewed yet to my knowledge), so > Tegra shouldn't be affected at all given it uses TegraDRM rather than > Nouveau's modesetting driver. > > If there are just general existing issues with modifier support on > Tegra, let's take that to a smaller venue. I probably won't be as much > help there, but I can at least try to help get some eyes on it. > I am sure that your patches here have nothing to do with it, just inside mesa since https://gitlab.freedesktop.org/mesa/mesa/commit/c56fe4118a2dd991ff1b2a532c0f234eddd435e9 it's broken on the jetson nano and because it's so old I am not able to tell if this is because of some kernel changes or because of the modifier code inside mesa being slightly broken. Maybe you have an idea, but Thierry knows about the issue, but I think he never was able to reproduce it on any system. > Thanks, > -James > > >> If we do need to have a kernel workaround I'm happy to help out, I've > >> done a bunch of these and occasionally it's good to get rather > >> creative :-) > >> > >> Ideally we'd also push a minimal fix in userspace to all stable > >> branches and make sure distros upgrade (might need releases if some > >> distro is stuck on old horrors), so that we don't have to keep the > >> hack in place for 10+ years or so. Definitely if the hack amounts to > >> disabling modifiers on
Re: [git pull] drm for 5.8-rc1
On Wed, Jul 1, 2020 at 6:01 PM Daniel Vetter wrote: > > On Wed, Jul 1, 2020 at 5:51 PM James Jones wrote: > > > > On 7/1/20 4:24 AM, Karol Herbst wrote: > > > On Wed, Jul 1, 2020 at 6:45 AM James Jones wrote: > > >> > > >> This implies something is trying to use one of the old > > >> DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifiers with DRM-KMS without > > >> first checking whether it is supported by the kernel. I had tried to > > >> force an Xorg+Mesa stack without my userspace patches to hit this error > > >> when testing, but must have missed some permutation. If the stalled > > >> Mesa patches go in, this would stop happening of course, but those were > > >> held up for a long time in review, and are now waiting on me to make > > >> some modifications. > > >> > > > > > > that's completely irrelevant. If a kernel change breaks userspace, > > > it's a kernel bug. > > > > Agreed it is unacceptable to break userspace, but I don't think it's > > irrelevant. Perhaps the musings on pending userspace patches are. > > > > My intent here was to point out it appears at first glance that > > something isn't behaving as expected in userspace, so fixing this would > > likely require some sort of work-around for broken userspace rather than > > straight-forward fixing of a bug in the kernel logic. My intent was not > > to shift blame to something besides my code & testing for the > > regression, though I certainly see how it could be interpreted that way. > > > > Regardless, I'm looking in to it. > I assume the MR you were talking about is https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3724 ? I am also aware of the tegra driver being broken on my jetson nano and I am now curious if this MR could fix this bug as well... and sorry for the harsh reply, I was just a annoyed by the fact that "everything modifier related is just breaking things", first tegra and that nobody is looking into fixing it and then apparently the userspace code being quite broken as well :/ Anyway, yeah I trust you guys on figuring out the keeping "broken" userspace happy from a kernel side and maybe I can help out with reviewing the mesa bits. I am just wondering if it could help with the tegra situation giving me more reasons to look into it as this would solve other issues I should be working on :) > If we do need to have a kernel workaround I'm happy to help out, I've > done a bunch of these and occasionally it's good to get rather > creative :-) > > Ideally we'd also push a minimal fix in userspace to all stable > branches and make sure distros upgrade (might need releases if some > distro is stuck on old horrors), so that we don't have to keep the > hack in place for 10+ years or so. Definitely if the hack amounts to > disabling modifiers on nouveau, that would be kinda sad. > -Daniel > > > > > Thanks, > > -James > > > > >> Are you using the modesetting driver in X? If so, with glamor I > > >> presume? What version of Mesa? Any distro patches? Any non-default > > >> xorg.conf options that would affect modesetting, your X driver if it > > >> isn't modesetting, or glamour? > > >> > > >> Thanks, > > >> -James > > >> > > >> On 6/30/20 4:08 PM, Kirill A. Shutemov wrote: > > >>> On Tue, Jun 02, 2020 at 04:06:32PM +1000, Dave Airlie wrote: > > >>>> James Jones (4): > > >>> ... > > >>>> drm/nouveau/kms: Support NVIDIA format modifiers > > >>> > > >>> This commit is the first one that breaks Xorg startup for my setup: > > >>> GTX 1080 + Dell UP2414Q (4K DP MST monitor). > > >>> > > >>> I believe this is the crucial part of dmesg (full dmesg is attached): > > >>> > > >>> [ 29.997140] [drm:nouveau_framebuffer_new] Unsupported modifier: > > >>> 0x314 > > >>> [ 29.997143] [drm:drm_internal_framebuffer_create] could not create > > >>> framebuffer > > >>> [ 29.997145] [drm:drm_ioctl] pid=3393, ret = -22 > > >>> > > >>> Any suggestions? > > >>> > > >> ___ > > >> dri-devel mailing list > > >> dri-de...@lists.freedesktop.org > > >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > >> > > > > > > ___ > > > dri-devel mailing list > > > dri-de...@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch >
Re: [git pull] drm for 5.8-rc1
On Wed, Jul 1, 2020 at 6:45 AM James Jones wrote: > > This implies something is trying to use one of the old > DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifiers with DRM-KMS without > first checking whether it is supported by the kernel. I had tried to > force an Xorg+Mesa stack without my userspace patches to hit this error > when testing, but must have missed some permutation. If the stalled > Mesa patches go in, this would stop happening of course, but those were > held up for a long time in review, and are now waiting on me to make > some modifications. > that's completely irrelevant. If a kernel change breaks userspace, it's a kernel bug. > Are you using the modesetting driver in X? If so, with glamor I > presume? What version of Mesa? Any distro patches? Any non-default > xorg.conf options that would affect modesetting, your X driver if it > isn't modesetting, or glamour? > > Thanks, > -James > > On 6/30/20 4:08 PM, Kirill A. Shutemov wrote: > > On Tue, Jun 02, 2020 at 04:06:32PM +1000, Dave Airlie wrote: > >> James Jones (4): > > ... > >>drm/nouveau/kms: Support NVIDIA format modifiers > > > > This commit is the first one that breaks Xorg startup for my setup: > > GTX 1080 + Dell UP2414Q (4K DP MST monitor). > > > > I believe this is the crucial part of dmesg (full dmesg is attached): > > > > [ 29.997140] [drm:nouveau_framebuffer_new] Unsupported modifier: > > 0x314 > > [ 29.997143] [drm:drm_internal_framebuffer_create] could not create > > framebuffer > > [ 29.997145] [drm:drm_ioctl] pid=3393, ret = -22 > > > > Any suggestions? > > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel >
Re: [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
On Tue, Oct 22, 2019 at 2:45 PM Mika Westerberg wrote: > > On Tue, Oct 22, 2019 at 11:16:14AM +0200, Karol Herbst wrote: > > I think there is something I totally forgot about: > > > > When there was never a driver bound to the GPU, and if runtime power > > management gets enabled on that device, runtime suspend/resume works > > as expected (I am not 100% sure on if that always works, but I will > > recheck that). > > AFAIK, if there is no driver bound to the PCI device it is left to D0 > regardless of the runtime PM state which could explain why it works in > that case (it is never put into D3hot). > > I looked at the acpidump you sent and there is one thing that may > explain the differences between Windows and Linux. Not sure if you were > aware of this already, though. The power resource PGOF() method has > this: > >If (((OSYS <= 0x07D9) || ((OSYS == 0x07DF) && (_REV == 0x05 { > ... >} > I think this is the fallback to some older method of runtime suspending the device, and I think it will end up touching different registers on the bridge controller which do not show the broken behaviour. You'll find references to following variables which all cause a link to be powered down: Q0L2 (newest), P0L2, P0LD (oldest, I think). Maybe I remember incorrectly and have to read the code again... okay, the fallback path uses P0LD indeed. That's actually the only register of those being documented by Intel afaik. > If I read it right, the later condition tries to detect Linux which > fails nowadays but if you have acpi_rev_override in the command line (or > the machine is listed in acpi_rev_dmi_table) this check passes and does > some magic which is not clear to me. There is similar in PGON() side > which is used to turn the device back on. > > You can check what actually happens when _ON()/_OFF() is called by > passing something like below to the kernel command line: > > acpi.trace_debug_layer=0x80 acpi.trace_debug_level=0x10 > acpi.trace_method_name=\_SB.PCI0.PEG0.PG00._ON acpi.trace_state=method > > (See also Documentation/firmware-guide/acpi/method-tracing.rst). > > Trace goes to system dmesg. This sounds to be very helpful, I'll give it a try.
Re: [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
On Mon, Oct 21, 2019 at 4:09 PM Mika Westerberg wrote: > > On Mon, Oct 21, 2019 at 03:54:09PM +0200, Karol Herbst wrote: > > > I really would like to provide you more information about such > > > workaround but I'm not aware of any ;-) I have not seen any issues like > > > this when D3cold is properly implemented in the platform. That's why > > > I'm bit skeptical that this has anything to do with specific Intel PCIe > > > ports. More likely it is some power sequence in the _ON/_OFF() methods > > > that is run differently on Windows. > > > > yeah.. maybe. I really don't know what's the actual root cause. I just > > know that with this workaround it works perfectly fine on my and some > > other systems it was tested on. Do you know who would be best to > > approach to get proper documentation about those methods and what are > > the actual prerequisites of those methods? > > Those should be documented in the ACPI spec. Chapter 7 should explain > power resources and the device power methods in detail. either I looked up the wrong spec or the documentation isn't really saying much there.
Re: [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
On Wed, Oct 16, 2019 at 11:37 PM Bjorn Helgaas wrote: > > [+cc linux-acpi] > > On Wed, Oct 16, 2019 at 09:18:32PM +0200, Karol Herbst wrote: > > but setting the PCI_DEV_FLAGS_NO_D3 flag does prevent using the > > platform means of putting the device into D3cold, right? That's > > actually what should still happen, just the D3hot step should be > > skipped. > > If I understand correctly, when we put a device in D3cold on an ACPI > system, we do something like this: > > pci_set_power_state(D3cold) > if (PCI_DEV_FLAGS_NO_D3) > return 0 <-- nothing at all if quirked > pci_raw_set_power_state > pci_write_config_word(PCI_PM_CTRL, D3hot) <-- set to D3hot > __pci_complete_power_transition(D3cold) > pci_platform_power_transition(D3cold) > platform_pci_set_power_state(D3cold) > acpi_pci_set_power_state(D3cold) > acpi_device_set_power(ACPI_STATE_D3_COLD) > ... > acpi_evaluate_object("_OFF") <-- set to D3cold > > I did not understand the connection with platform (ACPI) power > management from your patch. It sounds like you want this entire path > except that you want to skip the PCI_PM_CTRL write? > exactly. I am running with this workaround for a while now and never had any fails with it anymore. The GPU gets turned off correctly and I see the same power savings, just that the GPU can be powered on again. > That seems like something Rafael should weigh in on. I don't know > why we set the device to D3hot with PCI_PM_CTRL before using the ACPI > methods, and I don't know what the effect of skipping that is. It > seems a little messy to slice out this tiny piece from the middle, but > maybe it makes sense. > afaik when I was talking with others in the past about it, Windows is doing that before using ACPI calls, but maybe they have some similar workarounds for certain intel bridges as well? I am sure it affects more than the one I am blacklisting here, but I rather want to check each device before blacklisting all kabylake and sky lake bridges (as those are the ones were this issue can be observed). Sadly we had no luck getting any information about such workaround out of Nvidia or Intel. > > On Wed, Oct 16, 2019 at 9:14 PM Bjorn Helgaas wrote: > > > > > > On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote: > > > > Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher > > > > device > > > > states. > > > > > > > > v2: convert to pci_dev quirk > > > > put a proper technical explanation of the issue as a in-code comment > > > > v3: disable it only for certain combinations of intel and nvidia > > > > hardware > > > > > > > > Signed-off-by: Karol Herbst > > > > Cc: Bjorn Helgaas > > > > Cc: Lyude Paul > > > > Cc: Rafael J. Wysocki > > > > Cc: Mika Westerberg > > > > Cc: linux-...@vger.kernel.org > > > > Cc: linux...@vger.kernel.org > > > > Cc: dri-de...@lists.freedesktop.org > > > > Cc: nouv...@lists.freedesktop.org > > > > --- > > > > drivers/pci/pci.c| 11 ++ > > > > drivers/pci/quirks.c | 52 > > > > include/linux/pci.h | 1 + > > > > 3 files changed, 64 insertions(+) > > > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > > index b97d9e10c9cc..8e056eb7e6ff 100644 > > > > --- a/drivers/pci/pci.c > > > > +++ b/drivers/pci/pci.c > > > > @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct > > > > pci_dev *dev) > > > > return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false; > > > > } > > > > > > > > +static inline bool parent_broken_child_pm(struct pci_dev *dev) > > > > +{ > > > > + if (!dev->bus || !dev->bus->self) > > > > + return false; > > > > + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm; > > > > +} > > > > + > > > > /** > > > > * pci_raw_set_power_state - Use PCI PM registers to set the power > > > > state of > > > > *given PCI device > > > > @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev > > > > *dev, pci_power_t state) > > > > || (state == PCI_D
Re: [RFC PATCH] pci: prevent putting pcie devices into lower device states on certain intel bridges
On Tue, Oct 1, 2019 at 11:11 AM Mika Westerberg wrote: > > On Tue, Oct 01, 2019 at 10:56:39AM +0200, Karol Herbst wrote: > > On Tue, Oct 1, 2019 at 10:47 AM Mika Westerberg > > wrote: > > > > > > On Mon, Sep 30, 2019 at 06:36:12PM +0200, Karol Herbst wrote: > > > > On Mon, Sep 30, 2019 at 6:30 PM Mika Westerberg > > > > wrote: > > > > > > > > > > On Mon, Sep 30, 2019 at 06:05:14PM +0200, Karol Herbst wrote: > > > > > > still happens with your patch applied. The machine simply gets shut > > > > > > down. > > > > > > > > > > > > dmesg can be found here: > > > > > > https://gist.githubusercontent.com/karolherbst/40eb091c7b7b33ef993525de660f1a3b/raw/2380e31f566e93e5ba7c87ef545420965d4c492c/gistfile1.txt > > > > > > > > > > Looking your dmesg: > > > > > > > > > > Sep 30 17:24:27 kernel: nouveau :01:00.0: DRM: DCB version 4.1 > > > > > Sep 30 17:24:27 kernel: nouveau :01:00.0: DRM: MM: using COPY for > > > > > buffer copies > > > > > Sep 30 17:24:27 kernel: [drm] Initialized nouveau 1.3.1 20120801 for > > > > > :01:00.0 on minor 1 > > > > > > > > > > I would assume it runtime suspends here. Then it wakes up because of > > > > > PCI > > > > > access from userspace: > > > > > > > > > > Sep 30 17:24:42 kernel: pci_raw_set_power_state: 56 callbacks > > > > > suppressed > > > > > > > > > > and for some reason it does not get resumed properly. There are also > > > > > few > > > > > warnings from ACPI that might be relevant: > > > > > > > > > > Sep 30 17:24:27 kernel: ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument > > > > > #4 type mismatch - Found [Buffer], ACPI requires [Package] > > > > > (20190509/nsarguments-59) > > > > > Sep 30 17:24:27 kernel: ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: > > > > > Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] > > > > > (20190509/nsarguments-59) > > > > > > > > > > > > > afaik this is the case for essentially every laptop out there. > > > > > > OK, so they are harmless? > > > > > > > yes > > > > > > > This seems to be Dell XPS 9560 which I think has been around some time > > > > > already so I wonder why we only see issues now. Has it ever worked for > > > > > you or maybe there is a regression that causes it to happen now? > > > > > > > > oh, it's broken since forever, we just tried to get more information > > > > from Nvidia if they know what this is all about, but we got nothing > > > > useful. > > > > > > > > We were also hoping to find a reliable fix or workaround we could have > > > > inside nouveau to fix that as I think nouveau is the only driver > > > > actually hit by this issue, but nothing turned out to be reliable > > > > enough. > > > > > > Can't you just block runtime PM from the nouveau driver until this is > > > understood better? That can be done by calling pm_runtime_forbid() (or > > > not calling pm_runtime_allow() in the driver). Or in case of PCI driver > > > you just don't decrease the reference count when probe() ends. > > > > > > > the thing is, it does work for a lot of laptops. We could only observe > > this on kaby lake and skylake ones. Even on Cannon Lakes it seems to > > work just fine. > > Can't you then limit it to those? > > I've experienced that Kabylake root ports can enter and exit in D3cold > just fine because we do that for Thunderbolt for example. But that > always requires help from ACPI. If the system is using non-standard ACPI > methods for example that may require some tricks in the driver side. > yeah.. I am not quite sure what's actually the root cause. I was also trying to use the same PCI registers ACPI is using to trigger this issue on a normal desktop, no luck. Using the same registers does trigger the issue (hence the script). The script is essentially just doing what ACPI does, just skipping a lot. > > > I think that would be much better than blocking any devices behind > > > Kabylake PCIe root ports from entering D3 (I don't really think the > > > problem is in the root ports itself but there is something we are > > > missing when the NVIDIA GPU is put into D3cold or back from there). > > > > I highly doubt there is anything wrong with the GPU alone as we have > > too many indications which tell us otherwise. > > > > Anyway, at this point I don't know where to look further for what's > > actually wrong. And apparently it works on Windows, but I don't know > > why and I have no idea what Windows does on such systems to make it > > work reliably. > > By works you mean that Windows is able to put it into D3cold and back? > If that's the case it may be that there is some ACPI magic that the > Windows driver does and we of course are missing in Linux. Afaik that's the case. We were talking with Nvidia about it, but they are not aware of any issues generally. (on Windows, nor the hardware). No idea if we can trust their statements though. But yeah, it might be that on Windows they still do _DSM calls or something... but until today, Nvidia didn't provide any documentation to us for that.
Re: [RFC PATCH] pci: prevent putting pcie devices into lower device states on certain intel bridges
On Tue, Oct 1, 2019 at 10:47 AM Mika Westerberg wrote: > > On Mon, Sep 30, 2019 at 06:36:12PM +0200, Karol Herbst wrote: > > On Mon, Sep 30, 2019 at 6:30 PM Mika Westerberg > > wrote: > > > > > > On Mon, Sep 30, 2019 at 06:05:14PM +0200, Karol Herbst wrote: > > > > still happens with your patch applied. The machine simply gets shut > > > > down. > > > > > > > > dmesg can be found here: > > > > https://gist.githubusercontent.com/karolherbst/40eb091c7b7b33ef993525de660f1a3b/raw/2380e31f566e93e5ba7c87ef545420965d4c492c/gistfile1.txt > > > > > > Looking your dmesg: > > > > > > Sep 30 17:24:27 kernel: nouveau :01:00.0: DRM: DCB version 4.1 > > > Sep 30 17:24:27 kernel: nouveau :01:00.0: DRM: MM: using COPY for > > > buffer copies > > > Sep 30 17:24:27 kernel: [drm] Initialized nouveau 1.3.1 20120801 for > > > :01:00.0 on minor 1 > > > > > > I would assume it runtime suspends here. Then it wakes up because of PCI > > > access from userspace: > > > > > > Sep 30 17:24:42 kernel: pci_raw_set_power_state: 56 callbacks suppressed > > > > > > and for some reason it does not get resumed properly. There are also few > > > warnings from ACPI that might be relevant: > > > > > > Sep 30 17:24:27 kernel: ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 > > > type mismatch - Found [Buffer], ACPI requires [Package] > > > (20190509/nsarguments-59) > > > Sep 30 17:24:27 kernel: ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument > > > #4 type mismatch - Found [Buffer], ACPI requires [Package] > > > (20190509/nsarguments-59) > > > > > > > afaik this is the case for essentially every laptop out there. > > OK, so they are harmless? > yes > > > This seems to be Dell XPS 9560 which I think has been around some time > > > already so I wonder why we only see issues now. Has it ever worked for > > > you or maybe there is a regression that causes it to happen now? > > > > oh, it's broken since forever, we just tried to get more information > > from Nvidia if they know what this is all about, but we got nothing > > useful. > > > > We were also hoping to find a reliable fix or workaround we could have > > inside nouveau to fix that as I think nouveau is the only driver > > actually hit by this issue, but nothing turned out to be reliable > > enough. > > Can't you just block runtime PM from the nouveau driver until this is > understood better? That can be done by calling pm_runtime_forbid() (or > not calling pm_runtime_allow() in the driver). Or in case of PCI driver > you just don't decrease the reference count when probe() ends. > the thing is, it does work for a lot of laptops. We could only observe this on kaby lake and skylake ones. Even on Cannon Lakes it seems to work just fine. > I think that would be much better than blocking any devices behind > Kabylake PCIe root ports from entering D3 (I don't really think the > problem is in the root ports itself but there is something we are > missing when the NVIDIA GPU is put into D3cold or back from there). I highly doubt there is anything wrong with the GPU alone as we have too many indications which tell us otherwise. Anyway, at this point I don't know where to look further for what's actually wrong. And apparently it works on Windows, but I don't know why and I have no idea what Windows does on such systems to make it work reliably.
Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM _OSI string to enable dGPU direct output"
On Thu, Aug 15, 2019 at 4:30 PM wrote: > > > On Thu, Aug 15, 2019 at 10:15 AM Karol Herbst wrote: > > > > > > On Thu, Aug 15, 2019 at 4:13 PM Alex Deucher > > wrote: > > > > > > > > On Thu, Aug 15, 2019 at 10:04 AM Karol Herbst > > > > wrote: > > > > > > > > > > On Thu, Aug 15, 2019 at 3:56 PM wrote: > > > > > > > > > > > > > -Original Message- > > > > > > > From: linux-acpi-ow...@vger.kernel.org > ow...@vger.kernel.org> On > > > > > > > Behalf Of Dave Airlie > > > > > > > Sent: Wednesday, August 14, 2019 5:48 PM > > > > > > > To: Karol Herbst > > > > > > > Cc: LKML; Linux ACPI; dri-devel; nouveau; Rafael J . Wysocki; > > > > > > > Alex Hung; > > Ben > > > > > > > Skeggs; Dave Airlie > > > > > > > Subject: Re: [Nouveau] [PATCH 1/7] Revert "ACPI / OSI: Add OEM > > > > > > > _OSI > > string to > > > > > > > enable dGPU direct output" > > > > > > > > > > > > > > On Thu, 15 Aug 2019 at 07:31, Karol Herbst > > wrote: > > > > > > > > > > > > > > > > This reverts commit 28586a51eea666d5531bcaef2f68e4abbd87242c. > > > > > > > > > > > > > > > > The original commit message didn't even make sense. AMD _does_ > > support it and > > > > > > > > it works with Nouveau as well. > > > > > > > > > > > > > > > > Also what was the issue being solved here? No references to any > > > > > > > > bugs > > and not > > > > > > > > even explaining any issue at all isn't the way we do things. > > > > > > > > > > > > > > > > And even if it means a muxed design, then the fix is to make it > > > > > > > > work > > inside the > > > > > > > > driver, not adding some hacky workaround through ACPI tricks. > > > > > > > > > > > > > > > > And what out of tree drivers do or do not support we don't care > > > > > > > > one > > bit anyway. > > > > > > > > > > > > > > > > > > > > > > I think the reverts should be merged via Rafael's tree as the > > > > > > > original > > > > > > > patches went in via there, and we should get them in asap. > > > > > > > > > > > > > > Acked-by: Dave Airlie > > > > > > > Dave. > > > > > > > > > > > > There are definitely going to be regressions on machines in the > > > > > > field with > > the > > > > > > in tree drivers by reverting this. I think we should have an > > > > > > answer for all > > of those > > > > > > before this revert is accepted. > > > > > > > > > > > > Regarding systems with Intel+NVIDIA, we'll have to work with > > > > > > partners to > > collect > > > > > > some information on the impact of reverting this. > > > > > > > > > > > > When this is used on a system with Intel+AMD the ASL configures AMD > > GPU to use > > > > > > "Hybrid Graphics" when on Windows and "Power Express" and > > "Switchable Graphics" > > > > > > when on Linux. > > > > > > > > > > and what's exactly the difference between those? And what's the actual > > > > > issue here? > > > > > > > > Hybrid Graphics is the new "standard" way of handling these laptops. > > > > It uses the standard _PR3 APCI method to handle dGPU power. Support > > > > for this was added to Linux relatively later compared to when the > > > > laptops were launched. "Power Express" used the other AMD specific > > > > ATPX ACPI method to handle dGPU power. The driver supports both so > > > > either method will work. > > > > > > > > Alex > > > > > > > > > > thanks for clarifying. But that still means that we won't need such > > > workarounds for AMD users, right? amdgpu handles hybrid graphics just > > > fine, right? > > > > Yeah it should, assuming you have a new enough kernel which supports > > HG, which has been several years at this point IIRC. > > > > Alex > > > > Can you define how new of a kernel is a new enough kernel? > > Looking on my side these problems were on new hardware (Precision 7740) and > are checked as recently as start of this summer, w/ kernel 4.15. That's not even a long term one. And it shouldn't get any fixes. I just checked, last update was 16 months ago. > > We can arrange to have it checked again on 5.3rcX w/ the OSI disabled. yeah, please do. If there are any issues, we (as in drm developers) are happy to fix the issues inside the drivers.
Re: [PATCH 1/5] drm/nouveau: Check backlight IDs are >= 0, not > 0
Patches 1-5 are Reviewed-by: Karol Herbst I think it might be worth to test those patches on a system without any backlight devices just to verify we don't break things, but the code looked good already, so maybe we don't really need to test. On Thu, Aug 23, 2018 at 3:21 AM, Lyude Paul wrote: > Remember, ida IDs start at 0, not 1! > > Signed-off-by: Lyude Paul > Cc: sta...@vger.kernel.org > Cc: Jeffery Miller > Cc: Karol Herbst > --- > drivers/gpu/drm/nouveau/nouveau_backlight.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c > b/drivers/gpu/drm/nouveau/nouveau_backlight.c > index 408b955e5c39..6dd72bc32897 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c > +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c > @@ -116,7 +116,7 @@ nv40_backlight_init(struct drm_connector *connector) >_bl_ops, ); > > if (IS_ERR(bd)) { > - if (bl_connector.id > 0) > + if (bl_connector.id >= 0) > ida_simple_remove(_ida, bl_connector.id); > return PTR_ERR(bd); > } > @@ -249,7 +249,7 @@ nv50_backlight_init(struct drm_connector *connector) >nv_encoder, ops, ); > > if (IS_ERR(bd)) { > - if (bl_connector.id > 0) > + if (bl_connector.id >= 0) > ida_simple_remove(_ida, bl_connector.id); > return PTR_ERR(bd); > } > -- > 2.17.1 >
Re: [PATCH 1/5] drm/nouveau: Check backlight IDs are >= 0, not > 0
Patches 1-5 are Reviewed-by: Karol Herbst I think it might be worth to test those patches on a system without any backlight devices just to verify we don't break things, but the code looked good already, so maybe we don't really need to test. On Thu, Aug 23, 2018 at 3:21 AM, Lyude Paul wrote: > Remember, ida IDs start at 0, not 1! > > Signed-off-by: Lyude Paul > Cc: sta...@vger.kernel.org > Cc: Jeffery Miller > Cc: Karol Herbst > --- > drivers/gpu/drm/nouveau/nouveau_backlight.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c > b/drivers/gpu/drm/nouveau/nouveau_backlight.c > index 408b955e5c39..6dd72bc32897 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c > +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c > @@ -116,7 +116,7 @@ nv40_backlight_init(struct drm_connector *connector) >_bl_ops, ); > > if (IS_ERR(bd)) { > - if (bl_connector.id > 0) > + if (bl_connector.id >= 0) > ida_simple_remove(_ida, bl_connector.id); > return PTR_ERR(bd); > } > @@ -249,7 +249,7 @@ nv50_backlight_init(struct drm_connector *connector) >nv_encoder, ops, ); > > if (IS_ERR(bd)) { > - if (bl_connector.id > 0) > + if (bl_connector.id >= 0) > ida_simple_remove(_ida, bl_connector.id); > return PTR_ERR(bd); > } > -- > 2.17.1 >
Re: [Nouveau] [RFC v2 1/4] drm/nouveau: Add support for basic clockgating on Kepler1
On Fri, Jan 26, 2018 at 4:35 AM, Lyude Paulwrote: > This adds support for enabling automatic clockgating on nvidia GPUs for > Kepler1. While this is not technically a clockgating level, it does > enable clockgating using the clockgating values initially set by the > vbios (which should be safe to use). > > This introduces two therm helpers for controlling basic clockgating: > nvkm_therm_clkgate_enable() - enables clockgating through > CG_CTRL, done after initializing the GPU fully > nvkm_therm_clkgate_fini() - prepares clockgating for suspend or > driver unload > > As well, we add the nouveau kernel config parameter NvPmEnableGating, > which can be toggled on or off in order to enable/disable clockgating. > Since we've only had limited testing on this thus far, we disable this > by default. > > A lot of this code was originally going to be based off of fermi; > however it turns out that while Fermi's the first line of GPUs that > introduced this kind of power saving, Fermi requires more fine tuned > control of the CG_CTRL registers from the driver while reclocking that > we don't entirely understand yet. > > For the simple parts we will be sharing with Fermi for certain however, > we at least add those into a new subdev/therm/gf100.h header. > > Signed-off-by: Lyude Paul > --- > .../gpu/drm/nouveau/include/nvkm/subdev/therm.h| 5 + > drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 17 +-- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c | 60 +++-- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h | 35 ++ > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c | 8 +- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c | 135 > + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h | 48 > drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 15 ++- > 9 files changed, 303 insertions(+), 21 deletions(-) > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > index b1ac47eb786e..240b19bb4667 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > @@ -85,17 +85,22 @@ struct nvkm_therm { > > int (*attr_get)(struct nvkm_therm *, enum nvkm_therm_attr_type); > int (*attr_set)(struct nvkm_therm *, enum nvkm_therm_attr_type, int); > + > + bool clkgating_enabled; > }; > > int nvkm_therm_temp_get(struct nvkm_therm *); > int nvkm_therm_fan_sense(struct nvkm_therm *); > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > +void nvkm_therm_clkgate_enable(struct nvkm_therm *); > +void nvkm_therm_clkgate_fini(struct nvkm_therm *, bool); > > int nv40_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int nv50_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > +int gk104_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gm200_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gp100_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > index 08e77cd55e6e..74bd09b1c893 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > @@ -28,6 +28,7 @@ > #include > > #include > +#include > > static DEFINE_MUTEX(nv_devices_mutex); > static LIST_HEAD(nv_devices); > @@ -1682,7 +1683,7 @@ nve4_chipset = { > .mxm = nv50_mxm_new, > .pci = gk104_pci_new, > .pmu = gk104_pmu_new, > - .therm = gf119_therm_new, > + .therm = gk104_therm_new, > .timer = nv41_timer_new, > .top = gk104_top_new, > .volt = gk104_volt_new, > @@ -1721,7 +1722,7 @@ nve6_chipset = { > .mxm = nv50_mxm_new, > .pci = gk104_pci_new, > .pmu = gk104_pmu_new, > - .therm = gf119_therm_new, > + .therm = gk104_therm_new, > .timer = nv41_timer_new, > .top = gk104_top_new, > .volt = gk104_volt_new, > @@ -1760,7 +1761,7 @@ nve7_chipset = { > .mxm = nv50_mxm_new, > .pci = gk104_pci_new, > .pmu = gk104_pmu_new, > - .therm = gf119_therm_new, > + .therm = gk104_therm_new, > .timer = nv41_timer_new, > .top
Re: [Nouveau] [RFC v2 1/4] drm/nouveau: Add support for basic clockgating on Kepler1
On Fri, Jan 26, 2018 at 4:35 AM, Lyude Paul wrote: > This adds support for enabling automatic clockgating on nvidia GPUs for > Kepler1. While this is not technically a clockgating level, it does > enable clockgating using the clockgating values initially set by the > vbios (which should be safe to use). > > This introduces two therm helpers for controlling basic clockgating: > nvkm_therm_clkgate_enable() - enables clockgating through > CG_CTRL, done after initializing the GPU fully > nvkm_therm_clkgate_fini() - prepares clockgating for suspend or > driver unload > > As well, we add the nouveau kernel config parameter NvPmEnableGating, > which can be toggled on or off in order to enable/disable clockgating. > Since we've only had limited testing on this thus far, we disable this > by default. > > A lot of this code was originally going to be based off of fermi; > however it turns out that while Fermi's the first line of GPUs that > introduced this kind of power saving, Fermi requires more fine tuned > control of the CG_CTRL registers from the driver while reclocking that > we don't entirely understand yet. > > For the simple parts we will be sharing with Fermi for certain however, > we at least add those into a new subdev/therm/gf100.h header. > > Signed-off-by: Lyude Paul > --- > .../gpu/drm/nouveau/include/nvkm/subdev/therm.h| 5 + > drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 17 +-- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c | 60 +++-- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h | 35 ++ > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c | 8 +- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c | 135 > + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h | 48 > drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 15 ++- > 9 files changed, 303 insertions(+), 21 deletions(-) > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > index b1ac47eb786e..240b19bb4667 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > @@ -85,17 +85,22 @@ struct nvkm_therm { > > int (*attr_get)(struct nvkm_therm *, enum nvkm_therm_attr_type); > int (*attr_set)(struct nvkm_therm *, enum nvkm_therm_attr_type, int); > + > + bool clkgating_enabled; > }; > > int nvkm_therm_temp_get(struct nvkm_therm *); > int nvkm_therm_fan_sense(struct nvkm_therm *); > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > +void nvkm_therm_clkgate_enable(struct nvkm_therm *); > +void nvkm_therm_clkgate_fini(struct nvkm_therm *, bool); > > int nv40_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int nv50_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > +int gk104_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gm200_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gp100_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > index 08e77cd55e6e..74bd09b1c893 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > @@ -28,6 +28,7 @@ > #include > > #include > +#include > > static DEFINE_MUTEX(nv_devices_mutex); > static LIST_HEAD(nv_devices); > @@ -1682,7 +1683,7 @@ nve4_chipset = { > .mxm = nv50_mxm_new, > .pci = gk104_pci_new, > .pmu = gk104_pmu_new, > - .therm = gf119_therm_new, > + .therm = gk104_therm_new, > .timer = nv41_timer_new, > .top = gk104_top_new, > .volt = gk104_volt_new, > @@ -1721,7 +1722,7 @@ nve6_chipset = { > .mxm = nv50_mxm_new, > .pci = gk104_pci_new, > .pmu = gk104_pmu_new, > - .therm = gf119_therm_new, > + .therm = gk104_therm_new, > .timer = nv41_timer_new, > .top = gk104_top_new, > .volt = gk104_volt_new, > @@ -1760,7 +1761,7 @@ nve7_chipset = { > .mxm = nv50_mxm_new, > .pci = gk104_pci_new, > .pmu = gk104_pmu_new, > - .therm = gf119_therm_new, > + .therm = gk104_therm_new, > .timer = nv41_timer_new, > .top = gk104_top_new, > .volt =
Re: [Nouveau] [PATCH] drm/nouveau/mmu: Fix trailing semicolon
Reviewed-by: Karol Herbst <kher...@redhat.com> On Wed, Jan 17, 2018 at 7:53 PM, Luis de Bethencourt <lui...@kernel.org> wrote: > The trailing semicolon is an empty statement that does no operation. > Removing it since it doesn't do anything. > > Signed-off-by: Luis de Bethencourt <lui...@kernel.org> > --- > > Hi, > > After fixing the same thing in drivers/staging/rtl8723bs/, Joe Perches > suggested I fix it treewide [0]. > > Best regards > Luis > > > [0] > http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115410.html > [1] > http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115390.html > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > index e35d3e17cd7c..93946dcee319 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > @@ -642,7 +642,7 @@ nvkm_vmm_ptes_sparse(struct nvkm_vmm *vmm, u64 addr, u64 > size, bool ref) > else > block = (size >> page[i].shift) << > page[i].shift; > } else { > - block = (size >> page[i].shift) << page[i].shift;; > + block = (size >> page[i].shift) << page[i].shift; > } > > /* Perform operation. */ > -- > 2.15.1 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] drm/nouveau/mmu: Fix trailing semicolon
Reviewed-by: Karol Herbst On Wed, Jan 17, 2018 at 7:53 PM, Luis de Bethencourt wrote: > The trailing semicolon is an empty statement that does no operation. > Removing it since it doesn't do anything. > > Signed-off-by: Luis de Bethencourt > --- > > Hi, > > After fixing the same thing in drivers/staging/rtl8723bs/, Joe Perches > suggested I fix it treewide [0]. > > Best regards > Luis > > > [0] > http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115410.html > [1] > http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115390.html > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > index e35d3e17cd7c..93946dcee319 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > @@ -642,7 +642,7 @@ nvkm_vmm_ptes_sparse(struct nvkm_vmm *vmm, u64 addr, u64 > size, bool ref) > else > block = (size >> page[i].shift) << > page[i].shift; > } else { > - block = (size >> page[i].shift) << page[i].shift;; > + block = (size >> page[i].shift) << page[i].shift; > } > > /* Perform operation. */ > -- > 2.15.1 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
Hi Peter, the basic idea is to detect if a driver accesses a memory region mapped through ioremap. This is super usefull for reverse engineering closed source drivers like the Nvidia GPU driver. So here is what it does: 1. on ioremap the entire memory region mapped is registered in the mmiotracer and marked as not presen, which basically leads to page faults on acces 2. mmiotrace is the registered page fault handler for those pages and while handling the page (which basically means marking them as presen, because they were never missing in the first place) it parses the current instruction to detect if it was a read or write and writes relevant information into a file. This includes address accessed, value read/written, type of instruction 3. after single stepping, the page is marked as not present again 4. on unmap time, mmiotrace unregisteres those regions and marks them as present this is more or less the basic idea. And to answer your question how it is not completely broken: I don't know. It works for us (more or less, we can't parse repeat instructions as one example what does not work) and if we come across issues we try to fix them on the way. Anyway, this is a super useful tool to record and debug what a driver is doing with hardware and helps tracking down a lot of this, especially for Nouveau. I hope that helps. On Tue, Dec 12, 2017 at 3:04 PM, Ingo Molnar <mi...@kernel.org> wrote: > > * Peter Zijlstra <pet...@infradead.org> wrote: > >> On Tue, Dec 12, 2017 at 02:55:30AM -0800, tip-bot for Karol Herbst wrote: >> > Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687 >> > Gitweb: >> > https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687 >> > Author: Karol Herbst <kher...@redhat.com> >> > AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100 >> > Committer: Ingo Molnar <mi...@kernel.org> >> > CommitDate: Mon, 11 Dec 2017 15:35:18 +0100 >> > >> > x86/mm/kmmio: Fix mmiotrace for page unaligned addresses >> >> OK, let me hijack this thread since apparently people use and care about >> mmiotrace. >> >> I was recently auditing the x86 tlb flushing and ran across this >> 'thing'. Can someone please explain to me how this is supposed to work >> and how its not completely broken? > > (I have Cc:-ed other gents as well.) > > Thanks, > > Ingo
Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
Hi Peter, the basic idea is to detect if a driver accesses a memory region mapped through ioremap. This is super usefull for reverse engineering closed source drivers like the Nvidia GPU driver. So here is what it does: 1. on ioremap the entire memory region mapped is registered in the mmiotracer and marked as not presen, which basically leads to page faults on acces 2. mmiotrace is the registered page fault handler for those pages and while handling the page (which basically means marking them as presen, because they were never missing in the first place) it parses the current instruction to detect if it was a read or write and writes relevant information into a file. This includes address accessed, value read/written, type of instruction 3. after single stepping, the page is marked as not present again 4. on unmap time, mmiotrace unregisteres those regions and marks them as present this is more or less the basic idea. And to answer your question how it is not completely broken: I don't know. It works for us (more or less, we can't parse repeat instructions as one example what does not work) and if we come across issues we try to fix them on the way. Anyway, this is a super useful tool to record and debug what a driver is doing with hardware and helps tracking down a lot of this, especially for Nouveau. I hope that helps. On Tue, Dec 12, 2017 at 3:04 PM, Ingo Molnar wrote: > > * Peter Zijlstra wrote: > >> On Tue, Dec 12, 2017 at 02:55:30AM -0800, tip-bot for Karol Herbst wrote: >> > Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687 >> > Gitweb: >> > https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687 >> > Author: Karol Herbst >> > AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100 >> > Committer: Ingo Molnar >> > CommitDate: Mon, 11 Dec 2017 15:35:18 +0100 >> > >> > x86/mm/kmmio: Fix mmiotrace for page unaligned addresses >> >> OK, let me hijack this thread since apparently people use and care about >> mmiotrace. >> >> I was recently auditing the x86 tlb flushing and ran across this >> 'thing'. Can someone please explain to me how this is supposed to work >> and how its not completely broken? > > (I have Cc:-ed other gents as well.) > > Thanks, > > Ingo
[tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687 Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687 Author: Karol Herbst <kher...@redhat.com> AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Mon, 11 Dec 2017 15:35:18 +0100 x86/mm/kmmio: Fix mmiotrace for page unaligned addresses If something calls ioremap() with an address not aligned to PAGE_SIZE, the returned address might be not aligned as well. This led to a probe registered on exactly the returned address, but the entire page was armed for mmiotracing. On calling iounmap() the address passed to unregister_kmmio_probe() was PAGE_SIZE aligned by the caller leading to a complete freeze of the machine. We should always page align addresses while (un)registerung mappings, because the mmiotracer works on top of pages, not mappings. We still keep track of the probes based on their real addresses and lengths though, because the mmiotrace still needs to know what are mapped memory regions. Also move the call to mmiotrace_iounmap() prior page aligning the address, so that all probes are unregistered properly, otherwise the kernel ends up failing memory allocations randomly after disabling the mmiotracer. Tested-by: Lyude <ly...@redhat.com> Signed-off-by: Karol Herbst <kher...@redhat.com> Acked-by: Pekka Paalanen <ppaala...@gmail.com> Cc: Linus Torvalds <torva...@linux-foundation.org> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Steven Rostedt <rost...@goodmis.org> Cc: Thomas Gleixner <t...@linutronix.de> Cc: nouv...@lists.freedesktop.org Link: http://lkml.kernel.org/r/20171127075139.4928-1-kher...@redhat.com Signed-off-by: Ingo Molnar <mi...@kernel.org> --- arch/x86/mm/ioremap.c | 4 ++-- arch/x86/mm/kmmio.c | 12 +++- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 6e4573b..c45b6ec 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -404,11 +404,11 @@ void iounmap(volatile void __iomem *addr) return; } + mmiotrace_iounmap(addr); + addr = (volatile void __iomem *) (PAGE_MASK & (unsigned long __force)addr); - mmiotrace_iounmap(addr); - /* Use the vm area unlocked, assuming the caller ensures there isn't another iounmap for the same address in parallel. Reuse of the virtual address is prevented by diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c index c21c2ed..58477ec 100644 --- a/arch/x86/mm/kmmio.c +++ b/arch/x86/mm/kmmio.c @@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p) unsigned long flags; int ret = 0; unsigned long size = 0; + unsigned long addr = p->addr & PAGE_MASK; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); unsigned int l; pte_t *pte; spin_lock_irqsave(_lock, flags); - if (get_kmmio_probe(p->addr)) { + if (get_kmmio_probe(addr)) { ret = -EEXIST; goto out; } - pte = lookup_address(p->addr, ); + pte = lookup_address(addr, ); if (!pte) { ret = -EINVAL; goto out; @@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p) kmmio_count++; list_add_rcu(>list, _probes); while (size < size_lim) { - if (add_kmmio_fault_page(p->addr + size)) + if (add_kmmio_fault_page(addr + size)) pr_err("Unable to set page fault.\n"); size += page_level_size(l); } @@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p) { unsigned long flags; unsigned long size = 0; + unsigned long addr = p->addr & PAGE_MASK; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); struct kmmio_fault_page *release_list = NULL; struct kmmio_delayed_release *drelease; unsigned int l; pte_t *pte; - pte = lookup_address(p->addr, ); + pte = lookup_address(addr, ); if (!pte) return; spin_lock_irqsave(_lock, flags); while (size < size_lim) { - release_kmmio_fault_page(p->addr + size, _list); + release_kmmio_fault_page(addr + size, _list); size += page_level_size(l); } list_del_rcu(>list);
[tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687 Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687 Author: Karol Herbst AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100 Committer: Ingo Molnar CommitDate: Mon, 11 Dec 2017 15:35:18 +0100 x86/mm/kmmio: Fix mmiotrace for page unaligned addresses If something calls ioremap() with an address not aligned to PAGE_SIZE, the returned address might be not aligned as well. This led to a probe registered on exactly the returned address, but the entire page was armed for mmiotracing. On calling iounmap() the address passed to unregister_kmmio_probe() was PAGE_SIZE aligned by the caller leading to a complete freeze of the machine. We should always page align addresses while (un)registerung mappings, because the mmiotracer works on top of pages, not mappings. We still keep track of the probes based on their real addresses and lengths though, because the mmiotrace still needs to know what are mapped memory regions. Also move the call to mmiotrace_iounmap() prior page aligning the address, so that all probes are unregistered properly, otherwise the kernel ends up failing memory allocations randomly after disabling the mmiotracer. Tested-by: Lyude Signed-off-by: Karol Herbst Acked-by: Pekka Paalanen Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Steven Rostedt Cc: Thomas Gleixner Cc: nouv...@lists.freedesktop.org Link: http://lkml.kernel.org/r/20171127075139.4928-1-kher...@redhat.com Signed-off-by: Ingo Molnar --- arch/x86/mm/ioremap.c | 4 ++-- arch/x86/mm/kmmio.c | 12 +++- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 6e4573b..c45b6ec 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -404,11 +404,11 @@ void iounmap(volatile void __iomem *addr) return; } + mmiotrace_iounmap(addr); + addr = (volatile void __iomem *) (PAGE_MASK & (unsigned long __force)addr); - mmiotrace_iounmap(addr); - /* Use the vm area unlocked, assuming the caller ensures there isn't another iounmap for the same address in parallel. Reuse of the virtual address is prevented by diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c index c21c2ed..58477ec 100644 --- a/arch/x86/mm/kmmio.c +++ b/arch/x86/mm/kmmio.c @@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p) unsigned long flags; int ret = 0; unsigned long size = 0; + unsigned long addr = p->addr & PAGE_MASK; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); unsigned int l; pte_t *pte; spin_lock_irqsave(_lock, flags); - if (get_kmmio_probe(p->addr)) { + if (get_kmmio_probe(addr)) { ret = -EEXIST; goto out; } - pte = lookup_address(p->addr, ); + pte = lookup_address(addr, ); if (!pte) { ret = -EINVAL; goto out; @@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p) kmmio_count++; list_add_rcu(>list, _probes); while (size < size_lim) { - if (add_kmmio_fault_page(p->addr + size)) + if (add_kmmio_fault_page(addr + size)) pr_err("Unable to set page fault.\n"); size += page_level_size(l); } @@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p) { unsigned long flags; unsigned long size = 0; + unsigned long addr = p->addr & PAGE_MASK; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); struct kmmio_fault_page *release_list = NULL; struct kmmio_delayed_release *drelease; unsigned int l; pte_t *pte; - pte = lookup_address(p->addr, ); + pte = lookup_address(addr, ); if (!pte) return; spin_lock_irqsave(_lock, flags); while (size < size_lim) { - release_kmmio_fault_page(p->addr + size, _list); + release_kmmio_fault_page(addr + size, _list); size += page_level_size(l); } list_del_rcu(>list);
Re: [Nouveau] [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings
Reviewed-by: Karol Herbst <kher...@redhat.com> On Thu, Nov 30, 2017 at 8:53 PM, Christoph Böhmwalder <christ...@boehmwalder.at> wrote: > The kbuild test bot complained about a new coccinelle warning nearby, > which sparked a discussion about the assignment to 'memory' inside of > the conditional expression. See Link below for the original post. > > Fix the assignment to silence the coccinelle warning and also make the > code look a little nicer. > > Link: https://lists.freedesktop.org/archives/nouveau/2017-November/029242.html > Signed-off-by: Christoph Böhmwalder <christ...@boehmwalder.at> > --- > drm/nouveau/nvkm/subdev/mmu/uvmm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drm/nouveau/nvkm/subdev/mmu/uvmm.c > b/drm/nouveau/nvkm/subdev/mmu/uvmm.c > index fa81d0c1..37b201b9 100644 > --- a/drm/nouveau/nvkm/subdev/mmu/uvmm.c > +++ b/drm/nouveau/nvkm/subdev/mmu/uvmm.c > @@ -106,7 +106,8 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvmm, void *argv, > u32 argc) > } else > return ret; > > - if (IS_ERR((memory = nvkm_umem_search(client, handle { > + memory = nvkm_umem_search(client, handle); > + if (IS_ERR(memory)) { > VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, > PTR_ERR(memory)); > return PTR_ERR(memory); > } > -- > 2.13.6 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings
Reviewed-by: Karol Herbst On Thu, Nov 30, 2017 at 8:53 PM, Christoph Böhmwalder wrote: > The kbuild test bot complained about a new coccinelle warning nearby, > which sparked a discussion about the assignment to 'memory' inside of > the conditional expression. See Link below for the original post. > > Fix the assignment to silence the coccinelle warning and also make the > code look a little nicer. > > Link: https://lists.freedesktop.org/archives/nouveau/2017-November/029242.html > Signed-off-by: Christoph Böhmwalder > --- > drm/nouveau/nvkm/subdev/mmu/uvmm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drm/nouveau/nvkm/subdev/mmu/uvmm.c > b/drm/nouveau/nvkm/subdev/mmu/uvmm.c > index fa81d0c1..37b201b9 100644 > --- a/drm/nouveau/nvkm/subdev/mmu/uvmm.c > +++ b/drm/nouveau/nvkm/subdev/mmu/uvmm.c > @@ -106,7 +106,8 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvmm, void *argv, > u32 argc) > } else > return ret; > > - if (IS_ERR((memory = nvkm_umem_search(client, handle { > + memory = nvkm_umem_search(client, handle); > + if (IS_ERR(memory)) { > VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, > PTR_ERR(memory)); > return PTR_ERR(memory); > } > -- > 2.13.6 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings (fwd)
Hi julia, I think it would be better to extract the assignment out of the if clause. Then you'll get something like this: memory = ...; if (IS_ERR(memory)) { } so, to answer your question: no, it isn't necessary. On Tue, Nov 28, 2017 at 1:56 PM, Julia Lawallwrote: > This is a false positive, but I wonder if it is really necessary to put > the assignment in the conditional test expression. > > julia > > -- Forwarded message -- > Date: Tue, 28 Nov 2017 13:23:36 +0800 > From: kbuild test robot > To: kbu...@01.org > Cc: Julia Lawall > Subject: [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings > > CC: kbuild-...@01.org > CC: linux-kernel@vger.kernel.org > TO: Ben Skeggs > CC: David Airlie > CC: dri-de...@lists.freedesktop.org > CC: nouv...@lists.freedesktop.org > CC: linux-kernel@vger.kernel.org > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c:109:5-11: inconsistent IS_ERR > and PTR_ERR on line 110. > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c:109:5-11: inconsistent IS_ERR > and PTR_ERR on line 111. > > PTR_ERR should access the value just tested by IS_ERR > > Semantic patch information: > There can be false positives in the patch case, where it is the call to > IS_ERR that is wrong. > > Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci > > Fixes: 920d2b5ef215 ("drm/nouveau/mmu: define user interfaces to mmu vmm > opertaions") > Signed-off-by: Fengguang Wu > --- > > Please take the patch only if it's a positive warning. Thanks! > > uvmm.c |5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c > @@ -107,8 +107,9 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvm > return ret; > > if (IS_ERR((memory = nvkm_umem_search(client, handle { > - VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, > PTR_ERR(memory)); > - return PTR_ERR(memory); > + VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, > + PTR_ERR((memory = nvkm_umem_search(client, > handle; > + return PTR_ERR((memory = nvkm_umem_search(client, handle))); > } > > mutex_lock(>mutex); > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings (fwd)
Hi julia, I think it would be better to extract the assignment out of the if clause. Then you'll get something like this: memory = ...; if (IS_ERR(memory)) { } so, to answer your question: no, it isn't necessary. On Tue, Nov 28, 2017 at 1:56 PM, Julia Lawall wrote: > This is a false positive, but I wonder if it is really necessary to put > the assignment in the conditional test expression. > > julia > > -- Forwarded message -- > Date: Tue, 28 Nov 2017 13:23:36 +0800 > From: kbuild test robot > To: kbu...@01.org > Cc: Julia Lawall > Subject: [PATCH] drm/nouveau/mmu: fix odd_ptr_err.cocci warnings > > CC: kbuild-...@01.org > CC: linux-kernel@vger.kernel.org > TO: Ben Skeggs > CC: David Airlie > CC: dri-de...@lists.freedesktop.org > CC: nouv...@lists.freedesktop.org > CC: linux-kernel@vger.kernel.org > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c:109:5-11: inconsistent IS_ERR > and PTR_ERR on line 110. > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c:109:5-11: inconsistent IS_ERR > and PTR_ERR on line 111. > > PTR_ERR should access the value just tested by IS_ERR > > Semantic patch information: > There can be false positives in the patch case, where it is the call to > IS_ERR that is wrong. > > Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci > > Fixes: 920d2b5ef215 ("drm/nouveau/mmu: define user interfaces to mmu vmm > opertaions") > Signed-off-by: Fengguang Wu > --- > > Please take the patch only if it's a positive warning. Thanks! > > uvmm.c |5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c > @@ -107,8 +107,9 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvm > return ret; > > if (IS_ERR((memory = nvkm_umem_search(client, handle { > - VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, > PTR_ERR(memory)); > - return PTR_ERR(memory); > + VMM_DEBUG(vmm, "memory %016llx %ld\n", handle, > + PTR_ERR((memory = nvkm_umem_search(client, > handle; > + return PTR_ERR((memory = nvkm_umem_search(client, handle))); > } > > mutex_lock(>mutex); > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
[PATCH] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
If something calls ioremap with an address not aligned to PAGE_SIZE, the returned address might be not aligned as well. This led to a probe registered on exactly the returned address, but the entire page was armed for mmiotracing. On calling iounmap the address passed to unregister_kmmio_probe was PAGE_SIZE aligned by the caller leading to a complete freeze of the machine. We should always page align addresses while (un)registerung mappings, because the mmiotracer works on top of pages, not mappings. We still keep track of the probes based on their real addresses and lengths though, because the mmiotrace still needs to know what are mapped memory regions. Also move the call to mmiotrace_iounmap prior page aligning the address, so that all probes are unregistered properly, otherwise the kernel ends up failing memory allocations randomly after disabling the mmiotracer. Cc: Steven Rostedt <rost...@goodmis.org> Cc: Ingo Molnar <mi...@kernel.org> Cc: Thomas Gleixner <t...@linutronix.de> Cc: Pekka Paalanen <ppaala...@gmail.com> Cc: nouv...@lists.freedesktop.org Cc: x...@kernel.org Acked-by: Pekka Paalanen <ppaala...@gmail.com> Tested-by: Lyude <ly...@redhat.com> Signed-off-by: Karol Herbst <kher...@redhat.com> --- arch/x86/mm/ioremap.c | 4 ++-- arch/x86/mm/kmmio.c | 12 +++- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 34f0e1847dd6..5d4c358778dd 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -349,11 +349,11 @@ void iounmap(volatile void __iomem *addr) return; } + mmiotrace_iounmap(addr); + addr = (volatile void __iomem *) (PAGE_MASK & (unsigned long __force)addr); - mmiotrace_iounmap(addr); - /* Use the vm area unlocked, assuming the caller ensures there isn't another iounmap for the same address in parallel. Reuse of the virtual address is prevented by diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c index c21c2ed04612..58477ec3d66d 100644 --- a/arch/x86/mm/kmmio.c +++ b/arch/x86/mm/kmmio.c @@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p) unsigned long flags; int ret = 0; unsigned long size = 0; + unsigned long addr = p->addr & PAGE_MASK; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); unsigned int l; pte_t *pte; spin_lock_irqsave(_lock, flags); - if (get_kmmio_probe(p->addr)) { + if (get_kmmio_probe(addr)) { ret = -EEXIST; goto out; } - pte = lookup_address(p->addr, ); + pte = lookup_address(addr, ); if (!pte) { ret = -EINVAL; goto out; @@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p) kmmio_count++; list_add_rcu(>list, _probes); while (size < size_lim) { - if (add_kmmio_fault_page(p->addr + size)) + if (add_kmmio_fault_page(addr + size)) pr_err("Unable to set page fault.\n"); size += page_level_size(l); } @@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p) { unsigned long flags; unsigned long size = 0; + unsigned long addr = p->addr & PAGE_MASK; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); struct kmmio_fault_page *release_list = NULL; struct kmmio_delayed_release *drelease; unsigned int l; pte_t *pte; - pte = lookup_address(p->addr, ); + pte = lookup_address(addr, ); if (!pte) return; spin_lock_irqsave(_lock, flags); while (size < size_lim) { - release_kmmio_fault_page(p->addr + size, _list); + release_kmmio_fault_page(addr + size, _list); size += page_level_size(l); } list_del_rcu(>list); -- 2.14.3
[PATCH] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
If something calls ioremap with an address not aligned to PAGE_SIZE, the returned address might be not aligned as well. This led to a probe registered on exactly the returned address, but the entire page was armed for mmiotracing. On calling iounmap the address passed to unregister_kmmio_probe was PAGE_SIZE aligned by the caller leading to a complete freeze of the machine. We should always page align addresses while (un)registerung mappings, because the mmiotracer works on top of pages, not mappings. We still keep track of the probes based on their real addresses and lengths though, because the mmiotrace still needs to know what are mapped memory regions. Also move the call to mmiotrace_iounmap prior page aligning the address, so that all probes are unregistered properly, otherwise the kernel ends up failing memory allocations randomly after disabling the mmiotracer. Cc: Steven Rostedt Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Pekka Paalanen Cc: nouv...@lists.freedesktop.org Cc: x...@kernel.org Acked-by: Pekka Paalanen Tested-by: Lyude Signed-off-by: Karol Herbst --- arch/x86/mm/ioremap.c | 4 ++-- arch/x86/mm/kmmio.c | 12 +++- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 34f0e1847dd6..5d4c358778dd 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -349,11 +349,11 @@ void iounmap(volatile void __iomem *addr) return; } + mmiotrace_iounmap(addr); + addr = (volatile void __iomem *) (PAGE_MASK & (unsigned long __force)addr); - mmiotrace_iounmap(addr); - /* Use the vm area unlocked, assuming the caller ensures there isn't another iounmap for the same address in parallel. Reuse of the virtual address is prevented by diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c index c21c2ed04612..58477ec3d66d 100644 --- a/arch/x86/mm/kmmio.c +++ b/arch/x86/mm/kmmio.c @@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p) unsigned long flags; int ret = 0; unsigned long size = 0; + unsigned long addr = p->addr & PAGE_MASK; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); unsigned int l; pte_t *pte; spin_lock_irqsave(_lock, flags); - if (get_kmmio_probe(p->addr)) { + if (get_kmmio_probe(addr)) { ret = -EEXIST; goto out; } - pte = lookup_address(p->addr, ); + pte = lookup_address(addr, ); if (!pte) { ret = -EINVAL; goto out; @@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p) kmmio_count++; list_add_rcu(>list, _probes); while (size < size_lim) { - if (add_kmmio_fault_page(p->addr + size)) + if (add_kmmio_fault_page(addr + size)) pr_err("Unable to set page fault.\n"); size += page_level_size(l); } @@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p) { unsigned long flags; unsigned long size = 0; + unsigned long addr = p->addr & PAGE_MASK; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); struct kmmio_fault_page *release_list = NULL; struct kmmio_delayed_release *drelease; unsigned int l; pte_t *pte; - pte = lookup_address(p->addr, ); + pte = lookup_address(addr, ); if (!pte) return; spin_lock_irqsave(_lock, flags); while (size < size_lim) { - release_kmmio_fault_page(p->addr + size, _list); + release_kmmio_fault_page(addr + size, _list); size += page_level_size(l); } list_del_rcu(>list); -- 2.14.3
Re: [Nouveau] [PATCH] [RESEND] drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning
On Wed, Sep 6, 2017 at 10:11 PM, Arnd Bergmann <a...@arndb.de> wrote: > On Wed, Sep 6, 2017 at 4:20 PM, Karol Herbst <karolher...@gmail.com> wrote: >>> In this instance, I think using multiplication is more intuitive >>> than '&&', so I'm adding a comparison to zero instead to shut up >>> the warning. To further improve readability, I also make the >>> error case indented and leave the normal case as the final 'return' >>> statement. >>> >> >> I think to make perfectly clear why this check is done, we simply >> should precompute the denominator and do something like this: >> >> int denominator = M * P >> if (denominator == 0) >>return 0; >> return sclk * N / denominator; >> >> but with a better name for "denominator". > > I don't know what M and P actually are in this function, so I couldn't > come up with a much better name either, how about simply 'divisor'? > what about "MP"? M and P are simply dividers for the PLL configuration and there are two of them. I think "P" is the post divider. >> I even imagine, that there >> are some macros in the kernel for dealing with something like this >> already, so that we could do instead: >> >> return AWESOME_DIV_MACRO(sclk * N, M * P) > > I don't think that exists, the behavior for divide-by-zero really > depends a lot on the context, and returning zero is probably > often not a good solution. > > Arnd
Re: [Nouveau] [PATCH] [RESEND] drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning
On Wed, Sep 6, 2017 at 10:11 PM, Arnd Bergmann wrote: > On Wed, Sep 6, 2017 at 4:20 PM, Karol Herbst wrote: >>> In this instance, I think using multiplication is more intuitive >>> than '&&', so I'm adding a comparison to zero instead to shut up >>> the warning. To further improve readability, I also make the >>> error case indented and leave the normal case as the final 'return' >>> statement. >>> >> >> I think to make perfectly clear why this check is done, we simply >> should precompute the denominator and do something like this: >> >> int denominator = M * P >> if (denominator == 0) >>return 0; >> return sclk * N / denominator; >> >> but with a better name for "denominator". > > I don't know what M and P actually are in this function, so I couldn't > come up with a much better name either, how about simply 'divisor'? > what about "MP"? M and P are simply dividers for the PLL configuration and there are two of them. I think "P" is the post divider. >> I even imagine, that there >> are some macros in the kernel for dealing with something like this >> already, so that we could do instead: >> >> return AWESOME_DIV_MACRO(sclk * N, M * P) > > I don't think that exists, the behavior for divide-by-zero really > depends a lot on the context, and returning zero is probably > often not a good solution. > > Arnd
Re: [Nouveau] [PATCH] [RESEND] drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning
On Wed, Sep 6, 2017 at 3:56 PM, Arnd Bergmannwrote: > gcc thinks that interpreting a multiplication result as a bool > is confusing: > > drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c: In function 'read_pll': > drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c:133:8: error: '*' in boolean > context, suggest '&&' instead [-Werror=int-in-bool-context] > > In this instance, I think using multiplication is more intuitive > than '&&', so I'm adding a comparison to zero instead to shut up > the warning. To further improve readability, I also make the > error case indented and leave the normal case as the final 'return' > statement. > I think to make perfectly clear why this check is done, we simply should precompute the denominator and do something like this: int denominator = M * P if (denominator == 0) return 0; return sclk * N / denominator; but with a better name for "denominator". I even imagine, that there are some macros in the kernel for dealing with something like this already, so that we could do instead: return AWESOME_DIV_MACRO(sclk * N, M * P) > Fixes: 7632b30e4b8b ("drm/nouveau/clk: namespace + nvidia gpu names (no > binary change)") > Signed-off-by: Arnd Bergmann > --- > Originally submitted on July 14, but no reply. This is the same > patch again. The warning is currently disabled in mainline, but > I think we can turn it back on in the future, and this change here > seems harmless. > --- > drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c > index 96e0941c8edd..04b4f4ccf186 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c > @@ -130,10 +130,10 @@ read_pll(struct gt215_clk *clk, int idx, u32 pll) > sclk = read_clk(clk, 0x10 + idx, false); > } > > - if (M * P) > - return sclk * N / (M * P); > + if (M * P == 0) > + return 0; > > - return 0; > + return sclk * N / (M * P); > } > > static int > -- > 2.9.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] [RESEND] drm/nouveau/clk: fix gcc-7 -Wint-in-bool-context warning
On Wed, Sep 6, 2017 at 3:56 PM, Arnd Bergmann wrote: > gcc thinks that interpreting a multiplication result as a bool > is confusing: > > drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c: In function 'read_pll': > drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c:133:8: error: '*' in boolean > context, suggest '&&' instead [-Werror=int-in-bool-context] > > In this instance, I think using multiplication is more intuitive > than '&&', so I'm adding a comparison to zero instead to shut up > the warning. To further improve readability, I also make the > error case indented and leave the normal case as the final 'return' > statement. > I think to make perfectly clear why this check is done, we simply should precompute the denominator and do something like this: int denominator = M * P if (denominator == 0) return 0; return sclk * N / denominator; but with a better name for "denominator". I even imagine, that there are some macros in the kernel for dealing with something like this already, so that we could do instead: return AWESOME_DIV_MACRO(sclk * N, M * P) > Fixes: 7632b30e4b8b ("drm/nouveau/clk: namespace + nvidia gpu names (no > binary change)") > Signed-off-by: Arnd Bergmann > --- > Originally submitted on July 14, but no reply. This is the same > patch again. The warning is currently disabled in mainline, but > I think we can turn it back on in the future, and this change here > seems harmless. > --- > drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c > index 96e0941c8edd..04b4f4ccf186 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/gt215.c > @@ -130,10 +130,10 @@ read_pll(struct gt215_clk *clk, int idx, u32 pll) > sclk = read_clk(clk, 0x10 + idx, false); > } > > - if (M * P) > - return sclk * N / (M * P); > + if (M * P == 0) > + return 0; > > - return 0; > + return sclk * N / (M * P); > } > > static int > -- > 2.9.0 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
Yeah, we shouldn't let the machine die. Are there more WARN_ON_ONCE usage we could convert to WARN_ONCE? Reviewed-By: Karol Herbst <karolher...@gmail.com> On Fri, Jul 14, 2017 at 5:05 PM, Tobias Klausmann <tobias.johannes.klausm...@mni.thm.de> wrote: > On 7/14/17 3:41 PM, Mike Galbraith wrote: >> >> On Fri, 2017-07-14 at 15:36 +0200, Mike Galbraith wrote: >>> >>> All DRM did was to slip a >>> WARN_ON_ONCE() that nouveau triggers into a kernel module where such >>> things no longer warn, they blow the box out of the water. >> >> BTW, turn that irksome WARN_ON_ONCE() in drivers/gpu/drm/drm_vblank.c >> into a WARN_ONCE(), and all is peachy, you get the warning, box lives. >> >> --- >> drivers/gpu/drm/drm_vblank.c |3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> --- a/drivers/gpu/drm/drm_vblank.c >> +++ b/drivers/gpu/drm/drm_vblank.c >> @@ -605,7 +605,8 @@ bool drm_calc_vbltimestamp_from_scanoutp >> */ >> if (mode->crtc_clock == 0) { >> DRM_DEBUG("crtc %u: Noop due to uninitialized mode.\n", >> pipe); >> - WARN_ON_ONCE(drm_drv_uses_atomic_modeset(dev)); >> + WARN_ONCE(drm_drv_uses_atomic_modeset(dev), "%s: report >> me.\n", >> + dev->driver->name); >> return false; >> } > > > > Hey, > > confirmed this helps saving the box, but we still have to find the root > cause! Backtrace with the above fix applied (and the one which came in with > the latest drm-fixes merge)! > > > [1] https://hastebin.com/uyoqifijed.http > > Thanks, > > Tobias >Reviewed-By: Karol Herbst <karolher...@gmail.com> > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [regression drm/noveau] suspend to ram -> BOOM: exception RIP: drm_calc_vbltimestamp_from_scanoutpos+335
Yeah, we shouldn't let the machine die. Are there more WARN_ON_ONCE usage we could convert to WARN_ONCE? Reviewed-By: Karol Herbst On Fri, Jul 14, 2017 at 5:05 PM, Tobias Klausmann wrote: > On 7/14/17 3:41 PM, Mike Galbraith wrote: >> >> On Fri, 2017-07-14 at 15:36 +0200, Mike Galbraith wrote: >>> >>> All DRM did was to slip a >>> WARN_ON_ONCE() that nouveau triggers into a kernel module where such >>> things no longer warn, they blow the box out of the water. >> >> BTW, turn that irksome WARN_ON_ONCE() in drivers/gpu/drm/drm_vblank.c >> into a WARN_ONCE(), and all is peachy, you get the warning, box lives. >> >> --- >> drivers/gpu/drm/drm_vblank.c |3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> --- a/drivers/gpu/drm/drm_vblank.c >> +++ b/drivers/gpu/drm/drm_vblank.c >> @@ -605,7 +605,8 @@ bool drm_calc_vbltimestamp_from_scanoutp >> */ >> if (mode->crtc_clock == 0) { >> DRM_DEBUG("crtc %u: Noop due to uninitialized mode.\n", >> pipe); >> - WARN_ON_ONCE(drm_drv_uses_atomic_modeset(dev)); >> + WARN_ONCE(drm_drv_uses_atomic_modeset(dev), "%s: report >> me.\n", >> + dev->driver->name); >> return false; >> } > > > > Hey, > > confirmed this helps saving the box, but we still have to find the root > cause! Backtrace with the above fix applied (and the one which came in with > the latest drm-fixes merge)! > > > [1] https://hastebin.com/uyoqifijed.http > > Thanks, > > Tobias >Reviewed-By: Karol Herbst > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [PATCH] drm/nouveau: Add support for clockgating on Fermi+
Hi Lyude, thanks for the great work. Just a view comments inline. 2017-04-25 20:38 GMT+02:00 Lyude: > This adds support for enabling automatic clockgating on nvidia GPUs for > Fermi and later generations. This saves a little bit of power, bringing > my fermi GPU's power consumption from ~28.3W on idle to ~27W, and my > kepler's idle power consumption from ~23.6W to ~21.65W. > > Similar to how the nvidia driver seems to handle this, we enable > clockgating for each engine that supports it after it's initialization. > > Signed-off-by: Lyude > --- > .../gpu/drm/nouveau/include/nvkm/subdev/therm.h| 4 ++ > drivers/gpu/drm/nouveau/nvkm/core/engine.c | 20 +- > drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 14 ++-- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild | 2 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c | 2 + > .../gpu/drm/nouveau/nvkm/subdev/therm/clkgate.c| 49 ++ > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c | 77 > ++ > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c | 2 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm107.c | 2 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gt215.c | 2 +- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 10 +++ > 11 files changed, 175 insertions(+), 9 deletions(-) > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/clkgate.c > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > index b268b96..904aa56 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > @@ -84,6 +84,9 @@ struct nvkm_therm { > > int (*attr_get)(struct nvkm_therm *, enum nvkm_therm_attr_type); > int (*attr_set)(struct nvkm_therm *, enum nvkm_therm_attr_type, int); > + > + int (*clkgate_engine)(struct nvkm_therm *, enum nvkm_devidx); > + void (*clkgate_set)(struct nvkm_therm *, int gate_idx, bool enable); remove those and have a simple "nvkm_therm_clkgate_engine" function This way you know that every user calls this function and don't have to check for silly function pointers like you currently do in engine.c > }; > > int nvkm_therm_temp_get(struct nvkm_therm *); > @@ -94,6 +97,7 @@ int nv40_therm_new(struct nvkm_device *, int, struct > nvkm_therm **); > int nv50_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > +int gf100_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > #endif > diff --git a/drivers/gpu/drm/nouveau/nvkm/core/engine.c > b/drivers/gpu/drm/nouveau/nvkm/core/engine.c > index b6c9169..473ad3e 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/core/engine.c > +++ b/drivers/gpu/drm/nouveau/nvkm/core/engine.c > @@ -26,6 +26,7 @@ > #include > > #include > +#include > > bool > nvkm_engine_chsw_load(struct nvkm_engine *engine) > @@ -86,6 +87,13 @@ static int > nvkm_engine_fini(struct nvkm_subdev *subdev, bool suspend) > { > struct nvkm_engine *engine = nvkm_engine(subdev); > + struct nvkm_therm *therm = subdev->device->therm; > + int gate_idx; > + > + gate_idx = therm->clkgate_engine(therm, subdev->index); > + if (gate_idx != -1) > + therm->clkgate_set(therm, gate_idx, false); > + move this code inside "nvkm_therm_clkgate_engine". Nobody outside nvkm_therm should even care about the index. > if (engine->func->fini) > return engine->func->fini(engine, suspend); > return 0; > @@ -96,12 +104,13 @@ nvkm_engine_init(struct nvkm_subdev *subdev) > { > struct nvkm_engine *engine = nvkm_engine(subdev); > struct nvkm_fb *fb = subdev->device->fb; > + struct nvkm_therm *therm = subdev->device->therm; > int ret = 0, i; > s64 time; > > if (!engine->usecount) { > nvkm_trace(subdev, "init skipped, engine has no users\n"); > - return ret; > + goto finish; > } > > if (engine->func->oneinit && !engine->subdev.oneinit) { > @@ -123,6 +132,15 @@ nvkm_engine_init(struct nvkm_subdev *subdev) > > for (i = 0; fb && i < fb->tile.regions; i++) > nvkm_engine_tile(engine, i); > + > +finish: > + if (!ret) { > + int gate_idx = therm->clkgate_engine(therm, subdev->index); > + > + if (gate_idx != -1) > + therm->clkgate_set(therm, gate_idx, true); > + } > + same code as above. More code sharing! > return ret; > } > > diff --git
Re: [PATCH] drm/nouveau: Add support for clockgating on Fermi+
Hi Lyude, thanks for the great work. Just a view comments inline. 2017-04-25 20:38 GMT+02:00 Lyude : > This adds support for enabling automatic clockgating on nvidia GPUs for > Fermi and later generations. This saves a little bit of power, bringing > my fermi GPU's power consumption from ~28.3W on idle to ~27W, and my > kepler's idle power consumption from ~23.6W to ~21.65W. > > Similar to how the nvidia driver seems to handle this, we enable > clockgating for each engine that supports it after it's initialization. > > Signed-off-by: Lyude > --- > .../gpu/drm/nouveau/include/nvkm/subdev/therm.h| 4 ++ > drivers/gpu/drm/nouveau/nvkm/core/engine.c | 20 +- > drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 14 ++-- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild | 2 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c | 2 + > .../gpu/drm/nouveau/nvkm/subdev/therm/clkgate.c| 49 ++ > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c | 77 > ++ > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c | 2 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm107.c | 2 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gt215.c | 2 +- > drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 10 +++ > 11 files changed, 175 insertions(+), 9 deletions(-) > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/clkgate.c > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > index b268b96..904aa56 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > @@ -84,6 +84,9 @@ struct nvkm_therm { > > int (*attr_get)(struct nvkm_therm *, enum nvkm_therm_attr_type); > int (*attr_set)(struct nvkm_therm *, enum nvkm_therm_attr_type, int); > + > + int (*clkgate_engine)(struct nvkm_therm *, enum nvkm_devidx); > + void (*clkgate_set)(struct nvkm_therm *, int gate_idx, bool enable); remove those and have a simple "nvkm_therm_clkgate_engine" function This way you know that every user calls this function and don't have to check for silly function pointers like you currently do in engine.c > }; > > int nvkm_therm_temp_get(struct nvkm_therm *); > @@ -94,6 +97,7 @@ int nv40_therm_new(struct nvkm_device *, int, struct > nvkm_therm **); > int nv50_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > +int gf100_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **); > #endif > diff --git a/drivers/gpu/drm/nouveau/nvkm/core/engine.c > b/drivers/gpu/drm/nouveau/nvkm/core/engine.c > index b6c9169..473ad3e 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/core/engine.c > +++ b/drivers/gpu/drm/nouveau/nvkm/core/engine.c > @@ -26,6 +26,7 @@ > #include > > #include > +#include > > bool > nvkm_engine_chsw_load(struct nvkm_engine *engine) > @@ -86,6 +87,13 @@ static int > nvkm_engine_fini(struct nvkm_subdev *subdev, bool suspend) > { > struct nvkm_engine *engine = nvkm_engine(subdev); > + struct nvkm_therm *therm = subdev->device->therm; > + int gate_idx; > + > + gate_idx = therm->clkgate_engine(therm, subdev->index); > + if (gate_idx != -1) > + therm->clkgate_set(therm, gate_idx, false); > + move this code inside "nvkm_therm_clkgate_engine". Nobody outside nvkm_therm should even care about the index. > if (engine->func->fini) > return engine->func->fini(engine, suspend); > return 0; > @@ -96,12 +104,13 @@ nvkm_engine_init(struct nvkm_subdev *subdev) > { > struct nvkm_engine *engine = nvkm_engine(subdev); > struct nvkm_fb *fb = subdev->device->fb; > + struct nvkm_therm *therm = subdev->device->therm; > int ret = 0, i; > s64 time; > > if (!engine->usecount) { > nvkm_trace(subdev, "init skipped, engine has no users\n"); > - return ret; > + goto finish; > } > > if (engine->func->oneinit && !engine->subdev.oneinit) { > @@ -123,6 +132,15 @@ nvkm_engine_init(struct nvkm_subdev *subdev) > > for (i = 0; fb && i < fb->tile.regions; i++) > nvkm_engine_tile(engine, i); > + > +finish: > + if (!ret) { > + int gate_idx = therm->clkgate_engine(therm, subdev->index); > + > + if (gate_idx != -1) > + therm->clkgate_set(therm, gate_idx, true); > + } > + same code as above. More code sharing! > return ret; > } > > diff --git
Re: [Nouveau] [PATCH] drm/nouveau: fix unknown chipset for GTX 1060
we already have that included in 4.10 2016-12-12 12:26 GMT+01:00 Chris Chiu: > Nouveau driver shows unknown chipset (136000a1) for GTX 1060, so it > only gives VGA resolution on screen. Use the same chipset as nv134 > then it shows FullHD. This commit copies fields from nv134_chipset > to nv136_chipset for GTX 1060. > > Signed-off-by: Chris Chiu > --- > drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 29 > +++ > 1 file changed, 29 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > index 7218a06..7c6eece 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > @@ -2209,6 +2209,34 @@ nv134_chipset = { > .fifo = gp100_fifo_new, > }; > > +static const struct nvkm_device_chip > +nv136_chipset = { > + .name = "GP104", > + .bar = gf100_bar_new, > + .bios = nvkm_bios_new, > + .bus = gf100_bus_new, > + .devinit = gm200_devinit_new, > + .fb = gp104_fb_new, > + .fuse = gm107_fuse_new, > + .gpio = gk104_gpio_new, > + .i2c = gm200_i2c_new, > + .ibus = gm200_ibus_new, > + .imem = nv50_instmem_new, > + .ltc = gp100_ltc_new, > + .mc = gp100_mc_new, > + .mmu = gf100_mmu_new, > + .pci = gp100_pci_new, > + .timer = gk20a_timer_new, > + .top = gk104_top_new, > + .ce[0] = gp104_ce_new, > + .ce[1] = gp104_ce_new, > + .ce[2] = gp104_ce_new, > + .ce[3] = gp104_ce_new, > + .disp = gp104_disp_new, > + .dma = gf119_dma_new, > + .fifo = gp100_fifo_new, > +}; > + > static int > nvkm_device_event_ctor(struct nvkm_object *object, void *data, u32 size, >struct nvkm_notify *notify) > @@ -2644,6 +2672,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func, > case 0x12b: device->chip = _chipset; break; > case 0x130: device->chip = _chipset; break; > case 0x134: device->chip = _chipset; break; > + case 0x136: device->chip = _chipset; break; > default: > nvdev_error(device, "unknown chipset (%08x)\n", > boot0); > goto done; > -- > 2.1.4 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] drm/nouveau: fix unknown chipset for GTX 1060
we already have that included in 4.10 2016-12-12 12:26 GMT+01:00 Chris Chiu : > Nouveau driver shows unknown chipset (136000a1) for GTX 1060, so it > only gives VGA resolution on screen. Use the same chipset as nv134 > then it shows FullHD. This commit copies fields from nv134_chipset > to nv136_chipset for GTX 1060. > > Signed-off-by: Chris Chiu > --- > drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 29 > +++ > 1 file changed, 29 insertions(+) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > index 7218a06..7c6eece 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > @@ -2209,6 +2209,34 @@ nv134_chipset = { > .fifo = gp100_fifo_new, > }; > > +static const struct nvkm_device_chip > +nv136_chipset = { > + .name = "GP104", > + .bar = gf100_bar_new, > + .bios = nvkm_bios_new, > + .bus = gf100_bus_new, > + .devinit = gm200_devinit_new, > + .fb = gp104_fb_new, > + .fuse = gm107_fuse_new, > + .gpio = gk104_gpio_new, > + .i2c = gm200_i2c_new, > + .ibus = gm200_ibus_new, > + .imem = nv50_instmem_new, > + .ltc = gp100_ltc_new, > + .mc = gp100_mc_new, > + .mmu = gf100_mmu_new, > + .pci = gp100_pci_new, > + .timer = gk20a_timer_new, > + .top = gk104_top_new, > + .ce[0] = gp104_ce_new, > + .ce[1] = gp104_ce_new, > + .ce[2] = gp104_ce_new, > + .ce[3] = gp104_ce_new, > + .disp = gp104_disp_new, > + .dma = gf119_dma_new, > + .fifo = gp100_fifo_new, > +}; > + > static int > nvkm_device_event_ctor(struct nvkm_object *object, void *data, u32 size, >struct nvkm_notify *notify) > @@ -2644,6 +2672,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func, > case 0x12b: device->chip = _chipset; break; > case 0x130: device->chip = _chipset; break; > case 0x134: device->chip = _chipset; break; > + case 0x136: device->chip = _chipset; break; > default: > nvdev_error(device, "unknown chipset (%08x)\n", > boot0); > goto done; > -- > 2.1.4 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: mmiotracer hangs the system
sorry for that, but I forgot the patch 2016-11-19 11:56 GMT+01:00 Karol Herbst <karolher...@gmail.com>: > this is odd, I found a bug related to nouveau (modprobe/bind doesn't > return), but that isn't related to your issue at all or maybe it is > exactly this, cause the binding of the device doesn't return and > depending on the kind of driver, it would hang the system... yeah, > maybe it is the same issue. > > anyway, could you try to trace with the attached patch? Maybe the > additional output would help me to verify it. Currently I am working > on the bugfix I mentioned above and this may also fix your issue. I > was still able to get a working mmiotrace file, even if the dvice > binding didn't finish. Is this the same for you? (try cat > "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this > contains anything usefull). > > This really looks like an odd issue, because the mmiotracer still > behaves as expected. > > 2016-10-22 18:02 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>: >> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolher...@gmail.com> wrote: >>> sorry for the delay fixing that bug. I got occupied with other things >>> and didn't really got to the issue again, it is on my todo list as the >>> next item though and I hope I will be able to get a fix ready this >>> weekend. I think I might know where the issue is, but didn't confirm >>> it yet. >> >> Thanks.I'm still using revert. Feel free to Cc me when you will have >> some material to test. >> >>> >>> Again, sorry for the delay. >>> >>> Karol >>> >>> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolher...@gmail.com>: >>>> Hi again, >>>> >>>> I was able to get a crash/freeze/something while unbinding/binding my >>>> nvidia gpu from nouveau. >>>> >>>> Guess that means something is odd. I will investigate this more over >>>> the weekend. >>>> >>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>: >>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolher...@gmail.com> >>>>> wrote: >>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>: >>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolher...@gmail.com> >>>>>>> wrote: >>>>>>>> is there any update on that issue I missed somehow? I really don't >>>>>>>> want to leave the mmiotracer in a state, where it breaks something >>>>>>>> while fixing other issues. >>>>>>> >>>>>>> No updates. I'm busy right now with more priority tasks and revert >>>>>>> works for me. Issue is reproducible in my case 100%. >>>>>>> >>>>>> >>>>>> Is there something I could do with a "normal" haswell desktop system >>>>>> to reproduce this issue? >>>>> >>>>> Try LPSS UART device(s) >>>>> >>>>>> >>>>>> I'll try to play around the next days a bit and maybe I find something >>>>>> that works out here as well. It seems to be related to >>>>>> unmapping-mapping cycles. >>>>> >>>>> That is the only thing I would think of. >>>>> >>>>>> >>>>>> Because if this only happens with the pwm-lpss driver, >>>>> >>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by >>>>> intel-lpss driver. >>>>> >>>>>> it may be >>>>>> really troublesome to debug, because I don't really know the code that >>>>>> well to be sure where the issue might be. >>>>>> >>>>>>> So, I would able to attach dmesg in case it would be helpful. >>>>>>> Otherwise tell me exact instructions how to debug the issue. >>>>>>> >>>>>>> Here you are: >>>>>>> http://pastebin.com/raw/VfTZENt7 >>>>>>> >>>>>>>> But for now, without being able to even reproduce the issue, I can't >>>>>>>> really do much, because the code in the current state looks sane to >>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>>>>> arms new region for mmiotracing and that's
Re: mmiotracer hangs the system
sorry for that, but I forgot the patch 2016-11-19 11:56 GMT+01:00 Karol Herbst : > this is odd, I found a bug related to nouveau (modprobe/bind doesn't > return), but that isn't related to your issue at all or maybe it is > exactly this, cause the binding of the device doesn't return and > depending on the kind of driver, it would hang the system... yeah, > maybe it is the same issue. > > anyway, could you try to trace with the attached patch? Maybe the > additional output would help me to verify it. Currently I am working > on the bugfix I mentioned above and this may also fix your issue. I > was still able to get a working mmiotrace file, even if the dvice > binding didn't finish. Is this the same for you? (try cat > "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this > contains anything usefull). > > This really looks like an odd issue, because the mmiotracer still > behaves as expected. > > 2016-10-22 18:02 GMT+02:00 Andy Shevchenko : >> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst wrote: >>> sorry for the delay fixing that bug. I got occupied with other things >>> and didn't really got to the issue again, it is on my todo list as the >>> next item though and I hope I will be able to get a fix ready this >>> weekend. I think I might know where the issue is, but didn't confirm >>> it yet. >> >> Thanks.I'm still using revert. Feel free to Cc me when you will have >> some material to test. >> >>> >>> Again, sorry for the delay. >>> >>> Karol >>> >>> 2016-08-19 22:46 GMT+02:00 Karol Herbst : >>>> Hi again, >>>> >>>> I was able to get a crash/freeze/something while unbinding/binding my >>>> nvidia gpu from nouveau. >>>> >>>> Guess that means something is odd. I will investigate this more over >>>> the weekend. >>>> >>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko : >>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst >>>>> wrote: >>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko : >>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst >>>>>>> wrote: >>>>>>>> is there any update on that issue I missed somehow? I really don't >>>>>>>> want to leave the mmiotracer in a state, where it breaks something >>>>>>>> while fixing other issues. >>>>>>> >>>>>>> No updates. I'm busy right now with more priority tasks and revert >>>>>>> works for me. Issue is reproducible in my case 100%. >>>>>>> >>>>>> >>>>>> Is there something I could do with a "normal" haswell desktop system >>>>>> to reproduce this issue? >>>>> >>>>> Try LPSS UART device(s) >>>>> >>>>>> >>>>>> I'll try to play around the next days a bit and maybe I find something >>>>>> that works out here as well. It seems to be related to >>>>>> unmapping-mapping cycles. >>>>> >>>>> That is the only thing I would think of. >>>>> >>>>>> >>>>>> Because if this only happens with the pwm-lpss driver, >>>>> >>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by >>>>> intel-lpss driver. >>>>> >>>>>> it may be >>>>>> really troublesome to debug, because I don't really know the code that >>>>>> well to be sure where the issue might be. >>>>>> >>>>>>> So, I would able to attach dmesg in case it would be helpful. >>>>>>> Otherwise tell me exact instructions how to debug the issue. >>>>>>> >>>>>>> Here you are: >>>>>>> http://pastebin.com/raw/VfTZENt7 >>>>>>> >>>>>>>> But for now, without being able to even reproduce the issue, I can't >>>>>>>> really do much, because the code in the current state looks sane to >>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>>>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>>>>>> I have no idea and no way to reproduce this, so I can't help this way. >>>>>>> >>>>>>> Maybe. First thing happened is iounmap(). >>>&
Re: mmiotracer hangs the system
this is odd, I found a bug related to nouveau (modprobe/bind doesn't return), but that isn't related to your issue at all or maybe it is exactly this, cause the binding of the device doesn't return and depending on the kind of driver, it would hang the system... yeah, maybe it is the same issue. anyway, could you try to trace with the attached patch? Maybe the additional output would help me to verify it. Currently I am working on the bugfix I mentioned above and this may also fix your issue. I was still able to get a working mmiotrace file, even if the dvice binding didn't finish. Is this the same for you? (try cat "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this contains anything usefull). This really looks like an odd issue, because the mmiotracer still behaves as expected. 2016-10-22 18:02 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>: > On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolher...@gmail.com> wrote: >> sorry for the delay fixing that bug. I got occupied with other things >> and didn't really got to the issue again, it is on my todo list as the >> next item though and I hope I will be able to get a fix ready this >> weekend. I think I might know where the issue is, but didn't confirm >> it yet. > > Thanks.I'm still using revert. Feel free to Cc me when you will have > some material to test. > >> >> Again, sorry for the delay. >> >> Karol >> >> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolher...@gmail.com>: >>> Hi again, >>> >>> I was able to get a crash/freeze/something while unbinding/binding my >>> nvidia gpu from nouveau. >>> >>> Guess that means something is odd. I will investigate this more over >>> the weekend. >>> >>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>: >>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolher...@gmail.com> >>>> wrote: >>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevche...@gmail.com>: >>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolher...@gmail.com> >>>>>> wrote: >>>>>>> is there any update on that issue I missed somehow? I really don't >>>>>>> want to leave the mmiotracer in a state, where it breaks something >>>>>>> while fixing other issues. >>>>>> >>>>>> No updates. I'm busy right now with more priority tasks and revert >>>>>> works for me. Issue is reproducible in my case 100%. >>>>>> >>>>> >>>>> Is there something I could do with a "normal" haswell desktop system >>>>> to reproduce this issue? >>>> >>>> Try LPSS UART device(s) >>>> >>>>> >>>>> I'll try to play around the next days a bit and maybe I find something >>>>> that works out here as well. It seems to be related to >>>>> unmapping-mapping cycles. >>>> >>>> That is the only thing I would think of. >>>> >>>>> >>>>> Because if this only happens with the pwm-lpss driver, >>>> >>>> It has nothing to do with pwm-lpss since it's a HS UART and served by >>>> intel-lpss driver. >>>> >>>>> it may be >>>>> really troublesome to debug, because I don't really know the code that >>>>> well to be sure where the issue might be. >>>>> >>>>>> So, I would able to attach dmesg in case it would be helpful. >>>>>> Otherwise tell me exact instructions how to debug the issue. >>>>>> >>>>>> Here you are: >>>>>> http://pastebin.com/raw/VfTZENt7 >>>>>> >>>>>>> But for now, without being able to even reproduce the issue, I can't >>>>>>> really do much, because the code in the current state looks sane to >>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>>>>> I have no idea and no way to reproduce this, so I can't help this way. >>>>>> >>>>>> Maybe. First thing happened is iounmap(). >>>> >>>> >>>> -- >>>> With Best Regards, >>>> Andy Shevchenko > > > > -- > With Best Regards, > Andy Shevchenko
Re: mmiotracer hangs the system
this is odd, I found a bug related to nouveau (modprobe/bind doesn't return), but that isn't related to your issue at all or maybe it is exactly this, cause the binding of the device doesn't return and depending on the kind of driver, it would hang the system... yeah, maybe it is the same issue. anyway, could you try to trace with the attached patch? Maybe the additional output would help me to verify it. Currently I am working on the bugfix I mentioned above and this may also fix your issue. I was still able to get a working mmiotrace file, even if the dvice binding didn't finish. Is this the same for you? (try cat "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this contains anything usefull). This really looks like an odd issue, because the mmiotracer still behaves as expected. 2016-10-22 18:02 GMT+02:00 Andy Shevchenko : > On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst wrote: >> sorry for the delay fixing that bug. I got occupied with other things >> and didn't really got to the issue again, it is on my todo list as the >> next item though and I hope I will be able to get a fix ready this >> weekend. I think I might know where the issue is, but didn't confirm >> it yet. > > Thanks.I'm still using revert. Feel free to Cc me when you will have > some material to test. > >> >> Again, sorry for the delay. >> >> Karol >> >> 2016-08-19 22:46 GMT+02:00 Karol Herbst : >>> Hi again, >>> >>> I was able to get a crash/freeze/something while unbinding/binding my >>> nvidia gpu from nouveau. >>> >>> Guess that means something is odd. I will investigate this more over >>> the weekend. >>> >>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko : >>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst >>>> wrote: >>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko : >>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst >>>>>> wrote: >>>>>>> is there any update on that issue I missed somehow? I really don't >>>>>>> want to leave the mmiotracer in a state, where it breaks something >>>>>>> while fixing other issues. >>>>>> >>>>>> No updates. I'm busy right now with more priority tasks and revert >>>>>> works for me. Issue is reproducible in my case 100%. >>>>>> >>>>> >>>>> Is there something I could do with a "normal" haswell desktop system >>>>> to reproduce this issue? >>>> >>>> Try LPSS UART device(s) >>>> >>>>> >>>>> I'll try to play around the next days a bit and maybe I find something >>>>> that works out here as well. It seems to be related to >>>>> unmapping-mapping cycles. >>>> >>>> That is the only thing I would think of. >>>> >>>>> >>>>> Because if this only happens with the pwm-lpss driver, >>>> >>>> It has nothing to do with pwm-lpss since it's a HS UART and served by >>>> intel-lpss driver. >>>> >>>>> it may be >>>>> really troublesome to debug, because I don't really know the code that >>>>> well to be sure where the issue might be. >>>>> >>>>>> So, I would able to attach dmesg in case it would be helpful. >>>>>> Otherwise tell me exact instructions how to debug the issue. >>>>>> >>>>>> Here you are: >>>>>> http://pastebin.com/raw/VfTZENt7 >>>>>> >>>>>>> But for now, without being able to even reproduce the issue, I can't >>>>>>> really do much, because the code in the current state looks sane to >>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>>>>> I have no idea and no way to reproduce this, so I can't help this way. >>>>>> >>>>>> Maybe. First thing happened is iounmap(). >>>> >>>> >>>> -- >>>> With Best Regards, >>>> Andy Shevchenko > > > > -- > With Best Regards, > Andy Shevchenko
Re: [Nouveau] [PATCH] drm/nouveau: fix LEDS_CLASS=m configuration
2016-11-08 17:12 GMT+01:00 Arnd Bergmann: > On Tuesday, November 8, 2016 5:07:01 PM CET Lukas Wunner wrote: >> On Tue, Nov 08, 2016 at 04:52:49PM +0100, Arnd Bergmann wrote: >> > The underlying problem is that we already have a number of other >> > symbols that either have "depends on LEDS_CLASS" or >> > "select LEDS_CLASS". To clean that up properly, we should either >> > make the symbol itself hidden and only select it from other drivers, >> > or use "depends on LEDS_CLASS" everywhere. >> > >> > Another option is to use the IS_REACHABLE() macro instead of IS_ENABLED() >> > in the header file, to stub out the calls into the new file, but >> > that can be a bit confusing. >> >> Why don't you just add empty inline stubs for nouveau_led_init / _fini / >> _suspend / _resume? >> > > That's what I was suggesting: > > diff --git a/drivers/gpu/drm/nouveau/nouveau_led.h > b/drivers/gpu/drm/nouveau/nouveau_led.h > index 9c9bb6ac938e..bc5f47cb516b 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_led.h > +++ b/drivers/gpu/drm/nouveau/nouveau_led.h > @@ -35,21 +35,21 @@ struct nouveau_led { > struct led_classdev led; > }; > > static inline struct nouveau_led * > nouveau_led(struct drm_device *dev) > { > return nouveau_drm(dev)->led; > } > > /* nouveau_led.c */ > -#if IS_ENABLED(CONFIG_LEDS_CLASS) > +#if IS_REACHABLE(CONFIG_LEDS_CLASS) > int nouveau_led_init(struct drm_device *dev); > void nouveau_led_suspend(struct drm_device *dev); > void nouveau_led_resume(struct drm_device *dev); > void nouveau_led_fini(struct drm_device *dev); > #else > static inline int nouveau_led_init(struct drm_device *dev) { return 0; }; > static inline void nouveau_led_suspend(struct drm_device *dev) { }; > static inline void nouveau_led_resume(struct drm_device *dev) { }; > static inline void nouveau_led_fini(struct drm_device *dev) { }; > #endif > > The downside is that now the nouveau_led_init() just won't be called > if CONFIG_LEDS_CLASS=m and CONFIG_DRM_NOUVEAU=y, which can be > surprising to users. yeah, it will. I guess it is fine to force LEDS to y if nouveau is set to y. The thinks I absolutely dislike is: 1. auto hiding of features I _want_ to have, because I would have to enable the dependencies first, which is like super annoying if there are somewhere else 2. preventing me from enabling something, cause the dependency is missing. We should clarify first if we actually want to enable those features optionally, because there isn't much of a reason not to enable the dependencies, except embedded systems. We have a lot more stuff where we could add things like that: hwmon, debugfs, acpi, compat and maybe there are even more things > > Arnd > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] drm/nouveau: fix LEDS_CLASS=m configuration
2016-11-08 17:12 GMT+01:00 Arnd Bergmann : > On Tuesday, November 8, 2016 5:07:01 PM CET Lukas Wunner wrote: >> On Tue, Nov 08, 2016 at 04:52:49PM +0100, Arnd Bergmann wrote: >> > The underlying problem is that we already have a number of other >> > symbols that either have "depends on LEDS_CLASS" or >> > "select LEDS_CLASS". To clean that up properly, we should either >> > make the symbol itself hidden and only select it from other drivers, >> > or use "depends on LEDS_CLASS" everywhere. >> > >> > Another option is to use the IS_REACHABLE() macro instead of IS_ENABLED() >> > in the header file, to stub out the calls into the new file, but >> > that can be a bit confusing. >> >> Why don't you just add empty inline stubs for nouveau_led_init / _fini / >> _suspend / _resume? >> > > That's what I was suggesting: > > diff --git a/drivers/gpu/drm/nouveau/nouveau_led.h > b/drivers/gpu/drm/nouveau/nouveau_led.h > index 9c9bb6ac938e..bc5f47cb516b 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_led.h > +++ b/drivers/gpu/drm/nouveau/nouveau_led.h > @@ -35,21 +35,21 @@ struct nouveau_led { > struct led_classdev led; > }; > > static inline struct nouveau_led * > nouveau_led(struct drm_device *dev) > { > return nouveau_drm(dev)->led; > } > > /* nouveau_led.c */ > -#if IS_ENABLED(CONFIG_LEDS_CLASS) > +#if IS_REACHABLE(CONFIG_LEDS_CLASS) > int nouveau_led_init(struct drm_device *dev); > void nouveau_led_suspend(struct drm_device *dev); > void nouveau_led_resume(struct drm_device *dev); > void nouveau_led_fini(struct drm_device *dev); > #else > static inline int nouveau_led_init(struct drm_device *dev) { return 0; }; > static inline void nouveau_led_suspend(struct drm_device *dev) { }; > static inline void nouveau_led_resume(struct drm_device *dev) { }; > static inline void nouveau_led_fini(struct drm_device *dev) { }; > #endif > > The downside is that now the nouveau_led_init() just won't be called > if CONFIG_LEDS_CLASS=m and CONFIG_DRM_NOUVEAU=y, which can be > surprising to users. yeah, it will. I guess it is fine to force LEDS to y if nouveau is set to y. The thinks I absolutely dislike is: 1. auto hiding of features I _want_ to have, because I would have to enable the dependencies first, which is like super annoying if there are somewhere else 2. preventing me from enabling something, cause the dependency is missing. We should clarify first if we actually want to enable those features optionally, because there isn't much of a reason not to enable the dependencies, except embedded systems. We have a lot more stuff where we could add things like that: hwmon, debugfs, acpi, compat and maybe there are even more things > > Arnd > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] drm/nouveau: fix LEDS_CLASS=m configuration
2016-11-08 16:46 GMT+01:00 Ilia Mirkin: > On Tue, Nov 8, 2016 at 8:56 AM, Arnd Bergmann wrote: >> The newly introduced LED handling for nouveau fails to link when the >> driver is built-in but the LED subsystem is a loadable module: >> >> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_do_suspend': >> tvnv17.c:(.text.nouveau_do_suspend+0x10): undefined reference to >> `nouveau_led_suspend' >> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_do_resume': >> tvnv17.c:(.text.nouveau_do_resume+0xf0): undefined reference to >> `nouveau_led_resume' >> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_drm_unload': >> tvnv17.c:(.text.nouveau_drm_unload+0x34): undefined reference to >> `nouveau_led_fini' >> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_drm_load': >> tvnv17.c:(.text.nouveau_drm_load+0x7d0): undefined reference to >> `nouveau_led_init' >> >> This adds a separate Kconfig symbol for the LED support that >> correctly tracks the dependencies. >> >> Fixes: 8d021d71b324 ("drm/nouveau/drm/nouveau: add a LED driver for the >> NVIDIA logo") >> Signed-off-by: Arnd Bergmann >> --- >> drivers/gpu/drm/nouveau/Kbuild| 2 +- >> drivers/gpu/drm/nouveau/Kconfig | 8 >> drivers/gpu/drm/nouveau/nouveau_led.h | 2 +- >> 3 files changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/nouveau/Kbuild b/drivers/gpu/drm/nouveau/Kbuild >> index fde6e3656636..5e00e911daa6 100644 >> --- a/drivers/gpu/drm/nouveau/Kbuild >> +++ b/drivers/gpu/drm/nouveau/Kbuild >> @@ -22,7 +22,7 @@ nouveau-$(CONFIG_DEBUG_FS) += nouveau_debugfs.o >> nouveau-y += nouveau_drm.o >> nouveau-y += nouveau_hwmon.o >> nouveau-$(CONFIG_COMPAT) += nouveau_ioc32.o >> -nouveau-$(CONFIG_LEDS_CLASS) += nouveau_led.o >> +nouveau-$(CONFIG_DRM_NOUVEAU_LED) += nouveau_led.o >> nouveau-y += nouveau_nvif.o >> nouveau-$(CONFIG_NOUVEAU_PLATFORM_DRIVER) += nouveau_platform.o >> nouveau-y += nouveau_usif.o # userspace <-> nvif >> diff --git a/drivers/gpu/drm/nouveau/Kconfig >> b/drivers/gpu/drm/nouveau/Kconfig >> index 78631fb61adf..715cd6f4dc31 100644 >> --- a/drivers/gpu/drm/nouveau/Kconfig >> +++ b/drivers/gpu/drm/nouveau/Kconfig >> @@ -46,6 +46,14 @@ config NOUVEAU_DEBUG >> The paranoia and spam levels will add a lot of extra checks which >> may potentially slow down driver operation. >> >> +config DRM_NOUVEAU_LED >> + bool "Support for logo LED" >> + depends on DRM_NOUVEAU && LEDS_CLASS >> + depends on !(DRM_NOUVEAU=y && LEDS_CLASS=m) >> + help >> + Say Y here to enabling controlling the brightness of the >> + LED behind NVIDIA logo on certain Titan cards. > > This is a very odd restriction... could this be written as > > depends on DRM_NOUVEAU > select LEDS_CLASS > > Or will that not flip the LEDS_CLASS from m to y in case > DRM_NOUVEAU=y? If not, is there a way to cause that to happen? > > Separately, perhaps we should just drop this LEDS_CLASS select into > DRM_NOUVEAU? We've tended to avoid adding tons of options. > well, that would mean that you always need the LEDS_CLASS and maybe on a tegra system you don't want to, so the led stuff should stay completely optional. Don't know though. Alex, maybe you want to clarify which dependencies should stay optional? If nobody on your side care, then we won't care as well and only add switches if users actually request it. > Cheers, > > -ilia > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] drm/nouveau: fix LEDS_CLASS=m configuration
2016-11-08 16:46 GMT+01:00 Ilia Mirkin : > On Tue, Nov 8, 2016 at 8:56 AM, Arnd Bergmann wrote: >> The newly introduced LED handling for nouveau fails to link when the >> driver is built-in but the LED subsystem is a loadable module: >> >> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_do_suspend': >> tvnv17.c:(.text.nouveau_do_suspend+0x10): undefined reference to >> `nouveau_led_suspend' >> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_do_resume': >> tvnv17.c:(.text.nouveau_do_resume+0xf0): undefined reference to >> `nouveau_led_resume' >> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_drm_unload': >> tvnv17.c:(.text.nouveau_drm_unload+0x34): undefined reference to >> `nouveau_led_fini' >> drivers/gpu/drm/nouveau/nouveau.o: In function `nouveau_drm_load': >> tvnv17.c:(.text.nouveau_drm_load+0x7d0): undefined reference to >> `nouveau_led_init' >> >> This adds a separate Kconfig symbol for the LED support that >> correctly tracks the dependencies. >> >> Fixes: 8d021d71b324 ("drm/nouveau/drm/nouveau: add a LED driver for the >> NVIDIA logo") >> Signed-off-by: Arnd Bergmann >> --- >> drivers/gpu/drm/nouveau/Kbuild| 2 +- >> drivers/gpu/drm/nouveau/Kconfig | 8 >> drivers/gpu/drm/nouveau/nouveau_led.h | 2 +- >> 3 files changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/nouveau/Kbuild b/drivers/gpu/drm/nouveau/Kbuild >> index fde6e3656636..5e00e911daa6 100644 >> --- a/drivers/gpu/drm/nouveau/Kbuild >> +++ b/drivers/gpu/drm/nouveau/Kbuild >> @@ -22,7 +22,7 @@ nouveau-$(CONFIG_DEBUG_FS) += nouveau_debugfs.o >> nouveau-y += nouveau_drm.o >> nouveau-y += nouveau_hwmon.o >> nouveau-$(CONFIG_COMPAT) += nouveau_ioc32.o >> -nouveau-$(CONFIG_LEDS_CLASS) += nouveau_led.o >> +nouveau-$(CONFIG_DRM_NOUVEAU_LED) += nouveau_led.o >> nouveau-y += nouveau_nvif.o >> nouveau-$(CONFIG_NOUVEAU_PLATFORM_DRIVER) += nouveau_platform.o >> nouveau-y += nouveau_usif.o # userspace <-> nvif >> diff --git a/drivers/gpu/drm/nouveau/Kconfig >> b/drivers/gpu/drm/nouveau/Kconfig >> index 78631fb61adf..715cd6f4dc31 100644 >> --- a/drivers/gpu/drm/nouveau/Kconfig >> +++ b/drivers/gpu/drm/nouveau/Kconfig >> @@ -46,6 +46,14 @@ config NOUVEAU_DEBUG >> The paranoia and spam levels will add a lot of extra checks which >> may potentially slow down driver operation. >> >> +config DRM_NOUVEAU_LED >> + bool "Support for logo LED" >> + depends on DRM_NOUVEAU && LEDS_CLASS >> + depends on !(DRM_NOUVEAU=y && LEDS_CLASS=m) >> + help >> + Say Y here to enabling controlling the brightness of the >> + LED behind NVIDIA logo on certain Titan cards. > > This is a very odd restriction... could this be written as > > depends on DRM_NOUVEAU > select LEDS_CLASS > > Or will that not flip the LEDS_CLASS from m to y in case > DRM_NOUVEAU=y? If not, is there a way to cause that to happen? > > Separately, perhaps we should just drop this LEDS_CLASS select into > DRM_NOUVEAU? We've tended to avoid adding tons of options. > well, that would mean that you always need the LEDS_CLASS and maybe on a tegra system you don't want to, so the led stuff should stay completely optional. Don't know though. Alex, maybe you want to clarify which dependencies should stay optional? If nobody on your side care, then we won't care as well and only add switches if users actually request it. > Cheers, > > -ilia > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] noveau: emergency shutdown handling is overcomplex and broken
Thanks for the pointer. But I don't like this patch. If you find a bug, make a bug report or just fix it if you know the fix already. Or write something in IRC. Or write on the Mailing list as a general question or something else But I really don't agree on doing it this way. You would have needed like the same amount of time to actual fix the problem. Anyway, for adding a printk: struct nvkm_subdev *subdev = >subdev; nvkm_error(subdev, "message"); 2016-10-25 12:50 GMT+02:00 Pavel Machek: > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c > index b9703c0..adb1deb 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c > @@ -120,6 +120,11 @@ nvkm_therm_sensor_event(struct nvkm_therm *therm, enum > nvkm_therm_thrs thrs, > struct work_struct *work; > > work = kmalloc(sizeof(*work), GFP_ATOMIC); > + /* FIXME: > + 1) this is total overkill, orderly_poweroff() > already > + uses schedule_work internally > + 2) it would be good to at least printk what is > going on > + */ > if (work) { > INIT_WORK(work, nv_poweroff_work); > schedule_work(work); > > GFP_ATOMIC is not reliable. Plus, see the fixme. > > Best regards, > Pavel > > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >
Re: [Nouveau] noveau: emergency shutdown handling is overcomplex and broken
Thanks for the pointer. But I don't like this patch. If you find a bug, make a bug report or just fix it if you know the fix already. Or write something in IRC. Or write on the Mailing list as a general question or something else But I really don't agree on doing it this way. You would have needed like the same amount of time to actual fix the problem. Anyway, for adding a printk: struct nvkm_subdev *subdev = >subdev; nvkm_error(subdev, "message"); 2016-10-25 12:50 GMT+02:00 Pavel Machek : > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c > index b9703c0..adb1deb 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c > @@ -120,6 +120,11 @@ nvkm_therm_sensor_event(struct nvkm_therm *therm, enum > nvkm_therm_thrs thrs, > struct work_struct *work; > > work = kmalloc(sizeof(*work), GFP_ATOMIC); > + /* FIXME: > + 1) this is total overkill, orderly_poweroff() > already > + uses schedule_work internally > + 2) it would be good to at least printk what is > going on > + */ > if (work) { > INIT_WORK(work, nv_poweroff_work); > schedule_work(work); > > GFP_ATOMIC is not reliable. Plus, see the fixme. > > Best regards, > Pavel > > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau >