[PATCH] drm/nouveau: Fixup gk20a instobj hierarchy
From: Thierry Reding Commit 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend") uses container_of() to cast from struct nvkm_memory to struct nvkm_instobj, assuming that all instance objects are derived from struct nvkm_instobj. For the gk20a family that's not the case and they are derived from struct nvkm_memory instead. This causes some subtle data corruption (nvkm_instobj.preserve ends up mapping to gk20a_instobj.vaddr) that causes a NULL pointer dereference in gk20a_instobj_acquire_iommu() (and possibly elsewhere) and also prevents suspend/resume from working. Fix this by making struct gk20a_instobj derive from struct nvkm_instobj instead. Fixes: 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend") Reported-by: Jonathan Hunter Signed-off-by: Thierry Reding --- Note that this was probably subtly wrong before the above-mentioned commit already, but I don't think we've seen any reports that would indicate any actual failures related to this before. So I think it's good enough to apply this fix for v6.7. The next closest thing would be commit d8e83994aaf6 ("drm/nouveau/imem: improve management of instance memory"), but that's 8 years old (Linux v4.3)... --- .../drm/nouveau/nvkm/subdev/instmem/gk20a.c| 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c index 1b811d6972a1..201022ae9214 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c @@ -49,14 +49,14 @@ #include struct gk20a_instobj { - struct nvkm_memory memory; + struct nvkm_instobj base; struct nvkm_mm_node *mn; struct gk20a_instmem *imem; /* CPU mapping */ u32 *vaddr; }; -#define gk20a_instobj(p) container_of((p), struct gk20a_instobj, memory) +#define gk20a_instobj(p) container_of((p), struct gk20a_instobj, base.memory) /* * Used for objects allocated using the DMA API @@ -148,7 +148,7 @@ gk20a_instobj_iommu_recycle_vaddr(struct gk20a_instobj_iommu *obj) list_del(>vaddr_node); vunmap(obj->base.vaddr); obj->base.vaddr = NULL; - imem->vaddr_use -= nvkm_memory_size(>base.memory); + imem->vaddr_use -= nvkm_memory_size(>base.base.memory); nvkm_debug(>base.subdev, "vaddr used: %x/%x\n", imem->vaddr_use, imem->vaddr_max); } @@ -283,7 +283,7 @@ gk20a_instobj_map(struct nvkm_memory *memory, u64 offset, struct nvkm_vmm *vmm, { struct gk20a_instobj *node = gk20a_instobj(memory); struct nvkm_vmm_map map = { - .memory = >memory, + .memory = >base.memory, .offset = offset, .mem = node->mn, }; @@ -391,8 +391,8 @@ gk20a_instobj_ctor_dma(struct gk20a_instmem *imem, u32 npages, u32 align, return -ENOMEM; *_node = >base; - nvkm_memory_ctor(_instobj_func_dma, >base.memory); - node->base.memory.ptrs = _instobj_ptrs; + nvkm_memory_ctor(_instobj_func_dma, >base.base.memory); + node->base.base.memory.ptrs = _instobj_ptrs; node->base.vaddr = dma_alloc_attrs(dev, npages << PAGE_SHIFT, >handle, GFP_KERNEL, @@ -438,8 +438,8 @@ gk20a_instobj_ctor_iommu(struct gk20a_instmem *imem, u32 npages, u32 align, *_node = >base; node->dma_addrs = (void *)(node->pages + npages); - nvkm_memory_ctor(_instobj_func_iommu, >base.memory); - node->base.memory.ptrs = _instobj_ptrs; + nvkm_memory_ctor(_instobj_func_iommu, >base.base.memory); + node->base.base.memory.ptrs = _instobj_ptrs; /* Allocate backing memory */ for (i = 0; i < npages; i++) { @@ -533,7 +533,7 @@ gk20a_instobj_new(struct nvkm_instmem *base, u32 size, u32 align, bool zero, else ret = gk20a_instobj_ctor_dma(imem, size >> PAGE_SHIFT, align, ); - *pmemory = node ? >memory : NULL; + *pmemory = node ? >base.memory : NULL; if (ret) return ret; -- 2.43.0
Re: [PATCH 08/10] iommu/tegra: Use tegra_dev_iommu_get_stream_id() in the remaining places
On Wed, Nov 29, 2023 at 03:26:03PM -0400, Jason Gunthorpe wrote: > On Wed, Nov 29, 2023 at 05:23:13PM +0100, Thierry Reding wrote: > > > diff --git a/drivers/memory/tegra/tegra186.c > > > b/drivers/memory/tegra/tegra186.c > > > index 533f85a4b2bdb7..3e4fbe94dd666e 100644 > > > --- a/drivers/memory/tegra/tegra186.c > > > +++ b/drivers/memory/tegra/tegra186.c > > > @@ -111,21 +111,21 @@ static void tegra186_mc_client_sid_override(struct > > > tegra_mc *mc, > > > static int tegra186_mc_probe_device(struct tegra_mc *mc, struct device > > > *dev) > > > { > > > #if IS_ENABLED(CONFIG_IOMMU_API) > > > - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > > > struct of_phandle_args args; > > > unsigned int i, index = 0; > > > + u32 sid; > > > > > > + WARN_ON(!tegra_dev_iommu_get_stream_id(dev, )); > > > > I know the code previously didn't check for any errors, but we may want > > to do so now. If tegra_dev_iommu_get_stream_id() ever fails we may end > > up writing some undefined value into the override register. > > My assumption was it never fails otherwise this probably already > doesn't work? I guess the point I was trying to make is that previously we would not have written anything to the stream ID register and so ignoring the error here might end up writing to a register that previously we would not have written to. Looking at the current code more closely I see now that the reason why we wouldn't have written to the register is because we would've crashed before. So I think this okay. > > > I'm also unsure if WARN_ON() is appropriate here. I vaguely recall that > > ->probe_device() was called for all devices on the bus and not all of > > them may have been associated with the IOMMU. Not all of them may in > > fact access memory in the first place. > > So you are thinkin that of_parse_phandle_with_args() is a NOP > sometimes so it will tolerate the failure? > > Seems like the best thing to do is just continue to ignore it then? Yeah, exactly. It would've just skipped over everything, basically. > > Perhaps I'm misremembering and the IOMMU core now takes care of only > > calling this when fwspec is indeed valid? > > Can't advise, I have no idea what tegra_mc_ops is for :) In a nutshell, it's a hook that allows us to configure the memory controller when a device is attached to the IOMMU. The memory controller contains a set of registers that specify which memory client uses which stream ID by default. For some devices this can be overridden (which is where tegra_dev_iommu_get_stream_id() comes into play in those drivers) and for other devices we can't override, which is when the memory controller defaults come into play. Anyway, I took a closer look at this and ran some tests. Turns out that tegra186_mc_probe_device() really only gets called for devices that have their fwspec properly initialized anyway, so I don't think there's anything special we need to do here. Strictly from a static analysis point of view I suppose we could now have a situation that sid is uninitialized when the call to tegra_dev_iommu_get_stream_id() fails and so using it in the loop is not correct, theoretically, but I think that's just not a case that we'll ever hit in practice. So either way is fine with me. I have a slight preference for just returning 0 in case tegra_dev_iommu_get_stream_id() fails, because it's simple to do and avoids any of these (theoretical) ambiguities. So whichever way you decide: Reviewed-by: Thierry Reding signature.asc Description: PGP signature
Re: [Nouveau] [PATCH 08/10] iommu/tegra: Use tegra_dev_iommu_get_stream_id() in the remaining places
On Tue, Nov 28, 2023 at 08:48:04PM -0400, Jason Gunthorpe wrote: > This API was defined to formalize the access to internal iommu details on > some Tegra SOCs, but a few callers got missed. Add them. > > The helper already masks by 0x so remove this code from the callers. > > Suggested-by: Thierry Reding > Signed-off-by: Jason Gunthorpe > --- > drivers/dma/tegra186-gpc-dma.c | 8 +++- > drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 7 ++- > drivers/memory/tegra/tegra186.c | 12 ++-- > 3 files changed, 11 insertions(+), 16 deletions(-) > > diff --git a/drivers/dma/tegra186-gpc-dma.c b/drivers/dma/tegra186-gpc-dma.c > index fa4d4142a68a21..88547a23825b18 100644 > --- a/drivers/dma/tegra186-gpc-dma.c > +++ b/drivers/dma/tegra186-gpc-dma.c > @@ -1348,8 +1348,8 @@ static int tegra_dma_program_sid(struct > tegra_dma_channel *tdc, int stream_id) > static int tegra_dma_probe(struct platform_device *pdev) > { > const struct tegra_dma_chip_data *cdata = NULL; > - struct iommu_fwspec *iommu_spec; > - unsigned int stream_id, i; > + unsigned int i; > + u32 stream_id; > struct tegra_dma *tdma; > int ret; > > @@ -1378,12 +1378,10 @@ static int tegra_dma_probe(struct platform_device > *pdev) > > tdma->dma_dev.dev = >dev; > > - iommu_spec = dev_iommu_fwspec_get(>dev); > - if (!iommu_spec) { > + if (!tegra_dev_iommu_get_stream_id(>dev, _id)) { > dev_err(>dev, "Missing iommu stream-id\n"); > return -EINVAL; > } > - stream_id = iommu_spec->ids[0] & 0x; > > ret = device_property_read_u32(>dev, "dma-channel-mask", > >chan_mask); > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > index e7e8fdf3adab7a..b40fd1dbb21617 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > @@ -28,16 +28,13 @@ static void > gp10b_ltc_init(struct nvkm_ltc *ltc) > { > struct nvkm_device *device = ltc->subdev.device; > - struct iommu_fwspec *spec; > + u32 sid; > > nvkm_wr32(device, 0x17e27c, ltc->ltc_nr); > nvkm_wr32(device, 0x17e000, ltc->ltc_nr); > nvkm_wr32(device, 0x100800, ltc->ltc_nr); > > - spec = dev_iommu_fwspec_get(device->dev); > - if (spec) { > - u32 sid = spec->ids[0] & 0x; > - > + if (tegra_dev_iommu_get_stream_id(device->dev, )) { > /* stream ID */ > nvkm_wr32(device, 0x16, sid << 2); We could probably also remove the comment now since the function and variable names make it obvious what's being written here. > } > diff --git a/drivers/memory/tegra/tegra186.c b/drivers/memory/tegra/tegra186.c > index 533f85a4b2bdb7..3e4fbe94dd666e 100644 > --- a/drivers/memory/tegra/tegra186.c > +++ b/drivers/memory/tegra/tegra186.c > @@ -111,21 +111,21 @@ static void tegra186_mc_client_sid_override(struct > tegra_mc *mc, > static int tegra186_mc_probe_device(struct tegra_mc *mc, struct device *dev) > { > #if IS_ENABLED(CONFIG_IOMMU_API) > - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > struct of_phandle_args args; > unsigned int i, index = 0; > + u32 sid; > > + WARN_ON(!tegra_dev_iommu_get_stream_id(dev, )); I know the code previously didn't check for any errors, but we may want to do so now. If tegra_dev_iommu_get_stream_id() ever fails we may end up writing some undefined value into the override register. I'm also unsure if WARN_ON() is appropriate here. I vaguely recall that ->probe_device() was called for all devices on the bus and not all of them may have been associated with the IOMMU. Not all of them may in fact access memory in the first place. Perhaps I'm misremembering and the IOMMU core now takes care of only calling this when fwspec is indeed valid? Thierry signature.asc Description: PGP signature
Re: [Nouveau] [PATCH v3 04/27] drm: Don't test for IRQ support in VBLANK ioctls
On Thu, Jun 24, 2021 at 11:07:57AM +0200, Thomas Zimmermann wrote: > Hi > > Am 24.06.21 um 10:51 schrieb Jani Nikula: > > On Thu, 24 Jun 2021, Thomas Zimmermann wrote: > > > Hi > > > > > > Am 24.06.21 um 10:06 schrieb Jani Nikula: > > > > On Thu, 24 Jun 2021, Thomas Zimmermann wrote: > > > > > diff --git a/drivers/gpu/drm/drm_vblank.c > > > > > b/drivers/gpu/drm/drm_vblank.c > > > > > index 3417e1ac7918..10fe16bafcb6 100644 > > > > > --- a/drivers/gpu/drm/drm_vblank.c > > > > > +++ b/drivers/gpu/drm/drm_vblank.c > > > > > @@ -1748,8 +1748,16 @@ int drm_wait_vblank_ioctl(struct drm_device > > > > > *dev, void *data, > > > > > unsigned int pipe_index; > > > > > unsigned int flags, pipe, high_pipe; > > > > > - if (!dev->irq_enabled) > > > > > - return -EOPNOTSUPP; > > > > > +#if defined(CONFIG_DRM_LEGACY) > > > > > + if (unlikely(drm_core_check_feature(dev, DRIVER_LEGACY))) { > > > > > + if (!dev->irq_enabled) > > > > > + return -EOPNOTSUPP; > > > > > + } else /* if DRIVER_MODESET */ > > > > > +#endif > > > > > + { > > > > > + if (!drm_dev_has_vblank(dev)) > > > > > + return -EOPNOTSUPP; > > > > > + } > > > > > > > > Sheesh I hate this kind of inline #ifdefs. > > > > > > > > Two alternate suggestions that I believe should be as just efficient: > > > > > > Or how about: > > > > > > static bool drm_wait_vblank_supported(struct drm_device *dev) > > > > > > { > > > > > > if defined(CONFIG_DRM_LEGACY) > > > if (unlikely(drm_core_check_feature(dev, DRIVER_LEGACY))) > > > > > > return dev->irq_enabled; > > > > > > #endif > > > return drm_dev_has_vblank(dev); > > > > > > } > > > > > > > > > ? > > > > > > It's inline, but still readable. > > > > It's definitely better than the original, but it's unclear to me why > > you'd prefer this over option 2) below. I guess the only reason I can > > think of is emphasizing the conditional compilation. However, > > IS_ENABLED() is widely used in this manner specifically to avoid inline > > #if, and the compiler optimizes it away. > > It's simply more readable to me as the condition is simpler. But option 2 is > also ok. Perhaps do something like this, then: if (IS_ENABLED(CONFIG_DRM_LEGACY)) { if (unlikely(drm_core_check_feature(dev, DRIVER_LEGACY))) return dev->irq_enabled; } return drm_dev_has_vblank(dev); That's about just as readable as the variant involving the preprocessor but has all the benefits of not using the preprocessor. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v3 21/27] drm/tegra: Don't set struct drm_device.irq_enabled
On Thu, Jun 24, 2021 at 09:29:10AM +0200, Thomas Zimmermann wrote: > The field drm_device.irq_enabled is only used by legacy drivers > with userspace modesetting. Don't set it in tegra. > > Signed-off-by: Thomas Zimmermann > Reviewed-by: Laurent Pinchart > Acked-by: Daniel Vetter > --- > drivers/gpu/drm/tegra/drm.c | 7 --- > 1 file changed, 7 deletions(-) Acked-by: Thierry Reding signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] nouveau/gem: fix user-after-free in nouveau_gem_new
On Mon, May 17, 2021 at 09:32:44AM -0400, Jeremy Cline wrote: > On Mon, May 17, 2021 at 11:19:02AM +0200, Thierry Reding wrote: > > On Mon, May 17, 2021 at 10:56:29AM +0200, Thierry Reding wrote: > > > On Tue, May 11, 2021 at 06:35:53PM +0200, Karol Herbst wrote: > > > > If ttm_bo_init fails it will already call ttm_bo_put, so we don't have > > > > to > > > > do it through nouveau_bo_ref. > > > > > > > > == > > > > BUG: KFENCE: use-after-free write in ttm_bo_put+0x11/0x40 [ttm] > > > > > > > > Use-after-free write at 0x4dc4663c (in kfence-#44): > > > > ttm_bo_put+0x11/0x40 [ttm] > > > > nouveau_gem_new+0xc1/0xf0 [nouveau] > > > > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > > > > drm_ioctl_kernel+0xb2/0x100 [drm] > > > > drm_ioctl+0x215/0x390 [drm] > > > > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > > > > __x64_sys_ioctl+0x83/0xb0 > > > > do_syscall_64+0x33/0x40 > > > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > > > > > kfence-#44 [0xc0593b31-0x2e74122b, size=792, > > > > cache=kmalloc-1k] allocated by task 2657: > > > > nouveau_bo_alloc+0x63/0x4c0 [nouveau] > > > > nouveau_gem_new+0x38/0xf0 [nouveau] > > > > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > > > > drm_ioctl_kernel+0xb2/0x100 [drm] > > > > drm_ioctl+0x215/0x390 [drm] > > > > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > > > > __x64_sys_ioctl+0x83/0xb0 > > > > do_syscall_64+0x33/0x40 > > > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > > > > > freed by task 2657: > > > > ttm_bo_release+0x1cc/0x300 [ttm] > > > > ttm_bo_init_reserved+0x2ec/0x300 [ttm] > > > > ttm_bo_init+0x5e/0xd0 [ttm] > > > > nouveau_bo_init+0xaf/0xc0 [nouveau] > > > > nouveau_gem_new+0x7f/0xf0 [nouveau] > > > > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > > > > drm_ioctl_kernel+0xb2/0x100 [drm] > > > > drm_ioctl+0x215/0x390 [drm] > > > > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > > > > __x64_sys_ioctl+0x83/0xb0 > > > > do_syscall_64+0x33/0x40 > > > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > > > > > Fixes: 019cbd4a4feb3 "drm/nouveau: Initialize GEM object before TTM > > > > object" > > > > Cc: Thierry Reding > > > > Signed-off-by: Karol Herbst > > > > --- > > > > drivers/gpu/drm/nouveau/nouveau_gem.c | 1 - > > > > 1 file changed, 1 deletion(-) > > > > > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c > > > > b/drivers/gpu/drm/nouveau/nouveau_gem.c > > > > index c88cbb85f101..1165ff990fb5 100644 > > > > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c > > > > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c > > > > @@ -212,7 +212,6 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, > > > > int align, uint32_t domain, > > > > > > > > ret = nouveau_bo_init(nvbo, size, align, domain, NULL, NULL); > > > > if (ret) { > > > > - nouveau_bo_ref(NULL, ); > > > > return ret; > > > > } > > > > > > Looking at the surrounding code, I wonder if I just managed to jumble > > > the cleanup paths for drm_gem_object_init() and nouveau_bo_init(). If > > > drm_gem_object_init() fails, I don't think it's necessary (though it > > > also doesn't look harmful) to call drm_gem_object_release(). > > > > > > However, if nouveau_bo_init() fails, then I think we'd still need to > > > call drm_gem_object_release(), to make sure to undo the effects of > > > drm_gem_object_init(). > > > > > > So I wonder if we need something like this instead: > > > > > > --- >8 --- > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c > > > b/drivers/gpu/drm/nouveau/nouveau_gem.c > > > index c88cbb85f101..9b6055116f30 100644 > > > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c > > > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c > > > @@ -205,14 +205,13 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, > > > int align, uint32_t domain, > > >* to the caller, instead of a normal nouveau_bo ttm reference. */ > > > ret = drm_gem_objec
Re: [Nouveau] [PATCH] nouveau/gem: fix user-after-free in nouveau_gem_new
On Mon, May 17, 2021 at 10:56:29AM +0200, Thierry Reding wrote: > On Tue, May 11, 2021 at 06:35:53PM +0200, Karol Herbst wrote: > > If ttm_bo_init fails it will already call ttm_bo_put, so we don't have to > > do it through nouveau_bo_ref. > > > > == > > BUG: KFENCE: use-after-free write in ttm_bo_put+0x11/0x40 [ttm] > > > > Use-after-free write at 0x4dc4663c (in kfence-#44): > > ttm_bo_put+0x11/0x40 [ttm] > > nouveau_gem_new+0xc1/0xf0 [nouveau] > > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > > drm_ioctl_kernel+0xb2/0x100 [drm] > > drm_ioctl+0x215/0x390 [drm] > > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > > __x64_sys_ioctl+0x83/0xb0 > > do_syscall_64+0x33/0x40 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > kfence-#44 [0xc0593b31-0x2e74122b, size=792, > > cache=kmalloc-1k] allocated by task 2657: > > nouveau_bo_alloc+0x63/0x4c0 [nouveau] > > nouveau_gem_new+0x38/0xf0 [nouveau] > > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > > drm_ioctl_kernel+0xb2/0x100 [drm] > > drm_ioctl+0x215/0x390 [drm] > > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > > __x64_sys_ioctl+0x83/0xb0 > > do_syscall_64+0x33/0x40 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > freed by task 2657: > > ttm_bo_release+0x1cc/0x300 [ttm] > > ttm_bo_init_reserved+0x2ec/0x300 [ttm] > > ttm_bo_init+0x5e/0xd0 [ttm] > > nouveau_bo_init+0xaf/0xc0 [nouveau] > > nouveau_gem_new+0x7f/0xf0 [nouveau] > > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > > drm_ioctl_kernel+0xb2/0x100 [drm] > > drm_ioctl+0x215/0x390 [drm] > > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > > __x64_sys_ioctl+0x83/0xb0 > > do_syscall_64+0x33/0x40 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > Fixes: 019cbd4a4feb3 "drm/nouveau: Initialize GEM object before TTM object" > > Cc: Thierry Reding > > Signed-off-by: Karol Herbst > > --- > > drivers/gpu/drm/nouveau/nouveau_gem.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c > > b/drivers/gpu/drm/nouveau/nouveau_gem.c > > index c88cbb85f101..1165ff990fb5 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c > > @@ -212,7 +212,6 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, int > > align, uint32_t domain, > > > > ret = nouveau_bo_init(nvbo, size, align, domain, NULL, NULL); > > if (ret) { > > - nouveau_bo_ref(NULL, ); > > return ret; > > } > > Looking at the surrounding code, I wonder if I just managed to jumble > the cleanup paths for drm_gem_object_init() and nouveau_bo_init(). If > drm_gem_object_init() fails, I don't think it's necessary (though it > also doesn't look harmful) to call drm_gem_object_release(). > > However, if nouveau_bo_init() fails, then I think we'd still need to > call drm_gem_object_release(), to make sure to undo the effects of > drm_gem_object_init(). > > So I wonder if we need something like this instead: > > --- >8 --- > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c > b/drivers/gpu/drm/nouveau/nouveau_gem.c > index c88cbb85f101..9b6055116f30 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c > @@ -205,14 +205,13 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, int > align, uint32_t domain, >* to the caller, instead of a normal nouveau_bo ttm reference. */ > ret = drm_gem_object_init(drm->dev, >bo.base, size); > if (ret) { > - drm_gem_object_release(>bo.base); > kfree(nvbo); > return ret; > } > > ret = nouveau_bo_init(nvbo, size, align, domain, NULL, NULL); > if (ret) { > - nouveau_bo_ref(NULL, ); > + drm_gem_object_release(>bo.base); > return ret; > } > > --- >8 --- > > Thierry Adding Jeremy for visibility. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH] nouveau/gem: fix user-after-free in nouveau_gem_new
On Tue, May 11, 2021 at 06:35:53PM +0200, Karol Herbst wrote: > If ttm_bo_init fails it will already call ttm_bo_put, so we don't have to > do it through nouveau_bo_ref. > > == > BUG: KFENCE: use-after-free write in ttm_bo_put+0x11/0x40 [ttm] > > Use-after-free write at 0x4dc4663c (in kfence-#44): > ttm_bo_put+0x11/0x40 [ttm] > nouveau_gem_new+0xc1/0xf0 [nouveau] > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > drm_ioctl_kernel+0xb2/0x100 [drm] > drm_ioctl+0x215/0x390 [drm] > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > __x64_sys_ioctl+0x83/0xb0 > do_syscall_64+0x33/0x40 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > kfence-#44 [0xc0593b31-0x2e74122b, size=792, > cache=kmalloc-1k] allocated by task 2657: > nouveau_bo_alloc+0x63/0x4c0 [nouveau] > nouveau_gem_new+0x38/0xf0 [nouveau] > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > drm_ioctl_kernel+0xb2/0x100 [drm] > drm_ioctl+0x215/0x390 [drm] > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > __x64_sys_ioctl+0x83/0xb0 > do_syscall_64+0x33/0x40 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > freed by task 2657: > ttm_bo_release+0x1cc/0x300 [ttm] > ttm_bo_init_reserved+0x2ec/0x300 [ttm] > ttm_bo_init+0x5e/0xd0 [ttm] > nouveau_bo_init+0xaf/0xc0 [nouveau] > nouveau_gem_new+0x7f/0xf0 [nouveau] > nouveau_gem_ioctl_new+0x53/0xf0 [nouveau] > drm_ioctl_kernel+0xb2/0x100 [drm] > drm_ioctl+0x215/0x390 [drm] > nouveau_drm_ioctl+0x55/0xa0 [nouveau] > __x64_sys_ioctl+0x83/0xb0 > do_syscall_64+0x33/0x40 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > Fixes: 019cbd4a4feb3 "drm/nouveau: Initialize GEM object before TTM object" > Cc: Thierry Reding > Signed-off-by: Karol Herbst > --- > drivers/gpu/drm/nouveau/nouveau_gem.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c > b/drivers/gpu/drm/nouveau/nouveau_gem.c > index c88cbb85f101..1165ff990fb5 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c > @@ -212,7 +212,6 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, int > align, uint32_t domain, > > ret = nouveau_bo_init(nvbo, size, align, domain, NULL, NULL); > if (ret) { > - nouveau_bo_ref(NULL, ); > return ret; > } Looking at the surrounding code, I wonder if I just managed to jumble the cleanup paths for drm_gem_object_init() and nouveau_bo_init(). If drm_gem_object_init() fails, I don't think it's necessary (though it also doesn't look harmful) to call drm_gem_object_release(). However, if nouveau_bo_init() fails, then I think we'd still need to call drm_gem_object_release(), to make sure to undo the effects of drm_gem_object_init(). So I wonder if we need something like this instead: --- >8 --- diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index c88cbb85f101..9b6055116f30 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -205,14 +205,13 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, int align, uint32_t domain, * to the caller, instead of a normal nouveau_bo ttm reference. */ ret = drm_gem_object_init(drm->dev, >bo.base, size); if (ret) { - drm_gem_object_release(>bo.base); kfree(nvbo); return ret; } ret = nouveau_bo_init(nvbo, size, align, domain, NULL, NULL); if (ret) { - nouveau_bo_ref(NULL, ); + drm_gem_object_release(>bo.base); return ret; } --- >8 --- Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v3 03/20] drm/dp: Move i2c init to drm_dp_aux_init, add __must_check and fini
On Thu, Apr 22, 2021 at 01:18:09PM -0400, Lyude Paul wrote: > On Tue, 2021-04-20 at 02:16 +0300, Ville Syrjälä wrote: > > > > The init vs. register split is intentional. Registering the thing > > and allowing userspace access to it before the rest of the driver > > is ready isn't particularly great. For a while now we've tried to > > move towards an architecture where the driver is fully initialzied > > before anything gets exposed to userspace. > > Yeah-thank you for pointing this out. Thierry - do you think there's an > alternate solution we could go with in Tegra to fix the get_device() issue > that wouldn't require us trying to expose the i2c adapter early? I suppose we could do it in a hackish way that grabs a reference to the I2C adapter only upon registration. We can't do that for the regular I2C DDC case where the I2C controller is an external one because by the time we get to registration it could've gone again. This would make both code paths asymmetric, so I'd prefer not to do it. Perhaps it could serve as an stop-gap solution until something better is in place, though. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v3 03/20] drm/dp: Move i2c init to drm_dp_aux_init, add __must_check and fini
On Fri, Apr 23, 2021 at 12:11:06AM -0400, Lyude Paul wrote: > On Thu, 2021-04-22 at 18:33 -0400, Lyude Paul wrote: > > OK - talked with Ville a bit on this and did some of my own research, I > > actually think that moving i2c to drm_dp_aux_init() is the right decision > > for > > the time being. The reasoning behind this being that as shown by my previous > > work of fixing drivers that call drm_dp_aux_register() too early - it seems > > like there's already been drivers that have been working just fine with > > setting up the i2c device before DRM registration. > > > > In the future, it'd probably be better if we can split up i2c_add_adapter() > > into an init and register function - but we'll have to talk with the i2c > > maintainers to see if this is acceptable w/ them > > Actually - I think adding the ability to refcount dp aux adapters might be a > better solution so I'm going to try that! I'm curious: how is a DP AUX adapter reference count going to solve the issue of potentially registering devices too early (i.e. before the DRM is registered)? Is it because registering too early could cause a reference count problem if somebody get a hold of the DP AUX adapter before the parent DRM device is around? Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v3 03/20] drm/dp: Move i2c init to drm_dp_aux_init, add __must_check and fini
On Thu, Apr 22, 2021 at 06:33:44PM -0400, Lyude Paul wrote: > OK - talked with Ville a bit on this and did some of my own research, I > actually think that moving i2c to drm_dp_aux_init() is the right decision for > the time being. The reasoning behind this being that as shown by my previous > work of fixing drivers that call drm_dp_aux_register() too early - it seems > like there's already been drivers that have been working just fine with > setting up the i2c device before DRM registration. > > In the future, it'd probably be better if we can split up i2c_add_adapter() > into an init and register function - but we'll have to talk with the i2c > maintainers to see if this is acceptable w/ them Yeah, that sounds like a better long-term solution. We could leave i2c_add_adapter() in place, since it's already half-way split up into some initialization code and i2c_register_adapter(), so it shouldn't be all that difficult to split out an i2c_init_adapter() so that outside users can do the split setup. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 03/30] drm/tegra: Don't register DP AUX channels before connectors
On Fri, Feb 19, 2021 at 04:52:59PM -0500, Lyude Paul wrote: > As pointed out by the documentation for drm_dp_aux_register(), > drm_dp_aux_init() should be used in situations where the AUX channel for a > display driver can potentially be registered before it's respective DRM > driver. This is the case with Tegra, since the DP aux channel exists as a > platform device instead of being a grandchild of the DRM device. > > Since we're about to add a backpointer to a DP AUX channel's respective DRM > device, let's fix this so that we don't potentially allow userspace to use > the AUX channel before we've associated it with it's DRM connector. > > Signed-off-by: Lyude Paul > --- > drivers/gpu/drm/tegra/dpaux.c | 11 ++- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/tegra/dpaux.c b/drivers/gpu/drm/tegra/dpaux.c > index 105fb9cdbb3b..ea56c6ec25e4 100644 > --- a/drivers/gpu/drm/tegra/dpaux.c > +++ b/drivers/gpu/drm/tegra/dpaux.c > @@ -534,9 +534,7 @@ static int tegra_dpaux_probe(struct platform_device *pdev) > dpaux->aux.transfer = tegra_dpaux_transfer; > dpaux->aux.dev = >dev; > > - err = drm_dp_aux_register(>aux); > - if (err < 0) > - return err; > + drm_dp_aux_init(>aux); I just noticed that this change causes an error on some setups that I haven't seen before. The problem is that the SOR driver tries to grab a reference to the I2C device to make sure it doesn't go away while it has a pointer to it. However, since now the I2C adapter hasn't been registered yet, I get this: [ 15.013969] kobject: '(null)' (5c903e43): is not initialized, yet kobject_get() is being called. I recall that you wanted to make this change so that a backpointer to the DRM device could be added (I think that's patch 15 of the series), but I didn't see that patch get merged, so it's a bit difficult to try and fix this up. Has the situation changed? Do we no longer need the backpointer? If we still want it, what's the plan for merging the change? Should I work under the assumption that patch will make it in sometime and try to fix this on top of that? I'm thinking that perhaps we can move the I2C adapter registration into drm_dp_aux_init() since that's independent of the DRM device. It would also make a bit more sense from the Tegra driver's point of view where all devices would be created during the ->probe() path, and only during the ->init() path would the connection between DRM device and DRM DP AUX device be established. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v2 05/20] drm/dp: Add backpointer to drm_device in drm_dp_aux
On Fri, Mar 26, 2021 at 04:37:52PM -0400, Lyude Paul wrote: > This is something that we've wanted for a while now: the ability to > actually look up the respective drm_device for a given drm_dp_aux struct. > This will also allow us to transition over to using the drm_dbg_*() helpers > for debug message printing, as we'll finally have a drm_device to reference > for doing so. > > Note that there is one limitation with this - because some DP AUX adapters > exist as platform devices which are initialized independently of their > respective DRM devices, one cannot rely on drm_dp_aux->drm_dev to always be > non-NULL until drm_dp_aux_register() has been called. We make sure to point > this out in the documentation for struct drm_dp_aux. > > Signed-off-by: Lyude Paul > --- [...] > drivers/gpu/drm/tegra/dpaux.c | 1 + [...] Acked-by: Thierry Reding signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v2 02/20] drm/tegra: Don't register DP AUX channels before connectors
On Fri, Mar 26, 2021 at 04:37:49PM -0400, Lyude Paul wrote: > As pointed out by the documentation for drm_dp_aux_register(), > drm_dp_aux_init() should be used in situations where the AUX channel for a > display driver can potentially be registered before it's respective DRM > driver. This is the case with Tegra, since the DP aux channel exists as a > platform device instead of being a grandchild of the DRM device. > > Since we're about to add a backpointer to a DP AUX channel's respective DRM > device, let's fix this so that we don't potentially allow userspace to use > the AUX channel before we've associated it with it's DRM connector. > > Signed-off-by: Lyude Paul > --- > drivers/gpu/drm/tegra/dpaux.c | 11 ++- > 1 file changed, 6 insertions(+), 5 deletions(-) Acked-by: Thierry Reding signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
On Wed, Sep 23, 2020 at 05:21:24PM +0200, Daniel Vetter wrote: > On Wed, Sep 23, 2020 at 11:18:53AM +0200, Thierry Reding wrote: > > On Fri, Aug 28, 2020 at 12:40:10PM +0200, Thierry Reding wrote: > > > From: Thierry Reding > > > > > > Hi, > > > > > > This series implements a new IOCTL to submit push buffers that can > > > optionally return a sync FD or sync object to userspace. This is useful > > > in cases where userspace wants to synchronize operations between the GPU > > > and another driver (such as KMS for display). Among other things this > > > allows extensions such as eglDupNativeFenceFDANDROID to be implemented. > > > > > > Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM > > > sync objects to be passed rather than only sync FDs. It also allows any > > > number of sync FDs/objects to be passed in or emitted. I think those are > > > useful features, but I left them in a separate patch in case everybody > > > else thinks that this won't be needed. If we decide to merge the new ABI > > > then patch 4 should be squashed into patch 3. > > > > > > The corresponding userspace changes can be found here: > > > > > > libdrm: > > > https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/ > > > mesa: > > > https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/ > > > > > > I've verified that this works with kmscube's --atomic mode and Weston. > > > > Hi Ben, > > > > any thoughts on this series? I realize that this is somewhat suboptimal > > because we're effectively adding a duplicate of the existing IOCTL with > > only the "minor" extension of adding sync FDs/objects, but at the same > > time I don't have any good ideas on what else to add to make this more > > appealing or if you have any plans of your own to address this in the > > future. > > drm core automatically zero-extends ioctl structs both ways, so if all you > do is add more stuff to the top level ioctl struct at the bottom, there's > no need to duplicate any code. At least as long as you guarantee that 0 == > old behaviour for both in and out parameters. But that only works if the structure size remains fixed, right? In this case, however, we have to extend the structure with additional fields, so the size is going to change and therefore the IOCTL number will also change. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
On Fri, Aug 28, 2020 at 12:40:10PM +0200, Thierry Reding wrote: > From: Thierry Reding > > Hi, > > This series implements a new IOCTL to submit push buffers that can > optionally return a sync FD or sync object to userspace. This is useful > in cases where userspace wants to synchronize operations between the GPU > and another driver (such as KMS for display). Among other things this > allows extensions such as eglDupNativeFenceFDANDROID to be implemented. > > Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM > sync objects to be passed rather than only sync FDs. It also allows any > number of sync FDs/objects to be passed in or emitted. I think those are > useful features, but I left them in a separate patch in case everybody > else thinks that this won't be needed. If we decide to merge the new ABI > then patch 4 should be squashed into patch 3. > > The corresponding userspace changes can be found here: > > libdrm: > https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/ > mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/ > > I've verified that this works with kmscube's --atomic mode and Weston. Hi Ben, any thoughts on this series? I realize that this is somewhat suboptimal because we're effectively adding a duplicate of the existing IOCTL with only the "minor" extension of adding sync FDs/objects, but at the same time I don't have any good ideas on what else to add to make this more appealing or if you have any plans of your own to address this in the future. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v2 14/21] drm/tegra: Introduce GEM object functions
On Tue, Sep 15, 2020 at 04:59:51PM +0200, Thomas Zimmermann wrote: > GEM object functions deprecate several similar callback interfaces in > struct drm_driver. This patch replaces the per-driver callbacks with > per-instance callbacks in tegra. > > Signed-off-by: Thomas Zimmermann > --- > drivers/gpu/drm/tegra/drm.c | 4 > drivers/gpu/drm/tegra/gem.c | 8 > 2 files changed, 8 insertions(+), 4 deletions(-) Acked-by: Thierry Reding signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 5/6] drm/nouveau: Support DMA fence arrays
From: Thierry Reding A DMA fence can be composed of multiple fences in an array. Support this in the Nouveau driver by iteratively synchronizing to each DMA fence in the array. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_fence.c | 31 ++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 8530c2684832..c0849e09279c 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -24,6 +24,7 @@ * */ +#include #include #include #include @@ -338,9 +339,9 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr) return 0; } -int -nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan, - bool intr) +static int +__nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan, +bool intr) { struct nouveau_fence_chan *fctx = chan->fence; struct nouveau_channel *prev = NULL; @@ -363,6 +364,30 @@ nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan, return ret; } +int +nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan, + bool intr) +{ + int ret = 0; + + if (dma_fence_is_array(fence)) { + struct dma_fence_array *array = to_dma_fence_array(fence); + unsigned int i; + + for (i = 0; i < array->num_fences; i++) { + struct dma_fence *f = array->fences[i]; + + ret = __nouveau_fence_sync(f, chan, intr); + if (ret < 0) + break; + } + } else { + ret = __nouveau_fence_sync(fence, chan, intr); + } + + return ret; +} + struct nouveau_fence * nouveau_fence_ref(struct nouveau_fence *fence) { -- 2.28.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 6/6] drm/nouveau: Allow zero pushbuffer submits
From: Thierry Reding These are useful in cases where only a fence is to be created to wait for existing jobs in the command stream. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_gem.c | 197 +- 1 file changed, 99 insertions(+), 98 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index b3ece731e4e1..c70a045d7141 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -816,9 +816,9 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev, struct nouveau_abi16_chan *temp; struct nouveau_drm *drm = nouveau_drm(dev); struct drm_nouveau_gem_pushbuf *req = >base; - struct drm_nouveau_gem_pushbuf_push *push; struct drm_nouveau_gem_pushbuf_reloc *reloc = NULL; - struct drm_nouveau_gem_pushbuf_bo *bo; + struct drm_nouveau_gem_pushbuf_push *push = NULL; + struct drm_nouveau_gem_pushbuf_bo *bo = NULL; struct drm_nouveau_gem_fence *fences = NULL; struct nouveau_channel *chan = NULL; struct validate_op op; @@ -850,8 +850,6 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev, req->vram_available = drm->gem.vram_available; req->gart_available = drm->gem.gart_available; - if (unlikely(req->nr_push == 0)) - goto out_next; if (unlikely(req->nr_push > NOUVEAU_GEM_MAX_PUSH)) { NV_PRINTK(err, cli, "pushbuf push count exceeds limit: %d max %d\n", @@ -871,33 +869,35 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev, return nouveau_abi16_put(abi16, -EINVAL); } - push = u_memcpya(req->push, req->nr_push, sizeof(*push)); - if (IS_ERR(push)) - return nouveau_abi16_put(abi16, PTR_ERR(push)); + if (req->nr_push > 0) { + push = u_memcpya(req->push, req->nr_push, sizeof(*push)); + if (IS_ERR(push)) + return nouveau_abi16_put(abi16, PTR_ERR(push)); - bo = u_memcpya(req->buffers, req->nr_buffers, sizeof(*bo)); - if (IS_ERR(bo)) { - u_free(push); - return nouveau_abi16_put(abi16, PTR_ERR(bo)); - } + bo = u_memcpya(req->buffers, req->nr_buffers, sizeof(*bo)); + if (IS_ERR(bo)) { + u_free(push); + return nouveau_abi16_put(abi16, PTR_ERR(bo)); + } - /* Ensure all push buffers are on validate list */ - for (i = 0; i < req->nr_push; i++) { - if (push[i].bo_index >= req->nr_buffers) { - NV_PRINTK(err, cli, "push %d buffer not in list\n", i); - ret = -EINVAL; - goto out_prevalid; + /* Ensure all push buffers are on validate list */ + for (i = 0; i < req->nr_push; i++) { + if (push[i].bo_index >= req->nr_buffers) { + NV_PRINTK(err, cli, "push %d buffer not in list\n", i); + ret = -EINVAL; + goto out_prevalid; + } } - } - /* Validate buffer list */ + /* Validate buffer list */ revalidate: - ret = nouveau_gem_pushbuf_validate(chan, file_priv, bo, - req->nr_buffers, , _reloc); - if (ret) { - if (ret != -ERESTARTSYS) - NV_PRINTK(err, cli, "validate: %d\n", ret); - goto out_prevalid; + ret = nouveau_gem_pushbuf_validate(chan, file_priv, bo, + req->nr_buffers, , _reloc); + if (ret) { + if (ret != -ERESTARTSYS) + NV_PRINTK(err, cli, "validate: %d\n", ret); + goto out_prevalid; + } } if (request->num_fences > 0) { @@ -915,89 +915,89 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev, } /* Apply any relocations that are required */ - if (do_reloc) { - if (!reloc) { - validate_fini(, chan, NULL, bo); - reloc = u_memcpya(req->relocs, req->nr_relocs, sizeof(*reloc)); - if (IS_ERR(reloc)) { - ret = PTR_ERR(reloc); - goto out_prevalid; - } + if (req->nr_push > 0) { + if (do_reloc) { + if (!reloc) { + validate_fini(, chan, NULL, bo); + reloc = u_memcpya(req->relocs, req->nr_relocs, sizeof(*reloc)); +
[Nouveau] [PATCH 4/6] drm/nouveau: Support sync FDs and syncobjs
From: Thierry Reding Extends the new NOUVEAU_GEM_PUSHBUF2 IOCTL to accept and emit one or more sync FDs and/or DRM native sync objects. Signed-off-by: Thierry Reding --- Note: If acceptable, this should be merged into the previous patch that adds the new IOCTL. drivers/gpu/drm/nouveau/nouveau_gem.c | 180 ++ include/uapi/drm/nouveau_drm.h| 21 ++- 2 files changed, 167 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index 039f244c0a00..b3ece731e4e1 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -26,6 +26,7 @@ #include #include +#include #include "nouveau_drv.h" #include "nouveau_dma.h" @@ -680,12 +681,137 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli, return ret; } +static int nouveau_channel_wait_fence(struct nouveau_channel *channel, + struct drm_file *file_priv, + struct drm_nouveau_gem_fence *f) +{ + struct dma_fence *fence; + + if (f->flags & NOUVEAU_GEM_FENCE_FD) { + fence = sync_file_get_fence(f->handle); + if (!fence) + return -ENOENT; + } else { + struct drm_syncobj *syncobj; + + syncobj = drm_syncobj_find(file_priv, f->handle); + if (!syncobj) + return -ENOENT; + + fence = drm_syncobj_fence_get(syncobj); + drm_syncobj_put(syncobj); + } + + return nouveau_fence_sync(fence, channel, true); +} + +static int nouveau_channel_wait_fences(struct nouveau_channel *channel, + struct drm_file *file_priv, + struct drm_nouveau_gem_fence *fences, + unsigned int num_fences) +{ + unsigned int i; + int ret; + + for (i = 0; i < num_fences; i++) { + if (fences[i].flags & NOUVEAU_GEM_FENCE_WAIT) { + ret = nouveau_channel_wait_fence(channel, file_priv, +[i]); + if (ret < 0) + return ret; + } + } + + return 0; +} + +static struct nouveau_fence * +nouveau_channel_emit_fence(struct nouveau_channel *channel, + struct drm_file *file_priv, + struct drm_nouveau_gem_fence *f) +{ + struct nouveau_fence *fence; + int ret; + + ret = nouveau_fence_new(channel, false, ); + if (ret < 0) + return ERR_PTR(ret); + + if (f->flags & NOUVEAU_GEM_FENCE_FD) { + struct sync_file *file; + int fd; + + fd = get_unused_fd_flags(O_CLOEXEC); + if (fd < 0) { + ret = fd; + goto put; + } + + file = sync_file_create(>base); + if (!file) { + put_unused_fd(fd); + ret = -ENOMEM; + goto put; + } + + fd_install(fd, file->file); + f->handle = fd; + } else { + struct drm_syncobj *syncobj; + + ret = drm_syncobj_create(, 0, >base); + if (ret < 0) + goto put; + + ret = drm_syncobj_get_handle(file_priv, syncobj, >handle); + drm_syncobj_put(syncobj); + } + +put: + nouveau_fence_unref(); + return ERR_PTR(ret); +} + +static struct nouveau_fence * +nouveau_channel_emit_fences(struct nouveau_channel *channel, + struct drm_file *file_priv, + struct drm_nouveau_gem_fence *fences, + unsigned int num_fences) +{ + struct nouveau_fence *fence = NULL, *f; + unsigned int i; + int ret; + + for (i = 0; i < num_fences; i++) { + if (fences[i].flags & NOUVEAU_GEM_FENCE_EMIT) { + f = nouveau_channel_emit_fence(channel, file_priv, + [i]); + if (IS_ERR(f)) + return f; + + if (!fence) + fence = f; + } + } + + if (!fence) { + ret = nouveau_fence_new(channel, false, ); + if (ret) + fence = ERR_PTR(ret); + } else { + nouveau_fence_ref(fence); + } + + return fence; +} + static int __nouveau_gem_ioctl_pushbuf(struct drm_device *dev, struct drm_nouveau_gem_pushbuf2 *request,
[Nouveau] [PATCH 3/6] drm/nouveau: Support fence FDs at kickoff
From: Thierry Reding Add a new NOUVEAU_GEM_PUSHBUF2 IOCTL that accepts and emits a sync fence FD from/to userspace if requested by the corresponding flags. Based heavily on work by Lauri Peltonen Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_drm.c | 1 + drivers/gpu/drm/nouveau/nouveau_gem.c | 79 +-- drivers/gpu/drm/nouveau/nouveau_gem.h | 2 + include/uapi/drm/nouveau_drm.h| 14 + 4 files changed, 92 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 22d246acc5e5..c9cb2648a28b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -1140,6 +1140,7 @@ nouveau_ioctls[] = { DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_PREP, nouveau_gem_ioctl_cpu_prep, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_FINI, nouveau_gem_ioctl_cpu_fini, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_INFO, nouveau_gem_ioctl_info, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_PUSHBUF2, nouveau_gem_ioctl_pushbuf2, DRM_RENDER_ALLOW), }; long diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index 590e4c1d2e8a..039f244c0a00 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -24,6 +24,9 @@ * */ +#include +#include + #include "nouveau_drv.h" #include "nouveau_dma.h" #include "nouveau_fence.h" @@ -677,24 +680,30 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli, return ret; } -int -nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data, - struct drm_file *file_priv) +static int +__nouveau_gem_ioctl_pushbuf(struct drm_device *dev, + struct drm_nouveau_gem_pushbuf2 *request, + struct drm_file *file_priv) { struct nouveau_abi16 *abi16 = nouveau_abi16_get(file_priv); struct nouveau_cli *cli = nouveau_cli(file_priv); struct nouveau_abi16_chan *temp; struct nouveau_drm *drm = nouveau_drm(dev); - struct drm_nouveau_gem_pushbuf *req = data; + struct drm_nouveau_gem_pushbuf *req = >base; struct drm_nouveau_gem_pushbuf_push *push; struct drm_nouveau_gem_pushbuf_reloc *reloc = NULL; struct drm_nouveau_gem_pushbuf_bo *bo; struct nouveau_channel *chan = NULL; struct validate_op op; struct nouveau_fence *fence = NULL; + struct dma_fence *prefence = NULL; int i, j, ret = 0; bool do_reloc = false, sync = false; + /* check for unrecognized flags */ + if (request->flags & ~NOUVEAU_GEM_PUSHBUF_FLAGS) + return -EINVAL; + if (unlikely(!abi16)) return -ENOMEM; @@ -764,6 +773,15 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data, goto out_prevalid; } + if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_WAIT) { + prefence = sync_file_get_fence(request->fence); + if (prefence) { + ret = nouveau_fence_sync(prefence, chan, true); + if (ret < 0) + goto out; + } + } + /* Apply any relocations that are required */ if (do_reloc) { if (!reloc) { @@ -865,7 +883,30 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data, } } + if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_EMIT) { + struct sync_file *file; + int fd; + + fd = get_unused_fd_flags(O_CLOEXEC); + if (fd < 0) { + ret = fd; + goto out; + } + + file = sync_file_create(>base); + if (!file) { + put_unused_fd(fd); + goto out; + } + + fd_install(fd, file->file); + request->fence = fd; + } + out: + if (prefence) + dma_fence_put(prefence); + validate_fini(, chan, fence, bo); nouveau_fence_unref(); @@ -906,6 +947,27 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data, return nouveau_abi16_put(abi16, ret); } +int +nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct drm_nouveau_gem_pushbuf *request = data; + struct drm_nouveau_gem_pushbuf2 req; + int ret; + + memset(, 0, sizeof(req)); + memcpy(, request, sizeof(*request)); + + ret = __nouveau_gem_ioctl_pushbuf(dev, , file_priv); + + request->gart_available = req.base.gart_available; + request->vram_available = req.base.vram_available; + request->suffix1
[Nouveau] [PATCH 1/6] drm/nouveau: Split nouveau_fence_sync()
From: Thierry Reding Turn nouveau_fence_sync() into a low-level helper that adds fence waits to the channel command stream. The new nouveau_bo_sync() helper replaces the previous nouveau_fence_sync() implementation. It passes each of the buffer object's fences to nouveau_fence_sync() in turn. This provides more fine-grained control over fences which is needed by subsequent patches for sync fd support. Heavily based on work by Lauri Peltonen . Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 +- drivers/gpu/drm/nouveau/nouveau_bo.c| 38 +- drivers/gpu/drm/nouveau/nouveau_bo.h| 2 + drivers/gpu/drm/nouveau/nouveau_fence.c | 68 + drivers/gpu/drm/nouveau/nouveau_fence.h | 2 +- drivers/gpu/drm/nouveau/nouveau_gem.c | 2 +- 6 files changed, 57 insertions(+), 59 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c index 6416b6907aeb..4af702d0d6bf 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c +++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c @@ -1117,7 +1117,7 @@ nv04_page_flip_emit(struct nouveau_channel *chan, spin_unlock_irqrestore(>event_lock, flags); /* Synchronize with the old framebuffer */ - ret = nouveau_fence_sync(old_bo, chan, false, false); + ret = nouveau_bo_sync(old_bo, chan, false, false); if (ret) goto fail; @@ -1183,7 +1183,7 @@ nv04_crtc_page_flip(struct drm_crtc *crtc, struct drm_framebuffer *fb, goto fail_unpin; /* synchronise rendering channel with the kernel's channel */ - ret = nouveau_fence_sync(new_bo, chan, false, true); + ret = nouveau_bo_sync(new_bo, chan, false, true); if (ret) { ttm_bo_unreserve(_bo->bo); goto fail_unpin; diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index 9140387f30dc..25ceabfa741c 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -574,6 +574,42 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo) PAGE_SIZE, DMA_FROM_DEVICE); } +int +nouveau_bo_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, + bool exclusive, bool intr) +{ + struct dma_resv *resv = nvbo->bo.base.resv; + struct dma_resv_list *fobj; + struct dma_fence *fence; + int ret = 0, i; + + if (!exclusive) { + ret = dma_resv_reserve_shared(resv, 1); + if (ret < 0) + return ret; + } + + fobj = dma_resv_get_list(resv); + fence = dma_resv_get_excl(resv); + + if (fence && (!exclusive || !fobj || !fobj->shared_count)) + return nouveau_fence_sync(fence, chan, intr); + + if (!exclusive || !fobj) + return ret; + + for (i = 0; i < fobj->shared_count && !ret; ++i) { + fence = rcu_dereference_protected(fobj->shared[i], + dma_resv_held(resv)); + + ret = nouveau_fence_sync(fence, chan, intr); + if (ret < 0) + break; + } + + return ret; +} + int nouveau_bo_validate(struct nouveau_bo *nvbo, bool interruptible, bool no_wait_gpu) @@ -717,7 +753,7 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, bool intr, } mutex_lock_nested(>mutex, SINGLE_DEPTH_NESTING); - ret = nouveau_fence_sync(nouveau_bo(bo), chan, true, intr); + ret = nouveau_bo_sync(nouveau_bo(bo), chan, true, intr); if (ret == 0) { ret = drm->ttm.move(chan, bo, >mem, new_reg); if (ret == 0) { diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h b/drivers/gpu/drm/nouveau/nouveau_bo.h index aecb7481df0d..93d1706619a1 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.h +++ b/drivers/gpu/drm/nouveau/nouveau_bo.h @@ -96,6 +96,8 @@ int nouveau_bo_validate(struct nouveau_bo *, bool interruptible, bool no_wait_gpu); void nouveau_bo_sync_for_device(struct nouveau_bo *nvbo); void nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo); +int nouveau_bo_sync(struct nouveau_bo *nvbo, struct nouveau_channel *channel, + bool exclusive, bool intr); /* TODO: submit equivalent to TTM generic API upstream? */ static inline void __iomem * diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index e5dcbf67de7e..8e7550553584 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -339,66 +339,26 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr) } int -nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool exclusive, bool intr) +nouveau_fence_sync(s
[Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
From: Thierry Reding Hi, This series implements a new IOCTL to submit push buffers that can optionally return a sync FD or sync object to userspace. This is useful in cases where userspace wants to synchronize operations between the GPU and another driver (such as KMS for display). Among other things this allows extensions such as eglDupNativeFenceFDANDROID to be implemented. Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM sync objects to be passed rather than only sync FDs. It also allows any number of sync FDs/objects to be passed in or emitted. I think those are useful features, but I left them in a separate patch in case everybody else thinks that this won't be needed. If we decide to merge the new ABI then patch 4 should be squashed into patch 3. The corresponding userspace changes can be found here: libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/ mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/ I've verified that this works with kmscube's --atomic mode and Weston. Thierry Thierry Reding (6): drm/nouveau: Split nouveau_fence_sync() drm/nouveau: Add nouveau_fence_ref() drm/nouveau: Support fence FDs at kickoff drm/nouveau: Support sync FDs and syncobjs drm/nouveau: Support DMA fence arrays drm/nouveau: Allow zero pushbuffer submits drivers/gpu/drm/nouveau/dispnv04/crtc.c | 4 +- drivers/gpu/drm/nouveau/nouveau_bo.c| 38 ++- drivers/gpu/drm/nouveau/nouveau_bo.h| 2 + drivers/gpu/drm/nouveau/nouveau_drm.c | 1 + drivers/gpu/drm/nouveau/nouveau_fence.c | 90 +++--- drivers/gpu/drm/nouveau/nouveau_fence.h | 3 +- drivers/gpu/drm/nouveau/nouveau_gem.c | 402 ++-- drivers/gpu/drm/nouveau/nouveau_gem.h | 2 + include/uapi/drm/nouveau_drm.h | 23 ++ 9 files changed, 410 insertions(+), 155 deletions(-) -- 2.28.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 2/6] drm/nouveau: Add nouveau_fence_ref()
From: Thierry Reding This is a simple wrapper that increments the reference count of the backing DMA fence. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_fence.c | 9 + drivers/gpu/drm/nouveau/nouveau_fence.h | 1 + 2 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 8e7550553584..8530c2684832 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -363,6 +363,15 @@ nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan, return ret; } +struct nouveau_fence * +nouveau_fence_ref(struct nouveau_fence *fence) +{ + if (fence) + dma_fence_get(>base); + + return fence; +} + void nouveau_fence_unref(struct nouveau_fence **pfence) { diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h index 76cbf0c27a30..b8afd4b06445 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.h +++ b/drivers/gpu/drm/nouveau/nouveau_fence.h @@ -19,6 +19,7 @@ struct nouveau_fence { int nouveau_fence_new(struct nouveau_channel *, bool sysmem, struct nouveau_fence **); +struct nouveau_fence *nouveau_fence_ref(struct nouveau_fence *); void nouveau_fence_unref(struct nouveau_fence **); int nouveau_fence_emit(struct nouveau_fence *, struct nouveau_channel *); -- 2.28.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH] drm/nouveau: gr/gk20a: Use firmware version 0
From: Thierry Reding Tegra firmware doesn't actually use any version numbers and passing -1 causes the existing firmware binaries not to be found. Use version 0 to find the correct files. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c index ec330d791d15..e56880f3e3bd 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c @@ -352,7 +352,7 @@ gk20a_gr_load(struct gf100_gr *gr, int ver, const struct gf100_gr_fwif *fwif) static const struct gf100_gr_fwif gk20a_gr_fwif[] = { - { -1, gk20a_gr_load, _gr }, + { 0, gk20a_gr_load, _gr }, {} }; -- 2.24.1 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH] drm/nouveau: gp10b: Use gp100_grctx and gp100_gr_zbc
From: Thierry Reding gp10b doesn't have all the registers that gp102_gr_zbc wants to access, which causes IBUS MMIO faults to occur. Avoid this by using the gp100 variants of grctx and gr_zbc. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h | 1 + drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c | 2 +- drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c | 4 ++-- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h index fafdd0bbea9b..c4b2e6346684 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h @@ -245,6 +245,7 @@ void gp100_gr_init_fecs_exceptions(struct gf100_gr *); void gp100_gr_init_shader_exceptions(struct gf100_gr *, int, int); void gp100_gr_zbc_clear_color(struct gf100_gr *, int); void gp100_gr_zbc_clear_depth(struct gf100_gr *, int); +extern const struct gf100_gr_func_zbc gp100_gr_zbc; void gp102_gr_init_swdx_pes_mask(struct gf100_gr *); extern const struct gf100_gr_func_zbc gp102_gr_zbc; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c index 9d0521ce309a..ef16fee61327 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp100.c @@ -62,7 +62,7 @@ gp100_gr_zbc_clear_depth(struct gf100_gr *gr, int zbc) gr->zbc_depth[zbc].format << ((znum % 4) * 7)); } -static const struct gf100_gr_func_zbc +const struct gf100_gr_func_zbc gp100_gr_zbc = { .clear_color = gp100_gr_zbc_clear_color, .clear_depth = gp100_gr_zbc_clear_depth, diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c index 303dceddd4a8..0b375b2587d2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gp10b.c @@ -48,8 +48,8 @@ gp10b_gr = { .gpc_nr = 1, .tpc_nr = 2, .ppc_nr = 1, - .grctx = _grctx, - .zbc = _gr_zbc, + .grctx = _grctx, + .zbc = _gr_zbc, .sclass = { { -1, -1, FERMI_TWOD_A }, { -1, -1, KEPLER_INLINE_TO_MEMORY_B }, -- 2.24.1 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH] drm/nouveau: gm20b, gp10b: Fix Falcon bootstrapping
From: Thierry Reding The low-level Falcon bootstrapping callbacks are expected to return 0 on success or a negative error code on failure. However, the implementation on Tegra returns the ID or mask of the Falcons that were bootstrapped on success, thus breaking the calling code, which treats this as failure. Fix this by making sure we only return 0 or a negative error code, just like the code for discrete GPUs does. Fixes: 86ce2a71539c ("drm/nouveau/flcn/cmdq: move command generation to subdevs") Signed-off-by: Thierry Reding --- Note that this is against Ben's tree, which should only hit linux-next tomorrow, so most people should not be hitting this yet. drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c | 9 +++-- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c | 9 +++-- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c index 6d5a13e4a857..82571032a07d 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c @@ -52,8 +52,13 @@ gm20b_pmu_acr_bootstrap_falcon(struct nvkm_falcon *falcon, ret = nvkm_falcon_cmdq_send(pmu->hpq, , gm20b_pmu_acr_bootstrap_falcon_cb, >subdev, msecs_to_jiffies(1000)); - if (ret >= 0 && ret != cmd.falcon_id) - ret = -EIO; + if (ret >= 0) { + if (ret != cmd.falcon_id) + ret = -EIO; + else + ret = 0; + } + return ret; } diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c index 39c86bc56310..5b81c7320479 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp10b.c @@ -52,8 +52,13 @@ gp10b_pmu_acr_bootstrap_multiple_falcons(struct nvkm_falcon *falcon, u32 mask) ret = nvkm_falcon_cmdq_send(pmu->hpq, , gp10b_pmu_acr_bootstrap_multiple_falcons_cb, >subdev, msecs_to_jiffies(1000)); - if (ret >= 0 && ret != cmd.falcon_mask) - ret = -EIO; + if (ret >= 0) { + if (ret != cmd.falcon_mask) + ret = -EIO; + else + ret = 0; + } + return ret; } -- 2.24.1 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v3 0/9] drm/nouveau: Various fixes for GP10B
On Tue, Dec 10, 2019 at 06:15:30PM +1000, Ben Skeggs wrote: > On Mon, 9 Dec 2019 at 22:00, Thierry Reding wrote: > > > > From: Thierry Reding > > > > Hi Ben, > > > > here's a revised subset of the patches I had sent out a couple of weeks > > ago. I've reworked the BAR2 accesses in the way that you had suggested, > > which at least for GP10B turned out to be fairly trivial to do. I have > > not looked in detail at this for GV11B yet, but a cursory look showed > > that BAR2 is accessed in more places, so the equivalent for GV11B might > > be a bit more involved. > > > > Other than that, not a lot has changed since then. I've added a couple > > of precursory patches to add IOMMU helper dummies for the case where > > IOMMU is disabled (as suggested by Ben Dooks). > > > > Joerg has given an Acked-by on the first two patches, so I think it'd be > > easiest if you picked those up into the Nouveau tree because of the > > build dependency of subsequent patches on them. > I've merged all the patches in my tree, after fixing a small build > issue on !TEGRA in the WPR config readout patch. Thanks for taking care of that. I'm going to need to add a non-Tegra configuration to my build scripts and make sure I run those. On a related note: have you ever considered submitting the Nouveau tree for linux-next? That'd be very convenient for people like me working on multiple patch series at the same time. In fact I've got another set of patches against Nouveau that I want to send out after you've merged these. Technically I would need to rebase them on your tree (since there may be dependencies on this set), but that means I need to pull in both your tree and linux-next if I want to keep up to date on all fronts and test all patches in my local tree at the same time. I'm not sure how well that would fit into your workflow. It's typically not more effort than setting up a permanent branch that you can push to whenever there's something that's ready for broader consumption. Beyond the initial setup (which is really not more complicated than sending Stephen an email with a URL and the branch name), it's really quite simple and goes a long way to get broad testing early on. And it's especially handy to catch potential conflicts with cross-subsystem changes like the IOMMU patches in this series. Thierry > > Thierry Reding (9): > > iommu: Document iommu_fwspec::flags field > > iommu: Add dummy dev_iommu_fwspec_get() helper > > drm/nouveau: fault: Add support for GP10B > > drm/nouveau: tegra: Do not try to disable PCI device > > drm/nouveau: tegra: Avoid pulsing reset twice > > drm/nouveau: tegra: Set clock rate if not set > > drm/nouveau: secboot: Read WPR configuration from GPU registers > > drm/nouveau: gp10b: Add custom L2 cache implementation > > drm/nouveau: gp10b: Use correct copy engine > > > > .../drm/nouveau/include/nvkm/subdev/fault.h | 1 + > > .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h | 1 + > > drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +- > > .../gpu/drm/nouveau/nvkm/engine/device/base.c | 6 +- > > .../drm/nouveau/nvkm/engine/device/tegra.c| 24 -- > > .../gpu/drm/nouveau/nvkm/subdev/fault/Kbuild | 1 + > > .../gpu/drm/nouveau/nvkm/subdev/fault/base.c | 2 +- > > .../gpu/drm/nouveau/nvkm/subdev/fault/gp100.c | 17 ++-- > > .../gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c | 53 > > .../gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 1 + > > .../gpu/drm/nouveau/nvkm/subdev/fault/priv.h | 10 +++ > > .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild| 1 + > > .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 65 +++ > > .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h| 2 + > > .../drm/nouveau/nvkm/subdev/secboot/gm200.h | 2 +- > > .../drm/nouveau/nvkm/subdev/secboot/gm20b.c | 81 --- > > .../drm/nouveau/nvkm/subdev/secboot/gp10b.c | 4 +- > > include/linux/iommu.h | 47 ++- > > 18 files changed, 249 insertions(+), 72 deletions(-) > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > > > -- > > 2.23.0 > > > > ___ > > Nouveau mailing list > > Nouveau@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v3 8/9] drm/nouveau: gp10b: Add custom L2 cache implementation
From: Thierry Reding There are extra registers that need to be programmed to make the level 2 cache work on GP10B, such as the stream ID register that is used when an SMMU is used to translate memory addresses. Signed-off-by: Thierry Reding --- Changes in v2: - remove IOMMU_API protection to increase compile coverage - relies on dummy dev_iommu_fwspec_get() helper .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h | 1 + .../gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild| 1 + .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 65 +++ .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h| 2 + 5 files changed, 70 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h index 644d527c3b96..d76f60d7d29a 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h @@ -40,4 +40,5 @@ int gm107_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gm200_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gp100_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gp102_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); +int gp10b_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index b061df138142..231ec0073af3 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2380,7 +2380,7 @@ nv13b_chipset = { .fuse = gm107_fuse_new, .ibus = gp10b_ibus_new, .imem = gk20a_instmem_new, - .ltc = gp102_ltc_new, + .ltc = gp10b_ltc_new, .mc = gp10b_mc_new, .mmu = gp10b_mmu_new, .secboot = gp10b_secboot_new, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild index 2b6d36ea7067..728d75010847 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild @@ -6,3 +6,4 @@ nvkm-y += nvkm/subdev/ltc/gm107.o nvkm-y += nvkm/subdev/ltc/gm200.o nvkm-y += nvkm/subdev/ltc/gp100.o nvkm-y += nvkm/subdev/ltc/gp102.o +nvkm-y += nvkm/subdev/ltc/gp10b.o diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c new file mode 100644 index ..c0063c7caa50 --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c @@ -0,0 +1,65 @@ +/* + * Copyright (c) 2019 NVIDIA Corporation. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Thierry Reding + */ + +#include "priv.h" + +static void +gp10b_ltc_init(struct nvkm_ltc *ltc) +{ + struct nvkm_device *device = ltc->subdev.device; + struct iommu_fwspec *spec; + + nvkm_wr32(device, 0x17e27c, ltc->ltc_nr); + nvkm_wr32(device, 0x17e000, ltc->ltc_nr); + nvkm_wr32(device, 0x100800, ltc->ltc_nr); + + spec = dev_iommu_fwspec_get(device->dev); + if (spec) { + u32 sid = spec->ids[0] & 0x; + + /* stream ID */ + nvkm_wr32(device, 0x16, sid << 2); + } +} + +static const struct nvkm_ltc_func +gp10b_ltc = { + .oneinit = gp100_ltc_oneinit, + .init = gp10b_ltc_init, + .intr = gp100_ltc_intr, + .cbc_clear = gm107_ltc_cbc_clear, + .cbc_wait = gm107_ltc_cbc_wait, + .zbc = 16, + .zbc_clear_color = gm107_ltc_zbc_clear_color, + .zbc_clear_depth = gm107_ltc_zbc_clear_depth, + .zbc_clear_stencil = gp102_ltc_zbc_clear_stencil, + .invalidate = gf100_ltc_invalidate, + .flush = gf100_ltc_flush, +}; + +int +gp10b_ltc_new(str
[Nouveau] [PATCH v3 9/9] drm/nouveau: gp10b: Use correct copy engine
From: Thierry Reding gp10b uses the new engine enumeration mechanism introduced in the Pascal architecture. As a result, the copy engine, which used to be at index 2 for prior Tegra GPU instantiations, has now moved to index 0. Fix up the index and also use the gp100 variant of the copy engine class because on gp10b the PASCAL_DMA_COPY_B class is not supported. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index 231ec0073af3..eba450e689b2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2387,7 +2387,7 @@ nv13b_chipset = { .pmu = gm20b_pmu_new, .timer = gk20a_timer_new, .top = gk104_top_new, - .ce[2] = gp102_ce_new, + .ce[0] = gp100_ce_new, .dma = gf119_dma_new, .fifo = gp10b_fifo_new, .gr = gp10b_gr_new, -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v3 7/9] drm/nouveau: secboot: Read WPR configuration from GPU registers
From: Thierry Reding The GPUs found on Tegra SoCs have registers that can be used to read the WPR configuration. Use these registers instead of reaching into the memory controller's register space to read the same information. Signed-off-by: Thierry Reding --- .../drm/nouveau/nvkm/subdev/secboot/gm200.h | 2 +- .../drm/nouveau/nvkm/subdev/secboot/gm20b.c | 81 --- .../drm/nouveau/nvkm/subdev/secboot/gp10b.c | 4 +- 3 files changed, 53 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h index 62c5e162099a..280b1448df88 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h @@ -41,6 +41,6 @@ int gm200_secboot_run_blob(struct nvkm_secboot *, struct nvkm_gpuobj *, struct nvkm_falcon *); /* Tegra-only */ -int gm20b_secboot_tegra_read_wpr(struct gm200_secboot *, u32); +int gm20b_secboot_tegra_read_wpr(struct gm200_secboot *); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c index df8b919dcf09..f8a543122219 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c @@ -23,39 +23,65 @@ #include "acr.h" #include "gm200.h" -#define TEGRA210_MC_BASE 0x70019000 - #ifdef CONFIG_ARCH_TEGRA -#define MC_SECURITY_CARVEOUT2_CFG0 0xc58 -#define MC_SECURITY_CARVEOUT2_BOM_00xc5c -#define MC_SECURITY_CARVEOUT2_BOM_HI_0 0xc60 -#define MC_SECURITY_CARVEOUT2_SIZE_128K0xc64 -#define TEGRA_MC_SECURITY_CARVEOUT_CFG_LOCKED (1 << 1) /** * gm20b_secboot_tegra_read_wpr() - read the WPR registers on Tegra * - * On dGPU, we can manage the WPR region ourselves, but on Tegra the WPR region - * is reserved from system memory by the bootloader and irreversibly locked. - * This function reads the address and size of the pre-configured WPR region. + * On dGPU, we can manage the WPR region ourselves, but on Tegra this region + * is allocated from system memory by the secure firmware. The region is then + * marked as a "secure carveout" and irreversibly locked. Furthermore, the WPR + * secure carveout is also configured to be sent to the GPU via a dedicated + * serial bus between the memory controller and the GPU. The GPU requests this + * information upon leaving reset and exposes it through a FIFO register at + * offset 0x100cd4. + * + * The FIFO register's lower 4 bits can be used to set the read index into the + * FIFO. After each read of the FIFO register, the read index is incremented. + * + * Indices 2 and 3 contain the lower and upper addresses of the WPR. These are + * stored in units of 256 B. The WPR is inclusive of both addresses. + * + * Unfortunately, for some reason the WPR info register doesn't contain the + * correct values for the secure carveout. It seems like the upper address is + * always too small by 128 KiB - 1. Given that the secure carvout size in the + * memory controller configuration is specified in units of 128 KiB, it's + * possible that the computation of the upper address of the WPR is wrong and + * causes this difference. */ int -gm20b_secboot_tegra_read_wpr(struct gm200_secboot *gsb, u32 mc_base) +gm20b_secboot_tegra_read_wpr(struct gm200_secboot *gsb) { + struct nvkm_device *device = gsb->base.subdev.device; struct nvkm_secboot *sb = >base; - void __iomem *mc; - u32 cfg; + u64 base, limit; + u32 value; - mc = ioremap(mc_base, 0xd00); - if (!mc) { - nvkm_error(>subdev, "Cannot map Tegra MC registers\n"); - return -ENOMEM; - } - sb->wpr_addr = ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_0) | - ((u64)ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_HI_0) << 32); - sb->wpr_size = ioread32_native(mc + MC_SECURITY_CARVEOUT2_SIZE_128K) - << 17; - cfg = ioread32_native(mc + MC_SECURITY_CARVEOUT2_CFG0); - iounmap(mc); + /* set WPR info register to point at WPR base address register */ + value = nvkm_rd32(device, 0x100cd4); + value &= ~0xf; + value |= 0x2; + nvkm_wr32(device, 0x100cd4, value); + + /* read base address */ + value = nvkm_rd32(device, 0x100cd4); + base = (u64)(value >> 4) << 12; + + /* read limit */ + value = nvkm_rd32(device, 0x100cd4); + limit = (u64)(value >> 4) << 12; + + /* +* The upper address of the WPR seems to be computed wrongly and is +* actually SZ_128K - 1 bytes lower than it should be. Adjust the +* value accordingly. +*/ + limit += SZ_128K - 1; + + sb->wpr_size = limit - ba
[Nouveau] [PATCH v3 0/9] drm/nouveau: Various fixes for GP10B
From: Thierry Reding Hi Ben, here's a revised subset of the patches I had sent out a couple of weeks ago. I've reworked the BAR2 accesses in the way that you had suggested, which at least for GP10B turned out to be fairly trivial to do. I have not looked in detail at this for GV11B yet, but a cursory look showed that BAR2 is accessed in more places, so the equivalent for GV11B might be a bit more involved. Other than that, not a lot has changed since then. I've added a couple of precursory patches to add IOMMU helper dummies for the case where IOMMU is disabled (as suggested by Ben Dooks). Joerg has given an Acked-by on the first two patches, so I think it'd be easiest if you picked those up into the Nouveau tree because of the build dependency of subsequent patches on them. Thierry Thierry Reding (9): iommu: Document iommu_fwspec::flags field iommu: Add dummy dev_iommu_fwspec_get() helper drm/nouveau: fault: Add support for GP10B drm/nouveau: tegra: Do not try to disable PCI device drm/nouveau: tegra: Avoid pulsing reset twice drm/nouveau: tegra: Set clock rate if not set drm/nouveau: secboot: Read WPR configuration from GPU registers drm/nouveau: gp10b: Add custom L2 cache implementation drm/nouveau: gp10b: Use correct copy engine .../drm/nouveau/include/nvkm/subdev/fault.h | 1 + .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h | 1 + drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +- .../gpu/drm/nouveau/nvkm/engine/device/base.c | 6 +- .../drm/nouveau/nvkm/engine/device/tegra.c| 24 -- .../gpu/drm/nouveau/nvkm/subdev/fault/Kbuild | 1 + .../gpu/drm/nouveau/nvkm/subdev/fault/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/fault/gp100.c | 17 ++-- .../gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c | 53 .../gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 1 + .../gpu/drm/nouveau/nvkm/subdev/fault/priv.h | 10 +++ .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild| 1 + .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 65 +++ .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h| 2 + .../drm/nouveau/nvkm/subdev/secboot/gm200.h | 2 +- .../drm/nouveau/nvkm/subdev/secboot/gm20b.c | 81 --- .../drm/nouveau/nvkm/subdev/secboot/gp10b.c | 4 +- include/linux/iommu.h | 47 ++- 18 files changed, 249 insertions(+), 72 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v3 4/9] drm/nouveau: tegra: Do not try to disable PCI device
From: Thierry Reding When Nouveau is instantiated on top of a platform device, the dev->pdev field will be NULL and calling pci_disable_device() will crash. Move the PCI disabling code to the PCI specific driver removal code. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 2cd83849600f..b65ae817eabf 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -715,7 +715,6 @@ static int nouveau_drm_probe(struct pci_dev *pdev, void nouveau_drm_device_remove(struct drm_device *dev) { - struct pci_dev *pdev = dev->pdev; struct nouveau_drm *drm = nouveau_drm(dev); struct nvkm_client *client; struct nvkm_device *device; @@ -727,7 +726,6 @@ nouveau_drm_device_remove(struct drm_device *dev) device = nvkm_device_find(client->device); nouveau_drm_device_fini(dev); - pci_disable_device(pdev); drm_dev_put(dev); nvkm_device_del(); } @@ -738,6 +736,7 @@ nouveau_drm_remove(struct pci_dev *pdev) struct drm_device *dev = pci_get_drvdata(pdev); nouveau_drm_device_remove(dev); + pci_disable_device(pdev); } static int -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v3 5/9] drm/nouveau: tegra: Avoid pulsing reset twice
From: Thierry Reding When the GPU powergate is controlled by a generic power domain provider, the reset will automatically be asserted and deasserted as part of the power-ungating procedure. On some Jetson TX2 boards, doing an additional assert and deassert of the GPU outside of the power-ungate procedure can cause the GPU to go into a bad state where the memory interface can no longer access system memory. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c index 0e372a190d3f..747a775121cf 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c @@ -52,18 +52,18 @@ nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev) clk_set_rate(tdev->clk_pwr, 20400); udelay(10); - reset_control_assert(tdev->rst); - udelay(10); - if (!tdev->pdev->dev.pm_domain) { + reset_control_assert(tdev->rst); + udelay(10); + ret = tegra_powergate_remove_clamping(TEGRA_POWERGATE_3D); if (ret) goto err_clamp; udelay(10); - } - reset_control_deassert(tdev->rst); - udelay(10); + reset_control_deassert(tdev->rst); + udelay(10); + } return 0; -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v3 6/9] drm/nouveau: tegra: Set clock rate if not set
From: Thierry Reding If the GPU clock has not had a rate set, initialize it to the maximum clock rate to make sure it does run. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c index 747a775121cf..d0d52c1d4aee 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c @@ -279,6 +279,7 @@ nvkm_device_tegra_new(const struct nvkm_device_tegra_func *func, struct nvkm_device **pdevice) { struct nvkm_device_tegra *tdev; + unsigned long rate; int ret; if (!(tdev = kzalloc(sizeof(*tdev), GFP_KERNEL))) @@ -307,6 +308,17 @@ nvkm_device_tegra_new(const struct nvkm_device_tegra_func *func, goto free; } + rate = clk_get_rate(tdev->clk); + if (rate == 0) { + ret = clk_set_rate(tdev->clk, ULONG_MAX); + if (ret < 0) + goto free; + + rate = clk_get_rate(tdev->clk); + + dev_dbg(>dev, "GPU clock set to %lu\n", rate); + } + if (func->require_ref_clk) tdev->clk_ref = devm_clk_get(>dev, "ref"); if (IS_ERR(tdev->clk_ref)) { -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v3 1/9] iommu: Document iommu_fwspec::flags field
From: Thierry Reding When this field was added in commit 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes"), the kerneldoc comment wasn't updated at the same time. Acked-by: Joerg Roedel Signed-off-by: Thierry Reding --- include/linux/iommu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index f2223cbb5fd5..216e919875ea 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -570,6 +570,7 @@ struct iommu_group *fsl_mc_device_group(struct device *dev); * @ops: ops for this device's IOMMU * @iommu_fwnode: firmware handle for this device's IOMMU * @iommu_priv: IOMMU driver private data for this device + * @flags: IOMMU flags associated with this device * @num_ids: number of associated device IDs * @ids: IDs which this device may present to the IOMMU */ -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v3 3/9] drm/nouveau: fault: Add support for GP10B
From: Thierry Reding There is no BAR2 on GP10B and there is no need to map through BAR2 because all memory is shared between the GPU and the CPU. Add a custom implementation of the fault sub-device that uses nvkm_memory_addr() instead of nvkm_memory_bar2() to return the address of a pinned fault buffer. Signed-off-by: Thierry Reding --- .../drm/nouveau/include/nvkm/subdev/fault.h | 1 + .../gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/fault/Kbuild | 1 + .../gpu/drm/nouveau/nvkm/subdev/fault/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/fault/gp100.c | 17 -- .../gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c | 53 +++ .../gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 1 + .../gpu/drm/nouveau/nvkm/subdev/fault/priv.h | 10 8 files changed, 80 insertions(+), 7 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h index 97322f95b3ee..a513c16ab105 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h @@ -31,6 +31,7 @@ struct nvkm_fault_data { }; int gp100_fault_new(struct nvkm_device *, int, struct nvkm_fault **); +int gp10b_fault_new(struct nvkm_device *, int, struct nvkm_fault **); int gv100_fault_new(struct nvkm_device *, int, struct nvkm_fault **); int tu102_fault_new(struct nvkm_device *, int, struct nvkm_fault **); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index c3c7159f3411..b061df138142 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2375,7 +2375,7 @@ nv13b_chipset = { .name = "GP10B", .bar = gm20b_bar_new, .bus = gf100_bus_new, - .fault = gp100_fault_new, + .fault = gp10b_fault_new, .fb = gp10b_fb_new, .fuse = gm107_fuse_new, .ibus = gp10b_ibus_new, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild index 53b9d638f2c8..d65ec719f153 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild @@ -2,5 +2,6 @@ nvkm-y += nvkm/subdev/fault/base.o nvkm-y += nvkm/subdev/fault/user.o nvkm-y += nvkm/subdev/fault/gp100.o +nvkm-y += nvkm/subdev/fault/gp10b.o nvkm-y += nvkm/subdev/fault/gv100.o nvkm-y += nvkm/subdev/fault/tu102.o diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c index ca251560d3e0..1c4b852b26c3 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c @@ -108,7 +108,7 @@ nvkm_fault_oneinit_buffer(struct nvkm_fault *fault, int id) return ret; /* Pin fault buffer in BAR2. */ - buffer->addr = nvkm_memory_bar2(buffer->mem); + buffer->addr = fault->func->buffer.pin(buffer); if (buffer->addr == ~0ULL) return -EFAULT; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c index 4f3c4e091117..f6b189cc4330 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c @@ -21,25 +21,26 @@ */ #include "priv.h" +#include #include #include -static void +void gp100_fault_buffer_intr(struct nvkm_fault_buffer *buffer, bool enable) { struct nvkm_device *device = buffer->fault->subdev.device; nvkm_mc_intr_mask(device, NVKM_SUBDEV_FAULT, enable); } -static void +void gp100_fault_buffer_fini(struct nvkm_fault_buffer *buffer) { struct nvkm_device *device = buffer->fault->subdev.device; nvkm_mask(device, 0x002a70, 0x0001, 0x); } -static void +void gp100_fault_buffer_init(struct nvkm_fault_buffer *buffer) { struct nvkm_device *device = buffer->fault->subdev.device; @@ -48,7 +49,12 @@ gp100_fault_buffer_init(struct nvkm_fault_buffer *buffer) nvkm_mask(device, 0x002a70, 0x0001, 0x0001); } -static void +u64 gp100_fault_buffer_pin(struct nvkm_fault_buffer *buffer) +{ + return nvkm_memory_bar2(buffer->mem); +} + +void gp100_fault_buffer_info(struct nvkm_fault_buffer *buffer) { buffer->entries = nvkm_rd32(buffer->fault->subdev.device, 0x002a78); @@ -56,7 +62,7 @@ gp100_fault_buffer_info(struct nvkm_fault_buffer *buffer) buffer->put = 0x002a80; } -static void +void gp100_fault_intr(struct nvkm_fault *fault) { nvkm_event_send(>event, 1, 0, NULL, 0); @@ -68,6 +74,7 @@ gp100_fault = { .buffer.nr = 1, .buffer.entry_size = 32, .buffer.info = gp100_fault_buffe
[Nouveau] [PATCH v3 2/9] iommu: Add dummy dev_iommu_fwspec_get() helper
From: Thierry Reding This dummy implementation is useful to avoid a dependency on the IOMMU_API Kconfig symbol in drivers that can optionally use the IOMMU API. In order to fully use this, also move the struct iommu_fwspec definition out of the IOMMU_API protected region. Acked-by: Joerg Roedel Signed-off-by: Thierry Reding --- Changes in v3: - remove duplicate struct iommu_fwspec definition include/linux/iommu.h | 48 +++ 1 file changed, 26 insertions(+), 22 deletions(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 216e919875ea..bb28453bb09c 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -190,6 +190,27 @@ struct iommu_sva_ops { iommu_mm_exit_handler_t mm_exit; }; +/** + * struct iommu_fwspec - per-device IOMMU instance data + * @ops: ops for this device's IOMMU + * @iommu_fwnode: firmware handle for this device's IOMMU + * @iommu_priv: IOMMU driver private data for this device + * @flags: IOMMU flags associated with this device + * @num_ids: number of associated device IDs + * @ids: IDs which this device may present to the IOMMU + */ +struct iommu_fwspec { + const struct iommu_ops *ops; + struct fwnode_handle*iommu_fwnode; + void*iommu_priv; + u32 flags; + unsigned intnum_ids; + u32 ids[1]; +}; + +/* ATS is supported */ +#define IOMMU_FWSPEC_PCI_RC_ATS(1 << 0) + #ifdef CONFIG_IOMMU_API /** @@ -565,27 +586,6 @@ extern struct iommu_group *generic_device_group(struct device *dev); /* FSL-MC device grouping function */ struct iommu_group *fsl_mc_device_group(struct device *dev); -/** - * struct iommu_fwspec - per-device IOMMU instance data - * @ops: ops for this device's IOMMU - * @iommu_fwnode: firmware handle for this device's IOMMU - * @iommu_priv: IOMMU driver private data for this device - * @flags: IOMMU flags associated with this device - * @num_ids: number of associated device IDs - * @ids: IDs which this device may present to the IOMMU - */ -struct iommu_fwspec { - const struct iommu_ops *ops; - struct fwnode_handle*iommu_fwnode; - void*iommu_priv; - u32 flags; - unsigned intnum_ids; - u32 ids[1]; -}; - -/* ATS is supported */ -#define IOMMU_FWSPEC_PCI_RC_ATS(1 << 0) - /** * struct iommu_sva - handle to a device-mm bond */ @@ -634,7 +634,6 @@ int iommu_sva_get_pasid(struct iommu_sva *handle); struct iommu_ops {}; struct iommu_group {}; -struct iommu_fwspec {}; struct iommu_device {}; struct iommu_fault_param {}; struct iommu_iotlb_gather {}; @@ -980,6 +979,11 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode) return NULL; } +static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev) +{ + return NULL; +} + static inline bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat) { -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH v2 0/9] drm/nouveau: Various fixes for GP10B
On Sat, Nov 02, 2019 at 06:56:28PM +0100, Thierry Reding wrote: > From: Thierry Reding > > Hi Ben, > > here's a revised subset of the patches I had sent out a couple of weeks > ago. I've reworked the BAR2 accesses in the way that you had suggested, > which at least for GP10B turned out to be fairly trivial to do. I have > not looked in detail at this for GV11B yet, but a cursory look showed > that BAR2 is accessed in more places, so the equivalent for GV11B might > be a bit more involved. > > Other than that, not a lot has changed since then. I've added a couple > of precursory patches to add IOMMU helper dummies for the case where > IOMMU is disabled (as suggested by Ben Dooks). > > Joerg, it'd be great if you could give an Acked-by on those two patches > so that Ben can pick them both up into the Nouveau tree. Alternatively I > can put them both into a stable branch and send a pull request to both > of you. Or yet another alternative would be for Joerg to apply them now > and Ben to wait for v5.5-rc1 until he picks up the rest. All of those > work for me. Hi Joerg, Ben, do you guys have any further comments on this series? I've got an updated patch to silence the warning that the kbuild robot flagged, so if there are no other comments I can send a final v3 of the series. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/3] drm/nouveau/kms/nv50-: Call outp_atomic_check_view() before handling PBN
On Fri, Nov 15, 2019 at 04:07:18PM -0500, Lyude Paul wrote: > Since nv50_outp_atomic_check_view() can set crtc_state->mode_changed, we > probably should be calling it before handling any PBN changes. Just a > precaution. > > Signed-off-by: Lyude Paul > Fixes: 232c9eec417a ("drm/nouveau: Use atomic VCPI helpers for MST") > Cc: Ben Skeggs > Cc: Daniel Vetter > Cc: David Airlie > Cc: Jerry Zuo > Cc: Harry Wentland > Cc: Juston Li > Cc: Sean Paul > Cc: Laurent Pinchart > Cc: # v5.1+ > --- > drivers/gpu/drm/nouveau/dispnv50/disp.c | 44 ++--- > 1 file changed, 24 insertions(+), 20 deletions(-) Looks reasonable: Reviewed-by: Thierry Reding > diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c > b/drivers/gpu/drm/nouveau/dispnv50/disp.c > index 549486f1d937..6327aaf37c08 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c > @@ -770,32 +770,36 @@ nv50_msto_atomic_check(struct drm_encoder *encoder, > struct nv50_mstm *mstm = mstc->mstm; > struct nv50_head_atom *asyh = nv50_head_atom(crtc_state); > int slots; > + int ret; > > - if (crtc_state->mode_changed || crtc_state->connectors_changed) { > - /* > - * When restoring duplicated states, we need to make sure that > - * the bw remains the same and avoid recalculating it, as the > - * connector's bpc may have changed after the state was > - * duplicated > - */ > - if (!state->duplicated) { > - const int bpp = connector->display_info.bpc * 3; > - const int clock = crtc_state->adjusted_mode.clock; > + ret = nv50_outp_atomic_check_view(encoder, crtc_state, conn_state, > + mstc->native); > + if (ret) > + return ret; > > - asyh->dp.pbn = drm_dp_calc_pbn_mode(clock, bpp); > - } > + if (!crtc_state->mode_changed && !crtc_state->connectors_changed) > + return 0; > > - slots = drm_dp_atomic_find_vcpi_slots(state, >mgr, > - mstc->port, > - asyh->dp.pbn); > - if (slots < 0) > - return slots; > + /* > + * When restoring duplicated states, we need to make sure that the bw > + * remains the same and avoid recalculating it, as the connector's bpc > + * may have changed after the state was duplicated > + */ > + if (!state->duplicated) { > + const int bpp = connector->display_info.bpc * 3; > + const int clock = crtc_state->adjusted_mode.clock; > > - asyh->dp.tu = slots; > + asyh->dp.pbn = drm_dp_calc_pbn_mode(clock, bpp); > } > > - return nv50_outp_atomic_check_view(encoder, crtc_state, conn_state, > -mstc->native); > + slots = drm_dp_atomic_find_vcpi_slots(state, >mgr, mstc->port, > + asyh->dp.pbn); > + if (slots < 0) > + return slots; > + > + asyh->dp.tu = slots; > + > + return 0; > } > > static void > -- > 2.21.0 > > ___ > Nouveau mailing list > Nouveau@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 2/3] drm/nouveau/kms/nv50-: Store the bpc we're using in nv50_head_atom
On Fri, Nov 15, 2019 at 04:07:19PM -0500, Lyude Paul wrote: > In order to be able to use bpc values that are different from what the > connector reports, we want to be able to store the bpc value we decide > on using for an atomic state in nv50_head_atom and refer to that instead > of simply using the value that the connector reports throughout the > whole atomic check phase and commit phase. This will let us (eventually) > implement the max bpc connector property, and will also be needed for > limiting the bpc we use on MST displays to 8 in the next commit. > > Signed-off-by: Lyude Paul > Fixes: 232c9eec417a ("drm/nouveau: Use atomic VCPI helpers for MST") > Cc: Ben Skeggs > Cc: Daniel Vetter > Cc: David Airlie > Cc: Jerry Zuo > Cc: Harry Wentland > Cc: Juston Li > Cc: Sean Paul > Cc: Laurent Pinchart > Cc: # v5.1+ > --- > drivers/gpu/drm/nouveau/dispnv50/atom.h | 1 + > drivers/gpu/drm/nouveau/dispnv50/disp.c | 57 ++--- > drivers/gpu/drm/nouveau/dispnv50/head.c | 5 +-- > 3 files changed, 36 insertions(+), 27 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h > b/drivers/gpu/drm/nouveau/dispnv50/atom.h > index 43df86c38f58..24f7700768da 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/atom.h > +++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h > @@ -114,6 +114,7 @@ struct nv50_head_atom { > u8 nhsync:1; > u8 nvsync:1; > u8 depth:4; > + u8 bpc; > } or; > > /* Currently only used for MST */ > diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c > b/drivers/gpu/drm/nouveau/dispnv50/disp.c > index 6327aaf37c08..93665aecce57 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c > @@ -353,10 +353,20 @@ nv50_outp_atomic_check(struct drm_encoder *encoder, > struct drm_crtc_state *crtc_state, > struct drm_connector_state *conn_state) > { > - struct nouveau_connector *nv_connector = > - nouveau_connector(conn_state->connector); > - return nv50_outp_atomic_check_view(encoder, crtc_state, conn_state, > -nv_connector->native_mode); > + struct drm_connector *connector = conn_state->connector; > + struct nouveau_connector *nv_connector = nouveau_connector(connector); > + struct nv50_head_atom *asyh = nv50_head_atom(crtc_state); > + int ret; > + > + ret = nv50_outp_atomic_check_view(encoder, crtc_state, conn_state, > + nv_connector->native_mode); > + if (ret) > + return ret; > + > + if (crtc_state->mode_changed || crtc_state->connectors_changed) > + asyh->or.bpc = connector->display_info.bpc; > + > + return 0; > } > > > /** > @@ -786,10 +796,10 @@ nv50_msto_atomic_check(struct drm_encoder *encoder, >* may have changed after the state was duplicated >*/ > if (!state->duplicated) { > - const int bpp = connector->display_info.bpc * 3; > const int clock = crtc_state->adjusted_mode.clock; > > - asyh->dp.pbn = drm_dp_calc_pbn_mode(clock, bpp); > + asyh->or.bpc = connector->display_info.bpc; > + asyh->dp.pbn = drm_dp_calc_pbn_mode(clock, asyh->or.bpc * 3); > } > > slots = drm_dp_atomic_find_vcpi_slots(state, >mgr, mstc->port, > @@ -802,6 +812,17 @@ nv50_msto_atomic_check(struct drm_encoder *encoder, > return 0; > } > > +static u8 > +nv50_dp_bpc_to_depth(unsigned int bpc) > +{ > + switch (bpc) { > + case 6: return 0x2; > + case 8: return 0x5; > + case 10: /* fall-through */ > + default: return 0x6; > + } This is obviously just refactored from the code below, so this is probably fine for now. But what about BPC > 10? The OR here seems to be very similar to what's used on Tegra where the same values are used in the SOR_STATE1 register, see: drivers/gpu/drm/tegra/sor.h There are additional values for 12 and 16 BPC (see the definitions for SOR_STATE_ASY_PIXELDEPTH_BPP_*). With the above anything higher than 10 BPC will be treated the same and likely lead to wrong results. So I think either a WARN for the "default" case or additional cases for the other values would be good to have. Like I said, if this even is problematic (and given that userspace does not really support > 8 BPC yet, it probably isn't) it's a preexisting problem, so can be done in a different patch. Other than that this
Re: [Nouveau] [PATCH] RFC: drm/nouveau: Make BAR1 support optional
On Fri, Nov 08, 2019 at 05:02:07PM +0100, Thierry Reding wrote: > From: Thierry Reding > > The purpose of BAR1 is primarily to make memory accesses coherent. > However, some GPUs do not have BAR1 functionality. For example, the > GV11B found on the Xavier SoC is DMA coherent and therefore doesn't > need BAR1. > > Implement a variant of FIFO channels that work without a mapping of > instance memory through BAR1. > > XXX ensure memory barriers are in place for writes > > Signed-off-by: Thierry Reding > --- > Hi Ben, > > I'm sending this a bit out of context (it's part of the larger series to > enable basic GV11B support) because I wanted to get some early feedback > from you on this. > > For a bit of background: GV11B as it turns out doesn't really have BAR1 > anymore. The SoC has a coherency fabric which means that basically the > GPU's system memory accesses are already coherent and hence we no longer > need to go via BAR1 to ensure that. Functionally the same can be done by > just writing to the memory via the CPU's virtual mapping. > > So this patch implement basically a second variant of the FIFO channel > which, instead of taking a physical address and then ioremapping that, > takes a struct nvkm_memory object. This seems to work, though since I'm > not really doing much yet (firmware loading fails, etc.) I wouldn't call > this "done" just yet. > > In fact there are a few things that I'm not happy about. For example I > think we'll eventually need to have barriers to ensure that the CPU > write buffers are flushed, etc. It also seems like most users of the > FIFO channel object will just go and map its buffer once and then only > access it via the virtual mapping only, without going through the > ->rd32()/->wr32() callbacks nor unmapping via ->unmap(). That means we > effectively don't have a good point where we could emit the memory > barriers. > > I see two possibilities here: 1) make all accesses go through the > accessors or 2) guard each series of accesses with a pair of nvkm_map() > and nvkm_done() calls. Both of those would mean that all code paths need > to be carefully audited. Actually it looks like this is already working if I return 0 as the address from the ->unmap() callback. That seems to result in the ->wr32() and ->rd32() callbacks getting called instead of the callers trying to directly dereference the address, which obviously they now can't. So this seems like it could give me exactly what I need to make this work. Again, this seems to get me past probe, but I see only a single write at this point, so that's not saying much. Thierry > > One other thing I'm wondering is if it's okay to put all of this into > the gk104_fifo implementation. I think the result of parameterizing on > device->bar is pretty neat. Also, it seems like most of the rest of the > code would have to be duplicated, or a lot of the gk104_*() function > exported to a new implementation. So I'm not sure that it's really worth > it. > > Thierry > > .../drm/nouveau/include/nvkm/engine/fifo.h| 7 +- > .../gpu/drm/nouveau/nvkm/engine/fifo/chan.c | 157 -- > .../gpu/drm/nouveau/nvkm/engine/fifo/chan.h | 6 + > .../gpu/drm/nouveau/nvkm/engine/fifo/gk104.c | 29 +++- > 4 files changed, 180 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > index 4bd6e1e7c413..c0fb545efb2b 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > @@ -25,7 +25,12 @@ struct nvkm_fifo_chan { > struct nvkm_gpuobj *inst; > struct nvkm_gpuobj *push; > struct nvkm_vmm *vmm; > - void __iomem *user; > + > + union { > + struct nvkm_memory *mem; > + void __iomem *user; > + }; > + > u64 addr; > u32 size; > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c > b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c > index d83485385934..f47bc96bbb6d 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/chan.c > @@ -310,7 +310,7 @@ nvkm_fifo_chan_init(struct nvkm_object *object) > } > > static void * > -nvkm_fifo_chan_dtor(struct nvkm_object *object) > +__nvkm_fifo_chan_dtor(struct nvkm_object *object) > { > struct nvkm_fifo_chan *chan = nvkm_fifo_chan(object); > struct nvkm_fifo *fifo = chan->fifo; > @@ -324,9 +324,6 @@ nvkm_fifo_chan_dtor(struct nvkm_object *object) > } > spin_unlock_irqrestore(>lock, flags); > > - if (chan->us
[Nouveau] [PATCH v2 4/9] drm/nouveau: tegra: Do not try to disable PCI device
From: Thierry Reding When Nouveau is instantiated on top of a platform device, the dev->pdev field will be NULL and calling pci_disable_device() will crash. Move the PCI disabling code to the PCI specific driver removal code. Reviewed-by: Lyude Paul Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 2cd83849600f..b65ae817eabf 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -715,7 +715,6 @@ static int nouveau_drm_probe(struct pci_dev *pdev, void nouveau_drm_device_remove(struct drm_device *dev) { - struct pci_dev *pdev = dev->pdev; struct nouveau_drm *drm = nouveau_drm(dev); struct nvkm_client *client; struct nvkm_device *device; @@ -727,7 +726,6 @@ nouveau_drm_device_remove(struct drm_device *dev) device = nvkm_device_find(client->device); nouveau_drm_device_fini(dev); - pci_disable_device(pdev); drm_dev_put(dev); nvkm_device_del(); } @@ -738,6 +736,7 @@ nouveau_drm_remove(struct pci_dev *pdev) struct drm_device *dev = pci_get_drvdata(pdev); nouveau_drm_device_remove(dev); + pci_disable_device(pdev); } static int -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v2 5/9] drm/nouveau: tegra: Avoid pulsing reset twice
From: Thierry Reding When the GPU powergate is controlled by a generic power domain provider, the reset will automatically be asserted and deasserted as part of the power-ungating procedure. On some Jetson TX2 boards, doing an additional assert and deassert of the GPU outside of the power-ungate procedure can cause the GPU to go into a bad state where the memory interface can no longer access system memory. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c index 0e372a190d3f..747a775121cf 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c @@ -52,18 +52,18 @@ nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev) clk_set_rate(tdev->clk_pwr, 20400); udelay(10); - reset_control_assert(tdev->rst); - udelay(10); - if (!tdev->pdev->dev.pm_domain) { + reset_control_assert(tdev->rst); + udelay(10); + ret = tegra_powergate_remove_clamping(TEGRA_POWERGATE_3D); if (ret) goto err_clamp; udelay(10); - } - reset_control_deassert(tdev->rst); - udelay(10); + reset_control_deassert(tdev->rst); + udelay(10); + } return 0; -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v2 7/9] drm/nouveau: secboot: Read WPR configuration from GPU registers
From: Thierry Reding The GPUs found on Tegra SoCs have registers that can be used to read the WPR configuration. Use these registers instead of reaching into the memory controller's register space to read the same information. Signed-off-by: Thierry Reding --- .../drm/nouveau/nvkm/subdev/secboot/gm200.h | 2 +- .../drm/nouveau/nvkm/subdev/secboot/gm20b.c | 81 --- .../drm/nouveau/nvkm/subdev/secboot/gp10b.c | 4 +- 3 files changed, 53 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h index 62c5e162099a..280b1448df88 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h @@ -41,6 +41,6 @@ int gm200_secboot_run_blob(struct nvkm_secboot *, struct nvkm_gpuobj *, struct nvkm_falcon *); /* Tegra-only */ -int gm20b_secboot_tegra_read_wpr(struct gm200_secboot *, u32); +int gm20b_secboot_tegra_read_wpr(struct gm200_secboot *); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c index df8b919dcf09..f8a543122219 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c @@ -23,39 +23,65 @@ #include "acr.h" #include "gm200.h" -#define TEGRA210_MC_BASE 0x70019000 - #ifdef CONFIG_ARCH_TEGRA -#define MC_SECURITY_CARVEOUT2_CFG0 0xc58 -#define MC_SECURITY_CARVEOUT2_BOM_00xc5c -#define MC_SECURITY_CARVEOUT2_BOM_HI_0 0xc60 -#define MC_SECURITY_CARVEOUT2_SIZE_128K0xc64 -#define TEGRA_MC_SECURITY_CARVEOUT_CFG_LOCKED (1 << 1) /** * gm20b_secboot_tegra_read_wpr() - read the WPR registers on Tegra * - * On dGPU, we can manage the WPR region ourselves, but on Tegra the WPR region - * is reserved from system memory by the bootloader and irreversibly locked. - * This function reads the address and size of the pre-configured WPR region. + * On dGPU, we can manage the WPR region ourselves, but on Tegra this region + * is allocated from system memory by the secure firmware. The region is then + * marked as a "secure carveout" and irreversibly locked. Furthermore, the WPR + * secure carveout is also configured to be sent to the GPU via a dedicated + * serial bus between the memory controller and the GPU. The GPU requests this + * information upon leaving reset and exposes it through a FIFO register at + * offset 0x100cd4. + * + * The FIFO register's lower 4 bits can be used to set the read index into the + * FIFO. After each read of the FIFO register, the read index is incremented. + * + * Indices 2 and 3 contain the lower and upper addresses of the WPR. These are + * stored in units of 256 B. The WPR is inclusive of both addresses. + * + * Unfortunately, for some reason the WPR info register doesn't contain the + * correct values for the secure carveout. It seems like the upper address is + * always too small by 128 KiB - 1. Given that the secure carvout size in the + * memory controller configuration is specified in units of 128 KiB, it's + * possible that the computation of the upper address of the WPR is wrong and + * causes this difference. */ int -gm20b_secboot_tegra_read_wpr(struct gm200_secboot *gsb, u32 mc_base) +gm20b_secboot_tegra_read_wpr(struct gm200_secboot *gsb) { + struct nvkm_device *device = gsb->base.subdev.device; struct nvkm_secboot *sb = >base; - void __iomem *mc; - u32 cfg; + u64 base, limit; + u32 value; - mc = ioremap(mc_base, 0xd00); - if (!mc) { - nvkm_error(>subdev, "Cannot map Tegra MC registers\n"); - return -ENOMEM; - } - sb->wpr_addr = ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_0) | - ((u64)ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_HI_0) << 32); - sb->wpr_size = ioread32_native(mc + MC_SECURITY_CARVEOUT2_SIZE_128K) - << 17; - cfg = ioread32_native(mc + MC_SECURITY_CARVEOUT2_CFG0); - iounmap(mc); + /* set WPR info register to point at WPR base address register */ + value = nvkm_rd32(device, 0x100cd4); + value &= ~0xf; + value |= 0x2; + nvkm_wr32(device, 0x100cd4, value); + + /* read base address */ + value = nvkm_rd32(device, 0x100cd4); + base = (u64)(value >> 4) << 12; + + /* read limit */ + value = nvkm_rd32(device, 0x100cd4); + limit = (u64)(value >> 4) << 12; + + /* +* The upper address of the WPR seems to be computed wrongly and is +* actually SZ_128K - 1 bytes lower than it should be. Adjust the +* value accordingly. +*/ + limit += SZ_128K - 1; + + sb->wpr_size = limit - ba
[Nouveau] [PATCH v2 8/9] drm/nouveau: gp10b: Add custom L2 cache implementation
From: Thierry Reding There are extra registers that need to be programmed to make the level 2 cache work on GP10B, such as the stream ID register that is used when an SMMU is used to translate memory addresses. Signed-off-by: Thierry Reding --- Changes in v2: - remove IOMMU_API protection to increase compile coverage - relies on dummy dev_iommu_fwspec_get() helper .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h | 1 + .../gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild| 1 + .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 65 +++ .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h| 2 + 5 files changed, 70 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h index 644d527c3b96..d76f60d7d29a 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h @@ -40,4 +40,5 @@ int gm107_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gm200_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gp100_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gp102_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); +int gp10b_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index b061df138142..231ec0073af3 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2380,7 +2380,7 @@ nv13b_chipset = { .fuse = gm107_fuse_new, .ibus = gp10b_ibus_new, .imem = gk20a_instmem_new, - .ltc = gp102_ltc_new, + .ltc = gp10b_ltc_new, .mc = gp10b_mc_new, .mmu = gp10b_mmu_new, .secboot = gp10b_secboot_new, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild index 2b6d36ea7067..728d75010847 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild @@ -6,3 +6,4 @@ nvkm-y += nvkm/subdev/ltc/gm107.o nvkm-y += nvkm/subdev/ltc/gm200.o nvkm-y += nvkm/subdev/ltc/gp100.o nvkm-y += nvkm/subdev/ltc/gp102.o +nvkm-y += nvkm/subdev/ltc/gp10b.o diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c new file mode 100644 index ..c0063c7caa50 --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c @@ -0,0 +1,65 @@ +/* + * Copyright (c) 2019 NVIDIA Corporation. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Thierry Reding + */ + +#include "priv.h" + +static void +gp10b_ltc_init(struct nvkm_ltc *ltc) +{ + struct nvkm_device *device = ltc->subdev.device; + struct iommu_fwspec *spec; + + nvkm_wr32(device, 0x17e27c, ltc->ltc_nr); + nvkm_wr32(device, 0x17e000, ltc->ltc_nr); + nvkm_wr32(device, 0x100800, ltc->ltc_nr); + + spec = dev_iommu_fwspec_get(device->dev); + if (spec) { + u32 sid = spec->ids[0] & 0x; + + /* stream ID */ + nvkm_wr32(device, 0x16, sid << 2); + } +} + +static const struct nvkm_ltc_func +gp10b_ltc = { + .oneinit = gp100_ltc_oneinit, + .init = gp10b_ltc_init, + .intr = gp100_ltc_intr, + .cbc_clear = gm107_ltc_cbc_clear, + .cbc_wait = gm107_ltc_cbc_wait, + .zbc = 16, + .zbc_clear_color = gm107_ltc_zbc_clear_color, + .zbc_clear_depth = gm107_ltc_zbc_clear_depth, + .zbc_clear_stencil = gp102_ltc_zbc_clear_stencil, + .invalidate = gf100_ltc_invalidate, + .flush = gf100_ltc_flush, +}; + +int +gp10b_ltc_new(str
[Nouveau] [PATCH v2 6/9] drm/nouveau: tegra: Set clock rate if not set
From: Thierry Reding If the GPU clock has not had a rate set, initialize it to the maximum clock rate to make sure it does run. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c index 747a775121cf..d0d52c1d4aee 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c @@ -279,6 +279,7 @@ nvkm_device_tegra_new(const struct nvkm_device_tegra_func *func, struct nvkm_device **pdevice) { struct nvkm_device_tegra *tdev; + unsigned long rate; int ret; if (!(tdev = kzalloc(sizeof(*tdev), GFP_KERNEL))) @@ -307,6 +308,17 @@ nvkm_device_tegra_new(const struct nvkm_device_tegra_func *func, goto free; } + rate = clk_get_rate(tdev->clk); + if (rate == 0) { + ret = clk_set_rate(tdev->clk, ULONG_MAX); + if (ret < 0) + goto free; + + rate = clk_get_rate(tdev->clk); + + dev_dbg(>dev, "GPU clock set to %lu\n", rate); + } + if (func->require_ref_clk) tdev->clk_ref = devm_clk_get(>dev, "ref"); if (IS_ERR(tdev->clk_ref)) { -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v2 3/9] drm/nouveau: fault: Add support for GP10B
From: Thierry Reding There is no BAR2 on GP10B and there is no need to map through BAR2 because all memory is shared between the GPU and the CPU. Add a custom implementation of the fault sub-device that uses nvkm_memory_addr() instead of nvkm_memory_bar2() to return the address of a pinned fault buffer. Signed-off-by: Thierry Reding --- .../drm/nouveau/include/nvkm/subdev/fault.h | 1 + .../gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/fault/Kbuild | 1 + .../gpu/drm/nouveau/nvkm/subdev/fault/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/fault/gp100.c | 17 -- .../gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c | 53 +++ .../gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 1 + .../gpu/drm/nouveau/nvkm/subdev/fault/priv.h | 10 8 files changed, 80 insertions(+), 7 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h index 97322f95b3ee..a513c16ab105 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h @@ -31,6 +31,7 @@ struct nvkm_fault_data { }; int gp100_fault_new(struct nvkm_device *, int, struct nvkm_fault **); +int gp10b_fault_new(struct nvkm_device *, int, struct nvkm_fault **); int gv100_fault_new(struct nvkm_device *, int, struct nvkm_fault **); int tu102_fault_new(struct nvkm_device *, int, struct nvkm_fault **); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index c3c7159f3411..b061df138142 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2375,7 +2375,7 @@ nv13b_chipset = { .name = "GP10B", .bar = gm20b_bar_new, .bus = gf100_bus_new, - .fault = gp100_fault_new, + .fault = gp10b_fault_new, .fb = gp10b_fb_new, .fuse = gm107_fuse_new, .ibus = gp10b_ibus_new, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild index 53b9d638f2c8..d65ec719f153 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild @@ -2,5 +2,6 @@ nvkm-y += nvkm/subdev/fault/base.o nvkm-y += nvkm/subdev/fault/user.o nvkm-y += nvkm/subdev/fault/gp100.o +nvkm-y += nvkm/subdev/fault/gp10b.o nvkm-y += nvkm/subdev/fault/gv100.o nvkm-y += nvkm/subdev/fault/tu102.o diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c index ca251560d3e0..1c4b852b26c3 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c @@ -108,7 +108,7 @@ nvkm_fault_oneinit_buffer(struct nvkm_fault *fault, int id) return ret; /* Pin fault buffer in BAR2. */ - buffer->addr = nvkm_memory_bar2(buffer->mem); + buffer->addr = fault->func->buffer.pin(buffer); if (buffer->addr == ~0ULL) return -EFAULT; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c index 4f3c4e091117..f6b189cc4330 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c @@ -21,25 +21,26 @@ */ #include "priv.h" +#include #include #include -static void +void gp100_fault_buffer_intr(struct nvkm_fault_buffer *buffer, bool enable) { struct nvkm_device *device = buffer->fault->subdev.device; nvkm_mc_intr_mask(device, NVKM_SUBDEV_FAULT, enable); } -static void +void gp100_fault_buffer_fini(struct nvkm_fault_buffer *buffer) { struct nvkm_device *device = buffer->fault->subdev.device; nvkm_mask(device, 0x002a70, 0x0001, 0x); } -static void +void gp100_fault_buffer_init(struct nvkm_fault_buffer *buffer) { struct nvkm_device *device = buffer->fault->subdev.device; @@ -48,7 +49,12 @@ gp100_fault_buffer_init(struct nvkm_fault_buffer *buffer) nvkm_mask(device, 0x002a70, 0x0001, 0x0001); } -static void +u64 gp100_fault_buffer_pin(struct nvkm_fault_buffer *buffer) +{ + return nvkm_memory_bar2(buffer->mem); +} + +void gp100_fault_buffer_info(struct nvkm_fault_buffer *buffer) { buffer->entries = nvkm_rd32(buffer->fault->subdev.device, 0x002a78); @@ -56,7 +62,7 @@ gp100_fault_buffer_info(struct nvkm_fault_buffer *buffer) buffer->put = 0x002a80; } -static void +void gp100_fault_intr(struct nvkm_fault *fault) { nvkm_event_send(>event, 1, 0, NULL, 0); @@ -68,6 +74,7 @@ gp100_fault = { .buffer.nr = 1, .buffer.entry_size = 32, .buffer.info = gp100_fault_buffe
[Nouveau] [PATCH v2 9/9] drm/nouveau: gp10b: Use correct copy engine
From: Thierry Reding gp10b uses the new engine enumeration mechanism introduced in the Pascal architecture. As a result, the copy engine, which used to be at index 2 for prior Tegra GPU instantiations, has now moved to index 0. Fix up the index and also use the gp100 variant of the copy engine class because on gp10b the PASCAL_DMA_COPY_B class is not supported. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index 231ec0073af3..eba450e689b2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2387,7 +2387,7 @@ nv13b_chipset = { .pmu = gm20b_pmu_new, .timer = gk20a_timer_new, .top = gk104_top_new, - .ce[2] = gp102_ce_new, + .ce[0] = gp100_ce_new, .dma = gf119_dma_new, .fifo = gp10b_fifo_new, .gr = gp10b_gr_new, -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v2 2/9] iommu: Add dummy dev_iommu_fwspec_get() helper
From: Thierry Reding This dummy implementation is useful to avoid a dependency on the IOMMU_API Kconfig symbol in drivers that can optionally use the IOMMU API. In order to fully use this, also move the struct iommu_fwspec definition out of the IOMMU_API protected region. Suggested-by: Ben Dooks Signed-off-by: Thierry Reding --- include/linux/iommu.h | 47 --- 1 file changed, 26 insertions(+), 21 deletions(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 7bf038b371b8..b092e73b2c86 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -190,6 +190,27 @@ struct iommu_sva_ops { iommu_mm_exit_handler_t mm_exit; }; +/** + * struct iommu_fwspec - per-device IOMMU instance data + * @ops: ops for this device's IOMMU + * @iommu_fwnode: firmware handle for this device's IOMMU + * @iommu_priv: IOMMU driver private data for this device + * @flags: IOMMU flags associated with this device + * @num_ids: number of associated device IDs + * @ids: IDs which this device may present to the IOMMU + */ +struct iommu_fwspec { + const struct iommu_ops *ops; + struct fwnode_handle*iommu_fwnode; + void*iommu_priv; + u32 flags; + unsigned intnum_ids; + u32 ids[1]; +}; + +/* ATS is supported */ +#define IOMMU_FWSPEC_PCI_RC_ATS(1 << 0) + #ifdef CONFIG_IOMMU_API /** @@ -565,27 +586,6 @@ extern struct iommu_group *generic_device_group(struct device *dev); /* FSL-MC device grouping function */ struct iommu_group *fsl_mc_device_group(struct device *dev); -/** - * struct iommu_fwspec - per-device IOMMU instance data - * @ops: ops for this device's IOMMU - * @iommu_fwnode: firmware handle for this device's IOMMU - * @iommu_priv: IOMMU driver private data for this device - * @flags: IOMMU flags associated with this device - * @num_ids: number of associated device IDs - * @ids: IDs which this device may present to the IOMMU - */ -struct iommu_fwspec { - const struct iommu_ops *ops; - struct fwnode_handle*iommu_fwnode; - void*iommu_priv; - u32 flags; - unsigned intnum_ids; - u32 ids[1]; -}; - -/* ATS is supported */ -#define IOMMU_FWSPEC_PCI_RC_ATS(1 << 0) - /** * struct iommu_sva - handle to a device-mm bond */ @@ -980,6 +980,11 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode) return NULL; } +static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev) +{ + return NULL; +} + static inline bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat) { -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v2 0/9] drm/nouveau: Various fixes for GP10B
From: Thierry Reding Hi Ben, here's a revised subset of the patches I had sent out a couple of weeks ago. I've reworked the BAR2 accesses in the way that you had suggested, which at least for GP10B turned out to be fairly trivial to do. I have not looked in detail at this for GV11B yet, but a cursory look showed that BAR2 is accessed in more places, so the equivalent for GV11B might be a bit more involved. Other than that, not a lot has changed since then. I've added a couple of precursory patches to add IOMMU helper dummies for the case where IOMMU is disabled (as suggested by Ben Dooks). Joerg, it'd be great if you could give an Acked-by on those two patches so that Ben can pick them both up into the Nouveau tree. Alternatively I can put them both into a stable branch and send a pull request to both of you. Or yet another alternative would be for Joerg to apply them now and Ben to wait for v5.5-rc1 until he picks up the rest. All of those work for me. Thierry Thierry Reding (9): iommu: Document iommu_fwspec::flags field iommu: Add dummy dev_iommu_fwspec_get() helper drm/nouveau: fault: Add support for GP10B drm/nouveau: tegra: Do not try to disable PCI device drm/nouveau: tegra: Avoid pulsing reset twice drm/nouveau: tegra: Set clock rate if not set drm/nouveau: secboot: Read WPR configuration from GPU registers drm/nouveau: gp10b: Add custom L2 cache implementation drm/nouveau: gp10b: Use correct copy engine .../drm/nouveau/include/nvkm/subdev/fault.h | 1 + .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h | 1 + drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +- .../gpu/drm/nouveau/nvkm/engine/device/base.c | 6 +- .../drm/nouveau/nvkm/engine/device/tegra.c| 24 -- .../gpu/drm/nouveau/nvkm/subdev/fault/Kbuild | 1 + .../gpu/drm/nouveau/nvkm/subdev/fault/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/fault/gp100.c | 17 ++-- .../gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c | 53 .../gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 1 + .../gpu/drm/nouveau/nvkm/subdev/fault/priv.h | 10 +++ .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild| 1 + .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 65 +++ .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h| 2 + .../drm/nouveau/nvkm/subdev/secboot/gm200.h | 2 +- .../drm/nouveau/nvkm/subdev/secboot/gm20b.c | 81 --- .../drm/nouveau/nvkm/subdev/secboot/gp10b.c | 4 +- include/linux/iommu.h | 46 ++- 18 files changed, 249 insertions(+), 71 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp10b.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH v2 1/9] iommu: Document iommu_fwspec::flags field
From: Thierry Reding When this field was added in commit 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes"), the kerneldoc comment wasn't updated at the same time. Signed-off-by: Thierry Reding --- include/linux/iommu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index e28e80dea141..7bf038b371b8 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -570,6 +570,7 @@ struct iommu_group *fsl_mc_device_group(struct device *dev); * @ops: ops for this device's IOMMU * @iommu_fwnode: firmware handle for this device's IOMMU * @iommu_priv: IOMMU driver private data for this device + * @flags: IOMMU flags associated with this device * @num_ids: number of associated device IDs * @ids: IDs which this device may present to the IOMMU */ -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 3/6] drm/nouveau: Remove bogus gk20a aperture callback
On Tue, Sep 17, 2019 at 11:02:54AM +0200, Thierry Reding wrote: > On Tue, Sep 17, 2019 at 01:43:13PM +1000, Ben Skeggs wrote: > > On Tue, 17 Sep 2019 at 01:18, Thierry Reding > > wrote: > > > > > > From: Thierry Reding > > > > > > The gk20a (as well as all subsequent Tegra instantiations of the GPU) do > > > in fact use the same apertures as regular GPUs. Prior to gv11b there are > > > no checks in hardware for the aperture, so we get away with setting VRAM > > > as the aperture for buffers that are actually in system memory. > > Can GK20A take comptags with aperture set to system memory? For some > > reason I can recall, I was under the impression PTEs needed to be > > pointed at "vidmem" (despite them actually accessing system memory > > anyway) on Tegra parts for compression to work? I could be mistaken > > though. > > I don't think GK20A supports comptags at all. I think this wasn't > introduced until GM20B. nvgpu has some "gk20a" code to flush comptags, > but that's only used on GM20B and later. > > Anyway, my understanding is that on all of GK20A, GM20B and GP10B the > aperture field is completely ignored. I think that also goes for > comptags. nvgpu in particular never requests comptags to be allocated > from vidmem. See: > > > https://nv-tegra.nvidia.com/gitweb/?p=linux-nvgpu.git;a=blob;f=drivers/gpu/nvgpu/os/linux/ltc.c;h=baeb20b2e539cc6cb70667ce168603546678dc73;hb=2081ce686bfd4deb461b4130df424d592000ff88#l30 > > There are two callers of that, both passing "false" for the vidmem_alloc > parameter, so comptags always do end up in system memory for Tegra. > > That said, I'll go confirm that with one of our experts and get back to > you. So it looks like you're right and indeed GK20A and later (up to, but not including, GV11B that is) comptags do indeed need to be mapped as vidmem even if they reside in sysmem. Conceptually I think what we want to do is decide about the aperture at allocation time. So if we allocate comptags we would need to force the aperture to be VRAM on the iGPUs where necessary, even if they are really allocated in system memory. Ultimately the end result is the same of course, but I think this way around is a better representation of this particular hardware quirk and allows us to keep the unification that this patch series achieves. But I'll have to look into this and see what I can come up with. Thierry > > Thierry > > > > > Ben. > > > > > > > > Signed-off-by: Thierry Reding > > > --- > > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 - > > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c | 10 -- > > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c | 4 ++-- > > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c | 2 +- > > > 4 files changed, 3 insertions(+), 14 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h > > > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h > > > index fb3a9e8bb9cd..9862f44ac8b5 100644 > > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h > > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h > > > @@ -212,7 +212,6 @@ void gf100_vmm_flush(struct nvkm_vmm *, int); > > > void gf100_vmm_invalidate(struct nvkm_vmm *, u32 type); > > > void gf100_vmm_invalidate_pdb(struct nvkm_vmm *, u64 addr); > > > > > > -int gk20a_vmm_aper(enum nvkm_memory_target); > > > int gk20a_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map > > > *); > > > > > > int gm200_vmm_new_(const struct nvkm_vmm_func *, const struct > > > nvkm_vmm_func *, > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c > > > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c > > > index 16d7bf727292..999b953505b3 100644 > > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c > > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c > > > @@ -25,16 +25,6 @@ > > > > > > #include > > > > > > -int > > > -gk20a_vmm_aper(enum nvkm_memory_target target) > > > -{ > > > - switch (target) { > > > - case NVKM_MEM_TARGET_NCOH: return 0; > > > - default: > > > - return -EINVAL; > > > - } > > > -} > > > - > > > int > > > gk20a_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc, > > > struct nvkm_vmm_map *map) > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.
Re: [Nouveau] [PATCH 0/2] drm/nouveau: Two more fixes
On Tue, Sep 17, 2019 at 04:07:54PM +1000, Ben Skeggs wrote: > On Tue, 17 Sep 2019 at 00:36, Thierry Reding wrote: > > > > From: Thierry Reding > > > > Hi Ben, > > > > I messed up the ordering of patches in my tree a bit, so these two fixes > > got separated from the others. I don't consider these particularily > > urgent because the crash that the first one fixes only happens on gp10b > > which we don't enable by default yet and the second patch fixes a crash > > that only happens on module unload (or driver unbind, more accurately), > > which isn't a terribly common thing to do. > > > > I'll be sending out fixes shortly to make the GP10B work more properly > > on a wider range of Jetson TX2 devices and enable it by default. > > > > One thing to mention is that I'm not exactly sure if the first patch is > > the right thing to do. I haven't seen any issues after that change, but > > I'm also not exactly sure I understand what BAR2 is used for, so I don't > > know if I would've even covered those code paths (other than the one > > causing the crash at probe time) in my tests. > BAR2 on dGPUs is used to map kernel-level GPU objects in VRAM so they > can be accessed by the driver. It's pretty much a smaller version of > BAR1, but intended for a different purpose. > > On dGPUs, there's a couple of places (fault buffer address, and fault > method buffer on volta) where the GPU wants PRI regs to be poked with > an offset within BAR2 rather than an aperture+offset combination. I'm > not 100% sure what Tegra parts do here, but presumably if it's working > for you, they're happy to just accept a system memory address instead. > > I guess this would be the right thing to do here in that situation. > The more obvious (from a "reading the code" POV) thing to do would be > to write Tegra-specific versions of the functions that use > nvkm_memory_bar2() to perform this mapping, and use nvkm_memory_addr() > instead but I'm not sure if we need/want to go to that effort. It's > conceivable it could be required at some point. Yeah, that sounds slightly more correct. I'll look into it and see if I can come up with something. Thierry > > Ben. > > > > > It'd be great to get Lyude's feedback on the second patch, since that > > call to pci_disable_device() was rather oddly placed and I'm not sure if > > that was essential for things to work or whether the slightly different > > point in time where it's called after this patch is also okay. It looks > > to me like it should work fine, but I don't currently have a way to test > > this on desktop GPUs. > > > > Thierry > > > > Thierry Reding (2): > > drm/nouveau: tegra: Fix NULL pointer dereference > > drm/nouveau: tegra: Do not try to disable PCI device > > > > drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +- > > .../drm/nouveau/nvkm/subdev/instmem/gk20a.c | 30 +++ > > 2 files changed, 31 insertions(+), 2 deletions(-) > > > > -- > > 2.23.0 > > > > ___ > > Nouveau mailing list > > Nouveau@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 2/6] drm/nouveau: fault: Widen engine field
On Tue, Sep 17, 2019 at 01:48:20PM +1000, Ben Skeggs wrote: > On Tue, 17 Sep 2019 at 01:18, Thierry Reding wrote: > > > > From: Thierry Reding > > > > The engine field in the FIFO fault information registers is actually 9 > > bits wide. > Looks like this is true for fault buffer parsing too. Yes, I'll add that in v2. Thierry > > > > Signed-off-by: Thierry Reding > > --- > > drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c > > index b5e32295237b..28306c5f6651 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c > > @@ -137,8 +137,8 @@ gv100_fault_intr_fault(struct nvkm_fault *fault) > > info.addr = ((u64)addrhi << 32) | addrlo; > > info.inst = ((u64)insthi << 32) | (info0 & 0xf000); > > info.time = 0; > > - info.engine = (info0 & 0x00ff); > > info.aperture = (info0 & 0x0c00) >> 10; > > + info.engine = (info0 & 0x01ff); > > info.valid = (info1 & 0x8000) >> 31; > > info.gpc= (info1 & 0x1f00) >> 24; > > info.hub= (info1 & 0x0010) >> 20; > > -- > > 2.23.0 > > > > ___ > > Nouveau mailing list > > Nouveau@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/6] drm/nouveau: fault: Store aperture in fault information
On Tue, Sep 17, 2019 at 01:47:25PM +1000, Ben Skeggs wrote: > On Tue, 17 Sep 2019 at 01:18, Thierry Reding wrote: > > > > From: Thierry Reding > > > > The fault information register contains data about the aperture that > > caused the failure. This can be useful in debugging aperture related > > programming bugs. > Should this be parsed for fault buffer entries too? Yes, it probably should. Will fix that in v2. Thanks, Thierry > > > > > Signed-off-by: Thierry Reding > > --- > > drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h | 1 + > > drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c| 3 ++- > > drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 1 + > > 3 files changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h > > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h > > index 97322f95b3ee..1cc862bc1122 100644 > > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h > > @@ -21,6 +21,7 @@ struct nvkm_fault_data { > > u64 addr; > > u64 inst; > > u64 time; > > + u8 aperture; > > u8 engine; > > u8 valid; > > u8gpc; > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c > > b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c > > index 5d4b695cab8e..81cbe1cc4804 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c > > @@ -519,9 +519,10 @@ gk104_fifo_fault(struct nvkm_fifo *base, struct > > nvkm_fault_data *info) > > chan = nvkm_fifo_chan_inst_locked(>base, info->inst); > > > > nvkm_error(subdev, > > - "fault %02x [%s] at %016llx engine %02x [%s] client %02x > > " > > + "fault %02x [%s] at %016llx aperture %02x engine %02x > > [%s] client %02x " > >"[%s%s] reason %02x [%s] on channel %d [%010llx %s]\n", > >info->access, ea ? ea->name : "", info->addr, > > + info->aperture, > >info->engine, ee ? ee->name : en, > >info->client, ct, ec ? ec->name : "", > >info->reason, er ? er->name : "", chan ? chan->chid : -1, > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c > > index 6747f09c2dc3..b5e32295237b 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c > > @@ -138,6 +138,7 @@ gv100_fault_intr_fault(struct nvkm_fault *fault) > > info.inst = ((u64)insthi << 32) | (info0 & 0xf000); > > info.time = 0; > > info.engine = (info0 & 0x00ff); > > + info.aperture = (info0 & 0x0c00) >> 10; > > info.valid = (info1 & 0x8000) >> 31; > > info.gpc= (info1 & 0x1f00) >> 24; > > info.hub= (info1 & 0x0010) >> 20; > > -- > > 2.23.0 > > > > ___ > > Nouveau mailing list > > Nouveau@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 3/6] drm/nouveau: Remove bogus gk20a aperture callback
On Tue, Sep 17, 2019 at 01:43:13PM +1000, Ben Skeggs wrote: > On Tue, 17 Sep 2019 at 01:18, Thierry Reding wrote: > > > > From: Thierry Reding > > > > The gk20a (as well as all subsequent Tegra instantiations of the GPU) do > > in fact use the same apertures as regular GPUs. Prior to gv11b there are > > no checks in hardware for the aperture, so we get away with setting VRAM > > as the aperture for buffers that are actually in system memory. > Can GK20A take comptags with aperture set to system memory? For some > reason I can recall, I was under the impression PTEs needed to be > pointed at "vidmem" (despite them actually accessing system memory > anyway) on Tegra parts for compression to work? I could be mistaken > though. I don't think GK20A supports comptags at all. I think this wasn't introduced until GM20B. nvgpu has some "gk20a" code to flush comptags, but that's only used on GM20B and later. Anyway, my understanding is that on all of GK20A, GM20B and GP10B the aperture field is completely ignored. I think that also goes for comptags. nvgpu in particular never requests comptags to be allocated from vidmem. See: https://nv-tegra.nvidia.com/gitweb/?p=linux-nvgpu.git;a=blob;f=drivers/gpu/nvgpu/os/linux/ltc.c;h=baeb20b2e539cc6cb70667ce168603546678dc73;hb=2081ce686bfd4deb461b4130df424d592000ff88#l30 There are two callers of that, both passing "false" for the vidmem_alloc parameter, so comptags always do end up in system memory for Tegra. That said, I'll go confirm that with one of our experts and get back to you. Thierry > > Ben. > > > > > Signed-off-by: Thierry Reding > > --- > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 - > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c | 10 -- > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c | 4 ++-- > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c | 2 +- > > 4 files changed, 3 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h > > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h > > index fb3a9e8bb9cd..9862f44ac8b5 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h > > @@ -212,7 +212,6 @@ void gf100_vmm_flush(struct nvkm_vmm *, int); > > void gf100_vmm_invalidate(struct nvkm_vmm *, u32 type); > > void gf100_vmm_invalidate_pdb(struct nvkm_vmm *, u64 addr); > > > > -int gk20a_vmm_aper(enum nvkm_memory_target); > > int gk20a_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *); > > > > int gm200_vmm_new_(const struct nvkm_vmm_func *, const struct > > nvkm_vmm_func *, > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c > > index 16d7bf727292..999b953505b3 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c > > @@ -25,16 +25,6 @@ > > > > #include > > > > -int > > -gk20a_vmm_aper(enum nvkm_memory_target target) > > -{ > > - switch (target) { > > - case NVKM_MEM_TARGET_NCOH: return 0; > > - default: > > - return -EINVAL; > > - } > > -} > > - > > int > > gk20a_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc, > > struct nvkm_vmm_map *map) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c > > index 7a6066d886cd..f5d7819c4a40 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c > > @@ -25,7 +25,7 @@ static const struct nvkm_vmm_func > > gm20b_vmm_17 = { > > .join = gm200_vmm_join, > > .part = gf100_vmm_part, > > - .aper = gk20a_vmm_aper, > > + .aper = gf100_vmm_aper, > > .valid = gk20a_vmm_valid, > > .flush = gf100_vmm_flush, > > .invalidate_pdb = gf100_vmm_invalidate_pdb, > > @@ -41,7 +41,7 @@ static const struct nvkm_vmm_func > > gm20b_vmm_16 = { > > .join = gm200_vmm_join, > > .part = gf100_vmm_part, > > - .aper = gk20a_vmm_aper, > > + .aper = gf100_vmm_aper, > > .valid = gk20a_vmm_valid, > > .flush = gf100_vmm_flush, > > .invalidate_pdb = gf100_vmm_invalidate_pdb, > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp1
Re: [Nouveau] [PATCH 03/11] drm/nouveau: secboot: Read WPR configuration from GPU registers
On Tue, Sep 17, 2019 at 01:49:57PM +1000, Ben Skeggs wrote: > On Tue, 17 Sep 2019 at 01:04, Thierry Reding wrote: > > > > From: Thierry Reding > > > > The GPUs found on Tegra SoCs have registers that can be used to read the > > WPR configuration. Use these registers instead of reaching into the > > memory controller's register space to read the same information. > > > > Signed-off-by: Thierry Reding > > --- > > .../drm/nouveau/nvkm/subdev/secboot/gm200.h | 2 +- > > .../drm/nouveau/nvkm/subdev/secboot/gm20b.c | 81 --- > > .../drm/nouveau/nvkm/subdev/secboot/gp10b.c | 4 +- > > 3 files changed, 53 insertions(+), 34 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h > > b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h > > index 62c5e162099a..280b1448df88 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h > > @@ -41,6 +41,6 @@ int gm200_secboot_run_blob(struct nvkm_secboot *, struct > > nvkm_gpuobj *, > >struct nvkm_falcon *); > > > > /* Tegra-only */ > > -int gm20b_secboot_tegra_read_wpr(struct gm200_secboot *, u32); > > +int gm20b_secboot_tegra_read_wpr(struct gm200_secboot *); > > > > #endif > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c > > index df8b919dcf09..f8a543122219 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c > > @@ -23,39 +23,65 @@ > > #include "acr.h" > > #include "gm200.h" > > > > -#define TEGRA210_MC_BASE 0x70019000 > > - > > #ifdef CONFIG_ARCH_TEGRA > > -#define MC_SECURITY_CARVEOUT2_CFG0 0xc58 > > -#define MC_SECURITY_CARVEOUT2_BOM_00xc5c > > -#define MC_SECURITY_CARVEOUT2_BOM_HI_0 0xc60 > > -#define MC_SECURITY_CARVEOUT2_SIZE_128K0xc64 > > -#define TEGRA_MC_SECURITY_CARVEOUT_CFG_LOCKED (1 << 1) > > /** > > * gm20b_secboot_tegra_read_wpr() - read the WPR registers on Tegra > > * > > - * On dGPU, we can manage the WPR region ourselves, but on Tegra the WPR > > region > > - * is reserved from system memory by the bootloader and irreversibly > > locked. > > - * This function reads the address and size of the pre-configured WPR > > region. > > + * On dGPU, we can manage the WPR region ourselves, but on Tegra this > > region > > + * is allocated from system memory by the secure firmware. The region is > > then > > + * marked as a "secure carveout" and irreversibly locked. Furthermore, the > > WPR > > + * secure carveout is also configured to be sent to the GPU via a dedicated > > + * serial bus between the memory controller and the GPU. The GPU requests > > this > > + * information upon leaving reset and exposes it through a FIFO register at > > + * offset 0x100cd4. > > + * > > + * The FIFO register's lower 4 bits can be used to set the read index into > > the > > + * FIFO. After each read of the FIFO register, the read index is > > incremented. > > + * > > + * Indices 2 and 3 contain the lower and upper addresses of the WPR. These > > are > > + * stored in units of 256 B. The WPR is inclusive of both addresses. > > + * > > + * Unfortunately, for some reason the WPR info register doesn't contain the > > + * correct values for the secure carveout. It seems like the upper address > > is > > + * always too small by 128 KiB - 1. Given that the secure carvout size in > > the > > + * memory controller configuration is specified in units of 128 KiB, it's > > + * possible that the computation of the upper address of the WPR is wrong > > and > > + * causes this difference. > > */ > > int > > -gm20b_secboot_tegra_read_wpr(struct gm200_secboot *gsb, u32 mc_base) > > +gm20b_secboot_tegra_read_wpr(struct gm200_secboot *gsb) > > { > > + struct nvkm_device *device = gsb->base.subdev.device; > > struct nvkm_secboot *sb = >base; > > - void __iomem *mc; > > - u32 cfg; > > + u64 base, limit; > > + u32 value; > > > > - mc = ioremap(mc_base, 0xd00); > > - if (!mc) { > > - nvkm_error(>subdev, "Cannot map Tegra MC registers\n"); > > - return -
Re: [Nouveau] [PATCH 08/11] drm/nouveau: tegra: Skip IOMMU initialization if already attached
On Mon, Sep 16, 2019 at 05:15:25PM +0100, Robin Murphy wrote: > On 16/09/2019 16:57, Thierry Reding wrote: > > On Mon, Sep 16, 2019 at 04:29:18PM +0100, Robin Murphy wrote: > > > Hi Thierry, > > > > > > On 16/09/2019 16:04, Thierry Reding wrote: > > > > From: Thierry Reding > > > > > > > > If the GPU is already attached to an IOMMU, don't detach it and setup an > > > > explicit IOMMU domain. Since Nouveau can now properly handle the case of > > > > the DMA API being backed by an IOMMU, just continue using the DMA API. > > > > > > > > Signed-off-by: Thierry Reding > > > > --- > > > >.../drm/nouveau/nvkm/engine/device/tegra.c| 19 > > > > +++ > > > >1 file changed, 7 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c > > > > b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c > > > > index d0d52c1d4aee..fc652aaa41c7 100644 > > > > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c > > > > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c > > > > @@ -23,10 +23,6 @@ > > > >#ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER > > > >#include "priv.h" > > > > -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) > > > > -#include > > > > -#endif > > > > - > > > >static int > > > >nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev) > > > >{ > > > > @@ -109,14 +105,13 @@ nvkm_device_tegra_probe_iommu(struct > > > > nvkm_device_tegra *tdev) > > > > unsigned long pgsize_bitmap; > > > > int ret; > > > > -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) > > > > - if (dev->archdata.mapping) { > > > > - struct dma_iommu_mapping *mapping = > > > > to_dma_iommu_mapping(dev); > > > > - > > > > - arm_iommu_detach_device(dev); > > > > - arm_iommu_release_mapping(mapping); > > > > - } > > > > -#endif > > > > + /* > > > > +* Skip explicit IOMMU initialization if the GPU is already > > > > attached > > > > +* to an IOMMU domain. This can happen if the DMA API is backed > > > > by an > > > > +* IOMMU. > > > > +*/ > > > > + if (iommu_get_domain_for_dev(dev)) > > > > + return; > > > > > > Beware of "iommu.passthrough=1" - you could get a valid default domain > > > here > > > yet still have direct/SWIOTLB DMA ops. I guess you probably want to > > > double-check the domain type as well. > > > > Good point. An earlier version of this patch had an additional check for > > IOMMU_DOMAIN_DMA, but then that failed on 32-bit ARM because there the > > DMA API can also use IOMMU_DOMAIN_UNMANAGED type domains. Checking for > > IOMMU_DOMAIN_IDENTIFY should be safe, though. That doesn't seem to > > appear in arch/arm, arch/arm64 or drivers/iommu/dma-iommu.c. > > Right, "domain && domain->type != IOMMU_DOMAIN_IDENTITY" should be > sufficient to answer "is the DMA layer managing my address space for me?" > unless and until some massive API change happens (which I certainly don't > foresee). Might be a good idea to roll that up into a function to have a standard way for drivers to check for this rather than open-coding the same condition everywhere (and maybe get things wrong). As an additional advantage, if that massive API change ever does happen we don't have to go and update all the callers. Something like this perhaps? static inline bool iommu_managed(struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); return domain && domain->type != IOMMU_DOMAIN_UNMANAGED; } Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 08/11] drm/nouveau: tegra: Skip IOMMU initialization if already attached
On Mon, Sep 16, 2019 at 04:29:18PM +0100, Robin Murphy wrote: > Hi Thierry, > > On 16/09/2019 16:04, Thierry Reding wrote: > > From: Thierry Reding > > > > If the GPU is already attached to an IOMMU, don't detach it and setup an > > explicit IOMMU domain. Since Nouveau can now properly handle the case of > > the DMA API being backed by an IOMMU, just continue using the DMA API. > > > > Signed-off-by: Thierry Reding > > --- > > .../drm/nouveau/nvkm/engine/device/tegra.c| 19 +++ > > 1 file changed, 7 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c > > b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c > > index d0d52c1d4aee..fc652aaa41c7 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c > > @@ -23,10 +23,6 @@ > > #ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER > > #include "priv.h" > > -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) > > -#include > > -#endif > > - > > static int > > nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev) > > { > > @@ -109,14 +105,13 @@ nvkm_device_tegra_probe_iommu(struct > > nvkm_device_tegra *tdev) > > unsigned long pgsize_bitmap; > > int ret; > > -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) > > - if (dev->archdata.mapping) { > > - struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); > > - > > - arm_iommu_detach_device(dev); > > - arm_iommu_release_mapping(mapping); > > - } > > -#endif > > + /* > > +* Skip explicit IOMMU initialization if the GPU is already attached > > +* to an IOMMU domain. This can happen if the DMA API is backed by an > > +* IOMMU. > > +*/ > > + if (iommu_get_domain_for_dev(dev)) > > + return; > > Beware of "iommu.passthrough=1" - you could get a valid default domain here > yet still have direct/SWIOTLB DMA ops. I guess you probably want to > double-check the domain type as well. Good point. An earlier version of this patch had an additional check for IOMMU_DOMAIN_DMA, but then that failed on 32-bit ARM because there the DMA API can also use IOMMU_DOMAIN_UNMANAGED type domains. Checking for IOMMU_DOMAIN_IDENTIFY should be safe, though. That doesn't seem to appear in arch/arm, arch/arm64 or drivers/iommu/dma-iommu.c. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 04/11] drm/nouveau: gp10b: Add custom L2 cache implementation
On Mon, Sep 16, 2019 at 05:49:46PM +0200, Thierry Reding wrote: > On Mon, Sep 16, 2019 at 04:35:30PM +0100, Ben Dooks wrote: > > On 16/09/2019 16:04, Thierry Reding wrote: > > > From: Thierry Reding > > > > > > There are extra registers that need to be programmed to make the level 2 > > > cache work on GP10B, such as the stream ID register that is used when an > > > SMMU is used to translate memory addresses. > > > > > > Signed-off-by: Thierry Reding > > > --- > > > .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h | 1 + > > > .../gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- > > > .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild| 1 + > > > .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 69 +++ > > > .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h| 2 + > > > 5 files changed, 74 insertions(+), 1 deletion(-) > > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h > > > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h > > > index 644d527c3b96..d76f60d7d29a 100644 > > > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h > > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h > > > @@ -40,4 +40,5 @@ int gm107_ltc_new(struct nvkm_device *, int, struct > > > nvkm_ltc **); > > > int gm200_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); > > > int gp100_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); > > > int gp102_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); > > > +int gp10b_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); > > > #endif > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > > > b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > > > index c3c7159f3411..d2d6d5f4028a 100644 > > > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > > > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > > > @@ -2380,7 +2380,7 @@ nv13b_chipset = { > > > .fuse = gm107_fuse_new, > > > .ibus = gp10b_ibus_new, > > > .imem = gk20a_instmem_new, > > > - .ltc = gp102_ltc_new, > > > + .ltc = gp10b_ltc_new, > > > .mc = gp10b_mc_new, > > > .mmu = gp10b_mmu_new, > > > .secboot = gp10b_secboot_new, > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild > > > b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild > > > index 2b6d36ea7067..728d75010847 100644 > > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild > > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild > > > @@ -6,3 +6,4 @@ nvkm-y += nvkm/subdev/ltc/gm107.o > > > nvkm-y += nvkm/subdev/ltc/gm200.o > > > nvkm-y += nvkm/subdev/ltc/gp100.o > > > nvkm-y += nvkm/subdev/ltc/gp102.o > > > +nvkm-y += nvkm/subdev/ltc/gp10b.o > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > > b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > > new file mode 100644 > > > index ..4d27c6ea1552 > > > --- /dev/null > > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > > @@ -0,0 +1,69 @@ > > > +/* > > > + * Copyright (c) 2019 NVIDIA Corporation. > > > + * > > > + * Permission is hereby granted, free of charge, to any person obtaining > > > a > > > + * copy of this software and associated documentation files (the > > > "Software"), > > > + * to deal in the Software without restriction, including without > > > limitation > > > + * the rights to use, copy, modify, merge, publish, distribute, > > > sublicense, > > > + * and/or sell copies of the Software, and to permit persons to whom the > > > + * Software is furnished to do so, subject to the following conditions: > > > + * > > > + * The above copyright notice and this permission notice shall be > > > included in > > > + * all copies or substantial portions of the Software. > > > + * > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > > > EXPRESS OR > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > > > MERCHANTABILITY, > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT > > > SHALL > > > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAM
Re: [Nouveau] [PATCH 04/11] drm/nouveau: gp10b: Add custom L2 cache implementation
On Mon, Sep 16, 2019 at 04:35:30PM +0100, Ben Dooks wrote: > On 16/09/2019 16:04, Thierry Reding wrote: > > From: Thierry Reding > > > > There are extra registers that need to be programmed to make the level 2 > > cache work on GP10B, such as the stream ID register that is used when an > > SMMU is used to translate memory addresses. > > > > Signed-off-by: Thierry Reding > > --- > > .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h | 1 + > > .../gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- > > .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild| 1 + > > .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 69 +++ > > .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h| 2 + > > 5 files changed, 74 insertions(+), 1 deletion(-) > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h > > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h > > index 644d527c3b96..d76f60d7d29a 100644 > > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h > > @@ -40,4 +40,5 @@ int gm107_ltc_new(struct nvkm_device *, int, struct > > nvkm_ltc **); > > int gm200_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); > > int gp100_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); > > int gp102_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); > > +int gp10b_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); > > #endif > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > > b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > > index c3c7159f3411..d2d6d5f4028a 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c > > @@ -2380,7 +2380,7 @@ nv13b_chipset = { > > .fuse = gm107_fuse_new, > > .ibus = gp10b_ibus_new, > > .imem = gk20a_instmem_new, > > - .ltc = gp102_ltc_new, > > + .ltc = gp10b_ltc_new, > > .mc = gp10b_mc_new, > > .mmu = gp10b_mmu_new, > > .secboot = gp10b_secboot_new, > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild > > b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild > > index 2b6d36ea7067..728d75010847 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild > > @@ -6,3 +6,4 @@ nvkm-y += nvkm/subdev/ltc/gm107.o > > nvkm-y += nvkm/subdev/ltc/gm200.o > > nvkm-y += nvkm/subdev/ltc/gp100.o > > nvkm-y += nvkm/subdev/ltc/gp102.o > > +nvkm-y += nvkm/subdev/ltc/gp10b.o > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > new file mode 100644 > > index ..4d27c6ea1552 > > --- /dev/null > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c > > @@ -0,0 +1,69 @@ > > +/* > > + * Copyright (c) 2019 NVIDIA Corporation. > > + * > > + * Permission is hereby granted, free of charge, to any person obtaining a > > + * copy of this software and associated documentation files (the > > "Software"), > > + * to deal in the Software without restriction, including without > > limitation > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > > + * and/or sell copies of the Software, and to permit persons to whom the > > + * Software is furnished to do so, subject to the following conditions: > > + * > > + * The above copyright notice and this permission notice shall be included > > in > > + * all copies or substantial portions of the Software. > > + * > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS > > OR > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > > + * OTHER DEALINGS IN THE SOFTWARE. > > + * > > + * Authors: Thierry Reding > > + */ > > + > > +#include "priv.h" > > + > > +static void > > +gp10b_ltc_init(struct nvkm_ltc *ltc) > > +{ > > + struct nvkm_device *device = ltc->subdev.device; > > +#ifdef CONFIG_IOMMU_API > > + struct iommu_fwspec *spec;
[Nouveau] [PATCH 0/6] drm/nouveau: Preparatory work for GV11B support
From: Thierry Reding Hi Ben, these are a couple of patches that are in preparation for adding GV11B support. The fundamental issue that these are trying to solve is that the GV11B is the first Tegra incarnation of the GPU where the aperture really matters. All prior generations would accept any of them. For dGPUs we usually allocate memory in VRAM, so the default aperture (0) is correct. However, on Tegra the buffers are allocated in system memory, and since the GPU actually cares about the aperture, we need to ensure that the aperture field is written in all the necessary places. This series of patches does three things: the first two patches make it easier to debug aperture related faults by actually reading the aperture information from the fault information registers. The second patch is actually only a small cleanup. Patches 3-5 unify the aperture values. All generations have the same definitions for these, so there's little use in separating them out into callbacks. Finally, patch 6 writes the aperture field in the places where required. I've used these patches to test my initial support for GV11B. This is enough to get me through the driver probe without any faults, but I have not made much progress on secboot support yet, so I can't use the GV11B to do anything very interesting yet. I should also note that this is completely untested on dGPU because I don't currently have a way of testing them. I'm working on that, but in the meantime it'd be great if somebody could give this set a quick spin on a dGPU to confirm that these don't break. Thierry Thierry Reding (6): drm/nouveau: fault: Store aperture in fault information drm/nouveau: fault: Widen engine field drm/nouveau: Remove bogus gk20a aperture callback drm/nouveau: Implement nvkm_memory_aperture() drm/nouveau: Remove unused nvkm_vmm_func->aper() implementations drm/nouveau: Program aperture field where necessary .../drm/nouveau/include/nvkm/core/memory.h| 28 +++ .../drm/nouveau/include/nvkm/subdev/fault.h | 1 + .../gpu/drm/nouveau/nvkm/engine/fifo/gk104.c | 3 +- .../nouveau/nvkm/engine/fifo/gpfifogk104.c| 7 +++-- .../nouveau/nvkm/engine/fifo/gpfifogv100.c| 5 ++-- .../gpu/drm/nouveau/nvkm/engine/fifo/gv100.c | 7 - .../gpu/drm/nouveau/nvkm/subdev/bar/gf100.c | 3 +- .../gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 3 +- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 3 -- .../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c| 21 ++ .../drm/nouveau/nvkm/subdev/mmu/vmmgk104.c| 2 -- .../drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c| 12 .../drm/nouveau/nvkm/subdev/mmu/vmmgm200.c| 2 -- .../drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c| 2 -- .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c| 8 ++ .../drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c| 1 - .../drm/nouveau/nvkm/subdev/mmu/vmmgv100.c| 1 - .../drm/nouveau/nvkm/subdev/mmu/vmmtu102.c| 1 - 18 files changed, 55 insertions(+), 55 deletions(-) -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 6/6] drm/nouveau: Program aperture field where necessary
From: Thierry Reding Some registers and instance block entries need the aperture to be programmed correctly. This is important on recent Tegra GPUs where the GPU actually checks the value of this field and faults if an invalid aperture is programmed. For example GV11B no longer supports VRAM and all memory is already allocated from system (coherent or non-coherent), so make sure to also program the right aperture. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c | 7 +-- drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogv100.c | 5 +++-- drivers/gpu/drm/nouveau/nvkm/engine/fifo/gv100.c | 7 ++- drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c| 3 ++- 4 files changed, 16 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c index 728a1edbf98c..843ebb41dbc6 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c @@ -201,6 +201,7 @@ gk104_fifo_gpfifo_fini(struct nvkm_fifo_chan *base) void gk104_fifo_gpfifo_init(struct nvkm_fifo_chan *base) { + u32 aperture = nvkm_memory_aperture(base->inst->memory) << 28; struct gk104_fifo_chan *chan = gk104_fifo_chan(base); struct gk104_fifo *fifo = chan->fifo; struct nvkm_device *device = fifo->base.engine.subdev.device; @@ -208,7 +209,7 @@ gk104_fifo_gpfifo_init(struct nvkm_fifo_chan *base) u32 coff = chan->base.chid * 8; nvkm_mask(device, 0x84 + coff, 0x000f, chan->runl << 16); - nvkm_wr32(device, 0x80 + coff, 0x8000 | addr); + nvkm_wr32(device, 0x80 + coff, 0x8000 | aperture | addr); if (list_empty(>head) && !chan->killed) { gk104_fifo_runlist_insert(fifo, chan); @@ -250,6 +251,7 @@ gk104_fifo_gpfifo_new_(struct gk104_fifo *fifo, u64 *runlists, u16 *chid, unsigned long engm; u64 subdevs = 0; u64 usermem; + u32 target; if (!vmm || runlist < 0 || runlist >= fifo->runlist_nr) return -EINVAL; @@ -303,10 +305,11 @@ gk104_fifo_gpfifo_new_(struct gk104_fifo *fifo, u64 *runlists, u16 *chid, nvkm_wo32(fifo->user.mem, usermem + i, 0x); nvkm_done(fifo->user.mem); usermem = nvkm_memory_addr(fifo->user.mem) + usermem; + target = nvkm_memory_aperture(fifo->user.mem); /* RAMFC */ nvkm_kmap(chan->base.inst); - nvkm_wo32(chan->base.inst, 0x08, lower_32_bits(usermem)); + nvkm_wo32(chan->base.inst, 0x08, lower_32_bits(usermem) | target); nvkm_wo32(chan->base.inst, 0x0c, upper_32_bits(usermem)); nvkm_wo32(chan->base.inst, 0x10, 0xface); nvkm_wo32(chan->base.inst, 0x30, 0xf902); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogv100.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogv100.c index a7462cf59d65..97d084ffcfd5 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogv100.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogv100.c @@ -132,7 +132,7 @@ gv100_fifo_gpfifo_new_(const struct nvkm_fifo_chan_func *func, unsigned long engm; u64 subdevs = 0; u64 usermem, mthd; - u32 size; + u32 size, target; if (!vmm || runlist < 0 || runlist >= fifo->runlist_nr) return -EINVAL; @@ -183,6 +183,7 @@ gv100_fifo_gpfifo_new_(const struct nvkm_fifo_chan_func *func, nvkm_wo32(fifo->user.mem, usermem + i, 0x); nvkm_done(fifo->user.mem); usermem = nvkm_memory_addr(fifo->user.mem) + usermem; + target = nvkm_memory_target(fifo->user.mem); /* Allocate fault method buffer (magics come from nvgpu). */ size = nvkm_rd32(device, 0x104028); /* NV_PCE_PCE_MAP */ @@ -200,7 +201,7 @@ gv100_fifo_gpfifo_new_(const struct nvkm_fifo_chan_func *func, /* RAMFC */ nvkm_kmap(chan->base.inst); - nvkm_wo32(chan->base.inst, 0x008, lower_32_bits(usermem)); + nvkm_wo32(chan->base.inst, 0x008, lower_32_bits(usermem) | target); nvkm_wo32(chan->base.inst, 0x00c, upper_32_bits(usermem)); nvkm_wo32(chan->base.inst, 0x010, 0xface); nvkm_wo32(chan->base.inst, 0x030, 0x7902); diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gv100.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gv100.c index 6ee1bb32a071..449f669f43b0 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gv100.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gv100.c @@ -32,11 +32,16 @@ void gv100_fifo_runlist_chan(struct gk104_fifo_chan *chan, struct nvkm_memory *memory, u32 offset) { + struct nvkm_memory *instmem = chan->base.inst->memory; struct nvkm_memory *usermem = cha
[Nouveau] [PATCH 5/6] drm/nouveau: Remove unused nvkm_vmm_func->aper() implementations
From: Thierry Reding These implementations are now all unused. Remove them. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 2 -- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c | 14 -- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c | 2 -- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c | 2 -- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm200.c | 2 -- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c | 2 -- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 1 - drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c | 1 - drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgv100.c | 1 - drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmtu102.c | 1 - 10 files changed, 28 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h index 9862f44ac8b5..767870c2d24c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h @@ -140,7 +140,6 @@ struct nvkm_vmm_func { int (*join)(struct nvkm_vmm *, struct nvkm_memory *inst); void (*part)(struct nvkm_vmm *, struct nvkm_memory *inst); - int (*aper)(enum nvkm_memory_target); int (*valid)(struct nvkm_vmm *, void *argv, u32 argc, struct nvkm_vmm_map *); void (*flush)(struct nvkm_vmm *, int depth); @@ -206,7 +205,6 @@ int gf100_vmm_new_(const struct nvkm_vmm_func *, const struct nvkm_vmm_func *, int gf100_vmm_join_(struct nvkm_vmm *, struct nvkm_memory *, u64 base); int gf100_vmm_join(struct nvkm_vmm *, struct nvkm_memory *); void gf100_vmm_part(struct nvkm_vmm *, struct nvkm_memory *); -int gf100_vmm_aper(enum nvkm_memory_target); int gf100_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *); void gf100_vmm_flush(struct nvkm_vmm *, int); void gf100_vmm_invalidate(struct nvkm_vmm *, u32 type); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c index ffa64c0d3eda..ccf5a92d7b54 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c @@ -318,18 +318,6 @@ gf100_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc, return 0; } -int -gf100_vmm_aper(enum nvkm_memory_target target) -{ - switch (target) { - case NVKM_MEM_TARGET_VRAM: return 0; - case NVKM_MEM_TARGET_HOST: return 2; - case NVKM_MEM_TARGET_NCOH: return 3; - default: - return -EINVAL; - } -} - void gf100_vmm_part(struct nvkm_vmm *vmm, struct nvkm_memory *inst) { @@ -370,7 +358,6 @@ static const struct nvkm_vmm_func gf100_vmm_17 = { .join = gf100_vmm_join, .part = gf100_vmm_part, - .aper = gf100_vmm_aper, .valid = gf100_vmm_valid, .flush = gf100_vmm_flush, .invalidate_pdb = gf100_vmm_invalidate_pdb, @@ -385,7 +372,6 @@ static const struct nvkm_vmm_func gf100_vmm_16 = { .join = gf100_vmm_join, .part = gf100_vmm_part, - .aper = gf100_vmm_aper, .valid = gf100_vmm_valid, .flush = gf100_vmm_flush, .invalidate_pdb = gf100_vmm_invalidate_pdb, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c index 0b59c01fd146..8efd147fa930 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c @@ -68,7 +68,6 @@ static const struct nvkm_vmm_func gk104_vmm_17 = { .join = gf100_vmm_join, .part = gf100_vmm_part, - .aper = gf100_vmm_aper, .valid = gf100_vmm_valid, .flush = gf100_vmm_flush, .invalidate_pdb = gf100_vmm_invalidate_pdb, @@ -83,7 +82,6 @@ static const struct nvkm_vmm_func gk104_vmm_16 = { .join = gf100_vmm_join, .part = gf100_vmm_part, - .aper = gf100_vmm_aper, .valid = gf100_vmm_valid, .flush = gf100_vmm_flush, .invalidate_pdb = gf100_vmm_invalidate_pdb, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c index 999b953505b3..774b6fe9d4a9 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c @@ -45,7 +45,6 @@ static const struct nvkm_vmm_func gk20a_vmm_17 = { .join = gf100_vmm_join, .part = gf100_vmm_part, - .aper = gf100_vmm_aper, .valid = gk20a_vmm_valid, .flush = gf100_vmm_flush, .invalidate_pdb = gf100_vmm_invalidate_pdb, @@ -60,7 +59,6 @@ static const struct nvkm_vmm_func gk20a_vmm_16 = { .join = gf100_vmm_join, .part = gf100_vmm_part, - .aper = gf100_vmm_aper, .valid = gk20a_vmm_valid, .flush = gf100_vmm_flush, .invalidate_pdb = gf100_vmm_invalidate_pdb, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm200.c b/drivers/gpu/drm
[Nouveau] [PATCH 4/6] drm/nouveau: Implement nvkm_memory_aperture()
From: Thierry Reding The aperture of a buffer is always specific to where its memory was allocated from. Furthermore, the encoding of the aperture is always the same, regardless of GPU generation. Implement the memory target to aperture conversion in one central place and make the aperture independent of the VMM. Note that we no longer return a negative error code for unsupported apertures. First, this should never happen to begin with and is a programming error, which is why we have a WARN already. Second, the standard aperture (0, VRAM) should be correct for the vast majority of memory objects. Lastly, the aperture also needs to be programmed into many registers and instance blocks. Having to check for error codes at every step of the way would make this very unwieldy. If in any case there is ever a problem with the aperture being wrong, let us rely on the WARN to tell us about it. Signed-off-by: Thierry Reding --- .../drm/nouveau/include/nvkm/core/memory.h| 28 +++ .../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c| 7 ++--- .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c| 7 ++--- 3 files changed, 34 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/memory.h b/drivers/gpu/drm/nouveau/include/nvkm/core/memory.h index b23bf6109f2d..29c60fbed167 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/core/memory.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/memory.h @@ -64,6 +64,34 @@ void nvkm_memory_tags_put(struct nvkm_memory *, struct nvkm_device *, #define nvkm_memory_map(p,o,vm,va,av,ac) \ (p)->func->map((p),(o),(vm),(va),(av),(ac)) +static inline u32 +nvkm_memory_aperture(struct nvkm_memory *mem) +{ + enum nvkm_memory_target target = nvkm_memory_target(mem); + + switch (target) { + case NVKM_MEM_TARGET_VRAM: + return 0; + + case NVKM_MEM_TARGET_HOST: + return 2; + + case NVKM_MEM_TARGET_NCOH: + return 3; + + default: + break; + } + + /* +* This is invalid, so warn about this loudly. However, return 0 to +* avoid writing garbage into registers. 0 is the VRAM aperture and +* might still work in most cases. +*/ + WARN(1, "invalid memory target: %d\n", target); + return 0; +} + /* accessor macros - kmap()/done() must bracket use of the other accessor * macros to guarantee correct behaviour across all chipsets */ diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c index ab6424faf84c..ffa64c0d3eda 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c @@ -248,8 +248,9 @@ gf100_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc, struct nvkm_device *device = vmm->mmu->subdev.device; struct nvkm_memory *memory = map->memory; u8 kind, priv, ro, vol; - int kindn, aper, ret = -ENOSYS; + int kindn, ret = -ENOSYS; const u8 *kindm; + u32 aper; map->next = (1 << page->shift) >> 8; map->type = map->ctag = 0; @@ -270,9 +271,7 @@ gf100_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc, return ret; } - aper = vmm->func->aper(target); - if (WARN_ON(aper < 0)) - return aper; + aper = nvkm_memory_aperture(map->memory); kindm = vmm->mmu->func->kind(vmm->mmu, ); if (kind >= kindn || kindm[kind] == 0xff) { diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c index b4f519768d5e..4a1a658328e5 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c @@ -321,8 +321,9 @@ gp100_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc, struct nvkm_device *device = vmm->mmu->subdev.device; struct nvkm_memory *memory = map->memory; u8 kind, priv, ro, vol; - int kindn, aper, ret = -ENOSYS; + int kindn, ret = -ENOSYS; const u8 *kindm; + u32 aper; map->next = (1ULL << page->shift) >> 4; map->type = 0; @@ -343,9 +344,7 @@ gp100_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc, return ret; } - aper = vmm->func->aper(target); - if (WARN_ON(aper < 0)) - return aper; + aper = nvkm_memory_aperture(map->memory); kindm = vmm->mmu->func->kind(vmm->mmu, ); if (kind >= kindn || kindm[kind] == 0xff) { -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 1/6] drm/nouveau: fault: Store aperture in fault information
From: Thierry Reding The fault information register contains data about the aperture that caused the failure. This can be useful in debugging aperture related programming bugs. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h | 1 + drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c| 3 ++- drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 1 + 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h index 97322f95b3ee..1cc862bc1122 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h @@ -21,6 +21,7 @@ struct nvkm_fault_data { u64 addr; u64 inst; u64 time; + u8 aperture; u8 engine; u8 valid; u8gpc; diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c index 5d4b695cab8e..81cbe1cc4804 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c @@ -519,9 +519,10 @@ gk104_fifo_fault(struct nvkm_fifo *base, struct nvkm_fault_data *info) chan = nvkm_fifo_chan_inst_locked(>base, info->inst); nvkm_error(subdev, - "fault %02x [%s] at %016llx engine %02x [%s] client %02x " + "fault %02x [%s] at %016llx aperture %02x engine %02x [%s] client %02x " "[%s%s] reason %02x [%s] on channel %d [%010llx %s]\n", info->access, ea ? ea->name : "", info->addr, + info->aperture, info->engine, ee ? ee->name : en, info->client, ct, ec ? ec->name : "", info->reason, er ? er->name : "", chan ? chan->chid : -1, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c index 6747f09c2dc3..b5e32295237b 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c @@ -138,6 +138,7 @@ gv100_fault_intr_fault(struct nvkm_fault *fault) info.inst = ((u64)insthi << 32) | (info0 & 0xf000); info.time = 0; info.engine = (info0 & 0x00ff); + info.aperture = (info0 & 0x0c00) >> 10; info.valid = (info1 & 0x8000) >> 31; info.gpc= (info1 & 0x1f00) >> 24; info.hub= (info1 & 0x0010) >> 20; -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 3/6] drm/nouveau: Remove bogus gk20a aperture callback
From: Thierry Reding The gk20a (as well as all subsequent Tegra instantiations of the GPU) do in fact use the same apertures as regular GPUs. Prior to gv11b there are no checks in hardware for the aperture, so we get away with setting VRAM as the aperture for buffers that are actually in system memory. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 - drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c | 10 -- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c | 4 ++-- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c | 2 +- 4 files changed, 3 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h index fb3a9e8bb9cd..9862f44ac8b5 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h @@ -212,7 +212,6 @@ void gf100_vmm_flush(struct nvkm_vmm *, int); void gf100_vmm_invalidate(struct nvkm_vmm *, u32 type); void gf100_vmm_invalidate_pdb(struct nvkm_vmm *, u64 addr); -int gk20a_vmm_aper(enum nvkm_memory_target); int gk20a_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *); int gm200_vmm_new_(const struct nvkm_vmm_func *, const struct nvkm_vmm_func *, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c index 16d7bf727292..999b953505b3 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c @@ -25,16 +25,6 @@ #include -int -gk20a_vmm_aper(enum nvkm_memory_target target) -{ - switch (target) { - case NVKM_MEM_TARGET_NCOH: return 0; - default: - return -EINVAL; - } -} - int gk20a_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc, struct nvkm_vmm_map *map) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c index 7a6066d886cd..f5d7819c4a40 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c @@ -25,7 +25,7 @@ static const struct nvkm_vmm_func gm20b_vmm_17 = { .join = gm200_vmm_join, .part = gf100_vmm_part, - .aper = gk20a_vmm_aper, + .aper = gf100_vmm_aper, .valid = gk20a_vmm_valid, .flush = gf100_vmm_flush, .invalidate_pdb = gf100_vmm_invalidate_pdb, @@ -41,7 +41,7 @@ static const struct nvkm_vmm_func gm20b_vmm_16 = { .join = gm200_vmm_join, .part = gf100_vmm_part, - .aper = gk20a_vmm_aper, + .aper = gf100_vmm_aper, .valid = gk20a_vmm_valid, .flush = gf100_vmm_flush, .invalidate_pdb = gf100_vmm_invalidate_pdb, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c index 180c8f006e32..ffe84ea2f7d9 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c @@ -43,7 +43,7 @@ static const struct nvkm_vmm_func gp10b_vmm = { .join = gp100_vmm_join, .part = gf100_vmm_part, - .aper = gk20a_vmm_aper, + .aper = gf100_vmm_aper, .valid = gp10b_vmm_valid, .flush = gp100_vmm_flush, .mthd = gp100_vmm_mthd, -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 2/6] drm/nouveau: fault: Widen engine field
From: Thierry Reding The engine field in the FIFO fault information registers is actually 9 bits wide. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c index b5e32295237b..28306c5f6651 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c @@ -137,8 +137,8 @@ gv100_fault_intr_fault(struct nvkm_fault *fault) info.addr = ((u64)addrhi << 32) | addrlo; info.inst = ((u64)insthi << 32) | (info0 & 0xf000); info.time = 0; - info.engine = (info0 & 0x00ff); info.aperture = (info0 & 0x0c00) >> 10; + info.engine = (info0 & 0x01ff); info.valid = (info1 & 0x8000) >> 31; info.gpc= (info1 & 0x1f00) >> 24; info.hub= (info1 & 0x0010) >> 20; -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 11/11] arm64: tegra: Enable SMMU for GPU on Tegra186
From: Thierry Reding The GPU has a connection to the ARM SMMU found on Tegra186, which can be used to support large pages. Make sure the GPU is attached to the SMMU to take advantage of its capabilities. Signed-off-by: Thierry Reding --- arch/arm64/boot/dts/nvidia/tegra186.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi b/arch/arm64/boot/dts/nvidia/tegra186.dtsi index 47cd831fcf44..171fd4dfa58d 100644 --- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi @@ -1172,6 +1172,7 @@ status = "disabled"; power-domains = < TEGRA186_POWER_DOMAIN_GPU>; + iommus = < TEGRA186_SID_GPU>; }; sysram@3000 { -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 10/11] arm64: tegra: Enable GPU on Jetson TX2
From: Alexandre Courbot Enable the GPU node for the Jetson TX2 board. Signed-off-by: Alexandre Courbot Signed-off-by: Thierry Reding --- arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts | 4 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts b/arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts index bdace01561ba..6f7c7c4c5c29 100644 --- a/arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts +++ b/arch/arm64/boot/dts/nvidia/tegra186-p2771-.dts @@ -276,6 +276,10 @@ }; }; + gpu@1700 { + status = "okay"; + }; + gpio-keys { compatible = "gpio-keys"; -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 07/11] drm/nouveau: gk20a: Implement custom MMU class
From: Thierry Reding The GPU integrated in NVIDIA Tegra SoCs is connected to system memory via two paths: one direct path to the memory controller and another path that goes through a system MMU first. It's not typically necessary to go through the system MMU because the GPU's MMU can already map buffers so that they appear contiguous to the GPU. However, in order to support big pages, the system MMU has to be used to combine multiple small pages into one virtually contiguous chunk so that the GPU can then treat that as a single big page. In order to prepare for big page support, implement a custom MMU class that takes care of setting the IOMMU bit when writing page tables and when appropriate. This is also necessary to make sure that Nouveau works correctly on Tegra devices where the GPU is connected to a system MMU and that IOMMU is used to back the DMA API. Currently Nouveau assumes that the DMA API is never backed by an IOMMU, so access to DMA-mapped buffers fault when suddenly this assumption is no longer true. One situation where this can happen is on 32-bit Tegra SoCs where the ARM architecture code automatically attaches the GPU with a DMA/IOMMU domain. This is currently worked around by detaching the GPU from the IOMMU domain at probe time. However, with Tegra186 and later this can now also happen, but unfortunately no mechanism exists to detach from the domain in the 64-bit ARM architecture code. Using this Tegra-specific MMU class ensures that DMA-mapped buffers are properly mapped (with the IOMMU bit set) if the DMA API is backed by an IOMMU domain. Signed-off-by: Thierry Reding --- .../gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.c | 50 ++- .../gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.h | 44 .../gpu/drm/nouveau/nvkm/subdev/mmu/gm20b.c | 6 ++- .../gpu/drm/nouveau/nvkm/subdev/mmu/gp10b.c | 4 +- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 + .../drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c| 22 +++- .../drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c| 4 +- .../drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c| 20 +++- 8 files changed, 142 insertions(+), 9 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.h diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.c index ac74965a60d4..d9a5e05b7dc7 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.c @@ -19,11 +19,59 @@ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. */ + +#include "gk20a.h" #include "mem.h" #include "vmm.h" +#include #include +static void +gk20a_mmu_ctor(const struct nvkm_mmu_func *func, struct nvkm_device *device, + int index, struct gk20a_mmu *mmu) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(device->dev); + struct nvkm_device_tegra *tegra = device->func->tegra(device); + + nvkm_mmu_ctor(func, device, index, >base); + + /* +* If the DMA API is backed by an IOMMU, make sure the IOMMU bit is +* set for all buffer accesses. If the IOMMU is explicitly used, it +* is only used for instance blocks and the MMU doesn't care, since +* buffer objects are only mapped through the MMU, not through the +* IOMMU. +* +* Big page support could be implemented using explicit IOMMU usage, +* but the DMA API already provides that for free, so we don't worry +* about it for now. +*/ + if (domain && !tegra->iommu.domain) { + mmu->iommu_mask = BIT_ULL(tegra->func->iommu_bit); + nvkm_debug(>base.subdev, "IOMMU mask: %llx\n", + mmu->iommu_mask); + } +} + +int +gk20a_mmu_new_(const struct nvkm_mmu_func *func, struct nvkm_device *device, + int index, struct nvkm_mmu **pmmu) +{ + struct gk20a_mmu *mmu; + + mmu = kzalloc(sizeof(*mmu), GFP_KERNEL); + if (!mmu) + return -ENOMEM; + + gk20a_mmu_ctor(func, device, index, mmu); + + if (pmmu) + *pmmu = >base; + + return 0; +} + static const struct nvkm_mmu_func gk20a_mmu = { .dma_bits = 40, @@ -37,5 +85,5 @@ gk20a_mmu = { int gk20a_mmu_new(struct nvkm_device *device, int index, struct nvkm_mmu **pmmu) { - return nvkm_mmu_new_(_mmu, device, index, pmmu); + return gk20a_mmu_new_(_mmu, device, index, pmmu); } diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.h new file mode 100644 index ..bb81fc62509c --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gk20a.h @@ -0,0 +1,44 @@ +/* + * Copyright (c) 2019 NVIDIA Corporation. + * + * Permission is hereby granted, free of charge, to any person obtain
[Nouveau] [PATCH 09/11] drm/nouveau: tegra: Fall back to 32-bit DMA mask without IOMMU
From: Thierry Reding The GPU can usually address more than 32-bit, even without being attached to an IOMMU. However, if the GPU is not attached to an IOMMU, it's likely that there is no IOMMU in the system, in which case any buffers allocated by Nouveau will likely end up in a region of memory that cannot be accessed by host1x. Signed-off-by: Thierry Reding --- .../drm/nouveau/nvkm/engine/device/tegra.c| 111 +++--- 1 file changed, 70 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c index fc652aaa41c7..221238a2cf53 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c @@ -97,7 +97,7 @@ nvkm_device_tegra_power_down(struct nvkm_device_tegra *tdev) return 0; } -static void +static int nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev) { #if IS_ENABLED(CONFIG_IOMMU_API) @@ -111,47 +111,65 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev) * IOMMU. */ if (iommu_get_domain_for_dev(dev)) - return; + return -ENODEV; if (!tdev->func->iommu_bit) - return; + return -ENODEV; + + if (!iommu_present(_bus_type)) + return -ENODEV; mutex_init(>iommu.mutex); - if (iommu_present(_bus_type)) { - tdev->iommu.domain = iommu_domain_alloc(_bus_type); - if (!tdev->iommu.domain) - goto error; + tdev->iommu.domain = iommu_domain_alloc(_bus_type); + if (!tdev->iommu.domain) + return -ENOMEM; - /* -* A IOMMU is only usable if it supports page sizes smaller -* or equal to the system's PAGE_SIZE, with a preference if -* both are equal. -*/ - pgsize_bitmap = tdev->iommu.domain->ops->pgsize_bitmap; - if (pgsize_bitmap & PAGE_SIZE) { - tdev->iommu.pgshift = PAGE_SHIFT; - } else { - tdev->iommu.pgshift = fls(pgsize_bitmap & ~PAGE_MASK); - if (tdev->iommu.pgshift == 0) { - dev_warn(dev, "unsupported IOMMU page size\n"); - goto free_domain; - } - tdev->iommu.pgshift -= 1; + /* +* An IOMMU is only usable if it supports page sizes smaller or equal +* to the system's PAGE_SIZE, with a preference if both are equal. +*/ + pgsize_bitmap = tdev->iommu.domain->ops->pgsize_bitmap; + if (pgsize_bitmap & PAGE_SIZE) { + tdev->iommu.pgshift = PAGE_SHIFT; + } else { + tdev->iommu.pgshift = fls(pgsize_bitmap & ~PAGE_MASK); + if (tdev->iommu.pgshift == 0) { + dev_warn(dev, "unsupported IOMMU page size\n"); + ret = -ENOTSUPP; + goto free_domain; } - ret = iommu_attach_device(tdev->iommu.domain, dev); - if (ret) - goto free_domain; + tdev->iommu.pgshift -= 1; + } - ret = nvkm_mm_init(>iommu.mm, 0, 0, - (1ULL << tdev->func->iommu_bit) >> - tdev->iommu.pgshift, 1); - if (ret) - goto detach_device; + ret = iommu_attach_device(tdev->iommu.domain, dev); + if (ret) { + dev_warn(dev, "failed to attach to IOMMU: %d\n", ret); + goto free_domain; + } + + ret = nvkm_mm_init(>iommu.mm, 0, 0, + (1ULL << tdev->func->iommu_bit) >> + tdev->iommu.pgshift, 1); + if (ret) { + dev_warn(dev, "failed to initialize IOVA space: %d\n", ret); + goto detach_device; + } + + /* +* The IOMMU bit defines the upper limit of the GPU-addressable space. +*/ + ret = dma_set_mask(dev, DMA_BIT_MASK(tdev->func->iommu_bit)); + if (ret) { + dev_warn(dev, "failed to set DMA mask: %d\n", ret); + goto fini_mm; } - return; + return 0; + +fini_mm: + nvkm_mm_fini(>iommu.mm); detach_device: iommu_detach_device(tdev->iommu.domain, dev); @@ -159,10 +177,15 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev) free_domain: iommu_domain_free(tdev->iommu.domain); -error: + /* reset these so that the DMA API code paths are executed */ tdev->iommu.domain = NULL;
[Nouveau] [PATCH 05/11] drm/nouveau: gp10b: Use correct copy engine
From: Thierry Reding gp10b uses the new engine enumeration mechanism introduced in the Pascal architecture. As a result, the copy engine, which used to be at index 2 for prior Tegra GPU instantiations, has now moved to index 0. Fix up the index and also use the gp100 variant of the copy engine class because on gp10b the PASCAL_DMA_COPY_B class is not supported. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index d2d6d5f4028a..99d3fa3fad89 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2387,7 +2387,7 @@ nv13b_chipset = { .pmu = gm20b_pmu_new, .timer = gk20a_timer_new, .top = gk104_top_new, - .ce[2] = gp102_ce_new, + .ce[0] = gp100_ce_new, .dma = gf119_dma_new, .fifo = gp10b_fifo_new, .gr = gp10b_gr_new, -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 04/11] drm/nouveau: gp10b: Add custom L2 cache implementation
From: Thierry Reding There are extra registers that need to be programmed to make the level 2 cache work on GP10B, such as the stream ID register that is used when an SMMU is used to translate memory addresses. Signed-off-by: Thierry Reding --- .../gpu/drm/nouveau/include/nvkm/subdev/ltc.h | 1 + .../gpu/drm/nouveau/nvkm/engine/device/base.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild| 1 + .../gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c | 69 +++ .../gpu/drm/nouveau/nvkm/subdev/ltc/priv.h| 2 + 5 files changed, 74 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h index 644d527c3b96..d76f60d7d29a 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/ltc.h @@ -40,4 +40,5 @@ int gm107_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gm200_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gp100_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); int gp102_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); +int gp10b_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c index c3c7159f3411..d2d6d5f4028a 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c @@ -2380,7 +2380,7 @@ nv13b_chipset = { .fuse = gm107_fuse_new, .ibus = gp10b_ibus_new, .imem = gk20a_instmem_new, - .ltc = gp102_ltc_new, + .ltc = gp10b_ltc_new, .mc = gp10b_mc_new, .mmu = gp10b_mmu_new, .secboot = gp10b_secboot_new, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild index 2b6d36ea7067..728d75010847 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/Kbuild @@ -6,3 +6,4 @@ nvkm-y += nvkm/subdev/ltc/gm107.o nvkm-y += nvkm/subdev/ltc/gm200.o nvkm-y += nvkm/subdev/ltc/gp100.o nvkm-y += nvkm/subdev/ltc/gp102.o +nvkm-y += nvkm/subdev/ltc/gp10b.o diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c new file mode 100644 index ..4d27c6ea1552 --- /dev/null +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/ltc/gp10b.c @@ -0,0 +1,69 @@ +/* + * Copyright (c) 2019 NVIDIA Corporation. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Thierry Reding + */ + +#include "priv.h" + +static void +gp10b_ltc_init(struct nvkm_ltc *ltc) +{ + struct nvkm_device *device = ltc->subdev.device; +#ifdef CONFIG_IOMMU_API + struct iommu_fwspec *spec; +#endif + + nvkm_wr32(device, 0x17e27c, ltc->ltc_nr); + nvkm_wr32(device, 0x17e000, ltc->ltc_nr); + nvkm_wr32(device, 0x100800, ltc->ltc_nr); + +#ifdef CONFIG_IOMMU_API + spec = dev_iommu_fwspec_get(device->dev); + if (spec) { + u32 sid = spec->ids[0] & 0x; + + /* stream ID */ + nvkm_wr32(device, 0x16, sid << 2); + } +#endif +} + +static const struct nvkm_ltc_func +gp10b_ltc = { + .oneinit = gp100_ltc_oneinit, + .init = gp10b_ltc_init, + .intr = gp100_ltc_intr, + .cbc_clear = gm107_ltc_cbc_clear, + .cbc_wait = gm107_ltc_cbc_wait, + .zbc = 16, + .zbc_clear_color = gm107_ltc_zbc_clear_color, + .zbc_clear_depth = gm107_ltc_zbc_clear_depth, + .zbc_clear_stencil = gp102_ltc_zbc_clear_stencil, + .invalidate = gf100_ltc_invalidate, + .flush = gf100_ltc_flush, +}; + +int +gp10b_ltc_new(struct nvkm_device *device, int index, struct nvkm_ltc **pltc) +{ +
[Nouveau] [PATCH 08/11] drm/nouveau: tegra: Skip IOMMU initialization if already attached
From: Thierry Reding If the GPU is already attached to an IOMMU, don't detach it and setup an explicit IOMMU domain. Since Nouveau can now properly handle the case of the DMA API being backed by an IOMMU, just continue using the DMA API. Signed-off-by: Thierry Reding --- .../drm/nouveau/nvkm/engine/device/tegra.c| 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c index d0d52c1d4aee..fc652aaa41c7 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c @@ -23,10 +23,6 @@ #ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER #include "priv.h" -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) -#include -#endif - static int nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev) { @@ -109,14 +105,13 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev) unsigned long pgsize_bitmap; int ret; -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) - if (dev->archdata.mapping) { - struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); - - arm_iommu_detach_device(dev); - arm_iommu_release_mapping(mapping); - } -#endif + /* +* Skip explicit IOMMU initialization if the GPU is already attached +* to an IOMMU domain. This can happen if the DMA API is backed by an +* IOMMU. +*/ + if (iommu_get_domain_for_dev(dev)) + return; if (!tdev->func->iommu_bit) return; -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 06/11] drm/nouveau: gk20a: Set IOMMU bit for DMA API if appropriate
From: Thierry Reding Detect if the DMA API is backed by an IOMMU and set the IOMMU bit if so. This is needed to make sure IOMMU addresses are properly translated even the explicit IOMMU API is not used. Signed-off-by: Thierry Reding --- .../drm/nouveau/nvkm/subdev/instmem/gk20a.c | 35 +-- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c index b0493f8df1fe..1120a2a7d5f1 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c @@ -100,12 +100,14 @@ struct gk20a_instmem { unsigned int vaddr_max; struct list_head vaddr_lru; + /* IOMMU mapping */ + unsigned int page_shift; + u64 iommu_mask; + /* Only used if IOMMU if present */ struct mutex *mm_mutex; struct nvkm_mm *mm; struct iommu_domain *domain; - unsigned long iommu_pgshift; - u16 iommu_bit; /* Only used by DMA API */ unsigned long attrs; @@ -357,12 +359,12 @@ gk20a_instobj_dtor_iommu(struct nvkm_memory *memory) mutex_unlock(>lock); /* clear IOMMU bit to unmap pages */ - r->offset &= ~BIT(imem->iommu_bit - imem->iommu_pgshift); + r->offset &= ~imem->iommu_mask; /* Unmap pages from GPU address space and free them */ for (i = 0; i < node->base.mn->length; i++) { iommu_unmap(imem->domain, - (r->offset + i) << imem->iommu_pgshift, PAGE_SIZE); + (r->offset + i) << imem->page_shift, PAGE_SIZE); dma_unmap_page(dev, node->dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL); __free_page(node->pages[i]); @@ -440,7 +442,7 @@ gk20a_instobj_ctor_dma(struct gk20a_instmem *imem, u32 npages, u32 align, /* present memory for being mapped using small pages */ node->r.type = 12; - node->r.offset = node->handle >> 12; + node->r.offset = imem->iommu_mask | node->handle >> 12; node->r.length = (npages << PAGE_SHIFT) >> 12; node->base.mn = >r; @@ -493,7 +495,7 @@ gk20a_instobj_ctor_iommu(struct gk20a_instmem *imem, u32 npages, u32 align, mutex_lock(imem->mm_mutex); /* Reserve area from GPU address space */ ret = nvkm_mm_head(imem->mm, 0, 1, npages, npages, - align >> imem->iommu_pgshift, ); + align >> imem->page_shift, ); mutex_unlock(imem->mm_mutex); if (ret) { nvkm_error(subdev, "IOMMU space is full!\n"); @@ -502,7 +504,7 @@ gk20a_instobj_ctor_iommu(struct gk20a_instmem *imem, u32 npages, u32 align, /* Map into GPU address space */ for (i = 0; i < npages; i++) { - u32 offset = (r->offset + i) << imem->iommu_pgshift; + u32 offset = (r->offset + i) << imem->page_shift; ret = iommu_map(imem->domain, offset, node->dma_addrs[i], PAGE_SIZE, IOMMU_READ | IOMMU_WRITE); @@ -518,7 +520,7 @@ gk20a_instobj_ctor_iommu(struct gk20a_instmem *imem, u32 npages, u32 align, } /* IOMMU bit tells that an address is to be resolved through the IOMMU */ - r->offset |= BIT(imem->iommu_bit - imem->iommu_pgshift); + r->offset |= imem->iommu_mask; node->base.mn = r; return 0; @@ -619,11 +621,12 @@ gk20a_instmem_new(struct nvkm_device *device, int index, imem->mm_mutex = >iommu.mutex; imem->mm = >iommu.mm; imem->domain = tdev->iommu.domain; - imem->iommu_pgshift = tdev->iommu.pgshift; - imem->iommu_bit = tdev->func->iommu_bit; + imem->page_shift = tdev->iommu.pgshift; nvkm_info(>base.subdev, "using IOMMU\n"); } else { + imem->page_shift = PAGE_SHIFT; + imem->attrs = DMA_ATTR_NON_CONSISTENT | DMA_ATTR_WEAK_ORDERING | DMA_ATTR_WRITE_COMBINE; @@ -631,5 +634,17 @@ gk20a_instmem_new(struct nvkm_device *device, int index, nvkm_info(>base.subdev, "using DMA API\n"); } + /* +* The IOMMU mask needs to be set if an IOMMU is used explicitly (via +* direct IOMMU API usage) or implicitly (via the DMA API). In both +* cases the device will have been attached to an IOMMU domain. +*/ + if (iommu_get_domain_for_dev(device->dev)) { + imem->iommu_mask = BIT_ULL(tdev->func->iommu_bit
[Nouveau] [PATCH 02/11] drm/nouveau: tegra: Set clock rate if not set
From: Thierry Reding If the GPU clock has not had a rate set, initialize it to the maximum clock rate to make sure it does run. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c index 747a775121cf..d0d52c1d4aee 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c @@ -279,6 +279,7 @@ nvkm_device_tegra_new(const struct nvkm_device_tegra_func *func, struct nvkm_device **pdevice) { struct nvkm_device_tegra *tdev; + unsigned long rate; int ret; if (!(tdev = kzalloc(sizeof(*tdev), GFP_KERNEL))) @@ -307,6 +308,17 @@ nvkm_device_tegra_new(const struct nvkm_device_tegra_func *func, goto free; } + rate = clk_get_rate(tdev->clk); + if (rate == 0) { + ret = clk_set_rate(tdev->clk, ULONG_MAX); + if (ret < 0) + goto free; + + rate = clk_get_rate(tdev->clk); + + dev_dbg(>dev, "GPU clock set to %lu\n", rate); + } + if (func->require_ref_clk) tdev->clk_ref = devm_clk_get(>dev, "ref"); if (IS_ERR(tdev->clk_ref)) { -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 01/11] drm/nouveau: tegra: Avoid pulsing reset twice
From: Thierry Reding When the GPU powergate is controlled by a generic power domain provider, the reset will automatically be asserted and deasserted as part of the power-ungating procedure. On some Jetson TX2 boards, doing an additional assert and deassert of the GPU outside of the power-ungate procedure can cause the GPU to go into a bad state where the memory interface can no longer access system memory. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c index 0e372a190d3f..747a775121cf 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c @@ -52,18 +52,18 @@ nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev) clk_set_rate(tdev->clk_pwr, 20400); udelay(10); - reset_control_assert(tdev->rst); - udelay(10); - if (!tdev->pdev->dev.pm_domain) { + reset_control_assert(tdev->rst); + udelay(10); + ret = tegra_powergate_remove_clamping(TEGRA_POWERGATE_3D); if (ret) goto err_clamp; udelay(10); - } - reset_control_deassert(tdev->rst); - udelay(10); + reset_control_deassert(tdev->rst); + udelay(10); + } return 0; -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 03/11] drm/nouveau: secboot: Read WPR configuration from GPU registers
From: Thierry Reding The GPUs found on Tegra SoCs have registers that can be used to read the WPR configuration. Use these registers instead of reaching into the memory controller's register space to read the same information. Signed-off-by: Thierry Reding --- .../drm/nouveau/nvkm/subdev/secboot/gm200.h | 2 +- .../drm/nouveau/nvkm/subdev/secboot/gm20b.c | 81 --- .../drm/nouveau/nvkm/subdev/secboot/gp10b.c | 4 +- 3 files changed, 53 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h index 62c5e162099a..280b1448df88 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm200.h @@ -41,6 +41,6 @@ int gm200_secboot_run_blob(struct nvkm_secboot *, struct nvkm_gpuobj *, struct nvkm_falcon *); /* Tegra-only */ -int gm20b_secboot_tegra_read_wpr(struct gm200_secboot *, u32); +int gm20b_secboot_tegra_read_wpr(struct gm200_secboot *); #endif diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c index df8b919dcf09..f8a543122219 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c @@ -23,39 +23,65 @@ #include "acr.h" #include "gm200.h" -#define TEGRA210_MC_BASE 0x70019000 - #ifdef CONFIG_ARCH_TEGRA -#define MC_SECURITY_CARVEOUT2_CFG0 0xc58 -#define MC_SECURITY_CARVEOUT2_BOM_00xc5c -#define MC_SECURITY_CARVEOUT2_BOM_HI_0 0xc60 -#define MC_SECURITY_CARVEOUT2_SIZE_128K0xc64 -#define TEGRA_MC_SECURITY_CARVEOUT_CFG_LOCKED (1 << 1) /** * gm20b_secboot_tegra_read_wpr() - read the WPR registers on Tegra * - * On dGPU, we can manage the WPR region ourselves, but on Tegra the WPR region - * is reserved from system memory by the bootloader and irreversibly locked. - * This function reads the address and size of the pre-configured WPR region. + * On dGPU, we can manage the WPR region ourselves, but on Tegra this region + * is allocated from system memory by the secure firmware. The region is then + * marked as a "secure carveout" and irreversibly locked. Furthermore, the WPR + * secure carveout is also configured to be sent to the GPU via a dedicated + * serial bus between the memory controller and the GPU. The GPU requests this + * information upon leaving reset and exposes it through a FIFO register at + * offset 0x100cd4. + * + * The FIFO register's lower 4 bits can be used to set the read index into the + * FIFO. After each read of the FIFO register, the read index is incremented. + * + * Indices 2 and 3 contain the lower and upper addresses of the WPR. These are + * stored in units of 256 B. The WPR is inclusive of both addresses. + * + * Unfortunately, for some reason the WPR info register doesn't contain the + * correct values for the secure carveout. It seems like the upper address is + * always too small by 128 KiB - 1. Given that the secure carvout size in the + * memory controller configuration is specified in units of 128 KiB, it's + * possible that the computation of the upper address of the WPR is wrong and + * causes this difference. */ int -gm20b_secboot_tegra_read_wpr(struct gm200_secboot *gsb, u32 mc_base) +gm20b_secboot_tegra_read_wpr(struct gm200_secboot *gsb) { + struct nvkm_device *device = gsb->base.subdev.device; struct nvkm_secboot *sb = >base; - void __iomem *mc; - u32 cfg; + u64 base, limit; + u32 value; - mc = ioremap(mc_base, 0xd00); - if (!mc) { - nvkm_error(>subdev, "Cannot map Tegra MC registers\n"); - return -ENOMEM; - } - sb->wpr_addr = ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_0) | - ((u64)ioread32_native(mc + MC_SECURITY_CARVEOUT2_BOM_HI_0) << 32); - sb->wpr_size = ioread32_native(mc + MC_SECURITY_CARVEOUT2_SIZE_128K) - << 17; - cfg = ioread32_native(mc + MC_SECURITY_CARVEOUT2_CFG0); - iounmap(mc); + /* set WPR info register to point at WPR base address register */ + value = nvkm_rd32(device, 0x100cd4); + value &= ~0xf; + value |= 0x2; + nvkm_wr32(device, 0x100cd4, value); + + /* read base address */ + value = nvkm_rd32(device, 0x100cd4); + base = (u64)(value >> 4) << 12; + + /* read limit */ + value = nvkm_rd32(device, 0x100cd4); + limit = (u64)(value >> 4) << 12; + + /* +* The upper address of the WPR seems to be computed wrongly and is +* actually SZ_128K - 1 bytes lower than it should be. Adjust the +* value accordingly. +*/ + limit += SZ_128K - 1; + + sb->wpr_size = limit - ba
[Nouveau] [PATCH 2/2] drm/nouveau: tegra: Do not try to disable PCI device
From: Thierry Reding When Nouveau is instantiated on top of a platform device, the dev->pdev field will be NULL and calling pci_disable_device() will crash. Move the PCI disabling code to the PCI specific driver removal code. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 2cd83849600f..b65ae817eabf 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -715,7 +715,6 @@ static int nouveau_drm_probe(struct pci_dev *pdev, void nouveau_drm_device_remove(struct drm_device *dev) { - struct pci_dev *pdev = dev->pdev; struct nouveau_drm *drm = nouveau_drm(dev); struct nvkm_client *client; struct nvkm_device *device; @@ -727,7 +726,6 @@ nouveau_drm_device_remove(struct drm_device *dev) device = nvkm_device_find(client->device); nouveau_drm_device_fini(dev); - pci_disable_device(pdev); drm_dev_put(dev); nvkm_device_del(); } @@ -738,6 +736,7 @@ nouveau_drm_remove(struct pci_dev *pdev) struct drm_device *dev = pci_get_drvdata(pdev); nouveau_drm_device_remove(dev); + pci_disable_device(pdev); } static int -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 1/2] drm/nouveau: tegra: Fix NULL pointer dereference
From: Thierry Reding Fill in BAR2 callbacks for instance memory. There's no BAR2 on Tegra GPUs, but buffers are all in system memory anyway, so just return the plain address. Signed-off-by: Thierry Reding --- .../drm/nouveau/nvkm/subdev/instmem/gk20a.c | 30 +++ 1 file changed, 30 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c index 985f2990ab0d..b0493f8df1fe 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c @@ -261,6 +261,34 @@ gk20a_instobj_release_iommu(struct nvkm_memory *memory) nvkm_ltc_invalidate(ltc); } +static u64 +gk20a_instobj_bar2_dma(struct nvkm_memory *memory) +{ + struct gk20a_instobj_dma *iobj = gk20a_instobj_dma(memory); + u64 addr = ~0ULL; + + if (gk20a_instobj_acquire_dma(>base.memory)) + addr = gk20a_instobj_addr(>base.memory); + + gk20a_instobj_release_dma(>base.memory); + + return addr; +} + +static u64 +gk20a_instobj_bar2_iommu(struct nvkm_memory *memory) +{ + struct gk20a_instobj_iommu *iobj = gk20a_instobj_iommu(memory); + u64 addr = ~0ULL; + + if (gk20a_instobj_acquire_iommu(>base.memory)) + addr = gk20a_instobj_addr(>base.memory); + + gk20a_instobj_release_iommu(>base.memory); + + return addr; +} + static u32 gk20a_instobj_rd32(struct nvkm_memory *memory, u64 offset) { @@ -353,6 +381,7 @@ static const struct nvkm_memory_func gk20a_instobj_func_dma = { .dtor = gk20a_instobj_dtor_dma, .target = gk20a_instobj_target, + .bar2 = gk20a_instobj_bar2_dma, .page = gk20a_instobj_page, .addr = gk20a_instobj_addr, .size = gk20a_instobj_size, @@ -365,6 +394,7 @@ static const struct nvkm_memory_func gk20a_instobj_func_iommu = { .dtor = gk20a_instobj_dtor_iommu, .target = gk20a_instobj_target, + .bar2 = gk20a_instobj_bar2_iommu, .page = gk20a_instobj_page, .addr = gk20a_instobj_addr, .size = gk20a_instobj_size, -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 0/2] drm/nouveau: Two more fixes
From: Thierry Reding Hi Ben, I messed up the ordering of patches in my tree a bit, so these two fixes got separated from the others. I don't consider these particularily urgent because the crash that the first one fixes only happens on gp10b which we don't enable by default yet and the second patch fixes a crash that only happens on module unload (or driver unbind, more accurately), which isn't a terribly common thing to do. I'll be sending out fixes shortly to make the GP10B work more properly on a wider range of Jetson TX2 devices and enable it by default. One thing to mention is that I'm not exactly sure if the first patch is the right thing to do. I haven't seen any issues after that change, but I'm also not exactly sure I understand what BAR2 is used for, so I don't know if I would've even covered those code paths (other than the one causing the crash at probe time) in my tests. It'd be great to get Lyude's feedback on the second patch, since that call to pci_disable_device() was rather oddly placed and I'm not sure if that was essential for things to work or whether the slightly different point in time where it's called after this patch is also okay. It looks to me like it should work fine, but I don't currently have a way to test this on desktop GPUs. Thierry Thierry Reding (2): drm/nouveau: tegra: Fix NULL pointer dereference drm/nouveau: tegra: Do not try to disable PCI device drivers/gpu/drm/nouveau/nouveau_drm.c | 3 +- .../drm/nouveau/nvkm/subdev/instmem/gk20a.c | 30 +++ 2 files changed, 31 insertions(+), 2 deletions(-) -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 0/4] drm/nouveau: Miscellaneous fixes
From: Thierry Reding Hi Ben, these are fixes for a couple of issues that I've been running into when testing on various Tegra boards. The first two patches fix up issues in the fix that I had sent out earlier to fix the regression introduced in drm-misc-next. The first one is critical because it avoids a BUG_ON as reported by Ilia, while the second is less critical, but restores the locking correctness (at least to the best of my knowledge). Patch 3 is something that I think was also caused by the reservation object rework and is kind of a continuation of my earlier attempt to fix the VMA node sharing breakage. The current ordering between TTM and GEM teardown is causing a DEBUG_LOCKS_WARN_ON() because GEM cleanup already freed a mutex that TTM teardown will still want to use. Lastly, patch 4 is quite uncritical, but it's a one-line change that is causing an ugly (but harmless) external memory address decode error on Tegra210 and later. It seems that for some reason clearing this register will cause a DMA operation to be started by the GPU. I've verified that it's tied to exactly that register write by modifying the value written to the register, and stalling for a couple of seconds after the register write. The address decode error reflects the value written into this register exactly and it always happens a couple of milliseconds after this write. Thierry Thierry Reding (4): drm/nouveau: Fix fallout from reservation object rework drm/nouveau: prime: Extend DMA reservation object lock drm/nouveau: Fix ordering between TTM and GEM release drm/nouveau: gm20b: Avoid BAR1 teardown during init drivers/gpu/drm/nouveau/nouveau_bo.c | 26 +++--- drivers/gpu/drm/nouveau/nouveau_bo.h | 4 +-- drivers/gpu/drm/nouveau/nouveau_gem.c | 7 ++--- drivers/gpu/drm/nouveau/nouveau_prime.c | 27 --- .../gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c | 1 - 5 files changed, 39 insertions(+), 26 deletions(-) -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 2/4] drm/nouveau: prime: Extend DMA reservation object lock
From: Thierry Reding Prior to commit 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object"), the reservation object was locked across all of the buffer object creation. After splitting nouveau_bo_new() into separate nouveau_bo_alloc() and nouveau_bo_init() functions, the reservation object is passed to the latter, so the lock needs to be held across that function as well. Fixes: 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object") Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_prime.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c b/drivers/gpu/drm/nouveau/nouveau_prime.c index 656c334ee7d9..bae6a3eccee0 100644 --- a/drivers/gpu/drm/nouveau/nouveau_prime.c +++ b/drivers/gpu/drm/nouveau/nouveau_prime.c @@ -60,6 +60,7 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev, struct sg_table *sg) { struct nouveau_drm *drm = nouveau_drm(dev); + struct drm_gem_object *obj; struct nouveau_bo *nvbo; struct dma_resv *robj = attach->dmabuf->resv; u64 size = attach->dmabuf->size; @@ -71,9 +72,10 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev, dma_resv_lock(robj, NULL); nvbo = nouveau_bo_alloc(>client, , , flags, 0, 0); - dma_resv_unlock(robj); - if (IS_ERR(nvbo)) - return ERR_CAST(nvbo); + if (IS_ERR(nvbo)) { + obj = ERR_CAST(nvbo); + goto unlock; + } nvbo->valid_domains = NOUVEAU_GEM_DOMAIN_GART; @@ -82,16 +84,22 @@ struct drm_gem_object *nouveau_gem_prime_import_sg_table(struct drm_device *dev, ret = drm_gem_object_init(dev, >bo.base, size); if (ret) { nouveau_bo_ref(NULL, ); - return ERR_PTR(-ENOMEM); + obj = ERR_PTR(-ENOMEM); + goto unlock; } ret = nouveau_bo_init(nvbo, size, align, flags, sg, robj); if (ret) { nouveau_bo_ref(NULL, ); - return ERR_PTR(ret); + obj = ERR_PTR(ret); + goto unlock; } - return >bo.base; + obj = >bo.base; + +unlock: + dma_resv_unlock(robj); + return obj; } int nouveau_gem_prime_pin(struct drm_gem_object *obj) -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 3/4] drm/nouveau: Fix ordering between TTM and GEM release
From: Thierry Reding When the last reference to a TTM BO is dropped, ttm_bo_release() will acquire the DMA reservation object's wound/wait mutex while trying to clean up (ttm_bo_cleanup_refs_or_queue() via ttm_bo_release()). It is therefore essential that drm_gem_object_release() be called after the TTM BO has been uninitialized, otherwise drm_gem_object_release() has already destroyed the wound/wait mutex (via dma_resv_fini()). Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_bo.c | 10 -- drivers/gpu/drm/nouveau/nouveau_gem.c | 4 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index e7803dca32c5..f8015e0318d7 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -136,10 +136,16 @@ nouveau_bo_del_ttm(struct ttm_buffer_object *bo) struct drm_device *dev = drm->dev; struct nouveau_bo *nvbo = nouveau_bo(bo); - if (unlikely(nvbo->bo.base.filp)) - DRM_ERROR("bo %p still attached to GEM object\n", bo); WARN_ON(nvbo->pin_refcnt > 0); nv10_bo_put_tile_region(dev, nvbo->tile, NULL); + + /* +* If nouveau_bo_new() allocated this buffer, the GEM object was never +* initialized, so don't attempt to release it. +*/ + if (bo->base.dev) + drm_gem_object_release(>base); + kfree(nvbo); } diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index 1bdffd714456..1324c19f4e5c 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -51,10 +51,6 @@ nouveau_gem_object_del(struct drm_gem_object *gem) if (gem->import_attach) drm_prime_gem_destroy(gem, nvbo->bo.sg); - drm_gem_object_release(gem); - - /* reset filp so nouveau_bo_del_ttm() can test for it */ - gem->filp = NULL; ttm_bo_put(>bo); pm_runtime_mark_last_busy(dev); -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 4/4] drm/nouveau: gm20b: Avoid BAR1 teardown during init
From: Thierry Reding Writing the 0x1704 (BUS_BAR1_BLOCK) register causes the GPU to probe the memory region at the programmed address. The result is an address decode error in the external memory controller because address 0, which is what is written to the register, is not designated as accessible to devices. Avoid triggering DMA from the GPU by removing teardown of the BAR1. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c index 950bff1955ad..1ed6170891c4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gm20b.c @@ -26,7 +26,6 @@ gm20b_bar_func = { .dtor = gf100_bar_dtor, .oneinit = gf100_bar_oneinit, .bar1.init = gf100_bar_bar1_init, - .bar1.fini = gf100_bar_bar1_fini, .bar1.wait = gm107_bar_bar1_wait, .bar1.vmm = gf100_bar_bar1_vmm, .flush = g84_bar_flush, -- 2.23.0 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH 1/4] drm/nouveau: Fix fallout from reservation object rework
From: Thierry Reding Commit 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object") introduced a subtle change in how the buffer allocation size is handled. Prior to that change, the size would get aligned to at least a page, whereas after that change a non-page-aligned size would get passed through unmodified. This ultimately causes a BUG_ON() to trigger in drm_gem_private_object_init() and crashes the system. Fix this by restoring the code that align the allocation size. Fixes: 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object") Reported-by: Ilia Mirkin Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_bo.c| 16 +--- drivers/gpu/drm/nouveau/nouveau_bo.h| 4 ++-- drivers/gpu/drm/nouveau/nouveau_gem.c | 3 ++- drivers/gpu/drm/nouveau/nouveau_prime.c | 7 --- 4 files changed, 17 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index e918b437af17..e7803dca32c5 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -186,8 +186,8 @@ nouveau_bo_fixup_align(struct nouveau_bo *nvbo, u32 flags, } struct nouveau_bo * -nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, -u32 tile_flags) +nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int *align, u32 flags, +u32 tile_mode, u32 tile_flags) { struct nouveau_drm *drm = cli->drm; struct nouveau_bo *nvbo; @@ -195,8 +195,8 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, struct nvif_vmm *vmm = cli->svm.cli ? >svm.vmm : >vmm.vmm; int i, pi = -1; - if (!size) { - NV_WARN(drm, "skipped size %016llx\n", size); + if (!*size) { + NV_WARN(drm, "skipped size %016llx\n", *size); return ERR_PTR(-EINVAL); } @@ -266,7 +266,7 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, pi = i; /* Stop once the buffer is larger than the current page size. */ - if (size >= 1ULL << vmm->page[i].shift) + if (*size >= 1ULL << vmm->page[i].shift) break; } @@ -281,6 +281,8 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, } nvbo->page = vmm->page[pi].shift; + nouveau_bo_fixup_align(nvbo, flags, align, size); + return nvbo; } @@ -294,7 +296,6 @@ nouveau_bo_init(struct nouveau_bo *nvbo, u64 size, int align, u32 flags, acc_size = ttm_bo_dma_acc_size(nvbo->bo.bdev, size, sizeof(*nvbo)); - nouveau_bo_fixup_align(nvbo, flags, , ); nvbo->bo.mem.num_pages = size >> PAGE_SHIFT; nouveau_bo_placement_set(nvbo, flags, 0); @@ -318,7 +319,8 @@ nouveau_bo_new(struct nouveau_cli *cli, u64 size, int align, struct nouveau_bo *nvbo; int ret; - nvbo = nouveau_bo_alloc(cli, size, flags, tile_mode, tile_flags); + nvbo = nouveau_bo_alloc(cli, , , flags, tile_mode, + tile_flags); if (IS_ERR(nvbo)) return PTR_ERR(nvbo); diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h b/drivers/gpu/drm/nouveau/nouveau_bo.h index 62930d834fba..38f9d8350963 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.h +++ b/drivers/gpu/drm/nouveau/nouveau_bo.h @@ -71,8 +71,8 @@ nouveau_bo_ref(struct nouveau_bo *ref, struct nouveau_bo **pnvbo) extern struct ttm_bo_driver nouveau_bo_driver; void nouveau_bo_move_init(struct nouveau_drm *); -struct nouveau_bo *nouveau_bo_alloc(struct nouveau_cli *, u64 size, u32 flags, - u32 tile_mode, u32 tile_flags); +struct nouveau_bo *nouveau_bo_alloc(struct nouveau_cli *, u64 *size, int *align, + u32 flags, u32 tile_mode, u32 tile_flags); int nouveau_bo_init(struct nouveau_bo *, u64 size, int align, u32 flags, struct sg_table *sg, struct dma_resv *robj); int nouveau_bo_new(struct nouveau_cli *, u64 size, int align, u32 flags, diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index c2bfc0591909..1bdffd714456 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -188,7 +188,8 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, int align, uint32_t domain, if (domain & NOUVEAU_GEM_DOMAIN_COHERENT) flags |= TTM_PL_FLAG_UNCACHED; - nvbo = nouveau_bo_alloc(cli, size, flags, tile_mode, tile_flags); + nvbo = nouveau_bo_alloc(cli, , , flags, tile_mode, + tile_flags); if (IS_ERR(nvbo)) return PTR_ERR(nvbo); diff --git a/drivers/gpu/drm/nouveau/
Re: [Nouveau] [Intel-gfx] [PATCH v6 08/17] drm/ttm: use gem vma_node
On Sat, Sep 07, 2019 at 09:58:46PM -0400, Ilia Mirkin wrote: > On Wed, Aug 21, 2019 at 7:55 AM Thierry Reding > wrote: > > > > On Wed, Aug 21, 2019 at 04:33:58PM +1000, Ben Skeggs wrote: > > > On Wed, 14 Aug 2019 at 20:14, Gerd Hoffmann wrote: > > > > > > > > Hi, > > > > > > > > > > Changing the order doesn't look hard. Patch attached (untested, > > > > > > have no > > > > > > test hardware). But maybe I missed some detail ... > > > > > > > > > > I came up with something very similar by splitting up nouveau_bo_new() > > > > > into allocation and initialization steps, so that when necessary the > > > > > GEM > > > > > object can be initialized in between. I think that's slightly more > > > > > flexible and easier to understand than a boolean flag. > > > > > > > > Yes, that should work too. > > > > > > > > Acked-by: Gerd Hoffmann > > > Acked-by: Ben Skeggs > > > > Thanks guys, applied to drm-misc-next. > > Hi Thierry, > > Initial investigations suggest that this commit currently in drm-next > > commit 019cbd4a4feb3aa3a917d78e7110e3011bbff6d5 > Author: Thierry Reding > Date: Wed Aug 14 11:00:48 2019 +0200 > > drm/nouveau: Initialize GEM object before TTM object > > breaks nouveau userspace which tries to allocate GEM objects with a > non-page-aligned size. Previously nouveau_gem_new would just call > nouveau_bo_init which would call nouveau_bo_fixup_align before > initializing the GEM object. With this change, it is done after. What > do you think -- OK to just move that bit of logic into the new > nouveau_bo_alloc() (and make size/align be pointers so that they can > be fixed up?) Hi Ilia, sorry, got side-tracked earlier and forgot to send this out. I'll turn this into a proper patch, but if you manage to find the time to test this while I work out the userspace issues that are preventing me from testing this more thoroughly, that'd be great. Thierry --- >8 --- diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index e918b437af17..7d5ede756711 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -186,8 +186,8 @@ nouveau_bo_fixup_align(struct nouveau_bo *nvbo, u32 flags, } struct nouveau_bo * -nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, -u32 tile_flags) +nouveau_bo_alloc(struct nouveau_cli *cli, u64 *size, int *align, u32 flags, +u32 tile_mode, u32 tile_flags) { struct nouveau_drm *drm = cli->drm; struct nouveau_bo *nvbo; @@ -195,8 +195,8 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, struct nvif_vmm *vmm = cli->svm.cli ? >svm.vmm : >vmm.vmm; int i, pi = -1; - if (!size) { - NV_WARN(drm, "skipped size %016llx\n", size); + if (!*size) { + NV_WARN(drm, "skipped size %016llx\n", *size); return ERR_PTR(-EINVAL); } @@ -266,7 +266,7 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, pi = i; /* Stop once the buffer is larger than the current page size. */ - if (size >= 1ULL << vmm->page[i].shift) + if (*size >= 1ULL << vmm->page[i].shift) break; } @@ -281,6 +281,8 @@ nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, } nvbo->page = vmm->page[pi].shift; + nouveau_bo_fixup_align(nvbo, flags, align, size); + return nvbo; } @@ -292,12 +294,11 @@ nouveau_bo_init(struct nouveau_bo *nvbo, u64 size, int align, u32 flags, size_t acc_size; int ret; - acc_size = ttm_bo_dma_acc_size(nvbo->bo.bdev, size, sizeof(*nvbo)); - - nouveau_bo_fixup_align(nvbo, flags, , ); nvbo->bo.mem.num_pages = size >> PAGE_SHIFT; nouveau_bo_placement_set(nvbo, flags, 0); + acc_size = ttm_bo_dma_acc_size(nvbo->bo.bdev, size, sizeof(*nvbo)); + ret = ttm_bo_init(nvbo->bo.bdev, >bo, size, type, >placement, align >> PAGE_SHIFT, false, acc_size, sg, robj, nouveau_bo_del_ttm); @@ -318,7 +319,8 @@ nouveau_bo_new(struct nouveau_cli *cli, u64 size, int align, struct nouveau_bo *nvbo; int ret; - nvbo = nouveau_bo_alloc(cli, size, flags, tile_mode, tile_flags); + nvbo = nouveau_bo_alloc(cli, , , flags, tile_mode, + tile_flags); if (IS_ERR(nvbo))
Re: [Nouveau] [Intel-gfx] [PATCH v6 08/17] drm/ttm: use gem vma_node
On Wed, Aug 21, 2019 at 04:33:58PM +1000, Ben Skeggs wrote: > On Wed, 14 Aug 2019 at 20:14, Gerd Hoffmann wrote: > > > > Hi, > > > > > > Changing the order doesn't look hard. Patch attached (untested, have no > > > > test hardware). But maybe I missed some detail ... > > > > > > I came up with something very similar by splitting up nouveau_bo_new() > > > into allocation and initialization steps, so that when necessary the GEM > > > object can be initialized in between. I think that's slightly more > > > flexible and easier to understand than a boolean flag. > > > > Yes, that should work too. > > > > Acked-by: Gerd Hoffmann > Acked-by: Ben Skeggs Thanks guys, applied to drm-misc-next. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [Intel-gfx] [PATCH v6 08/17] drm/ttm: use gem vma_node
On Mon, Aug 05, 2019 at 04:01:10PM +0200, Gerd Hoffmann wrote: > Drop vma_node from ttm_buffer_object, use the gem struct > (base.vma_node) instead. > > Signed-off-by: Gerd Hoffmann > Reviewed-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +- > drivers/gpu/drm/qxl/qxl_object.h | 2 +- > drivers/gpu/drm/radeon/radeon_object.h | 2 +- > drivers/gpu/drm/virtio/virtgpu_drv.h | 2 +- > include/drm/ttm/ttm_bo_api.h | 4 > drivers/gpu/drm/drm_gem_vram_helper.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_display.c | 2 +- > drivers/gpu/drm/nouveau/nouveau_gem.c | 2 +- > drivers/gpu/drm/ttm/ttm_bo.c | 8 > drivers/gpu/drm/ttm/ttm_bo_util.c | 2 +- > drivers/gpu/drm/ttm/ttm_bo_vm.c| 9 + > drivers/gpu/drm/virtio/virtgpu_prime.c | 3 --- > drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 4 ++-- > drivers/gpu/drm/vmwgfx/vmwgfx_surface.c| 4 ++-- > 14 files changed, 21 insertions(+), 27 deletions(-) Hi Gerd, I've been seeing a regression on Nouveau with recent linux-next releases and git bisect points at this commit as the first bad one. If I revert it (there's a tiny conflict with a patch that was merged subsequently), things are back to normal. I think the reason for this issue is that Nouveau doesn't use GEM objects for all buffer objects, and even when it uses GEM objects, the code will not initialize the GEM object until after the buffer objects and the backing TTM objects have been created. I tried to fix that by making sure drm_gem_object_init() gets called by Nouveau before ttm_bo_init(), but the changes are fairly involved and I was unable to get the GEM reference counting right. I can look into the proper fix some more, but it might be worth reverting this patch for now to get Nouveau working again. Thierry > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > index 645a189d365c..113fb2feb437 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > @@ -191,7 +191,7 @@ static inline unsigned > amdgpu_bo_gpu_page_alignment(struct amdgpu_bo *bo) > */ > static inline u64 amdgpu_bo_mmap_offset(struct amdgpu_bo *bo) > { > - return drm_vma_node_offset_addr(>tbo.vma_node); > + return drm_vma_node_offset_addr(>tbo.base.vma_node); > } > > /** > diff --git a/drivers/gpu/drm/qxl/qxl_object.h > b/drivers/gpu/drm/qxl/qxl_object.h > index b812d4ae9d0d..8ae54ba7857c 100644 > --- a/drivers/gpu/drm/qxl/qxl_object.h > +++ b/drivers/gpu/drm/qxl/qxl_object.h > @@ -60,7 +60,7 @@ static inline unsigned long qxl_bo_size(struct qxl_bo *bo) > > static inline u64 qxl_bo_mmap_offset(struct qxl_bo *bo) > { > - return drm_vma_node_offset_addr(>tbo.vma_node); > + return drm_vma_node_offset_addr(>tbo.base.vma_node); > } > > static inline int qxl_bo_wait(struct qxl_bo *bo, u32 *mem_type, > diff --git a/drivers/gpu/drm/radeon/radeon_object.h > b/drivers/gpu/drm/radeon/radeon_object.h > index 9ffd8215d38a..e5554bf9140e 100644 > --- a/drivers/gpu/drm/radeon/radeon_object.h > +++ b/drivers/gpu/drm/radeon/radeon_object.h > @@ -116,7 +116,7 @@ static inline unsigned > radeon_bo_gpu_page_alignment(struct radeon_bo *bo) > */ > static inline u64 radeon_bo_mmap_offset(struct radeon_bo *bo) > { > - return drm_vma_node_offset_addr(>tbo.vma_node); > + return drm_vma_node_offset_addr(>tbo.base.vma_node); > } > > extern int radeon_bo_wait(struct radeon_bo *bo, u32 *mem_type, > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h > b/drivers/gpu/drm/virtio/virtgpu_drv.h > index f4ecea6054ba..e28829661724 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_drv.h > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h > @@ -396,7 +396,7 @@ static inline void virtio_gpu_object_unref(struct > virtio_gpu_object **bo) > > static inline u64 virtio_gpu_object_mmap_offset(struct virtio_gpu_object *bo) > { > - return drm_vma_node_offset_addr(>tbo.vma_node); > + return drm_vma_node_offset_addr(>tbo.base.vma_node); > } > > static inline int virtio_gpu_object_reserve(struct virtio_gpu_object *bo, > diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h > index fa050f0328ab..7ffc50a3303d 100644 > --- a/include/drm/ttm/ttm_bo_api.h > +++ b/include/drm/ttm/ttm_bo_api.h > @@ -152,7 +152,6 @@ struct ttm_tt; > * @ddestroy: List head for the delayed destroy list. > * @swap: List head for swap LRU list. > * @moving: Fence set when BO is moving > - * @vma_node: Address space manager node. > * @offset: The current GPU offset, which can have different meanings > * depending on the memory type. For SYSTEM type memory, it should be 0. > * @cur_placement: Hint of current placement. > @@ -219,9 +218,6 @@ struct ttm_buffer_object { >*/ > > struct dma_fence *moving; > - > - struct drm_vma_offset_node vma_node; > - > unsigned
Re: [Nouveau] [Intel-gfx] [PATCH v6 08/17] drm/ttm: use gem vma_node
On Wed, Aug 14, 2019 at 07:58:27AM +0200, Gerd Hoffmann wrote: > > Hi Gerd, > > > > I've been seeing a regression on Nouveau with recent linux-next releases > > and git bisect points at this commit as the first bad one. If I revert > > it (there's a tiny conflict with a patch that was merged subsequently), > > things are back to normal. > > > > I think the reason for this issue is that Nouveau doesn't use GEM > > objects for all buffer objects, > > That shouldn't be a problem ... > > > and even when it uses GEM objects, the > > code will not initialize the GEM object until after the buffer objects > > and the backing TTM objects have been created. > > ... but the initialization order is. > > ttm_bo_uses_embedded_gem_object() assumes gem gets initialized first. > > drm_gem_object_init() init calling drm_vma_node_reset() again is > probably the root cause for the breakage. > > > I tried to fix that by making sure drm_gem_object_init() gets called by > > Nouveau before ttm_bo_init(), but the changes are fairly involved and I > > was unable to get the GEM reference counting right. I can look into the > > proper fix some more, but it might be worth reverting this patch for > > now to get Nouveau working again. > > Changing the order doesn't look hard. Patch attached (untested, have no > test hardware). But maybe I missed some detail ... > > The other patch attached works around the issue with a flag, to avoid > drm_vma_node_reset() being called twice. I came up with something very similar by splitting up nouveau_bo_new() into allocation and initialization steps, so that when necessary the GEM object can be initialized in between. I think that's slightly more flexible and easier to understand than a boolean flag. Thierry From a1130a6affcb7c00133e89f3e498cb6757f5bb51 Mon Sep 17 00:00:00 2001 From: Thierry Reding Date: Wed, 14 Aug 2019 11:00:48 +0200 Subject: [PATCH] drm/nouveau: Initialize GEM object before TTM object TTM assumes that drivers initialize the embedded GEM object before calling the ttm_bo_init() function. This is not currently the case in the Nouveau driver. Fix this by splitting up nouveau_bo_new() into nouveau_bo_alloc() and nouveau_bo_init() so that the GEM can be initialized before TTM BO initialization when necessary. Fixes: b96f3e7c8069 ("drm/ttm: use gem vma_node") Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_bo.c| 69 - drivers/gpu/drm/nouveau/nouveau_bo.h| 4 ++ drivers/gpu/drm/nouveau/nouveau_gem.c | 29 ++- drivers/gpu/drm/nouveau/nouveau_prime.c | 16 -- 4 files changed, 77 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index 99e391be9370..b3d3e07de1af 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -185,31 +185,24 @@ nouveau_bo_fixup_align(struct nouveau_bo *nvbo, u32 flags, *size = roundup_64(*size, PAGE_SIZE); } -int -nouveau_bo_new(struct nouveau_cli *cli, u64 size, int align, - uint32_t flags, uint32_t tile_mode, uint32_t tile_flags, - struct sg_table *sg, struct reservation_object *robj, - struct nouveau_bo **pnvbo) +struct nouveau_bo * +nouveau_bo_alloc(struct nouveau_cli *cli, u64 size, u32 flags, u32 tile_mode, +u32 tile_flags) { struct nouveau_drm *drm = cli->drm; struct nouveau_bo *nvbo; struct nvif_mmu *mmu = >mmu; struct nvif_vmm *vmm = cli->svm.cli ? >svm.vmm : >vmm.vmm; - size_t acc_size; - int type = ttm_bo_type_device; - int ret, i, pi = -1; + int i, pi = -1; if (!size) { NV_WARN(drm, "skipped size %016llx\n", size); - return -EINVAL; + return ERR_PTR(-EINVAL); } - if (sg) - type = ttm_bo_type_sg; - nvbo = kzalloc(sizeof(struct nouveau_bo), GFP_KERNEL); if (!nvbo) - return -ENOMEM; + return ERR_PTR(-ENOMEM); INIT_LIST_HEAD(>head); INIT_LIST_HEAD(>entry); INIT_LIST_HEAD(>vma_list); @@ -231,7 +224,7 @@ nouveau_bo_new(struct nouveau_cli *cli, u64 size, int align, nvbo->kind = (tile_flags & 0xff00) >> 8; if (!nvif_mmu_kind_valid(mmu, nvbo->kind)) { kfree(nvbo); - return -EINVAL; + return ERR_PTR(-EINVAL); } nvbo->comp = mmu->kind[nvbo->kind] != nvbo->kind; @@ -241,7 +234,7 @@ nouveau_bo_new(struct nouveau_cli *cli, u64 size, int align, nvbo->comp = (tile_flags & 0x0003) >> 16; if (!nvi
Re: [Nouveau] [PATCH 26/26] drm/: Don't set FBINFO_(FLAG_)DEFAULT
On Thu, Jan 24, 2019 at 05:58:31PM +0100, Daniel Vetter wrote: > It's 0. > > Signed-off-by: Daniel Vetter > Cc: Inki Dae > Cc: Joonyoung Shim > Cc: Seung-Woo Kim > Cc: Kyungmin Park > Cc: Kukjin Kim > Cc: Krzysztof Kozlowski > Cc: Patrik Jakobsson > Cc: Ben Skeggs > Cc: Sandy Huang > Cc: "Heiko Stübner" > Cc: Thierry Reding > Cc: Jonathan Hunter > Cc: Hans de Goede > Cc: Greg Kroah-Hartman > Cc: Daniel Vetter > Cc: Bartlomiej Zolnierkiewicz > Cc: Alexander Kapshuk > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-samsung-...@vger.kernel.org > Cc: nouveau@lists.freedesktop.org > Cc: linux-rockc...@lists.infradead.org > Cc: linux-te...@vger.kernel.org > --- > drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 1 - > drivers/gpu/drm/gma500/framebuffer.c | 1 - > drivers/gpu/drm/nouveau/nouveau_fbcon.c | 4 ++-- > drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c | 1 - > drivers/gpu/drm/tegra/fb.c| 1 - > 5 files changed, 2 insertions(+), 6 deletions(-) Acked-by: Thierry Reding signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 7/7] drm: Split out drm_probe_helper.h
/gpu/drm/rockchip/rockchip_drm_fbdev.c | 2 +- > drivers/gpu/drm/rockchip/rockchip_drm_psr.c | 2 +- > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 2 +- > drivers/gpu/drm/rockchip/rockchip_lvds.c | 2 +- > drivers/gpu/drm/rockchip/rockchip_rgb.c | 2 +- > drivers/gpu/drm/sti/sti_crtc.c| 2 +- > drivers/gpu/drm/sti/sti_drv.c | 2 +- > drivers/gpu/drm/sti/sti_dvo.c | 2 +- > drivers/gpu/drm/sti/sti_hda.c | 2 +- > drivers/gpu/drm/sti/sti_hdmi.c| 2 +- > drivers/gpu/drm/sti/sti_tvout.c | 2 +- > drivers/gpu/drm/stm/drv.c | 2 +- > drivers/gpu/drm/stm/ltdc.c| 2 +- > drivers/gpu/drm/sun4i/sun4i_backend.c | 2 +- > drivers/gpu/drm/sun4i/sun4i_crtc.c| 2 +- > drivers/gpu/drm/sun4i/sun4i_drv.c | 2 +- > drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c| 2 +- > drivers/gpu/drm/sun4i/sun4i_lvds.c| 2 +- > drivers/gpu/drm/sun4i/sun4i_rgb.c | 2 +- > drivers/gpu/drm/sun4i/sun4i_tcon.c| 2 +- > drivers/gpu/drm/sun4i/sun4i_tv.c | 2 +- > drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c| 2 +- > drivers/gpu/drm/sun4i/sun8i_dw_hdmi.c | 2 +- > drivers/gpu/drm/sun4i/sun8i_mixer.c | 2 +- > drivers/gpu/drm/sun4i/sun8i_ui_layer.c| 2 +- > drivers/gpu/drm/sun4i/sun8i_vi_layer.c| 2 +- > drivers/gpu/drm/tegra/drm.h | 2 +- > drivers/gpu/drm/tegra/hdmi.c | 2 +- > drivers/gpu/drm/tegra/hub.c | 2 +- > drivers/gpu/drm/tinydrm/core/tinydrm-core.c | 2 +- > drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c | 2 +- > drivers/gpu/drm/tve200/tve200_drv.c | 2 +- > drivers/gpu/drm/udl/udl_connector.c | 1 + > drivers/gpu/drm/udl/udl_drv.c | 1 + > drivers/gpu/drm/udl/udl_main.c| 1 + > drivers/gpu/drm/vc4/vc4_crtc.c| 2 +- > drivers/gpu/drm/vc4/vc4_dpi.c | 2 +- > drivers/gpu/drm/vc4/vc4_dsi.c | 2 +- > drivers/gpu/drm/vc4/vc4_hdmi.c| 2 +- > drivers/gpu/drm/vc4/vc4_kms.c | 2 +- > drivers/gpu/drm/vc4/vc4_txp.c | 2 +- > drivers/gpu/drm/vc4/vc4_vec.c | 2 +- > drivers/gpu/drm/virtio/virtgpu_display.c | 2 +- > drivers/gpu/drm/virtio/virtgpu_drv.h | 2 +- > drivers/gpu/drm/vkms/vkms_crtc.c | 2 +- > drivers/gpu/drm/vkms/vkms_drv.c | 2 +- > drivers/gpu/drm/vkms/vkms_output.c| 2 +- > drivers/gpu/drm/vmwgfx/vmwgfx_kms.h | 2 +- > drivers/gpu/drm/xen/xen_drm_front.c | 2 +- > drivers/gpu/drm/xen/xen_drm_front_conn.c | 2 +- > drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +- > drivers/gpu/drm/xen/xen_drm_front_kms.c | 2 +- > drivers/gpu/drm/zte/zx_drm_drv.c | 2 +- > drivers/gpu/drm/zte/zx_hdmi.c | 2 +- > drivers/gpu/drm/zte/zx_tvenc.c| 2 +- > drivers/gpu/drm/zte/zx_vga.c | 2 +- > drivers/gpu/drm/zte/zx_vou.c | 2 +- > drivers/staging/vboxvideo/vbox_irq.c | 2 +- > drivers/staging/vboxvideo/vbox_mode.c | 2 +- > include/drm/drm_crtc_helper.h | 16 -- > include/drm/drm_probe_helper.h| 50 +++ > 208 files changed, 256 insertions(+), 200 deletions(-) > create mode 100644 include/drm/drm_probe_helper.h Looks good to me: Acked-by: Thierry Reding signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] next/master boot bisection: Oops in nouveau driver on jetson-tk1
On Mon, Dec 10, 2018 at 02:25:59PM +, Mark Brown wrote: > On Mon, Dec 10, 2018 at 10:00:08AM +, Guillaume Tucker wrote: > > On 08/12/2018 00:08, Lyude Paul wrote: > > > uh > > > didn't we fix this weeks ago? with "drm/nouveau: tegra: Call > > > nouveau_drm_device_init()" > > > > Yes here's the fix from Thierry: > > > > https://patchwork.freedesktop.org/patch/263587/ > > > > > > and I can confirm that it does fix the Oops when applied on top > > of next-20181206 (what I used for the bisection last week): > > > > http://lava.baylibre.com:10080/scheduler/job/71109 > > > > > > However the fix doesn't appear to have been applied in any > > upstream tree yet. > > This has been broken for a considerable time now with no response from > Ben - is there some other path we can use to get the fix merged? I suppose we could go directly via Dave. But Ben's usually pretty responsive, so he probably just missed it. Let me ping him on IRC, maybe that'll get his attention. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] TK1: DRM, Nouveau and VIC
On Mon, Dec 10, 2018 at 03:20:19PM +, Marcel Ziswiler wrote: > Hi Thierry > > On Mon, 2018-12-10 at 12:00 +0100, Thierry Reding wrote: > > On Mon, Dec 10, 2018 at 11:21:47AM +0100, Thierry Reding wrote: > > > On Sat, Dec 08, 2018 at 02:54:45PM +, Marcel Ziswiler wrote: > > > > Hi Thierry et al. > > > > > > > > I noticed that since commit 3dde5a2342cd ("ARM: tegra: Add VIC on > > > > Tegra124") graphics on Apalis TK1 is broken. During boot it fails > > > > loading the vic firmware: > > > > > > > > [1.595824] tegra-vic 5434.vic: Direct firmware load for > > > > nvidia/tegra124/vic03_ucode.bin failed with error -2 > > > > [1.606140] tegra-vic: probe of 5434.vic failed with error > > > > -2 > > > > > > > > Subsequently Tegra HDMI seems to fail completely: > > > > > > > > [2.379860] tegra-hdmi 5428.hdmi: failed to get PLL > > > > regulator > > > > > > > > And finally, Nouveau even crashes: > > > > > > > > [8.241115] nouveau 5700.gpu: Linked as a consumer to > > > > regulator.31 > > > > [8.247889] nouveau 5700.gpu: NVIDIA GK20A (0ea000a1) > > > > [8.253396] nouveau 5700.gpu: imem: using IOMMU > > > > [8.270210] Unable to handle kernel NULL pointer dereference > > > > at > > > > virtual address 006c > > > > [8.278340] pgd = (ptrval) > > > > [8.281250] [006c] *pgd= > > > > [8.284944] Internal error: Oops: 5 [#1] PREEMPT SMP ARM > > > > [8.290260] Modules linked in: nouveau(+) ttm > > > > [8.294625] CPU: 2 PID: 203 Comm: systemd-udevd Not tainted > > > > 4.20.0- > > > > rc5-next-20181207-8-g85b0f8e25f86-dirty #110 > > > > [8.305055] Hardware name: NVIDIA Tegra SoC (Flattened Device > > > > Tree) > > > > [8.311331] PC is at drm_plane_register_all+0x18/0x50 > > > > [8.316373] LR is at drm_modeset_register_all+0xc/0x70 > > > > [8.321513] pc : []lr : []psr: > > > > a0060013 > > > > [8.327768] sp : ed527c70 ip : ecc43ec0 fp : > > > > [8.332993] r10: 0016 r9 : ecc43e80 r8 : > > > > [8.338209] r7 : bf182c80 r6 : r5 : ed61b24c r4 : > > > > fffc > > > > [8.344735] r3 : 0002f000 r2 : r1 : 2e124000 r0 : > > > > ed61b000 > > > > [8.351260] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA > > > > ARM Segment none > > > > [8.358383] Control: 10c5387d Table: ad64c06a DAC: 0051 > > > > [8.364127] Process systemd-udevd (pid: 203, stack limit = > > > > 0x(ptrval)) > > > > [8.370654] Stack: (0xed527c70 to 0xed528000) > > > > [8.375004] 7c60: ed61b000 > > > > ed61b000 c0564cc8 > > > > [8.383177] 7c80: ed61b000 c054b5b8 0001 > > > > 0001 > > > > [8.391355] 7ca0: ed527cc0 c0f08c48 ed61b000 > > > > bf180c5c bf0dc900 > > > > [8.399531] 7cc0: eda29208 5dfe844b ee9f2a10 > > > > bf180c5c c05a9328 > > > > [8.407695] 7ce0: c1006828 ee9f2a10 c100682c > > > > c05a744c ee9f2a10 bf180c5c > > > > [8.415871] 7d00: ee9f2a44 c05a77a8 c0f08c48 bf182980 > > > > c05a769c eefd14d0 c05a77a8 > > > > [8.424048] 7d20: ee9f2a10 bf180c5c ee9f2a44 c05a77a8 > > > > c0f08c48 bf182980 > > > > [8.432226] 7d40: c05a7884 ee9ebfb4 c0f08c48 bf180c5c > > > > c05a5790 ee88135c > > > > [8.440405] 7d60: ee9ebfb4 5dfe844b c0f71168 bf180c5c ee379e80 > > > > c0f71168 c05a692c > > > > [8.448570] 7d80: bf15dc00 bf180ac8 e000 bf180c5c bf180ac8 > > > > e000 bf1aa000 c05a84a0 > > > > [8.456746] 7da0: bf182b80 bf180ac8 e000 bf1aa170 c0fbd220 > > > > c0f08c48 e000 c0102ed0 > > > > [8.464924] 7dc0: ed53f4c0 006000c0 c01b3d98 000c 6113 > > > > bf182980 0040 c02592d0 > > > > [8.473102] 7de0: eda60200 2e124000 ee80 006000c0 006000c0 > > > > c01b3d98 000c c025a8cc > > > > [8.481281] 7e00: c024ce54 a113 bf182980 5dfe844b bf18
Re: [Nouveau] TK1: DRM, Nouveau and VIC
On Mon, Dec 10, 2018 at 11:21:47AM +0100, Thierry Reding wrote: > On Sat, Dec 08, 2018 at 02:54:45PM +, Marcel Ziswiler wrote: > > Hi Thierry et al. > > > > I noticed that since commit 3dde5a2342cd ("ARM: tegra: Add VIC on > > Tegra124") graphics on Apalis TK1 is broken. During boot it fails > > loading the vic firmware: > > > > [1.595824] tegra-vic 5434.vic: Direct firmware load for > > nvidia/tegra124/vic03_ucode.bin failed with error -2 > > [1.606140] tegra-vic: probe of 5434.vic failed with error -2 > > > > Subsequently Tegra HDMI seems to fail completely: > > > > [2.379860] tegra-hdmi 5428.hdmi: failed to get PLL regulator > > > > And finally, Nouveau even crashes: > > > > [8.241115] nouveau 5700.gpu: Linked as a consumer to > > regulator.31 > > [8.247889] nouveau 5700.gpu: NVIDIA GK20A (0ea000a1) > > [8.253396] nouveau 5700.gpu: imem: using IOMMU > > [8.270210] Unable to handle kernel NULL pointer dereference at > > virtual address 006c > > [8.278340] pgd = (ptrval) > > [8.281250] [006c] *pgd= > > [8.284944] Internal error: Oops: 5 [#1] PREEMPT SMP ARM > > [8.290260] Modules linked in: nouveau(+) ttm > > [8.294625] CPU: 2 PID: 203 Comm: systemd-udevd Not tainted 4.20.0- > > rc5-next-20181207-8-g85b0f8e25f86-dirty #110 > > [8.305055] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) > > [8.311331] PC is at drm_plane_register_all+0x18/0x50 > > [8.316373] LR is at drm_modeset_register_all+0xc/0x70 > > [8.321513] pc : []lr : []psr: a0060013 > > [8.327768] sp : ed527c70 ip : ecc43ec0 fp : > > [8.332993] r10: 0016 r9 : ecc43e80 r8 : > > [8.338209] r7 : bf182c80 r6 : r5 : ed61b24c r4 : > > fffc > > [8.344735] r3 : 0002f000 r2 : r1 : 2e124000 r0 : > > ed61b000 > > [8.351260] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA > > ARM Segment none > > [8.358383] Control: 10c5387d Table: ad64c06a DAC: 0051 > > [8.364127] Process systemd-udevd (pid: 203, stack limit = > > 0x(ptrval)) > > [8.370654] Stack: (0xed527c70 to 0xed528000) > > [8.375004] 7c60: ed61b000 > > ed61b000 c0564cc8 > > [8.383177] 7c80: ed61b000 c054b5b8 0001 > > 0001 > > [8.391355] 7ca0: ed527cc0 c0f08c48 ed61b000 > > bf180c5c bf0dc900 > > [8.399531] 7cc0: eda29208 5dfe844b ee9f2a10 > > bf180c5c c05a9328 > > [8.407695] 7ce0: c1006828 ee9f2a10 c100682c > > c05a744c ee9f2a10 bf180c5c > > [8.415871] 7d00: ee9f2a44 c05a77a8 c0f08c48 bf182980 > > c05a769c eefd14d0 c05a77a8 > > [8.424048] 7d20: ee9f2a10 bf180c5c ee9f2a44 c05a77a8 > > c0f08c48 bf182980 > > [8.432226] 7d40: c05a7884 ee9ebfb4 c0f08c48 bf180c5c > > c05a5790 ee88135c > > [8.440405] 7d60: ee9ebfb4 5dfe844b c0f71168 bf180c5c ee379e80 > > c0f71168 c05a692c > > [8.448570] 7d80: bf15dc00 bf180ac8 e000 bf180c5c bf180ac8 > > e000 bf1aa000 c05a84a0 > > [8.456746] 7da0: bf182b80 bf180ac8 e000 bf1aa170 c0fbd220 > > c0f08c48 e000 c0102ed0 > > [8.464924] 7dc0: ed53f4c0 006000c0 c01b3d98 000c 6113 > > bf182980 0040 c02592d0 > > [8.473102] 7de0: eda60200 2e124000 ee80 006000c0 006000c0 > > c01b3d98 000c c025a8cc > > [8.481281] 7e00: c024ce54 a113 bf182980 5dfe844b bf182980 > > 0002 ed53f4c0 0002 > > [8.489459] 7e20: eceba000 c01b3dd4 c0f08c48 bf182980 > > ed527f40 0002 eceb9fc0 > > [8.497625] 7e40: 0002 c01b61a4 bf18298c 7fff bf182980 > > c01b2f88 c01b279c > > [8.505800] 7e60: bf1829c8 bf182a80 bf182b6c bf182ab0 c0b03ab0 > > c0d58964 c0ca726c c0ca7278 > > [8.513978] 7e80: c0ca72d0 c0f08c48 c02654a0 > > e000 bf00 > > [8.522157] 7ea0: > > 6e72656b 6c65 > > [8.530336] 7ec0: > > > > [8.538502] 7ee0: > > 5dfe844b 7fff c0f08c48 > > [8.546677] 7f00: 000f b6f761cc c0101204 ed526000 > > 017b 004a3270 c01b66a4 > > [8.554855] 7f20: 7fff 00
Re: [Nouveau] next/master boot: 142 boots: 2 failed, 130 passed with 7 offline, 3 conflicts (next-20181129)
On Thu, Nov 29, 2018 at 11:44:13AM +, Mark Brown wrote: > On Thu, Nov 29, 2018 at 03:23:59AM -0800, kernelci.org bot wrote: > > Today's -next crashes on Jetson TK1 when the Nouveau module is loaded if > it can't find firmware: > > [7.617291] nouveau 5700.gpu: Linked as a consumer to regulator.33 > [7.624037] nouveau 5700.gpu: NVIDIA GK20A (0ea000a1) > [7.629880] nouveau 5700.gpu: imem: using IOMMU > [7.635013] nouveau 5700.gpu: Direct firmware load for > nvidia/gk20a/fecs_inst.bin failed with error -2 > [7.644726] nouveau 5700.gpu: Direct firmware load for > nouveau/nvea_fuc409c failed with error -2 > [7.653960] nouveau 5700.gpu: Direct firmware load for nouveau/fuc409c > failed with error -2 > [7.662916] nouveau 5700.gpu: gr: failed to load fuc409c > [7.669694] Unable to handle kernel NULL pointer dereference at virtual > address 006c > > This has been there for ~30 days but obscured by other issues. Full > log and other info can be found at: > > https://kernelci.org/boot/id/5bffa74b59b5148342fddd64/ This looks like the same issue that I was seeing a couple of weeks ago. There's a fix for this here: http://patchwork.ozlabs.org/patch/993812/ Not sure if Ben's picked that up yet, though. Thierry signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [PATCH 1/4] drm/edid: Pass connector to AVI inforframe functions
On Tue, Nov 20, 2018 at 06:13:42PM +0200, Ville Syrjala wrote: > From: Ville Syrjälä > > Make life easier for drivers by simply passing the connector > to drm_hdmi_avi_infoframe_from_display_mode() and > drm_hdmi_avi_infoframe_quant_range(). That way drivers don't > need to worry about is_hdmi2_sink mess. > > Cc: Alex Deucher > Cc: "Christian König" > Cc: "David (ChunMing) Zhou" > Cc: Archit Taneja > Cc: Andrzej Hajda > Cc: Laurent Pinchart > Cc: Inki Dae > Cc: Joonyoung Shim > Cc: Seung-Woo Kim > Cc: Kyungmin Park > Cc: Russell King > Cc: CK Hu > Cc: Philipp Zabel > Cc: Rob Clark > Cc: Ben Skeggs > Cc: Tomi Valkeinen > Cc: Sandy Huang > Cc: "Heiko Stübner" > Cc: Benjamin Gaignard > Cc: Vincent Abriou > Cc: Thierry Reding > Cc: Eric Anholt > Cc: Shawn Guo > Cc: Ilia Mirkin > Cc: amd-...@lists.freedesktop.org > Cc: linux-arm-...@vger.kernel.org > Cc: freedr...@lists.freedesktop.org > Cc: nouveau@lists.freedesktop.org > Cc: linux-te...@vger.kernel.org > Signed-off-by: Ville Syrjälä > --- > drivers/gpu/drm/amd/amdgpu/dce_v10_0.c| 2 +- > drivers/gpu/drm/amd/amdgpu/dce_v11_0.c| 2 +- > drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 3 ++- > drivers/gpu/drm/amd/amdgpu/dce_v8_0.c | 2 +- > drivers/gpu/drm/bridge/analogix-anx78xx.c | 5 ++-- > drivers/gpu/drm/bridge/sii902x.c | 3 ++- > drivers/gpu/drm/bridge/sil-sii8620.c | 3 +-- > drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 3 ++- > drivers/gpu/drm/drm_edid.c| 33 ++- > drivers/gpu/drm/exynos/exynos_hdmi.c | 3 ++- > drivers/gpu/drm/i2c/tda998x_drv.c | 3 ++- > drivers/gpu/drm/i915/intel_hdmi.c | 14 +- > drivers/gpu/drm/i915/intel_lspcon.c | 15 ++- > drivers/gpu/drm/i915/intel_sdvo.c | 10 --- > drivers/gpu/drm/mediatek/mtk_hdmi.c | 3 ++- > drivers/gpu/drm/msm/hdmi/hdmi_bridge.c| 3 ++- > drivers/gpu/drm/nouveau/dispnv50/disp.c | 7 +++-- > drivers/gpu/drm/omapdrm/omap_encoder.c| 5 ++-- > drivers/gpu/drm/radeon/radeon_audio.c | 2 +- > drivers/gpu/drm/rockchip/inno_hdmi.c | 4 ++- > drivers/gpu/drm/sti/sti_hdmi.c| 3 ++- > drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c| 3 ++- > drivers/gpu/drm/tegra/hdmi.c | 3 ++- > drivers/gpu/drm/tegra/sor.c | 3 ++- > drivers/gpu/drm/vc4/vc4_hdmi.c| 11 +--- > drivers/gpu/drm/zte/zx_hdmi.c | 4 ++- > include/drm/drm_edid.h| 8 +++--- > 27 files changed, 94 insertions(+), 66 deletions(-) That's actually a lot nicer: Acked-by: Thierry Reding signature.asc Description: PGP signature ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [PATCH] drm/nouveau: tegra: Call nouveau_drm_device_init()
From: Thierry Reding As part of commit cfea88a4d866 ("drm/nouveau: Start using new drm_dev initialization helpers"), the initialization of the Nouveau DRM device was reworked and along the way the platform driver initialization was left incomplete. Add a call to nouveau_drm_device_init() to make sure all of the structures are properly initialized. Signed-off-by: Thierry Reding --- drivers/gpu/drm/nouveau/nouveau_drm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 2b2baf6e0e0d..d2928d43f29a 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -1171,10 +1171,16 @@ nouveau_platform_device_create(const struct nvkm_device_tegra_func *func, goto err_free; } + err = nouveau_drm_device_init(drm); + if (err) + goto err_put; + platform_set_drvdata(pdev, drm); return drm; +err_put: + drm_dev_put(drm); err_free: nvkm_device_del(pdevice); -- 2.19.1 ___ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau