Re: [PATCH][next] drm/nouveau/fifo/gk104: remove redundant variable ret

2024-01-22 Thread Dan Carpenter
On Tue, Jan 23, 2024 at 12:04:23AM +0100, Danilo Krummrich wrote:
> On 1/16/24 13:31, Dan Carpenter wrote:
> > On Tue, Jan 16, 2024 at 11:16:09AM +, Colin Ian King wrote:
> > > The variable ret is being assigned a value but it isn't being
> > > read afterwards. The assignment is redundant and so ret can be
> > > removed.
> > > 
> > > Cleans up clang scan build warning:
> > > warning: Although the value stored to 'ret' is used in the enclosing
> > > expression, the value is never actually read from 'ret'
> > > [deadcode.DeadStores]
> > > 
> > > Signed-off-by: Colin Ian King 
> > > ---
> > >   drivers/gpu/drm/nouveau/nvif/fifo.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/nouveau/nvif/fifo.c 
> > > b/drivers/gpu/drm/nouveau/nvif/fifo.c
> > > index a463289962b2..e96de14ce87e 100644
> > > --- a/drivers/gpu/drm/nouveau/nvif/fifo.c
> > > +++ b/drivers/gpu/drm/nouveau/nvif/fifo.c
> > > @@ -73,9 +73,9 @@ u64
> > >   nvif_fifo_runlist(struct nvif_device *device, u64 engine)
> > >   {
> > >   u64 runm = 0;
> > > - int ret, i;
> > > + int i;
> > > - if ((ret = nvif_fifo_runlists(device)))
> > > + if (nvif_fifo_runlists(device))
> > >   return runm;
> > 
> > Could we return a literal zero here?  Otherwise, I'm surprised this
> > doesn't trigger a static checker warning.
> 
> Why do you think so? Conditionally, runm is used later on as well. I don't
> think the checker should complain about keeping the value single source.
> 
> If you agree, want to offer your RB?

If you look at v6.7 then probably 300 patches were from static
analysis.  The syzbot gets credit for 63 bugs and those bugs are more
important because those are real life bugs.  But static analysis is a
huge in terms of just quantity.

One of the most common bugs that static checkers complain about is
missing error codes.  It's a super common bug.  Returning success
instead of failure almost always results in NULL dereference or a use
after free or some kind of crash.  Fortunately, error paths seldom
affect real life users.

My published Smatch checks only complain about:

if (ret)
return ret;

if (failure)
return ret;

I have a different check that I haven't published but I wish that I
could which looks like:

if (!ret)
return ret;

Here is a bug that check found recently.
https://lore.kernel.org/all/9c81251b-bc87-4ca3-bb86-843dc85e5145@moroto.mountain/

I have a different unpublished check for every time ret is zero and we
do:
return ret;

But I only review those warnings for specific code.  Perhaps, I could
make a warning for:

if (failure)
return ret;

I'm sure I tried this in the past and it had more false positives than
when we have an "if (ret) return ret;" like in the first example, but I
can't remember.  I could experiment with that a bit...

To me, if "return ret;" and "return 0;" are the same, then "return 0;"
is obviously more clear and looks more intentional.  When I was looking
at the code here, I had to consider the context.  Especially when the
patch was dealing with the "ret" variable it seemed suspicous.  But
"return 0;" is unamibuous.

I don't have a problem with this patch, it's correct.  But I really do
think that "return 0;" is clearer than "return ret;"

regards,
dan carpenter



[PATCH] nouveau: rip out fence irq allow/block sequences.

2024-01-22 Thread Dave Airlie
From: Dave Airlie 

fences are signalled on nvidia hw using non-stall interrupts.

non-stall interrupts are not latched from my reading.

When nouveau emits a fence, it requests a NON_STALL signalling,
but it only calls the interface to allow the non-stall irq to happen
after it has already emitted the fence. A recent change
eacabb546271 ("nouveau: push event block/allowing out of the fence context")
made this worse by pushing out the fence allow/block to a workqueue.

However I can't see how this could ever work great, since when
enable signalling is called, the semaphore has already been emitted
to the ring, and the hw could already have tried to set the bits,
but it's been masked off. Changing the allowed mask later won't make
the interrupt get called again.

For now rip all of this out.

This fixes a bunch of stalls seen running VK CTS sync tests.

Signed-off-by: Dave Airlie 
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 77 +
 drivers/gpu/drm/nouveau/nouveau_fence.h |  2 -
 2 files changed, 16 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 5057d976fa57..d6d50cdccf75 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -50,24 +50,14 @@ nouveau_fctx(struct nouveau_fence *fence)
return container_of(fence->base.lock, struct nouveau_fence_chan, lock);
 }
 
-static int
+static void
 nouveau_fence_signal(struct nouveau_fence *fence)
 {
-   int drop = 0;
-
dma_fence_signal_locked(>base);
list_del(>head);
rcu_assign_pointer(fence->channel, NULL);
 
-   if (test_bit(DMA_FENCE_FLAG_USER_BITS, >base.flags)) {
-   struct nouveau_fence_chan *fctx = nouveau_fctx(fence);
-
-   if (atomic_dec_and_test(>notify_ref))
-   drop = 1;
-   }
-
dma_fence_put(>base);
-   return drop;
 }
 
 static struct nouveau_fence *
@@ -93,8 +83,7 @@ nouveau_fence_context_kill(struct nouveau_fence_chan *fctx, 
int error)
if (error)
dma_fence_set_error(>base, error);
 
-   if (nouveau_fence_signal(fence))
-   nvif_event_block(>event);
+   nouveau_fence_signal(fence);
}
fctx->killed = 1;
spin_unlock_irqrestore(>lock, flags);
@@ -103,8 +92,8 @@ nouveau_fence_context_kill(struct nouveau_fence_chan *fctx, 
int error)
 void
 nouveau_fence_context_del(struct nouveau_fence_chan *fctx)
 {
-   cancel_work_sync(>allow_block_work);
nouveau_fence_context_kill(fctx, 0);
+   nvif_event_block(>event);
nvif_event_dtor(>event);
fctx->dead = 1;
 
@@ -127,11 +116,10 @@ nouveau_fence_context_free(struct nouveau_fence_chan 
*fctx)
kref_put(>fence_ref, nouveau_fence_context_put);
 }
 
-static int
+static void
 nouveau_fence_update(struct nouveau_channel *chan, struct nouveau_fence_chan 
*fctx)
 {
struct nouveau_fence *fence;
-   int drop = 0;
u32 seq = fctx->read(chan);
 
while (!list_empty(>pending)) {
@@ -140,10 +128,8 @@ nouveau_fence_update(struct nouveau_channel *chan, struct 
nouveau_fence_chan *fc
if ((int)(seq - fence->base.seqno) < 0)
break;
 
-   drop |= nouveau_fence_signal(fence);
+   nouveau_fence_signal(fence);
}
-
-   return drop;
 }
 
 static int
@@ -160,26 +146,13 @@ nouveau_fence_wait_uevent_handler(struct nvif_event 
*event, void *repv, u32 repc
 
fence = list_entry(fctx->pending.next, typeof(*fence), head);
chan = rcu_dereference_protected(fence->channel, 
lockdep_is_held(>lock));
-   if (nouveau_fence_update(chan, fctx))
-   ret = NVIF_EVENT_DROP;
+   nouveau_fence_update(chan, fctx);
}
spin_unlock_irqrestore(>lock, flags);
 
return ret;
 }
 
-static void
-nouveau_fence_work_allow_block(struct work_struct *work)
-{
-   struct nouveau_fence_chan *fctx = container_of(work, struct 
nouveau_fence_chan,
-  allow_block_work);
-
-   if (atomic_read(>notify_ref) == 0)
-   nvif_event_block(>event);
-   else
-   nvif_event_allow(>event);
-}
-
 void
 nouveau_fence_context_new(struct nouveau_channel *chan, struct 
nouveau_fence_chan *fctx)
 {
@@ -191,7 +164,6 @@ nouveau_fence_context_new(struct nouveau_channel *chan, 
struct nouveau_fence_cha
} args;
int ret;
 
-   INIT_WORK(>allow_block_work, nouveau_fence_work_allow_block);
INIT_LIST_HEAD(>flip);
INIT_LIST_HEAD(>pending);
spin_lock_init(>lock);
@@ -216,6 +188,12 @@ nouveau_fence_context_new(struct nouveau_channel *chan, 
struct nouveau_fence_cha
  , sizeof(args), >event);
 
WARN_ON(ret);
+
+   /*
+* Always allow non-stall 

[PATCH 16/82] drm/nouveau/mmu: Refactor intentional wrap-around calculation

2024-01-22 Thread Kees Cook
In an effort to separate intentional arithmetic wrap-around from
unexpected wrap-around, we need to refactor places that depend on this
kind of math. One of the most common code patterns of this is:

VAR + value < VAR

Notably, this is considered "undefined behavior" for signed and pointer
types, which the kernel works around by using the -fno-strict-overflow
option in the build[1] (which used to just be -fwrapv). Regardless, we
want to get the kernel source to the position where we can meaningfully
instrument arithmetic wrap-around conditions and catch them when they
are unexpected, regardless of whether they are signed[2], unsigned[3],
or pointer[4] types.

Refactor open-coded unsigned wrap-around addition test to use
check_add_overflow(), retaining the result for later usage (which removes
the redundant open-coded addition). This paves the way to enabling the
wrap-around sanitizers in the future.

Link: https://git.kernel.org/linus/68df3755e383e6fecf2354a67b08f92f18536594 [1]
Link: https://github.com/KSPP/linux/issues/26 [2]
Link: https://github.com/KSPP/linux/issues/27 [3]
Link: https://github.com/KSPP/linux/issues/344 [4]
Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Ben Skeggs 
Cc: Dave Airlie 
Cc: Julia Lawall 
Cc: Jiang Jian 
Cc: dri-de...@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Kees Cook 
---
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
index 9c97800fe037..6ca1a82ccbc1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
@@ -1149,13 +1149,15 @@ nvkm_vmm_ctor(const struct nvkm_vmm_func *func, struct 
nvkm_mmu *mmu,
vmm->root = RB_ROOT;
 
if (managed) {
+   u64 sum;
+
/* Address-space will be managed by the client for the most
 * part, except for a specified area where NVKM allocations
 * are allowed to be placed.
 */
vmm->start = 0;
vmm->limit = 1ULL << bits;
-   if (addr + size < addr || addr + size > vmm->limit)
+   if (check_add_overflow(addr, size, ) || sum > vmm->limit)
return -EINVAL;
 
/* Client-managed area before the NVKM-managed area. */
@@ -1174,7 +1176,7 @@ nvkm_vmm_ctor(const struct nvkm_vmm_func *func, struct 
nvkm_mmu *mmu,
}
 
/* Client-managed area after the NVKM-managed area. */
-   addr = addr + size;
+   addr = sum;
size = vmm->limit - addr;
if (size && (ret = nvkm_vmm_ctor_managed(vmm, addr, size)))
return ret;
-- 
2.34.1



[PATCH 48/82] drm/nouveau/mmu: Refactor intentional wrap-around test

2024-01-22 Thread Kees Cook
In an effort to separate intentional arithmetic wrap-around from
unexpected wrap-around, we need to refactor places that depend on this
kind of math. One of the most common code patterns of this is:

VAR + value < VAR

Notably, this is considered "undefined behavior" for signed and pointer
types, which the kernel works around by using the -fno-strict-overflow
option in the build[1] (which used to just be -fwrapv). Regardless, we
want to get the kernel source to the position where we can meaningfully
instrument arithmetic wrap-around conditions and catch them when they
are unexpected, regardless of whether they are signed[2], unsigned[3],
or pointer[4] types.

Refactor open-coded wrap-around addition test to use add_would_overflow().
This paves the way to enabling the wrap-around sanitizers in the future.

Link: https://git.kernel.org/linus/68df3755e383e6fecf2354a67b08f92f18536594 [1]
Link: https://github.com/KSPP/linux/issues/26 [2]
Link: https://github.com/KSPP/linux/issues/27 [3]
Link: https://github.com/KSPP/linux/issues/344 [4]
Cc: Karol Herbst 
Cc: Lyude Paul 
Cc: Danilo Krummrich 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: Ben Skeggs 
Cc: dri-de...@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Kees Cook 
---
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
index 6ca1a82ccbc1..87c0903be9a7 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
@@ -1291,7 +1291,7 @@ nvkm_vmm_pfn_map(struct nvkm_vmm *vmm, u8 shift, u64 
addr, u64 size, u64 *pfn)
 
if (!page->shift || !IS_ALIGNED(addr, 1ULL << shift) ||
!IS_ALIGNED(size, 1ULL << shift) ||
-   addr + size < addr || addr + size > vmm->limit) {
+   add_would_overflow(addr, size) || addr + size > vmm->limit) {
VMM_DEBUG(vmm, "paged map %d %d %016llx %016llx\n",
  shift, page->shift, addr, size);
return -EINVAL;
-- 
2.34.1



Re: [PATCH][next] drm/nouveau/fifo/gk104: remove redundant variable ret

2024-01-22 Thread Danilo Krummrich

On 1/16/24 13:31, Dan Carpenter wrote:

On Tue, Jan 16, 2024 at 11:16:09AM +, Colin Ian King wrote:

The variable ret is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so ret can be
removed.

Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]

Signed-off-by: Colin Ian King 
---
  drivers/gpu/drm/nouveau/nvif/fifo.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/fifo.c 
b/drivers/gpu/drm/nouveau/nvif/fifo.c
index a463289962b2..e96de14ce87e 100644
--- a/drivers/gpu/drm/nouveau/nvif/fifo.c
+++ b/drivers/gpu/drm/nouveau/nvif/fifo.c
@@ -73,9 +73,9 @@ u64
  nvif_fifo_runlist(struct nvif_device *device, u64 engine)
  {
u64 runm = 0;
-   int ret, i;
+   int i;
  
-	if ((ret = nvif_fifo_runlists(device)))

+   if (nvif_fifo_runlists(device))
return runm;


Could we return a literal zero here?  Otherwise, I'm surprised this
doesn't trigger a static checker warning.


Why do you think so? Conditionally, runm is used later on as well. I don't
think the checker should complain about keeping the value single source.

If you agree, want to offer your RB?

- Danilo



regards,
dan carpenter





Re: [PATCH] mm: Remove double faults once write a device pfn

2024-01-22 Thread Christian König

Am 22.01.24 um 04:32 schrieb Xianrong Zhou:

The vmf_insert_pfn_prot could cause unnecessary double faults
on a device pfn. Because currently the vmf_insert_pfn_prot does
not make the pfn writable so the pte entry is normally read-only
or dirty catching.


What? How do you got to this conclusion?


The first fault only sets up the pte entry which actually is
dirty catching. And the second immediate fault to the pfn due
to first dirty catching when the cpu re-execute the store
instruction.


It could be that this is done to work around some hw behavior, but not 
because of dirty catching.



Normally if the drivers call vmf_insert_pfn_prot and also supply
'pfn_mkwrite' callback within vm_operations_struct which requires
the pte to be dirty catching then the vmf_insert_pfn_prot and the
double fault are reasonable. It is not a problem.


Well, as far as I can see that behavior absolutely doesn't make sense.

When pfn_mkwrite is requested then the driver should use PAGE_COPY, 
which is exactly what VMWGFX (the only driver using dirty tracking) is 
doing.


Everybody else uses PAGE_SHARED which should make the pte writeable 
immediately.


Regards,
Christian.



However the most of drivers calling vmf_insert_pfn_prot do not
supply the 'pfn_mkwrite' callback so that the second fault is
unnecessary.

So just like vmf_insert_mixed and vmf_insert_mixed_mkwrite pair,
we should also supply vmf_insert_pfn_mkwrite for drivers as well.

Signed-off-by: Xianrong Zhou 
---
  arch/x86/entry/vdso/vma.c  |  3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c|  2 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c|  2 +-
  drivers/gpu/drm/nouveau/nouveau_gem.c  |  2 +-
  drivers/gpu/drm/radeon/radeon_gem.c|  2 +-
  drivers/gpu/drm/ttm/ttm_bo_vm.c|  8 +---
  drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c |  8 +---
  include/drm/ttm/ttm_bo.h   |  3 ++-
  include/linux/mm.h |  2 +-
  mm/memory.c| 14 +++---
  10 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 7645730dc228..dd2431c2975f 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -185,7 +185,8 @@ static vm_fault_t vvar_fault(const struct 
vm_special_mapping *sm,
if (pvti && vclock_was_used(VDSO_CLOCKMODE_PVCLOCK)) {
return vmf_insert_pfn_prot(vma, vmf->address,
__pa(pvti) >> PAGE_SHIFT,
-   pgprot_decrypted(vma->vm_page_prot));
+   pgprot_decrypted(vma->vm_page_prot),
+   true);
}
} else if (sym_offset == image->sym_hvclock_page) {
pfn = hv_get_tsc_pfn();
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 49a5f1c73b3e..adcb20d9e624 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -64,7 +64,7 @@ static vm_fault_t amdgpu_gem_fault(struct vm_fault *vmf)
}
  
  		ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,

-  TTM_BO_VM_NUM_PREFAULT);
+  TTM_BO_VM_NUM_PREFAULT, true);
  
  		drm_dev_exit(idx);

} else {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 9227f8146a58..c6f13ae6c308 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -1114,7 +1114,7 @@ static vm_fault_t vm_fault_ttm(struct vm_fault *vmf)
  
  	if (drm_dev_enter(dev, )) {

ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
-  TTM_BO_VM_NUM_PREFAULT);
+  TTM_BO_VM_NUM_PREFAULT, true);
drm_dev_exit(idx);
} else {
ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot);
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 49c2bcbef129..7e1453762ec9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -56,7 +56,7 @@ static vm_fault_t nouveau_ttm_fault(struct vm_fault *vmf)
  
  	nouveau_bo_del_io_reserve_lru(bo);

prot = vm_get_page_prot(vma->vm_flags);
-   ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT);
+   ret = ttm_bo_vm_fault_reserved(vmf, prot, TTM_BO_VM_NUM_PREFAULT, true);
nouveau_bo_add_io_reserve_lru(bo);
if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT))
return ret;
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 3fec3acdaf28..b21cf00ae162 100644
---