Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-29 Thread David Hildenbrand
On 29.06.22 10:31, Tong Zhang wrote:
> 
> 
> On Wed, Jun 29, 2022 at 12:29 AM David Hildenbrand  > wrote:
> 
> On 06.05.22 18:31, Tong Zhang wrote:
> > assert(dbs->acb) is meant to check the return value of io_func per
> > documented in commit 6bee44ea34 ("dma: the passed io_func does not
> > return NULL"). However, there is a chance that after calling
> > aio_context_release(dbs->ctx); the dma_blk_cb function is called
> before
> > the assertion and dbs->acb is set to NULL again at line 121. Thus when
> > we run assert at line 181 it will fail.
> >
> >   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
> >
> > Reported-by: Francisco Londono  >
> > Signed-off-by: Tong Zhang  >
> > ---
> >  softmmu/dma-helpers.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> > index 7820fec54c..cb81017928 100644
> > --- a/softmmu/dma-helpers.c
> > +++ b/softmmu/dma-helpers.c
> > @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
> >      aio_context_acquire(dbs->ctx);
> >      dbs->acb = dbs->io_func(dbs->offset, >iov,
> >                              dma_blk_cb, dbs, dbs->io_func_opaque);
> > -    aio_context_release(dbs->ctx);
> >      assert(dbs->acb);
> > +    aio_context_release(dbs->ctx);
> >  }
> > 
> >  static void dma_aio_cancel(BlockAIOCB *acb)
> 
> Please don't resend patches if the previous submission came to the
> conclusion that it's unclear how this should help.
> 
> 
> https://lkml.kernel.org/r/cajsp0qw396ry_g8ls1mncdzcov5gamury+xv+s8zmcdq03o...@mail.gmail.com
> 
> 
> 
> 
> I *still* don't understand the interaction between the lock and the
> assertion and so far nobody was able to clarify.
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 
> hello
> 
> This message is sent way before the discussion 

Oh, I'm sorry. I was mislead by the reply from Laurent :)

BTW, do we now have an understanding why that patch helps and if it
applies to upstream?

-- 
Thanks,

David / dhildenb




Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-29 Thread Tong Zhang
On Wed, Jun 29, 2022 at 12:29 AM David Hildenbrand  wrote:

> On 06.05.22 18:31, Tong Zhang wrote:
> > assert(dbs->acb) is meant to check the return value of io_func per
> > documented in commit 6bee44ea34 ("dma: the passed io_func does not
> > return NULL"). However, there is a chance that after calling
> > aio_context_release(dbs->ctx); the dma_blk_cb function is called before
> > the assertion and dbs->acb is set to NULL again at line 121. Thus when
> > we run assert at line 181 it will fail.
> >
> >   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
> >
> > Reported-by: Francisco Londono 
> > Signed-off-by: Tong Zhang 
> > ---
> >  softmmu/dma-helpers.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> > index 7820fec54c..cb81017928 100644
> > --- a/softmmu/dma-helpers.c
> > +++ b/softmmu/dma-helpers.c
> > @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
> >  aio_context_acquire(dbs->ctx);
> >  dbs->acb = dbs->io_func(dbs->offset, >iov,
> >  dma_blk_cb, dbs, dbs->io_func_opaque);
> > -aio_context_release(dbs->ctx);
> >  assert(dbs->acb);
> > +aio_context_release(dbs->ctx);
> >  }
> >
> >  static void dma_aio_cancel(BlockAIOCB *acb)
>
> Please don't resend patches if the previous submission came to the
> conclusion that it's unclear how this should help.
>
>
> https://lkml.kernel.org/r/cajsp0qw396ry_g8ls1mncdzcov5gamury+xv+s8zmcdq03o...@mail.gmail.com
>
>
> I *still* don't understand the interaction between the lock and the
> assertion and so far nobody was able to clarify.
>
> --
> Thanks,
>
> David / dhildenb
>
hello

This message is sent way before the discussion


>


Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-29 Thread David Hildenbrand
On 06.05.22 18:31, Tong Zhang wrote:
> assert(dbs->acb) is meant to check the return value of io_func per
> documented in commit 6bee44ea34 ("dma: the passed io_func does not
> return NULL"). However, there is a chance that after calling
> aio_context_release(dbs->ctx); the dma_blk_cb function is called before
> the assertion and dbs->acb is set to NULL again at line 121. Thus when
> we run assert at line 181 it will fail.
> 
>   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
> 
> Reported-by: Francisco Londono 
> Signed-off-by: Tong Zhang 
> ---
>  softmmu/dma-helpers.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> index 7820fec54c..cb81017928 100644
> --- a/softmmu/dma-helpers.c
> +++ b/softmmu/dma-helpers.c
> @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
>  aio_context_acquire(dbs->ctx);
>  dbs->acb = dbs->io_func(dbs->offset, >iov,
>  dma_blk_cb, dbs, dbs->io_func_opaque);
> -aio_context_release(dbs->ctx);
>  assert(dbs->acb);
> +aio_context_release(dbs->ctx);
>  }
>  
>  static void dma_aio_cancel(BlockAIOCB *acb)

Please don't resend patches if the previous submission came to the
conclusion that it's unclear how this should help.

https://lkml.kernel.org/r/cajsp0qw396ry_g8ls1mncdzcov5gamury+xv+s8zmcdq03o...@mail.gmail.com


I *still* don't understand the interaction between the lock and the
assertion and so far nobody was able to clarify.

-- 
Thanks,

David / dhildenb




Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-28 Thread Laurent Vivier

Le 06/05/2022 à 18:31, Tong Zhang a écrit :

assert(dbs->acb) is meant to check the return value of io_func per
documented in commit 6bee44ea34 ("dma: the passed io_func does not
return NULL"). However, there is a chance that after calling
aio_context_release(dbs->ctx); the dma_blk_cb function is called before
the assertion and dbs->acb is set to NULL again at line 121. Thus when
we run assert at line 181 it will fail.

   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.

Reported-by: Francisco Londono 
Signed-off-by: Tong Zhang 
---
  softmmu/dma-helpers.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
index 7820fec54c..cb81017928 100644
--- a/softmmu/dma-helpers.c
+++ b/softmmu/dma-helpers.c
@@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
  aio_context_acquire(dbs->ctx);
  dbs->acb = dbs->io_func(dbs->offset, >iov,
  dma_blk_cb, dbs, dbs->io_func_opaque);
-aio_context_release(dbs->ctx);
  assert(dbs->acb);
+aio_context_release(dbs->ctx);
  }
  
  static void dma_aio_cancel(BlockAIOCB *acb)


Fixes: 1919631e6b55 ("block: explicitly acquire aiocontext in bottom halves that 
need it")
Reviewed-by: Laurent Vivier 




Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-01 Thread Stefan Hajnoczi
On Thu, Jun 2, 2022, 02:04 Tong Zhang  wrote:
>
> Hi Stefan,
>
> On Wed, Jun 1, 2022 at 6:56 AM Stefan Hajnoczi  wrote:
> >
> > > > This patch makes sense to me. Can you rephrase your concern?
> > >
> > > The locking is around dbs->io_func().
> > >
> > > aio_context_acquire(dbs->ctx);
> > > dbs->acb = dbs->io_func()
> > > aio_context_release(dbs->ctx);
> > >
> > >
> > > So where exactly would the lock that's now still held stop someone from
> > > modifying dbs->acb = NULL at the beginning of the function, which seems
> > > to be not protected by that lock?
> > >
> > > Maybe I'm missing some locking magic due to the lock being a recursive 
> > > lock.
> >
> > Tong Zhang: Can you share a backtrace of all threads when the
> > assertion failure occurs?
> >
> Sorry I couldn't get the trace now -- but I can tell that we have some
> internal code uses
> this dma related code and will grab dbs->ctx lock in another thread
> and could overwrite dbs->acb.
>
> From my understanding, one of the reasons that the lock is required
> here is to protect dbs->acb,
> we could not reliably test io_func()'s return value after releasing
> the lock here.
>
> Since this code affects our internal code base and I did not reproduce
> on master branch,
> feel free to ignore it.

If this patch is unnecessary on qemu.git/master it raises the question
whether aio_context_acquire/release() should be removed from
dma_blk_cb(). It was added by:

commit 1919631e6b5562e474690853eca3c35610201e16
Author: Paolo Bonzini 
Date:   Mon Feb 13 14:52:31 2017 +0100

block: explicitly acquire aiocontext in bottom halves that need it

Paolo: Is dma_blk_cb() called without the AioContext lock (by
virtio-scsi or any code path with IOThreads)? David pointed out that
if that's the case then the dbs->acb is accessed without the lock at
the start of dma_blk_cb().

Stefan



Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-01 Thread Tong Zhang
Hi Stefan,

On Wed, Jun 1, 2022 at 6:56 AM Stefan Hajnoczi  wrote:
>
> > > This patch makes sense to me. Can you rephrase your concern?
> >
> > The locking is around dbs->io_func().
> >
> > aio_context_acquire(dbs->ctx);
> > dbs->acb = dbs->io_func()
> > aio_context_release(dbs->ctx);
> >
> >
> > So where exactly would the lock that's now still held stop someone from
> > modifying dbs->acb = NULL at the beginning of the function, which seems
> > to be not protected by that lock?
> >
> > Maybe I'm missing some locking magic due to the lock being a recursive lock.
>
> Tong Zhang: Can you share a backtrace of all threads when the
> assertion failure occurs?
>
Sorry I couldn't get the trace now -- but I can tell that we have some
internal code uses
this dma related code and will grab dbs->ctx lock in another thread
and could overwrite dbs->acb.

>From my understanding, one of the reasons that the lock is required
here is to protect dbs->acb,
we could not reliably test io_func()'s return value after releasing
the lock here.

Since this code affects our internal code base and I did not reproduce
on master branch,
feel free to ignore it.

- Tong

> Stefan



Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-01 Thread Stefan Hajnoczi
On Wed, 1 Jun 2022 at 14:29, David Hildenbrand  wrote:
>
> On 01.06.22 15:24, Stefan Hajnoczi wrote:
> > On Wed, Jun 01, 2022 at 10:00:50AM +0200, David Hildenbrand wrote:
> >> On 01.06.22 02:20, Tong Zhang wrote:
> >>> Hi David,
> >>>
> >>> On Mon, May 30, 2022 at 9:19 AM David Hildenbrand  
> >>> wrote:
> 
>  On 27.04.22 22:51, Tong Zhang wrote:
> > assert(dbs->acb) is meant to check the return value of io_func per
> > documented in commit 6bee44ea34 ("dma: the passed io_func does not
> > return NULL"). However, there is a chance that after calling
> > aio_context_release(dbs->ctx); the dma_blk_cb function is called before
> > the assertion and dbs->acb is set to NULL again at line 121. Thus when
> > we run assert at line 181 it will fail.
> >
> >   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
> >
> > Reported-by: Francisco Londono 
> > Signed-off-by: Tong Zhang 
> > ---
> >  softmmu/dma-helpers.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> > index 7820fec54c..cb81017928 100644
> > --- a/softmmu/dma-helpers.c
> > +++ b/softmmu/dma-helpers.c
> > @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
> >  aio_context_acquire(dbs->ctx);
> >  dbs->acb = dbs->io_func(dbs->offset, >iov,
> >  dma_blk_cb, dbs, dbs->io_func_opaque);
> > -aio_context_release(dbs->ctx);
> >  assert(dbs->acb);
> > +aio_context_release(dbs->ctx);
> >  }
> >
> >  static void dma_aio_cancel(BlockAIOCB *acb)
> 
>  I'm fairly new to that code, but I wonder what prevents dma_blk_cb() to
>  run after you reshuffled the code?
> 
> >>>
> >>> IMO if the assert is to test whether io_func returns a non-NULL value
> >>> shouldn't it be immediately after calling io_func.
> >>> Also... as suggested by commit 6bee44ea346aed24e12d525daf10542d695508db
> >>>   > dma: the passed io_func does not return NULL
> >>
> >> Yes, but I just don't see how it would fix the assertion you document in
> >> the patch description. The locking change to fix the assertion doesn't
> >> make any sense to me, and most probably I am missing something important :)
> >
> > The other thread will invoke dma_blk_cb(), which modifies dbs->acb, when
> > it can take the lock. Therefore dbs->acb may contain a value different
> > from our io_func()'s return value by the time we perform the assertion
> > check (that's the race).
> >
> > This patch makes sense to me. Can you rephrase your concern?
>
> The locking is around dbs->io_func().
>
> aio_context_acquire(dbs->ctx);
> dbs->acb = dbs->io_func()
> aio_context_release(dbs->ctx);
>
>
> So where exactly would the lock that's now still held stop someone from
> modifying dbs->acb = NULL at the beginning of the function, which seems
> to be not protected by that lock?
>
> Maybe I'm missing some locking magic due to the lock being a recursive lock.

Tong Zhang: Can you share a backtrace of all threads when the
assertion failure occurs?

Stefan



Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-01 Thread David Hildenbrand
On 01.06.22 15:24, Stefan Hajnoczi wrote:
> On Wed, Jun 01, 2022 at 10:00:50AM +0200, David Hildenbrand wrote:
>> On 01.06.22 02:20, Tong Zhang wrote:
>>> Hi David,
>>>
>>> On Mon, May 30, 2022 at 9:19 AM David Hildenbrand  wrote:

 On 27.04.22 22:51, Tong Zhang wrote:
> assert(dbs->acb) is meant to check the return value of io_func per
> documented in commit 6bee44ea34 ("dma: the passed io_func does not
> return NULL"). However, there is a chance that after calling
> aio_context_release(dbs->ctx); the dma_blk_cb function is called before
> the assertion and dbs->acb is set to NULL again at line 121. Thus when
> we run assert at line 181 it will fail.
>
>   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
>
> Reported-by: Francisco Londono 
> Signed-off-by: Tong Zhang 
> ---
>  softmmu/dma-helpers.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> index 7820fec54c..cb81017928 100644
> --- a/softmmu/dma-helpers.c
> +++ b/softmmu/dma-helpers.c
> @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
>  aio_context_acquire(dbs->ctx);
>  dbs->acb = dbs->io_func(dbs->offset, >iov,
>  dma_blk_cb, dbs, dbs->io_func_opaque);
> -aio_context_release(dbs->ctx);
>  assert(dbs->acb);
> +aio_context_release(dbs->ctx);
>  }
>
>  static void dma_aio_cancel(BlockAIOCB *acb)

 I'm fairly new to that code, but I wonder what prevents dma_blk_cb() to
 run after you reshuffled the code?

>>>
>>> IMO if the assert is to test whether io_func returns a non-NULL value
>>> shouldn't it be immediately after calling io_func.
>>> Also... as suggested by commit 6bee44ea346aed24e12d525daf10542d695508db
>>>   > dma: the passed io_func does not return NULL
>>
>> Yes, but I just don't see how it would fix the assertion you document in
>> the patch description. The locking change to fix the assertion doesn't
>> make any sense to me, and most probably I am missing something important :)
> 
> The other thread will invoke dma_blk_cb(), which modifies dbs->acb, when
> it can take the lock. Therefore dbs->acb may contain a value different
> from our io_func()'s return value by the time we perform the assertion
> check (that's the race).
> 
> This patch makes sense to me. Can you rephrase your concern?

The locking is around dbs->io_func().

aio_context_acquire(dbs->ctx);
dbs->acb = dbs->io_func()
aio_context_release(dbs->ctx);


So where exactly would the lock that's now still held stop someone from
modifying dbs->acb = NULL at the beginning of the function, which seems
to be not protected by that lock?

Maybe I'm missing some locking magic due to the lock being a recursive lock.

-- 
Thanks,

David / dhildenb




Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-01 Thread Stefan Hajnoczi
On Wed, Jun 01, 2022 at 10:00:50AM +0200, David Hildenbrand wrote:
> On 01.06.22 02:20, Tong Zhang wrote:
> > Hi David,
> > 
> > On Mon, May 30, 2022 at 9:19 AM David Hildenbrand  wrote:
> >>
> >> On 27.04.22 22:51, Tong Zhang wrote:
> >>> assert(dbs->acb) is meant to check the return value of io_func per
> >>> documented in commit 6bee44ea34 ("dma: the passed io_func does not
> >>> return NULL"). However, there is a chance that after calling
> >>> aio_context_release(dbs->ctx); the dma_blk_cb function is called before
> >>> the assertion and dbs->acb is set to NULL again at line 121. Thus when
> >>> we run assert at line 181 it will fail.
> >>>
> >>>   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
> >>>
> >>> Reported-by: Francisco Londono 
> >>> Signed-off-by: Tong Zhang 
> >>> ---
> >>>  softmmu/dma-helpers.c | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> >>> index 7820fec54c..cb81017928 100644
> >>> --- a/softmmu/dma-helpers.c
> >>> +++ b/softmmu/dma-helpers.c
> >>> @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
> >>>  aio_context_acquire(dbs->ctx);
> >>>  dbs->acb = dbs->io_func(dbs->offset, >iov,
> >>>  dma_blk_cb, dbs, dbs->io_func_opaque);
> >>> -aio_context_release(dbs->ctx);
> >>>  assert(dbs->acb);
> >>> +aio_context_release(dbs->ctx);
> >>>  }
> >>>
> >>>  static void dma_aio_cancel(BlockAIOCB *acb)
> >>
> >> I'm fairly new to that code, but I wonder what prevents dma_blk_cb() to
> >> run after you reshuffled the code?
> >>
> > 
> > IMO if the assert is to test whether io_func returns a non-NULL value
> > shouldn't it be immediately after calling io_func.
> > Also... as suggested by commit 6bee44ea346aed24e12d525daf10542d695508db
> >   > dma: the passed io_func does not return NULL
> 
> Yes, but I just don't see how it would fix the assertion you document in
> the patch description. The locking change to fix the assertion doesn't
> make any sense to me, and most probably I am missing something important :)

The other thread will invoke dma_blk_cb(), which modifies dbs->acb, when
it can take the lock. Therefore dbs->acb may contain a value different
from our io_func()'s return value by the time we perform the assertion
check (that's the race).

This patch makes sense to me. Can you rephrase your concern?

Stefan


signature.asc
Description: PGP signature


Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-06-01 Thread David Hildenbrand
On 01.06.22 02:20, Tong Zhang wrote:
> Hi David,
> 
> On Mon, May 30, 2022 at 9:19 AM David Hildenbrand  wrote:
>>
>> On 27.04.22 22:51, Tong Zhang wrote:
>>> assert(dbs->acb) is meant to check the return value of io_func per
>>> documented in commit 6bee44ea34 ("dma: the passed io_func does not
>>> return NULL"). However, there is a chance that after calling
>>> aio_context_release(dbs->ctx); the dma_blk_cb function is called before
>>> the assertion and dbs->acb is set to NULL again at line 121. Thus when
>>> we run assert at line 181 it will fail.
>>>
>>>   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
>>>
>>> Reported-by: Francisco Londono 
>>> Signed-off-by: Tong Zhang 
>>> ---
>>>  softmmu/dma-helpers.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
>>> index 7820fec54c..cb81017928 100644
>>> --- a/softmmu/dma-helpers.c
>>> +++ b/softmmu/dma-helpers.c
>>> @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
>>>  aio_context_acquire(dbs->ctx);
>>>  dbs->acb = dbs->io_func(dbs->offset, >iov,
>>>  dma_blk_cb, dbs, dbs->io_func_opaque);
>>> -aio_context_release(dbs->ctx);
>>>  assert(dbs->acb);
>>> +aio_context_release(dbs->ctx);
>>>  }
>>>
>>>  static void dma_aio_cancel(BlockAIOCB *acb)
>>
>> I'm fairly new to that code, but I wonder what prevents dma_blk_cb() to
>> run after you reshuffled the code?
>>
> 
> IMO if the assert is to test whether io_func returns a non-NULL value
> shouldn't it be immediately after calling io_func.
> Also... as suggested by commit 6bee44ea346aed24e12d525daf10542d695508db
>   > dma: the passed io_func does not return NULL

Yes, but I just don't see how it would fix the assertion you document in
the patch description. The locking change to fix the assertion doesn't
make any sense to me, and most probably I am missing something important :)

-- 
Thanks,

David / dhildenb




Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-05-31 Thread Tong Zhang
Hi David,

On Mon, May 30, 2022 at 9:19 AM David Hildenbrand  wrote:
>
> On 27.04.22 22:51, Tong Zhang wrote:
> > assert(dbs->acb) is meant to check the return value of io_func per
> > documented in commit 6bee44ea34 ("dma: the passed io_func does not
> > return NULL"). However, there is a chance that after calling
> > aio_context_release(dbs->ctx); the dma_blk_cb function is called before
> > the assertion and dbs->acb is set to NULL again at line 121. Thus when
> > we run assert at line 181 it will fail.
> >
> >   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
> >
> > Reported-by: Francisco Londono 
> > Signed-off-by: Tong Zhang 
> > ---
> >  softmmu/dma-helpers.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> > index 7820fec54c..cb81017928 100644
> > --- a/softmmu/dma-helpers.c
> > +++ b/softmmu/dma-helpers.c
> > @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
> >  aio_context_acquire(dbs->ctx);
> >  dbs->acb = dbs->io_func(dbs->offset, >iov,
> >  dma_blk_cb, dbs, dbs->io_func_opaque);
> > -aio_context_release(dbs->ctx);
> >  assert(dbs->acb);
> > +aio_context_release(dbs->ctx);
> >  }
> >
> >  static void dma_aio_cancel(BlockAIOCB *acb)
>
> I'm fairly new to that code, but I wonder what prevents dma_blk_cb() to
> run after you reshuffled the code?
>

IMO if the assert is to test whether io_func returns a non-NULL value
shouldn't it be immediately after calling io_func.
Also... as suggested by commit 6bee44ea346aed24e12d525daf10542d695508db
  > dma: the passed io_func does not return NULL

Thanks,
- Tong

> After all, acquire/release is only around the dbs->io_func() call, so I
> don't immediately see how it interacts with re-entrance?
>
> --
> Thanks,
>
> David / dhildenb
>



Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-05-30 Thread David Hildenbrand
On 27.04.22 22:51, Tong Zhang wrote:
> assert(dbs->acb) is meant to check the return value of io_func per
> documented in commit 6bee44ea34 ("dma: the passed io_func does not
> return NULL"). However, there is a chance that after calling
> aio_context_release(dbs->ctx); the dma_blk_cb function is called before
> the assertion and dbs->acb is set to NULL again at line 121. Thus when
> we run assert at line 181 it will fail.
> 
>   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.
> 
> Reported-by: Francisco Londono 
> Signed-off-by: Tong Zhang 
> ---
>  softmmu/dma-helpers.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
> index 7820fec54c..cb81017928 100644
> --- a/softmmu/dma-helpers.c
> +++ b/softmmu/dma-helpers.c
> @@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
>  aio_context_acquire(dbs->ctx);
>  dbs->acb = dbs->io_func(dbs->offset, >iov,
>  dma_blk_cb, dbs, dbs->io_func_opaque);
> -aio_context_release(dbs->ctx);
>  assert(dbs->acb);
> +aio_context_release(dbs->ctx);
>  }
>  
>  static void dma_aio_cancel(BlockAIOCB *acb)

I'm fairly new to that code, but I wonder what prevents dma_blk_cb() to
run after you reshuffled the code?

After all, acquire/release is only around the dbs->io_func() call, so I
don't immediately see how it interacts with re-entrance?

-- 
Thanks,

David / dhildenb




Re: [RESEND PATCH] hw/dma: fix crash caused by race condition

2022-05-30 Thread Philippe Mathieu-Daudé via

+Emanuele / Alexander / Stefan

On 27/4/22 22:51, Tong Zhang wrote:

assert(dbs->acb) is meant to check the return value of io_func per
documented in commit 6bee44ea34 ("dma: the passed io_func does not
return NULL"). However, there is a chance that after calling
aio_context_release(dbs->ctx); the dma_blk_cb function is called before
the assertion and dbs->acb is set to NULL again at line 121. Thus when
we run assert at line 181 it will fail.

   softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.

Reported-by: Francisco Londono 
Signed-off-by: Tong Zhang 
---
  softmmu/dma-helpers.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
index 7820fec54c..cb81017928 100644
--- a/softmmu/dma-helpers.c
+++ b/softmmu/dma-helpers.c
@@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
  aio_context_acquire(dbs->ctx);
  dbs->acb = dbs->io_func(dbs->offset, >iov,
  dma_blk_cb, dbs, dbs->io_func_opaque);
-aio_context_release(dbs->ctx);
  assert(dbs->acb);
+aio_context_release(dbs->ctx);
  }
  
  static void dma_aio_cancel(BlockAIOCB *acb)





[RESEND PATCH] hw/dma: fix crash caused by race condition

2022-05-06 Thread Tong Zhang
assert(dbs->acb) is meant to check the return value of io_func per
documented in commit 6bee44ea34 ("dma: the passed io_func does not
return NULL"). However, there is a chance that after calling
aio_context_release(dbs->ctx); the dma_blk_cb function is called before
the assertion and dbs->acb is set to NULL again at line 121. Thus when
we run assert at line 181 it will fail.

  softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.

Reported-by: Francisco Londono 
Signed-off-by: Tong Zhang 
---
 softmmu/dma-helpers.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
index 7820fec54c..cb81017928 100644
--- a/softmmu/dma-helpers.c
+++ b/softmmu/dma-helpers.c
@@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
 aio_context_acquire(dbs->ctx);
 dbs->acb = dbs->io_func(dbs->offset, >iov,
 dma_blk_cb, dbs, dbs->io_func_opaque);
-aio_context_release(dbs->ctx);
 assert(dbs->acb);
+aio_context_release(dbs->ctx);
 }
 
 static void dma_aio_cancel(BlockAIOCB *acb)
-- 
2.25.1



[RESEND PATCH] hw/dma: fix crash caused by race condition

2022-04-27 Thread Tong Zhang
assert(dbs->acb) is meant to check the return value of io_func per
documented in commit 6bee44ea34 ("dma: the passed io_func does not
return NULL"). However, there is a chance that after calling
aio_context_release(dbs->ctx); the dma_blk_cb function is called before
the assertion and dbs->acb is set to NULL again at line 121. Thus when
we run assert at line 181 it will fail.

  softmmu/dma-helpers.c:181: dma_blk_cb: Assertion `dbs->acb' failed.

Reported-by: Francisco Londono 
Signed-off-by: Tong Zhang 
---
 softmmu/dma-helpers.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c
index 7820fec54c..cb81017928 100644
--- a/softmmu/dma-helpers.c
+++ b/softmmu/dma-helpers.c
@@ -177,8 +177,8 @@ static void dma_blk_cb(void *opaque, int ret)
 aio_context_acquire(dbs->ctx);
 dbs->acb = dbs->io_func(dbs->offset, >iov,
 dma_blk_cb, dbs, dbs->io_func_opaque);
-aio_context_release(dbs->ctx);
 assert(dbs->acb);
+aio_context_release(dbs->ctx);
 }
 
 static void dma_aio_cancel(BlockAIOCB *acb)
-- 
2.25.1