> From: Kevin Wolf [mailto:kw...@redhat.com]
> Am 21.02.2019 um 12:05 hat Pavel Dovgalyuk geschrieben:
> > Replay is capable of recording normal BH events, but sometimes
> > there are single use callbacks scheduled with aio_bh_schedule_oneshot
> > function. This patch enables recording and replaying such callbacks.
> > Block layer uses these events for calling the completion function.
> > Replaying these calls makes the execution deterministic.
> >
> > Signed-off-by: Pavel Dovgalyuk <pavel.dovga...@ispras.ru>
> >
> > --
> >
> > v6:
> >  - moved stub function to the separate file for fixing linux-user build
> > v10:
> >  - replaced all block layer aio_bh_schedule_oneshot calls
> This still doesn't catch all instances, e.g. everything that goes
> through aio_co_schedule() is missing.

It seems, that everything else is synchronized with blkreplay driver
which is mandatory when using block devices in rr mode.

> But I fully expect this to get broken anyway all the time because nobody
> understands which function to use, and if it works for your special case
> now and we'll fix other stuff as you encouter it, maybe that's good
> enough for you.

This problem exists in every subsystem and it is ok for now, when record/replay
is not mature enough, and not familiar for others.
When virtual devices are updated, developers may miss correct loadvm/savevm
implementation. For example, loading the audio device state may miss
shift the phase of the output signal. Nobody will notice that bug
in the migration process, but it reveals when we use record/replay.

We can't cover everything with record/replay tests. Most of the new bugs
can be revealed in complex configurations after billions of executed 
instructions.
But when this feature will be available out of the box, we'll
at least get more smoke testing.

> 
> > @@ -1349,8 +1351,8 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, 
> > int64_t offset, int
> bytes,
> >
> >      acb->has_returned = true;
> >      if (acb->rwco.ret != NOT_DONE) {
> > -        aio_bh_schedule_oneshot(blk_get_aio_context(blk),
> > -                                blk_aio_complete_bh, acb);
> > +        replay_bh_schedule_oneshot_event(blk_get_aio_context(blk),
> > +                                         blk_aio_complete_bh, acb);
> >      }
> 
> This, and a few other places that you convert, are in fast paths and add
> some calls that are unnecessary for non-replay cases.

I don't think that this can make a noticeable slowdown, but we can run
the tests if you want.
We have the test suite which performs disk-intensive computation.
It was created to measure the effect of running BH callbacks through
the virtual timer infrastructure.

> I wonder if we could make replay optional in ./configure and then make
> replay_bh_schedule_oneshot_event() a static inline function that can get
> optimised away at compile time if the feature is disabled.

It is coupled with icount. However, some icount calls are also lie on
the fast paths and are completely useless when icount is not enabled.

Pavel Dovgalyuk


Reply via email to