> From: Kevin Wolf [mailto:kw...@redhat.com] > Am 21.02.2019 um 12:05 hat Pavel Dovgalyuk geschrieben: > > Replay is capable of recording normal BH events, but sometimes > > there are single use callbacks scheduled with aio_bh_schedule_oneshot > > function. This patch enables recording and replaying such callbacks. > > Block layer uses these events for calling the completion function. > > Replaying these calls makes the execution deterministic. > > > > Signed-off-by: Pavel Dovgalyuk <pavel.dovga...@ispras.ru> > > > > -- > > > > v6: > > - moved stub function to the separate file for fixing linux-user build > > v10: > > - replaced all block layer aio_bh_schedule_oneshot calls > This still doesn't catch all instances, e.g. everything that goes > through aio_co_schedule() is missing.
It seems, that everything else is synchronized with blkreplay driver which is mandatory when using block devices in rr mode. > But I fully expect this to get broken anyway all the time because nobody > understands which function to use, and if it works for your special case > now and we'll fix other stuff as you encouter it, maybe that's good > enough for you. This problem exists in every subsystem and it is ok for now, when record/replay is not mature enough, and not familiar for others. When virtual devices are updated, developers may miss correct loadvm/savevm implementation. For example, loading the audio device state may miss shift the phase of the output signal. Nobody will notice that bug in the migration process, but it reveals when we use record/replay. We can't cover everything with record/replay tests. Most of the new bugs can be revealed in complex configurations after billions of executed instructions. But when this feature will be available out of the box, we'll at least get more smoke testing. > > > @@ -1349,8 +1351,8 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, > > int64_t offset, int > bytes, > > > > acb->has_returned = true; > > if (acb->rwco.ret != NOT_DONE) { > > - aio_bh_schedule_oneshot(blk_get_aio_context(blk), > > - blk_aio_complete_bh, acb); > > + replay_bh_schedule_oneshot_event(blk_get_aio_context(blk), > > + blk_aio_complete_bh, acb); > > } > > This, and a few other places that you convert, are in fast paths and add > some calls that are unnecessary for non-replay cases. I don't think that this can make a noticeable slowdown, but we can run the tests if you want. We have the test suite which performs disk-intensive computation. It was created to measure the effect of running BH callbacks through the virtual timer infrastructure. > I wonder if we could make replay optional in ./configure and then make > replay_bh_schedule_oneshot_event() a static inline function that can get > optimised away at compile time if the feature is disabled. It is coupled with icount. However, some icount calls are also lie on the fast paths and are completely useless when icount is not enabled. Pavel Dovgalyuk