Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
On 03/12/2018 12:07 PM, Kevin Wolf wrote: > Am 12.03.2018 um 16:41 hat John Snow geschrieben: >> On 03/12/2018 11:28 AM, Kevin Wolf wrote: >>> Am 10.03.2018 um 09:27 hat John Snow geschrieben: Add a new state that specifically demarcates when we begin to permanently demolish a job after it has performed all work. This makes the transition explicit in the STM table and highlights conditions under which a job may be demolished. Alongside this state, add a new helper command "block_job_decommission", which transitions to the NULL state and puts down our implicit reference. This separates instances in the code for "block_job_unref" which merely undo a matching "block_job_ref" with instances intended to initiate the full destruction of the object. This decommission action also sets a number of fields to make sure that block internals or external users that are holding a reference to a job to see when it "finishes" are convinced that the job object is "done." This is necessary, for instance, to do a block_job_cancel_sync on a created object which will not make any progress. Now, all jobs must go through block_job_decommission prior to being freed, giving us start-to-finish state machine coverage for jobs. Transitions: Created -> Null: Early failure event before the job is started Concluded -> Null: Standard transition. Verbs: None. This should not ever be visible to the monitor. +-+ |UNDEFINED| +--+--+ | +--v+ +-+CREATED+--+ | +--++ | || | | +--v+ +--+ | +-+RUNNING<->PAUSED| | | +--+-+--+ +--+ | || | | || +--+ | ||| | | +--v--+ +---+ | | +-+READY<--->STANDBY| | | | +--+--+ +---+ | | ||| | +--v-+ +--v--+ | | |ABORTING+--->CONCLUDED<-+ | ++ +--+--+| | | +--v-+ | |NULL<-+ ++ Signed-off-by: John Snow >>> +static void block_job_decommission(BlockJob *job) +{ +assert(job); +job->completed = true; +job->busy = false; +job->paused = false; +job->deferred_to_main_loop = true; >>> >>> Why do we set all of these fields now? I don't see the use of it, and >>> overwriting fields here potentially makes debugging harder. >>> >>> Especially for deferred_to_main_loop I might expect an assert() that it >>> already is true, but shouldn't setting it always be done while actually >>> deferring to the main loop? >>> >>> Can we turn all of these assignments into asserts or are there some that >>> actually aren't already guaranteed, but that we want anyway? >>> +block_job_state_transition(job, BLOCK_JOB_STATUS_NULL); +block_job_unref(job); +} >>> >>> Kevin >>> >> >> Gonna be real honest; we probably only need to set maybe one field >> (job->completed = true) but it was late and I started hitting things >> with big hammers. >> >> The problem is that if jobs do not look "done" to functions like >> finish_sync, they will loop forever trying to make progress on a job >> that doesn't do anything. >> >> I set a bunch of fields here more as a semantic statement than a >> necessity, to be really really honest. ("Well, the job definitely has >> these properties if it made it here, so let's update these fields to be >> correct and the rest of the code will hopefully Do The Right Thing.") > > So essentially, we want this to be assert(), but currently that breaks > for some reasons and we can't figure out why before the freeze? > Nah, I knew exactly why it broke. > I guess that's fair enough, but then it would be good to use the freeze > period to find the offenders and actually turn it into assertions. > > Kevin > I appear to be horridly confused, and you haven't seen the intermediate mess that caused my confusion. A veritable maelstrom of confusion. Mr Babbage would not be able to rightly comprehend, &c. Let's give this another shot. I added that code at a time when my local branch was not calling block_job_completed, because I declared in v4's STM that a pre-created job "shall not pass go, and shall not collect $200" -- that CREATED jobs should either go to RUNNING or NULL. The discovery here is that directly decommissioning a created job actually breaks finish
Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
Am 12.03.2018 um 16:41 hat John Snow geschrieben: > On 03/12/2018 11:28 AM, Kevin Wolf wrote: > > Am 10.03.2018 um 09:27 hat John Snow geschrieben: > >> Add a new state that specifically demarcates when we begin to permanently > >> demolish a job after it has performed all work. This makes the transition > >> explicit in the STM table and highlights conditions under which a job may > >> be demolished. > >> > >> Alongside this state, add a new helper command "block_job_decommission", > >> which transitions to the NULL state and puts down our implicit reference. > >> This separates instances in the code for "block_job_unref" which merely > >> undo a matching "block_job_ref" with instances intended to initiate the > >> full destruction of the object. > >> > >> This decommission action also sets a number of fields to make sure that > >> block internals or external users that are holding a reference to a job > >> to see when it "finishes" are convinced that the job object is "done." > >> This is necessary, for instance, to do a block_job_cancel_sync on a > >> created object which will not make any progress. > >> > >> Now, all jobs must go through block_job_decommission prior to being > >> freed, giving us start-to-finish state machine coverage for jobs. > >> > >> > >> Transitions: > >> Created -> Null: Early failure event before the job is started > >> Concluded -> Null: Standard transition. > >> > >> Verbs: > >> None. This should not ever be visible to the monitor. > >> > >> +-+ > >> |UNDEFINED| > >> +--+--+ > >> | > >> +--v+ > >>+-+CREATED+--+ > >>| +--++ | > >>|| | > >>| +--v+ +--+ | > >>+-+RUNNING<->PAUSED| | > >>| +--+-+--+ +--+ | > >>|| | | > >>|| +--+ | > >>||| | > >>| +--v--+ +---+ | | > >>+-+READY<--->STANDBY| | | > >>| +--+--+ +---+ | | > >>||| | > >> +--v-+ +--v--+ | | > >> |ABORTING+--->CONCLUDED<-+ | > >> ++ +--+--+| > >> | | > >> +--v-+ | > >> |NULL<-+ > >> ++ > >> > >> Signed-off-by: John Snow > > > >> +static void block_job_decommission(BlockJob *job) > >> +{ > >> +assert(job); > >> +job->completed = true; > >> +job->busy = false; > >> +job->paused = false; > >> +job->deferred_to_main_loop = true; > > > > Why do we set all of these fields now? I don't see the use of it, and > > overwriting fields here potentially makes debugging harder. > > > > Especially for deferred_to_main_loop I might expect an assert() that it > > already is true, but shouldn't setting it always be done while actually > > deferring to the main loop? > > > > Can we turn all of these assignments into asserts or are there some that > > actually aren't already guaranteed, but that we want anyway? > > > >> +block_job_state_transition(job, BLOCK_JOB_STATUS_NULL); > >> +block_job_unref(job); > >> +} > > > > Kevin > > > > Gonna be real honest; we probably only need to set maybe one field > (job->completed = true) but it was late and I started hitting things > with big hammers. > > The problem is that if jobs do not look "done" to functions like > finish_sync, they will loop forever trying to make progress on a job > that doesn't do anything. > > I set a bunch of fields here more as a semantic statement than a > necessity, to be really really honest. ("Well, the job definitely has > these properties if it made it here, so let's update these fields to be > correct and the rest of the code will hopefully Do The Right Thing.") So essentially, we want this to be assert(), but currently that breaks for some reasons and we can't figure out why before the freeze? I guess that's fair enough, but then it would be good to use the freeze period to find the offenders and actually turn it into assertions. Kevin
Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
On 03/12/2018 11:28 AM, Kevin Wolf wrote: > Am 10.03.2018 um 09:27 hat John Snow geschrieben: >> Add a new state that specifically demarcates when we begin to permanently >> demolish a job after it has performed all work. This makes the transition >> explicit in the STM table and highlights conditions under which a job may >> be demolished. >> >> Alongside this state, add a new helper command "block_job_decommission", >> which transitions to the NULL state and puts down our implicit reference. >> This separates instances in the code for "block_job_unref" which merely >> undo a matching "block_job_ref" with instances intended to initiate the >> full destruction of the object. >> >> This decommission action also sets a number of fields to make sure that >> block internals or external users that are holding a reference to a job >> to see when it "finishes" are convinced that the job object is "done." >> This is necessary, for instance, to do a block_job_cancel_sync on a >> created object which will not make any progress. >> >> Now, all jobs must go through block_job_decommission prior to being >> freed, giving us start-to-finish state machine coverage for jobs. >> >> >> Transitions: >> Created -> Null: Early failure event before the job is started >> Concluded -> Null: Standard transition. >> >> Verbs: >> None. This should not ever be visible to the monitor. >> >> +-+ >> |UNDEFINED| >> +--+--+ >> | >> +--v+ >>+-+CREATED+--+ >>| +--++ | >>|| | >>| +--v+ +--+ | >>+-+RUNNING<->PAUSED| | >>| +--+-+--+ +--+ | >>|| | | >>|| +--+ | >>||| | >>| +--v--+ +---+ | | >>+-+READY<--->STANDBY| | | >>| +--+--+ +---+ | | >>||| | >> +--v-+ +--v--+ | | >> |ABORTING+--->CONCLUDED<-+ | >> ++ +--+--+| >> | | >> +--v-+ | >> |NULL<-+ >> ++ >> >> Signed-off-by: John Snow > >> +static void block_job_decommission(BlockJob *job) >> +{ >> +assert(job); >> +job->completed = true; >> +job->busy = false; >> +job->paused = false; >> +job->deferred_to_main_loop = true; > > Why do we set all of these fields now? I don't see the use of it, and > overwriting fields here potentially makes debugging harder. > > Especially for deferred_to_main_loop I might expect an assert() that it > already is true, but shouldn't setting it always be done while actually > deferring to the main loop? > > Can we turn all of these assignments into asserts or are there some that > actually aren't already guaranteed, but that we want anyway? > >> +block_job_state_transition(job, BLOCK_JOB_STATUS_NULL); >> +block_job_unref(job); >> +} > > Kevin > Gonna be real honest; we probably only need to set maybe one field (job->completed = true) but it was late and I started hitting things with big hammers. The problem is that if jobs do not look "done" to functions like finish_sync, they will loop forever trying to make progress on a job that doesn't do anything. I set a bunch of fields here more as a semantic statement than a necessity, to be really really honest. ("Well, the job definitely has these properties if it made it here, so let's update these fields to be correct and the rest of the code will hopefully Do The Right Thing.")
Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state
Am 10.03.2018 um 09:27 hat John Snow geschrieben: > Add a new state that specifically demarcates when we begin to permanently > demolish a job after it has performed all work. This makes the transition > explicit in the STM table and highlights conditions under which a job may > be demolished. > > Alongside this state, add a new helper command "block_job_decommission", > which transitions to the NULL state and puts down our implicit reference. > This separates instances in the code for "block_job_unref" which merely > undo a matching "block_job_ref" with instances intended to initiate the > full destruction of the object. > > This decommission action also sets a number of fields to make sure that > block internals or external users that are holding a reference to a job > to see when it "finishes" are convinced that the job object is "done." > This is necessary, for instance, to do a block_job_cancel_sync on a > created object which will not make any progress. > > Now, all jobs must go through block_job_decommission prior to being > freed, giving us start-to-finish state machine coverage for jobs. > > > Transitions: > Created -> Null: Early failure event before the job is started > Concluded -> Null: Standard transition. > > Verbs: > None. This should not ever be visible to the monitor. > > +-+ > |UNDEFINED| > +--+--+ > | > +--v+ >+-+CREATED+--+ >| +--++ | >|| | >| +--v+ +--+ | >+-+RUNNING<->PAUSED| | >| +--+-+--+ +--+ | >|| | | >|| +--+ | >||| | >| +--v--+ +---+ | | >+-+READY<--->STANDBY| | | >| +--+--+ +---+ | | >||| | > +--v-+ +--v--+ | | > |ABORTING+--->CONCLUDED<-+ | > ++ +--+--+| > | | > +--v-+ | > |NULL<-+ > ++ > > Signed-off-by: John Snow > +static void block_job_decommission(BlockJob *job) > +{ > +assert(job); > +job->completed = true; > +job->busy = false; > +job->paused = false; > +job->deferred_to_main_loop = true; Why do we set all of these fields now? I don't see the use of it, and overwriting fields here potentially makes debugging harder. Especially for deferred_to_main_loop I might expect an assert() that it already is true, but shouldn't setting it always be done while actually deferring to the main loop? Can we turn all of these assignments into asserts or are there some that actually aren't already guaranteed, but that we want anyway? > +block_job_state_transition(job, BLOCK_JOB_STATUS_NULL); > +block_job_unref(job); > +} Kevin