Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state

2018-03-12 Thread John Snow


On 03/12/2018 12:07 PM, Kevin Wolf wrote:
> Am 12.03.2018 um 16:41 hat John Snow geschrieben:
>> On 03/12/2018 11:28 AM, Kevin Wolf wrote:
>>> Am 10.03.2018 um 09:27 hat John Snow geschrieben:
 Add a new state that specifically demarcates when we begin to permanently
 demolish a job after it has performed all work. This makes the transition
 explicit in the STM table and highlights conditions under which a job may
 be demolished.

 Alongside this state, add a new helper command "block_job_decommission",
 which transitions to the NULL state and puts down our implicit reference.
 This separates instances in the code for "block_job_unref" which merely
 undo a matching "block_job_ref" with instances intended to initiate the
 full destruction of the object.

 This decommission action also sets a number of fields to make sure that
 block internals or external users that are holding a reference to a job
 to see when it "finishes" are convinced that the job object is "done."
 This is necessary, for instance, to do a block_job_cancel_sync on a
 created object which will not make any progress.

 Now, all jobs must go through block_job_decommission prior to being
 freed, giving us start-to-finish state machine coverage for jobs.


 Transitions:
 Created   -> Null: Early failure event before the job is started
 Concluded -> Null: Standard transition.

 Verbs:
 None. This should not ever be visible to the monitor.

  +-+
  |UNDEFINED|
  +--+--+
 |
  +--v+
+-+CREATED+--+
| +--++  |
||   |
| +--v+ +--+ |
+-+RUNNING<->PAUSED| |
| +--+-+--+ +--+ |
|| | |
|| +--+  |
|||  |
| +--v--+   +---+ |  |
+-+READY<--->STANDBY| |  |
| +--+--+   +---+ |  |
|||  |
 +--v-+   +--v--+ |  |
 |ABORTING+--->CONCLUDED<-+  |
 ++   +--+--+|
 |   |
  +--v-+ |
  |NULL<-+
  ++

 Signed-off-by: John Snow 
>>>
 +static void block_job_decommission(BlockJob *job)
 +{
 +assert(job);
 +job->completed = true;
 +job->busy = false;
 +job->paused = false;
 +job->deferred_to_main_loop = true;
>>>
>>> Why do we set all of these fields now? I don't see the use of it, and
>>> overwriting fields here potentially makes debugging harder.
>>>
>>> Especially for deferred_to_main_loop I might expect an assert() that it
>>> already is true, but shouldn't setting it always be done while actually
>>> deferring to the main loop?
>>>
>>> Can we turn all of these assignments into asserts or are there some that
>>> actually aren't already guaranteed, but that we want anyway?
>>>
 +block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
 +block_job_unref(job);
 +}
>>>
>>> Kevin
>>>
>>
>> Gonna be real honest; we probably only need to set maybe one field
>> (job->completed = true) but it was late and I started hitting things
>> with big hammers.
>>
>> The problem is that if jobs do not look "done" to functions like
>> finish_sync, they will loop forever trying to make progress on a job
>> that doesn't do anything.
>>
>> I set a bunch of fields here more as a semantic statement than a
>> necessity, to be really really honest. ("Well, the job definitely has
>> these properties if it made it here, so let's update these fields to be
>> correct and the rest of the code will hopefully Do The Right Thing.")
> 
> So essentially, we want this to be assert(), but currently that breaks
> for some reasons and we can't figure out why before the freeze?
> 

Nah, I knew exactly why it broke.

> I guess that's fair enough, but then it would be good to use the freeze
> period to find the offenders and actually turn it into assertions.
> 
> Kevin
> 

I appear to be horridly confused, and you haven't seen the intermediate
mess that caused my confusion. A veritable maelstrom of confusion. Mr
Babbage would not be able to rightly comprehend, &c.

Let's give this another shot.

I added that code at a time when my local branch was not calling
block_job_completed, because I declared in v4's STM that a pre-created
job "shall not pass go, and shall not collect $200" -- that CREATED jobs
should either go to RUNNING or NULL.

The discovery here is that directly decommissioning a created job
actually breaks finish

Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state

2018-03-12 Thread Kevin Wolf
Am 12.03.2018 um 16:41 hat John Snow geschrieben:
> On 03/12/2018 11:28 AM, Kevin Wolf wrote:
> > Am 10.03.2018 um 09:27 hat John Snow geschrieben:
> >> Add a new state that specifically demarcates when we begin to permanently
> >> demolish a job after it has performed all work. This makes the transition
> >> explicit in the STM table and highlights conditions under which a job may
> >> be demolished.
> >>
> >> Alongside this state, add a new helper command "block_job_decommission",
> >> which transitions to the NULL state and puts down our implicit reference.
> >> This separates instances in the code for "block_job_unref" which merely
> >> undo a matching "block_job_ref" with instances intended to initiate the
> >> full destruction of the object.
> >>
> >> This decommission action also sets a number of fields to make sure that
> >> block internals or external users that are holding a reference to a job
> >> to see when it "finishes" are convinced that the job object is "done."
> >> This is necessary, for instance, to do a block_job_cancel_sync on a
> >> created object which will not make any progress.
> >>
> >> Now, all jobs must go through block_job_decommission prior to being
> >> freed, giving us start-to-finish state machine coverage for jobs.
> >>
> >>
> >> Transitions:
> >> Created   -> Null: Early failure event before the job is started
> >> Concluded -> Null: Standard transition.
> >>
> >> Verbs:
> >> None. This should not ever be visible to the monitor.
> >>
> >>  +-+
> >>  |UNDEFINED|
> >>  +--+--+
> >> |
> >>  +--v+
> >>+-+CREATED+--+
> >>| +--++  |
> >>||   |
> >>| +--v+ +--+ |
> >>+-+RUNNING<->PAUSED| |
> >>| +--+-+--+ +--+ |
> >>|| | |
> >>|| +--+  |
> >>|||  |
> >>| +--v--+   +---+ |  |
> >>+-+READY<--->STANDBY| |  |
> >>| +--+--+   +---+ |  |
> >>|||  |
> >> +--v-+   +--v--+ |  |
> >> |ABORTING+--->CONCLUDED<-+  |
> >> ++   +--+--+|
> >> |   |
> >>  +--v-+ |
> >>  |NULL<-+
> >>  ++
> >>
> >> Signed-off-by: John Snow 
> > 
> >> +static void block_job_decommission(BlockJob *job)
> >> +{
> >> +assert(job);
> >> +job->completed = true;
> >> +job->busy = false;
> >> +job->paused = false;
> >> +job->deferred_to_main_loop = true;
> > 
> > Why do we set all of these fields now? I don't see the use of it, and
> > overwriting fields here potentially makes debugging harder.
> > 
> > Especially for deferred_to_main_loop I might expect an assert() that it
> > already is true, but shouldn't setting it always be done while actually
> > deferring to the main loop?
> > 
> > Can we turn all of these assignments into asserts or are there some that
> > actually aren't already guaranteed, but that we want anyway?
> > 
> >> +block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
> >> +block_job_unref(job);
> >> +}
> > 
> > Kevin
> > 
> 
> Gonna be real honest; we probably only need to set maybe one field
> (job->completed = true) but it was late and I started hitting things
> with big hammers.
> 
> The problem is that if jobs do not look "done" to functions like
> finish_sync, they will loop forever trying to make progress on a job
> that doesn't do anything.
> 
> I set a bunch of fields here more as a semantic statement than a
> necessity, to be really really honest. ("Well, the job definitely has
> these properties if it made it here, so let's update these fields to be
> correct and the rest of the code will hopefully Do The Right Thing.")

So essentially, we want this to be assert(), but currently that breaks
for some reasons and we can't figure out why before the freeze?

I guess that's fair enough, but then it would be good to use the freeze
period to find the offenders and actually turn it into assertions.

Kevin



Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state

2018-03-12 Thread John Snow


On 03/12/2018 11:28 AM, Kevin Wolf wrote:
> Am 10.03.2018 um 09:27 hat John Snow geschrieben:
>> Add a new state that specifically demarcates when we begin to permanently
>> demolish a job after it has performed all work. This makes the transition
>> explicit in the STM table and highlights conditions under which a job may
>> be demolished.
>>
>> Alongside this state, add a new helper command "block_job_decommission",
>> which transitions to the NULL state and puts down our implicit reference.
>> This separates instances in the code for "block_job_unref" which merely
>> undo a matching "block_job_ref" with instances intended to initiate the
>> full destruction of the object.
>>
>> This decommission action also sets a number of fields to make sure that
>> block internals or external users that are holding a reference to a job
>> to see when it "finishes" are convinced that the job object is "done."
>> This is necessary, for instance, to do a block_job_cancel_sync on a
>> created object which will not make any progress.
>>
>> Now, all jobs must go through block_job_decommission prior to being
>> freed, giving us start-to-finish state machine coverage for jobs.
>>
>>
>> Transitions:
>> Created   -> Null: Early failure event before the job is started
>> Concluded -> Null: Standard transition.
>>
>> Verbs:
>> None. This should not ever be visible to the monitor.
>>
>>  +-+
>>  |UNDEFINED|
>>  +--+--+
>> |
>>  +--v+
>>+-+CREATED+--+
>>| +--++  |
>>||   |
>>| +--v+ +--+ |
>>+-+RUNNING<->PAUSED| |
>>| +--+-+--+ +--+ |
>>|| | |
>>|| +--+  |
>>|||  |
>>| +--v--+   +---+ |  |
>>+-+READY<--->STANDBY| |  |
>>| +--+--+   +---+ |  |
>>|||  |
>> +--v-+   +--v--+ |  |
>> |ABORTING+--->CONCLUDED<-+  |
>> ++   +--+--+|
>> |   |
>>  +--v-+ |
>>  |NULL<-+
>>  ++
>>
>> Signed-off-by: John Snow 
> 
>> +static void block_job_decommission(BlockJob *job)
>> +{
>> +assert(job);
>> +job->completed = true;
>> +job->busy = false;
>> +job->paused = false;
>> +job->deferred_to_main_loop = true;
> 
> Why do we set all of these fields now? I don't see the use of it, and
> overwriting fields here potentially makes debugging harder.
> 
> Especially for deferred_to_main_loop I might expect an assert() that it
> already is true, but shouldn't setting it always be done while actually
> deferring to the main loop?
> 
> Can we turn all of these assignments into asserts or are there some that
> actually aren't already guaranteed, but that we want anyway?
> 
>> +block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
>> +block_job_unref(job);
>> +}
> 
> Kevin
> 

Gonna be real honest; we probably only need to set maybe one field
(job->completed = true) but it was late and I started hitting things
with big hammers.

The problem is that if jobs do not look "done" to functions like
finish_sync, they will loop forever trying to make progress on a job
that doesn't do anything.

I set a bunch of fields here more as a semantic statement than a
necessity, to be really really honest. ("Well, the job definitely has
these properties if it made it here, so let's update these fields to be
correct and the rest of the code will hopefully Do The Right Thing.")



Re: [Qemu-devel] [PATCH v5 10/21] blockjobs: add NULL state

2018-03-12 Thread Kevin Wolf
Am 10.03.2018 um 09:27 hat John Snow geschrieben:
> Add a new state that specifically demarcates when we begin to permanently
> demolish a job after it has performed all work. This makes the transition
> explicit in the STM table and highlights conditions under which a job may
> be demolished.
> 
> Alongside this state, add a new helper command "block_job_decommission",
> which transitions to the NULL state and puts down our implicit reference.
> This separates instances in the code for "block_job_unref" which merely
> undo a matching "block_job_ref" with instances intended to initiate the
> full destruction of the object.
> 
> This decommission action also sets a number of fields to make sure that
> block internals or external users that are holding a reference to a job
> to see when it "finishes" are convinced that the job object is "done."
> This is necessary, for instance, to do a block_job_cancel_sync on a
> created object which will not make any progress.
> 
> Now, all jobs must go through block_job_decommission prior to being
> freed, giving us start-to-finish state machine coverage for jobs.
> 
> 
> Transitions:
> Created   -> Null: Early failure event before the job is started
> Concluded -> Null: Standard transition.
> 
> Verbs:
> None. This should not ever be visible to the monitor.
> 
>  +-+
>  |UNDEFINED|
>  +--+--+
> |
>  +--v+
>+-+CREATED+--+
>| +--++  |
>||   |
>| +--v+ +--+ |
>+-+RUNNING<->PAUSED| |
>| +--+-+--+ +--+ |
>|| | |
>|| +--+  |
>|||  |
>| +--v--+   +---+ |  |
>+-+READY<--->STANDBY| |  |
>| +--+--+   +---+ |  |
>|||  |
> +--v-+   +--v--+ |  |
> |ABORTING+--->CONCLUDED<-+  |
> ++   +--+--+|
> |   |
>  +--v-+ |
>  |NULL<-+
>  ++
> 
> Signed-off-by: John Snow 

> +static void block_job_decommission(BlockJob *job)
> +{
> +assert(job);
> +job->completed = true;
> +job->busy = false;
> +job->paused = false;
> +job->deferred_to_main_loop = true;

Why do we set all of these fields now? I don't see the use of it, and
overwriting fields here potentially makes debugging harder.

Especially for deferred_to_main_loop I might expect an assert() that it
already is true, but shouldn't setting it always be done while actually
deferring to the main loop?

Can we turn all of these assignments into asserts or are there some that
actually aren't already guaranteed, but that we want anyway?

> +block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
> +block_job_unref(job);
> +}

Kevin