Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-06-24 Thread Dario Faggioli
On Fri, 2016-06-24 at 13:25 +, Wu, Feng wrote:
> > 
> > Then, in this case, the reason why we are sure that all the pcpus
> > are
> > executing the body of the tasklet, is indeed the structure of
> > stop_machine_run() and stopmachine_action() themselves, which are
> > built
> > to make sure of that,
> Thanks for the reply, I am sorry I don't quite understand the above
> comment. In my understanding, the tasklet has higher priority, so
> stopmachine_action() as the body of the tasklet preempts vCPU3.
> Is this the case?
> 
It is the case. What I was trying to say is that, even if tasklets
would not have higher priority than regular vCPUs (as soon as there
would be a mechanism that ensures that tasklets themselves were run and
not starve), things would still work.

It's not that important, it's just that it seemed you wanted to
summarize in order to better understand the situation, and I thought it
was important to make you notice this.

Just that. :-)

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-06-24 Thread Wu, Feng
> > Thanks for your replay. Yes, I think this is point. Here descheduling
> > of vCPU3
> > happens, and the reason we will choose the tasklet as the next
> > running
> > unit for sure (not choosing another vCPU or vCPU3 itself as the next
> > running unit) is because tasklet will overrides all other choose as
> > stated in csched_schedule() as below, right?
> >
> Exactly, tasklets preempt the running vcpu and take precedence of
> runnable vcpus.
> 
> Then, in this case, the reason why we are sure that all the pcpus are
> executing the body of the tasklet, is indeed the structure of
> stop_machine_run() and stopmachine_action() themselves, which are built
> to make sure of that,

Thanks for the reply, I am sorry I don't quite understand the above
comment. In my understanding, the tasklet has higher priority, so
stopmachine_action() as the body of the tasklet preempts vCPU3.
Is this the case?

Thanks,
Feng

> much rather than just the fact that tasklet are
> higher priority.
> 
> But yes, this is what happens.
> 
Thanks for the clarification!

Thanks,
Feng

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-06-24 Thread Dario Faggioli
On Fri, 2016-06-24 at 07:59 +, Wu, Feng wrote:
> > -Original Message-
> > From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> > So, vCPU 3 was running, but then some called stop_machine_run(),
> > which
> > causes the descheduling of vCPU 3, and the execution of the
> > stopmachine
> > tasklet.
> Thanks for your replay. Yes, I think this is point. Here descheduling
> of vCPU3
> happens, and the reason we will choose the tasklet as the next
> running
> unit for sure (not choosing another vCPU or vCPU3 itself as the next
> running unit) is because tasklet will overrides all other choose as
> stated in csched_schedule() as below, right?
> 
Exactly, tasklets preempt the running vcpu and take precedence of
runnable vcpus.

Then, in this case, the reason why we are sure that all the pcpus are
executing the body of the tasklet, is indeed the structure of
stop_machine_run() and stopmachine_action() themselves, which are built
to make sure of that, much rather than just the fact that tasklet are
higher priority.

But yes, this is what happens.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-06-24 Thread Wu, Feng


> -Original Message-
> From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> Sent: Friday, June 24, 2016 3:23 PM
> To: Wu, Feng <feng...@intel.com>; Jan Beulich <jbeul...@suse.com>
> Cc: Tian, Kevin <kevin.t...@intel.com>; k...@xen.org;
> george.dun...@eu.citrix.com; andrew.coop...@citrix.com; xen-
> de...@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and
> per-cpu blocking list
> 
> On Fri, 2016-06-24 at 06:11 +, Wu, Feng wrote:
> > > -Original Message-
> > > From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> > > No, because we call cpu_disable_scheduler() from __cpu_disable(),
> > > only
> > > when system state is SYS_STATE_suspend already, and hence we take
> > > the
> > > then branch of the 'if', which does never return an error.
> > Thanks for the elaboration. I find __cpu_disable() can be called with
> > system state not being SYS_STATE_suspend. Here is my experiment:
> >
> > 1. Launch a guest and pin vCPU 3 to pCPU 3 (which makes the
> > experiment simpler)
> > 2. offline pCPU 3 via "xen-hptool cpu-offline 3"
> >
> Ah, yes, of course. I should have included cpu-hot(un)plug in my
> analysis in the first place, sorry for not doing so.
> 
> > The call path of the above steps is:
> > arch_do_sysctl()
> >   -> cpu_down_helper()
> > -> cpu_down()
> >   -> take_cpu_down()
> > -> __cpu_disable()
> >   -> cpu_disable_scheduler() (enter the 'else'  part)
> >
> Right, and the important part is this one:
> 
> > int stop_machine_run(int (*fn)(void *), void *data, unsigned int cpu)
> > {
> >
> > ..
> >
> > point 1
> >
> > for_each_cpu ( i,  )
> > tasklet_schedule_on_cpu(_cpu(stopmachine_tasklet, i), i);
> >
> > point 2
> > ..
> >
> > }
> >
> > at point 1 above,
> > vCPU3->runstate.state: 0, vCPU3->is_running: 1
> > while at point 2 above:
> > vCPU3->runstate.state: 1, vCPU3->is_running: 0
> >
> This is exactly as you describe. cpu hotplug is done in stop machine
> context. Check the comment close to the definition of stop_machine_run:
> 
> /**
>  * stop_machine_run: freeze the machine on all CPUs and run this function
>  * @fn: the function to run
>  * @data: the data ptr for the @fn()
>  * @cpu: the cpu to run @fn() on (or all, if @cpu == NR_CPUS).
>  *
>  * Description: This causes every other cpu to enter a safe point, with
>  * each of which disables interrupts, and finally interrupts are disabled
>  * on the current CPU.  The result is that none is holding a spinlock
>  * or inside any other preempt-disabled region when @fn() runs.
>  *
>  * This can be thought of as a very heavy write lock, equivalent to
>  * grabbing every spinlock in the kernel. */
> 
> As you discovered yourself, this is achieved by forcing the execution
> of a tasklet on all the pcpus, which include pCPU 3 of your example.
> 
> So, vCPU 3 was running, but then some called stop_machine_run(), which
> causes the descheduling of vCPU 3, and the execution of the stopmachine
> tasklet.

Thanks for your replay. Yes, I think this is point. Here descheduling of vCPU3
happens, and the reason we will choose the tasklet as the next running
unit for sure (not choosing another vCPU or vCPU3 itself as the next
running unit) is because tasklet will overrides all other choose as
stated in csched_schedule() as below, right?

static struct task_slice
csched_schedule(
const struct scheduler *ops, s_time_t now, bool_t tasklet_work_scheduled) 
{
..

snext = __runq_elem(runq->next);
ret.migrated = 0;

/* Tasklet work (which runs in idle VCPU context) overrides all else. */
if ( tasklet_work_scheduled )
{
TRACE_0D(TRC_CSCHED_SCHED_TASKLET);
snext = CSCHED_VCPU(idle_vcpu[cpu]);
snext->pri = CSCHED_PRI_TS_BOOST;
}


..
}

Thanks,
Feng
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-06-24 Thread Dario Faggioli
On Fri, 2016-06-24 at 06:11 +, Wu, Feng wrote:
> > -Original Message-
> > From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> > No, because we call cpu_disable_scheduler() from __cpu_disable(),
> > only
> > when system state is SYS_STATE_suspend already, and hence we take
> > the
> > then branch of the 'if', which does never return an error.
> Thanks for the elaboration. I find __cpu_disable() can be called with
> system state not being SYS_STATE_suspend. Here is my experiment:
> 
> 1. Launch a guest and pin vCPU 3 to pCPU 3 (which makes the
> experiment simpler)
> 2. offline pCPU 3 via "xen-hptool cpu-offline 3"
> 
Ah, yes, of course. I should have included cpu-hot(un)plug in my
analysis in the first place, sorry for not doing so.

> The call path of the above steps is:
> arch_do_sysctl()
>   -> cpu_down_helper()
> -> cpu_down()
>   -> take_cpu_down()
> -> __cpu_disable()
>   -> cpu_disable_scheduler() (enter the 'else'  part)
> 
Right, and the important part is this one:

> int stop_machine_run(int (*fn)(void *), void *data, unsigned int cpu)
> {
> 
> ..
> 
> point 1
> 
> for_each_cpu ( i,  )
> tasklet_schedule_on_cpu(_cpu(stopmachine_tasklet, i), i);
> 
> point 2
> ..
> 
> }
> 
> at point 1 above, 
> vCPU3->runstate.state: 0, vCPU3->is_running: 1
> while at point 2 above:
> vCPU3->runstate.state: 1, vCPU3->is_running: 0
> 
This is exactly as you describe. cpu hotplug is done in stop machine
context. Check the comment close to the definition of stop_machine_run:

/**
 * stop_machine_run: freeze the machine on all CPUs and run this function
 * @fn: the function to run
 * @data: the data ptr for the @fn()
 * @cpu: the cpu to run @fn() on (or all, if @cpu == NR_CPUS).
 *
 * Description: This causes every other cpu to enter a safe point, with
 * each of which disables interrupts, and finally interrupts are disabled
 * on the current CPU.  The result is that none is holding a spinlock
 * or inside any other preempt-disabled region when @fn() runs.
 *
 * This can be thought of as a very heavy write lock, equivalent to
 * grabbing every spinlock in the kernel. */

As you discovered yourself, this is achieved by forcing the execution
of a tasklet on all the pcpus, which include pCPU 3 of your example.

So, vCPU 3 was running, but then some called stop_machine_run(), which
causes the descheduling of vCPU 3, and the execution of the stopmachine
tasklet.

Thet's why you find is_running to be 0, and that's why  we never return
EAGAIN.

> I tested it for many times and got the same result. I am not sure the
> vcpu
> state transition just happens to occurs here or not? If the
> transition doesn't
> happen and is_running is still 1 when we get vcpu_migrate() in
> cpu_disable_scheduler() in the above case, should it be a problem?
>
I'm not sure what you mean here (in particular with "just happen to
occur"). If you're wondering whether the fact that vCPU 3 gets
descheduled happens by chance or by design, it's indeed the latter, and
so, no, we don't have a problem with this code path.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-06-24 Thread Wu, Feng


> -Original Message-
> From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> Sent: Thursday, June 23, 2016 11:12 PM
> To: Wu, Feng <feng...@intel.com>; Jan Beulich <jbeul...@suse.com>
> Cc: Tian, Kevin <kevin.t...@intel.com>; k...@xen.org;
> george.dun...@eu.citrix.com; andrew.coop...@citrix.com; xen-
> de...@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and
> per-cpu blocking list
> 
> On Thu, 2016-06-23 at 12:33 +, Wu, Feng wrote:
> > > -Original Message-
> > > From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> > >
> > > It goes through all the vcpus of all domains, and does not check or
> > > care whether they are running, runnable or blocked.
> > >
> > > Let's look at this in some more details. So, let's assume that
> > > processor 5 is going away, and that you have the following vcpus
> > > around:
> > >
> > >  d0v0 : v->processor = 5, running on cpu 5
> > >  d0v1 : v->processor = 4, running on cpu 4
> > >  d1v0 : v->processor = 5, runnable but not running
> > >  d2v3 : v->processor = 5, blocked
> > >
> > > for d0v0, we do:
> > >   cpu_disable_scheduler(5)
> > >     set_bit(_VPF_migrating, d0v0->pause_flags);
> > >     vcpu_sleep_nosync(d0v0);
> > >       SCHED_OP(sleep, d0v0);
> > >         csched_vcpu_sleep(d0v0)
> > >           cpu_raise_softirq(5, SCHEDULE_SOFTIRQ);
> > >     vcpu_migrate(d0v0);
> > >       if ( v->is_running || ...) // assume v->is_running is true
> > >         return
> > Hi Dario, after read this mail again, I get another question,
> > could you please help me out?
> >
> > In the above code flow, we return in vcpu_migrate(d0v0) because
> > v->is_running == 1, after vcpu_migrate() return, we check:
> >
> > if ( v->processor == cpu )
> > ret = -EAGAIN;
> >
> > In my understand in the above case, 'v->processor' is likely equal to
> > 'cpu', hence return -EAGAIN. However, in __cpu_disable(), there is
> > some check as below:
> >
> > if ( cpu_disable_scheduler(cpu) )
> > BUG();
> >
> Right. But, as the comment inside cpu_disable_scheduler() itself says,
> we only return -EAGAIN in case we are calling cpu_disable_scheduler for
> removing a pCPU from a cpupool.
> 
> In that case, we do not use __cpu_disable(), and hence we can safely
> return an error value. In that case, in fact, the caller of
> cpu_disable_scheduler() is cpupool_unassign_cpu_helprer(), which does
> what's necessary to deal with such error.
> 
> > Might we hit the BUG() in the above case?
> >
> No, because we call cpu_disable_scheduler() from __cpu_disable(), only
> when system state is SYS_STATE_suspend already, and hence we take the
> then branch of the 'if', which does never return an error.

Thanks for the elaboration. I find __cpu_disable() can be called with
system state not being SYS_STATE_suspend. Here is my experiment:

1. Launch a guest and pin vCPU 3 to pCPU 3 (which makes the experiment simpler)
2. offline pCPU 3 via "xen-hptool cpu-offline 3"

The call path of the above steps is:
arch_do_sysctl()
  -> cpu_down_helper()
-> cpu_down()
  -> take_cpu_down()
-> __cpu_disable()
  -> cpu_disable_scheduler() (enter the 'else'  part)

I ran an infinite loop on guest vCPU3 (hence pCPU3) to increase the possibility
to hit 'ret = -EAGAGIN' in this function, and I thought it would return -EAGAIN 
if the
vCPU is running, However, I tested it for many times, it didn't happen. and I 
find
after vcpu_migrate() returns, vCPU3->runstate.state: 1, vCPU3->is_running: 0, 
which
means the vCPU is current not running. Then I went back in the call patch and  
find
' v->runstate.state ' gets changed during calling stop_machine_run(), and here 
is the
findings:

int stop_machine_run(int (*fn)(void *), void *data, unsigned int cpu)
{

..

point 1

for_each_cpu ( i,  )
tasklet_schedule_on_cpu(_cpu(stopmachine_tasklet, i), i);

point 2
..

}

at point 1 above, 
vCPU3->runstate.state: 0, vCPU3->is_running: 1
while at point 2 above:
vCPU3->runstate.state: 1, vCPU3->is_running: 0

I tested it for many times and got the same result. I am not sure the vcpu
state transition just happens to occurs here or not? If the transition doesn't
happen and is_running is still 1 when we get vcpu_migrate() in
cpu_disable_scheduler() in the above case, should it be a problem?
Thanks a lot!

Thanks,
Feng

> 
> Regards,
> Dario
> --
> <> (Raistlin Majere)
> -
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-06-23 Thread Dario Faggioli
On Thu, 2016-06-23 at 12:33 +, Wu, Feng wrote:
> > -Original Message-
> > From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> > 
> > It goes through all the vcpus of all domains, and does not check or
> > care whether they are running, runnable or blocked.
> > 
> > Let's look at this in some more details. So, let's assume that
> > processor 5 is going away, and that you have the following vcpus
> > around:
> > 
> >  d0v0 : v->processor = 5, running on cpu 5
> >  d0v1 : v->processor = 4, running on cpu 4
> >  d1v0 : v->processor = 5, runnable but not running
> >  d2v3 : v->processor = 5, blocked
> > 
> > for d0v0, we do:
> >   cpu_disable_scheduler(5)
> >     set_bit(_VPF_migrating, d0v0->pause_flags);
> >     vcpu_sleep_nosync(d0v0);
> >       SCHED_OP(sleep, d0v0);
> >         csched_vcpu_sleep(d0v0)
> >           cpu_raise_softirq(5, SCHEDULE_SOFTIRQ);
> >     vcpu_migrate(d0v0);
> >       if ( v->is_running || ...) // assume v->is_running is true
> >         return
> Hi Dario, after read this mail again, I get another question,
> could you please help me out?
> 
> In the above code flow, we return in vcpu_migrate(d0v0) because
> v->is_running == 1, after vcpu_migrate() return, we check:
> 
> if ( v->processor == cpu )
> ret = -EAGAIN; 
> 
> In my understand in the above case, 'v->processor' is likely equal to
> 'cpu', hence return -EAGAIN. However, in __cpu_disable(), there is
> some check as below:
> 
> if ( cpu_disable_scheduler(cpu) )
> BUG();
> 
Right. But, as the comment inside cpu_disable_scheduler() itself says,
we only return -EAGAIN in case we are calling cpu_disable_scheduler for
removing a pCPU from a cpupool.

In that case, we do not use __cpu_disable(), and hence we can safely
return an error value. In that case, in fact, the caller of
cpu_disable_scheduler() is cpupool_unassign_cpu_helprer(), which does
what's necessary to deal with such error.

> Might we hit the BUG() in the above case? 
>
No, because we call cpu_disable_scheduler() from __cpu_disable(), only
when system state is SYS_STATE_suspend already, and hence we take the
then branch of the 'if', which does never return an error.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-06-23 Thread Wu, Feng


> -Original Message-
> From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> Sent: Tuesday, May 24, 2016 10:02 PM
> To: Wu, Feng ; Jan Beulich 
> Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin
> ; xen-devel@lists.xen.org; konrad.w...@oracle.com;
> k...@xen.org
> Subject: Re: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> blocking list
> 
> On Tue, 2016-05-24 at 10:07 +, Wu, Feng wrote:
> > > See, for instance, cpu_disable_scheduler() in schedule.c. What we
> > > do is
> > > go over all the vcpus of all domains of either the system or the
> > > cpupool, and force the ones that we found with v->processor set to
> > > the
> > > pCPU that is going down, to perform migration (system_state will be
> > > different than SYS_STATE_suspend, and we hence take the 'else'
> > > branch).
> > >
> > > Considering that the pCPU is no longer part of the relevant
> > > bitmask-s
> > > during the migration, the vCPUs will figure out that they just
> > > can't
> > > stay there, and move somewhere else.
> >
> > Thanks a lot for the elaboration, it is really helpful.
> >
> NP :-)
> 
> > > Note, however, that this happens for running and runnable vCPUs.
> >
> > I don't quite understand this, do you mean cpu_disable_scheduler()
> > only handle running and runnable vCPUs, I tried to find some hints
> > from the code, but I didn't get it. Could you please give some more
> > information about this?
> >
> It goes through all the vcpus of all domains, and does not check or
> care whether they are running, runnable or blocked.
> 
> Let's look at this in some more details. So, let's assume that
> processor 5 is going away, and that you have the following vcpus
> around:
> 
>  d0v0 : v->processor = 5, running on cpu 5
>  d0v1 : v->processor = 4, running on cpu 4
>  d1v0 : v->processor = 5, runnable but not running
>  d2v3 : v->processor = 5, blocked
> 
> for d0v0, we do:
>   cpu_disable_scheduler(5)
>     set_bit(_VPF_migrating, d0v0->pause_flags);
>     vcpu_sleep_nosync(d0v0);
>       SCHED_OP(sleep, d0v0);
>         csched_vcpu_sleep(d0v0)
>           cpu_raise_softirq(5, SCHEDULE_SOFTIRQ);
>     vcpu_migrate(d0v0);
>       if ( v->is_running || ...) // assume v->is_running is true
>         return

Hi Dario, after read this mail again, I get another question,
could you please help me out?

In the above code flow, we return in vcpu_migrate(d0v0) because
v->is_running == 1, after vcpu_migrate() return, we check:

if ( v->processor == cpu )
ret = -EAGAIN; 

In my understand in the above case, 'v->processor' is likely equal to
'cpu', hence return -EAGAIN. However, in __cpu_disable(), there is
some check as below:

if ( cpu_disable_scheduler(cpu) )
BUG();

Might we hit the BUG() in the above case? Or am I miss something?
Thanks a lot!

Thanks,
Feng

>     ...
>     ... <--- scheduling occurs on processor 5
>     ...
>     context_saved(d0v0)
>       vcpu_migrate(d0v0);
>           //is_running is 0, so _VPF_migrating gets cleared
>         vcpu_move_locked(d0v0, new_cpu);
>         vcpu_wake(d0v0);
>           SCHED_OP(wake, d0v0);
>             csched_vcpu_wake(d0v0)
>               __runq_insert(d0v0);
>               __runq_tickle(d0v0);
> 
> for d0v1, we do:
>   cpu_disable_scheduler(5)
>     if ( d0v1->processor != 5 )
>       continue
> 
> for d1v0, we do:
>   cpu_disable_scheduler(5)
>     set_bit(_VPF_migrating, d1v0->pause_flags);
>     vcpu_sleep_nosync(d1v0);
>       SCHED_OP(sleep, d1v0);
>         csched_vcpu_sleep(d1v0)
>           __runq_remove(d1v0);
>     vcpu_migrate(d1v0);
>       if ( d1v0->is_running ||
>            !test_and_clear_bit(_VPF_migrating, d1v0->pause_flags)
>           // false, but clears the _VPF_migrating flag
>       vcpu_move_locked(d1v0, new_cpu);
>       vcpu_wake(v);
>         SCHED_OP(wake, d1v0);
>           csched_vcpu_wake(d1v0)
>             __runq_insert(d1v0);
>             __runq_tickle(d1v0);
> 
> for d2v3, we do:
>   cpu_disable_scheduler(5)
>     set_bit(_VPF_migrating, d2v3-
> >pause_flags);
>     vcpu_sleep_nosync(d2v3);
>       SCHED_OP(sleep, d2v3);
> 
>       csched_vcpu_sleep(d2v3)
> [1]       // Nothing!
> 
> vcpu_migrate(d2v3);
>       if ( d2v3->is_running ||
> 
>  !test_and_clear_bit(_VPF_migrating, d2v3->pause_flags)
>           //
> false, but clears the _VPF_migrating flag
> [*]   vcpu_move_locked(d2v3,
> new_cpu);
>       vcpu_wake(d2v3);
> [2]     // Nothing!
> 
> > > If a
> > > vCPU is blocker, there is nothing to do, and in fact, nothing
> > > happens
> > > (as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case).
> >
> > What do you mean by saying ' vcpu_sleep_nosync() and vcpu_wake()
> > are NOP '?
> >
> See [1] and [2] above.
> 
> > > For
> > > those vCPUs, as soon as they wake up, they'll figure out that their
> > > own
> > > v->processor is not there any longer, and will move somewhere else.
> >
> > So basically, when 

Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-25 Thread Wu, Feng


> -Original Message-
> From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> Sent: Tuesday, May 24, 2016 10:47 PM
> To: Wu, Feng ; Jan Beulich 
> Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin
> ; xen-devel@lists.xen.org; konrad.w...@oracle.com;
> k...@xen.org
> Subject: Re: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> blocking list
> 
> On Tue, 2016-05-24 at 13:33 +, Wu, Feng wrote:
> > > From: Wu, Feng
> > > > From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> > > >
> > > > If a
> > > > vCPU is blocker, there is nothing to do, and in fact, nothing
> > > > happens
> > > > (as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case).
> > > What do you mean by saying ' vcpu_sleep_nosync() and vcpu_wake()
> > > are NOP '?
> >
> > I think I understand what you meant above now.
> >
> Right. In any case, see the email I've just sent, with a detailed
> breakdown of the situation and of what actually happens.
> 
> > Do you think the following idea makes sense?
> >
> > When a pCPU is unplugged, we can just remove the vcpus on the
> > associated per-cpu blocking list, then we can choose another online
> > cpu, set the right NDST value, and put the vCPU the new per-cpu list?
> >
> Well, this does make sense to me, but the point is how you do that. I
> mean, how do you get to execute some PI adjustment code, from cpu-
> teardown code?
> 
> Right now, for what seemed to be necessary until now, we have the
> arch_vcpu_block(). Do we need more? If yes where?
> 
> From looking only at schedule.c, we already have arch_move_irqs(), can
> we take advantage of it?

I think we can add the logic in vmx_cpu_dead(). I've already have a draft
patch, and I will send it out to your guys to have a review later!

Thanks,
Feng

> 
> Regards,
> Dario
> --
> <> (Raistlin Majere)
> -
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-25 Thread Wu, Feng


> -Original Message-
> From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> Sent: Tuesday, May 24, 2016 10:02 PM
> To: Wu, Feng ; Jan Beulich 
> Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin
> ; xen-devel@lists.xen.org; konrad.w...@oracle.com;
> k...@xen.org
> Subject: Re: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> blocking list
> 
> On Tue, 2016-05-24 at 10:07 +, Wu, Feng wrote:
> > > See, for instance, cpu_disable_scheduler() in schedule.c. What we
> > > do is
> > > go over all the vcpus of all domains of either the system or the
> > > cpupool, and force the ones that we found with v->processor set to
> > > the
> > > pCPU that is going down, to perform migration (system_state will be
> > > different than SYS_STATE_suspend, and we hence take the 'else'
> > > branch).
> > >
> > > Considering that the pCPU is no longer part of the relevant
> > > bitmask-s
> > > during the migration, the vCPUs will figure out that they just
> > > can't
> > > stay there, and move somewhere else.
> >
> > Thanks a lot for the elaboration, it is really helpful.
> >
> NP :-)
> 
> > > Note, however, that this happens for running and runnable vCPUs.
> >
> > I don't quite understand this, do you mean cpu_disable_scheduler()
> > only handle running and runnable vCPUs, I tried to find some hints
> > from the code, but I didn't get it. Could you please give some more
> > information about this?
> >
> It goes through all the vcpus of all domains, and does not check or
> care whether they are running, runnable or blocked.
> 
> Let's look at this in some more details. So, let's assume that
> processor 5 is going away, and that you have the following vcpus
> around:
> 
>  d0v0 : v->processor = 5, running on cpu 5
>  d0v1 : v->processor = 4, running on cpu 4
>  d1v0 : v->processor = 5, runnable but not running
>  d2v3 : v->processor = 5, blocked
> 
> for d0v0, we do:
>   cpu_disable_scheduler(5)
>     set_bit(_VPF_migrating, d0v0->pause_flags);
>     vcpu_sleep_nosync(d0v0);
>       SCHED_OP(sleep, d0v0);
>         csched_vcpu_sleep(d0v0)
>           cpu_raise_softirq(5, SCHEDULE_SOFTIRQ);
>     vcpu_migrate(d0v0);
>       if ( v->is_running || ...) // assume v->is_running is true
>         return
>     ...
>     ... <--- scheduling occurs on processor 5
>     ...
>     context_saved(d0v0)
>       vcpu_migrate(d0v0);
>           //is_running is 0, so _VPF_migrating gets cleared
>         vcpu_move_locked(d0v0, new_cpu);
>         vcpu_wake(d0v0);
>           SCHED_OP(wake, d0v0);
>             csched_vcpu_wake(d0v0)
>               __runq_insert(d0v0);
>               __runq_tickle(d0v0);
> 
> for d0v1, we do:
>   cpu_disable_scheduler(5)
>     if ( d0v1->processor != 5 )
>       continue
> 
> for d1v0, we do:
>   cpu_disable_scheduler(5)
>     set_bit(_VPF_migrating, d1v0->pause_flags);
>     vcpu_sleep_nosync(d1v0);
>       SCHED_OP(sleep, d1v0);
>         csched_vcpu_sleep(d1v0)
>           __runq_remove(d1v0);
>     vcpu_migrate(d1v0);
>       if ( d1v0->is_running ||
>            !test_and_clear_bit(_VPF_migrating, d1v0->pause_flags)
>           // false, but clears the _VPF_migrating flag
>       vcpu_move_locked(d1v0, new_cpu);
>       vcpu_wake(v);
>         SCHED_OP(wake, d1v0);
>           csched_vcpu_wake(d1v0)
>             __runq_insert(d1v0);
>             __runq_tickle(d1v0);
> 
> for d2v3, we do:
>   cpu_disable_scheduler(5)
>     set_bit(_VPF_migrating, d2v3-
> >pause_flags);
>     vcpu_sleep_nosync(d2v3);
>       SCHED_OP(sleep, d2v3);
> 
>       csched_vcpu_sleep(d2v3)
> [1]       // Nothing!
> 
> vcpu_migrate(d2v3);
>       if ( d2v3->is_running ||
> 
>  !test_and_clear_bit(_VPF_migrating, d2v3->pause_flags)
>           //
> false, but clears the _VPF_migrating flag
> [*]   vcpu_move_locked(d2v3,
> new_cpu);
>       vcpu_wake(d2v3);
> [2]     // Nothing!
> 
> > > If a
> > > vCPU is blocker, there is nothing to do, and in fact, nothing
> > > happens
> > > (as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case).
> >
> > What do you mean by saying ' vcpu_sleep_nosync() and vcpu_wake()
> > are NOP '?
> >
> See [1] and [2] above.
> 
> > > For
> > > those vCPUs, as soon as they wake up, they'll figure out that their
> > > own
> > > v->processor is not there any longer, and will move somewhere else.
> >
> > So basically, when vCPU is blocking, it has no impact to the blocking
> > vcpu
> > when 'v->processor' is removed. When the vCPU is waken up, it will
> > find
> > another pCPU to run, since the original 'v->processor' is down and no
> > longer in the cpu bitmask, right?
> >
> Yes, that was my point.
> 
> _However_, as you can see at [*] above, it must be noted that even
> those vcpus that blocked while running on a certain processor (5 in the
> example), indeed have a chance to have their
> v->processor changed to something that is still online (something
> different than 5), as a consequence of 

Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-24 Thread Dario Faggioli
On Tue, 2016-05-24 at 13:33 +, Wu, Feng wrote:
> > From: Wu, Feng
> > > From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> > > 
> > > If a
> > > vCPU is blocker, there is nothing to do, and in fact, nothing
> > > happens
> > > (as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case).
> > What do you mean by saying ' vcpu_sleep_nosync() and vcpu_wake()
> > are NOP '?
>
> I think I understand what you meant above now.
> 
Right. In any case, see the email I've just sent, with a detailed
breakdown of the situation and of what actually happens.

> Do you think the following idea makes sense?
> 
> When a pCPU is unplugged, we can just remove the vcpus on the
> associated per-cpu blocking list, then we can choose another online
> cpu, set the right NDST value, and put the vCPU the new per-cpu list?
>
Well, this does make sense to me, but the point is how you do that. I
mean, how do you get to execute some PI adjustment code, from cpu-
teardown code?

Right now, for what seemed to be necessary until now, we have the
arch_vcpu_block(). Do we need more? If yes where?

From looking only at schedule.c, we already have arch_move_irqs(), can
we take advantage of it?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-24 Thread Dario Faggioli
On Tue, 2016-05-24 at 10:07 +, Wu, Feng wrote:
> > See, for instance, cpu_disable_scheduler() in schedule.c. What we
> > do is
> > go over all the vcpus of all domains of either the system or the
> > cpupool, and force the ones that we found with v->processor set to
> > the
> > pCPU that is going down, to perform migration (system_state will be
> > different than SYS_STATE_suspend, and we hence take the 'else'
> > branch).
> > 
> > Considering that the pCPU is no longer part of the relevant
> > bitmask-s
> > during the migration, the vCPUs will figure out that they just
> > can't
> > stay there, and move somewhere else.
>
> Thanks a lot for the elaboration, it is really helpful.
> 
NP :-)

> > Note, however, that this happens for running and runnable vCPUs. 
>
> I don't quite understand this, do you mean cpu_disable_scheduler()
> only handle running and runnable vCPUs, I tried to find some hints
> from the code, but I didn't get it. Could you please give some more
> information about this?
> 
It goes through all the vcpus of all domains, and does not check or
care whether they are running, runnable or blocked.

Let's look at this in some more details. So, let's assume that
processor 5 is going away, and that you have the following vcpus
around:

 d0v0 : v->processor = 5, running on cpu 5
 d0v1 : v->processor = 4, running on cpu 4
 d1v0 : v->processor = 5, runnable but not running
 d2v3 : v->processor = 5, blocked

for d0v0, we do:
  cpu_disable_scheduler(5)
    set_bit(_VPF_migrating, d0v0->pause_flags);
    vcpu_sleep_nosync(d0v0);
      SCHED_OP(sleep, d0v0);
        csched_vcpu_sleep(d0v0)
          cpu_raise_softirq(5, SCHEDULE_SOFTIRQ);
    vcpu_migrate(d0v0);
      if ( v->is_running || ...) // assume v->is_running is true
        return
    ...
    ... <--- scheduling occurs on processor 5
    ...
    context_saved(d0v0)
      vcpu_migrate(d0v0);
          //is_running is 0, so _VPF_migrating gets cleared
        vcpu_move_locked(d0v0, new_cpu);
        vcpu_wake(d0v0);
          SCHED_OP(wake, d0v0);
            csched_vcpu_wake(d0v0)
              __runq_insert(d0v0);
              __runq_tickle(d0v0);

for d0v1, we do:
  cpu_disable_scheduler(5)
    if ( d0v1->processor != 5 )
      continue

for d1v0, we do:
  cpu_disable_scheduler(5)
    set_bit(_VPF_migrating, d1v0->pause_flags);
    vcpu_sleep_nosync(d1v0);
      SCHED_OP(sleep, d1v0);
        csched_vcpu_sleep(d1v0)
          __runq_remove(d1v0);
    vcpu_migrate(d1v0);
      if ( d1v0->is_running ||
           !test_and_clear_bit(_VPF_migrating, d1v0->pause_flags)
          // false, but clears the _VPF_migrating flag
      vcpu_move_locked(d1v0, new_cpu);
      vcpu_wake(v);
        SCHED_OP(wake, d1v0);
          csched_vcpu_wake(d1v0)
            __runq_insert(d1v0);
            __runq_tickle(d1v0);

for d2v3, we do:
  cpu_disable_scheduler(5)
    set_bit(_VPF_migrating, d2v3-
>pause_flags);
    vcpu_sleep_nosync(d2v3);
      SCHED_OP(sleep, d2v3);
 
      csched_vcpu_sleep(d2v3)
[1]       // Nothing! 
   
vcpu_migrate(d2v3);
      if ( d2v3->is_running ||
         
 !test_and_clear_bit(_VPF_migrating, d2v3->pause_flags)
          //
false, but clears the _VPF_migrating flag
[*]   vcpu_move_locked(d2v3,
new_cpu);
      vcpu_wake(d2v3);
[2]     // Nothing!

> > If a
> > vCPU is blocker, there is nothing to do, and in fact, nothing
> > happens
> > (as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case). 
>
> What do you mean by saying ' vcpu_sleep_nosync() and vcpu_wake()
> are NOP '?
> 
See [1] and [2] above.

> > For
> > those vCPUs, as soon as they wake up, they'll figure out that their
> > own
> > v->processor is not there any longer, and will move somewhere else.
>
> So basically, when vCPU is blocking, it has no impact to the blocking
> vcpu
> when 'v->processor' is removed. When the vCPU is waken up, it will
> find
> another pCPU to run, since the original 'v->processor' is down and no
> longer in the cpu bitmask, right?
> 
Yes, that was my point.

_However_, as you can see at [*] above, it must be noted that even
those vcpus that blocked while running on a certain processor (5 in the
example), indeed have a chance to have their
v->processor changed to something that is still online (something
different than 5), as a consequence of that processor going away.

Whether this is useful/enough for you, I can't tell right now, out of
the top of my head.

> > > > But this is not an issue  in non pCPU hotplug scenario.
> > > > 
> > It's probably an issue even if you remove a cpu from a cpupool (and
> > even a more "interesting" one, if you also manage to add it to
> > another
> > pool, in the meantime) isn't it?
>
> Yes, things become more complex in that case 
> 
Well, but can you confirm that we also have an issue there, and test
and report what happens if you move a cpu from pool A to pool B, while
it still has vcpus from a domain that stays in pool A.

If there's transient suboptimal behavior, well, we probably can live

Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-24 Thread Wu, Feng


> -Original Message-
> From: Wu, Feng
> Sent: Tuesday, May 24, 2016 6:08 PM
> To: Dario Faggioli ; Jan Beulich 
> 
> Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin
> ; xen-devel@lists.xen.org; konrad.w...@oracle.com;
> k...@xen.org; Wu, Feng 
> Subject: RE: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> blocking list
> 
> 
> 
> > -Original Message-
> > From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> > Sent: Monday, May 23, 2016 8:39 PM
> > To: Jan Beulich ; Wu, Feng 
> > Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin
> > ; xen-devel@lists.xen.org; konrad.w...@oracle.com;
> > k...@xen.org
> > Subject: Re: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> > blocking list
> >
> > On Mon, 2016-05-23 at 02:51 -0600, Jan Beulich wrote:
> > > > > > On 23.05.16 at 10:44,  wrote:
> > > > >
> > > > > vCPU-s currently having their v->processor set to the pCPU being
> > > > > hot removed would simply get migrated elsewhere. If that's not
> > > > > accompanied by respective PI blocking list adjustments, then I
> > > > > can
> > > > > only refer you back to the original series' review process, where
> > > > > it was (iirc) more than once that it was pointed out that getting
> > > > > out
> > > > > of sync v->processor and the PI list a vCPU sits on may casue
> > > > > issues. And if there is an issue here, I'm not sure what a good
> > > > > solution would be.
> > > > Okay, here is my understanding about the blocking and vCPU
> > > > migration things, if vCPU is blocking and its v->processor is pCPU
> > > > 0,
> > > > at some point, pCPU0 is removed from the system, it will also
> > > > migrate the vCPU to another pCPU, say, pCPU1 as the response
> > > > to pCPU0 being unplugged, hence, v->processor = pCPU1, right?
> > >
> > > Yes.
> > >
> > See, for instance, cpu_disable_scheduler() in schedule.c. What we do is
> > go over all the vcpus of all domains of either the system or the
> > cpupool, and force the ones that we found with v->processor set to the
> > pCPU that is going down, to perform migration (system_state will be
> > different than SYS_STATE_suspend, and we hence take the 'else' branch).
> >
> > Considering that the pCPU is no longer part of the relevant bitmask-s
> > during the migration, the vCPUs will figure out that they just can't
> > stay there, and move somewhere else.
> 
> Thanks a lot for the elaboration, it is really helpful.
> 
> >
> > Note, however, that this happens for running and runnable vCPUs.
> 
> I don't quite understand this, do you mean cpu_disable_scheduler()
> only handle running and runnable vCPUs, I tried to find some hints
> from the code, but I didn't get it. Could you please give some more
> information about this?
> 
> > If a
> > vCPU is blocker, there is nothing to do, and in fact, nothing happens
> > (as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case).
> 
> What do you mean by saying ' vcpu_sleep_nosync() and vcpu_wake()
> are NOP '?

I think I understand what you meant above now.

Do you think the following idea makes sense?

When a pCPU is unplugged, we can just remove the vcpus on the
associated per-cpu blocking list, then we can choose another online
cpu, set the right NDST value, and put the vCPU the new per-cpu list?
Meanwhile, we may need some special treatment to the case that
vmx_vcpu_block() may run concurrently with cpu hotplug.

Thanks,
Feng

> 
> > For
> > those vCPUs, as soon as they wake up, they'll figure out that their own
> > v->processor is not there any longer, and will move somewhere else.
> 
> So basically, when vCPU is blocking, it has no impact to the blocking vcpu
> when 'v->processor' is removed. When the vCPU is waken up, it will find
> another pCPU to run, since the original 'v->processor' is down and no
> longer in the cpu bitmask, right?
> 
> >
> > > > If that is the case, I think the current code may have issues in
> > > > the
> > > > above case, since 'NDST' filed in PI descriptor is still vCPU0 even
> > > > it is removed from the system. I need to consider how to handle
> > > > this case.
> > >
> > Exactly. I don't think you can do that, for instance, from within
> > cpu_disable_scheduler(), or anythign that gets called from there, as it
> > basically deals with running vCPUs, while you're interested in blocked
> > ones.
> >
> > I wonder whether you can register your own notifier, and, inside it,
> > scan the blocked list and fixup things... (just the first idea that
> > came up in my mind)
> >
> > > > But this is not an issue  in non pCPU hotplug scenario.
> > > >
> > It's probably an issue even if you remove a cpu from a cpupool (and
> > even a more "interesting" one, if you also manage to add it to another
> > pool, in the meantime) isn't it?
> 
> Yes, things become 

Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-24 Thread Wu, Feng


> -Original Message-
> From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> Sent: Monday, May 23, 2016 8:39 PM
> To: Jan Beulich ; Wu, Feng 
> Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin
> ; xen-devel@lists.xen.org; konrad.w...@oracle.com;
> k...@xen.org
> Subject: Re: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> blocking list
> 
> On Mon, 2016-05-23 at 02:51 -0600, Jan Beulich wrote:
> > > > > On 23.05.16 at 10:44,  wrote:
> > > >
> > > > vCPU-s currently having their v->processor set to the pCPU being
> > > > hot removed would simply get migrated elsewhere. If that's not
> > > > accompanied by respective PI blocking list adjustments, then I
> > > > can
> > > > only refer you back to the original series' review process, where
> > > > it was (iirc) more than once that it was pointed out that getting
> > > > out
> > > > of sync v->processor and the PI list a vCPU sits on may casue
> > > > issues. And if there is an issue here, I'm not sure what a good
> > > > solution would be.
> > > Okay, here is my understanding about the blocking and vCPU
> > > migration things, if vCPU is blocking and its v->processor is pCPU
> > > 0,
> > > at some point, pCPU0 is removed from the system, it will also
> > > migrate the vCPU to another pCPU, say, pCPU1 as the response
> > > to pCPU0 being unplugged, hence, v->processor = pCPU1, right?
> >
> > Yes.
> >
> See, for instance, cpu_disable_scheduler() in schedule.c. What we do is
> go over all the vcpus of all domains of either the system or the
> cpupool, and force the ones that we found with v->processor set to the
> pCPU that is going down, to perform migration (system_state will be
> different than SYS_STATE_suspend, and we hence take the 'else' branch).
> 
> Considering that the pCPU is no longer part of the relevant bitmask-s
> during the migration, the vCPUs will figure out that they just can't
> stay there, and move somewhere else.

Thanks a lot for the elaboration, it is really helpful.

> 
> Note, however, that this happens for running and runnable vCPUs. 

I don't quite understand this, do you mean cpu_disable_scheduler()
only handle running and runnable vCPUs, I tried to find some hints
from the code, but I didn't get it. Could you please give some more
information about this?

> If a
> vCPU is blocker, there is nothing to do, and in fact, nothing happens
> (as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case). 

What do you mean by saying ' vcpu_sleep_nosync() and vcpu_wake()
are NOP '?

> For
> those vCPUs, as soon as they wake up, they'll figure out that their own
> v->processor is not there any longer, and will move somewhere else.

So basically, when vCPU is blocking, it has no impact to the blocking vcpu
when 'v->processor' is removed. When the vCPU is waken up, it will find
another pCPU to run, since the original 'v->processor' is down and no
longer in the cpu bitmask, right?

> 
> > > If that is the case, I think the current code may have issues in
> > > the
> > > above case, since 'NDST' filed in PI descriptor is still vCPU0 even
> > > it is removed from the system. I need to consider how to handle
> > > this case.
> >
> Exactly. I don't think you can do that, for instance, from within
> cpu_disable_scheduler(), or anythign that gets called from there, as it
> basically deals with running vCPUs, while you're interested in blocked
> ones.
> 
> I wonder whether you can register your own notifier, and, inside it,
> scan the blocked list and fixup things... (just the first idea that
> came up in my mind)
> 
> > > But this is not an issue  in non pCPU hotplug scenario.
> > >
> It's probably an issue even if you remove a cpu from a cpupool (and
> even a more "interesting" one, if you also manage to add it to another
> pool, in the meantime) isn't it?

Yes, things become more complex in that case 

Thanks,
Feng

> 
> Regards,
> Dario
> --
> <> (Raistlin Majere)
> -
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-23 Thread Dario Faggioli
On Mon, 2016-05-23 at 02:51 -0600, Jan Beulich wrote:
> > > > On 23.05.16 at 10:44,  wrote:
> > > 
> > > vCPU-s currently having their v->processor set to the pCPU being
> > > hot removed would simply get migrated elsewhere. If that's not
> > > accompanied by respective PI blocking list adjustments, then I
> > > can
> > > only refer you back to the original series' review process, where
> > > it was (iirc) more than once that it was pointed out that getting
> > > out
> > > of sync v->processor and the PI list a vCPU sits on may casue
> > > issues. And if there is an issue here, I'm not sure what a good
> > > solution would be.
> > Okay, here is my understanding about the blocking and vCPU
> > migration things, if vCPU is blocking and its v->processor is pCPU
> > 0,
> > at some point, pCPU0 is removed from the system, it will also
> > migrate the vCPU to another pCPU, say, pCPU1 as the response
> > to pCPU0 being unplugged, hence, v->processor = pCPU1, right?
>
> Yes.
> 
See, for instance, cpu_disable_scheduler() in schedule.c. What we do is
go over all the vcpus of all domains of either the system or the
cpupool, and force the ones that we found with v->processor set to the
pCPU that is going down, to perform migration (system_state will be
different than SYS_STATE_suspend, and we hence take the 'else' branch).

Considering that the pCPU is no longer part of the relevant bitmask-s
during the migration, the vCPUs will figure out that they just can't
stay there, and move somewhere else.

Note, however, that this happens for running and runnable vCPUs. If a
vCPU is blocker, there is nothing to do, and in fact, nothing happens
(as vcpu_sleep_nosync() and vcpu_wake() are NOP in that case). For
those vCPUs, as soon as they wake up, they'll figure out that their own
v->processor is not there any longer, and will move somewhere else.

> > If that is the case, I think the current code may have issues in
> > the
> > above case, since 'NDST' filed in PI descriptor is still vCPU0 even
> > it is removed from the system. I need to consider how to handle
> > this case. 
>
Exactly. I don't think you can do that, for instance, from within
cpu_disable_scheduler(), or anythign that gets called from there, as it
basically deals with running vCPUs, while you're interested in blocked
ones.

I wonder whether you can register your own notifier, and, inside it,
scan the blocked list and fixup things... (just the first idea that
came up in my mind)

> > But this is not an issue  in non pCPU hotplug scenario.
> > 
It's probably an issue even if you remove a cpu from a cpupool (and
even a more "interesting" one, if you also manage to add it to another
pool, in the meantime) isn't it?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-23 Thread Jan Beulich
>>> On 23.05.16 at 10:44,  wrote:
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Monday, May 23, 2016 4:09 PM
>> >>> On 20.05.16 at 12:46,  wrote:
>> > If this is the case, it can address part of my concern. Another
>> > concern is
>> > if a vCPU is blocking on a pCPU, then the pCPU is going to be unplugged,
>> > what does
>> > Xen do for the blocking vCPU?
>> 
>> A pCPU is never blocking "on a pCPU", it is always just blocking.
> 
> I think you meant "vCPU is never "

Oh, indeed.

>> vCPU-s currently having their v->processor set to the pCPU being
>> hot removed would simply get migrated elsewhere. If that's not
>> accompanied by respective PI blocking list adjustments, then I can
>> only refer you back to the original series' review process, where
>> it was (iirc) more than once that it was pointed out that getting out
>> of sync v->processor and the PI list a vCPU sits on may casue
>> issues. And if there is an issue here, I'm not sure what a good
>> solution would be.
> 
> Okay, here is my understanding about the blocking and vCPU
> migration things, if vCPU is blocking and its v->processor is pCPU 0,
> at some point, pCPU0 is removed from the system, it will also
> migrate the vCPU to another pCPU, say, pCPU1 as the response
> to pCPU0 being unplugged, hence, v->processor = pCPU1, right?

Yes.

Jan

> If that is the case, I think the current code may have issues in the
> above case, since 'NDST' filed in PI descriptor is still vCPU0 even
> it is removed from the system. I need to consider how to handle
> this case. But this is not an issue  in non pCPU hotplug scenario.
> 
> Thanks,
> Feng



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-23 Thread Wu, Feng


> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Monday, May 23, 2016 4:09 PM
> To: Wu, Feng 
> Cc: andrew.coop...@citrix.com; dario.faggi...@citrix.com;
> george.dun...@eu.citrix.com; Tian, Kevin ; xen-
> de...@lists.xen.org; konrad.w...@oracle.com; k...@xen.org
> Subject: RE: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> blocking list
> 
> >>> On 20.05.16 at 12:46,  wrote:
> 
> >
> >> -Original Message-
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: Friday, May 20, 2016 6:27 PM
> >> To: Wu, Feng 
> >> Cc: andrew.coop...@citrix.com; dario.faggi...@citrix.com;
> >> george.dun...@eu.citrix.com; Tian, Kevin ; xen-
> >> de...@lists.xen.org; konrad.w...@oracle.com; k...@xen.org
> >> Subject: Re: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> >> blocking list
> >>
> >> >>> On 20.05.16 at 10:53,  wrote:
> >> > I still have two opens, which needs comments/suggestions from you guys.
> >> > - What should we do for the per-cpu blocking list during vcpu hotplug?
> >>
> >> What do you mean with vcpu hotplug? vcpus never get removed
> >> from a VM (from hypervisor perspective), and all the guest may
> >> ever use need to be created before the guest starts.
> >
> > Thanks for the reply, Jan. First of all, I am not familiar with vcpu
> > hotplug/pcup hotplug,
> > and that is why I list them as two opens here. When I wrote "vcpu hotplug",
> > I was planning
> > to refer to the following case:
> > 1. use 'xl vcpu-set ${dom_id} ${vcpu_num}', where ${vcpu_num} is less than
> > the current
> > online vcpus.
> > 2. In the guest, use ' echo 1 > /sys/devices/system/cpu/cpuX/online ' to 
> > make
> > the vcpus
> > offline.
> >
> > Yes, I know the maximum vcpus are allocated for the guest, but I am not 
> > quite
> sure
> > whether the above operations will affect the blocking list. If you can 
> > elaborate
> a bit
> > more what hypervisor will do for the above operations, That should be very
> helpful.
> 
> The hypervisor just offlines the vCPU (in response to the guest
> invoking VCPUOP_down).
> 
> > such as, will the vcpu's state be changed, etc? And if the vcpu is current
> > in blocking
> > state, while try to offline it, it will still remain in the blocking list,
> > right, will this cause
> > some problems?
> 
> Since _VPF_down is just one of many possible pause flags, I don't
> see what problems you're suspecting. But maybe I'm overlooking
> something?

Thanks for the explanation. If hypervisor just offline the vCPU, I
think there is nothing special need to do here. Since the current
logic can handle the state transition well.

> 
> >> > - What should we do for the per-cpu blocking list during pcpu hotplug?
> >>
> >> I think it would help if you said what issue you see.
> >
> > I didn't see any issues related to this, this open just comes out in my mind
> > when I wrote
> > this patch to fix the bug. As mentioned above, when the pCPU is removed,
> > what is the
> > status of its blocking list? But thinking a bit more about this, I feel it
> > should work this way,
> > before the pCPU is removed, all the vCPUs running on it has been moved to
> > another
> > pCPU, right?
> 
> Yes.
> 
> > If this is the case, it can address part of my concern. Another
> > concern is
> > if a vCPU is blocking on a pCPU, then the pCPU is going to be unplugged,
> > what does
> > Xen do for the blocking vCPU?
> 
> A pCPU is never blocking "on a pCPU", it is always just blocking.

I think you meant "vCPU is never "

> vCPU-s currently having their v->processor set to the pCPU being
> hot removed would simply get migrated elsewhere. If that's not
> accompanied by respective PI blocking list adjustments, then I can
> only refer you back to the original series' review process, where
> it was (iirc) more than once that it was pointed out that getting out
> of sync v->processor and the PI list a vCPU sits on may casue
> issues. And if there is an issue here, I'm not sure what a good
> solution would be.

Okay, here is my understanding about the blocking and vCPU
migration things, if vCPU is blocking and its v->processor is pCPU 0,
at some point, pCPU0 is removed from the system, it will also
migrate the vCPU to another pCPU, say, pCPU1 as the response
to pCPU0 being unplugged, hence, v->processor = pCPU1, right?

If that is the case, I think the current code may have issues in the
above case, since 'NDST' filed in PI descriptor is still vCPU0 even
it is removed from the system. I need to consider how to handle
this case. But this is not an issue  in non pCPU hotplug scenario.

Thanks,
Feng

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-23 Thread Jan Beulich
>>> On 20.05.16 at 12:46,  wrote:

> 
>> -Original Message-
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Friday, May 20, 2016 6:27 PM
>> To: Wu, Feng 
>> Cc: andrew.coop...@citrix.com; dario.faggi...@citrix.com;
>> george.dun...@eu.citrix.com; Tian, Kevin ; xen-
>> de...@lists.xen.org; konrad.w...@oracle.com; k...@xen.org 
>> Subject: Re: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
>> blocking list
>> 
>> >>> On 20.05.16 at 10:53,  wrote:
>> > I still have two opens, which needs comments/suggestions from you guys.
>> > - What should we do for the per-cpu blocking list during vcpu hotplug?
>> 
>> What do you mean with vcpu hotplug? vcpus never get removed
>> from a VM (from hypervisor perspective), and all the guest may
>> ever use need to be created before the guest starts.
> 
> Thanks for the reply, Jan. First of all, I am not familiar with vcpu 
> hotplug/pcup hotplug,
> and that is why I list them as two opens here. When I wrote "vcpu hotplug", 
> I was planning
> to refer to the following case:
> 1. use 'xl vcpu-set ${dom_id} ${vcpu_num}', where ${vcpu_num} is less than 
> the current
> online vcpus.
> 2. In the guest, use ' echo 1 > /sys/devices/system/cpu/cpuX/online ' to make 
> the vcpus
> offline.
> 
> Yes, I know the maximum vcpus are allocated for the guest, but I am not quite 
> sure
> whether the above operations will affect the blocking list. If you can 
> elaborate a bit
> more what hypervisor will do for the above operations, That should be very 
> helpful.

The hypervisor just offlines the vCPU (in response to the guest
invoking VCPUOP_down).

> such as, will the vcpu's state be changed, etc? And if the vcpu is current 
> in blocking
> state, while try to offline it, it will still remain in the blocking list, 
> right, will this cause
> some problems? 

Since _VPF_down is just one of many possible pause flags, I don't
see what problems you're suspecting. But maybe I'm overlooking
something?

>> > - What should we do for the per-cpu blocking list during pcpu hotplug?
>> 
>> I think it would help if you said what issue you see.
> 
> I didn't see any issues related to this, this open just comes out in my mind 
> when I wrote
> this patch to fix the bug. As mentioned above, when the pCPU is removed, 
> what is the
> status of its blocking list? But thinking a bit more about this, I feel it 
> should work this way,
> before the pCPU is removed, all the vCPUs running on it has been moved to 
> another
> pCPU, right?

Yes.

> If this is the case, it can address part of my concern. Another 
> concern is
> if a vCPU is blocking on a pCPU, then the pCPU is going to be unplugged, 
> what does
> Xen do for the blocking vCPU?

A pCPU is never blocking "on a pCPU", it is always just blocking.
vCPU-s currently having their v->processor set to the pCPU being
hot removed would simply get migrated elsewhere. If that's not
accompanied by respective PI blocking list adjustments, then I can
only refer you back to the original series' review process, where
it was (iirc) more than once that it was pointed out that getting out
of sync v->processor and the PI list a vCPU sits on may casue
issues. And if there is an issue here, I'm not sure what a good
solution would be.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-20 Thread Wu, Feng


> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Friday, May 20, 2016 6:27 PM
> To: Wu, Feng 
> Cc: andrew.coop...@citrix.com; dario.faggi...@citrix.com;
> george.dun...@eu.citrix.com; Tian, Kevin ; xen-
> de...@lists.xen.org; konrad.w...@oracle.com; k...@xen.org
> Subject: Re: [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu
> blocking list
> 
> >>> On 20.05.16 at 10:53,  wrote:
> > I still have two opens, which needs comments/suggestions from you guys.
> > - What should we do for the per-cpu blocking list during vcpu hotplug?
> 
> What do you mean with vcpu hotplug? vcpus never get removed
> from a VM (from hypervisor perspective), and all the guest may
> ever use need to be created before the guest starts.

Thanks for the reply, Jan. First of all, I am not familiar with vcpu 
hotplug/pcup hotplug,
and that is why I list them as two opens here. When I wrote "vcpu hotplug", I 
was planning
to refer to the following case:
1. use 'xl vcpu-set ${dom_id} ${vcpu_num}', where ${vcpu_num} is less than the 
current
online vcpus.
2. In the guest, use ' echo 1 > /sys/devices/system/cpu/cpuX/online ' to make 
the vcpus
offline.

Yes, I know the maximum vcpus are allocated for the guest, but I am not quite 
sure
whether the above operations will affect the blocking list. If you can 
elaborate a bit
more what hypervisor will do for the above operations, That should be very 
helpful.
such as, will the vcpu's state be changed, etc? And if the vcpu is current in 
blocking
state, while try to offline it, it will still remain in the blocking list, 
right, will this cause
some problems? 

> 
> > - What should we do for the per-cpu blocking list during pcpu hotplug?
> 
> I think it would help if you said what issue you see.

I didn't see any issues related to this, this open just comes out in my mind 
when I wrote
this patch to fix the bug. As mentioned above, when the pCPU is removed, what 
is the
status of its blocking list? But thinking a bit more about this, I feel it 
should work this way,
before the pCPU is removed, all the vCPUs running on it has been moved to 
another
pCPU, right? If this is the case, it can address part of my concern. Another 
concern is
if a vCPU is blocking on a pCPU, then the pCPU is going to be unplugged, what 
does
Xen do for the blocking vCPU?

Thanks,
Feng

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-20 Thread Jan Beulich
>>> On 20.05.16 at 10:53,  wrote:
> I still have two opens, which needs comments/sugguestions from you guys.
> - What shoule we do for the per-cpu blocking list during vcpu hotplug?

What do you mean with vcpu hotplug? vcpus never get removed
from a VM (from hypervisor perspective), and all the guest may
ever use need to be created before the guest starts.

> - What shoule we do for the per-cpu blocking list during pcpu hotplug?

I think it would help if you said what issue you see.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/3] VMX: Properly handle pi descriptor and per-cpu blocking list

2016-05-20 Thread Feng Wu
The current VT-d PI related code may operate incorrectly in the following
scenarios:
- When the last assigned device is dettached from the domain, all
the PI related hooks are removed then, however, the vCPU can be
blocked, switched to another pCPU, etc, all without the aware of
PI. After the next time we attach another device to the domain,
which makes the PI realted hooks avaliable again, the status
of the pi descriptor is not true. Beside that, the blocking vcpu
may still remain in the per-cpu blocking in this case
- After the domain is destroyed, the the blocking vcpu may also
remain in the per-cpu blocking.

This series fix the above issue.

I still have two opens, which needs comments/sugguestions from you guys.
- What shoule we do for the per-cpu blocking list during vcpu hotplug?
- What shoule we do for the per-cpu blocking list during pcpu hotplug?

Feng Wu (3):
  VMX: Properly adjuest the status of pi descriptor
  VMX: Make hook pi_do_resume always available
  VMX: Remove the vcpu from the per-cpu blocking list after domain
termination

 xen/arch/x86/hvm/vmx/vmx.c | 65 +++---
 xen/include/asm-x86/hvm/vmx/vmcs.h |  1 +
 2 files changed, 61 insertions(+), 5 deletions(-)

-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel