On Tue, May 24, 2011, Tian, Kevin wrote about "RE: [PATCH 08/31] nVMX: Fix
local_vcpus_link handling":
> > +static inline void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
> > +{
> > + vmcs_clear(loaded_vmcs->vmcs);
> > + loaded_vmcs->cpu = -1;
> > + loaded_vmcs->launched = 0;
> > +}
> > +
>
> call it vmcs_init instead since you now remove original vmcs_init invocation,
> which more reflect the necessity of adding VMCLEAR for a new vmcs?
The best name for this function, I think, would have been loaded_vmcs_clear,
because this function isn't necessarily used to "init" - it's also called to
VMCLEAR an old vmcs (and flush its content back to memory) - in that sense
it is definitely not a "vmcs_init".
Unfortunately, I already have a whole chain of functions with this name :(
the existing loaded_vmcs_clear() does an IPI to the CPU which has this VMCS
loaded, and causes it to run __loaded_vmcs_clear(), which in turn calls
the above loaded_vmcs_init(). I wish I could call all three functions
loaded_vmcs_clear(), but of course I can't. If anyone reading this has a good
suggestion on how to name these three functions, please let me know.
> > +static void __loaded_vmcs_clear(void *arg)
> > {
> > - struct vcpu_vmx *vmx = arg;
> > + struct loaded_vmcs *loaded_vmcs = arg;
> > int cpu = raw_smp_processor_id();
> >
> > - if (vmx->vcpu.cpu == cpu)
> > - vmcs_clear(vmx->vmcs);
> > - if (per_cpu(current_vmcs, cpu) == vmx->vmcs)
> > + if (loaded_vmcs->cpu != cpu)
> > + return; /* cpu migration can race with cpu offline */
>
> what do you mean by "cpu migration" here? why does 'cpu offline'
> matter here regarding to the cpu change?
__loaded_vmcs_clear() is typically called in one of two cases: "cpu migration"
means that a guest that used to run on one CPU, and had its VMCS loaded there,
suddenly needs to run on a different CPU, so we need to clear the VMCS on
the old CPU. "cpu offline" means that we want to take a certain CPU offline,
and before we do that we should VMCLEAR all VMCSs which were loaded on it.
The (vmx->cpu.cpu != cpu) case in __loaded_vmcs_clear should ideally never
happen: In the cpu offline path, we only call it for the loaded_vmcss which
we know for sure are loaded on the current cpu. In the cpu migration path,
loaded_vmcs_clear runs __loaded_vmcs_clear on the right CPU, which ensures that
equality.
But, there can be a race condition (this was actually explained to me a while
back by Avi - I never seen this happening in practice): Imagine that cpu
migration calls loaded_vmcs_clear, which tells the old cpu (via IPI) to
VMCLEAR this vmcs. But before that old CPU gets a chance to act on that IPI,
a decision is made to take it offline, and all loaded_vmcs loaded on it
(including the one in question) are cleared. When that CPU acts on this IPI,
it notices that vmx->cpu.cpu==-1, i.e., != cpu, so it doesn't need to do
anything (in the new version of the code, I made this more explicit, by
returning immediately in this case).
At least this is the theory. As I said, I didn't see this problem in practice
(unsurprising, since I never offlined any CPU). Maybe Avi or someone else can
comment more about this (vmx->cpu.cpu == cpu) check, which existed before
my patch - in __vcpu_clear().
> > +static void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
> > +{
> > + if (!loaded_vmcs->vmcs)
> > + return;
> > + loaded_vmcs_clear(loaded_vmcs);
> > + free_vmcs(loaded_vmcs->vmcs);
> > + loaded_vmcs->vmcs = NULL;
> > +}
>
> prefer to not do cleanup work through loaded_vmcs since it's just a pointer
> to a loaded vmcs structure. Though you can carefully arrange the nested
> vmcs cleanup happening before it, it's not very clean and a bit error prone
> simply from this function itself. It's clearer to directly cleanup vmcs01, and
> if you want an assertion could be added to make sure loaded_vmcs doesn't
> point to any stale vmcs02 structure after nested cleanup step.
I'm afraid I didn't understand what you meant here... Basically, this
free_loaded_vmcs() is just a shortcut for loaded_vmcs_clear() and free_vmcs(),
as doing both is needed in 3 places: nested_free_vmcs02,
nested_free_all_saved_vmcss, vmx_free_vcpu. The same function is needed
for both vmcs01 and vmcs02 VMCSs - in both cases when we don't need them any
more we need to VMCLEAR them and then free the VMCS memory. Note that this
function does *not* free the loaded_vmcs structure itself.
What's wrong with this?
Would you prefer that I remove this function and explictly call
loaded_vmcs_clear() and then free_vmcs() in all three places?
Thanks,
Nadav.
--
Nadav Har'El | Tuesday, May 24 2011, 20 Iyyar 5771
[email protected] |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |Linux: Because a PC is a terrible thing
http://nadav.harel.org.il |to waste.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html