On Mon, Oct 13, 2014 at 05:52:38AM -0300, Marcelo Tosatti wrote:
> On Fri, Oct 10, 2014 at 04:09:29PM +0300, Gleb Natapov wrote:
> > On Wed, Oct 08, 2014 at 04:22:31PM -0300, Marcelo Tosatti wrote:
> > > > >
> > > > > Argh, lets try again:
> > > > >
> > > > > skip_pinned = true
> > > > > ------------------
> > > > >
> > > > > mark page dirty, keep spte intact
> > > > >
> > > > > called from get dirty log path.
> > > > >
> > > > > skip_pinned = false
> > > > > -------------------
> > > > > reload remote mmu
> > > > > destroy pinned spte.
> > > > >
> > > > > called from: dirty log enablement, rmap write protect (unused for
> > > > > pinned
> > > > > sptes)
> > > > >
> > > > >
> > > > > Note this behaviour is your suggestion:
> > > >
> > > > Yes, I remember that and I thought we will not need this skip_pinned
> > > > at all. For rmap write protect case there shouldn't be any pinned
> > > > pages,
> > > > but why dirty log enablement sets skip_pinned to false? Why not mark
> > > > pinned pages as dirty just like you do in get dirty log path?
> > >
> > > Because if its a large spte, it must be nuked (or marked read-only,
> > > which for pinned sptes, is not possible).
> > >
> > If a large page has one small page pinned inside it its spte will
> > be marked as pinned, correct?
>
> Correct.
>
> > We did nuke large ptes here until very
> > recently: c126d94f2c90ed9d, but we cannot drop a pte here anyway without
> > kicking all vcpu from a guest mode, but do you need additional skip_pinned
> > parameter? Why not check if spte is large instead?
>
> Nuke only if large spte is found? Can do that, instead.
>
> > So why not have per slot pinned page list (Xiao suggested the same) and do:
>
> The interface is per-vcpu (that is registration of pinned pages is
> performed on a per-vcpu basis).
>
PEBS is per cpu, but it does not mean that pinning should be per cpu, it
can be done globally with ref counting.
> > spte_write_protect() {
> > if (is_pinned(spte) {
> > if (large(spte))
> > // cannot drop while vcpu are running
> > mmu_reload_pinned_vcpus();
> > else
> > return false;
> > }
> >
> >
> > get_dirty_log() {
> > for_each(pinned pages i)
> > makr_dirty(i);
> > }
>
> That is effectively the same this patchset does, except that the spte
> pinned bit is checked at spte_write_protect, instead of looping over
> page pinned list. Fail to see huge advantage there.
>
I think spte_write_protect is a strange place to mark pages dirty, but
otherwise yes the effect is the same, so definitely not a huge difference.
If global pinned list is a PITA in your opinion leave it as is.
> I'll drop the skip_pinned parameter and use is_large_pte check instead.
>
Thanks,
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html