1) We use another interface that consumes pages instead of PFNs, like a
    vm_insert_pages_pgprot() we would be adding.

    Is there any strong requirement for inserting non-refcounted PFNs?

Yes, there is a strong requirement to insert non-refcounted PFNs.

We had a lot of trouble with KVM people trying to grab a reference to those 
pages even if the VMA had the VM_PFNMAP flag set.

Yes, KVM ignored (and maybe still does) VM_PFNMAP to some degree, which is rather nasty.


2) We add another interface that consumes PFNs, but explicitly states
    that it is only for ordinary system RAM, and that the user is
    required for updating the direct map.

    We could sanity-check the direct map in debug kernels.

I would rather like to see vmf_insert_pfn_prot() fixed instead.

That function was explicitly added to insert the PFN with the given attributes 
and as far as I can see all users of that function expect exactly that.

It's all a bit tricky :(



3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
    system RAM differently.


There is also the option for a mixture between 1 and 2, where we get pages, but 
we map them non-refcounted in a VM_PFNMAP.

In general, having pages makes it easier to assert that they are likely 
ordinary system ram pages, and that the interface is not getting abused for 
something else.

Well, exactly that's the use case here and that is not abusive at all as far as 
I can see.

What drivers want is to insert a PFN with a certain set of caching attributes 
regardless if it's system memory or iomem. That's why vmf_insert_pfn_prot() was 
created in the first place.

I mean, the use case of "allocate pages from the buddy and fixup the linear map" sounds perfectly reasonable to me. Absolutely no reason to get PAT involved. Nobody else should be messing with that memory after all.

As soon as we are talking about other memory ranges (iomem) that are not from the buddy, it gets weird to bypass PAT, and the question I am asking myself is, when is it okay, and when not.


That drivers need to call set_pages_wc/uc() for the linear mapping on x86 
manually is correct and checking that is clearly a good idea for debug kernels.

I'll have to think about this a bit: assuming only vmf_insert_pfn() calls pfnmap_setup_cachemode_pfn() but vmf_insert_pfn_prot() doesn't, how could we sanity check that somebody is doing something against the will of PAT.

--
Cheers

David / dhildenb

Reply via email to