Yes, that can absolutely happen. But for iomem we would have an explicit call 
to ioremap(), ioremap_wc(), ioremap_cache() for that before anybody would map 
anything into userspace page tables.

But thinking more about it I just had an OMFG moment! Is it possible that the 
PAT currently already has a problem with that?

We had customer projects where BARs of different PCIe devices ended up on 
different physical addresses after a hot remove/re-add.

Is it possible that the PAT keeps enforcing certain caching attributes for a 
physical address? E.g. for example because a driver doesn't clean up properly 
on hot remove?

If yes than that would explain a massive number of problems we had with hot 
add/remove.

The code is a mess, so if a driver messed up, likely everything is possible.

TBH, the more I look at this all, the more WTF moments I am having.


What I am currently wondering is: assume we get a
pfnmap_setup_cachemode_pfn() call and we could reliably identify whether
there was a previous registration, then we could do

(a) No previous registration: don't modify pgprot. Hopefully the driver
       knows what it is doing. Maybe we can add sanity checks that the
       direct map was already updated etc.
(b) A previous registration: modify pgprot like we do today.

That would work for me.

System RAM is the problem. I wonder how many of these registrations we
really get and if we could just store them in the same tree as !system
RAM instead of abusing page flags.

commit 9542ada803198e6eba29d3289abb39ea82047b92
Author: Suresh Siddha <suresh.b.sid...@intel.com>
Date:   Wed Sep 24 08:53:33 2008 -0700

     x86: track memtype for RAM in page struct
         Track the memtype for RAM pages in page struct instead of using the
     memtype list. This avoids the explosion in the number of entries in
     memtype list (of the order of 20,000 with AGP) and makes the PAT
     tracking simpler.
         We are using PG_arch_1 bit in page->flags.
         We still use the memtype list for non RAM pages.


I do wonder if that explosion is still an issue today.

Yes it is. That is exactly the issue I'm working on here.

It's just that AGP was replaced by internal GPU MMUs over time and so we don't 
use the old AGP code any more but just call get_free_pages() (or similar) 
directly.

Okay, I thought I slowly understood how it works, then I stumbled over the set_memory_uc / set_memory_wc implementation and now I am *all confused*.

I mean, that does perform a PAT reservation.

But when is that reservation ever freed again? :/

How can set_memory_wc() followed by set_memory_uc() possibly work? I am pretty sure I am missing a piece of the puzzle.

I think you mentioned that set_memory_uc() is avoided by drivers because of highmem mess, but what are drivers then using to modify the direct map?

--
Cheers

David / dhildenb

Reply via email to