ChangeSet 1.2231.1.176, 2005/03/28 20:05:21-08:00, [EMAIL PROTECTED]

        [PATCH] arch hook for notifying changes in PTE protections bits
        
         Recently on IA-64, we have found an issue where old data could be used 
by
         apps.  The sequence of operations includes few mprotects from user 
space
         (glibc) goes like this:
        
         1- The text region of an executable is mmaped using
            PROT_READ|PROT_EXEC.  As a result, a shared page is allocated to 
user.
        
         2- User then requests the text region to be mprotected with
            PROT_READ|PROT_WRITE.  Kernel removes the execute permission and 
leave
            the read permission on the text region.
        
         3- Subsequent write operation by user results in page fault and
            eventually resulting in COW break.  User gets a new private copy of 
the
            page.  At this point kernel marks the new page for defered flush.
        
         4- User then request the text region to be mprotected back with
            PROT_READ|PROT_EXEC.  mprotect suppport code in kernel, flushes the
            caches, updates the PTEs and then flushes the TLBs.  Though after
            updating the PTEs with new permissions, we don't let the arch 
specific
            code know about the new mappings (through update_mmu_cache like
            routine).  IA-64 typically uses update_mmu_cache to check for the
            defered flush flag (that got set in step 3) to maintain cache 
coherency
            lazily (The local I and D caches on IA-64 are incoherent).
        
        DavidM suggeested that we would need to add a hook in the function
        change_pte_range in mm/mprotect.c This would let the architecture 
specific
        code to look at the new ptes to decide if it needs to update any other
        architectual/kernel state based on the updated (new permissions) PTE
        values.
        
        We have added a new hook lazy_mmu_prot_update(pte_t) that gets called
        protection bits in PTEs change.  This hook provides an opportunity to 
arch
        specific code to do needful.  On IA-64 this will be used for lazily 
making
        the I and D caches coherent.
        
        Signed-off-by: David Mosberger <[EMAIL PROTECTED]> 
        Signed-off-by: Rohit Seth <[EMAIL PROTECTED]>
        Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
        Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>



 Documentation/cachetlb.txt      |    5 +++++
 arch/ia64/hp/common/sba_iommu.c |    2 +-
 arch/ia64/lib/swiotlb.c         |    2 +-
 arch/ia64/mm/init.c             |    3 +--
 include/asm-generic/pgtable.h   |    4 ++++
 include/asm-ia64/pgtable.h      |    5 ++++-
 mm/memory.c                     |    6 ++++++
 mm/mprotect.c                   |    5 +++--
 8 files changed, 25 insertions(+), 7 deletions(-)


diff -Nru a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt
--- a/Documentation/cachetlb.txt        2005-03-28 21:45:18 -08:00
+++ b/Documentation/cachetlb.txt        2005-03-28 21:45:18 -08:00
@@ -142,6 +142,11 @@
        The ia64 sn2 platform is one example of a platform
        that uses this interface.
 
+8) void lazy_mmu_prot_update(pte_t pte)
+       This interface is called whenever the protection on
+       any user PTEs change.  This interface provides a notification
+       to architecture specific code to take appropiate action.
+
 
 Next, we have the cache flushing interfaces.  In general, when Linux
 is changing an existing virtual-->physical mapping to a new value,
diff -Nru a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
--- a/arch/ia64/hp/common/sba_iommu.c   2005-03-28 21:45:18 -08:00
+++ b/arch/ia64/hp/common/sba_iommu.c   2005-03-28 21:45:18 -08:00
@@ -762,7 +762,7 @@
 #ifdef ENABLE_MARK_CLEAN
 /**
  * Since DMA is i-cache coherent, any (complete) pages that were written via
- * DMA can be marked as "clean" so that update_mmu_cache() doesn't have to
+ * DMA can be marked as "clean" so that lazy_mmu_prot_update() doesn't have to
  * flush them when they get mapped into an executable vm-area.
  */
 static void
diff -Nru a/arch/ia64/lib/swiotlb.c b/arch/ia64/lib/swiotlb.c
--- a/arch/ia64/lib/swiotlb.c   2005-03-28 21:45:18 -08:00
+++ b/arch/ia64/lib/swiotlb.c   2005-03-28 21:45:18 -08:00
@@ -444,7 +444,7 @@
 
 /*
  * Since DMA is i-cache coherent, any (complete) pages that were written via
- * DMA can be marked as "clean" so that update_mmu_cache() doesn't have to
+ * DMA can be marked as "clean" so that lazy_mmu_prot_update() doesn't have to
  * flush them when they get mapped into an executable vm-area.
  */
 static void
diff -Nru a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
--- a/arch/ia64/mm/init.c       2005-03-28 21:45:18 -08:00
+++ b/arch/ia64/mm/init.c       2005-03-28 21:45:18 -08:00
@@ -76,7 +76,7 @@
 }
 
 void
-update_mmu_cache (struct vm_area_struct *vma, unsigned long vaddr, pte_t pte)
+lazy_mmu_prot_update (pte_t pte)
 {
        unsigned long addr;
        struct page *page;
@@ -85,7 +85,6 @@
                return;                         /* not an executable page... */
 
        page = pte_page(pte);
-       /* don't use VADDR: it may not be mapped on this CPU (or may have just 
been flushed): */
        addr = (unsigned long) page_address(page);
 
        if (test_bit(PG_arch_1, &page->flags))
diff -Nru a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
--- a/include/asm-generic/pgtable.h     2005-03-28 21:45:18 -08:00
+++ b/include/asm-generic/pgtable.h     2005-03-28 21:45:18 -08:00
@@ -135,6 +135,10 @@
 #define pgd_offset_gate(mm, addr)      pgd_offset(mm, addr)
 #endif
 
+#ifndef __HAVE_ARCH_LAZY_MMU_PROT_UPDATE
+#define lazy_mmu_prot_update(pte)      do { } while (0)
+#endif
+
 /*
  * When walking page tables, get the address of the next boundary, or
  * the end address of the range if that comes earlier.  Although end might
diff -Nru a/include/asm-ia64/pgtable.h b/include/asm-ia64/pgtable.h
--- a/include/asm-ia64/pgtable.h        2005-03-28 21:45:18 -08:00
+++ b/include/asm-ia64/pgtable.h        2005-03-28 21:45:18 -08:00
@@ -411,6 +411,8 @@
        return pte_val(a) == pte_val(b);
 }
 
+#define update_mmu_cache(vma, address, pte) do { } while (0)
+
 extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 extern void paging_init (void);
 
@@ -479,7 +481,7 @@
  * information.  However, we use this routine to take care of any (delayed) 
i-cache
  * flushing that may be necessary.
  */
-extern void update_mmu_cache (struct vm_area_struct *vma, unsigned long vaddr, 
pte_t pte);
+extern void lazy_mmu_prot_update (pte_t pte);
 
 #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 /*
@@ -557,6 +559,7 @@
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
 #define __HAVE_ARCH_PTE_SAME
 #define __HAVE_ARCH_PGD_OFFSET_GATE
+#define __HAVE_ARCH_LAZY_MMU_PROT_UPDATE
 
 /*
  * Override for pgd_addr_end() to deal with the virtual address space holes
diff -Nru a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c       2005-03-28 21:45:18 -08:00
+++ b/mm/memory.c       2005-03-28 21:45:18 -08:00
@@ -1134,6 +1134,7 @@
                              vma);
        ptep_establish(vma, address, page_table, entry);
        update_mmu_cache(vma, address, entry);
+       lazy_mmu_prot_update(entry);
 }
 
 /*
@@ -1186,6 +1187,7 @@
                                              vma);
                        ptep_set_access_flags(vma, address, page_table, entry, 
1);
                        update_mmu_cache(vma, address, entry);
+                       lazy_mmu_prot_update(entry);
                        pte_unmap(page_table);
                        spin_unlock(&mm->page_table_lock);
                        return VM_FAULT_MINOR;
@@ -1648,6 +1650,7 @@
 
        /* No need to invalidate - it was non-present before */
        update_mmu_cache(vma, address, pte);
+       lazy_mmu_prot_update(pte);
        pte_unmap(page_table);
        spin_unlock(&mm->page_table_lock);
 out:
@@ -1705,6 +1708,7 @@
 
        /* No need to invalidate - it was non-present before */
        update_mmu_cache(vma, addr, entry);
+       lazy_mmu_prot_update(entry);
        spin_unlock(&mm->page_table_lock);
 out:
        return VM_FAULT_MINOR;
@@ -1830,6 +1834,7 @@
 
        /* no need to invalidate: a not-present page shouldn't be cached */
        update_mmu_cache(vma, address, entry);
+       lazy_mmu_prot_update(entry);
        spin_unlock(&mm->page_table_lock);
 out:
        return ret;
@@ -1924,6 +1929,7 @@
        entry = pte_mkyoung(entry);
        ptep_set_access_flags(vma, address, pte, entry, write_access);
        update_mmu_cache(vma, address, entry);
+       lazy_mmu_prot_update(entry);
        pte_unmap(pte);
        spin_unlock(&mm->page_table_lock);
        return VM_FAULT_MINOR;
diff -Nru a/mm/mprotect.c b/mm/mprotect.c
--- a/mm/mprotect.c     2005-03-28 21:45:18 -08:00
+++ b/mm/mprotect.c     2005-03-28 21:45:18 -08:00
@@ -39,8 +39,9 @@
                         * bits by wiping the pte and then setting the new pte
                         * into place.
                         */
-                       ptent = ptep_get_and_clear(mm, addr, pte);
-                       set_pte_at(mm, addr, pte, pte_modify(ptent, newprot));
+                       ptent = pte_modify(ptep_get_and_clear(mm, addr, pte), 
newprot);
+                       set_pte_at(mm, addr, pte, ptent);
+                       lazy_mmu_prot_update(ptent);
                }
        } while (pte++, addr += PAGE_SIZE, addr != end);
        pte_unmap(pte - 1);
-
To unsubscribe from this list: send the line "unsubscribe bk-commits-head" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to