Re: [patch 3/3] arch_rebalance_pgtables call
On Thu, 2007-11-15 at 09:07 +1100, Benjamin Herrenschmidt wrote: > On Wed, 2007-11-14 at 12:49 +0100, Martin Schwidefsky wrote: > > > > They all either have an arch override, call get_unmapped_area again or > > are not relevant. So it should be possible to do the upgrade in > > arch_get_unmapped_area. I still have my doubts though, all future uses > > of the get_unmapped_area pointer have to be checked and I feel it is > > easier to understand to do the upgrade / rebalance of the page table > > at > > the end of get_unmapped_area where every caller of mmap is guaranteed > > to > > pass through. > > Well, if something does what you are worried about, then it would be > broken on powerpc as well (among others). We have various constraints on > the address space layout that must be handled by our arch g_u_a (or our > hugetlb one). Ok, I rearranged the dynamic page tables code (and fixed the bug that 31 bit processes had a 3 level page table instead of 2). It is working fine with s390 specific versions of arch_get_unmapped_area and arch_get_unmapped_area_topdown which do the page table upgrade. Which means we can drop the arch_rebalance_pgtables-call.patch from -mm again. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Thu, 2007-11-15 at 09:07 +1100, Benjamin Herrenschmidt wrote: On Wed, 2007-11-14 at 12:49 +0100, Martin Schwidefsky wrote: They all either have an arch override, call get_unmapped_area again or are not relevant. So it should be possible to do the upgrade in arch_get_unmapped_area. I still have my doubts though, all future uses of the get_unmapped_area pointer have to be checked and I feel it is easier to understand to do the upgrade / rebalance of the page table at the end of get_unmapped_area where every caller of mmap is guaranteed to pass through. Well, if something does what you are worried about, then it would be broken on powerpc as well (among others). We have various constraints on the address space layout that must be handled by our arch g_u_a (or our hugetlb one). Ok, I rearranged the dynamic page tables code (and fixed the bug that 31 bit processes had a 3 level page table instead of 2). It is working fine with s390 specific versions of arch_get_unmapped_area and arch_get_unmapped_area_topdown which do the page table upgrade. Which means we can drop the arch_rebalance_pgtables-call.patch from -mm again. -- blue skies, Martin. Reality continues to ruin my life. - Calvin. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Wed, 2007-11-14 at 12:49 +0100, Martin Schwidefsky wrote: > > They all either have an arch override, call get_unmapped_area again or > are not relevant. So it should be possible to do the upgrade in > arch_get_unmapped_area. I still have my doubts though, all future uses > of the get_unmapped_area pointer have to be checked and I feel it is > easier to understand to do the upgrade / rebalance of the page table > at > the end of get_unmapped_area where every caller of mmap is guaranteed > to > pass through. Well, if something does what you are worried about, then it would be broken on powerpc as well (among others). We have various constraints on the address space layout that must be handled by our arch g_u_a (or our hugetlb one). Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Wed, 2007-11-14 at 21:06 +1100, Benjamin Herrenschmidt wrote: > On Wed, 2007-11-14 at 10:26 +0100, Martin Schwidefsky wrote: > > That patch allows processes to have different number of page table > > levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes > > have 3 levels (4TB) and really big 64 bit processes can have 4 levels > > (8PB). The downgrade of a page table to use less levels than the > > parent > > process is done in arch_pick_mmap_layout. The upgrade is done by using > > the arch_rebalance_pgtables call. I've considered using the > > arch_get_unmapped_area but got scared by the indirection in > > get_unmapped_area: > > > > get_area = current->mm->get_unmapped_area; > > if (file && file->f_op && file->f_op->get_unmapped_area) > > get_area = file->f_op->get_unmapped_area; > > addr = get_area(file, addr, len, pgoff, flags); > > Don't be, it's really only hugetlb and other arch specific stuff that > hook in here on platforms with an MMU (It's also used by /dev/mem etc... > for mmu-less platforms but you don't care). I find 8 places where a get_unmapped_area function pointer is used: ipc/shm.c: shm_get_unmapped_area / shm_file_operations drivers/char/mem.c: get_unmapped_area_mem / mem_fops & kmem_fops drivers/video/fbmem.c: get_fb_unmapped_area / fb_fops drivers/pci/proc.c: get_pci_unmapped_area / proc_bus_pci_operations fs/hugetlbfs/inode.c: hugetlb_get_unmapped_area / hugetlbfs_file_operations fs/bad_inode.c: bad_file_get_unmapped_area / bad_file_ops fs/ramfs/file-nommu.c: ramfs_nommu_get_unmapped_area / ramfs_file_operations arch/powerpc/platforms/cell/spufs/file.c: spufs_get_unmapped_area / spufs_mem_fops They all either have an arch override, call get_unmapped_area again or are not relevant. So it should be possible to do the upgrade in arch_get_unmapped_area. I still have my doubts though, all future uses of the get_unmapped_area pointer have to be checked and I feel it is easier to understand to do the upgrade / rebalance of the page table at the end of get_unmapped_area where every caller of mmap is guaranteed to pass through. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Wed, 2007-11-14 at 10:26 +0100, Martin Schwidefsky wrote: > That patch allows processes to have different number of page table > levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes > have 3 levels (4TB) and really big 64 bit processes can have 4 levels > (8PB). The downgrade of a page table to use less levels than the > parent > process is done in arch_pick_mmap_layout. The upgrade is done by using > the arch_rebalance_pgtables call. I've considered using the > arch_get_unmapped_area but got scared by the indirection in > get_unmapped_area: > > get_area = current->mm->get_unmapped_area; > if (file && file->f_op && file->f_op->get_unmapped_area) > get_area = file->f_op->get_unmapped_area; > addr = get_area(file, addr, len, pgoff, flags); Don't be, it's really only hugetlb and other arch specific stuff that hook in here on platforms with an MMU (It's also used by /dev/mem etc... for mmu-less platforms but you don't care). Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Tue, 2007-11-13 at 23:33 +1100, Nick Piggin wrote: > On Tuesday 13 November 2007 01:30, [EMAIL PROTECTED] wrote: > > From: Martin Schwidefsky <[EMAIL PROTECTED]> > > > > In order to change the layout of the page tables after an mmap has > > crossed the adress space limit of the current page table layout a > > architecture hook in get_unmapped_area is needed. The arguments > > are the address of the new mapping and the length of it. > > Can you comment what this is supposed to be fore somewhere? This hook is going to be used by the dynamic page table patch for s390: http://marc.info/?l=linux-mm=119333667710539=2 That patch allows processes to have different number of page table levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes have 3 levels (4TB) and really big 64 bit processes can have 4 levels (8PB). The downgrade of a page table to use less levels than the parent process is done in arch_pick_mmap_layout. The upgrade is done by using the arch_rebalance_pgtables call. I've considered using the arch_get_unmapped_area but got scared by the indirection in get_unmapped_area: get_area = current->mm->get_unmapped_area; if (file && file->f_op && file->f_op->get_unmapped_area) get_area = file->f_op->get_unmapped_area; addr = get_area(file, addr, len, pgoff, flags); -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Tue, 2007-11-13 at 23:33 +1100, Nick Piggin wrote: On Tuesday 13 November 2007 01:30, [EMAIL PROTECTED] wrote: From: Martin Schwidefsky [EMAIL PROTECTED] In order to change the layout of the page tables after an mmap has crossed the adress space limit of the current page table layout a architecture hook in get_unmapped_area is needed. The arguments are the address of the new mapping and the length of it. Can you comment what this is supposed to be fore somewhere? This hook is going to be used by the dynamic page table patch for s390: http://marc.info/?l=linux-mmm=119333667710539w=2 That patch allows processes to have different number of page table levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes have 3 levels (4TB) and really big 64 bit processes can have 4 levels (8PB). The downgrade of a page table to use less levels than the parent process is done in arch_pick_mmap_layout. The upgrade is done by using the arch_rebalance_pgtables call. I've considered using the arch_get_unmapped_area but got scared by the indirection in get_unmapped_area: get_area = current-mm-get_unmapped_area; if (file file-f_op file-f_op-get_unmapped_area) get_area = file-f_op-get_unmapped_area; addr = get_area(file, addr, len, pgoff, flags); -- blue skies, Martin. Reality continues to ruin my life. - Calvin. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Wed, 2007-11-14 at 10:26 +0100, Martin Schwidefsky wrote: That patch allows processes to have different number of page table levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes have 3 levels (4TB) and really big 64 bit processes can have 4 levels (8PB). The downgrade of a page table to use less levels than the parent process is done in arch_pick_mmap_layout. The upgrade is done by using the arch_rebalance_pgtables call. I've considered using the arch_get_unmapped_area but got scared by the indirection in get_unmapped_area: get_area = current-mm-get_unmapped_area; if (file file-f_op file-f_op-get_unmapped_area) get_area = file-f_op-get_unmapped_area; addr = get_area(file, addr, len, pgoff, flags); Don't be, it's really only hugetlb and other arch specific stuff that hook in here on platforms with an MMU (It's also used by /dev/mem etc... for mmu-less platforms but you don't care). Ben. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Wed, 2007-11-14 at 21:06 +1100, Benjamin Herrenschmidt wrote: On Wed, 2007-11-14 at 10:26 +0100, Martin Schwidefsky wrote: That patch allows processes to have different number of page table levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes have 3 levels (4TB) and really big 64 bit processes can have 4 levels (8PB). The downgrade of a page table to use less levels than the parent process is done in arch_pick_mmap_layout. The upgrade is done by using the arch_rebalance_pgtables call. I've considered using the arch_get_unmapped_area but got scared by the indirection in get_unmapped_area: get_area = current-mm-get_unmapped_area; if (file file-f_op file-f_op-get_unmapped_area) get_area = file-f_op-get_unmapped_area; addr = get_area(file, addr, len, pgoff, flags); Don't be, it's really only hugetlb and other arch specific stuff that hook in here on platforms with an MMU (It's also used by /dev/mem etc... for mmu-less platforms but you don't care). I find 8 places where a get_unmapped_area function pointer is used: ipc/shm.c: shm_get_unmapped_area / shm_file_operations drivers/char/mem.c: get_unmapped_area_mem / mem_fops kmem_fops drivers/video/fbmem.c: get_fb_unmapped_area / fb_fops drivers/pci/proc.c: get_pci_unmapped_area / proc_bus_pci_operations fs/hugetlbfs/inode.c: hugetlb_get_unmapped_area / hugetlbfs_file_operations fs/bad_inode.c: bad_file_get_unmapped_area / bad_file_ops fs/ramfs/file-nommu.c: ramfs_nommu_get_unmapped_area / ramfs_file_operations arch/powerpc/platforms/cell/spufs/file.c: spufs_get_unmapped_area / spufs_mem_fops They all either have an arch override, call get_unmapped_area again or are not relevant. So it should be possible to do the upgrade in arch_get_unmapped_area. I still have my doubts though, all future uses of the get_unmapped_area pointer have to be checked and I feel it is easier to understand to do the upgrade / rebalance of the page table at the end of get_unmapped_area where every caller of mmap is guaranteed to pass through. -- blue skies, Martin. Reality continues to ruin my life. - Calvin. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Wed, 2007-11-14 at 12:49 +0100, Martin Schwidefsky wrote: They all either have an arch override, call get_unmapped_area again or are not relevant. So it should be possible to do the upgrade in arch_get_unmapped_area. I still have my doubts though, all future uses of the get_unmapped_area pointer have to be checked and I feel it is easier to understand to do the upgrade / rebalance of the page table at the end of get_unmapped_area where every caller of mmap is guaranteed to pass through. Well, if something does what you are worried about, then it would be broken on powerpc as well (among others). We have various constraints on the address space layout that must be handled by our arch g_u_a (or our hugetlb one). Ben. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Tuesday 13 November 2007 01:30, [EMAIL PROTECTED] wrote: > From: Martin Schwidefsky <[EMAIL PROTECTED]> > > In order to change the layout of the page tables after an mmap has > crossed the adress space limit of the current page table layout a > architecture hook in get_unmapped_area is needed. The arguments > are the address of the new mapping and the length of it. Can you comment what this is supposed to be fore somewhere? > Cc: Benjamin Herrenschmidt <[EMAIL PROTECTED]> > Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]> > --- > > mm/mmap.c |6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > Index: linux-2.6/mm/mmap.c > === > --- linux-2.6.orig/mm/mmap.c > +++ linux-2.6/mm/mmap.c > @@ -36,6 +36,10 @@ > #define arch_mmap_check(addr, len, flags)(0) > #endif > > +#ifndef arch_rebalance_pgtables > +#define arch_rebalance_pgtables(addr, len) (addr) > +#endif > + > static void unmap_region(struct mm_struct *mm, > struct vm_area_struct *vma, struct vm_area_struct *prev, > unsigned long start, unsigned long end); > @@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns > if (addr & ~PAGE_MASK) > return -EINVAL; > > - return addr; > + return arch_rebalance_pgtables(addr, len); > } > > EXPORT_SYMBOL(get_unmapped_area); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 3/3] arch_rebalance_pgtables call
On Tuesday 13 November 2007 01:30, [EMAIL PROTECTED] wrote: From: Martin Schwidefsky [EMAIL PROTECTED] In order to change the layout of the page tables after an mmap has crossed the adress space limit of the current page table layout a architecture hook in get_unmapped_area is needed. The arguments are the address of the new mapping and the length of it. Can you comment what this is supposed to be fore somewhere? Cc: Benjamin Herrenschmidt [EMAIL PROTECTED] Signed-off-by: Martin Schwidefsky [EMAIL PROTECTED] --- mm/mmap.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) Index: linux-2.6/mm/mmap.c === --- linux-2.6.orig/mm/mmap.c +++ linux-2.6/mm/mmap.c @@ -36,6 +36,10 @@ #define arch_mmap_check(addr, len, flags)(0) #endif +#ifndef arch_rebalance_pgtables +#define arch_rebalance_pgtables(addr, len) (addr) +#endif + static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, unsigned long start, unsigned long end); @@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns if (addr ~PAGE_MASK) return -EINVAL; - return addr; + return arch_rebalance_pgtables(addr, len); } EXPORT_SYMBOL(get_unmapped_area); - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/3] arch_rebalance_pgtables call
From: Martin Schwidefsky <[EMAIL PROTECTED]> In order to change the layout of the page tables after an mmap has crossed the adress space limit of the current page table layout a architecture hook in get_unmapped_area is needed. The arguments are the address of the new mapping and the length of it. Cc: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]> --- mm/mmap.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) Index: linux-2.6/mm/mmap.c === --- linux-2.6.orig/mm/mmap.c +++ linux-2.6/mm/mmap.c @@ -36,6 +36,10 @@ #define arch_mmap_check(addr, len, flags) (0) #endif +#ifndef arch_rebalance_pgtables +#define arch_rebalance_pgtables(addr, len) (addr) +#endif + static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, unsigned long start, unsigned long end); @@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns if (addr & ~PAGE_MASK) return -EINVAL; - return addr; + return arch_rebalance_pgtables(addr, len); } EXPORT_SYMBOL(get_unmapped_area); -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/3] arch_rebalance_pgtables call
From: Martin Schwidefsky [EMAIL PROTECTED] In order to change the layout of the page tables after an mmap has crossed the adress space limit of the current page table layout a architecture hook in get_unmapped_area is needed. The arguments are the address of the new mapping and the length of it. Cc: Benjamin Herrenschmidt [EMAIL PROTECTED] Signed-off-by: Martin Schwidefsky [EMAIL PROTECTED] --- mm/mmap.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) Index: linux-2.6/mm/mmap.c === --- linux-2.6.orig/mm/mmap.c +++ linux-2.6/mm/mmap.c @@ -36,6 +36,10 @@ #define arch_mmap_check(addr, len, flags) (0) #endif +#ifndef arch_rebalance_pgtables +#define arch_rebalance_pgtables(addr, len) (addr) +#endif + static void unmap_region(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, unsigned long start, unsigned long end); @@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns if (addr ~PAGE_MASK) return -EINVAL; - return addr; + return arch_rebalance_pgtables(addr, len); } EXPORT_SYMBOL(get_unmapped_area); -- blue skies, Martin. Reality continues to ruin my life. - Calvin. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/