Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-15 Thread Martin Schwidefsky
On Thu, 2007-11-15 at 09:07 +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2007-11-14 at 12:49 +0100, Martin Schwidefsky wrote:
> > 
> > They all either have an arch override, call get_unmapped_area again or
> > are not relevant. So it should be possible to do the upgrade in
> > arch_get_unmapped_area. I still have my doubts though, all future uses
> > of the get_unmapped_area pointer have to be checked and I feel it is
> > easier to understand to do the upgrade / rebalance of the page table
> > at
> > the end of get_unmapped_area where every caller of mmap is guaranteed
> > to
> > pass through.
> 
> Well, if something does what you are worried about, then it would be
> broken on powerpc as well (among others). We have various constraints on
> the address space layout that must be handled by our arch g_u_a (or our
> hugetlb one).

Ok, I rearranged the dynamic page tables code (and fixed the bug that 31
bit processes had a 3 level page table instead of 2). It is working fine
with s390 specific versions of arch_get_unmapped_area and
arch_get_unmapped_area_topdown which do the page table upgrade. 

Which means we can drop the arch_rebalance_pgtables-call.patch from -mm
again.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-15 Thread Martin Schwidefsky
On Thu, 2007-11-15 at 09:07 +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2007-11-14 at 12:49 +0100, Martin Schwidefsky wrote:
  
  They all either have an arch override, call get_unmapped_area again or
  are not relevant. So it should be possible to do the upgrade in
  arch_get_unmapped_area. I still have my doubts though, all future uses
  of the get_unmapped_area pointer have to be checked and I feel it is
  easier to understand to do the upgrade / rebalance of the page table
  at
  the end of get_unmapped_area where every caller of mmap is guaranteed
  to
  pass through.
 
 Well, if something does what you are worried about, then it would be
 broken on powerpc as well (among others). We have various constraints on
 the address space layout that must be handled by our arch g_u_a (or our
 hugetlb one).

Ok, I rearranged the dynamic page tables code (and fixed the bug that 31
bit processes had a 3 level page table instead of 2). It is working fine
with s390 specific versions of arch_get_unmapped_area and
arch_get_unmapped_area_topdown which do the page table upgrade. 

Which means we can drop the arch_rebalance_pgtables-call.patch from -mm
again.

-- 
blue skies,
  Martin.

Reality continues to ruin my life. - Calvin.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-14 Thread Benjamin Herrenschmidt

On Wed, 2007-11-14 at 12:49 +0100, Martin Schwidefsky wrote:
> 
> They all either have an arch override, call get_unmapped_area again or
> are not relevant. So it should be possible to do the upgrade in
> arch_get_unmapped_area. I still have my doubts though, all future uses
> of the get_unmapped_area pointer have to be checked and I feel it is
> easier to understand to do the upgrade / rebalance of the page table
> at
> the end of get_unmapped_area where every caller of mmap is guaranteed
> to
> pass through.

Well, if something does what you are worried about, then it would be
broken on powerpc as well (among others). We have various constraints on
the address space layout that must be handled by our arch g_u_a (or our
hugetlb one).

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-14 Thread Martin Schwidefsky
On Wed, 2007-11-14 at 21:06 +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2007-11-14 at 10:26 +0100, Martin Schwidefsky wrote:
> > That patch allows processes to have different number of page table
> > levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes
> > have 3 levels (4TB) and really big 64 bit processes can have 4 levels
> > (8PB). The downgrade of a page table to use less levels than the
> > parent
> > process is done in arch_pick_mmap_layout. The upgrade is done by using
> > the arch_rebalance_pgtables call. I've considered using the
> > arch_get_unmapped_area but got scared by the indirection in
> > get_unmapped_area:
> > 
> > get_area = current->mm->get_unmapped_area;
> > if (file && file->f_op && file->f_op->get_unmapped_area)
> > get_area = file->f_op->get_unmapped_area;
> > addr = get_area(file, addr, len, pgoff, flags);
> 
> Don't be, it's really only hugetlb and other arch specific stuff that
> hook in here on platforms with an MMU (It's also used by /dev/mem etc...
> for mmu-less platforms but you don't care).

I find 8 places where a get_unmapped_area function pointer is used:
ipc/shm.c: shm_get_unmapped_area / shm_file_operations
drivers/char/mem.c: get_unmapped_area_mem / mem_fops & kmem_fops
drivers/video/fbmem.c: get_fb_unmapped_area / fb_fops
drivers/pci/proc.c: get_pci_unmapped_area / proc_bus_pci_operations
fs/hugetlbfs/inode.c: hugetlb_get_unmapped_area / hugetlbfs_file_operations
fs/bad_inode.c: bad_file_get_unmapped_area / bad_file_ops
fs/ramfs/file-nommu.c: ramfs_nommu_get_unmapped_area / ramfs_file_operations
arch/powerpc/platforms/cell/spufs/file.c:
spufs_get_unmapped_area / spufs_mem_fops

They all either have an arch override, call get_unmapped_area again or
are not relevant. So it should be possible to do the upgrade in
arch_get_unmapped_area. I still have my doubts though, all future uses
of the get_unmapped_area pointer have to be checked and I feel it is
easier to understand to do the upgrade / rebalance of the page table at
the end of get_unmapped_area where every caller of mmap is guaranteed to
pass through.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-14 Thread Benjamin Herrenschmidt

On Wed, 2007-11-14 at 10:26 +0100, Martin Schwidefsky wrote:
> That patch allows processes to have different number of page table
> levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes
> have 3 levels (4TB) and really big 64 bit processes can have 4 levels
> (8PB). The downgrade of a page table to use less levels than the
> parent
> process is done in arch_pick_mmap_layout. The upgrade is done by using
> the arch_rebalance_pgtables call. I've considered using the
> arch_get_unmapped_area but got scared by the indirection in
> get_unmapped_area:
> 
> get_area = current->mm->get_unmapped_area;
> if (file && file->f_op && file->f_op->get_unmapped_area)
> get_area = file->f_op->get_unmapped_area;
> addr = get_area(file, addr, len, pgoff, flags);

Don't be, it's really only hugetlb and other arch specific stuff that
hook in here on platforms with an MMU (It's also used by /dev/mem etc...
for mmu-less platforms but you don't care).

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-14 Thread Martin Schwidefsky

On Tue, 2007-11-13 at 23:33 +1100, Nick Piggin wrote:
> On Tuesday 13 November 2007 01:30, [EMAIL PROTECTED] wrote:
> > From: Martin Schwidefsky <[EMAIL PROTECTED]>
> >
> > In order to change the layout of the page tables after an mmap has
> > crossed the adress space limit of the current page table layout a
> > architecture hook in get_unmapped_area is needed. The arguments
> > are the address of the new mapping and the length of it.
> 
> Can you comment what this is supposed to be fore somewhere?

This hook is going to be used by the dynamic page table patch for s390:
http://marc.info/?l=linux-mm=119333667710539=2

That patch allows processes to have different number of page table
levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes
have 3 levels (4TB) and really big 64 bit processes can have 4 levels
(8PB). The downgrade of a page table to use less levels than the parent
process is done in arch_pick_mmap_layout. The upgrade is done by using
the arch_rebalance_pgtables call. I've considered using the
arch_get_unmapped_area but got scared by the indirection in
get_unmapped_area:

get_area = current->mm->get_unmapped_area;
if (file && file->f_op && file->f_op->get_unmapped_area)
get_area = file->f_op->get_unmapped_area;
addr = get_area(file, addr, len, pgoff, flags);

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-14 Thread Martin Schwidefsky

On Tue, 2007-11-13 at 23:33 +1100, Nick Piggin wrote:
 On Tuesday 13 November 2007 01:30, [EMAIL PROTECTED] wrote:
  From: Martin Schwidefsky [EMAIL PROTECTED]
 
  In order to change the layout of the page tables after an mmap has
  crossed the adress space limit of the current page table layout a
  architecture hook in get_unmapped_area is needed. The arguments
  are the address of the new mapping and the length of it.
 
 Can you comment what this is supposed to be fore somewhere?

This hook is going to be used by the dynamic page table patch for s390:
http://marc.info/?l=linux-mmm=119333667710539w=2

That patch allows processes to have different number of page table
levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes
have 3 levels (4TB) and really big 64 bit processes can have 4 levels
(8PB). The downgrade of a page table to use less levels than the parent
process is done in arch_pick_mmap_layout. The upgrade is done by using
the arch_rebalance_pgtables call. I've considered using the
arch_get_unmapped_area but got scared by the indirection in
get_unmapped_area:

get_area = current-mm-get_unmapped_area;
if (file  file-f_op  file-f_op-get_unmapped_area)
get_area = file-f_op-get_unmapped_area;
addr = get_area(file, addr, len, pgoff, flags);

-- 
blue skies,
  Martin.

Reality continues to ruin my life. - Calvin.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-14 Thread Benjamin Herrenschmidt

On Wed, 2007-11-14 at 10:26 +0100, Martin Schwidefsky wrote:
 That patch allows processes to have different number of page table
 levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes
 have 3 levels (4TB) and really big 64 bit processes can have 4 levels
 (8PB). The downgrade of a page table to use less levels than the
 parent
 process is done in arch_pick_mmap_layout. The upgrade is done by using
 the arch_rebalance_pgtables call. I've considered using the
 arch_get_unmapped_area but got scared by the indirection in
 get_unmapped_area:
 
 get_area = current-mm-get_unmapped_area;
 if (file  file-f_op  file-f_op-get_unmapped_area)
 get_area = file-f_op-get_unmapped_area;
 addr = get_area(file, addr, len, pgoff, flags);

Don't be, it's really only hugetlb and other arch specific stuff that
hook in here on platforms with an MMU (It's also used by /dev/mem etc...
for mmu-less platforms but you don't care).

Ben.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-14 Thread Martin Schwidefsky
On Wed, 2007-11-14 at 21:06 +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2007-11-14 at 10:26 +0100, Martin Schwidefsky wrote:
  That patch allows processes to have different number of page table
  levels, 31 bit processes have 2 levels (2GB), normal 64 bit processes
  have 3 levels (4TB) and really big 64 bit processes can have 4 levels
  (8PB). The downgrade of a page table to use less levels than the
  parent
  process is done in arch_pick_mmap_layout. The upgrade is done by using
  the arch_rebalance_pgtables call. I've considered using the
  arch_get_unmapped_area but got scared by the indirection in
  get_unmapped_area:
  
  get_area = current-mm-get_unmapped_area;
  if (file  file-f_op  file-f_op-get_unmapped_area)
  get_area = file-f_op-get_unmapped_area;
  addr = get_area(file, addr, len, pgoff, flags);
 
 Don't be, it's really only hugetlb and other arch specific stuff that
 hook in here on platforms with an MMU (It's also used by /dev/mem etc...
 for mmu-less platforms but you don't care).

I find 8 places where a get_unmapped_area function pointer is used:
ipc/shm.c: shm_get_unmapped_area / shm_file_operations
drivers/char/mem.c: get_unmapped_area_mem / mem_fops  kmem_fops
drivers/video/fbmem.c: get_fb_unmapped_area / fb_fops
drivers/pci/proc.c: get_pci_unmapped_area / proc_bus_pci_operations
fs/hugetlbfs/inode.c: hugetlb_get_unmapped_area / hugetlbfs_file_operations
fs/bad_inode.c: bad_file_get_unmapped_area / bad_file_ops
fs/ramfs/file-nommu.c: ramfs_nommu_get_unmapped_area / ramfs_file_operations
arch/powerpc/platforms/cell/spufs/file.c:
spufs_get_unmapped_area / spufs_mem_fops

They all either have an arch override, call get_unmapped_area again or
are not relevant. So it should be possible to do the upgrade in
arch_get_unmapped_area. I still have my doubts though, all future uses
of the get_unmapped_area pointer have to be checked and I feel it is
easier to understand to do the upgrade / rebalance of the page table at
the end of get_unmapped_area where every caller of mmap is guaranteed to
pass through.

-- 
blue skies,
  Martin.

Reality continues to ruin my life. - Calvin.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-14 Thread Benjamin Herrenschmidt

On Wed, 2007-11-14 at 12:49 +0100, Martin Schwidefsky wrote:
 
 They all either have an arch override, call get_unmapped_area again or
 are not relevant. So it should be possible to do the upgrade in
 arch_get_unmapped_area. I still have my doubts though, all future uses
 of the get_unmapped_area pointer have to be checked and I feel it is
 easier to understand to do the upgrade / rebalance of the page table
 at
 the end of get_unmapped_area where every caller of mmap is guaranteed
 to
 pass through.

Well, if something does what you are worried about, then it would be
broken on powerpc as well (among others). We have various constraints on
the address space layout that must be handled by our arch g_u_a (or our
hugetlb one).

Ben.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-13 Thread Nick Piggin
On Tuesday 13 November 2007 01:30, [EMAIL PROTECTED] wrote:
> From: Martin Schwidefsky <[EMAIL PROTECTED]>
>
> In order to change the layout of the page tables after an mmap has
> crossed the adress space limit of the current page table layout a
> architecture hook in get_unmapped_area is needed. The arguments
> are the address of the new mapping and the length of it.

Can you comment what this is supposed to be fore somewhere?


> Cc: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
> Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
> ---
>
>  mm/mmap.c |6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> Index: linux-2.6/mm/mmap.c
> ===
> --- linux-2.6.orig/mm/mmap.c
> +++ linux-2.6/mm/mmap.c
> @@ -36,6 +36,10 @@
>  #define arch_mmap_check(addr, len, flags)(0)
>  #endif
>
> +#ifndef arch_rebalance_pgtables
> +#define arch_rebalance_pgtables(addr, len)   (addr)
> +#endif
> +
>  static void unmap_region(struct mm_struct *mm,
>   struct vm_area_struct *vma, struct vm_area_struct *prev,
>   unsigned long start, unsigned long end);
> @@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns
>   if (addr & ~PAGE_MASK)
>   return -EINVAL;
>
> - return addr;
> + return arch_rebalance_pgtables(addr, len);
>  }
>
>  EXPORT_SYMBOL(get_unmapped_area);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 3/3] arch_rebalance_pgtables call

2007-11-13 Thread Nick Piggin
On Tuesday 13 November 2007 01:30, [EMAIL PROTECTED] wrote:
 From: Martin Schwidefsky [EMAIL PROTECTED]

 In order to change the layout of the page tables after an mmap has
 crossed the adress space limit of the current page table layout a
 architecture hook in get_unmapped_area is needed. The arguments
 are the address of the new mapping and the length of it.

Can you comment what this is supposed to be fore somewhere?


 Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
 Signed-off-by: Martin Schwidefsky [EMAIL PROTECTED]
 ---

  mm/mmap.c |6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

 Index: linux-2.6/mm/mmap.c
 ===
 --- linux-2.6.orig/mm/mmap.c
 +++ linux-2.6/mm/mmap.c
 @@ -36,6 +36,10 @@
  #define arch_mmap_check(addr, len, flags)(0)
  #endif

 +#ifndef arch_rebalance_pgtables
 +#define arch_rebalance_pgtables(addr, len)   (addr)
 +#endif
 +
  static void unmap_region(struct mm_struct *mm,
   struct vm_area_struct *vma, struct vm_area_struct *prev,
   unsigned long start, unsigned long end);
 @@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns
   if (addr  ~PAGE_MASK)
   return -EINVAL;

 - return addr;
 + return arch_rebalance_pgtables(addr, len);
  }

  EXPORT_SYMBOL(get_unmapped_area);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 3/3] arch_rebalance_pgtables call

2007-11-12 Thread schwidefsky
From: Martin Schwidefsky <[EMAIL PROTECTED]>

In order to change the layout of the page tables after an mmap has
crossed the adress space limit of the current page table layout a
architecture hook in get_unmapped_area is needed. The arguments
are the address of the new mapping and the length of it.

Cc: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
Signed-off-by: Martin Schwidefsky <[EMAIL PROTECTED]>
---

 mm/mmap.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/mmap.c
===
--- linux-2.6.orig/mm/mmap.c
+++ linux-2.6/mm/mmap.c
@@ -36,6 +36,10 @@
 #define arch_mmap_check(addr, len, flags)  (0)
 #endif
 
+#ifndef arch_rebalance_pgtables
+#define arch_rebalance_pgtables(addr, len) (addr)
+#endif
+
 static void unmap_region(struct mm_struct *mm,
struct vm_area_struct *vma, struct vm_area_struct *prev,
unsigned long start, unsigned long end);
@@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns
if (addr & ~PAGE_MASK)
return -EINVAL;
 
-   return addr;
+   return arch_rebalance_pgtables(addr, len);
 }
 
 EXPORT_SYMBOL(get_unmapped_area);

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 3/3] arch_rebalance_pgtables call

2007-11-12 Thread schwidefsky
From: Martin Schwidefsky [EMAIL PROTECTED]

In order to change the layout of the page tables after an mmap has
crossed the adress space limit of the current page table layout a
architecture hook in get_unmapped_area is needed. The arguments
are the address of the new mapping and the length of it.

Cc: Benjamin Herrenschmidt [EMAIL PROTECTED]
Signed-off-by: Martin Schwidefsky [EMAIL PROTECTED]
---

 mm/mmap.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/mmap.c
===
--- linux-2.6.orig/mm/mmap.c
+++ linux-2.6/mm/mmap.c
@@ -36,6 +36,10 @@
 #define arch_mmap_check(addr, len, flags)  (0)
 #endif
 
+#ifndef arch_rebalance_pgtables
+#define arch_rebalance_pgtables(addr, len) (addr)
+#endif
+
 static void unmap_region(struct mm_struct *mm,
struct vm_area_struct *vma, struct vm_area_struct *prev,
unsigned long start, unsigned long end);
@@ -1436,7 +1440,7 @@ get_unmapped_area(struct file *file, uns
if (addr  ~PAGE_MASK)
return -EINVAL;
 
-   return addr;
+   return arch_rebalance_pgtables(addr, len);
 }
 
 EXPORT_SYMBOL(get_unmapped_area);

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/