Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread William Lee Irwin III
Adam Litke wrote:
>> We didn't want to bloat the size of the vm_ops struct for all of its
>> users.

On Thu, Mar 22, 2007 at 10:02:07AM +1100, Nick Piggin wrote:
> But vmas are surely far more numerous than vm_ops, aren't they?

It should be clarified that the pointer to the operations structure
in once-per-mmap() vmas is a bigger overhead than once-per-driver
function pointers in the vm_ops structure.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Nick Piggin

Christoph Hellwig wrote:

On Wed, Mar 21, 2007 at 10:17:40AM -0500, Adam Litke wrote:


Also, it is going to be hugepage-only, isn't it? So should the naming be
changed to reflect that? And #ifdef it...


They are doing some interesting things on Cell that could take advantage
of this.



That would be new to me.  What we need on Cell is fixing up the
get_unmapped_area mess which Ben is working on now.

And let me once again repeat that I don't like this at all.  I'll
rather have a few ugly ifdefs in strategic places than a big object
oriented mess like this with just a single user.


I think I agree that we'd need more than one user for this.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Nick Piggin

Adam Litke wrote:

On Wed, 2007-03-21 at 15:18 +1100, Nick Piggin wrote:


Adam Litke wrote:



diff --git a/include/linux/mm.h b/include/linux/mm.h
index 60e0e4a..7089323 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -98,6 +98,7 @@ struct vm_area_struct {

/* Function pointers to deal with this struct. */
struct vm_operations_struct * vm_ops;
+   const struct pagetable_operations_struct * pagetable_ops;

/* Information about our backing store: */
unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE


Can you remind me why this isn't in vm_ops?



We didn't want to bloat the size of the vm_ops struct for all of its
users.


But vmas are surely far more numerous than vm_ops, aren't they?

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Christoph Hellwig
On Wed, Mar 21, 2007 at 10:17:40AM -0500, Adam Litke wrote:
> > Also, it is going to be hugepage-only, isn't it? So should the naming be
> > changed to reflect that? And #ifdef it...
> 
> They are doing some interesting things on Cell that could take advantage
> of this.

That would be new to me.  What we need on Cell is fixing up the
get_unmapped_area mess which Ben is working on now.

And let me once again repeat that I don't like this at all.  I'll
rather have a few ugly ifdefs in strategic places than a big object
oriented mess like this with just a single user.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Adam Litke
On Wed, 2007-03-21 at 15:18 +1100, Nick Piggin wrote:
> Adam Litke wrote:
> > Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
> > ---
> > 
> >  include/linux/mm.h |   25 +
> >  1 files changed, 25 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 60e0e4a..7089323 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -98,6 +98,7 @@ struct vm_area_struct {
> >  
> > /* Function pointers to deal with this struct. */
> > struct vm_operations_struct * vm_ops;
> > +   const struct pagetable_operations_struct * pagetable_ops;
> >  
> > /* Information about our backing store: */
> > unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
> 
> Can you remind me why this isn't in vm_ops?

We didn't want to bloat the size of the vm_ops struct for all of its
users.

> Also, it is going to be hugepage-only, isn't it? So should the naming be
> changed to reflect that? And #ifdef it...

They are doing some interesting things on Cell that could take advantage
of this.

> > @@ -218,6 +219,30 @@ struct vm_operations_struct {
> >  };
> >  
> >  struct mmu_gather;
> > +
> > +struct pagetable_operations_struct {
> > +   int (*fault)(struct mm_struct *mm,
> > +   struct vm_area_struct *vma,
> > +   unsigned long address, int write_access);
> 
> I got dibs on fault ;)
> 
> My callback is a sanitised one that basically abstracts the details of the
> virtual memory mapping away, so it is usable by drivers and filesystems.
> 
> You actually want to bypass the normal fault handling because it doesn't
> know how to deal with your virtual memory mapping. Hmm, the best suggestion
> I can come up with is handle_mm_fault... unless you can think of a better
> name for me to use.

How about I use handle_pte_fault?

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Arjan van de Ven
On Wed, 2007-03-21 at 09:50 -0500, Adam Litke wrote:
> On Tue, 2007-03-20 at 16:24 -0700, Dave Hansen wrote:
> > On Mon, 2007-03-19 at 13:05 -0700, Adam Litke wrote:
> > > 
> > > +#define has_pt_op(vma, op) \
> > > +   ((vma)->pagetable_ops && (vma)->pagetable_ops->op)
> > > +#define pt_op(vma, call) \
> > > +   ((vma)->pagetable_ops->call) 
> > 
> > Can you get rid of these macros?  I think they make it a wee bit harder
> > to read.  My brain doesn't properly parse the foo(arg)(bar) syntax.  
> > 
> > +   if (has_pt_op(vma, copy_vma))
> > +   return pt_op(vma, copy_vma)(dst_mm, src_mm, vma);
> > 
> > +   if (vma->pagetable_ops && vma->pagetable_ops->copy_vma)
> > +   return vma->pagetable_ops->copy_vma(dst_mm, src_mm, vma);
> > 
> > I guess it does lead to some longish lines.  Does it start looking
> > really nasty?
> 
> Yeah, it starts to look pretty bad.  Some of these calls are in code
> that is already indented several times.

can we just make sure these things are never NULL in the first place?
would obsolete a lot of the checks, which are also runtime overhead as
well!
-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Adam Litke
On Tue, 2007-03-20 at 16:24 -0700, Dave Hansen wrote:
> On Mon, 2007-03-19 at 13:05 -0700, Adam Litke wrote:
> > 
> > +#define has_pt_op(vma, op) \
> > +   ((vma)->pagetable_ops && (vma)->pagetable_ops->op)
> > +#define pt_op(vma, call) \
> > +   ((vma)->pagetable_ops->call) 
> 
> Can you get rid of these macros?  I think they make it a wee bit harder
> to read.  My brain doesn't properly parse the foo(arg)(bar) syntax.  
> 
> +   if (has_pt_op(vma, copy_vma))
> +   return pt_op(vma, copy_vma)(dst_mm, src_mm, vma);
> 
> +   if (vma->pagetable_ops && vma->pagetable_ops->copy_vma)
> +   return vma->pagetable_ops->copy_vma(dst_mm, src_mm, vma);
> 
> I guess it does lead to some longish lines.  Does it start looking
> really nasty?

Yeah, it starts to look pretty bad.  Some of these calls are in code
that is already indented several times.

> If you're going to have them, it might just be best to put a single
> unlikely() around the macro definitions themselves to keep anybody from
> having to open-code it for any of the users.  

It should be pretty easy to wrap has_pt_op() with an unlikely().  Good
suggestion.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread William Lee Irwin III
William Lee Irwin III wrote:
>> I'm tied up elsewhere so I won't get to it in a timely fashion. Maybe
>> in a few weeks I can start up on the first two of the bunch.

On Wed, Mar 21, 2007 at 05:51:23PM +1100, Nick Piggin wrote:
> Care to give us a hint? :)

The first is something DISM-like. I've not made up my mind on the
second, but the shopping catalogue of feature requests I've done
nothing about for some time that want this is long.


William Lee Irwin III wrote:
>> Not much of a VM translation; it's just a lookup through the
>> software mocked-up structures on everything save i386, x86_64, and
>> some m68k where they're the same thing only with hardware walkers
>> (ISTR ia64's being firmware a la Alpha despite the "HPW" name,
>> though I could be wrong)

On Wed, Mar 21, 2007 at 05:51:23PM +1100, Nick Piggin wrote:
> Well the vma+pagetables *are* our VM translation data structure. It is
> a good data structure. The Gelato/UNSW guys experimenting with changing
> this have basically said they haven't yet got anything that beats it.
> I would be opposed to anything that bypasses that unless a) it is not
> applicable to the VM as a whole, and b) it is really worth it
> (hugepages was a reasonable exception).

Maybe anticipating the conventional Linux approach to this wasn't as
difficult as I supposed. ;)


William Lee Irwin III wrote:
>> reliant on them. The drivers/etc. could just as easily use helper
>> functions to carry out the lookup, thereby accomplishing the
>> unification. There's nothing particularly fundamental about a pte
>> lookup.

On Wed, Mar 21, 2007 at 05:51:23PM +1100, Nick Piggin wrote:
> Yeah you could, but it looks back to front to me.
> The VM tells the filesystem that the machine took a fault at virtual
> address X, then the filesystem asks the VM what pgoff that is, then
> tells the VM to install the corresponding page to vaddr X.
> With my ->fault, the VM asks the filesystem to give the page that
> corresponds to vaddr X, then installs it into that vaddr.

I'm aware of what is now done and the minor modification accomplished
by your ->fault(). Maybe I've even written something like this before
that I never posted. It's obvious what I'm on about and that my
thoughts here are too divergent to fly. Others should chime in with
more Linux-native ideas about what's to be done here.


William Lee Irwin III wrote:
>> Normal arches that do software TLB refill could just as easily
>> consult the radix trees dangled off struct address_space or any old
>> data structure floating around the kernel with enough information to
>> translate user virtual addresses to the physical addresses they need to
>> fill the TLB with, and there are other kernels that literally do things
>> like that.

On Wed, Mar 21, 2007 at 05:51:23PM +1100, Nick Piggin wrote:
> Sure it *could* be done, but it may not be very nice, given Linux's
> design. And you definitely need _something_ other than just the
> pagecache radix-tree, because the VM needs to know who maps the page.
> So if, for your backing store, you use a small hash table and evict old
> entries like powerpc, you'll constantly be faulting in and out pages
> from the VM's high level view of the address space. That isn't a really
> cheap operation. It takes at least:
[long list of locking operations snipped]
> Compared to our current page table walk which is just a single locked
> op + barrier for the spinlock + radix tree walk.
> If you had a very large hash table (ia64 long mode, maybe?), then you
> may have slightly fewer high level faults, but range based operations
> are going to take a whole lot of cache misses, aren't they? Especially
> for small processes.
> Not that I wouldn't be happy to be proven wrong, but I don't think it
> should be something that sneaks in under these pagetable operations.
> IMO.

I'll presume that was not for my benefit; if so, it was superfluous.

The example I gave was to show how far things could diverge from Linux'
conventions. Every single locking operation cited for Linux didn't
apply to the kernel I was thinking of due to its lockless pagecache
analogue, its lack of a direct equivalent of struct page, and its use
of different lifetime-bounding protocols from reference counting.
Things like page replacement didn't rely on things that would disturb
all that. It all worked out quite well for that kernel. So not only
can it be done other ways, but those ways are indeed efficient.

It should be clear from the above that retrofitting Linux to do similar
is effectively impossible. (Well, if you think you can pull off removing
struct page in favor of no direct equivalent and bounding the lifetimes
of page-sized chunks of memory by shooting down all references using
knowledge of who could possibly be hanging onto them in Linux, feel
free to attempt such a retrofit, and I'll send you a case of Scotch
whisky if you can get it to boot and run a major database benchmark
without crashing regardless of whether it's merged.)

In any 

Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Nick Piggin

Nick Piggin wrote:


Yeah you could, but it looks back to front to me.

The VM tells the filesystem that the machine took a fault at virtual
address X, then the filesystem asks the VM what pgoff that is, then
tells the VM to install the corresponding page to vaddr X.

With my ->fault, the VM asks the filesystem to give the page that
corresponds to vaddr X, then installs it into that vaddr.


Err, sorry, that's what the current ->nopage does. It is then still
up to the filesystem to do the vaddr to pgoff conversion.

My fault patches of course just ask the filesystem for the page at
a given pgoff.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Nick Piggin

Nick Piggin wrote:


Yeah you could, but it looks back to front to me.

The VM tells the filesystem that the machine took a fault at virtual
address X, then the filesystem asks the VM what pgoff that is, then
tells the VM to install the corresponding page to vaddr X.

With my -fault, the VM asks the filesystem to give the page that
corresponds to vaddr X, then installs it into that vaddr.


Err, sorry, that's what the current -nopage does. It is then still
up to the filesystem to do the vaddr to pgoff conversion.

My fault patches of course just ask the filesystem for the page at
a given pgoff.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread William Lee Irwin III
William Lee Irwin III wrote:
 I'm tied up elsewhere so I won't get to it in a timely fashion. Maybe
 in a few weeks I can start up on the first two of the bunch.

On Wed, Mar 21, 2007 at 05:51:23PM +1100, Nick Piggin wrote:
 Care to give us a hint? :)

The first is something DISM-like. I've not made up my mind on the
second, but the shopping catalogue of feature requests I've done
nothing about for some time that want this is long.


William Lee Irwin III wrote:
 Not much of a VM translation; it's just a lookup through the
 software mocked-up structures on everything save i386, x86_64, and
 some m68k where they're the same thing only with hardware walkers
 (ISTR ia64's being firmware a la Alpha despite the HPW name,
 though I could be wrong)

On Wed, Mar 21, 2007 at 05:51:23PM +1100, Nick Piggin wrote:
 Well the vma+pagetables *are* our VM translation data structure. It is
 a good data structure. The Gelato/UNSW guys experimenting with changing
 this have basically said they haven't yet got anything that beats it.
 I would be opposed to anything that bypasses that unless a) it is not
 applicable to the VM as a whole, and b) it is really worth it
 (hugepages was a reasonable exception).

Maybe anticipating the conventional Linux approach to this wasn't as
difficult as I supposed. ;)


William Lee Irwin III wrote:
 reliant on them. The drivers/etc. could just as easily use helper
 functions to carry out the lookup, thereby accomplishing the
 unification. There's nothing particularly fundamental about a pte
 lookup.

On Wed, Mar 21, 2007 at 05:51:23PM +1100, Nick Piggin wrote:
 Yeah you could, but it looks back to front to me.
 The VM tells the filesystem that the machine took a fault at virtual
 address X, then the filesystem asks the VM what pgoff that is, then
 tells the VM to install the corresponding page to vaddr X.
 With my -fault, the VM asks the filesystem to give the page that
 corresponds to vaddr X, then installs it into that vaddr.

I'm aware of what is now done and the minor modification accomplished
by your -fault(). Maybe I've even written something like this before
that I never posted. It's obvious what I'm on about and that my
thoughts here are too divergent to fly. Others should chime in with
more Linux-native ideas about what's to be done here.


William Lee Irwin III wrote:
 Normal arches that do software TLB refill could just as easily
 consult the radix trees dangled off struct address_space or any old
 data structure floating around the kernel with enough information to
 translate user virtual addresses to the physical addresses they need to
 fill the TLB with, and there are other kernels that literally do things
 like that.

On Wed, Mar 21, 2007 at 05:51:23PM +1100, Nick Piggin wrote:
 Sure it *could* be done, but it may not be very nice, given Linux's
 design. And you definitely need _something_ other than just the
 pagecache radix-tree, because the VM needs to know who maps the page.
 So if, for your backing store, you use a small hash table and evict old
 entries like powerpc, you'll constantly be faulting in and out pages
 from the VM's high level view of the address space. That isn't a really
 cheap operation. It takes at least:
[long list of locking operations snipped]
 Compared to our current page table walk which is just a single locked
 op + barrier for the spinlock + radix tree walk.
 If you had a very large hash table (ia64 long mode, maybe?), then you
 may have slightly fewer high level faults, but range based operations
 are going to take a whole lot of cache misses, aren't they? Especially
 for small processes.
 Not that I wouldn't be happy to be proven wrong, but I don't think it
 should be something that sneaks in under these pagetable operations.
 IMO.

I'll presume that was not for my benefit; if so, it was superfluous.

The example I gave was to show how far things could diverge from Linux'
conventions. Every single locking operation cited for Linux didn't
apply to the kernel I was thinking of due to its lockless pagecache
analogue, its lack of a direct equivalent of struct page, and its use
of different lifetime-bounding protocols from reference counting.
Things like page replacement didn't rely on things that would disturb
all that. It all worked out quite well for that kernel. So not only
can it be done other ways, but those ways are indeed efficient.

It should be clear from the above that retrofitting Linux to do similar
is effectively impossible. (Well, if you think you can pull off removing
struct page in favor of no direct equivalent and bounding the lifetimes
of page-sized chunks of memory by shooting down all references using
knowledge of who could possibly be hanging onto them in Linux, feel
free to attempt such a retrofit, and I'll send you a case of Scotch
whisky if you can get it to boot and run a major database benchmark
without crashing regardless of whether it's merged.)

In any event, let's not talk too much at cross-purposes. I'm deferring
to 

Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Adam Litke
On Tue, 2007-03-20 at 16:24 -0700, Dave Hansen wrote:
 On Mon, 2007-03-19 at 13:05 -0700, Adam Litke wrote:
  
  +#define has_pt_op(vma, op) \
  +   ((vma)-pagetable_ops  (vma)-pagetable_ops-op)
  +#define pt_op(vma, call) \
  +   ((vma)-pagetable_ops-call) 
 
 Can you get rid of these macros?  I think they make it a wee bit harder
 to read.  My brain doesn't properly parse the foo(arg)(bar) syntax.  
 
 +   if (has_pt_op(vma, copy_vma))
 +   return pt_op(vma, copy_vma)(dst_mm, src_mm, vma);
 
 +   if (vma-pagetable_ops  vma-pagetable_ops-copy_vma)
 +   return vma-pagetable_ops-copy_vma(dst_mm, src_mm, vma);
 
 I guess it does lead to some longish lines.  Does it start looking
 really nasty?

Yeah, it starts to look pretty bad.  Some of these calls are in code
that is already indented several times.

 If you're going to have them, it might just be best to put a single
 unlikely() around the macro definitions themselves to keep anybody from
 having to open-code it for any of the users.  

It should be pretty easy to wrap has_pt_op() with an unlikely().  Good
suggestion.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Arjan van de Ven
On Wed, 2007-03-21 at 09:50 -0500, Adam Litke wrote:
 On Tue, 2007-03-20 at 16:24 -0700, Dave Hansen wrote:
  On Mon, 2007-03-19 at 13:05 -0700, Adam Litke wrote:
   
   +#define has_pt_op(vma, op) \
   +   ((vma)-pagetable_ops  (vma)-pagetable_ops-op)
   +#define pt_op(vma, call) \
   +   ((vma)-pagetable_ops-call) 
  
  Can you get rid of these macros?  I think they make it a wee bit harder
  to read.  My brain doesn't properly parse the foo(arg)(bar) syntax.  
  
  +   if (has_pt_op(vma, copy_vma))
  +   return pt_op(vma, copy_vma)(dst_mm, src_mm, vma);
  
  +   if (vma-pagetable_ops  vma-pagetable_ops-copy_vma)
  +   return vma-pagetable_ops-copy_vma(dst_mm, src_mm, vma);
  
  I guess it does lead to some longish lines.  Does it start looking
  really nasty?
 
 Yeah, it starts to look pretty bad.  Some of these calls are in code
 that is already indented several times.

can we just make sure these things are never NULL in the first place?
would obsolete a lot of the checks, which are also runtime overhead as
well!
-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Adam Litke
On Wed, 2007-03-21 at 15:18 +1100, Nick Piggin wrote:
 Adam Litke wrote:
  Signed-off-by: Adam Litke [EMAIL PROTECTED]
  ---
  
   include/linux/mm.h |   25 +
   1 files changed, 25 insertions(+), 0 deletions(-)
  
  diff --git a/include/linux/mm.h b/include/linux/mm.h
  index 60e0e4a..7089323 100644
  --- a/include/linux/mm.h
  +++ b/include/linux/mm.h
  @@ -98,6 +98,7 @@ struct vm_area_struct {
   
  /* Function pointers to deal with this struct. */
  struct vm_operations_struct * vm_ops;
  +   const struct pagetable_operations_struct * pagetable_ops;
   
  /* Information about our backing store: */
  unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
 
 Can you remind me why this isn't in vm_ops?

We didn't want to bloat the size of the vm_ops struct for all of its
users.

 Also, it is going to be hugepage-only, isn't it? So should the naming be
 changed to reflect that? And #ifdef it...

They are doing some interesting things on Cell that could take advantage
of this.

  @@ -218,6 +219,30 @@ struct vm_operations_struct {
   };
   
   struct mmu_gather;
  +
  +struct pagetable_operations_struct {
  +   int (*fault)(struct mm_struct *mm,
  +   struct vm_area_struct *vma,
  +   unsigned long address, int write_access);
 
 I got dibs on fault ;)
 
 My callback is a sanitised one that basically abstracts the details of the
 virtual memory mapping away, so it is usable by drivers and filesystems.
 
 You actually want to bypass the normal fault handling because it doesn't
 know how to deal with your virtual memory mapping. Hmm, the best suggestion
 I can come up with is handle_mm_fault... unless you can think of a better
 name for me to use.

How about I use handle_pte_fault?

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Christoph Hellwig
On Wed, Mar 21, 2007 at 10:17:40AM -0500, Adam Litke wrote:
  Also, it is going to be hugepage-only, isn't it? So should the naming be
  changed to reflect that? And #ifdef it...
 
 They are doing some interesting things on Cell that could take advantage
 of this.

That would be new to me.  What we need on Cell is fixing up the
get_unmapped_area mess which Ben is working on now.

And let me once again repeat that I don't like this at all.  I'll
rather have a few ugly ifdefs in strategic places than a big object
oriented mess like this with just a single user.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Nick Piggin

Adam Litke wrote:

On Wed, 2007-03-21 at 15:18 +1100, Nick Piggin wrote:


Adam Litke wrote:



diff --git a/include/linux/mm.h b/include/linux/mm.h
index 60e0e4a..7089323 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -98,6 +98,7 @@ struct vm_area_struct {

/* Function pointers to deal with this struct. */
struct vm_operations_struct * vm_ops;
+   const struct pagetable_operations_struct * pagetable_ops;

/* Information about our backing store: */
unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE


Can you remind me why this isn't in vm_ops?



We didn't want to bloat the size of the vm_ops struct for all of its
users.


But vmas are surely far more numerous than vm_ops, aren't they?

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread Nick Piggin

Christoph Hellwig wrote:

On Wed, Mar 21, 2007 at 10:17:40AM -0500, Adam Litke wrote:


Also, it is going to be hugepage-only, isn't it? So should the naming be
changed to reflect that? And #ifdef it...


They are doing some interesting things on Cell that could take advantage
of this.



That would be new to me.  What we need on Cell is fixing up the
get_unmapped_area mess which Ben is working on now.

And let me once again repeat that I don't like this at all.  I'll
rather have a few ugly ifdefs in strategic places than a big object
oriented mess like this with just a single user.


I think I agree that we'd need more than one user for this.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-21 Thread William Lee Irwin III
Adam Litke wrote:
 We didn't want to bloat the size of the vm_ops struct for all of its
 users.

On Thu, Mar 22, 2007 at 10:02:07AM +1100, Nick Piggin wrote:
 But vmas are surely far more numerous than vm_ops, aren't they?

It should be clarified that the pointer to the operations structure
in once-per-mmap() vmas is a bigger overhead than once-per-driver
function pointers in the vm_ops structure.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread Nick Piggin

William Lee Irwin III wrote:

William Lee Irwin III wrote:


ISTR potential ppc64 users coming out of the woodwork for something I
didn't recognize the name of, but I may be confusing that with your
patch. I can implement additional users (and useful ones at that)
needing this in particular if desired.



On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:


Yes I would be interested in seeing useful additional users of this
that cannot use our regular virtual memory, before making it a general
thing.
I just don't want to see proliferation of these things, if possible.



I'm tied up elsewhere so I won't get to it in a timely fashion. Maybe
in a few weeks I can start up on the first two of the bunch.


Care to give us a hint? :)



William Lee Irwin III wrote:


Two fault handling methods callbacks raise an eyebrow over here at least.
I was vaguely hoping for unification of the fault handling callbacks.



On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:

I don't know if it would be so clean to do that as they are at different 
levels.
Adam's fault is before the VM translation (and bypasses it), and mine is 
after.



Not much of a VM translation; it's just a lookup through the software
mocked-up structures on everything save i386, x86_64, and some m68k where
they're the same thing only with hardware walkers (ISTR ia64's being
firmware a la Alpha despite the "HPW" name, though I could be wrong)


Well the vma+pagetables *are* our VM translation data structure. It is
a good data structure. The Gelato/UNSW guys experimenting with changing
this have basically said they haven't yet got anything that beats it.

I would be opposed to anything that bypasses that unless a) it is not
applicable to the VM as a whole, and b) it is really worth it
(hugepages was a reasonable exception).



reliant on them. The drivers/etc. could just as easily use helper
functions to carry out the lookup, thereby accomplishing the
unification. There's nothing particularly fundamental about a pte
lookup.


Yeah you could, but it looks back to front to me.

The VM tells the filesystem that the machine took a fault at virtual
address X, then the filesystem asks the VM what pgoff that is, then
tells the VM to install the corresponding page to vaddr X.

With my ->fault, the VM asks the filesystem to give the page that
corresponds to vaddr X, then installs it into that vaddr.



Normal arches that do software TLB refill could just as easily
consult the radix trees dangled off struct address_space or any old
data structure floating around the kernel with enough information to
translate user virtual addresses to the physical addresses they need to
fill the TLB with, and there are other kernels that literally do things
like that.


Sure it *could* be done, but it may not be very nice, given Linux's
design. And you definitely need _something_ other than just the
pagecache radix-tree, because the VM needs to know who maps the page.

So if, for your backing store, you use a small hash table and evict old
entries like powerpc, you'll constantly be faulting in and out pages
from the VM's high level view of the address space. That isn't a really
cheap operation. It takes at least:

read_lock_irq(mapping->tree_lock);
radix_tree_lookup()
read_unlock_irq(mapping->tree_lock);
lock_page()
atomic_add(page->_count)
atomic_add(page->_mapcount)
unlock_page()

atomic_add_negative(page->_mapcount)
atomic_dec_and_test(page->_count)

Compared to our current page table walk which is just a single locked
op + barrier for the spinlock + radix tree walk.


If you had a very large hash table (ia64 long mode, maybe?), then you
may have slightly fewer high level faults, but range based operations
are going to take a whole lot of cache misses, aren't they? Especially
for small processes.

Not that I wouldn't be happy to be proven wrong, but I don't think it
should be something that sneaks in under these pagetable operations.
IMO.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread William Lee Irwin III
William Lee Irwin III wrote:
>> ISTR potential ppc64 users coming out of the woodwork for something I
>> didn't recognize the name of, but I may be confusing that with your
>> patch. I can implement additional users (and useful ones at that)
>> needing this in particular if desired.

On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:
> Yes I would be interested in seeing useful additional users of this
> that cannot use our regular virtual memory, before making it a general
> thing.
> I just don't want to see proliferation of these things, if possible.

I'm tied up elsewhere so I won't get to it in a timely fashion. Maybe
in a few weeks I can start up on the first two of the bunch.


William Lee Irwin III wrote:
>> Two fault handling methods callbacks raise an eyebrow over here at least.
>> I was vaguely hoping for unification of the fault handling callbacks.

On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:
> I don't know if it would be so clean to do that as they are at different 
> levels.
> Adam's fault is before the VM translation (and bypasses it), and mine is 
> after.

Not much of a VM translation; it's just a lookup through the software
mocked-up structures on everything save i386, x86_64, and some m68k where
they're the same thing only with hardware walkers (ISTR ia64's being
firmware a la Alpha despite the "HPW" name, though I could be wrong)
reliant on them. The drivers/etc. could just as easily use helper
functions to carry out the lookup, thereby accomplishing the
unification. There's nothing particularly fundamental about a pte
lookup. Normal arches that do software TLB refill could just as easily
consult the radix trees dangled off struct address_space or any old
data structure floating around the kernel with enough information to
translate user virtual addresses to the physical addresses they need to
fill the TLB with, and there are other kernels that literally do things
like that.

Basically, drop in to the ->fault() callback with no attempt at a pte
lookup. The drivers using the standard pagetable format can call helper
functions to do all the gruntwork surrounding that for them. Then the
more sophisticated drivers can do the necessary work by hand.

But others should really be consulted on this point. My notions in/around
this area tend to be outside the mainstream. I can anticipate that the
two ->fault() functions will look strange to people, but not what
alternatives would be most idiomatic to mainstream Linux conventions.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread Nick Piggin

William Lee Irwin III wrote:

Adam Litke wrote:


struct vm_operations_struct * vm_ops;
+   const struct pagetable_operations_struct * pagetable_ops;



On Wed, Mar 21, 2007 at 03:18:30PM +1100, Nick Piggin wrote:


Can you remind me why this isn't in vm_ops?
Also, it is going to be hugepage-only, isn't it? So should the naming be
changed to reflect that? And #ifdef it...



ISTR potential ppc64 users coming out of the woodwork for something I
didn't recognize the name of, but I may be confusing that with your
patch. I can implement additional users (and useful ones at that)
needing this in particular if desired.


Yes I would be interested in seeing useful additional users of this
that cannot use our regular virtual memory, before making it a general
thing.

I just don't want to see proliferation of these things, if possible.


Adam Litke wrote:


+struct pagetable_operations_struct {
+   int (*fault)(struct mm_struct *mm,



On Wed, Mar 21, 2007 at 03:18:30PM +1100, Nick Piggin wrote:


I got dibs on fault ;)
My callback is a sanitised one that basically abstracts the details of the
virtual memory mapping away, so it is usable by drivers and filesystems.
You actually want to bypass the normal fault handling because it doesn't
know how to deal with your virtual memory mapping. Hmm, the best suggestion
I can come up with is handle_mm_fault... unless you can think of a better
name for me to use.



Two fault handling methods callbacks raise an eyebrow over here at least.
I was vaguely hoping for unification of the fault handling callbacks.


I don't know if it would be so clean to do that as they are at different levels.
Adam's fault is before the VM translation (and bypasses it), and mine is after.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread William Lee Irwin III
Adam Litke wrote:
>>  struct vm_operations_struct * vm_ops;
>> +const struct pagetable_operations_struct * pagetable_ops;

On Wed, Mar 21, 2007 at 03:18:30PM +1100, Nick Piggin wrote:
> Can you remind me why this isn't in vm_ops?
> Also, it is going to be hugepage-only, isn't it? So should the naming be
> changed to reflect that? And #ifdef it...

ISTR potential ppc64 users coming out of the woodwork for something I
didn't recognize the name of, but I may be confusing that with your
patch. I can implement additional users (and useful ones at that)
needing this in particular if desired.


Adam Litke wrote:
>> +struct pagetable_operations_struct {
>> +int (*fault)(struct mm_struct *mm,

On Wed, Mar 21, 2007 at 03:18:30PM +1100, Nick Piggin wrote:
> I got dibs on fault ;)
> My callback is a sanitised one that basically abstracts the details of the
> virtual memory mapping away, so it is usable by drivers and filesystems.
> You actually want to bypass the normal fault handling because it doesn't
> know how to deal with your virtual memory mapping. Hmm, the best suggestion
> I can come up with is handle_mm_fault... unless you can think of a better
> name for me to use.

Two fault handling methods callbacks raise an eyebrow over here at least.
I was vaguely hoping for unification of the fault handling callbacks.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread Nick Piggin

Adam Litke wrote:

Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
---

 include/linux/mm.h |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 60e0e4a..7089323 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -98,6 +98,7 @@ struct vm_area_struct {
 
 	/* Function pointers to deal with this struct. */

struct vm_operations_struct * vm_ops;
+   const struct pagetable_operations_struct * pagetable_ops;
 
 	/* Information about our backing store: */

unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE


Can you remind me why this isn't in vm_ops?

Also, it is going to be hugepage-only, isn't it? So should the naming be
changed to reflect that? And #ifdef it...


@@ -218,6 +219,30 @@ struct vm_operations_struct {
 };
 
 struct mmu_gather;

+
+struct pagetable_operations_struct {
+   int (*fault)(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long address, int write_access);


I got dibs on fault ;)

My callback is a sanitised one that basically abstracts the details of the
virtual memory mapping away, so it is usable by drivers and filesystems.

You actually want to bypass the normal fault handling because it doesn't
know how to deal with your virtual memory mapping. Hmm, the best suggestion
I can come up with is handle_mm_fault... unless you can think of a better
name for me to use.


+   int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
+   struct vm_area_struct *vma);
+   int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
+   struct page **pages, struct vm_area_struct **vmas,
+   unsigned long *position, int *length, int i);
+   void (*change_protection)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, pgprot_t newprot);
+   unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, long *zap_work);
+   void (*free_pgtable_range)(struct mmu_gather **tlb,
+   unsigned long addr, unsigned long end,
+   unsigned long floor, unsigned long ceiling);
+};
+
+#define has_pt_op(vma, op) \
+   ((vma)->pagetable_ops && (vma)->pagetable_ops->op)
+#define pt_op(vma, call) \
+   ((vma)->pagetable_ops->call)
+
 struct inode;
 
 #define page_private(page)		((page)->private)


--


--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread Dave Hansen
On Mon, 2007-03-19 at 13:05 -0700, Adam Litke wrote:
> 
> +#define has_pt_op(vma, op) \
> +   ((vma)->pagetable_ops && (vma)->pagetable_ops->op)
> +#define pt_op(vma, call) \
> +   ((vma)->pagetable_ops->call) 

Can you get rid of these macros?  I think they make it a wee bit harder
to read.  My brain doesn't properly parse the foo(arg)(bar) syntax.  

+   if (has_pt_op(vma, copy_vma))
+   return pt_op(vma, copy_vma)(dst_mm, src_mm, vma);

+   if (vma->pagetable_ops && vma->pagetable_ops->copy_vma)
+   return vma->pagetable_ops->copy_vma(dst_mm, src_mm, vma);

I guess it does lead to some longish lines.  Does it start looking
really nasty?

If you're going to have them, it might just be best to put a single
unlikely() around the macro definitions themselves to keep anybody from
having to open-code it for any of the users.  

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread Dave Hansen
On Mon, 2007-03-19 at 13:05 -0700, Adam Litke wrote:
 
 +#define has_pt_op(vma, op) \
 +   ((vma)-pagetable_ops  (vma)-pagetable_ops-op)
 +#define pt_op(vma, call) \
 +   ((vma)-pagetable_ops-call) 

Can you get rid of these macros?  I think they make it a wee bit harder
to read.  My brain doesn't properly parse the foo(arg)(bar) syntax.  

+   if (has_pt_op(vma, copy_vma))
+   return pt_op(vma, copy_vma)(dst_mm, src_mm, vma);

+   if (vma-pagetable_ops  vma-pagetable_ops-copy_vma)
+   return vma-pagetable_ops-copy_vma(dst_mm, src_mm, vma);

I guess it does lead to some longish lines.  Does it start looking
really nasty?

If you're going to have them, it might just be best to put a single
unlikely() around the macro definitions themselves to keep anybody from
having to open-code it for any of the users.  

-- Dave

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread Nick Piggin

Adam Litke wrote:

Signed-off-by: Adam Litke [EMAIL PROTECTED]
---

 include/linux/mm.h |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 60e0e4a..7089323 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -98,6 +98,7 @@ struct vm_area_struct {
 
 	/* Function pointers to deal with this struct. */

struct vm_operations_struct * vm_ops;
+   const struct pagetable_operations_struct * pagetable_ops;
 
 	/* Information about our backing store: */

unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE


Can you remind me why this isn't in vm_ops?

Also, it is going to be hugepage-only, isn't it? So should the naming be
changed to reflect that? And #ifdef it...


@@ -218,6 +219,30 @@ struct vm_operations_struct {
 };
 
 struct mmu_gather;

+
+struct pagetable_operations_struct {
+   int (*fault)(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long address, int write_access);


I got dibs on fault ;)

My callback is a sanitised one that basically abstracts the details of the
virtual memory mapping away, so it is usable by drivers and filesystems.

You actually want to bypass the normal fault handling because it doesn't
know how to deal with your virtual memory mapping. Hmm, the best suggestion
I can come up with is handle_mm_fault... unless you can think of a better
name for me to use.


+   int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
+   struct vm_area_struct *vma);
+   int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
+   struct page **pages, struct vm_area_struct **vmas,
+   unsigned long *position, int *length, int i);
+   void (*change_protection)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, pgprot_t newprot);
+   unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, long *zap_work);
+   void (*free_pgtable_range)(struct mmu_gather **tlb,
+   unsigned long addr, unsigned long end,
+   unsigned long floor, unsigned long ceiling);
+};
+
+#define has_pt_op(vma, op) \
+   ((vma)-pagetable_ops  (vma)-pagetable_ops-op)
+#define pt_op(vma, call) \
+   ((vma)-pagetable_ops-call)
+
 struct inode;
 
 #define page_private(page)		((page)-private)


--


--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread William Lee Irwin III
Adam Litke wrote:
  struct vm_operations_struct * vm_ops;
 +const struct pagetable_operations_struct * pagetable_ops;

On Wed, Mar 21, 2007 at 03:18:30PM +1100, Nick Piggin wrote:
 Can you remind me why this isn't in vm_ops?
 Also, it is going to be hugepage-only, isn't it? So should the naming be
 changed to reflect that? And #ifdef it...

ISTR potential ppc64 users coming out of the woodwork for something I
didn't recognize the name of, but I may be confusing that with your
patch. I can implement additional users (and useful ones at that)
needing this in particular if desired.


Adam Litke wrote:
 +struct pagetable_operations_struct {
 +int (*fault)(struct mm_struct *mm,

On Wed, Mar 21, 2007 at 03:18:30PM +1100, Nick Piggin wrote:
 I got dibs on fault ;)
 My callback is a sanitised one that basically abstracts the details of the
 virtual memory mapping away, so it is usable by drivers and filesystems.
 You actually want to bypass the normal fault handling because it doesn't
 know how to deal with your virtual memory mapping. Hmm, the best suggestion
 I can come up with is handle_mm_fault... unless you can think of a better
 name for me to use.

Two fault handling methods callbacks raise an eyebrow over here at least.
I was vaguely hoping for unification of the fault handling callbacks.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread Nick Piggin

William Lee Irwin III wrote:

Adam Litke wrote:


struct vm_operations_struct * vm_ops;
+   const struct pagetable_operations_struct * pagetable_ops;



On Wed, Mar 21, 2007 at 03:18:30PM +1100, Nick Piggin wrote:


Can you remind me why this isn't in vm_ops?
Also, it is going to be hugepage-only, isn't it? So should the naming be
changed to reflect that? And #ifdef it...



ISTR potential ppc64 users coming out of the woodwork for something I
didn't recognize the name of, but I may be confusing that with your
patch. I can implement additional users (and useful ones at that)
needing this in particular if desired.


Yes I would be interested in seeing useful additional users of this
that cannot use our regular virtual memory, before making it a general
thing.

I just don't want to see proliferation of these things, if possible.


Adam Litke wrote:


+struct pagetable_operations_struct {
+   int (*fault)(struct mm_struct *mm,



On Wed, Mar 21, 2007 at 03:18:30PM +1100, Nick Piggin wrote:


I got dibs on fault ;)
My callback is a sanitised one that basically abstracts the details of the
virtual memory mapping away, so it is usable by drivers and filesystems.
You actually want to bypass the normal fault handling because it doesn't
know how to deal with your virtual memory mapping. Hmm, the best suggestion
I can come up with is handle_mm_fault... unless you can think of a better
name for me to use.



Two fault handling methods callbacks raise an eyebrow over here at least.
I was vaguely hoping for unification of the fault handling callbacks.


I don't know if it would be so clean to do that as they are at different levels.
Adam's fault is before the VM translation (and bypasses it), and mine is after.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread William Lee Irwin III
William Lee Irwin III wrote:
 ISTR potential ppc64 users coming out of the woodwork for something I
 didn't recognize the name of, but I may be confusing that with your
 patch. I can implement additional users (and useful ones at that)
 needing this in particular if desired.

On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:
 Yes I would be interested in seeing useful additional users of this
 that cannot use our regular virtual memory, before making it a general
 thing.
 I just don't want to see proliferation of these things, if possible.

I'm tied up elsewhere so I won't get to it in a timely fashion. Maybe
in a few weeks I can start up on the first two of the bunch.


William Lee Irwin III wrote:
 Two fault handling methods callbacks raise an eyebrow over here at least.
 I was vaguely hoping for unification of the fault handling callbacks.

On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:
 I don't know if it would be so clean to do that as they are at different 
 levels.
 Adam's fault is before the VM translation (and bypasses it), and mine is 
 after.

Not much of a VM translation; it's just a lookup through the software
mocked-up structures on everything save i386, x86_64, and some m68k where
they're the same thing only with hardware walkers (ISTR ia64's being
firmware a la Alpha despite the HPW name, though I could be wrong)
reliant on them. The drivers/etc. could just as easily use helper
functions to carry out the lookup, thereby accomplishing the
unification. There's nothing particularly fundamental about a pte
lookup. Normal arches that do software TLB refill could just as easily
consult the radix trees dangled off struct address_space or any old
data structure floating around the kernel with enough information to
translate user virtual addresses to the physical addresses they need to
fill the TLB with, and there are other kernels that literally do things
like that.

Basically, drop in to the -fault() callback with no attempt at a pte
lookup. The drivers using the standard pagetable format can call helper
functions to do all the gruntwork surrounding that for them. Then the
more sophisticated drivers can do the necessary work by hand.

But others should really be consulted on this point. My notions in/around
this area tend to be outside the mainstream. I can anticipate that the
two -fault() functions will look strange to people, but not what
alternatives would be most idiomatic to mainstream Linux conventions.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-20 Thread Nick Piggin

William Lee Irwin III wrote:

William Lee Irwin III wrote:


ISTR potential ppc64 users coming out of the woodwork for something I
didn't recognize the name of, but I may be confusing that with your
patch. I can implement additional users (and useful ones at that)
needing this in particular if desired.



On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:


Yes I would be interested in seeing useful additional users of this
that cannot use our regular virtual memory, before making it a general
thing.
I just don't want to see proliferation of these things, if possible.



I'm tied up elsewhere so I won't get to it in a timely fashion. Maybe
in a few weeks I can start up on the first two of the bunch.


Care to give us a hint? :)



William Lee Irwin III wrote:


Two fault handling methods callbacks raise an eyebrow over here at least.
I was vaguely hoping for unification of the fault handling callbacks.



On Wed, Mar 21, 2007 at 04:07:43PM +1100, Nick Piggin wrote:

I don't know if it would be so clean to do that as they are at different 
levels.
Adam's fault is before the VM translation (and bypasses it), and mine is 
after.



Not much of a VM translation; it's just a lookup through the software
mocked-up structures on everything save i386, x86_64, and some m68k where
they're the same thing only with hardware walkers (ISTR ia64's being
firmware a la Alpha despite the HPW name, though I could be wrong)


Well the vma+pagetables *are* our VM translation data structure. It is
a good data structure. The Gelato/UNSW guys experimenting with changing
this have basically said they haven't yet got anything that beats it.

I would be opposed to anything that bypasses that unless a) it is not
applicable to the VM as a whole, and b) it is really worth it
(hugepages was a reasonable exception).



reliant on them. The drivers/etc. could just as easily use helper
functions to carry out the lookup, thereby accomplishing the
unification. There's nothing particularly fundamental about a pte
lookup.


Yeah you could, but it looks back to front to me.

The VM tells the filesystem that the machine took a fault at virtual
address X, then the filesystem asks the VM what pgoff that is, then
tells the VM to install the corresponding page to vaddr X.

With my -fault, the VM asks the filesystem to give the page that
corresponds to vaddr X, then installs it into that vaddr.



Normal arches that do software TLB refill could just as easily
consult the radix trees dangled off struct address_space or any old
data structure floating around the kernel with enough information to
translate user virtual addresses to the physical addresses they need to
fill the TLB with, and there are other kernels that literally do things
like that.


Sure it *could* be done, but it may not be very nice, given Linux's
design. And you definitely need _something_ other than just the
pagecache radix-tree, because the VM needs to know who maps the page.

So if, for your backing store, you use a small hash table and evict old
entries like powerpc, you'll constantly be faulting in and out pages
from the VM's high level view of the address space. That isn't a really
cheap operation. It takes at least:

read_lock_irq(mapping-tree_lock);
radix_tree_lookup()
read_unlock_irq(mapping-tree_lock);
lock_page()
atomic_add(page-_count)
atomic_add(page-_mapcount)
unlock_page()

atomic_add_negative(page-_mapcount)
atomic_dec_and_test(page-_count)

Compared to our current page table walk which is just a single locked
op + barrier for the spinlock + radix tree walk.


If you had a very large hash table (ia64 long mode, maybe?), then you
may have slightly fewer high level faults, but range based operations
are going to take a whole lot of cache misses, aren't they? Especially
for small processes.

Not that I wouldn't be happy to be proven wrong, but I don't think it
should be something that sneaks in under these pagetable operations.
IMO.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-19 Thread Adam Litke

Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
---

 include/linux/mm.h |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 60e0e4a..7089323 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -98,6 +98,7 @@ struct vm_area_struct {
 
/* Function pointers to deal with this struct. */
struct vm_operations_struct * vm_ops;
+   const struct pagetable_operations_struct * pagetable_ops;
 
/* Information about our backing store: */
unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
@@ -218,6 +219,30 @@ struct vm_operations_struct {
 };
 
 struct mmu_gather;
+
+struct pagetable_operations_struct {
+   int (*fault)(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long address, int write_access);
+   int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
+   struct vm_area_struct *vma);
+   int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
+   struct page **pages, struct vm_area_struct **vmas,
+   unsigned long *position, int *length, int i);
+   void (*change_protection)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, pgprot_t newprot);
+   unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, long *zap_work);
+   void (*free_pgtable_range)(struct mmu_gather **tlb,
+   unsigned long addr, unsigned long end,
+   unsigned long floor, unsigned long ceiling);
+};
+
+#define has_pt_op(vma, op) \
+   ((vma)->pagetable_ops && (vma)->pagetable_ops->op)
+#define pt_op(vma, call) \
+   ((vma)->pagetable_ops->call)
+
 struct inode;
 
 #define page_private(page) ((page)->private)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-03-19 Thread Adam Litke

Signed-off-by: Adam Litke [EMAIL PROTECTED]
---

 include/linux/mm.h |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 60e0e4a..7089323 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -98,6 +98,7 @@ struct vm_area_struct {
 
/* Function pointers to deal with this struct. */
struct vm_operations_struct * vm_ops;
+   const struct pagetable_operations_struct * pagetable_ops;
 
/* Information about our backing store: */
unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
@@ -218,6 +219,30 @@ struct vm_operations_struct {
 };
 
 struct mmu_gather;
+
+struct pagetable_operations_struct {
+   int (*fault)(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long address, int write_access);
+   int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
+   struct vm_area_struct *vma);
+   int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
+   struct page **pages, struct vm_area_struct **vmas,
+   unsigned long *position, int *length, int i);
+   void (*change_protection)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, pgprot_t newprot);
+   unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, long *zap_work);
+   void (*free_pgtable_range)(struct mmu_gather **tlb,
+   unsigned long addr, unsigned long end,
+   unsigned long floor, unsigned long ceiling);
+};
+
+#define has_pt_op(vma, op) \
+   ((vma)-pagetable_ops  (vma)-pagetable_ops-op)
+#define pt_op(vma, call) \
+   ((vma)-pagetable_ops-call)
+
 struct inode;
 
 #define page_private(page) ((page)-private)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-20 Thread Mel Gorman
On (19/02/07 22:29), Christoph Hellwig didst pronounce:
> On Mon, Feb 19, 2007 at 10:31:34AM -0800, Adam Litke wrote:
> > Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
> > ---
> > 
> >  include/linux/mm.h |   25 +
> >  1 files changed, 25 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 2d2c08d..a2fa66d 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -98,6 +98,7 @@ struct vm_area_struct {
> >  
> > /* Function pointers to deal with this struct. */
> > struct vm_operations_struct * vm_ops;
> > +   struct pagetable_operations_struct * pagetable_ops;
> >  
> > /* Information about our backing store: */
> > unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
> > @@ -218,6 +219,30 @@ struct vm_operations_struct {
> >  };
> >  
> >  struct mmu_gather;
> > +
> > +struct pagetable_operations_struct {
> > +   int (*fault)(struct mm_struct *mm,
> > +   struct vm_area_struct *vma,
> > +   unsigned long address, int write_access);
> > +   int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
> > +   struct vm_area_struct *vma);
> > +   int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
> > +   struct page **pages, struct vm_area_struct **vmas,
> > +   unsigned long *position, int *length, int i);
> > +   void (*change_protection)(struct vm_area_struct *vma,
> > +   unsigned long address, unsigned long end, pgprot_t newprot);
> > +   unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
> > +   unsigned long address, unsigned long end, long *zap_work);
> > +   void (*free_pgtable_range)(struct mmu_gather **tlb,
> > +   unsigned long addr, unsigned long end,
> > +   unsigned long floor, unsigned long ceiling);
> > +};
> 
> I don't think adding another operation vector is a good idea.  But I'd
> rather extend the vma operations vector to deal with all nessecary
> buts ubstead if addubg a second one.

Well, there are a lot of users of vm_operations_struct that have no interest in
the operations in pagetable_operations_struct. Expanding vm_operations_struct
would increase the size of all VMAs by more than is necessary.

Also, having the pagetable ops in vm_operations_struct might lead device
drivers to believe they should be doing something entertaining there. In
reality, we would only want drivers playing with pagetable_operations when
they really know what they are doing and why.  Having the pagetable_ops
set is similar to VM_HUGETLB set as a strong sign that something unusual is
going on that is fairly easy to check for.

I prefer the additional struct to extending VMAs anyway.

-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-20 Thread Mel Gorman
On (19/02/07 22:29), Christoph Hellwig didst pronounce:
 On Mon, Feb 19, 2007 at 10:31:34AM -0800, Adam Litke wrote:
  Signed-off-by: Adam Litke [EMAIL PROTECTED]
  ---
  
   include/linux/mm.h |   25 +
   1 files changed, 25 insertions(+), 0 deletions(-)
  
  diff --git a/include/linux/mm.h b/include/linux/mm.h
  index 2d2c08d..a2fa66d 100644
  --- a/include/linux/mm.h
  +++ b/include/linux/mm.h
  @@ -98,6 +98,7 @@ struct vm_area_struct {
   
  /* Function pointers to deal with this struct. */
  struct vm_operations_struct * vm_ops;
  +   struct pagetable_operations_struct * pagetable_ops;
   
  /* Information about our backing store: */
  unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
  @@ -218,6 +219,30 @@ struct vm_operations_struct {
   };
   
   struct mmu_gather;
  +
  +struct pagetable_operations_struct {
  +   int (*fault)(struct mm_struct *mm,
  +   struct vm_area_struct *vma,
  +   unsigned long address, int write_access);
  +   int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
  +   struct vm_area_struct *vma);
  +   int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
  +   struct page **pages, struct vm_area_struct **vmas,
  +   unsigned long *position, int *length, int i);
  +   void (*change_protection)(struct vm_area_struct *vma,
  +   unsigned long address, unsigned long end, pgprot_t newprot);
  +   unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
  +   unsigned long address, unsigned long end, long *zap_work);
  +   void (*free_pgtable_range)(struct mmu_gather **tlb,
  +   unsigned long addr, unsigned long end,
  +   unsigned long floor, unsigned long ceiling);
  +};
 
 I don't think adding another operation vector is a good idea.  But I'd
 rather extend the vma operations vector to deal with all nessecary
 buts ubstead if addubg a second one.

Well, there are a lot of users of vm_operations_struct that have no interest in
the operations in pagetable_operations_struct. Expanding vm_operations_struct
would increase the size of all VMAs by more than is necessary.

Also, having the pagetable ops in vm_operations_struct might lead device
drivers to believe they should be doing something entertaining there. In
reality, we would only want drivers playing with pagetable_operations when
they really know what they are doing and why.  Having the pagetable_ops
set is similar to VM_HUGETLB set as a strong sign that something unusual is
going on that is fairly easy to check for.

I prefer the additional struct to extending VMAs anyway.

-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread Christoph Hellwig
On Mon, Feb 19, 2007 at 10:31:34AM -0800, Adam Litke wrote:
> Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
> ---
> 
>  include/linux/mm.h |   25 +
>  1 files changed, 25 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2d2c08d..a2fa66d 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -98,6 +98,7 @@ struct vm_area_struct {
>  
>   /* Function pointers to deal with this struct. */
>   struct vm_operations_struct * vm_ops;
> + struct pagetable_operations_struct * pagetable_ops;
>  
>   /* Information about our backing store: */
>   unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
> @@ -218,6 +219,30 @@ struct vm_operations_struct {
>  };
>  
>  struct mmu_gather;
> +
> +struct pagetable_operations_struct {
> + int (*fault)(struct mm_struct *mm,
> + struct vm_area_struct *vma,
> + unsigned long address, int write_access);
> + int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
> + struct vm_area_struct *vma);
> + int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
> + struct page **pages, struct vm_area_struct **vmas,
> + unsigned long *position, int *length, int i);
> + void (*change_protection)(struct vm_area_struct *vma,
> + unsigned long address, unsigned long end, pgprot_t newprot);
> + unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
> + unsigned long address, unsigned long end, long *zap_work);
> + void (*free_pgtable_range)(struct mmu_gather **tlb,
> + unsigned long addr, unsigned long end,
> + unsigned long floor, unsigned long ceiling);
> +};

I don't think adding another operation vector is a good idea.  But I'd
rather extend the vma operations vector to deal with all nessecary
buts ubstead if addubg a second one.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread William Lee Irwin III
On Mon, Feb 19, 2007 at 10:31:34AM -0800, Adam Litke wrote:
> +struct pagetable_operations_struct {
> + int (*fault)(struct mm_struct *mm,
> + struct vm_area_struct *vma,
> + unsigned long address, int write_access);
> + int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
> + struct vm_area_struct *vma);
> + int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
> + struct page **pages, struct vm_area_struct **vmas,
> + unsigned long *position, int *length, int i);
> + void (*change_protection)(struct vm_area_struct *vma,
> + unsigned long address, unsigned long end, pgprot_t newprot);
> + unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
> + unsigned long address, unsigned long end, long *zap_work);
> + void (*free_pgtable_range)(struct mmu_gather **tlb,
> + unsigned long addr, unsigned long end,
> + unsigned long floor, unsigned long ceiling);
> +};

I very very strongly approve of the approach this operations structure
entails.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread Adam Litke
On Mon, 2007-02-19 at 19:41 +0100, Arjan van de Ven wrote:
> On Mon, 2007-02-19 at 10:31 -0800, Adam Litke wrote:
> > Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
> > ---
> > 
> >  include/linux/mm.h |   25 +
> >  1 files changed, 25 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 2d2c08d..a2fa66d 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -98,6 +98,7 @@ struct vm_area_struct {
> >  
> > /* Function pointers to deal with this struct. */
> > struct vm_operations_struct * vm_ops;
> > +   struct pagetable_operations_struct * pagetable_ops;
> >  
> 
> please make it at least const, those things have no business ever being
> written to right? And by making them const the compiler helps catch
> that, and as bonus the data gets moved to rodata so that it won't share
> cachelines with anything that gets dirty

Yep I agree.  Changed.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread Arjan van de Ven
On Mon, 2007-02-19 at 10:31 -0800, Adam Litke wrote:
> Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
> ---
> 
>  include/linux/mm.h |   25 +
>  1 files changed, 25 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2d2c08d..a2fa66d 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -98,6 +98,7 @@ struct vm_area_struct {
>  
>   /* Function pointers to deal with this struct. */
>   struct vm_operations_struct * vm_ops;
> + struct pagetable_operations_struct * pagetable_ops;
>  

please make it at least const, those things have no business ever being
written to right? And by making them const the compiler helps catch
that, and as bonus the data gets moved to rodata so that it won't share
cachelines with anything that gets dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread Adam Litke

Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
---

 include/linux/mm.h |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2d2c08d..a2fa66d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -98,6 +98,7 @@ struct vm_area_struct {
 
/* Function pointers to deal with this struct. */
struct vm_operations_struct * vm_ops;
+   struct pagetable_operations_struct * pagetable_ops;
 
/* Information about our backing store: */
unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
@@ -218,6 +219,30 @@ struct vm_operations_struct {
 };
 
 struct mmu_gather;
+
+struct pagetable_operations_struct {
+   int (*fault)(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long address, int write_access);
+   int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
+   struct vm_area_struct *vma);
+   int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
+   struct page **pages, struct vm_area_struct **vmas,
+   unsigned long *position, int *length, int i);
+   void (*change_protection)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, pgprot_t newprot);
+   unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, long *zap_work);
+   void (*free_pgtable_range)(struct mmu_gather **tlb,
+   unsigned long addr, unsigned long end,
+   unsigned long floor, unsigned long ceiling);
+};
+
+#define has_pt_op(vma, op) \
+   ((vma)->pagetable_ops && (vma)->pagetable_ops->op)
+#define pt_op(vma, call) \
+   ((vma)->pagetable_ops->call)
+
 struct inode;
 
 #define page_private(page) ((page)->private)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread Adam Litke

Signed-off-by: Adam Litke [EMAIL PROTECTED]
---

 include/linux/mm.h |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2d2c08d..a2fa66d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -98,6 +98,7 @@ struct vm_area_struct {
 
/* Function pointers to deal with this struct. */
struct vm_operations_struct * vm_ops;
+   struct pagetable_operations_struct * pagetable_ops;
 
/* Information about our backing store: */
unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
@@ -218,6 +219,30 @@ struct vm_operations_struct {
 };
 
 struct mmu_gather;
+
+struct pagetable_operations_struct {
+   int (*fault)(struct mm_struct *mm,
+   struct vm_area_struct *vma,
+   unsigned long address, int write_access);
+   int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
+   struct vm_area_struct *vma);
+   int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
+   struct page **pages, struct vm_area_struct **vmas,
+   unsigned long *position, int *length, int i);
+   void (*change_protection)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, pgprot_t newprot);
+   unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
+   unsigned long address, unsigned long end, long *zap_work);
+   void (*free_pgtable_range)(struct mmu_gather **tlb,
+   unsigned long addr, unsigned long end,
+   unsigned long floor, unsigned long ceiling);
+};
+
+#define has_pt_op(vma, op) \
+   ((vma)-pagetable_ops  (vma)-pagetable_ops-op)
+#define pt_op(vma, call) \
+   ((vma)-pagetable_ops-call)
+
 struct inode;
 
 #define page_private(page) ((page)-private)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread Arjan van de Ven
On Mon, 2007-02-19 at 10:31 -0800, Adam Litke wrote:
 Signed-off-by: Adam Litke [EMAIL PROTECTED]
 ---
 
  include/linux/mm.h |   25 +
  1 files changed, 25 insertions(+), 0 deletions(-)
 
 diff --git a/include/linux/mm.h b/include/linux/mm.h
 index 2d2c08d..a2fa66d 100644
 --- a/include/linux/mm.h
 +++ b/include/linux/mm.h
 @@ -98,6 +98,7 @@ struct vm_area_struct {
  
   /* Function pointers to deal with this struct. */
   struct vm_operations_struct * vm_ops;
 + struct pagetable_operations_struct * pagetable_ops;
  

please make it at least const, those things have no business ever being
written to right? And by making them const the compiler helps catch
that, and as bonus the data gets moved to rodata so that it won't share
cachelines with anything that gets dirty

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread Adam Litke
On Mon, 2007-02-19 at 19:41 +0100, Arjan van de Ven wrote:
 On Mon, 2007-02-19 at 10:31 -0800, Adam Litke wrote:
  Signed-off-by: Adam Litke [EMAIL PROTECTED]
  ---
  
   include/linux/mm.h |   25 +
   1 files changed, 25 insertions(+), 0 deletions(-)
  
  diff --git a/include/linux/mm.h b/include/linux/mm.h
  index 2d2c08d..a2fa66d 100644
  --- a/include/linux/mm.h
  +++ b/include/linux/mm.h
  @@ -98,6 +98,7 @@ struct vm_area_struct {
   
  /* Function pointers to deal with this struct. */
  struct vm_operations_struct * vm_ops;
  +   struct pagetable_operations_struct * pagetable_ops;
   
 
 please make it at least const, those things have no business ever being
 written to right? And by making them const the compiler helps catch
 that, and as bonus the data gets moved to rodata so that it won't share
 cachelines with anything that gets dirty

Yep I agree.  Changed.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread William Lee Irwin III
On Mon, Feb 19, 2007 at 10:31:34AM -0800, Adam Litke wrote:
 +struct pagetable_operations_struct {
 + int (*fault)(struct mm_struct *mm,
 + struct vm_area_struct *vma,
 + unsigned long address, int write_access);
 + int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
 + struct vm_area_struct *vma);
 + int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
 + struct page **pages, struct vm_area_struct **vmas,
 + unsigned long *position, int *length, int i);
 + void (*change_protection)(struct vm_area_struct *vma,
 + unsigned long address, unsigned long end, pgprot_t newprot);
 + unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
 + unsigned long address, unsigned long end, long *zap_work);
 + void (*free_pgtable_range)(struct mmu_gather **tlb,
 + unsigned long addr, unsigned long end,
 + unsigned long floor, unsigned long ceiling);
 +};

I very very strongly approve of the approach this operations structure
entails.


-- wli
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] Introduce the pagetable_operations and associated helper macros.

2007-02-19 Thread Christoph Hellwig
On Mon, Feb 19, 2007 at 10:31:34AM -0800, Adam Litke wrote:
 Signed-off-by: Adam Litke [EMAIL PROTECTED]
 ---
 
  include/linux/mm.h |   25 +
  1 files changed, 25 insertions(+), 0 deletions(-)
 
 diff --git a/include/linux/mm.h b/include/linux/mm.h
 index 2d2c08d..a2fa66d 100644
 --- a/include/linux/mm.h
 +++ b/include/linux/mm.h
 @@ -98,6 +98,7 @@ struct vm_area_struct {
  
   /* Function pointers to deal with this struct. */
   struct vm_operations_struct * vm_ops;
 + struct pagetable_operations_struct * pagetable_ops;
  
   /* Information about our backing store: */
   unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
 @@ -218,6 +219,30 @@ struct vm_operations_struct {
  };
  
  struct mmu_gather;
 +
 +struct pagetable_operations_struct {
 + int (*fault)(struct mm_struct *mm,
 + struct vm_area_struct *vma,
 + unsigned long address, int write_access);
 + int (*copy_vma)(struct mm_struct *dst, struct mm_struct *src,
 + struct vm_area_struct *vma);
 + int (*pin_pages)(struct mm_struct *mm, struct vm_area_struct *vma,
 + struct page **pages, struct vm_area_struct **vmas,
 + unsigned long *position, int *length, int i);
 + void (*change_protection)(struct vm_area_struct *vma,
 + unsigned long address, unsigned long end, pgprot_t newprot);
 + unsigned long (*unmap_page_range)(struct vm_area_struct *vma,
 + unsigned long address, unsigned long end, long *zap_work);
 + void (*free_pgtable_range)(struct mmu_gather **tlb,
 + unsigned long addr, unsigned long end,
 + unsigned long floor, unsigned long ceiling);
 +};

I don't think adding another operation vector is a good idea.  But I'd
rather extend the vma operations vector to deal with all nessecary
buts ubstead if addubg a second one.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/