Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-11-01 Thread Jaya Kumar
On Oct 30, 2007 11:47 AM, Hugh Dickins <[EMAIL PROTECTED]> wrote:
>
> I don't understand why you suggested an anon_vma, nor why Jaya is
> suggesting a private list.  All vmas mapping /dev/fb0 will be kept
> in the prio_tree rooted in its struct address_space (__vma_link_file
> in mm/mmap.c).  And page_mkclean gets page_mkclean_file to walk that
> very tree.  The missing part is just the setting of page->mapping to
> point to that struct address_space (and clearing it before finally
> freeing the pages), and the setting of page->index as you described.
> Isn't it?

Oops, sorry that I missed that. Now I understand. I think:

page->mapping = vma->vm_file->f_mapping
page->index = ((addr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff

at nopage time and then before the driver vfrees, I'll clear mapping
for all those pages.

Thanks,
jaya
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-11-01 Thread Jaya Kumar
On Oct 30, 2007 11:47 AM, Hugh Dickins [EMAIL PROTECTED] wrote:

 I don't understand why you suggested an anon_vma, nor why Jaya is
 suggesting a private list.  All vmas mapping /dev/fb0 will be kept
 in the prio_tree rooted in its struct address_space (__vma_link_file
 in mm/mmap.c).  And page_mkclean gets page_mkclean_file to walk that
 very tree.  The missing part is just the setting of page-mapping to
 point to that struct address_space (and clearing it before finally
 freeing the pages), and the setting of page-index as you described.
 Isn't it?

Oops, sorry that I missed that. Now I understand. I think:

page-mapping = vma-vm_file-f_mapping
page-index = ((addr - vma-vm_start)  PAGE_SHIFT) + vma-vm_pgoff

at nopage time and then before the driver vfrees, I'll clear mapping
for all those pages.

Thanks,
jaya
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Hugh Dickins
On Tue, 30 Oct 2007, Peter Zijlstra wrote:
> On Tue, 2007-10-30 at 09:16 -0400, Jaya Kumar wrote:
...
> > - defio mmap adds this vma to private list (equivalent of
> > address_space or anon_vma)
...
> > - foreach vma { foreach page { page_mkclean_one(page, vma) }
> 
> Yeah, page_mkclean_one(page, vma) will use vma_address() to obtain an
> user-space address for the page in this vma using page->index and the
> formula from the last email, this address is then used to walk the page
> tables and obtain a pte.

I don't understand why you suggested an anon_vma, nor why Jaya is
suggesting a private list.  All vmas mapping /dev/fb0 will be kept
in the prio_tree rooted in its struct address_space (__vma_link_file
in mm/mmap.c).  And page_mkclean gets page_mkclean_file to walk that
very tree.  The missing part is just the setting of page->mapping to
point to that struct address_space (and clearing it before finally
freeing the pages), and the setting of page->index as you described.
Isn't it?

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Peter Zijlstra
On Tue, 2007-10-30 at 15:47 +, Hugh Dickins wrote:
> On Tue, 30 Oct 2007, Peter Zijlstra wrote:
> > On Tue, 2007-10-30 at 09:16 -0400, Jaya Kumar wrote:
> 
> > > - defio mmap adds this vma to private list (equivalent of
> > > address_space or anon_vma)
> 
> > > - foreach vma { foreach page { page_mkclean_one(page, vma) }
> > 
> > Yeah, page_mkclean_one(page, vma) will use vma_address() to obtain an
> > user-space address for the page in this vma using page->index and the
> > formula from the last email, this address is then used to walk the page
> > tables and obtain a pte.
> 
> I don't understand why you suggested an anon_vma, nor why Jaya is
> suggesting a private list.  All vmas mapping /dev/fb0 will be kept
> in the prio_tree rooted in its struct address_space (__vma_link_file
> in mm/mmap.c).  And page_mkclean gets page_mkclean_file to walk that
> very tree.  The missing part is just the setting of page->mapping to
> point to that struct address_space (and clearing it before finally
> freeing the pages), and the setting of page->index as you described.
> Isn't it?

Hmm, there is a thought. I had not considered that mapping a chardev
would have that effect.

I'd have to have a look at the actual code, but yeah, that might very
well work out. How silly of me.

Thanks!


signature.asc
Description: This is a digitally signed message part


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Peter Zijlstra
On Tue, 2007-10-30 at 09:16 -0400, Jaya Kumar wrote:
> On 10/30/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > So page->index does what you want it to, identify which part of the
> > framebuffer this particular page belongs to.
> 
> Ok. I'm attempting to walk the code sequence. Here's what I think:
> 
> - driver loads
> - driver vmalloc()s its fb
> - this creates the necessary pte entries

well, one set thereof, the kernel mappings, which for this purpose are
the least interesting.

> then...
> - app mmap(/dev/fb0)
> - vma is created
> - defio mmap adds this vma to private list (equivalent of
> address_space or anon_vma)

> - app touches base + pixel(128,128) = base + 16k
> - page fault
> - defio nopage gets called
> - defio nopage does vmalloc_to_page(base+16k)

this installs a user space page table entry for your page; this is the
interesting one as it carries the user-dirty state.

> - that finds the correct struct page corresponding to that vaddr.
> page->index has not been set by anyone so far, right?
> * ah... i see, you are suggesting that this is where I could set the
> index since i know the offset i want it to represent. right?

Not quite, you would set that right after vmallocing, just set an
increasing page->index starting with 0 for the first page.

Then ensure your vma->vm_pgoff is 0 (which should be the case since
userspace will most likely mmap the whole thing, and if not it still
gets what it expects).

> - defio mkwrite get called. defio adds page to its list. schedules delayed 
> work
> - app keeps writing the page
> - delayed work occurs
> - foreach vma { foreach page { page_mkclean_one(page, vma) }

Yeah, page_mkclean_one(page, vma) will use vma_address() to obtain an
user-space address for the page in this vma using page->index and the
formula from the last email, this address is then used to walk the page
tables and obtain a pte.

This will be the user-space pte installed by your nopfn handler. Not the
kernel vmap pte resulting from the vmalloc() call.

> - cycle repeats...




signature.asc
Description: This is a digitally signed message part


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Jaya Kumar
On 10/30/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> So page->index does what you want it to, identify which part of the
> framebuffer this particular page belongs to.

Ok. I'm attempting to walk the code sequence. Here's what I think:

- driver loads
- driver vmalloc()s its fb
- this creates the necessary pte entries
then...
- app mmap(/dev/fb0)
- vma is created
- defio mmap adds this vma to private list (equivalent of
address_space or anon_vma)
- app touches base + pixel(128,128) = base + 16k
- page fault
- defio nopage gets called
- defio nopage does vmalloc_to_page(base+16k)
- that finds the correct struct page corresponding to that vaddr.
page->index has not been set by anyone so far, right?
* ah... i see, you are suggesting that this is where I could set the
index since i know the offset i want it to represent. right?
- defio mkwrite get called. defio adds page to its list. schedules delayed work
- app keeps writing the page
- delayed work occurs
- foreach vma { foreach page { page_mkclean_one(page, vma) }
- cycle repeats...

Thanks,
jaya
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Peter Zijlstra
On Tue, 2007-10-30 at 12:39 +, Hugh Dickins wrote:
> On Tue, 30 Oct 2007, Stefani Seibold wrote:
> > 
> > the question is how can i get all pte's from a vmalloc'ed memory. Due to
> > the zeroed mapping pointer i dont see how to do this?
> 
> The mapping pointer is zeroed because you've done nothing to set it.
> Below is how I answered you a week ago.  But this is new territory
> (extending page_mkclean to work on more than just pagecache pages),
> I'm still unsure what would be the safest way to do it.

Quite, I think manual usage of page_mkclean_one() on the vma gotten from
mmap() along with properly setting page->index is the simplest solution
to make work.

Making page_mkclean(struct page *) work for remap_pfn/vmalloc_range()
style mmaps would require extending rmap to work with those, which
includes setting page->mapping to point to a anon_vma like object.

But that sounds like a lot of work, and I'm not sure its worth the
overhead, because so far all users of remap_pfn/vmalloc_range() have
survived without.




signature.asc
Description: This is a digitally signed message part


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Hugh Dickins
On Tue, 30 Oct 2007, Stefani Seibold wrote:
> 
> the question is how can i get all pte's from a vmalloc'ed memory. Due to
> the zeroed mapping pointer i dont see how to do this?

The mapping pointer is zeroed because you've done nothing to set it.
Below is how I answered you a week ago.  But this is new territory
(extending page_mkclean to work on more than just pagecache pages),
I'm still unsure what would be the safest way to do it.

On Mon, 22 Oct 2007, Stefani Seibold wrote:
> 
> i have a problem with vmalloc() and vm_ops.page_mkwrite().
> 
> ReadOnly access works, but on a write access the VM will
> endless invoke the vm_ops.page_mkwrite() handler.
> 
> I tracked down the problem to the
>   struct page.mapping pointer,
> which is NULL.
> 
> The problem original occurs with the fb_defio driver 
> (driver/video/fb_defio.c).
> This driver use the vm_ops.page_mkwrite() handler for tracking the modified 
> pages,
> which will be in an extra thread handled, to perform the IO and clean and
> write protect all pages with page_clean().

Interesting.  You need to ask Jaya (CC'ed) since he's the one
who put that code into fb_defio.c, exported page_mkclean, and
should have tested it.

> 
> I am not sure if the is a feature of the new rmap code or a bug.

page_mkclean was written in the belief that it was being used on
pagecache pages.  I'm not sure how deeply engrained that belief is.

If it can easily and safely be used on something else, that may be
nice: though there's a danger we'll keep breaking and re-breaking
it if there's only one driver using it in an unusual way.  CC'ed
Peter since he's the one who most needs to be aware of this.

> 
> Is there an way to get a similar functionality? Currently, i have no
> idea
> how to get the ptep from a page alloced with vmalloc().

A pagecache page would have page->mapping initialized to point to
the struct address_space of the vma, and page->index to the offset
(in PAGE_SIZE units): see mm/filemap.c:add_to_page_cache.  Without
page->mapping set, page_mkclean_file won't be able to find the vmas
in which the page might appear; and without page->index set, it
won't be able to find where the page should be in those vmas.

If such a driver does not put its pages into the page cache (the
safer course? I'm unsure), then it needs to set page->mapping and
page->index appropriately (and reset page->mapping to NULL before
freeing).

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Stefani Seibold
Hi,

the question is how can i get all pte's from a vmalloc'ed memory. Due to
the zeroed mapping pointer i dont see how to do this?


Am Dienstag, den 30.10.2007, 10:56 +0100 schrieb Peter Zijlstra:
> On Mon, 2007-10-29 at 21:22 -0400, Jaya Kumar wrote:
> > On 10/29/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > >
> > > [ also, remap_vmalloc_range() suffers similar issues, only file and anon
> > >   have proper rmap ]
> > >
> > > I'm not sure we want full rmap for remap_pfn/vmalloc_range, but perhaps
> > > we could assist drivers in maintaining and using vma lists.
> > >
> > > I think page_mkclean_one() would work if you'd manually set page->index
> > > and iterate the vmas yourself. Although atm I'm not sure of anything so
> > > don't pin me on it.
> > 
> > :-) If it's anybody's fault, it's mine for not testing properly. My bad.
> > 
> > In the case of defio, I think it's no trouble to build a list of vmas
> > at mmap time and then to iterate through them when it's ready for
> > mkclean time as you suggested. I don't fully understand page->index
> > yet. I had thought it was only used by swap cache or file map.
> > 
> > On an unrelated note, I was looking for somewhere to stuff a 16 bit
> > offset (so that I have a cheap way to know which struct page
> > corresponds to which framebuffer block or offset) for another driver.
> > I had thought page->index was it but I think I am wrong now.
> Yeah, page->index is used along with vma->vmpgoff and vma->vm_start to
> determine the address of the page in the given vma:
> 
>   address = vma->vm_start + ((page->index - vma->vm_pgoff) << PAGE_SHIFT);
> 
> and from that address the pte can be found by walking the vma->vm_mm
> page tables.
> 
> So page->index does what you want it to, identify which part of the
> framebuffer this particular page belongs to.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Peter Zijlstra
On Mon, 2007-10-29 at 21:22 -0400, Jaya Kumar wrote:
> On 10/29/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> >
> > [ also, remap_vmalloc_range() suffers similar issues, only file and anon
> >   have proper rmap ]
> >
> > I'm not sure we want full rmap for remap_pfn/vmalloc_range, but perhaps
> > we could assist drivers in maintaining and using vma lists.
> >
> > I think page_mkclean_one() would work if you'd manually set page->index
> > and iterate the vmas yourself. Although atm I'm not sure of anything so
> > don't pin me on it.
> 
> :-) If it's anybody's fault, it's mine for not testing properly. My bad.
> 
> In the case of defio, I think it's no trouble to build a list of vmas
> at mmap time and then to iterate through them when it's ready for
> mkclean time as you suggested. I don't fully understand page->index
> yet. I had thought it was only used by swap cache or file map.
> 
> On an unrelated note, I was looking for somewhere to stuff a 16 bit
> offset (so that I have a cheap way to know which struct page
> corresponds to which framebuffer block or offset) for another driver.
> I had thought page->index was it but I think I am wrong now.

Yeah, page->index is used along with vma->vmpgoff and vma->vm_start to
determine the address of the page in the given vma:

  address = vma->vm_start + ((page->index - vma->vm_pgoff) << PAGE_SHIFT);

and from that address the pte can be found by walking the vma->vm_mm
page tables.

So page->index does what you want it to, identify which part of the
framebuffer this particular page belongs to.


signature.asc
Description: This is a digitally signed message part


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Peter Zijlstra
On Mon, 2007-10-29 at 21:22 -0400, Jaya Kumar wrote:
 On 10/29/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
 
  [ also, remap_vmalloc_range() suffers similar issues, only file and anon
have proper rmap ]
 
  I'm not sure we want full rmap for remap_pfn/vmalloc_range, but perhaps
  we could assist drivers in maintaining and using vma lists.
 
  I think page_mkclean_one() would work if you'd manually set page-index
  and iterate the vmas yourself. Although atm I'm not sure of anything so
  don't pin me on it.
 
 :-) If it's anybody's fault, it's mine for not testing properly. My bad.
 
 In the case of defio, I think it's no trouble to build a list of vmas
 at mmap time and then to iterate through them when it's ready for
 mkclean time as you suggested. I don't fully understand page-index
 yet. I had thought it was only used by swap cache or file map.
 
 On an unrelated note, I was looking for somewhere to stuff a 16 bit
 offset (so that I have a cheap way to know which struct page
 corresponds to which framebuffer block or offset) for another driver.
 I had thought page-index was it but I think I am wrong now.

Yeah, page-index is used along with vma-vmpgoff and vma-vm_start to
determine the address of the page in the given vma:

  address = vma-vm_start + ((page-index - vma-vm_pgoff)  PAGE_SHIFT);

and from that address the pte can be found by walking the vma-vm_mm
page tables.

So page-index does what you want it to, identify which part of the
framebuffer this particular page belongs to.


signature.asc
Description: This is a digitally signed message part


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Stefani Seibold
Hi,

the question is how can i get all pte's from a vmalloc'ed memory. Due to
the zeroed mapping pointer i dont see how to do this?


Am Dienstag, den 30.10.2007, 10:56 +0100 schrieb Peter Zijlstra:
 On Mon, 2007-10-29 at 21:22 -0400, Jaya Kumar wrote:
  On 10/29/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
  
   [ also, remap_vmalloc_range() suffers similar issues, only file and anon
 have proper rmap ]
  
   I'm not sure we want full rmap for remap_pfn/vmalloc_range, but perhaps
   we could assist drivers in maintaining and using vma lists.
  
   I think page_mkclean_one() would work if you'd manually set page-index
   and iterate the vmas yourself. Although atm I'm not sure of anything so
   don't pin me on it.
  
  :-) If it's anybody's fault, it's mine for not testing properly. My bad.
  
  In the case of defio, I think it's no trouble to build a list of vmas
  at mmap time and then to iterate through them when it's ready for
  mkclean time as you suggested. I don't fully understand page-index
  yet. I had thought it was only used by swap cache or file map.
  
  On an unrelated note, I was looking for somewhere to stuff a 16 bit
  offset (so that I have a cheap way to know which struct page
  corresponds to which framebuffer block or offset) for another driver.
  I had thought page-index was it but I think I am wrong now.
 Yeah, page-index is used along with vma-vmpgoff and vma-vm_start to
 determine the address of the page in the given vma:
 
   address = vma-vm_start + ((page-index - vma-vm_pgoff)  PAGE_SHIFT);
 
 and from that address the pte can be found by walking the vma-vm_mm
 page tables.
 
 So page-index does what you want it to, identify which part of the
 framebuffer this particular page belongs to.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Hugh Dickins
On Tue, 30 Oct 2007, Stefani Seibold wrote:
 
 the question is how can i get all pte's from a vmalloc'ed memory. Due to
 the zeroed mapping pointer i dont see how to do this?

The mapping pointer is zeroed because you've done nothing to set it.
Below is how I answered you a week ago.  But this is new territory
(extending page_mkclean to work on more than just pagecache pages),
I'm still unsure what would be the safest way to do it.

On Mon, 22 Oct 2007, Stefani Seibold wrote:
 
 i have a problem with vmalloc() and vm_ops.page_mkwrite().
 
 ReadOnly access works, but on a write access the VM will
 endless invoke the vm_ops.page_mkwrite() handler.
 
 I tracked down the problem to the
   struct page.mapping pointer,
 which is NULL.
 
 The problem original occurs with the fb_defio driver 
 (driver/video/fb_defio.c).
 This driver use the vm_ops.page_mkwrite() handler for tracking the modified 
 pages,
 which will be in an extra thread handled, to perform the IO and clean and
 write protect all pages with page_clean().

Interesting.  You need to ask Jaya (CC'ed) since he's the one
who put that code into fb_defio.c, exported page_mkclean, and
should have tested it.

 
 I am not sure if the is a feature of the new rmap code or a bug.

page_mkclean was written in the belief that it was being used on
pagecache pages.  I'm not sure how deeply engrained that belief is.

If it can easily and safely be used on something else, that may be
nice: though there's a danger we'll keep breaking and re-breaking
it if there's only one driver using it in an unusual way.  CC'ed
Peter since he's the one who most needs to be aware of this.

 
 Is there an way to get a similar functionality? Currently, i have no
 idea
 how to get the ptep from a page alloced with vmalloc().

A pagecache page would have page-mapping initialized to point to
the struct address_space of the vma, and page-index to the offset
(in PAGE_SIZE units): see mm/filemap.c:add_to_page_cache.  Without
page-mapping set, page_mkclean_file won't be able to find the vmas
in which the page might appear; and without page-index set, it
won't be able to find where the page should be in those vmas.

If such a driver does not put its pages into the page cache (the
safer course? I'm unsure), then it needs to set page-mapping and
page-index appropriately (and reset page-mapping to NULL before
freeing).

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Peter Zijlstra
On Tue, 2007-10-30 at 12:39 +, Hugh Dickins wrote:
 On Tue, 30 Oct 2007, Stefani Seibold wrote:
  
  the question is how can i get all pte's from a vmalloc'ed memory. Due to
  the zeroed mapping pointer i dont see how to do this?
 
 The mapping pointer is zeroed because you've done nothing to set it.
 Below is how I answered you a week ago.  But this is new territory
 (extending page_mkclean to work on more than just pagecache pages),
 I'm still unsure what would be the safest way to do it.

Quite, I think manual usage of page_mkclean_one() on the vma gotten from
mmap() along with properly setting page-index is the simplest solution
to make work.

Making page_mkclean(struct page *) work for remap_pfn/vmalloc_range()
style mmaps would require extending rmap to work with those, which
includes setting page-mapping to point to a anon_vma like object.

But that sounds like a lot of work, and I'm not sure its worth the
overhead, because so far all users of remap_pfn/vmalloc_range() have
survived without.




signature.asc
Description: This is a digitally signed message part


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Jaya Kumar
On 10/30/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
 So page-index does what you want it to, identify which part of the
 framebuffer this particular page belongs to.

Ok. I'm attempting to walk the code sequence. Here's what I think:

- driver loads
- driver vmalloc()s its fb
- this creates the necessary pte entries
then...
- app mmap(/dev/fb0)
- vma is created
- defio mmap adds this vma to private list (equivalent of
address_space or anon_vma)
- app touches base + pixel(128,128) = base + 16k
- page fault
- defio nopage gets called
- defio nopage does vmalloc_to_page(base+16k)
- that finds the correct struct page corresponding to that vaddr.
page-index has not been set by anyone so far, right?
* ah... i see, you are suggesting that this is where I could set the
index since i know the offset i want it to represent. right?
- defio mkwrite get called. defio adds page to its list. schedules delayed work
- app keeps writing the page
- delayed work occurs
- foreach vma { foreach page { page_mkclean_one(page, vma) }
- cycle repeats...

Thanks,
jaya
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Peter Zijlstra
On Tue, 2007-10-30 at 09:16 -0400, Jaya Kumar wrote:
 On 10/30/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
  So page-index does what you want it to, identify which part of the
  framebuffer this particular page belongs to.
 
 Ok. I'm attempting to walk the code sequence. Here's what I think:
 
 - driver loads
 - driver vmalloc()s its fb
 - this creates the necessary pte entries

well, one set thereof, the kernel mappings, which for this purpose are
the least interesting.

 then...
 - app mmap(/dev/fb0)
 - vma is created
 - defio mmap adds this vma to private list (equivalent of
 address_space or anon_vma)

 - app touches base + pixel(128,128) = base + 16k
 - page fault
 - defio nopage gets called
 - defio nopage does vmalloc_to_page(base+16k)

this installs a user space page table entry for your page; this is the
interesting one as it carries the user-dirty state.

 - that finds the correct struct page corresponding to that vaddr.
 page-index has not been set by anyone so far, right?
 * ah... i see, you are suggesting that this is where I could set the
 index since i know the offset i want it to represent. right?

Not quite, you would set that right after vmallocing, just set an
increasing page-index starting with 0 for the first page.

Then ensure your vma-vm_pgoff is 0 (which should be the case since
userspace will most likely mmap the whole thing, and if not it still
gets what it expects).

 - defio mkwrite get called. defio adds page to its list. schedules delayed 
 work
 - app keeps writing the page
 - delayed work occurs
 - foreach vma { foreach page { page_mkclean_one(page, vma) }

Yeah, page_mkclean_one(page, vma) will use vma_address() to obtain an
user-space address for the page in this vma using page-index and the
formula from the last email, this address is then used to walk the page
tables and obtain a pte.

This will be the user-space pte installed by your nopfn handler. Not the
kernel vmap pte resulting from the vmalloc() call.

 - cycle repeats...




signature.asc
Description: This is a digitally signed message part


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Hugh Dickins
On Tue, 30 Oct 2007, Peter Zijlstra wrote:
 On Tue, 2007-10-30 at 09:16 -0400, Jaya Kumar wrote:
...
  - defio mmap adds this vma to private list (equivalent of
  address_space or anon_vma)
...
  - foreach vma { foreach page { page_mkclean_one(page, vma) }
 
 Yeah, page_mkclean_one(page, vma) will use vma_address() to obtain an
 user-space address for the page in this vma using page-index and the
 formula from the last email, this address is then used to walk the page
 tables and obtain a pte.

I don't understand why you suggested an anon_vma, nor why Jaya is
suggesting a private list.  All vmas mapping /dev/fb0 will be kept
in the prio_tree rooted in its struct address_space (__vma_link_file
in mm/mmap.c).  And page_mkclean gets page_mkclean_file to walk that
very tree.  The missing part is just the setting of page-mapping to
point to that struct address_space (and clearing it before finally
freeing the pages), and the setting of page-index as you described.
Isn't it?

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-30 Thread Peter Zijlstra
On Tue, 2007-10-30 at 15:47 +, Hugh Dickins wrote:
 On Tue, 30 Oct 2007, Peter Zijlstra wrote:
  On Tue, 2007-10-30 at 09:16 -0400, Jaya Kumar wrote:
 
   - defio mmap adds this vma to private list (equivalent of
   address_space or anon_vma)
 
   - foreach vma { foreach page { page_mkclean_one(page, vma) }
  
  Yeah, page_mkclean_one(page, vma) will use vma_address() to obtain an
  user-space address for the page in this vma using page-index and the
  formula from the last email, this address is then used to walk the page
  tables and obtain a pte.
 
 I don't understand why you suggested an anon_vma, nor why Jaya is
 suggesting a private list.  All vmas mapping /dev/fb0 will be kept
 in the prio_tree rooted in its struct address_space (__vma_link_file
 in mm/mmap.c).  And page_mkclean gets page_mkclean_file to walk that
 very tree.  The missing part is just the setting of page-mapping to
 point to that struct address_space (and clearing it before finally
 freeing the pages), and the setting of page-index as you described.
 Isn't it?

Hmm, there is a thought. I had not considered that mapping a chardev
would have that effect.

I'd have to have a look at the actual code, but yeah, that might very
well work out. How silly of me.

Thanks!


signature.asc
Description: This is a digitally signed message part


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Jaya Kumar
On 10/29/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
>
> [ also, remap_vmalloc_range() suffers similar issues, only file and anon
>   have proper rmap ]
>
> I'm not sure we want full rmap for remap_pfn/vmalloc_range, but perhaps
> we could assist drivers in maintaining and using vma lists.
>
> I think page_mkclean_one() would work if you'd manually set page->index
> and iterate the vmas yourself. Although atm I'm not sure of anything so
> don't pin me on it.

:-) If it's anybody's fault, it's mine for not testing properly. My bad.

In the case of defio, I think it's no trouble to build a list of vmas
at mmap time and then to iterate through them when it's ready for
mkclean time as you suggested. I don't fully understand page->index
yet. I had thought it was only used by swap cache or file map.

On an unrelated note, I was looking for somewhere to stuff a 16 bit
offset (so that I have a cheap way to know which struct page
corresponds to which framebuffer block or offset) for another driver.
I had thought page->index was it but I think I am wrong now.

Thanks,
jaya
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra

On Mon, 2007-10-29 at 19:17 +0100, Peter Zijlstra wrote:
> On Mon, 2007-10-29 at 13:51 -0400, Jaya Kumar wrote:
> > On 10/29/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > > On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
> > > > An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
> > > >
> > > > I understood from the thread that PeterZ is looking into page_mkclean
> > > > changes which I guess went into 2.6.23. I'm also happy to help in any
> > > > way if the way we're doing fb_defio needs to change.
> > >
> > > OK, seems I can't read. Or at least, I missed a large part of the
> > > problem.
> > >
> > > page_mkclean() hasn't changed, it was ->page_mkwrite() that changed. And
> > > looking at the fb_defio code, I'm not sure I understand how its
> > > page_mkclean() use could ever have worked.
> > >
> > > The proposed patch [1] only fixes the issue of ->page_mkwrite() on
> > > vmalloc()'ed memory. Not page_mkclean(), and that has never worked from
> > > what I can make of it.
> > >
> > > Jaya, could you shed some light on this? I presume you had your display
> > > working.
> > >
> > 
> > I thought I had it working. I saw the display update after each
> > mmap/write sequence to the framebuffer. I need to check if there's an
> > munmap or anything else going on in between write sequences that would
> > cause it to behave like page_mkclean was working.
> > 
> > Is it correct to assume that page_mkclean should mark the pages
> > read-only so that the next write would again trigger mkwrite?
> 
> Well, yes, that is the intended behaviour.
> 
> >  Even if the page was from a vmalloc_to_page()?
> 
> That is the crux, I only ever implemented it for file pages.

Hmm, so these vmalloc pages are mapped into user-space with
remap_pfn_range(), which doesn't have any form of rmap. That is, given a
pfn there is no way to obtain all ptes for it. So the interface to
page_mkclean() could never work for these (as it only provides a struct
page *).

[ also, remap_vmalloc_range() suffers similar issues, only file and anon
  have proper rmap ]

I'm not sure we want full rmap for remap_pfn/vmalloc_range, but perhaps
we could assist drivers in maintaining and using vma lists.

I think page_mkclean_one() would work if you'd manually set page->index
and iterate the vmas yourself. Although atm I'm not sure of anything so
don't pin me on it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra
On Mon, 2007-10-29 at 13:51 -0400, Jaya Kumar wrote:
> On 10/29/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
> > > An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
> > >
> > > I understood from the thread that PeterZ is looking into page_mkclean
> > > changes which I guess went into 2.6.23. I'm also happy to help in any
> > > way if the way we're doing fb_defio needs to change.
> >
> > OK, seems I can't read. Or at least, I missed a large part of the
> > problem.
> >
> > page_mkclean() hasn't changed, it was ->page_mkwrite() that changed. And
> > looking at the fb_defio code, I'm not sure I understand how its
> > page_mkclean() use could ever have worked.
> >
> > The proposed patch [1] only fixes the issue of ->page_mkwrite() on
> > vmalloc()'ed memory. Not page_mkclean(), and that has never worked from
> > what I can make of it.
> >
> > Jaya, could you shed some light on this? I presume you had your display
> > working.
> >
> 
> I thought I had it working. I saw the display update after each
> mmap/write sequence to the framebuffer. I need to check if there's an
> munmap or anything else going on in between write sequences that would
> cause it to behave like page_mkclean was working.
> 
> Is it correct to assume that page_mkclean should mark the pages
> read-only so that the next write would again trigger mkwrite?

Well, yes, that is the intended behaviour.

>  Even if the page was from a vmalloc_to_page()?

That is the crux, I only ever implemented it for file pages.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Jaya Kumar
On 10/29/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
> > An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
> >
> > I understood from the thread that PeterZ is looking into page_mkclean
> > changes which I guess went into 2.6.23. I'm also happy to help in any
> > way if the way we're doing fb_defio needs to change.
>
> OK, seems I can't read. Or at least, I missed a large part of the
> problem.
>
> page_mkclean() hasn't changed, it was ->page_mkwrite() that changed. And
> looking at the fb_defio code, I'm not sure I understand how its
> page_mkclean() use could ever have worked.
>
> The proposed patch [1] only fixes the issue of ->page_mkwrite() on
> vmalloc()'ed memory. Not page_mkclean(), and that has never worked from
> what I can make of it.
>
> Jaya, could you shed some light on this? I presume you had your display
> working.
>

I thought I had it working. I saw the display update after each
mmap/write sequence to the framebuffer. I need to check if there's an
munmap or anything else going on in between write sequences that would
cause it to behave like page_mkclean was working.

Is it correct to assume that page_mkclean should mark the pages
read-only so that the next write would again trigger mkwrite? Even if
the page was from a vmalloc_to_page()?

Thanks,
jaya
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra
On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
> On 10/29/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold <[EMAIL PROTECTED]> 
> > wrote:
> > >
> > > The problem original occurs with the fb_defio driver 
> > > (driver/video/fb_defio.c).
> > > This driver use the vm_ops.page_mkwrite() handler for tracking the 
> > > modified pages,
> > > which will be in an extra thread handled, to perform the IO and clean and
> > > write protect all pages with page_clean().
> > >

> An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
> 
> I understood from the thread that PeterZ is looking into page_mkclean
> changes which I guess went into 2.6.23. I'm also happy to help in any
> way if the way we're doing fb_defio needs to change.

OK, seems I can't read. Or at least, I missed a large part of the
problem.

page_mkclean() hasn't changed, it was ->page_mkwrite() that changed. And
looking at the fb_defio code, I'm not sure I understand how its
page_mkclean() use could ever have worked.

The proposed patch [1] only fixes the issue of ->page_mkwrite() on
vmalloc()'ed memory. Not page_mkclean(), and that has never worked from
what I can make of it.

Jaya, could you shed some light on this? I presume you had your display
working.


[1] which I will clean up and resend after this issue is cleared up -
and preferably tested by someone who has this hardware.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Nick Piggin
On Monday 29 October 2007 23:35, Peter Zijlstra wrote:
> On Mon, 2007-10-29 at 11:11 +0100, Peter Zijlstra wrote:
> > On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
> > > On 10/29/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > > On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold 
<[EMAIL PROTECTED]> wrote:
> > > > > The problem original occurs with the fb_defio driver
> > > > > (driver/video/fb_defio.c). This driver use the
> > > > > vm_ops.page_mkwrite() handler for tracking the modified pages,
> > > > > which will be in an extra thread handled, to perform the IO and
> > > > > clean and write protect all pages with page_clean().
> > >
> > > Hi,
> > >
> > > An aside, I just tested that deferred IO works fine on
> > > 2.6.22.10/pxa255.
> > >
> > > I understood from the thread that PeterZ is looking into page_mkclean
> > > changes which I guess went into 2.6.23. I'm also happy to help in any
> > > way if the way we're doing fb_defio needs to change.
> >
> > Yeah, its the truncate race stuff introduced by Nick in
> >   d0217ac04ca6591841e5665f518e38064f4e65bd
> >
> > I'm a bit at a loss on how to go around fixing this. One ugly idea I had
> > was to check page->mapping before going into page_mkwrite() and when
> > that is null, don't bother with the truncate check.
>
> Something like this


I think it's a fine minimal patch. Maybe add a comment to say exactly
what we're doing here (pagecache generally just uses !mapping to test
for truncate).

Otherwise, Acked-by: Nick Piggin <[EMAIL PROTECTED]>, thanks!

> ---
>  mm/memory.c |4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> Index: linux-2.6/mm/memory.c
> ===
> --- linux-2.6.orig/mm/memory.c
> +++ linux-2.6/mm/memory.c
> @@ -2300,6 +2300,8 @@ static int __do_fault(struct mm_struct *
>* to become writable
>*/
>   if (vma->vm_ops->page_mkwrite) {
> + struct address_space *mapping = page->mapping;
> +
>   unlock_page(page);
>   if (vma->vm_ops->page_mkwrite(vma, page) < 0) {
>   ret = VM_FAULT_SIGBUS;
> @@ -2314,7 +2316,7 @@ static int __do_fault(struct mm_struct *
>* reworking page_mkwrite locking API, which
>* is better done later.
>*/
> - if (!page->mapping) {
> + if (mapping != page->mapping) {
>   ret = 0;
>   anon = 1; /* no anon but release 
> vmf.page */
>   goto out;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra
On Mon, 2007-10-29 at 11:11 +0100, Peter Zijlstra wrote:
> On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
> > On 10/29/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold <[EMAIL PROTECTED]> 
> > > wrote:
> > > >
> > > > The problem original occurs with the fb_defio driver 
> > > > (driver/video/fb_defio.c).
> > > > This driver use the vm_ops.page_mkwrite() handler for tracking the 
> > > > modified pages,
> > > > which will be in an extra thread handled, to perform the IO and clean 
> > > > and
> > > > write protect all pages with page_clean().
> > > >
> > 
> > Hi,
> > 
> > An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
> > 
> > I understood from the thread that PeterZ is looking into page_mkclean
> > changes which I guess went into 2.6.23. I'm also happy to help in any
> > way if the way we're doing fb_defio needs to change.
> 
> Yeah, its the truncate race stuff introduced by Nick in
>   d0217ac04ca6591841e5665f518e38064f4e65bd
> 
> I'm a bit at a loss on how to go around fixing this. One ugly idea I had
> was to check page->mapping before going into page_mkwrite() and when
> that is null, don't bother with the truncate check.

Something like this

---
 mm/memory.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/memory.c
===
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -2300,6 +2300,8 @@ static int __do_fault(struct mm_struct *
 * to become writable
 */
if (vma->vm_ops->page_mkwrite) {
+   struct address_space *mapping = page->mapping;
+
unlock_page(page);
if (vma->vm_ops->page_mkwrite(vma, page) < 0) {
ret = VM_FAULT_SIGBUS;
@@ -2314,7 +2316,7 @@ static int __do_fault(struct mm_struct *
 * reworking page_mkwrite locking API, which
 * is better done later.
 */
-   if (!page->mapping) {
+   if (mapping != page->mapping) {
ret = 0;
anon = 1; /* no anon but release 
vmf.page */
goto out;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra
On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
> On 10/29/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold <[EMAIL PROTECTED]> 
> > wrote:
> > >
> > > The problem original occurs with the fb_defio driver 
> > > (driver/video/fb_defio.c).
> > > This driver use the vm_ops.page_mkwrite() handler for tracking the 
> > > modified pages,
> > > which will be in an extra thread handled, to perform the IO and clean and
> > > write protect all pages with page_clean().
> > >
> 
> Hi,
> 
> An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
> 
> I understood from the thread that PeterZ is looking into page_mkclean
> changes which I guess went into 2.6.23. I'm also happy to help in any
> way if the way we're doing fb_defio needs to change.

Yeah, its the truncate race stuff introduced by Nick in
  d0217ac04ca6591841e5665f518e38064f4e65bd

I'm a bit at a loss on how to go around fixing this. One ugly idea I had
was to check page->mapping before going into page_mkwrite() and when
that is null, don't bother with the truncate check.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Jaya Kumar
On 10/29/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold <[EMAIL PROTECTED]> wrote:
> >
> > The problem original occurs with the fb_defio driver 
> > (driver/video/fb_defio.c).
> > This driver use the vm_ops.page_mkwrite() handler for tracking the modified 
> > pages,
> > which will be in an extra thread handled, to perform the IO and clean and
> > write protect all pages with page_clean().
> >

Hi,

An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.

I understood from the thread that PeterZ is looking into page_mkclean
changes which I guess went into 2.6.23. I'm also happy to help in any
way if the way we're doing fb_defio needs to change.

Thanks,
jaya
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Andrew Morton
On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> i have a problem with vmalloc() and vm_ops.page_mkwrite().
> 
> ReadOnly access works, but on a write access the VM will
> endless invoke the vm_ops.page_mkwrite() handler.
> 
> I tracked down the problem to the
>   struct page.mapping pointer,
> which is NULL.
> 
> The problem original occurs with the fb_defio driver 
> (driver/video/fb_defio.c).
> This driver use the vm_ops.page_mkwrite() handler for tracking the modified 
> pages,
> which will be in an extra thread handled, to perform the IO and clean and
> write protect all pages with page_clean().
> 
> I am not sure if the is a feature of the new rmap code or a bug.
> 
> Is there an way to get a similar functionality? Currently, i have no
> idea
> how to get the ptep from a page alloced with vmalloc().
> 
> Greetings,
> Stefani
> 
> Here is a small sample driver:
> 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> #define   DEVICE_NAME "mydrv"
> #define   DEVICE_MAJOR240
> 
> static u8 *mydrv_memory;
> static const u_long mydrv_memory_size=1024*1024; // 1 Megabyte
> 
> #ifdef MODULE
> static u_long vmas=0;
> #endif
> 
> static void mydrv_vma_open(struct vm_area_struct *vma)
> {
> #ifdef MODULE
>   if (vmas++==0)
>   try_module_get(THIS_MODULE);
> #endif
> }
> 
> static void mydrv_vma_close(struct vm_area_struct *vma)
> {
> #ifdef MODULE
>   if (--vmas==0)
>   module_put(THIS_MODULE);
> #endif
> }
> 
> struct page *mydrv_vma_nopage(struct vm_area_struct *vma,unsigned long
> address,int *type)
> {
>   unsigned long offset;
>   struct page *page;
> 
>   offset = (address - vma->vm_start) + (vma->vm_pgoff << PAGE_SHIFT);
>   printk("--> mydrv_vma_nopage:%lu(%08lx)\n",offset,offset);
> 
>   if (offset >= mydrv_memory_size)
>   return NOPAGE_SIGBUS;
> 
>   page = vmalloc_to_page(mydrv_memory + offset);
>   if (!page)
>   return NOPAGE_OOM;
> 
>   get_page(page);
>   if (type)
>   *type = VM_FAULT_MINOR;
>   return page;
> }
> 
> int mydrv_mkwrite(struct vm_area_struct *vma,struct page *page)
> {
>   printk("--> mydrv_mkwrite:%p\n",page->mapping);
> 
>   return 0;
> }
> 
> struct vm_operations_struct mydrv_vm_ops = {
>   .open = mydrv_vma_open,
>   .close =mydrv_vma_close,
>   .nopage =   mydrv_vma_nopage,
>   .page_mkwrite = mydrv_mkwrite,
> };
> 
> static int mydrv_mmap(struct file *file, struct vm_area_struct *vma)
> {
>   vma->vm_ops = _vm_ops;
>   vma->vm_flags |= ( VM_IO | VM_RESERVED | VM_DONTEXPAND );
>   vma->vm_private_data = 0;
>   mydrv_vma_open(vma);
>   return 0;
> }
> 
> int mydrv_open(struct inode *inode, struct file *file)
> {
>   int minor;
> 
>   minor=MINOR(file->f_dentry->d_inode->i_rdev);
>   printk("--> mydrv_open called for minor: %d\n", minor);
> 
>   if (minor>0) {
>   printk("--> mydrv_open minor %d failed\n",minor);
>   return -ENODEV;
>   }
> 
>   printk("--> mydrv_open o.K.\n");
>   return 0;
> }
> 
> 
> int mydrv_close(struct inode *inode, struct file *file)
> {
>   int minor;
> 
>   minor=MINOR(file->f_dentry->d_inode->i_rdev);
>   printk("--> mydrv_close for minor: %d\n", minor);
>   return 0;
> }
> 
> struct file_operations mydrv_fops = {
>   .owner=THIS_MODULE,
>   .mmap=mydrv_mmap,
>   .open=mydrv_open,
>   .release=mydrv_close,
> };
> 
> int init_module(void)
> {
>   printk("--> init_module called\n");
> 
>   if (!(mydrv_memory = vmalloc(mydrv_memory_size)))
>   return -ENOMEM;
> 
>   if (register_chrdev(DEVICE_MAJOR, DEVICE_NAME, _fops) < 0) {
>   printk("--> Error registering driver.\n");
>   return -ENODEV;
>   }
> 
>   memset(mydrv_memory, 0, mydrv_memory_size);
>   printk("--> init_module done\n");
>   return 0;
> }
> 
> 
> void cleanup_module(void)
> {
>   printk("--> cleanup_module called\n");
>   unregister_chrdev(DEVICE_MAJOR,DEVICE_NAME);
>   vfree(mydrv_memory);
>   printk("--> cleanup_module done\n");
> }
> 
> MODULE_LICENSE("GPL");
> MODULE_AUTHOR("Stefan Seibold <[EMAIL PROTECTED]>");
> MODULE_DESCRIPTION("Test Driver");
> 

(cc's added)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Andrew Morton
On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold [EMAIL PROTECTED] wrote:

 Hi,
 
 i have a problem with vmalloc() and vm_ops.page_mkwrite().
 
 ReadOnly access works, but on a write access the VM will
 endless invoke the vm_ops.page_mkwrite() handler.
 
 I tracked down the problem to the
   struct page.mapping pointer,
 which is NULL.
 
 The problem original occurs with the fb_defio driver 
 (driver/video/fb_defio.c).
 This driver use the vm_ops.page_mkwrite() handler for tracking the modified 
 pages,
 which will be in an extra thread handled, to perform the IO and clean and
 write protect all pages with page_clean().
 
 I am not sure if the is a feature of the new rmap code or a bug.
 
 Is there an way to get a similar functionality? Currently, i have no
 idea
 how to get the ptep from a page alloced with vmalloc().
 
 Greetings,
 Stefani
 
 Here is a small sample driver:
 
 #include linux/module.h
 #include linux/errno.h
 #include linux/string.h
 #include linux/mm.h
 #include linux/vmalloc.h
 #include linux/fs.h
 
 #define   DEVICE_NAME mydrv
 #define   DEVICE_MAJOR240
 
 static u8 *mydrv_memory;
 static const u_long mydrv_memory_size=1024*1024; // 1 Megabyte
 
 #ifdef MODULE
 static u_long vmas=0;
 #endif
 
 static void mydrv_vma_open(struct vm_area_struct *vma)
 {
 #ifdef MODULE
   if (vmas++==0)
   try_module_get(THIS_MODULE);
 #endif
 }
 
 static void mydrv_vma_close(struct vm_area_struct *vma)
 {
 #ifdef MODULE
   if (--vmas==0)
   module_put(THIS_MODULE);
 #endif
 }
 
 struct page *mydrv_vma_nopage(struct vm_area_struct *vma,unsigned long
 address,int *type)
 {
   unsigned long offset;
   struct page *page;
 
   offset = (address - vma-vm_start) + (vma-vm_pgoff  PAGE_SHIFT);
   printk(-- mydrv_vma_nopage:%lu(%08lx)\n,offset,offset);
 
   if (offset = mydrv_memory_size)
   return NOPAGE_SIGBUS;
 
   page = vmalloc_to_page(mydrv_memory + offset);
   if (!page)
   return NOPAGE_OOM;
 
   get_page(page);
   if (type)
   *type = VM_FAULT_MINOR;
   return page;
 }
 
 int mydrv_mkwrite(struct vm_area_struct *vma,struct page *page)
 {
   printk(-- mydrv_mkwrite:%p\n,page-mapping);
 
   return 0;
 }
 
 struct vm_operations_struct mydrv_vm_ops = {
   .open = mydrv_vma_open,
   .close =mydrv_vma_close,
   .nopage =   mydrv_vma_nopage,
   .page_mkwrite = mydrv_mkwrite,
 };
 
 static int mydrv_mmap(struct file *file, struct vm_area_struct *vma)
 {
   vma-vm_ops = mydrv_vm_ops;
   vma-vm_flags |= ( VM_IO | VM_RESERVED | VM_DONTEXPAND );
   vma-vm_private_data = 0;
   mydrv_vma_open(vma);
   return 0;
 }
 
 int mydrv_open(struct inode *inode, struct file *file)
 {
   int minor;
 
   minor=MINOR(file-f_dentry-d_inode-i_rdev);
   printk(-- mydrv_open called for minor: %d\n, minor);
 
   if (minor0) {
   printk(-- mydrv_open minor %d failed\n,minor);
   return -ENODEV;
   }
 
   printk(-- mydrv_open o.K.\n);
   return 0;
 }
 
 
 int mydrv_close(struct inode *inode, struct file *file)
 {
   int minor;
 
   minor=MINOR(file-f_dentry-d_inode-i_rdev);
   printk(-- mydrv_close for minor: %d\n, minor);
   return 0;
 }
 
 struct file_operations mydrv_fops = {
   .owner=THIS_MODULE,
   .mmap=mydrv_mmap,
   .open=mydrv_open,
   .release=mydrv_close,
 };
 
 int init_module(void)
 {
   printk(-- init_module called\n);
 
   if (!(mydrv_memory = vmalloc(mydrv_memory_size)))
   return -ENOMEM;
 
   if (register_chrdev(DEVICE_MAJOR, DEVICE_NAME, mydrv_fops)  0) {
   printk(-- Error registering driver.\n);
   return -ENODEV;
   }
 
   memset(mydrv_memory, 0, mydrv_memory_size);
   printk(-- init_module done\n);
   return 0;
 }
 
 
 void cleanup_module(void)
 {
   printk(-- cleanup_module called\n);
   unregister_chrdev(DEVICE_MAJOR,DEVICE_NAME);
   vfree(mydrv_memory);
   printk(-- cleanup_module done\n);
 }
 
 MODULE_LICENSE(GPL);
 MODULE_AUTHOR(Stefan Seibold [EMAIL PROTECTED]);
 MODULE_DESCRIPTION(Test Driver);
 

(cc's added)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Jaya Kumar
On 10/29/07, Andrew Morton [EMAIL PROTECTED] wrote:
 On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold [EMAIL PROTECTED] wrote:
 
  The problem original occurs with the fb_defio driver 
  (driver/video/fb_defio.c).
  This driver use the vm_ops.page_mkwrite() handler for tracking the modified 
  pages,
  which will be in an extra thread handled, to perform the IO and clean and
  write protect all pages with page_clean().
 

Hi,

An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.

I understood from the thread that PeterZ is looking into page_mkclean
changes which I guess went into 2.6.23. I'm also happy to help in any
way if the way we're doing fb_defio needs to change.

Thanks,
jaya
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra
On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
 On 10/29/07, Andrew Morton [EMAIL PROTECTED] wrote:
  On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold [EMAIL PROTECTED] 
  wrote:
  
   The problem original occurs with the fb_defio driver 
   (driver/video/fb_defio.c).
   This driver use the vm_ops.page_mkwrite() handler for tracking the 
   modified pages,
   which will be in an extra thread handled, to perform the IO and clean and
   write protect all pages with page_clean().
  
 
 Hi,
 
 An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
 
 I understood from the thread that PeterZ is looking into page_mkclean
 changes which I guess went into 2.6.23. I'm also happy to help in any
 way if the way we're doing fb_defio needs to change.

Yeah, its the truncate race stuff introduced by Nick in
  d0217ac04ca6591841e5665f518e38064f4e65bd

I'm a bit at a loss on how to go around fixing this. One ugly idea I had
was to check page-mapping before going into page_mkwrite() and when
that is null, don't bother with the truncate check.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra
On Mon, 2007-10-29 at 11:11 +0100, Peter Zijlstra wrote:
 On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
  On 10/29/07, Andrew Morton [EMAIL PROTECTED] wrote:
   On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold [EMAIL PROTECTED] 
   wrote:
   
The problem original occurs with the fb_defio driver 
(driver/video/fb_defio.c).
This driver use the vm_ops.page_mkwrite() handler for tracking the 
modified pages,
which will be in an extra thread handled, to perform the IO and clean 
and
write protect all pages with page_clean().
   
  
  Hi,
  
  An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
  
  I understood from the thread that PeterZ is looking into page_mkclean
  changes which I guess went into 2.6.23. I'm also happy to help in any
  way if the way we're doing fb_defio needs to change.
 
 Yeah, its the truncate race stuff introduced by Nick in
   d0217ac04ca6591841e5665f518e38064f4e65bd
 
 I'm a bit at a loss on how to go around fixing this. One ugly idea I had
 was to check page-mapping before going into page_mkwrite() and when
 that is null, don't bother with the truncate check.

Something like this

---
 mm/memory.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/memory.c
===
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -2300,6 +2300,8 @@ static int __do_fault(struct mm_struct *
 * to become writable
 */
if (vma-vm_ops-page_mkwrite) {
+   struct address_space *mapping = page-mapping;
+
unlock_page(page);
if (vma-vm_ops-page_mkwrite(vma, page)  0) {
ret = VM_FAULT_SIGBUS;
@@ -2314,7 +2316,7 @@ static int __do_fault(struct mm_struct *
 * reworking page_mkwrite locking API, which
 * is better done later.
 */
-   if (!page-mapping) {
+   if (mapping != page-mapping) {
ret = 0;
anon = 1; /* no anon but release 
vmf.page */
goto out;


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Nick Piggin
On Monday 29 October 2007 23:35, Peter Zijlstra wrote:
 On Mon, 2007-10-29 at 11:11 +0100, Peter Zijlstra wrote:
  On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
   On 10/29/07, Andrew Morton [EMAIL PROTECTED] wrote:
On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold 
[EMAIL PROTECTED] wrote:
 The problem original occurs with the fb_defio driver
 (driver/video/fb_defio.c). This driver use the
 vm_ops.page_mkwrite() handler for tracking the modified pages,
 which will be in an extra thread handled, to perform the IO and
 clean and write protect all pages with page_clean().
  
   Hi,
  
   An aside, I just tested that deferred IO works fine on
   2.6.22.10/pxa255.
  
   I understood from the thread that PeterZ is looking into page_mkclean
   changes which I guess went into 2.6.23. I'm also happy to help in any
   way if the way we're doing fb_defio needs to change.
 
  Yeah, its the truncate race stuff introduced by Nick in
d0217ac04ca6591841e5665f518e38064f4e65bd
 
  I'm a bit at a loss on how to go around fixing this. One ugly idea I had
  was to check page-mapping before going into page_mkwrite() and when
  that is null, don't bother with the truncate check.

 Something like this


I think it's a fine minimal patch. Maybe add a comment to say exactly
what we're doing here (pagecache generally just uses !mapping to test
for truncate).

Otherwise, Acked-by: Nick Piggin [EMAIL PROTECTED], thanks!

 ---
  mm/memory.c |4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

 Index: linux-2.6/mm/memory.c
 ===
 --- linux-2.6.orig/mm/memory.c
 +++ linux-2.6/mm/memory.c
 @@ -2300,6 +2300,8 @@ static int __do_fault(struct mm_struct *
* to become writable
*/
   if (vma-vm_ops-page_mkwrite) {
 + struct address_space *mapping = page-mapping;
 +
   unlock_page(page);
   if (vma-vm_ops-page_mkwrite(vma, page)  0) {
   ret = VM_FAULT_SIGBUS;
 @@ -2314,7 +2316,7 @@ static int __do_fault(struct mm_struct *
* reworking page_mkwrite locking API, which
* is better done later.
*/
 - if (!page-mapping) {
 + if (mapping != page-mapping) {
   ret = 0;
   anon = 1; /* no anon but release 
 vmf.page */
   goto out;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra
On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
 On 10/29/07, Andrew Morton [EMAIL PROTECTED] wrote:
  On Mon, 22 Oct 2007 16:40:57 +0200 Stefani Seibold [EMAIL PROTECTED] 
  wrote:
  
   The problem original occurs with the fb_defio driver 
   (driver/video/fb_defio.c).
   This driver use the vm_ops.page_mkwrite() handler for tracking the 
   modified pages,
   which will be in an extra thread handled, to perform the IO and clean and
   write protect all pages with page_clean().
  

 An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
 
 I understood from the thread that PeterZ is looking into page_mkclean
 changes which I guess went into 2.6.23. I'm also happy to help in any
 way if the way we're doing fb_defio needs to change.

OK, seems I can't read. Or at least, I missed a large part of the
problem.

page_mkclean() hasn't changed, it was -page_mkwrite() that changed. And
looking at the fb_defio code, I'm not sure I understand how its
page_mkclean() use could ever have worked.

The proposed patch [1] only fixes the issue of -page_mkwrite() on
vmalloc()'ed memory. Not page_mkclean(), and that has never worked from
what I can make of it.

Jaya, could you shed some light on this? I presume you had your display
working.


[1] which I will clean up and resend after this issue is cleared up -
and preferably tested by someone who has this hardware.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra
On Mon, 2007-10-29 at 13:51 -0400, Jaya Kumar wrote:
 On 10/29/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
  On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
   An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
  
   I understood from the thread that PeterZ is looking into page_mkclean
   changes which I guess went into 2.6.23. I'm also happy to help in any
   way if the way we're doing fb_defio needs to change.
 
  OK, seems I can't read. Or at least, I missed a large part of the
  problem.
 
  page_mkclean() hasn't changed, it was -page_mkwrite() that changed. And
  looking at the fb_defio code, I'm not sure I understand how its
  page_mkclean() use could ever have worked.
 
  The proposed patch [1] only fixes the issue of -page_mkwrite() on
  vmalloc()'ed memory. Not page_mkclean(), and that has never worked from
  what I can make of it.
 
  Jaya, could you shed some light on this? I presume you had your display
  working.
 
 
 I thought I had it working. I saw the display update after each
 mmap/write sequence to the framebuffer. I need to check if there's an
 munmap or anything else going on in between write sequences that would
 cause it to behave like page_mkclean was working.
 
 Is it correct to assume that page_mkclean should mark the pages
 read-only so that the next write would again trigger mkwrite?

Well, yes, that is the intended behaviour.

  Even if the page was from a vmalloc_to_page()?

That is the crux, I only ever implemented it for file pages.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Jaya Kumar
On 10/29/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
 On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
  An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
 
  I understood from the thread that PeterZ is looking into page_mkclean
  changes which I guess went into 2.6.23. I'm also happy to help in any
  way if the way we're doing fb_defio needs to change.

 OK, seems I can't read. Or at least, I missed a large part of the
 problem.

 page_mkclean() hasn't changed, it was -page_mkwrite() that changed. And
 looking at the fb_defio code, I'm not sure I understand how its
 page_mkclean() use could ever have worked.

 The proposed patch [1] only fixes the issue of -page_mkwrite() on
 vmalloc()'ed memory. Not page_mkclean(), and that has never worked from
 what I can make of it.

 Jaya, could you shed some light on this? I presume you had your display
 working.


I thought I had it working. I saw the display update after each
mmap/write sequence to the framebuffer. I need to check if there's an
munmap or anything else going on in between write sequences that would
cause it to behave like page_mkclean was working.

Is it correct to assume that page_mkclean should mark the pages
read-only so that the next write would again trigger mkwrite? Even if
the page was from a vmalloc_to_page()?

Thanks,
jaya
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Peter Zijlstra

On Mon, 2007-10-29 at 19:17 +0100, Peter Zijlstra wrote:
 On Mon, 2007-10-29 at 13:51 -0400, Jaya Kumar wrote:
  On 10/29/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
   On Mon, 2007-10-29 at 01:17 -0700, Jaya Kumar wrote:
An aside, I just tested that deferred IO works fine on 2.6.22.10/pxa255.
   
I understood from the thread that PeterZ is looking into page_mkclean
changes which I guess went into 2.6.23. I'm also happy to help in any
way if the way we're doing fb_defio needs to change.
  
   OK, seems I can't read. Or at least, I missed a large part of the
   problem.
  
   page_mkclean() hasn't changed, it was -page_mkwrite() that changed. And
   looking at the fb_defio code, I'm not sure I understand how its
   page_mkclean() use could ever have worked.
  
   The proposed patch [1] only fixes the issue of -page_mkwrite() on
   vmalloc()'ed memory. Not page_mkclean(), and that has never worked from
   what I can make of it.
  
   Jaya, could you shed some light on this? I presume you had your display
   working.
  
  
  I thought I had it working. I saw the display update after each
  mmap/write sequence to the framebuffer. I need to check if there's an
  munmap or anything else going on in between write sequences that would
  cause it to behave like page_mkclean was working.
  
  Is it correct to assume that page_mkclean should mark the pages
  read-only so that the next write would again trigger mkwrite?
 
 Well, yes, that is the intended behaviour.
 
   Even if the page was from a vmalloc_to_page()?
 
 That is the crux, I only ever implemented it for file pages.

Hmm, so these vmalloc pages are mapped into user-space with
remap_pfn_range(), which doesn't have any form of rmap. That is, given a
pfn there is no way to obtain all ptes for it. So the interface to
page_mkclean() could never work for these (as it only provides a struct
page *).

[ also, remap_vmalloc_range() suffers similar issues, only file and anon
  have proper rmap ]

I'm not sure we want full rmap for remap_pfn/vmalloc_range, but perhaps
we could assist drivers in maintaining and using vma lists.

I think page_mkclean_one() would work if you'd manually set page-index
and iterate the vmas yourself. Although atm I'm not sure of anything so
don't pin me on it.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-29 Thread Jaya Kumar
On 10/29/07, Peter Zijlstra [EMAIL PROTECTED] wrote:

 [ also, remap_vmalloc_range() suffers similar issues, only file and anon
   have proper rmap ]

 I'm not sure we want full rmap for remap_pfn/vmalloc_range, but perhaps
 we could assist drivers in maintaining and using vma lists.

 I think page_mkclean_one() would work if you'd manually set page-index
 and iterate the vmas yourself. Although atm I'm not sure of anything so
 don't pin me on it.

:-) If it's anybody's fault, it's mine for not testing properly. My bad.

In the case of defio, I think it's no trouble to build a list of vmas
at mmap time and then to iterate through them when it's ready for
mkclean time as you suggested. I don't fully understand page-index
yet. I had thought it was only used by swap cache or file map.

On an unrelated note, I was looking for somewhere to stuff a 16 bit
offset (so that I have a cheap way to know which struct page
corresponds to which framebuffer block or offset) for another driver.
I had thought page-index was it but I think I am wrong now.

Thanks,
jaya
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-22 Thread Stefani Seibold
Hi,

i have a problem with vmalloc() and vm_ops.page_mkwrite().

ReadOnly access works, but on a write access the VM will
endless invoke the vm_ops.page_mkwrite() handler.

I tracked down the problem to the
struct page.mapping pointer,
which is NULL.

The problem original occurs with the fb_defio driver (driver/video/fb_defio.c).
This driver use the vm_ops.page_mkwrite() handler for tracking the modified 
pages,
which will be in an extra thread handled, to perform the IO and clean and
write protect all pages with page_clean().

I am not sure if the is a feature of the new rmap code or a bug.

Is there an way to get a similar functionality? Currently, i have no
idea
how to get the ptep from a page alloced with vmalloc().

Greetings,
Stefani

Here is a small sample driver:

#include 
#include 
#include 
#include 
#include 
#include 

#define DEVICE_NAME "mydrv"
#define DEVICE_MAJOR240

static u8 *mydrv_memory;
static const u_long mydrv_memory_size=1024*1024; // 1 Megabyte

#ifdef MODULE
static u_long vmas=0;
#endif

static void mydrv_vma_open(struct vm_area_struct *vma)
{
#ifdef MODULE
if (vmas++==0)
try_module_get(THIS_MODULE);
#endif
}

static void mydrv_vma_close(struct vm_area_struct *vma)
{
#ifdef MODULE
if (--vmas==0)
module_put(THIS_MODULE);
#endif
}

struct page *mydrv_vma_nopage(struct vm_area_struct *vma,unsigned long
address,int *type)
{
unsigned long offset;
struct page *page;

offset = (address - vma->vm_start) + (vma->vm_pgoff << PAGE_SHIFT);
printk("--> mydrv_vma_nopage:%lu(%08lx)\n",offset,offset);

if (offset >= mydrv_memory_size)
return NOPAGE_SIGBUS;

page = vmalloc_to_page(mydrv_memory + offset);
if (!page)
return NOPAGE_OOM;

get_page(page);
if (type)
*type = VM_FAULT_MINOR;
return page;
}

int mydrv_mkwrite(struct vm_area_struct *vma,struct page *page)
{
printk("--> mydrv_mkwrite:%p\n",page->mapping);

return 0;
}

struct vm_operations_struct mydrv_vm_ops = {
.open = mydrv_vma_open,
.close =mydrv_vma_close,
.nopage =   mydrv_vma_nopage,
.page_mkwrite = mydrv_mkwrite,
};

static int mydrv_mmap(struct file *file, struct vm_area_struct *vma)
{
vma->vm_ops = _vm_ops;
vma->vm_flags |= ( VM_IO | VM_RESERVED | VM_DONTEXPAND );
vma->vm_private_data = 0;
mydrv_vma_open(vma);
return 0;
}

int mydrv_open(struct inode *inode, struct file *file)
{
int minor;

minor=MINOR(file->f_dentry->d_inode->i_rdev);
printk("--> mydrv_open called for minor: %d\n", minor);

if (minor>0) {
printk("--> mydrv_open minor %d failed\n",minor);
return -ENODEV;
}

printk("--> mydrv_open o.K.\n");
return 0;
}


int mydrv_close(struct inode *inode, struct file *file)
{
int minor;

minor=MINOR(file->f_dentry->d_inode->i_rdev);
printk("--> mydrv_close for minor: %d\n", minor);
return 0;
}

struct file_operations mydrv_fops = {
.owner=THIS_MODULE,
.mmap=mydrv_mmap,
.open=mydrv_open,
.release=mydrv_close,
};

int init_module(void)
{
printk("--> init_module called\n");

if (!(mydrv_memory = vmalloc(mydrv_memory_size)))
return -ENOMEM;

if (register_chrdev(DEVICE_MAJOR, DEVICE_NAME, _fops) < 0) {
printk("--> Error registering driver.\n");
return -ENODEV;
}

memset(mydrv_memory, 0, mydrv_memory_size);
printk("--> init_module done\n");
return 0;
}


void cleanup_module(void)
{
printk("--> cleanup_module called\n");
unregister_chrdev(DEVICE_MAJOR,DEVICE_NAME);
vfree(mydrv_memory);
printk("--> cleanup_module done\n");
}

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Stefan Seibold <[EMAIL PROTECTED]>");
MODULE_DESCRIPTION("Test Driver");


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


vm_ops.page_mkwrite() fails with vmalloc on 2.6.23

2007-10-22 Thread Stefani Seibold
Hi,

i have a problem with vmalloc() and vm_ops.page_mkwrite().

ReadOnly access works, but on a write access the VM will
endless invoke the vm_ops.page_mkwrite() handler.

I tracked down the problem to the
struct page.mapping pointer,
which is NULL.

The problem original occurs with the fb_defio driver (driver/video/fb_defio.c).
This driver use the vm_ops.page_mkwrite() handler for tracking the modified 
pages,
which will be in an extra thread handled, to perform the IO and clean and
write protect all pages with page_clean().

I am not sure if the is a feature of the new rmap code or a bug.

Is there an way to get a similar functionality? Currently, i have no
idea
how to get the ptep from a page alloced with vmalloc().

Greetings,
Stefani

Here is a small sample driver:

#include linux/module.h
#include linux/errno.h
#include linux/string.h
#include linux/mm.h
#include linux/vmalloc.h
#include linux/fs.h

#define DEVICE_NAME mydrv
#define DEVICE_MAJOR240

static u8 *mydrv_memory;
static const u_long mydrv_memory_size=1024*1024; // 1 Megabyte

#ifdef MODULE
static u_long vmas=0;
#endif

static void mydrv_vma_open(struct vm_area_struct *vma)
{
#ifdef MODULE
if (vmas++==0)
try_module_get(THIS_MODULE);
#endif
}

static void mydrv_vma_close(struct vm_area_struct *vma)
{
#ifdef MODULE
if (--vmas==0)
module_put(THIS_MODULE);
#endif
}

struct page *mydrv_vma_nopage(struct vm_area_struct *vma,unsigned long
address,int *type)
{
unsigned long offset;
struct page *page;

offset = (address - vma-vm_start) + (vma-vm_pgoff  PAGE_SHIFT);
printk(-- mydrv_vma_nopage:%lu(%08lx)\n,offset,offset);

if (offset = mydrv_memory_size)
return NOPAGE_SIGBUS;

page = vmalloc_to_page(mydrv_memory + offset);
if (!page)
return NOPAGE_OOM;

get_page(page);
if (type)
*type = VM_FAULT_MINOR;
return page;
}

int mydrv_mkwrite(struct vm_area_struct *vma,struct page *page)
{
printk(-- mydrv_mkwrite:%p\n,page-mapping);

return 0;
}

struct vm_operations_struct mydrv_vm_ops = {
.open = mydrv_vma_open,
.close =mydrv_vma_close,
.nopage =   mydrv_vma_nopage,
.page_mkwrite = mydrv_mkwrite,
};

static int mydrv_mmap(struct file *file, struct vm_area_struct *vma)
{
vma-vm_ops = mydrv_vm_ops;
vma-vm_flags |= ( VM_IO | VM_RESERVED | VM_DONTEXPAND );
vma-vm_private_data = 0;
mydrv_vma_open(vma);
return 0;
}

int mydrv_open(struct inode *inode, struct file *file)
{
int minor;

minor=MINOR(file-f_dentry-d_inode-i_rdev);
printk(-- mydrv_open called for minor: %d\n, minor);

if (minor0) {
printk(-- mydrv_open minor %d failed\n,minor);
return -ENODEV;
}

printk(-- mydrv_open o.K.\n);
return 0;
}


int mydrv_close(struct inode *inode, struct file *file)
{
int minor;

minor=MINOR(file-f_dentry-d_inode-i_rdev);
printk(-- mydrv_close for minor: %d\n, minor);
return 0;
}

struct file_operations mydrv_fops = {
.owner=THIS_MODULE,
.mmap=mydrv_mmap,
.open=mydrv_open,
.release=mydrv_close,
};

int init_module(void)
{
printk(-- init_module called\n);

if (!(mydrv_memory = vmalloc(mydrv_memory_size)))
return -ENOMEM;

if (register_chrdev(DEVICE_MAJOR, DEVICE_NAME, mydrv_fops)  0) {
printk(-- Error registering driver.\n);
return -ENODEV;
}

memset(mydrv_memory, 0, mydrv_memory_size);
printk(-- init_module done\n);
return 0;
}


void cleanup_module(void)
{
printk(-- cleanup_module called\n);
unregister_chrdev(DEVICE_MAJOR,DEVICE_NAME);
vfree(mydrv_memory);
printk(-- cleanup_module done\n);
}

MODULE_LICENSE(GPL);
MODULE_AUTHOR(Stefan Seibold [EMAIL PROTECTED]);
MODULE_DESCRIPTION(Test Driver);


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/