Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Robin Holt
On Wed, Feb 20, 2008 at 02:11:41PM +1100, Nick Piggin wrote: > On Wednesday 20 February 2008 14:00, Robin Holt wrote: > > On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote: > > > On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: > > > > > Also, how to you resolve the

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Nick Piggin
On Wednesday 20 February 2008 14:00, Robin Holt wrote: > On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote: > > On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: > > > Also, how to you resolve the case where you are not allowed to sleep? > > > I would have thought either

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Robin Holt
On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote: > On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: > > You can't sleep inside rcu_read_lock()! > > > > I must say that for a patch that is up to v8 or whatever and is > > posted twice a week to such a big cc list, it is

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Andrea Arcangeli
On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: > You can't sleep inside rcu_read_lock()! > > I must say that for a patch that is up to v8 or whatever and is > posted twice a week to such a big cc list, it is kind of slack to > not even test it and expect other people to review it.

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Nick Piggin
On Friday 15 February 2008 17:49, Christoph Lameter wrote: > The invalidation of address ranges in a mm_struct needs to be > performed when pages are removed or permissions etc change. > > If invalidate_range_begin() is called with locks held then we > pass a flag into invalidate_range() to

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Andrea Arcangeli
On Tue, Feb 19, 2008 at 07:54:14PM +1100, Nick Piggin wrote: > As far as sleeping inside callbacks goes... I think there are big > problems with the patch (the sleeping patch and the external rmap > patch). I don't think it is workable in its current state. Either > we have to make some big

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Nick Piggin
On Friday 15 February 2008 17:49, Christoph Lameter wrote: > The invalidation of address ranges in a mm_struct needs to be > performed when pages are removed or permissions etc change. > > If invalidate_range_begin() is called with locks held then we > pass a flag into invalidate_range() to

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Nick Piggin
On Friday 15 February 2008 17:49, Christoph Lameter wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Andrea Arcangeli
On Tue, Feb 19, 2008 at 07:54:14PM +1100, Nick Piggin wrote: As far as sleeping inside callbacks goes... I think there are big problems with the patch (the sleeping patch and the external rmap patch). I don't think it is workable in its current state. Either we have to make some big changes to

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Nick Piggin
On Friday 15 February 2008 17:49, Christoph Lameter wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Andrea Arcangeli
On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: You can't sleep inside rcu_read_lock()! I must say that for a patch that is up to v8 or whatever and is posted twice a week to such a big cc list, it is kind of slack to not even test it and expect other people to review it. Well,

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Nick Piggin
On Wednesday 20 February 2008 14:00, Robin Holt wrote: On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote: On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: Also, how to you resolve the case where you are not allowed to sleep? I would have thought either you have

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Robin Holt
On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote: On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: You can't sleep inside rcu_read_lock()! I must say that for a patch that is up to v8 or whatever and is posted twice a week to such a big cc list, it is kind of

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Robin Holt
On Wed, Feb 20, 2008 at 02:11:41PM +1100, Nick Piggin wrote: On Wednesday 20 February 2008 14:00, Robin Holt wrote: On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote: On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: Also, how to you resolve the case where you

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-16 Thread Christoph Lameter
On Fri, 15 Feb 2008, Andrew Morton wrote: > On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter <[EMAIL PROTECTED]> > wrote: > > > The invalidation of address ranges in a mm_struct needs to be > > performed when pages are removed or permissions etc change. > > hm. Do they? Why? If I'm in

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-16 Thread Christoph Lameter
On Fri, 15 Feb 2008, Andrew Morton wrote: On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. hm. Do they? Why? If I'm in the process

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter <[EMAIL PROTECTED]> wrote: > The invalidation of address ranges in a mm_struct needs to be > performed when pages are removed or permissions etc change. hm. Do they? Why? If I'm in the process of zero-copy writing a hunk of memory out to

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. hm. Do they? Why? If I'm in the process of zero-copy writing a hunk of memory out to

[patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-14 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate that no sleeping is possible. Locks are only held for truncate

[patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-14 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate that no sleeping is possible. Locks are only held for truncate

[patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-08 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate that no sleeping is possible. Locks are only held for truncate

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-31 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 06:51:26PM -0800, Christoph Lameter wrote: > True. hlist_del_init ok? That would allow to check the driver that the > mmu_notifier is already linked in using !hlist_unhashed(). Driver then > needs to properly initialize the mmu_notifier list with INIT_HLIST_NODE(). A

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-31 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 05:46:21PM -0800, Christoph Lameter wrote: > Well the GRU uses follow_page() instead of get_user_pages. Performance is > a major issue for the GRU. GRU is a external TLB, we have to allocate RAM instead but we do it through the regular userland paging mechanism.

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-31 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 05:46:21PM -0800, Christoph Lameter wrote: Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. GRU is a external TLB, we have to allocate RAM instead but we do it through the regular userland paging mechanism.

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-31 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 06:51:26PM -0800, Christoph Lameter wrote: True. hlist_del_init ok? That would allow to check the driver that the mmu_notifier is already linked in using !hlist_unhashed(). Driver then needs to properly initialize the mmu_notifier list with INIT_HLIST_NODE(). A driver

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: > On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: > > hlist_for_each_entry_safe_rcu(mn, n, t, > > > > >mmu_notifier.head, hlist) { > >

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: > hlist_for_each_entry_safe_rcu(mn, n, t, > >mmu_notifier.head, hlist) { > hlist_del_rcu(>hlist);

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Robin Holt wrote: > > Well the GRU uses follow_page() instead of get_user_pages. Performance is > > a major issue for the GRU. > > Worse, the GRU takes its TLB faults from within an interrupt so we > use follow_page to prevent going to sleep. That said, I think we > could

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
> Well the GRU uses follow_page() instead of get_user_pages. Performance is > a major issue for the GRU. Worse, the GRU takes its TLB faults from within an interrupt so we use follow_page to prevent going to sleep. That said, I think we could probably use follow_page() with FOLL_GET set to

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
Patch to 1. Remove sync on notifier_release. Must be called when only a single process remain. 2. Add invalidate_range_start/end. This should allow safe removal of ranges of external ptes without having to resort to a callback for every individual page. This must be able to nest so

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: > On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote: > > How we offload that? Before the scan of the rmaps we do not have the > > mmstruct. So we'd need another notifier_rmap_callback. > > My assumption is that that "int lock" exists

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote: > How we offload that? Before the scan of the rmaps we do not have the > mmstruct. So we'd need another notifier_rmap_callback. My assumption is that that "int lock" exists just because unmap_mapping_range_vma exists. If I'm

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: > > - void (*invalidate_range)(struct mmu_notifier *mn, > > + void (*invalidate_range_begin)(struct mmu_notifier *mn, > > struct mm_struct *mm, > > -unsigned long start, unsigned long end, >

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: > Then we have > > invalidate_range_start(mm) > > and > > invalidate_range_finish(mm, start, end) > > in addition to the invalidate rmap_notifier? > > --- > include/linux/mmu_notifier.h |7 +-- > 1 file changed, 5

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: > On Wed, 30 Jan 2008, Andrea Arcangeli wrote: > > > XPMEM requires with invalidate_range (sleepy) + > > before_invalidate_range (sleepy). invalidate_all should also be called > > before_release (both sleepy). > > > > It sounds

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Jack Steiner wrote: > > Seems that we cannot rely on the invalidate_ranges for correctness at all? > > We need to have invalidate_page() always. invalidate_range() is only an > > optimization. > > > > I don't understand your point "an optimization". How would

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Jack Steiner
On Wed, Jan 30, 2008 at 11:41:29AM -0800, Christoph Lameter wrote: > On Wed, 30 Jan 2008, Jack Steiner wrote: > > > I see what you mean. I need to review to mail to see why this changed > > but in the original discussions with Christoph, the invalidate_range > > callouts were suppose to be made

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Andrea Arcangeli wrote: > XPMEM requires with invalidate_range (sleepy) + > before_invalidate_range (sleepy). invalidate_all should also be called > before_release (both sleepy). > > It sounds we need full overlap of information provided by > invalidate_page and

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Jack Steiner wrote: > I see what you mean. I need to review to mail to see why this changed > but in the original discussions with Christoph, the invalidate_range > callouts were suppose to be made BEFORE the pages were put on the freelist. Seems that we cannot rely on the

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Robin Holt wrote: > I think I need to straighten this discussion out in my head a little bit. > Am I correct in assuming Andrea's original patch set did not have any SMP > race conditions for KVM? If so, then we need to start looking at how to > implement Christoph's and my

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 11:30:09AM -0600, Robin Holt wrote: > I don't think I saw the answer to my original question. I assume your > original patch, extended in a way similar to what Christoph has done, > can be made to work to cover both the KVM and GRU (Jack's) case. Yes, I think so. >

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
On Wed, Jan 30, 2008 at 06:04:52PM +0100, Andrea Arcangeli wrote: > On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: ... > > The three issues we need to simultaneously solve is revoking the remote > > page table/tlb information while still in a sleepable context and not > > having the

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: > > Robin, if you don't mind, could you please post or upload somewhere > > your GPLv2 code that registers itself in Christoph's V2 notifiers? Or > > is it top secret? I wouldn't mind to have a look so I can better > > understand what's

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
> Robin, if you don't mind, could you please post or upload somewhere > your GPLv2 code that registers itself in Christoph's V2 notifiers? Or > is it top secret? I wouldn't mind to have a look so I can better > understand what's the exact reason you're sleeping besides attempting > GFP_KERNEL

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Jack Steiner
On Wed, Jan 30, 2008 at 02:37:20PM +0100, Andrea Arcangeli wrote: > On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote: > > On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: > > > On Wed, 30 Jan 2008, Andrea Arcangeli wrote: > > > > > > > > invalidate_range after

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote: > On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: > > On Wed, 30 Jan 2008, Andrea Arcangeli wrote: > > > > > > invalidate_range after populate allows access to memory for which ptes > > > > were zapped and the

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Peter Zijlstra
On Wed, 2008-01-30 at 01:59 +0100, Andrea Arcangeli wrote: > On Tue, Jan 29, 2008 at 04:22:46PM -0800, Christoph Lameter wrote: > > That is only partially true. pte are created wronly in order to track > > dirty state these days. The first write will lead to a fault that switches > > the pte to

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Peter Zijlstra
On Wed, 2008-01-30 at 01:59 +0100, Andrea Arcangeli wrote: On Tue, Jan 29, 2008 at 04:22:46PM -0800, Christoph Lameter wrote: That is only partially true. pte are created wronly in order to track dirty state these days. The first write will lead to a fault that switches the pte to

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Jack Steiner
On Wed, Jan 30, 2008 at 02:37:20PM +0100, Andrea Arcangeli wrote: On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote: On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: invalidate_range after populate allows

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote: On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: invalidate_range after populate allows access to memory for which ptes were zapped and the refcount was

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
Robin, if you don't mind, could you please post or upload somewhere your GPLv2 code that registers itself in Christoph's V2 notifiers? Or is it top secret? I wouldn't mind to have a look so I can better understand what's the exact reason you're sleeping besides attempting GFP_KERNEL

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: Robin, if you don't mind, could you please post or upload somewhere your GPLv2 code that registers itself in Christoph's V2 notifiers? Or is it top secret? I wouldn't mind to have a look so I can better understand what's the exact

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
On Wed, Jan 30, 2008 at 06:04:52PM +0100, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: ... The three issues we need to simultaneously solve is revoking the remote page table/tlb information while still in a sleepable context and not having the remote

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 11:30:09AM -0600, Robin Holt wrote: I don't think I saw the answer to my original question. I assume your original patch, extended in a way similar to what Christoph has done, can be made to work to cover both the KVM and GRU (Jack's) case. Yes, I think so. XPMEM,

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Robin Holt wrote: I think I need to straighten this discussion out in my head a little bit. Am I correct in assuming Andrea's original patch set did not have any SMP race conditions for KVM? If so, then we need to start looking at how to implement Christoph's and my

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Andrea Arcangeli wrote: XPMEM requires with invalidate_range (sleepy) + before_invalidate_range (sleepy). invalidate_all should also be called before_release (both sleepy). It sounds we need full overlap of information provided by invalidate_page and invalidate_range

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Jack Steiner
On Wed, Jan 30, 2008 at 11:41:29AM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Jack Steiner wrote: I see what you mean. I need to review to mail to see why this changed but in the original discussions with Christoph, the invalidate_range callouts were suppose to be made BEFORE

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Jack Steiner wrote: Seems that we cannot rely on the invalidate_ranges for correctness at all? We need to have invalidate_page() always. invalidate_range() is only an optimization. I don't understand your point an optimization. How would invalidate_range as

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: XPMEM requires with invalidate_range (sleepy) + before_invalidate_range (sleepy). invalidate_all should also be called before_release (both sleepy). It sounds we need full

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: Then we have invalidate_range_start(mm) and invalidate_range_finish(mm, start, end) in addition to the invalidate rmap_notifier? --- include/linux/mmu_notifier.h |7 +-- 1 file changed, 5 insertions(+),

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: - void (*invalidate_range)(struct mmu_notifier *mn, + void (*invalidate_range_begin)(struct mmu_notifier *mn, struct mm_struct *mm, -unsigned long start, unsigned long end,

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote: How we offload that? Before the scan of the rmaps we do not have the mmstruct. So we'd need another notifier_rmap_callback. My assumption is that that int lock exists just because unmap_mapping_range_vma exists. If I'm right

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote: How we offload that? Before the scan of the rmaps we do not have the mmstruct. So we'd need another notifier_rmap_callback. My assumption is that that int lock exists just

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
Patch to 1. Remove sync on notifier_release. Must be called when only a single process remain. 2. Add invalidate_range_start/end. This should allow safe removal of ranges of external ptes without having to resort to a callback for every individual page. This must be able to nest so

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. Worse, the GRU takes its TLB faults from within an interrupt so we use follow_page to prevent going to sleep. That said, I think we could probably use follow_page() with FOLL_GET set to

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Robin Holt wrote: Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. Worse, the GRU takes its TLB faults from within an interrupt so we use follow_page to prevent going to sleep. That said, I think we could

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: hlist_for_each_entry_safe_rcu(mn, n, t, mm-mmu_notifier.head, hlist) { hlist_del_rcu(mn-hlist);

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: hlist_for_each_entry_safe_rcu(mn, n, t, mm-mmu_notifier.head, hlist) {

[patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. Most of the VM address space changes can use the range invalidate callback. invalidate_range() is generally called with mmap_sem held but no spinlocks are active. If

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 04:22:46PM -0800, Christoph Lameter wrote: > That is only partially true. pte are created wronly in order to track > dirty state these days. The first write will lead to a fault that switches > the pte to writable. When the page undergoes writeback the page again >

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Jack Steiner wrote: > > That is true for your implementation and to address Robin's issues. Jack: > > Is that true for the GRU? > > I'm not sure I understand the question. The GRU never (currently) takes > a reference on a page. It has no mechanism for tracking pages that >

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Wed, 30 Jan 2008, Andrea Arcangeli wrote: > > A user space spinlock plays into this??? That is irrelevant to the kernel. > > And we are discussing "your" placement of the invalidate_range not mine. > > With "my" code, invalidate_range wasn't placed there at all, my > modification to

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Jack Steiner
On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: > On Wed, 30 Jan 2008, Andrea Arcangeli wrote: > > > > invalidate_range after populate allows access to memory for which ptes > > > were zapped and the refcount was released. > > > > The last refcount is released by the

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Wed, 30 Jan 2008, Andrea Arcangeli wrote: > On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote: > > get_user_pages, regular linux writes don't fault unless it's > > explicitly writeprotect, which is mandatory in a few archs, x86 not). > > actually get_user_pages doesn't fault

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Wed, 30 Jan 2008, Andrea Arcangeli wrote: > > invalidate_range after populate allows access to memory for which ptes > > were zapped and the refcount was released. > > The last refcount is released by the invalidate_range itself. That is true for your implementation and to address Robin's

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote: > get_user_pages, regular linux writes don't fault unless it's > explicitly writeprotect, which is mandatory in a few archs, x86 not). actually get_user_pages doesn't fault either but it calls into set_page_dirty, however

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 02:39:00PM -0800, Christoph Lameter wrote: > If it does not run in write mode then concurrent faults are permissible > while we remap pages. Weird. Maybe we better handle this like individual > page operations? Put the invalidate_page back into zap_pte. But then there >

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 02:55:56PM -0800, Christoph Lameter wrote: > On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > > > But now I think there may be an issue with a third thread that may > > show unsafe the removal of invalidate_page from ptep_clear_flush. > > > > A third thread writing to a

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > But now I think there may be an issue with a third thread that may > show unsafe the removal of invalidate_page from ptep_clear_flush. > > A third thread writing to a page through the linux-pte and the guest > VM writing to the same page through the

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
n Tue, 29 Jan 2008, Andrea Arcangeli wrote: > hmm, "there" where? When I said it was taken in readonly mode I meant > for the quoted code (it would be at the top if it wasn't cut), so I > quote below again: > > > > + mmu_notifier(invalidate_range, mm, address, > > > +

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 01:53:05PM -0800, Christoph Lameter wrote: > On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > > > > We invalidate the range *after* populating it? Isnt it okay to establish > > > references while populate_range() runs? > > > > It's not ok because that function can very

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 01:35:58PM -0800, Christoph Lameter wrote: > On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > > > > It seems to be okay to invalidate range if you hold mmap_sem writably. In > > > that case no additional faults can happen that would create new ptes. > > > > In that place

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > > We invalidate the range *after* populating it? Isnt it okay to establish > > references while populate_range() runs? > > It's not ok because that function can very well overwrite existing and > present ptes (it's actually the nonlinear common

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 12:30:06PM -0800, Christoph Lameter wrote: > On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > > > diff --git a/mm/fremap.c b/mm/fremap.c > > --- a/mm/fremap.c > > +++ b/mm/fremap.c > > @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns > >

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > > It seems to be okay to invalidate range if you hold mmap_sem writably. In > > that case no additional faults can happen that would create new ptes. > > In that place the mmap_sem is taken but in readonly mode. I never rely > on the mmap_sem in

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 11:55:10AM -0800, Christoph Lameter wrote: > I am not sure. AFAICT you wrote that code. Actually I didn't need to change a single line in do_wp_page because ptep_clear_flush was already doing everything transparently for me. This was the memory.c part of my last patch I

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > diff --git a/mm/fremap.c b/mm/fremap.c > --- a/mm/fremap.c > +++ b/mm/fremap.c > @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns > spin_unlock(>i_mmap_lock); > } > > + err = populate_range(mm, vma, start, size,

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: > > + mmu_notifier(invalidate_range, mm, address, > > + address + PAGE_SIZE - 1, 0); > > page_table = pte_offset_map_lock(mm, pmd, address, ); > > if (likely(pte_same(*page_table, orig_pte))) { > > if

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
Christoph, the below patch should fix the current leak of the pinned pages. I hope the page-pin that should be dropped by the invalidate_range op, is enough to prevent the "physical page" mapped on that "mm+address" to change before invalidate_range returns. If that would ever happen, there would

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Mon, Jan 28, 2008 at 12:28:42PM -0800, Christoph Lameter wrote: > Index: linux-2.6/mm/fremap.c > === > --- linux-2.6.orig/mm/fremap.c2008-01-25 19:31:05.0 -0800 > +++ linux-2.6/mm/fremap.c 2008-01-25

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Mon, Jan 28, 2008 at 12:28:42PM -0800, Christoph Lameter wrote: Index: linux-2.6/mm/fremap.c === --- linux-2.6.orig/mm/fremap.c2008-01-25 19:31:05.0 -0800 +++ linux-2.6/mm/fremap.c 2008-01-25

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 11:55:10AM -0800, Christoph Lameter wrote: I am not sure. AFAICT you wrote that code. Actually I didn't need to change a single line in do_wp_page because ptep_clear_flush was already doing everything transparently for me. This was the memory.c part of my last patch I

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 01:35:58PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: It seems to be okay to invalidate range if you hold mmap_sem writably. In that case no additional faults can happen that would create new ptes. In that place the mmap_sem

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 12:30:06PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: diff --git a/mm/fremap.c b/mm/fremap.c --- a/mm/fremap.c +++ b/mm/fremap.c @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: It seems to be okay to invalidate range if you hold mmap_sem writably. In that case no additional faults can happen that would create new ptes. In that place the mmap_sem is taken but in readonly mode. I never rely on the mmap_sem in the mmu

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: + mmu_notifier(invalidate_range, mm, address, + address + PAGE_SIZE - 1, 0); page_table = pte_offset_map_lock(mm, pmd, address, ptl); if (likely(pte_same(*page_table, orig_pte))) { if (old_page)

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
Christoph, the below patch should fix the current leak of the pinned pages. I hope the page-pin that should be dropped by the invalidate_range op, is enough to prevent the physical page mapped on that mm+address to change before invalidate_range returns. If that would ever happen, there would be a

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: diff --git a/mm/fremap.c b/mm/fremap.c --- a/mm/fremap.c +++ b/mm/fremap.c @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns spin_unlock(mapping-i_mmap_lock); } + err = populate_range(mm, vma, start, size,

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 01:53:05PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: We invalidate the range *after* populating it? Isnt it okay to establish references while populate_range() runs? It's not ok because that function can very well overwrite

Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
n Tue, 29 Jan 2008, Andrea Arcangeli wrote: hmm, there where? When I said it was taken in readonly mode I meant for the quoted code (it would be at the top if it wasn't cut), so I quote below again: + mmu_notifier(invalidate_range, mm, address, + address +

  1   2   >