[kvm-devel] mm notifier: Notifications when pages are unmapped.

2008-05-16 Thread Christoph Lameter
asymmetry in invalidate_page_sync() (at that time called rmap notifier) and we are reintroducing that now in a light weight order to be able to defer freeing until after the rmap spinlocks have been dropped. Jack tested this with the GRU. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- fs

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-15 Thread Christoph Lameter
On Thu, 15 May 2008, Nick Piggin wrote: Oh, I get that confused because of the mixed up naming conventions there: unmap_page_range should actually be called zap_page_range. But at any rate, yes we can easily zap pagetables without holding mmap_sem. How is that synchronized with code that

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-14 Thread Christoph Lameter
On Wed, 14 May 2008, Linus Torvalds wrote: One thing to realize is that most of the time (read: pretty much *always*) when we have the problem of wanting to sleep inside a spinlock, the solution is actually to just move the sleeping to outside the lock, and then have something else that

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Christoph Lameter
On Wed, 7 May 2008, Linus Torvalds wrote: The code that can take many locks, will have to get the global lock *and* order the types, but that's still trivial. It's something like spin_lock(global_lock); for (vma = mm-mmap; vma; vma = vma-vm_next) { if

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Christoph Lameter
On Wed, 7 May 2008, Linus Torvalds wrote: On Wed, 7 May 2008, Christoph Lameter wrote: Multiple vmas may share the same mapping or refer to the same anonymous vma. The above code will deadlock since we may take some locks multiple times. Ok, so that actually _is_ a problem

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Christoph Lameter
On Wed, 7 May 2008, Linus Torvalds wrote: and you're now done. You have your mm_lock() (which still needs to be renamed - it should be a mmu_notifier_lock() or something like that), but you don't need the insane sorting. At most you apparently need a way to recognize duplicates (so that

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Christoph Lameter
On Thu, 8 May 2008, Andrea Arcangeli wrote: to the sort function to break the loop. After that we remove the 512 vma cap and mm_lock is free to run as long as it wants like /dev/urandom, nobody can care less how long it will run before returning as long as it reacts to signals. Look Linus

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-28 Thread Christoph Lameter
On Sun, 27 Apr 2008, Andrea Arcangeli wrote: Talking about post 2.6.26: the refcount with rcu in the anon-vma conversion seems unnecessary and may explain part of the AIM slowdown too. The rest looks ok and probably we should switch the code to a compile-time decision between rwlock and rwsem

Re: [kvm-devel] [PATCH 04 of 12] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-23 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: I know you rather want to see KVM development stalled for more months than to get a partial solution now that already covers KVM and GRU with the same API that XPMEM will also use later. It's very unfair on your side to pretend to stall other

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: Implement unregister but it's not reliable, only -release is reliable. Why is there still the hlist stuff being used for the mmu notifier list? And why is this still unsafe? There are cases in which you do not take the reverse map locks or mmap_sem

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: On Tue, Apr 22, 2008 at 04:20:35PM -0700, Christoph Lameter wrote: I guess I have to prepare another patchset then? If you want to embarrass yourself three time in a row go ahead ;). I thought two failed takeovers was enough. Takeover? I'd

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: The only way to avoid failing because of vmalloc space shortage or oom, would be to provide a O(N*N) fallback. But one that can't be interrupted by sigkill! sigkill interruption was ok in #v12 because we didn't rely on mmu_notifier_unregister to

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: will go in -mm in time for 2.6.26. Let's put it this way, if I fail to merge mmu-notifier-core into 2.6.26 I'll voluntarily give up my entire patchset and leave maintainership to you so you move 1/N to N/N and remove mm_lock-sem patch (everything

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: On Wed, Apr 23, 2008 at 11:09:35AM -0700, Christoph Lameter wrote: Why is there still the hlist stuff being used for the mmu notifier list? And why is this still unsafe? What's the problem with hlist, it saves 8 bytes for each mm_struct, you

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-23 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: Yes, there's really no risk of races in this area after introducing mm_lock, any place that mangles over ptes and doesn't hold any of the three locks is buggy anyway. I appreciate the audit work (I also did it and couldn't find bugs but the more

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-22 Thread Christoph Lameter
Thanks for adding most of my enhancements. But 1. There is no real need for invalidate_page(). Can be done with invalidate_start/end. Needlessly complicates the API. One of the objections by Andrew was that there mere multiple callbacks that perform similar functions. 2.

Re: [kvm-devel] [PATCH 02 of 12] Fix ia64 compilation failure because of common code include bug

2008-04-22 Thread Christoph Lameter
Looks like this is not complete. There are numerous .h files missing which means that various structs are undefined (fs.h and rmap.h are needed f.e.) which leads to surprises when dereferencing fields of these struct. It seems that mm_types.h is expected to be included only in certain

Re: [kvm-devel] [PATCH 03 of 12] get_task_mm should not succeed if mmput() is running and has reduced

2008-04-22 Thread Christoph Lameter
Missing signoff by you. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2.

Re: [kvm-devel] [PATCH 04 of 12] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-22 Thread Christoph Lameter
Reverts a part of an earlier patch. Why isnt this merged into 1 of 12? - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use

Re: [kvm-devel] [PATCH 05 of 12] Move the tlb flushing into free_pgtables. The conversion of the locks

2008-04-22 Thread Christoph Lameter
Why are the subjects all screwed up? They are the first line of the description instead of the subject line of my patches. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's

Re: [kvm-devel] [PATCH 10 of 12] Convert mm_lock to use semaphores after i_mmap_lock and anon_vma_lock

2008-04-22 Thread Christoph Lameter
Doing the right patch ordering would have avoided this patch and allow better review. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save

Re: [kvm-devel] [PATCH 00 of 12] mmu notifier #v13

2008-04-22 Thread Christoph Lameter
On Tue, 22 Apr 2008, Andrea Arcangeli wrote: My patch order and API backward compatible extension over the patchset is done to allow 2.6.26 to fully support KVM/GRU and 2.6.27 to support XPMEM as well. KVM/GRU won't notice any difference once the support for XPMEM is added, but even if the

Re: [kvm-devel] [PATCH 00 of 12] mmu notifier #v13

2008-04-22 Thread Christoph Lameter
On Tue, 22 Apr 2008, Robin Holt wrote: putting it back into your patch/agreeing to it remaining in Andrea's patch? If not, I think we can put this issue aside until Andrew gets out of the merge window and can decide it. Either way, the patches become much more similar with this in. One

Re: [kvm-devel] [PATCH 03 of 12] get_task_mm should not succeed if mmput() is running and has reduced

2008-04-22 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: On Tue, Apr 22, 2008 at 01:23:16PM -0700, Christoph Lameter wrote: Missing signoff by you. I thought I had to signoff if I conributed with anything that could resemble copyright? Given I only merged that patch, I can add an Acked-by if you like

Re: [kvm-devel] [PATCH 04 of 12] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-22 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: On Tue, Apr 22, 2008 at 01:24:21PM -0700, Christoph Lameter wrote: Reverts a part of an earlier patch. Why isnt this merged into 1 of 12? To give zero regression risk to 1/12 when MMU_NOTIFIER=y or =n and the mmu notifiers aren't registered

Re: [kvm-devel] [PATCH 10 of 12] Convert mm_lock to use semaphores after i_mmap_lock and anon_vma_lock

2008-04-22 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: The right patch ordering isn't necessarily the one that reduces the total number of lines in the patchsets. The mmu-notifier-core is already converged and can go in. The rest isn't converged at all... nearly nobody commented on the other part (the

Re: [kvm-devel] [PATCH 01 of 12] Core of mmu notifiers

2008-04-22 Thread Christoph Lameter
On Wed, 23 Apr 2008, Andrea Arcangeli wrote: I'll send an update in any case to Andrew way before Saturday so hopefully we'll finally get mmu-notifiers-core merged before next week. Also I'm not updating my mmu-notifier-core patch anymore except for strict bugfixes so don't worry about any

Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-17 Thread Christoph Lameter
On Thu, 17 Apr 2008, Andrea Arcangeli wrote: Also note, EMM isn't using the clean hlist_del, it's implementing list by hand (with zero runtime gain) so all the debugging may not be existent in EMM, so if it's really a mm_lock race, and it only triggers with mmu notifiers and not with EMM, it

Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Christoph Lameter
On Wed, 16 Apr 2008, Robin Holt wrote: I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears to be LIST_POISON1. How does xpmem unregistering of notifiers work?

Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Christoph Lameter
On Wed, 16 Apr 2008, Robin Holt wrote: On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote: On Wed, 16 Apr 2008, Robin Holt wrote: I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears

Re: [kvm-devel] [PATCH 2 of 9] Core of mmu notifiers

2008-04-14 Thread Christoph Lameter
On Tue, 8 Apr 2008, Andrea Arcangeli wrote: + /* + * Called when nobody can register any more notifier in the mm + * and after the mn notifier has been disarmed already. + */ + void (*release)(struct mmu_notifier *mn, + struct mm_struct *mm);

Re: [kvm-devel] [PATCH 3 of 9] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-14 Thread Christoph Lameter
Not sure why this patch is not merged into 2 of 9. Same comment as last round. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100.

Re: [kvm-devel] [PATCH 2 of 9] Core of mmu notifiers

2008-04-14 Thread Christoph Lameter
Where is the documentation on locking that you wanted to provide? - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority

Re: [kvm-devel] [PATCH 0 of 9] mmu notifier #v12

2008-04-14 Thread Christoph Lameter
On Tue, 8 Apr 2008, Andrea Arcangeli wrote: The difference with #v11 is a different implementation of mm_lock that guarantees handling signals in O(N). It's also more lowlatency friendly. Ok. So the rest of the issues remains unaddressed? I am glad that we finally settled on the locking. But

Re: [kvm-devel] [patch 02/10] emm: notifier logic

2008-04-08 Thread Christoph Lameter
It may also be useful to allow invalidate_start() to fail in some contexts (try_to_unmap f.e., maybe if a certain flag is passed). This may allow the device to get out of tight situations (pending I/O f.e. or time out if there is no response for network communications). But then that

Re: [kvm-devel] [patch 02/10] emm: notifier logic

2008-04-07 Thread Christoph Lameter
On Mon, 7 Apr 2008, Andrea Arcangeli wrote: My mm_lock solution makes all rcu serialization an unnecessary overhead so you should remove it like I already did in #v11. If it wasn't the case, then mm_lock wouldn't be a definitive fix for the race. There still could be junk in the

Re: [kvm-devel] [PATCH] mmu notifier #v11

2008-04-06 Thread Christoph Lameter
On Sat, 5 Apr 2008, Andrea Arcangeli wrote: In short when working with single pages it's a waste to block the secondary-mmu page fault, because it's zero cost to invalidate_page before put_page. Not even GRU need to do that. That depends on what the notifier is being used for. Some

Re: [kvm-devel] [patch 02/10] emm: notifier logic

2008-04-06 Thread Christoph Lameter
On Sat, 5 Apr 2008, Andrea Arcangeli wrote: + rcu_assign_pointer(mm-emm_notifier, e); + mm_unlock(mm); My mm_lock solution makes all rcu serialization an unnecessary overhead so you should remove it like I already did in #v11. If it wasn't the case, then mm_lock wouldn't be a

Re: [kvm-devel] [PATCH] mmu notifier #v11

2008-04-04 Thread Christoph Lameter
I am always the guy doing the cleanup after Andrea it seems. Sigh. Here is the mm_lock/mm_unlock logic separated out for easier review. Adds some comments. Still objectionable is the multiple ways of invalidating pages in #v11. Callout now has similar locking to emm. From: Christoph Lameter

[kvm-devel] [patch 04/10] emm: Convert i_mmap_lock to i_mmap_sem

2008-04-04 Thread Christoph Lameter
]. This slightly increases Aim9 performance results on an 8p. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- arch/x86/mm/hugetlbpage.c |4 ++-- fs/hugetlbfs/inode.c |4 ++-- fs/inode.c|2 +- include/linux

[kvm-devel] [patch 01/10] emm: mm_lock: Lock a process against reclaim

2008-04-04 Thread Christoph Lameter
Provide a way to lock an mm_struct against reclaim (try_to_unmap etc). This is necessary for the invalidate notifier approaches so that they can reliably add and remove a notifier. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include

[kvm-devel] [patch 02/10] emm: notifier logic

2008-04-04 Thread Christoph Lameter
semantings for emm_referenced (thanks Andrea) - Call mm_lock/mm_unlock to protect against registration races. Acked-by: Paul E. McKenney [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/mm_types.h |3 + include/linux/rmap.h | 50

[kvm-devel] [patch 06/10] emm: Convert anon_vma lock to rw_sem and refcount

2008-04-04 Thread Christoph Lameter
by 10-15%. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/rmap.h | 20 --- mm/migrate.c | 26 ++--- mm/mmap.c| 28 +- mm/rmap.c| 53

[kvm-devel] [patch 10/10] xpmem: Simple example

2008-04-04 Thread Christoph Lameter
session. Paste as many times as you like. Each pass will increment the value one additional time. When you are tired, hit enter in the first window. You should see the same value printed from A1 as you most recently received from A2. Signed-off-by: Christoph Lameter [EMAIL PROTECTED

[kvm-devel] [patch 00/10] [RFC] EMM Notifier V3

2008-04-04 Thread Christoph Lameter
V2-V3: - Fix rcu issues - Fix emm_referenced handling - Use Andrea's mm_lock/unlock to prevent registration races. - Keep simple API since there does not seem to be a need to add additional callbacks (mm_lock does not require callbacks like emm_start/stop that I envisioned). - Reduce CC list

[kvm-devel] [patch 03/10] emm: Move tlb flushing into free_pgtables

2008-04-04 Thread Christoph Lameter
this patch. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/mm.h |4 ++-- mm/memory.c| 14 ++ mm/mmap.c |6 +++--- 3 files changed, 15 insertions(+), 9 deletions(-) Index: linux-2.6/include/linux/mm.h

[kvm-devel] [patch 07/10] xpmem: This patch exports zap_page_range as it is needed by XPMEM.

2008-04-04 Thread Christoph Lameter
XPMEM would have used sys_madvise() except that madvise_dontneed() returns an -EINVAL if VM_PFNMAP is set, which is always true for the pages XPMEM imports from other partitions and is also true for uncached pages allocated locally via the mspec allocator. XPMEM needs zap_page_range()

[kvm-devel] [patch 08/10] xpmem: Locking rules for taking multiple mmap_sem locks.

2008-04-04 Thread Christoph Lameter
This patch adds a lock ordering rule to avoid a potential deadlock when multiple mmap_sems need to be locked. Signed-off-by: Dean Nelson [EMAIL PROTECTED] --- mm/filemap.c |3 +++ 1 file changed, 3 insertions(+) Index: linux-2.6/mm/filemap.c

Re: [kvm-devel] EMM: Fixup return value handling of emm_notify()

2008-04-03 Thread Christoph Lameter
On Thu, 3 Apr 2008, Peter Zijlstra wrote: It seems to me that common code can be shared using functions? No need to stuff everything into a single function. We have method vectors all over the kernel, we could do a_ops as a single callback too, but we dont. FWIW I prefer separate methods.

Re: [kvm-devel] EMM: disable other notifiers before register and unregister

2008-04-03 Thread Christoph Lameter
On Thu, 3 Apr 2008, Andrea Arcangeli wrote: My attempt to fix this once and for all is to walk all vmas of the mm inside mmu_notifier_register and take all anon_vma locks and i_mmap_locks in virtual address order in a row. It's ok to take those inside the mmap_sem. Supposedly if anybody will

Re: [kvm-devel] [patch 1/9] EMM Notifier: The notifier calls

2008-04-02 Thread Christoph Lameter
On Wed, 2 Apr 2008, Andrea Arcangeli wrote: There are much bigger issues besides the rcu safety in this patch, proper aging of the secondary mmu through access bits set by hardware is unfixable with this model (you would need to do age |= e-callback), which is the proof of why this isn't

Re: [kvm-devel] [patch 5/9] Convert anon_vma lock to rw_sem and refcount

2008-04-02 Thread Christoph Lameter
On Wed, 2 Apr 2008, Andrea Arcangeli wrote: On Tue, Apr 01, 2008 at 01:55:36PM -0700, Christoph Lameter wrote: This results in f.e. the Aim9 brk performance test to got down by 10-15%. I guess it's more likely because of overscheduling for small crtitical sections, did you counted

[kvm-devel] EMM: Fixup return value handling of emm_notify()

2008-04-02 Thread Christoph Lameter
On Wed, 2 Apr 2008, Christoph Lameter wrote: Here f.e. We can add a special emm_age() function that iterates differently and does the | for you. Well maybe not really necessary. How about this fix? Its likely a problem to stop callbacks if one callback returned an error. Subject: EMM

[kvm-devel] EMM: Require single threadedness for registration.

2008-04-02 Thread Christoph Lameter
only a single thread. That even allows to avoid the use of rcu. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- mm/rmap.c | 46 +- 1 file changed, 37 insertions(+), 9 deletions(-) Index: linux-2.6/mm/rmap.c

Re: [kvm-devel] [patch 1/9] EMM Notifier: The notifier calls

2008-04-02 Thread Christoph Lameter
On Wed, 2 Apr 2008, Andrea Arcangeli wrote: Hmmm... Okay that is one solution that would just require a BUG_ON in the registration methods. Perhaps you didn't notice that this solution can't work if you call range_begin/end not in the current context and try_to_unmap_cluster does

Re: [kvm-devel] [patch 5/9] Convert anon_vma lock to rw_sem and refcount

2008-04-02 Thread Christoph Lameter
On Wed, 2 Apr 2008, Andrea Arcangeli wrote: paging), hence the slowdown. What you benchmarked is the write side, which is also the fast path when the system is heavily CPU bound. I've to say aim is a great benchmark to test this regression. I am a bit surprised that brk performance is that

Re: [kvm-devel] EMM: Fixup return value handling of emm_notify()

2008-04-02 Thread Christoph Lameter
On Wed, 2 Apr 2008, Andrea Arcangeli wrote: but anyway it's silly to be hardwired to such an interface that worst of all requires switch statements instead of proper pointer to functions and a fixed set of parameters and retval semantics for all methods. The EMM API with a single callback is

Re: [kvm-devel] [PATCH 2 of 8] Moves all mmu notifier methods outside the PT lock (first and not last

2008-04-02 Thread Christoph Lameter
On Wed, 2 Apr 2008, Andrea Arcangeli wrote: diff --git a/mm/memory.c b/mm/memory.c --- a/mm/memory.c +++ b/mm/memory.c @@ -1626,9 +1626,10 @@ */ page_table = pte_offset_map_lock(mm, pmd, address,

Re: [kvm-devel] EMM: Require single threadedness for registration.

2008-04-02 Thread Christoph Lameter
On Thu, 3 Apr 2008, Andrea Arcangeli wrote: That would work for #v10 if I remove the invalidate_range_start from try_to_unmap_cluster, it can't work for EMM because you've emm_invalidate_start firing anywhere outside the context of the current task (even regular rmap code, not just nonlinear

Re: [kvm-devel] [PATCH 1 of 8] Core of mmu notifiers

2008-04-02 Thread Christoph Lameter
On Wed, 2 Apr 2008, Andrea Arcangeli wrote: + void (*invalidate_page)(struct mmu_notifier *mn, + struct mm_struct *mm, + unsigned long address); + + void (*invalidate_range_start)(struct mmu_notifier *mn, +

Re: [kvm-devel] [patch 1/9] EMM Notifier: The notifier calls

2008-04-02 Thread Christoph Lameter
On Thu, 3 Apr 2008, Andrea Arcangeli wrote: I said try_to_unmap_cluster, not get_user_pages. CPU0CPU1 try_to_unmap_cluster: emm_invalidate_start in EMM (or mmu_notifier_invalidate_range_start in #v10) walking the list by hand in EMM (or with

Re: [kvm-devel] [PATCH 1 of 8] Core of mmu notifiers

2008-04-02 Thread Christoph Lameter
Thinking about this adventurous locking some more: I think you are misunderstanding what a seqlock is. It is *not* a spinlock. The critical read section with the reading of a version before and after allows you access to a certain version of memory how it is or was some time ago (caching

[kvm-devel] EMM: disable other notifiers before register and unregister

2008-04-02 Thread Christoph Lameter
for unregistering. If we can get all subsystems to stop then we can also reliably unregister a subsystem. So provide that callback. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/rmap.h | 10 +++--- mm/rmap.c| 30 ++ 2 files changed, 37

[kvm-devel] [patch 5/9] Convert anon_vma lock to rw_sem and refcount

2008-04-01 Thread Christoph Lameter
by 10-15%. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/rmap.h | 20 --- mm/migrate.c | 26 ++--- mm/mmap.c|4 +-- mm/rmap.c| 53 +-- 4 files changed

[kvm-devel] [patch 6/9] This patch exports zap_page_range as it is needed by XPMEM.

2008-04-01 Thread Christoph Lameter
XPMEM would have used sys_madvise() except that madvise_dontneed() returns an -EINVAL if VM_PFNMAP is set, which is always true for the pages XPMEM imports from other partitions and is also true for uncached pages allocated locally via the mspec allocator. XPMEM needs zap_page_range()

[kvm-devel] [patch 1/9] EMM Notifier: The notifier calls

2008-04-01 Thread Christoph Lameter
will be necessary to this patchset. V1-V2: - page_referenced_one: Do not increment reference count if it is already != 0. - Use rcu_assign_pointer and rcu_derefence_pointer instead of putting in our own barriers. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/mm_types.h

[kvm-devel] [patch 7/9] Locking rules for taking multiple mmap_sem locks.

2008-04-01 Thread Christoph Lameter
This patch adds a lock ordering rule to avoid a potential deadlock when multiple mmap_sems need to be locked. Signed-off-by: Dean Nelson [EMAIL PROTECTED] --- mm/filemap.c |3 +++ 1 file changed, 3 insertions(+) Index: linux-2.6/mm/filemap.c

[kvm-devel] [patch 2/9] Move tlb flushing into free_pgtables

2008-04-01 Thread Christoph Lameter
this patch. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/mm.h |4 ++-- mm/memory.c| 14 ++ mm/mmap.c |6 +++--- 3 files changed, 15 insertions(+), 9 deletions(-) Index: linux-2.6/include/linux/mm.h

[kvm-devel] [patch 0/9] [RFC] EMM Notifier V2

2008-04-01 Thread Christoph Lameter
[Note that I will be giving talks next week at the OpenFabrics Forum and at the Linux Collab Summit in Austin on memory pinning etc. It would be great if I could get some feedback on the approach then] V1-V2: - Additional optimizations in the VM - Convert vm spinlocks to rw sems. - Add XPMEM

[kvm-devel] [patch 9/9] XPMEM: Simple example

2008-04-01 Thread Christoph Lameter
session. Paste as many times as you like. Each pass will increment the value one additional time. When you are tired, hit enter in the first window. You should see the same value printed from A1 as you most recently received from A2. Signed-off-by: Christoph Lameter [EMAIL PROTECTED

Re: [kvm-devel] [PATCH] 4/4 i_mmap_lock spinlock2rwsem (#v9 was 1/4)

2008-03-19 Thread Christoph Lameter
You need this patch to address the issues (that I already mentioned when I sent the patch to you). New EMM notifier patch with sleeping coming soon. From: Christoph Lameter [EMAIL PROTECTED] Subject: Move tlb flushing into free_pgtables Move the tlb flushing into free_pgtables. The conversion

Re: [kvm-devel] [PATCH] 2/4 move all invalidate_page outside of PT lock (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter
On Fri, 7 Mar 2008, Andrea Arcangeli wrote: This below simple patch invalidates the invalidate_page part, the next patch will invalidate the RCU part, and btw in a way that doesn't forbid unregistering the mmu notifiers at runtime (like your brand new EMM does). Sounds good. The reason I

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter
On Fri, 7 Mar 2008, Andrea Arcangeli wrote: This combines the non-sleep-capable RCU locking of #v9 with a seqlock so the mmu notifier fast path will require zero cacheline writes/bouncing while still providing mmu_notifier_unregister and allowing to schedule inside the mmu notifier methods.

Re: [kvm-devel] [PATCH] 4/4 i_mmap_lock spinlock2rwsem (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter
On Fri, 7 Mar 2008, Andrea Arcangeli wrote: I didn't look into this but it shows how it would be risky to make this change in .25. It's a bit strange that the bugcheck triggers Yes this was never intended for .25. I think we need to split this into a copule of patches. One needs to get rid of

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter
On Fri, 7 Mar 2008, Andrea Arcangeli wrote: In the meantime I've also been thinking that we could need the write_seqlock in mmu_notifier_register, to know when to restart the loop if somebody does a mmu_notifier_register; synchronize_rcu(). Otherwise there's no way to be sure the mmu

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter
On Fri, 7 Mar 2008, Andrea Arcangeli wrote: PS. this problem I pointed out of _end possibly called before _begin is the same for #v9 and EMM V1 as far as I can tell. Hmmm.. We could just push that on the driver saying that is has to tolerate it. Otherwise how can we solve this?

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter
On Fri, 7 Mar 2008, Andrea Arcangeli wrote: This is a replacement for the previously posted 3/4, one of the pieces to allow the mmu notifier methods to sleep. Looks good. That is what we talked about last week. What guarantees now that we see the cacheline referenced after the cacheline that

[kvm-devel] Notifier for Externally Mapped Memory (EMM) V1

2008-03-05 Thread Christoph Lameter
will be necessary to this patchset. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/mm_types.h |3 + include/linux/rmap.h | 50 ++ kernel/fork.c|3 + mm/Kconfig |5 +++ mm/filemap_xip.c |4

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Nick Piggin wrote: Then put it into the arch code for TLB invalidation. Paravirt ops gives good examples on how to do that. Put what into arch code? The mmu notifier code. What about a completely different approach... XPmem runs over NUMAlink, right? Why not

Re: [kvm-devel] [RFC] Notifier for Externally Mapped Memory (EMM)

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Andrea Arcangeli wrote: When working with single pages it's more efficient and preferable to call invalidate_page and only later release the VM reference on the page. But as you pointed out before that path is a slow path anyways. Its rarely taken. Having a single

Re: [kvm-devel] [RFC] Notifier for Externally Mapped Memory (EMM)

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Andrea Arcangeli wrote: I once ripped invalidate_page while working on #v8 but then I reintroduced it because I thought reducing the total number of hooks was beneficial to the core linux VM (even if only a microoptimization, I sure agree about that, but it's trivial to

Re: [kvm-devel] [RFC] Notifier for Externally Mapped Memory (EMM)

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Peter Zijlstra wrote: On Tue, 2008-03-04 at 14:35 -0800, Christoph Lameter wrote: RCU means that the callbacks occur in an atomic context. Not really, if it requires moving the VM locks to sleepable locks under a .config option, I think its also fair to require

Re: [kvm-devel] [PATCH] mmu notifiers #v8

2008-03-03 Thread Christoph Lameter
On Mon, 3 Mar 2008, Nick Piggin wrote: I'm still not completely happy with this. I had a very quick look at the GRU driver, but I don't see why it can't be implemented more like the regular TLB model, and have TLB insertions depend on the linux pte, and do invalidates _after_ restricting

Re: [kvm-devel] [PATCH] mmu notifiers #v8

2008-03-03 Thread Christoph Lameter
On Mon, 3 Mar 2008, Nick Piggin wrote: It is going to be really easy to add more weird and wonderful notifiers later that deviate from our standard TLB model. It would be much harder to remove them. So I really want to see everyone conform to this model first. Numbers and comparisons can be

Re: [kvm-devel] [PATCH] mmu notifiers #v8

2008-03-03 Thread Christoph Lameter
On Mon, 3 Mar 2008, Nick Piggin wrote: Move definition of struct mmu_notifier and struct mmu_notifier_ops under CONFIG_MMU_NOTIFIER to ensure they doesn't get dereferenced when they don't make sense. The callbacks take a mmu_notifier parameter. So how does this compile for !MMU_NOTIFIER?

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-03-03 Thread Christoph Lameter
On Mon, 3 Mar 2008, Nick Piggin wrote: Your skeleton is just registering notifiers and saying /* you fill the hard part in */ If somebody needs a skeleton in order just to register the notifiers, then almost by definition they are unqualified to write the hard part ;) Its also providing

[kvm-devel] [RFC] Notifier for Externally Mapped Memory (EMM)

2008-03-03 Thread Christoph Lameter
(f.e. KVM/GRU). If the rmap traversal spinlocks are converted to semaphores then all callbacks willbe performed in a nonatomic context. Callouts can stay where they are. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/mm_types.h |3 + include/linux/rmap.h | 51

[kvm-devel] [Early draft] Conversion of i_mmap_lock to semaphore

2008-03-03 Thread Christoph Lameter
during rmap traversal for files in a non atomic context. A rw style lock allows concurrent walking of the reverse map. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- arch/x86/mm/hugetlbpage.c |4 ++-- fs/hugetlbfs/inode.c |4 ++-- fs/inode.c|2 +- include

Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Thu, Feb 28, 2008 at 05:03:01PM -0800, Christoph Lameter wrote: I thought you wanted to get rid of the sync via pte lock? Sure. _notify is happening inside the pt lock by coincidence, to reduce the changes to mm/* as long as the mmu notifiers

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote: And thus the device driver may stop receiving data on a UP system? It will never get the ack. Not sure to follow, sorry. My idea was: post the invalidate in the mmio

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: Agreed. I just thought xpmem needed an invalidate-by-page, but I'm glad if xpmem can go in sync with the KVM/GRU/DRI model in this regard. That means we need both the anon_vma locks and the i_mmap_lock to become semaphores. I think semaphores are

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote: That means we need both the anon_vma locks and the i_mmap_lock to become semaphores. I think semaphores are better than mutexes. Rik and Lee saw some performance improvements

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-28 Thread Christoph Lameter
On Thu, 28 Feb 2008, Andrea Arcangeli wrote: On Wed, Feb 27, 2008 at 05:03:21PM -0800, Christoph Lameter wrote: RDMA works across a network and I would assume that it needs confirmation that a connection has been torn down before pages can be unmapped. Depends on the latency

Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-28 Thread Christoph Lameter
On Wed, 27 Feb 2008, Andrea Arcangeli wrote: What Christoph need to do when he's back from vacations to support sleepable mmu notifiers is to add a CONFIG_XPMEM config option that will switch the i_mmap_lock from a semaphore to a mutex (any other change to this patch will be minor compared to

Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-28 Thread Christoph Lameter
On Thu, 28 Feb 2008, Andrea Arcangeli wrote: This is not going to work even if the mutex would work as easily as you think since the patch here still does an rcu_lock/unlock around a callback. See underlined. Mutex is not acceptable for performance reasons. I think we can just drop the

Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-28 Thread Christoph Lameter
On Wed, 27 Feb 2008, Andrea Arcangeli wrote: +struct mmu_notifier_head { + struct hlist_head head; + spinlock_t lock; +}; Still think that the lock here is not of too much use and can be easily replaced by mmap_sem. +#define mmu_notifier(function, mm, args...)

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-28 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Thu, Feb 28, 2008 at 10:43:54AM -0800, Christoph Lameter wrote: What about invalidate_page()? That would just spin waiting an ack (just like the smp-tlb-flushing invalidates in numa already does). And thus the device driver may stop

Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-28 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: Also re the _notify variants: The binding to pte_clear_flush_young etc will become a problem for notifiers that want to sleep because pte_clear_flush is usually called with the pte lock held. See f.e. try_to_unmap_one, page_mkclean_one etc.

Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-28 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Thu, Feb 28, 2008 at 05:17:33PM -0600, Jack Steiner wrote: I disagree. The location of the callout IS a performance issue. In simple comparisons of the 2 patches (Christoph's vs. Andrea's), Andrea's has a 7X increase in the number of TLB

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Christoph Lameter
On Tue, 19 Feb 2008, Andrea Arcangeli wrote: Yes, that's why I kept maintaining my patch and I posted the last revision to Andrew. I use pte/tlb locking of the core VM, it's unintrusive and obviously safe. Furthermore it can be extended with Christoph's stuff in a 100% backwards compatible

  1   2   3   >