asymmetry in invalidate_page_sync() (at that time called rmap notifier)
and
we are reintroducing that now in a light weight order to be able to defer
freeing
until after the rmap spinlocks have been dropped.
Jack tested this with the GRU.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
fs
On Thu, 15 May 2008, Nick Piggin wrote:
Oh, I get that confused because of the mixed up naming conventions
there: unmap_page_range should actually be called zap_page_range. But
at any rate, yes we can easily zap pagetables without holding mmap_sem.
How is that synchronized with code that
On Wed, 14 May 2008, Linus Torvalds wrote:
One thing to realize is that most of the time (read: pretty much *always*)
when we have the problem of wanting to sleep inside a spinlock, the
solution is actually to just move the sleeping to outside the lock, and
then have something else that
On Wed, 7 May 2008, Linus Torvalds wrote:
The code that can take many locks, will have to get the global lock *and*
order the types, but that's still trivial. It's something like
spin_lock(global_lock);
for (vma = mm-mmap; vma; vma = vma-vm_next) {
if
On Wed, 7 May 2008, Linus Torvalds wrote:
On Wed, 7 May 2008, Christoph Lameter wrote:
Multiple vmas may share the same mapping or refer to the same anonymous
vma. The above code will deadlock since we may take some locks multiple
times.
Ok, so that actually _is_ a problem
On Wed, 7 May 2008, Linus Torvalds wrote:
and you're now done. You have your mm_lock() (which still needs to be
renamed - it should be a mmu_notifier_lock() or something like that),
but you don't need the insane sorting. At most you apparently need a way
to recognize duplicates (so that
On Thu, 8 May 2008, Andrea Arcangeli wrote:
to the sort function to break the loop. After that we remove the 512
vma cap and mm_lock is free to run as long as it wants like
/dev/urandom, nobody can care less how long it will run before
returning as long as it reacts to signals.
Look Linus
On Sun, 27 Apr 2008, Andrea Arcangeli wrote:
Talking about post 2.6.26: the refcount with rcu in the anon-vma
conversion seems unnecessary and may explain part of the AIM slowdown
too. The rest looks ok and probably we should switch the code to a
compile-time decision between rwlock and rwsem
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
I know you rather want to see KVM development stalled for more months
than to get a partial solution now that already covers KVM and GRU
with the same API that XPMEM will also use later. It's very unfair on
your side to pretend to stall other
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
Implement unregister but it's not reliable, only -release is reliable.
Why is there still the hlist stuff being used for the mmu notifier list?
And why is this still unsafe?
There are cases in which you do not take the reverse map locks or mmap_sem
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
On Tue, Apr 22, 2008 at 04:20:35PM -0700, Christoph Lameter wrote:
I guess I have to prepare another patchset then?
If you want to embarrass yourself three time in a row go ahead ;). I
thought two failed takeovers was enough.
Takeover? I'd
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
The only way to avoid failing because of vmalloc space shortage or
oom, would be to provide a O(N*N) fallback. But one that can't be
interrupted by sigkill! sigkill interruption was ok in #v12 because we
didn't rely on mmu_notifier_unregister to
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
will go in -mm in time for 2.6.26. Let's put it this way, if I fail to
merge mmu-notifier-core into 2.6.26 I'll voluntarily give up my entire
patchset and leave maintainership to you so you move 1/N to N/N and
remove mm_lock-sem patch (everything
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
On Wed, Apr 23, 2008 at 11:09:35AM -0700, Christoph Lameter wrote:
Why is there still the hlist stuff being used for the mmu notifier list?
And why is this still unsafe?
What's the problem with hlist, it saves 8 bytes for each mm_struct,
you
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
Yes, there's really no risk of races in this area after introducing
mm_lock, any place that mangles over ptes and doesn't hold any of the
three locks is buggy anyway. I appreciate the audit work (I also did
it and couldn't find bugs but the more
Thanks for adding most of my enhancements. But
1. There is no real need for invalidate_page(). Can be done with
invalidate_start/end. Needlessly complicates the API. One
of the objections by Andrew was that there mere multiple
callbacks that perform similar functions.
2.
Looks like this is not complete. There are numerous .h files missing which
means that various structs are undefined (fs.h and rmap.h are needed
f.e.) which leads to surprises when dereferencing fields of these struct.
It seems that mm_types.h is expected to be included only in certain
Missing signoff by you.
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
Reverts a part of an earlier patch. Why isnt this merged into 1 of 12?
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use
Why are the subjects all screwed up? They are the first line of the
description instead of the subject line of my patches.
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's
Doing the right patch ordering would have avoided this patch and allow
better review.
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save
On Tue, 22 Apr 2008, Andrea Arcangeli wrote:
My patch order and API backward compatible extension over the patchset
is done to allow 2.6.26 to fully support KVM/GRU and 2.6.27 to support
XPMEM as well. KVM/GRU won't notice any difference once the support
for XPMEM is added, but even if the
On Tue, 22 Apr 2008, Robin Holt wrote:
putting it back into your patch/agreeing to it remaining in Andrea's
patch? If not, I think we can put this issue aside until Andrew gets
out of the merge window and can decide it. Either way, the patches
become much more similar with this in.
One
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
On Tue, Apr 22, 2008 at 01:23:16PM -0700, Christoph Lameter wrote:
Missing signoff by you.
I thought I had to signoff if I conributed with anything that could
resemble copyright? Given I only merged that patch, I can add an
Acked-by if you like
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
On Tue, Apr 22, 2008 at 01:24:21PM -0700, Christoph Lameter wrote:
Reverts a part of an earlier patch. Why isnt this merged into 1 of 12?
To give zero regression risk to 1/12 when MMU_NOTIFIER=y or =n and the
mmu notifiers aren't registered
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
The right patch ordering isn't necessarily the one that reduces the
total number of lines in the patchsets. The mmu-notifier-core is
already converged and can go in. The rest isn't converged at
all... nearly nobody commented on the other part (the
On Wed, 23 Apr 2008, Andrea Arcangeli wrote:
I'll send an update in any case to Andrew way before Saturday so
hopefully we'll finally get mmu-notifiers-core merged before next
week. Also I'm not updating my mmu-notifier-core patch anymore except
for strict bugfixes so don't worry about any
On Thu, 17 Apr 2008, Andrea Arcangeli wrote:
Also note, EMM isn't using the clean hlist_del, it's implementing list
by hand (with zero runtime gain) so all the debugging may not be
existent in EMM, so if it's really a mm_lock race, and it only
triggers with mmu notifiers and not with EMM, it
On Wed, 16 Apr 2008, Robin Holt wrote:
I don't think this lock mechanism is completely working. I have
gotten a few failures trying to dereference 0x100100 which appears to
be LIST_POISON1.
How does xpmem unregistering of notifiers work?
On Wed, 16 Apr 2008, Robin Holt wrote:
On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote:
On Wed, 16 Apr 2008, Robin Holt wrote:
I don't think this lock mechanism is completely working. I have
gotten a few failures trying to dereference 0x100100 which appears
On Tue, 8 Apr 2008, Andrea Arcangeli wrote:
+ /*
+ * Called when nobody can register any more notifier in the mm
+ * and after the mn notifier has been disarmed already.
+ */
+ void (*release)(struct mmu_notifier *mn,
+ struct mm_struct *mm);
Not sure why this patch is not merged into 2 of 9. Same comment as last
round.
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Where is the documentation on locking that you wanted to provide?
-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority
On Tue, 8 Apr 2008, Andrea Arcangeli wrote:
The difference with #v11 is a different implementation of mm_lock that
guarantees handling signals in O(N). It's also more lowlatency friendly.
Ok. So the rest of the issues remains unaddressed? I am glad that we
finally settled on the locking. But
It may also be useful to allow invalidate_start() to fail in some contexts
(try_to_unmap f.e., maybe if a certain flag is passed). This may allow the
device to get out of tight situations (pending I/O f.e. or time out if
there is no response for network communications). But then that
On Mon, 7 Apr 2008, Andrea Arcangeli wrote:
My mm_lock solution makes all rcu serialization an unnecessary
overhead so you should remove it like I already did in #v11. If it
wasn't the case, then mm_lock wouldn't be a definitive fix for the
race.
There still could be junk in the
On Sat, 5 Apr 2008, Andrea Arcangeli wrote:
In short when working with single pages it's a waste to block the
secondary-mmu page fault, because it's zero cost to invalidate_page
before put_page. Not even GRU need to do that.
That depends on what the notifier is being used for. Some
On Sat, 5 Apr 2008, Andrea Arcangeli wrote:
+ rcu_assign_pointer(mm-emm_notifier, e);
+ mm_unlock(mm);
My mm_lock solution makes all rcu serialization an unnecessary
overhead so you should remove it like I already did in #v11. If it
wasn't the case, then mm_lock wouldn't be a
I am always the guy doing the cleanup after Andrea it seems. Sigh.
Here is the mm_lock/mm_unlock logic separated out for easier review.
Adds some comments. Still objectionable is the multiple ways of
invalidating pages in #v11. Callout now has similar locking to emm.
From: Christoph Lameter
].
This slightly increases Aim9 performance results on an 8p.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
arch/x86/mm/hugetlbpage.c |4 ++--
fs/hugetlbfs/inode.c |4 ++--
fs/inode.c|2 +-
include/linux
Provide a way to lock an mm_struct against reclaim (try_to_unmap
etc). This is necessary for the invalidate notifier approaches so
that they can reliably add and remove a notifier.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include
semantings for emm_referenced
(thanks Andrea)
- Call mm_lock/mm_unlock to protect against registration races.
Acked-by: Paul E. McKenney [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/mm_types.h |3 +
include/linux/rmap.h | 50
by 10-15%.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/rmap.h | 20 ---
mm/migrate.c | 26 ++---
mm/mmap.c| 28 +-
mm/rmap.c| 53
session. Paste as many times as you like. Each pass will
increment the value one additional time. When you are tired, hit enter
in the first window. You should see the same value printed from A1 as
you most recently received from A2.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED
V2-V3:
- Fix rcu issues
- Fix emm_referenced handling
- Use Andrea's mm_lock/unlock to prevent registration races.
- Keep simple API since there does not seem to be a need to add additional
callbacks (mm_lock does not require callbacks like emm_start/stop that
I envisioned).
- Reduce CC list
this patch.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/mm.h |4 ++--
mm/memory.c| 14 ++
mm/mmap.c |6 +++---
3 files changed, 15 insertions(+), 9 deletions(-)
Index: linux-2.6/include/linux/mm.h
XPMEM would have used sys_madvise() except that madvise_dontneed()
returns an -EINVAL if VM_PFNMAP is set, which is always true for the pages
XPMEM imports from other partitions and is also true for uncached pages
allocated locally via the mspec allocator. XPMEM needs zap_page_range()
This patch adds a lock ordering rule to avoid a potential deadlock when
multiple mmap_sems need to be locked.
Signed-off-by: Dean Nelson [EMAIL PROTECTED]
---
mm/filemap.c |3 +++
1 file changed, 3 insertions(+)
Index: linux-2.6/mm/filemap.c
On Thu, 3 Apr 2008, Peter Zijlstra wrote:
It seems to me that common code can be shared using functions? No need
to stuff everything into a single function. We have method vectors all
over the kernel, we could do a_ops as a single callback too, but we
dont.
FWIW I prefer separate methods.
On Thu, 3 Apr 2008, Andrea Arcangeli wrote:
My attempt to fix this once and for all is to walk all vmas of the
mm inside mmu_notifier_register and take all anon_vma locks and
i_mmap_locks in virtual address order in a row. It's ok to take those
inside the mmap_sem. Supposedly if anybody will
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
There are much bigger issues besides the rcu safety in this patch,
proper aging of the secondary mmu through access bits set by hardware
is unfixable with this model (you would need to do age |=
e-callback), which is the proof of why this isn't
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
On Tue, Apr 01, 2008 at 01:55:36PM -0700, Christoph Lameter wrote:
This results in f.e. the Aim9 brk performance test to got down by 10-15%.
I guess it's more likely because of overscheduling for small crtitical
sections, did you counted
On Wed, 2 Apr 2008, Christoph Lameter wrote:
Here f.e. We can add a special emm_age() function that iterates
differently and does the | for you.
Well maybe not really necessary. How about this fix? Its likely a problem
to stop callbacks if one callback returned an error.
Subject: EMM
only a single
thread. That even allows to avoid the use of rcu.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
mm/rmap.c | 46 +-
1 file changed, 37 insertions(+), 9 deletions(-)
Index: linux-2.6/mm/rmap.c
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
Hmmm... Okay that is one solution that would just require a BUG_ON in the
registration methods.
Perhaps you didn't notice that this solution can't work if you call
range_begin/end not in the current context and try_to_unmap_cluster
does
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
paging), hence the slowdown. What you benchmarked is the write side,
which is also the fast path when the system is heavily CPU bound. I've
to say aim is a great benchmark to test this regression.
I am a bit surprised that brk performance is that
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
but anyway it's silly to be hardwired to such an interface that worst
of all requires switch statements instead of proper pointer to
functions and a fixed set of parameters and retval semantics for all
methods.
The EMM API with a single callback is
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
diff --git a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1626,9 +1626,10 @@
*/
page_table = pte_offset_map_lock(mm, pmd, address,
On Thu, 3 Apr 2008, Andrea Arcangeli wrote:
That would work for #v10 if I remove the invalidate_range_start from
try_to_unmap_cluster, it can't work for EMM because you've
emm_invalidate_start firing anywhere outside the context of the
current task (even regular rmap code, not just nonlinear
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
+ void (*invalidate_page)(struct mmu_notifier *mn,
+ struct mm_struct *mm,
+ unsigned long address);
+
+ void (*invalidate_range_start)(struct mmu_notifier *mn,
+
On Thu, 3 Apr 2008, Andrea Arcangeli wrote:
I said try_to_unmap_cluster, not get_user_pages.
CPU0CPU1
try_to_unmap_cluster:
emm_invalidate_start in EMM (or mmu_notifier_invalidate_range_start in #v10)
walking the list by hand in EMM (or with
Thinking about this adventurous locking some more: I think you are
misunderstanding what a seqlock is. It is *not* a spinlock.
The critical read section with the reading of a version before and after
allows you access to a certain version of memory how it is or was some
time ago (caching
for unregistering. If we can get all subsystems
to stop then we can also reliably unregister a subsystem. So
provide that callback.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/rmap.h | 10 +++---
mm/rmap.c| 30 ++
2 files changed, 37
by 10-15%.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/rmap.h | 20 ---
mm/migrate.c | 26 ++---
mm/mmap.c|4 +--
mm/rmap.c| 53 +--
4 files changed
XPMEM would have used sys_madvise() except that madvise_dontneed()
returns an -EINVAL if VM_PFNMAP is set, which is always true for the pages
XPMEM imports from other partitions and is also true for uncached pages
allocated locally via the mspec allocator. XPMEM needs zap_page_range()
will be necessary
to this patchset.
V1-V2:
- page_referenced_one: Do not increment reference count if it is already
!= 0.
- Use rcu_assign_pointer and rcu_derefence_pointer instead of putting in our
own barriers.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/mm_types.h
This patch adds a lock ordering rule to avoid a potential deadlock when
multiple mmap_sems need to be locked.
Signed-off-by: Dean Nelson [EMAIL PROTECTED]
---
mm/filemap.c |3 +++
1 file changed, 3 insertions(+)
Index: linux-2.6/mm/filemap.c
this patch.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/mm.h |4 ++--
mm/memory.c| 14 ++
mm/mmap.c |6 +++---
3 files changed, 15 insertions(+), 9 deletions(-)
Index: linux-2.6/include/linux/mm.h
[Note that I will be giving talks next week at the OpenFabrics Forum
and at the Linux Collab Summit in Austin on memory pinning etc. It would
be great if I could get some feedback on the approach then]
V1-V2:
- Additional optimizations in the VM
- Convert vm spinlocks to rw sems.
- Add XPMEM
session. Paste as many times as you like. Each pass will
increment the value one additional time. When you are tired, hit enter
in the first window. You should see the same value printed from A1 as
you most recently received from A2.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED
You need this patch to address the issues (that I already mentioned when I
sent the patch to you). New EMM notifier patch with sleeping coming soon.
From: Christoph Lameter [EMAIL PROTECTED]
Subject: Move tlb flushing into free_pgtables
Move the tlb flushing into free_pgtables. The conversion
On Fri, 7 Mar 2008, Andrea Arcangeli wrote:
This below simple patch invalidates the invalidate_page part, the
next patch will invalidate the RCU part, and btw in a way that doesn't
forbid unregistering the mmu notifiers at runtime (like your brand new
EMM does).
Sounds good.
The reason I
On Fri, 7 Mar 2008, Andrea Arcangeli wrote:
This combines the non-sleep-capable RCU locking of #v9 with a seqlock
so the mmu notifier fast path will require zero cacheline
writes/bouncing while still providing mmu_notifier_unregister and
allowing to schedule inside the mmu notifier methods.
On Fri, 7 Mar 2008, Andrea Arcangeli wrote:
I didn't look into this but it shows how it would be risky to make
this change in .25. It's a bit strange that the bugcheck triggers
Yes this was never intended for .25. I think we need to split this into a
copule of patches. One needs to get rid of
On Fri, 7 Mar 2008, Andrea Arcangeli wrote:
In the meantime I've also been thinking that we could need the
write_seqlock in mmu_notifier_register, to know when to restart the
loop if somebody does a mmu_notifier_register;
synchronize_rcu(). Otherwise there's no way to be sure the mmu
On Fri, 7 Mar 2008, Andrea Arcangeli wrote:
PS. this problem I pointed out of _end possibly called before _begin
is the same for #v9 and EMM V1 as far as I can tell.
Hmmm.. We could just push that on the driver saying that is has to
tolerate it. Otherwise how can we solve this?
On Fri, 7 Mar 2008, Andrea Arcangeli wrote:
This is a replacement for the previously posted 3/4, one of the pieces
to allow the mmu notifier methods to sleep.
Looks good. That is what we talked about last week. What guarantees now
that we see the cacheline referenced after the cacheline that
will be necessary
to this patchset.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/mm_types.h |3 +
include/linux/rmap.h | 50 ++
kernel/fork.c|3 +
mm/Kconfig |5 +++
mm/filemap_xip.c |4
On Tue, 4 Mar 2008, Nick Piggin wrote:
Then put it into the arch code for TLB invalidation. Paravirt ops gives
good examples on how to do that.
Put what into arch code?
The mmu notifier code.
What about a completely different approach... XPmem runs over NUMAlink,
right? Why not
On Tue, 4 Mar 2008, Andrea Arcangeli wrote:
When working with single pages it's more efficient and preferable to
call invalidate_page and only later release the VM reference on the
page.
But as you pointed out before that path is a slow path anyways. Its rarely
taken. Having a single
On Tue, 4 Mar 2008, Andrea Arcangeli wrote:
I once ripped invalidate_page while working on #v8 but then I
reintroduced it because I thought reducing the total number of hooks
was beneficial to the core linux VM (even if only a
microoptimization, I sure agree about that, but it's trivial to
On Tue, 4 Mar 2008, Peter Zijlstra wrote:
On Tue, 2008-03-04 at 14:35 -0800, Christoph Lameter wrote:
RCU means that the callbacks occur in an atomic context.
Not really, if it requires moving the VM locks to sleepable locks under
a .config option, I think its also fair to require
On Mon, 3 Mar 2008, Nick Piggin wrote:
I'm still not completely happy with this. I had a very quick look
at the GRU driver, but I don't see why it can't be implemented
more like the regular TLB model, and have TLB insertions depend on
the linux pte, and do invalidates _after_ restricting
On Mon, 3 Mar 2008, Nick Piggin wrote:
It is going to be really easy to add more weird and wonderful notifiers
later that deviate from our standard TLB model. It would be much harder to
remove them. So I really want to see everyone conform to this model first.
Numbers and comparisons can be
On Mon, 3 Mar 2008, Nick Piggin wrote:
Move definition of struct mmu_notifier and struct mmu_notifier_ops under
CONFIG_MMU_NOTIFIER to ensure they doesn't get dereferenced when they
don't make sense.
The callbacks take a mmu_notifier parameter. So how does this compile for
!MMU_NOTIFIER?
On Mon, 3 Mar 2008, Nick Piggin wrote:
Your skeleton is just registering notifiers and saying
/* you fill the hard part in */
If somebody needs a skeleton in order just to register the notifiers,
then almost by definition they are unqualified to write the hard
part ;)
Its also providing
(f.e. KVM/GRU).
If the rmap traversal spinlocks are converted to semaphores then all
callbacks willbe performed in a nonatomic context. Callouts can stay
where they are.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
include/linux/mm_types.h |3 +
include/linux/rmap.h | 51
during rmap traversal
for files in a non atomic context. A rw style lock allows concurrent
walking of the reverse map.
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
---
arch/x86/mm/hugetlbpage.c |4 ++--
fs/hugetlbfs/inode.c |4 ++--
fs/inode.c|2 +-
include
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
On Thu, Feb 28, 2008 at 05:03:01PM -0800, Christoph Lameter wrote:
I thought you wanted to get rid of the sync via pte lock?
Sure. _notify is happening inside the pt lock by coincidence, to
reduce the changes to mm/* as long as the mmu notifiers
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote:
And thus the device driver may stop receiving data on a UP system? It will
never get the ack.
Not sure to follow, sorry.
My idea was:
post the invalidate in the mmio
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
Agreed. I just thought xpmem needed an invalidate-by-page, but
I'm glad if xpmem can go in sync with the KVM/GRU/DRI model in this
regard.
That means we need both the anon_vma locks and the i_mmap_lock to become
semaphores. I think semaphores are
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
That means we need both the anon_vma locks and the i_mmap_lock to become
semaphores. I think semaphores are better than mutexes. Rik and Lee saw
some performance improvements
On Thu, 28 Feb 2008, Andrea Arcangeli wrote:
On Wed, Feb 27, 2008 at 05:03:21PM -0800, Christoph Lameter wrote:
RDMA works across a network and I would assume that it needs confirmation
that a connection has been torn down before pages can be unmapped.
Depends on the latency
On Wed, 27 Feb 2008, Andrea Arcangeli wrote:
What Christoph need to do when he's back from vacations to support
sleepable mmu notifiers is to add a CONFIG_XPMEM config option that
will switch the i_mmap_lock from a semaphore to a mutex (any other
change to this patch will be minor compared to
On Thu, 28 Feb 2008, Andrea Arcangeli wrote:
This is not going to work even if the mutex would work as easily as you
think since the patch here still does an rcu_lock/unlock around a callback.
See underlined.
Mutex is not acceptable for performance reasons. I think we can just drop
the
On Wed, 27 Feb 2008, Andrea Arcangeli wrote:
+struct mmu_notifier_head {
+ struct hlist_head head;
+ spinlock_t lock;
+};
Still think that the lock here is not of too much use and can be easily
replaced by mmap_sem.
+#define mmu_notifier(function, mm, args...)
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
On Thu, Feb 28, 2008 at 10:43:54AM -0800, Christoph Lameter wrote:
What about invalidate_page()?
That would just spin waiting an ack (just like the smp-tlb-flushing
invalidates in numa already does).
And thus the device driver may stop
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
Also re the _notify variants: The binding to pte_clear_flush_young etc
will become a problem for notifiers that want to sleep because
pte_clear_flush is usually called with the pte lock held. See f.e.
try_to_unmap_one, page_mkclean_one etc.
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
On Thu, Feb 28, 2008 at 05:17:33PM -0600, Jack Steiner wrote:
I disagree. The location of the callout IS a performance issue. In simple
comparisons of the 2 patches (Christoph's vs. Andrea's), Andrea's has a 7X
increase in the number of TLB
On Tue, 19 Feb 2008, Andrea Arcangeli wrote:
Yes, that's why I kept maintaining my patch and I posted the last
revision to Andrew. I use pte/tlb locking of the core VM, it's
unintrusive and obviously safe. Furthermore it can be extended with
Christoph's stuff in a 100% backwards compatible
1 - 100 of 268 matches
Mail list logo