This allows compiling the external module against linux.git (fastcall
has finally become the default and only choice).
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/kernel/external-module-compat.h b/kernel/external-module-compat.h
index 5611c12..52b745c 100644
--- a/kernel
On Fri, Feb 15, 2008 at 07:37:36PM -0800, Andrew Morton wrote:
The | is obviously deliberate. But no explanation is provided telling us
why we still call the callback if ptep_clear_flush_young() said the page
was recently referenced. People who read your code will want to understand
this.
. The race
can materialize if the linux pte is zapped after get_user_pages
returns but before the page is mapped by the spte and tracked by
rmap. The invalidate_ calls can also likely be optimized further but
it's not a fast path so it's not urgent.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED
On Sat, Feb 16, 2008 at 03:08:17AM -0800, Andrew Morton wrote:
On Sat, 16 Feb 2008 11:48:27 +0100 Andrea Arcangeli [EMAIL PROTECTED] wrote:
+void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
+ struct mm_struct *mm
On Sat, Feb 16, 2008 at 05:51:38AM -0600, Robin Holt wrote:
I am doing this in xpmem with a stack-based structure in the function
calling get_user_pages. That structure describes the start and
end address of the range we are doing the get_user_pages on. If an
invalidate_range_begin comes in
On Tue, Feb 19, 2008 at 07:54:14PM +1100, Nick Piggin wrote:
As far as sleeping inside callbacks goes... I think there are big
problems with the patch (the sleeping patch and the external rmap
patch). I don't think it is workable in its current state. Either
we have to make some big changes to
On Tue, Feb 19, 2008 at 09:43:57AM +0100, Nick Piggin wrote:
are rather similar. However I have tried to make a point of minimising the
impact the the core mm/. I don't see why we need to invalidate or flush
I also tried hard to minimise the impact of the core mm/, I also
argued with Christoph
On Tue, Feb 19, 2008 at 07:46:10PM +1100, Nick Piggin wrote:
On Sunday 17 February 2008 06:22, Christoph Lameter wrote:
On Fri, 15 Feb 2008, Andrew Morton wrote:
flush_cache_page(vma, address, pte_pfn(*pte));
entry = ptep_clear_flush(vma, address, pte);
On Tue, Feb 19, 2008 at 11:59:23PM +0100, Nick Piggin wrote:
That's why I don't understand the need for the pairs: it should be
done like this.
Yes, except it can't be done like this for xpmem.
OK, I didn't see the invalidate_pages call...
See the last patch I posted to Andrew, you've
On Wed, Feb 20, 2008 at 12:11:57AM +0100, Nick Piggin wrote:
Sorry, I realise I still didn't get this through my head yet (and also
have not seen your patch recently). So I don't know exactly what you
are doing...
The last version was posted here:
On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
You can't sleep inside rcu_read_lock()!
I must say that for a patch that is up to v8 or whatever and is
posted twice a week to such a big cc list, it is kind of slack to
not even test it and expect other people to review it.
Well,
. I doubt xpmem fits inside a
CONFIG_MMU_NOTIFIER anymore, or we'll all run a bit slower because of
it. It's really a call of how much we want to optimize the MMU
notifier, by keeping things like RCU for the registration.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/include/asm
A 2.6.25-rc based kernel spawned an oops in mmdrop when kvm quit so
that reminded me of this:
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/kernel/external-module-compat.h b/kernel/external-module-compat.h
index 20ef841..fd3cb1d 100644
--- a/kernel/external-module-compat.h
+++ b
,
without requiring a page pin).
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 41962e7..e1287ab 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
tristate Kernel-based Virtual Machine
On Wed, Feb 20, 2008 at 05:33:13AM -0600, Robin Holt wrote:
But won't that other subsystem cause us to have two seperate callouts
that do equivalent things and therefore force a removal of this and go
back to what Christoph has currently proposed?
The point is that a new kind of notifier that
On Wed, Feb 20, 2008 at 06:24:24AM -0600, Robin Holt wrote:
We do not need to do any allocation in the messaging layer, all
structures used for messaging are allocated at module load time.
The allocation discussions we had early on were about trying to
rearrange you notifiers to allow a
On Wed, Feb 20, 2008 at 08:41:55AM -0600, Robin Holt wrote:
On Wed, Feb 20, 2008 at 11:39:42AM +0100, Andrea Arcangeli wrote:
XPMEM simply can't use RCU for the registration locking if it wants to
schedule inside the mmu notifier calls. So I guess it's better to add
Whoa
On Thu, Feb 21, 2008 at 05:54:30AM +0100, Nick Piggin wrote:
will send you incremental changes that can be discussed more easily
that way (nothing major, mainly style and minor things).
I don't need to say you're very welcome ;).
I agree: your coherent, non-sleeping mmu notifiers are pretty
on the below to be optimal for GRU/KVM and
trivially extendible once a CONFIG_XPMEM will be added. So this first
part can go in now I think.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
diff --git a/include/linux/mm_types.h b/include/linux
Same as before but one one hand ported to #v7 API and on the other
hand ported to latest kvm.git.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 41962e7..e1287ab 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
Hi Izik kvm-devel,
Just wanted to remind that if we'll converge on #v7, the ksm code
in replace_page will have to call ptep_clear_flush_notify too (just
like do_wp_page).
-
This SF.net email is sponsored by: Microsoft
Defy
On Wed, Feb 27, 2008 at 03:06:10PM -0800, Christoph Lameter wrote:
Ok so it somehow works slowly with GRU and you are happy with it. What
As far as GRU is concerned, performance is the same as with your patch
(Jack can confirm).
about the RDMA folks etc etc?
If RDMA/IB folks needed to block
On Wed, Feb 27, 2008 at 02:23:29PM -0800, Christoph Lameter wrote:
How would that work? You rely on the pte locking. Thus calls are all in an
I don't rely on the pte locking in #v7, exactly to satisfy GRU
(so far purely theoretical) performance complains.
atomic context. I think we need a
On Wed, Feb 27, 2008 at 02:35:59PM -0800, Christoph Lameter wrote:
Could you be specific? This refers to page migration? Hmmm... Guess we
If the reader schedule, the synchronize_rcu will return in the other
cpu and the objects in the list will be freed and overwritten, and
when the task is
On Wed, Feb 27, 2008 at 04:08:07PM -0800, Christoph Lameter wrote:
On Thu, 28 Feb 2008, Andrea Arcangeli wrote:
If RDMA/IB folks needed to block in invalidate_range, I guess they
need to do so on top of tmpfs too, and that never worked with your
patch anyway.
How about blocking
On Wed, Feb 27, 2008 at 02:39:46PM -0800, Christoph Lameter wrote:
On Wed, 20 Feb 2008, Andrea Arcangeli wrote:
Well, xpmem requirements are complex. As as side effect of the
simplicity of my approach, my patch is 100% safe since #v1. Now it
also works for GRU and it cluster invalidates
On Wed, Feb 27, 2008 at 02:43:41PM -0800, Christoph Lameter wrote:
Nope. unmap_mapping_range is already handled by the range callbacks.
But they're called with atomic=1 on anything but anonymous memory. I
understood Andrew asked to remove the atomic param and to allow
sleeping for all kind of
On Wed, Feb 27, 2008 at 04:14:08PM -0800, Christoph Lameter wrote:
Erm. This would also be needed by RDMA etc.
The only RDMA I know is Quadrics, and Quadrics apparently doesn't need
to schedule inside the invalidate methods AFIK, so I doubt the above
is true. It'd be interesting to know if IB is
On Wed, Feb 27, 2008 at 05:03:21PM -0800, Christoph Lameter wrote:
RDMA works across a network and I would assume that it needs confirmation
that a connection has been torn down before pages can be unmapped.
Depends on the latency of the network, for example with page pinning
it can even try
On Thu, Feb 28, 2008 at 11:48:10AM -0800, Christoph Lameter wrote:
make it work after the VM locking will be altered (for example the
^^^
CONFIG_XPMEM should also switch the mmu_register/unregister locking
On Thu, Feb 28, 2008 at 05:17:33PM -0600, Jack Steiner wrote:
I disagree. The location of the callout IS a performance issue. In simple
comparisons of the 2 patches (Christoph's vs. Andrea's), Andrea's has a 7X
increase in the number of TLB purges being issued to the GRU. TLB flushing
Are you
On Thu, Feb 28, 2008 at 03:05:30PM -0800, Christoph Lameter wrote:
Still think that the lock here is not of too much use and can be easily
replaced by mmap_sem.
I can use the mmap_sem.
+#define mmu_notifier(function, mm, args...)
\
+ do {
On Thu, Feb 28, 2008 at 10:43:54AM -0800, Christoph Lameter wrote:
What about invalidate_page()?
That would just spin waiting an ack (just like the smp-tlb-flushing
invalidates in numa already does).
Thinking more about this, we could also parallelize it with an
invalidate_page_before/end. If
On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote:
And thus the device driver may stop receiving data on a UP system? It will
never get the ack.
Not sure to follow, sorry.
My idea was:
post the invalidate in the mmio region of the device
smp_call_function()
while
On Thu, Feb 28, 2008 at 05:03:01PM -0800, Christoph Lameter wrote:
I thought you wanted to get rid of the sync via pte lock?
Sure. _notify is happening inside the pt lock by coincidence, to
reduce the changes to mm/* as long as the mmu notifiers aren't
sleep capable.
What changes to do_wp_page
On Fri, Feb 29, 2008 at 11:55:17AM -0800, Christoph Lameter wrote:
post the invalidate in the mmio region of the device
smp_call_function()
while (mmio device wait-bitflag is on);
So the device driver on UP can only operate through interrupts? If you are
hogging the only cpu
On Fri, Feb 29, 2008 at 01:34:34PM -0800, Christoph Lameter wrote:
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
That means we need both the anon_vma locks and the i_mmap_lock to become
semaphores. I think semaphores
On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
That means we need both the anon_vma locks and the i_mmap_lock to become
semaphores. I think semaphores are better than mutexes. Rik and Lee saw
some performance improvements because list can be traversed in parallel
when
On Fri, Feb 29, 2008 at 02:12:57PM -0800, Christoph Lameter wrote:
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
AFAICT The rw semaphore fastpath is similar in performance to a rw
spinlock.
read side is taken in the slow path.
Slowpath meaning VM slowpath or lock slow path? Its
to linux-mm in a separate thread).
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/mm/rmap.c b/mm/rmap.c
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -274,7 +274,7 @@ static int page_referenced_one(struct pa
unsigned long address;
pte_t *pte;
spinlock_t *ptl;
- int
in .26. The brainer part of the VM work to do to make it sleep
capable is pretty much orthogonal with this patch.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
--- a/include
On Mon, Mar 03, 2008 at 04:29:34AM +0100, Nick Piggin wrote:
to something I prefer. Others may not, but I'll post them for debate
anyway.
Sure, thanks!
I didn't drop invalidate_page, because invalidate_range_begin/end
would be slower for usages like KVM/GRU (we don't need a begin/end
On Mon, Mar 03, 2008 at 02:10:17PM +0100, Nick Piggin wrote:
Is this just a GRU problem? Can't we just require them to take a ref
on the page (IIRC Jack said GRU could be changed to more like a TLB
model).
Yes, it's just a GRU problem, it tries to optimize performance by
calling follow_page
On Mon, Mar 03, 2008 at 11:01:22AM -0800, Christoph Lameter wrote:
API still has rcu issues and the example given for making things sleepable
is only working for the aging callback. The most important callback is for
try_to_unmao and page_mkclean. This means the API is still not generic
and at the
same time deferring _end after the whole tlb_gather page freeing is
reducing the number of invalidates.
.26 will allow all the methods to sleep by following the roadmap
described in the #v8 patch.
KVM so far is swapping fine on top of this.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED
Notably the registration now requires the mmap_sem in write mode.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 41962e7..e1287ab 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
Hello Izik,
On Tue, Mar 04, 2008 at 02:44:07AM +0200, Izik Eidus wrote:
i wrote to you about this before (i didnt get answer for this so i write
Ouch I must have lost your previous comment with a too-fast pgdown in
the full quoting of the patch sorry.
again)
with large pages support i think
On Mon, Mar 03, 2008 at 11:31:15PM -0800, Christoph Lameter wrote:
@@ -446,6 +450,8 @@ static int page_mkclean_one(struct page
if (address == -EFAULT)
goto out;
+ /* rmap lock held */
+ emm_notify(mm, emm_invalidate_start, address, address + PAGE_SIZE);
On Tue, Mar 04, 2008 at 11:00:31AM -0800, Christoph Lameter wrote:
But as you pointed out before that path is a slow path anyways. Its rarely
It's a slow path but I don't see why you think two hooks are better
than one, when only one is necessary.
I once ripped invalidate_page while working on
) is to
decrease the non obviously safe mangling over mm/* during .25. The
below patch is simple, but not as obviously safe as
s/ptep_clear_flush/ptep_clear_flush_notify/.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
to
keep mmu_notifier_unregister.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -10,6 +10,7 @@
#include linux/rbtree.h
#include linux/rwsem.h
#include linux
This is a rediff of Christoph's plain i_mmap_lock2rwsem patch on top
of #v9 1/4 + 2/4 + 3/4 (hence this is called 4/4). This is mostly to
show that after 3/4, any patch that plugs on the EMM patchset will
plug nicely on top of my MMU notifer patchset too.
The patch trigger bug checks here in
On Fri, Mar 07, 2008 at 05:52:42PM +0100, Peter Zijlstra wrote:
hlist_del_rcu(mn-hlist)
+ rcu_read_unlock();
kfree(mn);
young |= mn-ops-clear_flush_young(mn, mm, address);
*BANG*
My objective was to allow mmu_notifier_register/unregister to be
On Fri, Mar 07, 2008 at 07:01:35PM +0100, Peter Zijlstra wrote:
The reason Christoph can do without RCU is because he doesn't allow
unregister, and as soon as you drop that you'll end up with something
Not sure to follow, what do you mean he doesn't allow? We'll also
have to rip unregister
On Wed, Mar 05, 2008 at 04:22:11PM -0800, Christoph Lameter wrote:
+ if (e-callback) {
+ x = e-callback(e, mm, op, start, end);
+ if (x)
+ return x;
[..]
+
+ if (emm_notify(mm, emm_referenced, address,
On Fri, Mar 07, 2008 at 07:45:52PM +0100, Andrea Arcangeli wrote:
On Fri, Mar 07, 2008 at 07:01:35PM +0100, Peter Zijlstra wrote:
The reason Christoph can do without RCU is because he doesn't allow
unregister, and as soon as you drop that you'll end up with something
Not sure to follow
On Thu, Mar 20, 2008 at 02:09:15PM +0200, Avi Kivity wrote:
Marcelo Tosatti wrote:
Add an ioctl to zap all mappings to a given gfn. This allows userspace
remove the QEMU process mappings and the page without causing
inconsistency.
I'm thinking of comitting rmap_nuke() to kvm.git,
On Fri, Mar 21, 2008 at 10:37:00AM -0300, Marcelo Tosatti wrote:
This is not the final put_page().
Remote TLB's are flushed here, after rmap_remove:
+ if (nuked)
+ kvm_flush_remote_tlbs(kvm);
This ioctl is called before zap_page_range() is executed through
On Fri, Mar 21, 2008 at 06:23:41PM -0300, Marcelo Tosatti wrote:
If there are any active shadow mappings to a page there is a guarantee
that there is a valid linux pte mapping pointing at it. So page_count ==
^^ was
1 + nr_sptes.
Yes.
So the theoretical race you're talking
On Mon, Mar 24, 2008 at 07:54:27AM +0100, Andrea Arcangeli wrote:
I'd more accurately describe the race as this:
CPU0 CPU1
spte = rmap_next(kvm, rmapp, NULL);
while (spte) {
BUG_ON(!spte
On Wed, Mar 26, 2008 at 02:51:28PM -0300, Marcelo Tosatti wrote:
Nope. If a physical CPU has page translations cached it _must_ be
running in the context of a qemu thread (does not matter if its in
userspace or executing guest code). The bit corresponding to such CPU's
will be set in
On Wed, Mar 26, 2008 at 05:02:53PM +0200, Avi Kivity wrote:
Andrea notes that freeing the page before flushing the tlb is a race, as the
guest can sneak in one last write before the tlb is flushed, writing to a
page that may belong to someone else.
Fix be reversing the order of freeing and
On Wed, Mar 26, 2008 at 08:22:31PM +0100, Andrea Arcangeli wrote:
what happens if invalidate_page runs after rmap_remove is returned
(the spte isn't visible anymore by the rmap code and in turn by
invalidate_page) but before the set_shadow_pte(nonpresent) runs.
Thinking some more the mmu_lock
On Thu, Mar 27, 2008 at 10:11:42AM +0200, Avi Kivity wrote:
Erm I don't think this means what you think it means. This is the
kernel/user communication area, used to pass exit data to userspace. It's
not the memslot vma.
Yep... only the kvm_vm_vm_ops can run gfn_to_page, and I assume that
On Thu, Mar 27, 2008 at 03:56:56PM +0200, Avi Kivity wrote:
That's not good. We need to support the older userspace, for a while yet.
Why is there a problem? IIRC it's just anonymous memory.
Problem is that for it to be unmapped __do_fault must call
page_add_new_anon_rmap on it. Even anon
safe spte=nonpresent; tlbflush; put_page
ordering, then it'll always be safe, but it'll be slower as there will
be more tlb flushes than needed.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index caa9f94..5343216 100644
--- a/arch/x86/kvm/x86
This is crashing at boot my lowmem reserved RAM patch. This is causing
GFP_DMA allocations at boot for no good reason. It crashes in my case
because there's no ram below 16M available to linux. Are you sure this
is needed at all, for sure if there's any bug this isn't the right fix.
Please
On Fri, Mar 28, 2008 at 03:01:13PM +0100, Andrea Arcangeli wrote:
@@ -271,8 +292,12 @@ int __kvm_set_memory_region(struct kvm *kvm,
r = -EINVAL;
/* General sanity checks */
+ if (mem-userspace_addr (PAGE_SIZE - 1))
+ goto out;
if (mem-memory_size
On Mon, Mar 31, 2008 at 09:35:00AM +0300, Avi Kivity wrote:
This can be done by taking mmu_lock in _begin and releasing it in _end,
unless there's a lock dependency issue.
The main problem is if want to be able to co-exit with XPMEM methods
registered in the same notifier chain for the same MM
Hello,
These three patches (one against host kernel, one against kvm.git, one
against kvm-userland.git) forces KVM to map all RAM mapped in the
virtualized e820 map provided to the guest with gfn = hfn. In turn
it's now possible to give direct hardware access to the guest, all DMA
will work fine
array be
allocated with holes corresponding to the holes generated in the e820
map, simply bad_page will be returned gracefully without risk like if
this patch wasn't applied to kvm.git.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm
don't use pci-passthrough, but then
pci passthrough will randomly memory corrupt the host.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/bios/rombios.c b/bios/rombios.c
index 318de57..f93a6c6 100644
--- a/bios/rombios.c
+++ b/bios/rombios.c
@@ -4251,6 +4251,7 @@ int15_function32
-by: Andrea Arcangeli [EMAIL PROTECTED]
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1107,8 +1107,36 @@ config CRASH_DUMP
(CONFIG_RELOCATABLE=y).
For more details see Documentation/kdump/kdump.txt
+config RESERVE_PHYSICAL_START
tell, this will stop the regression with isa dma
operations at boot for 99% of blkdev/memory combinations out there and
I guess this fixes the setups with 4G of ram and 32bit pci cards as
well (this also retains symmetry with the 32bit code).
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff
On Tue, Apr 01, 2008 at 10:20:49AM -0500, Anthony Liguori wrote:
Which is apparently entirely unnecessary as we already have
/sys/bus/pci/.../region. It's just a matter of checking if a vma is VM_IO
and then dealing with the subsequent reference counting issues as Avi
points out.
Do you
On Tue, Apr 01, 2008 at 10:22:51PM +0300, Avi Kivity wrote:
It's just something we discussed, not code.
Yes, the pfn_valid check should skip all refcounting for mmio regions
without a struct page. But gfn_to_page can't work without a struct
page, so some change will be needed there. With the
On Tue, Apr 01, 2008 at 01:55:32PM -0700, Christoph Lameter wrote:
+/* Perform a callback */
+int __emm_notify(struct mm_struct *mm, enum emm_operation op,
+ unsigned long start, unsigned long end)
+{
+ struct emm_notifier *e = rcu_dereference(mm)-emm_notifier;
+ int x;
On Wed, Apr 02, 2008 at 07:32:35AM +0300, Avi Kivity wrote:
It ought to work. gfn_to_hfn() (old gfn_to_page) will still need to take a
refcount if possible.
This reminds me, that mmu notifiers we could implement gfn_to_hfn only
with follow_page and skip the refcounting on the struct page.
On Wed, Apr 02, 2008 at 12:50:50PM +0300, Avi Kivity wrote:
Isn't it faster though? We don't need to pull in the cacheline containing
the struct page anymore.
Exactly, not only that, get_user_pages is likely a bit slower that we
need for just kvm pte lookup. GRU uses follow_page directly
On Wed, Apr 02, 2008 at 02:16:41PM +0300, Avi Kivity wrote:
Ugh, there's still mark_page_accessed() and SetPageDirty().
btw, like PG_dirty is only set if the spte is writeable,
mark_page_accessed should only run if the accessed bit is set in the
spte. It doesn't matter now as nobody could
On Wed, Apr 02, 2008 at 01:50:19PM +0200, Andrea Arcangeli wrote:
if (pfn_valid(pfn)) {
page = pfn_to_page(pfn);
if (!PageReserved(page)) {
BUG_ON(page_count(page) != 1);
if (is_writeable_pte(*spte
On Tue, Apr 01, 2008 at 01:55:36PM -0700, Christoph Lameter wrote:
This results in f.e. the Aim9 brk performance test to got down by 10-15%.
I guess it's more likely because of overscheduling for small crtitical
sections, did you counted the total number of context switches? I
guess there will
On Wed, Apr 02, 2008 at 12:03:50PM -0700, Christoph Lameter wrote:
+ /*
+ * Callback may return a positive value to indicate a
count
+ * or a negative error code. We keep the first error
code
+ * but
On Wed, Apr 02, 2008 at 10:59:50AM -0700, Christoph Lameter wrote:
Did I see #v10? Could you start a new subject when you post please? Do
not respond to some old message otherwise the threading will be wrong.
I wasn't clear enough, #v10 was in the works... I was thinking about
the last two
On Wed, Apr 02, 2008 at 11:15:26AM -0700, Christoph Lameter wrote:
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
On Tue, Apr 01, 2008 at 01:55:36PM -0700, Christoph Lameter wrote:
This results in f.e. the Aim9 brk performance test to got down by
10-15%.
I guess it's more likely
# HG changeset patch
# User Andrea Arcangeli [EMAIL PROTECTED]
# Date 1207159010 -7200
# Node ID fe00cb9deeb31467396370c835cb808f4b85209a
# Parent a406c0cc686d0ca94a4d890d661cdfa48cfba09f
Moves all mmu notifier methods outside the PT lock (first and not last
step to make them sleep capable
# HG changeset patch
# User Andrea Arcangeli [EMAIL PROTECTED]
# Date 1207159011 -7200
# Node ID 3c3787c496cab1fc590ba3f97e7904bdfaab5375
# Parent d880c227ddf345f5d577839d36d150c37b653bfd
The conversion to a rwsem allows callbacks during rmap traversal
for files in a non atomic context. A rw
Hello,
this is the mmu notifier #v10. Patches 1 and 2 are the only difference between
this and EMM V2. The rest is the same as with Christoph's patches.
I think maximum priority should be given in merging patch 1 and 2 into -mm and
ASAP in mainline.
Patches from 3 to 8 can go in -mm for testing
# HG changeset patch
# User Andrea Arcangeli [EMAIL PROTECTED]
# Date 1207159055 -7200
# Node ID 316e5b1e4bf388ef0198c91b3067ed1e4171d7f6
# Parent 3c3787c496cab1fc590ba3f97e7904bdfaab5375
We no longer abort unmapping in unmap vmas because we can reschedule while
unmapping since we are holding
# HG changeset patch
# User Andrea Arcangeli [EMAIL PROTECTED]
# Date 1207159059 -7200
# Node ID f3f119118b0abd9c4624263ef388dc7230d937fe
# Parent 31fc23193bd039cc595fba1ca149a9715f7d0fb2
This patch adds a lock ordering rule to avoid a potential deadlock when
multiple mmap_sems need to be locked
# HG changeset patch
# User Andrea Arcangeli [EMAIL PROTECTED]
# Date 1207159010 -7200
# Node ID d880c227ddf345f5d577839d36d150c37b653bfd
# Parent fe00cb9deeb31467396370c835cb808f4b85209a
Move the tlb flushing into free_pgtables. The conversion of the locks
taken for reverse map scanning would
On Wed, Apr 02, 2008 at 02:54:52PM -0700, Christoph Lameter wrote:
On Wed, 2 Apr 2008, Andrea Arcangeli wrote:
Hmmm... Okay that is one solution that would just require a BUG_ON in the
registration methods.
Perhaps you didn't notice that this solution can't work if you call
On Wed, Apr 02, 2008 at 02:56:25PM -0700, Christoph Lameter wrote:
I am a bit surprised that brk performance is that important. There may be
I think it's not brk but fork that is being slowed down, did you
oprofile? AIM forks a lot... The write side fast path generating the
overscheduling I
On Wed, Apr 02, 2008 at 03:06:19PM -0700, Christoph Lameter wrote:
On Thu, 3 Apr 2008, Andrea Arcangeli wrote:
That would work for #v10 if I remove the invalidate_range_start from
try_to_unmap_cluster, it can't work for EMM because you've
emm_invalidate_start firing anywhere outside
On Wed, Apr 02, 2008 at 03:34:01PM -0700, Christoph Lameter wrote:
Still two methods ...
Yes, the invalidate_page is called with the core VM holding a
reference on the page _after_ the tlb flush. The invalidate_end is
called after the page has been freed already and after the tlb
flush. They've
On Thu, Apr 03, 2008 at 12:40:46PM +0200, Peter Zijlstra wrote:
It seems to me that common code can be shared using functions? No need
FWIW I prefer separate methods.
kvm patch using mmu notifiers shares 99% of the code too between the
two different methods implemented indeed. Code sharing is
On Wed, Apr 02, 2008 at 06:24:15PM -0700, Christoph Lameter wrote:
Ok lets forget about the single theaded thing to solve the registration
races. As Andrea pointed out this still has ssues with other subscribed
subsystems (and also try_to_unmap). We could do something like what
On Thu, Apr 03, 2008 at 12:20:41PM -0700, Christoph Lameter wrote:
On Thu, 3 Apr 2008, Andrea Arcangeli wrote:
My attempt to fix this once and for all is to walk all vmas of the
mm inside mmu_notifier_register and take all anon_vma locks and
i_mmap_locks in virtual address order in a row
of this one.
Andrew can you apply this to -mm?
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
Signed-off-by: Nick Piggin [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
diff --git a/include/linux/mm.h b/include/linux/mm.h
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
On Fri, Apr 04, 2008 at 03:06:18PM -0700, Christoph Lameter wrote:
Adds some comments. Still objectionable is the multiple ways of
invalidating pages in #v11. Callout now has similar locking to emm.
range_begin exists because range_end is called after the page has
already been freed.
101 - 200 of 333 matches
Mail list logo