On Tue, Aug 30, 2005 at 05:56:33PM +0100, Alan Cox wrote:
I doubt there is anything needed that can't be done in sh and nc here.
Catching boots can be done by adding one to a boot number and sending
that as well. How does suspend to disk handle uptime - if the uptime
stops then sending the
On Tue, Aug 30, 2005 at 10:08:38AM -0700, Wilkerson, Bryan P wrote:
they're work, I'm not sure I'd trust or use the data unless it was
somehow authenticated.
I doubt many testers would be willing to register on yet another website
just for this. So I doubt adding authentication is a good
On Tue, Aug 30, 2005 at 06:11:26PM -0400, Bill Davidsen wrote:
the system, like load. A week running while I was on vacation doesn't
test much, a week running on a loaded server tests other things.
btw, I thought about adding the load average too but it wasn't really
interesting, since
On Wed, Aug 31, 2005 at 12:14:23PM -0700, [EMAIL PROTECTED] wrote:
Do you want to try to handle version skew ? All kernels built
from GIT trees look like 2.6.13 until Linus releases 2.6.14-rc1.
Possible approaches (requiring changes to the kernel Makefile).
1) Use the SHA1 of HEAD to provide
On Wed, Aug 31, 2005 at 04:28:59PM +0200, Sven Ladegast wrote:
Why not generating a unique system ID at compilation stage of the kernel
if the apopriate kernel option is enabled? This needn't have something to
do with klive...just a unique kernel-ID or something like that.
I could also store
On Wed, Aug 31, 2005 at 08:20:51PM +0200, Pavel Machek wrote:
Well, you could remove everything that is not valid kernel text from
backtrace.
What if the corruption wrote the ssh key inside a the kernel text?
As suggested before, I suspect the only way would be to make it
optional.
Oh and
On Wed, Aug 31, 2005 at 08:32:00PM +0200, Pavel Machek wrote:
I'd say ignore suspend. Machines using it are probably not connected
to network, anyway, and it stresses system quite a lot.
Currently even if you're not connected to the network it's fine. As long
as you connect sometime. If a
exit_code 0 signal 0
The seccomp_test.py completed successfully, thank you for testing.
Thanks.
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
diff -r 1df7bfbb783f arch/i386/kernel/ptrace.c
--- a/arch/i386/kernel/ptrace.c Fri Sep 2 09:01:35 2005
+++ b/arch/i386/kernel/ptrace.c Mon Sep 5 05
On Mon, Jan 29, 2007 at 03:08:44PM +0100, Andrea Gelmini wrote:
On Mon, Jan 22, 2007 at 10:10:39AM +0100, Peter Zijlstra wrote:
On Fri, 2007-01-12 at 01:39 +0100, Andrea Gelmini wrote:
Hi,
I can't do the test 'till next week.
Thanks a lot for your time,
Gelma
Have you
On Sun, Jan 28, 2007 at 06:03:08PM +0100, Denis Vlasenko wrote:
I still don't see much difference between O_SYNC and O_DIRECT write
semantic.
O_DIRECT is about avoiding the copy_user between cache and userland,
when working with devices that runs faster than ram (think =100M/sec,
quite standard
On Tue, Jan 30, 2007 at 01:50:41PM -0500, Phillip Susi wrote:
It should return the number of bytes successfully written before the
error, giving you the location of the first error. Also using smaller
individual writes ( preferably issued in parallel ) also allows the
problem spot to be
On Tue, Jan 30, 2007 at 08:57:20PM +0100, Andrea Arcangeli wrote:
Please try yourself, it's simple enough:
time dd if=/dev/hda of=/dev/null bs=16M count=100
time dd if=/dev/hda of=/dev/null bs=16M count=100 iflag=sync
sorry, reading won't help much to exercise sync
On Tue, Jan 30, 2007 at 06:07:14PM -0500, Phillip Susi wrote:
It most certainly matters where the error happened because you are
screwd is not an acceptable outcome in a mission critical application.
An I/O error is not an acceptable outcome in a mission critical app,
all mission critical
On Thu, Feb 01, 2007 at 12:20:59PM +0100, Andi Kleen wrote:
I think a better way to do this would be to define a new
CLOCK_THREAD_MONOTONOUS
(or better name) timer for clock_gettime().
[and my currently stalled vdso patches that implement clock_gettime
as a vsyscall]
Then also an
On Thu, Feb 01, 2007 at 01:02:41PM +0100, Andi Kleen wrote:
I don't think so because having per process state in a vsyscall
is quite costly. You would need to allocate at least one more
page to each process, which I think would be excessive.
You would need one page per cpu and to check a
Hello everyone,
This is a long thread about O_DIRECT surprisingly without a single
bugreport in it, that's a good sign that O_DIRECT is starting to work
well in 2.6 too ;)
On Fri, Jan 12, 2007 at 02:47:48PM -0800, Andrew Morton wrote:
On Fri, 12 Jan 2007 15:35:09 -0700
Erik Andersen [EMAIL
On Tue, Jan 23, 2007 at 01:10:46AM +0100, Niki Hammler wrote:
Dear Linux Developers/Enthusiasts,
For a course at my university I'm implementing parts of an operating
system where I get most ideas from the Linux Kernel (which I like very
much). One book I gain information from is [1].
On Tue, Jan 23, 2007 at 07:01:33AM +0530, Balbir Singh wrote:
This makes me wonder if it makes sense to split up the LRU into page
cache LRU and mapped pages LRU. I see two benefits
1. Currently based on swappiness, we might walk an entire list
searching for page cache pages or mapped
On Mon, Dec 11, 2006 at 01:17:25PM -0800, dean gaudet wrote:
rdtscp doesn't solve anything extra [..]
[..] lsl-based vgetcpu is relatively slow
Well, if you accept to run slow there's nothing to solve in the first
place indeed.
If nothing else rdtscp should avoid the mess of restarting a
On Mon, Dec 11, 2006 at 03:15:44PM -0800, dean gaudet wrote:
rdtscp gets you 2 of the 5 values you need to compute the time. anything
can happen between when you do the rdtscp and do the other 3 reads: the
computation is (((tsc-A)*B)N)+C where N is a constant, and A, B, C are
per-cpu
On Wed, Aug 31, 2005 at 09:47:01PM +0200, Andrea Arcangeli wrote:
I'm thinking to add optional aggregations for (\d+)\.(\d+)\.(\d+)\D and
for different archs. So you can watch ia64 only or 2.6.13 only etc...
The -tiger-smp/-generic-up makes life harder indeed ;).
I now implemented some basic
On Tue, Sep 06, 2005 at 12:05:07AM +0200, Marc Giger wrote:
Hi Andrea
Two little details:
The following line does not print what you expect on
alpha's:
MHZ = int(re.search(r' (\d+)\.?\d?',
os.popen(grep -i mhz /proc/cpuinfo | head -n
1).read()).group(1))
Thanks
argument passing in
mm/huge_memory.c
Both:
Reviewed-by: Andrea Arcangeli aarca...@redhat.com
Steve Capper (1):
mm: Introduce HAVE_ARCH_TRANSPARENT_HUGEPAGE
This was already introduced by the s390 THP support which I reviewed a
few days ago, and it's already included in -mm, so it can
it would be more correct if __GFP_MOVABLE was clear, like
(GFP_TRANSHUGE | __GFP_ZERO) ~__GFP_MOVABLE because this page isn't
really movable (it's only reclaimable).
The xchg vs xchgcmp locking also looks good.
Reviewed-by: Andrea Arcangeli aarca...@redhat.com
Thanks,
Andrea
--
To unsubscribe from
Hi Kirill,
On Thu, Sep 13, 2012 at 08:37:58PM +0300, Kirill A. Shutemov wrote:
On Thu, Sep 13, 2012 at 07:16:13PM +0200, Andrea Arcangeli wrote:
Hi Kirill,
On Wed, Sep 12, 2012 at 01:07:53PM +0300, Kirill A. Shutemov wrote:
- hpage = alloc_pages(GFP_TRANSHUGE | __GFP_ZERO
Hi Michel,
On Tue, Sep 04, 2012 at 02:20:52AM -0700, Michel Lespinasse wrote:
This change fixes an anon_vma locking issue in the following situation:
- vma has no anon_vma
- next has an anon_vma
- vma is being shrunk / next is being expanded, due to an mprotect call
We need to take next's
On Tue, Sep 04, 2012 at 02:53:47PM -0700, Michel Lespinasse wrote:
I think the minimal fix would actually be:
if (vma-anon_vma (importer || start != vma-vm_start)) {
anon_vma = vma-anon_vma;
+ else if (next-anon_vma adjust_next)
+ anon_vma =
Hi Andrew and Martin,
On Fri, Aug 31, 2012 at 12:47:02PM -0700, Andrew Morton wrote:
On Fri, 31 Aug 2012 09:07:57 +0200
Martin Schwidefsky schwidef...@de.ibm.com wrote:
I grabbed them all. Patches 1-3 look sane to me and I cheerfully
didn't read the s390 changes at all. Hopefully
Hi Gerald,
On Wed, Aug 29, 2012 at 05:32:58PM +0200, Gerald Schaefer wrote:
+#ifndef __HAVE_ARCH_PGTABLE_DEPOSIT
+extern void pgtable_deposit(struct mm_struct *mm, pgtable_t pgtable);
+#endif
One minor nitpick on the naming of the two functions: considering that
those are global exports, that
Hi Kirill,
On Thu, Aug 16, 2012 at 06:15:53PM +0300, Kirill A. Shutemov wrote:
for (i = 0; i pages_per_huge_page;
i++, p = mem_map_next(p, page, i)) {
It may be more optimal to avoid a multiplication/shiftleft before the
add, and to do:
for (i = 0, vaddr = haddr; i
On Thu, Aug 16, 2012 at 07:43:56PM +0300, Kirill A. Shutemov wrote:
Hm.. I think with static_key we can avoid cache overhead here. I'll try.
Could you elaborate on the static_key? Is it some sort of self
modifying code?
Thanks, for review. Could you take a look at huge zero page patchset? ;)
On Thu, Aug 09, 2012 at 12:08:18PM +0300, Kirill A. Shutemov wrote:
+static void __split_huge_zero_page_pmd(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long address)
+{
+ pgtable_t pgtable;
+ pmd_t _pmd;
+ unsigned long haddr = address HPAGE_PMD_MASK;
+ struct
Hi Andrew,
On Thu, Aug 16, 2012 at 12:20:23PM -0700, Andrew Morton wrote:
That's a pretty big improvement for a rather fake test case. I wonder
how much benefit we'd see with real workloads?
The same discussion happened about the zero page in general and
there's no easy answer. I seem to
On Thu, Aug 09, 2012 at 12:08:17PM +0300, Kirill A. Shutemov wrote:
From: Kirill A. Shutemov kirill.shute...@linux.intel.com
It's required to implement huge zero pmd splitting.
This isn't bisectable with the next one, it'd fail on wfg 0-DAY kernel
build testing backend, however this is
On Thu, Aug 16, 2012 at 09:37:25PM +0300, Kirill A. Shutemov wrote:
On Thu, Aug 16, 2012 at 08:29:44PM +0200, Andrea Arcangeli wrote:
On Thu, Aug 16, 2012 at 07:43:56PM +0300, Kirill A. Shutemov wrote:
Hm.. I think with static_key we can avoid cache overhead here. I'll try.
Could you
Hi everyone,
On Tue, Jul 31, 2012 at 09:12:04PM +0200, Peter Zijlstra wrote:
Hi all,
After having had a talk with Rik about all this NUMA nonsense where he
proposed
the scheme implemented in the next to last patch, I came up with a related
means of doing the home-node selection.
I've
Hi,
On Tue, Jul 31, 2012 at 09:12:06PM +0200, Peter Zijlstra wrote:
Since the NUMA_INTERLEAVE_HIT statistic is useless on its own; it wants
to be compared to either a total of interleave allocations or to a miss
count, remove it.
Fixing it would be possible, but since we've gone years
On Tue, Jul 31, 2012 at 09:12:08PM +0200, Peter Zijlstra wrote:
If we marked a THP with our special PROT_NONE protections, ensure we
don't loose them over a split.
Collapse seems to always allocate a new (huge) page which should
already end up on the new target node so loosing protections
On Tue, Jul 31, 2012 at 09:12:09PM +0200, Peter Zijlstra wrote:
+static bool pte_prot_none(struct vm_area_struct *vma, pte_t pte)
+{
+ /*
+ * If we have the normal vma-vm_page_prot protections we're not a
+ * 'special' PROT_NONE page.
+ *
+ * This means we cannot get
On Tue, Jul 31, 2012 at 09:12:11PM +0200, Peter Zijlstra wrote:
From: Lee Schermerhorn lee.schermerh...@hp.com
This patch augments the MPOL_MF_LAZY feature by adding a NOOP
policy to mbind(). When the NOOP policy is used with the 'MOVE
and 'LAZY flags, mbind() [check_range()] will walk the
On Tue, Jul 31, 2012 at 09:12:14PM +0200, Peter Zijlstra wrote:
+#ifdef CONFIG_NUMA
/*
- * Do fancy stuff...
+ * For NUMA systems we use the special PROT_NONE maps to drive
+ * lazy page migration, see MPOL_MF_LAZY and related.
*/
+ page = vm_normal_page(vma,
On Tue, Jul 31, 2012 at 09:12:19PM +0200, Peter Zijlstra wrote:
@@ -2699,6 +2705,29 @@ select_task_rq_fair(struct task_struct *
}
rcu_read_lock();
+ if (sched_feat_numa(NUMA_BIAS) node != -1) {
+ int node_cpu;
+
+ node_cpu =
On Tue, Jul 31, 2012 at 09:12:22PM +0200, Peter Zijlstra wrote:
Implement a per-task memory placement scheme for 'big' tasks (as per
the last patch). It relies on a regular PROT_NONE 'migration' fault to
scan the memory space of the procress and uses a two stage migration
scheme to reduce the
59af0d4348eb07087097e310f60422b994dd3a2c Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli aarca...@redhat.com
Date: Tue, 21 Aug 2012 19:32:23 +0200
Subject: [PATCH] thp: make pmd_present more accurate
In many places !pmd_present has been converted to pmd_none. For pmds
that's equivalent and pmd_none is quicker so using pmd_none
This resets all per-thread and per-process statistics across exec
syscalls or after kernel threads detach from the mm. The past
statistical NUMA information is unlikely to be relevant for the future
in these cases.
Acked-by: Rik van Riel r...@redhat.com
Signed-off-by: Andrea Arcangeli aarca
and make it tunable with
sysfs too.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
mm/huge_memory.c | 33 +++--
1 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 08fd33c..a65590f 100644
--- a/mm
This is needed to make sure the tail pages are also queued into the
migration queues of knuma_migrated across a transparent hugepage
split.
Acked-by: Rik van Riel r...@redhat.com
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
mm/huge_memory.c |2 ++
1 files changed, 2 insertions
allocated
page_autonuma of 32 bytes per page (only allocated if booted on NUMA
hardware, unless noautonuma is passed as parameter to the kernel at
boot). Yet another later patch introduces the autonuma_list and
reduces the size of the page_autonuma from 32 to 12 bytes.
Signed-off-by: Andrea
-by: Andrea Arcangeli aarca...@redhat.com
---
kernel/sched/fair.c | 11 +++
1 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 42a88fa..677b99e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2794,6 +2794,17
, but it reduces some
initial thrashing in case of NUMA false sharing.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/autonuma_flags.h | 20
mm/autonuma.c |7 +--
2 files changed, 25 insertions(+), 2 deletions(-)
diff --git
faults
to start. All other actions follow after that. If knuma_scand doesn't
run, AutoNUMA is fully bypassed. If knuma_scand is stopped, soon all
other AutoNUMA gears will settle down too.
Acked-by: Rik van Riel r...@redhat.com
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
kernel/fork.c
and task_autonuma_cpu will always return true in
that case.
Includes fixes from Hillf Danton dhi...@gmail.com.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
kernel/sched/fair.c | 71 ++
1 files changed, 59 insertions(+), 12 deletions(-)
diff
Define the two data structures that collect the per-process (in the
mm) and per-thread (in the task_struct) statistical information that
are the input of the CPU follow memory algorithms in the NUMA
scheduler.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/autonuma_types.h
set_pmd_at() will also be used for the knuma_scand/pmd = 1 (default)
mode even when TRANSPARENT_HUGEPAGE=n. Make it available so the build
won't fail.
Acked-by: Rik van Riel r...@redhat.com
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
arch/x86/include/asm/paravirt.h |2 --
1 files
it on NUMA
hardware. So the non NUMA hardware only pays the memory of a pointer
in the kernel stack (which remains NULL at all times in that case).
If the kernel is compiled with CONFIG_AUTONUMA=n, not even the pointer
is allocated on the kernel stack of course.
Signed-off-by: Andrea Arcangeli aarca
Until THP native migration is implemented it's safer to boost
khugepaged scanning rate because all memory migration are splitting
the hugepages. So the regular rate of scanning becomes too low when
lots of memory is migrated.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
mm
Add the config options to allow building the kernel with AutoNUMA.
If CONFIG_AUTONUMA_DEFAULT_ENABLED is =y, then
/sys/kernel/mm/autonuma/enabled will be equal to 1, and AutoNUMA will
be enabled automatically at boot.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
arch/Kconfig
When pages are collapsed try to keep the last_nid information from one
of the original pages.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
mm/huge_memory.c | 14 ++
1 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index
is booted on real
NUMA hardware and noautonuma is not passed as a parameter to the
kernel.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/autonuma.h | 18 +++-
include/linux/autonuma_types.h | 55 +
include/linux/mm_types.h | 26
include/linux
and cleanups from Hillf Danton dhi...@gmail.com.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/autonuma_sched.h | 50
include/linux/mm_types.h |5 +
include/linux/sched.h |3 +
kernel/sched/core.c|1 +
kernel/sched/fair.c
it does nothing at
all.
Changelog from alpha11 to alpha13:
o autonuma_balance optimization (take the fast path when process is in
the preferred NUMA node)
TODO:
o THP native migration (orthogonal and also needed for
cpuset/migrate_pages(2)/numa/sched).
Andrea Arcangeli (35):
autonuma
Riel r...@redhat.com
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
arch/x86/mm/gup.c | 13 -
1 files changed, 12 insertions(+), 1 deletions(-)
diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
index dd74e46..02c5ec5 100644
--- a/arch/x86/mm/gup.c
+++ b/arch/x86/mm/gup.c
for now).
This means the max RAM configuration fully supported by AutoNUMA
becomes AUTONUMA_LIST_MAX_PFN_OFFSET multiplied by 32767 nodes
multiplied by the PAGE_SIZE (assume 4096 here, but for some archs it's
bigger).
4096*32767*(0x-3)(10*5) = 511 PetaBytes.
Signed-off-by: Andrea Arcangeli
This is where the numa hinting page faults are detected and are passed
over to the AutoNUMA core logic.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/huge_mm.h |2 ++
mm/huge_memory.c| 18 ++
mm/memory.c | 31
the memory in a round robin fashion from
all remote nodes to the daemon's local node.
The head that belongs to the local node that knuma_migrated runs on,
for now must be empty and it's not being used.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/mmzone.h | 18
Link the AutoNUMA core and scheduler object files in the kernel if
CONFIG_AUTONUMA=y.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
kernel/sched/Makefile |1 +
mm/Makefile |1 +
2 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/kernel/sched/Makefile b
is_vma_temporary_stack() is needed by mm/autonuma.c too, and without
this the build breaks with CONFIG_TRANSPARENT_HUGEPAGE=n.
Reported-by: Petr Holasek phola...@redhat.com
Acked-by: Rik van Riel r...@redhat.com
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/huge_mm.h
established by ioremap,
never on pmds so there's no risk of collision with Xen.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
arch/x86/include/asm/pgtable_types.h | 28
1 files changed, 28 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm
These flags are the ones tweaked through sysfs, they control the
behavior of autonuma, from enabling disabling it, to selecting various
runtime options.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/autonuma_flags.h | 129
1 files
...@gmail.com.
Math documentation on autonuma_last_nid in the header of
last_nid_set() reworked from sched-numa code by Peter Zijlstra
a.p.zijls...@chello.nl.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
Signed-off-by: Hillf Danton dhi...@gmail.com
---
mm/autonuma.c | 1619
* Page migration is yet to be observed/verified
Signed-off-by: Vaidyanathan Srinivasan sva...@linux.vnet.ibm.com
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
arch/powerpc/include/asm/pgtable.h| 48 -
arch/powerpc/include/asm/pte-hash64-64k.h |4
Remove the sysfs entry /sys/kernel/mm/autonuma/knuma_scand/pmd and
force the knuma_scand pmd mode off if
CONFIG_HAVE_ARCH_AUTONUMA_SCAN_PMD is not set by the architecture.
Enable AutoNUMA for PPC64.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
arch/Kconfig |3 +++
arch
This function makes it easy to bind the per-node knuma_migrated
threads to their respective NUMA nodes. Those threads take memory from
the other nodes (in round robin with a incoming queue for each remote
node) and they move that memory to their local node.
Signed-off-by: Andrea Arcangeli aarca
Header that defines the generic AutoNUMA specific functions.
All functions are defined unconditionally, but are only linked into
the kernel if CONFIG_AUTONUMA=y. When CONFIG_AUTONUMA=n, their call
sites are optimized away at build time (or the kernel wouldn't link).
Signed-off-by: Andrea
hardware the memory cost is reduced to one pointer per mm.
To get rid of the pointer in the each mm, the kernel can be compiled
with CONFIG_AUTONUMA=n.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
kernel/fork.c |7 +++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git
are never used.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/autonuma_flags.h | 25 ++---
mm/autonuma.c | 25 +
2 files changed, 47 insertions(+), 3 deletions(-)
diff --git a/include/linux/autonuma_flags.h b
-by: Andrea Arcangeli aarca...@redhat.com
---
mm/mempolicy.c | 12 ++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index bd92431..19a8f72 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1951,10 +1951,18 @@ retry_cpuset
Reduce the autonuma_migrate_head array entries from MAX_NUMNODES to
num_possible_nodes() or zero if autonuma is not possible.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
arch/x86/mm/numa.c |6 --
arch/x86/mm/numa_32.c |3 ++-
include/linux
When pages are freed abort any pending migration. If knuma_migrated
arrives first it will notice because get_page_unless_zero would fail.
You can safely ignore the #ifdef because a later patch (page_autonuma)
clears it.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
mm/page_alloc.c
Debug tweak.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
include/linux/autonuma.h | 19 +++
mm/page_alloc.c |3 ++-
2 files changed, 21 insertions(+), 1 deletions(-)
diff --git a/include/linux/autonuma.h b/include/linux/autonuma.h
index 1d87ecc
pages to
migrate queues. They are extremely quick, absolutely non-blocking and
do not allocate memory.
The generic implementation is used when CONFIG_AUTONUMA=n.
Acked-by: Rik van Riel r...@redhat.com
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
arch/x86/include/asm/pgtable.h | 65
On Wed, Aug 22, 2012 at 02:03:41PM +0800, Xiao Guangrong wrote:
On 08/21/2012 11:06 PM, Andrea Arcangeli wrote:
CPU0CPU1
oldpage[1] == 0 (both guest host)
oldpage[0] = 1
trigger do_wp_page
We always do ptep_clear_flush before
On Wed, Aug 22, 2012 at 11:51:17AM +0800, Xiao Guangrong wrote:
Hmm, in KSM code, i found this code in replace_page:
set_pte_at_notify(mm, addr, ptep, mk_pte(kpage, vma-vm_page_prot));
It is possible to establish a writable pte, no?
Hugh already answered this thanks. Further details on the
Hi Andrew,
On Wed, Aug 22, 2012 at 12:15:35PM -0700, Andrew Morton wrote:
On Wed, 22 Aug 2012 18:29:55 +0200
Andrea Arcangeli aarca...@redhat.com wrote:
On Wed, Aug 22, 2012 at 02:03:41PM +0800, Xiao Guangrong wrote:
On 08/21/2012 11:06 PM, Andrea Arcangeli wrote:
CPU0
On Wed, Aug 22, 2012 at 12:58:05PM -0700, Andrew Morton wrote:
If you can suggest some text I'll type it in right now.
Ok ;), I tried below:
This is safe to start by updating the secondary MMUs, because the
relevant primary MMU pte invalidate must have already happened with a
ptep_clear_flush
Hi Andi,
On Wed, Aug 22, 2012 at 01:19:04PM -0700, Andi Kleen wrote:
Andrea Arcangeli aarca...@redhat.com writes:
+/*
+ * In this function we build a temporal CPU_node-page relation by
+ * using a two-stage autonuma_last_nid filter to remove short/unlikely
+ * relations
On Wed, Aug 22, 2012 at 11:40:48PM +0200, Ingo Molnar wrote:
* Rik van Riel r...@redhat.com wrote:
On 08/22/2012 10:58 AM, Andrea Arcangeli wrote:
Hello everyone,
Before the Kernel Summit, I think it's good idea to post a new
AutoNUMA24 and to go through a new review cycle
On Thu, Aug 23, 2012 at 08:01:47AM +1000, Benjamin Herrenschmidt wrote:
On Wed, 2012-08-22 at 16:59 +0200, Andrea Arcangeli wrote:
diff --git a/arch/powerpc/include/asm/pgtable.h
b/arch/powerpc/include/asm/pgtable.h
index 2e0e411..5f03079 100644
--- a/arch/powerpc/include/asm/pgtable.h
Hi Andi,
On Thu, Aug 23, 2012 at 12:37:33AM +0200, Andi Kleen wrote:
This comment seems quite accurate to me (btw I taken it from
sched-numa rewrite with minor changes).
I had expected it to describe the next function. If it's a strategic
overview maybe it should be somewhere else.
Hi Benjamin,
On Thu, Aug 23, 2012 at 08:56:34AM +1000, Benjamin Herrenschmidt wrote:
What I mean here is that it's fine as a proof of concept ;-) I don't
like it being in a series aimed at upstream...
We can try to flush out the issues, but as it is, the patch isn't
upstreamable imho.
Well
Hi Benjamin,
On Thu, Aug 23, 2012 at 03:11:00PM +1000, Benjamin Herrenschmidt wrote:
Basically PROT_NONE turns into _PAGE_PRESENT without _PAGE_USER for us.
Maybe the simplest is to implement pte_numa as !_PAGE_USER too. No
need to clear the _PAGE_PRESENT bit and to alter pte_present() if
ad51771a2c3fa697fa0267edda23b48d0b85f023 Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli aarca...@redhat.com
Date: Fri, 3 Aug 2012 21:10:44 +0200
Subject: [PATCH] thp: document barrier() in wrprotect THP fault path
Inline doc.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
mm/memory.c |6 ++
1 files
On Tue, Jul 24, 2012 at 02:51:05PM -0700, Hugh Dickins wrote:
Since then, I think THP has made the rules more complicated; but I
believe Andrea paid a great deal of attention to that kind of issue.
There were many issues, one unexpected was
1a5a9906d4e8d1976b701f889d8f35d54b928f25.
Keep in
On Sat, Aug 04, 2012 at 03:02:45PM -0700, Paul E. McKenney wrote:
OK, I'll bite. ;-)
:))
The most sane way for this to happen is with feedback-driven techniques
involving profiling, similar to what is done for basic-block reordering
or branch prediction. The idea is that you compile the
On Fri, Aug 17, 2012 at 11:12:33AM +0300, Kirill A. Shutemov wrote:
I've used do_huge_pmd_wp_page_fallback() as template for my code.
What's difference between these two code paths?
Why is do_huge_pmd_wp_page_fallback() safe?
Good point. do_huge_pmd_wp_page_fallback works only on the current
Hi,
On Wed, Aug 08, 2012 at 02:43:34PM -0400, Rik van Riel wrote:
While the sched-numa code is relatively small and clean, the
current version does not seem to offer a significant
performance improvement over not having it, and in one of
the tests performance actually regresses vs. mainline.
(set_pte_at_notify must always run under the PT lock of course).
How about this:
=
From 160a0b1b2be9bf96c45b30d9423f8196ecebe351 Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli aarca...@redhat.com
Date: Tue, 21 Aug 2012 16:48:11 +0200
Subject: [PATCH] mmu_notifier: fix race in set_pte_at_notify usage
On Mon, Feb 04, 2008 at 11:09:01AM -0800, Christoph Lameter wrote:
On Sun, 3 Feb 2008, Andrea Arcangeli wrote:
Right but that pin requires taking a refcount which we cannot do.
GRU can use my patch without the pin. XPMEM obviously can't use my
patch as my invalidate_page[s] are under
On Mon, Feb 04, 2008 at 10:11:24PM -0800, Christoph Lameter wrote:
Zero problems only if you find having a single callout for every page
acceptable. So the invalidate_range in your patch is only working
invalidate_pages is only a further optimization that was
strightforward in some places
On Tue, Feb 05, 2008 at 10:17:41AM -0800, Christoph Lameter wrote:
The other approach will not have any remote ptes at that point. Why would
there be a coherency issue?
It never happens that two threads writes to two different physical
pages by working on the same process virtual address. This
901 - 1000 of 3668 matches
Mail list logo