[PATCH 3/6] sched,numa: preparations for complex topology placement

2014-10-17 Thread riel
From: Rik van Riel Preparatory patch for adding NUMA placement on systems with complex NUMA topology. Also fix a potential divide by zero in group_weight() Signed-off-by: Rik van Riel Tested-by: Chegu Vinod --- kernel/sched/fair.c | 57 ++--- 1

[PATCH 0/6] sched,numa: weigh nearby nodes for task placement on complex NUMA topologies (v2)

2014-10-17 Thread riel
This patch set integrates two algorithms I have previously tested, one for glueless mesh NUMA topologies, where NUMA nodes communicate with far-away nodes through intermediary nodes, and backplane topologies, where communication with far-away NUMA nodes happens through backplane controllers (which

[PATCH 2/6] sched,numa: classify the NUMA topology of a system

2014-10-17 Thread riel
From: Rik van Riel Smaller NUMA systems tend to have all NUMA nodes directly connected to each other. This includes the degenerate case of a system with just one node, ie. a non-NUMA system. Larger systems can have two kinds of NUMA topology, which affects how tasks and memory should be placed

[PATCH 4/6] sched,numa: calculate node scores in complex NUMA topologies

2014-10-17 Thread riel
From: Rik van Riel In order to do task placement on systems with complex NUMA topologies, it is necessary to count the faults on nodes nearby the node that is being examined for a potential move. In case of a system with a backplane interconnect, we are dealing with groups of NUMA nodes; each

[RFC PATCH 11/11] nohz,kvm,time: teach account_process_tick about guest time

2015-06-24 Thread riel
From: Rik van Riel When tick based accounting is run from a remote CPU, it is actually possible to encounter a task with PF_VCPU set. Make sure to account those as guest time. Signed-off-by: Rik van Riel --- kernel/sched/cputime.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff

[RFC PATCH 08/11] nohz,timer: have housekeeper call account_process_tick for nohz cpus

2015-06-24 Thread riel
From: Rik van Riel Have the housekeeper CPU call account_process_tick to do tick based accounting for remote nohz_full CPUs. Signed-off-by: Rik van Riel --- kernel/time/timer.c | 28 1 file changed, 28 insertions(+) diff --git a/kernel/time/timer.c b/kernel/time

[RFC PATCH 03/11] time,nohz: add cpu parameter to irqtime_account_process_tick

2015-06-24 Thread riel
From: Rik van Riel Add a cpu parameter to irqtime_account_process_tick, to specify what cpu to run the statistics for. In order for this to actually work on a different cpu, all the functions called by irqtime_account_process_tick need to be able to handle workng for another CPU. Signed-off

[RFC PATCH 01/11] nohz,time: make account_process_tick work on the task's CPU

2015-06-24 Thread riel
From: Rik van Riel Teach account_process_tick to work on the CPU of the task specified in the function argument. This allows us to do remote tick based sampling of a nohz_full cpu from a housekeeping CPU. Signed-off-by: Rik van Riel --- kernel/sched/cputime.c | 8 +++- 1 file changed, 7

[RFC INCOMPLETE] tick based timekeeping from a housekeeping CPU

2015-06-24 Thread riel
This series seems to make basic tick based time sampling from a housekeeping CPU work, allowing us to have tick based accounting on a nohz_full CPU, and no longer doing vtime accounting on those CPUs. It still needs a major cleanup, and steal time accounting and irq accounting are still missing.

[RFC PATCH 04/11] time,nohz: add cpu parameter to steal_account_process_tick

2015-06-24 Thread riel
From: Rik van Riel Add a cpu parameter to steal_account_process_tick, so it can be used to do CPU time accounting for another CPU. Signed-off-by: Rik van Riel --- kernel/sched/cputime.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/sched/cputime.c b

[RFC PATCH 09/11] nohz,time: add tick_accounting_remote macro

2015-06-24 Thread riel
From: Rik van Riel With the introduction of remote tick based sampling, we now have three ways of gathering time statistics: - local tick based sampling - vtime accounting (used natively on some architectures) - remote tick based sampling On a system with remote tick based sampling

[RFC PATCH 02/11] time,nohz: rename vtime_accounting_enabled to tick_accounting_disabled

2015-06-24 Thread riel
From: Rik van Riel Rename vtime_accounting_enabled to tick_accounting_disabled, because it can mean either that vtime accounting is enabled, or that the system is doing tick based sampling from a housekeeping CPU for nohz_full CPUs. Signed-off-by: Rik van Riel --- include/linux

[RFC PATCH 05/11] time,nohz: add cpu parameter to account_steal_time

2015-06-24 Thread riel
From: Rik van Riel Simple transformation to allow tick based sampling from a remote cpu. Additional changes may be needed to actually acquire the steal time info for remote cpus from the host/hypervisor. Signed-off-by: Rik van Riel --- include/linux/kernel_stat.h | 2 +- kernel/sched

[RFC PATCH 06/11] time,nohz: add cpu parameter to account_idle_time

2015-06-24 Thread riel
From: Rik van Riel Simple transformation to allow account_idle_time to account the idle time for another CPU. Signed-off-by: Rik van Riel --- arch/ia64/kernel/time.c | 2 +- arch/powerpc/kernel/time.c | 2 +- arch/s390/kernel/idle.c | 2 +- include/linux/kernel_stat.h | 2

[RFC PATCH 07/11] nohz,timer: designate timer housekeeping cpu

2015-06-24 Thread riel
From: Rik van Riel The timer housekeeping CPU can do tick based sampling for remote CPUs. For now this is the first CPU in the housekeeping_mask. Eventually we could move to having one timer housekeeping cpu per socket, if needed. Signed-off-by: Rik van Riel --- include/linux/tick.h | 9

[RFC PATCH 10/11] nohz,kvm,time: skip vtime accounting at kernel entry & exit

2015-06-24 Thread riel
From: Rik van Riel When timer statistics are sampled from a remote CPU, vtime calculations at the kernel/user and kernel/guest boundary are no longer necessary. Skip them. Signed-off-by: Rik van Riel --- include/linux/context_tracking.h | 4 ++-- kernel/context_tracking.c| 6 -- 2

[PATCH 1/2] show isolated cpus in sysfs

2015-04-24 Thread riel
From: Rik van Riel After system bootup, there is no totally reliable way to see which CPUs are isolated, because the kernel may modify the CPUs specified on the isolcpus= kernel command line option. Export the CPU list that actually got isolated in sysfs, specifically in the file /sys/devices

[PATCH 0/2 resend] show isolated & nohz_full cpus in sysfs

2015-04-24 Thread riel
Currently there is no good way to get the isolated and nohz_full CPUs at runtime, because the kernel may have changed the CPUs specified on the commandline (when specifying all CPUs as isolated, or CPUs that do not exist, ...) This series adds two files to /sys/devices/system/cpu, which can be

[PATCH 2/2] show nohz_full cpus in sysfs

2015-04-24 Thread riel
From: Rik van Riel Currently there is no way to query which CPUs are in nohz_full mode from userspace. Export the CPU list running in nohz_full mode in sysfs, specifically in the file /sys/devices/system/cpu/nohz_full This can be used by system management tools like libvirt, openstack

[PATCH 0/3] reduce nohz_full syscall overhead by 10%

2015-04-30 Thread riel
Profiling reveals that a lot of the overhead from the nohz_full accounting seems to come not from the accounting itself, but from disabling and re-enabling interrupts. This patch series removes the interrupt disabling & re-enabling from __acct_update_integrals, which is called on both syscall

[PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry

2015-04-30 Thread riel
From: Rik van Riel On syscall entry with nohz_full on, we enable interrupts, call user_exit, disable interrupts, do something, re-enable interrupts, and go on our merry way. Profiling shows that a large amount of the nohz_full overhead comes from the extraneous disabling and re-enabling

[PATCH 2/3] remove local_irq_save from __acct_update_integrals

2015-04-30 Thread riel
From: Rik van Riel The function __acct_update_integrals() is called both from irq context and task context. This creates a race where irq context can advance tsk->acct_timexpd to a value larger than time, leading to a negative value, which causes a divide error. See commit 6d5b5acca9e5 (&

[PATCH 1/3] reduce indentation in __acct_update_integrals

2015-04-30 Thread riel
From: Peter Zijlstra Reduce indentation in __acct_update_integrals. Cc: Andy Lutomirsky Cc: Frederic Weisbecker Cc: Peter Zijlstra Cc: Heiko Carstens Cc: Thomas Gleixner Signed-off-by: Peter Zijlstra Signed-off-by: Rik van Riel --- kernel/tsacct.c | 34

[PATCH RFC 1/5] sched,numa: build table of node hop distance

2014-10-08 Thread riel
From: Rik van Riel In order to more efficiently figure out where to place workloads that span multiple NUMA nodes, it makes sense to estimate how many hops away nodes are from each other. Also add some comments to sched_init_numa. Signed-off-by: Rik van Riel Suggested-by: Peter Zijlstra

[PATCH RFC 0/5] sched,numa: task placement with complex NUMA topologies

2014-10-08 Thread riel
This patch set integrates two algorithms I have previously tested, one for glueless mesh NUMA topologies, where NUMA nodes communicate with far-away nodes through intermediary nodes, and backplane topologies, where communication with far-away NUMA nodes happens through backplane controllers (which

[PATCH RFC 2/5] sched,numa: classify the NUMA topology of a system

2014-10-08 Thread riel
From: Rik van Riel Smaller NUMA systems tend to have all NUMA nodes directly connected to each other. This includes the degenerate case of a system with just one node, ie. a non-NUMA system. Larger systems can have two kinds of NUMA topology, which affects how tasks and memory should be placed

[PATCH RFC 3/5] sched,numa: preparations for complex topology placement

2014-10-08 Thread riel
From: Rik van Riel Preparatory patch for adding NUMA placement on systems with complex NUMA topology. Also fix a potential divide by zero in group_weight() Signed-off-by: Rik van Riel --- include/linux/topology.h | 1 + kernel/sched/core.c | 2 +- kernel/sched/fair.c | 57

[PATCH RFC 4/5] sched,numa: calculate node scores in complex NUMA topologies

2014-10-08 Thread riel
From: Rik van Riel In order to do task placement on systems with complex NUMA topologies, it is necessary to count the faults on nodes nearby the node that is being examined for a potential move. In case of a system with a backplane interconnect, we are dealing with groups of NUMA nodes; each

[PATCH RFC 5/5] sched,numa: find the preferred nid with complex NUMA topology

2014-10-08 Thread riel
From: Rik van Riel On systems with complex NUMA topologies, the node scoring is adjusted to allow workloads to converge on nodes that are near each other. The way a task group's preferred nid is determined needs to be adjusted, in order for the preferred_nid to be consistent with group_weight

[PATCH 2/8] x86, fpu: unlazy_fpu: don't do __thread_fpu_end() if use_eager_fpu()

2015-02-06 Thread riel
Signed-off-by: Rik van Riel --- arch/x86/kernel/i387.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c index c3b92c0975cd..8e070a6c30e5 100644 --- a/arch/x86/kernel/i387.c +++ b/arch/x86/kernel/i387.c @@ -120,8 +120,12 @

[PATCH 8/8] x86,fpu: also check fpu_lazy_restore when use_eager_fpu

2015-02-06 Thread riel
From: Rik van Riel With Oleg's patch "x86, fpu: don't abuse FPU in kernel threads if use_eager_fpu()", kernel threads no longer have an FPU state even on systems with use_eager_fpu() That in turn means that a task may still have its FPU state loaded in the FPU registers, if the tas

[PATCH 7/8] x86,fpu: use disable_task_lazy_fpu_restore helper

2015-02-06 Thread riel
From: Rik van Riel Replace magic assignments of fpu.last_cpu = ~0 with more explicit disable_task_lazy_fpu_restore calls. This also fixes the lazy FPU restore disabling in drop_fpu, which only really works when !use_eager_fpu(). This is fine for now, because fpu_lazy_restore() is only used

[PATCH 0/8] x86,fpu: various small FPU cleanups and optimizations

2015-02-06 Thread riel
This includes the three patches by Oleg that are not in -tip yet, and five more by myself. I believe the changes to my patches address all the comments by reviewers on the previous version. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

[PATCH 5/8] x86,fpu: introduce task_disable_lazy_fpu_restore helper

2015-02-06 Thread riel
From: Rik van Riel Currently there are a few magic assignments sprinkled through the code that disable lazy FPU state restoring, some more effective than others, and all equally mystifying. It would be easier to have a helper to explicitly disable lazy FPU state restoring for a task. Signed

[PATCH 6/8] x86,fpu: use an explicit if/else in switch_fpu_prepare

2015-02-06 Thread riel
From: Rik van Riel Use an explicit if/else branch after __save_init_fpu(old) in switch_fpu_prepare. This makes substituting the assignment with a call to task_disable_lazy_fpu() in the next patch easier to review. Signed-off-by: Rik van Riel --- arch/x86/include/asm/fpu-internal.h | 5

[PATCH 3/8] x86, fpu: kill save_init_fpu(), change math_error() to use unlazy_fpu()

2015-02-06 Thread riel
(). Signed-off-by: Oleg Nesterov Signed-off-by: Rik van Riel --- arch/x86/include/asm/fpu-internal.h | 18 -- arch/x86/kernel/traps.c | 2 +- 2 files changed, 1 insertion(+), 19 deletions(-) diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu

[PATCH 1/8] x86, fpu: unlazy_fpu: don't reset thread.fpu_counter

2015-02-06 Thread riel
hread_has_fpu(). Signed-off-by: Oleg Nesterov Signed-off-by: Rik van Riel --- arch/x86/kernel/i387.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c index 47348653503a..c3b92c0975cd 100644 --- a/arch/x86/kernel/i387.c +++ b/ar

[PATCH 4/8] x86,fpu: move lazy restore functions up a few lines

2015-02-06 Thread riel
From: Rik van Riel We need another lazy restore related function, that will be called from a function that is above where the lazy restore functions are now. It would be nice to keep all three functions grouped together. Signed-off-by: Rik van Riel --- arch/x86/include/asm/fpu-internal.h | 36

[PATCH -v3 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest

2015-02-09 Thread riel
When running a KVM guest on a system with NOHZ_FULL enabled, and the KVM guest running with idle=poll mode, we still get wakeups of the rcuos/N threads. This problem has already been solved for user space by telling the RCU subsystem that the CPU is in an extended quiescent state while running

[PATCH 1/6] rcu,nohz: add state parameter to context_tracking_user_enter/exit

2015-02-09 Thread riel
From: Rik van Riel Add the expected ctx_state as a parameter to context_tracking_user_enter and context_tracking_user_exit, allowing the same functions to not just track kernel <> user space switching, but also kernel <> guest transitions. Catalin, Will: this patch and the next o

[PATCH 4/6] nohz,kvm: export context_tracking_user_enter/exit

2015-02-09 Thread riel
From: Rik van Riel Export context_tracking_user_enter/exit so it can be used by KVM. Signed-off-by: Rik van Riel --- kernel/context_tracking.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index 2d94147c07b2..8c5f2e939eee 100644

[PATCH 6/6] nohz: add stub context_tracking_is_enabled

2015-02-09 Thread riel
From: Rik van Riel With code elsewhere doing something conditional on whether or not context tracking is enabled, we want a stub function that tells us context tracking is not enabled, when CONFIG_CONTEXT_TRACKING is not set. Signed-off-by: Rik van Riel --- include/linux

[PATCH 2/6] rcu,nohz: rename context_tracking_enter & _exit

2015-02-09 Thread riel
From: Rik van Riel Rename context_tracking_user_enter & context_tracking_user_exit to just context_tracking_enter & context_tracking_exit, since it will be used to track guest state, too. This also breaks ARM. The rest of the series does not look like it impacts ARM. Cc: will.dea...@ar

[PATCH 3/6] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER

2015-02-09 Thread riel
From: Rik van Riel Only run vtime_user_enter, vtime_user_exit, and the user enter & exit trace points when we are entering or exiting user state, respectively. The RCU code only distinguishes between "idle" and "not idle or kernel". There should be no need to add an a

[PATCH 5/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest

2015-02-09 Thread riel
From: Rik van Riel The host kernel is not doing anything while the CPU is executing a KVM guest VCPU, so it can be marked as being in an extended quiescent state, identical to that used when running user space code. The only exception to that rule is when the host handles an interrupt, which

[PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest

2015-02-10 Thread riel
When running a KVM guest on a system with NOHZ_FULL enabled, and the KVM guest running with idle=poll mode, we still get wakeups of the rcuos/N threads. This problem has already been solved for user space by telling the RCU subsystem that the CPU is in an extended quiescent state while running

[PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions

2015-02-10 Thread riel
From: Rik van Riel These wrapper functions allow architecture code (eg. ARM) to keep calling context_tracking_user_enter & context_tracking_user_exit the same way it always has, without error prone tricks like duplicate defines of argument values in assembly code. Signed-off-by: Rik van

[PATCH 3/6] nohz: add stub context_tracking_is_enabled

2015-02-10 Thread riel
From: Rik van Riel With code elsewhere doing something conditional on whether or not context tracking is enabled, we want a stub function that tells us context tracking is not enabled, when CONFIG_CONTEXT_TRACKING is not set. Signed-off-by: Rik van Riel --- include/linux

[PATCH 4/6] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER

2015-02-10 Thread riel
From: Rik van Riel Only run vtime_user_enter, vtime_user_exit, and the user enter & exit trace points when we are entering or exiting user state, respectively. The KVM code in guest_enter and guest_exit already take care of calling vtime_guest_enter and vtime_guest_exit, respectively. The

[PATCH 2/6] rcu,nohz: add state parameter to context_tracking_enter/exit

2015-02-10 Thread riel
From: Rik van Riel Add the expected ctx_state as a parameter to context_tracking_enter and context_tracking_exit, allowing the same functions to not just track kernel <> user space switching, but also kernel <> guest transitions. Signed-off-by: Rik van Riel --- i

[PATCH 5/6] nohz,kvm: export context_tracking_user_enter/exit

2015-02-10 Thread riel
From: Rik van Riel Export context_tracking_user_enter/exit so it can be used by KVM. Signed-off-by: Rik van Riel --- kernel/context_tracking.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index 0e4e318d5ea4..5bdf1a342ab3 100644

[PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest

2015-02-10 Thread riel
From: Rik van Riel The host kernel is not doing anything while the CPU is executing a KVM guest VCPU, so it can be marked as being in an extended quiescent state, identical to that used when running user space code. The only exception to that rule is when the host handles an interrupt, which

[PATCH 0/3] cleanups to the disable lazy fpu restore code

2015-01-30 Thread riel
These go on top of Oleg's patches from yesterday. The mechanism to disable lazy FPU restore is inscrutible in several places, and dubious at best in one. These patches make things explicit. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

[PATCH 1/3] x86,fpu: move lazy restore functions up a few lines

2015-01-30 Thread riel
From: Rik van Riel We need another lazy restore related function, that will be called from a function that is above where the lazy restore functions are now. It would be nice to keep all three functions grouped together. Signed-off-by: Rik van Riel --- arch/x86/include/asm/fpu-internal.h | 36

[PATCH 3/3] x86,fpu: use disable_task_lazy_fpu_restore helper

2015-01-30 Thread riel
From: Rik van Riel Replace magic assignments of fpu.last_cpu = ~0 with more explicit disable_task_lazy_fpu_restore calls. This also fixes the lazy FPU restore disabling in drop_fpu, which only really works when !use_eager_fpu(). This is fine for now, because fpu_lazy_restore() is only used

[PATCH 2/3] x86,fpu: introduce task_disable_lazy_fpu_restore helper

2015-01-30 Thread riel
From: Rik van Riel Currently there are a few magic assignments sprinkled through the code that disable lazy FPU state restoring, some more effective than others, and all equally mystifying. It would be easier to have a helper to explicitly disable lazy FPU state restoring for a task. Signed

[PATCH 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

2015-02-23 Thread riel
From: Rik van Riel The previous patch makes it so the code skips over isolcpus when building scheduler load balancing domains. This makes it hard to see for a user which of the CPUs in a cpuset are participating in load balancing, and which ones are isolated cpus. Add a cpuset.isolcpus file

[PATCH 2/2] cpusets,isolcpus: add file to show isolated cpus in cpuset

2015-02-25 Thread riel
From: Rik van Riel The previous patch makes it so the code skips over isolcpus when building scheduler load balancing domains. This makes it hard to see for a user which of the CPUs in a cpuset are participating in load balancing, and which ones are isolated cpus. Add a cpuset.isolcpus file

[PATCH 1/2] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

2015-02-25 Thread riel
From: Rik van Riel Ensure that cpus specified with the isolcpus= boot commandline option stay outside of the load balancing in the kernel scheduler. Operations like load balancing can introduce unwanted latencies, which is exactly what the isolcpus= commandline is there to prevent. Previously

[PATCH -v2 0/2] cpusets,isolcpus: resolve conflict between cpusets and isolcpus

2015-02-25 Thread riel
-v2 addresses the conflict David Rientjes spotted between my previous patches and commit e8e6d97c9b ("cpuset: use %*pb[l] to print bitmaps including cpumasks and nodemasks") Ensure that cpus specified with the isolcpus= boot commandline option stay outside of the load balancing in the kernel

[PATCH v4 RESEND 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

2015-03-09 Thread riel
Ensure that cpus specified with the isolcpus= boot commandline option stay outside of the load balancing in the kernel scheduler. Operations like load balancing can introduce unwanted latencies, which is exactly what the isolcpus= commandline is there to prevent. Previously, simply creating a

[PATCH 4/4] cpuset,isolcpus: document relationship between cpusets & isolcpus

2015-03-09 Thread riel
From: Rik van Riel Document the subtly changed relationship between cpusets and isolcpus. Turns out the old documentation did not match the code... Signed-off-by: Rik van Riel Suggested-by: Peter Zijlstra --- Documentation/cgroups/cpusets.txt | 10 -- 1 file changed, 8 insertions

[PATCH 3/4] cpusets,isolcpus: add file to show isolated cpus in cpuset

2015-03-09 Thread riel
From: Rik van Riel The previous patch makes it so the code skips over isolcpus when building scheduler load balancing domains. This makes it hard to see for a user which of the CPUs in a cpuset are participating in load balancing, and which ones are isolated cpus. Add a cpuset.isolcpus file

[PATCH 2/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

2015-03-09 Thread riel
From: Rik van Riel Ensure that cpus specified with the isolcpus= boot commandline option stay outside of the load balancing in the kernel scheduler. Operations like load balancing can introduce unwanted latencies, which is exactly what the isolcpus= commandline is there to prevent. Previously

[PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler

2015-03-09 Thread riel
From: Rik van Riel Needed by the next patch. Also makes cpu_isolated_map present when compiled without SMP and/or with CONFIG_NR_CPUS=1, like the other cpu masks. At some point we may want to clean things up so cpumasks do not exist in UP kernels. Maybe something for the CONFIG_TINY crowd. Cc

[PATCH v4 0/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

2015-03-03 Thread riel
Ensure that cpus specified with the isolcpus= boot commandline option stay outside of the load balancing in the kernel scheduler. Operations like load balancing can introduce unwanted latencies, which is exactly what the isolcpus= commandline is there to prevent. Previously, simply creating a

[PATCH 4/4] cpuset,isolcpus: document relationship between cpusets & isolcpus

2015-03-03 Thread riel
From: Rik van Riel Document the subtly changed relationship between cpusets and isolcpus. Turns out the old documentation did not match the code... Signed-off-by: Rik van Riel Suggested-by: Peter Zijlstra --- Documentation/cgroups/cpusets.txt | 10 -- 1 file changed, 8 insertions

[PATCH 1/4] sched,isolcpu: make cpu_isolated_map visible outside scheduler

2015-03-03 Thread riel
From: Rik van Riel Needed by the next patch. Also makes cpu_isolated_map present when compiled without SMP and/or with CONFIG_NR_CPUS=1, like the other cpu masks. At some point we may want to clean things up so cpumasks do not exist in UP kernels. Maybe something for the CONFIG_TINY crowd. Cc

[PATCH 3/4] cpusets,isolcpus: add file to show isolated cpus in cpuset

2015-03-03 Thread riel
From: Rik van Riel The previous patch makes it so the code skips over isolcpus when building scheduler load balancing domains. This makes it hard to see for a user which of the CPUs in a cpuset are participating in load balancing, and which ones are isolated cpus. Add a cpuset.isolcpus file

[PATCH 2/4] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

2015-03-03 Thread riel
From: Rik van Riel Ensure that cpus specified with the isolcpus= boot commandline option stay outside of the load balancing in the kernel scheduler. Operations like load balancing can introduce unwanted latencies, which is exactly what the isolcpus= commandline is there to prevent. Previously

[PATCH 5/5] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest

2015-02-10 Thread riel
From: Rik van Riel The host kernel is not doing anything while the CPU is executing a KVM guest VCPU, so it can be marked as being in an extended quiescent state, identical to that used when running user space code. The only exception to that rule is when the host handles an interrupt, which

[PATCH 3/5] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER

2015-02-10 Thread riel
From: Rik van Riel Only run vtime_user_enter, vtime_user_exit, and the user enter & exit trace points when we are entering or exiting user state, respectively. The KVM code in guest_enter and guest_exit already take care of calling vtime_guest_enter and vtime_guest_exit, respectively. The

[PATCH -v5 0/5] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest

2015-02-10 Thread riel
When running a KVM guest on a system with NOHZ_FULL enabled, and the KVM guest running with idle=poll mode, we still get wakeups of the rcuos/N threads. This problem has already been solved for user space by telling the RCU subsystem that the CPU is in an extended quiescent state while running

[PATCH 1/5] context_tracking: generalize context tracking APIs to support user and guest

2015-02-10 Thread riel
From: Rik van Riel Split out the mechanism from context_tracking_user_enter and context_tracking_user_exit into context_tracking_enter and context_tracking_exit. Leave the old functions in order to avoid breaking ARM, which calls these functions from assembler code, and cannot easily use C enum

[PATCH 4/5] nohz,kvm: export context_tracking_user_enter/exit

2015-02-10 Thread riel
From: Rik van Riel Export context_tracking_user_enter/exit so it can be used by KVM. Signed-off-by: Rik van Riel --- kernel/context_tracking.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index 0e4e318d5ea4..5bdf1a342ab3 100644

[PATCH 2/5] nohz: add stub context_tracking_is_enabled

2015-02-10 Thread riel
From: Rik van Riel With code elsewhere doing something conditional on whether or not context tracking is enabled, we want a stub function that tells us context tracking is not enabled, when CONFIG_CONTEXT_TRACKING is not set. Signed-off-by: Rik van Riel --- include/linux

[PATCH 0/2] cpusets,isolcpus: resolve conflict between cpusets and isolcpus

2015-02-23 Thread riel
Ensure that cpus specified with the isolcpus= boot commandline option stay outside of the load balancing in the kernel scheduler. Operations like load balancing can introduce unwanted latencies, which is exactly what the isolcpus= commandline is there to prevent. Previously, simply creating a

[PATCH 1/2] cpusets,isolcpus: exclude isolcpus from load balancing in cpusets

2015-02-23 Thread riel
From: Rik van Riel Ensure that cpus specified with the isolcpus= boot commandline option stay outside of the load balancing in the kernel scheduler. Operations like load balancing can introduce unwanted latencies, which is exactly what the isolcpus= commandline is there to prevent. Previously

[PATCH 7/9] x86/fpu: rename lazy restore functions to "register state valid"

2016-10-04 Thread riel
From: Rik van Riel Name the functions after the state they track, rather than the function they currently enable. This should make it more obvious when we use the fpu_register_state_valid function for something else in the future. Signed-off-by: Rik van Riel --- arch/x86/include/asm/fpu

[PATCH 3/9] x86/fpu: Remove the XFEATURE_MASK_EAGER/LAZY distinction

2016-10-04 Thread riel
From: Andy Lutomirski Now that lazy mode is gone, we don't need to distinguish which xfeatures require eager mode. Signed-off-by: Rik van Riel Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/fpu/xstate.h | 14 +- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git

[PATCH 9/9] x86/fpu: split old & new fpu code paths

2016-10-04 Thread riel
From: Rik van Riel Now that CR0.TS is no longer being manipulated, we can simplify switch_fpu_prepare by no longer nesting the handling of the new fpu inside the two branches for the old FPU. Signed-off-by: Rik van Riel --- arch/x86/include/asm/fpu/internal.h | 22 -- 1

[PATCH 8/9] x86/fpu: remove __fpregs_(de)activate

2016-10-04 Thread riel
From: Rik van Riel Now that fpregs_activate and fpregs_deactivate do nothing except call the double underscored versions of themselves, we can get rid of the double underscore version. Signed-off-by: Rik van Riel --- arch/x86/include/asm/fpu/internal.h | 25 +++-- 1 file

[PATCH 5/9] x86/fpu: remove fpu.counter

2016-10-04 Thread riel
From: Rik van Riel With the lazy FPU code gone, we no longer use the counter field in struct fpu for anything. Get rid it. Signed-off-by: Rik van Riel --- arch/x86/include/asm/fpu/internal.h | 3 --- arch/x86/include/asm/fpu/types.h| 11 --- arch/x86/include/asm/trace/fpu.h

[PATCH 6/9] x86/fpu,kvm: remove kvm vcpu->fpu_counter

2016-10-04 Thread riel
From: Rik van Riel With the removal of the lazy FPU code, this field is no longer used. Get rid of it. Signed-off-by: Rik van Riel --- arch/x86/kvm/x86.c | 4 +--- include/linux/kvm_host.h | 1 - 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86

[PATCH 0/9] x86/fpu: remove lazy FPU mode & various FPU cleanups

2016-10-04 Thread riel
This series removes lazy FPU mode, and cleans up various bits and pieces around the FPU code. I have run this through a basic floating point test that involves about 1.5 billion context switches and 45 minutes of swapping at 250MB/second. This seems to tease out bugs fairly well, though I would

[PATCH 4/9] x86/fpu: Remove use_eager_fpu()

2016-10-04 Thread riel
From: Andy Lutomirski This removes all the obvious code paths that depend on lazy FPU mode. It shouldn't change the generated code at all. Signed-off-by: Rik van Riel Signed-off-by: Andy Lutomirski --- arch/x86/crypto/crc32c-intel_glue.c | 17 - arch/x86/include/asm/fpu

[PATCH 1/9] x86/crypto: Remove X86_FEATURE_EAGER_FPU ifdef from the crc32c code

2016-10-04 Thread riel
From: Rik van Riel >From : Andy Lutomirski The crypto code was checking both use_eager_fpu() and defined(X86_FEATURE_EAGER_FPU). The latter was nonsensical, so remove it. This will avoid breakage when we remove X86_FEATURE_EAGER_FPU. Signed-off-by: Rik van Riel Signed-off-by: A

[PATCH 2/9] x86/fpu: Hard-disable lazy fpu mode

2016-10-04 Thread riel
uot;return true" and all of the FPU mode selection machinery is removed. Signed-off-by: Rik van Riel Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/cpufeatures.h | 2 +- arch/x86/include/asm/fpu/internal.h | 2 +- arch/x86/kernel/fpu/init.c | 91 ++

[PATCH 3/5] x86: ascii armor the x86_64 boot init stack canary

2017-05-24 Thread riel
From: Rik van Riel Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and Daniel Micay's linux-hardened tree. Signed-off-by: Rik van Riel

[PATCH 4/5] arm64: ascii armor the arm64 boot init stack canary

2017-05-24 Thread riel
From: Rik van Riel Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and Daniel Micay's linux-hardened tree. Signed-off-by: Rik van Riel

[PATCH 2/5] fork,random: use get_random_canary to set tsk->stack_canary

2017-05-24 Thread riel
From: Rik van Riel Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and Daniel Micay's linux-hardened tree. Signed-off-by: Rik van Riel

[PATCH 1/5] random,stackprotect: introduce get_random_canary function

2017-05-24 Thread riel
From: Rik van Riel Introduce the get_random_canary function, which provides a random unsigned long canary value with the first byte zeroed out on 64 bit architectures, in order to mitigate non-terminated C string overflows. The null byte both prevents C string functions from reading the canary

[PATCH v2 0/5] stackprotector: ascii armor the stack canary

2017-05-24 Thread riel
Zero out the first byte of the stack canary value on 64 bit systems, in order to mitigate unterminated C string overflows. The null byte both prevents C string functions from reading the canary, and from writing it if the canary value were guessed or obtained through some other means.

[PATCH 5/5] sh64: ascii armor the sh64 boot init stack canary

2017-05-24 Thread riel
From: Rik van Riel Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and Daniel Micay's linux-hardened tree. Signed-off-by: Rik van Riel

[PATCH 2/5] fork,random: use get_random_canary to set tsk->stack_canary

2017-05-19 Thread riel
From: Rik van Riel Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and PaX/grsecurity. Signed-off-by: Rik van Riel --- kernel/fork.c

[PATCH 3/5] x86: ascii armor the x86_64 boot init stack canary

2017-05-19 Thread riel
From: Rik van Riel Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and PaX/grsecurity. Signed-off-by: Rik van Riel --- arch/x86/include

[PATCH 1/5] random,stackprotect: introduce get_random_canary function

2017-05-19 Thread riel
From: Rik van Riel Introduce the get_random_canary function, which provides a random unsigned long canary value with the first byte zeroed out on 64 bit architectures, in order to mitigate non-terminated C string overflows. Inspired by the "ascii armor" code in the old execshie

[PATCH 4/5] arm64: ascii armor the arm64 boot init stack canary

2017-05-19 Thread riel
From: Rik van Riel Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and PaX/grsecurity. Signed-off-by: Rik van Riel --- arch/arm64

[PATCH 5/5] sh64: ascii armor the sh64 boot init stack canary

2017-05-19 Thread riel
From: Rik van Riel Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and PaX/grsecurity. Signed-off-by: Rik van Riel --- arch/sh/include

stackprotector: ascii armor the stack canary

2017-05-19 Thread riel
Zero out the first byte of the stack canary value on 64 bit systems, in order to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if an attacker somehow guessed or obtained the canary value. Inspired by execshield ascii-armor and PaX/grsecurity.

<    1   2   3   4   5   6   7   8   9   10   >