[PATCH -tip] locking/mutexes: Avoid bogus wakeups after lock stealing

2014-08-13 Thread Davidlohr Bueso
Mutexes lock-stealing functionality allows another task to
skip its turn in the wait-queue and atomically acquire the lock.
This is fine and a nice optimization, however, when releasing
the mutex, we always wakeup the next task in FIFO order. When
the lock has been stolen, this leads to wasting waking up a
task just to immediately realize it cannot acquire the lock
and just go back to sleep. While in practice this window is
quite small, it is not about performance or avoid taking the
wait_lock, but because avoiding bogus wakeups is the right thing
to do.

In order to deal with the race when potentially missing the
unlock slowpath (details in the comments), we pessimistically set
the lock to have waiters. The downside of this is that the task
that now stole the lock would always have to acquire the mutex in
its slowpath (as mutex_try_to_acquire() would never succeed.
However, since this path is rarely called, the cost is really
never noticed.

Signed-off-by: Davidlohr Bueso 
---
Original thread: https://lkml.org/lkml/2014/8/8/37

 kernel/locking/mutex.c | 43 +++
 1 file changed, 43 insertions(+)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index dadbf88..4570611 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -383,12 +383,26 @@ done:
 
return false;
 }
+
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+   struct task_struct *owner = ACCESS_ONCE(lock->owner);
+
+   return owner != NULL;
+}
+
 #else
+
 static bool mutex_optimistic_spin(struct mutex *lock,
  struct ww_acquire_ctx *ww_ctx, const bool 
use_ww_ctx)
 {
return false;
 }
+
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+   return false;
+}
 #endif
 
 __visible __used noinline
@@ -715,6 +729,35 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int 
nested)
 {
unsigned long flags;
 
+/*
+ * Skipping the mutex_has_owner() check when DEBUG, allows us to
+ * avoid taking the wait_lock in order to do not call mutex_release()
+ * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
+ * hung it a hung task when another one enters the lock's slowpath in
+ * mutex_lock().
+ */
+#ifndef CONFIG_DEBUG_MUTEXES
+   /*
+* Abort the wakeup operation if there is an another mutex owner, as the
+* lock was stolen. mutex_unlock() should have cleared the owner field
+* before calling this function. If that field is now set, another task
+* must have acquired the mutex. Note that this is a very tiny window.
+*/
+   if (unlikely(mutex_has_owner(lock))) {
+   /*
+* Unconditionally set the lock to have waiters. Otherwise
+* we can race with another task that grabbed the mutex via
+* optimistic spinning and sets the lock to 0. When done,
+* the unlock logic never reaches the slowpath, thus never
+* waking the next task in the queue.
+* Furthermore, this is safe as we've already acknowledged
+* the fact that the lock was stolen and now a new owner
+* exists.
+*/
+   atomic_set(>count, -1);
+   return;
+   }
+#endif
/*
 * As a performance measurement, release the lock before doing other
 * wakeup related duties to follow. This allows other tasks to acquire
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARM: dts: Add sdio0 and sdio1 to the rk3288

2014-08-13 Thread Addy Ke
Signed-off-by: Addy Ke 
---
 arch/arm/boot/dts/rk3288.dtsi | 76 +++
 1 file changed, 76 insertions(+)

diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index 7a9173d..a440869 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -120,6 +120,32 @@
//fifo-depth = <0x100>;
};
 
+   sdio0: dwmmc@ff0d {
+   compatible = "rockchip,rk3288-dw-mshc";
+   reg = <0xff0d 0x4000>;
+   interrupts = ;
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   clocks = < HCLK_SDIO0>, < SCLK_SDIO0>;
+   clock-names = "biu", "ciu";
+
+   fifo-depth = <0x100>;
+   };
+
+   sdio1: dwmmc@ff0e {
+   compatible = "rockchip,rk3288-dw-mshc";
+   reg = <0xff0e 0x4000>;
+   interrupts = ;
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   clocks = < HCLK_SDIO1>, < SCLK_SDIO1>;
+   clock-names = "biu", "ciu";
+
+   fifo-depth = <0x100>;
+   };
+
emmc: dwmmc@ff0f {
compatible = "rockchip,rk3288-dw-mshc";
reg = <0xff0f 0x4000>;
@@ -589,6 +615,56 @@
};
};
 
+   sdio0 {
+   sdio0_clk: sdio0-clk {
+   rockchip,pins = <4 25 RK_FUNC_1 
_pull_none>;
+   };
+
+   sdio0_cmd: sdio0-cmd {
+   rockchip,pins = <4 24 RK_FUNC_1 _pull_up>;
+   };
+
+   sdio0_pwr: sdio0-pwr {
+   rockchip,pins = <4 28 RK_FUNC_1 _pull_up>;
+   };
+
+   sdio0_bus1: sdio0-bus1 {
+   rockchip,pins = <4 20 RK_FUNC_1 _pull_up>;
+   };
+
+   sdio0_bus4: sdio0-bus4 {
+   rockchip,pins = <4 20 RK_FUNC_1 _pull_up>,
+   <4 21 RK_FUNC_1 _pull_up>,
+   <4 22 RK_FUNC_1 _pull_up>,
+   <4 23 RK_FUNC_1 _pull_up>;
+   };
+   };
+
+   sdio1 {
+   sdio1_clk: sdio1-clk {
+   rockchip,pins = <4 7 RK_FUNC_4 _pull_none>;
+   };
+
+   sdio1_cmd: sdio1-cmd {
+   rockchip,pins = <4 6 RK_FUNC_4 _pull_up>;
+   };
+
+   sdio1_pwr: sdio1-pwr {
+   rockchip,pins = <4 9 RK_FUNC_4 _pull_up>;
+   };
+
+   sdio1_bus1: sdio1-bus1 {
+   rockchip,pins = <4 24 RK_FUNC_4 _pull_up>;
+   };
+
+   sdio1_bus4: sdio1-bus4 {
+   rockchip,pins = <3 24 RK_FUNC_4 _pull_up>,
+   <3 25 RK_FUNC_4 _pull_up>,
+   <3 26 RK_FUNC_4 _pull_up>,
+   <3 27 RK_FUNC_4 _pull_up>;
+   };
+   };
+
emmc {
emmc_clk: emmc-clk {
rockchip,pins = <3 18 RK_FUNC_2 
_pull_none>;
-- 
1.8.3.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 5/7] x86,random: Add an x86 implementation of arch_rng_init

2014-08-13 Thread Andy Lutomirski
This does the same thing as the generic implementation, except
that it logs how many bits of each type it collected.  I want to
know whether the initial seeding is working and, if so, whether
the RNG is fast enough.

(I know that hpa assures me that the hardware RNG is more than
 fast enough, but I'd still like a direct way to verify this.)

Arguably, arch_get_random_seed could be removed now: I'm having some
trouble imagining a sensible non-architecture-specific use of it
that wouldn't be better served by arch_rng_init.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/include/asm/archrandom.h |  6 +
 arch/x86/kernel/Makefile  |  2 ++
 arch/x86/kernel/archrandom.c  | 51 +++
 3 files changed, 59 insertions(+)
 create mode 100644 arch/x86/kernel/archrandom.c

diff --git a/arch/x86/include/asm/archrandom.h 
b/arch/x86/include/asm/archrandom.h
index 69f1366..5611c21 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -117,6 +117,12 @@ GET_SEED(arch_get_random_seed_int, unsigned int, 
RDSEED_INT, ASM_NOP4);
 #define arch_has_random()  static_cpu_has(X86_FEATURE_RDRAND)
 #define arch_has_random_seed() static_cpu_has(X86_FEATURE_RDSEED)
 
+#define __HAVE_ARCH_RNG_INIT
+extern void arch_rng_init(void *ctx,
+ void (*seed)(void *ctx, u32 data),
+ int bits_per_source,
+ const char *log_prefix);
+
 #else
 
 static inline int rdrand_long(unsigned long *v)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 047f9ff..0718bae 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -92,6 +92,8 @@ obj-$(CONFIG_PARAVIRT)+= paravirt.o 
paravirt_patch_$(BITS).o
 obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
 obj-$(CONFIG_PARAVIRT_CLOCK)   += pvclock.o
 
+obj-$(CONFIG_ARCH_RANDOM)  += archrandom.o
+
 obj-$(CONFIG_PCSPKR_PLATFORM)  += pcspeaker.o
 
 obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) += check.o
diff --git a/arch/x86/kernel/archrandom.c b/arch/x86/kernel/archrandom.c
new file mode 100644
index 000..e8d2ffb
--- /dev/null
+++ b/arch/x86/kernel/archrandom.c
@@ -0,0 +1,51 @@
+/*
+ * This file is part of the Linux kernel.
+ *
+ * Copyright (c) 2014 Andy Lutomirski
+ * Authors: Andy Lutomirski 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+
+void arch_rng_init(void *ctx,
+  void (*seed)(void *ctx, u32 data),
+  int bits_per_source,
+  const char *log_prefix)
+{
+   int i;
+   int rdseed_bits = 0, rdrand_bits = 0;
+   char buf[128] = "";
+   char *msgptr = buf;
+
+   for (i = 0; i < bits_per_source; i += 8 * sizeof(long)) {
+   unsigned long rv;
+
+   if (arch_get_random_seed_long())
+   rdseed_bits += 8 * sizeof(rv);
+   else if (arch_get_random_long())
+   rdrand_bits += 8 * sizeof(rv);
+   else
+   continue;   /* Don't waste time mixing. */
+
+   seed(ctx, (u32)rv);
+#if BITS_PER_LONG > 32
+   seed(ctx, (u32)(rv >> 32));
+#endif
+   }
+
+   if (rdseed_bits)
+   msgptr += sprintf(msgptr, ", %d bits from RDSEED", rdseed_bits);
+   if (rdrand_bits)
+   msgptr += sprintf(msgptr, ", %d bits from RDRAND", rdrand_bits);
+   if (buf[0])
+   pr_info("%s with %s\n", log_prefix, buf + 2);
+}
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 6/7] x86,random,kvm: Use KVM_GET_RNG_SEED in arch_rng_init

2014-08-13 Thread Andy Lutomirski
This is a straightforward implementation: for each bit of internal
RNG state, request one bit from KVM_GET_RNG_SEED.  This is done even
if RDSEED/RDRAND worked, since KVM_GET_RNG_SEED is likely to provide
cryptographically secure output even if the CPU's RNG is weak or
compromised.

Acked-by: Paolo Bonzini 
Signed-off-by: Andy Lutomirski 
---
 arch/x86/Kconfig |  4 
 arch/x86/include/asm/kvm_guest.h |  9 +
 arch/x86/kernel/archrandom.c | 25 -
 arch/x86/kernel/kvm.c| 10 ++
 4 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d24887b..ad87278 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -594,6 +594,7 @@ config KVM_GUEST
bool "KVM Guest support (including kvmclock)"
depends on PARAVIRT
select PARAVIRT_CLOCK
+   select ARCH_RANDOM
default y
---help---
  This option enables various optimizations for running under the KVM
@@ -1508,6 +1509,9 @@ config ARCH_RANDOM
  If supported, this is a high bandwidth, cryptographically
  secure hardware random number generator.
 
+ This also enables paravirt RNGs such as KVM's if the relevant
+ PV guest support is enabled.
+
 config X86_SMAP
def_bool y
prompt "Supervisor Mode Access Prevention" if EXPERT
diff --git a/arch/x86/include/asm/kvm_guest.h b/arch/x86/include/asm/kvm_guest.h
index a92b176..8c4dbd5 100644
--- a/arch/x86/include/asm/kvm_guest.h
+++ b/arch/x86/include/asm/kvm_guest.h
@@ -3,4 +3,13 @@
 
 int kvm_setup_vsyscall_timeinfo(void);
 
+#if defined(CONFIG_KVM_GUEST) && defined(CONFIG_ARCH_RANDOM)
+extern bool kvm_get_rng_seed(u64 *rv);
+#else
+static inline bool kvm_get_rng_seed(u64 *rv)
+{
+   return false;
+}
+#endif
+
 #endif /* _ASM_X86_KVM_GUEST_H */
diff --git a/arch/x86/kernel/archrandom.c b/arch/x86/kernel/archrandom.c
index e8d2ffb..adbaa25 100644
--- a/arch/x86/kernel/archrandom.c
+++ b/arch/x86/kernel/archrandom.c
@@ -15,6 +15,7 @@
  */
 
 #include 
+#include 
 
 void arch_rng_init(void *ctx,
   void (*seed)(void *ctx, u32 data),
@@ -22,7 +23,7 @@ void arch_rng_init(void *ctx,
   const char *log_prefix)
 {
int i;
-   int rdseed_bits = 0, rdrand_bits = 0;
+   int rdseed_bits = 0, rdrand_bits = 0, kvm_bits = 0;
char buf[128] = "";
char *msgptr = buf;
 
@@ -42,10 +43,32 @@ void arch_rng_init(void *ctx,
 #endif
}
 
+   /*
+* Use KVM_GET_RNG_SEED regardless of whether the CPU RNG
+* worked, since it incorporates entropy unavailable to the CPU,
+* and we shouldn't trust the hardware RNG more than we need to.
+* We request enough bits for the entire internal RNG state,
+* because there's no good reason not to.
+*/
+   for (i = 0; i < bits_per_source; i += 64) {
+   u64 rv;
+
+   if (kvm_get_rng_seed()) {
+   seed(ctx, (u32)rv);
+   seed(ctx, (u32)(rv >> 32));
+   kvm_bits += 8 * sizeof(rv);
+   } else {
+   break;  /* If it fails once, it will keep failing. */
+   }
+   }
+
if (rdseed_bits)
msgptr += sprintf(msgptr, ", %d bits from RDSEED", rdseed_bits);
if (rdrand_bits)
msgptr += sprintf(msgptr, ", %d bits from RDRAND", rdrand_bits);
+   if (kvm_bits)
+   msgptr += sprintf(msgptr, ", %d bits from KVM_GET_RNG_BITS",
+ kvm_bits);
if (buf[0])
pr_info("%s with %s\n", log_prefix, buf + 2);
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 3dd8e2c..bd8783a 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -416,6 +416,16 @@ void kvm_disable_steal_time(void)
wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
+bool kvm_get_rng_seed(u64 *v)
+{
+   /*
+* Allow migration from a hypervisor with the GET_RNG_SEED
+* feature to a hypervisor without it.
+*/
+   return (kvm_para_has_feature(KVM_FEATURE_GET_RNG_SEED) &&
+   rdmsrl_safe(MSR_KVM_GET_RNG_SEED, v) == 0);
+}
+
 #ifdef CONFIG_SMP
 static void __init kvm_smp_prepare_boot_cpu(void)
 {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 7/7] x86,kaslr: Use MSR_KVM_GET_RNG_SEED for KASLR if available

2014-08-13 Thread Andy Lutomirski
It's considerably better than any of the alternatives on KVM.

Rather than reinventing all of the cpu feature query code, this fixes
native_cpuid to work in PIC objects.

I haven't combined it with boot/cpuflags.c's cpuid implementation:
including asm/processor.h from boot/cpuflags.c results in a flood of
unrelated errors, and fixing it might be messy.

Reviewed-by: Kees Cook 
Acked-by: Paolo Bonzini 
Signed-off-by: Andy Lutomirski 
---
 arch/x86/boot/compressed/aslr.c  | 27 +++
 arch/x86/include/asm/processor.h | 21 ++---
 2 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index fc6091a..8583f0e 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -5,6 +5,8 @@
 #include 
 #include 
 
+#include 
+
 #include 
 #include 
 #include 
@@ -15,6 +17,22 @@
 static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
+static bool kvm_para_has_feature(unsigned int feature)
+{
+   u32 kvm_base;
+   u32 features;
+
+   if (!has_cpuflag(X86_FEATURE_HYPERVISOR))
+   return false;
+
+   kvm_base = hypervisor_cpuid_base("KVMKVMKVM\0\0\0", KVM_CPUID_FEATURES);
+   if (!kvm_base)
+   return false;
+
+   features = cpuid_eax(kvm_base | KVM_CPUID_FEATURES);
+   return features & (1UL << feature);
+}
+
 #define I8254_PORT_CONTROL 0x43
 #define I8254_PORT_COUNTER00x40
 #define I8254_CMD_READBACK 0xC0
@@ -81,6 +99,15 @@ static unsigned long get_random_long(void)
}
}
 
+   if (kvm_para_has_feature(KVM_FEATURE_GET_RNG_SEED)) {
+   u64 seed;
+
+   debug_putstr(" MSR_KVM_GET_RNG_SEED");
+   rdmsrl(MSR_KVM_GET_RNG_SEED, seed);
+   random ^= (unsigned long)seed;
+   use_i8254 = false;
+   }
+
if (has_cpuflag(X86_FEATURE_TSC)) {
debug_putstr(" RDTSC");
rdtscll(raw);
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a4ea023..6096f3c 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -189,10 +189,25 @@ static inline int have_cpuid_p(void)
 static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
unsigned int *ecx, unsigned int *edx)
 {
-   /* ecx is often an input as well as an output. */
-   asm volatile("cpuid"
+   /*
+* This function can be used from the boot code, so it needs
+* to avoid using EBX in constraints in PIC mode.
+*
+* ecx is often an input as well as an output.
+*/
+   asm volatile(".ifnc %%ebx,%1 ; .ifnc %%rbx,%1   \n\t"
+"movl  %%ebx,%1\n\t"
+".endif ; .endif   \n\t"
+"cpuid \n\t"
+".ifnc %%ebx,%1 ; .ifnc %%rbx,%1   \n\t"
+"xchgl %%ebx,%1\n\t"
+".endif ; .endif"
: "=a" (*eax),
- "=b" (*ebx),
+#if defined(__i386__) && defined(__PIC__)
+ "=r" (*ebx),  /* gcc won't let us use ebx */
+#else
+ "=b" (*ebx),  /* ebx is okay */
+#endif
  "=c" (*ecx),
  "=d" (*edx)
: "0" (*eax), "2" (*ecx)
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 6/7] x86,random,kvm: Use KVM_GET_RNG_SEED in arch_get_rng_seed

2014-08-13 Thread Andy Lutomirski
This is a straightforward implementation: for each bit of internal
RNG state, request one bit from KVM_GET_RNG_SEED.  This is done even
if RDSEED/RDRAND worked, since KVM_GET_RNG_SEED is likely to provide
cryptographically secure output even if the CPU's RNG is weak or
compromised.

Acked-by: Paolo Bonzini 
Signed-off-by: Andy Lutomirski 
---
 arch/x86/Kconfig |  4 
 arch/x86/include/asm/kvm_guest.h |  9 +
 arch/x86/kernel/archrandom.c | 25 -
 arch/x86/kernel/kvm.c| 10 ++
 4 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d24887b..ad87278 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -594,6 +594,7 @@ config KVM_GUEST
bool "KVM Guest support (including kvmclock)"
depends on PARAVIRT
select PARAVIRT_CLOCK
+   select ARCH_RANDOM
default y
---help---
  This option enables various optimizations for running under the KVM
@@ -1508,6 +1509,9 @@ config ARCH_RANDOM
  If supported, this is a high bandwidth, cryptographically
  secure hardware random number generator.
 
+ This also enables paravirt RNGs such as KVM's if the relevant
+ PV guest support is enabled.
+
 config X86_SMAP
def_bool y
prompt "Supervisor Mode Access Prevention" if EXPERT
diff --git a/arch/x86/include/asm/kvm_guest.h b/arch/x86/include/asm/kvm_guest.h
index a92b176..8c4dbd5 100644
--- a/arch/x86/include/asm/kvm_guest.h
+++ b/arch/x86/include/asm/kvm_guest.h
@@ -3,4 +3,13 @@
 
 int kvm_setup_vsyscall_timeinfo(void);
 
+#if defined(CONFIG_KVM_GUEST) && defined(CONFIG_ARCH_RANDOM)
+extern bool kvm_get_rng_seed(u64 *rv);
+#else
+static inline bool kvm_get_rng_seed(u64 *rv)
+{
+   return false;
+}
+#endif
+
 #endif /* _ASM_X86_KVM_GUEST_H */
diff --git a/arch/x86/kernel/archrandom.c b/arch/x86/kernel/archrandom.c
index e8d2ffb..adbaa25 100644
--- a/arch/x86/kernel/archrandom.c
+++ b/arch/x86/kernel/archrandom.c
@@ -15,6 +15,7 @@
  */
 
 #include 
+#include 
 
 void arch_rng_init(void *ctx,
   void (*seed)(void *ctx, u32 data),
@@ -22,7 +23,7 @@ void arch_rng_init(void *ctx,
   const char *log_prefix)
 {
int i;
-   int rdseed_bits = 0, rdrand_bits = 0;
+   int rdseed_bits = 0, rdrand_bits = 0, kvm_bits = 0;
char buf[128] = "";
char *msgptr = buf;
 
@@ -42,10 +43,32 @@ void arch_rng_init(void *ctx,
 #endif
}
 
+   /*
+* Use KVM_GET_RNG_SEED regardless of whether the CPU RNG
+* worked, since it incorporates entropy unavailable to the CPU,
+* and we shouldn't trust the hardware RNG more than we need to.
+* We request enough bits for the entire internal RNG state,
+* because there's no good reason not to.
+*/
+   for (i = 0; i < bits_per_source; i += 64) {
+   u64 rv;
+
+   if (kvm_get_rng_seed()) {
+   seed(ctx, (u32)rv);
+   seed(ctx, (u32)(rv >> 32));
+   kvm_bits += 8 * sizeof(rv);
+   } else {
+   break;  /* If it fails once, it will keep failing. */
+   }
+   }
+
if (rdseed_bits)
msgptr += sprintf(msgptr, ", %d bits from RDSEED", rdseed_bits);
if (rdrand_bits)
msgptr += sprintf(msgptr, ", %d bits from RDRAND", rdrand_bits);
+   if (kvm_bits)
+   msgptr += sprintf(msgptr, ", %d bits from KVM_GET_RNG_BITS",
+ kvm_bits);
if (buf[0])
pr_info("%s with %s\n", log_prefix, buf + 2);
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 3dd8e2c..bd8783a 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -416,6 +416,16 @@ void kvm_disable_steal_time(void)
wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
+bool kvm_get_rng_seed(u64 *v)
+{
+   /*
+* Allow migration from a hypervisor with the GET_RNG_SEED
+* feature to a hypervisor without it.
+*/
+   return (kvm_para_has_feature(KVM_FEATURE_GET_RNG_SEED) &&
+   rdmsrl_safe(MSR_KVM_GET_RNG_SEED, v) == 0);
+}
+
 #ifdef CONFIG_SMP
 static void __init kvm_smp_prepare_boot_cpu(void)
 {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 1/7] random: Add and use arch_rng_init

2014-08-13 Thread Andy Lutomirski
Currently, init_std_data contains its own logic for using arch
random sources.  This replaces that logic with a generic function
arch_rng_init that allows arch code to supply its own logic.  The
default implementation tries arch_get_random_seed_long and
arch_get_random_long individually.

The only functional change here is that random_get_entropy() is used
unconditionally instead of being used only when the arch sources
fail.  This may add a tiny amount of security.

Acked-by: Theodore Ts'o 
Signed-off-by: Andy Lutomirski 
---
 drivers/char/random.c  | 14 +++---
 include/linux/random.h | 40 
 2 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 71529e1..7673e60 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1246,6 +1246,10 @@ void get_random_bytes_arch(void *buf, int nbytes)
 }
 EXPORT_SYMBOL(get_random_bytes_arch);
 
+static void seed_entropy_store(void *ctx, u32 data)
+{
+   mix_pool_bytes((struct entropy_store *)ctx, , sizeof(data), NULL);
+}
 
 /*
  * init_std_data - initialize pool with system data
@@ -1261,15 +1265,19 @@ static void init_std_data(struct entropy_store *r)
int i;
ktime_t now = ktime_get_real();
unsigned long rv;
+   char log_prefix[128];
 
r->last_pulled = jiffies;
mix_pool_bytes(r, , sizeof(now), NULL);
for (i = r->poolinfo->poolbytes; i > 0; i -= sizeof(rv)) {
-   if (!arch_get_random_seed_long() &&
-   !arch_get_random_long())
-   rv = random_get_entropy();
+   rv = random_get_entropy();
mix_pool_bytes(r, , sizeof(rv), NULL);
}
+
+   sprintf(log_prefix, "random: seeded %s pool", r->name);
+   arch_rng_init(r, seed_entropy_store, 8 * r->poolinfo->poolbytes,
+ log_prefix);
+
mix_pool_bytes(r, utsname(), sizeof(*(utsname())), NULL);
 }
 
diff --git a/include/linux/random.h b/include/linux/random.h
index 57fbbff..c8d692e 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -106,6 +106,46 @@ static inline int arch_has_random_seed(void)
 }
 #endif
 
+#ifndef __HAVE_ARCH_RNG_INIT
+
+/**
+ * arch_rng_init() - get architectural rng seed data
+ * @ctx: context for the seed function
+ * @seed: function to call for each u32 obtained
+ * @bits_per_source: number of bits from each source to try to use
+ * @log_prefix: beginning of log output (may be NULL)
+ *
+ * Synchronously load some architectural entropy or other best-effort
+ * random seed data.  An arch-specific implementation should be no worse
+ * than this generic implementation.  If the arch code does something
+ * interesting, it may log something of the form "log_prefix with
+ * 8 bits of stuff".
+ *
+ * No arch-specific implementation should be any worse than the generic
+ * implementation.
+ */
+static inline void arch_rng_init(void *ctx,
+void (*seed)(void *ctx, u32 data),
+int bits_per_source,
+const char *log_prefix)
+{
+   int i;
+
+   for (i = 0; i < bits_per_source; i += 8 * sizeof(long)) {
+   unsigned long rv;
+
+   if (arch_get_random_seed_long() ||
+   arch_get_random_long()) {
+   seed(ctx, (u32)rv);
+#if BITS_PER_LONG > 32
+   seed(ctx, (u32)(rv >> 32));
+#endif
+   }
+   }
+}
+
+#endif /* __HAVE_ARCH_RNG_INIT */
+
 /* Pseudo random number generator from numerical recipes. */
 static inline u32 next_pseudo_random32(u32 seed)
 {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 4/7] x86,kvm: Add MSR_KVM_GET_RNG_SEED and a matching feature bit

2014-08-13 Thread Andy Lutomirski
This adds a simple interface to allow a guest to request 64 bits of
host nonblocking entropy.  This is independent of virtio-rng for a
couple of reasons:

 - It's intended to be usable during early boot, when a trivial
   synchronous interface is needed.

 - virtio-rng gives blocking entropy, and making guest boot wait for
   the host's /dev/random will cause problems.

MSR_KVM_GET_RNG_SEED is intended to provide 64 bits of best-effort
cryptographically secure data for use as a seed.  It provides no
guarantee that the result contains any actual entropy.

Acked-by: Paolo Bonzini 
Signed-off-by: Andy Lutomirski 
---
 Documentation/virtual/kvm/cpuid.txt  | 3 +++
 arch/x86/include/uapi/asm/kvm_para.h | 2 ++
 arch/x86/kvm/cpuid.c | 3 ++-
 arch/x86/kvm/x86.c   | 4 
 4 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/cpuid.txt 
b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..0ab043b 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,9 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest checks 
this feature bit
||   || before enabling paravirtualized
||   || spinlock support.
 --
+KVM_FEATURE_GET_RNG_SEED   || 8 || host provides rng seed data via
+   ||   || MSR_KVM_GET_RNG_SEED.
+--
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no guest-side
||   || per-cpu warps are expected in
||   || kvmclock.
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index 94dc8ca..e2eaf93 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME 5
 #define KVM_FEATURE_PV_EOI 6
 #define KVM_FEATURE_PV_UNHALT  7
+#define KVM_FEATURE_GET_RNG_SEED   8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
@@ -40,6 +41,7 @@
 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
 #define MSR_KVM_STEAL_TIME  0x4b564d03
 #define MSR_KVM_PV_EOI_EN  0x4b564d04
+#define MSR_KVM_GET_RNG_SEED 0x4b564d05
 
 struct kvm_steal_time {
__u64 steal;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 38a0afe..40d6763 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -479,7 +479,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
*entry, u32 function,
 (1 << KVM_FEATURE_ASYNC_PF) |
 (1 << KVM_FEATURE_PV_EOI) |
 (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
-(1 << KVM_FEATURE_PV_UNHALT);
+(1 << KVM_FEATURE_PV_UNHALT) |
+(1 << KVM_FEATURE_GET_RNG_SEED);
 
if (sched_info_on())
entry->eax |= (1 << KVM_FEATURE_STEAL_TIME);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ef432f8..695b682 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #define CREATE_TRACE_POINTS
@@ -2480,6 +2481,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata)
case MSR_KVM_PV_EOI_EN:
data = vcpu->arch.pv_eoi.msr_val;
break;
+   case MSR_KVM_GET_RNG_SEED:
+   get_random_bytes(, sizeof(data));
+   break;
case MSR_IA32_P5_MC_ADDR:
case MSR_IA32_P5_MC_TYPE:
case MSR_IA32_MCG_CAP:
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 2/7] random, timekeeping: Collect timekeeping entropy in the timekeeping code

2014-08-13 Thread Andy Lutomirski
Currently, init_std_data calls ktime_get_real().  This imposes
awkward constraints on when init_std_data can be called, and
init_std_data is unlikely to collect the full unpredictable data
available to the timekeeping code, especially after resume.

Remove this code from random.c and add the appropriate
add_device_randomness calls to timekeeping.c instead.

Cc: John Stultz 
Signed-off-by: Andy Lutomirski 
---
 drivers/char/random.c |  2 --
 kernel/time/timekeeping.c | 11 +++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 7673e60..8dc3e3a 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1263,12 +1263,10 @@ static void seed_entropy_store(void *ctx, u32 data)
 static void init_std_data(struct entropy_store *r)
 {
int i;
-   ktime_t now = ktime_get_real();
unsigned long rv;
char log_prefix[128];
 
r->last_pulled = jiffies;
-   mix_pool_bytes(r, , sizeof(now), NULL);
for (i = r->poolinfo->poolbytes; i > 0; i -= sizeof(rv)) {
rv = random_get_entropy();
mix_pool_bytes(r, , sizeof(rv), NULL);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 32d8d6a..9609db9 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "tick-internal.h"
 #include "ntp_internal.h"
@@ -835,6 +836,9 @@ void __init timekeeping_init(void)
memcpy(_timekeeper, , sizeof(timekeeper));
 
write_seqcount_end(_seq);
+
+   add_device_randomness(tk, sizeof(tk));
+
raw_spin_unlock_irqrestore(_lock, flags);
 }
 
@@ -976,6 +980,13 @@ static void timekeeping_resume(void)
timekeeping_suspended = 0;
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
write_seqcount_end(_seq);
+
+   /*
+* The timekeeping state has a decent chance of differing
+* between resumptions of the same image.
+*/
+   add_device_randomness(tk, sizeof(tk));
+
raw_spin_unlock_irqrestore(_lock, flags);
 
touch_softlockup_watchdog();
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 3/7] random: Reseed pools on resume

2014-08-13 Thread Andy Lutomirski
After a suspend/resume cycle, and especially after hibernating, we
should assume that the random pools might have leaked.  To minimize
the risk this poses, try to collect fresh architectural entropy on
resume.

Signed-off-by: Andy Lutomirski 
---
 drivers/char/random.c | 26 +++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 8dc3e3a..0811ad4 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -257,6 +257,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1279,6 +1280,26 @@ static void init_std_data(struct entropy_store *r)
mix_pool_bytes(r, utsname(), sizeof(*(utsname())), NULL);
 }
 
+static void init_all_pools(void)
+{
+   init_std_data(_pool);
+   init_std_data(_pool);
+   init_std_data(_pool);
+}
+
+static void random_resume(void)
+{
+   /*
+* After resume (and especially after hibernation / kexec resume),
+* make a best-effort attempt to collect fresh entropy.
+*/
+   init_all_pools();
+}
+
+static struct syscore_ops random_syscore_ops = {
+   .resume = random_resume,
+};
+
 /*
  * Note that setup_arch() may call add_device_randomness()
  * long before we get here. This allows seeding of the pools
@@ -1291,9 +1312,8 @@ static void init_std_data(struct entropy_store *r)
  */
 static int rand_initialize(void)
 {
-   init_std_data(_pool);
-   init_std_data(_pool);
-   init_std_data(_pool);
+   init_all_pools();
+   register_syscore_ops(_syscore_ops);
return 0;
 }
 early_initcall(rand_initialize);
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 0/7] random,x86,kvm: Rework arch RNG seeds and get some from kvm

2014-08-13 Thread Andy Lutomirski
This introduces and uses a very simple synchronous mechanism to get
/dev/urandom-style bits appropriate for initial KVM PV guest RNG
seeding.

It also re-works the way that architectural random data is fed into
random.c's pools.  Timekeeping randomness now comes directly from
the timekeeping core rather than being pulled in from init_std_data,
and timekeeping randomness is added both on boot and on resume.  I
added a new arch hook called arch_rng_init.  The default
implementation is more or less the same as the current code, except
that random_get_entropy is now called unconditionally.  We now also
call init_std_data on resume.

x86 gets a custom arch_rng_init.  It will use KVM_GET_RNG_SEED if
available, and, if it does anything, it will log the number of bits
collected from each available architectural source.  If more
paravirt seed sources show up, it will be a natural place to add
them.

I sent the corresponding kvm-unit-tests and qemu changes separately.

Changes from v5:
 - Moved the generic changes to the beginning.
 - Renamed arch_get_rng_seed to arch_rng_init.
 - The timekeeping change is new.
 - random.c registers a syscore callback to reseed on resume.

Changes from v4:
 - Got rid of the RDRAND behavior change.  If this series is accepted,
   I may resend it separately, but I think it's an unrelated issue.
 - Fix up the changelog entries -- I misunderstood how the old code
   worked.
 - Avoid lots of failed attempts to use KVM_GET_RNG_SEED if it's not
   available.

Changes from v3:
 - Other than KASLR, the guest pieces are completely rewritten.
   Patches 2-4 have essentially nothing in common with v2.

Changes from v2:
 - Bisection fix (patch 2 had a misplaced brace).  The final states is
   identical to that of v2.
 - Improve the 0/5 description a little bit.

Changes from v1:
 - Split patches 2 and 3
 - Log all arch sources in init_std_data
 - Fix the 32-bit kaslr build

Andy Lutomirski (7):
  random: Add and use arch_rng_init
  random, timekeeping: Collect timekeeping entropy in the timekeeping
code
  random: Reseed pools on resume
  x86,kvm: Add MSR_KVM_GET_RNG_SEED and a matching feature bit
  x86,random: Add an x86 implementation of arch_rng_init
  x86,random,kvm: Use KVM_GET_RNG_SEED in arch_rng_init
  x86,kaslr: Use MSR_KVM_GET_RNG_SEED for KASLR if available

 Documentation/virtual/kvm/cpuid.txt  |  3 ++
 arch/x86/Kconfig |  4 ++
 arch/x86/boot/compressed/aslr.c  | 27 +
 arch/x86/include/asm/archrandom.h|  6 +++
 arch/x86/include/asm/kvm_guest.h |  9 +
 arch/x86/include/asm/processor.h | 21 --
 arch/x86/include/uapi/asm/kvm_para.h |  2 +
 arch/x86/kernel/Makefile |  2 +
 arch/x86/kernel/archrandom.c | 74 
 arch/x86/kernel/kvm.c| 10 +
 arch/x86/kvm/cpuid.c |  3 +-
 arch/x86/kvm/x86.c   |  4 ++
 drivers/char/random.c| 42 
 include/linux/random.h   | 40 +++
 kernel/time/timekeeping.c| 11 ++
 15 files changed, 246 insertions(+), 12 deletions(-)
 create mode 100644 arch/x86/kernel/archrandom.c

-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio generation number

2014-08-13 Thread Xiao Guangrong
On 08/13/2014 05:18 AM, David Matlack wrote:
> On Mon, Aug 11, 2014 at 10:02 PM, Xiao Guangrong
>  wrote:
>> @@ -722,9 +719,10 @@ static struct kvm_memslots *install_new_memslots(struct 
>> kvm *kvm,
>>  {
>> struct kvm_memslots *old_memslots = kvm->memslots;
>>
> 
> I think you want
> 
>   slots->generation = old_memslots->generation;
> 
> here.
> 
> On the KVM_MR_DELETE path, install_new_memslots is called twice so this
> patch introduces a short window of time where the generation number
> actually decreases.

Yes, indeed. Thank you for pointing it out, will update.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 11/14] ARM: brcmstb: delete unneeded test before of_node_put

2014-08-13 Thread Julia Lawall


On Wed, 13 Aug 2014, Brian Norris wrote:

> Hi Julia,
> 
> On Fri, Aug 08, 2014 at 12:07:52PM +0200, Julia Lawall wrote:
> > From: Julia Lawall 
> > 
> > Simplify the error path to avoid calling of_node_put when it is not needed.
> > 
> > The semantic patch that finds this problem is as follows:
> > (http://coccinelle.lip6.fr/)
> > 
> > // 
> > @@
> > expression e;
> > @@
> > 
> > -if (e)
> >of_node_put(e);
> > // 
> > 
> > Signed-off-by: Julia Lawall 
> > 
> > ---
> >  arch/arm/mach-bcm/platsmp-brcmstb.c |   14 ++
> 
> This file is being dropped temporarily, for rework/resubmission at a
> later time:
> 
>   https://lkml.org/lkml/2014/8/13/617
> 
> But thanks for the patch. I'll take it into account in the future. A few
> comments below.
> 
> >  1 file changed, 6 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/arm/mach-bcm/platsmp-brcmstb.c 
> > b/arch/arm/mach-bcm/platsmp-brcmstb.c
> > index af780e9..c515ea1 100644
> > --- a/arch/arm/mach-bcm/platsmp-brcmstb.c
> > +++ b/arch/arm/mach-bcm/platsmp-brcmstb.c
> > @@ -227,7 +227,7 @@ static int __init setup_hifcpubiuctrl_regs(struct 
> > device_node *np)
> > if (!syscon_np) {
> > pr_err("can't find phandle %s\n", name);
> > rc = -EINVAL;
> > -   goto cleanup;
> > +   goto out;
> > }
> >  
> > cpubiuctrl_block = of_iomap(syscon_np, 0);
> > @@ -256,9 +256,8 @@ static int __init setup_hifcpubiuctrl_regs(struct 
> > device_node *np)
> > }
> >  
> >  cleanup:
> > -   if (syscon_np)
> > -   of_node_put(syscon_np);
> > -
> > +   of_node_put(syscon_np);
> > +out:
> 
> Is there a good reason for this new label? I thought part of the point
> of this semantic patch is that the previous line (of_node_put()) is a
> no-op for NULL arguments.

Personally, I prefer code to only be executed if it needs to be.  It is 
helpful from a program analysis point of view, and I think it helps 
someone trying to understand the code.

That is, when I am trying to understand some unknown code, I may look at 
the cleanup code and try to figure out why each piece of it is executed.  
If some of it is statically known to be irrelevant, it is confusing.

But I you think the other way around, and would rather have just one label 
that contains anything that might ever be useful, then I guess that is a 
reasonable point of view as well.

julia


> > return rc;
> >  }
> >  
> > @@ -274,7 +273,7 @@ static int __init setup_hifcont_regs(struct device_node 
> > *np)
> > if (!syscon_np) {
> > pr_err("can't find phandle %s\n", name);
> > rc = -EINVAL;
> > -   goto cleanup;
> > +   goto out;
> > }
> >  
> > hif_cont_block = of_iomap(syscon_np, 0);
> > @@ -288,9 +287,8 @@ static int __init setup_hifcont_regs(struct device_node 
> > *np)
> > hif_cont_reg = 0;
> >  
> >  cleanup:
> > -   if (syscon_np)
> > -   of_node_put(syscon_np);
> > -
> > +   of_node_put(syscon_np);
> > +out:
> 
> Ditto.
> 
> > return rc;
> >  }
> >  
> > 
> 
> Brian
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARM: compressed/head.S: use addruart properly

2014-08-13 Thread kpark3469
From: Sahara 

This patch fixes compile error in compressed/head.S, when DEBUG
is defined. Since addruart macro accepts 3 params, rp, rv, and tmp,
loadsp macro also needs to be fixed. Or you will meet the following
error messages:
Error: ARM register expected -- `mov ,#(5<<1)'
Error: shift expression expected -- `add r3,r3,'

Signed-off-by: Sahara 
---
 arch/arm/boot/compressed/head.S |   22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 413fd94..9f1a6cd 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -24,19 +24,19 @@
 #if defined(CONFIG_DEBUG_ICEDCC)
 
 #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7)
-   .macro  loadsp, rb, tmp
+   .macro  loadsp, rp, rv, tmp
.endm
.macro  writeb, ch, rb
mcr p14, 0, \ch, c0, c5, 0
.endm
 #elif defined(CONFIG_CPU_XSCALE)
-   .macro  loadsp, rb, tmp
+   .macro  loadsp, rp, rv, tmp
.endm
.macro  writeb, ch, rb
mcr p14, 0, \ch, c8, c0, 0
.endm
 #else
-   .macro  loadsp, rb, tmp
+   .macro  loadsp, rp, rv, tmp
.endm
.macro  writeb, ch, rb
mcr p14, 0, \ch, c1, c0, 0
@@ -52,17 +52,17 @@
.endm
 
 #if defined(CONFIG_ARCH_SA1100)
-   .macro  loadsp, rb, tmp
-   mov \rb, #0x8000@ physical base address
+   .macro  loadsp, rp, rv, tmp
+   mov \rp, #0x8000@ physical base address
 #ifdef CONFIG_DEBUG_LL_SER3
-   add \rb, \rb, #0x0005   @ Ser3
+   add \rp, \rp, #0x0005   @ Ser3
 #else
-   add \rb, \rb, #0x0001   @ Ser1
+   add \rp, \rp, #0x0001   @ Ser1
 #endif
.endm
 #else
-   .macro  loadsp, rb, tmp
-   addruart \rb, \tmp
+   .macro  loadsp, rp, rv, tmp
+   addruart \rp, \rv, \tmp
.endm
 #endif
 #endif
@@ -1209,7 +1209,7 @@ phex: adr r3, phexbuf
b   1b
 
 @ puts corrupts {r0, r1, r2, r3}
-puts:  loadsp  r3, r1
+puts:  loadsp  r3, r2, r1
 1: ldrbr2, [r0], #1
teq r2, #0
moveq   pc, lr
@@ -1225,9 +1225,9 @@ puts: loadsp  r3, r1
mov pc, lr
 @ putc corrupts {r0, r1, r2, r3}
 putc:
+   loadsp  r3, r2, r1
mov r2, r0
mov r0, #0
-   loadsp  r3, r1
b   2b
 
 @ memdump corrupts {r0, r1, r2, r3, r10, r11, r12, lr}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Aug 14

2014-08-13 Thread Stephen Rothwell
Hi all,

Please do not add code intended for v3.18 until after v3.17-rc1 is
released.

Changes since 20140813:

The tip tree gained a conflict against the pci-current tree.

Non-merge commits (relative to Linus' tree): 1233
 1229 files changed, 34781 insertions(+), 18188 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (this fails its final link) and i386, sparc, sparc64 and arm
defconfig.

Below is a summary of the state of the merge.

I am currently merging 220 trees (counting Linus' and 30 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (dc1cc8513312 Merge tag 'xfs-for-linus-3.17-rc1' of 
git://oss.sgi.com/xfs/xfs)
Merging fixes/master (23cf8d3ca0fd powerpc: Fix "attempt to move .org 
backwards" error)
Merging kbuild-current/rc-fixes (dd5a6752ae7d firmware: Create directories for 
external firmware)
Merging arc-current/for-curr (89ca3b881987 Linux 3.15-rc4)
Merging arm-current/fixes (e57e41931134 ARM: wire up memfd_create syscall)
Merging m68k-current/for-linus (9117710a5997 m68k/sun3: Remove define statement 
no longer needed)
Merging metag-fixes/fixes (ffe6902b66aa asm-generic: remove _STK_LIM_MAX)
Merging mips-fixes/mips-fixes (08a9c3c9afcf MIPS: OCTEON: make 
get_system_type() thread-safe)
Merging powerpc-merge/merge (396a34340cdf powerpc: Fix endianness of 
flash_block_list in rtas_flash)
Merging sparc/master (8bccf5b31318 sparc64: Fix pcr_ops initialization and 
usage bugs.)
Merging net/master (61dac43ee6be Merge branch 'bcmgenet')
Merging ipsec/master (a0e5ef53aac8 xfrm: Fix installation of AH IPsec SAs)
Merging sound-current/for-linus (e24aa0a4c5ac ALSA: hda/ca0132 - Don't try 
loading firmware at resume when already failed)
Merging pci-current/for-linus (9baa3c34ac4e PCI: Remove DEFINE_PCI_DEVICE_TABLE 
macro use)
Merging wireless/master (2da2c5854ed7 net: wireless: ipw2x00: ipw2200.c: 
Cleaning up missing null-terminate after strncpy call)
Merging driver-core.current/driver-core-linus (c489d98c8c81 Merge branch 
'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm)
Merging tty.current/tty-linus (c489d98c8c81 Merge branch 'for-linus' of 
git://ftp.arm.linux.org.uk/~rmk/linux-arm)
Merging usb.current/usb-linus (c489d98c8c81 Merge branch 'for-linus' of 
git://ftp.arm.linux.org.uk/~rmk/linux-arm)
Merging usb-gadget-fixes/fixes (a8a85b01d185 usb: musb/cppi41: call 
musb_ep_select() before accessing an endpoint's CSR)
CONFLICT (content): Merge conflict in drivers/usb/musb/musb_host.c
Merging usb-serial-fixes/usb-linus (19583ca584d6 Linux 3.16)
Merging staging.current/staging-linus (c309bfa9b481 Merge tag 
'for-linus-20140808' of git://git.infradead.org/linux-mtd)
Merging char-misc.current/char-misc-linus (c489d98c8c81 Merge branch 
'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm)
Merging input-current/for-linus (a6b48699ae50 Input: joystick - use get_cycles 
on ARMv8)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (ce5481d01f67 crypto: drbg - fix failure of 
generating multiple of 2**16 bytes)
Merging ide/master (a53dae49b2fe ide: use module_platform_driver())
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging devicetree-current/devicetree/merge (5a12a597a862 arm: Add devicetree 
fixup machine function)
Merging rr-fixes/fixes (79465d2fd48e module: remove warning about waiting 
module removal.)
Merging vfio-fixes/for-linus (239a87020b26 Merge branch 
'for-joerg/arm-smmu/fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into for-linus)
Merg

[TRIVIAL PATCH] Documentation: Fix null_blk parameter irq_mode to irqmode

2014-08-13 Thread Fam Zheng
To match the real module parameter name we implemented.

Signed-off-by: Fam Zheng 
---
 Documentation/block/null_blk.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/block/null_blk.txt b/Documentation/block/null_blk.txt
index b2830b4..2f6c6ff 100644
--- a/Documentation/block/null_blk.txt
+++ b/Documentation/block/null_blk.txt
@@ -42,7 +42,7 @@ nr_devices=[Number of devices]: Default: 2
   Number of block devices instantiated. They are instantiated as /dev/nullb0,
   etc.
 
-irq_mode=[0-2]: Default: 1-Soft-irq
+irqmode=[0-2]: Default: 1-Soft-irq
   The completion mode used for completing IOs to the block-layer.
 
   0: None.
@@ -53,7 +53,7 @@ irq_mode=[0-2]: Default: 1-Soft-irq
  completion.
 
 completion_nsec=[ns]: Default: 10.000ns
-  Combined with irq_mode=2 (timer). The time each completion event must wait.
+  Combined with irqmode=2 (timer). The time each completion event must wait.
 
 submit_queues=[0..nr_cpus]:
   The number of submission queues attached to the device driver. If unset, it
-- 
2.0.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: shmobile: defconfig: enable initrd and atag dtb compat

2014-08-13 Thread Simon Horman
On Wed, Aug 13, 2014 at 05:07:15PM -0700, Kevin Hilman wrote:
> Enable ATAG_DTB_COMPAT support for passing commandline from bootloader.
> Enable initrd support.
> 
> Signed-off-by: Kevin Hilman 
> ---
> With this, I can finally integrate the kzm9d into the board farm because I use
> initrafms for the primary rootfs, and set the command-line from u-boot.
> 
> Tested on next-20140813

Thanks, I will queue this up for v3.18 and push it to next
once v3.17-rc1 has been released.

> 
>  arch/arm/configs/shmobile_defconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/configs/shmobile_defconfig 
> b/arch/arm/configs/shmobile_defconfig
> index 3b136144cc83..d655f088ac58 100644
> --- a/arch/arm/configs/shmobile_defconfig
> +++ b/arch/arm/configs/shmobile_defconfig
> @@ -3,6 +3,7 @@ CONFIG_NO_HZ=y
>  CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_LOG_BUF_SHIFT=16
> +CONFIG_BLK_DEV_INITRD=y
>  CONFIG_CC_OPTIMIZE_FOR_SIZE=y
>  CONFIG_SYSCTL_SYSCALL=y
>  CONFIG_EMBEDDED=y
> @@ -33,6 +34,7 @@ CONFIG_HIGHMEM=y
>  CONFIG_ZBOOT_ROM_TEXT=0x0
>  CONFIG_ZBOOT_ROM_BSS=0x0
>  CONFIG_ARM_APPENDED_DTB=y
> +CONFIG_ARM_ATAG_DTB_COMPAT=y
>  CONFIG_KEXEC=y
>  CONFIG_VFP=y
>  CONFIG_NEON=y
> -- 
> 1.9.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sh" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new APIs to allocate buffer-cache for superblock in non-movable area

2014-08-13 Thread Gioh Kim


2014-08-14 오후 2:12, Gioh Kim 쓴 글:
> Hello,
> 
> This patch try to solve problem that a long-lasting page caches of
> ext4 superblock and journaling of superblock disturb page migration.
> 
> I've been testing CMA feature on my ARM-based platform
> and found that two page caches cannot be migrated.
> They are page caches of superblock of ext4 filesystem and its journaling data.
> 
> Current ext4 reads superblock with sb_bread() that allocates page
> from movable area. But the problem is that ext4 hold the page until
> it is unmounted. If root filesystem is ext4 the page cannot be migrated 
> forever.
> And also the journaling data for the superblock cannot be migreated.
> 
> I introduce a new API for allocating page cache from non-movable area.
> It is useful for ext4/ext3 and others that want to hold page cache for a long 
> time.
> 
> I have 3 patchs:
> 
> 1. Patch 1/3: introduce a new API that create page cache from non-movable area
> 2. Patch 2/3: have ext4 use the new API to read superblock
> 3. Patch 3/3: have jbd/jbd2 use the new API to make journaling of superblock
> 
> This patchset is based on linux-next-20140814.
> 
> Thanks a lot.
> 
> Gioh Kim (3):
>   fs/buffer.c: allocate buffer cache from non-movable area
>   ext4: allocate buffer-cache for superblock in non-movable area
>   jbd-jbd2-allocate-buffer-cache-for-superblock-inode-.patch
> 
>   fs/buffer.c |   63 ++
>   fs/ext4/super.c |6 +--
>   fs/jbd/journal.c|2 -
>   fs/jbd2/journal.c   |2 -
>   include/linux/buffer_head.h |   10 +
>   5 files changed, 71 insertions(+), 12 deletions(-)
> 

I'm sorry to forget mentioning that this is 2nd version of 
https://lkml.org/lkml/2014/7/22/34.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] module: add support for unsafe, tainting parameters

2014-08-13 Thread Daniel Vetter
On Wed, Aug 13, 2014 at 10:25 PM, Rusty Russell  wrote:
> Jani Nikula  writes:
>> This is a generic version of Daniel's patch [1] letting us have unsafe
>> module parameters (experimental, debugging, testing, etc.) that taint
>> the kernel when set. Quoting Daniel,
>
> OK, I think the idea is fine, but we'll probably only want this for
> a few types (eg. int and bool).  So for the moment I prefer a more
> naive approach.
>
> Does this work for you?

Can you please discuss this with yourself from a few months back?
We've done the general version since you suggested that just doing it
for int is a bit lame ;-) And I actually agreed so asked Jani to look
into that.

http://mid.mail-archive.com/87r46gywul.fsf@rustcorp.com.au

"If this is a good idea, you can write a macro module_param_unsafe_named
which is a general wrapper."

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Hyperv: Trigger DHCP renew after host hibernation

2014-08-13 Thread Dexuan Cui
> -Original Message-
> From: Dan Williams
> > > e.g., on a bare metal host with Ubuntu 14.04, when I plug the RJ45 cable
> > > out of the network card and then plug the cable back into the network card
> > > quickly -- in ~3 seconds, networkd doesn't trigger DHCP renew request: in
> > > /var/log/syslog, we see
> > > Aug 12 11:07:07 decui-lin NetworkManager[828]:  (eth0): carrier
> now OFF (device state 100, deferring action for 4 seconds)
> > > Aug 12 11:07:07 decui-lin kernel: [  246.975453] e1000e: eth0 NIC Link is
> Down
> > > Aug 12 11:07:10 decui-lin NetworkManager[828]:  (eth0): carrier
> now ON (device state 100)
> > > Aug 12 11:07:10 decui-lin kernel: [  250.028533] e1000e: eth0 NIC Link is
> Up 100 Mbps Full Duplex, Flow Control: Rx/Tx
> > >
> > > It looks there is a delay of 4s.
> > > I'm going to find out if there is a configurable parameter for this.
> >
> > Just to avoid any confusion: you are referring to "networkd" (and so
> > did my comments), but the above logs are from "NetworkManager".
> 
> And yes, NM does have a 4-second delay before processing a carrier down
> event, and NM currently does not renew DHCP on link changes, but that's
> certainly something we can/should change.
> 
> Dan

Hi Tom, Dan,
Thanks a lot for the clarification!
So the 4s delay comes from NM rather than networkd.

Thanks,
-- Dexuan
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH 1/2] fs/buffer.c: allocate buffer cache from non-movable area

2014-08-13 Thread Gioh Kim
I'm sorry for a typo at the title.
It is 1/3, not 1/2.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new APIs to allocate buffer-cache for superblock in non-movable area

2014-08-13 Thread Gioh Kim

I'm sorry for a typo at the tile.
It is 0/3, not 0/2.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] jbd/jbd2: allocate buffer-cache for superblock inode in, non-movable area

2014-08-13 Thread Gioh Kim
A long-lasting buffer-cache can distrub page migration so that
it must be allocated from non-movable area.

The journal_init_inode is creating a buffer-cache for superblock journaling.
The superblock exists until system shutdown so that the buffer-cache
for the superblock would also exist for a long time
and it can distrub page migration.

This patch make the buffer-cache be allocated from non-movable area
not to distrub page migration.

Signed-off-by: Gioh Kim 
---
 fs/jbd/journal.c  |2 +-
 fs/jbd2/journal.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
index 06fe11e..2ed8aa2 100644
--- a/fs/jbd/journal.c
+++ b/fs/jbd/journal.c
@@ -886,7 +886,7 @@ journal_t * journal_init_inode (struct inode *inode)
goto out_err;
}

-   bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
+   bh = __getblk_nonmovable(journal->j_dev, blocknr, journal->j_blocksize);
if (!bh) {
printk(KERN_ERR
   "%s: Cannot get buffer for journal superblock\n",
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 67b8e30..a618e49 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1237,7 +1237,7 @@ journal_t * jbd2_journal_init_inode (struct inode *inode)
goto out_err;
}

-   bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
+   bh = __getblk_nonmovable(journal->j_dev, blocknr, journal->j_blocksize);
if (!bh) {
printk(KERN_ERR
   "%s: Cannot get buffer for journal superblock\n",
--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] ext4: allocate buffer-cache for superblock in, non-movable area

2014-08-13 Thread Gioh Kim
A buffer-cache for superblock is disturbing page migration,
because the buffer-cache is not released until unmount.
The buffer-cache must be allocated from non-movable area.

Signed-off-by: Gioh Kim 
---
 fs/ext4/super.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 32b43ad..0630a88 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3435,7 +3435,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
logical_sb_block = sb_block;
}

-   if (!(bh = sb_bread(sb, logical_sb_block))) {
+   if (!(bh = sb_bread_nonmovable(sb, logical_sb_block))) {
ext4_msg(sb, KERN_ERR, "unable to read superblock");
goto out_fail;
}
@@ -3645,7 +3645,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
brelse(bh);
logical_sb_block = sb_block * EXT4_MIN_BLOCK_SIZE;
offset = do_div(logical_sb_block, blocksize);
-   bh = sb_bread(sb, logical_sb_block);
+   bh = sb_bread_nonmovable(sb, logical_sb_block);
if (!bh) {
ext4_msg(sb, KERN_ERR,
   "Can't read superblock on 2nd try");
@@ -3867,7 +3867,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)

for (i = 0; i < db_count; i++) {
block = descriptor_loc(sb, logical_sb_block, i);
-   sbi->s_group_desc[i] = sb_bread(sb, block);
+   sbi->s_group_desc[i] = sb_bread_nonmovable(sb, block);
if (!sbi->s_group_desc[i]) {
ext4_msg(sb, KERN_ERR,
   "can't read group descriptor %d", i);
--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] fs/buffer.c: allocate buffer cache from non-movable area

2014-08-13 Thread Gioh Kim
A buffer cache is allocated from movable area
because it is referred for a while and released soon.
But some filesystems are taking buffer cache for a long time
and it can disturb page migration.

A new API should be introduced to allocate buffer cache from
non-movable area.

Signed-off-by: Gioh Kim 
---
 fs/buffer.c |   63 ++-
 include/linux/buffer_head.h |   10 +++
 2 files changed, 66 insertions(+), 7 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 8f05111..7ef658f 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -993,7 +993,7 @@ init_page_buffers(struct page *page, struct block_device 
*bdev,
  */
 static int
 grow_dev_page(struct block_device *bdev, sector_t block,
-   pgoff_t index, int size, int sizebits)
+ pgoff_t index, int size, int sizebits, gfp_t movable_mask)
 {
struct inode *inode = bdev->bd_inode;
struct page *page;
@@ -1003,7 +1003,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
gfp_t gfp_mask;

gfp_mask = mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS;
-   gfp_mask |= __GFP_MOVABLE;
+   if (movable_mask & __GFP_MOVABLE)
+   gfp_mask |= __GFP_MOVABLE;
/*
 * XXX: __getblk_slow() can not really deal with failure and
 * will endlessly loop on improvised global reclaim.  Prefer
@@ -1058,7 +1059,8 @@ failed:
  * that page was dirty, the buffers are set dirty also.
  */
 static int
-grow_buffers(struct block_device *bdev, sector_t block, int size)
+grow_buffers(struct block_device *bdev, sector_t block,
+int size, gfp_t movable_mask)
 {
pgoff_t index;
int sizebits;
@@ -1085,11 +1087,12 @@ grow_buffers(struct block_device *bdev, sector_t block, 
int size)
}

/* Create a page with the proper size buffers.. */
-   return grow_dev_page(bdev, block, index, size, sizebits);
+   return grow_dev_page(bdev, block, index, size, sizebits, movable_mask);
 }

 static struct buffer_head *
-__getblk_slow(struct block_device *bdev, sector_t block, int size)
+__getblk_slow(struct block_device *bdev, sector_t block,
+ int size, gfp_t movable_mask)
 {
/* Size must be multiple of hard sectorsize */
if (unlikely(size & (bdev_logical_block_size(bdev)-1) ||
@@ -,7 +1114,7 @@ __getblk_slow(struct block_device *bdev, sector_t block, 
int size)
if (bh)
return bh;

-   ret = grow_buffers(bdev, block, size);
+   ret = grow_buffers(bdev, block, size, movable_mask);
if (ret < 0)
return NULL;
if (ret == 0)
@@ -1385,11 +1388,34 @@ __getblk(struct block_device *bdev, sector_t block, 
unsigned size)

might_sleep();
if (bh == NULL)
-   bh = __getblk_slow(bdev, block, size);
+   bh = __getblk_slow(bdev, block, size, __GFP_MOVABLE);
return bh;
 }
 EXPORT_SYMBOL(__getblk);

+ /*
+ * __getblk_nonmovable will locate (and, if necessary, create) the buffer_head
+ * which corresponds to the passed block_device, block and size. The
+ * returned buffer has its reference count incremented.
+ *
+ * The page cache is allocated from non-movable area
+ * not to prevent page migration.
+ *
+ * __getblk()_nonmovable will lock up the machine
+ * if grow_dev_page's try_to_free_buffers() attempt is failing. FIXME, perhaps?
+ */
+struct buffer_head *
+__getblk_nonmovable(struct block_device *bdev, sector_t block, unsigned size)
+{
+   struct buffer_head *bh = __find_get_block(bdev, block, size);
+
+   might_sleep();
+   if (bh == NULL)
+   bh = __getblk_slow(bdev, block, size, 0);
+   return bh;
+}
+EXPORT_SYMBOL(__getblk_nonmovable);
+
 /*
  * Do async read-ahead on a buffer..
  */
@@ -1410,6 +1436,7 @@ EXPORT_SYMBOL(__breadahead);
  *  @size: size (in bytes) to read
  *
  *  Reads a specified block, and returns buffer head that contains it.
+ *  The page cache is allocated from movable area so that it can be migrated.
  *  It returns NULL if the block was unreadable.
  */
 struct buffer_head *
@@ -1423,6 +1450,28 @@ __bread(struct block_device *bdev, sector_t block, 
unsigned size)
 }
 EXPORT_SYMBOL(__bread);

+/**
+ *  __bread_nonmovable() - reads a specified block and returns the bh
+ *  @bdev: the block_device to read from
+ *  @block: number of block
+ *  @size: size (in bytes) to read
+ *
+ *  Reads a specified block, and returns buffer head that contains it.
+ *  The page cache is allocated from non-movable area
+ *  not to prevent page migration.
+ *  It returns NULL if the block was unreadable.
+ */
+struct buffer_head *
+__bread_nonmovable(struct block_device *bdev, sector_t block, unsigned size)
+{
+   struct buffer_head *bh = __getblk_slow(bdev, block, size, 0);
+
+   if (likely(bh) && !buffer_uptodate(bh))
+   bh = __bread_slow(bh);
+   return bh;
+}

Re: [PATCH v5 0/5] random,x86,kvm: Rework arch RNG seeds and get some from kvm

2014-08-13 Thread Andy Lutomirski
On Wed, Aug 13, 2014 at 7:41 PM, H. Peter Anvin  wrote:
> On 08/13/2014 11:44 AM, H. Peter Anvin wrote:
>> On 08/13/2014 11:33 AM, Andy Lutomirski wrote:
>>>
>>> As for doing arch_random_init after clone/migration, I think we'll
>>> need another KVM extension for that, since, AFAIK, we don't actually
>>> get notified that we were cloned or migrated.  That will be
>>> nontrivial.  Maybe we can figure that out at KS, too.
>>>
>>
>> We don't need a reset when migrated (although it might be a good idea
>> under some circumstances, i.e. if the pools might somehow have gotten
>> exposed) but definitely when cloned.
>>
>
> But yes, we need a notification.  For obvious reasons there is no
> suspend event (one can snapshot a running VM) but we need to be notified
> upon wakeup, *or* we need to give KVM a way to update the necessary state.

This could presumably use the interrupt mechanism on virtio-rng if
we're willing to depend on having host support for virtio-rng.

v6 (coming in a few minutes) will at least get it right when the
kernel goes through the resume path (i.e. not KVM/QEMU suspend, and
maybe not S0ix either).

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PC, KVM, CMA: Fix regression caused by wrong get_order() use

2014-08-13 Thread Aneesh Kumar K.V
Alexey Kardashevskiy  writes:

> fc95ca7284bc54953165cba76c3228bd2cdb9591 claims that there is no
> functional change but this is not true as it calls get_order() (which
> takes bytes) where it should have called ilog2() and the kernel stops
> on VM_BUG_ON().
>
> This replaces get_order() with order_base_2() (round-up version of ilog2).
>
> Suggested-by: Paul Mackerras 
> Cc: Alexander Graf 
> Cc: Aneesh Kumar K.V 
> Cc: Joonsoo Kim 
> Cc: Benjamin Herrenschmidt 
> Signed-off-by: Alexey Kardashevskiy 

Reviewed-by: Aneesh Kumar K.V 

> ---
>
> Changes:
> v2:
> * s/ilog2/order_base_2/
> * removed cc:  as I got wrong impression that v3.16 is
> broken
>
> ---
>  arch/powerpc/kvm/book3s_hv_builtin.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
> b/arch/powerpc/kvm/book3s_hv_builtin.c
> index 329d7fd..b9615ba 100644
> --- a/arch/powerpc/kvm/book3s_hv_builtin.c
> +++ b/arch/powerpc/kvm/book3s_hv_builtin.c
> @@ -101,7 +101,7 @@ struct kvm_rma_info *kvm_alloc_rma()
>   ri = kmalloc(sizeof(struct kvm_rma_info), GFP_KERNEL);
>   if (!ri)
>   return NULL;
> - page = cma_alloc(kvm_cma, kvm_rma_pages, get_order(kvm_rma_pages));
> + page = cma_alloc(kvm_cma, kvm_rma_pages, order_base_2(kvm_rma_pages));
>   if (!page)
>   goto err_out;
>   atomic_set(>use_count, 1);
> @@ -135,12 +135,12 @@ struct page *kvm_alloc_hpt(unsigned long nr_pages)
>  {
>   unsigned long align_pages = HPT_ALIGN_PAGES;
>
> - VM_BUG_ON(get_order(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
> + VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
>
>   /* Old CPUs require HPT aligned on a multiple of its size */
>   if (!cpu_has_feature(CPU_FTR_ARCH_206))
>   align_pages = nr_pages;
> - return cma_alloc(kvm_cma, nr_pages, get_order(align_pages));
> + return cma_alloc(kvm_cma, nr_pages, order_base_2(align_pages));
>  }
>  EXPORT_SYMBOL_GPL(kvm_alloc_hpt);
>
> -- 
> 2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] new APIs to allocate buffer-cache for superblock in non-movable area

2014-08-13 Thread Gioh Kim
Hello,

This patch try to solve problem that a long-lasting page caches of
ext4 superblock and journaling of superblock disturb page migration.

I've been testing CMA feature on my ARM-based platform
and found that two page caches cannot be migrated.
They are page caches of superblock of ext4 filesystem and its journaling data.

Current ext4 reads superblock with sb_bread() that allocates page
from movable area. But the problem is that ext4 hold the page until
it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.
And also the journaling data for the superblock cannot be migreated.

I introduce a new API for allocating page cache from non-movable area.
It is useful for ext4/ext3 and others that want to hold page cache for a long 
time.

I have 3 patchs:

1. Patch 1/3: introduce a new API that create page cache from non-movable area
2. Patch 2/3: have ext4 use the new API to read superblock
3. Patch 3/3: have jbd/jbd2 use the new API to make journaling of superblock

This patchset is based on linux-next-20140814.

Thanks a lot.

Gioh Kim (3):
 fs/buffer.c: allocate buffer cache from non-movable area
 ext4: allocate buffer-cache for superblock in non-movable area
 jbd-jbd2-allocate-buffer-cache-for-superblock-inode-.patch

 fs/buffer.c |   63 ++
 fs/ext4/super.c |6 +--
 fs/jbd/journal.c|2 -
 fs/jbd2/journal.c   |2 -
 include/linux/buffer_head.h |   10 +
 5 files changed, 71 insertions(+), 12 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH/RFC v4 00/21] LED / flash API integration

2014-08-13 Thread Sakari Ailus
Hi Jacek,

On Thu, Aug 07, 2014 at 10:21:14AM +0200, Jacek Anaszewski wrote:
> On 08/06/2014 08:53 AM, Sakari Ailus wrote:
> >Hi Jacek,
> >
> >On Fri, Jul 11, 2014 at 04:04:03PM +0200, Jacek Anaszewski wrote:
> >...
> >>1) Who should register V4L2 Flash sub-device?
> >>
> >>LED Flash Class devices, after introduction of the Flash Manager,
> >>are not tightly coupled with any media controller. They are maintained
> >>by the Flash Manager and made available for dynamic assignment to
> >>any media system they are connected to through multiplexing devices.
> >>
> >>In the proposed rough solution, when support for V4L2 Flash sub-devices
> >>is enabled, there is a v4l2_device created for them to register in.
> >>This however implies that V4L2 Flash device will not be available
> >>in any media controller, which calls its existence into question.
> >>
> >>Therefore I'd like to consult possible ways of solving this issue.
> >>The option I see is implementing a mechanism for moving V4L2 Flash
> >>sub-devices between media controllers. A V4L2 Flash sub-device
> >>would initially be assigned to one media system in the relevant
> >>device tree binding, but it could be dynamically reassigned to
> >>the other one. However I'm not sure if media controller design
> >>is prepared for dynamic modifications of its graph and how many
> >>modifications in the existing drivers this solution would require.
> >
> >Do you have a use case where you would need to strobe a flash from multiple
> >media devices at different times, or is this entirely theoretical? Typically
> >flash controllers are connected to a single source of hardware strobe (if
> >there's one) since the flash LEDs are in fact mounted next to a specific
> >camera sensor.
> 
> I took into account such arrangements in response to your message
> [1], where you were considering configurations like "one flash but
> two
> cameras", "one camera and two flashes". And you also called for
> proposing generic solution.
> 
> One flash and two (or more) cameras case is easily conceivable -
> You even mentioned stereo cameras. One camera and many flashes
> arrangement might be useful in case of some professional devices which
> might be designed so that they would be able to apply different scene
> lighting. I haven't heard about such devices, but as you said
> such a configuration isn't unthinkable.
> 
> >If this is a real issue the way to solve it would be to have a single media
> >device instead of many.
> 
> I was considering adding media device, that would be a representation
> of a flash manager, gathering all the registered flashes. Nonetheless,
> finally I came to conclusion that a v4l2-device alone should suffice,
> just to provide a Flash Manager representation allowing for
> v4l2-flash sub-devices to register in.
> All the features provided by the media device are useless in case
> of a set of V4L2 Flash sub-devices. They couldn't have any linkage
> in such a device. The only benefit from having media device gathering
> V4L2 Flash devices would be possibility of listing them.

Not quite so. The flash is associated to the sensor (and lens) using the
group ID in the Media controller. The user space doesn't need to "know" this
association.

More complex use cases such as the above may need extensions to the Media
controller API.

> >>2) Consequences of locking the Flash Manager during flash strobe
> >>
> >>In case a LED Flash Class device depends on muxes involved in
> >>routing the other LED Flash Class device's strobe signals,
> >>the Flash Manager must be locked for the time of strobing
> >>to prevent reconfiguration of the strobe signal routing
> >>by the other device.
> >
> >I wouldn't be concerned of this in particular. It's more important we do
> >actully have V4L2 flash API supported by LED flash drivers and that they do
> >implement the API correctly.
> >
> >It's at least debatable whether you should try to prevent user space from
> >doing silly things or not. With complex devices it may relatively easily
> >lead to wrecking havoc with actual use cases which we certainly do not want.
> >
> >In this case, if you just prevent changing the routing (do you have a use
> >case for it?) while strobing, someone else could still change the routing
> >just before you strobe.
> 
> Originally I started to implementing this so that strobe signal routing
> was altered upon setting strobe source. With such an implementation
> the use case would be as follows:
> 
> 1. Process 1 sets strobe source to external
> 2. Process 2 sets strobe source to software
> 3. Process 1 strobes the flash, unaware that strobe source setting has
>been changed
> 
> To avoid such problems I changed the implementation so that the
> routing is set in the led_flash_manager_setup_strobe function called
> from led_set_flash_strobe and led_set_external_strobe functions.
> led_flash_manager_setup_strobe sets strobe signal routing
> and strobes the flash under lock and holds it for the flash timeout
> 

Re: [GIT PULL] hwmon updates for 3.17 (part 2)

2014-08-13 Thread Guenter Roeck
On Mon, Aug 11, 2014 at 10:49:57AM -0700, Guenter Roeck wrote:
> Hi Linus,
> 
> Please pull hwmon updates for Linux 3.17 from signed tag:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git 
> hwmon-for-linus
> 
Hi Linus,

did this get lost, am I impatient, or is there a problem with it ?

Thanks,
Guenter

> Thanks,
> Guenter
> --
> 
> The following changes since commit f4d7eac4007793ca11fd1ab68d91ce7aa762:
> 
>   Merge branch 'v4l_for_linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media (2014-08-05 
> 16:36:30 -0700)
> 
> are available in the git repository at:
> 
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git 
> tags/hwmon-for-linus
> 
> for you to fetch changes up to 6ddd855c13bcd324a2d9756134e6b57c659cfde6:
> 
>   hwmon: (tmp103) Remove duplicate test for I2C_FUNC_SMBUS_BYTE_DATA 
> functionality (2014-08-10 07:51:42 -0700)
> 
> 
> Several bug fixes in various drivers, plus a minor cleanup
> in the tmp103 driver.
> 
> 
> Axel Lin (14):
>   hwmon: (lm92) Prevent overflow problem when writing large limits
>   hwmon: (ads1015) Fix out-of-bounds array access
>   hwmon: (dme1737) Prevent overflow problem when writing large limits
>   hwmon: (hih6130) Fix missing hih6130->write_length setting
>   hwmon: (adm1025) Fix vrm write operation
>   hwmon: (adm1026) Fix vrm write operation
>   hwmon: (asb100) Fix vrm write operation
>   hwmon: (lm87) Fix vrm write operation
>   hwmon: (pc87360) Fix vrm write operation
>   hwmon: (vt1211) Fix vrm write operation
>   hwmon: (w83627hf) Fix vrm write operation
>   hwmon: (w83791d) Fix vrm write operation
>   hwmon: (w83793) Fix vrm write operation
>   hwmon: (tmp103) Remove duplicate test for I2C_FUNC_SMBUS_BYTE_DATA 
> functionality
> 
> Guenter Roeck (1):
>   hwmon: (emc6w201) Fix temperature limit range
> 
>  drivers/hwmon/adm1025.c  |  3 +++
>  drivers/hwmon/adm1026.c  |  3 +++
>  drivers/hwmon/ads1015.c  |  2 ++
>  drivers/hwmon/asb100.c   |  4 
>  drivers/hwmon/dme1737.c  | 33 ++---
>  drivers/hwmon/emc6w201.c |  4 ++--
>  drivers/hwmon/hih6130.c  |  3 +++
>  drivers/hwmon/lm87.c |  4 
>  drivers/hwmon/lm92.c | 13 ++---
>  drivers/hwmon/pc87360.c  |  3 +++
>  drivers/hwmon/tmp103.c   |  7 ---
>  drivers/hwmon/vt1211.c   |  3 +++
>  drivers/hwmon/w83627hf.c |  3 +++
>  drivers/hwmon/w83791d.c  |  3 +++
>  drivers/hwmon/w83793.c   |  3 +++
>  15 files changed, 60 insertions(+), 31 deletions(-)




signature.asc
Description: Digital signature


[PATCH v2] PC, KVM, CMA: Fix regression caused by wrong get_order() use

2014-08-13 Thread Alexey Kardashevskiy
fc95ca7284bc54953165cba76c3228bd2cdb9591 claims that there is no
functional change but this is not true as it calls get_order() (which
takes bytes) where it should have called ilog2() and the kernel stops
on VM_BUG_ON().

This replaces get_order() with order_base_2() (round-up version of ilog2).

Suggested-by: Paul Mackerras 
Cc: Alexander Graf 
Cc: Aneesh Kumar K.V 
Cc: Joonsoo Kim 
Cc: Benjamin Herrenschmidt 
Signed-off-by: Alexey Kardashevskiy 
---

Changes:
v2:
* s/ilog2/order_base_2/
* removed cc:  as I got wrong impression that v3.16 is
broken

---
 arch/powerpc/kvm/book3s_hv_builtin.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 329d7fd..b9615ba 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -101,7 +101,7 @@ struct kvm_rma_info *kvm_alloc_rma()
ri = kmalloc(sizeof(struct kvm_rma_info), GFP_KERNEL);
if (!ri)
return NULL;
-   page = cma_alloc(kvm_cma, kvm_rma_pages, get_order(kvm_rma_pages));
+   page = cma_alloc(kvm_cma, kvm_rma_pages, order_base_2(kvm_rma_pages));
if (!page)
goto err_out;
atomic_set(>use_count, 1);
@@ -135,12 +135,12 @@ struct page *kvm_alloc_hpt(unsigned long nr_pages)
 {
unsigned long align_pages = HPT_ALIGN_PAGES;
 
-   VM_BUG_ON(get_order(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
+   VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
 
/* Old CPUs require HPT aligned on a multiple of its size */
if (!cpu_has_feature(CPU_FTR_ARCH_206))
align_pages = nr_pages;
-   return cma_alloc(kvm_cma, nr_pages, get_order(align_pages));
+   return cma_alloc(kvm_cma, nr_pages, order_base_2(align_pages));
 }
 EXPORT_SYMBOL_GPL(kvm_alloc_hpt);
 
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] perf, x86: Add INST_RETIRED.ALL workarounds

2014-08-13 Thread Stephane Eranian
On Thu, Aug 14, 2014 at 3:17 AM, Andi Kleen  wrote:
> From: Andi Kleen 
>
> On Broadwell INST_RETIRED.ALL cannot be used with any period
> that doesn't have the lowest 6 bits cleared. And the period
> should not be smaller than 128.
>
If you have frequency mode enabled, then I suspect this works okay.
It may be that the kernel will keep trying to set a period lower than 128
but you will correct it.

I am more worried about the case where the user sets a fixed period
with some of the bottom 6 bits set.  The apps thinks it is sampling at
X occurences per sample, when it is in fact at X - 63 (worst case).
I think this would be okay, if there was a trace of this in the sampling
buffer, i.e., PERF_SAMPLE_PERIOD. But I recall that perf record
does not request this flag when using a fixed period. So no way
of knowing what the actual period was, at least when using the
perf tool. I understand also that for this event, 64 occurrences
may not matter as much, maybe except if you use some filters
such as cmask.


> Add a new callback to enforce this, and set it for Broadwell.
>
> This is erratum BDM57 and BDM11.
>
> v2: Use correct event name in description. Use EVENT() macro.
> Signed-off-by: Andi Kleen 
> ---
>  arch/x86/kernel/cpu/perf_event.c   |  3 +++
>  arch/x86/kernel/cpu/perf_event.h   |  1 +
>  arch/x86/kernel/cpu/perf_event_intel.c | 19 +++
>  3 files changed, 23 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/perf_event.c 
> b/arch/x86/kernel/cpu/perf_event.c
> index 0adc5e3..a0adf58 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -980,6 +980,9 @@ int x86_perf_event_set_period(struct perf_event *event)
> if (left > x86_pmu.max_period)
> left = x86_pmu.max_period;
>
> +   if (x86_pmu.limit_period)
> +   left = x86_pmu.limit_period(event, left);
> +
> per_cpu(pmc_prev_left[idx], smp_processor_id()) = left;
>
> /*
> diff --git a/arch/x86/kernel/cpu/perf_event.h 
> b/arch/x86/kernel/cpu/perf_event.h
> index de81627..a46e391 100644
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -456,6 +456,7 @@ struct x86_pmu {
> struct x86_pmu_quirk *quirks;
> int perfctr_second_write;
> boollate_ack;
> +   unsigned(*limit_period)(struct perf_event *event, unsigned l);
>
> /*
>  * sysfs attrs
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
> b/arch/x86/kernel/cpu/perf_event_intel.c
> index 4bfb0ec..66260e1 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -2034,6 +2034,24 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, 
> struct perf_event *event)
> return c;
>  }
>
> +/*
> + * Broadwell:
> + * The INST_RETIRED.ALL period always needs to have lowest
> + * 6bits cleared (BDM57). It shall not use a period smaller
> + * than 100 (BDM11). We combine the two to enforce
> + * a min-period of 128.
> + */
> +static unsigned bdw_limit_period(struct perf_event *event, unsigned left)
> +{
> +   if ((event->hw.config & 0x) ==
> +   X86_CONFIG(.event=0xc0, .umask=0x01)) {
> +   if (left < 128)
> +   left = 128;
> +   left &= ~0x3fu;
> +   }
> +   return left;
> +}
> +
>  PMU_FORMAT_ATTR(event, "config:0-7");
>  PMU_FORMAT_ATTR(umask, "config:8-15"   );
>  PMU_FORMAT_ATTR(edge,  "config:18" );
> @@ -2711,6 +2729,7 @@ __init int intel_pmu_init(void)
> x86_pmu.hw_config = hsw_hw_config;
> x86_pmu.get_event_constraints = hsw_get_event_constraints;
> x86_pmu.cpu_events = hsw_events_attrs;
> +   x86_pmu.limit_period = bdw_limit_period;
> pr_cont("Broadwell events, ");
> break;
>
> --
> 1.9.3
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[LKP] [writeback] 952648324b9: -24.9% blogbench.write_score

2014-08-13 Thread kernel test robot
FYI, we applied your patches and noticed the below changes on

commit 952648324b969f3fc22d3a2a78f4715c0bf43d7f ("writeback: Per-sb dirty 
tracking")

test case: lkp-sb02/blogbench/1HDD-ext4

df3be46bdbab23e  952648324b969f3fc22d3a2a7 
---  - 
   834 ± 3% -24.9%627 ± 3%  TOTAL blogbench.write_score
  6043 ± 3% -74.0%   1571 ± 2%  TOTAL 
slabinfo.jbd2_journal_head.active_objs
   174 ± 3% -66.9% 57 ± 3%  TOTAL 
slabinfo.jbd2_journal_head.num_slabs
   174 ± 3% -66.9% 57 ± 3%  TOTAL 
slabinfo.jbd2_journal_head.active_slabs
  6293 ± 3% -66.8%   2091 ± 2%  TOTAL 
slabinfo.jbd2_journal_head.num_objs
171025 ±32% +78.6% 305412 ±28%  TOTAL cpuidle.C1-SNB.time
   1013700 ±36% +80.6%1830682 ±17%  TOTAL cpuidle.C1E-SNB.time
  0.11 ±27% +71.7%   0.18 ±16%  TOTAL turbostat.%c1
   1129386 ± 9% +35.6%1531413 ± 4%  TOTAL meminfo.MemFree
   1128767 ± 9% +35.5%1529806 ± 4%  TOTAL vmstat.memory.free
282835 ± 9% +35.3% 382773 ± 4%  TOTAL proc-vmstat.nr_free_pages
   740 ±12% +46.3%   1083 ±13%  TOTAL cpuidle.C1E-SNB.usage
891706 ± 8% -24.9% 669997 ± 1%  TOTAL proc-vmstat.pgactivate
  1528 ± 6% -24.7%   1151 ± 1%  TOTAL proc-vmstat.nr_writeback
   327 ±16% +33.5%437 ±10%  TOTAL cpuidle.C1-SNB.usage
  6074 ± 8% -24.6%   4581 ± 2%  TOTAL meminfo.Writeback
 13104 ± 4% -17.6%  10794 ± 3%  TOTAL slabinfo.buffer_head.num_slabs
 13104 ± 4% -17.6%  10794 ± 3%  TOTAL 
slabinfo.buffer_head.active_slabs
511069 ± 4% -17.6% 421003 ± 3%  TOTAL slabinfo.buffer_head.num_objs
510881 ± 4% -17.6% 421000 ± 3%  TOTAL 
slabinfo.buffer_head.active_objs
   3500553 ± 2% -20.3%2788421 ± 3%  TOTAL proc-vmstat.pgpgout
   1722412 ± 3% -16.4%1439377 ± 3%  TOTAL meminfo.Active(file)
   1752356 ± 3% -16.2%1468171 ± 3%  TOTAL meminfo.Active
430248 ± 3% -16.3% 359905 ± 3%  TOTAL proc-vmstat.nr_active_file
767908 ± 2% -19.2% 620130 ± 1%  TOTAL proc-vmstat.nr_written
 97504 ± 4% -15.5%  82412 ± 3%  TOTAL 
slabinfo.ext4_inode_cache.active_objs
 97544 ± 4% -15.5%  82457 ± 3%  TOTAL 
slabinfo.ext4_inode_cache.num_objs
  6096 ± 4% -15.5%   5153 ± 3%  TOTAL 
slabinfo.ext4_inode_cache.num_slabs
  6096 ± 4% -15.5%   5153 ± 3%  TOTAL 
slabinfo.ext4_inode_cache.active_slabs
   963 ± 4% -15.2%816 ± 3%  TOTAL 
slabinfo.ext4_extent_status.num_slabs
   963 ± 4% -15.2%816 ± 3%  TOTAL 
slabinfo.ext4_extent_status.active_slabs
 98282 ± 4% -15.2%  83303 ± 3%  TOTAL 
slabinfo.ext4_extent_status.num_objs
100050 ± 4% -14.8%  85198 ± 3%  TOTAL 
slabinfo.shared_policy_node.active_objs
100058 ± 4% -14.8%  85218 ± 3%  TOTAL 
slabinfo.shared_policy_node.num_objs
   2435104 ± 3% -14.7%2077577 ± 2%  TOTAL meminfo.Cached
  1176 ± 4% -14.8%   1002 ± 3%  TOTAL 
slabinfo.shared_policy_node.active_slabs
  1176 ± 4% -14.8%   1002 ± 3%  TOTAL 
slabinfo.shared_policy_node.num_slabs
   2435823 ± 3% -14.6%2079169 ± 2%  TOTAL vmstat.memory.cache
616019 ± 3% -14.6% 526020 ± 2%  TOTAL proc-vmstat.nr_file_pages
  3342 ± 4% -14.7%   2852 ± 3%  TOTAL 
slabinfo.radix_tree_node.active_slabs
  3342 ± 4% -14.7%   2852 ± 3%  TOTAL 
slabinfo.radix_tree_node.num_slabs
 93612 ± 4% -14.7%  79877 ± 3%  TOTAL 
slabinfo.radix_tree_node.num_objs
 93574 ± 4% -14.7%  79836 ± 3%  TOTAL 
slabinfo.radix_tree_node.active_objs
 30729 ± 4% -14.9%  26163 ± 3%  TOTAL meminfo.Buffers
253276 ± 3% -14.5% 216563 ± 3%  TOTAL meminfo.SReclaimable
 30737 ± 4% -14.8%  26187 ± 3%  TOTAL vmstat.memory.buff
 97430 ± 3% -14.5%  83280 ± 3%  TOTAL 
slabinfo.ext4_extent_status.active_objs
 63268 ± 3% -14.4%  54149 ± 3%  TOTAL 
proc-vmstat.nr_slab_reclaimable
287129 ± 3% -13.6% 248074 ± 2%  TOTAL meminfo.Slab
141193 ± 3% -12.2% 124025 ± 2%  TOTAL slabinfo.dentry.num_objs
  6723 ± 3% -12.2%   5905 ± 2%  TOTAL slabinfo.dentry.active_slabs
  6723 ± 3% -12.2%   5905 ± 2%  TOTAL slabinfo.dentry.num_slabs
141096 ± 3% -12.2% 123952 ± 2%  TOTAL slabinfo.dentry.active_objs
129818 ± 3% -11.6% 114766 ± 2%  TOTAL slabinfo.Acpi-State.num_objs
  2544 ± 3% -11.6%   2249 ± 2%  TOTAL 
slabinfo.Acpi-State.active_slabs
  2544 ± 3% -11.6%   2249 ± 2%  TOTAL slabinfo.Acpi-State.num_slabs
129753 ± 3% -11.6% 114696 ± 2%  TOTAL 
slabinfo.Acpi-State.active_objs
969880 ± 1% -14.1% 832982 ± 1%  TOTAL proc-vmstat.nr_dirtied
 37692 ± 1% +13.3%  42704 ± 1%  TOTAL softirqs.BLOCK
726033 ± 3%  -9.0% 660369 ± 3%  TOTAL meminfo.Inactive(file)
   

Re: [PATCH/RFC v4 06/21] leds: add API for setting torch brightness

2014-08-13 Thread Sakari Ailus
Bryan and Richard,

Your opinion would be much appreciated to a question myself and Jacek were
pondering. Please see below.

On Thu, Aug 07, 2014 at 03:12:09PM +0200, Jacek Anaszewski wrote:
> Hi Sakari,
> 
> On 08/04/2014 02:50 PM, Sakari Ailus wrote:
> >Hi Jacek,
> >
> >Thank you for your continued efforts on this!
> >
> >On Mon, Aug 04, 2014 at 02:35:26PM +0200, Jacek Anaszewski wrote:
> >>On 07/16/2014 11:54 PM, Sakari Ailus wrote:
> >>>Hi Jacek,
> >>>
> >>>Jacek Anaszewski wrote:
> >>>...
> diff --git a/include/linux/leds.h b/include/linux/leds.h
> index 1a130cc..9bea9e6 100644
> --- a/include/linux/leds.h
> +++ b/include/linux/leds.h
> @@ -44,11 +44,21 @@ struct led_classdev {
>   #define LED_BLINK_ONESHOT_STOP(1 << 18)
>   #define LED_BLINK_INVERT(1 << 19)
>   #define LED_SYSFS_LOCK(1 << 20)
> +#define LED_DEV_CAP_TORCH(1 << 21)
> 
>   /* Set LED brightness level */
>   /* Must not sleep, use a workqueue if needed */
>   void(*brightness_set)(struct led_classdev *led_cdev,
> enum led_brightness brightness);
> +/*
> + * Set LED brightness immediately - it is required for flash led
> + * devices as they require setting torch brightness to have
> immediate
> + * effect. brightness_set op cannot be used for this purpose because
> + * the led drivers schedule a work queue task in it to allow for
> + * being called from led-triggers, i.e. from the timer irq context.
> + */
> >>>
> >>>Do we need to classify actual devices based on this? I think it's rather
> >>>a different API behaviour between the LED and the V4L2 APIs.
> >>>
> >>>On devices that are slow to control, the behaviour should be asynchronous
> >>>over the LED API and synchronous when accessed through the V4L2 API. How
> >>>about implementing the work queue, as I have suggested, in the
> >>>framework, so
> >>>that individual drivers don't need to care about this and just implement
> >>>the
> >>>synchronous variant of this op? A flag could be added to distinguish
> >>>devices
> >>>that are fast so that the work queue isn't needed.
> >>>
> >>>It'd be nice to avoid individual drivers having to implement multiple
> >>>ops to
> >>>do the same thing, just for differing user space interfacs.
> >>>
> >>
> >>It is not only the matter of a device controller speed. If a flash
> >>device is to be made accessible from the LED subsystem, then it
> >>should be also compatible with led-triggers. Some of led-triggers
> >>call brightness_set op from the timer irq context and thus no
> >>locking in the callback can occur. This requirement cannot be
> >>met i.e. if i2c bus is to be used. This is probably the primary
> >>reason for scheduling work queue tasks in brightness_set op.
> >>
> >>Having the above in mind, setting a brightness in a work queue
> >>task must be possible for all LED Class Flash drivers, regardless
> >>whether related devices have fast or slow controller.
> >>
> >>Let's recap the cost of possible solutions then:
> >>
> >>1) Moving the work queues to the LED framework
> >>
> >>   - it would probably require extending led_set_brightness and
> >> __led_set_brightness functions by a parameter indicating whether it
> >> should call brightness_set op in the work queue task or directly;
> >>   - all existing triggers would have to be updated accordingly;
> >>   - work queues would have to be removed from all the LED drivers;
> >>
> >>2) adding led_set_torch_brightness API
> >>
> >>   - no modifications in existing drivers and triggers would be required
> >>   - instead, only the modifications from the discussed patch would
> >> be required
> >>
> >>Solution 1 looks cleaner but requires much more modifications.
> >
> >How about a combination of the two, i.e. option 1 with the old op remaining
> >there for compatibility with the old drivers (with a comment telling it's
> >deprecated)?
> >
> >This way new drivers will benefit from having to implement this just once,
> >and modifications to the existing drivers could be left for later.
> 
> It's OK for me, but the opinion from the LED side guys is needed here
> as well.

Ping.

> >The downside is that any old drivers wouldn't get V4L2 flash API but that's
> >entirely acceptable in my opinion since these would hardly be needed in use
> >cases that would benefit from V4L2 flash API.
> 
> In the version 4 of the patch set I changed the implementation, so that
> a flash led driver must call led_classdev_flash_register to get
> registered as a LED Flash Class device and v4l2_flash_init to get
> V4L2 Flash API. In effect old drivers will have no chance to get V4L2
> Flash API either way.

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ai...@iki.fi XMPP: sai...@retiisi.org.uk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo 

Re: [PATCH/RFC v4 15/21] media: Add registration helpers for V4L2 flash

2014-08-13 Thread Sakari Ailus
Hi Jacek,

On Mon, Aug 11, 2014 at 03:27:22PM +0200, Jacek Anaszewski wrote:

...

> diff --git a/include/media/v4l2-flash.h b/include/media/v4l2-flash.h
> new file mode 100644
> index 000..effa46b
> --- /dev/null
> +++ b/include/media/v4l2-flash.h
> @@ -0,0 +1,137 @@
> +/*
> + * V4L2 Flash LED sub-device registration helpers.
> + *
> + *   Copyright (C) 2014 Samsung Electronics Co., Ltd
> + *   Author: Jacek Anaszewski 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation."
> + */
> +
> +#ifndef _V4L2_FLASH_H
> +#define _V4L2_FLASH_H
> +
> +#include 
> +#include 
> +#include le
> +#include 
> +#include 
> +
> +struct led_classdev_flash;
> +struct led_classdev;
> +enum led_brightness;
> +
> +struct v4l2_flash_ops {
> + int (*torch_brightness_set)(struct led_classdev *led_cdev,
> + enum led_brightness brightness);
> + int (*torch_brightness_update)(struct led_classdev *led_cdev);
> + int (*flash_brightness_set)(struct led_classdev_flash *flash,
> + u32 brightness);
> + int (*flash_brightness_update)(struct led_classdev_flash *flash);
> + int (*strobe_set)(struct led_classdev_flash *flash, bool state);
> + int (*strobe_get)(struct led_classdev_flash *flash, bool *state);
> + int (*timeout_set)(struct led_classdev_flash *flash, u32 timeout);
> + int (*indicator_brightness_set)(struct led_classdev_flash *flash,
> + u32 brightness);
> + int (*indicator_brightness_update)(struct led_classdev_flash *flash);
> + int (*external_strobe_set)(struct led_classdev_flash *flash,
> + bool enable);
> + int (*fault_get)(struct led_classdev_flash *flash, u32 *fault);
> + void (*sysfs_lock)(struct led_classdev *led_cdev);
> + void (*sysfs_unlock)(struct led_classdev *led_cdev);
> >>>
> >>>These functions are not driver specific and there's going to be just one
> >>>implementation (I suppose). Could you refresh my memory regarding why
> >>>the LED framework functions aren't called directly?
> >>
> >>These ops are required to make possible building led-class-flash as
> >>a kernel module.
> >
> >Assuming you'd use the actual implementation directly, what would be the
> >dependencies? I don't think the LED flash framework has any callbacks
> >towards the V4L2 (LED) flash framework, does it? Please correct my
> >understanding if I'm missing something. In Makefile format, assume all
> >targets are .PHONY:
> >
> >led-flash-api: led-api
> >
> >v4l2-flash: led-flash-api
> >
> >driver: led-flash-api v4l2-flash
> 
> LED Class Flash driver gains V4L2 Flash API when
> CONFIG_V4L2_FLASH_LED_CLASS is defined. This is accomplished in
> the probe function by either calling v4l2_flash_init function
> or the macro of this name, when the CONFIG_V4L2_FLASH_LED_CLASS
> macro isn't defined.
> 
> If the v4l2-flash.c was to call the LED API directly, then the
> led-class-flash module symbols would have to be available at
> v4l2-flash.o linking time.

Is this an issue? EXPORT_SYMBOL_GPL() for the relevant symbols should be
enough.

> This requirement cannot be met if the led-class-flash is built
> as a module.
> 
> Use of function pointers in the v4l2-flash.c allows to compile it
> into the kernel and enables the possibility of adding the V4L2 Flash
> support conditionally, during driver probing.

I'd simply decide this during kernel compilation time. If you want
something, just enable it. v4l2_flash_init() is called directly by the
driver in any case, so unless that is also called through a wrapper the
driver is still directly dependent on it.

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ai...@iki.fi XMPP: sai...@retiisi.org.uk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] FS-Cache: Reduce cookie ref count if submit fails.

2014-08-13 Thread NeilBrown
On Wed, 13 Aug 2014 12:58:21 -0400 Milosz Tanski  wrote:

> I've been seeing issues with disposing cookies under vma pressure. The symptom
> is that the refcount gets out of sync. In this case we fail to decrement the
> refcount if submit fails. I found this while auditing the error in and around
> cookie operations.
> 
> Signed-off-by: Milosz Tanski 
> ---
>  fs/fscache/object.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/fscache/object.c b/fs/fscache/object.c
> index d3b4539..e1eb1f5 100644
> --- a/fs/fscache/object.c
> +++ b/fs/fscache/object.c
> @@ -982,6 +982,8 @@ nomem:
>  submit_op_failed:
>   clear_bit(FSCACHE_OBJECT_IS_LIVE, >flags);
>   spin_unlock(>lock);
> + if (__fscache_unuse_cookie(cookie))
> + __fscache_wake_unused_cookie(cookie);
>   kfree(op);
>   _leave(" [EIO]");
>   return transit_to(KILL_OBJECT);

Should this simple by
  + fscache_unuse_cookie(cookie);

it does both the "unuse" and the "wake".

Otherwise they all look good to me.

NeilBrown


signature.asc
Description: PGP signature


Re: [PATCH] cpufreq, store_scaling_governor requires policy->rwsem to be held for duration of changing governors [v2]

2014-08-13 Thread Viresh Kumar
On 13 August 2014 15:28, Prarit Bhargava  wrote:
> Anywhere from 2-4 sockets, 8 - 240 cpus (depending on # of sockets), x86 arch.

That's what. We know that it does happen on multi cluster systems
and I was reproducing it on a single cluster one. i.e. all CPUs share
clock line.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] async: async device driver probing

2014-08-13 Thread Greg KH
On Wed, Aug 13, 2014 at 03:10:44PM -0700, Dmitry Torokhov wrote:
> On Thu, Aug 14, 2014 at 06:02:23AM +0800, Greg KH wrote:
> > On Wed, Aug 13, 2014 at 10:20:33AM -0700, Dmitry Torokhov wrote:
> > > Hi Greg,
> > > 
> > > On Sat, Feb 08, 2014 at 10:27:29AM -0800, Greg KH wrote:
> > 
> > February?  That's an old thread to dig up...
> 
> Well, yes ;)
> 
> > 
> > 
> > > > On Sat, Feb 08, 2014 at 07:05:38PM +0800, fal...@meizu.com wrote:
> > > > > From: Wu Zhangjin 
> > > > > 
> > > > > [*Note*: NOT applicable, only for comments.]
> > > > > 
> > > > > To async the slow driver probing function of some devices, the device 
> > > > > probing
> > > > > support is modified to support async scheduling.
> > > > > 
> > > > > In order to async your driver probing function, please mask the 
> > > > > async_probe
> > > > > flag to 1, and to make sure one asynced probing is executed before an 
> > > > > specified
> > > > > point, please call async_synchronize_full() in that point..
> > > > > 
> > > > > Usage:
> > > > > 
> > > > >   static struct i2c_driver test_driver = {
> > > > >   .driver = {
> > > > >   .name   = TEST_DEV_NAME,
> > > > >   .owner  = THIS_MODULE,
> > > > >   +   .async_probe = 1,
> > > > >   },
> > > > 
> > > > Why is this needed, we have defered probing and the container stuff, so
> > > > what problem is this solving?
> > > 
> > > Deferred probing only helps if resources are not ready yet, but
> > > sometimes we have a slow(ish) device which initialization stalls probing
> > > the rest of the system. For example a touchpad can take up to a second
> > > to calibrate itself after reset. One could try scheduling reset
> > > asynchronously, or try to offload it to open(), but that is not always
> > > best:
> > > 
> > > 1. Manual async: what to do when reset fails? Ideally we do not want to
> > > leave driver half-way bound with device not operable, but much rather
> > > signal the rest of the system that binding of the device failed and
> > > release all resources.
> > > 
> > > 2. Offload to open: the same issue as with manually doing async reset,
> > > plus sometimes we do not know all parameters that we should create input
> > > device with until after we reset physical device and queried it for
> > > capabilities.
> > > 
> > > Marking a driver to tell device core to execute probe asynchronously [at
> > > boot time] seems like a very appealing feature from driver author POV.
> > > 
> > > What is the container stuff you mention?
> > 
> > drivers/base/container.c
> 
> I am not sure how that would help in the scenario described above, as it
> seems ACPI specific...

It is?  I thought the drm people were using it on ARM.  I really don't
know, sorry, don't have the chance to check it right now...

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 3.16.1

2014-08-13 Thread Greg KH

diff --git a/Makefile b/Makefile
index d0901b46b4bf..87663a2d1d10 100644
--- a/Makefile
+++ b/Makefile
@@ -1,8 +1,8 @@
 VERSION = 3
 PATCHLEVEL = 16
-SUBLEVEL = 0
+SUBLEVEL = 1
 EXTRAVERSION =
-NAME = Shuffling Zombie Juror
+NAME = Museum of Fishiegoodies
 
 # *DOCUMENTATION*
 # To see a list of typical targets execute "make help"
diff --git a/arch/sparc/include/asm/tlbflush_64.h 
b/arch/sparc/include/asm/tlbflush_64.h
index 816d8202fa0a..dea1cfa2122b 100644
--- a/arch/sparc/include/asm/tlbflush_64.h
+++ b/arch/sparc/include/asm/tlbflush_64.h
@@ -34,6 +34,8 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
 {
 }
 
+void flush_tlb_kernel_range(unsigned long start, unsigned long end);
+
 #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
 
 void flush_tlb_pending(void);
@@ -48,11 +50,6 @@ void __flush_tlb_kernel_range(unsigned long start, unsigned 
long end);
 
 #ifndef CONFIG_SMP
 
-#define flush_tlb_kernel_range(start,end) \
-do {   flush_tsb_kernel_range(start,end); \
-   __flush_tlb_kernel_range(start,end); \
-} while (0)
-
 static inline void global_flush_tlb_page(struct mm_struct *mm, unsigned long 
vaddr)
 {
__flush_tlb_page(CTX_HWBITS(mm->context), vaddr);
@@ -63,11 +60,6 @@ static inline void global_flush_tlb_page(struct mm_struct 
*mm, unsigned long vad
 void smp_flush_tlb_kernel_range(unsigned long start, unsigned long end);
 void smp_flush_tlb_page(struct mm_struct *mm, unsigned long vaddr);
 
-#define flush_tlb_kernel_range(start, end) \
-do {   flush_tsb_kernel_range(start,end); \
-   smp_flush_tlb_kernel_range(start, end); \
-} while (0)
-
 #define global_flush_tlb_page(mm, vaddr) \
smp_flush_tlb_page(mm, vaddr)
 
diff --git a/arch/sparc/kernel/ldc.c b/arch/sparc/kernel/ldc.c
index e01d75d40329..66dacd56bb10 100644
--- a/arch/sparc/kernel/ldc.c
+++ b/arch/sparc/kernel/ldc.c
@@ -1336,7 +1336,7 @@ int ldc_connect(struct ldc_channel *lp)
if (!(lp->flags & LDC_FLAG_ALLOCED_QUEUES) ||
!(lp->flags & LDC_FLAG_REGISTERED_QUEUES) ||
lp->hs_state != LDC_HS_OPEN)
-   err = -EINVAL;
+   err = ((lp->hs_state > LDC_HS_OPEN) ? 0 : -EINVAL);
else
err = start_handshake(lp);
 
diff --git a/arch/sparc/math-emu/math_32.c b/arch/sparc/math-emu/math_32.c
index aa4d55b0bdf0..5ce8f2f64604 100644
--- a/arch/sparc/math-emu/math_32.c
+++ b/arch/sparc/math-emu/math_32.c
@@ -499,7 +499,7 @@ static int do_one_mathemu(u32 insn, unsigned long *pfsr, 
unsigned long *fregs)
case 0: fsr = *pfsr;
if (IR == -1) IR = 2;
/* fcc is always fcc0 */
-   fsr &= ~0xc00; fsr |= (IR << 10); break;
+   fsr &= ~0xc00; fsr |= (IR << 10);
*pfsr = fsr;
break;
case 1: rd->s = IR; break;
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 16b58ff11e65..2cfb0f25e0ed 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -351,6 +351,10 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned 
long address, pte_t *
 
mm = vma->vm_mm;
 
+   /* Don't insert a non-valid PTE into the TSB, we'll deadlock.  */
+   if (!pte_accessible(mm, pte))
+   return;
+
spin_lock_irqsave(>context.lock, flags);
 
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
@@ -2619,6 +2623,10 @@ void update_mmu_cache_pmd(struct vm_area_struct *vma, 
unsigned long addr,
 
pte = pmd_val(entry);
 
+   /* Don't insert a non-valid PMD into the TSB, we'll deadlock.  */
+   if (!(pte & _PAGE_VALID))
+   return;
+
/* We are fabricating 8MB pages using 4MB real hw pages.  */
pte |= (addr & (1UL << REAL_HPAGE_SHIFT));
 
@@ -2699,3 +2707,26 @@ void hugetlb_setup(struct pt_regs *regs)
}
 }
 #endif
+
+#ifdef CONFIG_SMP
+#define do_flush_tlb_kernel_range  smp_flush_tlb_kernel_range
+#else
+#define do_flush_tlb_kernel_range  __flush_tlb_kernel_range
+#endif
+
+void flush_tlb_kernel_range(unsigned long start, unsigned long end)
+{
+   if (start < HI_OBP_ADDRESS && end > LOW_OBP_ADDRESS) {
+   if (start < LOW_OBP_ADDRESS) {
+   flush_tsb_kernel_range(start, LOW_OBP_ADDRESS);
+   do_flush_tlb_kernel_range(start, LOW_OBP_ADDRESS);
+   }
+   if (end > HI_OBP_ADDRESS) {
+   flush_tsb_kernel_range(end, HI_OBP_ADDRESS);
+   do_flush_tlb_kernel_range(end, HI_OBP_ADDRESS);
+   }
+   } else {
+   flush_tsb_kernel_range(start, end);
+   do_flush_tlb_kernel_range(start, end);
+   }
+}
diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index 8afa579e7c40..a3dd5dc64f4c 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -7830,17 +7830,18 @@ 

Linux 3.16.1

2014-08-13 Thread Greg KH
I'm announcing the release of the 3.16.1 kernel.

All users of the 3.16 kernel series must upgrade.

The updated 3.16.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.16.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile|4 ++--
 arch/sparc/include/asm/tlbflush_64.h|   12 ++--
 arch/sparc/kernel/ldc.c |2 +-
 arch/sparc/math-emu/math_32.c   |2 +-
 arch/sparc/mm/init_64.c |   31 +++
 drivers/net/ethernet/broadcom/tg3.c |   22 --
 drivers/net/ethernet/brocade/bna/bnad.c |2 +-
 drivers/net/macvlan.c   |1 +
 drivers/net/phy/mdio_bus.c  |1 -
 drivers/sbus/char/bbc_envctrl.c |6 ++
 drivers/sbus/char/bbc_i2c.c |   11 ---
 drivers/tty/serial/sunsab.c |9 +
 include/net/ip_tunnels.h|1 +
 lib/iovec.c |4 
 net/batman-adv/fragmentation.c  |   10 +++---
 net/core/skbuff.c   |2 +-
 net/ipv4/ip_tunnel.c|   29 ++---
 net/ipv4/tcp_vegas.c|3 ++-
 net/ipv4/tcp_veno.c |2 +-
 net/sctp/output.c   |2 +-
 20 files changed, 109 insertions(+), 47 deletions(-)

Andrey Utkin (1):
  arch/sparc/math-emu/math_32.c: drop stray break operator

Christoph Paasch (2):
  tcp: Fix integer-overflows in TCP veno
  tcp: Fix integer-overflow in TCP vegas

Christopher Alexander Tobias Schulze (2):
  bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
  sunsab: Fix detection of BREAK on sunsab serial console

David S. Miller (2):
  sparc64: Do not insert non-valid PTEs into the TSB hash table.
  sparc64: Guard against flushing openfirmware mappings.

Dmitry Popov (1):
  ip_tunnel(ipv4): fix tunnels with "local any remote $remote_ip"

Eric Dumazet (1):
  sctp: fix possible seqlock seadlock in sctp_packet_transmit()

Fabio Estevam (1):
  Revert "net: phy: Set the driver when registering an MDIO bus device"

Greg Kroah-Hartman (1):
  Linux 3.16.1

Ivan Vecera (1):
  bna: fix performance regression

Prashant Sreedharan (1):
  tg3: Modify tg3_tso_bug() to handle multiple TX rings

Sasha Levin (1):
  iovec: make sure the caller actually wants anything in memcpy_fromiovecend

Sowmini Varadhan (1):
  sparc64: ldc_connect() should not return EINVAL when handshake is in 
progress.

Sven Eckelmann (1):
  batman-adv: Fix out-of-order fragmentation support

Vlad Yasevich (2):
  macvlan: Initialize vlan_features to turn on offload support.
  net: Correctly set segment mac_len in skb_segment().



pgpAiX8Bjrkqk.pgp
Description: PGP signature


Re: Linux 3.15.10

2014-08-13 Thread Greg KH
diff --git a/Makefile b/Makefile
index 25b85aba1e2e..76b75f7b8485 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 15
-SUBLEVEL = 9
+SUBLEVEL = 10
 EXTRAVERSION =
 NAME = Double Funky Skunk
 
diff --git a/arch/sparc/include/asm/tlbflush_64.h 
b/arch/sparc/include/asm/tlbflush_64.h
index 3c3c89f52643..7f9bab26a499 100644
--- a/arch/sparc/include/asm/tlbflush_64.h
+++ b/arch/sparc/include/asm/tlbflush_64.h
@@ -34,6 +34,8 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
 {
 }
 
+void flush_tlb_kernel_range(unsigned long start, unsigned long end);
+
 #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
 
 extern void flush_tlb_pending(void);
@@ -48,11 +50,6 @@ extern void __flush_tlb_kernel_range(unsigned long start, 
unsigned long end);
 
 #ifndef CONFIG_SMP
 
-#define flush_tlb_kernel_range(start,end) \
-do {   flush_tsb_kernel_range(start,end); \
-   __flush_tlb_kernel_range(start,end); \
-} while (0)
-
 static inline void global_flush_tlb_page(struct mm_struct *mm, unsigned long 
vaddr)
 {
__flush_tlb_page(CTX_HWBITS(mm->context), vaddr);
@@ -63,11 +60,6 @@ static inline void global_flush_tlb_page(struct mm_struct 
*mm, unsigned long vad
 extern void smp_flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void smp_flush_tlb_page(struct mm_struct *mm, unsigned long vaddr);
 
-#define flush_tlb_kernel_range(start, end) \
-do {   flush_tsb_kernel_range(start,end); \
-   smp_flush_tlb_kernel_range(start, end); \
-} while (0)
-
 #define global_flush_tlb_page(mm, vaddr) \
smp_flush_tlb_page(mm, vaddr)
 
diff --git a/arch/sparc/include/uapi/asm/unistd.h 
b/arch/sparc/include/uapi/asm/unistd.h
index b73274fb961a..42f2bca1d338 100644
--- a/arch/sparc/include/uapi/asm/unistd.h
+++ b/arch/sparc/include/uapi/asm/unistd.h
@@ -410,8 +410,9 @@
 #define __NR_finit_module  342
 #define __NR_sched_setattr 343
 #define __NR_sched_getattr 344
+#define __NR_renameat2 345
 
-#define NR_syscalls345
+#define NR_syscalls346
 
 /* Bitmask values returned from kern_features system call.  */
 #define KERN_FEATURE_MIXED_MODE_STACK  0x0001
diff --git a/arch/sparc/kernel/ldc.c b/arch/sparc/kernel/ldc.c
index e01d75d40329..66dacd56bb10 100644
--- a/arch/sparc/kernel/ldc.c
+++ b/arch/sparc/kernel/ldc.c
@@ -1336,7 +1336,7 @@ int ldc_connect(struct ldc_channel *lp)
if (!(lp->flags & LDC_FLAG_ALLOCED_QUEUES) ||
!(lp->flags & LDC_FLAG_REGISTERED_QUEUES) ||
lp->hs_state != LDC_HS_OPEN)
-   err = -EINVAL;
+   err = ((lp->hs_state > LDC_HS_OPEN) ? 0 : -EINVAL);
else
err = start_handshake(lp);
 
diff --git a/arch/sparc/kernel/sys32.S b/arch/sparc/kernel/sys32.S
index d066eb18650c..f834224208ed 100644
--- a/arch/sparc/kernel/sys32.S
+++ b/arch/sparc/kernel/sys32.S
@@ -48,6 +48,7 @@ SIGN1(sys32_futex, compat_sys_futex, %o1)
 SIGN1(sys32_recvfrom, compat_sys_recvfrom, %o0)
 SIGN1(sys32_recvmsg, compat_sys_recvmsg, %o0)
 SIGN1(sys32_sendmsg, compat_sys_sendmsg, %o0)
+SIGN2(sys32_renameat2, sys_renameat2, %o0, %o2)
 
.globl  sys32_mmap2
 sys32_mmap2:
diff --git a/arch/sparc/kernel/systbls_32.S b/arch/sparc/kernel/systbls_32.S
index 151ace8766cc..85fe9b1087cd 100644
--- a/arch/sparc/kernel/systbls_32.S
+++ b/arch/sparc/kernel/systbls_32.S
@@ -86,3 +86,4 @@ sys_call_table:
 /*330*/.long sys_fanotify_mark, sys_prlimit64, sys_name_to_handle_at, 
sys_open_by_handle_at, sys_clock_adjtime
 /*335*/.long sys_syncfs, sys_sendmmsg, sys_setns, 
sys_process_vm_readv, sys_process_vm_writev
 /*340*/.long sys_ni_syscall, sys_kcmp, sys_finit_module, 
sys_sched_setattr, sys_sched_getattr
+/*345*/.long sys_renameat2
diff --git a/arch/sparc/kernel/systbls_64.S b/arch/sparc/kernel/systbls_64.S
index 4bd4e2bb26cf..33ecba2826ea 100644
--- a/arch/sparc/kernel/systbls_64.S
+++ b/arch/sparc/kernel/systbls_64.S
@@ -87,6 +87,7 @@ sys_call_table32:
 /*330*/.word compat_sys_fanotify_mark, sys_prlimit64, 
sys_name_to_handle_at, compat_sys_open_by_handle_at, compat_sys_clock_adjtime
.word sys_syncfs, compat_sys_sendmmsg, sys_setns, 
compat_sys_process_vm_readv, compat_sys_process_vm_writev
 /*340*/.word sys_kern_features, sys_kcmp, sys_finit_module, 
sys_sched_setattr, sys_sched_getattr
+   .word sys32_renameat2
 
 #endif /* CONFIG_COMPAT */
 
@@ -165,3 +166,4 @@ sys_call_table:
 /*330*/.word sys_fanotify_mark, sys_prlimit64, sys_name_to_handle_at, 
sys_open_by_handle_at, sys_clock_adjtime
.word sys_syncfs, sys_sendmmsg, sys_setns, sys_process_vm_readv, 
sys_process_vm_writev
 /*340*/.word sys_kern_features, sys_kcmp, sys_finit_module, 
sys_sched_setattr, sys_sched_getattr
+   .word sys_renameat2
diff --git a/arch/sparc/math-emu/math_32.c b/arch/sparc/math-emu/math_32.c
index aa4d55b0bdf0..5ce8f2f64604 100644
--- a/arch/sparc/math-emu/math_32.c
+++ 

Linux 3.15.10

2014-08-13 Thread Greg KH
I'm announcing the release of the 3.15.10 kernel.

Note, this is the LAST 3.15.y kernel release, it is now end-of-life,
please move to 3.16.y now.

All users of the 3.15 kernel series must upgrade.

The updated 3.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.15.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile|2 
 arch/sparc/include/asm/tlbflush_64.h|   12 
 arch/sparc/include/uapi/asm/unistd.h|3 -
 arch/sparc/kernel/ldc.c |2 
 arch/sparc/kernel/sys32.S   |1 
 arch/sparc/kernel/systbls_32.S  |1 
 arch/sparc/kernel/systbls_64.S  |2 
 arch/sparc/math-emu/math_32.c   |2 
 arch/sparc/mm/init_64.c |   31 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h |1 
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |9 +++
 drivers/net/ethernet/broadcom/genet/bcmgenet.c  |5 +
 drivers/net/ethernet/brocade/bna/bnad.c |2 
 drivers/net/macvlan.c   |1 
 drivers/net/phy/phy_device.c|2 
 drivers/net/ppp/pptp.c  |2 
 drivers/sbus/char/bbc_envctrl.c |6 ++
 drivers/sbus/char/bbc_i2c.c |   11 +++-
 drivers/tty/serial/sunsab.c |9 +++
 fs/xfs/xfs_log.h|   19 ---
 fs/xfs/xfs_log_cil.c|7 +-
 include/net/inetpeer.h  |   16 +-
 include/net/ip.h|   31 +--
 include/net/ip_tunnels.h|1 
 include/net/ipv6.h  |2 
 include/net/secure_seq.h|2 
 net/batman-adv/fragmentation.c  |   10 ++-
 net/compat.c|9 +--
 net/core/iovec.c|   10 ++-
 net/core/secure_seq.c   |   25 -
 net/core/skbuff.c   |2 
 net/ipv4/igmp.c |4 -
 net/ipv4/inetpeer.c |   18 --
 net/ipv4/ip_output.c|7 +-
 net/ipv4/ip_tunnel.c|   29 ++-
 net/ipv4/ip_tunnel_core.c   |2 
 net/ipv4/ipmr.c |2 
 net/ipv4/raw.c  |2 
 net/ipv4/route.c|   63 ++--
 net/ipv4/tcp_vegas.c|3 -
 net/ipv4/tcp_veno.c |2 
 net/ipv4/xfrm4_mode_tunnel.c|2 
 net/ipv6/ip6_output.c   |   14 +
 net/ipv6/output_core.c  |   25 -
 net/netfilter/ipvs/ip_vs_xmit.c |2 
 net/sctp/associola.c|1 
 net/sctp/output.c   |2 
 net/xfrm/xfrm_policy.c  |2 
 net/xfrm/xfrm_user.c|7 +-
 49 files changed, 229 insertions(+), 196 deletions(-)

Andrey Ryabinin (1):
  net: sendmsg: fix NULL pointer dereference

Andrey Utkin (1):
  arch/sparc/math-emu/math_32.c: drop stray break operator

Christoph Paasch (2):
  tcp: Fix integer-overflows in TCP veno
  tcp: Fix integer-overflow in TCP vegas

Christopher Alexander Tobias Schulze (2):
  bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
  sunsab: Fix detection of BREAK on sunsab serial console

Daniel Borkmann (1):
  net: sctp: inherit auth_capable on INIT collisions

Dave Chinner (1):
  xfs: log vector rounding leaks log space

David S. Miller (3):
  sparc: Hook up renameat2 syscall.
  sparc64: Do not insert non-valid PTEs into the TSB hash table.
  sparc64: Guard against flushing openfirmware mappings.

Dmitry Kravkov (1):
  bnx2x: fix crash during TSO tunneling

Dmitry Popov (1):
  ip_tunnel(ipv4): fix tunnels with "local any remote $remote_ip"

Eric Dumazet (3):
  inetpeer: get rid of ip_id_count
  ip: make IP identifiers less predictable
  sctp: fix possible seqlock seadlock in sctp_packet_transmit()

Florian Fainelli (2):
  net: bcmgenet: correctly pad short packets
  net: phy: re-apply PHY fixups during phy_register_device

Greg Kroah-Hartman (1):
  Linux 3.15.10

Ivan Vecera (1):
  bna: fix performance regression

Sasha Levin (1):
  iovec: make sure the caller actually wants anything in memcpy_fromiovecend

Sowmini Varadhan (1):
  sparc64: ldc_connect() should not return EINVAL 

Re: Linux 3.14.17

2014-08-13 Thread Greg KH

diff --git a/Makefile b/Makefile
index 8b22e24a2d8e..12aac0325888 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 14
-SUBLEVEL = 16
+SUBLEVEL = 17
 EXTRAVERSION =
 NAME = Remembering Coco
 
diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 0f9e94537eee..1a49ffdf9da9 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -24,7 +24,8 @@
 
 /* The kernel image occupies 0x400 to 0x600 (4MB --> 96MB).
  * The page copy blockops can use 0x600 to 0x800.
- * The TSB is mapped in the 0x800 to 0xa00 range.
+ * The 8K TSB is mapped in the 0x800 to 0x840 range.
+ * The 4M TSB is mapped in the 0x840 to 0x880 range.
  * The PROM resides in an area spanning 0xf000 to 0x1.
  * The vmalloc area spans 0x1 to 0x2.
  * Since modules need to be in the lowest 32-bits of the address space,
@@ -33,7 +34,8 @@
  * 0x4.
  */
 #defineTLBTEMP_BASE_AC(0x0600,UL)
-#defineTSBMAP_BASE _AC(0x0800,UL)
+#defineTSBMAP_8K_BASE  _AC(0x0800,UL)
+#defineTSBMAP_4M_BASE  _AC(0x0840,UL)
 #define MODULES_VADDR  _AC(0x1000,UL)
 #define MODULES_LEN_AC(0xe000,UL)
 #define MODULES_END_AC(0xf000,UL)
@@ -71,6 +73,23 @@
 
 #include 
 
+extern unsigned long sparc64_valid_addr_bitmap[];
+
+/* Needs to be defined here and not in linux/mm.h, as it is arch dependent */
+static inline bool __kern_addr_valid(unsigned long paddr)
+{
+   if ((paddr >> MAX_PHYS_ADDRESS_BITS) != 0UL)
+   return false;
+   return test_bit(paddr >> ILOG2_4MB, sparc64_valid_addr_bitmap);
+}
+
+static inline bool kern_addr_valid(unsigned long addr)
+{
+   unsigned long paddr = __pa(addr);
+
+   return __kern_addr_valid(paddr);
+}
+
 /* Entries per page directory level. */
 #define PTRS_PER_PTE   (1UL << (PAGE_SHIFT-3))
 #define PTRS_PER_PMD   (1UL << PMD_BITS)
@@ -79,9 +98,12 @@
 /* Kernel has a separate 44bit address space. */
 #define FIRST_USER_ADDRESS 0
 
-#define pte_ERROR(e)   __builtin_trap()
-#define pmd_ERROR(e)   __builtin_trap()
-#define pgd_ERROR(e)   __builtin_trap()
+#define pmd_ERROR(e)   \
+   pr_err("%s:%d: bad pmd %p(%016lx) seen at (%pS)\n", \
+  __FILE__, __LINE__, &(e), pmd_val(e), 
__builtin_return_address(0))
+#define pgd_ERROR(e)   \
+   pr_err("%s:%d: bad pgd %p(%016lx) seen at (%pS)\n", \
+  __FILE__, __LINE__, &(e), pgd_val(e), 
__builtin_return_address(0))
 
 #endif /* !(__ASSEMBLY__) */
 
@@ -258,8 +280,8 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t prot)
 {
unsigned long mask, tmp;
 
-   /* SUN4U: 0x600307ffecb8 (negated == 0x9ffcf8001347)
-* SUN4V: 0x30ffee17 (negated == 0xcf0011e8)
+   /* SUN4U: 0x630107ffec38 (negated == 0x9cfef80013c7)
+* SUN4V: 0x33ffee07 (negated == 0xcc0011f8)
 *
 * Even if we use negation tricks the result is still a 6
 * instruction sequence, so don't try to play fancy and just
@@ -289,10 +311,10 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t prot)
"   .previous\n"
: "=r" (mask), "=r" (tmp)
: "i" (_PAGE_PADDR_4U | _PAGE_MODIFIED_4U | _PAGE_ACCESSED_4U |
-  _PAGE_CP_4U | _PAGE_CV_4U | _PAGE_E_4U | _PAGE_PRESENT_4U |
+  _PAGE_CP_4U | _PAGE_CV_4U | _PAGE_E_4U |
   _PAGE_SPECIAL | _PAGE_PMD_HUGE | _PAGE_SZALL_4U),
  "i" (_PAGE_PADDR_4V | _PAGE_MODIFIED_4V | _PAGE_ACCESSED_4V |
-  _PAGE_CP_4V | _PAGE_CV_4V | _PAGE_E_4V | _PAGE_PRESENT_4V |
+  _PAGE_CP_4V | _PAGE_CV_4V | _PAGE_E_4V |
   _PAGE_SPECIAL | _PAGE_PMD_HUGE | _PAGE_SZALL_4V));
 
return __pte((pte_val(pte) & mask) | (pgprot_val(prot) & ~mask));
@@ -633,7 +655,7 @@ static inline unsigned long pmd_large(pmd_t pmd)
 {
pte_t pte = __pte(pmd_val(pmd));
 
-   return (pte_val(pte) & _PAGE_PMD_HUGE) && pte_present(pte);
+   return pte_val(pte) & _PAGE_PMD_HUGE;
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -719,20 +741,6 @@ static inline pmd_t pmd_mkwrite(pmd_t pmd)
return __pmd(pte_val(pte));
 }
 
-static inline pmd_t pmd_mknotpresent(pmd_t pmd)
-{
-   unsigned long mask;
-
-   if (tlb_type == hypervisor)
-   mask = _PAGE_PRESENT_4V;
-   else
-   mask = _PAGE_PRESENT_4U;
-
-   pmd_val(pmd) &= ~mask;
-
-   return pmd;
-}
-
 static inline pmd_t pmd_mksplitting(pmd_t pmd)
 {
pte_t pte = __pte(pmd_val(pmd));
@@ -757,6 +765,20 @@ static inline int pmd_present(pmd_t pmd)
 
 #define pmd_none(pmd)  (!pmd_val(pmd))
 
+/* 

Linux 3.10.53

2014-08-13 Thread Greg KH
I'm announcing the release of the 3.10.53 kernel.

All users of the 3.10 kernel series must upgrade.

The updated 3.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.10.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile|2 
 arch/sparc/include/asm/pgtable_64.h |6 -
 arch/sparc/include/asm/tlbflush_64.h|   12 --
 arch/sparc/kernel/ldc.c |2 
 arch/sparc/kernel/smp_64.c  |6 -
 arch/sparc/kernel/sys32.S   |2 
 arch/sparc/kernel/unaligned_64.c|   12 ++
 arch/sparc/lib/NG2memcpy.S  |1 
 arch/sparc/math-emu/math_32.c   |2 
 arch/sparc/mm/fault_64.c|  102 
 arch/sparc/mm/init_64.c |   27 ++
 arch/sparc/mm/tsb.c |   14 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h |1 
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |9 ++
 drivers/net/macvlan.c   |1 
 drivers/net/ppp/pptp.c  |2 
 drivers/net/vxlan.c |2 
 drivers/sbus/char/bbc_envctrl.c |6 +
 drivers/sbus/char/bbc_i2c.c |   11 +-
 drivers/tty/serial/sunsab.c |9 ++
 include/net/inetpeer.h  |   16 ---
 include/net/ip.h|   31 +++
 include/net/ipv6.h  |   11 +-
 include/net/secure_seq.h|2 
 net/compat.c|9 +-
 net/core/iovec.c|   10 +-
 net/core/secure_seq.c   |   25 -
 net/core/skbuff.c   |2 
 net/ipv4/igmp.c |4 
 net/ipv4/inetpeer.c |   18 
 net/ipv4/ip_output.c|7 -
 net/ipv4/ip_tunnel.c|2 
 net/ipv4/ipmr.c |2 
 net/ipv4/raw.c  |2 
 net/ipv4/route.c|   69 ++--
 net/ipv4/tcp_vegas.c|3 
 net/ipv4/tcp_veno.c |2 
 net/ipv4/xfrm4_mode_tunnel.c|2 
 net/ipv6/ip6_output.c   |   17 
 net/ipv6/output_core.c  |   23 -
 net/ipv6/sit.c  |2 
 net/netfilter/ipvs/ip_vs_xmit.c |2 
 net/sctp/associola.c|1 
 net/sctp/output.c   |2 
 44 files changed, 269 insertions(+), 224 deletions(-)

Andrey Ryabinin (1):
  net: sendmsg: fix NULL pointer dereference

Andrey Utkin (1):
  arch/sparc/math-emu/math_32.c: drop stray break operator

Christoph Paasch (2):
  tcp: Fix integer-overflows in TCP veno
  tcp: Fix integer-overflow in TCP vegas

Christopher Alexander Tobias Schulze (2):
  bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
  sunsab: Fix detection of BREAK on sunsab serial console

Daniel Borkmann (1):
  net: sctp: inherit auth_capable on INIT collisions

David S. Miller (8):
  sparc64: Fix argument sign extension for compat_sys_futex().
  sparc64: Handle 32-bit tasks properly in compute_effective_address().
  sparc64: Fix top-level fault handling bugs.
  sparc64: Don't bark so loudly about 32-bit tasks generating 64-bit fault 
addresses.
  sparc64: Fix huge TSB mapping on pre-UltraSPARC-III cpus.
  sparc64: Add membar to Niagara2 memcpy code.
  sparc64: Do not insert non-valid PTEs into the TSB hash table.
  sparc64: Guard against flushing openfirmware mappings.

Dmitry Kravkov (1):
  bnx2x: fix crash during TSO tunneling

Eric Dumazet (3):
  inetpeer: get rid of ip_id_count
  ip: make IP identifiers less predictable
  sctp: fix possible seqlock seadlock in sctp_packet_transmit()

Greg Kroah-Hartman (1):
  Linux 3.10.53

Kirill Tkhai (1):
  sparc64: Make itc_sync_lock raw

Sasha Levin (1):
  iovec: make sure the caller actually wants anything in memcpy_fromiovecend

Sowmini Varadhan (1):
  sparc64: ldc_connect() should not return EINVAL when handshake is in 
progress.

Vlad Yasevich (2):
  macvlan: Initialize vlan_features to turn on offload support.
  net: Correctly set segment mac_len in skb_segment().



pgpZLm7JAKYK3.pgp
Description: PGP signature


Linux 3.14.17

2014-08-13 Thread Greg KH
I'm announcing the release of the 3.14.17 kernel.

All users of the 3.14 kernel series must upgrade.

The updated 3.14.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.14.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile|2 
 arch/sparc/include/asm/pgtable_64.h |   89 -
 arch/sparc/include/asm/tlbflush_64.h|   12 --
 arch/sparc/include/asm/tsb.h|3 
 arch/sparc/kernel/head_64.S |4 
 arch/sparc/kernel/ktlb.S|2 
 arch/sparc/kernel/ldc.c |2 
 arch/sparc/kernel/smp_64.c  |6 -
 arch/sparc/kernel/sys32.S   |2 
 arch/sparc/kernel/unaligned_64.c|   12 ++
 arch/sparc/lib/NG2memcpy.S  |1 
 arch/sparc/math-emu/math_32.c   |2 
 arch/sparc/mm/fault_64.c|   98 
 arch/sparc/mm/gup.c |2 
 arch/sparc/mm/init_64.c |   43 +-
 arch/sparc/mm/tlb.c |   26 --
 arch/sparc/mm/tsb.c |   14 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h |1 
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |9 ++
 drivers/net/ethernet/brocade/bna/bnad.c |2 
 drivers/net/macvlan.c   |1 
 drivers/net/phy/phy_device.c|2 
 drivers/net/ppp/pptp.c  |2 
 drivers/sbus/char/bbc_envctrl.c |6 +
 drivers/sbus/char/bbc_i2c.c |   11 +-
 drivers/tty/serial/sunsab.c |9 ++
 fs/xfs/xfs_log.h|   19 +++-
 fs/xfs/xfs_log_cil.c|7 -
 include/net/inetpeer.h  |   16 ---
 include/net/ip.h|   31 +++
 include/net/ip_tunnels.h|1 
 include/net/ipv6.h  |2 
 include/net/secure_seq.h|2 
 net/batman-adv/fragmentation.c  |   10 +-
 net/compat.c|9 +-
 net/core/iovec.c|   10 +-
 net/core/secure_seq.c   |   25 --
 net/core/skbuff.c   |2 
 net/ipv4/igmp.c |4 
 net/ipv4/inetpeer.c |   18 
 net/ipv4/ip_output.c|7 -
 net/ipv4/ip_tunnel.c|   29 ---
 net/ipv4/ip_tunnel_core.c   |2 
 net/ipv4/ipmr.c |2 
 net/ipv4/raw.c  |2 
 net/ipv4/route.c|   63 +--
 net/ipv4/tcp_vegas.c|3 
 net/ipv4/tcp_veno.c |2 
 net/ipv4/xfrm4_mode_tunnel.c|2 
 net/ipv6/ip6_output.c   |   14 +++
 net/ipv6/output_core.c  |   23 -
 net/netfilter/ipvs/ip_vs_xmit.c |2 
 net/sctp/associola.c|1 
 net/sctp/output.c   |2 
 net/xfrm/xfrm_user.c|7 -
 55 files changed, 378 insertions(+), 302 deletions(-)

Andrey Ryabinin (1):
  net: sendmsg: fix NULL pointer dereference

Andrey Utkin (1):
  arch/sparc/math-emu/math_32.c: drop stray break operator

Christoph Paasch (2):
  tcp: Fix integer-overflows in TCP veno
  tcp: Fix integer-overflow in TCP vegas

Christopher Alexander Tobias Schulze (2):
  bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
  sunsab: Fix detection of BREAK on sunsab serial console

Daniel Borkmann (1):
  net: sctp: inherit auth_capable on INIT collisions

Dave Chinner (1):
  xfs: log vector rounding leaks log space

David S. Miller (17):
  sparc64: Fix argument sign extension for compat_sys_futex().
  sparc64: Fix executable bit testing in set_pmd_at() paths.
  sparc64: Fix huge PMD invalidation.
  sparc64: Fix bugs in get_user_pages_fast() wrt. THP.
  sparc64: Fix hex values in comment above pte_modify().
  sparc64: Don't use _PAGE_PRESENT in pte_modify() mask.
  sparc64: Handle 32-bit tasks properly in compute_effective_address().
  sparc64: Fix top-level fault handling bugs.
  sparc64: Fix range check in kern_addr_valid().
  sparc64: Use 'ILOG2_4MB' instead of constant '22'.
  sparc64: Add basic validations to {pud,pmd}_bad().
  

Re: Linux 3.10.53

2014-08-13 Thread Greg KH
diff --git a/Makefile b/Makefile
index b94f00938acc..2ac415a7e937 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 10
-SUBLEVEL = 52
+SUBLEVEL = 53
 EXTRAVERSION =
 NAME = TOSSUG Baby Fish
 
diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index dfb0019bf05b..6663604a902a 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -24,7 +24,8 @@
 
 /* The kernel image occupies 0x400 to 0x600 (4MB --> 96MB).
  * The page copy blockops can use 0x600 to 0x800.
- * The TSB is mapped in the 0x800 to 0xa00 range.
+ * The 8K TSB is mapped in the 0x800 to 0x840 range.
+ * The 4M TSB is mapped in the 0x840 to 0x880 range.
  * The PROM resides in an area spanning 0xf000 to 0x1.
  * The vmalloc area spans 0x1 to 0x2.
  * Since modules need to be in the lowest 32-bits of the address space,
@@ -33,7 +34,8 @@
  * 0x4.
  */
 #defineTLBTEMP_BASE_AC(0x0600,UL)
-#defineTSBMAP_BASE _AC(0x0800,UL)
+#defineTSBMAP_8K_BASE  _AC(0x0800,UL)
+#defineTSBMAP_4M_BASE  _AC(0x0840,UL)
 #define MODULES_VADDR  _AC(0x1000,UL)
 #define MODULES_LEN_AC(0xe000,UL)
 #define MODULES_END_AC(0xf000,UL)
diff --git a/arch/sparc/include/asm/tlbflush_64.h 
b/arch/sparc/include/asm/tlbflush_64.h
index f0d6a9700f4c..1a4bb971e06d 100644
--- a/arch/sparc/include/asm/tlbflush_64.h
+++ b/arch/sparc/include/asm/tlbflush_64.h
@@ -35,6 +35,8 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
 {
 }
 
+void flush_tlb_kernel_range(unsigned long start, unsigned long end);
+
 #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
 
 extern void flush_tlb_pending(void);
@@ -49,11 +51,6 @@ extern void __flush_tlb_kernel_range(unsigned long start, 
unsigned long end);
 
 #ifndef CONFIG_SMP
 
-#define flush_tlb_kernel_range(start,end) \
-do {   flush_tsb_kernel_range(start,end); \
-   __flush_tlb_kernel_range(start,end); \
-} while (0)
-
 static inline void global_flush_tlb_page(struct mm_struct *mm, unsigned long 
vaddr)
 {
__flush_tlb_page(CTX_HWBITS(mm->context), vaddr);
@@ -64,11 +61,6 @@ static inline void global_flush_tlb_page(struct mm_struct 
*mm, unsigned long vad
 extern void smp_flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void smp_flush_tlb_page(struct mm_struct *mm, unsigned long vaddr);
 
-#define flush_tlb_kernel_range(start, end) \
-do {   flush_tsb_kernel_range(start,end); \
-   smp_flush_tlb_kernel_range(start, end); \
-} while (0)
-
 #define global_flush_tlb_page(mm, vaddr) \
smp_flush_tlb_page(mm, vaddr)
 
diff --git a/arch/sparc/kernel/ldc.c b/arch/sparc/kernel/ldc.c
index 54df554b82d9..fa4c900a0d1f 100644
--- a/arch/sparc/kernel/ldc.c
+++ b/arch/sparc/kernel/ldc.c
@@ -1336,7 +1336,7 @@ int ldc_connect(struct ldc_channel *lp)
if (!(lp->flags & LDC_FLAG_ALLOCED_QUEUES) ||
!(lp->flags & LDC_FLAG_REGISTERED_QUEUES) ||
lp->hs_state != LDC_HS_OPEN)
-   err = -EINVAL;
+   err = ((lp->hs_state > LDC_HS_OPEN) ? 0 : -EINVAL);
else
err = start_handshake(lp);
 
diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index 77539eda928c..8565ecd7d48a 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -150,7 +150,7 @@ void cpu_panic(void)
 #define NUM_ROUNDS 64  /* magic value */
 #define NUM_ITERS  5   /* likewise */
 
-static DEFINE_SPINLOCK(itc_sync_lock);
+static DEFINE_RAW_SPINLOCK(itc_sync_lock);
 static unsigned long go[SLAVE + 1];
 
 #define DEBUG_TICK_SYNC0
@@ -258,7 +258,7 @@ static void smp_synchronize_one_tick(int cpu)
go[MASTER] = 0;
membar_safe("#StoreLoad");
 
-   spin_lock_irqsave(_sync_lock, flags);
+   raw_spin_lock_irqsave(_sync_lock, flags);
{
for (i = 0; i < NUM_ROUNDS*NUM_ITERS; i++) {
while (!go[MASTER])
@@ -269,7 +269,7 @@ static void smp_synchronize_one_tick(int cpu)
membar_safe("#StoreLoad");
}
}
-   spin_unlock_irqrestore(_sync_lock, flags);
+   raw_spin_unlock_irqrestore(_sync_lock, flags);
 }
 
 #if defined(CONFIG_SUN_LDOMS) && defined(CONFIG_HOTPLUG_CPU)
diff --git a/arch/sparc/kernel/sys32.S b/arch/sparc/kernel/sys32.S
index f7c72b6efc27..d066eb18650c 100644
--- a/arch/sparc/kernel/sys32.S
+++ b/arch/sparc/kernel/sys32.S
@@ -44,7 +44,7 @@ SIGN1(sys32_timer_settime, compat_sys_timer_settime, %o1)
 SIGN1(sys32_io_submit, compat_sys_io_submit, %o1)
 SIGN1(sys32_mq_open, compat_sys_mq_open, %o1)
 SIGN1(sys32_select, compat_sys_select, %o0)
-SIGN3(sys32_futex, compat_sys_futex, %o1, %o2, %o5)
+SIGN1(sys32_futex, compat_sys_futex, %o1)
 SIGN1(sys32_recvfrom, 

Re: Linux 3.4.103

2014-08-13 Thread Greg KH

diff --git a/Makefile b/Makefile
index dd03fa5777a0..36f0913bd1d6 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 4
-SUBLEVEL = 102
+SUBLEVEL = 103
 EXTRAVERSION =
 NAME = Saber-toothed Squirrel
 
diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 656de8bc0ed6..cd24caf1732d 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -24,7 +24,8 @@
 
 /* The kernel image occupies 0x400 to 0x600 (4MB --> 96MB).
  * The page copy blockops can use 0x600 to 0x800.
- * The TSB is mapped in the 0x800 to 0xa00 range.
+ * The 8K TSB is mapped in the 0x800 to 0x840 range.
+ * The 4M TSB is mapped in the 0x840 to 0x880 range.
  * The PROM resides in an area spanning 0xf000 to 0x1.
  * The vmalloc area spans 0x1 to 0x2.
  * Since modules need to be in the lowest 32-bits of the address space,
@@ -33,7 +34,8 @@
  * 0x4.
  */
 #defineTLBTEMP_BASE_AC(0x0600,UL)
-#defineTSBMAP_BASE _AC(0x0800,UL)
+#defineTSBMAP_8K_BASE  _AC(0x0800,UL)
+#defineTSBMAP_4M_BASE  _AC(0x0840,UL)
 #define MODULES_VADDR  _AC(0x1000,UL)
 #define MODULES_LEN_AC(0xe000,UL)
 #define MODULES_END_AC(0xf000,UL)
diff --git a/arch/sparc/include/asm/tlbflush_64.h 
b/arch/sparc/include/asm/tlbflush_64.h
index f0d6a9700f4c..1a4bb971e06d 100644
--- a/arch/sparc/include/asm/tlbflush_64.h
+++ b/arch/sparc/include/asm/tlbflush_64.h
@@ -35,6 +35,8 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
 {
 }
 
+void flush_tlb_kernel_range(unsigned long start, unsigned long end);
+
 #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
 
 extern void flush_tlb_pending(void);
@@ -49,11 +51,6 @@ extern void __flush_tlb_kernel_range(unsigned long start, 
unsigned long end);
 
 #ifndef CONFIG_SMP
 
-#define flush_tlb_kernel_range(start,end) \
-do {   flush_tsb_kernel_range(start,end); \
-   __flush_tlb_kernel_range(start,end); \
-} while (0)
-
 static inline void global_flush_tlb_page(struct mm_struct *mm, unsigned long 
vaddr)
 {
__flush_tlb_page(CTX_HWBITS(mm->context), vaddr);
@@ -64,11 +61,6 @@ static inline void global_flush_tlb_page(struct mm_struct 
*mm, unsigned long vad
 extern void smp_flush_tlb_kernel_range(unsigned long start, unsigned long end);
 extern void smp_flush_tlb_page(struct mm_struct *mm, unsigned long vaddr);
 
-#define flush_tlb_kernel_range(start, end) \
-do {   flush_tsb_kernel_range(start,end); \
-   smp_flush_tlb_kernel_range(start, end); \
-} while (0)
-
 #define global_flush_tlb_page(mm, vaddr) \
smp_flush_tlb_page(mm, vaddr)
 
diff --git a/arch/sparc/kernel/ldc.c b/arch/sparc/kernel/ldc.c
index 435e406fdec3..1beaf60a5f78 100644
--- a/arch/sparc/kernel/ldc.c
+++ b/arch/sparc/kernel/ldc.c
@@ -1339,7 +1339,7 @@ int ldc_connect(struct ldc_channel *lp)
if (!(lp->flags & LDC_FLAG_ALLOCED_QUEUES) ||
!(lp->flags & LDC_FLAG_REGISTERED_QUEUES) ||
lp->hs_state != LDC_HS_OPEN)
-   err = -EINVAL;
+   err = ((lp->hs_state > LDC_HS_OPEN) ? 0 : -EINVAL);
else
err = start_handshake(lp);
 
diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index bb2886ac1df3..bbf7e955cb19 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -151,7 +151,7 @@ void cpu_panic(void)
 #define NUM_ROUNDS 64  /* magic value */
 #define NUM_ITERS  5   /* likewise */
 
-static DEFINE_SPINLOCK(itc_sync_lock);
+static DEFINE_RAW_SPINLOCK(itc_sync_lock);
 static unsigned long go[SLAVE + 1];
 
 #define DEBUG_TICK_SYNC0
@@ -259,7 +259,7 @@ static void smp_synchronize_one_tick(int cpu)
go[MASTER] = 0;
membar_safe("#StoreLoad");
 
-   spin_lock_irqsave(_sync_lock, flags);
+   raw_spin_lock_irqsave(_sync_lock, flags);
{
for (i = 0; i < NUM_ROUNDS*NUM_ITERS; i++) {
while (!go[MASTER])
@@ -270,7 +270,7 @@ static void smp_synchronize_one_tick(int cpu)
membar_safe("#StoreLoad");
}
}
-   spin_unlock_irqrestore(_sync_lock, flags);
+   raw_spin_unlock_irqrestore(_sync_lock, flags);
 }
 
 #if defined(CONFIG_SUN_LDOMS) && defined(CONFIG_HOTPLUG_CPU)
diff --git a/arch/sparc/kernel/sys32.S b/arch/sparc/kernel/sys32.S
index d97f3eb72e06..085c60fd4b6b 100644
--- a/arch/sparc/kernel/sys32.S
+++ b/arch/sparc/kernel/sys32.S
@@ -87,7 +87,7 @@ SIGN1(sys32_io_submit, compat_sys_io_submit, %o1)
 SIGN1(sys32_mq_open, compat_sys_mq_open, %o1)
 SIGN1(sys32_select, compat_sys_select, %o0)
 SIGN1(sys32_mkdir, sys_mkdir, %o1)
-SIGN3(sys32_futex, compat_sys_futex, %o1, %o2, %o5)
+SIGN1(sys32_futex, compat_sys_futex, %o1)
 SIGN1(sys32_sysfs, compat_sys_sysfs, 

Linux 3.4.103

2014-08-13 Thread Greg KH
I'm announcing the release of the 3.4.103 kernel.

All users of the 3.4 kernel series must upgrade.

The updated 3.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.4.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile |2 
 arch/sparc/include/asm/pgtable_64.h  |6 +-
 arch/sparc/include/asm/tlbflush_64.h |   12 
 arch/sparc/kernel/ldc.c  |2 
 arch/sparc/kernel/smp_64.c   |6 +-
 arch/sparc/kernel/sys32.S|2 
 arch/sparc/kernel/unaligned_64.c |   12 +++-
 arch/sparc/lib/NG2memcpy.S   |1 
 arch/sparc/math-emu/math_32.c|2 
 arch/sparc/mm/fault_64.c |  102 ++-
 arch/sparc/mm/init_64.c  |   27 +
 arch/sparc/mm/tsb.c  |   14 
 drivers/net/macvlan.c|1 
 drivers/net/ppp/pptp.c   |2 
 drivers/sbus/char/bbc_envctrl.c  |6 ++
 drivers/sbus/char/bbc_i2c.c  |   11 ++-
 drivers/tty/serial/sunsab.c  |9 +++
 include/net/inetpeer.h   |   14 
 include/net/ip.h |   31 --
 include/net/ipip.h   |2 
 include/net/ipv6.h   |9 ++-
 include/net/secure_seq.h |2 
 net/compat.c |9 +--
 net/core/iovec.c |   10 ++-
 net/core/secure_seq.c|   23 ---
 net/core/skbuff.c|2 
 net/ipv4/igmp.c  |4 -
 net/ipv4/inetpeer.c  |   18 --
 net/ipv4/ip_output.c |7 +-
 net/ipv4/ipmr.c  |2 
 net/ipv4/raw.c   |2 
 net/ipv4/route.c |   78 +++---
 net/ipv4/tcp_vegas.c |3 -
 net/ipv4/tcp_veno.c  |2 
 net/ipv4/xfrm4_mode_tunnel.c |2 
 net/ipv6/ip6_output.c|   25 +++-
 net/netfilter/ipvs/ip_vs_xmit.c  |2 
 net/sctp/associola.c |1 
 net/sctp/output.c|2 
 39 files changed, 250 insertions(+), 217 deletions(-)

Andrey Ryabinin (1):
  net: sendmsg: fix NULL pointer dereference

Andrey Utkin (1):
  arch/sparc/math-emu/math_32.c: drop stray break operator

Christoph Paasch (2):
  tcp: Fix integer-overflows in TCP veno
  tcp: Fix integer-overflow in TCP vegas

Christopher Alexander Tobias Schulze (2):
  bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
  sunsab: Fix detection of BREAK on sunsab serial console

Daniel Borkmann (1):
  net: sctp: inherit auth_capable on INIT collisions

David S. Miller (8):
  sparc64: Fix argument sign extension for compat_sys_futex().
  sparc64: Handle 32-bit tasks properly in compute_effective_address().
  sparc64: Fix top-level fault handling bugs.
  sparc64: Don't bark so loudly about 32-bit tasks generating 64-bit fault 
addresses.
  sparc64: Fix huge TSB mapping on pre-UltraSPARC-III cpus.
  sparc64: Add membar to Niagara2 memcpy code.
  sparc64: Do not insert non-valid PTEs into the TSB hash table.
  sparc64: Guard against flushing openfirmware mappings.

Eric Dumazet (3):
  inetpeer: get rid of ip_id_count
  ip: make IP identifiers less predictable
  sctp: fix possible seqlock seadlock in sctp_packet_transmit()

Greg Kroah-Hartman (1):
  Linux 3.4.103

Kirill Tkhai (1):
  sparc64: Make itc_sync_lock raw

Sasha Levin (1):
  iovec: make sure the caller actually wants anything in memcpy_fromiovecend

Sowmini Varadhan (1):
  sparc64: ldc_connect() should not return EINVAL when handshake is in 
progress.

Vlad Yasevich (2):
  macvlan: Initialize vlan_features to turn on offload support.
  net: Correctly set segment mac_len in skb_segment().



pgp3CPL8UhbC3.pgp
Description: PGP signature


Re: [PATCH] ARM: dts: Add mmc0 and mmc1 aliases for rk3288

2014-08-13 Thread addy ke
> Addy,
> 
> On Wed, Aug 13, 2014 at 6:57 PM, Addy  wrote:
> 
>> I think maybe it is suitable as follows:
>> mmc0 = 
>> mmc1 = 
>> mmc2 = 
>> mmc3 = 
> 
> Right, except the only ones that have landed in Heiko's tree are sdmmc
> and emmc, so we can't do sdio0 and sdio1 yet.  You could post support
> for sdio0 and adio1?

yes, I will post it today.
> 
> Also: it's really handy if emmc is 0.  See below: I don't think it's
> great to use the ID to find the sysconfig registers.
> 
> 
>> So we can get ctrl_id:
>> ctrl_id = of_alias_get_id(host->dev->of_node, "mshc");
> 
> Somehow I hadn't realized that was there.  I guess we could use that
> too.  I'd vote to remove that and use the standard "mmc" numbering
> (and get some momentum to land those patches).  If you want I'll
> repost using the mshc stuff, though.
> 
> 
>> and can get offset of registers:
>> offset = 0x200 + ctrl_id * 8 + 4 * drive_or_sample
> 
> I thought the plan was to actually implement the phase stuff as a clock 
> driver.
> 
> ...even if we didn't, I'd rather not rely on ID like this to find the
> right address.  It's really non-obvious.
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Usage of _PAGE_PCD et al in i915 driver

2014-08-13 Thread Juergen Gross

On 08/13/2014 05:07 PM, Jesse Barnes wrote:

On Fri, 8 Aug 2014 15:14:15 +0200
Daniel Vetter  wrote:


Adding relevant mailing lists.

On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross  wrote:

I'm just about to create a patch for full PAT support in the Linux
kernel, including Xen. For this purpose I introduce a translation
between cache modes and pte bits.

Scanning the kernel sources for usage of the cache mode bits in the
pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
_PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
to create ptes not for usage by the main processor, but for the
graphics processor. Is this true? In this case I'd suggest to define
i915-specific macros instead of using the x86 ones.


Yeah, those are gpu specific PAT tables, but the hw engineers
specifically designed this to match, and we've tried to follow the cpu
side to match it. Especially in the future that will be somewhat
important, since we want to fully share the entire address space
between cpu and gpu on the next platform. Jesse is working on that.


Right, we have an x86 compatible MMU in the GPU itself, so re-using the
defines makes sense.  I suppose with your work you'll move them and
make them a bit more opaque?  If so, we'll still want a way to get at
them directly, or access your mapping functions for generating PTE bits
for the GPU MMU.


Using the mapping functions I'm introducing should work, if the MMU has
an x86 compatible MSR_IA32_CR_PAT which is configured the same way as
on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT
setting as the Linux kernel).

Juergen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: dts: Add mmc0 and mmc1 aliases for rk3288

2014-08-13 Thread Doug Anderson
Addy,

On Wed, Aug 13, 2014 at 6:57 PM, Addy  wrote:

> I think maybe it is suitable as follows:
> mmc0 = 
> mmc1 = 
> mmc2 = 
> mmc3 = 

Right, except the only ones that have landed in Heiko's tree are sdmmc
and emmc, so we can't do sdio0 and sdio1 yet.  You could post support
for sdio0 and adio1?

Also: it's really handy if emmc is 0.  See below: I don't think it's
great to use the ID to find the sysconfig registers.


> So we can get ctrl_id:
> ctrl_id = of_alias_get_id(host->dev->of_node, "mshc");

Somehow I hadn't realized that was there.  I guess we could use that
too.  I'd vote to remove that and use the standard "mmc" numbering
(and get some momentum to land those patches).  If you want I'll
repost using the mshc stuff, though.


> and can get offset of registers:
> offset = 0x200 + ctrl_id * 8 + 4 * drive_or_sample

I thought the plan was to actually implement the phase stuff as a clock driver.

...even if we didn't, I'd rather not rely on ID like this to find the
right address.  It's really non-obvious.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 19/19] Documentation: ACPI for ARM64

2014-08-13 Thread Hanjun Guo
On 2014-8-14 7:41, Rafael J. Wysocki wrote:
> On Tuesday, August 12, 2014 07:23:47 PM Catalin Marinas wrote:
>> On Mon, Jul 28, 2014 at 07:27:52PM +0100, Olof Johansson wrote:
>>> On Mon, Jul 28, 2014 at 10:00 AM, Mark Rutland  wrote:
 On Mon, Jul 28, 2014 at 05:27:50PM +0100, Olof Johansson wrote:
> On Mon, Jul 28, 2014 at 11:07:50AM +0200, Arnd Bergmann wrote:
>> On Saturday 26 July 2014 19:34:48 Olof Johansson wrote:
>>> On Thu, Jul 24, 2014 at 6:00 AM, Hanjun Guo  
>>> wrote:
 +Relationship with Device Tree
 +-
 +
 +ACPI support in drivers and subsystems for ARMv8 should never be 
 mutually
 +exclusive with DT support at compile time.
 +
 +At boot time the kernel will only use one description method 
 depending on
 +parameters passed from the bootloader.
>>>
>>> Possibly overriden by kernel bootargs. And as debated for quite a
>>> while earlier this year, acpi should still default to off -- if a DT
>>> and ACPI are both passed in, DT should at this time be given priority.
>>
>> I think this would be harder to do with the way that ACPI is passed in
>> to the kernel. IIRC, you always have a minimal DT information based on
>> the ARM64 boot protocol, but in the case of ACPI, this contains pointers
>> to the ACPI tables, which are then used for populating the Linux platform
>> devices (unless acpi=disabled is set), while the other contents of the
>> DTB may be present but we skip the of_platform_populate state.
>
> How can it be harder to do? If you support acpi=off, then you should 
> support
> acpi=on.
>
> Another alternative would be to have an early fixup that stubs out
> the acpi properties from the DTB unless there's an 'acpi' or 'acpi=on'
> argument on the cmdline. Not quite as tidy a solution, though.

 I don't follow:

 If you want to disable ACPI, you can pass acpi=off. This is your
 workaround for broken ACPI (and only if you happen to have wrirten your
 own DTB, because many/most ACPI systems WILL NOT have a DTB to begin
 with).
>>>
>>> All ACPI should be assumed broken at this time, so acpi=off _must_ be
>>> the default.
>>
>> (catching up with emails after holiday and I may have missed some of
>> your arguments)
>>
>> If we consider ACPI unusable on ARM but we still want to start merging
>> patches, we should rather make the config option depend on BROKEN
>> (though if it is that unusable that no real platform can use it, I would
>> rather not merge it at all at this stage).
> 
> I agree here.
> 
> I would recommend creating a separate branch for that living outside of the
> mainline kernel and merging it when there are real users.

Real users will coming soon, we already tested this patch set on real hardware
(ARM64 Juno platform), and I think ARM64 server chips and platforms will show
up before 3.18 is released.

For this patch set, DT is the first class citizen at now:

a) We can always set CONFIG_ACPI as off in Kconfig, and use DT only;

b) Even if we set CONFIG_ACPI=Y, we also can use DT as normal:

  - Pass DT blob without (valid) ACPI tables (just as we boot the kernel now),
ACPI will disabled in the very early stage and FDT will still to be
unflattened, so will not break DT booting.

  - We can pass ACPI=off to disable ACPI and use DT even if we got valid
ACPI tables (in the v1 patch set);

So it is safe for people who want to use DT, and didn't change any behavior
of DT booting except some extra test of if(acpi_disabled).

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the tip tree with the pci-current tree

2014-08-13 Thread Stephen Rothwell
Hi all,

On Thu, 14 Aug 2014 13:10:31 +1000 Stephen Rothwell  
wrote:
>
> I fixed it up (I used the latter version of this file and applied the
> following merge fix patch) and can carry the fix as necessary (no
> action is required).

Apart from making sure Linus is informed when these things hit his tree ...

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


Re: [PATCH] ext4: include journal blocks of internal journal in df overhead calcs

2014-08-13 Thread Eric Sandeen
On 8/13/14, 6:37 AM, Chin-Tsung Cheng wrote:
> The journal blocks of external journal device should not
> be counted as overhead.
> 
> Signed-off-by: Chin-Tsung Cheng 

Yep, I added this and didn't consider external journals, oops.  

Agree with Darrick that whitespace (and parens) aren't ideal...

Is this a shorter test?

if (sbi->s_journal && !sbi->journal_bdev) {
overhead += EXT4_NUM_B2C(sbi, sbi->s_journal->j_maxlen);


*sbi gets kzalloced and I *think* journal_bdev is only filled in for
external journals...

ext3_statfs probably needs the same treatment, it unconditionally does:


/* Add the journal blocks as well */
overhead += sbi->s_journal->j_maxlen;

-Eric

> ---
>  fs/ext4/super.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 32b43ad..03b2f62 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3316,9 +3316,10 @@ int ext4_calculate_overhead(struct super_block *sb)
>   memset(buf, 0, PAGE_SIZE);
>   cond_resched();
>   }
> - /* Add the journal blocks as well */
> - if (sbi->s_journal)
> - overhead += EXT4_NUM_B2C(sbi, sbi->s_journal->j_maxlen);
> + /* Add the internal journal blocks as well */
> + if ((sbi->s_journal) &&
> + (sbi->s_journal->j_fs_dev == sbi->s_journal->j_dev))
> + overhead += EXT4_NUM_B2C(sbi, sbi->s_journal->j_maxlen);
>  
>   sbi->s_overhead = overhead;
>   smp_wmb();
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the tip tree with the pci-current tree

2014-08-13 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tip tree got a conflict in
arch/x86/kernel/cpu/perf_event_intel_uncore.c between commit
9baa3c34ac4e ("PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use") from the
pci-current tree and commits 92807ffdf32c ("perf/x86/uncore: Move
NHM/SNB/IVB specific code to seperate file") and 8268fdfc45b7
("perf/x86/uncore: Move SNB/IVB-EP specific code to seperate file")
from the tip tree.

I fixed it up (I used the latter version of this file and applied the
following merge fix patch) and can carry the fix as necessary (no
action is required).

From: Stephen Rothwell 
Date: Thu, 14 Aug 2014 13:07:52 +1000
Subject: [PATCH] perf/x86/uncore: fix for DEFINE_PCI_DEVICE_TABLE removal

Signed-off-by: Stephen Rothwell 
---
 arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c   | 6 +++---
 arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c 
b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
index 6e7811f3ea73..e0e934c8ee77 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
@@ -447,7 +447,7 @@ static struct intel_uncore_type *snb_pci_uncores[] = {
NULL,
 };
 
-static DEFINE_PCI_DEVICE_TABLE(snb_uncore_pci_ids) = {
+static const struct pci_device_id snb_uncore_pci_ids[] = {
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_SNB_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
@@ -455,7 +455,7 @@ static DEFINE_PCI_DEVICE_TABLE(snb_uncore_pci_ids) = {
{ /* end: all zeroes */ },
 };
 
-static DEFINE_PCI_DEVICE_TABLE(ivb_uncore_pci_ids) = {
+static const struct pci_device_id ivb_uncore_pci_ids[] = {
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
@@ -463,7 +463,7 @@ static DEFINE_PCI_DEVICE_TABLE(ivb_uncore_pci_ids) = {
{ /* end: all zeroes */ },
 };
 
-static DEFINE_PCI_DEVICE_TABLE(hsw_uncore_pci_ids) = {
+static const struct pci_device_id hsw_uncore_pci_ids[] = {
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_HSW_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c 
b/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c
index d3e9c55d984a..6606ed05d311 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c
@@ -960,7 +960,7 @@ static struct intel_uncore_type *snbep_pci_uncores[] = {
NULL,
 };
 
-static DEFINE_PCI_DEVICE_TABLE(snbep_uncore_pci_ids) = {
+static const struct pci_device_id snbep_uncore_pci_ids[] = {
{ /* Home Agent */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_HA),
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_HA, 0),
@@ -1542,7 +1542,7 @@ static struct intel_uncore_type *ivbep_pci_uncores[] = {
NULL,
 };
 
-static DEFINE_PCI_DEVICE_TABLE(ivbep_uncore_pci_ids) = {
+static const struct pci_device_id ivbep_uncore_pci_ids[] = {
{ /* Home Agent 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe30),
.driver_data = UNCORE_PCI_DEV_DATA(IVBEP_PCI_UNCORE_HA, 0),
-- 
2.1.0.rc1

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


RE: [f2fs-dev] [PATCH 08/13] f2fs: do checkpoint at f2fs_put_super

2014-08-13 Thread Chao Yu
Hi Jaegeuk,

> -Original Message-
> From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> Sent: Wednesday, August 13, 2014 3:49 AM
> To: linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org;
> linux-f2fs-de...@lists.sourceforge.net
> Cc: Jaegeuk Kim
> Subject: [f2fs-dev] [PATCH 08/13] f2fs: do checkpoint at f2fs_put_super
> 
> The generic_shutdown_super calls sync_filesystem, evict_inode, and then
> f2fs_put_super. In f2fs_evict_inode, we remain some dirty inode information
> so we should release them at f2fs_put_super.
> 
> But normally, it's more reasonable to set its superblock as dirty when
> evict_inode is called.

After applying this patch, when we mount and then umount f2fs image without
modification, we will write checkpoint in f2fs_put_super even though we have
no more dirty data to sync, it's not needed.

"some dirty inode information" you mentioned is inode information in ->ino_root 
of
sbi, right?

Regards,
Yu

> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/inode.c | 1 +
>  fs/f2fs/super.c | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> index 2c3..eeaf5aa 100644
> --- a/fs/f2fs/inode.c
> +++ b/fs/f2fs/inode.c
> @@ -303,6 +303,7 @@ no_delete:
>   add_dirty_inode(sbi, inode->i_ino, APPEND_INO);
>   if (is_inode_flag_set(F2FS_I(inode), FI_UPDATE_WRITE))
>   add_dirty_inode(sbi, inode->i_ino, UPDATE_INO);
> + F2FS_SET_SB_DIRT(sbi);
>  out_clear:
>   clear_inode(inode);
>  }
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 633315a..60e3554 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -432,7 +432,7 @@ static void f2fs_put_super(struct super_block *sb)
>   stop_gc_thread(sbi);
> 
>   /* We don't need to do checkpoint when it's clean */
> - if (sbi->s_dirty && get_pages(sbi, F2FS_DIRTY_NODES))
> + if (sbi->s_dirty)
>   write_checkpoint(sbi, true);
> 
>   iput(sbi->node_inode);
> --
> 1.8.5.2 (Apple Git-48)
> 
> 
> --
> ___
> Linux-f2fs-devel mailing list
> linux-f2fs-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2 V4] irqchip: gic: Add supports for ARM GICv2m MSI(-X)

2014-08-13 Thread Jingoo Han
On Thursday, August 14, 2014 12:01 AM, Suravee Suthikulpanit wrote:
> 
> From: Suravee Suthikulpanit 
> 
> ARM GICv2m specification extends GICv2 to support MSI(-X) with
> a new set of register frame. This patch introduces support for
> the non-secure GICv2m register frame. Currently, GICV2m is available
> in certain version of GIC-400.
> 
> The patch introduces a new property in ARM gic binding, the v2m subnode.
> It is optional.

Hi Suravee Suthikulpanit,

I added some minor comments.

> 
> Signed-off-by: Suravee Suthikulpanit 
> Cc: Mark Rutland 
> Cc: Marc Zyngier 
> Cc: Jason Cooper 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> ---
>  Documentation/devicetree/bindings/arm/gic.txt |  32 
>  drivers/irqchip/Kconfig   |   7 +
>  drivers/irqchip/Makefile  |   1 +
>  drivers/irqchip/irq-gic-v2m.c | 215 
> ++
>  drivers/irqchip/irq-gic.c |  75 +
>  drivers/irqchip/irq-gic.h |  48 ++
>  6 files changed, 348 insertions(+), 30 deletions(-)
>  create mode 100644 drivers/irqchip/irq-gic-v2m.c
>  create mode 100644 drivers/irqchip/irq-gic.h
> 
> diff --git a/Documentation/devicetree/bindings/arm/gic.txt
> b/Documentation/devicetree/bindings/arm/gic.txt
> index 5573c08..8a64179 100644
> --- a/Documentation/devicetree/bindings/arm/gic.txt
> +++ b/Documentation/devicetree/bindings/arm/gic.txt
> @@ -95,3 +95,35 @@ Example:
> <0x2c006000 0x2000>;
>   interrupts = <1 9 0xf04>;
>   };
> +
> +

Please remove the unnecessary line.

> +* GICv2m extension for MSI/MSI-x support (Optional)
> +
> +Certain revision of GIC-400 supports MSI/MSI-x via V2M register frame.
> +This is enabled by specifying v2m sub-node.
> +
> +Required properties:
> +
> +- msi-controller : Identifies the node as an MSI controller.
> +
> +- reg : GICv2m MSI interface register base and size
> +
> +Example:
> +
> + interrupt-controller@e1101000 {
> + compatible = "arm,gic-400";
> + #interrupt-cells = <3>;
> + #address-cells = <2>;
> + #size-cells = <2>;
> + interrupt-controller;
> + interrupts = <1 8 0xf04>;
> + ranges = <0 0 0 0xe110 0 0x10>;
> + reg = <0x0 0xe111 0 0x01000>,
> +   <0x0 0xe112f000 0 0x02000>,
> +   <0x0 0xe114 0 0x1>,
> +   <0x0 0xe116 0 0x1>;
> + v2m {
> + msi-controller;
> + reg = <0x0 0x8 0 0x1000>;
> + };
> + };
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index 4e230e7..9aa5edc 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -7,6 +7,13 @@ config ARM_GIC
>   select IRQ_DOMAIN
>   select MULTI_IRQ_HANDLER
> 
> +config ARM_GIC_V2M
> + bool
> + select IRQ_DOMAIN
> + select MULTI_IRQ_HANDLER
> + depends on ARM_GIC
> + depends on PCI && PCI_MSI
> +
>  config GIC_NON_BANKED
>   bool
> 
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 73052ba..3bda951 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -17,6 +17,7 @@ obj-$(CONFIG_ARCH_SUNXI)+= irq-sun4i.o
>  obj-$(CONFIG_ARCH_SUNXI) += irq-sunxi-nmi.o
>  obj-$(CONFIG_ARCH_SPEAR3XX)  += spear-shirq.o
>  obj-$(CONFIG_ARM_GIC)+= irq-gic.o irq-gic-common.o
> +obj-$(CONFIG_ARM_GIC_V2M)+= irq-gic-v2m.o
>  obj-$(CONFIG_ARM_GIC_V3) += irq-gic-v3.o irq-gic-common.o
>  obj-$(CONFIG_ARM_NVIC)   += irq-nvic.o
>  obj-$(CONFIG_ARM_VIC)+= irq-vic.o
> diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
> new file mode 100644
> index 000..1ac0ace
> --- /dev/null
> +++ b/drivers/irqchip/irq-gic-v2m.c
> @@ -0,0 +1,215 @@
> +/*
> + * ARM GIC v2m MSI(-X) support
> + * Support for Message Signalelled Interrupts for systems that

s/Signalelled/Signaled

> + * implement ARM Generic Interrupt Controller: GICv2m.
> + *
> + * Copyright (C) 2014 Advanced Micro Devices, Inc.
> + * Authors: Suravee Suthikulpanit 
> + *  Harish Kasiviswanathan 
> + *  Brandon Anderson 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

Please, re-order these headers alphabetically.
It enhances the readability.

> +
> +#include "irqchip.h"
> +#include "irq-gic.h"
> +
> +/*
> +* MSI_TYPER:
> +* [31:26] Reserved
> +* [25:16] lowest SPI assigned to MSI
> +* [15:10] Reserved
> +* [9:0]   Numer of SPIs assigned to MSI
> +*/
> 

Re: [PATCH v14 3/8] sparc: add pmd_[dirty|mkclean] for THP

2014-08-13 Thread David Miller
From: Minchan Kim 
Date: Thu, 14 Aug 2014 10:53:27 +0900

> MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent
> overwrite of the contents since MADV_FREE syscall is called for
> THP page.
> 
> This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
> support.
> 
> Cc: "David S. Miller" 
> Cc: sparcli...@vger.kernel.org
> Signed-off-by: Minchan Kim 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2 RESEND 04/23] bfa: Use pci_enable_msix_exact() instead of pci_enable_msix()

2014-08-13 Thread Anil Gurumurthy
Hi Alexander,
  I believe I acked the series already, if not, I am acking all the bfa patches 
in this series.

Thanks,
Anil

-Original Message-
From: Alexander Gordeev [mailto:agord...@redhat.com] 
Sent: 12 August 2014 16:20
To: Anil Gurumurthy
Cc: linux-kernel; Anil Gurumurthy; Vijaya Mohan Guvva; linux-scsi; linux-pci; 
Sudarsana Kalluru
Subject: Re: [PATCH v2 RESEND 04/23] bfa: Use pci_enable_msix_exact() instead 
of pci_enable_msix()

On Mon, Aug 11, 2014 at 11:02:56AM +, Anil Gurumurthy wrote:
> Acked-by: Anil Gurumurthy 

Many thanks, Anil!

If your Ack apply to this patch only or to all three 'bfa' patches in this 
series?

Thanks!

> -Original Message-
> From: Alexander Gordeev [mailto:agord...@redhat.com]
> Sent: 11 August 2014 13:09
> To: linux-kernel
> Cc: Anil Gurumurthy; Vijaya Mohan Guvva; linux-scsi; linux-pci; Anil 
> Gurumurthy; Sudarsana Kalluru
> Subject: Re: [PATCH v2 RESEND 04/23] bfa: Use pci_enable_msix_exact() 
> instead of pci_enable_msix()
> 
> On Wed, Jul 16, 2014 at 08:05:08PM +0200, Alexander Gordeev wrote:
> > As result of deprecation of MSI-X/MSI enablement functions
> > pci_enable_msix() and pci_enable_msi_block() all drivers using these 
> > two interfaces need to be updated to use the new
> > pci_enable_msi_range()  or pci_enable_msi_exact() and
> > pci_enable_msix_range() or pci_enable_msix_exact() interfaces.
> 
> Anil, Sudarsana,
> 
> Could you please review bfa patches in this series?
> 
> Thanks!
> 
> > Signed-off-by: Alexander Gordeev 
> > Cc: Anil Gurumurthy 
> > Cc: Vijaya Mohan Guvva 
> > Cc: linux-s...@vger.kernel.org
> > Cc: linux-...@vger.kernel.org
> > Acked-by: Anil Gurumurthy 
> > ---
> >  drivers/scsi/bfa/bfad.c |   20 ++--
> >  1 files changed, 6 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/scsi/bfa/bfad.c b/drivers/scsi/bfa/bfad.c index
> > c18279f..e90a374 100644
> > --- a/drivers/scsi/bfa/bfad.c
> > +++ b/drivers/scsi/bfa/bfad.c
> > @@ -1234,29 +1234,21 @@ bfad_setup_intr(struct bfad_s *bfad)
> > if ((bfa_asic_id_ctc(pdev->device) && !msix_disable_ct) ||
> >(bfa_asic_id_cb(pdev->device) && !msix_disable_cb)) {
> >  
> > -   error = pci_enable_msix(bfad->pcidev, msix_entries, bfad->nvec);
> > +   error = pci_enable_msix_exact(bfad->pcidev,
> > + msix_entries, bfad->nvec);
> > /* In CT1 & CT2, try to allocate just one vector */
> > -   if (error > 0 && bfa_asic_id_ctc(pdev->device)) {
> > +   if (error == -ENOSPC && bfa_asic_id_ctc(pdev->device)) {
> > printk(KERN_WARNING "bfa %s: trying one msix "
> >"vector failed to allocate %d[%d]\n",
> >bfad->pci_name, bfad->nvec, error);
> > bfad->nvec = 1;
> > -   error = pci_enable_msix(bfad->pcidev,
> > -   msix_entries, bfad->nvec);
> > +   error = pci_enable_msix_exact(bfad->pcidev,
> > + msix_entries, 1);
> > }
> >  
> > -   /*
> > -* Only error number of vector is available.
> > -* We don't have a mechanism to map multiple
> > -* interrupts into one vector, so even if we
> > -* can try to request less vectors, we don't
> > -* know how to associate interrupt events to
> > -*  vectors. Linux doesn't duplicate vectors
> > -* in the MSIX table for this case.
> > -*/
> > if (error) {
> > printk(KERN_WARNING "bfad%d: "
> > -  "pci_enable_msix failed (%d), "
> > +  "pci_enable_msix_exact failed (%d), "
> >"use line based.\n",
> > bfad->inst_no, error);
> > goto line_based;
> > --
> > 1.7.7.6
> > 

--
Regards,
Alexander Gordeev
agord...@redhat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH to be tested] serial: msm_serial: add missing sysrq handling

2014-08-13 Thread Frank Rowand
On 8/13/2014 7:33 PM, Frank Rowand wrote:
> On 8/12/2014 5:23 PM, Stephen Boyd wrote:
>> On 08/06/14 17:16, Frank Rowand wrote:

< snip >
 
> The patches you sent are a little hard to read since they modify further code
> that my patch modified.  So I have redone your patches, as if my patch was
> not previously applied.  Hopefully I did not make any mistakes there.  I will
> reply to this email with each of your redone patches.

Stephen's patch alternative number 2:

---
 drivers/tty/serial/msm_serial.c |   41 +++-
 1 file changed, 28 insertions(+), 13 deletions(-)

Index: b/drivers/tty/serial/msm_serial.c
===
--- a/drivers/tty/serial/msm_serial.c
+++ b/drivers/tty/serial/msm_serial.c
@@ -125,25 +125,40 @@ static void handle_rx_dm(struct uart_por
port->icount.rx += count;
 
while (count > 0) {
-   unsigned int c;
+   unsigned char buf[4];
+   int sysrq, r_count, i;
 
sr = msm_read(port, UART_SR);
if ((sr & UART_SR_RX_READY) == 0) {
msm_port->old_snap_state -= count;
break;
}
-   c = msm_read(port, UARTDM_RF);
-   if (sr & UART_SR_RX_BREAK) {
-   port->icount.brk++;
-   if (uart_handle_break(port))
-   continue;
-   } else if (sr & UART_SR_PAR_FRAME_ERR)
-   port->icount.frame++;
-
-   /* TODO: handle sysrq */
-   tty_insert_flip_string(tport, (char *),
-  (count > 4) ? 4 : count);
-   count -= 4;
+   readsl(port->membase + UARTDM_RF, buf, 1);
+
+   r_count = min_t(int, count, sizeof(buf));
+
+   for (i = 0; i < r_count; i++) {
+   char flag = TTY_NORMAL;
+
+   if (sr & UART_SR_RX_BREAK) {
+   if (buf[i] == 0) {
+   port->icount.brk++;
+   flag = TTY_BREAK;
+   if (uart_handle_break(port))
+   continue;
+   }
+   }
+
+   if (!(port->read_status_mask & UART_SR_RX_BREAK))
+   flag = TTY_NORMAL;
+
+   spin_unlock(>lock);
+   sysrq = uart_handle_sysrq_char(port, buf[i]);
+   spin_lock(>lock);
+   if (!sysrq)
+   tty_insert_flip_char(tport, buf[i], flag);
+   }
+   count -= r_count;
}
 
spin_unlock(>lock);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH to be tested] serial: msm_serial: add missing sysrq handling

2014-08-13 Thread Frank Rowand
On 8/13/2014 7:33 PM, Frank Rowand wrote:
> On 8/12/2014 5:23 PM, Stephen Boyd wrote:
>> On 08/06/14 17:16, Frank Rowand wrote:

< snip >

> The patches you sent are a little hard to read since they modify further code
> that my patch modified.  So I have redone your patches, as if my patch was
> not previously applied.  Hopefully I did not make any mistakes there.  I will
> reply to this email with each of your redone patches.

< snip >

Stephen's patch alternative number 1:


---
 drivers/tty/serial/msm_serial.c |   44 
 drivers/tty/serial/msm_serial.h |8 +++
 2 files changed, 39 insertions(+), 13 deletions(-)

Index: b/drivers/tty/serial/msm_serial.c
===
--- a/drivers/tty/serial/msm_serial.c
+++ b/drivers/tty/serial/msm_serial.c
@@ -125,25 +125,28 @@ static void handle_rx_dm(struct uart_por
port->icount.rx += count;
 
while (count > 0) {
-   unsigned int c;
+   unsigned char buf[4];
+   unsigned char *p = buf;
+   int sysrq, r_count;
 
sr = msm_read(port, UART_SR);
if ((sr & UART_SR_RX_READY) == 0) {
msm_port->old_snap_state -= count;
break;
}
-   c = msm_read(port, UARTDM_RF);
-   if (sr & UART_SR_RX_BREAK) {
-   port->icount.brk++;
-   if (uart_handle_break(port))
-   continue;
-   } else if (sr & UART_SR_PAR_FRAME_ERR)
-   port->icount.frame++;
+   readsl(port->membase + UARTDM_RF, p, 1);
 
-   /* TODO: handle sysrq */
-   tty_insert_flip_string(tport, (char *),
-  (count > 4) ? 4 : count);
-   count -= 4;
+   spin_unlock(>lock);
+   sysrq = uart_handle_sysrq_char(port, buf[0]);
+   spin_lock(>lock);
+   if (sysrq) {
+   p++;
+   count--;
+   }
+   r_count = min_t(int, count, sizeof(buf) - sysrq);
+   if (r_count)
+   tty_insert_flip_string(tport, p, r_count);
+   count -= r_count;
}
 
spin_unlock(>lock);
@@ -285,6 +288,17 @@ static irqreturn_t msm_irq(int irq, void
misr = msm_read(port, UART_MISR);
msm_write(port, 0, UART_IMR); /* disable interrupt */
 
+   if (misr & UART_IMR_RXBREAK_END) {
+   uart_handle_break(port);
+   port->icount.brk++;
+   msm_write(port, UART_CR_CMD_RESET_RXBREAK_END, UART_CR);
+   }
+
+   if (misr & UART_IMR_PAR_FRAME_ERR) {
+   port->icount.frame++;
+   msm_write(port, UART_CR_CMD_RESET_PAR_FRAME_ERR, UART_CR);
+   }
+
if (misr & (UART_IMR_RXLEV | UART_IMR_RXSTALE)) {
if (msm_port->is_uartdm)
handle_rx_dm(port, misr);
@@ -491,7 +505,8 @@ static int msm_startup(struct uart_port
 
/* turn on RX and CTS interrupts */
msm_port->imr = UART_IMR_RXLEV | UART_IMR_RXSTALE |
-   UART_IMR_CURRENT_CTS;
+   UART_IMR_CURRENT_CTS | UART_IMR_RXBREAK_END |
+   UART_IMR_PAR_FRAME_ERR;
 
if (msm_port->is_uartdm) {
msm_write(port, 0xFF, UARTDM_DMRX);
@@ -566,6 +581,9 @@ static void msm_set_termios(struct uart_
else
mr |= UART_MR2_STOP_BIT_LEN_ONE;
 
+   mr |= UART_MR2_RX_BREAK_ZERO_CHAR_OFF;
+   mr |= UART_MR2_RX_ERROR_CHAR_OFF;
+
/* set parity, bits per char, and stop bit */
msm_write(port, mr, UART_MR2);
 
Index: b/drivers/tty/serial/msm_serial.h
===
--- a/drivers/tty/serial/msm_serial.h
+++ b/drivers/tty/serial/msm_serial.h
@@ -24,6 +24,8 @@
 #define UART_MR1_CTS_CTL   (1 << 6)
 
 #define UART_MR2   0x0004
+#define UART_MR2_RX_ERROR_CHAR_OFF (1 << 9)
+#define UART_MR2_RX_BREAK_ZERO_CHAR_OFF(1 << 8)
 #define UART_MR2_ERROR_MODE(1 << 6)
 #define UART_MR2_BITS_PER_CHAR 0x30
 #define UART_MR2_BITS_PER_CHAR_5   (0x0 << 4)
@@ -65,6 +67,9 @@
 #define UART_CR_TX_ENABLE  (1 << 2)
 #define UART_CR_RX_DISABLE (1 << 1)
 #define UART_CR_RX_ENABLE  (1 << 0)
+#define UART_CR_CMD_RESET_RXBREAK_START((1 << 11) | (2 << 4))
+#define UART_CR_CMD_RESET_RXBREAK_END  ((1 << 11) | (3 << 4))
+#define UART_CR_CMD_RESET_PAR_FRAME_ERR((1 << 11) | (4 << 4))
 
 #define UART_IMR   0x0014
 #define UART_IMR_TXLEV (1 << 0)
@@ -72,6 +77,9 @@
 #define UART_IMR_RXLEV (1 << 4)
 #define UART_IMR_DELTA_CTS (1 << 5)
 #define UART_IMR_CURRENT_CTS   (1 << 6)
+#define 

Re: [PATCH v5 0/5] random,x86,kvm: Rework arch RNG seeds and get some from kvm

2014-08-13 Thread H. Peter Anvin
On 08/13/2014 11:44 AM, H. Peter Anvin wrote:
> On 08/13/2014 11:33 AM, Andy Lutomirski wrote:
>>
>> As for doing arch_random_init after clone/migration, I think we'll
>> need another KVM extension for that, since, AFAIK, we don't actually
>> get notified that we were cloned or migrated.  That will be
>> nontrivial.  Maybe we can figure that out at KS, too.
>>
> 
> We don't need a reset when migrated (although it might be a good idea
> under some circumstances, i.e. if the pools might somehow have gotten
> exposed) but definitely when cloned.
> 

But yes, we need a notification.  For obvious reasons there is no
suspend event (one can snapshot a running VM) but we need to be notified
upon wakeup, *or* we need to give KVM a way to update the necessary state.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH to be tested] serial: msm_serial: add missing sysrq handling

2014-08-13 Thread Frank Rowand
On 8/12/2014 5:23 PM, Stephen Boyd wrote:
> On 08/06/14 17:16, Frank Rowand wrote:
>> Stephen,
>>
>> Can you test this patch on v 1.3 hardware?  It works on my v 1.4.
>>
>> If you use kdmx2, the way to send a break is '~B'.  The previous
>> key pressed must be  for the '~' escape to be recognized.
>>
>> Thanks!
>>
>> -Frank
>>
>>
>>
>> From: Frank Rowand 
>>
>> Add missing sysrq handling to msm_serial.
>>
>> Signed-off-by: Frank Rowand 
>>
>> ---
> 
> It works but I have questions.
> 
>>  drivers/tty/serial/msm_serial.c |   26 +-
>>  1 file changed, 21 insertions(+), 5 deletions(-)
>>
>> Index: b/drivers/tty/serial/msm_serial.c
>> ===
>> --- a/drivers/tty/serial/msm_serial.c
>> +++ b/drivers/tty/serial/msm_serial.c
>> @@ -126,6 +126,8 @@ static void handle_rx_dm(struct uart_por
>>  
>>  while (count > 0) {
>>  unsigned int c;
>> +unsigned char *cp;
>> +int res;
>>  
>>  sr = msm_read(port, UART_SR);
>>  if ((sr & UART_SR_RX_READY) == 0) {
>> @@ -135,15 +137,29 @@ static void handle_rx_dm(struct uart_por
>>  c = msm_read(port, UARTDM_RF);
>>  if (sr & UART_SR_RX_BREAK) {
>>  port->icount.brk++;
>> -if (uart_handle_break(port))
>> +if (uart_handle_break(port)) {
>> +count -= 1;
>>  continue;
>> +}
> 
> This looks wrong. If it's a break then I think the fifo takes in a break
> character indicated by all zeros. We could possibly have 3 other
> characters after it in the fifo, or maybe 2 characters in front of it,
> or it could be 30 characters in. We can change this behavior by setting
> a bit in the MR2 register so that the all zero character doesn't enter
> the fifo. The same goes for the parity and frame error conditions, we
> can drop those characters too.
> 
> I asked the designers how we're supposed to deal with a break in the
> middle of the fifo and they recommended using the start/stop rx break
> interrupts. Unfortunately, I don't think we can rely on the interrupts
> being distinct so that might not work (but see attached patch). There's
> also a break interrupt that triggers on the start and stop rx break
> events. I don't know how this is useful either because we might get two
> interrupts or we might get one.
> 
> So perhaps we need to scan through the 4 characters for a zero character
> when the SR_RX_BREAK bit it set? The second diff does that.
> 
> Add another twist with port->read_status_mask. I guess that's there to
> make it so that break characters still enter the tty layer? Is there any
> documentation on this stuff? I'm lost on what the tty layer expects.

< snip >

Yep, the whole break in the middle of a fifo is interesting.  If I understand
correctly, there is not enough information to determine where in the byte
stream the break actually occurred.  If interrupts were not disabled for any
length of time, then it was probably after all of the characters in the fifo.
But I don't like to depend on winning races.  As you noted, finding a \0 in
the fifo is likely the location in the byte stream where the break occurred.
But \0 is also valid data.

I read through the two alternate patches that you attached and read some tty
layer stuff.  Your patches look like better code than my original code, and
also leave less cruft in the input stream when stress tested.  One stress
test that I have not attempted to create is to hold off processing the break
until after more character have entered the fifo.

The patches you sent are a little hard to read since they modify further code
that my patch modified.  So I have redone your patches, as if my patch was
not previously applied.  Hopefully I did not make any mistakes there.  I will
reply to this email with each of your redone patches.

I do not have a strong preference between the two alternatives you provided,
without digging deeper into the tty layer, which I won't have time for in
the next week.  Both alternatives improve the break handling (leave less
cruft in the input stream when stress tested than before applying the
patches).  Both alternatives support sysrq in my testing.

If you do not want to submit either of your alternatives, I can dig into
this again in a couple weeks.

Thanks!

-Frank

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 00/13] perf: Replace strerror with strerror_r for thread-safety

2014-08-13 Thread Masami Hiramatsu
Hi,

(sorry, ignore the previous series, this is complete one)

Here is a series to get rid of thread-unsafe strerror() from
perf tools. Of course, there maybe other thread-unsafe functions,
so this goes just one step forward. :)

This introduces STRERR_BUFSIZE(=128) macro for allocating local
buffer, but some strerror_r()s don't use that. If there are
already a local buffer on the stack, and if it is bigger than
STRERR_BUFSIZE, I chose it for strerror_r()'s buffer.

By the way, while doing this cleanup, I've found some confusions
on the code. Currently perf has 3 ways to output messages except
for standard (f)printf, pr_XXX, ui__XXX and warning/error functions.
Is there any differences among those APIs? What is the expected
use cases for them?
For example, a pure printf is in kvm_live_open_events@builtin-kvm.c
but it seems to be ui__error, because the error output next to it
uses that. However, other parts use pr_XXX too. It seems inconsistent.

Thank you,

---

Masami Hiramatsu (13):
  perf probe: Don't use strerror if strlist__add failed
  perf: Use strerror_r instead of strerror
  perf probe: Make error messages thread-safe
  perf/util: Replace strerror with strerror_r for thread-safety
  perf top: Use strerror_r instead of strerror
  perf trace: Use strerror_r instead of strerror
  perf record: Use strerror_r instead of strerror
  perf test: Use strerror_r instead of strerror
  perf sched: Use strerror_r instead of strerror
  perf buildid-cache: Use strerror_r instead of strerror
  perf kvm: Use strerror_r instead of strerror
  perf help: Use strerror_r instead of strerror
  perf stat: Use strerror_r instead of strerror


 tools/perf/builtin-buildid-cache.c|7 +++---
 tools/perf/builtin-help.c |   20 ++---
 tools/perf/builtin-kvm.c  |7 --
 tools/perf/builtin-probe.c|5 +++-
 tools/perf/builtin-record.c   |7 +++---
 tools/perf/builtin-sched.c|4 +++
 tools/perf/builtin-stat.c |2 +-
 tools/perf/builtin-top.c  |2 +-
 tools/perf/builtin-trace.c|6 +++--
 tools/perf/perf.c |   10 ++---
 tools/perf/tests/builtin-test.c   |4 +++
 tools/perf/tests/mmap-basic.c |7 +++---
 tools/perf/tests/open-syscall-all-cpus.c  |5 +++-
 tools/perf/tests/open-syscall-tp-fields.c |7 --
 tools/perf/tests/open-syscall.c   |3 ++-
 tools/perf/tests/perf-record.c|   13 ---
 tools/perf/tests/rdpmc.c  |6 +++--
 tools/perf/tests/sw-clock.c   |6 +++--
 tools/perf/tests/task-exit.c  |6 +++--
 tools/perf/util/cloexec.c |6 +++--
 tools/perf/util/data.c|8 +--
 tools/perf/util/debug.h   |3 +++
 tools/perf/util/dso.c |8 +--
 tools/perf/util/evlist.c  |2 +-
 tools/perf/util/evsel.c   |7 --
 tools/perf/util/parse-events.c|5 +++-
 tools/perf/util/probe-event.c |   34 -
 tools/perf/util/probe-finder.c|7 --
 tools/perf/util/run-command.c |9 ++--
 tools/perf/util/util.c|5 +++-
 30 files changed, 150 insertions(+), 71 deletions(-)

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/13] perf: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages for
thread-safety. This also introduce STRERR_BUFSIZE macro
for the default size of message buffer for strerror_r.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/perf.c   |   10 +++---
 tools/perf/util/debug.h |3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 2282d41..452a847 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -313,6 +313,7 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
int status;
struct stat st;
const char *prefix;
+   char sbuf[STRERR_BUFSIZE];
 
prefix = NULL;
if (p->option & RUN_SETUP)
@@ -343,7 +344,8 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
status = 1;
/* Check for ENOSPC and EIO errors.. */
if (fflush(stdout)) {
-   fprintf(stderr, "write failure on standard output: %s", 
strerror(errno));
+   fprintf(stderr, "write failure on standard output: %s",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
goto out;
}
if (ferror(stdout)) {
@@ -351,7 +353,8 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
goto out;
}
if (fclose(stdout)) {
-   fprintf(stderr, "close failed on standard output: %s", 
strerror(errno));
+   fprintf(stderr, "close failed on standard output: %s",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
goto out;
}
status = 0;
@@ -466,6 +469,7 @@ void pthread__unblock_sigwinch(void)
 int main(int argc, const char **argv)
 {
const char *cmd;
+   char sbuf[STRERR_BUFSIZE];
 
/* The page_size is placed in util object. */
page_size = sysconf(_SC_PAGE_SIZE);
@@ -561,7 +565,7 @@ int main(int argc, const char **argv)
}
 
fprintf(stderr, "Failed to run command '%s': %s\n",
-   cmd, strerror(errno));
+   cmd, strerror_r(errno, sbuf, sizeof(sbuf)));
 out:
return 1;
 }
diff --git a/tools/perf/util/debug.h b/tools/perf/util/debug.h
index 6944ea3..be264d6 100644
--- a/tools/perf/util/debug.h
+++ b/tools/perf/util/debug.h
@@ -3,6 +3,7 @@
 #define __PERF_DEBUG_H
 
 #include 
+#include 
 #include "event.h"
 #include "../ui/helpline.h"
 #include "../ui/progress.h"
@@ -36,6 +37,8 @@ extern int debug_ordered_events;
 #define pr_oe_time(t, fmt, ...)  pr_time_N(1, debug_ordered_events, t, 
pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_oe_time2(t, fmt, ...) pr_time_N(2, debug_ordered_events, t, 
pr_fmt(fmt), ##__VA_ARGS__)
 
+#define STRERR_BUFSIZE 128 /* For the buffer size of strerror_r */
+
 int dump_printf(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
 void trace_event(union perf_event *event);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/13] perf probe: Don't use strerror if strlist__add failed

2014-08-13 Thread Masami Hiramatsu
Since the strlist__add doesn't involves any IO, the failure
reason must be ENOMEM or EINVAL, moreover this is just a
debug message, we don't need to show the error string.

And also, if get_probe_trace_command_rawlist() returns NULL,
it doesn't mean the rawlist is empty, there is an error.
So caller must use -ENOMEM for the error.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/util/probe-event.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 784ea42..66799c6 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1876,7 +1876,7 @@ static struct strlist 
*get_probe_trace_command_rawlist(int fd)
p[idx] = '\0';
ret = strlist__add(sl, buf);
if (ret < 0) {
-   pr_debug("strlist__add failed: %s\n", strerror(-ret));
+   pr_debug("strlist__add failed (%d)\n", ret);
strlist__delete(sl);
return NULL;
}
@@ -1935,7 +1935,7 @@ static int __show_perf_probe_events(int fd, bool 
is_kprobe)
 
rawlist = get_probe_trace_command_rawlist(fd);
if (!rawlist)
-   return -ENOENT;
+   return -ENOMEM;
 
strlist__for_each(ent, rawlist) {
ret = parse_probe_trace_command(ent->s, );
@@ -2002,6 +2002,8 @@ static struct strlist *get_probe_trace_event_names(int 
fd, bool include_group)
 
memset(, 0, sizeof(tev));
rawlist = get_probe_trace_command_rawlist(fd);
+   if (!rawlist)
+   return NULL;
sl = strlist__new(true, NULL);
strlist__for_each(ent, rawlist) {
ret = parse_probe_trace_command(ent->s, );

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT

2014-08-13 Thread Chuansheng Liu
We found sometimes even after we let PM_QOS back to DEFAULT,
the CPU still stuck at C0 for 2-3s, don't do the new suitable C-state
selection immediately after received the IPI interrupt.

The code model is simply like below:
{
pm_qos_update_request(_qos, C1 - 1);
< == Here keep all cores at C0
...;
pm_qos_update_request(_qos, PM_QOS_DEFAULT_VALUE);
< == Here some cores still stuck at C0 for 2-3s
}

The reason is when pm_qos come back to DEFAULT, there is IPI interrupt to
wake up the core, but when core is in poll idle state, the IPI interrupt
can not break the polling loop.

So here in the IPI callback interrupt, when currently the idle task is
running, we need to forcedly set reschedule bit to break the polling loop,
as for other non-polling idle state, IPI interrupt can break them directly,
and setting reschedule bit has no harm for them too.

With this fix, we saved about 30mV power in our android platform.

Signed-off-by: Chuansheng Liu 
---
 drivers/cpuidle/cpuidle.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index ee9df5e..9e28a13 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -532,7 +532,13 @@ EXPORT_SYMBOL_GPL(cpuidle_register);
 
 static void smp_callback(void *v)
 {
-   /* we already woke the CPU up, nothing more to do */
+   /* we already woke the CPU up, and when the corresponding
+* CPU is at polling idle state, we need to set the sched
+* bit to trigger reselect the new suitable C-state, it
+* will be helpful for power.
+   */
+   if (is_idle_task(current))
+   set_tsk_need_resched(current);
 }
 
 /*
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/13] perf probe: Make error messages thread-safe

2014-08-13 Thread Masami Hiramatsu
To make error messages thread-safe, this replaces strerror with
strerror_r for warnings, and just shows the return value instead
of using strerror for debug messages.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-probe.c |5 -
 tools/perf/util/probe-event.c  |   28 +++-
 tools/perf/util/probe-finder.c |7 +--
 3 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index c63fa29..347729e 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -290,8 +290,11 @@ static void cleanup_params(void)
 
 static void pr_err_with_code(const char *msg, int err)
 {
+   char sbuf[STRERR_BUFSIZE];
+
pr_err("%s", msg);
-   pr_debug(" Reason: %s (Code: %d)", strerror(-err), err);
+   pr_debug(" Reason: %s (Code: %d)",
+strerror_r(-err, sbuf, sizeof(sbuf)), err);
pr_err("\n");
 }
 
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 66799c6..e685ef4 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -565,7 +565,7 @@ static int get_real_path(const char *raw_path, const char 
*comp_dir,
 
 static int __show_one_line(FILE *fp, int l, bool skip, bool show_num)
 {
-   char buf[LINEBUF_SIZE];
+   char buf[LINEBUF_SIZE], sbuf[STRERR_BUFSIZE];
const char *color = show_num ? "" : PERF_COLOR_BLUE;
const char *prefix = NULL;
 
@@ -585,7 +585,8 @@ static int __show_one_line(FILE *fp, int l, bool skip, bool 
show_num)
return 1;
 error:
if (ferror(fp)) {
-   pr_warning("File read error: %s\n", strerror(errno));
+   pr_warning("File read error: %s\n",
+  strerror_r(errno, sbuf, sizeof(sbuf)));
return -1;
}
return 0;
@@ -618,6 +619,7 @@ static int __show_line_range(struct line_range *lr, const 
char *module)
FILE *fp;
int ret;
char *tmp;
+   char sbuf[STRERR_BUFSIZE];
 
/* Search a line range */
dinfo = open_debuginfo(module);
@@ -656,7 +658,7 @@ static int __show_line_range(struct line_range *lr, const 
char *module)
fp = fopen(lr->path, "r");
if (fp == NULL) {
pr_warning("Failed to open %s: %s\n", lr->path,
-  strerror(errno));
+  strerror_r(errno, sbuf, sizeof(sbuf)));
return -errno;
}
/* Skip to starting line number */
@@ -1405,8 +1407,7 @@ int synthesize_perf_probe_arg(struct perf_probe_arg *pa, 
char *buf, size_t len)
 
return tmp - buf;
 error:
-   pr_debug("Failed to synthesize perf probe argument: %s\n",
-strerror(-ret));
+   pr_debug("Failed to synthesize perf probe argument: %d\n", ret);
return ret;
 }
 
@@ -1455,8 +1456,7 @@ static char *synthesize_perf_probe_point(struct 
perf_probe_point *pp)
 
return buf;
 error:
-   pr_debug("Failed to synthesize perf probe point: %s\n",
-strerror(-ret));
+   pr_debug("Failed to synthesize perf probe point: %d\n", ret);
free(buf);
return NULL;
 }
@@ -1782,7 +1782,7 @@ static void clear_probe_trace_event(struct 
probe_trace_event *tev)
 
 static void print_open_warning(int err, bool is_kprobe)
 {
-   char sbuf[128];
+   char sbuf[STRERR_BUFSIZE];
 
if (err == -ENOENT) {
const char *config;
@@ -1812,7 +1812,7 @@ static void print_both_open_warning(int kerr, int uerr)
pr_warning("Please rebuild kernel with CONFIG_KPROBE_EVENTS "
   "or/and CONFIG_UPROBE_EVENTS.\n");
else {
-   char sbuf[128];
+   char sbuf[STRERR_BUFSIZE];
pr_warning("Failed to open kprobe events: %s.\n",
   strerror_r(-kerr, sbuf, sizeof(sbuf)));
pr_warning("Failed to open uprobe events: %s.\n",
@@ -2033,6 +2033,7 @@ static int write_probe_trace_event(int fd, struct 
probe_trace_event *tev)
 {
int ret = 0;
char *buf = synthesize_probe_trace_command(tev);
+   char sbuf[STRERR_BUFSIZE];
 
if (!buf) {
pr_debug("Failed to synthesize probe trace event.\n");
@@ -2044,7 +2045,7 @@ static int write_probe_trace_event(int fd, struct 
probe_trace_event *tev)
ret = write(fd, buf, strlen(buf));
if (ret <= 0)
pr_warning("Failed to write event: %s\n",
-  strerror(errno));
+  strerror_r(errno, sbuf, sizeof(sbuf)));
}
free(buf);
return ret;
@@ -2058,7 +2059,7 @@ static int get_new_event_name(char *buf, size_t len, 
const char *base,
/* Try no suffix */
ret = e_snprintf(buf, len, "%s", base);
if (ret < 0) {
-   pr_debug("snprintf() failed: %s\n", strerror(-ret));
+   

[PATCH 04/13] perf/util: Replace strerror with strerror_r for thread-safety

2014-08-13 Thread Masami Hiramatsu
Replaces all strerror with strerror_r in util for making
the perf lib thread-safe.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/util/cloexec.c  |6 --
 tools/perf/util/data.c |8 ++--
 tools/perf/util/dso.c  |8 ++--
 tools/perf/util/evlist.c   |2 +-
 tools/perf/util/evsel.c|7 +--
 tools/perf/util/parse-events.c |5 -
 tools/perf/util/run-command.c  |9 +++--
 tools/perf/util/util.c |5 +++--
 8 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
index 4945aa5..47b78b3 100644
--- a/tools/perf/util/cloexec.c
+++ b/tools/perf/util/cloexec.c
@@ -3,6 +3,7 @@
 #include "../perf.h"
 #include "cloexec.h"
 #include "asm/bug.h"
+#include "debug.h"
 
 static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
 
@@ -18,6 +19,7 @@ static int perf_flag_probe(void)
int err;
int cpu;
pid_t pid = -1;
+   char sbuf[STRERR_BUFSIZE];
 
cpu = sched_getcpu();
if (cpu < 0)
@@ -42,7 +44,7 @@ static int perf_flag_probe(void)
 
WARN_ONCE(err != EINVAL && err != EBUSY,
  "perf_event_open(..., PERF_FLAG_FD_CLOEXEC) failed with 
unexpected error %d (%s)\n",
- err, strerror(err));
+ err, strerror_r(err, sbuf, sizeof(sbuf)));
 
/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
fd = sys_perf_event_open(, pid, cpu, -1, 0);
@@ -50,7 +52,7 @@ static int perf_flag_probe(void)
 
if (WARN_ONCE(fd < 0 && err != EBUSY,
  "perf_event_open(..., 0) failed unexpectedly with error 
%d (%s)\n",
- err, strerror(err)))
+ err, strerror_r(err, sbuf, sizeof(sbuf
return -1;
 
close(fd);
diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 29d720c..1921942f 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -50,12 +50,14 @@ static int open_file_read(struct perf_data_file *file)
 {
struct stat st;
int fd;
+   char sbuf[STRERR_BUFSIZE];
 
fd = open(file->path, O_RDONLY);
if (fd < 0) {
int err = errno;
 
-   pr_err("failed to open %s: %s", file->path, strerror(err));
+   pr_err("failed to open %s: %s", file->path,
+   strerror_r(err, sbuf, sizeof(sbuf)));
if (err == ENOENT && !strcmp(file->path, "perf.data"))
pr_err("  (try 'perf record' first)");
pr_err("\n");
@@ -88,6 +90,7 @@ static int open_file_read(struct perf_data_file *file)
 static int open_file_write(struct perf_data_file *file)
 {
int fd;
+   char sbuf[STRERR_BUFSIZE];
 
if (check_backup(file))
return -1;
@@ -95,7 +98,8 @@ static int open_file_write(struct perf_data_file *file)
fd = open(file->path, O_CREAT|O_RDWR|O_TRUNC, S_IRUSR|S_IWUSR);
 
if (fd < 0)
-   pr_err("failed to open %s : %s\n", file->path, strerror(errno));
+   pr_err("failed to open %s : %s\n", file->path,
+   strerror_r(errno, sbuf, sizeof(sbuf)));
 
return fd;
 }
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index bdafd30..55e39dc 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -162,13 +162,15 @@ static void close_first_dso(void);
 static int do_open(char *name)
 {
int fd;
+   char sbuf[STRERR_BUFSIZE];
 
do {
fd = open(name, O_RDONLY);
if (fd >= 0)
return fd;
 
-   pr_debug("dso open failed, mmap: %s\n", strerror(errno));
+   pr_debug("dso open failed, mmap: %s\n",
+strerror_r(errno, sbuf, sizeof(sbuf)));
if (!dso__data_open_cnt || errno != EMFILE)
break;
 
@@ -530,10 +532,12 @@ static ssize_t cached_read(struct dso *dso, u64 offset, 
u8 *data, ssize_t size)
 static int data_file_size(struct dso *dso)
 {
struct stat st;
+   char sbuf[STRERR_BUFSIZE];
 
if (!dso->data.file_size) {
if (fstat(dso->data.fd, )) {
-   pr_err("dso mmap failed, fstat: %s\n", strerror(errno));
+   pr_err("dso mmap failed, fstat: %s\n",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
return -1;
}
dso->data.file_size = st.st_size;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5dcd28c..a3e28b4 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1295,7 +1295,7 @@ int perf_evlist__strerror_open(struct perf_evlist *evlist 
__maybe_unused,
   int err, char *buf, size_t size)
 {
int printed, value;
-   char sbuf[128], *emsg = strerror_r(err, sbuf, sizeof(sbuf));
+   char 

[PATCH 06/13] perf trace: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error message
for thead-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-trace.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index d080b9c..a9e96ff 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1750,7 +1750,7 @@ static int trace__sys_exit(struct trace *trace, struct 
perf_evsel *evsel,
 signed_print:
fprintf(trace->output, ") = %d", ret);
} else if (ret < 0 && sc->fmt->errmsg) {
-   char bf[256];
+   char bf[STRERR_BUFSIZE];
const char *emsg = strerror_r(-ret, bf, sizeof(bf)),
   *e = audit_errno_to_name(-ret);
 
@@ -2044,6 +2044,7 @@ static int trace__run(struct trace *trace, int argc, 
const char **argv)
int err = -1, i;
unsigned long before;
const bool forks = argc > 0;
+   char sbuf[STRERR_BUFSIZE];
 
trace->live = true;
 
@@ -2105,7 +2106,8 @@ static int trace__run(struct trace *trace, int argc, 
const char **argv)
 
err = perf_evlist__mmap(evlist, trace->opts.mmap_pages, false);
if (err < 0) {
-   fprintf(trace->output, "Couldn't mmap the events: %s\n", 
strerror(errno));
+   fprintf(trace->output, "Couldn't mmap the events: %s\n",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_delete_evlist;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/13] perf record: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-record.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4db670d..87e28a4 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -161,7 +161,7 @@ try_again:
 
if (perf_evlist__apply_filters(evlist)) {
error("failed to set filter with %d (%s)\n", errno,
-   strerror(errno));
+   strerror_r(errno, msg, sizeof(msg)));
rc = -1;
goto out;
}
@@ -175,7 +175,8 @@ try_again:
   "(current value: %u)\n", opts->mmap_pages);
rc = -errno;
} else {
-   pr_err("failed to mmap with %d (%s)\n", errno, 
strerror(errno));
+   pr_err("failed to mmap with %d (%s)\n", errno,
+   strerror_r(errno, msg, sizeof(msg)));
rc = -errno;
}
goto out;
@@ -480,7 +481,7 @@ static int __cmd_record(struct record *rec, int argc, const 
char **argv)
}
 
if (forks && workload_exec_errno) {
-   char msg[512];
+   char msg[STRERR_BUFSIZE];
const char *emsg = strerror_r(workload_exec_errno, msg, 
sizeof(msg));
pr_err("Workload failed: %s\n", emsg);
err = -1;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/13] perf test: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/tests/builtin-test.c   |4 +++-
 tools/perf/tests/mmap-basic.c |7 ---
 tools/perf/tests/open-syscall-all-cpus.c  |5 +++--
 tools/perf/tests/open-syscall-tp-fields.c |7 +--
 tools/perf/tests/open-syscall.c   |3 ++-
 tools/perf/tests/perf-record.c|   13 +
 tools/perf/tests/rdpmc.c  |6 --
 tools/perf/tests/sw-clock.c   |6 --
 tools/perf/tests/task-exit.c  |6 --
 9 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index c6796d2..9948136 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -185,9 +185,11 @@ static bool perf_test__matches(int curr, int argc, const 
char *argv[])
 static int run_test(struct test *test)
 {
int status, err = -1, child = fork();
+   char sbuf[STRERR_BUFSIZE];
 
if (child < 0) {
-   pr_err("failed to fork test: %s\n", strerror(errno));
+   pr_err("failed to fork test: %s\n",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
return -1;
}
 
diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 1422634..9b9622a 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -31,6 +31,7 @@ int test__basic_mmap(void)
unsigned int nr_events[nsyscalls],
 expected_nr_events[nsyscalls], i, j;
struct perf_evsel *evsels[nsyscalls], *evsel;
+   char sbuf[STRERR_BUFSIZE];
 
threads = thread_map__new(-1, getpid(), UINT_MAX);
if (threads == NULL) {
@@ -49,7 +50,7 @@ int test__basic_mmap(void)
sched_setaffinity(0, sizeof(cpu_set), _set);
if (sched_setaffinity(0, sizeof(cpu_set), _set) < 0) {
pr_debug("sched_setaffinity() failed on CPU %d: %s ",
-cpus->map[0], strerror(errno));
+cpus->map[0], strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_free_cpus;
}
 
@@ -79,7 +80,7 @@ int test__basic_mmap(void)
if (perf_evsel__open(evsels[i], cpus, threads) < 0) {
pr_debug("failed to open counter: %s, "
 "tweak 
/proc/sys/kernel/perf_event_paranoid?\n",
-strerror(errno));
+strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_delete_evlist;
}
 
@@ -89,7 +90,7 @@ int test__basic_mmap(void)
 
if (perf_evlist__mmap(evlist, 128, true) < 0) {
pr_debug("failed to mmap events: %d (%s)\n", errno,
-strerror(errno));
+strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_delete_evlist;
}
 
diff --git a/tools/perf/tests/open-syscall-all-cpus.c 
b/tools/perf/tests/open-syscall-all-cpus.c
index 5fecdbd..8fa82d1 100644
--- a/tools/perf/tests/open-syscall-all-cpus.c
+++ b/tools/perf/tests/open-syscall-all-cpus.c
@@ -12,6 +12,7 @@ int test__open_syscall_event_on_all_cpus(void)
unsigned int nr_open_calls = 111, i;
cpu_set_t cpu_set;
struct thread_map *threads = thread_map__new(-1, getpid(), UINT_MAX);
+   char sbuf[STRERR_BUFSIZE];
 
if (threads == NULL) {
pr_debug("thread_map__new\n");
@@ -35,7 +36,7 @@ int test__open_syscall_event_on_all_cpus(void)
if (perf_evsel__open(evsel, cpus, threads) < 0) {
pr_debug("failed to open counter: %s, "
 "tweak /proc/sys/kernel/perf_event_paranoid?\n",
-strerror(errno));
+strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_evsel_delete;
}
 
@@ -56,7 +57,7 @@ int test__open_syscall_event_on_all_cpus(void)
if (sched_setaffinity(0, sizeof(cpu_set), _set) < 0) {
pr_debug("sched_setaffinity() failed on CPU %d: %s ",
 cpus->map[cpu],
-strerror(errno));
+strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_close_fd;
}
for (i = 0; i < ncalls; ++i) {
diff --git a/tools/perf/tests/open-syscall-tp-fields.c 
b/tools/perf/tests/open-syscall-tp-fields.c
index 0785b64..922bdb6 100644
--- a/tools/perf/tests/open-syscall-tp-fields.c
+++ b/tools/perf/tests/open-syscall-tp-fields.c
@@ -22,6 +22,7 @@ int test__syscall_open_tp_fields(void)
struct perf_evlist *evlist = perf_evlist__new();
struct perf_evsel *evsel;
int err = -1, i, nr_events = 0, nr_polls = 0;
+   char sbuf[STRERR_BUFSIZE];
 
if (evlist == NULL) {

Re: [RESEND 0/5] PCIe, AER: Misc cleanup

2014-08-13 Thread Chen, Gong
On Wed, Aug 13, 2014 at 07:52:45AM -0600, Bjorn Helgaas wrote:
> I haven't responded because I've been on vacation for the past three
> weeks.  If there's no change in the patches themselves, and if they
> are still in http://patchwork.ozlabs.org/project/linux-pci/list, the
> only thing reposting them does is make more work for me.
> 
There is one difference existed in Patch 1. I add more explanation in
the comments as Boris suggested.


signature.asc
Description: Digital signature


[PATCH 09/13] perf sched: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error message
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-sched.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index f5874a2..9c9287f 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -428,6 +428,7 @@ static u64 get_cpu_usage_nsec_parent(void)
 static int self_open_counters(void)
 {
struct perf_event_attr attr;
+   char sbuf[STRERR_BUFSIZE];
int fd;
 
memset(, 0, sizeof(attr));
@@ -440,7 +441,8 @@ static int self_open_counters(void)
 
if (fd < 0)
pr_err("Error: sys_perf_event_open() syscall returned "
-  "with %d (%s)\n", fd, strerror(errno));
+  "with %d (%s)\n", fd,
+  strerror_r(errno, sbuf, sizeof(sbuf)));
return fd;
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: (ltc-kernel 9923) [PATCH 0/8] perf: Replace strerror with strerror_r for thread-safety

2014-08-13 Thread Masami Hiramatsu
Oops, this is not a complete series... I'll resend same series.


(2014/08/14 11:20), Masami Hiramatsu wrote:
> Hi,
> 
> Here is a series to get rid of thread-unsafe strerror() from
> perf tools. Of course, there maybe other thread-unsafe functions,
> so this goes just one step forward. :)
> 
> This introduces STRERR_BUFSIZE(=128) macro for allocating local
> buffer, but some strerror_r()s don't use that. If there are
> already a local buffer on the stack, and if it is bigger than
> STRERR_BUFSIZE, I chose it for strerror_r()'s buffer.
> 
> By the way, while doing this cleanup, I've found some confusions
> on the code. Currently perf has 3 ways to output messages except
> for standard (f)printf, pr_XXX, ui__XXX and warning/error functions.
> Is there any differences among those APIs? What is the expected
> use cases for them?
> For example, a pure printf is in kvm_live_open_events@builtin-kvm.c
> but it seems to be ui__error, because the error output next to it
> uses that. However, other parts use pr_XXX too. It seems inconsistent.
> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (8):
>   perf probe: Don't use strerror if strlist__add failed
>   perf: Use strerror_r instead of strerror
>   perf probe: Make error messages thread-safe
>   perf/util: Replace strerror with strerror_r for thread-safety
>   perf top: Use strerror_r instead of strerror
>   perf trace: Use strerror_r instead of strerror
>   perf record: Use strerror_r instead of strerror
>   perf test: Use strerror_r instead of strerror
> 
> 
>  tools/perf/builtin-probe.c|5 +++-
>  tools/perf/builtin-record.c   |7 +++---
>  tools/perf/builtin-top.c  |2 +-
>  tools/perf/builtin-trace.c|6 +++--
>  tools/perf/perf.c |   10 ++---
>  tools/perf/tests/builtin-test.c   |4 +++
>  tools/perf/tests/mmap-basic.c |7 +++---
>  tools/perf/tests/open-syscall-all-cpus.c  |5 +++-
>  tools/perf/tests/open-syscall-tp-fields.c |7 --
>  tools/perf/tests/open-syscall.c   |3 ++-
>  tools/perf/tests/perf-record.c|   13 ---
>  tools/perf/tests/rdpmc.c  |6 +++--
>  tools/perf/tests/sw-clock.c   |6 +++--
>  tools/perf/tests/task-exit.c  |6 +++--
>  tools/perf/util/cloexec.c |6 +++--
>  tools/perf/util/data.c|8 +--
>  tools/perf/util/debug.h   |3 +++
>  tools/perf/util/dso.c |8 +--
>  tools/perf/util/evlist.c  |2 +-
>  tools/perf/util/evsel.c   |7 --
>  tools/perf/util/parse-events.c|5 +++-
>  tools/perf/util/probe-event.c |   34 
> -
>  tools/perf/util/probe-finder.c|7 --
>  tools/perf/util/run-command.c |9 ++--
>  tools/perf/util/util.c|5 +++-
>  25 files changed, 121 insertions(+), 60 deletions(-)
> 
> --
> Masami HIRAMATSU
> Software Platform Research Dept. Linux Technology Research Center
> Hitachi, Ltd., Yokohama Research Laboratory
> E-mail: masami.hiramatsu...@hitachi.com
> 
> 


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/13] perf stat: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error message
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-stat.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 3e80aa1..5fe0edb 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -593,7 +593,7 @@ static int __run_perf_stat(int argc, const char **argv)
 
if (perf_evlist__apply_filters(evsel_list)) {
error("failed to set filter with %d (%s)\n", errno,
-   strerror(errno));
+   strerror_r(errno, msg, sizeof(msg)));
return -1;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/13] perf help: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-help.c |   20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 0384d93..25d2062 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -103,6 +103,8 @@ static int check_emacsclient_version(void)
 
 static void exec_woman_emacs(const char *path, const char *page)
 {
+   char sbuf[STRERR_BUFSIZE];
+
if (!check_emacsclient_version()) {
/* This works only with emacsclient version >= 22. */
struct strbuf man_page = STRBUF_INIT;
@@ -111,16 +113,19 @@ static void exec_woman_emacs(const char *path, const char 
*page)
path = "emacsclient";
strbuf_addf(_page, "(woman \"%s\")", page);
execlp(path, "emacsclient", "-e", man_page.buf, NULL);
-   warning("failed to exec '%s': %s", path, strerror(errno));
+   warning("failed to exec '%s': %s", path,
+   strerror_r(errno, sbuf, sizeof(sbuf)));
}
 }
 
 static void exec_man_konqueror(const char *path, const char *page)
 {
const char *display = getenv("DISPLAY");
+
if (display && *display) {
struct strbuf man_page = STRBUF_INIT;
const char *filename = "kfmclient";
+   char sbuf[STRERR_BUFSIZE];
 
/* It's simpler to launch konqueror using kfmclient. */
if (path) {
@@ -139,24 +144,31 @@ static void exec_man_konqueror(const char *path, const 
char *page)
path = "kfmclient";
strbuf_addf(_page, "man:%s(1)", page);
execlp(path, filename, "newTab", man_page.buf, NULL);
-   warning("failed to exec '%s': %s", path, strerror(errno));
+   warning("failed to exec '%s': %s", path,
+   strerror_r(errno, sbuf, sizeof(sbuf)));
}
 }
 
 static void exec_man_man(const char *path, const char *page)
 {
+   char sbuf[STRERR_BUFSIZE];
+
if (!path)
path = "man";
execlp(path, "man", page, NULL);
-   warning("failed to exec '%s': %s", path, strerror(errno));
+   warning("failed to exec '%s': %s", path,
+   strerror_r(errno, sbuf, sizeof(sbuf)));
 }
 
 static void exec_man_cmd(const char *cmd, const char *page)
 {
struct strbuf shell_cmd = STRBUF_INIT;
+   char sbuf[STRERR_BUFSIZE];
+
strbuf_addf(_cmd, "%s %s", cmd, page);
execl("/bin/sh", "sh", "-c", shell_cmd.buf, NULL);
-   warning("failed to exec '%s': %s", cmd, strerror(errno));
+   warning("failed to exec '%s': %s", cmd,
+   strerror_r(errno, sbuf, sizeof(sbuf)));
 }
 
 static void add_man_viewer(const char *name)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/13] perf kvm: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-kvm.c |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 14d03ed..1a4ef9c 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -990,6 +990,7 @@ static int kvm_live_open_events(struct perf_kvm_stat *kvm)
int err, rc = -1;
struct perf_evsel *pos;
struct perf_evlist *evlist = kvm->evlist;
+   char sbuf[STRERR_BUFSIZE];
 
perf_evlist__config(evlist, >opts);
 
@@ -1026,12 +1027,14 @@ static int kvm_live_open_events(struct perf_kvm_stat 
*kvm)
 
err = perf_evlist__open(evlist);
if (err < 0) {
-   printf("Couldn't create the events: %s\n", strerror(errno));
+   printf("Couldn't create the events: %s\n",
+  strerror_r(errno, sbuf, sizeof(sbuf)));
goto out;
}
 
if (perf_evlist__mmap(evlist, kvm->opts.mmap_pages, false) < 0) {
-   ui__error("Failed to mmap the events: %s\n", strerror(errno));
+   ui__error("Failed to mmap the events: %s\n",
+ strerror_r(errno, sbuf, sizeof(sbuf)));
perf_evlist__close(evlist);
goto out;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/13] perf top: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error message
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-top.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 87a6615..a77ff6c 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -899,7 +899,7 @@ try_again:
 
if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
ui__error("Failed to mmap with %d (%s)\n",
-   errno, strerror(errno));
+   errno, strerror_r(errno, msg, sizeof(msg)));
goto out_err;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/13] perf buildid-cache: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-buildid-cache.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-buildid-cache.c 
b/tools/perf/builtin-buildid-cache.c
index ac5838e..7038575 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -291,6 +291,7 @@ int cmd_buildid_cache(int argc, const char **argv,
   *missing_filename = NULL,
   *update_name_list_str = NULL,
   *kcore_filename;
+   char sbuf[STRERR_BUFSIZE];
 
struct perf_data_file file = {
.mode  = PERF_DATA_MODE_READ,
@@ -347,7 +348,7 @@ int cmd_buildid_cache(int argc, const char **argv,
continue;
}
pr_warning("Couldn't add %s: %s\n",
-  pos->s, strerror(errno));
+  pos->s, strerror_r(errno, 
sbuf, sizeof(sbuf)));
}
 
strlist__delete(list);
@@ -365,7 +366,7 @@ int cmd_buildid_cache(int argc, const char **argv,
continue;
}
pr_warning("Couldn't remove %s: %s\n",
-  pos->s, strerror(errno));
+  pos->s, strerror_r(errno, 
sbuf, sizeof(sbuf)));
}
 
strlist__delete(list);
@@ -386,7 +387,7 @@ int cmd_buildid_cache(int argc, const char **argv,
continue;
}
pr_warning("Couldn't update %s: %s\n",
-  pos->s, strerror(errno));
+  pos->s, strerror_r(errno, 
sbuf, sizeof(sbuf)));
}
 
strlist__delete(list);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/8] perf: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages for
thread-safety. This also introduce STRERR_BUFSIZE macro
for the default size of message buffer for strerror_r.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/perf.c   |   10 +++---
 tools/perf/util/debug.h |3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 2282d41..452a847 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -313,6 +313,7 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
int status;
struct stat st;
const char *prefix;
+   char sbuf[STRERR_BUFSIZE];
 
prefix = NULL;
if (p->option & RUN_SETUP)
@@ -343,7 +344,8 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
status = 1;
/* Check for ENOSPC and EIO errors.. */
if (fflush(stdout)) {
-   fprintf(stderr, "write failure on standard output: %s", 
strerror(errno));
+   fprintf(stderr, "write failure on standard output: %s",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
goto out;
}
if (ferror(stdout)) {
@@ -351,7 +353,8 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
goto out;
}
if (fclose(stdout)) {
-   fprintf(stderr, "close failed on standard output: %s", 
strerror(errno));
+   fprintf(stderr, "close failed on standard output: %s",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
goto out;
}
status = 0;
@@ -466,6 +469,7 @@ void pthread__unblock_sigwinch(void)
 int main(int argc, const char **argv)
 {
const char *cmd;
+   char sbuf[STRERR_BUFSIZE];
 
/* The page_size is placed in util object. */
page_size = sysconf(_SC_PAGE_SIZE);
@@ -561,7 +565,7 @@ int main(int argc, const char **argv)
}
 
fprintf(stderr, "Failed to run command '%s': %s\n",
-   cmd, strerror(errno));
+   cmd, strerror_r(errno, sbuf, sizeof(sbuf)));
 out:
return 1;
 }
diff --git a/tools/perf/util/debug.h b/tools/perf/util/debug.h
index 6944ea3..be264d6 100644
--- a/tools/perf/util/debug.h
+++ b/tools/perf/util/debug.h
@@ -3,6 +3,7 @@
 #define __PERF_DEBUG_H
 
 #include 
+#include 
 #include "event.h"
 #include "../ui/helpline.h"
 #include "../ui/progress.h"
@@ -36,6 +37,8 @@ extern int debug_ordered_events;
 #define pr_oe_time(t, fmt, ...)  pr_time_N(1, debug_ordered_events, t, 
pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_oe_time2(t, fmt, ...) pr_time_N(2, debug_ordered_events, t, 
pr_fmt(fmt), ##__VA_ARGS__)
 
+#define STRERR_BUFSIZE 128 /* For the buffer size of strerror_r */
+
 int dump_printf(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
 void trace_event(union perf_event *event);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/8] perf/util: Replace strerror with strerror_r for thread-safety

2014-08-13 Thread Masami Hiramatsu
Replaces all strerror with strerror_r in util for making
the perf lib thread-safe.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/util/cloexec.c  |6 --
 tools/perf/util/data.c |8 ++--
 tools/perf/util/dso.c  |8 ++--
 tools/perf/util/evlist.c   |2 +-
 tools/perf/util/evsel.c|7 +--
 tools/perf/util/parse-events.c |5 -
 tools/perf/util/run-command.c  |9 +++--
 tools/perf/util/util.c |5 +++--
 8 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
index 4945aa5..47b78b3 100644
--- a/tools/perf/util/cloexec.c
+++ b/tools/perf/util/cloexec.c
@@ -3,6 +3,7 @@
 #include "../perf.h"
 #include "cloexec.h"
 #include "asm/bug.h"
+#include "debug.h"
 
 static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
 
@@ -18,6 +19,7 @@ static int perf_flag_probe(void)
int err;
int cpu;
pid_t pid = -1;
+   char sbuf[STRERR_BUFSIZE];
 
cpu = sched_getcpu();
if (cpu < 0)
@@ -42,7 +44,7 @@ static int perf_flag_probe(void)
 
WARN_ONCE(err != EINVAL && err != EBUSY,
  "perf_event_open(..., PERF_FLAG_FD_CLOEXEC) failed with 
unexpected error %d (%s)\n",
- err, strerror(err));
+ err, strerror_r(err, sbuf, sizeof(sbuf)));
 
/* not supported, confirm error related to PERF_FLAG_FD_CLOEXEC */
fd = sys_perf_event_open(, pid, cpu, -1, 0);
@@ -50,7 +52,7 @@ static int perf_flag_probe(void)
 
if (WARN_ONCE(fd < 0 && err != EBUSY,
  "perf_event_open(..., 0) failed unexpectedly with error 
%d (%s)\n",
- err, strerror(err)))
+ err, strerror_r(err, sbuf, sizeof(sbuf
return -1;
 
close(fd);
diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 29d720c..1921942f 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -50,12 +50,14 @@ static int open_file_read(struct perf_data_file *file)
 {
struct stat st;
int fd;
+   char sbuf[STRERR_BUFSIZE];
 
fd = open(file->path, O_RDONLY);
if (fd < 0) {
int err = errno;
 
-   pr_err("failed to open %s: %s", file->path, strerror(err));
+   pr_err("failed to open %s: %s", file->path,
+   strerror_r(err, sbuf, sizeof(sbuf)));
if (err == ENOENT && !strcmp(file->path, "perf.data"))
pr_err("  (try 'perf record' first)");
pr_err("\n");
@@ -88,6 +90,7 @@ static int open_file_read(struct perf_data_file *file)
 static int open_file_write(struct perf_data_file *file)
 {
int fd;
+   char sbuf[STRERR_BUFSIZE];
 
if (check_backup(file))
return -1;
@@ -95,7 +98,8 @@ static int open_file_write(struct perf_data_file *file)
fd = open(file->path, O_CREAT|O_RDWR|O_TRUNC, S_IRUSR|S_IWUSR);
 
if (fd < 0)
-   pr_err("failed to open %s : %s\n", file->path, strerror(errno));
+   pr_err("failed to open %s : %s\n", file->path,
+   strerror_r(errno, sbuf, sizeof(sbuf)));
 
return fd;
 }
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index bdafd30..55e39dc 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -162,13 +162,15 @@ static void close_first_dso(void);
 static int do_open(char *name)
 {
int fd;
+   char sbuf[STRERR_BUFSIZE];
 
do {
fd = open(name, O_RDONLY);
if (fd >= 0)
return fd;
 
-   pr_debug("dso open failed, mmap: %s\n", strerror(errno));
+   pr_debug("dso open failed, mmap: %s\n",
+strerror_r(errno, sbuf, sizeof(sbuf)));
if (!dso__data_open_cnt || errno != EMFILE)
break;
 
@@ -530,10 +532,12 @@ static ssize_t cached_read(struct dso *dso, u64 offset, 
u8 *data, ssize_t size)
 static int data_file_size(struct dso *dso)
 {
struct stat st;
+   char sbuf[STRERR_BUFSIZE];
 
if (!dso->data.file_size) {
if (fstat(dso->data.fd, )) {
-   pr_err("dso mmap failed, fstat: %s\n", strerror(errno));
+   pr_err("dso mmap failed, fstat: %s\n",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
return -1;
}
dso->data.file_size = st.st_size;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5dcd28c..a3e28b4 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1295,7 +1295,7 @@ int perf_evlist__strerror_open(struct perf_evlist *evlist 
__maybe_unused,
   int err, char *buf, size_t size)
 {
int printed, value;
-   char sbuf[128], *emsg = strerror_r(err, sbuf, sizeof(sbuf));
+   char 

[PATCH 1/8] perf probe: Don't use strerror if strlist__add failed

2014-08-13 Thread Masami Hiramatsu
Since the strlist__add doesn't involves any IO, the failure
reason must be ENOMEM or EINVAL, moreover this is just a
debug message, we don't need to show the error string.

And also, if get_probe_trace_command_rawlist() returns NULL,
it doesn't mean the rawlist is empty, there is an error.
So caller must use -ENOMEM for the error.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/util/probe-event.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 784ea42..66799c6 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1876,7 +1876,7 @@ static struct strlist 
*get_probe_trace_command_rawlist(int fd)
p[idx] = '\0';
ret = strlist__add(sl, buf);
if (ret < 0) {
-   pr_debug("strlist__add failed: %s\n", strerror(-ret));
+   pr_debug("strlist__add failed (%d)\n", ret);
strlist__delete(sl);
return NULL;
}
@@ -1935,7 +1935,7 @@ static int __show_perf_probe_events(int fd, bool 
is_kprobe)
 
rawlist = get_probe_trace_command_rawlist(fd);
if (!rawlist)
-   return -ENOENT;
+   return -ENOMEM;
 
strlist__for_each(ent, rawlist) {
ret = parse_probe_trace_command(ent->s, );
@@ -2002,6 +2002,8 @@ static struct strlist *get_probe_trace_event_names(int 
fd, bool include_group)
 
memset(, 0, sizeof(tev));
rawlist = get_probe_trace_command_rawlist(fd);
+   if (!rawlist)
+   return NULL;
sl = strlist__new(true, NULL);
strlist__for_each(ent, rawlist) {
ret = parse_probe_trace_command(ent->s, );

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/8] perf test: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/tests/builtin-test.c   |4 +++-
 tools/perf/tests/mmap-basic.c |7 ---
 tools/perf/tests/open-syscall-all-cpus.c  |5 +++--
 tools/perf/tests/open-syscall-tp-fields.c |7 +--
 tools/perf/tests/open-syscall.c   |3 ++-
 tools/perf/tests/perf-record.c|   13 +
 tools/perf/tests/rdpmc.c  |6 --
 tools/perf/tests/sw-clock.c   |6 --
 tools/perf/tests/task-exit.c  |6 --
 9 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index c6796d2..9948136 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -185,9 +185,11 @@ static bool perf_test__matches(int curr, int argc, const 
char *argv[])
 static int run_test(struct test *test)
 {
int status, err = -1, child = fork();
+   char sbuf[STRERR_BUFSIZE];
 
if (child < 0) {
-   pr_err("failed to fork test: %s\n", strerror(errno));
+   pr_err("failed to fork test: %s\n",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
return -1;
}
 
diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 1422634..9b9622a 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -31,6 +31,7 @@ int test__basic_mmap(void)
unsigned int nr_events[nsyscalls],
 expected_nr_events[nsyscalls], i, j;
struct perf_evsel *evsels[nsyscalls], *evsel;
+   char sbuf[STRERR_BUFSIZE];
 
threads = thread_map__new(-1, getpid(), UINT_MAX);
if (threads == NULL) {
@@ -49,7 +50,7 @@ int test__basic_mmap(void)
sched_setaffinity(0, sizeof(cpu_set), _set);
if (sched_setaffinity(0, sizeof(cpu_set), _set) < 0) {
pr_debug("sched_setaffinity() failed on CPU %d: %s ",
-cpus->map[0], strerror(errno));
+cpus->map[0], strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_free_cpus;
}
 
@@ -79,7 +80,7 @@ int test__basic_mmap(void)
if (perf_evsel__open(evsels[i], cpus, threads) < 0) {
pr_debug("failed to open counter: %s, "
 "tweak 
/proc/sys/kernel/perf_event_paranoid?\n",
-strerror(errno));
+strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_delete_evlist;
}
 
@@ -89,7 +90,7 @@ int test__basic_mmap(void)
 
if (perf_evlist__mmap(evlist, 128, true) < 0) {
pr_debug("failed to mmap events: %d (%s)\n", errno,
-strerror(errno));
+strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_delete_evlist;
}
 
diff --git a/tools/perf/tests/open-syscall-all-cpus.c 
b/tools/perf/tests/open-syscall-all-cpus.c
index 5fecdbd..8fa82d1 100644
--- a/tools/perf/tests/open-syscall-all-cpus.c
+++ b/tools/perf/tests/open-syscall-all-cpus.c
@@ -12,6 +12,7 @@ int test__open_syscall_event_on_all_cpus(void)
unsigned int nr_open_calls = 111, i;
cpu_set_t cpu_set;
struct thread_map *threads = thread_map__new(-1, getpid(), UINT_MAX);
+   char sbuf[STRERR_BUFSIZE];
 
if (threads == NULL) {
pr_debug("thread_map__new\n");
@@ -35,7 +36,7 @@ int test__open_syscall_event_on_all_cpus(void)
if (perf_evsel__open(evsel, cpus, threads) < 0) {
pr_debug("failed to open counter: %s, "
 "tweak /proc/sys/kernel/perf_event_paranoid?\n",
-strerror(errno));
+strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_evsel_delete;
}
 
@@ -56,7 +57,7 @@ int test__open_syscall_event_on_all_cpus(void)
if (sched_setaffinity(0, sizeof(cpu_set), _set) < 0) {
pr_debug("sched_setaffinity() failed on CPU %d: %s ",
 cpus->map[cpu],
-strerror(errno));
+strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_close_fd;
}
for (i = 0; i < ncalls; ++i) {
diff --git a/tools/perf/tests/open-syscall-tp-fields.c 
b/tools/perf/tests/open-syscall-tp-fields.c
index 0785b64..922bdb6 100644
--- a/tools/perf/tests/open-syscall-tp-fields.c
+++ b/tools/perf/tests/open-syscall-tp-fields.c
@@ -22,6 +22,7 @@ int test__syscall_open_tp_fields(void)
struct perf_evlist *evlist = perf_evlist__new();
struct perf_evsel *evsel;
int err = -1, i, nr_events = 0, nr_polls = 0;
+   char sbuf[STRERR_BUFSIZE];
 
if (evlist == NULL) {

[PATCH 6/8] perf trace: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error message
for thead-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-trace.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index d080b9c..a9e96ff 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1750,7 +1750,7 @@ static int trace__sys_exit(struct trace *trace, struct 
perf_evsel *evsel,
 signed_print:
fprintf(trace->output, ") = %d", ret);
} else if (ret < 0 && sc->fmt->errmsg) {
-   char bf[256];
+   char bf[STRERR_BUFSIZE];
const char *emsg = strerror_r(-ret, bf, sizeof(bf)),
   *e = audit_errno_to_name(-ret);
 
@@ -2044,6 +2044,7 @@ static int trace__run(struct trace *trace, int argc, 
const char **argv)
int err = -1, i;
unsigned long before;
const bool forks = argc > 0;
+   char sbuf[STRERR_BUFSIZE];
 
trace->live = true;
 
@@ -2105,7 +2106,8 @@ static int trace__run(struct trace *trace, int argc, 
const char **argv)
 
err = perf_evlist__mmap(evlist, trace->opts.mmap_pages, false);
if (err < 0) {
-   fprintf(trace->output, "Couldn't mmap the events: %s\n", 
strerror(errno));
+   fprintf(trace->output, "Couldn't mmap the events: %s\n",
+   strerror_r(errno, sbuf, sizeof(sbuf)));
goto out_delete_evlist;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/8] perf top: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error message
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-top.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 87a6615..a77ff6c 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -899,7 +899,7 @@ try_again:
 
if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
ui__error("Failed to mmap with %d (%s)\n",
-   errno, strerror(errno));
+   errno, strerror_r(errno, msg, sizeof(msg)));
goto out_err;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/8] perf record: Use strerror_r instead of strerror

2014-08-13 Thread Masami Hiramatsu
Use strerror_r instead of strerror in error messages
for thread-safety.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-record.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4db670d..87e28a4 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -161,7 +161,7 @@ try_again:
 
if (perf_evlist__apply_filters(evlist)) {
error("failed to set filter with %d (%s)\n", errno,
-   strerror(errno));
+   strerror_r(errno, msg, sizeof(msg)));
rc = -1;
goto out;
}
@@ -175,7 +175,8 @@ try_again:
   "(current value: %u)\n", opts->mmap_pages);
rc = -errno;
} else {
-   pr_err("failed to mmap with %d (%s)\n", errno, 
strerror(errno));
+   pr_err("failed to mmap with %d (%s)\n", errno,
+   strerror_r(errno, msg, sizeof(msg)));
rc = -errno;
}
goto out;
@@ -480,7 +481,7 @@ static int __cmd_record(struct record *rec, int argc, const 
char **argv)
}
 
if (forks && workload_exec_errno) {
-   char msg[512];
+   char msg[STRERR_BUFSIZE];
const char *emsg = strerror_r(workload_exec_errno, msg, 
sizeof(msg));
pr_err("Workload failed: %s\n", emsg);
err = -1;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/8] perf probe: Make error messages thread-safe

2014-08-13 Thread Masami Hiramatsu
To make error messages thread-safe, this replaces strerror with
strerror_r for warnings, and just shows the return value instead
of using strerror for debug messages.

Signed-off-by: Masami Hiramatsu 
---
 tools/perf/builtin-probe.c |5 -
 tools/perf/util/probe-event.c  |   28 +++-
 tools/perf/util/probe-finder.c |7 +--
 3 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index c63fa29..347729e 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -290,8 +290,11 @@ static void cleanup_params(void)
 
 static void pr_err_with_code(const char *msg, int err)
 {
+   char sbuf[STRERR_BUFSIZE];
+
pr_err("%s", msg);
-   pr_debug(" Reason: %s (Code: %d)", strerror(-err), err);
+   pr_debug(" Reason: %s (Code: %d)",
+strerror_r(-err, sbuf, sizeof(sbuf)), err);
pr_err("\n");
 }
 
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 66799c6..e685ef4 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -565,7 +565,7 @@ static int get_real_path(const char *raw_path, const char 
*comp_dir,
 
 static int __show_one_line(FILE *fp, int l, bool skip, bool show_num)
 {
-   char buf[LINEBUF_SIZE];
+   char buf[LINEBUF_SIZE], sbuf[STRERR_BUFSIZE];
const char *color = show_num ? "" : PERF_COLOR_BLUE;
const char *prefix = NULL;
 
@@ -585,7 +585,8 @@ static int __show_one_line(FILE *fp, int l, bool skip, bool 
show_num)
return 1;
 error:
if (ferror(fp)) {
-   pr_warning("File read error: %s\n", strerror(errno));
+   pr_warning("File read error: %s\n",
+  strerror_r(errno, sbuf, sizeof(sbuf)));
return -1;
}
return 0;
@@ -618,6 +619,7 @@ static int __show_line_range(struct line_range *lr, const 
char *module)
FILE *fp;
int ret;
char *tmp;
+   char sbuf[STRERR_BUFSIZE];
 
/* Search a line range */
dinfo = open_debuginfo(module);
@@ -656,7 +658,7 @@ static int __show_line_range(struct line_range *lr, const 
char *module)
fp = fopen(lr->path, "r");
if (fp == NULL) {
pr_warning("Failed to open %s: %s\n", lr->path,
-  strerror(errno));
+  strerror_r(errno, sbuf, sizeof(sbuf)));
return -errno;
}
/* Skip to starting line number */
@@ -1405,8 +1407,7 @@ int synthesize_perf_probe_arg(struct perf_probe_arg *pa, 
char *buf, size_t len)
 
return tmp - buf;
 error:
-   pr_debug("Failed to synthesize perf probe argument: %s\n",
-strerror(-ret));
+   pr_debug("Failed to synthesize perf probe argument: %d\n", ret);
return ret;
 }
 
@@ -1455,8 +1456,7 @@ static char *synthesize_perf_probe_point(struct 
perf_probe_point *pp)
 
return buf;
 error:
-   pr_debug("Failed to synthesize perf probe point: %s\n",
-strerror(-ret));
+   pr_debug("Failed to synthesize perf probe point: %d\n", ret);
free(buf);
return NULL;
 }
@@ -1782,7 +1782,7 @@ static void clear_probe_trace_event(struct 
probe_trace_event *tev)
 
 static void print_open_warning(int err, bool is_kprobe)
 {
-   char sbuf[128];
+   char sbuf[STRERR_BUFSIZE];
 
if (err == -ENOENT) {
const char *config;
@@ -1812,7 +1812,7 @@ static void print_both_open_warning(int kerr, int uerr)
pr_warning("Please rebuild kernel with CONFIG_KPROBE_EVENTS "
   "or/and CONFIG_UPROBE_EVENTS.\n");
else {
-   char sbuf[128];
+   char sbuf[STRERR_BUFSIZE];
pr_warning("Failed to open kprobe events: %s.\n",
   strerror_r(-kerr, sbuf, sizeof(sbuf)));
pr_warning("Failed to open uprobe events: %s.\n",
@@ -2033,6 +2033,7 @@ static int write_probe_trace_event(int fd, struct 
probe_trace_event *tev)
 {
int ret = 0;
char *buf = synthesize_probe_trace_command(tev);
+   char sbuf[STRERR_BUFSIZE];
 
if (!buf) {
pr_debug("Failed to synthesize probe trace event.\n");
@@ -2044,7 +2045,7 @@ static int write_probe_trace_event(int fd, struct 
probe_trace_event *tev)
ret = write(fd, buf, strlen(buf));
if (ret <= 0)
pr_warning("Failed to write event: %s\n",
-  strerror(errno));
+  strerror_r(errno, sbuf, sizeof(sbuf)));
}
free(buf);
return ret;
@@ -2058,7 +2059,7 @@ static int get_new_event_name(char *buf, size_t len, 
const char *base,
/* Try no suffix */
ret = e_snprintf(buf, len, "%s", base);
if (ret < 0) {
-   pr_debug("snprintf() failed: %s\n", strerror(-ret));
+   

[PATCH 0/8] perf: Replace strerror with strerror_r for thread-safety

2014-08-13 Thread Masami Hiramatsu
Hi,

Here is a series to get rid of thread-unsafe strerror() from
perf tools. Of course, there maybe other thread-unsafe functions,
so this goes just one step forward. :)

This introduces STRERR_BUFSIZE(=128) macro for allocating local
buffer, but some strerror_r()s don't use that. If there are
already a local buffer on the stack, and if it is bigger than
STRERR_BUFSIZE, I chose it for strerror_r()'s buffer.

By the way, while doing this cleanup, I've found some confusions
on the code. Currently perf has 3 ways to output messages except
for standard (f)printf, pr_XXX, ui__XXX and warning/error functions.
Is there any differences among those APIs? What is the expected
use cases for them?
For example, a pure printf is in kvm_live_open_events@builtin-kvm.c
but it seems to be ui__error, because the error output next to it
uses that. However, other parts use pr_XXX too. It seems inconsistent.

Thank you,

---

Masami Hiramatsu (8):
  perf probe: Don't use strerror if strlist__add failed
  perf: Use strerror_r instead of strerror
  perf probe: Make error messages thread-safe
  perf/util: Replace strerror with strerror_r for thread-safety
  perf top: Use strerror_r instead of strerror
  perf trace: Use strerror_r instead of strerror
  perf record: Use strerror_r instead of strerror
  perf test: Use strerror_r instead of strerror


 tools/perf/builtin-probe.c|5 +++-
 tools/perf/builtin-record.c   |7 +++---
 tools/perf/builtin-top.c  |2 +-
 tools/perf/builtin-trace.c|6 +++--
 tools/perf/perf.c |   10 ++---
 tools/perf/tests/builtin-test.c   |4 +++
 tools/perf/tests/mmap-basic.c |7 +++---
 tools/perf/tests/open-syscall-all-cpus.c  |5 +++-
 tools/perf/tests/open-syscall-tp-fields.c |7 --
 tools/perf/tests/open-syscall.c   |3 ++-
 tools/perf/tests/perf-record.c|   13 ---
 tools/perf/tests/rdpmc.c  |6 +++--
 tools/perf/tests/sw-clock.c   |6 +++--
 tools/perf/tests/task-exit.c  |6 +++--
 tools/perf/util/cloexec.c |6 +++--
 tools/perf/util/data.c|8 +--
 tools/perf/util/debug.h   |3 +++
 tools/perf/util/dso.c |8 +--
 tools/perf/util/evlist.c  |2 +-
 tools/perf/util/evsel.c   |7 --
 tools/perf/util/parse-events.c|5 +++-
 tools/perf/util/probe-event.c |   34 -
 tools/perf/util/probe-finder.c|7 --
 tools/perf/util/run-command.c |9 ++--
 tools/perf/util/util.c|5 +++-
 25 files changed, 121 insertions(+), 60 deletions(-)

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Linux 3.16 boot hangs at initramfs scripts when mounting filesystems

2014-08-13 Thread Petri Gynther
Hi linux-kernel and linux-mips:

I'm trying to boot Linux 3.16 on Broadcom BMIPS5000-based platform.
However, 90% of the time, the system hangs at initramfs scripts at one
of the mount commands. I'm also seeing this same issue when booting
Linux 3.15.

Please see the boot log below. At 2.583000, MMC partitions seem to be
detected correctly. At 2.809000 initramfs scripts start to run. The
scripts attempt to run the following commands:

strace mount -t devtmpfs none /dev
strace mount -t proc none /proc
strace mount -t sysfs none /sys
strace mkdir /dev/pts /dev/shm
strace mount -t devpts none /dev/pts
strace mount -t tmpfs none /dev/shm
strace mount -t debugfs none /sys/kernel/debug
strace mount -o ro -t squashfs /dev/mmcblk0p19 /rootfs

[0.00] Linux version 3.16.0 ... (gcc version 4.5.3 (Broadcom
stbgcc-4.5.3-1.3) ) #4 SMP Wed Aug 13 18:13:44 PDT 2014
[0.00] Fetching vars from bootloader... found 10 vars.
[0.00] Options: moca=1 sata=0 pcie=0 usb=1
[0.00] Using 1024 MB + 0 MB RAM (from CFE)
[0.00] bootconsole [early0] enabled
[0.00] CPU0 revision is: 00025a11 (Broadcom BMIPS5000)
[0.00] FPU revision is: 00130001
[0.00] Determined physical RAM map:
[0.00]  memory: 1000 @  (usable)
[0.00]  memory: 3000 @ 2000 (usable)
[0.00] bmem: adding 109 MB LINUX region at 18 MB (0x06d0a000@0x012f6000)
[0.00] bmem: adding 128 MB RESERVED region at 128 MB
(0x0800@0x0800)
[0.00] bmem: adding 320 MB RESERVED region at 512 MB
(0x1400@0x2000)
[0.00] bmem: adding 448 MB LINUX region at 832 MB
(0x1c00@0x3400)
[0.00] Initrd not found or empty - disabling initrd
[0.00] Zone ranges:
[0.00]   Normal   [mem 0x-0x4fff]
[0.00]   HighMem  empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x-0x0fff]
[0.00]   node   0: [mem 0x2000-0x4fff]
[0.00] On node 0 totalpages: 262144
[0.00]   Normal zone: 2048 pages used for memmap
[0.00]   Normal zone: 0 pages reserved
[0.00]   Normal zone: 262144 pages, LIFO batch:31
[0.00] Primary instruction cache 32kB, physically tagged,
4-way, linesize 64 bytes.
[0.00] Primary data cache 32kB, 4-way, linesize 32 bytes.
[0.00] MIPS secondary cache 256kB, 8-way, linesize 128 bytes.
[0.00] PERCPU: Embedded 9 pages/cpu @81b1a000 s12416 r8192 d16256 u36864
[0.00] pcpu-alloc: s12416 r8192 d16256 u36864 alloc=9*4096
[0.00] pcpu-alloc: [0] 0 [0] 1
[0.00] Built 1 zonelists in Zone order, mobility grouping on.
Total pages: 260096
[0.00] Kernel command line: splashmem=0x0/4m@memc1
cfe_ver=3.53.24 display=on rootfstype=squashfs partitionver=2
root=rootfs1 debug bmem=128m@128m bmem=320m@512m log_buf_len=8388608
[0.00] Switched partition from version 0 to version 2.
[0.00] log_buf_len: 8388608
[0.00] early log buf free: 13880(84%)
[0.00] PID hash table entries: 4096 (order: 2, 16384 bytes)
[0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[0.00] Memory: 553056K/1048576K available (7245K kernel code,
434K rwdata, 1944K rodata, 1688K init, 8087K bss, 495520K reserved, 0K
highmem)
[0.00] Hierarchical RCU implementation.
[0.00] NR_IRQS:160
[0.00] Measuring MIPS counter frequency...
[0.00] Detected MIPS clock frequency: 1305 MHz (163.136 MHz counter)
[0.001000] Console: colour dummy device 80x25
[0.002000] Lock dependency validator: Copyright (c) 2006 Red Hat,
Inc., Ingo Molnar
[0.003000] ... MAX_LOCKDEP_SUBCLASSES:  8
[0.004000] ... MAX_LOCK_DEPTH:  48
[0.005000] ... MAX_LOCKDEP_KEYS:8191
[0.006000] ... CLASSHASH_SIZE:  4096
[0.007000] ... MAX_LOCKDEP_ENTRIES: 32768
[0.008000] ... MAX_LOCKDEP_CHAINS:  65536
[0.009000] ... CHAINHASH_SIZE:  32768
[0.01]  memory used by lock dependency info: 5167 kB
[0.011000]  per task-struct memory footprint: 1152 bytes
[0.018000] Calibrating delay loop... 864.25 BogoMIPS (lpj=432128)
[0.029000] pid_max: default: 32768 minimum: 301
[0.032000] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
[0.033000] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
[0.04] ftrace: allocating 21366 entries in 42 pages
[0.087000] SMP: Booting CPU1...
[0.088000] CPU1 revision is: 00025a11 (Broadcom BMIPS5000)
[0.088000] FPU revision is: 00130001
[0.088000] Primary instruction cache 32kB, physically tagged,
4-way, linesize 64 bytes.
[0.088000] Primary data cache 32kB, 4-way, linesize 32 bytes.
[0.088000] MIPS secondary cache 256kB, 8-way, linesize 128 bytes.
[0.098000] SMP: CPU1 is 

Re: [PATCH] ARM: dts: Add mmc0 and mmc1 aliases for rk3288

2014-08-13 Thread Addy

> It's convenient (and less confusing to people reading logs) if the
> eMMC port on rk3288 is consistenly marked with mmc0 and the sdmmc port
> on rk3288 is consistently marked with mmc1.  Add the appropriate
> aliases.
>
> These aliases only actually do something if a patch like
> (https://patchwork.kernel.org/patch/3925551/) lands, but they don't
> hurt even before that patch.
>
> Signed-off-by: Doug Anderson 
> Reviewed-by: Sonny Rao 
> ---
>  arch/arm/boot/dts/rk3288.dtsi | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
> index 36be7bb..0b54b0d 100644
> --- a/arch/arm/boot/dts/rk3288.dtsi
> +++ b/arch/arm/boot/dts/rk3288.dtsi
> @@ -29,6 +29,8 @@
>   i2c3 = 
>   i2c4 = 
>   i2c5 = 
> + mmc0 = 
> + mmc1 = 
There are 8 registers can be configured for clock tunning(see chapter 3,
page 133):
sdmmc: CRU_SDMMC_CON0(offset: 0x200)
CRU_SDMMC_CON1(offset: 0x204)
sdio0: CRU_SDMMC_CON2(offset: 0x208)
CRU_SDMMC_CON3(offset: 0x20c)
sdio1: CRU_SDMMC_CON4(offset: 0x210)
CRU_SDMMC_CON5(offset: 0x214)
emmc: CRU_SDMMC_CON6(offset: 0x218)
CRU_SDMMC_CON7(offset: 0x21c)

I think maybe it is suitable as follows:
mmc0 = 
mmc1 = 
mmc2 = 
mmc3 = 

So we can get ctrl_id:
ctrl_id = of_alias_get_id(host->dev->of_node, "mshc");

and can get offset of registers:
offset = 0x200 + ctrl_id * 8 + 4 * drive_or_sample

>   serial0 = 
>   serial1 = 
>   serial2 = 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] time,signal: protect resource use statistics with seqlock

2014-08-13 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/13/2014 08:43 PM, Frederic Weisbecker wrote:
> On Wed, Aug 13, 2014 at 05:03:24PM -0400, Rik van Riel wrote:
>> --- a/kernel/time/posix-cpu-timers.c +++
>> b/kernel/time/posix-cpu-timers.c @@ -272,22 +272,8 @@ static int
>> posix_cpu_clock_get_task(struct task_struct *tsk, if
>> (same_thread_group(tsk, current)) err =
>> cpu_clock_sample(which_clock, tsk, ); } else { - unsigned
>> long flags; -struct sighand_struct *sighand; - - 
>> /* - *
>> while_each_thread() is not yet entirely RCU safe, -   * keep
>> locking the group while sampling process -* clock for now. -
>> */ - sighand = lock_task_sighand(tsk, ); - if 
>> (!sighand) 
>> -return err; - if (tsk == current ||
>> thread_group_leader(tsk)) err =
>> cpu_clock_sample_group(which_clock, tsk, ); - -
>> unlock_task_sighand(tsk, ); }
> 
> I'm worried about such lockless solution based on RCU or read
> seqcount because we lose the guarantee that an update is
> immediately visible by all subsequent readers.
> 
> Say CPU 0 updates the thread time and both CPU 1 and CPU 2 right
> after that call clock_gettime(), with the spinlock we were
> guaranteed to see the new update. Now with a pure seqlock read
> approach, we guarantee a read sequence coherency but we don't
> guarantee the freshest update result.
> 
> So that looks like a source of non monotonic results.

Which update are you worried about, specifically?

The seq_write_lock to update the usage stat in p->signal will lock out
the seqlock read side used to check those results.

Is there another kind of thing read by cpu_clock_sample_group that you
believe is not excluded by the seq_lock?


- -- 
All rights reversed
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBAgAGBQJT7BdtAAoJEM553pKExN6DngEH/1CJuBb6xij08AoZNQuW4WNQ
geKakADsYTz8FmutbGi+lJEHNKAMZQ5wYbyFNczPAX/fVJsOlC92Qtfwy5z/VupN
QzlRHh79ZJR5/6xGddlu/8LjGrMIXwKqShIeKtTzoENS+rxA82l42zoXTagal4yX
Peb5/Q7cBMg9pFZzUMITEJssQhDtyTN1rXiU5IsEG54PhDbSgFk7HK1cO46tRefb
x1WbUKZKUViVKnoKYhJqd6FDSWuPtL5EpebvMVj9TZ29JBQTMDGx8saUezjuY0YL
STAv/wqigmbbcNnloJpr3gO1iJMkknv3jHk6Bfv1Dil8Um1D3mBAAKFK+Ov8Rk0=
=kU1O
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v14 4/8] powerpc: add pmd_[dirty|mkclean] for THP

2014-08-13 Thread Minchan Kim
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent
overwrite of the contents since MADV_FREE syscall is called for
THP page.

This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
support.

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: linuxppc-...@lists.ozlabs.org
Cc: Aneesh Kumar K.V 
Reviewed-by: Aneesh Kumar K.V 
Signed-off-by: Minchan Kim 
---
 arch/powerpc/include/asm/pgtable-ppc64.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h 
b/arch/powerpc/include/asm/pgtable-ppc64.h
index eb9261024f51..c9a4bbe8e179 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -468,9 +468,11 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 
 #define pmd_pfn(pmd)   pte_pfn(pmd_pte(pmd))
 #define pmd_young(pmd) pte_young(pmd_pte(pmd))
+#define pmd_dirty(pmd) pte_dirty(pmd_pte(pmd))
 #define pmd_mkold(pmd) pte_pmd(pte_mkold(pmd_pte(pmd)))
 #define pmd_wrprotect(pmd) pte_pmd(pte_wrprotect(pmd_pte(pmd)))
 #define pmd_mkdirty(pmd)   pte_pmd(pte_mkdirty(pmd_pte(pmd)))
+#define pmd_mkclean(pmd)   pte_pmd(pte_mkclean(pmd_pte(pmd)))
 #define pmd_mkyoung(pmd)   pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)   pte_pmd(pte_mkwrite(pmd_pte(pmd)))
 
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v14 8/8] mm: Don't split THP page when syscall is called

2014-08-13 Thread Minchan Kim
We don't need to split THP page when MADV_FREE syscall is
called. It could be done when VM decide really frees it so
we could avoid unnecessary THP split.

Cc: Kirill A. Shutemov 
Cc: Andrea Arcangeli 
Acked-by: Kirill A. Shutemov 
Signed-off-by: Minchan Kim 
---
 include/linux/huge_mm.h |  4 
 mm/huge_memory.c| 35 +++
 mm/madvise.c| 21 -
 mm/rmap.c   |  8 ++--
 mm/vmscan.c | 28 ++--
 5 files changed, 83 insertions(+), 13 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 63579cb8d3dc..25a961256d9f 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -19,6 +19,9 @@ extern struct page *follow_trans_huge_pmd(struct 
vm_area_struct *vma,
  unsigned long addr,
  pmd_t *pmd,
  unsigned int flags);
+extern int madvise_free_huge_pmd(struct mmu_gather *tlb,
+   struct vm_area_struct *vma,
+   pmd_t *pmd, unsigned long addr);
 extern int zap_huge_pmd(struct mmu_gather *tlb,
struct vm_area_struct *vma,
pmd_t *pmd, unsigned long addr);
@@ -56,6 +59,7 @@ extern pmd_t *page_check_address_pmd(struct page *page,
 unsigned long address,
 enum page_check_address_pmd_flag flag,
 spinlock_t **ptl);
+extern int pmd_freeable(pmd_t pmd);
 
 #define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT)
 #define HPAGE_PMD_NR (1mmap_sem)) {
+   pr_err("%s: mmap_sem is unlocked! addr=0x%lx 
end=0x%lx vma->vm_start=0x%lx vma->vm_end=0x%lx\n",
+   __func__, addr, end,
+   vma->vm_start,
+   vma->vm_end);
+   BUG();
+   }
+#endif
+   split_huge_page_pmd(vma, addr, pmd);
+   } else if (!madvise_free_huge_pmd(tlb, vma, pmd, addr))
+   goto next;
+   /* fall through */
+   }
 
-   split_huge_page_pmd(vma, addr, pmd);
if (pmd_trans_unstable(pmd))
return 0;
 
@@ -316,6 +334,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long 
addr,
}
arch_leave_lazy_mmu_mode();
pte_unmap_unlock(pte - 1, ptl);
+next:
cond_resched();
return 0;
 }
diff --git a/mm/rmap.c b/mm/rmap.c
index 04c181133890..9c407576ff8e 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -704,9 +704,13 @@ static int page_referenced_one(struct page *page, struct 
vm_area_struct *vma,
referenced++;
 
/*
-* In this implmentation, MADV_FREE doesn't support THP free
+* Use pmd_freeable instead of raw pmd_dirty because in some
+* of architecture, pmd_dirty is not defined unless
+* CONFIG_TRANSPARNTE_HUGE is enabled
 */
-   dirty++;
+   if (!pmd_freeable(*pmd))
+   dirty++;
+
spin_unlock(ptl);
} else {

[PATCH v14 1/8] mm: support madvise(MADV_FREE)

2014-08-13 Thread Minchan Kim
Linux doesn't have an ability to free pages lazy while other OS
already have been supported that named by madvise(MADV_FREE).

The gain is clear that kernel can discard freed pages rather than
swapping out or OOM if memory pressure happens.

Without memory pressure, freed pages would be reused by userspace
without another additional overhead(ex, page fault + allocation
+ zeroing).

How to work is following as.

When madvise syscall is called, VM clears dirty bit of ptes of
the range. If memory pressure happens, VM checks dirty bit of
page table and if it found still "clean", it means it's a
"lazyfree pages" so VM could discard the page instead of swapping out.
Once there was store operation for the page before VM peek a page
to reclaim, dirty bit is set so VM can swap out the page instead of
discarding.

Firstly, heavy users would be general allocators(ex, jemalloc,
tcmalloc and hope glibc supports it) and jemalloc/tcmalloc already
have supported the feature for other OS(ex, FreeBSD)

barrios@blaptop:~/benchmark/ebizzy$ lscpu
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):4
On-line CPU(s) list:   0-3
Thread(s) per core:2
Core(s) per socket:2
Socket(s): 1
NUMA node(s):  1
Vendor ID: GenuineIntel
CPU family:6
Model: 42
Stepping:  7
CPU MHz:   2801.000
BogoMIPS:  5581.64
Virtualization:VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  256K
L3 cache:  4096K
NUMA node0 CPU(s): 0-3

ebizzy benchmark(./ebizzy -S 10 -n 512)

 vanilla-jemalloc   MADV_free-jemalloc

1 thread
records:  10  records:  10
avg:  7682.10 avg:  15306.10
std:  62.35(0.81%)std:  347.99(2.27%)
max:  7770.00 max:  15622.00
min:  7598.00 min:  14772.00

2 thread
records:  10  records:  10
avg:  12747.50avg:  24171.00
std:  792.06(6.21%)   std:  895.18(3.70%)
max:  13337.00max:  26023.00
min:  10535.00min:  23152.00

4 thread
records:  10  records:  10
avg:  16474.60avg:  33717.90
std:  1496.45(9.08%)  std:  2008.97(5.96%)
max:  17877.00max:  35958.00
min:  12224.00min:  29565.00

8 thread
records:  10  records:  10
avg:  16778.50avg:  33308.10
std:  825.53(4.92%)   std:  1668.30(5.01%)
max:  17543.00max:  36010.00
min:  14576.00min:  29577.00

16 thread
records:  10  records:  10
avg:  20614.40avg:  35516.30
std:  602.95(2.92%)   std:  1283.65(3.61%)
max:  21753.00max:  37178.00
min:  19605.00min:  33217.00

32 thread
records:  10  records:  10
avg:  22771.70avg:  36018.50
std:  598.94(2.63%)   std:  1046.76(2.91%)
max:  24035.00max:  37266.00
min:  22108.00min:  34149.00

In summary, MADV_FREE is about 2 time faster than MADV_DONTNEED.

Cc: Michael Kerrisk 
Cc: Linux API 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: KOSAKI Motohiro 
Cc: Mel Gorman 
Cc: Jason Evans 
Cc: Kirill A. Shutemov 
Cc: Zhang Yanfei 
Cc: Rik van Riel 
Acked-by: Kirill A. Shutemov 
Acked-by: Zhang Yanfei 
Acked-by: Rik van Riel 
Signed-off-by: Minchan Kim 
---
 include/linux/rmap.h   |   9 ++-
 include/linux/vm_event_item.h  |   1 +
 include/uapi/asm-generic/mman-common.h |   1 +
 mm/madvise.c   | 140 +
 mm/rmap.c  |  42 +-
 mm/vmscan.c|  40 --
 mm/vmstat.c|   1 +
 7 files changed, 222 insertions(+), 12 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index be574506e6a9..0ba377b97a38 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -75,6 +75,7 @@ enum ttu_flags {
TTU_UNMAP = 1,  /* unmap mode */
TTU_MIGRATION = 2,  /* migration mode */
TTU_MUNLOCK = 4,/* munlock mode */
+   TTU_FREE = 8,   /* free mode */
 
TTU_IGNORE_MLOCK = (1 << 8),/* ignore mlock */
TTU_IGNORE_ACCESS = (1 << 9),   /* don't age */
@@ -181,7 +182,8 @@ static inline void page_dup_rmap(struct page *page)
  * Called from mm/vmscan.c to handle paging out
  */
 int page_referenced(struct page *, int is_locked,
-   struct mem_cgroup *memcg, unsigned long *vm_flags);
+   struct mem_cgroup *memcg, unsigned long *vm_flags,
+   int *is_dirty);
 
 #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
 
@@ -260,9 +262,12 @@ int rmap_walk(struct page *page, struct rmap_walk_control 
*rwc);
 
 

[PATCH v14 2/8] x86: add pmd_[dirty|mkclean] for THP

2014-08-13 Thread Minchan Kim
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent
overwrite of the contents since MADV_FREE syscall is called for
THP page.

This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
support.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Zhang Yanfei 
Cc: Kirill A. Shutemov 
Acked-by: Zhang Yanfei 
Acked-by: Kirill A. Shutemov 
Signed-off-by: Minchan Kim 
---
 arch/x86/include/asm/pgtable.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 0ec056012618..329865799653 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -104,6 +104,11 @@ static inline int pmd_young(pmd_t pmd)
return pmd_flags(pmd) & _PAGE_ACCESSED;
 }
 
+static inline int pmd_dirty(pmd_t pmd)
+{
+   return pmd_flags(pmd) & _PAGE_DIRTY;
+}
+
 static inline int pte_write(pte_t pte)
 {
return pte_flags(pte) & _PAGE_RW;
@@ -267,6 +272,11 @@ static inline pmd_t pmd_mkold(pmd_t pmd)
return pmd_clear_flags(pmd, _PAGE_ACCESSED);
 }
 
+static inline pmd_t pmd_mkclean(pmd_t pmd)
+{
+   return pmd_clear_flags(pmd, _PAGE_DIRTY);
+}
+
 static inline pmd_t pmd_wrprotect(pmd_t pmd)
 {
return pmd_clear_flags(pmd, _PAGE_RW);
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v14 3/8] sparc: add pmd_[dirty|mkclean] for THP

2014-08-13 Thread Minchan Kim
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent
overwrite of the contents since MADV_FREE syscall is called for
THP page.

This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
support.

Cc: "David S. Miller" 
Cc: sparcli...@vger.kernel.org
Signed-off-by: Minchan Kim 
---
 arch/sparc/include/asm/pgtable_64.h | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 3770bf5c6e1b..b80a309d7e00 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -666,6 +666,13 @@ static inline unsigned long pmd_young(pmd_t pmd)
return pte_young(pte);
 }
 
+static inline int pmd_dirty(pmd_t pmd)
+{
+   pte_t pte = __pte(pmd_val(pmd));
+
+   return pte_dirty(pte);
+}
+
 static inline unsigned long pmd_write(pmd_t pmd)
 {
pte_t pte = __pte(pmd_val(pmd));
@@ -723,6 +730,15 @@ static inline pmd_t pmd_mkdirty(pmd_t pmd)
return __pmd(pte_val(pte));
 }
 
+static inline pmd_t pmd_mkclean(pmd_t pmd)
+{
+   pte_t pte = __pte(pmd_val(pmd));
+
+   pte = pte_mkclean(pte);
+
+   return __pmd(pte_val(pte));
+}
+
 static inline pmd_t pmd_mkyoung(pmd_t pmd)
 {
pte_t pte = __pte(pmd_val(pmd));
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v14 7/8] arm64: add pmd_[dirty|mkclean] for THP

2014-08-13 Thread Minchan Kim
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent
overwrite of the contents since MADV_FREE syscall is called for
THP page.

This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
support.

Cc: Will Deacon 
Cc: Russell King 
Cc: linux-arm-ker...@lists.infradead.org
Cc: Steve Capper 
Cc: Catalin Marinas 
Acked-by: Steve Capper 
Acked-by: Catalin Marinas 
Signed-off-by: Minchan Kim 
---
 arch/arm64/include/asm/pgtable.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index ffe1ba0506d1..efb1b2fc4d39 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -259,10 +259,12 @@ static inline pmd_t pte_pmd(pte_t pte)
 #endif
 
 #define pmd_young(pmd) pte_young(pmd_pte(pmd))
+#define pmd_dirty(pmd) pte_dirty(pmd_pte(pmd))
 #define pmd_wrprotect(pmd) pte_pmd(pte_wrprotect(pmd_pte(pmd)))
 #define pmd_mksplitting(pmd)   pte_pmd(pte_mkspecial(pmd_pte(pmd)))
 #define pmd_mkold(pmd) pte_pmd(pte_mkold(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)   pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#define pmd_mkclean(pmd)   pte_pmd(pte_mkclean(pmd_pte(pmd)))
 #define pmd_mkdirty(pmd)   pte_pmd(pte_mkdirty(pmd_pte(pmd)))
 #define pmd_mkyoung(pmd)   pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mknotpresent(pmd)  (__pmd(pmd_val(pmd) & ~PMD_TYPE_MASK))
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v14 6/8] arm: add pmd_mkclean for THP

2014-08-13 Thread Minchan Kim
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent
overwrite of the contents since MADV_FREE syscall is called for
THP page.

This patch adds pmd_mkclean for THP page MADV_FREE support.

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Russell King 
Cc: linux-arm-ker...@lists.infradead.org
Cc: Steve Capper 
Acked-by: Steve Capper 
Signed-off-by: Minchan Kim 
---
 arch/arm/include/asm/pgtable-3level.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/include/asm/pgtable-3level.h 
b/arch/arm/include/asm/pgtable-3level.h
index 06e0bc0f8b00..bc913a065270 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -234,6 +234,7 @@ PMD_BIT_FUNC(mkold, &= ~PMD_SECT_AF);
 PMD_BIT_FUNC(mksplitting, |= L_PMD_SECT_SPLITTING);
 PMD_BIT_FUNC(mkwrite,   &= ~L_PMD_SECT_RDONLY);
 PMD_BIT_FUNC(mkdirty,   |= L_PMD_SECT_DIRTY);
+PMD_BIT_FUNC(mkclean,   &= ~L_PMD_SECT_DIRTY);
 PMD_BIT_FUNC(mkyoung,   |= PMD_SECT_AF);
 
 #define pmd_mkhuge(pmd)(__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v14 5/8] s390: add pmd_[dirty|mkclean] for THP

2014-08-13 Thread Minchan Kim
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent
overwrite of the contents since MADV_FREE syscall is called for
THP page but for s390 pmds only referenced bit is available
because there is no free bit left in the pmd entry for the
software dirty bit so this patch adds dumb pmd_dirty which
returns always true by suggesting by Martin.

They finally find a solution in future.
http://marc.info/?l=linux-api=140440328820808=2

Cc: Martin Schwidefsky 
Cc: Heiko Carstens 
Cc: Dominik Dingel 
Cc: Christian Borntraeger 
Cc: linux-s...@vger.kernel.org
Cc: Gerald Schaefer 
Acked-by: Gerald Schaefer 
Signed-off-by: Minchan Kim 
---
 arch/s390/include/asm/pgtable.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index b76317c1f3eb..ad4c855b026d 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1612,6 +1612,18 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
return pmd;
 }
 
+static inline int pmd_dirty(pmd_t pmd)
+{
+   /* No dirty bit in the segment table entry */
+   return 1;
+}
+
+static inline pmd_t pmd_mkclean(pmd_t pmd)
+{
+   /* No dirty bit in the segment table entry */
+   return pmd;
+}
+
 #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
 static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
unsigned long address, pmd_t *pmdp)
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >