[PATCH 2/2] powerpc/powernv: Enable POWER8 doorbell IPIs

2014-06-01 Thread Michael Neuling
This patch enables POWER8 doorbell IPIs on powernv.

Since doorbells can only IPI within a core, we test to see when we can use
doorbells and if not we fall back to XICS.  This also enables hypervisor
doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit.

Based on tests by Anton, the best case IPI latency between two threads dropped
from 894ns to 512ns.

Signed-off-by: Michael Neuling 
---
 arch/powerpc/kernel/cpu_setup_power.S | 2 ++
 arch/powerpc/platforms/powernv/smp.c  | 6 ++
 arch/powerpc/sysdev/xics/icp-native.c | 9 -
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/cpu_setup_power.S 
b/arch/powerpc/kernel/cpu_setup_power.S
index 1557e7c..4673353 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -56,6 +56,7 @@ _GLOBAL(__setup_cpu_power8)
li  r0,0
mtspr   SPRN_LPID,r0
mfspr   r3,SPRN_LPCR
+   ori r3, r3, LPCR_PECEDH
bl  __init_LPCR
bl  __init_HFSCR
bl  __init_tlb_power8
@@ -74,6 +75,7 @@ _GLOBAL(__restore_cpu_power8)
li  r0,0
mtspr   SPRN_LPID,r0
mfspr   r3,SPRN_LPCR
+   ori r3, r3, LPCR_PECEDH
bl  __init_LPCR
bl  __init_HFSCR
bl  __init_tlb_power8
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 0062a43..5fcfcf4 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "powernv.h"
 
@@ -46,6 +47,11 @@ static void pnv_smp_setup_cpu(int cpu)
 {
if (cpu != boot_cpuid)
xics_setup_cpu();
+
+#ifdef CONFIG_PPC_DOORBELL
+   if (cpu_has_feature(CPU_FTR_DBELL))
+   doorbell_setup_this_cpu();
+#endif
 }
 
 int pnv_smp_kick_cpu(int nr)
diff --git a/arch/powerpc/sysdev/xics/icp-native.c 
b/arch/powerpc/sysdev/xics/icp-native.c
index 9dee470..de8d948 100644
--- a/arch/powerpc/sysdev/xics/icp-native.c
+++ b/arch/powerpc/sysdev/xics/icp-native.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct icp_ipl {
union {
@@ -145,7 +146,13 @@ static unsigned int icp_native_get_irq(void)
 static void icp_native_cause_ipi(int cpu, unsigned long data)
 {
kvmppc_set_host_ipi(cpu, 1);
-   icp_native_set_qirr(cpu, IPI_PRIORITY);
+#ifdef CONFIG_PPC_DOORBELL
+   if (cpu_has_feature(CPU_FTR_DBELL) &&
+   (cpumask_test_cpu(cpu, cpu_sibling_mask(smp_processor_id()
+   doorbell_cause_ipi(cpu, data);
+   else
+#endif
+   icp_native_set_qirr(cpu, IPI_PRIORITY);
 }
 
 void xics_wake_cpu(int cpu)
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] powerpc/cpuidle: Only clear LPCR decrementer wakeup bit on fast sleep entry

2014-06-01 Thread Michael Neuling
Currently when entering fastsleep we clear all LPCR PECE bits.

This patch changes it to only clear the decrementer bit (ie. PECE1), which is
the only bit we really need to clear here.  This is needed if we want to set
other wakeup causes like the PECEDH bit so we can use hypervisor doorbells on
powernv.

Signed-off-by: Michael Neuling 
---
 drivers/cpuidle/cpuidle-powernv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpuidle/cpuidle-powernv.c 
b/drivers/cpuidle/cpuidle-powernv.c
index 719f6fb..7f7798e 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -73,7 +73,7 @@ static int fastsleep_loop(struct cpuidle_device *dev,
return index;
 
new_lpcr = old_lpcr;
-   new_lpcr &= ~(LPCR_MER | LPCR_PECE); /* lpcr[mer] must be 0 */
+   new_lpcr &= ~(LPCR_MER | LPCR_PECE1); /* lpcr[mer] must be 0 */
 
/* exit powersave upon external interrupt, but not decrementer
 * interrupt.
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[git pull] Please pull powerpc.git merge branch

2014-06-01 Thread Benjamin Herrenschmidt
Hi Linus !

Here's just one trivial patch to wire up sys_renameat2 which I
seem to have completely missed so far. (My test build scripts fwd me
warnings but miss the ones generated for missing syscalls).

Cheers,
Ben.

The following changes since commit 011e4b02f1da156ac7fea28a9da878f3c23af739:

  powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode 
(2014-05-28 13:24:26 +1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge

for you to fetch changes up to 8212f58a9b151d842fa60a70f354e43c61fad839:

  powerpc: Wire renameat2() syscall (2014-06-02 09:24:27 +1000)


Benjamin Herrenschmidt (1):
  powerpc: Wire renameat2() syscall

 arch/powerpc/include/asm/systbl.h  | 1 +
 arch/powerpc/include/asm/unistd.h  | 2 +-
 arch/powerpc/include/uapi/asm/unistd.h | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCHv7 2/6] ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()

2014-06-01 Thread Yann Droneaud
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), function get_unused_fd_flags()
(or anon_inode_getfd()) should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that new code
start using get_unused_fd_flags() (or anon_inode_getfd()) with correct
flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem
basis, and, if possible, set to O_CLOEXEC.

Link: http://lkml.kernel.org/r/cover.1401630396.git.ydrone...@opteya.com
Cc: Al Viro 
Cc: Andrew Morton 
Signed-off-by: Yann Droneaud 
---
 arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c 
b/arch/powerpc/platforms/cell/spufs/inode.c
index 87ba7cf99cd7..51effcec30d8 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -301,7 +301,7 @@ static int spufs_context_open(struct path *path)
int ret;
struct file *filp;
 
-   ret = get_unused_fd();
+   ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;
 
@@ -518,7 +518,7 @@ static int spufs_gang_open(struct path *path)
int ret;
struct file *filp;
 
-   ret = get_unused_fd();
+   ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;
 
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCHv7 0/6] Getting rid of get_unused_fd() / enable close-on-exec

2014-06-01 Thread Yann Droneaud
TL;DR;

These are mostly obvious patches, easy to review, easy to apply:
you want them in the kernel. Now. And get_unused_fd() deserves to
die. Act today !

Hi,

Please find the seventh revision of my patchset to remove get_unused_fd()
macro in order to help subsystems to use get_unused_fd_flags() (or
anon_inode_getfd()) with flags either provided by the userspace or
set to O_CLOEXEC by default where appropriate.

Without get_unused_fd() macro, more subsystems are likely to use
get_unused_fd_flags() (or anon_inode_getfd()) and be taught to
provide an API that let userspace choose the opening flags
of the file descriptor.

Not allowing userspace to provide the "open" flags or not using O_CLOEXEC
by default should be considered bad practice from security point of view:
in most case O_CLOEXEC must be used to not leak file descriptor across
exec().

Not allowing userspace to atomically set close-on-exec flag and not using
O_CLOEXEC should be avoided to protect multi-threaded program
from race condition when it tried to set close-on-exec flag using
fcntl(fd, F_SETFD, FD_CLOEXEC) after opening the file descriptor.

Example:

int fd;
int ret;

fd = open(filename, O_RDONLY);
if (fd < 0) {
perror("open");
return -1;
}

/*
 * window opened for another thread to call fork(),
 * then the new process can call exec() at any time
 * and the file descriptor would be inherited
 */

ret = fcntl(fd, F_SETFD, FD_CLOEXEC)
if (ret < 0) {
perror("fnctl()");
close(fd);
return -1;
}

vs.:

int fd;
fd = open(filaneme, O_RDONLY | O_CLOEXEC);
if (fd < 0) {
perror("open");
return -1;
}

Using O_CLOEXEC by default when flags are not (eg. cannot be) provided
by userspace is the safest bet as it allows userspace to choose, if, when
and where the file descriptor is going to be inherited across exec():
userspace is free to call fcntl() to remove FD_CLOEXEC flag in the child
process that will call exec().

Unfortunately, O_CLOEXEC cannot be made the default for most existing system
calls as it's not the default behavior for POSIX / Unix. Reader interested
in this issue could have a look at "Ghosts of Unix past, part 2: Conflated
designs" [1] article by Neil Brown.

FAQ:

- Why do one want close-on-exec ?

Setting close-on-exec flag on file descriptor ensure it won't be inherited
silently by child, child of child, etc. when executing another program.

If the file descriptor is not closed, some kernel resources can be locked
until the last process with the opened file descriptor exit.

If the file descriptor is not closed, this can lead to a security issue,
eg. making resources available to a less privileged program allowing
information leak and/or deny of service.

- Why do one need atomic close-on-exec ?

Even if it's possible to set close-on-exec flag through call to fcntl()
as shown previously, it introduces a race condition in multi-threaded
process, where a thread call fork() / exec() while another thread
is between call to open() and fcntl().

Additionally, using close-on-exec free the programmer from tracking manually
which file descriptor is to close in a child before calling exec():
in a program using multiple third-party libraries, it's difficult to know
which file descriptor must be closed.
AFAIK, while there's a atexit(), pthread_atfork(), there's no atexec()
userspace function in libc to allow libraries to register a handler in
order to close its opened file descriptor before exec().

- Why default to close-on-exec ?

Some kernel interfaces don't allow userspace to pass a O_CLOEXEC-like
flag when creating a new file descriptor.

In such cases, if possible (see below), O_CLOEXEC must be made
the default so that userspace doesn't have to call fcntl()
which, as demonstrated previously, is open to race condition in
multi-threaded program.

- How to choose between flag 0 or O_CLOEXEC in call to
  get_unused_fd_flags() (or anon_inode_getfd()) ?

Short: Will it break existing application ? Will it break kernel ABI ?

   If answer is no, use O_CLOEXEC.
   If answer is yes, use 0.

Long: If userspace expect to retrieve a file descriptor with plain
  old Unix(tm) semantics, O_CLOEXEC must not be the default, as it
  could break some applications expecting that the file descriptor
  will be inherited across exec().

  But for some subsystems, such as InfiniBand, KVM, VFIO, it makes
  no sense to have file descriptors inherited across exec() since
  those are tied to resources that will vanish when a another program
  will replace the current one by mean of exec(), so it's safe to use
  O_CLOEXEC in such cases.

  For others, like XFS, the file descriptor is retrieved by one
  program and will be used by a different program, executed as a
  child. In this case, setting O_CLOEXEC would break existing
  application which do not expect to have to call fcntl(fd,