[PATCH v2 13/13] powerpc/configs: Enable secure guest support in pseries and ppc64 defconfigs

2019-07-12 Thread Thiago Jung Bauermann
From: Ryan Grimm Enables running as a secure guest in platforms with an Ultravisor. Signed-off-by: Ryan Grimm Signed-off-by: Ram Pai Signed-off-by: Thiago Jung Bauermann --- arch/powerpc/configs/ppc64_defconfig | 1 + arch/powerpc/configs/pseries_defconfig | 1 + 2 files changed, 2 inserti

[PATCH v2 12/13] powerpc/pseries/svm: Force SWIOTLB for secure guests

2019-07-12 Thread Thiago Jung Bauermann
From: Anshuman Khandual SWIOTLB checks range of incoming CPU addresses to be bounced and sees if the device can access it through its DMA window without requiring bouncing. In such cases it just chooses to skip bouncing. But for cases like secure guests on powerpc platform all addresses need to b

[PATCH v2 11/13] powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests

2019-07-12 Thread Thiago Jung Bauermann
Secure guest memory is inacessible to devices so regular DMA isn't possible. In that case set devices' dma_map_ops to NULL so that the generic DMA code path will use SWIOTLB and DMA to bounce buffers. Signed-off-by: Thiago Jung Bauermann --- arch/powerpc/platforms/pseries/iommu.c | 6 +- 1

[PATCH v2 10/13] powerpc/pseries/svm: Disable doorbells in SVM guests

2019-07-12 Thread Thiago Jung Bauermann
From: Sukadev Bhattiprolu Normally, the HV emulates some instructions like MSGSNDP, MSGCLRP from a KVM guest. To emulate the instructions, it must first read the instruction from the guest's memory and decode its parameters. However for a secure guest (aka SVM), the page containing the instructi

[PATCH v2 09/13] powerpc/pseries/svm: Export guest SVM status to user space via sysfs

2019-07-12 Thread Thiago Jung Bauermann
From: Ryan Grimm User space might want to know it's running in a secure VM. It can't do a mfmsr because mfmsr is a privileged instruction. The solution here is to create a cpu attribute: /sys/devices/system/cpu/svm which will read 0 or 1 based on the S bit of the guest's CPU 0. Signed-off-by

[PATCH v2 08/13] powerpc/pseries/svm: Unshare all pages before kexecing a new kernel

2019-07-12 Thread Thiago Jung Bauermann
From: Ram Pai A new kernel deserves a clean slate. Any pages shared with the hypervisor is unshared before invoking the new kernel. However there are exceptions. If the new kernel is invoked to dump the current kernel, or if there is a explicit request to preserve the state of the current kernel,

[PATCH v2 07/13] powerpc/pseries/svm: Use shared memory for Debug Trace Log (DTL)

2019-07-12 Thread Thiago Jung Bauermann
From: Anshuman Khandual Secure guests need to share the DTL buffers with the hypervisor. To that end, use a kmem_cache constructor which converts the underlying buddy allocated SLUB cache pages into shared memory. Signed-off-by: Anshuman Khandual Signed-off-by: Thiago Jung Bauermann --- arch/

[PATCH v2 06/13] powerpc/pseries/svm: Use shared memory for LPPACA structures

2019-07-12 Thread Thiago Jung Bauermann
From: Anshuman Khandual LPPACA structures need to be shared with the host. Hence they need to be in shared memory. Instead of allocating individual chunks of memory for a given structure from memblock, a contiguous chunk of memory is allocated and then converted into shared memory. Subsequent all

[PATCH v2 05/13] powerpc/pseries: Add and use LPPACA_SIZE constant

2019-07-12 Thread Thiago Jung Bauermann
Helps document what the hard-coded number means. Also take the opportunity to fix an #endif comment. Suggested-by: Alexey Kardashevskiy Signed-off-by: Thiago Jung Bauermann --- arch/powerpc/kernel/paca.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/power

[PATCH v2 04/13] powerpc/pseries/svm: Add helpers for UV_SHARE_PAGE and UV_UNSHARE_PAGE

2019-07-12 Thread Thiago Jung Bauermann
From: Ram Pai These functions are used when the guest wants to grant the hypervisor access to certain pages. Signed-off-by: Ram Pai Signed-off-by: Thiago Jung Bauermann --- arch/powerpc/include/asm/ultravisor-api.h | 2 ++ arch/powerpc/include/asm/ultravisor.h | 15 +++ 2 fil

[PATCH v2 03/13] powerpc/prom_init: Add the ESM call to prom_init

2019-07-12 Thread Thiago Jung Bauermann
From: Ram Pai Make the Enter-Secure-Mode (ESM) ultravisor call to switch the VM to secure mode. Add "svm=" command line option to turn on switching to secure mode. Signed-off-by: Ram Pai [ andmike: Generate an RTAS os-term hcall when the ESM ucall fails. ] Signed-off-by: Michael Anderson [ bau

[RFC PATCH v2 02/13] powerpc: Add support for adding an ESM blob to the zImage wrapper

2019-07-12 Thread Thiago Jung Bauermann
From: Benjamin Herrenschmidt For secure VMs, the signing tool will create a ticket called the "ESM blob" for the Enter Secure Mode ultravisor call with the signatures of the kernel and initrd among other things. This adds support to the wrapper script for adding that blob via the "-e" option to

[PATCH v2 01/13] powerpc/pseries: Introduce option to build secure virtual machines

2019-07-12 Thread Thiago Jung Bauermann
Introduce CONFIG_PPC_SVM to control support for secure guests and include Ultravisor-related helpers when it is selected Signed-off-by: Thiago Jung Bauermann --- arch/powerpc/include/asm/ultravisor.h | 2 +- arch/powerpc/kernel/Makefile | 4 +++- arch/powerpc/platforms/pseries/Kconf

[PATCH v2 00/13] Secure Virtual Machine Enablement

2019-07-12 Thread Thiago Jung Bauermann
Hello, The main change in this version was to rebase on top of cleanup series I just posted: https://lore.kernel.org/linuxppc-dev/20190713044554.28719-1-bauer...@linux.ibm.com/ In addition to the patches above, this patch series applies on top of v4 of Claudio Carvalho's "kvmppc: Paravirtualize

Re: [PATCH 0/3] Remove x86-specific code from generic headers

2019-07-12 Thread Thiago Jung Bauermann
I forgot to mark this series as v2 when generating the patches. Sorry for the confusion. -- Thiago Jung Bauermann IBM Linux Technology Center

[PATCH 2/3] DMA mapping: Move SME handling to x86-specific files

2019-07-12 Thread Thiago Jung Bauermann
Secure Memory Encryption is an x86-specific feature, so it shouldn't appear in generic kernel code. In DMA mapping code, Christoph Hellwig mentioned that "There is no reason why we should have a special debug printk just for one specific reason why there is a requirement for a large DMA mask.", so

[PATCH 3/3] fs/core/vmcore: Move sev_active() reference to x86 arch code

2019-07-12 Thread Thiago Jung Bauermann
Secure Encrypted Virtualization is an x86-specific feature, so it shouldn't appear in generic kernel code because it forces non-x86 architectures to define the sev_active() function, which doesn't make a lot of sense. To solve this problem, add an x86 elfcorehdr_read() function to override the gen

[PATCH 1/3] x86, s390: Move ARCH_HAS_MEM_ENCRYPT definition to arch/Kconfig

2019-07-12 Thread Thiago Jung Bauermann
powerpc is also going to use this feature, so put it in a generic location. Signed-off-by: Thiago Jung Bauermann Reviewed-by: Thomas Gleixner --- arch/Kconfig | 3 +++ arch/s390/Kconfig | 3 --- arch/x86/Kconfig | 4 +--- 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/arch

[PATCH 0/3] Remove x86-specific code from generic headers

2019-07-12 Thread Thiago Jung Bauermann
Hello, This version mostly changes patch 2/3, removing dma_check_mask() from kernel/dma/mapping.c as suggested by Christoph Hellwig, and also adapting s390's . There's also a small change in patch 1/3 as mentioned in the changelog below. Patch 3/3 may or may not need to change s390 code depending

[GIT PULL] Please pull powerpc/linux.git powerpc-5.3-1 tag

2019-07-12 Thread Michael Ellerman
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Linus, Please pull powerpc updates for 5.3. A bit of a small batch for us, just due to me not getting the time to review things. Only one conflict that I'm aware of, in our pgtable.h, resolution is simply to take both sides. cheers The followin

Re: [PATCH 1/3] KVM: PPC: Book3S HV: Always save guest pmu for guest capable of nesting

2019-07-12 Thread Michael Ellerman
Suraj Jitindar Singh writes: > The performance monitoring unit (PMU) registers are saved on guest exit > when the guest has set the pmcregs_in_use flag in its lppaca, if it > exists, or unconditionally if it doesn't. If a nested guest is being > run then the hypervisor doesn't, and in most cases c

[PATCH] powerpc: remove meaningless KBUILD_ARFLAGS addition

2019-07-12 Thread Masahiro Yamada
The KBUILD_ARFLAGS addition in arch/powerpc/Makefile has never worked in a useful way because it is always overridden by the following code in the top Makefile: # use the deterministic mode of AR if available KBUILD_ARFLAGS := $(call ar-option,D) The code in the top Makefile was added in 2011

Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 04:00:26PM +0100, Al Viro wrote: > On Fri, Jul 12, 2019 at 02:25:53PM +0100, Al Viro wrote: > > > if (flags & LOOKUP_BENEATH) { > > nd->root = nd->path; > > if (!(flags & LOOKUP_RCU)) > > path_get(&nd->root); > > e

Re: [RFC PATCH kernel] powerpc/xive: Drop deregistered irqs

2019-07-12 Thread Benjamin Herrenschmidt
On Fri, 2019-07-12 at 19:37 +1000, Alexey Kardashevskiy wrote: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/irq.c#n614 > > If so, then in order to do EOI, I'll need the desc which is gone, or > I am missing the point? All you need is drop the loc

Re: [PATCH 2/3] DMA mapping: Move SME handling to x86-specific files

2019-07-12 Thread Thiago Jung Bauermann
[ Cc'ing Tom Lendacky which I forgot to do earlier. Sorry about that. ] Hello Christoph, Christoph Hellwig writes: > Honestly I think this code should go away without any replacement. > There is no reason why we should have a special debug printk just > for one specific reason why there is a

Re: [PATCH 1/3] x86/Kconfig: Move ARCH_HAS_MEM_ENCRYPT to arch/Kconfig

2019-07-12 Thread Thiago Jung Bauermann
Hello Thomas, Thanks for quickly reviewing the patches. Thomas Gleixner writes: > On Fri, 12 Jul 2019, Thiago Jung Bauermann wrote: > >> powerpc and s390 are going to use this feature as well, so put it in a >> generic location. >> >> Signed-off-by: Thiago Jung Bauermann > > Reviewed-by: Th

Re: [PATCH 3/3] fs/core/vmcore: Move sev_active() reference to x86 arch code

2019-07-12 Thread Thiago Jung Bauermann
[ Cc'ing Tom Lendacky which I forgot to do earlier. Sorry about that. ] Hello Halil, Thanks for the quick review. Halil Pasic writes: > On Fri, 12 Jul 2019 02:36:31 -0300 > Thiago Jung Bauermann wrote: > >> Secure Encrypted Virtualization is an x86-specific feature, so it shouldn't >> appea

Re: [PATCH v4 1/8] KVM: PPC: Ultravisor: Introduce the MSR_S bit

2019-07-12 Thread Claudio Carvalho
On 7/11/19 9:57 PM, Nicholas Piggin wrote: > Claudio Carvalho's on June 29, 2019 6:08 am: >> From: Sukadev Bhattiprolu >> >> The ultravisor processor mode is introduced in POWER platforms that >> supports the Protected Execution Facility (PEF). Ultravisor is higher >> privileged than hypervisor

Re: [PATCH 01/12] Documentation: move architectures together

2019-07-12 Thread Jonathan Corbet
On Fri, 12 Jul 2019 10:20:07 +0800 Alex Shi wrote: > There are many different archs in Documentation/ dir, it's better to > move them together in 'Documentation/arch' which follows from kernel source. So this seems certain to collide badly with Mauro's RST-conversion monster patch set. More to

Re: [PATCH 2/3] DMA mapping: Move SME handling to x86-specific files

2019-07-12 Thread Thomas Gleixner
On Fri, 12 Jul 2019, Thiago Jung Bauermann wrote: > diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h > index b310a9c18113..f2e399fb626b 100644 > --- a/include/linux/mem_encrypt.h > +++ b/include/linux/mem_encrypt.h > @@ -21,23 +21,11 @@ > > #else/* !CONFIG_ARCH_HAS_

Re: [PATCH 1/3] x86/Kconfig: Move ARCH_HAS_MEM_ENCRYPT to arch/Kconfig

2019-07-12 Thread Thomas Gleixner
On Fri, 12 Jul 2019, Thiago Jung Bauermann wrote: > powerpc and s390 are going to use this feature as well, so put it in a > generic location. > > Signed-off-by: Thiago Jung Bauermann Reviewed-by: Thomas Gleixner

[Bug 204125] FTBFS on ppc64 big endian and gcc9 because of -mcall-aixdesc and missing __linux__

2019-07-12 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=204125 --- Comment #8 from Daniel Kolesa (li...@octaforge.org) --- Using this patch on my machines now: https://gist.github.com/q66/625cbec5d7317829a302773f89533b51 seems to work well -- You are receiving this mail because: You are watching the assigne

Re: [PATCH] treewide: Rename rcu_dereference_raw_notrace to _check

2019-07-12 Thread Joel Fernandes
On Fri, Jul 12, 2019 at 08:01:07AM -0700, Paul E. McKenney wrote: > On Thu, Jul 11, 2019 at 04:45:41PM -0400, Joel Fernandes (Google) wrote: > > The rcu_dereference_raw_notrace() API name is confusing. > > It is equivalent to rcu_dereference_raw() except that it also does > > sparse pointer checkin

Re: [PATCH v9 00/10] namei: openat2(2) path resolution restrictions

2019-07-12 Thread Aleksa Sarai
On 2019-07-12, Al Viro wrote: > On Sun, Jul 07, 2019 at 12:57:27AM +1000, Aleksa Sarai wrote: > > Patch changelog: > > v9: > > * Replace resolveat(2) with openat2(2). [Linus] > > * Output a warning to dmesg if may_open_magiclink() is violated. > > * Add an openat2(O_CREAT) testcase.

Re: [PATCH v3] powerpc/setup_64: fix -Wempty-body warnings

2019-07-12 Thread Qian Cai
Ping. On Fri, 2019-06-28 at 10:03 -0400, Qian Cai wrote: > At the beginning of setup_64.c, it has, > >   #ifdef DEBUG >   #define DBG(fmt...) udbg_printf(fmt) >   #else >   #define DBG(fmt...) >   #endif > > where DBG() could be compiled away, and generate warnings, > > arch/powerpc/kernel/setu

Re: [PATCH] powerpc/powernv: fix a W=1 compilation warning

2019-07-12 Thread Qian Cai
Ping. On Wed, 2019-05-22 at 12:09 -0400, Qian Cai wrote: > The commit b575c731fe58 ("powerpc/powernv/npu: Add set/unset window > helpers") called pnv_npu_set_window() in a void function > pnv_npu_dma_set_32(), but the return code from pnv_npu_set_window() has > no use there as all the error loggin

Re: [PATCH kernel v4 0/4 repost] powerpc/ioda2: Yet another attempt to allow DMA masks between 32 and 59

2019-07-12 Thread Christoph Hellwig
On Fri, Jul 12, 2019 at 07:45:05PM +1000, Alexey Kardashevskiy wrote: > This is an attempt to allow DMA masks between 32..59 which are not large > enough to use either a PHB3 bypass mode or a sketchy bypass. Depending > on the max order, up to 40 is usually available. Can you elaborate what you ma

Re: [PATCH kernel v4 4/4] powerpc/powernv/ioda2: Create bigger default window with 64k IOMMU pages

2019-07-12 Thread Christoph Hellwig
> -extern struct iommu_table *iommu_init_table(struct iommu_table * tbl, > - int nid); > +extern struct iommu_table *iommu_init_table_res(struct iommu_table *tbl, > + int nid, unsigned long res_start, unsigned long res_end); > +#define iommu_init_

Re: [PATCH kernel v4 2/4] powerpc/iommu: Allow bypass-only for DMA

2019-07-12 Thread Christoph Hellwig
> This skips the 32bit DMA setup check if the bypass is can be selected. That sentence does not parse. I think you need to dop the "can be" based on the actual patch.

Re: [PATCH v9 00/10] namei: openat2(2) path resolution restrictions

2019-07-12 Thread Al Viro
On Sun, Jul 07, 2019 at 12:57:27AM +1000, Aleksa Sarai wrote: > Patch changelog: > v9: > * Replace resolveat(2) with openat2(2). [Linus] > * Output a warning to dmesg if may_open_magiclink() is violated. > * Add an openat2(O_CREAT) testcase. One general note for the future, BTW: for

Re: [PATCH 3/3] fs/core/vmcore: Move sev_active() reference to x86 arch code

2019-07-12 Thread Christoph Hellwig
On Fri, Jul 12, 2019 at 04:51:53PM +0200, Halil Pasic wrote: > Thank you very much! I will have another look, but it seems to me, > without further measures taken, this would break protected virtualization > support on s390. The effect of the che for s390 is that > force_dma_unencrypted() will alwa

Re: [PATCH] treewide: Rename rcu_dereference_raw_notrace to _check

2019-07-12 Thread Paul E. McKenney
On Thu, Jul 11, 2019 at 04:45:41PM -0400, Joel Fernandes (Google) wrote: > The rcu_dereference_raw_notrace() API name is confusing. > It is equivalent to rcu_dereference_raw() except that it also does > sparse pointer checking. > > There are only a few users of rcu_dereference_raw_notrace(). This

Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 02:25:53PM +0100, Al Viro wrote: > if (flags & LOOKUP_BENEATH) { > nd->root = nd->path; > if (!(flags & LOOKUP_RCU)) > path_get(&nd->root); > else > nd->root_seq = nd->seq; BTW, thi

Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-12 Thread Michal Hocko
On Fri 12-07-19 15:37:30, Will Deacon wrote: > Hi all, > > On Fri, Jul 12, 2019 at 02:12:23PM +0200, Michal Hocko wrote: > > On Fri 12-07-19 10:56:47, Hoan Tran OS wrote: > > [...] > > > It would be good if we can enable it by-default. Otherwise, let arch > > > enables it by them-self. Do you hav

Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-12 Thread Will Deacon
Hi all, On Fri, Jul 12, 2019 at 02:12:23PM +0200, Michal Hocko wrote: > On Fri 12-07-19 10:56:47, Hoan Tran OS wrote: > [...] > > It would be good if we can enable it by-default. Otherwise, let arch > > enables it by them-self. Do you have any suggestions? > > I can hardly make any suggestions w

Re: [PATCH 3/3] fs/core/vmcore: Move sev_active() reference to x86 arch code

2019-07-12 Thread Christoph Hellwig
On Fri, Jul 12, 2019 at 03:09:12PM +0200, Halil Pasic wrote: > This is the implementation for the guys that don't > have ARCH_HAS_MEM_ENCRYPT. > > Means sev_active() may not be used in such code after this > patch. What about > > static inline bool force_dma_unencrypted(void) > { > retur

Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 01:55:52PM +0100, Al Viro wrote: > On Fri, Jul 12, 2019 at 01:39:24PM +0100, Al Viro wrote: > > On Fri, Jul 12, 2019 at 08:57:45PM +1000, Aleksa Sarai wrote: > > > > > > > @@ -2350,9 +2400,11 @@ static const char *path_init(struct nameidata > > > > > *nd, unsigned flags) >

Re: [PATCH v9 01/10] namei: obey trailing magic-link DAC permissions

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 10:20:17PM +1000, Aleksa Sarai wrote: > On 2019-07-12, Al Viro wrote: > > On Sun, Jul 07, 2019 at 12:57:28AM +1000, Aleksa Sarai wrote: > > > @@ -514,7 +516,14 @@ static void set_nameidata(struct nameidata *p, int > > > dfd, struct filename *name) > > > p->stack = p->int

Re: [PATCH] powerpc: mm: Limit rma_size to 1TB when running without HV mode

2019-07-12 Thread Michael Ellerman
Suraj Jitindar Singh writes: > The virtual real mode addressing (VRMA) mechanism is used when a > partition is using HPT (Hash Page Table) translation and performs > real mode accesses (MSR[IR|DR] = 0) in non-hypervisor mode. In this > mode effective address bits 0:23 are treated as zero (i.e. the

Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 01:39:24PM +0100, Al Viro wrote: > On Fri, Jul 12, 2019 at 08:57:45PM +1000, Aleksa Sarai wrote: > > > > > @@ -2350,9 +2400,11 @@ static const char *path_init(struct nameidata > > > > *nd, unsigned flags) > > > > s = ERR_PTR(error); > > > >

Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Al Viro
On Fri, Jul 12, 2019 at 08:57:45PM +1000, Aleksa Sarai wrote: > > > @@ -2350,9 +2400,11 @@ static const char *path_init(struct nameidata *nd, > > > unsigned flags) > > > s = ERR_PTR(error); > > > return s; > > > } > > > - error = dirfd_path_init(nd); > > > - if (unli

Re: [PATCH v2] powerpc/book3s/mm: Update Oops message to print the correct translation in use

2019-07-12 Thread Christophe Leroy
Le 12/07/2019 à 14:22, Michael Ellerman a écrit : Christophe Leroy writes: Le 12/07/2019 à 08:25, Michael Ellerman a écrit : "Aneesh Kumar K.V" writes: ... diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 11caa0291254..b181d6860f28 100644 --- a/arch/powerpc/ke

Re: [PATCH v2] powerpc/book3s/mm: Update Oops message to print the correct translation in use

2019-07-12 Thread Michael Ellerman
Christophe Leroy writes: > Le 12/07/2019 à 08:25, Michael Ellerman a écrit : >> "Aneesh Kumar K.V" writes: ... >>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c >>> index 11caa0291254..b181d6860f28 100644 >>> --- a/arch/powerpc/kernel/traps.c >>> +++ b/arch/powerpc/kernel

Re: [PATCH v9 01/10] namei: obey trailing magic-link DAC permissions

2019-07-12 Thread Aleksa Sarai
On 2019-07-12, Al Viro wrote: > On Sun, Jul 07, 2019 at 12:57:28AM +1000, Aleksa Sarai wrote: > > @@ -514,7 +516,14 @@ static void set_nameidata(struct nameidata *p, int > > dfd, struct filename *name) > > p->stack = p->internal; > > p->dfd = dfd; > > p->name = name; > > - p->total_

Re: [PATCH v9 04/10] namei: split out nd->dfd handling to dirfd_path_init

2019-07-12 Thread Aleksa Sarai
On 2019-07-12, Aleksa Sarai wrote: > On 2019-07-12, Al Viro wrote: > > On Sun, Jul 07, 2019 at 12:57:31AM +1000, Aleksa Sarai wrote: > > > Previously, path_init's handling of *at(dfd, ...) was only done once, > > > but with LOOKUP_BENEATH (and LOOKUP_IN_ROOT) we have to parse the > > > initial nd

Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-12 Thread Michal Hocko
On Fri 12-07-19 10:56:47, Hoan Tran OS wrote: [...] > It would be good if we can enable it by-default. Otherwise, let arch > enables it by them-self. Do you have any suggestions? I can hardly make any suggestions when it is not really clear _why_ you want to remove this config option in the first

Re: [PATCH v9 04/10] namei: split out nd->dfd handling to dirfd_path_init

2019-07-12 Thread Aleksa Sarai
On 2019-07-12, Al Viro wrote: > On Sun, Jul 07, 2019 at 12:57:31AM +1000, Aleksa Sarai wrote: > > Previously, path_init's handling of *at(dfd, ...) was only done once, > > but with LOOKUP_BENEATH (and LOOKUP_IN_ROOT) we have to parse the > > initial nd->path at different times (before or after abs

Re: [PATCH v9 05/10] namei: O_BENEATH-style path resolution flags

2019-07-12 Thread Aleksa Sarai
On 2019-07-12, Al Viro wrote: > On Sun, Jul 07, 2019 at 12:57:32AM +1000, Aleksa Sarai wrote: > > @@ -1442,8 +1464,11 @@ static int follow_dotdot_rcu(struct nameidata *nd) > > struct inode *inode = nd->inode; > > > > while (1) { > > - if (path_equal(&nd->path, &nd->root)) > > +

Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-12 Thread Hoan Tran OS
Hi, On 7/12/19 2:02 PM, Michal Hocko wrote: > On Thu 11-07-19 23:25:44, Hoan Tran OS wrote: >> In NUMA layout which nodes have memory ranges that span across other nodes, >> the mm driver can detect the memory node id incorrectly. >> >> For example, with layout below >> Node 0 address: 0

Re: [PATCH V2] mm/ioremap: Probe platform for p4d huge map support

2019-07-12 Thread Anshuman Khandual
On 07/12/2019 12:37 PM, Michael Ellerman wrote: > Anshuman Khandual writes: >> On 07/03/2019 04:36 AM, Andrew Morton wrote: >>> On Fri, 28 Jun 2019 10:50:31 +0530 Anshuman Khandual >>> wrote: >>> Finishing up what the commit c2febafc67734a ("mm: convert generic code to 5-level pagin

[PATCH kernel v4 3/4] powerpc/powernv/ioda2: Allocate TCE table levels on demand for default DMA window

2019-07-12 Thread Alexey Kardashevskiy
We allocate only the first level of multilevel TCE tables for KVM already (alloc_userspace_copy==true), and the rest is allocated on demand. This is not enabled though for bare metal. This removes the KVM limitation (implicit, via the alloc_userspace_copy parameter) and always allocates just the f

[PATCH kernel v4 2/4] powerpc/iommu: Allow bypass-only for DMA

2019-07-12 Thread Alexey Kardashevskiy
POWER8 and newer support a bypass mode which maps all host memory to PCI buses so an IOMMU table is not always required. However if we fail to create such a table, the DMA setup fails and the kernel does not boot. This skips the 32bit DMA setup check if the bypass is can be selected. Signed-off-b

[PATCH kernel v4 4/4] powerpc/powernv/ioda2: Create bigger default window with 64k IOMMU pages

2019-07-12 Thread Alexey Kardashevskiy
At the moment we create a small window only for 32bit devices, the window maps 0..2GB of the PCI space only. For other devices we either use a sketchy bypass or hardware bypass but the former can only work if the amount of RAM is no bigger than the device's DMA mask and the latter requires devices

[PATCH kernel v4 1/4] powerpc/powernv/ioda: Fix race in TCE level allocation

2019-07-12 Thread Alexey Kardashevskiy
pnv_tce() returns a pointer to a TCE entry and originally a TCE table would be pre-allocated. For the default case of 2GB window the table needs only a single level and that is fine. However if more levels are requested, it is possible to get a race when 2 threads want a pointer to a TCE entry from

[PATCH kernel v4 0/4 repost] powerpc/ioda2: Yet another attempt to allow DMA masks between 32 and 59

2019-07-12 Thread Alexey Kardashevskiy
This is an attempt to allow DMA masks between 32..59 which are not large enough to use either a PHB3 bypass mode or a sketchy bypass. Depending on the max order, up to 40 is usually available. Changelogs are in the patches. This is based on sha1 a2b6f26c264e Christophe Leroy "powerpc/module64: U

Re: [RFC PATCH kernel] powerpc/xive: Drop deregistered irqs

2019-07-12 Thread Alexey Kardashevskiy
On 12/07/2019 18:29, Benjamin Herrenschmidt wrote: On Fri, 2019-07-12 at 18:20 +1000, Alexey Kardashevskiy wrote: diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c index 082c7e1c20f0..65742e280337 100644 --- a/arch/powerpc/sysdev/xive/common.c +++ b/arch/pow

Re: [PATCH V2] mm/ioremap: Probe platform for p4d huge map support

2019-07-12 Thread Stephen Rothwell
Hi all, On Fri, 12 Jul 2019 17:07:48 +1000 Michael Ellerman wrote: > > The return value of arch_ioremap_p4d_supported() is stored in the > variable ioremap_p4d_capable which is then returned by > ioremap_p4d_enabled(). > > That is used by ioremap_try_huge_p4d() called from ioremap_p4d_range() >

Re: [RFC PATCH kernel] powerpc/xive: Drop deregistered irqs

2019-07-12 Thread Benjamin Herrenschmidt
On Fri, 2019-07-12 at 18:20 +1000, Alexey Kardashevskiy wrote: > > diff --git a/arch/powerpc/sysdev/xive/common.c > b/arch/powerpc/sysdev/xive/common.c > index 082c7e1c20f0..65742e280337 100644 > --- a/arch/powerpc/sysdev/xive/common.c > +++ b/arch/powerpc/sysdev/xive/common.c > @@ -148,8 +148,12

[RFC PATCH kernel] powerpc/xive: Drop deregistered irqs

2019-07-12 Thread Alexey Kardashevskiy
There is a race between releasing an irq on one cpu and fetching it from XIVE on another cpu as there does not seem to be any locking between these, probably because xive_irq_chip::irq_shutdown() is supposed to remove the irq from all queues in the system which it does not do. As a result, when su

Re: [PATCH 2/3] DMA mapping: Move SME handling to x86-specific files

2019-07-12 Thread Christoph Hellwig
Honestly I think this code should go away without any replacement. There is no reason why we should have a special debug printk just for one specific reason why there is a requirement for a large DMA mask.

Re: [PATCH V2] mm/ioremap: Probe platform for p4d huge map support

2019-07-12 Thread Michael Ellerman
Anshuman Khandual writes: > On 07/03/2019 04:36 AM, Andrew Morton wrote: >> On Fri, 28 Jun 2019 10:50:31 +0530 Anshuman Khandual >> wrote: >> >>> Finishing up what the commit c2febafc67734a ("mm: convert generic code to >>> 5-level paging") started out while levelling up P4D huge mapping suppor

[RFC v4 3/3] cpuidle-powernv : Recompute the idle-state timeouts when state usage is enabled/disabled

2019-07-12 Thread Abhishek Goel
The disable callback can be used to compute timeout for other states whenever a state is enabled or disabled. We store the computed timeout in "timeout" defined in cpuidle state strucure. So, we compute timeout only when some state is enabled or disabled and not every time in the fast idle path. We

[RFC v4 2/3] cpuidle : Add callback whenever a state usage is enabled/disabled

2019-07-12 Thread Abhishek Goel
To force wakeup a cpu, we need to compute the timeout in the fast idle path as a state may be enabled or disabled but there did not exist a feedback to driver when a state is enabled or disabled. This patch adds a callback whenever a state_usage records a store for disable attribute. Signed-off-by

[PATCH v4 1/3] cpuidle-powernv : forced wakeup for stop states

2019-07-12 Thread Abhishek Goel
Currently, the cpuidle governors determine what idle state a idling CPU should enter into based on heuristics that depend on the idle history on that CPU. Given that no predictive heuristic is perfect, there are cases where the governor predicts a shallow idle state, hoping that the CPU will be bus

[PATCH v4 0/3] Forced-wakeup for stop states on Powernv

2019-07-12 Thread Abhishek Goel
Currently, the cpuidle governors determine what idle state a idling CPU should enter into based on heuristics that depend on the idle history on that CPU. Given that no predictive heuristic is perfect, there are cases where the governor predicts a shallow idle state, hoping that the CPU will be bus

Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2019-07-12 Thread Michal Hocko
On Thu 11-07-19 23:25:44, Hoan Tran OS wrote: > In NUMA layout which nodes have memory ranges that span across other nodes, > the mm driver can detect the memory node id incorrectly. > > For example, with layout below > Node 0 address: > Node 1 address: > >

Re: [PATCH v3 3/3] powerpc/module64: Use symbolic instructions names.

2019-07-12 Thread Michael Ellerman
Christophe Leroy writes: > Le 08/07/2019 à 02:56, Michael Ellerman a écrit : >> Christophe Leroy writes: >>> To increase readability/maintainability, replace hard coded >>> instructions values by symbolic names. >>> >>> Signed-off-by: Christophe Leroy >>> --- >>> v3: fixed warning by adding () i