Re: [PATCH 1/1 v4] drivers/nvme: default to 4k device page size

2015-11-06 Thread Nishanth Aravamudan
On 05.11.2015 [11:58:39 -0800], Christoph Hellwig wrote: > Looks fine, > > Reviewed-by: Christoph Hellwig > > ... but I doubt we'll ever bother updating it. Most architectures > with arger page sizes also have iommus and would need different settings > for different iommus vs direct mapping for

Re: [PATCH 1/1 v4] drivers/nvme: default to 4k device page size

2015-11-05 Thread Nishanth Aravamudan
On 05.11.2015 [11:58:39 -0800], Christoph Hellwig wrote: > Looks fine, > > Reviewed-by: Christoph Hellwig > > ... but I doubt we'll ever bother updating it. Most architectures > with arger page sizes also have iommus and would need different settings > for different iommus vs direct mapping for

[PATCH 1/1 v4] drivers/nvme: default to 4k device page size

2015-11-05 Thread Nishanth Aravamudan
On 03.11.2015 [13:46:25 +], Keith Busch wrote: > On Tue, Nov 03, 2015 at 05:18:24AM -0800, Christoph Hellwig wrote: > > On Fri, Oct 30, 2015 at 02:35:11PM -0700, Nishanth Aravamudan wrote: > > > diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c > &

Re: [PATCH 1/1 v3] drivers/nvme: default to 4k device page size

2015-10-30 Thread Nishanth Aravamudan
On 30.10.2015 [21:48:48 +], Keith Busch wrote: > On Fri, Oct 30, 2015 at 02:35:11PM -0700, Nishanth Aravamudan wrote: > > Given that it's 4K just about everywhere by default (and sort of > > implicitly expected to be, I guess), I think I'd prefer we default to > &g

Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-30 Thread Nishanth Aravamudan
On 29.10.2015 [18:49:55 -0700], David Miller wrote: > From: Nishanth Aravamudan > Date: Thu, 29 Oct 2015 08:57:01 -0700 > > > So, would that imply changing just the NVMe driver code rather than > > adding the dma_page_shift API at all? What about > > architectures

[PATCH 1/1 v3] drivers/nvme: default to 4k device page size

2015-10-30 Thread Nishanth Aravamudan
On 29.10.2015 [17:20:43 +], Busch, Keith wrote: > On Thu, Oct 29, 2015 at 08:57:01AM -0700, Nishanth Aravamudan wrote: > > On 29.10.2015 [04:55:36 -0700], Christoph Hellwig wrote: > > > We had a quick cht about this issue and I think we simply should > > > default to

Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-29 Thread Nishanth Aravamudan
On 29.10.2015 [04:55:36 -0700], Christoph Hellwig wrote: > On Wed, Oct 28, 2015 at 01:59:23PM +, Busch, Keith wrote: > > The "new" interface for all the other architectures is the same as the > > old one we've been using for the last 5 years. > > > > I welcome x86 maintainer feedback to confir

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Nishanth Aravamudan
On 28.10.2015 [11:20:05 +0900], Benjamin Herrenschmidt wrote: > On Tue, 2015-10-27 at 18:54 -0700, Nishanth Aravamudan wrote: > > > > In "bypass" mode, what TCE size is used? Is it guaranteed to be 4K? > > None :-) The TCEs are completely bypassed. You get a N:M

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Nishanth Aravamudan
On 28.10.2015 [12:00:20 +1100], Alexey Kardashevskiy wrote: > On 10/28/2015 09:27 AM, Nishanth Aravamudan wrote: > >On 27.10.2015 [17:02:16 +1100], Alexey Kardashevskiy wrote: > >>On 10/24/2015 07:57 AM, Nishanth Aravamudan wrote: > >>>On Power, the kernel's pa

Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-27 Thread Nishanth Aravamudan
On 27.10.2015 [17:53:22 -0700], David Miller wrote: > From: Nishanth Aravamudan > Date: Tue, 27 Oct 2015 15:20:10 -0700 > > > Well, looks like I should spin up a v4 anyways for the powerpc changes. > > So, to make sure I understand your point, should I make the generic >

Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-27 Thread Nishanth Aravamudan
On 28.10.2015 [09:57:48 +1100], Julian Calaby wrote: > Hi Nishanth, > > On Wed, Oct 28, 2015 at 9:20 AM, Nishanth Aravamudan > wrote: > > On 26.10.2015 [18:27:46 -0700], David Miller wrote: > >> From: Nishanth Aravamudan > >> Date: Fri, 23 Oct 2015 13:54:2

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Nishanth Aravamudan
On 27.10.2015 [17:02:16 +1100], Alexey Kardashevskiy wrote: > On 10/24/2015 07:57 AM, Nishanth Aravamudan wrote: > >On Power, the kernel's page size can differ from the IOMMU's page size, > >so we need to override the generic implementation, which always returns > >

Re: [PATCH 4/7 v2] pseries/iommu: implement DDW-aware dma_get_page_shift

2015-10-27 Thread Nishanth Aravamudan
On 27.10.2015 [16:56:10 +1100], Alexey Kardashevskiy wrote: > On 10/24/2015 07:59 AM, Nishanth Aravamudan wrote: > >When DDW (Dynamic DMA Windows) are present for a device, we have stored > >the TCE (Translation Control Entry) size in a special device tree > >property. Check i

Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-27 Thread Nishanth Aravamudan
On 26.10.2015 [18:27:46 -0700], David Miller wrote: > From: Nishanth Aravamudan > Date: Fri, 23 Oct 2015 13:54:20 -0700 > > > 1) add a generic dma_get_page_shift implementation that just returns > > PAGE_SHIFT > > I won't object to this patch series, but if I had

Re: [PATCH 5/7] [RFC PATCH 5/7] sparc: rename kernel/iommu_common.h -> include/asm/iommu_common.h

2015-10-23 Thread Nishanth Aravamudan
[Apologies for the subject line, should just have the [RFC PATCH 5/7]] On 23.10.2015 [14:00:08 -0700], Nishanth Aravamudan wrote: > In order to cleanly expose the desired IOMMU page shift via the new > dma_get_page_shift API, we need to have the sparc constants available in > a mor

[PATCH 7/7 v2] drivers/nvme: default to the IOMMU page size

2015-10-23 Thread Nishanth Aravamudan
ge size for the default device page size, rather than the kernel's page size. With this patch, a NVMe device survives our internal hardware exerciser; the kernel BUGs within a few seconds without the patch. Signed-off-by: Nishanth Aravamudan --- v1 -> v2: Based upon feedback from Chris

[RFC PATCH 6/7] sparc/dma-mapping: override dma_get_page_shift

2015-10-23 Thread Nishanth Aravamudan
On sparc, the kernel's page size differs from the IOMMU's page size, so override the generic implementation, which always returns the kernel's page size, and return IOMMU_PAGE_SHIFT instead. Signed-off-by: Nishanth Aravamudan --- I know very little about sparc, so please cor

[PATCH 5/7] [RFC PATCH 5/7] sparc: rename kernel/iommu_common.h -> include/asm/iommu_common.h

2015-10-23 Thread Nishanth Aravamudan
In order to cleanly expose the desired IOMMU page shift via the new dma_get_page_shift API, we need to have the sparc constants available in a more typical location. There should be no functional impact to this move, but it is untested. Signed-off-by: Nishanth Aravamudan --- arch/sparc/include

[PATCH 4/7 v2] pseries/iommu: implement DDW-aware dma_get_page_shift

2015-10-23 Thread Nishanth Aravamudan
oking the value up in struct iommu_table. If we don't find a iommu_table, fallback to the kernel's page size. Signed-off-by: Nishanth Aravamudan --- arch/powerpc/platforms/pseries/iommu.c | 36 ++ 1 file changed, 36 insertions(+) diff --git a/arch/po

[PATCH 3/7 v2] powerpc/dma: implement per-platform dma_get_page_shift

2015-10-23 Thread Nishanth Aravamudan
. DDW is a pseries-specific feature, so allow platforms to override the implementation of dma_get_page_shift if desired. Signed-off-by: Nishanth Aravamudan --- arch/powerpc/include/asm/machdep.h | 3 ++- arch/powerpc/kernel/dma.c | 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-) diff

Re: [PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-23 Thread Nishanth Aravamudan
[Sorry, subject should have been 0/7!] On 23.10.2015 [13:54:20 -0700], Nishanth Aravamudan wrote: > We received a bug report recently when DDW (64-bit direct DMA on Power) > is not enabled for NVMe devices. In that case, we fall back to 32-bit > DMA via the IOMMU, which is always done vi

[PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-23 Thread Nishanth Aravamudan
otherwise. Signed-off-by: Nishanth Aravamudan --- arch/powerpc/include/asm/dma-mapping.h | 3 +++ arch/powerpc/kernel/dma.c | 9 + 2 files changed, 12 insertions(+) diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h index 7f522c0..

[PATCH 1/7 v3] dma-mapping: add generic dma_get_page_shift API

2015-10-23 Thread Nishanth Aravamudan
Drivers like NVMe need to be able to determine the page size used for DMA transfers. Add a new API that defaults to return PAGE_SHIFT on all architectures. Signed-off-by: Nishanth Aravamudan --- v1 -> v2: Based upon feedback from Christoph Hellwig, implement the IOMMU page size lookup a

[PATCH 0/5 v3] Fix NVMe driver support on Power with 32-bit DMA

2015-10-23 Thread Nishanth Aravamudan
We received a bug report recently when DDW (64-bit direct DMA on Power) is not enabled for NVMe devices. In that case, we fall back to 32-bit DMA via the IOMMU, which is always done via 4K TCEs (Translation Control Entries). The NVMe device driver, though, assumes that the DMA alignment for the PR

Re: [PATCH 1/5 v2] dma-mapping: add generic dma_get_page_shift API

2015-10-19 Thread Nishanth Aravamudan
On 15.10.2015 [15:52:19 -0700], Nishanth Aravamudan wrote: > On 14.10.2015 [08:42:51 -0700], Christoph Hellwig wrote: > > Hi Nishanth, > > > > sorry for the late reply. > > > > > > On Power, since it's technically variable, we'd need a function.

Re: [PATCH 1/5 v2] dma-mapping: add generic dma_get_page_shift API

2015-10-15 Thread Nishanth Aravamudan
On 14.10.2015 [08:42:51 -0700], Christoph Hellwig wrote: > Hi Nishanth, > > sorry for the late reply. > > > > On Power, since it's technically variable, we'd need a function. So are > > > you suggesting define'ing it to a function just on Power and leaving it > > > a constant elsewhere? > > > >

Re: [PATCH 1/5 v2] dma-mapping: add generic dma_get_page_shift API

2015-10-14 Thread Nishanth Aravamudan
Hi Christoph, On 12.10.2015 [14:06:51 -0700], Nishanth Aravamudan wrote: > On 06.10.2015 [02:51:36 -0700], Christoph Hellwig wrote: > > Do we need a function here or can we just have a IOMMU_PAGE_SHIFT define > > with an #ifndef in common code? > > On Power, since it's

Re: [PATCH 1/2] powerpc/iommu: expose IOMMU page shift

2015-10-12 Thread Nishanth Aravamudan
On 12.10.2015 [09:03:52 -0700], Nishanth Aravamudan wrote: > On 06.10.2015 [14:19:43 +1100], David Gibson wrote: > > On Fri, Oct 02, 2015 at 10:18:00AM -0700, Nishanth Aravamudan wrote: > > > We will leverage this macro in the NVMe driver, which needs to know the > > >

Re: [PATCH 1/5 v2] dma-mapping: add generic dma_get_page_shift API

2015-10-12 Thread Nishanth Aravamudan
On 06.10.2015 [02:51:36 -0700], Christoph Hellwig wrote: > Do we need a function here or can we just have a IOMMU_PAGE_SHIFT define > with an #ifndef in common code? On Power, since it's technically variable, we'd need a function. So are you suggesting define'ing it to a function just on Power and

Re: [PATCH 1/5 v2] dma-mapping: add generic dma_get_page_shift API

2015-10-12 Thread Nishanth Aravamudan
On 06.10.2015 [02:51:36 -0700], Christoph Hellwig wrote: > Do we need a function here or can we just have a IOMMU_PAGE_SHIFT define > with an #ifndef in common code? I suppose we could do that -- I wasn't sure if the macro would be palatable. > Also not all architectures use dma-mapping-common.h

Re: [PATCH 1/2] powerpc/iommu: expose IOMMU page shift

2015-10-12 Thread Nishanth Aravamudan
On 06.10.2015 [14:19:43 +1100], David Gibson wrote: > On Fri, Oct 02, 2015 at 10:18:00AM -0700, Nishanth Aravamudan wrote: > > We will leverage this macro in the NVMe driver, which needs to know the > > configured IOMMU page shift to properly configure its device's page >

Re: [PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Nishanth Aravamudan
On 03.10.2015 [07:35:09 +1000], Benjamin Herrenschmidt wrote: > On Fri, 2015-10-02 at 14:04 -0700, Nishanth Aravamudan wrote: > > Right, I did start with your advice and tried that approach, but it > > turned out I was wrong about the actual issue at the time. The problem >

Re: [PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Nishanth Aravamudan
On 03.10.2015 [06:51:06 +1000], Benjamin Herrenschmidt wrote: > On Fri, 2015-10-02 at 13:09 -0700, Nishanth Aravamudan wrote: > > > 1) add a generic dma_get_page_shift implementation that just returns > > PAGE_SHIFT > > So you chose to return the granularity of the iomm

[PATCH 5/5 v2] drivers/nvme: default to the IOMMU page size

2015-10-02 Thread Nishanth Aravamudan
We received a bug report recently when DDW (64-bit direct DMA on Power) is not enabled for NVMe devices. In that case, we fall back to 32-bit DMA via the IOMMU, which is always done via 4K TCEs (Translation Control Entries). The NVMe device driver, though, assumes that the DMA alignment for the PR

[PATCH 4/5 v2] pseries/iommu: implement DDW-aware dma_get_page_shift

2015-10-02 Thread Nishanth Aravamudan
oking the value up in struct iommu_table. If we don't find a iommu_table, fallback to the kernel's page size. Signed-off-by: Nishanth Aravamudan diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index 0946b98..1bf6471 100644 --- a/arch/po

[PATCH 3/5 v2] powerpc/dma: implement per-platform dma_get_page_shift

2015-10-02 Thread Nishanth Aravamudan
. DDW is a pseries-specific feature, so allow platforms to override the implementation of dma_get_page_shift if desired. Signed-off-by: Nishanth Aravamudan diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index cab6753..5c372e3 100644 --- a/arch/powerpc/include

[PATCH 2/5 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-02 Thread Nishanth Aravamudan
otherwise. Signed-off-by: Nishanth Aravamudan diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h index 7f522c0..c5638f4 100644 --- a/arch/powerpc/include/asm/dma-mapping.h +++ b/arch/powerpc/include/asm/dma-mapping.h @@ -125,6 +125,9 @@ static inline v

[PATCH 1/5 v2] dma-mapping: add generic dma_get_page_shift API

2015-10-02 Thread Nishanth Aravamudan
Drivers like NVMe need to be able to determine the page size used for DMA transfers. Add a new API that defaults to return PAGE_SHIFT on all architectures. Signed-off-by: Nishanth Aravamudan diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h index

[PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Nishanth Aravamudan
We received a bug report recently when DDW (64-bit direct DMA on Power) is not enabled for NVMe devices. In that case, we fall back to 32-bit DMA via the IOMMU, which is always done via 4K TCEs (Translation Control Entries). The NVMe device driver, though, assumes that the DMA alignment for the P

Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power

2015-10-02 Thread Nishanth Aravamudan
On 02.10.2015 [10:25:44 -0700], Christoph Hellwig wrote: > Hi Nishanth, > > please expose this value through the generic DMA API instead of adding > architecture specific hacks to drivers. Ok, I'm happy to do that instead -- what I struggled with is that I don't have enough knowledge of the vario

[PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power

2015-10-02 Thread Nishanth Aravamudan
survives our internal hardware exerciser; the kernel BUGs within a few seconds without the patch. Signed-off-by: Nishanth Aravamudan diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c index 7920c27..969a95e 100644 --- a/drivers/block/nvme-core.c +++ b/drivers/block/nvme-core

[PATCH 1/2] powerpc/iommu: expose IOMMU page shift

2015-10-02 Thread Nishanth Aravamudan
We will leverage this macro in the NVMe driver, which needs to know the configured IOMMU page shift to properly configure its device's page size. Signed-off-by: Nishanth Aravamudan --- Given this is available, it seems reasonable to expose -- and it doesn't really make sense to make

[PATCH 0/2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Nishanth Aravamudan
We received a bug report recently when DDW (64-bit direct DMA on Power) is not enabled for NVMe devices. In that case, we fall back to 32-bit DMA via the IOMMU, which is always done via 4K TCEs (Translation Control Entries). The NVMe device driver, though, assumes that the DMA alignment for the PR

Re: [PATCH RFC 3/5] powerpc:numa create 1:1 mappaing between chipid and nid

2015-09-28 Thread Nishanth Aravamudan
On 27.09.2015 [23:59:11 +0530], Raghavendra K T wrote: > Once we have made the distinction between nid and chipid > create a 1:1 mapping between them. This makes compacting the > nids easy later. > > No functionality change. > > Signed-off-by: Raghavendra K T > --- > arch/powerpc/mm/numa.c | 36

Re: [PATCH RFC 0/5] powerpc:numa Add serial nid support

2015-09-28 Thread Nishanth Aravamudan
On 27.09.2015 [23:59:08 +0530], Raghavendra K T wrote: > Problem description: > Powerpc has sparse node numbering, i.e. on a 4 node system nodes are > numbered (possibly) as 0,1,16,17. At a lower level, we map the chipid > got from device tree is naturally mapped (directly) to nid. chipid is a OPA

Re: [PATCH RFC 4/5] powerpc:numa Add helper functions to maintain chipid to nid mapping

2015-09-28 Thread Nishanth Aravamudan
On 27.09.2015 [23:59:12 +0530], Raghavendra K T wrote: > Create arrays that maps serial nids and sparse chipids. > > Note: My original idea had only two arrays of chipid to nid map. Final > code is inspired by driver/acpi/numa.c that maps a proximity node with > a logical node by Takayoshi Kochi ,

Re: [PATCH RFC 3/5] powerpc:numa create 1:1 mappaing between chipid and nid

2015-09-28 Thread Nishanth Aravamudan
On 27.09.2015 [23:59:11 +0530], Raghavendra K T wrote: > Once we have made the distinction between nid and chipid > create a 1:1 mapping between them. This makes compacting the > nids easy later. Didn't the previous patch just do the opposite of... > @@ -286,7 +308,7 @@ int of_node_to_nid(struct

Re: [PATCH RFC 2/5] powerpc:numa Rename functions referring to nid as chipid

2015-09-28 Thread Nishanth Aravamudan
On 27.09.2015 [23:59:10 +0530], Raghavendra K T wrote: > There is no change in the fuctionality > > Signed-off-by: Raghavendra K T > --- > arch/powerpc/mm/numa.c | 42 +- > 1 file changed, 21 insertions(+), 21 deletions(-) > > diff --git a/arch/powerpc/mm

Re: [PATCH RFC 0/5] powerpc:numa Add serial nid support

2015-09-28 Thread Nishanth Aravamudan
On 28.09.2015 [13:44:42 +0300], Denis Kirjanov wrote: > On 9/27/15, Raghavendra K T wrote: > > Problem description: > > Powerpc has sparse node numbering, i.e. on a 4 node system nodes are > > numbered (possibly) as 0,1,16,17. At a lower level, we map the chipid > > got from device tree is natural

Re: [RFC] powerpc/hugetlb: Add warning message when gpage allocation request fails

2015-09-14 Thread Nishanth Aravamudan
On 14.09.2015 [18:59:25 +0530], Aneesh Kumar K.V wrote: > Anshuman Khandual writes: > > > When a 16GB huge page is requested on POWER platform through kernel command > > line interface, it silently fails because of the lack of any gigantic pages > > on the system which the platform should have co

Re: [PATCH v2] powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=

2015-09-07 Thread Nishanth Aravamudan
On 07.09.2015 [19:19:09 +1000], Michael Ellerman wrote: > On Fri, 2015-09-04 at 11:22 -0700, Nishanth Aravamudan wrote: > > The 32-bit TCE table initialization relies on the DMA window having a > > size equal to a power of 2 (and checks for it explicitly). But > > crashkern

[PATCH v2] powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=

2015-09-04 Thread Nishanth Aravamudan
controller) are successfully initialized. After this change, the PCI devices successfully set up the 32-bit TCE table and kdump succeeds. Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Signed-off-by: Nishanth Aravamudan Cc: sta...@vger.kernel

Re: powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=

2015-09-04 Thread Nishanth Aravamudan
On 04.09.2015 [20:01:22 +0200], Jan Stancek wrote: > On Fri, Sep 04, 2015 at 09:59:38AM -0700, Nishanth Aravamudan wrote: > > The 32-bit TCE table initialization relies on the DMA window having a > > size equal to a power of 2 (and checks for it explicitly). But > > crashkern

[PATCH] powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=

2015-09-04 Thread Nishanth Aravamudan
controller) are successfully initialized. After this change, the PCI devices successfully set up the 32-bit TCE table and kdump succeeds. Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Signed-off-by: Nishanth Aravamudan Cc: sta...@vger.kernel

Re: [PATCH v2] powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel

2015-09-03 Thread Nishanth Aravamudan
On 03.09.2015 [19:58:53 +1000], Michael Ellerman wrote: > On Wed, 2015-09-02 at 08:39 -0700, Nishanth Aravamudan wrote: > > On 02.09.2015 [19:00:31 +1000], Alexey Kardashevskiy wrote: > > > On 09/02/2015 11:11 AM, Nishanth Aravamudan wrote: > > > >diff --git a/arch

[PATCH v2] powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel

2015-09-02 Thread Nishanth Aravamudan
On 02.09.2015 [19:00:31 +1000], Alexey Kardashevskiy wrote: > On 09/02/2015 11:11 AM, Nishanth Aravamudan wrote: > >When attempting to kdump with the 4.2 kernel, we see for each PCI > >device: > > > > pci 0003:01 : [PE# 000] Assign DMA32 space > > pci 0003

[PATCH] powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel

2015-09-01 Thread Nishanth Aravamudan
nd kdump succeeds. The problem was seen on a Firestone machine originally. Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Signed-off-by: Nishanth Aravamudan diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powern

Re: [PATCH] openvswitch: make for_each_node loops work with sparse numa systems

2015-07-21 Thread Nishanth Aravamudan
On 21.07.2015 [11:30:58 -0500], Chris J Arges wrote: > On Tue, Jul 21, 2015 at 09:24:18AM -0700, Nishanth Aravamudan wrote: > > On 21.07.2015 [10:32:34 -0500], Chris J Arges wrote: > > > Some architectures like POWER can have a NUMA node_possible_map that > > > contains

Re: [PATCH v2] openvswitch: allocate nr_node_ids flow_stats instead of num_possible_nodes

2015-07-21 Thread Nishanth Aravamudan
tch node_online_map on boot. > Fixes: 3af229f2071f5b5cb31664be6109561fbe19c861 > > Signed-off-by: Chris J Arges Acked-by: Nishanth Aravamudan > --- > net/openvswitch/flow_table.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/openvswitch/flow_tab

Re: [PATCH] openvswitch: make for_each_node loops work with sparse numa systems

2015-07-21 Thread Nishanth Aravamudan
On 21.07.2015 [10:32:34 -0500], Chris J Arges wrote: > Some architectures like POWER can have a NUMA node_possible_map that > contains sparse entries. This causes memory corruption with openvswitch > since it allocates flow_cache with a multiple of num_possible_nodes() and Couldn't this also be fi

Re: [RFC PATCH 1/2] powerpc/numa: fix cpu_to_node() usage during boot

2015-07-15 Thread Nishanth Aravamudan
On 15.07.2015 [16:35:16 -0400], Tejun Heo wrote: > Hello, > > On Thu, Jul 02, 2015 at 04:02:02PM -0700, Nishanth Aravamudan wrote: > > we currently emit at boot: > > > > [0.00] pcpu-alloc: [0] 0 1 2 3 [0] 4 5 6 7 > > > > After this commit, we

Re: [PATCH 5/6] [RFC] crypto/testmgr: add null test for 842 algorithm

2015-07-13 Thread Nishanth Aravamudan
On 13.07.2015 [17:05:36 -0700], Nishanth Aravamudan wrote: > On 04.07.2015 [15:24:53 +0800], Herbert Xu wrote: > > On Thu, Jul 02, 2015 at 03:41:19PM -0700, Nishanth Aravamudan wrote: > > > Currently, when the nx-842-pseries driver loads, the following message > > > i

Re: [PATCH 5/6] [RFC] crypto/testmgr: add null test for 842 algorithm

2015-07-13 Thread Nishanth Aravamudan
On 04.07.2015 [15:24:53 +0800], Herbert Xu wrote: > On Thu, Jul 02, 2015 at 03:41:19PM -0700, Nishanth Aravamudan wrote: > > Currently, when the nx-842-pseries driver loads, the following message > > is emitted: > > > > alg: No test for 842 (842-nx) > > > >

Re: [RFC PATCH 1/2] powerpc/numa: fix cpu_to_node() usage during boot

2015-07-10 Thread Nishanth Aravamudan
On 08.07.2015 [18:22:09 -0700], David Rientjes wrote: > On Thu, 2 Jul 2015, Nishanth Aravamudan wrote: > > > Much like on x86, now that powerpc is using USE_PERCPU_NUMA_NODE_ID, we > > have an ordering issue during boot with early calls to cpu_to_node(). > > The value ret

Re: [RFC,1/2] powerpc/numa: fix cpu_to_node() usage during boot

2015-07-10 Thread Nishanth Aravamudan
On 08.07.2015 [16:16:23 -0700], Nishanth Aravamudan wrote: > On 08.07.2015 [14:00:56 +1000], Michael Ellerman wrote: > > On Thu, 2015-02-07 at 23:02:02 UTC, Nishanth Aravamudan wrote: > > > Much like on x86, now that powerpc is using USE_PERCPU_NUMA_NODE_ID, we > > > ha

Re: [RFC,1/2] powerpc/numa: fix cpu_to_node() usage during boot

2015-07-08 Thread Nishanth Aravamudan
On 08.07.2015 [14:00:56 +1000], Michael Ellerman wrote: > On Thu, 2015-02-07 at 23:02:02 UTC, Nishanth Aravamudan wrote: > > Much like on x86, now that powerpc is using USE_PERCPU_NUMA_NODE_ID, we > > have an ordering issue during boot with early calls to cpu_to_node(). > >

Re: [PATCH 6/6] nx-842-platform: if NX842 platform drivers are not modules, don't try to load them

2015-07-06 Thread Nishanth Aravamudan
On 06.07.2015 [16:13:07 +0800], Herbert Xu wrote: > On Thu, Jul 02, 2015 at 03:42:26PM -0700, Nishanth Aravamudan wrote: > > Based off the CONFIG_SPU_FS_MODULE code, only attempt to load platform > > modules if the nx-842 pseries/powernv drivers are built as modules. > &g

[PATCH v2] crypto/nx-842-{powerpc,pseries}: reduce chattiness of platform drivers

2015-07-06 Thread Nishanth Aravamudan
On 03.07.2015 [11:30:32 +1000], Michael Ellerman wrote: > On Thu, 2015-07-02 at 15:40 -0700, Nishanth Aravamudan wrote: > > While we never would successfully load on the wrong machine type, there > > is extra output by default regardless of machine type. > > > > For

[RFC PATCH 2/2] powerpc/smp: use early_cpu_to_node() instead of direct references to numa_cpu_lookup_table

2015-07-02 Thread Nishanth Aravamudan
A simple move to a wrapper function to numa_cpu_lookup_table, now that power has the early_cpu_to_node() API. Signed-off-by: Nishanth Aravamudan diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index ec9ec20..7bf333b 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc

[RFC PATCH 1/2] powerpc/numa: fix cpu_to_node() usage during boot

2015-07-02 Thread Nishanth Aravamudan
c: [0] 0 1 2 3 [1] 4 5 6 7 Signed-off-by: Nishanth Aravamudan diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h index 5f1048e..f2c4c89 100644 --- a/arch/powerpc/include/asm/topology.h +++ b/arch/powerpc/include/asm/topology.h @@ -39,6 +39,8 @@ static inlin

[PATCH 6/6] nx-842-platform: if NX842 platform drivers are not modules, don't try to load them

2015-07-02 Thread Nishanth Aravamudan
platform driver. Signed-off-by: Nishanth Aravamudan Cc: Dan Streetman Cc: Herbert Xu Cc: "David S. Miller" Cc: linux-cry...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org --- drivers/crypto/nx/nx-842-platform.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) di

[PATCH 5/6] [RFC] crypto/testmgr: add null test for 842 algorithm

2015-07-02 Thread Nishanth Aravamudan
Currently, when the nx-842-pseries driver loads, the following message is emitted: alg: No test for 842 (842-nx) It seems like the simplest way to fix this message (other than adding a proper test) is to just insert the null test into the list in the testmgr. Signed-off-by: Nishanth Aravamudan

[PATCH 4/6] crypto/nx-842-{powerpc,pseries}: only load on the appropriate machine type

2015-07-02 Thread Nishanth Aravamudan
never be found. Similar pseries messages are printed on powernv. Signed-off-by: Nishanth Aravamudan --- drivers/crypto/nx/nx-842-powernv.c | 6 ++ drivers/crypto/nx/nx-842-pseries.c | 6 ++ drivers/crypto/nx/nx-842.h | 1 + 3 files changed, 13 insertions(+) diff --git a/drivers

[PATCH 3/6] nx-842-pseries: do not emit extra output if status is disabled

2015-07-02 Thread Nishanth Aravamudan
en that 'disabled' is a valid state, and we are going to emit that the device is disabled, only print out a non-'okay' status if it is not 'disabled'. Signed-off-by: Nishanth Aravamudan --- drivers/crypto/nx/nx-842-pseries.c | 8 +++- 1 file changed, 7 in

[PATCH 2/6] nx-842-pseries: rename nx842_{init,exit} to nx842_pseries_{init,exit}

2015-07-02 Thread Nishanth Aravamudan
While there is no technical reason that both nx-842.c and nx-842-pseries.c can have the same name for the init/exit functions, it is a bit confusing with initcall_debug. Rename the pseries specific functions appropriately Signed-off-by: Nishanth Aravamudan --- drivers/crypto/nx/nx-842

[PATCH 1/6] crypto/nx-842-pseries: nx842_OF_upd_status should return ENODEV if device is not 'okay'

2015-07-02 Thread Nishanth Aravamudan
extra error in that case. It seems like the proper return code of a disabled device is ENODEV. Signed-off-by: Nishanth Aravamudan --- drivers/crypto/nx/nx-842-pseries.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/crypto/nx/nx-842-pseries.c b/drivers/crypto/nx/nx

[PATCH 0/6] drivers/nx-842: reduce verbosity of logging

2015-07-02 Thread Nishanth Aravamudan
Currently, on a LPAR with the nx-842 device disabled, the following messages are emitted: nx_compress: no nx842 driver found. [1] Registering IBM Power 842 compression driver nx_compress_pseries ibm,compression-v1: nx842_OF_upd_status: status 'disabled' is not 'okay' nx_compress_pseries ibm,compr

Re: powerpc,numa: Memory hotplug to memory-less nodes ?

2015-06-25 Thread Nishanth Aravamudan
On 24.06.2015 [07:13:36 -0500], Nathan Fontenot wrote: > On 06/23/2015 11:01 PM, Bharata B Rao wrote: > > So will it be correct to say that memory hotplug to memory-less node > > isn't supported by PowerPC kernel ? Should I enforce the same in QEMU > > for PowerKVM ? > > > > I'm not sure if that i

Re: [PATCH kernel] powerpc/powernv/ioda2: Add devices only from buses which belong to PE

2015-06-12 Thread Nishanth Aravamudan
On 12.06.2015 [16:47:03 +1000], Gavin Shan wrote: > On Fri, Jun 12, 2015 at 04:19:17PM +1000, Alexey Kardashevskiy wrote: > >The existing code puts all devices from a root PE to the same IOMMU group. > >However it is a possible situation when subordinate buses belong to > >separate PEs, in this cas

Re: [PATCH v2] mm: vmscan: do not throttle based on pfmemalloc reserves if node has no reclaimable pages

2015-05-08 Thread Nishanth Aravamudan
On 08.05.2015 [15:47:26 -0700], Andrew Morton wrote: > On Wed, 06 May 2015 11:28:12 +0200 Vlastimil Babka wrote: > > > On 05/06/2015 12:09 AM, Nishanth Aravamudan wrote: > > > On 03.04.2015 [10:45:56 -0700], Nishanth Aravamudan wrote: > > >>> What I find somew

Re: [PATCH v2] mm: vmscan: do not throttle based on pfmemalloc reserves if node has no reclaimable pages

2015-05-05 Thread Nishanth Aravamudan
On 03.04.2015 [10:45:56 -0700], Nishanth Aravamudan wrote: > On 03.04.2015 [09:57:35 +0200], Vlastimil Babka wrote: > > On 03/31/2015 11:48 AM, Michal Hocko wrote: > > >On Fri 27-03-15 15:23:50, Nishanth Aravamudan wrote: > > >>On 27.03.2015 [13:17:59 -0700], Dave

Re: Topology updates and NUMA-level sched domains

2015-04-10 Thread Nishanth Aravamudan
On 10.04.2015 [10:31:53 +0200], Peter Zijlstra wrote: > On Thu, Apr 09, 2015 at 03:29:56PM -0700, Nishanth Aravamudan wrote: > > > No, that's very much not the same. Even if it were dealing with hotplug > > > it would still assume the cpu to return to the same node. >

Re: Topology updates and NUMA-level sched domains

2015-04-10 Thread Nishanth Aravamudan
On 10.04.2015 [11:08:10 +0200], Peter Zijlstra wrote: > On Fri, Apr 10, 2015 at 10:31:53AM +0200, Peter Zijlstra wrote: > > Please, step back, look at what you're doing and ask yourself, will any > > sane person want to use this? Can they use this? > > > > If so, start by describing the desired us

Re: [PATCH] of: return NUMA_NO_NODE from fallback of_node_to_nid()

2015-04-10 Thread Nishanth Aravamudan
On 10.04.2015 [14:37:19 +0300], Konstantin Khlebnikov wrote: > On 10.04.2015 01:58, Tanisha Aravamudan wrote: > >On 09.04.2015 [07:27:28 +0300], Konstantin Khlebnikov wrote: > >>On Thu, Apr 9, 2015 at 2:07 AM, Nishanth Aravamudan > >> wrote: > >>>On

Re: [PATCH] of: return NUMA_NO_NODE from fallback of_node_to_nid()

2015-04-08 Thread Nishanth Aravamudan
On 08.04.2015 [20:04:04 +0300], Konstantin Khlebnikov wrote: > On 08.04.2015 19:59, Konstantin Khlebnikov wrote: > >Node 0 might be offline as well as any other numa node, > >in this case kernel cannot handle memory allocation and crashes. Isn't the bug that numa_node_id() returned an offline node

Re: Topology updates and NUMA-level sched domains

2015-04-07 Thread Nishanth Aravamudan
On 07.04.2015 [12:21:47 +0200], Peter Zijlstra wrote: > On Mon, Apr 06, 2015 at 02:45:58PM -0700, Nishanth Aravamudan wrote: > > Hi Peter, > > > > As you are very aware, I think, power has some odd NUMA topologies (and > > changes to the those topologies) at run-time

Topology updates and NUMA-level sched domains

2015-04-06 Thread Nishanth Aravamudan
Hi Peter, As you are very aware, I think, power has some odd NUMA topologies (and changes to the those topologies) at run-time. In particular, we can see a topology at boot: Node 0: all Cpus Node 7: no cpus Then we get a notification from the hypervisor that a core (or two) have moved from node

Re: [PATCH v2] mm: vmscan: do not throttle based on pfmemalloc reserves if node has no reclaimable pages

2015-04-03 Thread Nishanth Aravamudan
On 03.04.2015 [20:24:45 +0200], Michal Hocko wrote: > On Fri 03-04-15 10:43:57, Nishanth Aravamudan wrote: > > On 31.03.2015 [11:48:29 +0200], Michal Hocko wrote: > [...] > > > I would expect kswapd would be looping endlessly because the zone > > > wouldn't be

Re: [PATCH v2] mm: vmscan: do not throttle based on pfmemalloc reserves if node has no reclaimable pages

2015-04-03 Thread Nishanth Aravamudan
On 03.04.2015 [09:57:35 +0200], Vlastimil Babka wrote: > On 03/31/2015 11:48 AM, Michal Hocko wrote: > >On Fri 27-03-15 15:23:50, Nishanth Aravamudan wrote: > >>On 27.03.2015 [13:17:59 -0700], Dave Hansen wrote: > >>>On 03/27/2015 12:28 PM, Nishanth Aravamudan w

Re: [PATCH v2] mm: vmscan: do not throttle based on pfmemalloc reserves if node has no reclaimable pages

2015-04-03 Thread Nishanth Aravamudan
On 31.03.2015 [11:48:29 +0200], Michal Hocko wrote: > On Fri 27-03-15 15:23:50, Nishanth Aravamudan wrote: > > On 27.03.2015 [13:17:59 -0700], Dave Hansen wrote: > > > On 03/27/2015 12:28 PM, Nishanth Aravamudan wrote: > > > > @@ -2585,7 +2585,7 @@ static bool pfm

[PATCH v2] mm: vmscan: do not throttle based on pfmemalloc reserves if node has no reclaimable pages

2015-03-27 Thread Nishanth Aravamudan
On 27.03.2015 [13:17:59 -0700], Dave Hansen wrote: > On 03/27/2015 12:28 PM, Nishanth Aravamudan wrote: > > @@ -2585,7 +2585,7 @@ static bool pfmemalloc_watermark_ok(pg_data_t *pgdat) > > > > for (i = 0; i <= ZONE_NORMAL; i++) { > >

Re: [PATCH] mm: vmscan: do not throttle based on pfmemalloc reserves if node has no reclaimable zones

2015-03-27 Thread Nishanth Aravamudan
[ Sorry, typo'd anton's address ] On 27.03.2015 [12:28:50 -0700], Nishanth Aravamudan wrote: > Based upon 675becce15 ("mm: vmscan: do not throttle based on pfmemalloc > reserves if node has no ZONE_NORMAL") from Mel. > > We have a system with the following t

[PATCH] mm: vmscan: do not throttle based on pfmemalloc reserves if node has no reclaimable zones

2015-03-27 Thread Nishanth Aravamudan
ge, the afore-mentioned 16M hugepage allocation succeeds and correctly round-robins between Nodes 1 and 3. Signed-off-by: Nishanth Aravamudan diff --git a/mm/vmscan.c b/mm/vmscan.c index dcd90c8..033c2b7 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2585,7 +2585,7 @@ static bool pfmemalloc

Re: new decimal conversion - seeking testers

2015-03-12 Thread Nishanth Aravamudan
On 13.03.2015 [00:09:19 +0100], Rasmus Villemoes wrote: > Hi, > > I've proposed a new implementation of decimal conversion for > lib/vsprintf.c; see > . > Benchmarking so far shows 25-50% (depending on distribution of input > number

[PATCH v3] powerpc/numa: set node_possible_map to only node_online_map during boot

2015-03-10 Thread Nishanth Aravamudan
On 10.03.2015 [10:55:05 +1100], Michael Ellerman wrote: > On Thu, 2015-03-05 at 21:27 -0800, Nishanth Aravamudan wrote: > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > > index 0257a7d659ef..0c1716cd271f 100644 > > --- a/arch/powerpc/mm/numa.c > > +

[PATCH v2] powerpc/numa: set node_possible_map to only node_online_map during boot

2015-03-05 Thread Nishanth Aravamudan
On 05.03.2015 [15:29:00 -0800], David Rientjes wrote: > On Thu, 5 Mar 2015, Nishanth Aravamudan wrote: > > > So if we compare to x86: > > > > arch/x86/mm/numa.c::numa_init(): > > > > nodes_clear(numa_nodes_parsed); > > nodes_clear(n

Re: [RFC PATCH] powerpc/numa: reset node_possible_map to only node_online_map

2015-03-05 Thread Nishanth Aravamudan
On 05.03.2015 [17:13:08 -0500], Tejun Heo wrote: > On Thu, Mar 05, 2015 at 10:05:49AM -0800, Nishanth Aravamudan wrote: > > While looking at this, I noticed that nr_node_ids is actually a > > misnomer, it seems. It's not the number, but the maximum_node_id, as > > with sp

Re: [RFC PATCH] powerpc/numa: reset node_possible_map to only node_online_map

2015-03-05 Thread Nishanth Aravamudan
On 05.03.2015 [17:08:04 -0500], Tejun Heo wrote: > Hello, > > On Thu, Mar 05, 2015 at 01:58:27PM -0800, David Rientjes wrote: > > I'm not sure why this is being proposed as a powerpc patch and now a patch > > for mem_cgroup_css_alloc(). In other words, why do we have to allocate > > for all pos

Re: [RFC PATCH] powerpc/numa: reset node_possible_map to only node_online_map

2015-03-05 Thread Nishanth Aravamudan
On 05.03.2015 [13:58:27 -0800], David Rientjes wrote: > On Fri, 6 Mar 2015, Michael Ellerman wrote: > > > > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > > > > index 0257a7d659ef..24de29b3651b 100644 > > > > --- a/arch/powerpc/mm/numa.c > > > > +++ b/arch/powerpc/mm/numa.c > > >

Re: [RFC PATCH] powerpc/numa: reset node_possible_map to only node_online_map

2015-03-05 Thread Nishanth Aravamudan
On 06.03.2015 [08:48:52 +1100], Michael Ellerman wrote: > On Thu, 2015-03-05 at 13:16 -0800, David Rientjes wrote: > > On Thu, 5 Mar 2015, Nishanth Aravamudan wrote: > > > > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > > > index 0257a7d659ef

  1   2   3   4   >