Re: [PATCH v11 2/4] perf,kvm/{x86,s390}: Remove const from kvm_events_tp
On 1/27/16 11:33 PM, Hemant Kumar wrote: This patch removes the "const" qualifier from kvm_events_tp declaration to account for the fact that some architectures may need to update this variable dynamically. For instance, powerpc will need to update this variable dynamically depending on the machine type. Signed-off-by: Hemant Kumar--- Acked-by: David Ahern ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v11 1/4] perf,kvm/{x86,s390}: Remove dependency on uapi/kvm_perf.h
On 1/27/16 11:33 PM, Hemant Kumar wrote: Its better to remove the dependency on uapi/kvm_perf.h to allow dynamic discovery of kvm events (if its needed). To do this, some extern variables have been introduced with which we can keep the generic functions generic. Signed-off-by: Hemant KumarAcked-by: Alexander Yarygin Acked-by: David Ahern ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v9 3/6] arm64/arm, numa, dt: adding numa dt binding implementation for arm64 platforms.
On Thu, Jan 28, 2016 at 11:38 PM, Will Deaconwrote: > On Thu, Jan 28, 2016 at 10:42:17PM +0530, Ganapatrao Kulkarni wrote: >> On Thu, Jan 28, 2016 at 8:09 PM, Will Deacon wrote: >> > On Tue, Jan 26, 2016 at 02:36:04PM -0600, Bjorn Helgaas wrote: >> >> Subject is "arm64/arm, numa, dt: adding ..." What is the significance >> >> of the "arm" part? The other patches only mention "arm64". >> >> >> >> General comment: the code below has little, if anything, that is >> >> actually arm64-specific. Maybe this is the first DT-based NUMA >> >> platform? I don't see other similar code for other arches, so maybe >> >> it's too early to try to generalize it, but we should try to avoid >> >> adding duplicates of this code if/when other arches do show up. >> > >> > Having it in the core code would allow us to share it with arch/arm/ >> > fairly straightforwardly. >> This binding can be used for arm too. >> however at this moment it is the need of arm64 platforms. >> can we please keep this to arm64 as it's too early to try to >> generalize it(as Bjorn suggested) >> I prefer to keep it as it is, otherwise ok. >> Please suggest. > > My suggestions time and time again on the NUMA patches from you have > consistently been around consolidation of existing code, or moving things > that aren't architecture-specific out of the architecture code. thanks, i shall move this out to drivers/of > > Will thanks Ganapat ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH v3 1/5] PCI: Add support for enforcing all MMIO BARs to be page aligned
On Fri, 2016-01-15 at 15:06 +0800, Yongji Xie wrote: > When vfio passthrough a PCI device of which MMIO BARs > are smaller than PAGE_SIZE, guest will not handle the > mmio accesses to the BARs which leads to mmio emulations > in host. > > This is because vfio will not allow to passthrough one > BAR's mmio page which may be shared with other BARs. > > To solve this performance issue, this patch adds a kernel > parameter "pci=resource_page_aligned=on" to enforce > the alignment of all MMIO BARs to be at least PAGE_SIZE, > so that one BAR's mmio page would not be shared with other > BARs. We can also disable it through kernel parameter > "pci=resource_page_aligned=off". > > For the default value of the parameter, we think it should be > arch-independent, so we add a macro > HAVE_PCI_DEFAULT_RESOURCES_PAGE_ALIGNED to change it. And we > define this macro to enable this parameter by default on PPC64 > platform which can easily hit this performance issue because > its PAGE_SIZE is 64KB. > > Note that the kernel parameter won't works if kernel doesn't do > resources reallocation. And where do you account for this so that we know whether it's really in effect? > Signed-off-by: Yongji Xie> --- > Documentation/kernel-parameters.txt |5 + > arch/powerpc/include/asm/pci.h | 11 +++ > drivers/pci/pci.c | 35 > +++ > drivers/pci/pci.h |8 +++- > include/linux/pci.h |4 > 5 files changed, 62 insertions(+), 1 deletion(-) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index 742f69d..3f2a7c9 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -2857,6 +2857,11 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > PAGE_SIZE is used as alignment. > PCI-PCI bridge can be specified, if resource > windows need to be expanded. > + resource_page_aligned= Enable/disable enforcing the alignment > + of all PCI devices' memory resources to be > + at least PAGE_SIZE if resources reallocation > + is done by kernel. > + Format: { "on" | "off" } > ecrc= Enable/disable PCIe ECRC (transaction layer > end-to-end CRC checking). > bios: Use BIOS/firmware settings. This is the > diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h > index 3453bd8..2d2b3ef 100644 > --- a/arch/powerpc/include/asm/pci.h > +++ b/arch/powerpc/include/asm/pci.h > @@ -136,6 +136,17 @@ extern pgprot_t pci_phys_mem_access_prot(struct file > *file, > unsigned long pfn, > unsigned long size, > pgprot_t prot); > +#ifdef CONFIG_PPC64 > + > +/* For PPC64, We enforce all PCI MMIO BARs to be page aligned > + * by default. This would be helpful to improve performance > + * when we passthrough a PCI device of which BARs are smaller > + * than PAGE_SIZE(64KB). And we can use kernel parameter > + * "pci=resource_page_aligned=off" to disable it. > + */ > +#define HAVE_PCI_DEFAULT_RESOURCES_PAGE_ALIGNED 1 > + > +#endif > > #define HAVE_ARCH_PCI_RESOURCE_TO_USER > extern void pci_resource_to_user(const struct pci_dev *dev, int bar, > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 314db8c..7b21238 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -99,6 +99,9 @@ u8 pci_cache_line_size; > */ > unsigned int pcibios_max_latency = 255; > > +bool pci_resources_page_aligned = > + IS_ENABLED(HAVE_PCI_DEFAULT_RESOURCES_PAGE_ALIGNED); I don't think this is proper use of IS_ENABLED, which seems to be targeted at CONFIG_ type options. You could define this as that in an arch Kconfig. > + > /* If set, the PCIe ARI capability will not be used. */ > static bool pcie_ari_disabled; > > @@ -4746,6 +4749,35 @@ static ssize_t pci_resource_alignment_store(struct > bus_type *bus, > BUS_ATTR(resource_alignment, 0644, pci_resource_alignment_show, > pci_resource_alignment_store); > > +static void pci_resources_get_page_aligned(char *str) > +{ > + if (!strncmp(str, "off", 3)) > + pci_resources_page_aligned = false; > + else if (!strncmp(str, "on", 2)) > + pci_resources_page_aligned = true; > +} "get"? > + > +/* > + * This function checks whether PCI BARs' mmio page will be shared > + * with other BARs. > + */ > +bool pci_resources_share_page(struct pci_dev *dev, int resno) > +{ > + struct resource *res = dev->resource + resno; > + > + if (resource_size(res) >=
[no subject]
--- Begin Message --- On Thu, 28 Jan 2016, Christian Borntraeger wrote: > Indeed, I only touched the identity mapping and dump stack. > The question is do we really want to change free_init_pages as well? > The unmapping during runtime causes significant overhead, but the > unmapping after init imposes almost no runtime overhead. Of course, > things get fishy now as what is enabled and what not. > > Kconfig after my patch "mm/debug_pagealloc: Ask users for default setting of > debug_pagealloc" > (in mm) now states > snip > By default this option will have a small overhead, e.g. by not > allowing the kernel mapping to be backed by large pages on some > architectures. Even bigger overhead comes when the debugging is > enabled by DEBUG_PAGEALLOC_ENABLE_DEFAULT or the debug_pagealloc > command line parameter. > snip > > So I am tempted to NOT change free_init_pages, but the x86 maintainers > can certainly decide differently. Ingo, Thomas, H. Peter, please advise. > I'm sorry, but I thought the discussion of the previous version of the patchset led to deciding that all CONFIG_DEBUG_PAGEALLOC behavior would be controlled by being enabled on the commandline and checked with debug_pagealloc_enabled(). I don't think we should have a CONFIG_DEBUG_PAGEALLOC that does some stuff and then a commandline parameter or CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT to enable more stuff. It should either be all enabled by the commandline (or config option) or split into a separate entity. CONFIG_DEBUG_PAGEALLOC_LIGHT and CONFIG_DEBUG_PAGEALLOC would be fine, but the current state is very confusing about what is being done and what isn't. It also wouldn't hurt to enumerate what is enabled and what isn't enabled in the Kconfig entry. --- End Message --- ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH v3 3/5] PCI: Add host bridge attribute to indicate filtering of MSIs is supported
On Fri, 2016-01-15 at 15:06 +0800, Yongji Xie wrote: > MSI-X tables are not allowed to be mmapped in vfio-pci > driver in case that user get to touch this directly. > This will cause some performance issues when when PCI > adapters have critical registers in the same page as > the MSI-X table. > > However, some kind of PCI host bridge such as IODA bridge > on Power support filtering of MSIs, which can ensure that a > given pci device can only shoot the MSIs assigned for it. > So we think it's safe to expose the MSI-X table to userspace > if filtering of MSIs is supported because the exposed MSI-X > table can't be used to do harm to other memory space. > > To support this case, this patch adds a pci_host_bridge > attribute to indicate if this PCI host bridge supports > filtering of MSIs. > > Signed-off-by: Yongji Xie> --- > drivers/pci/host-bridge.c |6 ++ > include/linux/pci.h |3 +++ > 2 files changed, 9 insertions(+) > > diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c > index 5f4a2e0..c029267 100644 > --- a/drivers/pci/host-bridge.c > +++ b/drivers/pci/host-bridge.c > @@ -96,3 +96,9 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct > resource *res, > res->end = region->end + offset; > } > EXPORT_SYMBOL(pcibios_bus_to_resource); > + > +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev) > +{ > + return pci_find_host_bridge(pdev->bus)->msi_filtered; > +} > +EXPORT_SYMBOL_GPL(pci_host_bridge_msi_filtered_enabled); > diff --git a/include/linux/pci.h b/include/linux/pci.h > index b640d65..b952b78 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -412,6 +412,7 @@ struct pci_host_bridge { > void (*release_fn)(struct pci_host_bridge *); > void *release_data; > unsigned int ignore_reset_delay:1; /* for entire hierarchy */ > + unsigned int msi_filtered:1;/* support filtering of MSIs */ > /* Resource alignment requirements */ > resource_size_t (*align_resource)(struct pci_dev *dev, > const struct resource *res, > @@ -430,6 +431,8 @@ void pci_set_host_bridge_release(struct pci_host_bridge > *bridge, > > int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge); > > +bool pci_host_bridge_msi_filtered_enabled(struct pci_dev *pdev); > + > /* > * The first PCI_BRIDGE_RESOURCE_NUM PCI bus resources (those that correspond > * to P2P or CardBus bridge windows) go in a table. Additional ones (for Don't we already have a flag for this in the IOMMU space? enum iommu_cap { IOMMU_CAP_CACHE_COHERENCY, /* IOMMU can enforce cache coherent DMA transactions */ --->IOMMU_CAP_INTR_REMAP, /* IOMMU supports interrupt isolation */ IOMMU_CAP_NOEXEC, /* IOMMU_NOEXEC flag */ }; ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH v3 5/5] vfio-pci: Allow to mmap MSI-X table if host bridge supports filtering of MSIs
On Fri, 2016-01-15 at 15:06 +0800, Yongji Xie wrote: > Current vfio-pci implementation disallows to mmap MSI-X > table in case that user get to touch this directly. > > But we should allow to mmap these MSI-X tables if the PCI > host bridge supports filtering of MSIs. > > Signed-off-by: Yongji Xie> --- > drivers/vfio/pci/vfio_pci.c |6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index 11fd0f0..4d68f6a 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -555,7 +555,8 @@ static long vfio_pci_ioctl(void *device_data, > IORESOURCE_MEM && !pci_resources_share_page(pdev, > info.index)) { > info.flags |= VFIO_REGION_INFO_FLAG_MMAP; > - if (info.index == vdev->msix_bar) { > + if (!pci_host_bridge_msi_filtered_enabled(pdev) > && > + info.index == vdev->msix_bar) { > ret = msix_sparse_mmap_cap(vdev, ); > if (ret) > return ret; > @@ -967,7 +968,8 @@ static int vfio_pci_mmap(void *device_data, struct > vm_area_struct *vma) > if (phys_len < PAGE_SIZE || req_start + req_len > phys_len) > return -EINVAL; > > - if (index == vdev->msix_bar) { > + if (!pci_host_bridge_msi_filtered_enabled(pdev) && > + index == vdev->msix_bar) { > /* > * Disallow mmaps overlapping the MSI-X table; users don't > * get to touch this directly. We could find somewhere What about read()/write() access, why would we allow mmap() but not those? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] documentation: Add disclaimer
Paul E. McKenneywrote: > Good point! Would you be willing to add a Signed-off-by so I > can take the combined change, assuming Peter and Will are good > with it? Sure! David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 29/31] Add debugger entry points for POWERPC
This patch series adds an export which can be set by system debuggers to direct the hard lockup and soft lockup detector to trigger a breakpoint exception and enter a debugger if one is active. It is assumed that if someone sets this variable, then an breakpoint handler of some sort will be actively loaded or registered via the notify die handler chain. This addition is extremely useful for debugging hard and soft lockups real time and quickly from a console debugger. Signed-off-by: Jeffrey Merkey--- arch/powerpc/include/asm/kdebug.h | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/powerpc/include/asm/kdebug.h b/arch/powerpc/include/asm/kdebug.h index ae6d206..54f5ca8 100644 --- a/arch/powerpc/include/asm/kdebug.h +++ b/arch/powerpc/include/asm/kdebug.h @@ -11,5 +11,10 @@ enum die_val { DIE_SSTEP, }; +static inline void arch_breakpoint(void) +{ + asm(".long 0x7d821008"); /* twge r2, r2 */ +} + #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_KDEBUG_H */ -- 1.8.3.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
On 28-01-16, 12:55, Shilpasri G Bhat wrote: > cpu_to_chip_id() does a DT walk through to find out the chip id by > taking a contended device tree lock. This adds an unnecessary overhead > in a hot path. So instead of calling cpu_to_chip_id() everytime cache > the chip ids for all cores in the array 'core_to_chip_map' and use it > in the hotpath. > > Reported-by: Anton Blanchard> Signed-off-by: Shilpasri G Bhat > --- > Changes from v6: > - Minor changes to move the code 'cpumask_copy()' after 'core_to_chip_map' > is allocated. > - Move 'kfree(chips)' to a separate patch. See, you weren't that bad :) Just that you missed saying that individual patches contain version-log in cover-letter :) Acked-by: Viresh Kumar -- viresh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
Hi Shilpa, A minor nit. On Thu, Jan 28, 2016 at 12:55:41PM +0530, Shilpasri G Bhat wrote: [..snip..] > + > +What: > /sys/devices/system/cpu/cpufreq/chip*/throttle_reasons/ > +Date:Jan 2016 > +Contact: Linux kernel mailing list> + Linux for PowerPC mailing list > +Description: CPU Frequency throttle reason stat for the chip > + > + This directory contains throttle reason files. Each file gives > + the total number of times the max frequency is throttled, except > + for 'unthrottle_count', which gives the total number of times > + the max frequency is unthrottled after being throttled. Below > + are the reason attributes. > + > + cpu_over_temperature: Throttled due to cpu over temperature > + > + occ_reset: Throttled due to reset of OCC > + > + over_current: Throttled due to over current Overcurrent is a single word. No need of the extra space. You could fix that and add my Reviewed-by. -- Thanks and Regards gautham. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
This enables HugeTLB page migration for PPC64_BOOK3S systems which implement HugeTLB page at the PMD level. It enables the kernel configuration option CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION by default which turns on the function hugepage_migration_supported() during migration. After the recent changes to the PTE format, HugeTLB page migration happens successfully. Signed-off-by: Anshuman Khandual--- arch/powerpc/Kconfig | 4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index e4824fd..65d52a0 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -82,6 +82,10 @@ config GENERIC_HWEIGHT config ARCH_HAS_DMA_SET_COHERENT_MASK bool +config ARCH_ENABLE_HUGEPAGE_MIGRATION + def_bool y + depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION + config PPC bool default y -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit
On Thu, Jan 28, 2016 at 12:55:36PM +0530, Shilpasri G Bhat wrote: > This will free the dynamically allocated memory of'chips' on > module exit. > > Signed-off-by: Shilpasri G BhatReviewed-by: Gautham R. Shenoy -- Thanks and Regards gautham. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver
On 28-01-16, 12:55, Shilpasri G Bhat wrote: > In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the > CPU when the chip crosses its thermal and power limits. Currently, > powernv-cpufreq driver detects and reports this event as a console > message. Some machines may not sustain the max turbo frequency in all > conditions and can be throttled frequently. This can lead to the > flooding of console with throttle messages. So this patchset aims to > redesign the presentation of this event via sysfs counters and > tracepoints. And it also fixes couple of bugs reported in the driver. > > - Patch [1] fixes a memory leak bug > - Patch [2] fixes the cpu hot-plug bug in powernv_cpufreq_work_fn(). > - Patch [3] solves a bug in powernv_cpufreq_throttle_check(), which > calls in to cpu_to_chip_id() in hot path which reads DT every time > to find the chip id. > - Patches [4] to [6] will add a perf trace point > "power:powernv_throttle" and sysfs throttle counter stats in > /sys/devices/system/cpu/cpufreq/chipN. > > Changes from v6: > - Changes wrt comments from Balbir Singh and Viresh Kumar. Who cares about these names in version-log ?? You have completely missed what should have been present here. This is version log and that's what should be present here :) And because of that, I have to - search for your earlier version in my mailbox - Read all my comments - Haven't read what Balbir have said See .. -- viresh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
On 28-01-16, 12:55, Shilpasri G Bhat wrote: > diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu > b/Documentation/ABI/testing/sysfs-devices-system-cpu > index b683e8e..dea4620 100644 > --- a/Documentation/ABI/testing/sysfs-devices-system-cpu > +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu > @@ -271,3 +271,48 @@ Description: Parameters for the CPU cache attributes > - WriteBack: data is written only to the cache line and >the modified cache line is written to main >memory only when it is replaced > + > +What:/sys/devices/system/cpu/cpufreq/chip*/throttle_stats What about the chip directory ? Shouldn't that be documented? And shouldn't that mention that this is just for powerpc ? And before that, I don't think that you are doing this properly. I am sorry that I never came to a point where I could review it, and you continued with it, version after version. But, I really have strong objections to the way this is done. And you are making things more complex then they are. So, these stats are per-policy, right ? Then why aren't they added on the policy->kobj instead, just like cpufreq-stats? And maybe inside cpufreq-stats folder only? That will solve many complexities you have in place here and will look sane as well. Right now, you have stats as two places, cpu/cpufreq/chip/ and cpu/cpuX/cpufreq/stats/, which doesn't look wise and adds to confusion. What do you say? -- viresh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
On 28-01-16, 15:06, Shilpasri G Bhat wrote: > No these stats are not per-policy. They are per-chip. The throttle event is > common for all cores in the chip. How do you define a chip? And how is it different then the group of CPUs represented by the policy ? -- viresh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] selfttest/powerpc: Add memory page migration tests
This adds two tests for memory page migration. One for normal page migration which works for both 4K or 64K base page size kernel and the other one is for 16MB huge page migration which will work both 4K or 64K base page sized 16MB huge pages as and when we support huge page migration. Signed-off-by: Anshuman Khandual--- Changes in V3: - Minor changes to the code for considering skipped pages - Enabled HugeTLB test in the script as it works now Changes in V2: - Changed the script to accommodate review comments from Michael - Disabled huge page migration test till it is supported on POWER Sample test result == Test HugeTLB vs THP test: hugetlb_vs_thp tags: git_version:v4.5-rc1-30-gda30491 success: hugetlb_vs_thp [PASS] . Test subpage protection . test: subpage_prot_anon tags: git_version:v4.5-rc1-30-gda30491 allocated malloc block of 0x400 bytes at 0x0x3fff8072 testing malloc block... success: subpage_prot_anon test: subpage_prot_file tags: git_version:v4.5-rc1-30-gda30491 allocated tempfile for 0x1 bytes at 0x0x3fff8472 testing file map... success: subpage_prot_file [PASS] ... Test normal page migration ... test: page_migration tags: git_version:v4.5-rc1-30-gda30491 Running on base page size 64K 64 moved 0 skipped 0 failed 1024 moved 0 skipped 0 failed 3328 moved 768 skipped 0 failed 4352 moved 3840 skipped 0 failed 8448 moved 7936 skipped 0 failed 16640 moved 16128 skipped 0 failed success: page_migration [PASS] . Test huge page migration . test: hugepage_migration tags: git_version:v4.5-rc1-30-gda30491 Running on base page size 64K 1 moved 0 skipped 0 failed 16 moved 0 skipped 0 failed 32 moved 0 skipped 0 failed success: hugepage_migration [PASS] tools/testing/selftests/powerpc/mm/Makefile| 14 +- .../selftests/powerpc/mm/hugepage-migration.c | 30 +++ tools/testing/selftests/powerpc/mm/migration.h | 204 + .../testing/selftests/powerpc/mm/page-migration.c | 33 tools/testing/selftests/powerpc/mm/run_mmtests | 104 +++ 5 files changed, 380 insertions(+), 5 deletions(-) create mode 100644 tools/testing/selftests/powerpc/mm/hugepage-migration.c create mode 100644 tools/testing/selftests/powerpc/mm/migration.h create mode 100644 tools/testing/selftests/powerpc/mm/page-migration.c create mode 100755 tools/testing/selftests/powerpc/mm/run_mmtests diff --git a/tools/testing/selftests/powerpc/mm/Makefile b/tools/testing/selftests/powerpc/mm/Makefile index ee179e2..c482614 100644 --- a/tools/testing/selftests/powerpc/mm/Makefile +++ b/tools/testing/selftests/powerpc/mm/Makefile @@ -1,12 +1,16 @@ noarg: $(MAKE) -C ../ -TEST_PROGS := hugetlb_vs_thp_test subpage_prot -TEST_FILES := tempfile +TEST_PROGS := run_mmtests +TEST_FILES := hugetlb_vs_thp_test +TEST_FILES += subpage_prot +TEST_FILES += tempfile +TEST_FILES += hugepage-migration +TEST_FILES += page-migration -all: $(TEST_PROGS) $(TEST_FILES) +all: $(TEST_FILES) -$(TEST_PROGS): ../harness.c +$(TEST_FILES): ../harness.c include ../../lib.mk @@ -14,4 +18,4 @@ tempfile: dd if=/dev/zero of=tempfile bs=64k count=1 clean: - rm -f $(TEST_PROGS) tempfile + rm -f $(TEST_FILES) diff --git a/tools/testing/selftests/powerpc/mm/hugepage-migration.c b/tools/testing/selftests/powerpc/mm/hugepage-migration.c new file mode 100644 index 000..b60bc10 --- /dev/null +++ b/tools/testing/selftests/powerpc/mm/hugepage-migration.c @@ -0,0 +1,30 @@ +/* + * Copyright (C) 2015, Anshuman Khandual, IBM Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + */ +#include "migration.h" + +static int hugepage_migration(void) +{ + int ret = 0; + + if ((unsigned long)getpagesize() == 0x1000) + printf("Running on base page size 4K\n"); + + if ((unsigned long)getpagesize() == 0x1) + printf("Running on base page size 64K\n"); + + ret = test_huge_migration(16 * MEM_MB); + ret = test_huge_migration(256 * MEM_MB); + ret = test_huge_migration(512 * MEM_MB); + + return ret; +} + +int main(void) +{ + return test_harness(hugepage_migration, "hugepage_migration"); +} diff --git a/tools/testing/selftests/powerpc/mm/migration.h b/tools/testing/selftests/powerpc/mm/migration.h new file mode 100644 index 000..fe35849 --- /dev/null +++ b/tools/testing/selftests/powerpc/mm/migration.h @@ -0,0 +1,204 @@ +/* + * Copyright (C) 2015, Anshuman Khandual, IBM Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as
Re: [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
On Thu, Jan 28, 2016 at 12:55:38PM +0530, Shilpasri G Bhat wrote: > cpu_to_chip_id() does a DT walk through to find out the chip id by > taking a contended device tree lock. This adds an unnecessary overhead > in a hot path. So instead of calling cpu_to_chip_id() everytime cache > the chip ids for all cores in the array 'core_to_chip_map' and use it > in the hotpath. > > Reported-by: Anton Blanchard> Signed-off-by: Shilpasri G Bhat Reviewed-by: Gautham R. Shenoy -- Thanks and Regards gautham. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event
On Thu, Jan 28, 2016 at 12:55:40PM +0530, Shilpasri G Bhat wrote: > Currently we use printk message to notify the throttle event. But this > can flood the console if the cpu is throttled frequently. So replace the > printk with the tracepoint to notify the throttle event. And also events > like throttle below nominal frequency and OCC_RESET are reduced to > pr_warn/pr_warn_once as pointed by MFG to not mark them as critical > messages. This patch adds 'throttle_reason' to struct chip to store the > throttle reason. > > Signed-off-by: Shilpasri G BhatReviewed-by: Gautham R. Shenoy -- Thanks and Regards gautham. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event
On 28-01-16, 12:55, Shilpasri G Bhat wrote: > Currently we use printk message to notify the throttle event. But this > can flood the console if the cpu is throttled frequently. So replace the > printk with the tracepoint to notify the throttle event. And also events > like throttle below nominal frequency and OCC_RESET are reduced to > pr_warn/pr_warn_once as pointed by MFG to not mark them as critical > messages. This patch adds 'throttle_reason' to struct chip to store the > throttle reason. > > Signed-off-by: Shilpasri G Bhat> --- > Changes from v6: > - Rename struct chip member 'throt_reason' to 'throttle_reason' Acked-by: Viresh Kumar -- viresh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
Hi Viresh, On 01/28/2016 02:10 PM, Viresh Kumar wrote: > On 28-01-16, 12:55, Shilpasri G Bhat wrote: >> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu >> b/Documentation/ABI/testing/sysfs-devices-system-cpu >> index b683e8e..dea4620 100644 >> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu >> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu >> @@ -271,3 +271,48 @@ Description:Parameters for the CPU cache attributes >> - WriteBack: data is written only to the cache line and >> the modified cache line is written to main >> memory only when it is replaced >> + >> +What: /sys/devices/system/cpu/cpufreq/chip*/throttle_stats > > What about the chip directory ? Shouldn't that be documented? And > shouldn't that mention that this is just for powerpc ? > > And before that, I don't think that you are doing this properly. I am > sorry that I never came to a point where I could review it, and you > continued with it, version after version. > > But, I really have strong objections to the way this is done. And you > are making things more complex then they are. > > So, these stats are per-policy, right ? First of all sorry about the version log. No these stats are not per-policy. They are per-chip. The throttle event is common for all cores in the chip. > > Then why aren't they added on the policy->kobj instead, just like > cpufreq-stats? And maybe inside cpufreq-stats folder only? > > That will solve many complexities you have in place here and will look > sane as well. > > Right now, you have stats as two places, cpu/cpufreq/chip/ and > cpu/cpuX/cpufreq/stats/, which doesn't look wise and adds to > confusion. > > What do you say? > Yes agree that it will be much cleaner with policy->kobj. But using policy->kobj will result in multiple copies of the throttle-chip stats exported for each policy in the chip. And moving it to cpu/cpuX/cpufreq/stats/ will add a dependency on CONFIG_CPU_FREQ_STAT We want throttle attributes to be either in cpu/cpufreq or cpu/cpuX/cpufreq. If multiple copies is not an issue, then I will move it to cpu/cpuX/cpufreq. Thanks and Regards, Shilpa ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 2/3] x86: query dynamic DEBUG_PAGEALLOC setting
On 01/27/2016 11:17 PM, David Rientjes wrote: > On Wed, 27 Jan 2016, Christian Borntraeger wrote: > >> We can use debug_pagealloc_enabled() to check if we can map >> the identity mapping with 2MB pages. We can also add the state >> into the dump_stack output. >> >> The patch does not touch the code for the 1GB pages, which ignored >> CONFIG_DEBUG_PAGEALLOC. Do we need to fence this as well? >> >> Signed-off-by: Christian Borntraeger>> Reviewed-by: Thomas Gleixner >> --- >> arch/x86/kernel/dumpstack.c | 5 ++--- >> arch/x86/mm/init.c | 7 --- >> arch/x86/mm/pageattr.c | 14 -- >> 3 files changed, 10 insertions(+), 16 deletions(-) >> >> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c >> index 9c30acf..32e5699 100644 >> --- a/arch/x86/kernel/dumpstack.c >> +++ b/arch/x86/kernel/dumpstack.c >> @@ -265,9 +265,8 @@ int __die(const char *str, struct pt_regs *regs, long >> err) >> #ifdef CONFIG_SMP >> printk("SMP "); >> #endif >> -#ifdef CONFIG_DEBUG_PAGEALLOC >> -printk("DEBUG_PAGEALLOC "); >> -#endif >> +if (debug_pagealloc_enabled()) >> +printk("DEBUG_PAGEALLOC "); >> #ifdef CONFIG_KASAN >> printk("KASAN"); >> #endif >> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c >> index 493f541..39823fd 100644 >> --- a/arch/x86/mm/init.c >> +++ b/arch/x86/mm/init.c >> @@ -150,13 +150,14 @@ static int page_size_mask; >> >> static void __init probe_page_size_mask(void) >> { >> -#if !defined(CONFIG_DEBUG_PAGEALLOC) && !defined(CONFIG_KMEMCHECK) >> +#if !defined(CONFIG_KMEMCHECK) >> /* >> - * For CONFIG_DEBUG_PAGEALLOC, identity mapping will use small pages. >> + * For CONFIG_KMEMCHECK or pagealloc debugging, identity mapping will >> + * use small pages. >> * This will simplify cpa(), which otherwise needs to support splitting >> * large pages into small in interrupt context, etc. >> */ >> -if (cpu_has_pse) >> +if (cpu_has_pse && !debug_pagealloc_enabled()) >> page_size_mask |= 1 << PG_LEVEL_2M; >> #endif >> > > I would have thought free_init_pages() would be modified to use > debug_pagealloc_enabled() as well? Indeed, I only touched the identity mapping and dump stack. The question is do we really want to change free_init_pages as well? The unmapping during runtime causes significant overhead, but the unmapping after init imposes almost no runtime overhead. Of course, things get fishy now as what is enabled and what not. Kconfig after my patch "mm/debug_pagealloc: Ask users for default setting of debug_pagealloc" (in mm) now states snip By default this option will have a small overhead, e.g. by not allowing the kernel mapping to be backed by large pages on some architectures. Even bigger overhead comes when the debugging is enabled by DEBUG_PAGEALLOC_ENABLE_DEFAULT or the debug_pagealloc command line parameter. snip So I am tempted to NOT change free_init_pages, but the x86 maintainers can certainly decide differently. Ingo, Thomas, H. Peter, please advise. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit
On 28-01-16, 12:55, Shilpasri G Bhat wrote: > This will free the dynamically allocated memory of'chips' on > module exit. Though it has a 'space' issues before 'chips', but I don't really care much about that and so you aren't required to resend, unless you have to send a v8 for something else. > Signed-off-by: Shilpasri G Bhat> --- > drivers/cpufreq/powernv-cpufreq.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/cpufreq/powernv-cpufreq.c > b/drivers/cpufreq/powernv-cpufreq.c > index 547890f..53f980b 100644 > --- a/drivers/cpufreq/powernv-cpufreq.c > +++ b/drivers/cpufreq/powernv-cpufreq.c > @@ -612,6 +612,7 @@ static void __exit powernv_cpufreq_exit(void) > unregister_reboot_notifier(_cpufreq_reboot_nb); > opal_message_notifier_unregister(OPAL_MSG_OCC, >_cpufreq_opal_nb); > + kfree(chips); > cpufreq_unregister_driver(_cpufreq_driver); > } > module_exit(powernv_cpufreq_exit); Acked-by: Viresh Kumar -- viresh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver
On Thu, Jan 28, 2016 at 6:25 PM, Shilpasri G Bhatwrote: > In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the > CPU when the chip crosses its thermal and power limits. Currently, > powernv-cpufreq driver detects and reports this event as a console > message. Some machines may not sustain the max turbo frequency in all > conditions and can be throttled frequently. This can lead to the > flooding of console with throttle messages. So this patchset aims to > redesign the presentation of this event via sysfs counters and > tracepoints. And it also fixes couple of bugs reported in the driver. > > - Patch [1] fixes a memory leak bug > - Patch [2] fixes the cpu hot-plug bug in powernv_cpufreq_work_fn(). > - Patch [3] solves a bug in powernv_cpufreq_throttle_check(), which > calls in to cpu_to_chip_id() in hot path which reads DT every time > to find the chip id. > - Patches [4] to [6] will add a perf trace point > "power:powernv_throttle" and sysfs throttle counter stats in > /sys/devices/system/cpu/cpufreq/chipN. > Looks good to me. You've got the reviews and acks you need. Balbir Singh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH] B4860qds/B4420qds: Updates to device trees for B4860 for DSP clusters and their L2 caches
Please ignore this mail. Will send another revision. Regards Ashish -Original Message- From: Ashish Kumar [mailto:ashish.ku...@nxp.com] Sent: Thursday, January 28, 2016 1:23 PM To: Scott Wood; linuxppc-dev@lists.ozlabs.org Cc: Ashish Kumar ; Shaveta Leekha Subject: [PATCH] B4860qds/B4420qds: Updates to device trees for B4860 for DSP clusters and their L2 caches B4860 has 1 PPC core cluster and 3 DSP core clusters. Similarly B4420 has 1 PPC core cluster and 1 DSP core cluster. Each DSP core cluster consists of 2 SC3900 cores and a shared L2 cache. 1. Add DSP clusters for B4420 2. Reorganized the L2 cache nodes such that they now appear in only the soc specific dtsi files(b4860si-post.dtsi and b4420si-post.dtsi). Earlier they were shown partly in common b4si-post.dtsi and si specific b4860si-post.dtsi files . Signed-off-by: Ashish Kumar Signed-off-by: Shaveta Leekha --- arch/powerpc/boot/dts/fsl/b4420si-post.dtsi |8 arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi | 23 arch/powerpc/boot/dts/fsl/b4860si-post.dtsi | 18 + arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi | 52 +++ arch/powerpc/boot/dts/fsl/b4si-post.dtsi|5 --- 5 files changed, 101 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi b/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi index 86161ae..c0fe250 100644 --- a/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi @@ -102,5 +102,13 @@ L2: l2-cache-controller@c2 { compatible = "fsl,b4420-l2-cache-controller"; + reg = <0xc2 0x1000>; + next-level-cache = <>; + }; +/* Following is DSP L2 cache*/ + L2_2: l2-cache-controller@c6 { + compatible = "fsl,b4420-l2-cache-controller"; + reg = <0xc6 0x1000>; + next-level-cache = <>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi b/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi index 338af7e..5fec4ea 100644 --- a/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi @@ -76,4 +76,27 @@ fsl,portid-mapping = <0x8000>; }; }; + + dsp-clusters { + #address-cells = <1>; + #size-cells = <0>; + + dsp-cluster0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,sc3900-cluster"; + reg = <0>; + + dsp0: dsp@0 { + compatible = "fsl,sc3900"; + reg = <0>; + next-level-cache = <_2>; + }; + dsp1: dsp@1 { + compatible = "fsl,sc3900"; + reg = <1>; + next-level-cache = <_2>; + }; + }; + }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi index f35e9e0..19679d3 100644 --- a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi @@ -204,5 +204,23 @@ L2: l2-cache-controller@c2 { compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xc2 0x1000>; + next-level-cache = <>; + }; +/* Following are DSP L2 cache */ + L2_2: l2-cache-controller@c6 { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xc6 0x1000>; + next-level-cache = <>; + }; + L2_3: l2-cache-controller@ca { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xca 0x1000>; + next-level-cache = <>; + }; + L2_4: l2-cache-controller@ce { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xce 0x1000>; + next-level-cache = <>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi b/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi index 1948f73..2e5dcb6 100644 --- a/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi @@ -90,4 +90,56 @@ fsl,portid-mapping = <0x8000>; }; }; + dsp-clusters { + #address-cells = <1>; + #size-cells = <0>; + dsp-cluster0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,sc3900-cluster"; + reg = <0>; + dsp0: dsp@0 { +
Re: [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
On 01/28/2016 02:41 PM, Anshuman Khandual wrote: > This enables HugeTLB page migration for PPC64_BOOK3S systems which implement > HugeTLB page at the PMD level. It enables the kernel configuration option As mentioned above, it works only for 16MB HugeTLB page migration on 64K base pages implemented right on the PMD as a single PTE but not for the 16MB HugeTLB page on 4K base pages implemented through huge page directory. As generic VM migrate code does not look into the page table structure when initiating migration of 16MB on 4K it will just fail without stating the reason which might be some times confusing. I can edit the commit message to capture this point if needed. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v6 1/9] ppc64 (le): prepare for -mprofile-kernel
On Thu, Jan 28, 2016 at 03:26:59PM +1100, Michael Ellerman wrote: > > That raises an interesting question, how does it work *without* > DYNAMIC_FTRACE? > > It looks like you haven't updated that version of _mcount at all? Or maybe I'm > missing an #ifdef somewhere? You didn't, I did. I haven't considered that combination. > It doesn't look like that will work right with the -mprofile-kernel ABI. And > indeed it doesn't boot. The lean _mcount should handle it and boot, had I not misplaced it in the #ifdefs, but then of course profiling wouldn't work. > So we'll need to work that out. I guess the minimum would be to disable > -mprofile-kernel if DYNAMIC_FTRACE is disabled. I feel that supporting all combinations of ABIv1/ABIv2, FTRACE/DYNAMIC_FTRACE, -p/-mprofile-kernel will get us into #ifdef hell, and at least one kernel developer will go insane. That will probably be the one porting this to ppc64be (ABIv1). > Frankly I think we'd be happy to *only* support DYNAMIC_FTRACE, but the > generic > code doesn't let us do that at the moment. Seconded. I'll have a look at the Kconfigs. > But it's better than the previous version which didn't boot :) That's your fault, you picked the wrong compiler ;-) > Also ftracetest fails at step 8: > ... > [8] ftrace - function graph filters with stack tracer > Unable to handle kernel paging request for data at address > 0xd33d7f70 [...] > That doesn't happen without your series applied, though that doesn't 100% mean > it's your bug. I haven't had time to dig any deeper. Will check as well... Torsten ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] B4860qds/B4420qds: Updates to device trees for B4860 for DSP clusters and their L2 caches
B4860 has 1 PPC core cluster and 3 DSP core clusters. Similarly B4420 has 1 PPC core cluster and 1 DSP core cluster. Each DSP core cluster consists of 2 SC3900 cores and a shared L2 cache. 1. Add DSP clusters for B4420 2. Reorganized the L2 cache nodes such that they now appear in only the soc specific dtsi files(b4860si-post.dtsi and b4420si-post.dtsi). Earlier they were shown partly in common b4si-post.dtsi and si specific b4860si-post.dtsi files . Signed-off-by: Ashish KumarSigned-off-by: Shaveta Leekha --- arch/powerpc/boot/dts/fsl/b4420si-post.dtsi |8 arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi | 23 arch/powerpc/boot/dts/fsl/b4860si-post.dtsi | 18 + arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi | 52 +++ arch/powerpc/boot/dts/fsl/b4si-post.dtsi|5 --- 5 files changed, 101 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi b/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi index 86161ae..c0fe250 100644 --- a/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi @@ -102,5 +102,13 @@ L2: l2-cache-controller@c2 { compatible = "fsl,b4420-l2-cache-controller"; + reg = <0xc2 0x1000>; + next-level-cache = <>; + }; +/* Following is DSP L2 cache*/ + L2_2: l2-cache-controller@c6 { + compatible = "fsl,b4420-l2-cache-controller"; + reg = <0xc6 0x1000>; + next-level-cache = <>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi b/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi index 338af7e..5fec4ea 100644 --- a/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi @@ -76,4 +76,27 @@ fsl,portid-mapping = <0x8000>; }; }; + + dsp-clusters { + #address-cells = <1>; + #size-cells = <0>; + + dsp-cluster0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,sc3900-cluster"; + reg = <0>; + + dsp0: dsp@0 { + compatible = "fsl,sc3900"; + reg = <0>; + next-level-cache = <_2>; + }; + dsp1: dsp@1 { + compatible = "fsl,sc3900"; + reg = <1>; + next-level-cache = <_2>; + }; + }; + }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi index f35e9e0..19679d3 100644 --- a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi @@ -204,5 +204,23 @@ L2: l2-cache-controller@c2 { compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xc2 0x1000>; + next-level-cache = <>; + }; +/* Following are DSP L2 cache */ + L2_2: l2-cache-controller@c6 { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xc6 0x1000>; + next-level-cache = <>; + }; + L2_3: l2-cache-controller@ca { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xca 0x1000>; + next-level-cache = <>; + }; + L2_4: l2-cache-controller@ce { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xce 0x1000>; + next-level-cache = <>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi b/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi index 1948f73..2e5dcb6 100644 --- a/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi @@ -90,4 +90,56 @@ fsl,portid-mapping = <0x8000>; }; }; + dsp-clusters { + #address-cells = <1>; + #size-cells = <0>; + dsp-cluster0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,sc3900-cluster"; + reg = <0>; + dsp0: dsp@0 { + compatible = "fsl,sc3900"; + reg = <0>; + next-level-cache = <_2>; + }; + dsp1: dsp@1 { + compatible = "fsl,sc3900"; + reg = <1>; + next-level-cache = <_2>; + }; + }; + dsp-cluster1 { +
Re: [PATCH v6 0/9] ftrace with regs + live patching for ppc64 LE (ABI v2)
On Thu, Jan 28, 2016 at 02:31:58PM +1100, Michael Ellerman wrote: > > Looking at GCC history it looks like the fix is in 4.9.0 and anything later. Good. But 4.8.5 has a buggy -mprofile-kernel, and there will be no 4.8.6, Bad. > But a version check doesn't work with patched distro/vendor toolchains. So we > probably need some sort of runtime check. Agreed. /bin/echo -e '#include \nnotrace int func() { return 0; }' | gcc -D__KERNEL__ -Iinclude -p -mprofile-kernel -x c -O2 - -S -o - | grep mcount should be empty. If it yields "bl _mcount" your compiler is buggy. I haven't looked at the kernel's "autoconf" yet, but it's probably capable of testing this. Torsten ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
On 01/28/2016 03:11 PM, Viresh Kumar wrote: > On 28-01-16, 15:06, Shilpasri G Bhat wrote: >> No these stats are not per-policy. They are per-chip. The throttle event is >> common for all cores in the chip. > > How do you define a chip? And how is it different then the group of > CPUs represented by the policy ? > Chip is a group of policies. Hmm yes I see your point. We anyways maintain frequency stats which is per-policy. We might as well have throttle stats exported per-policy which points to per-chip data. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH v3 0/5] vfio-pci: Allow to mmap sub-page MMIO BARs and MSI-X table on PPC64 platform
Ping... Alex, any comment? Regards, Yongji Xie On 2016/1/15 15:06, Yongji Xie wrote: Current vfio-pci implementation disallows to mmap sub-page(size < PAGE_SIZE) MMIO BARs and MSI-X table. This is because sub-page BARs' mmio page may be shared with other BARs and MSI-X table should not be accessed directly from the guest for security reasons. But these will easily cause some performance issues for mmio accesses in guest when vfio passthrough sub-page BARs or BARs containing MSI-X table on PPC64 platform. This is because PAGE_SIZE is 64KB by default on PPC64 platform and the big page may easily hit the sub-page MMIO BARs' unmmapping and cause the unmmaping of the mmio page which MSI-X table locate in, which lead to mmio emulation in host. For sub-page MMIO BARs' unmmapping, this patchset adds a kernel parameter for PCI resource allocator to enforce the alignment of all MMIO BARs to be at least PAGE_SZIE and make it enabled by default on PPC64 platform so that sub-page BAR's mmio page will not be shared with other BARs. Then we can mmap sub-page MMIO BARs in vfio-pci driver with this parameter enabled. For MSI-X table's unmmapping, we think MSI-X table is safe to access directly from userspace if PCI host bridge support filtering of MSIs which can ensure that a given pci device can only shoot the MSIs assigned for it. So we add a pci_host_bridge attribute to indicate if this PCI host bridge supports filtering of MSIs. Then we can mmap MSI-X table with this attribute set. With this patchset applied, we can get almost 100% improvement on performance for mmio accesses when we passthrough sub-page BARs to guest in our test. The two vfio related patches(patch 2 and patch 5) are based on the proposed patchset[1]. Changelog v3: - Rebase on new linux kernel mainline with the patchset[1] applied. - Add a function to check whether PCI BARs'mmio page is shared with other BARs. - Add a host bridge attribute to indicate PCI host bridge support filtering of MSIs. - Use the new host bridge attribute to check if MSI-X table can be mmapped instead of CONFIG_EEH. - Remove Kconfig option VFIO_PCI_MMAP_MSIX Changelog v2: - Rebase on v4.4-rc6 with the patchset[1] applied. - Use kernel parameter to enforce all MMIO BARs to be page aligned on PCI core code instead of doing it on PPC64 arch code. - Remove flags: VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED VFIO_DEVICE_FLAGS_PCI_MSIX_MMAP - Add a Kconfig option to support for mmapping MSI-X table. [1] https://lkml.org/lkml/2015/11/23/748 Yongji Xie (5): PCI: Add support for enforcing all MMIO BARs to be page aligned vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive PCI: Add host bridge attribute to indicate filtering of MSIs is supported powerpc/powernv/pci-ioda: Enable msi_filtered bit for any IODA host bridge vfio-pci: Allow to mmap MSI-X table if host bridge supports filtering of MSIs Documentation/kernel-parameters.txt |5 + arch/powerpc/include/asm/pci.h| 11 + arch/powerpc/platforms/powernv/pci-ioda.c |6 + drivers/pci/host-bridge.c |6 + drivers/pci/pci.c | 35 + drivers/pci/pci.h |8 ++- drivers/vfio/pci/vfio_pci.c | 13 --- include/linux/pci.h |7 ++ 8 files changed, 87 insertions(+), 4 deletions(-) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
On 28-01-16, 12:55, Shilpasri G Bhat wrote: > Create sysfs attributes to export throttle information in > /sys/devices/system/cpu/cpufreq/chipN. The newly added sysfs files are as > follows: > > 1)/sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies > This gives the throttle stats for each of the available frequencies. > The throttle stat of a frequency is the total number of times the max > frequency is reduced to that frequency. > # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies > 4023000 0 > 399 0 > 3956000 1 > 3923000 0 > 389 0 > 3857000 2 > 3823000 0 > 379 0 > 3757000 2 > 3724000 1 > 369 1 > ... > > 2)/sys/devices/system/cpu/cpufreq/chip0/throttle_reasons > This directory contains throttle reason files. Each file gives the > total number of times the max frequency is throttled, except for > 'unthrottle_count', which gives the total number of times the max > frequency is unthrottled after being throttled. > # cd /sys/devices/system/cpu/cpufreq/chip0/throttle_reasons > # cat cpu_over_temperature > 7 > # cat occ_reset > 0 > # cat over_current > 0 > # cat power_cap > 0 > # cat power_supply_failure > 0 > # cat unthrottle_count > 7 Wouldn't it be better to keep a two dimensional table here, something like: trans_table ? You can have "reasons" in the vertical dimension and frequencies in the horizontal one? So, that user can see which frequencies got throttled and why? > diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu > b/Documentation/ABI/testing/sysfs-devices-system-cpu > index b683e8e..dea4620 100644 > --- a/Documentation/ABI/testing/sysfs-devices-system-cpu > +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu > @@ -271,3 +271,48 @@ Description: Parameters for the CPU cache attributes > - WriteBack: data is written only to the cache line and >the modified cache line is written to main >memory only when it is replaced > + > +What:/sys/devices/system/cpu/cpufreq/chip*/throttle_stats You need documentation for chip*/ as well.. And how can a user know which policies or CPUs belong to a chip? > diff --git a/drivers/cpufreq/powernv-cpufreq.c > b/drivers/cpufreq/powernv-cpufreq.c > static struct chip { > unsigned int id; > bool throttled; > @@ -62,6 +72,11 @@ static struct chip { > u8 throttle_reason; > cpumask_t mask; > struct work_struct throttle; > + int throttle_turbo; > + int throttle_nominal; > + int reason[OCC_MAX_REASON]; > + int *pstate_stat; > + struct kobject *kobj; > } *chips; > > static int nr_chips; > @@ -196,6 +211,126 @@ static struct freq_attr *powernv_cpu_freq_attr[] = { > NULL, > }; > > +static inline int get_chip_index(unsigned int id) > +{ > + int i; > + > + for (i = 0; i < nr_chips; i++) > + if (chips[i].id == id) > + return i; > + > + return -EINVAL; > +} > + > +static inline int get_chip_index_from_kobj(struct kobject *kobj) > +{ > + int ret, id; > + int len = strlen("chip"); > + > + ret = kstrtoint(kobj->name + len, 0, ); That doesn't look nice. What about keeping the kobject in the 'struct chip' and using container of here? You can register the kobject with: kobject_init_and_add(). > + if (ret) > + return ret; > + > + ret = get_chip_index(id); > + if (ret < 0) > + pr_warn_once("%s Matching chip-id not found %d\n", __func__, > + id); > + return ret; > +} > + > +static ssize_t throttle_freq_show(struct kobject *kobj, > + struct kobj_attribute *attr, char *buf) > +{ > + int i, count = 0, id; > + > + id = get_chip_index_from_kobj(kobj); > + if (id < 0) > + return id; > + > + for (i = 0; i < powernv_pstate_info.nr_pstates; i++) > + count += sprintf([count], "%d %d\n", > +powernv_freqs[i].frequency, > +chips[id].pstate_stat[i]); > + > + return count; > +} > + > +static struct kobj_attribute attr_throttle_frequencies = > +__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL); Use DEVICE_ATTR_RO macro ? > @@ -583,12 +736,38 @@ static int init_chip_info(void) > goto free_chip_map; > > for (i = 0; i < nr_chips; i++) { > + char name[10]; > + > chips[i].id = chip[i]; > cpumask_copy([i].mask, cpumask_of_node(chip[i])); > INIT_WORK([i].throttle, powernv_cpufreq_work_fn); > + chips[i].pstate_stat = kcalloc(powernv_pstate_info.nr_pstates, > + sizeof(int), GFP_KERNEL); > + if (!chips[i].pstate_stat) > + goto free; > + > + sprintf(name, "chip%d", chips[i].id);
Re: [PATCH v9 3/6] arm64/arm, numa, dt: adding numa dt binding implementation for arm64 platforms.
On Thu, Jan 28, 2016 at 10:42:17PM +0530, Ganapatrao Kulkarni wrote: > On Thu, Jan 28, 2016 at 8:09 PM, Will Deaconwrote: > > On Tue, Jan 26, 2016 at 02:36:04PM -0600, Bjorn Helgaas wrote: > >> Subject is "arm64/arm, numa, dt: adding ..." What is the significance > >> of the "arm" part? The other patches only mention "arm64". > >> > >> General comment: the code below has little, if anything, that is > >> actually arm64-specific. Maybe this is the first DT-based NUMA > >> platform? I don't see other similar code for other arches, so maybe > >> it's too early to try to generalize it, but we should try to avoid > >> adding duplicates of this code if/when other arches do show up. > > > > Having it in the core code would allow us to share it with arch/arm/ > > fairly straightforwardly. > This binding can be used for arm too. > however at this moment it is the need of arm64 platforms. > can we please keep this to arm64 as it's too early to try to > generalize it(as Bjorn suggested) > I prefer to keep it as it is, otherwise ok. > Please suggest. My suggestions time and time again on the NUMA patches from you have consistently been around consolidation of existing code, or moving things that aren't architecture-specific out of the architecture code. Will ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc/mm: Fixup _HPAGE_CHG_MASK
On Wed, 2016-27-01 at 06:34:20 UTC, "Aneesh Kumar K.V" wrote: > This got wrongly updated by 7aa9a23c69eae5bfba4f1f92c58d89edc334c8ae > ("powerpc, thp: remove infrastructure for handling splitting PMDs") > during the last merge. Fix this up > > Signed-off-by: Aneesh Kumar K.VApplied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/2d19fc639516dc7b4184450b31 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/3] param: convert some "on"/"off" users to strtobool
On Thu, 2016-01-28 at 06:17 -0800, Kees Cook wrote: > This changes several users of manual "on"/"off" parsing to use strtobool. You should probably point out that it's a slight behaviour change for some users. ie. parameters that previously *only* worked with "on"/"off", can now also take 0/1/y/n etc. But I don't think that's a show stopper. > Signed-off-by: Kees Cook> Cc: x...@kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-s...@vger.kernel.org > --- > arch/powerpc/kernel/rtasd.c | 10 +++--- > arch/powerpc/platforms/pseries/hotplug-cpu.c | 11 +++ Acked-by: Michael Ellerman (powerpc) cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v9 3/6] arm64/arm, numa, dt: adding numa dt binding implementation for arm64 platforms.
On Tue, Jan 26, 2016 at 02:36:04PM -0600, Bjorn Helgaas wrote: > Subject is "arm64/arm, numa, dt: adding ..." What is the significance > of the "arm" part? The other patches only mention "arm64". > > General comment: the code below has little, if anything, that is > actually arm64-specific. Maybe this is the first DT-based NUMA > platform? I don't see other similar code for other arches, so maybe > it's too early to try to generalize it, but we should try to avoid > adding duplicates of this code if/when other arches do show up. Having it in the core code would allow us to share it with arch/arm/ fairly straightforwardly. Will ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/3] lib: add "on" and "off" to strtobool
Several places in the kernel expect to use "on" and "off" for their boolean signifiers, so add them to strtobool. Signed-off-by: Kees CookCc: Rasmus Villemoes Cc: Daniel Borkmann --- lib/string.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/lib/string.c b/lib/string.c index 0323c0d5629a..091570708db7 100644 --- a/lib/string.c +++ b/lib/string.c @@ -635,12 +635,15 @@ EXPORT_SYMBOL(sysfs_streq); * @s: input string * @res: result * - * This routine returns 0 iff the first character is one of 'Yy1Nn0'. - * Otherwise it will return -EINVAL. Value pointed to by res is - * updated upon finding a match. + * This routine returns 0 iff the first character is one of 'Yy1Nn0', or + * [oO][NnFf] for "on" and "off". Otherwise it will return -EINVAL. Value + * pointed to by res is updated upon finding a match. */ int strtobool(const char *s, bool *res) { + if (!s) + return -EINVAL; + switch (s[0]) { case 'y': case 'Y': @@ -652,6 +655,21 @@ int strtobool(const char *s, bool *res) case '0': *res = false; break; + case 'o': + case 'O': + switch (s[1]) { + case 'n': + case 'N': + *res = true; + break; + case 'f': + case 'F': + *res = false; + break; + default: + return -EINVAL; + } + break; default: return -EINVAL; } -- 2.6.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/3] param: convert some "on"/"off" users to strtobool
This changes several users of manual "on"/"off" parsing to use strtobool. Signed-off-by: Kees CookCc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s...@vger.kernel.org --- arch/powerpc/kernel/rtasd.c | 10 +++--- arch/powerpc/platforms/pseries/hotplug-cpu.c | 11 +++ arch/s390/kernel/time.c | 8 ++-- arch/s390/kernel/topology.c | 8 +++- arch/x86/kernel/aperture_64.c| 13 +++-- include/linux/tick.h | 2 +- kernel/time/hrtimer.c| 11 +++ kernel/time/tick-sched.c | 11 +++ 8 files changed, 21 insertions(+), 53 deletions(-) diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c index 5a2c049c1c61..984e67e91ba3 100644 --- a/arch/powerpc/kernel/rtasd.c +++ b/arch/powerpc/kernel/rtasd.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -49,7 +50,7 @@ static unsigned int rtas_error_log_buffer_max; static unsigned int event_scan; static unsigned int rtas_event_scan_rate; -static int full_rtas_msgs = 0; +static bool full_rtas_msgs; /* Stop logging to nvram after first fatal error */ static int logging_enabled; /* Until we initialize everything, @@ -592,11 +593,6 @@ __setup("surveillance=", surveillance_setup); static int __init rtasmsgs_setup(char *str) { - if (strcmp(str, "on") == 0) - full_rtas_msgs = 1; - else if (strcmp(str, "off") == 0) - full_rtas_msgs = 0; - - return 1; + return strtobool(str, _rtas_msgs); } __setup("rtasmsgs=", rtasmsgs_setup); diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 32274f72fe3f..bb333e9fd77a 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -47,20 +48,14 @@ static DEFINE_PER_CPU(enum cpu_state_vals, current_state) = CPU_STATE_OFFLINE; static enum cpu_state_vals default_offline_state = CPU_STATE_OFFLINE; -static int cede_offline_enabled __read_mostly = 1; +static bool cede_offline_enabled __read_mostly = true; /* * Enable/disable cede_offline when available. */ static int __init setup_cede_offline(char *str) { - if (!strcmp(str, "off")) - cede_offline_enabled = 0; - else if (!strcmp(str, "on")) - cede_offline_enabled = 1; - else - return 0; - return 1; + return strtobool(str, _offline_enabled); } __setup("cede_offline=", setup_cede_offline); diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c index 99f84ac31307..afc7fc9684ba 100644 --- a/arch/s390/kernel/time.c +++ b/arch/s390/kernel/time.c @@ -1433,7 +1433,7 @@ device_initcall(etr_init_sysfs); /* * Server Time Protocol (STP) code. */ -static int stp_online; +static bool stp_online; static struct stp_sstpi stp_info; static void *stp_page; @@ -1444,11 +1444,7 @@ static struct timer_list stp_timer; static int __init early_parse_stp(char *p) { - if (strncmp(p, "off", 3) == 0) - stp_online = 0; - else if (strncmp(p, "on", 2) == 0) - stp_online = 1; - return 0; + return strtobool(p, _online); } early_param("stp", early_parse_stp); diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c index 40b8102fdadb..10e388216307 100644 --- a/arch/s390/kernel/topology.c +++ b/arch/s390/kernel/topology.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -37,7 +38,7 @@ static void set_topology_timer(void); static void topology_work_fn(struct work_struct *work); static struct sysinfo_15_1_x *tl_info; -static int topology_enabled = 1; +static bool topology_enabled = true; static DECLARE_WORK(topology_work, topology_work_fn); /* @@ -444,10 +445,7 @@ static const struct cpumask *cpu_book_mask(int cpu) static int __init early_parse_topology(char *p) { - if (strncmp(p, "off", 3)) - return 0; - topology_enabled = 0; - return 0; + return strtobool(p, _enabled); } early_param("topology", early_parse_topology); diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c index 6e85f713641d..6608b00a516a 100644 --- a/arch/x86/kernel/aperture_64.c +++ b/arch/x86/kernel/aperture_64.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -227,19 +228,11 @@ static u32 __init search_agp_bridge(u32 *order, int *valid_agp) return 0; } -static int gart_fix_e820 __initdata = 1; +static bool gart_fix_e820 __initdata = true; static int __init parse_gart_mem(char *p) { - if (!p) - return -EINVAL; - - if (!strncmp(p, "off", 3)) -
Re: powerpc/mm: Allow user space to map rtas_rmo_buf
On Thu, 2016-21-01 at 16:15:31 UTC, Vasant Hegde wrote: > With commit 90a545e9 (restrict /dev/mem to idle io memory ranges) mapping > rtas_rmo_buf from user space is failing. Hence we are not able to make > RTAS syscall. > > This patch calls page_is_rtas_user_buf before calling iomem_is_exclusive > in devmem_is_allowed(). This will allow user space to map rtas_rmo_buf > and we are able to make RTAS syscall. > > Reported-by: Bharata B Rao> CC: Dan Williams > CC: Nathan Fontenot > CC: Michael Ellerman > Signed-off-by: Vasant Hegde > Acked-by: Dan Williams Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/e256caa7d0515e301f8c8c6e7d cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc/eeh: Fix PE location code
On Wed, 2015-02-12 at 05:25:32 UTC, Gavin Shan wrote: > In eeh_pe_loc_get(), the PE location code is retrieved from the > "ibm,loc-code" property of the device node for the bridge of the > PE's primary bus. It's not correct because the property indicates > the parent PE's location code. > > This reads the correct PE location code from "ibm,io-base-loc-code" > or "ibm,slot-location-code" property of PE parent bus's device node. > > Signed-off-by: Gavin Shan> Tested-by: Russell Currey Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/7e56f627768da4e6480986b514 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc/perf: Remove PPMU_HAS_SSLOT flag for Power8
On Mon, 2016-25-01 at 08:33:46 UTC, Madhavan Srinivasan wrote: > Commit: 7a7868326d77 introduced PPMU_HAS_SSLOT flag to > remove assumption of MMCRA[SLOT] with respect to > PPMU_ALT_SIPR flag. Commit 7a7868326d77's message also > specifies that Power8 does not support MMCRA[SLOT]. > But still PPMU_HAS_SSLOT flag managed to get into > Power8 code. Patch to remove the same from Power8 flags. > > Signed-off-by: Madhavan SrinivasanApplied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/370f06c88528b3988fe24a372c cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] arch/PPC:B4860qds/B4420qds: Updates to device trees for B4860 for DSP clusters and their L2 caches
B4860 has 1 PPC core cluster and 3 DSP core clusters. Similarly B4420 has 1 PPC core cluster and 1 DSP core cluster. Each DSP core cluster consists of 2 SC3900 cores and a shared L2 cache. Add DSP clusters for B4420 The L2 cache nodes such that they now appear in only the soc specific dtsi files(b4860si-post.dtsi and b4420si-post.dtsi). Signed-off-by: Ashish KumarSigned-off-by: Shaveta Leekha --- arch/powerpc/boot/dts/fsl/b4420si-post.dtsi |7 +++- arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi | 23 arch/powerpc/boot/dts/fsl/b4860si-post.dtsi | 20 ++- arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi | 52 +++ 4 files changed, 100 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi b/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi index f996cce..cc70adb 100644 --- a/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4420si-post.dtsi @@ -91,7 +91,12 @@ L2_1: l2-cache-controller@c2 { compatible = "fsl,b4420-l2-cache-controller"; - reg = <0xc2 0x4>; + reg = <0xc2 0x1000>; + next-level-cache = <>; + }; + L2_2: l2-cache-controller@c6 { + compatible = "fsl,b4420-l2-cache-controller"; + reg = <0xc6 0x1000>; next-level-cache = <>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi b/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi index bc3bf93..87c2712 100644 --- a/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4420si-pre.dtsi @@ -81,4 +81,27 @@ fsl,portid-mapping = <0x8000>; }; }; + + dsp-clusters { + #address-cells = <1>; + #size-cells = <0>; + + dsp-cluster0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,sc3900-cluster"; + reg = <0>; + + dsp0: dsp@0 { + compatible = "fsl,sc3900"; + reg = <0>; + next-level-cache = <_2>; + }; + dsp1: dsp@1 { + compatible = "fsl,sc3900"; + reg = <1>; + next-level-cache = <_2>; + }; + }; + }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi index 8687198..833d483 100644 --- a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi @@ -278,7 +278,25 @@ L2_1: l2-cache-controller@c2 { compatible = "fsl,b4860-l2-cache-controller"; - reg = <0xc2 0x4>; + reg = <0xc2 0x1000>; + next-level-cache = <>; + }; + + L2_2: l2-cache-controller@c6 { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xc6 0x1000>; + next-level-cache = <>; + }; + + L2_3: l2-cache-controller@ca { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xca 0x1000>; + next-level-cache = <>; + }; + + L2_4: l2-cache-controller@ce { + compatible = "fsl,b4860-l2-cache-controller"; + reg = <0xce 0x1000>; next-level-cache = <>; }; }; diff --git a/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi b/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi index 8797ce1..a45800d 100644 --- a/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi +++ b/arch/powerpc/boot/dts/fsl/b4860si-pre.dtsi @@ -100,4 +100,56 @@ fsl,portid-mapping = <0x8000>; }; }; + dsp-clusters { + #address-cells = <1>; + #size-cells = <0>; + dsp-cluster0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,sc3900-cluster"; + reg = <0>; + dsp0: dsp@0 { + compatible = "fsl,sc3900"; + reg = <0>; + next-level-cache = <_2>; + }; + dsp1: dsp@1 { + compatible = "fsl,sc3900"; + reg = <1>; + next-level-cache = <_2>; + }; + }; + dsp-cluster1 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,sc3900-cluster"; +
[PATCH 1/3] lib: fix callers of strtobool to use char array
Some callers of strtobool were passing a pointer to unterminated strings. This fixes the issue and consolidates some logic in cifs. Signed-off-by: Kees CookCc: Amitkumar Karwar Cc: Nishant Sarmukadam Cc: Kalle Valo Cc: Steve French Cc: linux-c...@vger.kernel.org --- drivers/net/wireless/marvell/mwifiex/debugfs.c | 6 +- fs/cifs/cifs_debug.c | 106 - fs/cifs/cifs_debug.h | 2 +- fs/cifs/cifsfs.c | 6 +- fs/cifs/cifsglob.h | 4 +- 5 files changed, 58 insertions(+), 66 deletions(-) diff --git a/drivers/net/wireless/marvell/mwifiex/debugfs.c b/drivers/net/wireless/marvell/mwifiex/debugfs.c index 0b9c580af988..76af60899c69 100644 --- a/drivers/net/wireless/marvell/mwifiex/debugfs.c +++ b/drivers/net/wireless/marvell/mwifiex/debugfs.c @@ -880,13 +880,13 @@ mwifiex_reset_write(struct file *file, { struct mwifiex_private *priv = file->private_data; struct mwifiex_adapter *adapter = priv->adapter; - char cmd; + char cmd[2] = { '\0' }; bool result; - if (copy_from_user(, ubuf, sizeof(cmd))) + if (copy_from_user(cmd, ubuf, sizeof(char))) return -EFAULT; - if (strtobool(, )) + if (strtobool(cmd, )) return -EINVAL; if (!result) diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c index 50b268483302..2f7ffcc9e364 100644 --- a/fs/cifs/cifs_debug.c +++ b/fs/cifs/cifs_debug.c @@ -251,11 +251,29 @@ static const struct file_operations cifs_debug_data_proc_fops = { .release= single_release, }; +static int get_user_bool(const char __user *buffer, bool *store) +{ + char c[2] = { '\0' }; + bool bv; + int rc; + + rc = get_user(c[0], buffer); + if (rc) + return rc; + + rc = strtobool(c, ); + if (rc) + return rc; + + *store = bv; + + return 0; +} + #ifdef CONFIG_CIFS_STATS static ssize_t cifs_stats_proc_write(struct file *file, const char __user *buffer, size_t count, loff_t *ppos) { - char c; bool bv; int rc; struct list_head *tmp1, *tmp2, *tmp3; @@ -263,34 +281,32 @@ static ssize_t cifs_stats_proc_write(struct file *file, struct cifs_ses *ses; struct cifs_tcon *tcon; - rc = get_user(c, buffer); + rc = get_user_bool(buffer, ); if (rc) return rc; - if (strtobool(, ) == 0) { #ifdef CONFIG_CIFS_STATS2 - atomic_set(, 0); - atomic_set(, 0); + atomic_set(, 0); + atomic_set(, 0); #endif /* CONFIG_CIFS_STATS2 */ - spin_lock(_tcp_ses_lock); - list_for_each(tmp1, _tcp_ses_list) { - server = list_entry(tmp1, struct TCP_Server_Info, - tcp_ses_list); - list_for_each(tmp2, >smb_ses_list) { - ses = list_entry(tmp2, struct cifs_ses, -smb_ses_list); - list_for_each(tmp3, >tcon_list) { - tcon = list_entry(tmp3, - struct cifs_tcon, - tcon_list); - atomic_set(>num_smbs_sent, 0); - if (server->ops->clear_stats) - server->ops->clear_stats(tcon); - } + spin_lock(_tcp_ses_lock); + list_for_each(tmp1, _tcp_ses_list) { + server = list_entry(tmp1, struct TCP_Server_Info, + tcp_ses_list); + list_for_each(tmp2, >smb_ses_list) { + ses = list_entry(tmp2, struct cifs_ses, +smb_ses_list); + list_for_each(tmp3, >tcon_list) { + tcon = list_entry(tmp3, + struct cifs_tcon, + tcon_list); + atomic_set(>num_smbs_sent, 0); + if (server->ops->clear_stats) + server->ops->clear_stats(tcon); } } - spin_unlock(_tcp_ses_lock); } + spin_unlock(_tcp_ses_lock); return count; } @@ -433,17 +449,17 @@ static int cifsFYI_proc_open(struct inode *inode, struct file *file) static ssize_t cifsFYI_proc_write(struct file *file, const char __user *buffer, size_t count, loff_t *ppos) { -
[PATCH 0/3] lib: add "on" and "off" to strtobool
This consolidates logic for handling "on"/"off" parsing for bools into the existing strtobool function. This requires making sure callers are passing NULL-terminated strings. -Kees ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/3] param: convert some "on"/"off" users to strtobool
On Thu, Jan 28, 2016 at 06:17:07AM -0800, Kees Cook wrote: > This changes several users of manual "on"/"off" parsing to use strtobool. > > Signed-off-by: Kees Cook> Cc: x...@kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-s...@vger.kernel.org > --- > arch/powerpc/kernel/rtasd.c | 10 +++--- > arch/powerpc/platforms/pseries/hotplug-cpu.c | 11 +++ > arch/s390/kernel/time.c | 8 ++-- > arch/s390/kernel/topology.c | 8 +++- > arch/x86/kernel/aperture_64.c| 13 +++-- > include/linux/tick.h | 2 +- > kernel/time/hrtimer.c| 11 +++ > kernel/time/tick-sched.c | 11 +++ > 8 files changed, 21 insertions(+), 53 deletions(-) For the s390 bits: Acked-by: Heiko Carstens ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v9 3/6] arm64/arm, numa, dt: adding numa dt binding implementation for arm64 platforms.
Hi Will, On Thu, Jan 28, 2016 at 8:09 PM, Will Deaconwrote: > On Tue, Jan 26, 2016 at 02:36:04PM -0600, Bjorn Helgaas wrote: >> Subject is "arm64/arm, numa, dt: adding ..." What is the significance >> of the "arm" part? The other patches only mention "arm64". >> >> General comment: the code below has little, if anything, that is >> actually arm64-specific. Maybe this is the first DT-based NUMA >> platform? I don't see other similar code for other arches, so maybe >> it's too early to try to generalize it, but we should try to avoid >> adding duplicates of this code if/when other arches do show up. > > Having it in the core code would allow us to share it with arch/arm/ > fairly straightforwardly. This binding can be used for arm too. however at this moment it is the need of arm64 platforms. can we please keep this to arm64 as it's too early to try to generalize it(as Bjorn suggested) I prefer to keep it as it is, otherwise ok. Please suggest. > > Will thanks Ganapat ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] powerpc/mm: Enable HugeTLB page migration
Anshuman Khandualwrites: > This enables HugeTLB page migration for PPC64_BOOK3S systems which implement > HugeTLB page at the PMD level. It enables the kernel configuration option > CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION by default which turns on the function > hugepage_migration_supported() during migration. After the recent changes > to the PTE format, HugeTLB page migration happens successfully. > > Signed-off-by: Anshuman Khandual > --- > arch/powerpc/Kconfig | 4 > 1 file changed, 4 insertions(+) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index e4824fd..65d52a0 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -82,6 +82,10 @@ config GENERIC_HWEIGHT > config ARCH_HAS_DMA_SET_COHERENT_MASK > bool > > +config ARCH_ENABLE_HUGEPAGE_MIGRATION > + def_bool y > + depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION > + > config PPC > bool > default y Are you sure this is all that is needed ? We will get a FOLL_GET with hugetlb migration and our follow_huge_addr will BUG_ON on that. Look at e66f17ff71772b209eed39de35aaa99ba819c93d (" mm/hugetlb: take page table lock in follow_huge_pmd()"). Again this doesn't work with 4K page size. So if you are taking this route, we will need that restriction here. I would suggest we switch 64K page size hugetlb to generic hugetlb and then do hugetlb migration on top of that. Till you help me understnd why that FOLL_GET issue is not valid for powerpc, NAK. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFCv2 0/9] PAPR hash page table resizing (guest side)
Here's a second prototype of the guest side work for runtime resizing of the has page table in PAPR guests. This is now feature complete. It implements the resizing, advertises it with CAS, and will automatically invoke it to maintain a good HPT size when memory is hot-added or hot-removed. Patches 1-5 are standalone prerequisite cleanups that I'll be pushing concurrently. David Gibson (9): memblock: Don't mark memblock_phys_mem_size() as __init arch/powerpc: Clean up error handling for htab_remove_mapping arch/powerpc: Handle removing maybe-present bolted HPTEs arch/powerpc: Clean up memory hotplug failure paths arch/powerpc: Split hash page table sizing heuristic into a helper pseries: Add hypercall wrappers for hash page table resizing pseries: Add support for hash table resizing pseries: Advertise HPT resizing support via CAS pseries: Automatically resize HPT for memory hot add/remove arch/powerpc/include/asm/firmware.h | 5 +- arch/powerpc/include/asm/hvcall.h | 2 + arch/powerpc/include/asm/machdep.h| 3 +- arch/powerpc/include/asm/mmu-hash64.h | 3 + arch/powerpc/include/asm/plpar_wrappers.h | 12 +++ arch/powerpc/include/asm/prom.h | 1 + arch/powerpc/include/asm/sparsemem.h | 1 + arch/powerpc/kernel/prom_init.c | 2 +- arch/powerpc/mm/hash_utils_64.c | 121 -- arch/powerpc/mm/init_64.c | 47 arch/powerpc/mm/mem.c | 14 +++- arch/powerpc/platforms/pseries/firmware.c | 1 + arch/powerpc/platforms/pseries/lpar.c | 117 - mm/memblock.c | 2 +- 14 files changed, 281 insertions(+), 50 deletions(-) -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init
At the moment memblock_phys_mem_size() is marked as __init, and so is discarded after boot. This is different from most of the memblock functions which are marked __init_memblock, and are only discarded after boot if memory hotplug is not configured. To allow for upcoming code which will need memblock_phys_mem_size() in the hotplug path, change it from __init to __init_memblock. Signed-off-by: David Gibson--- mm/memblock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memblock.c b/mm/memblock.c index d2ed81e..dd79899 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1448,7 +1448,7 @@ void __init __memblock_free_late(phys_addr_t base, phys_addr_t size) * Remaining API functions */ -phys_addr_t __init memblock_phys_mem_size(void) +phys_addr_t __init_memblock memblock_phys_mem_size(void) { return memblock.memory.total_size; } -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping
Currently, the only error that htab_remove_mapping() can report is -EINVAL, if removal of bolted HPTEs isn't implemeted for this platform. We make a few clean ups to the handling of this: * EINVAL isn't really the right code - there's nothing wrong with the function's arguments - use ENODEV instead * We were also printing a warning message, but that's a decision better left up to the callers, so remove it * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on error, making the warning message irrelevant, so no change is needed there. * The other caller is remove_section_mapping(). This is called in the memory hot remove path at a point after vmemmap_remove_mapping() so if hpte_removebolted isn't implemented, we'd expect to have already BUG()ed anyway. Put a WARN_ON() here, in lieu of a printk() since this really shouldn't be happening. Signed-off-by: David Gibson--- arch/powerpc/mm/hash_utils_64.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index ba59d59..9f7d727 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -273,11 +273,8 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend, shift = mmu_psize_defs[psize].shift; step = 1 << shift; - if (!ppc_md.hpte_removebolted) { - printk(KERN_WARNING "Platform doesn't implement " - "hpte_removebolted\n"); - return -EINVAL; - } + if (!ppc_md.hpte_removebolted) + return -ENODEV; for (vaddr = vstart; vaddr < vend; vaddr += step) ppc_md.hpte_removebolted(vaddr, psize, ssize); @@ -641,8 +638,10 @@ int create_section_mapping(unsigned long start, unsigned long end) int remove_section_mapping(unsigned long start, unsigned long end) { - return htab_remove_mapping(start, end, mmu_linear_psize, - mmu_kernel_ssize); + int rc = htab_remove_mapping(start, end, mmu_linear_psize, +mmu_kernel_ssize); + WARN_ON(rc < 0); + return rc; } #endif /* CONFIG_MEMORY_HOTPLUG */ -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs
At the moment the hpte_removebolted callback in ppc_md returns void and will BUG_ON() if the hpte it's asked to remove doesn't exist in the first place. This is awkward for the case of cleaning up a mapping which was partially made before failing. So, we add a return value to hpte_removebolted, and have it return ENOENT in the case that the HPTE to remove didn't exist in the first place. In the (sole) caller, we propagate errors in hpte_removebolted to its caller to handle. However, we handle ENOENT specially, continuing to complete the unmapping over the specified range before returning the error to the caller. This means that htab_remove_mapping() will work sanely on a partially present mapping, removing any HPTEs which are present, while also returning ENOENT to its caller in case it's important there. There are two callers of htab_remove_mapping(): - In remove_section_mapping() we already WARN_ON() any error return, which is reasonable - in this case the mapping should be fully present - In vmemmap_remove_mapping() we BUG_ON() any error. We change that to just a WARN_ON() in the case of ENOENT, since failing to remove a mapping that wasn't there in the first place probably shouldn't be fatal. Signed-off-by: David Gibson--- arch/powerpc/include/asm/machdep.h| 2 +- arch/powerpc/mm/hash_utils_64.c | 10 +++--- arch/powerpc/mm/init_64.c | 9 + arch/powerpc/platforms/pseries/lpar.c | 7 +-- 4 files changed, 18 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index 3f191f5..a7d3f66 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -54,7 +54,7 @@ struct machdep_calls { int psize, int apsize, int ssize); long(*hpte_remove)(unsigned long hpte_group); - void(*hpte_removebolted)(unsigned long ea, + long(*hpte_removebolted)(unsigned long ea, int psize, int ssize); void(*flush_hash_range)(unsigned long number, int local); void(*hugepage_invalidate)(unsigned long vsid, diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 9f7d727..0737eae 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -269,6 +269,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend, { unsigned long vaddr; unsigned int step, shift; + int rc = 0; shift = mmu_psize_defs[psize].shift; step = 1 << shift; @@ -276,10 +277,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend, if (!ppc_md.hpte_removebolted) return -ENODEV; - for (vaddr = vstart; vaddr < vend; vaddr += step) - ppc_md.hpte_removebolted(vaddr, psize, ssize); + for (vaddr = vstart; vaddr < vend; vaddr += step) { + rc = ppc_md.hpte_removebolted(vaddr, psize, ssize); + if ((rc < 0) && (rc != -ENOENT)) + return rc; + } - return 0; + return rc; } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index 379a6a9..baa1a23 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -232,10 +232,11 @@ static void __meminit vmemmap_create_mapping(unsigned long start, static void vmemmap_remove_mapping(unsigned long start, unsigned long page_size) { - int mapped = htab_remove_mapping(start, start + page_size, -mmu_vmemmap_psize, -mmu_kernel_ssize); - BUG_ON(mapped < 0); + int rc = htab_remove_mapping(start, start + page_size, +mmu_vmemmap_psize, +mmu_kernel_ssize); + BUG_ON((rc < 0) && (rc != -ENOENT)); + WARN_ON(rc == -ENOENT); } #endif diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index 477290a..92d472d 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -505,7 +505,7 @@ static void pSeries_lpar_hugepage_invalidate(unsigned long vsid, } #endif -static void pSeries_lpar_hpte_removebolted(unsigned long ea, +static long pSeries_lpar_hpte_removebolted(unsigned long ea, int psize, int ssize) { unsigned long vpn; @@ -515,11 +515,14 @@ static void pSeries_lpar_hpte_removebolted(unsigned long ea, vpn = hpt_vpn(ea, vsid, ssize); slot = pSeries_lpar_hpte_find(vpn, psize, ssize); - BUG_ON(slot == -1); + if (slot == -1) + return -ENOENT;
[RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths
This makes a number of cleanups to handling of mapping failures during memory hotplug on Power: For errors creating the linear mapping for the hot-added region: * This is now reported with EFAULT which is more appropriate than the previous EINVAL (the failure is unlikely to be related to the function's parameters) * An error in this path now prints a warning message, rather than just silently failing to add the extra memory. * Previously a failure here could result in the region being partially mapped. We now clean up any partial mapping before failing. For errors creating the vmemmap for the hot-added region: * This is now reported with EFAULT instead of causing a BUG() - this could happen for external reason (e.g. full hash table) so it's better to handle this non-fatally * An error message is also printed, so the failure won't be silent * As above a failure could cause a partially mapped region, we now clean this up. Signed-off-by: David Gibson--- arch/powerpc/mm/hash_utils_64.c | 13 ++--- arch/powerpc/mm/init_64.c | 38 ++ arch/powerpc/mm/mem.c | 10 -- 3 files changed, 44 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 0737eae..e88a86e 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -635,9 +635,16 @@ static unsigned long __init htab_get_table_size(void) #ifdef CONFIG_MEMORY_HOTPLUG int create_section_mapping(unsigned long start, unsigned long end) { - return htab_bolt_mapping(start, end, __pa(start), -pgprot_val(PAGE_KERNEL), mmu_linear_psize, -mmu_kernel_ssize); + int rc = htab_bolt_mapping(start, end, __pa(start), + pgprot_val(PAGE_KERNEL), mmu_linear_psize, + mmu_kernel_ssize); + + if (rc < 0) { + int rc2 = htab_remove_mapping(start, end, mmu_linear_psize, + mmu_kernel_ssize); + BUG_ON(rc2 && (rc2 != -ENOENT)); + } + return rc; } int remove_section_mapping(unsigned long start, unsigned long end) diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index baa1a23..fbc9448 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -188,9 +188,9 @@ static int __meminit vmemmap_populated(unsigned long start, int page_size) */ #ifdef CONFIG_PPC_BOOK3E -static void __meminit vmemmap_create_mapping(unsigned long start, -unsigned long page_size, -unsigned long phys) +static int __meminit vmemmap_create_mapping(unsigned long start, + unsigned long page_size, + unsigned long phys) { /* Create a PTE encoding without page size */ unsigned long i, flags = _PAGE_PRESENT | _PAGE_ACCESSED | @@ -208,6 +208,8 @@ static void __meminit vmemmap_create_mapping(unsigned long start, */ for (i = 0; i < page_size; i += PAGE_SIZE) BUG_ON(map_kernel_page(start + i, phys, flags)); + + return 0; } #ifdef CONFIG_MEMORY_HOTPLUG @@ -217,15 +219,20 @@ static void vmemmap_remove_mapping(unsigned long start, } #endif #else /* CONFIG_PPC_BOOK3E */ -static void __meminit vmemmap_create_mapping(unsigned long start, -unsigned long page_size, -unsigned long phys) +static int __meminit vmemmap_create_mapping(unsigned long start, + unsigned long page_size, + unsigned long phys) { - int mapped = htab_bolt_mapping(start, start + page_size, phys, - pgprot_val(PAGE_KERNEL), - mmu_vmemmap_psize, - mmu_kernel_ssize); - BUG_ON(mapped < 0); + int rc = htab_bolt_mapping(start, start + page_size, phys, + pgprot_val(PAGE_KERNEL), + mmu_vmemmap_psize, mmu_kernel_ssize); + if (rc < 0) { + int rc2 = htab_remove_mapping(start, start + page_size, + mmu_vmemmap_psize, + mmu_kernel_ssize); + BUG_ON(rc2 && (rc2 != -ENOENT)); + } + return rc; } #ifdef CONFIG_MEMORY_HOTPLUG @@ -304,6 +311,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node) for (; start < end; start += page_size) { void *p; + int rc; if
[RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper
htab_get_table_size() either retrieve the size of the hash page table (HPT) from the device tree - if the HPT size is determined by firmware - or uses a heuristic to determine a good size based on RAM size if the kernel is responsible for allocating the HPT. To support a PAPR extension allowing resizing of the HPT, we're going to want the memory size -> HPT size logic elsewhere, so split it out into a helper function. Signed-off-by: David Gibson--- arch/powerpc/include/asm/mmu-hash64.h | 3 +++ arch/powerpc/mm/hash_utils_64.c | 30 +- 2 files changed, 20 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h index 7352d3f..cf070fd 100644 --- a/arch/powerpc/include/asm/mmu-hash64.h +++ b/arch/powerpc/include/asm/mmu-hash64.h @@ -607,6 +607,9 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize) context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1; return get_vsid(context, ea, ssize); } + +unsigned htab_shift_for_mem_size(unsigned long mem_size); + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_MMU_HASH64_H_ */ diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index e88a86e..d63f7dc 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -606,10 +606,24 @@ static int __init htab_dt_scan_pftsize(unsigned long node, return 0; } -static unsigned long __init htab_get_table_size(void) +unsigned htab_shift_for_mem_size(unsigned long mem_size) { - unsigned long mem_size, rnd_mem_size, pteg_count, psize; + unsigned memshift = __ilog2(mem_size); + unsigned pshift = mmu_psize_defs[mmu_virtual_psize].shift; + unsigned pteg_shift; + + /* round mem_size up to next power of 2 */ + if ((1UL << memshift) < mem_size) + memshift += 1; + + /* aim for 2 pages / pteg */ + pteg_shift = memshift - (pshift + 1); + + return max(pteg_shift + 7, 18U); +} +static unsigned long __init htab_get_table_size(void) +{ /* If hash size isn't already provided by the platform, we try to * retrieve it from the device-tree. If it's not there neither, we * calculate it now based on the total RAM size @@ -619,17 +633,7 @@ static unsigned long __init htab_get_table_size(void) if (ppc64_pft_size) return 1UL << ppc64_pft_size; - /* round mem_size up to next power of 2 */ - mem_size = memblock_phys_mem_size(); - rnd_mem_size = 1UL << __ilog2(mem_size); - if (rnd_mem_size < mem_size) - rnd_mem_size <<= 1; - - /* # pages / 2 */ - psize = mmu_psize_defs[mmu_virtual_psize].shift; - pteg_count = max(rnd_mem_size >> (psize + 1), 1UL << 11); - - return pteg_count << 7; + return htab_shift_for_mem_size(memblock_phys_mem_size()); } #ifdef CONFIG_MEMORY_HOTPLUG -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing
This adds the hypercall numbers and wrapper functions for the hash page table resizing hypercalls. These are experimental "platform specific" values for now, until we have a formal PAPR update. It also adds a new firmware feature flat to track the presence of the HPT resizing calls. Signed-off-by: David Gibson--- arch/powerpc/include/asm/firmware.h | 5 +++-- arch/powerpc/include/asm/hvcall.h | 2 ++ arch/powerpc/include/asm/plpar_wrappers.h | 12 arch/powerpc/platforms/pseries/firmware.c | 1 + 4 files changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h index b062924..32435d2 100644 --- a/arch/powerpc/include/asm/firmware.h +++ b/arch/powerpc/include/asm/firmware.h @@ -42,7 +42,7 @@ #define FW_FEATURE_SPLPAR ASM_CONST(0x0010) #define FW_FEATURE_LPARASM_CONST(0x0040) #define FW_FEATURE_PS3_LV1 ASM_CONST(0x0080) -/* FreeASM_CONST(0x0100) */ +#define FW_FEATURE_HPT_RESIZE ASM_CONST(0x0100) #define FW_FEATURE_CMO ASM_CONST(0x0200) #define FW_FEATURE_VPHNASM_CONST(0x0400) #define FW_FEATURE_XCMOASM_CONST(0x0800) @@ -66,7 +66,8 @@ enum { FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR | FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO | FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY | - FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN, + FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | + FW_FEATURE_HPT_RESIZE, FW_FEATURE_PSERIES_ALWAYS = 0, FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL, FW_FEATURE_POWERNV_ALWAYS = 0, diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index e3b54dd..195e080 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -293,6 +293,8 @@ /* Platform specific hcalls, used by KVM */ #define H_RTAS 0xf000 +#define H_RESIZE_HPT_PREPARE 0xf003 +#define H_RESIZE_HPT_COMMIT0xf004 /* "Platform specific hcalls", provided by PHYP */ #define H_GET_24X7_CATALOG_PAGE0xF078 diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h index 1b39424..b7ee6d9 100644 --- a/arch/powerpc/include/asm/plpar_wrappers.h +++ b/arch/powerpc/include/asm/plpar_wrappers.h @@ -242,6 +242,18 @@ static inline long plpar_pte_protect(unsigned long flags, unsigned long ptex, return plpar_hcall_norets(H_PROTECT, flags, ptex, avpn); } +static inline long plpar_resize_hpt_prepare(unsigned long flags, + unsigned long shift) +{ + return plpar_hcall_norets(H_RESIZE_HPT_PREPARE, flags, shift); +} + +static inline long plpar_resize_hpt_commit(unsigned long flags, + unsigned long shift) +{ + return plpar_hcall_norets(H_RESIZE_HPT_COMMIT, flags, shift); +} + static inline long plpar_tce_get(unsigned long liobn, unsigned long ioba, unsigned long *tce_ret) { diff --git a/arch/powerpc/platforms/pseries/firmware.c b/arch/powerpc/platforms/pseries/firmware.c index 8c80588..7b287be 100644 --- a/arch/powerpc/platforms/pseries/firmware.c +++ b/arch/powerpc/platforms/pseries/firmware.c @@ -63,6 +63,7 @@ hypertas_fw_features_table[] = { {FW_FEATURE_VPHN, "hcall-vphn"}, {FW_FEATURE_SET_MODE, "hcall-set-mode"}, {FW_FEATURE_BEST_ENERGY,"hcall-best-energy-1*"}, + {FW_FEATURE_HPT_RESIZE, "hcall-hpt-resize"}, }; /* Build up the firmware features bitmask using the contents of -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFCv2 7/9] pseries: Add support for hash table resizing
This adds support for using experimental hypercalls to change the size of the main hash page table while running as a PAPR guest. For now these hypercalls are only in experimental qemu versions. The interface is two part: first H_RESIZE_HPT_PREPARE is used to allocate and prepare the new hash table. This may be slow, but can be done asynchronously. Then, H_RESIZE_HPT_COMMIT is used to switch to the new hash table. This requires that no CPUs be concurrently updating the HPT, and so must be run under stop_machine(). This also adds a debugfs file which can be used to manually control HPT resizing or testing purposes. Signed-off-by: David Gibson--- arch/powerpc/include/asm/machdep.h| 1 + arch/powerpc/mm/hash_utils_64.c | 28 + arch/powerpc/platforms/pseries/lpar.c | 110 ++ 3 files changed, 139 insertions(+) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index a7d3f66..532d795 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -61,6 +61,7 @@ struct machdep_calls { unsigned long addr, unsigned char *hpte_slot_array, int psize, int ssize, int local); + int (*resize_hpt)(unsigned long shift); /* * Special for kexec. * To be called in real mode with interrupts disabled. No locks are diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index d63f7dc..882e409 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -1578,3 +1579,30 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base, /* Finally limit subsequent allocations */ memblock_set_current_limit(ppc64_rma_size); } + +static int ppc64_pft_size_get(void *data, u64 *val) +{ + *val = ppc64_pft_size; + return 0; +} + +static int ppc64_pft_size_set(void *data, u64 val) +{ + if (!ppc_md.resize_hpt) + return -ENODEV; + return ppc_md.resize_hpt(val); +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_ppc64_pft_size, + ppc64_pft_size_get, ppc64_pft_size_set, "%llu\n"); + +static int __init hash64_debugfs(void) +{ + if (!debugfs_create_file("pft-size", 0600, powerpc_debugfs_root, +NULL, _ppc64_pft_size)) { + pr_err("lpar: unable to create ppc64_pft_size debugsfs file\n"); + } + + return 0; +} +machine_device_initcall(pseries, hash64_debugfs); diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index 92d472d..ebf02e7 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include #include #include #include @@ -603,6 +605,113 @@ static int __init disable_bulk_remove(char *str) __setup("bulk_remove=", disable_bulk_remove); +#define HPT_RESIZE_TIMEOUT 1 /* ms */ + +struct hpt_resize_state { + unsigned long shift; + int commit_rc; +}; + +static int pseries_lpar_resize_hpt_commit(void *data) +{ + struct hpt_resize_state *state = data; + + state->commit_rc = plpar_resize_hpt_commit(0, state->shift); + if (state->commit_rc != H_SUCCESS) + return -EIO; + + /* Hypervisor has transitioned the HTAB, update our globals */ + ppc64_pft_size = state->shift; + htab_size_bytes = 1UL << ppc64_pft_size; + htab_hash_mask = (htab_size_bytes >> 7) - 1; + + return 0; +} + +/* Must be called in user context */ +static int pseries_lpar_resize_hpt(unsigned long shift) +{ + struct hpt_resize_state state = { + .shift = shift, + .commit_rc = H_FUNCTION, + }; + unsigned int delay, total_delay = 0; + int rc; + ktime_t t0, t1, t2; + + might_sleep(); + + if (!firmware_has_feature(FW_FEATURE_HPT_RESIZE)) + return -ENODEV; + + printk(KERN_INFO "lpar: Attempting to resize HPT to shift %lu\n", + shift); + + t0 = ktime_get(); + + rc = plpar_resize_hpt_prepare(0, shift); + while (H_IS_LONG_BUSY(rc)) { + delay = get_longbusy_msecs(rc); + total_delay += delay; + if (total_delay > HPT_RESIZE_TIMEOUT) { + /* prepare call with shift==0 cancels an +* in-progress resize */ + rc = plpar_resize_hpt_prepare(0, 0); + if (rc != H_SUCCESS) + printk(KERN_WARNING + "lpar: Unexpected error %d cancelling timed out HPT resize\n", +
[RFCv2 8/9] pseries: Advertise HPT resizing support via CAS
The hypervisor needs to know a guest is capable of using the HPT resizing PAPR extension in order to make full advantage of it for memory hotplug. If the hypervisor knows the guest is HPT resize aware, it can size the initial HPT based on the initial guest RAM size, relying on the guest to resize the HPT when more memory is hot-added. Without this, the hypervisor must size the HPT for the maximum possible guest RAM, which can lead to a huge waste of space if the guest never actually expends to that maximum size. This patch advertises the guest's support for HPT resizing via the ibm,client-architecture-support OF interface. Obviously, the actual encoding in the CAS vector is tentative until the extension is officially incorporated into PAPR. For now we use bit 0 of (previously unused) byte 8 of option vector 5. Signed-off-by: David Gibson--- arch/powerpc/include/asm/prom.h | 1 + arch/powerpc/kernel/prom_init.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h index 7f436ba..ef08208 100644 --- a/arch/powerpc/include/asm/prom.h +++ b/arch/powerpc/include/asm/prom.h @@ -151,6 +151,7 @@ struct of_drconf_cell { #define OV5_XCMO 0x0440 /* Page Coalescing */ #define OV5_TYPE1_AFFINITY 0x0580 /* Type 1 NUMA affinity */ #define OV5_PRRN 0x0540 /* Platform Resource Reassignment */ +#define OV5_HPT_RESIZE 0x0880 /* Hash Page Table resizing */ #define OV5_PFO_HW_RNG 0x0E80 /* PFO Random Number Generator */ #define OV5_PFO_HW_842 0x0E40 /* PFO Compression Accelerator */ #define OV5_PFO_HW_ENCR0x0E20 /* PFO Encryption Accelerator */ diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index da51925..c6feafb 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -713,7 +713,7 @@ unsigned char ibm_architecture_vec[] = { OV5_FEAT(OV5_TYPE1_AFFINITY) | OV5_FEAT(OV5_PRRN), 0, 0, - 0, + OV5_FEAT(OV5_HPT_RESIZE), /* WARNING: The offset of the "number of cores" field below * must match by the macro below. Update the definition if * the structure layout changes. -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove
We've now implemented code in the pseries platform to use the new PAPR interface to allow resizing the hash page table (HPT) at runtime. This patch uses that interface to automatically attempt to resize the HPT when memory is hot added or removed. This tries to always keep the HPT at a reasonable size for our current memory size. Signed-off-by: David Gibson--- arch/powerpc/include/asm/sparsemem.h | 1 + arch/powerpc/mm/hash_utils_64.c | 29 + arch/powerpc/mm/mem.c| 4 3 files changed, 34 insertions(+) diff --git a/arch/powerpc/include/asm/sparsemem.h b/arch/powerpc/include/asm/sparsemem.h index f6fc0ee..737335c 100644 --- a/arch/powerpc/include/asm/sparsemem.h +++ b/arch/powerpc/include/asm/sparsemem.h @@ -16,6 +16,7 @@ #endif /* CONFIG_SPARSEMEM */ #ifdef CONFIG_MEMORY_HOTPLUG +extern void resize_hpt_for_hotplug(unsigned long new_mem_size); extern int create_section_mapping(unsigned long start, unsigned long end); extern int remove_section_mapping(unsigned long start, unsigned long end); #ifdef CONFIG_NUMA diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 882e409..18cc851 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -638,6 +638,35 @@ static unsigned long __init htab_get_table_size(void) } #ifdef CONFIG_MEMORY_HOTPLUG +void resize_hpt_for_hotplug(unsigned long new_mem_size) +{ + unsigned target_hpt_shift; + + if (!ppc_md.resize_hpt) + return; + + target_hpt_shift = htab_shift_for_mem_size(new_mem_size); + + /* +* To avoid lots of HPT resizes if memory size is fluctuating +* across a boundary, we deliberately have some hysterisis +* here: we immediately increase the HPT size if the target +* shift exceeds the current shift, but we won't attempt to +* reduce unless the target shift is at least 2 below the +* current shift +*/ + if ((target_hpt_shift > ppc64_pft_size) + || (target_hpt_shift < (ppc64_pft_size - 1))) { + int rc; + + rc = ppc_md.resize_hpt(target_hpt_shift); + if (rc) + printk(KERN_WARNING + "Unable to resize hash page table to target order %d: %d\n", + target_hpt_shift, rc); + } +} + int create_section_mapping(unsigned long start, unsigned long end) { int rc = htab_bolt_mapping(start, end, __pa(start), diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 8ffc1e2..e77f36c 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -121,6 +121,8 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device) unsigned long nr_pages = size >> PAGE_SHIFT; int rc; + resize_hpt_for_hotplug(memblock_phys_mem_size()); + pgdata = NODE_DATA(nid); start = (unsigned long)__va(start); @@ -161,6 +163,8 @@ int arch_remove_memory(u64 start, u64 size) */ vm_unmap_aliases(); + resize_hpt_for_hotplug(memblock_phys_mem_size()); + return ret; } #endif -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev