Re: PA6T little endian mode: ppc64le kernels possible?
On 10 November 2018 at 8:20PM, Olof Johansson wrote: > > > On Sat, Nov 10, 2018 at 00:55 Christian Zigotzky wrote: > > Hi All, > > I read the following: > > The part that is debatably the heart of the chip is a PowerPC core called the PA6T, a full blown 64-bit PPC with an FPU, VMX extensions and hypervisor support. It fully conforms to the PowerPC 2.04 architecture spec, and can operate in both big and little-endian modes. > > —- > > Is it possible to boot a ppc64le kernel on the PA6T? > > > Nobody has done the work to get the kernel to work in little endian on these platforms. I wouldn't expect it to work without quite a bit of effort. > > > -Olof Thank you for your answer.
Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected
On Thu, 2018-11-08 at 23:06 +, alex_gagn...@dellteam.com wrote: > On 11/08/2018 04:51 PM, Greg KH wrote: > > On Thu, Nov 08, 2018 at 10:49:08PM +, alex_gagn...@dellteam.com wrote: > > > In the case that we're trying to fix, this code executing is a result of > > > the device being gone, so we can guarantee race-free operation. I agree > > > that there is a race, in the general case. As far as checking the result > > > for all F's, that's not an option when firmware crashes the system as a > > > result of the mmio read/write. It's never pretty when firmware gets > > > involved. > > > > If you have firmware that crashes the system when you try to read from a > > PCI device that was hot-removed, that is broken firmware and needs to be > > fixed. The kernel can not work around that as again, you will never win > > that race. > > But it's not the firmware that crashes. It's linux as a result of a > fatal error message from the firmware. And we can't fix that because FFS > handling requires that the system reboots [1]. Do we know the exact circumsances that result in firmware requesting a reboot? If it happen on any PCIe error I don't see what we can do to prevent that beyond masking UEs entirely (are we even allowed to do that on FFS systems?). > If we're going to say that we don't want to support FFS because it's a > separate code path, and different flow, that's fine. I am myself, not a > fan of FFS. But if we're going to continue supporting it, I think we'll > continue to have to resolve these sort of unintended consequences. > > Alex > > [1] ACPI 6.2, 18.1 - Hardware Errors and Error Sources
Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected
On Fri, 2018-11-09 at 08:11 +0100, Lukas Wunner wrote: > On Thu, Nov 08, 2018 at 02:09:17PM -0600, Bjorn Helgaas wrote: > > + /* > > +* If an MMIO read from the device returns ~0 data, that data may > > +* be valid, or it may indicate a bus error. If config space is > > +* readable, assume it's valid data; otherwise, assume a bus error. > > +*/ > > + if (val == ~0) { > > + pci_read_config_dword(dev, PCI_VENDOR_ID, &id); > > + if (id == ~0) > > + pci_dev_set_disconnected(dev, NULL); > > + } > > This isn't safe unfortunately because "all ones" may occur for other > reasons besides disconnectedness. E.g. on an Uncorrectable Error, > the device may likewise respond with all ones, but revert to valid > responses if the error can be recovered through a Secondary Bus Reset. > In such a case, marking the device disconnected would be inappropriate. I don't really see why we're trying to make a distinction between recoverable errors and disconnected devices at this stage. In either case we should assume the device is broken and shouldn't be accessed until we perform some kind of recovery action. Bjorn's MMIO wrappers are more-or-less an opt-in software emulation of the freeze-MMIO-on-error behaviour that the EEH mechanism provides on IBM hardware so I think it makes sense. It also has the nice side effect of giving driver writers an error injection mechanism so they might actually test how their drivers deal with errors. > Accessing a device in D3cold would be another example where all ones > is returned both from mmio and config space despite the device still > being present and future accesses having a chance to succeed. Is doing a MMIO to a device in D3cold (or hot) ever a valid thing to do? > In fact, in v2 of Keith's patches adding pci_dev_set_disconnected() > he attempted the same as what you're doing here and caused issues > for me with devices in D3cold: > > https://spinics.net/lists/linux-pci/msg54337.html > > > > One thing I'm uncomfortable with is that [...]. Another is that the > > only place we call pci_dev_set_disconnected() is in pciehp and acpiphp, > > so the only "disconnected" case we catch is if hotplug happens to be > > involved. > > Yes, that's because the hotplug drivers are the only ones who can > identify removal authoritatively and unambiguously. They *know* > when the device is gone and don't have to resort to heuristics > such as all ones. (ISTR that dpc also marks devices disconnected.) The herustics shouldn't be used to work out when the device is gone, rather they should be used to work out when we need to check on the device. One of the grosser bits of EEH support is a hook in readl() and friends that checks for a 0xFF response. If it finds one, it looks at the EEH state and will start the recovery process if the device is marked as frozen. (don't look at the code. you won't like what you find) > > sprinkling pci_dev_is_disconnected() around feels ad hoc > > instead of systematic, in the sense that I don't know how we convince > > ourselves that this (and only this) is the correct place to put it. > > We need to add documentation for driver authors how to deal with > surprise removal. Briefly: > > * If (pdev->error_state == pci_channel_io_perm_failure), the device > is definitely gone and any further device access can be skipped. > Otherwise presence of the device is likely, but not guaranteed. > > * If a device access can significantly delay device removal due to > Completion Timeouts, or can cause an infinite loop, MCE or crash, > do check pdev->error_state before carrying out the device access. > > * Always be prepared that a device access may fail due to surprise > removal, do not blindly trust mmio or config space reads or > assume success of writes. Completely agree. We really need better documentation of what drivers should be doing. Oliver
[PATCH v2] powerpc/32: Avoid unsupported flags with clang
When building for ppc32 with clang these flags are unsupported: -ffixed-r2 and -mmultiple llvm's lib/Target/PowerPC/PPCRegisterInfo.cpp marks r2 as reserved on when building for SVR4ABI and !ppc64: // The SVR4 ABI reserves r2 and r13 if (Subtarget.isSVR4ABI()) { // We only reserve r2 if we need to use the TOC pointer. If we have no // explicit uses of the TOC pointer (meaning we're a leaf function with // no constant-pool loads, etc.) and we have no potential uses inside an // inline asm block, then we can treat r2 has an ordinary callee-saved // register. const PPCFunctionInfo *FuncInfo = MF.getInfo(); if (!TM.isPPC64() || FuncInfo->usesTOCBasePtr() || MF.hasInlineAsm()) markSuperRegs(Reserved, PPC::R2); // System-reserved register markSuperRegs(Reserved, PPC::R13); // Small Data Area pointer register } This means we can safely omit -ffixed-r2 when building for 32-bit targets. The -mmultiple/-mno-multiple flags are not supported by clang, so platforms that might support multiple miss out on using multiple word instructions. We wrap these flags in cc-option so that when Clang gains support the kernel will be able use these flags. Clang 8 can then build a ppc44x_defconfig which boots in Qemu: make CC=clang-8 ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- ppc44x_defconfig ./scripts/config -e CONFIG_DEVTMPFS -d DEVTMPFS_MOUNT make CC=clang-8 ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- qemu-system-ppc -M bamboo \ -kernel arch/powerpc/boot/zImage \ -dtb arch/powerpc/boot/dts/bamboo.dtb \ -initrd ~/ppc32-440-rootfs.cpio \ -nographic -serial stdio -monitor pty -append "console=ttyS0" Link: https://github.com/ClangBuiltLinux/linux/issues/261 Link: https://bugs.llvm.org/show_bug.cgi?id=39556 Link: https://bugs.llvm.org/show_bug.cgi?id=39555 Signed-off-by: Joel Stanley --- arch/powerpc/Makefile | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 8a2ce14d68d0..4685671dfb4f 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -152,7 +152,14 @@ endif CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,$(call cc-option,-mminimal-toc)) CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions) -CFLAGS-$(CONFIG_PPC32) := -ffixed-r2 $(MULTIPLEWORD) +# Clang unconditionally reserves r2 on ppc32 and does not support the flag +# https://bugs.llvm.org/show_bug.cgi?id=39555 +CFLAGS-$(CONFIG_PPC32) := $(call cc-option, -ffixed-r2) + +# Clang doesn't support -mmultiple / -mno-multiple +# https://bugs.llvm.org/show_bug.cgi?id=39556 +CFLAGS-$(CONFIG_PPC32) += $(call cc-option, $(MULTIPLEWORD)) + CFLAGS-$(CONFIG_PPC32) += $(call cc-option,-mno-readonly-in-sdata) ifdef CONFIG_PPC_BOOK3S_64 -- 2.19.1
Re: [PATCH kernel 3/3] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
On 12/11/2018 15:23, David Gibson wrote: > On Mon, Nov 12, 2018 at 01:36:45PM +1100, Alexey Kardashevskiy wrote: >> >> >> On 12/11/2018 12:08, David Gibson wrote: >>> On Fri, Oct 19, 2018 at 11:53:53AM +1100, Alexey Kardashevskiy wrote: On 19/10/2018 05:05, Alex Williamson wrote: > On Thu, 18 Oct 2018 10:37:46 -0700 > Piotr Jaroszynski wrote: > >> On 10/18/18 9:55 AM, Alex Williamson wrote: >>> On Thu, 18 Oct 2018 11:31:33 +1100 >>> Alexey Kardashevskiy wrote: >>> On 18/10/2018 08:52, Alex Williamson wrote: > On Wed, 17 Oct 2018 12:19:20 +1100 > Alexey Kardashevskiy wrote: > >> On 17/10/2018 06:08, Alex Williamson wrote: >>> On Mon, 15 Oct 2018 20:42:33 +1100 >>> Alexey Kardashevskiy wrote: + + if (pdev->vendor == PCI_VENDOR_ID_IBM && + pdev->device == 0x04ea) { + ret = vfio_pci_ibm_npu2_init(vdev); + if (ret) { + dev_warn(&vdev->pdev->dev, + "Failed to setup NVIDIA NV2 ATSD region\n"); + goto disable_exit; } >>> >>> So the NPU is also actually owned by vfio-pci and assigned to the >>> VM? >> >> Yes. On a running system it looks like: >> >> 0007:00:00.0 Bridge: IBM Device 04ea (rev 01) >> 0007:00:00.1 Bridge: IBM Device 04ea (rev 01) >> 0007:00:01.0 Bridge: IBM Device 04ea (rev 01) >> 0007:00:01.1 Bridge: IBM Device 04ea (rev 01) >> 0007:00:02.0 Bridge: IBM Device 04ea (rev 01) >> 0007:00:02.1 Bridge: IBM Device 04ea (rev 01) >> 0035:00:00.0 PCI bridge: IBM Device 04c1 >> 0035:01:00.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) >> 0035:02:04.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) >> 0035:02:05.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) >> 0035:02:0d.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) >> 0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 >> SXM2] >> (rev a1 >> 0035:04:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 >> SXM2] >> (rev a1) >> 0035:05:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 >> SXM2] >> (rev a1) >> >> One "IBM Device" bridge represents one NVLink2, i.e. a piece of NPU. >> They all and 3 GPUs go to the same IOMMU group and get passed >> through to >> a guest. >> >> The entire NPU does not have representation via sysfs as a whole >> though. > > So the NPU is a bridge, but it uses a normal header type so vfio-pci > will bind to it? An NPU is a NVLink bridge, it is not PCI in any sense. We (the host powerpc firmware known as "skiboot" or "opal") have chosen to emulate a virtual bridge per 1 NVLink on the firmware level. So for each physical NPU there are 6 virtual bridges. So the NVIDIA driver does not need to know much about NPUs. > And the ATSD register that we need on it is not > accessible through these PCI representations of the sub-pieces of the > NPU? Thanks, No, only via the device tree. The skiboot puts the ATSD register address to the PHB's DT property called 'ibm,mmio-atsd' of these virtual bridges. >>> >>> Ok, so the NPU is essential a virtual device already, mostly just a >>> stub. But it seems that each NPU is associated to a specific GPU, how >>> is that association done? In the use case here it seems like it's just >>> a vehicle to provide this ibm,mmio-atsd property to guest DT and the tgt >>> routing information to the GPU. So if both of those were attached to >>> the GPU, there'd be no purpose in assigning the NPU other than it's in >>> the same IOMMU group with a type 0 header, so something needs to be >>> done with it. If it's a virtual device, perhaps it could have a type 1 >>> header so vfio wouldn't care about it, then we would only assign the >>> GPU with these extra properties, which seems easier for management >>> tools and users. If the guest driver needs a visible NPU device, QEMU >>> could possibly emulate one to make the GPU association work >>> automatically. Maybe this isn't really a problem, but I wonder if >>> you've looked up the management stack to see what tools need to know to >>> assign these NPU devices and whether specific configurations are >>> required to make the NPU to GPU association work. Thanks, >> >> I'm not that familiar with how this was originally set up,
Re: [PATCH kernel 3/3] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
On Mon, Nov 12, 2018 at 01:36:45PM +1100, Alexey Kardashevskiy wrote: > > > On 12/11/2018 12:08, David Gibson wrote: > > On Fri, Oct 19, 2018 at 11:53:53AM +1100, Alexey Kardashevskiy wrote: > >> > >> > >> On 19/10/2018 05:05, Alex Williamson wrote: > >>> On Thu, 18 Oct 2018 10:37:46 -0700 > >>> Piotr Jaroszynski wrote: > >>> > On 10/18/18 9:55 AM, Alex Williamson wrote: > > On Thu, 18 Oct 2018 11:31:33 +1100 > > Alexey Kardashevskiy wrote: > > > >> On 18/10/2018 08:52, Alex Williamson wrote: > >>> On Wed, 17 Oct 2018 12:19:20 +1100 > >>> Alexey Kardashevskiy wrote: > >>> > On 17/10/2018 06:08, Alex Williamson wrote: > > On Mon, 15 Oct 2018 20:42:33 +1100 > > Alexey Kardashevskiy wrote: > >> + > >> + if (pdev->vendor == PCI_VENDOR_ID_IBM && > >> + pdev->device == 0x04ea) { > >> + ret = vfio_pci_ibm_npu2_init(vdev); > >> + if (ret) { > >> + dev_warn(&vdev->pdev->dev, > >> + "Failed to setup NVIDIA NV2 > >> ATSD region\n"); > >> + goto disable_exit; > >>} > > > > So the NPU is also actually owned by vfio-pci and assigned to the > > VM? > > Yes. On a running system it looks like: > > 0007:00:00.0 Bridge: IBM Device 04ea (rev 01) > 0007:00:00.1 Bridge: IBM Device 04ea (rev 01) > 0007:00:01.0 Bridge: IBM Device 04ea (rev 01) > 0007:00:01.1 Bridge: IBM Device 04ea (rev 01) > 0007:00:02.0 Bridge: IBM Device 04ea (rev 01) > 0007:00:02.1 Bridge: IBM Device 04ea (rev 01) > 0035:00:00.0 PCI bridge: IBM Device 04c1 > 0035:01:00.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) > 0035:02:04.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) > 0035:02:05.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) > 0035:02:0d.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) > 0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 > SXM2] > (rev a1 > 0035:04:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 > SXM2] > (rev a1) > 0035:05:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 > SXM2] > (rev a1) > > One "IBM Device" bridge represents one NVLink2, i.e. a piece of NPU. > They all and 3 GPUs go to the same IOMMU group and get passed > through to > a guest. > > The entire NPU does not have representation via sysfs as a whole > though. > >>> > >>> So the NPU is a bridge, but it uses a normal header type so vfio-pci > >>> will bind to it? > >> > >> An NPU is a NVLink bridge, it is not PCI in any sense. We (the host > >> powerpc firmware known as "skiboot" or "opal") have chosen to emulate a > >> virtual bridge per 1 NVLink on the firmware level. So for each physical > >> NPU there are 6 virtual bridges. So the NVIDIA driver does not need to > >> know much about NPUs. > >> > >>> And the ATSD register that we need on it is not > >>> accessible through these PCI representations of the sub-pieces of the > >>> NPU? Thanks, > >> > >> No, only via the device tree. The skiboot puts the ATSD register > >> address > >> to the PHB's DT property called 'ibm,mmio-atsd' of these virtual > >> bridges. > > > > Ok, so the NPU is essential a virtual device already, mostly just a > > stub. But it seems that each NPU is associated to a specific GPU, how > > is that association done? In the use case here it seems like it's just > > a vehicle to provide this ibm,mmio-atsd property to guest DT and the tgt > > routing information to the GPU. So if both of those were attached to > > the GPU, there'd be no purpose in assigning the NPU other than it's in > > the same IOMMU group with a type 0 header, so something needs to be > > done with it. If it's a virtual device, perhaps it could have a type 1 > > header so vfio wouldn't care about it, then we would only assign the > > GPU with these extra properties, which seems easier for management > > tools and users. If the guest driver needs a visible NPU device, QEMU > > could possibly emulate one to make the GPU association work > > automatically. Maybe this isn't really a problem, but I wonder if > > you've looked up the management stack to see what tools need to know to > > assign these NPU devices and whether specific configurations are > > required to make the NPU to GPU association work. Thanks, > > I'm not that familiar with how this was originally set up, but note that > Alexey is just making it
Re: [PATCH v2 2/2] kbuild: consolidate Clang compiler flags
On Mon, 12 Nov 2018 at 13:59, Masahiro Yamada wrote: > > On Mon, Nov 12, 2018 at 10:05 AM Michael Ellerman wrote: > > > > Masahiro Yamada writes: > > > On Sat, Nov 10, 2018 at 3:35 AM Greg Hackmann > > > wrote: > > >> > > >> On 11/09/2018 10:29 AM, Nick Desaulniers wrote: > > >> > On Mon, Nov 5, 2018 at 7:05 PM Masahiro Yamada > > >> > wrote: > > >> >> > > >> >> Collect basic Clang options such as --target, --prefix, > > >> >> --gcc-toolchain, > > >> >> -no-integrated-as into a single variable CLANG_FLAGS so that it can be > > >> >> easily reused in other parts of Makefile. > > >> >> > > >> >> Signed-off-by: Masahiro Yamada > > >> >> --- > > >> >> > > >> >> Changes in v2: > > >> >> - Use := flavor instead of = because $(CLANG_FLAGS) is expanded > > >> >> soon anyway > > >> >> > > >> >> Makefile | 13 ++--- > > >> >> 1 file changed, 6 insertions(+), 7 deletions(-) > > >> >> > > >> >> diff --git a/Makefile b/Makefile > > >> >> index da11700..e173a73 100644 > > >> >> --- a/Makefile > > >> >> +++ b/Makefile > > >> >> @@ -487,18 +487,17 @@ endif > > >> >> > > >> >> ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) > > >> >> ifneq ($(CROSS_COMPILE),) > > >> >> -CLANG_TARGET := --target=$(notdir $(CROSS_COMPILE:%-=%)) > > >> >> +CLANG_FLAGS:= --target=$(notdir $(CROSS_COMPILE:%-=%)) > > >> >> GCC_TOOLCHAIN_DIR := $(dir $(shell which $(LD))) > > >> >> -CLANG_PREFIX := --prefix=$(GCC_TOOLCHAIN_DIR) > > >> >> +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) > > >> >> GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) > > >> >> endif > > >> >> ifneq ($(GCC_TOOLCHAIN),) > > >> >> -CLANG_GCC_TC := --gcc-toolchain=$(GCC_TOOLCHAIN) > > >> >> +CLANG_FLAGS+= --gcc-toolchain=$(GCC_TOOLCHAIN) > > >> >> endif > > >> >> -KBUILD_CFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC) $(CLANG_PREFIX) > > >> >> -KBUILD_AFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC) $(CLANG_PREFIX) > > >> >> -KBUILD_CFLAGS += -no-integrated-as > > >> >> -KBUILD_AFLAGS += -no-integrated-as > > >> >> +CLANG_FLAGS+= -no-integrated-as > > >> >> +KBUILD_CFLAGS += $(CLANG_FLAGS) > > >> >> +KBUILD_AFLAGS += $(CLANG_FLAGS) > > >> >> endif > > >> >> > > >> >> RETPOLINE_CFLAGS_GCC := -mindirect-branch=thunk-extern > > >> >> -mindirect-branch-register > > >> >> -- > > >> >> 2.7.4 > > >> >> > > >> > > > >> > Thanks for this patch, Masahiro, it's a good simplification. > > >> > Reviewed-by: Nick Desaulniers > > >> > Tested-by: Nick Desaulniers > > >> > > > >> > Would you mind waiting for a tested-by from Stefan, and maybe an ack > > >> > from Greg (added to cc)? > > >> > > > >> > > >> Acked-by: Greg Hackmann > > > > > > > > > Thanks for your review! > > > > > > > > > So, how to organize this series, and Joel's one together? > > > > > > I'd like Joel to use this series as a base for his work. > > > (https://lore.kernel.org/patchwork/patch/1006696/) > > > > > > It will be much cleaner. > > > > > > > > > Shall I merge all the patches to kbuild tree, or > > > maybe will they go through powerpc tree? > > > > Joel's changes are fairly small so you may as well merge them along with > > the rest of the series, if that's OK with you and Joel. > > > OK, I will. > > > Joel, > If you send v2, I will merge it to kbuild tree. Thanks, I've done that now. Cheers, Joel
Re: [PATCH 02/17] x86: Add support for ZSTD-compressed kernel
* Adam Borowski wrote: > From: Nick Terrell > > Integrates the ZSTD decompression code to the x86 pre-boot code. > > Zstandard requires slightly more memory during the kernel decompression > on x86 (192 KB vs 64 KB), and the memory usage is independent of the > window size. > > Zstandard requires memory proportional to the window size used during > compression for decompressing the ramdisk image, since streaming mode is > used. Newer versions of zstd (1.3.2+) list the window size of a file > with `zstd -lv '. The absolute maximum amount of memory required > is just over 8 MB. > > Signed-off-by: Nick Terrell > --- > Documentation/x86/boot.txt| 6 +++--- > arch/x86/Kconfig | 1 + > arch/x86/boot/compressed/Makefile | 5 - > arch/x86/boot/compressed/misc.c | 4 > arch/x86/boot/header.S| 8 +++- > arch/x86/include/asm/boot.h | 6 -- > 6 files changed, 23 insertions(+), 7 deletions(-) Acked-by: Ingo Molnar > diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S > index 4c881c850125..af2efb256527 100644 > --- a/arch/x86/boot/header.S > +++ b/arch/x86/boot/header.S > @@ -526,8 +526,14 @@ pref_address:.quad LOAD_PHYSICAL_ADDR > # preferred load addr > # the size-dependent part now grows so fast. > # > # extra_bytes = (uncompressed_size >> 8) + 65536 > +# > +# ZSTD compressed data grows by at most 3 bytes per 128K, and only has a 22 > +# byte fixed overhead but has a maximum block size of 128K, so it needs a > +# larger margin. > +# > +# extra_bytes = (uncompressed_size >> 8) + 131072 > > -#define ZO_z_extra_bytes ((ZO_z_output_len >> 8) + 65536) > +#define ZO_z_extra_bytes ((ZO_z_output_len >> 8) + 131072) This change would also affect other decompressors, not just ZSTD, correct? Might want to split this change out into a separate preparatory patch to allow it to be bisected to, or at least mention it in the changelog more explicitly? Thanks, Ingo
[PATCH v2 2/2] powerpc/boot: Set target when cross-compiling for clang
Clang needs to be told which target it is building for when cross compiling. Link: https://github.com/ClangBuiltLinux/linux/issues/259 Signed-off-by: Joel Stanley --- arch/powerpc/boot/Makefile | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 39354365f54a..111f97b1ccec 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -55,6 +55,11 @@ BOOTAFLAGS := -D__ASSEMBLY__ $(BOOTCFLAGS) -traditional -nostdinc BOOTARFLAGS:= -cr$(KBUILD_ARFLAGS) +ifdef CONFIG_CC_IS_CLANG +BOOTCFLAGS += $(CLANG_FLAGS) +BOOTAFLAGS += $(CLANG_FLAGS) +endif + ifdef CONFIG_DEBUG_INFO BOOTCFLAGS += -g endif -- 2.19.1
[PATCH v2 1/2] Makefile: Export clang toolchain variables
The powerpc makefile will use these in it's boot wrapper. Signed-off-by: Joel Stanley --- Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/Makefile b/Makefile index 09278330282d..840efe6eb54c 100644 --- a/Makefile +++ b/Makefile @@ -495,6 +495,7 @@ endif ifneq ($(GCC_TOOLCHAIN),) CLANG_FLAGS+= --gcc-toolchain=$(GCC_TOOLCHAIN) endif +export CLANG_FLAGS CLANG_FLAGS+= -no-integrated-as KBUILD_CFLAGS += $(CLANG_FLAGS) KBUILD_AFLAGS += $(CLANG_FLAGS) -- 2.19.1
[PATCH v2 0/2] powerpc/boot: Fix cross compiling with clang
v2: rebase on "kbuild: consolidate Clang compiler flags" These patches allow clang to cross-compile the powerpc boot wrapper. The boot wrapper constructs it's own compiler flags as it may not be built for the same arch as the kernel. The powerpc64le kernel builds natively with clang and with this patch it can cross compile too. Joel Stanley (2): Makefile: Export clang toolchain variables powerpc/boot: Set target when cross-compiling for clang Makefile | 1 + arch/powerpc/boot/Makefile | 5 + 2 files changed, 6 insertions(+) -- 2.19.1
Re: [RFC] mm: Replace all open encodings for NUMA_NO_NODE
On 11/12/2018 09:40 AM, Anshuman Khandual wrote: > > > On 11/12/2018 09:27 AM, Joseph Qi wrote: >> For ocfs2 part, node means host in the cluster, not NUMA node. >> > > Does not -1 indicate an invalid node which can never be present ? > My bad, got it wrong. Seems like this is nothing to do with NUMA node at all. Will drop the changes from ocfs2.
Re: [RFC] mm: Replace all open encodings for NUMA_NO_NODE
On 11/12/2018 09:27 AM, Joseph Qi wrote: > For ocfs2 part, node means host in the cluster, not NUMA node. > Does not -1 indicate an invalid node which can never be present ?
Re: [RFC] mm: Replace all open encodings for NUMA_NO_NODE
For ocfs2 part, node means host in the cluster, not NUMA node. Thanks, Joseph On 18/11/12 10:41, Anshuman Khandual wrote: > At present there are multiple places where invalid node number is encoded > as -1. Even though implicitly understood it is always better to have macros > in there. Replace these open encodings for an invalid node number with the > global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like > 'invalid node' from various places redirecting them to a common definition. > > Signed-off-by: Anshuman Khandual > --- > Build tested this with multiple cross compiler options like alpha, sparc, > arm64, x86, powerpc64le etc with their default config which might not have > compiled tested all driver related changes. I will appreciate folks giving > this a test in their respective build environment. > > All these places for replacement were found by running the following grep > patterns on the entire kernel code. Please let me know if this might have > missed some instances. This might also have replaced some false positives. > I will appreciate suggestions, inputs and review. > > 1. git grep "nid == -1" > 2. git grep "node == -1" > 3. git grep "nid = -1" > 4. git grep "node = -1" > > arch/alpha/include/asm/topology.h | 2 +- > arch/ia64/kernel/numa.c | 2 +- > arch/ia64/mm/discontig.c | 6 +++--- > arch/ia64/sn/kernel/io_common.c | 2 +- > arch/powerpc/include/asm/pci-bridge.h | 2 +- > arch/powerpc/kernel/paca.c| 2 +- > arch/powerpc/kernel/pci-common.c | 2 +- > arch/powerpc/mm/numa.c| 14 +++--- > arch/powerpc/platforms/powernv/memtrace.c | 4 ++-- > arch/sparc/kernel/auxio_32.c | 2 +- > arch/sparc/kernel/pci_fire.c | 2 +- > arch/sparc/kernel/pci_schizo.c| 2 +- > arch/sparc/kernel/pcic.c | 6 +++--- > arch/sparc/kernel/psycho_common.c | 2 +- > arch/sparc/kernel/sbus.c | 2 +- > arch/sparc/mm/init_64.c | 6 +++--- > arch/sparc/prom/init_32.c | 2 +- > arch/sparc/prom/init_64.c | 4 ++-- > arch/sparc/prom/tree_32.c | 12 ++-- > arch/sparc/prom/tree_64.c | 18 +- > arch/x86/include/asm/pci.h| 2 +- > arch/x86/kernel/apic/x2apic_uv_x.c| 6 +++--- > arch/x86/kernel/smpboot.c | 2 +- > arch/x86/platform/olpc/olpc_dt.c | 16 > drivers/block/mtip32xx/mtip32xx.c | 4 ++-- > drivers/dma/dmaengine.c | 3 ++- > drivers/infiniband/hw/hfi1/affinity.c | 2 +- > drivers/infiniband/hw/hfi1/init.c | 2 +- > drivers/iommu/dmar.c | 4 ++-- > drivers/iommu/intel-iommu.c | 2 +- > drivers/media/pci/ivtv/ivtvfb.c | 2 +- > drivers/media/platform/vivid/vivid-osd.c | 2 +- > drivers/misc/sgi-xp/xpc_uv.c | 2 +- > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++-- > drivers/video/fbdev/mmp/fb/mmpfb.c| 2 +- > drivers/video/fbdev/pxa168fb.c| 2 +- > drivers/video/fbdev/w100fb.c | 2 +- > fs/ocfs2/dlm/dlmcommon.h | 2 +- > fs/ocfs2/dlm/dlmdomain.c | 10 +- > fs/ocfs2/dlm/dlmmaster.c | 2 +- > fs/ocfs2/dlm/dlmrecovery.c| 2 +- > fs/ocfs2/stack_user.c | 6 +++--- > init/init_task.c | 2 +- > kernel/kthread.c | 2 +- > kernel/sched/fair.c | 15 --- > lib/cpumask.c | 2 +- > mm/huge_memory.c | 12 ++-- > mm/hugetlb.c | 2 +- > mm/ksm.c | 2 +- > mm/memory.c | 6 +++--- > mm/memory_hotplug.c | 12 ++-- > mm/mempolicy.c| 2 +- > mm/page_alloc.c | 4 ++-- > mm/page_ext.c | 2 +- > net/core/pktgen.c | 2 +- > net/qrtr/qrtr.c | 2 +- > tools/perf/bench/numa.c | 6 +++--- > 57 files changed, 125 insertions(+), 123 deletions(-) > > diff --git a/arch/alpha/include/asm/topology.h > b/arch/alpha/include/asm/topology.h > index e6e13a8..f6dc89c 100644 > --- a/arch/alpha/include/asm/topology.h > +++ b/arch/alpha/include/asm/topology.h > @@ -29,7 +29,7 @@ static const struct cpumask *cpumask_of_node(int node) > { > int cpu; > > - if (node ==
Re: [PATCH v2 2/2] kbuild: consolidate Clang compiler flags
On Mon, Nov 12, 2018 at 10:05 AM Michael Ellerman wrote: > > Masahiro Yamada writes: > > On Sat, Nov 10, 2018 at 3:35 AM Greg Hackmann wrote: > >> > >> On 11/09/2018 10:29 AM, Nick Desaulniers wrote: > >> > On Mon, Nov 5, 2018 at 7:05 PM Masahiro Yamada > >> > wrote: > >> >> > >> >> Collect basic Clang options such as --target, --prefix, --gcc-toolchain, > >> >> -no-integrated-as into a single variable CLANG_FLAGS so that it can be > >> >> easily reused in other parts of Makefile. > >> >> > >> >> Signed-off-by: Masahiro Yamada > >> >> --- > >> >> > >> >> Changes in v2: > >> >> - Use := flavor instead of = because $(CLANG_FLAGS) is expanded soon > >> >> anyway > >> >> > >> >> Makefile | 13 ++--- > >> >> 1 file changed, 6 insertions(+), 7 deletions(-) > >> >> > >> >> diff --git a/Makefile b/Makefile > >> >> index da11700..e173a73 100644 > >> >> --- a/Makefile > >> >> +++ b/Makefile > >> >> @@ -487,18 +487,17 @@ endif > >> >> > >> >> ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) > >> >> ifneq ($(CROSS_COMPILE),) > >> >> -CLANG_TARGET := --target=$(notdir $(CROSS_COMPILE:%-=%)) > >> >> +CLANG_FLAGS:= --target=$(notdir $(CROSS_COMPILE:%-=%)) > >> >> GCC_TOOLCHAIN_DIR := $(dir $(shell which $(LD))) > >> >> -CLANG_PREFIX := --prefix=$(GCC_TOOLCHAIN_DIR) > >> >> +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) > >> >> GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) > >> >> endif > >> >> ifneq ($(GCC_TOOLCHAIN),) > >> >> -CLANG_GCC_TC := --gcc-toolchain=$(GCC_TOOLCHAIN) > >> >> +CLANG_FLAGS+= --gcc-toolchain=$(GCC_TOOLCHAIN) > >> >> endif > >> >> -KBUILD_CFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC) $(CLANG_PREFIX) > >> >> -KBUILD_AFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC) $(CLANG_PREFIX) > >> >> -KBUILD_CFLAGS += -no-integrated-as > >> >> -KBUILD_AFLAGS += -no-integrated-as > >> >> +CLANG_FLAGS+= -no-integrated-as > >> >> +KBUILD_CFLAGS += $(CLANG_FLAGS) > >> >> +KBUILD_AFLAGS += $(CLANG_FLAGS) > >> >> endif > >> >> > >> >> RETPOLINE_CFLAGS_GCC := -mindirect-branch=thunk-extern > >> >> -mindirect-branch-register > >> >> -- > >> >> 2.7.4 > >> >> > >> > > >> > Thanks for this patch, Masahiro, it's a good simplification. > >> > Reviewed-by: Nick Desaulniers > >> > Tested-by: Nick Desaulniers > >> > > >> > Would you mind waiting for a tested-by from Stefan, and maybe an ack > >> > from Greg (added to cc)? > >> > > >> > >> Acked-by: Greg Hackmann > > > > > > Thanks for your review! > > > > > > So, how to organize this series, and Joel's one together? > > > > I'd like Joel to use this series as a base for his work. > > (https://lore.kernel.org/patchwork/patch/1006696/) > > > > It will be much cleaner. > > > > > > Shall I merge all the patches to kbuild tree, or > > maybe will they go through powerpc tree? > > Joel's changes are fairly small so you may as well merge them along with > the rest of the series, if that's OK with you and Joel. OK, I will. Joel, If you send v2, I will merge it to kbuild tree. Thanks. -- Best Regards Masahiro Yamada
Re: [PATCH 2/2] x86, powerpc: remove -funit-at-a-time compiler option entirely
* Masahiro Yamada wrote: > GCC 4.6 manual says: > > -funit-at-a-time > This option is left for compatibility reasons. -funit-at-a-time has > no effect, while -fno-unit-at-a-time implies -fno-toplevel-reorder > and -fno-section-anchors. > Enabled by default. > > Signed-off-by: Masahiro Yamada > --- > > arch/powerpc/Makefile | 4 > arch/x86/Makefile | 4 > arch/x86/Makefile.um | 5 - > 3 files changed, 13 deletions(-) > > diff --git a/arch/x86/Makefile b/arch/x86/Makefile > index 88398fd..3508049 100644 > --- a/arch/x86/Makefile > +++ b/arch/x86/Makefile > @@ -130,10 +130,6 @@ else > > KBUILD_CFLAGS += -mno-red-zone > KBUILD_CFLAGS += -mcmodel=kernel > - > -# -funit-at-a-time shrinks the kernel .text considerably > -# unfortunately it makes reading oopses harder. > -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time) > endif > > ifdef CONFIG_X86_X32 > diff --git a/arch/x86/Makefile.um b/arch/x86/Makefile.um > index 577976b..1db7913 100644 > --- a/arch/x86/Makefile.um > +++ b/arch/x86/Makefile.um > @@ -26,9 +26,6 @@ cflags-y += $(call cc-option,-mpreferred-stack-boundary=2) > # an unresolved reference. > cflags-y += -ffreestanding > > -# gcc 4.3.0 needs -funit-at-a-time for extern inline functions. > -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time) > - > KBUILD_CFLAGS += $(cflags-y) > > else > @@ -50,6 +47,4 @@ ELF_FORMAT := elf64-x86-64 > LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib64 > LINK-y += -m64 > > -# Do unit-at-a-time unconditionally on x86_64, following the host > -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time) > endif Acked-by: Ingo Molnar Thanks, Ingo
[RFC] mm: Replace all open encodings for NUMA_NO_NODE
At present there are multiple places where invalid node number is encoded as -1. Even though implicitly understood it is always better to have macros in there. Replace these open encodings for an invalid node number with the global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like 'invalid node' from various places redirecting them to a common definition. Signed-off-by: Anshuman Khandual --- Build tested this with multiple cross compiler options like alpha, sparc, arm64, x86, powerpc64le etc with their default config which might not have compiled tested all driver related changes. I will appreciate folks giving this a test in their respective build environment. All these places for replacement were found by running the following grep patterns on the entire kernel code. Please let me know if this might have missed some instances. This might also have replaced some false positives. I will appreciate suggestions, inputs and review. 1. git grep "nid == -1" 2. git grep "node == -1" 3. git grep "nid = -1" 4. git grep "node = -1" arch/alpha/include/asm/topology.h | 2 +- arch/ia64/kernel/numa.c | 2 +- arch/ia64/mm/discontig.c | 6 +++--- arch/ia64/sn/kernel/io_common.c | 2 +- arch/powerpc/include/asm/pci-bridge.h | 2 +- arch/powerpc/kernel/paca.c| 2 +- arch/powerpc/kernel/pci-common.c | 2 +- arch/powerpc/mm/numa.c| 14 +++--- arch/powerpc/platforms/powernv/memtrace.c | 4 ++-- arch/sparc/kernel/auxio_32.c | 2 +- arch/sparc/kernel/pci_fire.c | 2 +- arch/sparc/kernel/pci_schizo.c| 2 +- arch/sparc/kernel/pcic.c | 6 +++--- arch/sparc/kernel/psycho_common.c | 2 +- arch/sparc/kernel/sbus.c | 2 +- arch/sparc/mm/init_64.c | 6 +++--- arch/sparc/prom/init_32.c | 2 +- arch/sparc/prom/init_64.c | 4 ++-- arch/sparc/prom/tree_32.c | 12 ++-- arch/sparc/prom/tree_64.c | 18 +- arch/x86/include/asm/pci.h| 2 +- arch/x86/kernel/apic/x2apic_uv_x.c| 6 +++--- arch/x86/kernel/smpboot.c | 2 +- arch/x86/platform/olpc/olpc_dt.c | 16 drivers/block/mtip32xx/mtip32xx.c | 4 ++-- drivers/dma/dmaengine.c | 3 ++- drivers/infiniband/hw/hfi1/affinity.c | 2 +- drivers/infiniband/hw/hfi1/init.c | 2 +- drivers/iommu/dmar.c | 4 ++-- drivers/iommu/intel-iommu.c | 2 +- drivers/media/pci/ivtv/ivtvfb.c | 2 +- drivers/media/platform/vivid/vivid-osd.c | 2 +- drivers/misc/sgi-xp/xpc_uv.c | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++-- drivers/video/fbdev/mmp/fb/mmpfb.c| 2 +- drivers/video/fbdev/pxa168fb.c| 2 +- drivers/video/fbdev/w100fb.c | 2 +- fs/ocfs2/dlm/dlmcommon.h | 2 +- fs/ocfs2/dlm/dlmdomain.c | 10 +- fs/ocfs2/dlm/dlmmaster.c | 2 +- fs/ocfs2/dlm/dlmrecovery.c| 2 +- fs/ocfs2/stack_user.c | 6 +++--- init/init_task.c | 2 +- kernel/kthread.c | 2 +- kernel/sched/fair.c | 15 --- lib/cpumask.c | 2 +- mm/huge_memory.c | 12 ++-- mm/hugetlb.c | 2 +- mm/ksm.c | 2 +- mm/memory.c | 6 +++--- mm/memory_hotplug.c | 12 ++-- mm/mempolicy.c| 2 +- mm/page_alloc.c | 4 ++-- mm/page_ext.c | 2 +- net/core/pktgen.c | 2 +- net/qrtr/qrtr.c | 2 +- tools/perf/bench/numa.c | 6 +++--- 57 files changed, 125 insertions(+), 123 deletions(-) diff --git a/arch/alpha/include/asm/topology.h b/arch/alpha/include/asm/topology.h index e6e13a8..f6dc89c 100644 --- a/arch/alpha/include/asm/topology.h +++ b/arch/alpha/include/asm/topology.h @@ -29,7 +29,7 @@ static const struct cpumask *cpumask_of_node(int node) { int cpu; - if (node == -1) + if (node == NUMA_NO_NODE) return cpu_all_mask; cpumask_clear(&node_to_cpumask_map[node]); diff --git a/arch/ia64/kernel/numa.c b/arch/ia64/kernel/numa.c index 92c3762..1315da6 100644 --- a/arch/ia64/kernel/numa.c +++ b/arch/ia64/kernel/numa.c @@ -74,7 +74,7 @@ void
[PATCH v2] selftests/powerpc: Fix wild_bctr test to work on ppc64
The selftest I recently added to test branching to an out-of-bounds NIP doesn't work on 64-bit big endian. It does fail but not in the right way. That is it SEGVs trying to load from the opd at BAD_NIP, but it never gets as far as branching to BAD_NIP. To fix it we need to create an opd which is reachable but which holds the bad address. Fixes: b7683fc66eba ("selftests/powerpc: Add a test of wild bctr") Signed-off-by: Michael Ellerman --- tools/testing/selftests/powerpc/mm/wild_bctr.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) v2: Use _CALL_AIXDESC as suggested by Segher. diff --git a/tools/testing/selftests/powerpc/mm/wild_bctr.c b/tools/testing/selftests/powerpc/mm/wild_bctr.c index 1b0e9e9a2ddc..90469a9e49d4 100644 --- a/tools/testing/selftests/powerpc/mm/wild_bctr.c +++ b/tools/testing/selftests/powerpc/mm/wild_bctr.c @@ -105,6 +105,20 @@ static void dump_regs(void) } } +#ifdef _CALL_AIXDESC +struct opd { + unsigned long ip; + unsigned long toc; + unsigned long env; +}; +static struct opd bad_opd = { + .ip = BAD_NIP, +}; +#define BAD_FUNC (&bad_opd) +#else +#define BAD_FUNC BAD_NIP +#endif + int test_wild_bctr(void) { int (*func_ptr)(void); @@ -133,7 +147,7 @@ int test_wild_bctr(void) poison_regs(); - func_ptr = (int (*)(void))BAD_NIP; + func_ptr = (int (*)(void))BAD_FUNC; func_ptr(); FAIL_IF(1); /* we didn't segv? */ -- 2.17.2
[PATCH 1/2] um: remove -fno-unit-at-a-time workaround for pre-4.0 GCC
Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6") bumped the minimum GCC version to 4.6 for all architectures. '$(call cc-option,-fno-unit-at-a-time)' is now dead code since '$(cc-version) -lt 0400' is always false. Signed-off-by: Masahiro Yamada --- arch/x86/Makefile.um | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/Makefile.um b/arch/x86/Makefile.um index 91085a0..577976b 100644 --- a/arch/x86/Makefile.um +++ b/arch/x86/Makefile.um @@ -26,12 +26,8 @@ cflags-y += $(call cc-option,-mpreferred-stack-boundary=2) # an unresolved reference. cflags-y += -ffreestanding -# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use -# a lot more stack due to the lack of sharing of stacklots. Also, gcc -# 4.3.0 needs -funit-at-a-time for extern inline functions. -KBUILD_CFLAGS += $(shell if [ $(cc-version) -lt 0400 ] ; then \ - echo $(call cc-option,-fno-unit-at-a-time); \ - else echo $(call cc-option,-funit-at-a-time); fi ;) +# gcc 4.3.0 needs -funit-at-a-time for extern inline functions. +KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time) KBUILD_CFLAGS += $(cflags-y) -- 2.7.4
[PATCH 2/2] x86, powerpc: remove -funit-at-a-time compiler option entirely
GCC 4.6 manual says: -funit-at-a-time This option is left for compatibility reasons. -funit-at-a-time has no effect, while -fno-unit-at-a-time implies -fno-toplevel-reorder and -fno-section-anchors. Enabled by default. Signed-off-by: Masahiro Yamada --- arch/powerpc/Makefile | 4 arch/x86/Makefile | 4 arch/x86/Makefile.um | 5 - 3 files changed, 13 deletions(-) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 8a2ce14..854199c 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -228,10 +228,6 @@ KBUILD_CFLAGS += $(call cc-option,-mno-vsx) KBUILD_CFLAGS += $(call cc-option,-mno-spe) KBUILD_CFLAGS += $(call cc-option,-mspe=no) -# Enable unit-at-a-time mode when possible. It shrinks the -# kernel considerably. -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time) - # FIXME: the module load should be taught about the additional relocs # generated by this. # revert to pre-gcc-4.4 behaviour of .eh_frame diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 88398fd..3508049 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -130,10 +130,6 @@ else KBUILD_CFLAGS += -mno-red-zone KBUILD_CFLAGS += -mcmodel=kernel - -# -funit-at-a-time shrinks the kernel .text considerably -# unfortunately it makes reading oopses harder. -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time) endif ifdef CONFIG_X86_X32 diff --git a/arch/x86/Makefile.um b/arch/x86/Makefile.um index 577976b..1db7913 100644 --- a/arch/x86/Makefile.um +++ b/arch/x86/Makefile.um @@ -26,9 +26,6 @@ cflags-y += $(call cc-option,-mpreferred-stack-boundary=2) # an unresolved reference. cflags-y += -ffreestanding -# gcc 4.3.0 needs -funit-at-a-time for extern inline functions. -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time) - KBUILD_CFLAGS += $(cflags-y) else @@ -50,6 +47,4 @@ ELF_FORMAT := elf64-x86-64 LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib64 LINK-y += -m64 -# Do unit-at-a-time unconditionally on x86_64, following the host -KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time) endif -- 2.7.4
[PATCH 0/2] Remove -fno-unit-at-a-time and -funit-at-a-time compiler flags entirely
1/2: remove dead code, which is logically obvious because the minimum GCC version is now 4.6 2/2: we can say -funit-at-a-time is no longer useful according to GCC 4.6 manual I hope, this series can be applied through x86 tree. Masahiro Yamada (2): um: remove -fno-unit-at-a-time workaround for pre-4.0 GCC x86, powerpc: remove -funit-at-a-time compiler option entirely arch/powerpc/Makefile | 4 arch/x86/Makefile | 4 arch/x86/Makefile.um | 9 - 3 files changed, 17 deletions(-) -- 2.7.4
Re: [PATCH kernel 3/3] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
On 12/11/2018 12:08, David Gibson wrote: > On Fri, Oct 19, 2018 at 11:53:53AM +1100, Alexey Kardashevskiy wrote: >> >> >> On 19/10/2018 05:05, Alex Williamson wrote: >>> On Thu, 18 Oct 2018 10:37:46 -0700 >>> Piotr Jaroszynski wrote: >>> On 10/18/18 9:55 AM, Alex Williamson wrote: > On Thu, 18 Oct 2018 11:31:33 +1100 > Alexey Kardashevskiy wrote: > >> On 18/10/2018 08:52, Alex Williamson wrote: >>> On Wed, 17 Oct 2018 12:19:20 +1100 >>> Alexey Kardashevskiy wrote: >>> On 17/10/2018 06:08, Alex Williamson wrote: > On Mon, 15 Oct 2018 20:42:33 +1100 > Alexey Kardashevskiy wrote: >> + >> +if (pdev->vendor == PCI_VENDOR_ID_IBM && >> +pdev->device == 0x04ea) { >> +ret = vfio_pci_ibm_npu2_init(vdev); >> +if (ret) { >> +dev_warn(&vdev->pdev->dev, >> +"Failed to setup NVIDIA NV2 >> ATSD region\n"); >> +goto disable_exit; >> } > > So the NPU is also actually owned by vfio-pci and assigned to the VM? > Yes. On a running system it looks like: 0007:00:00.0 Bridge: IBM Device 04ea (rev 01) 0007:00:00.1 Bridge: IBM Device 04ea (rev 01) 0007:00:01.0 Bridge: IBM Device 04ea (rev 01) 0007:00:01.1 Bridge: IBM Device 04ea (rev 01) 0007:00:02.0 Bridge: IBM Device 04ea (rev 01) 0007:00:02.1 Bridge: IBM Device 04ea (rev 01) 0035:00:00.0 PCI bridge: IBM Device 04c1 0035:01:00.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) 0035:02:04.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) 0035:02:05.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) 0035:02:0d.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) 0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1 0035:04:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1) 0035:05:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1) One "IBM Device" bridge represents one NVLink2, i.e. a piece of NPU. They all and 3 GPUs go to the same IOMMU group and get passed through to a guest. The entire NPU does not have representation via sysfs as a whole though. >>> >>> So the NPU is a bridge, but it uses a normal header type so vfio-pci >>> will bind to it? >> >> An NPU is a NVLink bridge, it is not PCI in any sense. We (the host >> powerpc firmware known as "skiboot" or "opal") have chosen to emulate a >> virtual bridge per 1 NVLink on the firmware level. So for each physical >> NPU there are 6 virtual bridges. So the NVIDIA driver does not need to >> know much about NPUs. >> >>> And the ATSD register that we need on it is not >>> accessible through these PCI representations of the sub-pieces of the >>> NPU? Thanks, >> >> No, only via the device tree. The skiboot puts the ATSD register address >> to the PHB's DT property called 'ibm,mmio-atsd' of these virtual >> bridges. > > Ok, so the NPU is essential a virtual device already, mostly just a > stub. But it seems that each NPU is associated to a specific GPU, how > is that association done? In the use case here it seems like it's just > a vehicle to provide this ibm,mmio-atsd property to guest DT and the tgt > routing information to the GPU. So if both of those were attached to > the GPU, there'd be no purpose in assigning the NPU other than it's in > the same IOMMU group with a type 0 header, so something needs to be > done with it. If it's a virtual device, perhaps it could have a type 1 > header so vfio wouldn't care about it, then we would only assign the > GPU with these extra properties, which seems easier for management > tools and users. If the guest driver needs a visible NPU device, QEMU > could possibly emulate one to make the GPU association work > automatically. Maybe this isn't really a problem, but I wonder if > you've looked up the management stack to see what tools need to know to > assign these NPU devices and whether specific configurations are > required to make the NPU to GPU association work. Thanks, I'm not that familiar with how this was originally set up, but note that Alexey is just making it work exactly like baremetal does. The baremetal GPU driver works as-is in the VM and expects the same properties in the device-tree. Obviously it doesn't have to be that way, but there is value in keeping it identical. Another probably big
[PATCH 3.16 153/366] powerpc/e500mc: Set assembler machine type to e500mc
3.16.61-rc1 review patch. If anyone has any objections, please let me know. -- From: Michael Jeanson commit 69a8405999aa1c489de4b8d349468f0c2b83f093 upstream. In binutils 2.26 a new opcode for the "wait" instruction was added for the POWER9 and has precedence over the one specific to the e500mc. Commit ebf714ff3756 ("powerpc/e500mc: Add support for the wait instruction in e500_idle") uses this instruction specifically on the e500mc to work around an erratum. This results in an invalid instruction in idle_e500 when we build for the e500mc on bintutils >= 2.26 with the default assembler machine type. Since multiplatform between e500 and non-e500 is not supported, set the assembler machine type globaly when CONFIG_PPC_E500MC=y. Signed-off-by: Michael Jeanson Reviewed-by: Mathieu Desnoyers CC: Benjamin Herrenschmidt CC: Paul Mackerras CC: Michael Ellerman CC: Kumar Gala CC: Vakul Garg CC: Scott Wood CC: Mathieu Desnoyers CC: linuxppc-dev@lists.ozlabs.org CC: linux-ker...@vger.kernel.org Signed-off-by: Michael Ellerman [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings --- arch/powerpc/Makefile | 1 + 1 file changed, 1 insertion(+) --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -205,6 +205,7 @@ endif cpu-as-$(CONFIG_4xx) += -Wa,-m405 cpu-as-$(CONFIG_ALTIVEC) += -Wa,-maltivec cpu-as-$(CONFIG_E200) += -Wa,-me200 +cpu-as-$(CONFIG_PPC_E500MC)+= $(call as-option,-Wa$(comma)-me500mc) KBUILD_AFLAGS += $(cpu-as-y) KBUILD_CFLAGS += $(cpu-as-y)
Re: [PATCH kernel 3/3] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
On Fri, Oct 19, 2018 at 11:53:53AM +1100, Alexey Kardashevskiy wrote: > > > On 19/10/2018 05:05, Alex Williamson wrote: > > On Thu, 18 Oct 2018 10:37:46 -0700 > > Piotr Jaroszynski wrote: > > > >> On 10/18/18 9:55 AM, Alex Williamson wrote: > >>> On Thu, 18 Oct 2018 11:31:33 +1100 > >>> Alexey Kardashevskiy wrote: > >>> > On 18/10/2018 08:52, Alex Williamson wrote: > > On Wed, 17 Oct 2018 12:19:20 +1100 > > Alexey Kardashevskiy wrote: > > > >> On 17/10/2018 06:08, Alex Williamson wrote: > >>> On Mon, 15 Oct 2018 20:42:33 +1100 > >>> Alexey Kardashevskiy wrote: > + > +if (pdev->vendor == PCI_VENDOR_ID_IBM && > +pdev->device == 0x04ea) { > +ret = vfio_pci_ibm_npu2_init(vdev); > +if (ret) { > +dev_warn(&vdev->pdev->dev, > +"Failed to setup NVIDIA NV2 > ATSD region\n"); > +goto disable_exit; > } > >>> > >>> So the NPU is also actually owned by vfio-pci and assigned to the VM? > >>> > >> > >> Yes. On a running system it looks like: > >> > >> 0007:00:00.0 Bridge: IBM Device 04ea (rev 01) > >> 0007:00:00.1 Bridge: IBM Device 04ea (rev 01) > >> 0007:00:01.0 Bridge: IBM Device 04ea (rev 01) > >> 0007:00:01.1 Bridge: IBM Device 04ea (rev 01) > >> 0007:00:02.0 Bridge: IBM Device 04ea (rev 01) > >> 0007:00:02.1 Bridge: IBM Device 04ea (rev 01) > >> 0035:00:00.0 PCI bridge: IBM Device 04c1 > >> 0035:01:00.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) > >> 0035:02:04.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) > >> 0035:02:05.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) > >> 0035:02:0d.0 PCI bridge: PLX Technology, Inc. Device 8725 (rev ca) > >> 0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 > >> SXM2] > >> (rev a1 > >> 0035:04:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 > >> SXM2] > >> (rev a1) > >> 0035:05:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 > >> SXM2] > >> (rev a1) > >> > >> One "IBM Device" bridge represents one NVLink2, i.e. a piece of NPU. > >> They all and 3 GPUs go to the same IOMMU group and get passed through > >> to > >> a guest. > >> > >> The entire NPU does not have representation via sysfs as a whole > >> though. > > > > So the NPU is a bridge, but it uses a normal header type so vfio-pci > > will bind to it? > > An NPU is a NVLink bridge, it is not PCI in any sense. We (the host > powerpc firmware known as "skiboot" or "opal") have chosen to emulate a > virtual bridge per 1 NVLink on the firmware level. So for each physical > NPU there are 6 virtual bridges. So the NVIDIA driver does not need to > know much about NPUs. > > > And the ATSD register that we need on it is not > > accessible through these PCI representations of the sub-pieces of the > > NPU? Thanks, > > No, only via the device tree. The skiboot puts the ATSD register address > to the PHB's DT property called 'ibm,mmio-atsd' of these virtual > bridges. > >>> > >>> Ok, so the NPU is essential a virtual device already, mostly just a > >>> stub. But it seems that each NPU is associated to a specific GPU, how > >>> is that association done? In the use case here it seems like it's just > >>> a vehicle to provide this ibm,mmio-atsd property to guest DT and the tgt > >>> routing information to the GPU. So if both of those were attached to > >>> the GPU, there'd be no purpose in assigning the NPU other than it's in > >>> the same IOMMU group with a type 0 header, so something needs to be > >>> done with it. If it's a virtual device, perhaps it could have a type 1 > >>> header so vfio wouldn't care about it, then we would only assign the > >>> GPU with these extra properties, which seems easier for management > >>> tools and users. If the guest driver needs a visible NPU device, QEMU > >>> could possibly emulate one to make the GPU association work > >>> automatically. Maybe this isn't really a problem, but I wonder if > >>> you've looked up the management stack to see what tools need to know to > >>> assign these NPU devices and whether specific configurations are > >>> required to make the NPU to GPU association work. Thanks, > >> > >> I'm not that familiar with how this was originally set up, but note that > >> Alexey is just making it work exactly like baremetal does. The baremetal > >> GPU driver works as-is in the VM and expects the same properties in the > >> device-tree. Obviously it doesn't have to be that way, but there is > >> value in keeping it identical. > >> > >> Another probably bigger point is that the NPU device also implemen
Re: [PATCH v2 2/2] kbuild: consolidate Clang compiler flags
Masahiro Yamada writes: > On Sat, Nov 10, 2018 at 3:35 AM Greg Hackmann wrote: >> >> On 11/09/2018 10:29 AM, Nick Desaulniers wrote: >> > On Mon, Nov 5, 2018 at 7:05 PM Masahiro Yamada >> > wrote: >> >> >> >> Collect basic Clang options such as --target, --prefix, --gcc-toolchain, >> >> -no-integrated-as into a single variable CLANG_FLAGS so that it can be >> >> easily reused in other parts of Makefile. >> >> >> >> Signed-off-by: Masahiro Yamada >> >> --- >> >> >> >> Changes in v2: >> >> - Use := flavor instead of = because $(CLANG_FLAGS) is expanded soon >> >> anyway >> >> >> >> Makefile | 13 ++--- >> >> 1 file changed, 6 insertions(+), 7 deletions(-) >> >> >> >> diff --git a/Makefile b/Makefile >> >> index da11700..e173a73 100644 >> >> --- a/Makefile >> >> +++ b/Makefile >> >> @@ -487,18 +487,17 @@ endif >> >> >> >> ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) >> >> ifneq ($(CROSS_COMPILE),) >> >> -CLANG_TARGET := --target=$(notdir $(CROSS_COMPILE:%-=%)) >> >> +CLANG_FLAGS:= --target=$(notdir $(CROSS_COMPILE:%-=%)) >> >> GCC_TOOLCHAIN_DIR := $(dir $(shell which $(LD))) >> >> -CLANG_PREFIX := --prefix=$(GCC_TOOLCHAIN_DIR) >> >> +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) >> >> GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) >> >> endif >> >> ifneq ($(GCC_TOOLCHAIN),) >> >> -CLANG_GCC_TC := --gcc-toolchain=$(GCC_TOOLCHAIN) >> >> +CLANG_FLAGS+= --gcc-toolchain=$(GCC_TOOLCHAIN) >> >> endif >> >> -KBUILD_CFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC) $(CLANG_PREFIX) >> >> -KBUILD_AFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC) $(CLANG_PREFIX) >> >> -KBUILD_CFLAGS += -no-integrated-as >> >> -KBUILD_AFLAGS += -no-integrated-as >> >> +CLANG_FLAGS+= -no-integrated-as >> >> +KBUILD_CFLAGS += $(CLANG_FLAGS) >> >> +KBUILD_AFLAGS += $(CLANG_FLAGS) >> >> endif >> >> >> >> RETPOLINE_CFLAGS_GCC := -mindirect-branch=thunk-extern >> >> -mindirect-branch-register >> >> -- >> >> 2.7.4 >> >> >> > >> > Thanks for this patch, Masahiro, it's a good simplification. >> > Reviewed-by: Nick Desaulniers >> > Tested-by: Nick Desaulniers >> > >> > Would you mind waiting for a tested-by from Stefan, and maybe an ack >> > from Greg (added to cc)? >> > >> >> Acked-by: Greg Hackmann > > > Thanks for your review! > > > So, how to organize this series, and Joel's one together? > > I'd like Joel to use this series as a base for his work. > (https://lore.kernel.org/patchwork/patch/1006696/) > > It will be much cleaner. > > > Shall I merge all the patches to kbuild tree, or > maybe will they go through powerpc tree? Joel's changes are fairly small so you may as well merge them along with the rest of the series, if that's OK with you and Joel. cheers
[PATCH tip/core/rcu 08/41] powerpc: Convert hugepd_free() to use call_rcu()
Now that call_rcu()'s callback is not invoked until after all preempt-disable regions of code have completed (in addition to explicitly marked RCU read-side critical sections), call_rcu() can be used in place of call_rcu_sched(). This commit therefore makes that change. Signed-off-by: Paul E. McKenney Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: --- arch/powerpc/mm/hugetlbpage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 8cf035e68378..4c01e9a01a74 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -289,7 +289,7 @@ static void hugepd_free(struct mmu_gather *tlb, void *hugepte) (*batchp)->ptes[(*batchp)->index++] = hugepte; if ((*batchp)->index == HUGEPD_FREELIST_SIZE) { - call_rcu_sched(&(*batchp)->rcu, hugepd_free_rcu_callback); + call_rcu(&(*batchp)->rcu, hugepd_free_rcu_callback); *batchp = NULL; } put_cpu_var(hugepd_freelist_cur); -- 2.17.1
Re: [PATCH v2 2/2] kbuild: consolidate Clang compiler flags
On Sat, Nov 10, 2018 at 3:35 AM Greg Hackmann wrote: > > On 11/09/2018 10:29 AM, Nick Desaulniers wrote: > > On Mon, Nov 5, 2018 at 7:05 PM Masahiro Yamada > > wrote: > >> > >> Collect basic Clang options such as --target, --prefix, --gcc-toolchain, > >> -no-integrated-as into a single variable CLANG_FLAGS so that it can be > >> easily reused in other parts of Makefile. > >> > >> Signed-off-by: Masahiro Yamada > >> --- > >> > >> Changes in v2: > >> - Use := flavor instead of = because $(CLANG_FLAGS) is expanded soon > >> anyway > >> > >> Makefile | 13 ++--- > >> 1 file changed, 6 insertions(+), 7 deletions(-) > >> > >> diff --git a/Makefile b/Makefile > >> index da11700..e173a73 100644 > >> --- a/Makefile > >> +++ b/Makefile > >> @@ -487,18 +487,17 @@ endif > >> > >> ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) > >> ifneq ($(CROSS_COMPILE),) > >> -CLANG_TARGET := --target=$(notdir $(CROSS_COMPILE:%-=%)) > >> +CLANG_FLAGS:= --target=$(notdir $(CROSS_COMPILE:%-=%)) > >> GCC_TOOLCHAIN_DIR := $(dir $(shell which $(LD))) > >> -CLANG_PREFIX := --prefix=$(GCC_TOOLCHAIN_DIR) > >> +CLANG_FLAGS+= --prefix=$(GCC_TOOLCHAIN_DIR) > >> GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) > >> endif > >> ifneq ($(GCC_TOOLCHAIN),) > >> -CLANG_GCC_TC := --gcc-toolchain=$(GCC_TOOLCHAIN) > >> +CLANG_FLAGS+= --gcc-toolchain=$(GCC_TOOLCHAIN) > >> endif > >> -KBUILD_CFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC) $(CLANG_PREFIX) > >> -KBUILD_AFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC) $(CLANG_PREFIX) > >> -KBUILD_CFLAGS += -no-integrated-as > >> -KBUILD_AFLAGS += -no-integrated-as > >> +CLANG_FLAGS+= -no-integrated-as > >> +KBUILD_CFLAGS += $(CLANG_FLAGS) > >> +KBUILD_AFLAGS += $(CLANG_FLAGS) > >> endif > >> > >> RETPOLINE_CFLAGS_GCC := -mindirect-branch=thunk-extern > >> -mindirect-branch-register > >> -- > >> 2.7.4 > >> > > > > Thanks for this patch, Masahiro, it's a good simplification. > > Reviewed-by: Nick Desaulniers > > Tested-by: Nick Desaulniers > > > > Would you mind waiting for a tested-by from Stefan, and maybe an ack > > from Greg (added to cc)? > > > > Acked-by: Greg Hackmann Thanks for your review! So, how to organize this series, and Joel's one together? I'd like Joel to use this series as a base for his work. (https://lore.kernel.org/patchwork/patch/1006696/) It will be much cleaner. Shall I merge all the patches to kbuild tree, or maybe will they go through powerpc tree? -- Best Regards Masahiro Yamada
[PATCH] powerpc/ptrace: replace ptrace_report_syscall() with a tracehook call
Arch code should use tracehook_*() helpers, as documented in include/linux/tracehook.h. Signed-off-by: Elvira Khabirova --- arch/powerpc/kernel/ptrace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index afb819f4ca68..f73f8ea402e9 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -3266,7 +3266,7 @@ long do_syscall_trace_enter(struct pt_regs *regs) user_exit(); if (test_thread_flag(TIF_SYSCALL_EMU)) { - ptrace_report_syscall(regs); + tracehook_report_syscall_entry(regs); /* * Returning -1 will skip the syscall execution. We want to * avoid clobbering any register also, thus, not 'gotoing' -- 2.19.1