Re: [RFC v6 00/10] vfio-pci: Allow to mmap sub-page MMIO BARs and MSI-X table
On 2016/4/27 0:40, Alex Williamson wrote: On Mon, 25 Apr 2016 18:05:53 +0800 Yongji Xiewrote: Hi Alex, Any comment? TBH, I shuffled this to the bottom of the review pile because you're depending on a patch series for ARM MSI mapping that's still very much in flux. You've really got 3 or 4 separate patch series here that should be separated so they can be sent as non-RFC and you can start making progress. For instance, patches 1-4 are PCI-core enabling PAGE_SIZE aligned BARs, patch 5 discovers PAGE_SIZE aligned BARs and enables mmapping them through vfio. Now that you're using shadow resources to attempt to reserve the remainder of the page in patch 5, doesn't that make it independent of patches 1-4? These could be sent as separate series in parallel. Patches 6-9 are another separate series, but here you start to depend on the changes happening with ARM MSI mapping to determine whether we have real interrupt isolation. Once that gets settled, patch 10 becomes a much less controversial follow-on patch. Thanks, Alex That's a really good idea! Thank you! Regards, Yongji On 2016/4/18 18:53, Yongji Xie wrote: Current vfio-pci implementation disallows to mmap sub-page(size < PAGE_SIZE) MMIO BARs and MSI-X table. This is because sub-page BARs' mmio page may be shared with other BARs and MSI-X table should not be accessed directly from the guest for security reasons. But it will easily cause some performance issues for mmio accesses in guest when vfio passthrough sub-page BARs or BARs containing MSI-X table on PPC64 platform. This is because PAGE_SIZE is 64KB by default on PPC64 platform and the big page may easily hit the sub-page MMIO BARs' unmmapping and cause the unmmaping of the mmio page which MSI-X table locate in, which lead to mmio emulation in host. For sub-page MMIO BARs' unmmapping, this patchset modifies resource_alignment kernel parameter to enforce the alignment of all MMIO BARs to be at least PAGE_SZIE so that sub-page BAR's mmio page will not be shared with other BARs. And we also add shadow resources to the vfio device and put them into the holes of mmio pages in case that hot-add device's BARs are assigned into the holes. Then we can mmap sub-page MMIO BARs safely. For MSI-X table's unmmapping, we think MSI-X table is safe to access directly from userspace if hardware supports the capability of interrupt remapping which can ensure that a given pci device can only shoot the MSIs assigned for it. But the implenmentation of this capability is arch-independent. To have a universal way to test this capability on PCI side for different archs, we introduce a new bus_flags PCI_BUS_FLAGS_MSI_REMAP. With this patchset applied, we can get almost 100% improvement on performance for small block 4k random read when we passthrough a FC HBA containing sub-page BARs and MSI-X BARs to guest on PPC64 in our test. The patch 8 are based on the proposed patchset[2]. Changelog v6: - Rebase on vfio/next with patchset[2] applied - Fix some bugs of v5 - Add three patches to make PCI_BUS_FLAGS_MSI_REMAP as a universal flag to test IRQ remapping Changelog v5: - Rebase on vfio/next - Change the order of patch 1,2,3 - Move the warning "resource_alignment will not work with PCI_PROBE_ONLY set" from documentation to kernel log - Remove IORESOURCE_WINDOW - Add description for parameter "resize" - Add PCIBIOS_MIN_ALIGNMENT to force all MMIO BARs to get minimum alignment - Add shadow resources to make sure sub-page BAR's mmio page will not be shared with hot-add BARs. - Add a new bit to pci_bus_flags to indicate the capbility of interrupt remapping on PPC64 - Remove IOMMU_CAP_INTR_REMAP on PPC64 - Add a property msi_remap to vfio_pci_device to cache the capbility of interrupt remapping Changelog v4: - Rebase on v4.5-rc6 with patchset[1] applied. - Remove resource_page_aligned kernel parameter - Fix some problems with resource_alignment kernel parameter - Modify resource_alignment kernel parameter to support multiple devices. - Remove host bridge attribute: msi_filtered - Use IOMMU_CAP_INTR_REMAP to check if MSI-X table can be mmapped - Add IOMMU_CAP_INTR_REMAP for IODA host bridge on PPC64 platform Changelog v3: - Rebase on new linux kernel mainline with the patchset[1] applied. - Add a function to check whether PCI BARs'mmio page is shared with other BARs. - Add a host bridge attribute to indicate PCI host bridge support filtering of MSIs. - Use the new host bridge attribute to check if MSI-X table can be mmapped instead of CONFIG_EEH. - Remove Kconfig option VFIO_PCI_MMAP_MSIX Changelog v2: - Rebase on v4.4-rc6 with the patchset[1] applied. - Use kernel parameter to enforce all MMIO BARs to be page aligned on PCI core code instead of doing it on PPC64 arch code. - Remove flags: VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED [1] http://www.spinics.net/lists/kvm/msg127812.html [2] http://www.spinics.net/lists/kvm/msg130256.html Yongji Xie
Re: [PATCH v2] Documentation: fix common spelling mistakes
On 04/26/16 16:41, Kees Cook wrote: > This fixes several spelling mistakes in the Documentation/ tree, which > are caught by checkpatch.pl's spell checking. > > Signed-off-by: Kees CookAcked-by: Randy Dunlap Thanks. > --- > Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu | 11 > ++- > .../ABI/testing/sysfs-bus-event_source-devices-hv_24x7| 2 +- > Documentation/ABI/testing/sysfs-driver-hid-picolcd| 2 +- > Documentation/ABI/testing/sysfs-firmware-acpi | 2 +- > Documentation/DocBook/media/v4l/controls.xml | 2 +- > Documentation/DocBook/media/v4l/dev-raw-vbi.xml | 2 +- > Documentation/DocBook/media/v4l/vidioc-g-selection.xml| 2 +- > Documentation/RCU/RTFP.txt| 6 +++--- > Documentation/arm/SA1100/Assabet | 2 +- > Documentation/devicetree/bindings/mfd/arizona.txt | 2 +- > Documentation/filesystems/cifs/README | 2 +- > Documentation/filesystems/pohmelfs/design_notes.txt | 2 +- > Documentation/filesystems/qnx6.txt| 2 +- > Documentation/firmware_class/README | 2 +- > Documentation/hwmon/abituguru | 2 +- > Documentation/infiniband/ipoib.txt| 2 +- > Documentation/networking/altera_tse.txt | 2 +- > Documentation/networking/can.txt | 2 +- > Documentation/scsi/bfa.txt| 2 +- > Documentation/timers/hrtimers.txt | 2 +- > Documentation/video4linux/README.cx88 | 2 +- > Documentation/video4linux/bttv/Sound-FAQ | 2 +- > Documentation/vm/hugetlbpage.txt | 2 +- > 23 files changed, 30 insertions(+), 29 deletions(-) -- ~Randy -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] Documentation: fix common spelling mistakes
This fixes several spelling mistakes in the Documentation/ tree, which are caught by checkpatch.pl's spell checking. Signed-off-by: Kees Cook--- Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu | 11 ++- .../ABI/testing/sysfs-bus-event_source-devices-hv_24x7| 2 +- Documentation/ABI/testing/sysfs-driver-hid-picolcd| 2 +- Documentation/ABI/testing/sysfs-firmware-acpi | 2 +- Documentation/DocBook/media/v4l/controls.xml | 2 +- Documentation/DocBook/media/v4l/dev-raw-vbi.xml | 2 +- Documentation/DocBook/media/v4l/vidioc-g-selection.xml| 2 +- Documentation/RCU/RTFP.txt| 6 +++--- Documentation/arm/SA1100/Assabet | 2 +- Documentation/devicetree/bindings/mfd/arizona.txt | 2 +- Documentation/filesystems/cifs/README | 2 +- Documentation/filesystems/pohmelfs/design_notes.txt | 2 +- Documentation/filesystems/qnx6.txt| 2 +- Documentation/firmware_class/README | 2 +- Documentation/hwmon/abituguru | 2 +- Documentation/infiniband/ipoib.txt| 2 +- Documentation/networking/altera_tse.txt | 2 +- Documentation/networking/can.txt | 2 +- Documentation/scsi/bfa.txt| 2 +- Documentation/timers/hrtimers.txt | 2 +- Documentation/video4linux/README.cx88 | 2 +- Documentation/video4linux/bttv/Sound-FAQ | 2 +- Documentation/vm/hugetlbpage.txt | 2 +- 23 files changed, 30 insertions(+), 29 deletions(-) diff --git a/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu b/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu index f1e02a98bd9d..99fda67fce18 100644 --- a/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu +++ b/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu @@ -3,9 +3,10 @@ Date: Mai 2012 Contact: Stefan Achatz Description: The mouse can store 5 profiles which can be switched by the press of a button. A profile is split into general settings and - button settings. buttons holds informations about button layout. - When written, this file lets one write the respective profile - buttons to the mouse. The data has to be 47 bytes long. + button settings. The buttons variable holds information about + button layout. When written, this file lets one write the + respective profile buttons to the mouse. The data has to be + 47 bytes long. The mouse will reject invalid data. Which profile to write is determined by the profile number contained in the data. @@ -26,8 +27,8 @@ Date: Mai 2012 Contact: Stefan Achatz Description: The mouse can store 5 profiles which can be switched by the press of a button. A profile is split into general settings and - button settings. profile holds informations like resolution, sensitivity - and light effects. + button settings. A profile holds information like resolution, + sensitivity and light effects. When written, this file lets one write the respective profile settings back to the mouse. The data has to be 43 bytes long. The mouse will reject invalid data. diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 index f893337570c1..ec27c6c9e737 100644 --- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 @@ -4,7 +4,7 @@ Contact:Linux on PowerPC Developer List Description: Provides access to the binary "24x7 catalog" provided by the hypervisor on POWER7 and 8 systems. This catalog lists events - avaliable from the powerpc "hv_24x7" pmu. Its format is + available from the powerpc "hv_24x7" pmu. Its format is documented here: https://raw.githubusercontent.com/jmesmon/catalog-24x7/master/hv-24x7-catalog.h diff --git a/Documentation/ABI/testing/sysfs-driver-hid-picolcd b/Documentation/ABI/testing/sysfs-driver-hid-picolcd index 08579e7e1e89..98fd81ad76a1 100644 --- a/Documentation/ABI/testing/sysfs-driver-hid-picolcd +++ b/Documentation/ABI/testing/sysfs-driver-hid-picolcd @@ -39,5 +39,5 @@ Description: Make it possible to
Re: [PATCH] Documentation: fix common spelling mistakes
On 04/26/16 16:28, Kees Cook wrote: > This fixes several spelling mistakes in the Documentation/ tree, which > are caught by checkpatch.pl's spell checking. > > Signed-off-by: Kees Cook> --- > Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu | 4 ++-- > Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 | 2 +- > Documentation/ABI/testing/sysfs-driver-hid-picolcd | 2 +- > Documentation/ABI/testing/sysfs-firmware-acpi| 2 +- > Documentation/DocBook/media/v4l/controls.xml | 2 +- > Documentation/DocBook/media/v4l/dev-raw-vbi.xml | 2 +- > Documentation/DocBook/media/v4l/vidioc-g-selection.xml | 2 +- > Documentation/RCU/RTFP.txt | 6 +++--- > Documentation/arm/SA1100/Assabet | 2 +- > Documentation/devicetree/bindings/mfd/arizona.txt| 2 +- > Documentation/filesystems/cifs/README| 2 +- > Documentation/filesystems/pohmelfs/design_notes.txt | 2 +- > Documentation/filesystems/qnx6.txt | 2 +- > Documentation/firmware_class/README | 2 +- > Documentation/hwmon/abituguru| 2 +- > Documentation/infiniband/ipoib.txt | 2 +- > Documentation/networking/altera_tse.txt | 2 +- > Documentation/networking/can.txt | 2 +- > Documentation/scsi/bfa.txt | 2 +- > Documentation/timers/hrtimers.txt| 2 +- > Documentation/video4linux/README.cx88| 2 +- > Documentation/video4linux/bttv/Sound-FAQ | 2 +- > Documentation/vm/hugetlbpage.txt | 2 +- > 23 files changed, 26 insertions(+), 26 deletions(-) > > diff --git a/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu > b/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu > index f1e02a98bd9d..846c3d5b6d8c 100644 > --- a/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu > +++ b/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu > @@ -3,7 +3,7 @@ Date: Mai 2012 > Contact: Stefan Achatz > Description: The mouse can store 5 profiles which can be switched by the > press of a button. A profile is split into general settings and > - button settings. buttons holds informations about button layout. > + button settings. buttons holds information about button layout. hold > When written, this file lets one write the respective profile > buttons to the mouse. The data has to be 47 bytes long. > The mouse will reject invalid data. > @@ -26,7 +26,7 @@ Date: Mai 2012 > Contact: Stefan Achatz > Description: The mouse can store 5 profiles which can be switched by the > press of a button. A profile is split into general settings and > - button settings. profile holds informations like resolution, > sensitivity > + button settings. profile holds information like resolution, > sensitivity profiles hold or A profile holds > and light effects. > When written, this file lets one write the respective profile > settings back to the mouse. The data has to be 43 bytes long. > diff --git a/Documentation/firmware_class/README > b/Documentation/firmware_class/README > index 71f86859d7d8..434e5db25fc0 100644 > --- a/Documentation/firmware_class/README > +++ b/Documentation/firmware_class/README > @@ -20,7 +20,7 @@ > > 1), kernel(driver): > - calls request_firmware(_entry, $FIRMWARE, device) > - - kernel searchs the fimware image with name $FIRMWARE directly > + - kernel searches the fimware image with name $FIRMWARE directly firmware > in the below search path of root filesystem: > User customized search path by module parameter 'path'[1] > "/lib/firmware/updates/" UTS_RELEASE, -- ~Randy -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Documentation: fix common spelling mistakes
On Tue, Apr 26, 2016 at 4:34 PM, Randy Dunlapwrote: > On 04/26/16 16:28, Kees Cook wrote: >> This fixes several spelling mistakes in the Documentation/ tree, which >> are caught by checkpatch.pl's spell checking. >> >> Signed-off-by: Kees Cook >> --- >> Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu | 4 ++-- >> Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 | 2 +- >> Documentation/ABI/testing/sysfs-driver-hid-picolcd | 2 +- >> Documentation/ABI/testing/sysfs-firmware-acpi| 2 +- >> Documentation/DocBook/media/v4l/controls.xml | 2 +- >> Documentation/DocBook/media/v4l/dev-raw-vbi.xml | 2 +- >> Documentation/DocBook/media/v4l/vidioc-g-selection.xml | 2 +- >> Documentation/RCU/RTFP.txt | 6 +++--- >> Documentation/arm/SA1100/Assabet | 2 +- >> Documentation/devicetree/bindings/mfd/arizona.txt| 2 +- >> Documentation/filesystems/cifs/README| 2 +- >> Documentation/filesystems/pohmelfs/design_notes.txt | 2 +- >> Documentation/filesystems/qnx6.txt | 2 +- >> Documentation/firmware_class/README | 2 +- >> Documentation/hwmon/abituguru| 2 +- >> Documentation/infiniband/ipoib.txt | 2 +- >> Documentation/networking/altera_tse.txt | 2 +- >> Documentation/networking/can.txt | 2 +- >> Documentation/scsi/bfa.txt | 2 +- >> Documentation/timers/hrtimers.txt| 2 +- >> Documentation/video4linux/README.cx88| 2 +- >> Documentation/video4linux/bttv/Sound-FAQ | 2 +- >> Documentation/vm/hugetlbpage.txt | 2 +- >> 23 files changed, 26 insertions(+), 26 deletions(-) >> >> diff --git a/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu >> b/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu >> index f1e02a98bd9d..846c3d5b6d8c 100644 >> --- a/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu >> +++ b/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu >> @@ -3,7 +3,7 @@ Date: Mai 2012 >> Contact: Stefan Achatz >> Description: The mouse can store 5 profiles which can be switched by the >> press of a button. A profile is split into general settings and >> - button settings. buttons holds informations about button >> layout. >> + button settings. buttons holds information about button layout. > > hold > >> When written, this file lets one write the respective profile >> buttons to the mouse. The data has to be 47 bytes long. >> The mouse will reject invalid data. >> @@ -26,7 +26,7 @@ Date: Mai 2012 >> Contact: Stefan Achatz >> Description: The mouse can store 5 profiles which can be switched by the >> press of a button. A profile is split into general settings and >> - button settings. profile holds informations like resolution, >> sensitivity >> + button settings. profile holds information like resolution, >> sensitivity > > profiles hold > or > A profile holds Yeah, that bugged me too. I'll fix this (and I think there's a "button" leading a sentence in there too..) > >> and light effects. >> When written, this file lets one write the respective profile >> settings back to the mouse. The data has to be 43 bytes long. > > >> diff --git a/Documentation/firmware_class/README >> b/Documentation/firmware_class/README >> index 71f86859d7d8..434e5db25fc0 100644 >> --- a/Documentation/firmware_class/README >> +++ b/Documentation/firmware_class/README >> @@ -20,7 +20,7 @@ >> >> 1), kernel(driver): >> - calls request_firmware(_entry, $FIRMWARE, device) >> - - kernel searchs the fimware image with name $FIRMWARE directly >> + - kernel searches the fimware image with name $FIRMWARE directly > > firmware Hah. I should add that to the checkpatch misspellings. ;) -Kees > >> in the below search path of root filesystem: >> User customized search path by module parameter 'path'[1] >> "/lib/firmware/updates/" UTS_RELEASE, > > -- > ~Randy -- Kees Cook Chrome OS & Brillo Security -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at
[PATCH] Documentation: fix common spelling mistakes
This fixes several spelling mistakes in the Documentation/ tree, which are caught by checkpatch.pl's spell checking. Signed-off-by: Kees Cook--- Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu | 4 ++-- Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 | 2 +- Documentation/ABI/testing/sysfs-driver-hid-picolcd | 2 +- Documentation/ABI/testing/sysfs-firmware-acpi| 2 +- Documentation/DocBook/media/v4l/controls.xml | 2 +- Documentation/DocBook/media/v4l/dev-raw-vbi.xml | 2 +- Documentation/DocBook/media/v4l/vidioc-g-selection.xml | 2 +- Documentation/RCU/RTFP.txt | 6 +++--- Documentation/arm/SA1100/Assabet | 2 +- Documentation/devicetree/bindings/mfd/arizona.txt| 2 +- Documentation/filesystems/cifs/README| 2 +- Documentation/filesystems/pohmelfs/design_notes.txt | 2 +- Documentation/filesystems/qnx6.txt | 2 +- Documentation/firmware_class/README | 2 +- Documentation/hwmon/abituguru| 2 +- Documentation/infiniband/ipoib.txt | 2 +- Documentation/networking/altera_tse.txt | 2 +- Documentation/networking/can.txt | 2 +- Documentation/scsi/bfa.txt | 2 +- Documentation/timers/hrtimers.txt| 2 +- Documentation/video4linux/README.cx88| 2 +- Documentation/video4linux/bttv/Sound-FAQ | 2 +- Documentation/vm/hugetlbpage.txt | 2 +- 23 files changed, 26 insertions(+), 26 deletions(-) diff --git a/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu b/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu index f1e02a98bd9d..846c3d5b6d8c 100644 --- a/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu +++ b/Documentation/ABI/obsolete/sysfs-driver-hid-roccat-savu @@ -3,7 +3,7 @@ Date: Mai 2012 Contact: Stefan Achatz Description: The mouse can store 5 profiles which can be switched by the press of a button. A profile is split into general settings and - button settings. buttons holds informations about button layout. + button settings. buttons holds information about button layout. When written, this file lets one write the respective profile buttons to the mouse. The data has to be 47 bytes long. The mouse will reject invalid data. @@ -26,7 +26,7 @@ Date: Mai 2012 Contact: Stefan Achatz Description: The mouse can store 5 profiles which can be switched by the press of a button. A profile is split into general settings and - button settings. profile holds informations like resolution, sensitivity + button settings. profile holds information like resolution, sensitivity and light effects. When written, this file lets one write the respective profile settings back to the mouse. The data has to be 43 bytes long. diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 index f893337570c1..ec27c6c9e737 100644 --- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 @@ -4,7 +4,7 @@ Contact:Linux on PowerPC Developer List Description: Provides access to the binary "24x7 catalog" provided by the hypervisor on POWER7 and 8 systems. This catalog lists events - avaliable from the powerpc "hv_24x7" pmu. Its format is + available from the powerpc "hv_24x7" pmu. Its format is documented here: https://raw.githubusercontent.com/jmesmon/catalog-24x7/master/hv-24x7-catalog.h diff --git a/Documentation/ABI/testing/sysfs-driver-hid-picolcd b/Documentation/ABI/testing/sysfs-driver-hid-picolcd index 08579e7e1e89..98fd81ad76a1 100644 --- a/Documentation/ABI/testing/sysfs-driver-hid-picolcd +++ b/Documentation/ABI/testing/sysfs-driver-hid-picolcd @@ -39,5 +39,5 @@ Description: Make it possible to adjust defio refresh rate. Note: As device can barely do 2 complete refreshes a second it only makes sense to adjust this value if only one or two tiles get changed and it's not appropriate to expect the application - to flush it's tiny changes explicitely at higher than default
[RFC PATCH v1 05/18] x86: Handle reduction in physical address size with SME
When System Memory Encryption (SME) is enabled, the physical address space is reduced. Adjust the x86_phys_bits value to reflect this reduction. Signed-off-by: Tom Lendacky--- arch/x86/kernel/cpu/common.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 6bfa36d..b49e7fc 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -43,6 +43,7 @@ #include #include #include +#include #ifdef CONFIG_X86_LOCAL_APIC #include @@ -722,6 +723,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) c->x86_virt_bits = (eax >> 8) & 0xff; c->x86_phys_bits = eax & 0xff; + c->x86_phys_bits -= sme_get_me_loss(); c->x86_capability[CPUID_8000_0008_EBX] = ebx; } #ifdef CONFIG_X86_32 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 08/18] x86: Add support for early encryption/decryption of memory
This adds support to be able to either encrypt or decrypt data during the early stages of booting the kernel. This does not change the memory encryption attribute - it is used for ensuring that data present in either an encrypted or un-encrypted memory area is in the proper state (for example the initrd will have been loaded by the boot loader and will not be encrypted, but the memory that it resides in is marked as encrypted). Signed-off-by: Tom Lendacky--- arch/x86/include/asm/mem_encrypt.h | 15 ++ arch/x86/mm/mem_encrypt.c | 89 2 files changed, 104 insertions(+) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 9f3e762..2785493 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -23,6 +23,11 @@ extern unsigned long sme_me_mask; u8 sme_get_me_loss(void); +void __init sme_early_mem_enc(resource_size_t paddr, + unsigned long size); +void __init sme_early_mem_dec(resource_size_t paddr, + unsigned long size); + void __init sme_early_init(void); #define __sme_pa(x)(__pa((x)) | sme_me_mask) @@ -39,6 +44,16 @@ static inline u8 sme_get_me_loss(void) return 0; } +static inline void __init sme_early_mem_enc(resource_size_t paddr, + unsigned long size) +{ +} + +static inline void __init sme_early_mem_dec(resource_size_t paddr, + unsigned long size) +{ +} + static inline void __init sme_early_init(void) { } diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index 00eb705..5f19ede 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -14,6 +14,95 @@ #include #include +#include +#include + +/* Buffer used for early in-place encryption by BSP, no locking needed */ +static char me_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE); + +void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size) +{ + void *src, *dst; + size_t len; + + if (!sme_me_mask) + return; + + local_flush_tlb(); + wbinvd(); + + /* +* There are limited number of early mapping slots, so map (at most) +* one page at time. +*/ + while (size) { + len = min_t(size_t, sizeof(me_early_buffer), size); + + /* Create a mapping for non-encrypted write-protected memory */ + src = early_memremap_dec_wp(paddr, len); + + /* Create a mapping for encrypted memory */ + dst = early_memremap_enc(paddr, len); + + /* +* If a mapping can't be obtained to perform the encryption, +* then encrypted access to that area will end up causing +* a crash. +*/ + BUG_ON(!src || !dst); + + memcpy(me_early_buffer, src, len); + memcpy(dst, me_early_buffer, len); + + early_memunmap(dst, len); + early_memunmap(src, len); + + paddr += len; + size -= len; + } +} + +void __init sme_early_mem_dec(resource_size_t paddr, unsigned long size) +{ + void *src, *dst; + size_t len; + + if (!sme_me_mask) + return; + + local_flush_tlb(); + wbinvd(); + + /* +* There are limited number of early mapping slots, so map (at most) +* one page at time. +*/ + while (size) { + len = min_t(size_t, sizeof(me_early_buffer), size); + + /* Create a mapping for encrypted write-protected memory */ + src = early_memremap_enc_wp(paddr, len); + + /* Create a mapping for non-encrypted memory */ + dst = early_memremap_dec(paddr, len); + + /* +* If a mapping can't be obtained to perform the decryption, +* then un-encrypted access to that area will end up causing +* a crash. +*/ + BUG_ON(!src || !dst); + + memcpy(me_early_buffer, src, len); + memcpy(dst, me_early_buffer, len); + + early_memunmap(dst, len); + early_memunmap(src, len); + + paddr += len; + size -= len; + } +} void __init sme_early_init(void) { -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 13/18] x86: DMA support for memory encryption
Since DMA addresses will effectively look like 48-bit addresses when the memory encryption mask is set, SWIOTLB is needed if the DMA mask of the device performing the DMA does not support 48-bits. SWIOTLB will be initialized to create un-encrypted bounce buffers for use by these devices. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/dma-mapping.h |5 ++- arch/x86/include/asm/mem_encrypt.h |5 +++ arch/x86/kernel/pci-dma.c | 11 -- arch/x86/kernel/pci-nommu.c|2 + arch/x86/kernel/pci-swiotlb.c |8 +++-- arch/x86/mm/mem_encrypt.c | 21 include/linux/swiotlb.h|1 + init/main.c|6 +++ lib/swiotlb.c | 64 9 files changed, 106 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/dma-mapping.h b/arch/x86/include/asm/dma-mapping.h index 3a27b93..33a4f6d 100644 --- a/arch/x86/include/asm/dma-mapping.h +++ b/arch/x86/include/asm/dma-mapping.h @@ -13,6 +13,7 @@ #include #include #include +#include #ifdef CONFIG_ISA # define ISA_DMA_BIT_MASK DMA_BIT_MASK(24) @@ -70,12 +71,12 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size) static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr) { - return paddr; + return paddr | sme_me_mask; } static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t daddr) { - return daddr; + return daddr & ~sme_me_mask; } #endif /* CONFIG_X86_DMA_REMAP */ diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 42868f5..d17d8cf 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -37,6 +37,11 @@ void __init *sme_early_memremap(resource_size_t paddr, void __init sme_early_init(void); /* Architecture __weak replacement functions */ +void __init mem_encrypt_init(void); + +unsigned long swiotlb_get_me_mask(void); +void swiotlb_set_mem_dec(void *vaddr, unsigned long size); + void __init *efi_me_early_memremap(resource_size_t paddr, unsigned long size); diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index 6ba014c..bd1daae 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -92,9 +92,12 @@ again: /* CMA can be used only in the context which permits sleeping */ if (gfpflags_allow_blocking(flag)) { page = dma_alloc_from_contiguous(dev, count, get_order(size)); - if (page && page_to_phys(page) + size > dma_mask) { - dma_release_from_contiguous(dev, page, count); - page = NULL; + if (page) { + addr = phys_to_dma(dev, page_to_phys(page)); + if (addr + size > dma_mask) { + dma_release_from_contiguous(dev, page, count); + page = NULL; + } } } /* fallback */ @@ -103,7 +106,7 @@ again: if (!page) return NULL; - addr = page_to_phys(page); + addr = phys_to_dma(dev, page_to_phys(page)); if (addr + size > dma_mask) { __free_pages(page, get_order(size)); diff --git a/arch/x86/kernel/pci-nommu.c b/arch/x86/kernel/pci-nommu.c index da15918..ca2b820 100644 --- a/arch/x86/kernel/pci-nommu.c +++ b/arch/x86/kernel/pci-nommu.c @@ -30,7 +30,7 @@ static dma_addr_t nommu_map_page(struct device *dev, struct page *page, enum dma_data_direction dir, struct dma_attrs *attrs) { - dma_addr_t bus = page_to_phys(page) + offset; + dma_addr_t bus = phys_to_dma(dev, page_to_phys(page)) + offset; WARN_ON(size == 0); if (!check_addr("map_single", dev, bus, size)) return DMA_ERROR_CODE; diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c index 7c577a1..0ae083d 100644 --- a/arch/x86/kernel/pci-swiotlb.c +++ b/arch/x86/kernel/pci-swiotlb.c @@ -12,6 +12,8 @@ #include #include #include +#include + int swiotlb __read_mostly; void *x86_swiotlb_alloc_coherent(struct device *hwdev, size_t size, @@ -64,13 +66,15 @@ static struct dma_map_ops swiotlb_dma_ops = { * pci_swiotlb_detect_override - set swiotlb to 1 if necessary * * This returns non-zero if we are forced to use swiotlb (by the boot - * option). + * option). If memory encryption is enabled then swiotlb will be set + * to 1 so that bounce buffers are allocated and used for devices that + * do not support the addressing range required for the encryption mask. */ int __init pci_swiotlb_detect_override(void) { int use_swiotlb = swiotlb | swiotlb_force; - if (swiotlb_force) + if (swiotlb_force || sme_me_mask)
[RFC PATCH v1 04/18] x86: Add the Secure Memory Encryption cpu feature
Update the cpu features to include identifying and reporting on the Secure Memory Encryption feature. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/cpufeature.h |1 + arch/x86/include/asm/cpufeatures.h |5 - arch/x86/kernel/cpu/scattered.c|1 + 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 07c942d..e27e352 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -27,6 +27,7 @@ enum cpuid_leafs CPUID_6_EAX, CPUID_8000_000A_EDX, CPUID_7_ECX, + CPUID_8000_001F_EAX, }; #ifdef CONFIG_X86_FEATURE_NAMES diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 47b5056..4aea205 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -12,7 +12,7 @@ /* * Defines x86 CPU feature bits */ -#define NCAPINTS 17 /* N 32-bit words worth of info */ +#define NCAPINTS 18 /* N 32-bit words worth of info */ #define NBUGINTS 1 /* N 32-bit bug flags */ /* @@ -282,6 +282,9 @@ #define X86_FEATURE_PKU(16*32+ 3) /* Protection Keys for Userspace */ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ +/* AMD SME Feature Identification, CPUID level 0x801f (eax), word 17 */ +#define X86_FEATURE_SME(17*32+ 0) /* Secure Memory Encryption support */ + /* * BUG word(s) */ diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 8cb57df..d86d9a5 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -37,6 +37,7 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { X86_FEATURE_HW_PSTATE,CR_EDX, 7, 0x8007, 0 }, { X86_FEATURE_CPB, CR_EDX, 9, 0x8007, 0 }, { X86_FEATURE_PROC_FEEDBACK,CR_EDX,11, 0x8007, 0 }, + { X86_FEATURE_SME, CR_EAX, 0, 0x801f, 0 }, { 0, 0, 0, 0, 0 } }; -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
The EFI tables are not encrypted and need to be accessed as such. Be sure to memmap them without the encryption attribute set. For EFI support that lives outside of the arch/x86 tree, create a routine that uses the __weak attribute so that it can be overridden by an architecture specific routine. When freeing boot services related memory, since it has been mapped as un-encrypted, be sure to change the mapping to encrypted for future use. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/cacheflush.h |3 + arch/x86/include/asm/mem_encrypt.h | 22 +++ arch/x86/kernel/setup.c|6 +-- arch/x86/mm/mem_encrypt.c | 56 +++ arch/x86/mm/pageattr.c | 75 arch/x86/platform/efi/efi.c| 26 +++- arch/x86/platform/efi/efi_64.c |9 +++- arch/x86/platform/efi/quirks.c | 12 +- drivers/firmware/efi/efi.c | 18 +++-- drivers/firmware/efi/esrt.c| 12 +++--- include/linux/efi.h|3 + 11 files changed, 212 insertions(+), 30 deletions(-) diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h index 61518cf..bfb08e5 100644 --- a/arch/x86/include/asm/cacheflush.h +++ b/arch/x86/include/asm/cacheflush.h @@ -13,6 +13,7 @@ * Executability : eXeutable, NoteXecutable * Read/Write: ReadOnly, ReadWrite * Presence : NotPresent + * Encryption: ENCrypted, DECrypted * * Within a category, the attributes are mutually exclusive. * @@ -48,6 +49,8 @@ int set_memory_ro(unsigned long addr, int numpages); int set_memory_rw(unsigned long addr, int numpages); int set_memory_np(unsigned long addr, int numpages); int set_memory_4k(unsigned long addr, int numpages); +int set_memory_enc(unsigned long addr, int numpages); +int set_memory_dec(unsigned long addr, int numpages); int set_memory_array_uc(unsigned long *addr, int addrinarray); int set_memory_array_wc(unsigned long *addr, int addrinarray); diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 2785493..42868f5 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -23,13 +23,23 @@ extern unsigned long sme_me_mask; u8 sme_get_me_loss(void); +int sme_set_mem_enc(void *vaddr, unsigned long size); +int sme_set_mem_dec(void *vaddr, unsigned long size); + void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size); void __init sme_early_mem_dec(resource_size_t paddr, unsigned long size); +void __init *sme_early_memremap(resource_size_t paddr, + unsigned long size); + void __init sme_early_init(void); +/* Architecture __weak replacement functions */ +void __init *efi_me_early_memremap(resource_size_t paddr, + unsigned long size); + #define __sme_pa(x)(__pa((x)) | sme_me_mask) #define __sme_pa_nodebug(x)(__pa_nodebug((x)) | sme_me_mask) @@ -44,6 +54,16 @@ static inline u8 sme_get_me_loss(void) return 0; } +static inline int sme_set_mem_enc(void *vaddr, unsigned long size) +{ + return 0; +} + +static inline int sme_set_mem_dec(void *vaddr, unsigned long size) +{ + return 0; +} + static inline void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size) { @@ -63,6 +83,8 @@ static inline void __init sme_early_init(void) #define __sme_va __va +#define sme_early_memremap early_memremap + #endif /* CONFIG_AMD_MEM_ENCRYPT */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1d29cf9..2e460fb 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -424,7 +424,7 @@ static void __init parse_setup_data(void) while (pa_data) { u32 data_len, data_type; - data = early_memremap(pa_data, sizeof(*data)); + data = sme_early_memremap(pa_data, sizeof(*data)); data_len = data->len + sizeof(struct setup_data); data_type = data->type; pa_next = data->next; @@ -457,7 +457,7 @@ static void __init e820_reserve_setup_data(void) return; while (pa_data) { - data = early_memremap(pa_data, sizeof(*data)); + data = sme_early_memremap(pa_data, sizeof(*data)); e820_update_range(pa_data, sizeof(*data)+data->len, E820_RAM, E820_RESERVED_KERN); pa_data = data->next; @@ -477,7 +477,7 @@ static void __init memblock_x86_reserve_range_setup_data(void) pa_data = boot_params.hdr.setup_data; while (pa_data) { - data = early_memremap(pa_data, sizeof(*data)); + data = sme_early_memremap(pa_data,
[RFC PATCH v1 12/18] x86: Access device tree in the clear
The device tree is not encrypted and needs to be accessed as such. Be sure to memmap it without the encryption mask set. Signed-off-by: Tom Lendacky--- arch/x86/kernel/devicetree.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/devicetree.c b/arch/x86/kernel/devicetree.c index 3fe45f8..ff11f7a 100644 --- a/arch/x86/kernel/devicetree.c +++ b/arch/x86/kernel/devicetree.c @@ -22,6 +22,7 @@ #include #include #include +#include __initdata u64 initial_dtb; char __initdata cmd_line[COMMAND_LINE_SIZE]; @@ -276,11 +277,12 @@ static void __init x86_flattree_get_config(void) map_len = max(PAGE_SIZE - (initial_dtb & ~PAGE_MASK), (u64)128); - initial_boot_params = dt = early_memremap(initial_dtb, map_len); + initial_boot_params = dt = sme_early_memremap(initial_dtb, map_len); size = of_get_flat_dt_size(); if (map_len < size) { early_memunmap(dt, map_len); - initial_boot_params = dt = early_memremap(initial_dtb, size); + initial_boot_params = dt = sme_early_memremap(initial_dtb, + size); map_len = size; } -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 07/18] x86: Extend the early_memmap support with additional attrs
Add to the early_memmap support to be able to specify encrypted and un-encrypted mappings with and without write-protection. The use of write-protection is necessary when encrypting data "in place". The write-protect attribute is considered cacheable for loads, but not stores. This implies that the hardware will never give the core a dirty line with this memtype. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/fixmap.h|9 + arch/x86/include/asm/pgtable_types.h |8 arch/x86/mm/ioremap.c| 28 include/asm-generic/early_ioremap.h |2 ++ mm/early_ioremap.c | 15 +++ 5 files changed, 62 insertions(+) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 83e91f0..4d41878 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -160,6 +160,15 @@ static inline void __set_fixmap(enum fixed_addresses idx, */ #define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_IO_NOCACHE +void __init *early_memremap_enc(resource_size_t phys_addr, + unsigned long size); +void __init *early_memremap_enc_wp(resource_size_t phys_addr, + unsigned long size); +void __init *early_memremap_dec(resource_size_t phys_addr, + unsigned long size); +void __init *early_memremap_dec_wp(resource_size_t phys_addr, + unsigned long size); + #include #define __late_set_fixmap(idx, phys, flags) __set_fixmap(idx, phys, flags) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index fda7877..6291248 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -154,6 +154,7 @@ enum page_cache_mode { #define _PAGE_CACHE_MASK (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT) #define _PAGE_NOCACHE (cachemode2protval(_PAGE_CACHE_MODE_UC)) +#define _PAGE_CACHE_WP (cachemode2protval(_PAGE_CACHE_MODE_WP)) #define PAGE_NONE __pgprot(_PAGE_PROTNONE | _PAGE_ACCESSED) #define PAGE_SHARED__pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \ @@ -182,6 +183,7 @@ enum page_cache_mode { #define __PAGE_KERNEL_VVAR (__PAGE_KERNEL_RO | _PAGE_USER) #define __PAGE_KERNEL_LARGE(__PAGE_KERNEL | _PAGE_PSE) #define __PAGE_KERNEL_LARGE_EXEC (__PAGE_KERNEL_EXEC | _PAGE_PSE) +#define __PAGE_KERNEL_WP (__PAGE_KERNEL | _PAGE_CACHE_WP) #define __PAGE_KERNEL_IO (__PAGE_KERNEL) #define __PAGE_KERNEL_IO_NOCACHE (__PAGE_KERNEL_NOCACHE) @@ -196,6 +198,12 @@ enum page_cache_mode { #define _KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |\ _PAGE_DIRTY | _PAGE_ENC) +#define __PAGE_KERNEL_ENC (__PAGE_KERNEL | _PAGE_ENC) +#define __PAGE_KERNEL_ENC_WP (__PAGE_KERNEL_WP | _PAGE_ENC) + +#define __PAGE_KERNEL_DEC (__PAGE_KERNEL) +#define __PAGE_KERNEL_DEC_WP (__PAGE_KERNEL_WP) + #define PAGE_KERNEL__pgprot(__PAGE_KERNEL | _PAGE_ENC) #define PAGE_KERNEL_RO __pgprot(__PAGE_KERNEL_RO | _PAGE_ENC) #define PAGE_KERNEL_EXEC __pgprot(__PAGE_KERNEL_EXEC | _PAGE_ENC) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 77dadf5..14c7ed5 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -420,6 +420,34 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr) iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK)); } +/* Remap memory with encryption */ +void __init *early_memremap_enc(resource_size_t phys_addr, + unsigned long size) +{ + return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_ENC); +} + +/* Remap memory with encryption and write-protected */ +void __init *early_memremap_enc_wp(resource_size_t phys_addr, + unsigned long size) +{ + return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_ENC_WP); +} + +/* Remap memory without encryption */ +void __init *early_memremap_dec(resource_size_t phys_addr, + unsigned long size) +{ + return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_DEC); +} + +/* Remap memory without encryption and write-protected */ +void __init *early_memremap_dec_wp(resource_size_t phys_addr, + unsigned long size) +{ + return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_DEC_WP); +} + static pte_t bm_pte[PAGE_SIZE/sizeof(pte_t)] __page_aligned_bss; static inline pmd_t * __init early_ioremap_pmd(unsigned long addr) diff --git a/include/asm-generic/early_ioremap.h b/include/asm-generic/early_ioremap.h index 734ad4d..2edef8d 100644 --- a/include/asm-generic/early_ioremap.h +++ b/include/asm-generic/early_ioremap.h @@ -13,6 +13,8 @@ extern void *early_memremap(resource_size_t phys_addr,
[RFC PATCH v1 11/18] x86: Decrypt trampoline area if memory encryption is active
When Secure Memory Encryption is enabled, the trampoline area must not be encrypted. A cpu running in real mode will not be able to decrypt memory that has been encrypted because it will not be able to use addresses with the memory encryption mask. Signed-off-by: Tom Lendacky--- arch/x86/realmode/init.c |9 + 1 file changed, 9 insertions(+) diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c index 0b7a63d..85b145c 100644 --- a/arch/x86/realmode/init.c +++ b/arch/x86/realmode/init.c @@ -4,6 +4,7 @@ #include #include #include +#include struct real_mode_header *real_mode_header; u32 *trampoline_cr4_features; @@ -113,6 +114,14 @@ static int __init set_real_mode_permissions(void) unsigned long text_start = (unsigned long) __va(real_mode_header->text_start); + /* +* If memory encryption is active, the trampoline area will need to +* be in non-encrypted memory in order to bring up other processors +* successfully. +*/ + sme_early_mem_dec(__pa(base), size); + sme_set_mem_dec(base, size); + set_memory_nx((unsigned long) base, size >> PAGE_SHIFT); set_memory_ro((unsigned long) base, ro_size >> PAGE_SHIFT); set_memory_x((unsigned long) text_start, text_size >> PAGE_SHIFT); -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 08/18] x86: Add support for early encryption/decryption of memory
This adds support to be able to either encrypt or decrypt data during the early stages of booting the kernel. This does not change the memory encryption attribute - it is used for ensuring that data present in either an encrypted or un-encrypted memory area is in the proper state (for example the initrd will have been loaded by the boot loader and will not be encrypted, but the memory that it resides in is marked as encrypted). Signed-off-by: Tom Lendacky--- arch/x86/include/asm/mem_encrypt.h | 15 ++ arch/x86/mm/mem_encrypt.c | 89 2 files changed, 104 insertions(+) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 9f3e762..2785493 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -23,6 +23,11 @@ extern unsigned long sme_me_mask; u8 sme_get_me_loss(void); +void __init sme_early_mem_enc(resource_size_t paddr, + unsigned long size); +void __init sme_early_mem_dec(resource_size_t paddr, + unsigned long size); + void __init sme_early_init(void); #define __sme_pa(x)(__pa((x)) | sme_me_mask) @@ -39,6 +44,16 @@ static inline u8 sme_get_me_loss(void) return 0; } +static inline void __init sme_early_mem_enc(resource_size_t paddr, + unsigned long size) +{ +} + +static inline void __init sme_early_mem_dec(resource_size_t paddr, + unsigned long size) +{ +} + static inline void __init sme_early_init(void) { } diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index 00eb705..5f19ede 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -14,6 +14,95 @@ #include #include +#include +#include + +/* Buffer used for early in-place encryption by BSP, no locking needed */ +static char me_early_buffer[PAGE_SIZE] __aligned(PAGE_SIZE); + +void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size) +{ + void *src, *dst; + size_t len; + + if (!sme_me_mask) + return; + + local_flush_tlb(); + wbinvd(); + + /* +* There are limited number of early mapping slots, so map (at most) +* one page at time. +*/ + while (size) { + len = min_t(size_t, sizeof(me_early_buffer), size); + + /* Create a mapping for non-encrypted write-protected memory */ + src = early_memremap_dec_wp(paddr, len); + + /* Create a mapping for encrypted memory */ + dst = early_memremap_enc(paddr, len); + + /* +* If a mapping can't be obtained to perform the encryption, +* then encrypted access to that area will end up causing +* a crash. +*/ + BUG_ON(!src || !dst); + + memcpy(me_early_buffer, src, len); + memcpy(dst, me_early_buffer, len); + + early_memunmap(dst, len); + early_memunmap(src, len); + + paddr += len; + size -= len; + } +} + +void __init sme_early_mem_dec(resource_size_t paddr, unsigned long size) +{ + void *src, *dst; + size_t len; + + if (!sme_me_mask) + return; + + local_flush_tlb(); + wbinvd(); + + /* +* There are limited number of early mapping slots, so map (at most) +* one page at time. +*/ + while (size) { + len = min_t(size_t, sizeof(me_early_buffer), size); + + /* Create a mapping for encrypted write-protected memory */ + src = early_memremap_enc_wp(paddr, len); + + /* Create a mapping for non-encrypted memory */ + dst = early_memremap_dec(paddr, len); + + /* +* If a mapping can't be obtained to perform the decryption, +* then un-encrypted access to that area will end up causing +* a crash. +*/ + BUG_ON(!src || !dst); + + memcpy(me_early_buffer, src, len); + memcpy(dst, me_early_buffer, len); + + early_memunmap(dst, len); + early_memunmap(src, len); + + paddr += len; + size -= len; + } +} void __init sme_early_init(void) { -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 14/18] iommu/amd: AMD IOMMU support for memory encryption
Add support to the AMD IOMMU driver to set the memory encryption mask if memory encryption is enabled. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/mem_encrypt.h |2 ++ arch/x86/mm/mem_encrypt.c |5 + drivers/iommu/amd_iommu.c | 10 ++ 3 files changed, 17 insertions(+) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index d17d8cf..55163e4 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -39,6 +39,8 @@ void __init sme_early_init(void); /* Architecture __weak replacement functions */ void __init mem_encrypt_init(void); +unsigned long amd_iommu_get_me_mask(void); + unsigned long swiotlb_get_me_mask(void); void swiotlb_set_mem_dec(void *vaddr, unsigned long size); diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index 594dc65..6efceb8 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -179,6 +179,11 @@ void __init mem_encrypt_init(void) swiotlb_clear_encryption(); } +unsigned long amd_iommu_get_me_mask(void) +{ + return sme_me_mask; +} + unsigned long swiotlb_get_me_mask(void) { return sme_me_mask; diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 5efadad..5dc8f52 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -156,6 +156,15 @@ struct dma_ops_domain { struct aperture_range *aperture[APERTURE_MAX_RANGES]; }; +/* + * Support for memory encryption. If memory encryption is supported, then an + * override to this function will be provided. + */ +unsigned long __weak amd_iommu_get_me_mask(void) +{ + return 0; +} + / * * Helper functions @@ -2612,6 +2621,7 @@ static dma_addr_t __map_single(struct device *dev, if (address == DMA_ERROR_CODE) goto out; + paddr |= amd_iommu_get_me_mask(); start = address; for (i = 0; i < pages; ++i) { ret = dma_ops_domain_map(dma_dom, start, paddr, dir); -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 17/18] x86/kvm: Enable Secure Memory Encryption of nested page tables
Update the KVM support to include the memory encryption mask when creating and using nested page tables. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/kvm_host.h |2 +- arch/x86/kvm/mmu.c |7 +-- arch/x86/kvm/vmx.c |2 +- arch/x86/kvm/x86.c |3 ++- 4 files changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b7e3944..75f1e30 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1012,7 +1012,7 @@ void kvm_mmu_setup(struct kvm_vcpu *vcpu); void kvm_mmu_init_vm(struct kvm *kvm); void kvm_mmu_uninit_vm(struct kvm *kvm); void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, - u64 dirty_mask, u64 nx_mask, u64 x_mask); + u64 dirty_mask, u64 nx_mask, u64 x_mask, u64 me_mask); void kvm_mmu_reset_context(struct kvm_vcpu *vcpu); void kvm_mmu_slot_remove_write_access(struct kvm *kvm, diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 4c6972f..5c7d939 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -121,7 +121,7 @@ module_param(dbg, bool, 0644); * PT32_LEVEL_BITS))) - 1)) #define PT64_PERM_MASK (PT_PRESENT_MASK | PT_WRITABLE_MASK | shadow_user_mask \ - | shadow_x_mask | shadow_nx_mask) + | shadow_x_mask | shadow_nx_mask | shadow_me_mask) #define ACC_EXEC_MASK1 #define ACC_WRITE_MASK PT_WRITABLE_MASK @@ -175,6 +175,7 @@ static u64 __read_mostly shadow_user_mask; static u64 __read_mostly shadow_accessed_mask; static u64 __read_mostly shadow_dirty_mask; static u64 __read_mostly shadow_mmio_mask; +static u64 __read_mostly shadow_me_mask; static void mmu_spte_set(u64 *sptep, u64 spte); static void mmu_free_roots(struct kvm_vcpu *vcpu); @@ -282,13 +283,14 @@ static bool check_mmio_spte(struct kvm_vcpu *vcpu, u64 spte) } void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, - u64 dirty_mask, u64 nx_mask, u64 x_mask) + u64 dirty_mask, u64 nx_mask, u64 x_mask, u64 me_mask) { shadow_user_mask = user_mask; shadow_accessed_mask = accessed_mask; shadow_dirty_mask = dirty_mask; shadow_nx_mask = nx_mask; shadow_x_mask = x_mask; + shadow_me_mask = me_mask; } EXPORT_SYMBOL_GPL(kvm_mmu_set_mask_ptes); @@ -2549,6 +2551,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, pte_access &= ~ACC_WRITE_MASK; spte |= (u64)pfn << PAGE_SHIFT; + spte |= shadow_me_mask; if (pte_access & ACC_WRITE_MASK) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index d5908bd..5d8eb4b 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -6351,7 +6351,7 @@ static __init int hardware_setup(void) kvm_mmu_set_mask_ptes(0ull, (enable_ept_ad_bits) ? VMX_EPT_ACCESS_BIT : 0ull, (enable_ept_ad_bits) ? VMX_EPT_DIRTY_BIT : 0ull, - 0ull, VMX_EPT_EXECUTABLE_MASK); + 0ull, VMX_EPT_EXECUTABLE_MASK, 0ull); ept_set_mmio_spte_mask(); kvm_enable_tdp(); } else diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 12f33e6..9432e27 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -67,6 +67,7 @@ #include #include #include +#include #define MAX_IO_MSRS 256 #define KVM_MAX_MCE_BANKS 32 @@ -5859,7 +5860,7 @@ int kvm_arch_init(void *opaque) kvm_x86_ops = ops; kvm_mmu_set_mask_ptes(PT_USER_MASK, PT_ACCESSED_MASK, - PT_DIRTY_MASK, PT64_NX_MASK, 0); + PT_DIRTY_MASK, PT64_NX_MASK, 0, sme_me_mask); kvm_timer_init(); -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 05/18] x86: Handle reduction in physical address size with SME
When System Memory Encryption (SME) is enabled, the physical address space is reduced. Adjust the x86_phys_bits value to reflect this reduction. Signed-off-by: Tom Lendacky--- arch/x86/kernel/cpu/common.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 6bfa36d..b49e7fc 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -43,6 +43,7 @@ #include #include #include +#include #ifdef CONFIG_X86_LOCAL_APIC #include @@ -722,6 +723,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) c->x86_virt_bits = (eax >> 8) & 0xff; c->x86_phys_bits = eax & 0xff; + c->x86_phys_bits -= sme_get_me_loss(); c->x86_capability[CPUID_8000_0008_EBX] = ebx; } #ifdef CONFIG_X86_32 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 16/18] x86: Do not specify encrypted memory for VGA mapping
Since the VGA memory needs to be accessed unencrypted be sure that the memory encryption mask is not set for the VGA range being mapped. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/vga.h | 13 + 1 file changed, 13 insertions(+) diff --git a/arch/x86/include/asm/vga.h b/arch/x86/include/asm/vga.h index c4b9dc2..55fe164 100644 --- a/arch/x86/include/asm/vga.h +++ b/arch/x86/include/asm/vga.h @@ -7,12 +7,25 @@ #ifndef _ASM_X86_VGA_H #define _ASM_X86_VGA_H +#include + /* * On the PC, we can just recalculate addresses and then * access the videoram directly without any black magic. + * To support memory encryption however, we need to access + * the videoram as un-encrypted memory. */ +#ifdef CONFIG_AMD_MEM_ENCRYPT +#define VGA_MAP_MEM(x, s) \ +({ \ + unsigned long start = (unsigned long)phys_to_virt(x); \ + sme_set_mem_dec((void *)start, s); \ + start; \ +}) +#else #define VGA_MAP_MEM(x, s) (unsigned long)phys_to_virt(x) +#endif #define vga_readb(x) (*(x)) #define vga_writeb(x, y) (*(y) = (x)) -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 00/18] x86: Secure Memory Encryption (AMD)
This RFC patch series provides support for AMD's new Secure Memory Encryption (SME) feature. SME can be used to mark individual pages of memory as encrypted through the page tables. A page of memory that is marked encrypted will be automatically decrypted when read from DRAM and will be automatically encrypted when written to DRAM. Details on SME can found in the links below. The SME feature is identified through a CPUID function and enabled through the SYSCFG MSR. Once enabled, page table entries will determine how the memory is accessed. If a page table entry has the memory encryption mask set, then that memory will be accessed as encrypted memory. The memory encryption mask (as well as other related information) is determined from settings returned through the same CPUID function that identifies the presence of the feature. The approach that this patch series takes is to encrypt everything possible starting early in the boot where the kernel is encrypted. Using the page table macros the encryption mask can be incorporated into all page table entries and page allocations. By updating the protection map, userspace allocations are also marked encrypted. Certain data must be accounted for as having been placed in memory before SME was enabled (EFI, initrd, etc.) and accessed accordingly. This patch series is a pre-cursor to another AMD processor feature called Secure Encrypted Virtualization (SEV). The support for SEV will build upon the SME support and will be submitted later. Details on SEV can be found in the links below. The following links provide additional detail: AMD Memory Encryption whitepaper: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf AMD64 Architecture Programmer's Manual: http://support.amd.com/TechDocs/24593.pdf SME is section 7.10 SEV is section 15.34 This patch series is based off of the master branch of tip. Commit 8d54fcebd9b3 ("Merge branch 'x86/urgent'") --- Tom Lendacky (18): x86: Set the write-protect cache mode for AMD processors x86: Secure Memory Encryption (SME) build enablement x86: Secure Memory Encryption (SME) support x86: Add the Secure Memory Encryption cpu feature x86: Handle reduction in physical address size with SME x86: Provide general kernel support for memory encryption x86: Extend the early_memmap support with additional attrs x86: Add support for early encryption/decryption of memory x86: Insure that memory areas are encrypted when possible x86/efi: Access EFI related tables in the clear x86: Decrypt trampoline area if memory encryption is active x86: Access device tree in the clear x86: DMA support for memory encryption iommu/amd: AMD IOMMU support for memory encryption x86: Enable memory encryption on the APs x86: Do not specify encrypted memory for VGA mapping x86/kvm: Enable Secure Memory Encryption of nested page tables x86: Add support to turn on Secure Memory Encryption Documentation/kernel-parameters.txt |3 arch/x86/Kconfig |9 + arch/x86/include/asm/cacheflush.h|3 arch/x86/include/asm/cpufeature.h|1 arch/x86/include/asm/cpufeatures.h |5 arch/x86/include/asm/dma-mapping.h |5 arch/x86/include/asm/fixmap.h| 16 ++ arch/x86/include/asm/kvm_host.h |2 arch/x86/include/asm/mem_encrypt.h | 99 ++ arch/x86/include/asm/msr-index.h |2 arch/x86/include/asm/pgtable_types.h | 49 +++-- arch/x86/include/asm/processor.h |3 arch/x86/include/asm/realmode.h | 12 + arch/x86/include/asm/vga.h | 13 + arch/x86/kernel/Makefile |2 arch/x86/kernel/asm-offsets.c|2 arch/x86/kernel/cpu/common.c |2 arch/x86/kernel/cpu/scattered.c |1 arch/x86/kernel/devicetree.c |6 - arch/x86/kernel/espfix_64.c |2 arch/x86/kernel/head64.c | 100 +- arch/x86/kernel/head_64.S| 42 +++- arch/x86/kernel/machine_kexec_64.c |2 arch/x86/kernel/mem_encrypt.S| 343 ++ arch/x86/kernel/pci-dma.c| 11 + arch/x86/kernel/pci-nommu.c |2 arch/x86/kernel/pci-swiotlb.c|8 + arch/x86/kernel/setup.c | 14 + arch/x86/kernel/x8664_ksyms_64.c |6 + arch/x86/kvm/mmu.c |7 - arch/x86/kvm/vmx.c |2 arch/x86/kvm/x86.c |3 arch/x86/mm/Makefile |1 arch/x86/mm/fault.c |5 arch/x86/mm/ioremap.c| 31 +++ arch/x86/mm/kasan_init_64.c |4 arch/x86/mm/mem_encrypt.c| 201 arch/x86/mm/pageattr.c | 78 arch/x86/mm/pat.c| 11 +
[RFC PATCH v1 18/18] x86: Add support to turn on Secure Memory Encryption
This patch adds the support to check for and enable SME when available on the processor and when the mem_encrypt=on command line option is set. This consists of setting the encryption mask, calculating the number of physical bits of addressing lost and encrypting the kernel "in place." Signed-off-by: Tom Lendacky--- Documentation/kernel-parameters.txt |3 arch/x86/kernel/asm-offsets.c |2 arch/x86/kernel/mem_encrypt.S | 306 +++ 3 files changed, 311 insertions(+) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 8ba7f82..0a2678a 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2210,6 +2210,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted. memory contents and reserves bad memory regions that are detected. + mem_encrypt=on [X86_64] Enable memory encryption on processors + that support this feature. + meye.*= [HW] Set MotionEye Camera parameters See Documentation/video4linux/meye.txt. diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c index 5c04246..a0f76de 100644 --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -82,6 +82,8 @@ void common(void) { OFFSET(BP_kernel_alignment, boot_params, hdr.kernel_alignment); OFFSET(BP_pref_address, boot_params, hdr.pref_address); OFFSET(BP_code32_start, boot_params, hdr.code32_start); + OFFSET(BP_cmd_line_ptr, boot_params, hdr.cmd_line_ptr); + OFFSET(BP_ext_cmd_line_ptr, boot_params, ext_cmd_line_ptr); BLANK(); DEFINE(PTREGS_SIZE, sizeof(struct pt_regs)); diff --git a/arch/x86/kernel/mem_encrypt.S b/arch/x86/kernel/mem_encrypt.S index f2e0536..4d3326d 100644 --- a/arch/x86/kernel/mem_encrypt.S +++ b/arch/x86/kernel/mem_encrypt.S @@ -12,13 +12,236 @@ #include +#include +#include +#include +#include +#include + .text .code64 ENTRY(sme_enable) +#ifdef CONFIG_AMD_MEM_ENCRYPT + /* Check for AMD processor */ + xorl%eax, %eax + cpuid + cmpl$0x68747541, %ebx # AuthenticAMD + jne .Lno_mem_encrypt + cmpl$0x69746e65, %edx + jne .Lno_mem_encrypt + cmpl$0x444d4163, %ecx + jne .Lno_mem_encrypt + + /* Check for memory encryption leaf */ + movl$0x8000, %eax + cpuid + cmpl$0x801f, %eax + jb .Lno_mem_encrypt + + /* +* Check for memory encryption feature: +* CPUID Fn8000_001F[EAX] - Bit 0 +*/ + movl$0x801f, %eax + cpuid + bt $0, %eax + jnc .Lno_mem_encrypt + + /* Check for the mem_encrypt=on command line option */ + push%rsi/* Save RSI (real_mode_data) */ + movlBP_ext_cmd_line_ptr(%rsi), %ecx + shlq$32, %rcx + movlBP_cmd_line_ptr(%rsi), %edi + addq%rcx, %rdi + leaqmem_encrypt_enable_option(%rip), %rsi + callcmdline_find_option_bool + pop %rsi/* Restore RSI (real_mode_data) */ + testl %eax, %eax + jz .Lno_mem_encrypt + + /* +* Get memory encryption information: +* CPUID Fn8000_001F[EBX] - Bits 5:0 +* Pagetable bit position used to indicate encryption +*/ + movl%ebx, %ecx + andl$0x3f, %ecx + jz .Lno_mem_encrypt + bts %ecx, sme_me_mask(%rip) + shrl$6, %ebx + + /* +* Get memory encryption information: +* CPUID Fn8000_001F[EBX] - Bits 11:6 +* Reduction in physical address space (in bits) when enabled +*/ + movl%ebx, %ecx + andl$0x3f, %ecx + movb%cl, sme_me_loss(%rip) + + /* +* Enable memory encryption through the SYSCFG MSR +*/ + movl$MSR_K8_SYSCFG, %ecx + rdmsr + bt $MSR_K8_SYSCFG_MEM_ENCRYPT_BIT, %eax + jc .Lmem_encrypt_exit + bts $MSR_K8_SYSCFG_MEM_ENCRYPT_BIT, %eax + wrmsr + jmp .Lmem_encrypt_exit + +.Lno_mem_encrypt: + /* Did not get enabled, clear settings */ + movq$0, sme_me_mask(%rip) + movb$0, sme_me_loss(%rip) + +.Lmem_encrypt_exit: +#endif /* CONFIG_AMD_MEM_ENCRYPT */ + ret ENDPROC(sme_enable) ENTRY(sme_encrypt_kernel) +#ifdef CONFIG_AMD_MEM_ENCRYPT + cmpq$0, sme_me_mask(%rip) + jz .Lencrypt_exit + + /* +* Encrypt the kernel. +* Pagetables for performing kernel encryption: +* 0x00 - 0x00 will map just the memory occupied by +* the kernel as encrypted memory +* 0x80 - 0x80
[RFC PATCH v1 09/18] x86: Insure that memory areas are encrypted when possible
Encrypt memory areas in place when possible (e.g. zero page, etc.) so that special handling isn't needed afterwards. Signed-off-by: Tom Lendacky--- arch/x86/kernel/head64.c | 90 +++--- arch/x86/kernel/setup.c |8 2 files changed, 93 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 3516f9f..ac3a2bf 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -47,12 +47,12 @@ static void __init reset_early_page_tables(void) } /* Create a new PMD entry */ -int __init early_make_pgtable(unsigned long address) +static int __init __early_make_pgtable(unsigned long address, pmdval_t pmd) { unsigned long physaddr = address - __PAGE_OFFSET; pgdval_t pgd, *pgd_p; pudval_t pud, *pud_p; - pmdval_t pmd, *pmd_p; + pmdval_t *pmd_p; /* Invalid address or early pgt is done ? */ if (physaddr >= MAXMEM || read_cr3() != __sme_pa_nodebug(early_level4_pgt)) @@ -94,12 +94,92 @@ again: memset(pmd_p, 0, sizeof(*pmd_p) * PTRS_PER_PMD); *pud_p = (pudval_t)pmd_p - __START_KERNEL_map + phys_base + _KERNPG_TABLE; } - pmd = (physaddr & PMD_MASK) + early_pmd_flags; pmd_p[pmd_index(address)] = pmd; return 0; } +int __init early_make_pgtable(unsigned long address) +{ + unsigned long physaddr = address - __PAGE_OFFSET; + pmdval_t pmd; + + pmd = (physaddr & PMD_MASK) + early_pmd_flags; + + return __early_make_pgtable(address, pmd); +} + +static void __init create_unencrypted_mapping(void *address, unsigned long size) +{ + unsigned long physaddr = (unsigned long)address - __PAGE_OFFSET; + pmdval_t pmd_flags, pmd; + + if (!sme_me_mask) + return; + + /* Clear the encryption mask from the early_pmd_flags */ + pmd_flags = early_pmd_flags & ~sme_me_mask; + + do { + pmd = (physaddr & PMD_MASK) + pmd_flags; + __early_make_pgtable((unsigned long)address, pmd); + + address += PMD_SIZE; + physaddr += PMD_SIZE; + size = (size < PMD_SIZE) ? 0 : size - PMD_SIZE; + } while (size); +} + +static void __init __clear_mapping(unsigned long address) +{ + unsigned long physaddr = address - __PAGE_OFFSET; + pgdval_t pgd, *pgd_p; + pudval_t pud, *pud_p; + pmdval_t *pmd_p; + + /* Invalid address or early pgt is done ? */ + if (physaddr >= MAXMEM || + read_cr3() != __sme_pa_nodebug(early_level4_pgt)) + return; + + pgd_p = _level4_pgt[pgd_index(address)].pgd; + pgd = *pgd_p; + + if (!pgd) + return; + + /* +* The use of __START_KERNEL_map rather than __PAGE_OFFSET here matches +* __early_make_pgtable where the entry was created. +*/ + pud_p = (pudval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map - phys_base); + pud_p += pud_index(address); + pud = *pud_p; + + if (!pud) + return; + + pmd_p = (pmdval_t *)((pud & PTE_PFN_MASK) + __START_KERNEL_map - phys_base); + pmd_p[pmd_index(address)] = 0; +} + +static void __init clear_mapping(void *address, unsigned long size) +{ + do { + __clear_mapping((unsigned long)address); + + address += PMD_SIZE; + size = (size < PMD_SIZE) ? 0 : size - PMD_SIZE; + } while (size); +} + +static void __init sme_memcpy(void *dst, void *src, unsigned long size) +{ + create_unencrypted_mapping(src, size); + memcpy(dst, src, size); + clear_mapping(src, size); +} + /* Don't add a printk in there. printk relies on the PDA which is not initialized yet. */ static void __init clear_bss(void) @@ -122,12 +202,12 @@ static void __init copy_bootdata(char *real_mode_data) char * command_line; unsigned long cmd_line_ptr; - memcpy(_params, real_mode_data, sizeof boot_params); + sme_memcpy(_params, real_mode_data, sizeof boot_params); sanitize_boot_params(_params); cmd_line_ptr = get_cmd_line_ptr(); if (cmd_line_ptr) { command_line = __va(cmd_line_ptr); - memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE); + sme_memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE); } } diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 2367ae0..1d29cf9 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -113,6 +113,7 @@ #include #include #include +#include /* * max_low_pfn_mapped: highest direct mapped pfn under 4GB @@ -375,6 +376,13 @@ static void __init reserve_initrd(void) !ramdisk_image || !ramdisk_size) return; /* No initrd provided by bootloader */ + /* +* This memory is marked
[RFC PATCH v1 15/18] x86: Enable memory encryption on the APs
Add support to set the memory encryption enable flag on the APs during realmode initialization. When an AP is started it checks this flag, and if set, enables memory encryption on its core. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/msr-index.h |2 ++ arch/x86/include/asm/realmode.h | 12 arch/x86/realmode/init.c |4 arch/x86/realmode/rm/trampoline_64.S | 14 ++ 4 files changed, 32 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 94555b4..b73182b 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -349,6 +349,8 @@ #define MSR_K8_TOP_MEM10xc001001a #define MSR_K8_TOP_MEM20xc001001d #define MSR_K8_SYSCFG 0xc0010010 +#define MSR_K8_SYSCFG_MEM_ENCRYPT_BIT 23 +#define MSR_K8_SYSCFG_MEM_ENCRYPT (1ULL << MSR_K8_SYSCFG_MEM_ENCRYPT_BIT) #define MSR_K8_INT_PENDING_MSG 0xc0010055 /* C1E active bits in int pending message */ #define K8_INTP_C1E_ACTIVE_MASK0x1800 diff --git a/arch/x86/include/asm/realmode.h b/arch/x86/include/asm/realmode.h index 9c6b890..e24d2ec 100644 --- a/arch/x86/include/asm/realmode.h +++ b/arch/x86/include/asm/realmode.h @@ -1,6 +1,15 @@ #ifndef _ARCH_X86_REALMODE_H #define _ARCH_X86_REALMODE_H +/* + * Flag bit definitions for use with the flags field of the trampoline header + * when configured for X86_64 + */ +#define TH_FLAGS_MEM_ENCRYPT_BIT 0 +#define TH_FLAGS_MEM_ENCRYPT (1ULL << TH_FLAGS_MEM_ENCRYPT_BIT) + +#ifndef __ASSEMBLY__ + #include #include @@ -38,6 +47,7 @@ struct trampoline_header { u64 start; u64 efer; u32 cr4; + u32 flags; #endif }; @@ -61,4 +71,6 @@ extern unsigned char secondary_startup_64[]; void reserve_real_mode(void); void setup_real_mode(void); +#endif /* __ASSEMBLY__ */ + #endif /* _ARCH_X86_REALMODE_H */ diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c index 85b145c..657532b 100644 --- a/arch/x86/realmode/init.c +++ b/arch/x86/realmode/init.c @@ -84,6 +84,10 @@ void __init setup_real_mode(void) trampoline_cr4_features = _header->cr4; *trampoline_cr4_features = __read_cr4(); + trampoline_header->flags = 0; + if (sme_me_mask) + trampoline_header->flags |= TH_FLAGS_MEM_ENCRYPT; + trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd); trampoline_pgd[0] = init_level4_pgt[pgd_index(__PAGE_OFFSET)].pgd; trampoline_pgd[511] = init_level4_pgt[511].pgd; diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S index dac7b20..8d84167 100644 --- a/arch/x86/realmode/rm/trampoline_64.S +++ b/arch/x86/realmode/rm/trampoline_64.S @@ -30,6 +30,7 @@ #include #include #include +#include #include "realmode.h" .text @@ -109,6 +110,18 @@ ENTRY(startup_32) movl$(X86_CR0_PG | X86_CR0_WP | X86_CR0_PE), %eax movl%eax, %cr0 + # Check for and enable memory encryption support + movlpa_tr_flags, %eax + bt $TH_FLAGS_MEM_ENCRYPT_BIT, pa_tr_flags + jnc .Ldone + movl$MSR_K8_SYSCFG, %ecx + rdmsr + bt $MSR_K8_SYSCFG_MEM_ENCRYPT_BIT, %eax + jc .Ldone + bts $MSR_K8_SYSCFG_MEM_ENCRYPT_BIT, %eax + wrmsr +.Ldone: + /* * At this point we're in long mode but in 32bit compatibility mode * with EFER.LME = 1, CS.L = 0, CS.D = 1 (and in turn @@ -147,6 +160,7 @@ GLOBAL(trampoline_header) tr_start: .space 8 GLOBAL(tr_efer) .space 8 GLOBAL(tr_cr4) .space 4 + GLOBAL(tr_flags).space 4 END(trampoline_header) #include "trampoline_common.S" -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 03/18] x86: Secure Memory Encryption (SME) support
Provide support for Secure Memory Encryption (SME). This initial support defines the memory encryption mask as a variable for quick access and an accessor for retrieving the number of physical addressing bits lost if SME is enabled. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/mem_encrypt.h | 37 arch/x86/kernel/Makefile |2 ++ arch/x86/kernel/mem_encrypt.S | 29 arch/x86/kernel/x8664_ksyms_64.c |6 ++ 4 files changed, 74 insertions(+) create mode 100644 arch/x86/include/asm/mem_encrypt.h create mode 100644 arch/x86/kernel/mem_encrypt.S diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h new file mode 100644 index 000..747fc52 --- /dev/null +++ b/arch/x86/include/asm/mem_encrypt.h @@ -0,0 +1,37 @@ +/* + * AMD Memory Encryption Support + * + * Copyright (C) 2016 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifndef __X86_MEM_ENCRYPT_H__ +#define __X86_MEM_ENCRYPT_H__ + +#ifndef __ASSEMBLY__ + +#ifdef CONFIG_AMD_MEM_ENCRYPT + +extern unsigned long sme_me_mask; + +u8 sme_get_me_loss(void); + +#else /* !CONFIG_AMD_MEM_ENCRYPT */ + +#define sme_me_mask0UL + +static inline u8 sme_get_me_loss(void) +{ + return 0; +} + +#endif /* CONFIG_AMD_MEM_ENCRYPT */ + +#endif /* __ASSEMBLY__ */ + +#endif /* __X86_MEM_ENCRYPT_H__ */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 9abf855..11536d9 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -126,6 +126,8 @@ obj-$(CONFIG_EFI) += sysfb_efi.o obj-$(CONFIG_PERF_EVENTS) += perf_regs.o obj-$(CONFIG_TRACING) += tracepoint.o +obj-y += mem_encrypt.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/mem_encrypt.S b/arch/x86/kernel/mem_encrypt.S new file mode 100644 index 000..ef7f325 --- /dev/null +++ b/arch/x86/kernel/mem_encrypt.S @@ -0,0 +1,29 @@ +/* + * AMD Memory Encryption Support + * + * Copyright (C) 2016 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include + + .text + .code64 +ENTRY(sme_get_me_loss) + xor %rax, %rax + mov sme_me_loss(%rip), %al + ret +ENDPROC(sme_get_me_loss) + + .data + .align 16 +ENTRY(sme_me_mask) + .quad 0x +sme_me_loss: + .byte 0x00 + .align 8 diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c index cd05942..72cb689 100644 --- a/arch/x86/kernel/x8664_ksyms_64.c +++ b/arch/x86/kernel/x8664_ksyms_64.c @@ -11,6 +11,7 @@ #include #include #include +#include #ifdef CONFIG_FUNCTION_TRACER /* mcount and __fentry__ are defined in assembly */ @@ -79,3 +80,8 @@ EXPORT_SYMBOL(native_load_gs_index); EXPORT_SYMBOL(___preempt_schedule); EXPORT_SYMBOL(___preempt_schedule_notrace); #endif + +#ifdef CONFIG_AMD_MEM_ENCRYPT +EXPORT_SYMBOL_GPL(sme_me_mask); +EXPORT_SYMBOL_GPL(sme_get_me_loss); +#endif -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 06/18] x86: Provide general kernel support for memory encryption
Adding general kernel support for memory encryption includes: - Modify and create some page table macros to include the Secure Memory Encryption (SME) memory encryption mask - Update kernel boot support to call an SME routine that checks for and sets the SME capability (the SME routine will grow later and for now is just a stub routine) - Update kernel boot support to call an SME routine that encrypts the kernel (the SME routine will grow later and for now is just a stub routine) - Provide an SME initialization routine to update the protection map with the memory encryption mask so that it is used by default Signed-off-by: Tom Lendacky--- arch/x86/include/asm/fixmap.h|7 ++ arch/x86/include/asm/mem_encrypt.h | 18 +++ arch/x86/include/asm/pgtable_types.h | 41 ++--- arch/x86/include/asm/processor.h |3 ++ arch/x86/kernel/espfix_64.c |2 +- arch/x86/kernel/head64.c | 10 ++-- arch/x86/kernel/head_64.S| 42 ++ arch/x86/kernel/machine_kexec_64.c |2 +- arch/x86/kernel/mem_encrypt.S|8 ++ arch/x86/mm/Makefile |1 + arch/x86/mm/fault.c |5 ++-- arch/x86/mm/ioremap.c|3 ++ arch/x86/mm/kasan_init_64.c |4 ++- arch/x86/mm/mem_encrypt.c| 30 arch/x86/mm/pageattr.c |3 ++ 15 files changed, 145 insertions(+), 34 deletions(-) create mode 100644 arch/x86/mm/mem_encrypt.c diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 8554f96..83e91f0 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -153,6 +153,13 @@ static inline void __set_fixmap(enum fixed_addresses idx, } #endif +/* + * Fixmap settings used with memory encryption + * - FIXMAP_PAGE_NOCACHE is used for MMIO so make sure the memory + * encryption mask is not part of the page attributes + */ +#define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_IO_NOCACHE + #include #define __late_set_fixmap(idx, phys, flags) __set_fixmap(idx, phys, flags) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 747fc52..9f3e762 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -15,12 +15,21 @@ #ifndef __ASSEMBLY__ +#include + #ifdef CONFIG_AMD_MEM_ENCRYPT extern unsigned long sme_me_mask; u8 sme_get_me_loss(void); +void __init sme_early_init(void); + +#define __sme_pa(x)(__pa((x)) | sme_me_mask) +#define __sme_pa_nodebug(x)(__pa_nodebug((x)) | sme_me_mask) + +#define __sme_va(x)(__va((x) & ~sme_me_mask)) + #else /* !CONFIG_AMD_MEM_ENCRYPT */ #define sme_me_mask0UL @@ -30,6 +39,15 @@ static inline u8 sme_get_me_loss(void) return 0; } +static inline void __init sme_early_init(void) +{ +} + +#define __sme_pa __pa +#define __sme_pa_nodebug __pa_nodebug + +#define __sme_va __va + #endif /* CONFIG_AMD_MEM_ENCRYPT */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 7b5efe2..fda7877 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -3,6 +3,7 @@ #include #include +#include #define FIRST_USER_ADDRESS 0UL @@ -115,9 +116,9 @@ #define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE) -#define _PAGE_TABLE(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |\ +#define __PAGE_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |\ _PAGE_ACCESSED | _PAGE_DIRTY) -#define _KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |\ +#define __KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |\ _PAGE_DIRTY) /* @@ -185,18 +186,30 @@ enum page_cache_mode { #define __PAGE_KERNEL_IO (__PAGE_KERNEL) #define __PAGE_KERNEL_IO_NOCACHE (__PAGE_KERNEL_NOCACHE) -#define PAGE_KERNEL__pgprot(__PAGE_KERNEL) -#define PAGE_KERNEL_RO __pgprot(__PAGE_KERNEL_RO) -#define PAGE_KERNEL_EXEC __pgprot(__PAGE_KERNEL_EXEC) -#define PAGE_KERNEL_RX __pgprot(__PAGE_KERNEL_RX) -#define PAGE_KERNEL_NOCACHE__pgprot(__PAGE_KERNEL_NOCACHE) -#define PAGE_KERNEL_LARGE __pgprot(__PAGE_KERNEL_LARGE) -#define PAGE_KERNEL_LARGE_EXEC __pgprot(__PAGE_KERNEL_LARGE_EXEC) -#define PAGE_KERNEL_VSYSCALL __pgprot(__PAGE_KERNEL_VSYSCALL) -#define PAGE_KERNEL_VVAR __pgprot(__PAGE_KERNEL_VVAR) - -#define PAGE_KERNEL_IO __pgprot(__PAGE_KERNEL_IO) -#define PAGE_KERNEL_IO_NOCACHE __pgprot(__PAGE_KERNEL_IO_NOCACHE) +#ifndef __ASSEMBLY__ + +#define _PAGE_ENC sme_me_mask + +/* Redefine
[RFC PATCH v1 02/18] x86: Secure Memory Encryption (SME) build enablement
Provide the Kconfig support to build the SME support in the kernel. Signed-off-by: Tom Lendacky--- arch/x86/Kconfig |9 + 1 file changed, 9 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 7bb1574..13249b5 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1356,6 +1356,15 @@ config X86_DIRECT_GBPAGES supports them), so don't confuse the user by printing that we have them enabled. +config AMD_MEM_ENCRYPT + bool "Secure Memory Encryption support for AMD" + depends on X86_64 && CPU_SUP_AMD + ---help--- + Say yes to enable the encryption of system memory. This requires + an AMD processor that supports Secure Memory Encryption (SME). + The encryption of system memory is disabled by default but can be + enabled with the mem_encrypt=on command line option. + # Common NUMA Features config NUMA bool "Numa Memory Allocation and Scheduler Support" -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 06/18] x86: Provide general kernel support for memory encryption
Adding general kernel support for memory encryption includes: - Modify and create some page table macros to include the Secure Memory Encryption (SME) memory encryption mask - Update kernel boot support to call an SME routine that checks for and sets the SME capability (the SME routine will grow later and for now is just a stub routine) - Update kernel boot support to call an SME routine that encrypts the kernel (the SME routine will grow later and for now is just a stub routine) - Provide an SME initialization routine to update the protection map with the memory encryption mask so that it is used by default Signed-off-by: Tom Lendacky--- arch/x86/include/asm/fixmap.h|7 ++ arch/x86/include/asm/mem_encrypt.h | 18 +++ arch/x86/include/asm/pgtable_types.h | 41 ++--- arch/x86/include/asm/processor.h |3 ++ arch/x86/kernel/espfix_64.c |2 +- arch/x86/kernel/head64.c | 10 ++-- arch/x86/kernel/head_64.S| 42 ++ arch/x86/kernel/machine_kexec_64.c |2 +- arch/x86/kernel/mem_encrypt.S|8 ++ arch/x86/mm/Makefile |1 + arch/x86/mm/fault.c |5 ++-- arch/x86/mm/ioremap.c|3 ++ arch/x86/mm/kasan_init_64.c |4 ++- arch/x86/mm/mem_encrypt.c| 30 arch/x86/mm/pageattr.c |3 ++ 15 files changed, 145 insertions(+), 34 deletions(-) create mode 100644 arch/x86/mm/mem_encrypt.c diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 8554f96..83e91f0 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -153,6 +153,13 @@ static inline void __set_fixmap(enum fixed_addresses idx, } #endif +/* + * Fixmap settings used with memory encryption + * - FIXMAP_PAGE_NOCACHE is used for MMIO so make sure the memory + * encryption mask is not part of the page attributes + */ +#define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_IO_NOCACHE + #include #define __late_set_fixmap(idx, phys, flags) __set_fixmap(idx, phys, flags) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 747fc52..9f3e762 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -15,12 +15,21 @@ #ifndef __ASSEMBLY__ +#include + #ifdef CONFIG_AMD_MEM_ENCRYPT extern unsigned long sme_me_mask; u8 sme_get_me_loss(void); +void __init sme_early_init(void); + +#define __sme_pa(x)(__pa((x)) | sme_me_mask) +#define __sme_pa_nodebug(x)(__pa_nodebug((x)) | sme_me_mask) + +#define __sme_va(x)(__va((x) & ~sme_me_mask)) + #else /* !CONFIG_AMD_MEM_ENCRYPT */ #define sme_me_mask0UL @@ -30,6 +39,15 @@ static inline u8 sme_get_me_loss(void) return 0; } +static inline void __init sme_early_init(void) +{ +} + +#define __sme_pa __pa +#define __sme_pa_nodebug __pa_nodebug + +#define __sme_va __va + #endif /* CONFIG_AMD_MEM_ENCRYPT */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 7b5efe2..fda7877 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -3,6 +3,7 @@ #include #include +#include #define FIRST_USER_ADDRESS 0UL @@ -115,9 +116,9 @@ #define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE) -#define _PAGE_TABLE(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |\ +#define __PAGE_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |\ _PAGE_ACCESSED | _PAGE_DIRTY) -#define _KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |\ +#define __KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |\ _PAGE_DIRTY) /* @@ -185,18 +186,30 @@ enum page_cache_mode { #define __PAGE_KERNEL_IO (__PAGE_KERNEL) #define __PAGE_KERNEL_IO_NOCACHE (__PAGE_KERNEL_NOCACHE) -#define PAGE_KERNEL__pgprot(__PAGE_KERNEL) -#define PAGE_KERNEL_RO __pgprot(__PAGE_KERNEL_RO) -#define PAGE_KERNEL_EXEC __pgprot(__PAGE_KERNEL_EXEC) -#define PAGE_KERNEL_RX __pgprot(__PAGE_KERNEL_RX) -#define PAGE_KERNEL_NOCACHE__pgprot(__PAGE_KERNEL_NOCACHE) -#define PAGE_KERNEL_LARGE __pgprot(__PAGE_KERNEL_LARGE) -#define PAGE_KERNEL_LARGE_EXEC __pgprot(__PAGE_KERNEL_LARGE_EXEC) -#define PAGE_KERNEL_VSYSCALL __pgprot(__PAGE_KERNEL_VSYSCALL) -#define PAGE_KERNEL_VVAR __pgprot(__PAGE_KERNEL_VVAR) - -#define PAGE_KERNEL_IO __pgprot(__PAGE_KERNEL_IO) -#define PAGE_KERNEL_IO_NOCACHE __pgprot(__PAGE_KERNEL_IO_NOCACHE) +#ifndef __ASSEMBLY__ + +#define _PAGE_ENC sme_me_mask + +/* Redefine
[RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear
The EFI tables are not encrypted and need to be accessed as such. Be sure to memmap them without the encryption attribute set. For EFI support that lives outside of the arch/x86 tree, create a routine that uses the __weak attribute so that it can be overridden by an architecture specific routine. When freeing boot services related memory, since it has been mapped as un-encrypted, be sure to change the mapping to encrypted for future use. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/cacheflush.h |3 + arch/x86/include/asm/mem_encrypt.h | 22 +++ arch/x86/kernel/setup.c|6 +-- arch/x86/mm/mem_encrypt.c | 56 +++ arch/x86/mm/pageattr.c | 75 arch/x86/platform/efi/efi.c| 26 +++- arch/x86/platform/efi/efi_64.c |9 +++- arch/x86/platform/efi/quirks.c | 12 +- drivers/firmware/efi/efi.c | 18 +++-- drivers/firmware/efi/esrt.c| 12 +++--- include/linux/efi.h|3 + 11 files changed, 212 insertions(+), 30 deletions(-) diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h index 61518cf..bfb08e5 100644 --- a/arch/x86/include/asm/cacheflush.h +++ b/arch/x86/include/asm/cacheflush.h @@ -13,6 +13,7 @@ * Executability : eXeutable, NoteXecutable * Read/Write: ReadOnly, ReadWrite * Presence : NotPresent + * Encryption: ENCrypted, DECrypted * * Within a category, the attributes are mutually exclusive. * @@ -48,6 +49,8 @@ int set_memory_ro(unsigned long addr, int numpages); int set_memory_rw(unsigned long addr, int numpages); int set_memory_np(unsigned long addr, int numpages); int set_memory_4k(unsigned long addr, int numpages); +int set_memory_enc(unsigned long addr, int numpages); +int set_memory_dec(unsigned long addr, int numpages); int set_memory_array_uc(unsigned long *addr, int addrinarray); int set_memory_array_wc(unsigned long *addr, int addrinarray); diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 2785493..42868f5 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -23,13 +23,23 @@ extern unsigned long sme_me_mask; u8 sme_get_me_loss(void); +int sme_set_mem_enc(void *vaddr, unsigned long size); +int sme_set_mem_dec(void *vaddr, unsigned long size); + void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size); void __init sme_early_mem_dec(resource_size_t paddr, unsigned long size); +void __init *sme_early_memremap(resource_size_t paddr, + unsigned long size); + void __init sme_early_init(void); +/* Architecture __weak replacement functions */ +void __init *efi_me_early_memremap(resource_size_t paddr, + unsigned long size); + #define __sme_pa(x)(__pa((x)) | sme_me_mask) #define __sme_pa_nodebug(x)(__pa_nodebug((x)) | sme_me_mask) @@ -44,6 +54,16 @@ static inline u8 sme_get_me_loss(void) return 0; } +static inline int sme_set_mem_enc(void *vaddr, unsigned long size) +{ + return 0; +} + +static inline int sme_set_mem_dec(void *vaddr, unsigned long size) +{ + return 0; +} + static inline void __init sme_early_mem_enc(resource_size_t paddr, unsigned long size) { @@ -63,6 +83,8 @@ static inline void __init sme_early_init(void) #define __sme_va __va +#define sme_early_memremap early_memremap + #endif /* CONFIG_AMD_MEM_ENCRYPT */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1d29cf9..2e460fb 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -424,7 +424,7 @@ static void __init parse_setup_data(void) while (pa_data) { u32 data_len, data_type; - data = early_memremap(pa_data, sizeof(*data)); + data = sme_early_memremap(pa_data, sizeof(*data)); data_len = data->len + sizeof(struct setup_data); data_type = data->type; pa_next = data->next; @@ -457,7 +457,7 @@ static void __init e820_reserve_setup_data(void) return; while (pa_data) { - data = early_memremap(pa_data, sizeof(*data)); + data = sme_early_memremap(pa_data, sizeof(*data)); e820_update_range(pa_data, sizeof(*data)+data->len, E820_RAM, E820_RESERVED_KERN); pa_data = data->next; @@ -477,7 +477,7 @@ static void __init memblock_x86_reserve_range_setup_data(void) pa_data = boot_params.hdr.setup_data; while (pa_data) { - data = early_memremap(pa_data, sizeof(*data)); + data = sme_early_memremap(pa_data,
[RFC PATCH v1 07/18] x86: Extend the early_memmap support with additional attrs
Add to the early_memmap support to be able to specify encrypted and un-encrypted mappings with and without write-protection. The use of write-protection is necessary when encrypting data "in place". The write-protect attribute is considered cacheable for loads, but not stores. This implies that the hardware will never give the core a dirty line with this memtype. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/fixmap.h|9 + arch/x86/include/asm/pgtable_types.h |8 arch/x86/mm/ioremap.c| 28 include/asm-generic/early_ioremap.h |2 ++ mm/early_ioremap.c | 15 +++ 5 files changed, 62 insertions(+) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 83e91f0..4d41878 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -160,6 +160,15 @@ static inline void __set_fixmap(enum fixed_addresses idx, */ #define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_IO_NOCACHE +void __init *early_memremap_enc(resource_size_t phys_addr, + unsigned long size); +void __init *early_memremap_enc_wp(resource_size_t phys_addr, + unsigned long size); +void __init *early_memremap_dec(resource_size_t phys_addr, + unsigned long size); +void __init *early_memremap_dec_wp(resource_size_t phys_addr, + unsigned long size); + #include #define __late_set_fixmap(idx, phys, flags) __set_fixmap(idx, phys, flags) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index fda7877..6291248 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -154,6 +154,7 @@ enum page_cache_mode { #define _PAGE_CACHE_MASK (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT) #define _PAGE_NOCACHE (cachemode2protval(_PAGE_CACHE_MODE_UC)) +#define _PAGE_CACHE_WP (cachemode2protval(_PAGE_CACHE_MODE_WP)) #define PAGE_NONE __pgprot(_PAGE_PROTNONE | _PAGE_ACCESSED) #define PAGE_SHARED__pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \ @@ -182,6 +183,7 @@ enum page_cache_mode { #define __PAGE_KERNEL_VVAR (__PAGE_KERNEL_RO | _PAGE_USER) #define __PAGE_KERNEL_LARGE(__PAGE_KERNEL | _PAGE_PSE) #define __PAGE_KERNEL_LARGE_EXEC (__PAGE_KERNEL_EXEC | _PAGE_PSE) +#define __PAGE_KERNEL_WP (__PAGE_KERNEL | _PAGE_CACHE_WP) #define __PAGE_KERNEL_IO (__PAGE_KERNEL) #define __PAGE_KERNEL_IO_NOCACHE (__PAGE_KERNEL_NOCACHE) @@ -196,6 +198,12 @@ enum page_cache_mode { #define _KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED |\ _PAGE_DIRTY | _PAGE_ENC) +#define __PAGE_KERNEL_ENC (__PAGE_KERNEL | _PAGE_ENC) +#define __PAGE_KERNEL_ENC_WP (__PAGE_KERNEL_WP | _PAGE_ENC) + +#define __PAGE_KERNEL_DEC (__PAGE_KERNEL) +#define __PAGE_KERNEL_DEC_WP (__PAGE_KERNEL_WP) + #define PAGE_KERNEL__pgprot(__PAGE_KERNEL | _PAGE_ENC) #define PAGE_KERNEL_RO __pgprot(__PAGE_KERNEL_RO | _PAGE_ENC) #define PAGE_KERNEL_EXEC __pgprot(__PAGE_KERNEL_EXEC | _PAGE_ENC) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 77dadf5..14c7ed5 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -420,6 +420,34 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr) iounmap((void __iomem *)((unsigned long)addr & PAGE_MASK)); } +/* Remap memory with encryption */ +void __init *early_memremap_enc(resource_size_t phys_addr, + unsigned long size) +{ + return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_ENC); +} + +/* Remap memory with encryption and write-protected */ +void __init *early_memremap_enc_wp(resource_size_t phys_addr, + unsigned long size) +{ + return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_ENC_WP); +} + +/* Remap memory without encryption */ +void __init *early_memremap_dec(resource_size_t phys_addr, + unsigned long size) +{ + return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_DEC); +} + +/* Remap memory without encryption and write-protected */ +void __init *early_memremap_dec_wp(resource_size_t phys_addr, + unsigned long size) +{ + return early_memremap_prot(phys_addr, size, __PAGE_KERNEL_DEC_WP); +} + static pte_t bm_pte[PAGE_SIZE/sizeof(pte_t)] __page_aligned_bss; static inline pmd_t * __init early_ioremap_pmd(unsigned long addr) diff --git a/include/asm-generic/early_ioremap.h b/include/asm-generic/early_ioremap.h index 734ad4d..2edef8d 100644 --- a/include/asm-generic/early_ioremap.h +++ b/include/asm-generic/early_ioremap.h @@ -13,6 +13,8 @@ extern void *early_memremap(resource_size_t phys_addr,
[RFC PATCH v1 03/18] x86: Secure Memory Encryption (SME) support
Provide support for Secure Memory Encryption (SME). This initial support defines the memory encryption mask as a variable for quick access and an accessor for retrieving the number of physical addressing bits lost if SME is enabled. Signed-off-by: Tom Lendacky--- arch/x86/include/asm/mem_encrypt.h | 37 arch/x86/kernel/Makefile |2 ++ arch/x86/kernel/mem_encrypt.S | 29 arch/x86/kernel/x8664_ksyms_64.c |6 ++ 4 files changed, 74 insertions(+) create mode 100644 arch/x86/include/asm/mem_encrypt.h create mode 100644 arch/x86/kernel/mem_encrypt.S diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h new file mode 100644 index 000..747fc52 --- /dev/null +++ b/arch/x86/include/asm/mem_encrypt.h @@ -0,0 +1,37 @@ +/* + * AMD Memory Encryption Support + * + * Copyright (C) 2016 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifndef __X86_MEM_ENCRYPT_H__ +#define __X86_MEM_ENCRYPT_H__ + +#ifndef __ASSEMBLY__ + +#ifdef CONFIG_AMD_MEM_ENCRYPT + +extern unsigned long sme_me_mask; + +u8 sme_get_me_loss(void); + +#else /* !CONFIG_AMD_MEM_ENCRYPT */ + +#define sme_me_mask0UL + +static inline u8 sme_get_me_loss(void) +{ + return 0; +} + +#endif /* CONFIG_AMD_MEM_ENCRYPT */ + +#endif /* __ASSEMBLY__ */ + +#endif /* __X86_MEM_ENCRYPT_H__ */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 9abf855..11536d9 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -126,6 +126,8 @@ obj-$(CONFIG_EFI) += sysfb_efi.o obj-$(CONFIG_PERF_EVENTS) += perf_regs.o obj-$(CONFIG_TRACING) += tracepoint.o +obj-y += mem_encrypt.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/mem_encrypt.S b/arch/x86/kernel/mem_encrypt.S new file mode 100644 index 000..ef7f325 --- /dev/null +++ b/arch/x86/kernel/mem_encrypt.S @@ -0,0 +1,29 @@ +/* + * AMD Memory Encryption Support + * + * Copyright (C) 2016 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include + + .text + .code64 +ENTRY(sme_get_me_loss) + xor %rax, %rax + mov sme_me_loss(%rip), %al + ret +ENDPROC(sme_get_me_loss) + + .data + .align 16 +ENTRY(sme_me_mask) + .quad 0x +sme_me_loss: + .byte 0x00 + .align 8 diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c index cd05942..72cb689 100644 --- a/arch/x86/kernel/x8664_ksyms_64.c +++ b/arch/x86/kernel/x8664_ksyms_64.c @@ -11,6 +11,7 @@ #include #include #include +#include #ifdef CONFIG_FUNCTION_TRACER /* mcount and __fentry__ are defined in assembly */ @@ -79,3 +80,8 @@ EXPORT_SYMBOL(native_load_gs_index); EXPORT_SYMBOL(___preempt_schedule); EXPORT_SYMBOL(___preempt_schedule_notrace); #endif + +#ifdef CONFIG_AMD_MEM_ENCRYPT +EXPORT_SYMBOL_GPL(sme_me_mask); +EXPORT_SYMBOL_GPL(sme_get_me_loss); +#endif -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v1 09/18] x86: Insure that memory areas are encrypted when possible
Encrypt memory areas in place when possible (e.g. zero page, etc.) so that special handling isn't needed afterwards. Signed-off-by: Tom Lendacky--- arch/x86/kernel/head64.c | 90 +++--- arch/x86/kernel/setup.c |8 2 files changed, 93 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 3516f9f..ac3a2bf 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -47,12 +47,12 @@ static void __init reset_early_page_tables(void) } /* Create a new PMD entry */ -int __init early_make_pgtable(unsigned long address) +static int __init __early_make_pgtable(unsigned long address, pmdval_t pmd) { unsigned long physaddr = address - __PAGE_OFFSET; pgdval_t pgd, *pgd_p; pudval_t pud, *pud_p; - pmdval_t pmd, *pmd_p; + pmdval_t *pmd_p; /* Invalid address or early pgt is done ? */ if (physaddr >= MAXMEM || read_cr3() != __sme_pa_nodebug(early_level4_pgt)) @@ -94,12 +94,92 @@ again: memset(pmd_p, 0, sizeof(*pmd_p) * PTRS_PER_PMD); *pud_p = (pudval_t)pmd_p - __START_KERNEL_map + phys_base + _KERNPG_TABLE; } - pmd = (physaddr & PMD_MASK) + early_pmd_flags; pmd_p[pmd_index(address)] = pmd; return 0; } +int __init early_make_pgtable(unsigned long address) +{ + unsigned long physaddr = address - __PAGE_OFFSET; + pmdval_t pmd; + + pmd = (physaddr & PMD_MASK) + early_pmd_flags; + + return __early_make_pgtable(address, pmd); +} + +static void __init create_unencrypted_mapping(void *address, unsigned long size) +{ + unsigned long physaddr = (unsigned long)address - __PAGE_OFFSET; + pmdval_t pmd_flags, pmd; + + if (!sme_me_mask) + return; + + /* Clear the encryption mask from the early_pmd_flags */ + pmd_flags = early_pmd_flags & ~sme_me_mask; + + do { + pmd = (physaddr & PMD_MASK) + pmd_flags; + __early_make_pgtable((unsigned long)address, pmd); + + address += PMD_SIZE; + physaddr += PMD_SIZE; + size = (size < PMD_SIZE) ? 0 : size - PMD_SIZE; + } while (size); +} + +static void __init __clear_mapping(unsigned long address) +{ + unsigned long physaddr = address - __PAGE_OFFSET; + pgdval_t pgd, *pgd_p; + pudval_t pud, *pud_p; + pmdval_t *pmd_p; + + /* Invalid address or early pgt is done ? */ + if (physaddr >= MAXMEM || + read_cr3() != __sme_pa_nodebug(early_level4_pgt)) + return; + + pgd_p = _level4_pgt[pgd_index(address)].pgd; + pgd = *pgd_p; + + if (!pgd) + return; + + /* +* The use of __START_KERNEL_map rather than __PAGE_OFFSET here matches +* __early_make_pgtable where the entry was created. +*/ + pud_p = (pudval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map - phys_base); + pud_p += pud_index(address); + pud = *pud_p; + + if (!pud) + return; + + pmd_p = (pmdval_t *)((pud & PTE_PFN_MASK) + __START_KERNEL_map - phys_base); + pmd_p[pmd_index(address)] = 0; +} + +static void __init clear_mapping(void *address, unsigned long size) +{ + do { + __clear_mapping((unsigned long)address); + + address += PMD_SIZE; + size = (size < PMD_SIZE) ? 0 : size - PMD_SIZE; + } while (size); +} + +static void __init sme_memcpy(void *dst, void *src, unsigned long size) +{ + create_unencrypted_mapping(src, size); + memcpy(dst, src, size); + clear_mapping(src, size); +} + /* Don't add a printk in there. printk relies on the PDA which is not initialized yet. */ static void __init clear_bss(void) @@ -122,12 +202,12 @@ static void __init copy_bootdata(char *real_mode_data) char * command_line; unsigned long cmd_line_ptr; - memcpy(_params, real_mode_data, sizeof boot_params); + sme_memcpy(_params, real_mode_data, sizeof boot_params); sanitize_boot_params(_params); cmd_line_ptr = get_cmd_line_ptr(); if (cmd_line_ptr) { command_line = __va(cmd_line_ptr); - memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE); + sme_memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE); } } diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 2367ae0..1d29cf9 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -113,6 +113,7 @@ #include #include #include +#include /* * max_low_pfn_mapped: highest direct mapped pfn under 4GB @@ -375,6 +376,13 @@ static void __init reserve_initrd(void) !ramdisk_image || !ramdisk_size) return; /* No initrd provided by bootloader */ + /* +* This memory is marked
Re: [PATCH 0/6] Intel Secure Guard Extensions
On Tue 2016-04-26 21:59:52, One Thousand Gnomes wrote: > > But... that will mean that my ssh will need to be SGX-aware, and that > > I will not be able to switch to AMD machine in future. ... or to other > > Intel machine for that matter, right? > > I'm not privy to AMD's CPU design plans. > > However I think for the ssl/ssh case you'd use the same interfaces > currently available for plugging in TPMs and dongles. It's a solved > problem in the crypto libraries. > > > What new syscalls would be needed for ssh to get all this support? > > I don't see why you'd need new syscalls. So the kernel will implement few selected crypto algorithms, similar to what TPM would provide, using SGX, and then userspace no longer needs to know about SGX? Ok, I guess that's simple. It also means it is boring, and the multiuser-game-of-the-day will not be able to protect the (plain text) password from the cold boot attack. Nor will be emacs be able to protect in-memory copy of my diary from cold boot attack. So I guess yes, some new syscalls would be nice :-). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] Intel Secure Guard Extensions
> But... that will mean that my ssh will need to be SGX-aware, and that > I will not be able to switch to AMD machine in future. ... or to other > Intel machine for that matter, right? I'm not privy to AMD's CPU design plans. However I think for the ssl/ssh case you'd use the same interfaces currently available for plugging in TPMs and dongles. It's a solved problem in the crypto libraries. > What new syscalls would be needed for ssh to get all this support? I don't see why you'd need new syscalls. > Ookay... I guess I can get a fake Replay Protected Memory block, which > will confirm that write happened and not do anything from China, but It's not quite that simple because there are keys and a counter involved but I am sure doable. > And, again, it means that quite complex new kernel-user interface will > be needed, right? Why ? For user space we have perfectly good existing system calls, for kernel space we have existing interfaces to the crypto and key layers for modules to use. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] Intel Secure Guard Extensions
> Replay Protected Memory Block. It's a device that allows someone to > write to it and confirm that the write happened and the old contents > is no longer available. You could use it to implement an enclave that > checks a password for your disk but only allows you to try a certain > number of times. rpmb is found in a load of hardware today notably MMC/SD cards. Android phones often use it to store sensitive system data. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] Intel Secure Guard Extensions
On Tue, Apr 26, 2016 at 12:41 PM, Pavel Machekwrote: > On Tue 2016-04-26 12:05:48, Andy Lutomirski wrote: >> On Tue, Apr 26, 2016 at 12:00 PM, Pavel Machek wrote: >> > On Mon 2016-04-25 20:34:07, Jarkko Sakkinen wrote: >> >> Intel(R) SGX is a set of CPU instructions that can be used by >> >> applications to set aside private regions of code and data. The code >> >> outside the enclave is disallowed to access the memory inside the >> >> enclave by the CPU access control. >> >> >> >> The firmware uses PRMRR registers to reserve an area of physical memory >> >> called Enclave Page Cache (EPC). There is a hardware unit in the >> >> processor called Memory Encryption Engine. The MEE encrypts and decrypts >> >> the EPC pages as they enter and leave the processor package. >> > >> > What are non-evil use cases for this? >> >> Storing your ssh private key encrypted such that even someone who >> completely compromises your system can't get the actual private key > > Well, if someone gets root on my system, he can get my ssh private > key right? > > So, you can use this to prevent "cold boot" attacks? (You know, > stealing machine, liquid nitrogen, moving DIMMs to different machine > to read them?) Ok. That's non-evil. Preventing cold boot attacks is really just icing on the cake. The real point of this is to allow you to run an "enclave". An SGX enclave has unencrypted code but gets access to a key that only it can access. It could use that key to unwrap your ssh private key and sign with it without ever revealing the unwrapped key. No one, not even root, can read enclave memory once the enclave is initialized and gets access to its personalized key. The point of the memory encryption engine to to prevent even cold boot attacks from being used to read enclave memory. This could probably be used for evil, but I think the evil uses are outweighed by the good uses. > > Is there reason not to enable this for whole RAM if the hw can do it? The HW can't, at least not in the current implementation. Also, the metadata has considerable overhead (no clue whether there's a performance hit, but there's certainly a memory usage hit). > >> out. Using this in conjunction with an RPMB device to make it Rather >> Difficult (tm) for third parties to decrypt your disk even if you >> password has low entropy. There are plenty more. > > I'm not sure what RPMB is, but I don't think you can make it too hard > to decrypt my disk if my password has low entropy. ... And I don't see > how encrypting RAM helps there. Replay Protected Memory Block. It's a device that allows someone to write to it and confirm that the write happened and the old contents is no longer available. You could use it to implement an enclave that checks a password for your disk but only allows you to try a certain number of times. There are some hints in the whitepapers that such a mechanism might be present on existing Skylake chipsets. I'm not really sure. > > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH locking 4/4] locktorture: Simplify torture_runnable computation
This commit replaces a #ifdef with IS_ENABLED(), saving five lines. Signed-off-by: Paul E. McKenney--- kernel/locking/locktorture.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index d066a50dc87e..f8c5af52a131 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -75,12 +75,7 @@ struct lock_stress_stats { long n_lock_acquired; }; -#if defined(MODULE) -#define LOCKTORTURE_RUNNABLE_INIT 1 -#else -#define LOCKTORTURE_RUNNABLE_INIT 0 -#endif -int torture_runnable = LOCKTORTURE_RUNNABLE_INIT; +int torture_runnable = IS_ENABLED(MODULE); module_param(torture_runnable, int, 0444); MODULE_PARM_DESC(torture_runnable, "Start locktorture at module init"); -- 2.5.2 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH locking 2/4] documentation: State purpose of memory-barriers.txt
From: David HowellsThere has been some confusion about the purpose of memory-barriers.txt, so this commit adds a statement of purpose. Signed-off-by: David Howells Signed-off-by: Paul E. McKenney --- Documentation/memory-barriers.txt | 16 1 file changed, 16 insertions(+) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index fb2dd35a823a..8b11e54238bf 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -19,6 +19,22 @@ in case of any doubt (and there are many) please ask. To repeat, this document is not a specification of what Linux expects from hardware. +The purpose of this document is twofold: + + (1) to specify the minimum functionality that one can rely on for any + particular barrier, and + + (2) to provide a guide as to how to use the barriers that are available. + +Note that an architecture can provide more than the minimum requirement +for any particular barrier, but if the architecure provides less than +that, that architecture is incorrect. + +Note also that it is possible that a barrier may be a no-op for an +architecture because the way that arch works renders an explicit barrier +unnecessary in that case. + + CONTENTS -- 2.5.2 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH locking 3/4] documentation: ACQUIRE applies to loads, RELEASE applies to stores
From: Will DeaconFor compound atomics performing both a load and a store operation, make it clear that _acquire and _release variants refer only to the load and store portions of compound atomic. For example, xchg_acquire is an xchg operation where the load takes on ACQUIRE semantics. Cc: Paul E. McKenney Cc: Peter Zijlstra Signed-off-by: Will Deacon Acked-by: Peter Zijlstra (Intel) Signed-off-by: Paul E. McKenney --- Documentation/memory-barriers.txt | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 8b11e54238bf..147ae8ec836f 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -498,6 +498,11 @@ And a couple of implicit varieties: This means that ACQUIRE acts as a minimal "acquire" operation and RELEASE acts as a minimal "release" operation. +A subset of the atomic operations described in atomic_ops.txt have ACQUIRE +and RELEASE variants in addition to fully-ordered and relaxed (no barrier +semantics) definitions. For compound atomics performing both a load and a +store, ACQUIRE semantics apply only to the load and RELEASE semantics apply +only to the store portion of the operation. Memory barriers are only required where there's a possibility of interaction between two CPUs or between a CPU and a device. If it can be guaranteed that -- 2.5.2 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH locking 1/4] documentation: Add disclaimer
From: Peter ZijlstraIt appears people are reading this document as a requirements list for building hardware. This is not the intent of this document. Nor is it particularly suited for this purpose. The primary purpose of this document is our collective attempt to define a set of primitives that (hopefully) allow us to write correct code on the myriad of SMP platforms Linux supports. Its a definite work in progress as our understanding of these platforms, and memory ordering in general, progresses. Nor does being mentioned in this document mean we think its a particularly good idea; the data dependency barrier required by Alpha being a prime example. Yes we have it, no you're insane to require it when building new hardware. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Paul E. McKenney --- Documentation/memory-barriers.txt | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index a9454b1c73bd..fb2dd35a823a 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -4,8 +4,24 @@ By: David Howells Paul E. McKenney +Will Deacon +Peter Zijlstra -Contents: +== +DISCLAIMER +== + +This document is not a specification; it is intentionally (for the sake of +brevity) and unintentionally (due to being human) incomplete. This document is +meant as a guide to using the various memory barriers provided by Linux, but +in case of any doubt (and there are many) please ask. + +To repeat, this document is not a specification of what Linux expects from +hardware. + + +CONTENTS + (*) Abstract memory access model. -- 2.5.2 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH locking 0/4] locktorture and memory-barriers.txt updates
Hello! This series contains a few memory-barriers.txt updates and a locktorture cleanup: 1. Add a disclaimer to memory-barrier.txt, courtesy of Peter Zijlstra. 2. Explicitly state the purpose of memory-barrier.txt, courtesy of David Howells. 3. Explicitly state that ACQUIRE applies to loads and that RELEASE applies to stores, courtesy of Will Deacon. 4. Simplify torture_runnable computation in locktorture, replacing a multiline #ifdef with an IS_ENABLED() that fits into an existing line. Thanx, Paul Documentation/memory-barriers.txt | 39 +- kernel/locking/locktorture.c |7 -- 2 files changed, 39 insertions(+), 7 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 20/25] arm64:ilp32: add sys_ilp32.c and a separate table (in entry.S) to use it
On Wed, Apr 06, 2016 at 01:08:42AM +0300, Yury Norov wrote: > +/* Using non-compat syscalls where necessary */ > +#define compat_sys_fadvise64_64sys_fadvise64_64 > +#define compat_sys_fallocate sys_fallocate > +#define compat_sys_ftruncate64 sys_ftruncate > +#define compat_sys_lookup_dcookie sys_lookup_dcookie > +#define compat_sys_pread64 sys_pread64 > +#define compat_sys_pwrite64sys_pwrite64 > +#define compat_sys_readahead sys_readahead > +#define compat_sys_shmat sys_shmat Why don't we use compat_sys_shmat? Is it because of COMPAT_SHMLBA? > +#define compat_sys_sync_file_range sys_sync_file_range > +#define compat_sys_truncate64 sys_truncate > +#define sys_llseek sys_lseek > +#define sys_mmap2 sys_mmap Nitpick: there are some whitespace inconsistencies above (just convert all spaces to tabs). I think you should also update Documentation/arm64/ilp32.txt to include the list above. > + > +#include > + > +#undef __SYSCALL > +#undef __SC_COMP > +#undef __SC_WRAP > +#undef __SC_3264 > +#undef __SC_COMP_3264 Minor detail: do we actually need to undef all these? Maybe we can get away with just defining __SYSCALL_COMPAT at the top of the file. > + > +#define __SYSCALL_COMPAT > +#define __SYSCALL(nr, sym) [nr] = sym, > +#define __SC_WRAP(nr, sym) [nr] = compat_##sym, > + > +/* > + * The sys_call_ilp32_table array must be 4K aligned to be accessible from > + * kernel/entry.S. > + */ > +void *sys_call_ilp32_table[__NR_syscalls] __aligned(4096) = { > + [0 ... __NR_syscalls - 1] = sys_ni_syscall, > +#include > +}; -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v6 00/10] vfio-pci: Allow to mmap sub-page MMIO BARs and MSI-X table
On Mon, 25 Apr 2016 18:05:53 +0800 Yongji Xiewrote: > Hi Alex, > > Any comment? TBH, I shuffled this to the bottom of the review pile because you're depending on a patch series for ARM MSI mapping that's still very much in flux. You've really got 3 or 4 separate patch series here that should be separated so they can be sent as non-RFC and you can start making progress. For instance, patches 1-4 are PCI-core enabling PAGE_SIZE aligned BARs, patch 5 discovers PAGE_SIZE aligned BARs and enables mmapping them through vfio. Now that you're using shadow resources to attempt to reserve the remainder of the page in patch 5, doesn't that make it independent of patches 1-4? These could be sent as separate series in parallel. Patches 6-9 are another separate series, but here you start to depend on the changes happening with ARM MSI mapping to determine whether we have real interrupt isolation. Once that gets settled, patch 10 becomes a much less controversial follow-on patch. Thanks, Alex > On 2016/4/18 18:53, Yongji Xie wrote: > > Current vfio-pci implementation disallows to mmap > > sub-page(size < PAGE_SIZE) MMIO BARs and MSI-X table. This is because > > sub-page BARs' mmio page may be shared with other BARs and MSI-X table > > should not be accessed directly from the guest for security reasons. > > > > But it will easily cause some performance issues for mmio accesses > > in guest when vfio passthrough sub-page BARs or BARs containing MSI-X > > table on PPC64 platform. This is because PAGE_SIZE is 64KB by default > > on PPC64 platform and the big page may easily hit the sub-page MMIO > > BARs' unmmapping and cause the unmmaping of the mmio page which > > MSI-X table locate in, which lead to mmio emulation in host. > > > > For sub-page MMIO BARs' unmmapping, this patchset modifies > > resource_alignment kernel parameter to enforce the alignment of all > > MMIO BARs to be at least PAGE_SZIE so that sub-page BAR's mmio page > > will not be shared with other BARs. And we also add shadow resources > > to the vfio device and put them into the holes of mmio pages in case > > that hot-add device's BARs are assigned into the holes. Then we can > > mmap sub-page MMIO BARs safely. > > > > For MSI-X table's unmmapping, we think MSI-X table is safe to access > > directly from userspace if hardware supports the capability of > > interrupt remapping which can ensure that a given pci device can > > only shoot the MSIs assigned for it. But the implenmentation of > > this capability is arch-independent. To have a universal way > > to test this capability on PCI side for different archs, we introduce > > a new bus_flags PCI_BUS_FLAGS_MSI_REMAP. > > > > With this patchset applied, we can get almost 100% improvement on > > performance for small block 4k random read when we passthrough a FC > > HBA containing sub-page BARs and MSI-X BARs to guest on PPC64 in > > our test. > > > > The patch 8 are based on the proposed patchset[2]. > > > > Changelog v6: > > - Rebase on vfio/next with patchset[2] applied > > - Fix some bugs of v5 > > - Add three patches to make PCI_BUS_FLAGS_MSI_REMAP as > >a universal flag to test IRQ remapping > > > > Changelog v5: > > - Rebase on vfio/next > > - Change the order of patch 1,2,3 > > - Move the warning "resource_alignment will not work with > >PCI_PROBE_ONLY set" from documentation to kernel log > > - Remove IORESOURCE_WINDOW > > - Add description for parameter "resize" > > - Add PCIBIOS_MIN_ALIGNMENT to force all MMIO BARs to > >get minimum alignment > > - Add shadow resources to make sure sub-page BAR's mmio > >page will not be shared with hot-add BARs. > > - Add a new bit to pci_bus_flags to indicate the capbility > >of interrupt remapping on PPC64 > > - Remove IOMMU_CAP_INTR_REMAP on PPC64 > > - Add a property msi_remap to vfio_pci_device to cache the > >capbility of interrupt remapping > > > > Changelog v4: > > - Rebase on v4.5-rc6 with patchset[1] applied. > > - Remove resource_page_aligned kernel parameter > > - Fix some problems with resource_alignment kernel parameter > > - Modify resource_alignment kernel parameter to support multiple > >devices. > > - Remove host bridge attribute: msi_filtered > > - Use IOMMU_CAP_INTR_REMAP to check if MSI-X table can be mmapped > > - Add IOMMU_CAP_INTR_REMAP for IODA host bridge on PPC64 platform > > > > Changelog v3: > > - Rebase on new linux kernel mainline with the patchset[1] applied. > > - Add a function to check whether PCI BARs'mmio page is shared with > >other BARs. > > - Add a host bridge attribute to indicate PCI host bridge support > >filtering of MSIs. > > - Use the new host bridge attribute to check if MSI-X table can > >be mmapped instead of CONFIG_EEH. > > - Remove Kconfig option VFIO_PCI_MMAP_MSIX > > > > Changelog v2: > > - Rebase on v4.4-rc6 with the patchset[1] applied. > > - Use kernel parameter to enforce all MMIO BARs to be page aligned > >on PCI core
Re: [PATCH v2 0/2] moves samples out of Documentation directory
Em Tue, 26 Apr 2016 12:28:42 +0200 Hans Verkuilescreveu: > On 04/26/2016 11:59 AM, Jonathan Corbet wrote: > > On Mon, 25 Apr 2016 18:03:07 +0200 > > Arnd Bergmann wrote: > > > >> As suggested by Nicolas Pitre, here is a resend of two patches to > >> move the kernel modules from Documentation/*/ to samples/*/. > >> > >> With Nico's changes in place, it's no longer necessary to do this, > >> but it seems like a good idea anyway for consistency. > >> Not sure who would be the best person to pick up the patches, I'd > >> probably either the Documentation or the kbuild maintainers. > >> > >>Arnd > >> > >> [PATCH v2 1/2] samples: connector: from Documentation to samples > >> [PATCH v2 2/2] samples: v4l: from Documentation to samples directory > > > > I can take them through the docs tree. > > > > Hans [added], are you OK with moving v4l2-pci-skeleton.c over to > > the samples directory? > > Yes, that's fine. For the record: > > Acked-by: Hans Verkuil Acked-by: Mauro Carvalho Chehab > > Regards, > > Hans > -- > To unsubscribe from this list: send the line "unsubscribe linux-doc" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thanks, Mauro -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/2] moves samples out of Documentation directory
On 04/26/2016 11:59 AM, Jonathan Corbet wrote: > On Mon, 25 Apr 2016 18:03:07 +0200 > Arnd Bergmannwrote: > >> As suggested by Nicolas Pitre, here is a resend of two patches to >> move the kernel modules from Documentation/*/ to samples/*/. >> >> With Nico's changes in place, it's no longer necessary to do this, >> but it seems like a good idea anyway for consistency. >> Not sure who would be the best person to pick up the patches, I'd >> probably either the Documentation or the kbuild maintainers. >> >> Arnd >> >> [PATCH v2 1/2] samples: connector: from Documentation to samples >> [PATCH v2 2/2] samples: v4l: from Documentation to samples directory > > I can take them through the docs tree. > > Hans [added], are you OK with moving v4l2-pci-skeleton.c over to > the samples directory? Yes, that's fine. For the record: Acked-by: Hans Verkuil Regards, Hans -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Mark "Out of Date" addresses as undeliverable
On 2016/4/26 17:52, Jonathan Corbet wrote: > On Thu, 21 Apr 2016 23:04:26 +0800 > Zhigang Gaowrote: > >> Chinese maintainer for help. Contact the Chinese maintainer, if this >> translation is outdated or there is problem with translation. >> >> -Chinese maintainer: Zhang Le >> +Chinese maintainer: Zhang Le > > So this makes me a little uncomfortable...the document now says to > contact somebody who cannot be contacted. It's a promise of help that is > empty. If Zhang Le has truly vanished, and nobody else is willing to > fill in, I think it would be better to simply delete this text. > >> - Li Zefan >> - Wang Chen >> + Li Zefan >> + Wang Chen > > Zefan, at least, is trivially findable at his new address (copied); > Zefan, I assume you would like things updated here? Which address would > you like to use? > Yeah, I'm still actively working in the kernel and in the open-source community. :) lize...@huawei.com. I use l...@kernel.org for maintaning v3.4.y only. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 20/25] arm64:ilp32: add sys_ilp32.c and a separate table (in entry.S) to use it
On Mon, Apr 25, 2016 at 09:47:40PM +0300, Yury Norov wrote: > On Mon, Apr 25, 2016 at 09:19:13PM +0300, Yury Norov wrote: > > On Mon, Apr 25, 2016 at 06:26:56PM +0100, Catalin Marinas wrote: > > > On Wed, Apr 06, 2016 at 01:08:42AM +0300, Yury Norov wrote: > > > > --- a/arch/arm64/kernel/entry.S > > > > +++ b/arch/arm64/kernel/entry.S > > > > @@ -715,9 +715,13 @@ ENDPROC(ret_from_fork) > > > > */ > > > > .align 6 > > > > el0_svc: > > > > - adrpstbl, sys_call_table// load syscall table > > > > pointer > > > > uxtwscno, w8// syscall number in w8 > > > > mov sc_nr, #__NR_syscalls > > > > +#ifdef CONFIG_ARM64_ILP32 > > > > + ldr x16, [tsk, #TI_FLAGS] > > > > + tbnzx16, #TIF_32BIT_AARCH64, el0_ilp32_svc // We are using > > > > ILP32 > > > > +#endif > > > > > > There is another ldr x16, [tsk, #TI_FLAGS] load further down in the > > > el0_svc_naked block. We should rework these a bit to avoid loading the > > > same location twice unnecessarily. E.g. move the ldr x16 just before > > > el0_svc_naked and branch one line after in case of the ILP32 syscall. > > > > > > > Yes, I thiks we can refactor it. Thanks for a catch. > > Now it's better, I think > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index cf4d1ae..21312bb 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -715,16 +715,22 @@ ENDPROC(ret_from_fork) > */ > .align 6 > el0_svc: > - adrpstbl, sys_call_table// load syscall table pointer > uxtwscno, w8// syscall number in w8 > mov sc_nr, #__NR_syscalls > + ldr x16, [tsk, #TI_FLAGS] You can move this higher up for interlocking reasons (though these days CPUs do a lot of speculative loads). > +#ifdef CONFIG_ARM64_ILP32 > + tbz x16, #TIF_32BIT_AARCH64, el0_lp64_svc // We are using ILP32 // We are *not* using ILP32 > + adrpstbl, sys_call_ilp32_table // load ilp32 syscall table > pointer > + b el0_svc_naked > +el0_lp64_svc: > +#endif > + adrpstbl, sys_call_table// load syscall table pointer You can avoid the branches by using csel, something like this: ldr x16, [tsk, #TI_FLAGS] adrpstbl, sys_call_table ... #ifdef CONFIG_ARM64_ILP32 adrpx17, sys_call_ilp32_table tst x16, #_TIF_32BIT_AARCH64 cselstbl, stbl, x17, eq #endif el0_svc_naked: ... > el0_svc_naked: // compat entry point > stp x0, scno, [sp, #S_ORIG_X0] // save the original x0 and > syscall number > enable_dbg_and_irq > ct_user_exit 1 > > - ldr x16, [tsk, #TI_FLAGS] // check for syscall hooks > - tst x16, #_TIF_SYSCALL_WORK > + tst x16, #_TIF_SYSCALL_WORK // check for syscall hooks > b.ne__sys_trace > cmp scno, sc_nr // check upper syscall limit > b.hsni_sys There is el0_svc_compat branching to el0_svc_naked and it won't have x16 set anymore. So you need to add an ldr x16 to el0_svc_compat as well. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Changed the path from to the incorrect drivers/char/sysrq.c to drivers/tty/sysrq.c
On Fri, 22 Apr 2016 21:17:23 +0200 René Nyffeneggerwrote: > This is my first patch submission. Please let me know if I have made a > mistake anywhere. Thank you for improving the documentation! Unfortunately, the patch was corrupted by your mail client and does not apply. Could I please ask you to have a look at Documentation/email-clients.txt for information on how to send patches so that they arrive intact at the other end? A good approach is to email the patch to yourself and ensure that the result applies before trying again. Thanks, jon -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/2] moves samples out of Documentation directory
On Mon, 25 Apr 2016 18:03:07 +0200 Arnd Bergmannwrote: > As suggested by Nicolas Pitre, here is a resend of two patches to > move the kernel modules from Documentation/*/ to samples/*/. > > With Nico's changes in place, it's no longer necessary to do this, > but it seems like a good idea anyway for consistency. > Not sure who would be the best person to pick up the patches, I'd > probably either the Documentation or the kbuild maintainers. > > Arnd > > [PATCH v2 1/2] samples: connector: from Documentation to samples > [PATCH v2 2/2] samples: v4l: from Documentation to samples directory I can take them through the docs tree. Hans [added], are you OK with moving v4l2-pci-skeleton.c over to the samples directory? Thanks, jon -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Mark "Out of Date" addresses as undeliverable
On Thu, 21 Apr 2016 23:04:26 +0800 Zhigang Gaowrote: > Chinese maintainer for help. Contact the Chinese maintainer, if this > translation is outdated or there is problem with translation. > > -Chinese maintainer: Zhang Le > +Chinese maintainer: Zhang Le So this makes me a little uncomfortable...the document now says to contact somebody who cannot be contacted. It's a promise of help that is empty. If Zhang Le has truly vanished, and nobody else is willing to fill in, I think it would be better to simply delete this text. > - Li Zefan > - Wang Chen > + Li Zefan > + Wang Chen Zefan, at least, is trivially findable at his new address (copied); Zefan, I assume you would like things updated here? Which address would you like to use? Thanks, jon -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v6 04/10] PCI: Add support for enforcing all MMIO BARs to be page aligned
On 2016/4/26 13:41, Alexey Kardashevskiy wrote: On 04/18/2016 08:56 PM, Yongji Xie wrote: When vfio passthrough a PCI device of which MMIO BARs are smaller than PAGE_SIZE, guest will not handle the mmio accesses to the BARs which leads to mmio emulations in host. This is because vfio will not allow to passthrough one BAR's mmio page which may be shared with other BARs. Otherwise, there will be a backdoor that guest can use to access BARs of other guest. To solve this issue, this patch modifies resource_alignment to support syntax where multiple devices get the same alignment. So we can use something like "pci=resource_alignment=*:*:*.*:noresize" to enforce the alignment of all MMIO BARs to be at least PAGE_SIZE so that one BAR's mmio page would not be shared with other BARs. And we also define a macro PCIBIOS_MIN_ALIGNMENT to enable this automatically on PPC64 platform which can easily hit this issue because its PAGE_SIZE is 64KB. Signed-off-by: Yongji Xie--- Documentation/kernel-parameters.txt |2 ++ arch/powerpc/include/asm/pci.h |2 ++ drivers/pci/pci.c | 64 +-- 3 files changed, 57 insertions(+), 11 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index d8b29ab..542be4a 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2918,6 +2918,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. aligned memory resources. If is not specified, PAGE_SIZE is used as alignment. +, , and can be set to +"*" which means match all values. PCI-PCI bridge can be specified, if resource windows need to be expanded. noresize: Don't change the resources' sizes when diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h index 6f8065a..78f230f 100644 --- a/arch/powerpc/include/asm/pci.h +++ b/arch/powerpc/include/asm/pci.h @@ -30,6 +30,8 @@ #define PCIBIOS_MIN_IO0x1000 #define PCIBIOS_MIN_MEM0x1000 +#define PCIBIOS_MIN_ALIGNMENT PAGE_SIZE + struct pci_dev; /* Values for the `which' argument to sys_pciconfig_iobase syscall. */ diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 7564ccc..0381c28 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4605,7 +4605,12 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev, int seg, bus, slot, func, align_order, count; resource_size_t align = 0; char *p; +bool invalid = false; +#ifdef PCIBIOS_MIN_ALIGNMENT +align = PCIBIOS_MIN_ALIGNMENT; +*resize = false; +#endif spin_lock(_alignment_lock); p = resource_alignment_param; while (*p) { @@ -4622,16 +4627,49 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev, } else { align_order = -1; } -if (sscanf(p, "%x:%x:%x.%x%n", -, , , , ) != 4) { I'd replace the above lines with: char segstr[5] = "*", busstr[3] = "*"; char slotstr[3] = "*", funstr[2] = "*"; if (sscanf(p, "%4[^:]:%2[^:]:%2[^.].%1s%n", , , , , ) != 4) { It seems the current implement of sscanf() in kernel is not able to support the syntax: "%4[^:]:%2[^:]:%2[^.]". Thanks, Yongji and add some wrapper like: static bool glob_match_hex(char const *pat, int val) { char valstr[5]; /* 5 should be enough for PCI */ snprintf(valstr, sizeof(valstr) - 1, "%4x", val); return glob_match(pat, valstr); } and then use glob_match_hex() (or make a wrapper like above on top of fnmatch()), this would enable better mask handling. If anyone finds this useful (which I am not sure about). +if (p[0] == '*' && p[1] == ':') { +seg = -1; +count = 1; +} else if (sscanf(p, "%x%n", , ) != 1 || +p[count] != ':') { +invalid = true; +break; +} +p += count + 1; +if (*p == '*') { +bus = -1; +count = 1; +} else if (sscanf(p, "%x%n", , ) != 1) { +invalid = true; +break; +} +p += count; +if (*p == '.') { +slot = bus; +bus = seg; seg = 0; -if (sscanf(p, "%x:%x.%x%n", -, , , ) != 3) { -/* Invalid format */ -printk(KERN_ERR "PCI: Can't parse resource_alignment parameter: %s\n", -p); +p++; +} else if (*p == ':') { +p++; +if (p[0] == '*' && p[1] == '.') { +slot = -1; +count = 1; +} else if (sscanf(p, "%x%n", , ) != 1 || +p[count] != '.') { +invalid = true;