On 25/11/2020 11:51, Jan Kiszka wrote:
> On 25.11.20 10:15, Andrea Bastoni wrote:
>> On 25/11/2020 07:07, Jan Kiszka wrote:
>>> On 23.11.20 21:45, Andrea Bastoni wrote:
>>>> Hello Jan, hi all,
>>>>
>>>> As discussed a couple of weeks ago, here an updated version of the 
>>>> "cache-coloring" for Jailhouse.
>>>>
>>>> A short recap on what's coloring (see Documentation/cache-coloring.md):
>>>>> Cache coloring is a software technique that permits LLC partitioning,
>>>>> therefore eliminating mutual core interference, and thus guaranteeing more
>>>>> predictable performances for memory accesses.
>>>>
>>>> In this version of the patch, we've tried to keep the changes in the 
>>>> hypervisor core minimal. Also, since coloring is something that should be 
>>>> transparent to inmates/VM (coloring belongs to the IPA/PA hypervisor 
>>>> translations), the Linux driver has been only minimally modified.
>>>>
>>>> Coloring API
>>>> ------------
>>>>
>>>> The patch adds a small architecture API layer that implements coloring. 
>>>> The API is fully implemented only on arm64.  arm32 could be also a 
>>>> possible target for coloring, x86 already has CAT. Despite being 
>>>> implemented by only one architecture, having a small common API layer 
>>>> integrates IMHO better than other solutions in the main code (without 
>>>> adding conditionally compiled code).
>>>>
>>>> The API is:
>>>>     arch_color_map_memory_region()   // cell_create
>>>>         Color-maps the memory of the cell. Coloring requires multiple 
>>>> passes to keep a contiguous VA, while "punching holes" in the PA according 
>>>> to the allocated colors.
>>>>
>>>>     arch_color_unmap_memory_region() // cell_destroy
>>>>         The opposite of create: search and give back each piece of colored 
>>>> PA.
>>>>
>>>>     arch_color_remap_to_root()       // cell_load
>>>>         Given that we don't want to modify the driver (i.e., we don't want 
>>>> the driver to do non-contiguous color-ioremap) we let the root cell 
>>>> believe that the IPAs it is ioremapping are contiguous, but in reality the 
>>>> PA are non-contiguous (colored). This is currently done by using a very 
>>>> high VA (IPA) address as "base" for the load-remap-to-root-cell operation. 
>>>> This value is configuration dependent and can be specified in the root 
>>>> cell to avoid (unlikely) conflicts.
>>>>
>>>>     arch_color_unmap_from_root()     // cell_start
>>>>         Returns the mapping setup by the cell_load to the root cell.
>>>>
>>>> Two APIs are implemented only by arm64:
>>>>     arm_color_dcache_flush_memory_region()  // range flushing
>>>>         Also flushing should be done in a "colored way".
>>>>
>>>>     arm_color_init()                        // hook for the initialization 
>>>> of coloring
>>>>         Reads the parameters (size of the LLC way, base offset for the 
>>>> load-remapping operation) needed by coloring. There's also some debugging 
>>>> code to autodetect the cache-geometry and determine the size of the way.
>>>>
>>>> Configuration
>>>> -------------
>>>>
>>>> From the configuration point of view, a colored memory region is a normal 
>>>> memory region with a color attached. The color is expressed directly in 
>>>> the memory region as color-bitmask and the flag JAILHOUSE_MEM_COLORED is 
>>>> used to identify such a region. Contiguous bits in the bitmask identify a 
>>>> color (region) to be used for the memory region. The size of the mapping 
>>>> doesn't change due to coloring.
>>>>
>>>> For example, with 16 colors, a full way size is selected by 0xffff, while 
>>>> 1/4 of the way size is selected by 0x000f.
>>>>
>>>> (Note: we also experimented with a different version that defined separate 
>>>> "normal" and "colored" memory regions. The approach presented in this 
>>>> version is more robust.)
>>>>
>>>> Main coloring "loop"
>>>> --------------------
>>>>
>>>> The coloring APIs basically boil down to a loop that appropriately selects 
>>>> --according to the color-- the physical addresses where to perform a 
>>>> selected operation. The selection is done efficiently with bit operations 
>>>> on the bitmask.
>>>>
>>>> An alternative approach is to hook coloring deep into the mapping 
>>>> functionalities of the hypervisor. Basically, all low level mapping 
>>>> functions support coloring, and the non-colored case becomes the "special 
>>>> case."
>>>> This approach was not followed because to be implemented "cleanly", all 
>>>> the architecture mapping APIs have to be extended to support coloring, but 
>>>> only one-architecture then really implement it.
>>>>
>>>> Instead, the implemented approach to have an additional "coloring" API 
>>>> integrates more nicely. Additionally, coloring will be hopefully only a 
>>>> transitory techniques (some years?) that could be eventually replaced by 
>>>> hardware-based techniques (e.g., MPAM). When this happens, removing the 
>>>> additional coloring API requires less rework than the other approach.
>>>>
>>>>
>>>> Debugging Code
>>>> --------------
>>>>
>>>> The patches also provide a "debugging-only" code that can be activated by 
>>>> defining CONFIG_DEBUG (it sounded more Jailhouse-oriented than NDEBUG). In 
>>>> debugging mode, the coloring provides an autodetection for arm64 cache 
>>>> geometry.
>>>> Additionally, I've added an assert() function that can be used at runtime 
>>>> in debugging only contexts (could be used independently from coloring).
>>>>
>>>>
>>>> Feedback and comments welcome.
>>>>
>>>> Best,
>>>> Andrea
>>>>
>>>> Andrea Bastoni (12):
>>>>   arm-common: bitops: add most-significant-bit operation
>>>>   hypervisor, driver, archs: add basic empty infrastructure for coloring
>>>>   hypervisor: arm64: add cache coloring implementation
>>>>   hypervisor configuration: hook autodetection for coloring_way_size
>>>>   configs: arm64: add example configuration for colored ZCU102 inmate
>>>>   hypervisor: protect inclusion of control.h
>>>>   hypervisor, driver: add platform information to configure coloring
>>>>     params
>>>>   configs: arm64: hook-in coloring parameters for ZCU102
>>>>   hypervisor: provide runtime assert() helper for DEBUG only
>>>>   hypervisor: provide implementation for __assert_fail() and group
>>>>     panic-related functions
>>>>   hypervisor: coloring: use assert instead of "BUG"
>>>>   hypervisor: coloring: make cache autodetection debug-only
>>>>
>>>> Luca Miccio (2):
>>>>   Documentation: add description and usage of cache coloring support
>>>>   pyjailhouse: add support for colored regions
>>>>
>>>>  Documentation/cache-coloring.md               | 198 ++++++++++++++++++
>>>>  configs/arm64/zynqmp-zcu102-inmate-demo-col.c |  75 +++++++
>>>>  configs/arm64/zynqmp-zcu102.c                 |   6 +
>>>>  driver/cell.c                                 |  10 +-
>>>>  driver/cell.h                                 |   1 +
>>>>  driver/main.c                                 |  12 ++
>>>>  hypervisor/Makefile                           |   2 +-
>>>>  hypervisor/arch/arm-common/control.c          |   1 +
>>>>  .../arch/arm-common/include/asm/bitops.h      |  10 +
>>>>  .../arch/arm-common/include/asm/dcaches.h     |   8 +
>>>>  hypervisor/arch/arm-common/mmu_cell.c         |  52 +++--
>>>>  hypervisor/arch/arm/control.c                 |   1 +
>>>>  hypervisor/arch/arm/include/asm/coloring.h    |  59 ++++++
>>>>  hypervisor/arch/arm/traps.c                   |   1 +
>>>>  hypervisor/arch/arm64/Kbuild                  |   5 +
>>>>  hypervisor/arch/arm64/cache_layout.c          | 148 +++++++++++++
>>>>  hypervisor/arch/arm64/coloring.c              | 184 ++++++++++++++++
>>>>  hypervisor/arch/arm64/control.c               |   1 +
>>>>  .../arch/arm64/include/asm/cache_layout.h     |  21 ++
>>>>  hypervisor/arch/arm64/include/asm/coloring.h  | 137 ++++++++++++
>>>>  hypervisor/arch/arm64/setup.c                 |   3 +
>>>>  hypervisor/arch/arm64/traps.c                 |   1 +
>>>>  hypervisor/arch/x86/amd_iommu.c               |   1 +
>>>>  hypervisor/arch/x86/control.c                 |   1 +
>>>>  hypervisor/arch/x86/efifb.c                   |   1 +
>>>>  hypervisor/arch/x86/include/asm/coloring.h    |  45 ++++
>>>>  hypervisor/arch/x86/ioapic.c                  |   1 +
>>>>  hypervisor/arch/x86/svm.c                     |   1 +
>>>>  hypervisor/arch/x86/vmx.c                     |   1 +
>>>>  hypervisor/control.c                          | 111 ++++------
>>>>  hypervisor/include/jailhouse/assert.h         |  37 ++++
>>>>  hypervisor/include/jailhouse/control.h        |  34 +--
>>>>  hypervisor/include/jailhouse/panic.h          |  40 ++++
>>>>  hypervisor/include/jailhouse/printk.h         |   4 +
>>>>  hypervisor/panic.c                            |  86 ++++++++
>>>>  hypervisor/pci.c                              |   1 +
>>>>  hypervisor/printk.c                           |   1 +
>>>>  hypervisor/uart.c                             |   1 +
>>>>  include/jailhouse/cell-config.h               |  11 +
>>>>  pyjailhouse/config_parser.py                  |   9 +-
>>>>  40 files changed, 1201 insertions(+), 121 deletions(-)
>>>>  create mode 100644 Documentation/cache-coloring.md
>>>>  create mode 100644 configs/arm64/zynqmp-zcu102-inmate-demo-col.c
>>>>  create mode 100644 hypervisor/arch/arm/include/asm/coloring.h
>>>>  create mode 100644 hypervisor/arch/arm64/cache_layout.c
>>>>  create mode 100644 hypervisor/arch/arm64/coloring.c
>>>>  create mode 100644 hypervisor/arch/arm64/include/asm/cache_layout.h
>>>>  create mode 100644 hypervisor/arch/arm64/include/asm/coloring.h
>>>>  create mode 100644 hypervisor/arch/x86/include/asm/coloring.h
>>>>  create mode 100644 hypervisor/include/jailhouse/assert.h
>>>>  create mode 100644 hypervisor/include/jailhouse/panic.h
>>>>  create mode 100644 hypervisor/panic.c
>>>>
>>>
>>> Thanks for the update! I assume it's functional and can be used in
>>> tests? Asking as there seems to be a lot of debug code and comments.
>>
>> Yes, it is functional and it can be used for testing. I assumed this wasn't 
>> the
>> final version, and the debug code + commented out code are still there in the
>> places where it can be useful to take a look to understand how the machinery 
>> is
>> working.
>>
>> I'm also interested in your point of view regarding having a CONFIG_DEBUG 
>> option
>> in Jailhouse. Basically a "debugging/development-enabled" version (with 
>> assert()
>> and extra verbose output), and a "production" one.
>>
>>> Scanning through it, I still find it very arm64-centric, rather than
>>
>> It is. As mentioned, I don't know if it is meaningful to have a x86 version 
>> of
>> the coloring, and for how long there will be the need to have coloring in
>> software. So, the patches try to keep the changes in both hypervisor and 
>> driver
>> to a minimum.
> 
> x86 might not be meaningful for modern CPUs with CAT, but it is another
> test case. But I'm also thinking of RISC-V and what else might come along.

Coloring in software on x86 might not be trivial. But yes, RISC-V could be a
test case, and maybe there will be some hardware support that can be exploited
there...

>>> generically addressing the topic as I pointed out in previous reviews.
>>> And keeping modifications of the driver minimal is not necessarily a
>>> sign we are on the same page already.
>>
>> We've experimented also with a version that does "more" in the driver. I 
>> think
>> that more changes in the driver only add complexity. In the end the 
>> hypervisor
>> have to distinguish anyway between colored and non-colored mapping, and
>> scattering the load logic between driver and hypervisor has more maintenance
>> effort than a single clean configuration parameter.
> 
> You are possibly right, it's just hard for me to follow this as there is
> no code or more concrete design to prove that argument.

I've synched with Luca and Marco, and I've pushed here

https://gitlab.com/bastoni/jailhouse/-/commits/wip/design/coloring-driver/

some relevant commits that show how a possible driver-integrated solution could
look like.

Note that the "jailhouse_ioremap_col()" function in driver/main.c basically
replicates the same coloring loop that should also be provided by the hypervisor
to do colored-mapping and how the logic about the "next_colored()" should be
"moved-up" to be shared between HV and driver.

This non-clean separation, and the fact that the IPA/PA mapping should not
belong to the driver, but only to the HV are among the reasons why we moved to
the solution presented in the patch series.

>>> For testing purposes, please also include a QEMU configuration change in
>>> the future. Helps validating things and playing the functional changes.
>>
>> OK.

Please see the attached patch.

Best,
Andrea

>>> The patch series should further more be structured in way that a static,
>>> config-defined way-size can be used before any arch-specific
>>> auto-detection logic is added.
>>
>> Sure. As said, I assumed that discussion was needed, but I think it's better 
>> to
>> discuss with a draft of how the code will look like.
>>
> 
> On the one hand, discussion can be less effort than coding if everyone
> can follow the arguments and this leads to a common view. On the other
> hand, code has the advantage of make an even stronger argument as
> everyone can see the results and compare, and can even do that in the
> future when that topic pops up again for some reason.
> 
> Jan
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/6af26a83-8be5-0317-92c6-ffcdf636375a%40tum.de.
From 2aa03d21d4f4fcb5086be809904c3a6c7af36c31 Mon Sep 17 00:00:00 2001
From: Andrea Bastoni <[email protected]>
Date: Wed, 25 Nov 2020 15:30:14 +0100
Subject: [PATCH 1/2] configs: arm64: add coloring example for qemu-arm64

---
 configs/arm64/qemu-arm64-inmate-demo-col.c | 134 +++++++++++++++++++++
 configs/arm64/qemu-arm64.c                 |   6 +
 2 files changed, 140 insertions(+)
 create mode 100644 configs/arm64/qemu-arm64-inmate-demo-col.c

diff --git a/configs/arm64/qemu-arm64-inmate-demo-col.c b/configs/arm64/qemu-arm64-inmate-demo-col.c
new file mode 100644
index 00000000..45bdf568
--- /dev/null
+++ b/configs/arm64/qemu-arm64-inmate-demo-col.c
@@ -0,0 +1,134 @@
+/*
+ * Jailhouse, a Linux-based partitioning hypervisor
+ *
+ * Configuration for demo inmate on QEMU arm64 virtual target
+ * 1 CPU, 64K RAM, 1 serial port
+ *
+ * Copyright (c) Siemens AG, 2017
+ *
+ * Authors:
+ *  Jan Kiszka <[email protected]>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include <jailhouse/types.h>
+#include <jailhouse/cell-config.h>
+
+struct {
+	struct jailhouse_cell_desc cell;
+	__u64 cpus[1];
+	struct jailhouse_memory mem_regions[8];
+	struct jailhouse_irqchip irqchips[1];
+	struct jailhouse_pci_device pci_devices[1];
+} __attribute__((packed)) config = {
+	.cell = {
+		.signature = JAILHOUSE_CELL_DESC_SIGNATURE,
+		.revision = JAILHOUSE_CONFIG_REVISION,
+		.name = "inmate-demo",
+		.flags = JAILHOUSE_CELL_PASSIVE_COMMREG,
+
+		.cpu_set_size = sizeof(config.cpus),
+		.num_memory_regions = ARRAY_SIZE(config.mem_regions),
+		.num_irqchips = ARRAY_SIZE(config.irqchips),
+		.num_pci_devices = ARRAY_SIZE(config.pci_devices),
+
+		.vpci_irq_base = 144-32,
+
+		.console = {
+			.address = 0x09000000,
+			.type = JAILHOUSE_CON_TYPE_PL011,
+			.flags = JAILHOUSE_CON_ACCESS_MMIO |
+				 JAILHOUSE_CON_REGDIST_4,
+		},
+	},
+
+	.cpus = {
+		0b0010,
+	},
+
+	.mem_regions = {
+		/* IVSHMEM shared memory regions (demo) */
+		{
+			.phys_start = 0x7faf0000,
+			.virt_start = 0x7faf0000,
+			.size = 0x1000,
+			.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_ROOTSHARED,
+		},
+		{
+			.phys_start = 0x7faf1000,
+			.virt_start = 0x7faf1000,
+			.size = 0x9000,
+			.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_WRITE |
+				JAILHOUSE_MEM_ROOTSHARED,
+		},
+		{
+			.phys_start = 0x7fafa000,
+			.virt_start = 0x7fafa000,
+			.size = 0x2000,
+			.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_ROOTSHARED,
+		},
+		{
+			.phys_start = 0x7fafc000,
+			.virt_start = 0x7fafc000,
+			.size = 0x2000,
+			.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_WRITE |
+				JAILHOUSE_MEM_ROOTSHARED,
+		},
+		{
+			.phys_start = 0x7fafe000,
+			.virt_start = 0x7fafe000,
+			.size = 0x2000,
+			.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_ROOTSHARED,
+		},
+		/* UART */ {
+			.phys_start = 0x09000000,
+			.virt_start = 0x09000000,
+			.size = 0x1000,
+			.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_WRITE |
+				JAILHOUSE_MEM_IO | JAILHOUSE_MEM_ROOTSHARED,
+		},
+		/* RAM */ {
+			.phys_start = 0x7fa00000,
+			.virt_start = 0,
+			.size = 0x00010000,
+			.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_WRITE |
+				JAILHOUSE_MEM_EXECUTE | JAILHOUSE_MEM_LOADABLE |
+				JAILHOUSE_MEM_COLORED,
+			.colors=0x000f,
+		},
+		/* communication region */ {
+			.virt_start = 0x80000000,
+			.size = 0x00001000,
+			.flags = JAILHOUSE_MEM_READ | JAILHOUSE_MEM_WRITE |
+				JAILHOUSE_MEM_COMM_REGION,
+		},
+	},
+
+	.irqchips = {
+		/* GIC */ {
+			.address = 0x08000000,
+			.pin_base = 32,
+			.pin_bitmap = {
+				0,
+				0,
+				0,
+				(1 << (144 - 128))
+			},
+		},
+	},
+
+	.pci_devices = {
+		{ /* IVSHMEM (demo) */
+			.type = JAILHOUSE_PCI_TYPE_IVSHMEM,
+			.domain = 1,
+			.bdf = 0 << 3,
+			.bar_mask = JAILHOUSE_IVSHMEM_BAR_MASK_INTX,
+			.shmem_regions_start = 0,
+			.shmem_dev_id = 1,
+			.shmem_peers = 3,
+			.shmem_protocol = JAILHOUSE_SHMEM_PROTO_UNDEFINED,
+		},
+	},
+};
diff --git a/configs/arm64/qemu-arm64.c b/configs/arm64/qemu-arm64.c
index 2e9fcde6..f064c1b0 100644
--- a/configs/arm64/qemu-arm64.c
+++ b/configs/arm64/qemu-arm64.c
@@ -44,6 +44,12 @@ struct {
 			.pci_mmconfig_end_bus = 0,
 			.pci_is_virtual = 1,
 			.pci_domain = 1,
+			.color = {
+				/* in debug mode, the way_size is autodetected
+				 * if it is not specified */
+				/* .way_size = 0x20000, */
+				.root_map_offset = 0x0C000000000,
+			},
 			.arm = {
 				.gic_version = 3,
 				.gicd_base = 0x08000000,
-- 
2.29.2

Reply via email to