date:20240102

On 12/1/23 20:27, Volodymyr Babchuk wrote:
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 43216429d9..7c84cee5d1 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -598,6 +675,18 @@ static void cf_check rom_write(
>  rom->addr = val & PCI_ROM_ADDRESS_MASK;
>  }
>  
> +static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
> +unsigned int i)
> +{
> +char str[32];
> +
> +snprintf(str, sizeof(str), "%pp:BAR%u", >sbdf, i);
> +
> +bar->mem = rangeset_new(pdev->domain, str, RANGESETF_no_print);
> +
> +return !bar->mem ? -ENOMEM : 0;
> +}
> +
>  static int cf_check init_bars(struct pci_dev *pdev)
>  {
>  uint16_t cmd;
> @@ -679,6 +768,10 @@ static int cf_check init_bars(struct pci_dev *pdev)
>  else
>  bars[i].type = VPCI_BAR_MEM32;
>  
> +rc = bar_add_rangeset(pdev, [i], i);
> +if ( rc )
> +goto fail;
> +
>  rc = pci_size_mem_bar(pdev->sbdf, reg, , ,
>(i == num_bars - 1) ? PCI_BAR_LAST : 0);
>  if ( rc < 0 )
> @@ -728,6 +821,12 @@ static int cf_check init_bars(struct pci_dev *pdev)
> rom_reg, 4, rom);
>  if ( rc )
>  rom->type = VPCI_BAR_EMPTY;
> +else
> +{
> +rc = bar_add_rangeset(pdev, rom, i);

Although purely cosmetic, it looks like this should use num_bars, not i.

> +if ( rc )
> +goto fail;
> +}
>  }
>  }
>  else

Re: [PATCH v11 07/17] vpci/header: implement guest BAR register handlers

On 12/21/23 10:43, Roger Pau Monné wrote:
> On Sat, Dec 02, 2023 at 01:27:04AM +, Volodymyr Babchuk wrote:
>> From: Oleksandr Andrushchenko 
>>
>> Add relevant vpci register handlers when assigning PCI device to a domain
>> and remove those when de-assigning. This allows having different
>> handlers for different domains, e.g. hwdom and other guests.
>>
>> Emulate guest BAR register values: this allows creating a guest view
>> of the registers and emulates size and properties probe as it is done
>> during PCI device enumeration by the guest.
>>
>> All empty, IO and ROM BARs for guests are emulated by returning 0 on
>> reads and ignoring writes: this BARs are special with this respect as
>> their lower bits have special meaning, so returning default ~0 on read
>> may confuse guest OS.
>>
>> Signed-off-by: Oleksandr Andrushchenko 
>> Signed-off-by: Volodymyr Babchuk 
> 
> Just a couple of nits.
> 
> Reviewed-by: Roger Pau Monné 

Thanks!

> 
>> ---
>> In v11:
>> - Access guest_addr after adjusting for MEM64_HI bar in
>> guest_bar_write()
>> - guest bar handlers renamed and now  _mem_ part to denote
>> that they are handling only memory BARs
>> - refuse to update guest BAR address if BAR is enabled
>> In v10:
>> - ull -> ULL to be MISRA-compatbile
>> - Use PAGE_OFFSET() instead of combining with ~PAGE_MASK
>> - Set type of empty bars to VPCI_BAR_EMPTY
>> In v9:
>> - factored-out "fail" label introduction in init_bars()
>> - replaced #ifdef CONFIG_X86 with IS_ENABLED()
>> - do not pass bars[i] to empty_bar_read() handler
>> - store guest's BAR address instead of guests BAR register view
>> Since v6:
>> - unify the writing of the PCI_COMMAND register on the
>>   error path into a label
>> - do not introduce bar_ignore_access helper and open code
>> - s/guest_bar_ignore_read/empty_bar_read
>> - update error message in guest_bar_write
>> - only setup empty_bar_read for IO if !x86
>> Since v5:
>> - make sure that the guest set address has the same page offset
>>   as the physical address on the host
>> - remove guest_rom_{read|write} as those just implement the default
>>   behaviour of the registers not being handled
>> - adjusted comment for struct vpci.addr field
>> - add guest handlers for BARs which are not handled and will otherwise
>>   return ~0 on read and ignore writes. The BARs are special with this
>>   respect as their lower bits have special meaning, so returning ~0
>>   doesn't seem to be right
>> Since v4:
>> - updated commit message
>> - s/guest_addr/guest_reg
>> Since v3:
>> - squashed two patches: dynamic add/remove handlers and guest BAR
>>   handler implementation
>> - fix guest BAR read of the high part of a 64bit BAR (Roger)
>> - add error handling to vpci_assign_device
>> - s/dom%pd/%pd
>> - blank line before return
>> Since v2:
>> - remove unneeded ifdefs for CONFIG_HAS_VPCI_GUEST_SUPPORT as more code
>>   has been eliminated from being built on x86
>> Since v1:
>>  - constify struct pci_dev where possible
>>  - do not open code is_system_domain()
>>  - simplify some code3. simplify
>>  - use gdprintk + error code instead of gprintk
>>  - gate vpci_bar_{add|remove}_handlers with CONFIG_HAS_VPCI_GUEST_SUPPORT,
>>so these do not get compiled for x86
>>  - removed unneeded is_system_domain check
>>  - re-work guest read/write to be much simpler and do more work on write
>>than read which is expected to be called more frequently
>>  - removed one too obvious comment
>> ---
>>  xen/drivers/vpci/header.c | 135 +-
>>  xen/include/xen/vpci.h|   3 +
>>  2 files changed, 122 insertions(+), 16 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index e6a1d58c42..43216429d9 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -477,6 +477,75 @@ static void cf_check bar_write(
>>  pci_conf_write32(pdev->sbdf, reg, val);
>>  }
>>  
>> +static void cf_check guest_mem_bar_write(const struct pci_dev *pdev,
>> + unsigned int reg, uint32_t val,
>> + void *data)
>> +{
>> +struct vpci_bar *bar = data;
>> +bool hi = false;
>> +uint64_t guest_addr;
>> +
>> +if ( bar->type == VPCI_BAR_MEM64_HI )
>> +{
>> +ASSERT(reg > PCI_BASE_ADDRESS_0);
>> +bar--;
>> +hi = true;
>> +}
>> +else
>> +{
>> +val &= PCI_BASE_ADDRESS_MEM_MASK;
>> +}
>> +
>> +guest_addr = bar->guest_addr;
>> +guest_addr &= ~(0xULL << (hi ? 32 : 0));
>> +guest_addr |= (uint64_t)val << (hi ? 32 : 0);
>> +
>> +/* Allow guest to size BAR correctly */
>> +guest_addr &= ~(bar->size - 1);
>> +
>> +/*
>> + * Xen only cares whether the BAR is mapped into the p2m, so allow BAR
>> + * writes as long as the BAR is not mapped into the p2m.
>> + */
>> +if ( bar->enabled )
>> +{
>> +/* If the value written is the current one avoid printing a

Re: XEN virtio question

2024-01-02 Thread Stefano Stabellini

Hi Peng,

Let me CC Vikram who might have seen this.


On Tue, 2 Jan 2024, Peng Fan wrote:
> All,
> 
> I am trying to setup xen virtio on NXP i.MX9, but when build xen tools,
> I always met qemu build error, such as:
> 
> In file included from ../qemu-xen-dir-remote/hw/xen/xen-operations.c:16:
> /home/Freenix/work/sw-stash/xen/upstream/tools/qemu-xen-dir-remote/
> include/hw/xen/xen_native.h:5:2: error: #error In Xen native files, 
> include xen_native.h before other Xen headers
> 5 | #error In Xen native files, include xen_native.h before other Xen 
> headers
>   |  ^
> 
> Anyone met this?
> 
> BTW, do u build qemu together with xen or build standalone qemu out of
> xen repo? Does the default qemu-system-i386 built out works well
> for virtio trying?
> 
> Thanks,
> Peng.
>

Re: [BUG]i2c_hid_acpi broken with 4.17.2 on Framework Laptop 13 AMD

2024-01-02 Thread Sébastien Chaumat

>  output of gpioinfo
>
> kernel alone :
>
> line   5: unnamed input active-low consumer=interrupt
> line  84: unnamed input active-low consumer=interrupt
>
> xen:
>
> line   5: unnamed input active-low
> line  84: unnamed input active-low
>
> xen with skipping IRQ7 double init :
>
> line   5: unnamed input active-low consumer=interrupt
> line  84: unnamed input active-low
>
>
> So definitely progressing.
>

Checking /sys/kernel/irq/7

kernel alone :
 actions: pinctrl_amd
 chip_name: IR-IO-APIC
 hwirq: 7
 name: fasteoi
 per_cpu_count: 0,0,0,0,0,20,0,0,0,0,0,0,0,0,0,0
 type: level
 wakeup: enabled

xen skipping IRQ7 double init :

actions: pinctrl_amd
 chip_name: xen-pirq
 hwirq:
 name: ioapic-level
 per_cpu_count: 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
 type: edge
 wakeup: disabled

So the skip of IRQ7 in pci_xen_initial_domain() sets the correct handler
 (IIUC xen uses the ioapic-level and handles the eoi separately), but not
the correct type (still edge).
I guess this may explains the results above.

Sébastien

Re: [PATCH v2 15/29] tools/libs/light: add backend type for 9pfs PV devices

2024-01-02 Thread Nick Rosbrook

On Wed, Dec 6, 2023 at 1:53 PM Jason Andryuk  wrote:
>
> On Wed, Dec 6, 2023 at 12:44 PM Nick Rosbrook  wrote:
> >
> > On Wed, Dec 6, 2023 at 9:36 AM Jason Andryuk  wrote:
> > > FYI, these IDL changes will require golang binding regeneration.
> > > (Maybe we shouldn't have generated code checked in...)
> >
> > The generated code needs to be checked in for it to work as a go module.
>
> I don't follow. The build system generates the *.gen.go binding files
> if they are missing.  They can then be used, installed or packaged.
> Why do they need to be checked into the git repo?  Checked in, they
> have a tendency to go stale.
>

That's not how go modules are typically consumed. E.g., the Debian
packages with go source code are only there to satisfy Build-Depends
of other packages. One can use locally installed go modules as
overrides at build-time, but normally go modules are fetched via git
(internally by the go tooling) according to what is specified in a
project's go.mod file. See https://go.dev/blog/using-go-modules for
more.

-Nick

Re: [PATCH v2 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

2024-01-02 Thread Julien Grall


Hi Jan,

On 14/12/2023 10:35, Jan Beulich wrote:

On 14.12.2023 11:14, Julien Grall wrote:

On 14/12/2023 10:10, Jan Beulich wrote:

On 11.12.2023 13:23, Julien Grall wrote:

--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
   int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
   int __read_mostly nr_ioapics;
   
+/*

+ * The logic to check if the timer is working is expensive. So allow
+ * the admin to bypass it if they know their platform doesn't have
+ * a buggy timer.
+ */
+static bool __initdata pit_irq_works;
+boolean_param("pit-irq-works", pit_irq_works);
+
   /*
* Rough estimation of how many shared IRQs there are, can
* be changed anytime.
@@ -1502,6 +1510,9 @@ static int __init timer_irq_works(void)
   {
   unsigned long t1, flags;
   
+if ( pit_irq_works )

+return 1;


When the check is placed here, what exactly use of the option means is
system dependent. I consider this somewhat risky, so I'd prefer if the
check was put on the "normal" path in check_timer(). That way it'll
affect only the one case which we can generally consider "known good",
but not the cases where the virtual wire setups are being probed. I.e.


By "known good", do you mean the following:

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index c89fbed8d675..c39d39ee951a 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1960,7 +1959,8 @@ static void __init check_timer(void)
  * Ok, does IRQ0 through the IOAPIC work?
  */
 unmask_IO_APIC_irq(irq_to_desc(0));
-if (timer_irq_works()) {
+if (pit_irq_works || timer_irq_works()) {
+printk("== pirq_irq_works %d =\n", pit_irq_works);
 local_irq_restore(flags);
 return;
 }



I am not against restricting when we allow skipping the timer check. But
in that case, I wonder why Linux is doing it differently?


Sadly Linux'es git history doesn't go back far enough (begins only at past
2.6.11), so I can't (easily) find the patch (and description) for the x86-64
change. The later i386 change is justified mainly by paravirt needs, so
isn't applicable here. I wouldn't therefore exclude that my point above
wasn't even taken into consideration. Furthermore their command line option
is "no_timer_check", which to me firmly says "don't check" without regard to
whether the source (PIT) is actually okay. That's different with the option
name you (imo validly) chose.


Just to note that the name was suggested by Roger. I have to admit that 
I didn't check if this made sense for the existing placement.


Anyway, I tested the change on the HW where I wanted to skip the timer 
check. And I can confirm this is still skipping the timer check.


So I will send a new version with the diff above and some updated comments.

Cheers,

--
Julien Grall

Re: [PATCH v11 05/17] vpci: add hooks for PCI device assign/de-assign

On 12/21/23 10:21, Roger Pau Monné wrote:
> On Sat, Dec 02, 2023 at 01:27:03AM +, Volodymyr Babchuk wrote:
>> From: Oleksandr Andrushchenko 
>>
>> When a PCI device gets assigned/de-assigned we need to
>> initialize/de-initialize vPCI state for the device.
>>
>> Also, rename vpci_add_handlers() to vpci_assign_device() and
>> vpci_remove_device() to vpci_deassign_device() to better reflect role
>> of the functions.
>>
>> Signed-off-by: Oleksandr Andrushchenko 
>> Signed-off-by: Volodymyr Babchuk 
> 
> Reviewed-by: Roger Pau Monné 

Thanks!

> 
>> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
>> index d743d96a10..75cfb532ee 100644
>> --- a/xen/include/xen/vpci.h
>> +++ b/xen/include/xen/vpci.h
>> @@ -25,11 +25,11 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
>>static vpci_register_init_t *const x##_entry  \
>> __used_section(".data.vpci." p) = x
>>  
>> -/* Add vPCI handlers to device. */
>> -int __must_check vpci_add_handlers(struct pci_dev *pdev);
>> +/* Assign vPCI to device by adding handlers to device. */
> 
> Nit: the comment would likely benefit from removing the last device
> before the full stop.

Will fix

> 
>> +int __must_check vpci_assign_device(struct pci_dev *pdev);
> 
> Thanks, Roger.

Re: [PATCH v11 05/17] vpci: add hooks for PCI device assign/de-assign

On 12/22/23 03:52, Jan Beulich wrote:
> On 02.12.2023 02:27, Volodymyr Babchuk wrote:
>> @@ -886,6 +890,10 @@ static int deassign_device(struct domain *d, uint16_t 
>> seg, uint8_t bus,
>>  
>>  pdev->fault.count = 0;
>>  
>> +write_lock(>pci_lock);
>> +ret = vpci_assign_device(pdev);
>> +write_unlock(>pci_lock);
>> +
>>   out:
>>  if ( ret )
>>  printk(XENLOG_G_ERR "%pd: deassign (%pp) failed (%d)\n",
> 
> Considering the function we're in, I think this "assign" deserves a comment.

OK, we will add a comment along the lines of:

/* Re-assign back to hardware_domain */

> It's necessary for hwdom only aiui, i.e. particularly not for the more
> typical case of putting the device in quarantine?

Right, since dom_io doesn't have vPCI

> 
> Jan

Re: [PATCH v11 03/17] vpci: use per-domain PCI lock to protect vpci structure

On 12/21/23 09:05, Roger Pau Monné wrote:
> On Sat, Dec 02, 2023 at 01:27:03AM +, Volodymyr Babchuk wrote:
>> From: Oleksandr Andrushchenko 
>>
>> Use a previously introduced per-domain read/write lock to check
>> whether vpci is present, so we are sure there are no accesses to the
>> contents of the vpci struct if not. This lock can be used (and in a
>> few cases is used right away) so that vpci removal can be performed
>> while holding the lock in write mode. Previously such removal could
>> race with vpci_read for example.
>>
>> When taking both d->pci_lock and pdev->vpci->lock, they should be
>> taken in this exact order: d->pci_lock then pdev->vpci->lock to avoid
>> possible deadlock situations.
>>
>> 1. Per-domain's pci_rwlock is used to protect pdev->vpci structure
>> from being removed.
>>
>> 2. Writing the command register and ROM BAR register may trigger
>> modify_bars to run, which in turn may access multiple pdevs while
>> checking for the existing BAR's overlap. The overlapping check, if
>> done under the read lock, requires vpci->lock to be acquired on both
>> devices being compared, which may produce a deadlock. It is not
>> possible to upgrade read lock to write lock in such a case. So, in
>> order to prevent the deadlock, use d->pci_lock instead. To prevent
>> deadlock while locking both hwdom->pci_lock and dom_xen->pci_lock,
>> always lock hwdom first.
> 
> FWIW (I think it was also mentioned in the previous version) for
> devices assigned to dom_xen taking the hwdom lock should be enough.
> There are no other accesses to such devices that don't originate from
> hwdom, and it's not possible to degassing devices from dom_xen.

OK, we will re-word the commit message and make the change in vpci_{read,write}.

> 
>>
>> All other code, which doesn't lead to pdev->vpci destruction and does
>> not access multiple pdevs at the same time, can still use a
>> combination of the read lock and pdev->vpci->lock.
>>
>> 3. Drop const qualifier where the new rwlock is used and this is
>> appropriate.
>>
>> 4. Do not call process_pending_softirqs with any locks held. For that
>> unlock prior the call and re-acquire the locks after. After
>> re-acquiring the lock there is no need to check if pdev->vpci exists:
>>  - in apply_map because of the context it is called (no race condition
>>possible)
>>  - for MSI/MSI-X debug code because it is called at the end of
>>pdev->vpci access and no further access to pdev->vpci is made
>>
>> 5. Use d->pci_lock around for_each_pdev and pci_get_pdev_by_domain
>> while accessing pdevs in vpci code.
>>
>> 6. We are removing multiple ASSERT(pcidevs_locked()) instances because
>> they are too strict now: they should be corrected to
>> ASSERT(pcidevs_locked() || rw_is_locked(>pci_lock)), but problem is
>> that mentioned instances does not have access to the domain
>> pointer and it is not feasible to pass a domain pointer to a function
>> just for debugging purposes.
>>
>> Suggested-by: Roger Pau Monné 
>> Suggested-by: Jan Beulich 
>> Signed-off-by: Oleksandr Andrushchenko 
>> Signed-off-by: Volodymyr Babchuk 
>>
>> ---
>> Changes in v11:
>>  - Fixed commit message regarding possible spinlocks
>>  - Removed parameter from allocate_and_map_msi_pirq(), which was added
>>  in the prev version. Now we are taking pcidevs_lock in
>>  physdev_map_pirq()
>>  - Returned ASSERT to pci_enable_msi
>>  - Fixed case when we took read lock instead of write one
>>  - Fixed label indentation
>>
>> Changes in v10:
>>  - Moved printk pas locked area
>>  - Returned back ASSERTs
>>  - Added new parameter to allocate_and_map_msi_pirq() so it knows if
>>  it should take the global pci lock
>>  - Added comment about possible improvement in vpci_write
>>  - Changed ASSERT(rw_is_locked()) to rw_is_write_locked() in
>>appropriate places
>>  - Renamed release_domain_locks() to release_domain_write_locks()
>>  - moved domain_done label in vpci_dump_msi() to correct place
>> Changes in v9:
>>  - extended locked region to protect vpci_remove_device and
>>vpci_add_handlers() calls
>>  - vpci_write() takes lock in the write mode to protect
>>potential call to modify_bars()
>>  - renamed lock releasing function
>>  - removed ASSERT()s from msi code
>>  - added trylock in vpci_dump_msi
>>
>> Changes in v8:
>>  - changed d->vpci_lock to d->pci_lock
>>  - introducing d->pci_lock in a separate patch
>>  - extended locked region in vpci_process_pending
>>  - removed pcidevs_lockis vpci_dump_msi()
>>  - removed some changes as they are not needed with
>>the new locking scheme
>>  - added handling for hwdom && dom_xen case
>> ---
>>  xen/arch/x86/hvm/vmsi.c   | 22 +++
>>  xen/arch/x86/hvm/vmx/vmx.c|  2 --
>>  xen/arch/x86/irq.c|  8 +++---
>>  xen/arch/x86/msi.c| 10 ++-
>>  xen/arch/x86/physdev.c|  2 ++
>>  xen/drivers/passthrough/pci.c |  9 +++---
>>  xen/drivers/vpci/header.c | 18 
>>  xen/drivers/vpci/msi.c|

Re: Xen 4.19 release schedule proposal

2024-01-02 Thread Oleksii

Something went wrong, and this email wasn't sent as a separate topic,
so I re-sent it.

Sorry for any inconvenience.

~ Oleksii
On Tue, 2024-01-02 at 18:59 +0200, Oleksii wrote:
> Dear community,
> 
> Wishing you a Happy New Year!
> 
> I'd like to propose the release schedule for Xen 4.19.
> 
> Based on the previous release schedules [1] and [2], it seems the
> next
> release date should be on Wednesday, July 10, 2024:
> 
> ** Proposed option: Wed Jul 10, 2024 **
> (+9 months from Xen 4.18 release)
> 
> - Last posting date  Fri Apr 26, 2024
> 
> Patches adding new features are expected to be posted to the mailing
> list by this date, although perhaps not in their final version.
> 
> - Feature freeze Fri May 17, 2024 (+3 weeks from Last
> posting date)
> 
> Patches adding new features should be committed by this date.
> Straightforward bugfixes may continue to be accepted by maintainers.
> 
> - Code freeze    Fri May 31, 2024 (+2 weeks from Feature
> freeze)
> 
> Bugfixes only.
> 
> - Hard code freeze   Fri Jun 21, 2024 (+3 weeks from Code
> freeze)
> 
> Bugfixes for serious bugs (including regressions), and low-risk fixes
> only.
> 
> - Final commits  Fri Jul 05, 2024 (+2 weeks from Hard
> code
> freeze)
> 
> Branch off staging-4.19.
> 
> - Release    Wed Jul 10, 2024
> 
> If there are no objections, we will stick to the proposed schedule.
> 
> One more thing I'd like to discuss is whether to add a file
> (RELEASE.md) with the release schedule to the source code or update
> an
> existing one (xen-release-management.pandoc). I think it will help to
> find the final release schedule for the nearest release. For example,
> for the previous release, there are a lot of emails with proposed
> schedules, polls of Xen release schedules, and I found the final
> release schedule in just one of the replies (but probably I missed
> something).
> 
> I'll be happy to hear any comments from you. Thanks in advance.
> 
> Best regards,
>  Oleksii
> 
> [1]
> https://lore.kernel.org/xen-devel/as8pr08mb7991424a3167c70a9b29530c92...@as8pr08mb7991.eurprd08.prod.outlook.com/
> [2]
> https://lists.xen.org/archives/html/xen-devel/2018-07/msg02240.html

Xen 4.19 release schedule proposal

2024-01-02 Thread Oleksii

Dear community,

Wishing you a Happy New Year!

I'd like to propose the release schedule for Xen 4.19.

Based on the previous release schedules [1] and [2], it seems the next
release date should be on Wednesday, July 10, 2024:

** Proposed option: Wed Jul 10, 2024 **
(+9 months from Xen 4.18 release)

- Last posting date Fri Apr 26, 2024

Patches adding new features are expected to be posted to the mailing
list by this date, although perhaps not in their final version.

- Feature freeze Fri May 17, 2024 (+3 weeks from Last
posting date)

Patches adding new features should be committed by this date.
Straightforward bugfixes may continue to be accepted by maintainers.

- Code freezeFri May 31, 2024 (+2 weeks from Feature
freeze)

Bugfixes only.

- Hard code freeze Fri Jun 21, 2024 (+3 weeks from Code
freeze)

Bugfixes for serious bugs (including regressions), and low-risk fixes
only.

- Final commits Fri Jul 05, 2024 (+2 weeks from Hard code
freeze)

Branch off staging-4.19.

- ReleaseWed Jul 10, 2024

If there are no objections, we will stick to the proposed schedule.

One more thing I'd like to discuss is whether to add a file
(RELEASE.md) with the release schedule to the source code or update an
existing one (xen-release-management.pandoc). I think it will help to
find the final release schedule for the nearest release. For example,
for the previous release, there are a lot of emails with proposed
schedules, polls of Xen release schedules, and I found the final
release schedule in just one of the replies (but probably I missed
something).

I'll be happy to hear any comments from you. Thanks in advance.

Best regards,
Oleksii

[1]
https://lore.kernel.org/xen-devel/as8pr08mb7991424a3167c70a9b29530c92...@as8pr08mb7991.eurprd08.prod.outlook.com/
[2] https://lists.xen.org/archives/html/xen-devel/2018-07/msg02240.html

Xen 4.19 release schedule proposal

2024-01-02 Thread Oleksii

Dear community,

Wishing you a Happy New Year!

I'd like to propose the release schedule for Xen 4.19.

Based on the previous release schedules [1] and [2], it seems the next
release date should be on Wednesday, July 10, 2024:

** Proposed option: Wed Jul 10, 2024 **
(+9 months from Xen 4.18 release)

- Last posting date Fri Apr 26, 2024

Patches adding new features are expected to be posted to the mailing
list by this date, although perhaps not in their final version.

- Feature freeze Fri May 17, 2024 (+3 weeks from Last
posting date)

Patches adding new features should be committed by this date.
Straightforward bugfixes may continue to be accepted by maintainers.

- Code freezeFri May 31, 2024 (+2 weeks from Feature
freeze)

Bugfixes only.

- Hard code freeze Fri Jun 21, 2024 (+3 weeks from Code
freeze)

Bugfixes for serious bugs (including regressions), and low-risk fixes
only.

- Final commits Fri Jul 05, 2024 (+2 weeks from Hard code
freeze)

Branch off staging-4.19.

- ReleaseWed Jul 10, 2024

If there are no objections, we will stick to the proposed schedule.

I'll be happy to hear any comments from you. Thanks in advance.

Best regards,
Oleksii

[1]
https://lore.kernel.org/xen-devel/as8pr08mb7991424a3167c70a9b29530c92...@as8pr08mb7991.eurprd08.prod.outlook.com/
[2] https://lists.xen.org/archives/html/xen-devel/2018-07/msg02240.html

Re: [PATCH v3 5/5] Rename "QEMU global mutex" to "BQL" in comments and docs

2024-01-02 Thread Cédric Le Goater


On 1/2/24 16:35, Stefan Hajnoczi wrote:

The term "QEMU global mutex" is identical to the more widely used Big
QEMU Lock ("BQL"). Update the code comments and documentation to use
"BQL" instead of "QEMU global mutex".

Signed-off-by: Stefan Hajnoczi 
Acked-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.

Re: [PATCH v3 4/5] Replace "iothread lock" with "BQL" in comments

2024-01-02 Thread Cédric Le Goater


On 1/2/24 16:35, Stefan Hajnoczi wrote:

The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL)
in their names. Update the code comments to use "BQL" instead of
"iothread lock".

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 



Reviewed-by: Cédric Le Goater 

Thanks,

C.

[PATCH v3 1/5] system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()

The Big QEMU Lock (BQL) has many names and they are confusing. The
actual QemuMutex variable is called qemu_global_mutex but it's commonly
referred to as the BQL in discussions and some code comments. The
locking APIs, however, are called qemu_mutex_lock_iothread() and
qemu_mutex_unlock_iothread().

The "iothread" name is historic and comes from when the main thread was
split into into KVM vcpu threads and the "iothread" (now called the main
loop thread). I have contributed to the confusion myself by introducing
a separate --object iothread, a separate concept unrelated to the BQL.

The "iothread" name is no longer appropriate for the BQL. Rename the
locking APIs to:
- void bql_lock(void)
- void bql_unlock(void)
- bool bql_locked(void)

There are more APIs with "iothread" in their names. Subsequent patches
will rename them. There are also comments and documentation that will be
updated in later patches.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Paul Durrant 
Acked-by: Fabiano Rosas 
Acked-by: David Woodhouse 
Reviewed-by: Cédric Le Goater 
Acked-by: Peter Xu 
Acked-by: Eric Farman 
Reviewed-by: Harsh Prateek Bora 
Acked-by: Hyman Huang 
---
 include/block/aio-wait.h |   2 +-
 include/qemu/main-loop.h |  39 +
 include/qemu/thread.h|   2 +-
 accel/accel-blocker.c|  10 +--
 accel/dummy-cpus.c   |   8 +-
 accel/hvf/hvf-accel-ops.c|   4 +-
 accel/kvm/kvm-accel-ops.c|   4 +-
 accel/kvm/kvm-all.c  |  22 ++---
 accel/tcg/cpu-exec.c |  26 +++---
 accel/tcg/cputlb.c   |  16 ++--
 accel/tcg/tcg-accel-ops-icount.c |   4 +-
 accel/tcg/tcg-accel-ops-mttcg.c  |  12 +--
 accel/tcg/tcg-accel-ops-rr.c |  14 ++--
 accel/tcg/tcg-accel-ops.c|   2 +-
 accel/tcg/translate-all.c|   2 +-
 cpu-common.c |   4 +-
 dump/dump.c  |   4 +-
 hw/core/cpu-common.c |   6 +-
 hw/i386/intel_iommu.c|   6 +-
 hw/i386/kvm/xen_evtchn.c |  16 ++--
 hw/i386/kvm/xen_overlay.c|   2 +-
 hw/i386/kvm/xen_xenstore.c   |   2 +-
 hw/intc/arm_gicv3_cpuif.c|   2 +-
 hw/intc/s390_flic.c  |  18 ++--
 hw/misc/edu.c|   4 +-
 hw/misc/imx6_src.c   |   2 +-
 hw/misc/imx7_src.c   |   2 +-
 hw/net/xen_nic.c |   8 +-
 hw/ppc/pegasos2.c|   2 +-
 hw/ppc/ppc.c |   4 +-
 hw/ppc/spapr.c   |   2 +-
 hw/ppc/spapr_rng.c   |   4 +-
 hw/ppc/spapr_softmmu.c   |   4 +-
 hw/remote/mpqemu-link.c  |  20 ++---
 hw/remote/vfio-user-obj.c|   2 +-
 hw/s390x/s390-skeys.c|   2 +-
 migration/block-dirty-bitmap.c   |   4 +-
 migration/block.c|  16 ++--
 migration/colo.c |  60 +++---
 migration/dirtyrate.c|  12 +--
 migration/migration.c|  52 ++--
 migration/ram.c  |  12 +--
 replay/replay-internal.c |   2 +-
 semihosting/console.c|   8 +-
 stubs/iothread-lock.c|   6 +-
 system/cpu-throttle.c|   4 +-
 system/cpus.c|  51 ++--
 system/dirtylimit.c  |   4 +-
 system/memory.c  |   2 +-
 system/physmem.c |   8 +-
 system/runstate.c|   2 +-
 system/watchpoint.c  |   4 +-
 target/arm/arm-powerctl.c|  14 ++--
 target/arm/helper.c  |   4 +-
 target/arm/hvf/hvf.c |   8 +-
 target/arm/kvm.c |   8 +-
 target/arm/ptw.c |   6 +-
 target/arm/tcg/helper-a64.c  |   8 +-
 target/arm/tcg/m_helper.c|   6 +-
 target/arm/tcg/op_helper.c   |  24 +++---
 target/arm/tcg/psci.c|   2 +-
 target/hppa/int_helper.c |   8 +-
 target/i386/hvf/hvf.c|   6 +-
 target/i386/kvm/hyperv.c |   4 +-
 target/i386/kvm/kvm.c|  28 +++
 target/i386/kvm/xen-emu.c|  14 ++--
 target/i386/nvmm/nvmm-accel-ops.c|   4 +-
 target/i386/nvmm/nvmm-all.c  |  20 ++---
 target/i386/tcg/sysemu/fpu_helper.c  |   6 +-
 target/i386/tcg/sysemu/misc_helper.c |   4 +-
 target/i386/whpx/whpx-accel-ops.c|   4 +-
 target/i386/whpx/whpx-all.c  |  24 +++---
 target/loongarch/csr_helper.c|   4 +-
 target/mips/kvm.c|   4 +-
 target/mips/tcg/sysemu/cp0_helper.c  |   4 +-
 target/openrisc/sys_helper.c |  16 ++--
 target/ppc/excp_helper.c |  12 +--
 target/ppc/kvm.c |   4 +-
 target/ppc/misc_helper.c |   8 +-
 target/ppc/timebase_helper.c

[PATCH v3 5/5] Rename "QEMU global mutex" to "BQL" in comments and docs

The term "QEMU global mutex" is identical to the more widely used Big
QEMU Lock ("BQL"). Update the code comments and documentation to use
"BQL" instead of "QEMU global mutex".

Signed-off-by: Stefan Hajnoczi 
Acked-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
---
 docs/devel/multi-thread-tcg.rst   |  7 +++
 docs/devel/qapi-code-gen.rst  |  2 +-
 docs/devel/replay.rst |  2 +-
 docs/devel/multiple-iothreads.txt | 14 +++---
 include/block/blockjob.h  |  6 +++---
 include/io/task.h |  2 +-
 include/qemu/coroutine-core.h |  2 +-
 include/qemu/coroutine.h  |  2 +-
 hw/block/dataplane/virtio-blk.c   |  8 
 hw/block/virtio-blk.c |  2 +-
 hw/scsi/virtio-scsi-dataplane.c   |  6 +++---
 net/tap.c |  2 +-
 12 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/docs/devel/multi-thread-tcg.rst b/docs/devel/multi-thread-tcg.rst
index c9541a7b20..7302c3bf53 100644
--- a/docs/devel/multi-thread-tcg.rst
+++ b/docs/devel/multi-thread-tcg.rst
@@ -226,10 +226,9 @@ instruction. This could be a future optimisation.
 Emulated hardware state
 ---
 
-Currently thanks to KVM work any access to IO memory is automatically
-protected by the global iothread mutex, also known as the BQL (Big
-QEMU Lock). Any IO region that doesn't use global mutex is expected to
-do its own locking.
+Currently thanks to KVM work any access to IO memory is automatically protected
+by the BQL (Big QEMU Lock). Any IO region that doesn't use the BQL is expected
+to do its own locking.
 
 However IO memory isn't the only way emulated hardware state can be
 modified. Some architectures have model specific registers that
diff --git a/docs/devel/qapi-code-gen.rst b/docs/devel/qapi-code-gen.rst
index 7f78183cd4..ea8228518c 100644
--- a/docs/devel/qapi-code-gen.rst
+++ b/docs/devel/qapi-code-gen.rst
@@ -594,7 +594,7 @@ blocking the guest and other background operations.
 Coroutine safety can be hard to prove, similar to thread safety.  Common
 pitfalls are:
 
-- The global mutex isn't held across ``qemu_coroutine_yield()``, so
+- The BQL isn't held across ``qemu_coroutine_yield()``, so
   operations that used to assume that they execute atomically may have
   to be more careful to protect against changes in the global state.
 
diff --git a/docs/devel/replay.rst b/docs/devel/replay.rst
index 0244be8b9c..effd856f0c 100644
--- a/docs/devel/replay.rst
+++ b/docs/devel/replay.rst
@@ -184,7 +184,7 @@ modes.
 Reading and writing requests are created by CPU thread of QEMU. Later these
 requests proceed to block layer which creates "bottom halves". Bottom
 halves consist of callback and its parameters. They are processed when
-main loop locks the global mutex. These locks are not synchronized with
+main loop locks the BQL. These locks are not synchronized with
 replaying process because main loop also processes the events that do not
 affect the virtual machine state (like user interaction with monitor).
 
diff --git a/docs/devel/multiple-iothreads.txt 
b/docs/devel/multiple-iothreads.txt
index 4865196bde..de85767b12 100644
--- a/docs/devel/multiple-iothreads.txt
+++ b/docs/devel/multiple-iothreads.txt
@@ -5,7 +5,7 @@ the COPYING file in the top-level directory.
 
 
 This document explains the IOThread feature and how to write code that runs
-outside the QEMU global mutex.
+outside the BQL.
 
 The main loop and IOThreads
 ---
@@ -29,13 +29,13 @@ scalability bottleneck on hosts with many CPUs.  Work can 
be spread across
 several IOThreads instead of just one main loop.  When set up correctly this
 can improve I/O latency and reduce jitter seen by the guest.
 
-The main loop is also deeply associated with the QEMU global mutex, which is a
-scalability bottleneck in itself.  vCPU threads and the main loop use the QEMU
-global mutex to serialize execution of QEMU code.  This mutex is necessary
-because a lot of QEMU's code historically was not thread-safe.
+The main loop is also deeply associated with the BQL, which is a
+scalability bottleneck in itself.  vCPU threads and the main loop use the BQL
+to serialize execution of QEMU code.  This mutex is necessary because a lot of
+QEMU's code historically was not thread-safe.
 
 The fact that all I/O processing is done in a single main loop and that the
-QEMU global mutex is contended by all vCPU threads and the main loop explain
+BQL is contended by all vCPU threads and the main loop explain
 why it is desirable to place work into IOThreads.
 
 The experimental virtio-blk data-plane implementation has been benchmarked and
@@ -66,7 +66,7 @@ There are several old APIs that use the main loop AioContext:
 
 Since they implicitly work on the main loop they cannot be used in code that
 runs in an IOThread.  They might cause a crash or deadlock if called from an
-IOThread since the QEMU global mutex is not held.
+IOThread since the BQL is not held.

[PATCH v3 2/5] qemu/main-loop: rename QEMU_IOTHREAD_LOCK_GUARD to BQL_LOCK_GUARD

The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL)
instead, it is already widely used and unambiguous.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Paul Durrant 
Acked-by: David Woodhouse 
Reviewed-by: Cédric Le Goater 
Acked-by: Ilya Leoshkevich 
---
 include/qemu/main-loop.h  | 19 +--
 hw/i386/kvm/xen_evtchn.c  | 14 +++---
 hw/i386/kvm/xen_gnttab.c  |  2 +-
 hw/mips/mips_int.c|  2 +-
 hw/ppc/ppc.c  |  2 +-
 target/i386/kvm/xen-emu.c |  2 +-
 target/ppc/excp_helper.c  |  2 +-
 target/ppc/helper_regs.c  |  2 +-
 target/riscv/cpu_helper.c |  4 ++--
 9 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index 72ebc0cb3a..c26ad2a029 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -343,33 +343,32 @@ void bql_lock_impl(const char *file, int line);
 void bql_unlock(void);
 
 /**
- * QEMU_IOTHREAD_LOCK_GUARD
+ * BQL_LOCK_GUARD
  *
  * Wrap a block of code in a conditional bql_{lock,unlock}.
  */
-typedef struct IOThreadLockAuto IOThreadLockAuto;
+typedef struct BQLLockAuto BQLLockAuto;
 
-static inline IOThreadLockAuto *qemu_iothread_auto_lock(const char *file,
-int line)
+static inline BQLLockAuto *bql_auto_lock(const char *file, int line)
 {
 if (bql_locked()) {
 return NULL;
 }
 bql_lock_impl(file, line);
 /* Anything non-NULL causes the cleanup function to be called */
-return (IOThreadLockAuto *)(uintptr_t)1;
+return (BQLLockAuto *)(uintptr_t)1;
 }
 
-static inline void qemu_iothread_auto_unlock(IOThreadLockAuto *l)
+static inline void bql_auto_unlock(BQLLockAuto *l)
 {
 bql_unlock();
 }
 
-G_DEFINE_AUTOPTR_CLEANUP_FUNC(IOThreadLockAuto, qemu_iothread_auto_unlock)
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(BQLLockAuto, bql_auto_unlock)
 
-#define QEMU_IOTHREAD_LOCK_GUARD() \
-g_autoptr(IOThreadLockAuto) _iothread_lock_auto __attribute__((unused)) \
-= qemu_iothread_auto_lock(__FILE__, __LINE__)
+#define BQL_LOCK_GUARD() \
+g_autoptr(BQLLockAuto) _bql_lock_auto __attribute__((unused)) \
+= bql_auto_lock(__FILE__, __LINE__)
 
 /*
  * qemu_cond_wait_iothread: Wait on condition for the main loop mutex
diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index d7d15cfaf7..bd077eda6d 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -1127,7 +1127,7 @@ int xen_evtchn_reset_op(struct evtchn_reset *reset)
 return -ESRCH;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 return xen_evtchn_soft_reset();
 }
 
@@ -1145,7 +1145,7 @@ int xen_evtchn_close_op(struct evtchn_close *close)
 return -EINVAL;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 qemu_mutex_lock(>port_lock);
 
 ret = close_port(s, close->port, _kvm_routes);
@@ -1272,7 +1272,7 @@ int xen_evtchn_bind_pirq_op(struct evtchn_bind_pirq *pirq)
 return -EINVAL;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 
 if (s->pirq[pirq->pirq].port) {
 return -EBUSY;
@@ -1824,7 +1824,7 @@ int xen_physdev_map_pirq(struct physdev_map_pirq *map)
 return -ENOTSUP;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 QEMU_LOCK_GUARD(>port_lock);
 
 if (map->domid != DOMID_SELF && map->domid != xen_domid) {
@@ -1884,7 +1884,7 @@ int xen_physdev_unmap_pirq(struct physdev_unmap_pirq 
*unmap)
 return -EINVAL;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 qemu_mutex_lock(>port_lock);
 
 if (!pirq_inuse(s, pirq)) {
@@ -1924,7 +1924,7 @@ int xen_physdev_eoi_pirq(struct physdev_eoi *eoi)
 return -ENOTSUP;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 QEMU_LOCK_GUARD(>port_lock);
 
 if (!pirq_inuse(s, pirq)) {
@@ -1956,7 +1956,7 @@ int xen_physdev_query_pirq(struct 
physdev_irq_status_query *query)
 return -ENOTSUP;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 QEMU_LOCK_GUARD(>port_lock);
 
 if (!pirq_inuse(s, pirq)) {
diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c
index 0a24f53f20..d9477ae927 100644
--- a/hw/i386/kvm/xen_gnttab.c
+++ b/hw/i386/kvm/xen_gnttab.c
@@ -176,7 +176,7 @@ int xen_gnttab_map_page(uint64_t idx, uint64_t gfn)
 return -EINVAL;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 QEMU_LOCK_GUARD(>gnt_lock);
 
 xen_overlay_do_map_page(>gnt_aliases[idx], gpa);
diff --git a/hw/mips/mips_int.c b/hw/mips/mips_int.c
index 6c32e466a3..eef2fd2cd1 100644
--- a/hw/mips/mips_int.c
+++ b/hw/mips/mips_int.c
@@ -36,7 +36,7 @@ static void cpu_mips_irq_request(void *opaque, int irq, int 
level)
 return;
 }
 
-QEMU_IOTHREAD_LOCK_GUARD();
+BQL_LOCK_GUARD();
 
 if (level) {
 env->CP0_Cause |= 1 << (irq + CP0Ca_IP);
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index b6581c16fc..7387b5b677 100644

[PATCH v3 4/5] Replace "iothread lock" with "BQL" in comments

The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL)
in their names. Update the code comments to use "BQL" instead of
"iothread lock".

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Philippe Mathieu-Daudé 
---
 docs/devel/reset.rst |  2 +-
 hw/display/qxl.h |  2 +-
 include/exec/cpu-common.h|  2 +-
 include/exec/memory.h|  4 ++--
 include/exec/ramblock.h  |  2 +-
 include/migration/register.h |  8 
 target/arm/internals.h   |  4 ++--
 accel/tcg/cputlb.c   |  4 ++--
 accel/tcg/tcg-accel-ops-icount.c |  2 +-
 hw/remote/mpqemu-link.c  |  2 +-
 migration/block-dirty-bitmap.c   | 10 +-
 migration/block.c| 22 +++---
 migration/colo.c |  2 +-
 migration/migration.c|  2 +-
 migration/ram.c  |  4 ++--
 system/physmem.c |  6 +++---
 target/arm/helper.c  |  2 +-
 ui/spice-core.c  |  2 +-
 util/rcu.c   |  2 +-
 audio/coreaudio.m|  4 ++--
 ui/cocoa.m   |  6 +++---
 21 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/docs/devel/reset.rst b/docs/devel/reset.rst
index 38ed1790f7..d4e79718ba 100644
--- a/docs/devel/reset.rst
+++ b/docs/devel/reset.rst
@@ -19,7 +19,7 @@ Triggering reset
 
 This section documents the APIs which "users" of a resettable object should use
 to control it. All resettable control functions must be called while holding
-the iothread lock.
+the BQL.
 
 You can apply a reset to an object using ``resettable_assert_reset()``. You 
need
 to call ``resettable_release_reset()`` to release the object from reset. To
diff --git a/hw/display/qxl.h b/hw/display/qxl.h
index fdac14edad..e0a85a5ca4 100644
--- a/hw/display/qxl.h
+++ b/hw/display/qxl.h
@@ -159,7 +159,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(PCIQXLDevice, PCI_QXL)
  *
  * Use with care; by the time this function returns, the returned pointer is
  * not protected by RCU anymore.  If the caller is not within an RCU critical
- * section and does not hold the iothread lock, it must have other means of
+ * section and does not hold the BQL, it must have other means of
  * protecting the pointer, such as a reference to the region that includes
  * the incoming ram_addr_t.
  *
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 41115d8919..fef3138d29 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -92,7 +92,7 @@ RAMBlock *qemu_ram_block_by_name(const char *name);
  *
  * By the time this function returns, the returned pointer is not protected
  * by RCU anymore.  If the caller is not within an RCU critical section and
- * does not hold the iothread lock, it must have other means of protecting the
+ * does not hold the BQL, it must have other means of protecting the
  * pointer, such as a reference to the memory region that owns the RAMBlock.
  */
 RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index f172e82ac9..991ab8c6e8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1962,7 +1962,7 @@ int memory_region_get_fd(MemoryRegion *mr);
  *
  * Use with care; by the time this function returns, the returned pointer is
  * not protected by RCU anymore.  If the caller is not within an RCU critical
- * section and does not hold the iothread lock, it must have other means of
+ * section and does not hold the BQL, it must have other means of
  * protecting the pointer, such as a reference to the region that includes
  * the incoming ram_addr_t.
  *
@@ -1979,7 +1979,7 @@ MemoryRegion *memory_region_from_host(void *ptr, 
ram_addr_t *offset);
  *
  * Use with care; by the time this function returns, the returned pointer is
  * not protected by RCU anymore.  If the caller is not within an RCU critical
- * section and does not hold the iothread lock, it must have other means of
+ * section and does not hold the BQL, it must have other means of
  * protecting the pointer, such as a reference to the region that includes
  * the incoming ram_addr_t.
  *
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 69c6a53902..3eb79723c6 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -34,7 +34,7 @@ struct RAMBlock {
 ram_addr_t max_length;
 void (*resized)(const char*, uint64_t length, void *host);
 uint32_t flags;
-/* Protected by iothread lock.  */
+/* Protected by the BQL.  */
 char idstr[256];
 /* RCU-enabled, writes protected by the ramlist lock */
 QLIST_ENTRY(RAMBlock) next;
diff --git a/include/migration/register.h b/include/migration/register.h
index fed1d04a3c..9ab1f79512 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -17,7 +17,7 @@
 #include "hw/vmstate-if.h"
 
 typedef struct SaveVMHandlers {
-/* This runs inside the iothread lock.

[PATCH v3 3/5] qemu/main-loop: rename qemu_cond_wait_iothread() to qemu_cond_wait_bql()

The name "iothread" is overloaded. Use the term Big QEMU Lock (BQL)
instead, it is already widely used and unambiguous.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/qemu/main-loop.h  | 10 +-
 accel/tcg/tcg-accel-ops-rr.c  |  4 ++--
 hw/display/virtio-gpu.c   |  2 +-
 hw/ppc/spapr_events.c |  2 +-
 system/cpu-throttle.c |  2 +-
 system/cpus.c |  4 ++--
 target/i386/nvmm/nvmm-accel-ops.c |  2 +-
 target/i386/whpx/whpx-accel-ops.c |  2 +-
 8 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index c26ad2a029..5764db157c 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -371,17 +371,17 @@ G_DEFINE_AUTOPTR_CLEANUP_FUNC(BQLLockAuto, 
bql_auto_unlock)
 = bql_auto_lock(__FILE__, __LINE__)
 
 /*
- * qemu_cond_wait_iothread: Wait on condition for the main loop mutex
+ * qemu_cond_wait_bql: Wait on condition for the Big QEMU Lock (BQL)
  *
- * This function atomically releases the main loop mutex and causes
+ * This function atomically releases the Big QEMU Lock (BQL) and causes
  * the calling thread to block on the condition.
  */
-void qemu_cond_wait_iothread(QemuCond *cond);
+void qemu_cond_wait_bql(QemuCond *cond);
 
 /*
- * qemu_cond_timedwait_iothread: like the previous, but with timeout
+ * qemu_cond_timedwait_bql: like the previous, but with timeout
  */
-void qemu_cond_timedwait_iothread(QemuCond *cond, int ms);
+void qemu_cond_timedwait_bql(QemuCond *cond, int ms);
 
 /* internal interfaces */
 
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index c4ea372a3f..5794e5a9ce 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -111,7 +111,7 @@ static void rr_wait_io_event(void)
 
 while (all_cpu_threads_idle()) {
 rr_stop_kick_timer();
-qemu_cond_wait_iothread(first_cpu->halt_cond);
+qemu_cond_wait_bql(first_cpu->halt_cond);
 }
 
 rr_start_kick_timer();
@@ -198,7 +198,7 @@ static void *rr_cpu_thread_fn(void *arg)
 
 /* wait for initial kick-off after machine start */
 while (first_cpu->stopped) {
-qemu_cond_wait_iothread(first_cpu->halt_cond);
+qemu_cond_wait_bql(first_cpu->halt_cond);
 
 /* process any pending work */
 CPU_FOREACH(cpu) {
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index b016d3bac8..67c5be1a4e 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1512,7 +1512,7 @@ void virtio_gpu_reset(VirtIODevice *vdev)
 g->reset_finished = false;
 qemu_bh_schedule(g->reset_bh);
 while (!g->reset_finished) {
-qemu_cond_wait_iothread(>reset_cond);
+qemu_cond_wait_bql(>reset_cond);
 }
 } else {
 virtio_gpu_reset_bh(g);
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index deb4641505..cb0587 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -899,7 +899,7 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
 }
 return;
 }
-qemu_cond_wait_iothread(>fwnmi_machine_check_interlock_cond);
+qemu_cond_wait_bql(>fwnmi_machine_check_interlock_cond);
 if (spapr->fwnmi_machine_check_addr == -1) {
 /*
  * If the machine was reset while waiting for the interlock,
diff --git a/system/cpu-throttle.c b/system/cpu-throttle.c
index 786a9a5639..c951a6c65e 100644
--- a/system/cpu-throttle.c
+++ b/system/cpu-throttle.c
@@ -54,7 +54,7 @@ static void cpu_throttle_thread(CPUState *cpu, 
run_on_cpu_data opaque)
 endtime_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + sleeptime_ns;
 while (sleeptime_ns > 0 && !cpu->stop) {
 if (sleeptime_ns > SCALE_MS) {
-qemu_cond_timedwait_iothread(cpu->halt_cond,
+qemu_cond_timedwait_bql(cpu->halt_cond,
  sleeptime_ns / SCALE_MS);
 } else {
 bql_unlock();
diff --git a/system/cpus.c b/system/cpus.c
index 9b68dc9c7c..c8e2772b5f 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -514,12 +514,12 @@ void bql_unlock(void)
 qemu_mutex_unlock();
 }
 
-void qemu_cond_wait_iothread(QemuCond *cond)
+void qemu_cond_wait_bql(QemuCond *cond)
 {
 qemu_cond_wait(cond, );
 }
 
-void qemu_cond_timedwait_iothread(QemuCond *cond, int ms)
+void qemu_cond_timedwait_bql(QemuCond *cond, int ms)
 {
 qemu_cond_timedwait(cond, , ms);
 }
diff --git a/target/i386/nvmm/nvmm-accel-ops.c 
b/target/i386/nvmm/nvmm-accel-ops.c
index f9d5e9a37a..6b2bfd9b9c 100644
--- a/target/i386/nvmm/nvmm-accel-ops.c
+++ b/target/i386/nvmm/nvmm-accel-ops.c
@@ -48,7 +48,7 @@ static void *qemu_nvmm_cpu_thread_fn(void *arg)
 }
 }
 while (cpu_thread_is_idle(cpu)) {
-qemu_cond_wait_iothread(cpu->halt_cond);
+

[PATCH v3 0/5] Make Big QEMU Lock naming consistent

v3:
- Rebase
- Define bql_lock() macro on a single line [Akihiko Odaki]
v2:
- Rename APIs bql_*() [PeterX]
- Spell out "Big QEMU Lock (BQL)" in doc comments [PeterX]
- Rename "iolock" variables in hw/remote/mpqemu-link.c [Harsh]
- Fix bql_auto_lock() indentation in Patch 2 [Ilya]
- "with BQL taken" -> "with the BQL taken" [Philippe]
- "under BQL" -> "under the BQL" [Philippe]

The Big QEMU Lock ("BQL") has two other names: "iothread lock" and "QEMU global
mutex". The term "iothread lock" is easily confused with the unrelated --object
iothread (iothread.c).

This series updates the code and documentation to consistently use "BQL". This
makes the code easier to understand.

Stefan Hajnoczi (5):
  system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()
  qemu/main-loop: rename QEMU_IOTHREAD_LOCK_GUARD to BQL_LOCK_GUARD
  qemu/main-loop: rename qemu_cond_wait_iothread() to
qemu_cond_wait_bql()
  Replace "iothread lock" with "BQL" in comments
  Rename "QEMU global mutex" to "BQL" in comments and docs

 docs/devel/multi-thread-tcg.rst  |   7 +-
 docs/devel/qapi-code-gen.rst |   2 +-
 docs/devel/replay.rst|   2 +-
 docs/devel/reset.rst |   2 +-
 docs/devel/multiple-iothreads.txt|  14 ++--
 hw/display/qxl.h |   2 +-
 include/block/aio-wait.h |   2 +-
 include/block/blockjob.h |   6 +-
 include/exec/cpu-common.h|   2 +-
 include/exec/memory.h|   4 +-
 include/exec/ramblock.h  |   2 +-
 include/io/task.h|   2 +-
 include/migration/register.h |   8 +-
 include/qemu/coroutine-core.h|   2 +-
 include/qemu/coroutine.h |   2 +-
 include/qemu/main-loop.h |  68 ---
 include/qemu/thread.h|   2 +-
 target/arm/internals.h   |   4 +-
 accel/accel-blocker.c|  10 +--
 accel/dummy-cpus.c   |   8 +-
 accel/hvf/hvf-accel-ops.c|   4 +-
 accel/kvm/kvm-accel-ops.c|   4 +-
 accel/kvm/kvm-all.c  |  22 ++---
 accel/tcg/cpu-exec.c |  26 +++---
 accel/tcg/cputlb.c   |  20 ++---
 accel/tcg/tcg-accel-ops-icount.c |   6 +-
 accel/tcg/tcg-accel-ops-mttcg.c  |  12 +--
 accel/tcg/tcg-accel-ops-rr.c |  18 ++--
 accel/tcg/tcg-accel-ops.c|   2 +-
 accel/tcg/translate-all.c|   2 +-
 cpu-common.c |   4 +-
 dump/dump.c  |   4 +-
 hw/block/dataplane/virtio-blk.c  |   8 +-
 hw/block/virtio-blk.c|   2 +-
 hw/core/cpu-common.c |   6 +-
 hw/display/virtio-gpu.c  |   2 +-
 hw/i386/intel_iommu.c|   6 +-
 hw/i386/kvm/xen_evtchn.c |  30 +++
 hw/i386/kvm/xen_gnttab.c |   2 +-
 hw/i386/kvm/xen_overlay.c|   2 +-
 hw/i386/kvm/xen_xenstore.c   |   2 +-
 hw/intc/arm_gicv3_cpuif.c|   2 +-
 hw/intc/s390_flic.c  |  18 ++--
 hw/mips/mips_int.c   |   2 +-
 hw/misc/edu.c|   4 +-
 hw/misc/imx6_src.c   |   2 +-
 hw/misc/imx7_src.c   |   2 +-
 hw/net/xen_nic.c |   8 +-
 hw/ppc/pegasos2.c|   2 +-
 hw/ppc/ppc.c |   6 +-
 hw/ppc/spapr.c   |   2 +-
 hw/ppc/spapr_events.c|   2 +-
 hw/ppc/spapr_rng.c   |   4 +-
 hw/ppc/spapr_softmmu.c   |   4 +-
 hw/remote/mpqemu-link.c  |  22 ++---
 hw/remote/vfio-user-obj.c|   2 +-
 hw/s390x/s390-skeys.c|   2 +-
 hw/scsi/virtio-scsi-dataplane.c  |   6 +-
 migration/block-dirty-bitmap.c   |  14 ++--
 migration/block.c|  38 -
 migration/colo.c |  62 +++---
 migration/dirtyrate.c|  12 +--
 migration/migration.c|  54 ++--
 migration/ram.c  |  16 ++--
 net/tap.c|   2 +-
 replay/replay-internal.c |   2 +-
 semihosting/console.c|   8 +-
 stubs/iothread-lock.c|   6 +-
 system/cpu-throttle.c|   6 +-
 system/cpus.c|  55 +++--
 system/dirtylimit.c  |   4 +-
 system/memory.c  |   2 +-
 system/physmem.c |  14 ++--
 system/runstate.c|   2 +-
 system/watchpoint.c  |   4 +-
 target/arm/arm-powerctl.c|  14 ++--
 target/arm/helper.c  |   6 +-
 target/arm/hvf/hvf.c |   8 +-
 target/arm/kvm.c |   8 +-
 target/arm/ptw.c |   6 +-
 target/arm/tcg/helper-a64.c  |   8 +-
 target/arm/tcg/m_helper.c|   6 +-
 target/arm/tcg/op_helper.c

Re: [PATCH v13 01/35] x86/cpufeatures,opcode,msr: Add the WRMSRNS instruction support

2024-01-02 Thread Borislav Petkov

On Tue, Dec 05, 2023 at 02:49:50AM -0800, Xin Li wrote:

> Subject: Re: [PATCH v13 01/35] x86/cpufeatures,opcode,msr: Add the WRMSRNS 
> instruction support

Or simply "x86/fred: Add ... "

Other than that,

Acked-by: Borislav Petkov (AMD) 

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2 1/5] system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()

On Wed, Dec 13, 2023 at 03:37:00PM +0900, Akihiko Odaki wrote:
> On 2023/12/13 0:39, Stefan Hajnoczi wrote:
> > @@ -312,58 +312,58 @@ bool qemu_in_main_thread(void);
> >   } while (0)
> >   /**
> > - * qemu_mutex_lock_iothread: Lock the main loop mutex.
> > + * bql_lock: Lock the Big QEMU Lock (BQL).
> >*
> > - * This function locks the main loop mutex.  The mutex is taken by
> > + * This function locks the Big QEMU Lock (BQL).  The lock is taken by
> >* main() in vl.c and always taken except while waiting on
> > - * external events (such as with select).  The mutex should be taken
> > + * external events (such as with select).  The lock should be taken
> >* by threads other than the main loop thread when calling
> >* qemu_bh_new(), qemu_set_fd_handler() and basically all other
> >* functions documented in this file.
> >*
> > - * NOTE: tools currently are single-threaded and qemu_mutex_lock_iothread
> > + * NOTE: tools currently are single-threaded and bql_lock
> >* is a no-op there.
> >*/
> > -#define qemu_mutex_lock_iothread()  \
> > -qemu_mutex_lock_iothread_impl(__FILE__, __LINE__)
> > -void qemu_mutex_lock_iothread_impl(const char *file, int line);
> > +#define bql_lock()  \
> > +bql_lock_impl(__FILE__, __LINE__)
> 
> This line break is no longer necessary.

Will fix in v3.

Stefan


signature.asc
Description: PGP signature

Re: Possible bug in Xen

2024-01-02 Thread Joe Tretter


Happy New Year! I think it's time I send you an update:
I have spent a considerable amount of time trying to get the latest 
firmware working, but I find a way that does the job.
When I replaced the file with yours, the machine directly restarts on 
application of the firmware.
All ways I have tried either result in the firmware not being considered 
or the machine rebooting on application.
The machine is dual boot with Archlinux and on Archlinux, I have the 
latest firmware running.
I have reached out to the Qubes people for help getting the latest 
firmware working. I'll let you know when I succeed with that.


~Joe

On 12/19/23 11:00, Andrew Cooper wrote:

On 19/12/2023 4:28 pm, Joe Tretter wrote:

On 12/19/23 10:05, Andrew Cooper wrote:

Is it always the same test which fails, or is it random?

Which test fails seems to be random (see attached screenshot).

Looking athttps://github.com/Tarsnap/scrypt  it's only a trivial piece
of userspace crypto.

The fact that running multiple instances makes it fail more easily
points towards some kind of register handling issue, but the fact that
it repros only under Xen, and even with eager-fpu (which isn't the
default on AMD, sadly), is weird.

Looking at the scrypt source, it has alternative routines for the AESNI
and SHANI instruction groups.  However, because it's a Zen1, we don't
have a useful way of filtering visible for PV dom0 userspace.

First of all, can you get the exact CPU model and microcode version.
`head /proc/cpuinfo` will be enough.  But while you're at it, can you
include `xl dmesg` too just in case there's something obvious showing up
there too.


I have attachted text files with the (full) cpuinfo and the dmesg.

microcode    : 0x8001129

That's 0x08001129 when not rendered brokenly, and the up-to-date version
is 0x08001138 (which itself dates from 2019).

If you can, get a firmware update.  Given that it's a Dell, there's a
good chance it's already on LFVS/fwupd.  This is definitely the
preferred option.

If not, and you're willing to experiment in definitely unsupported
territory, then move /lib/firmware/amd-ucode/microcode_amd_fam17h.bin
sideways in dom0, and replace it with the attached SummitRidge-08001138
file (it's important to still be named microcode_amd_fam17h.bin in the
end), then rebuild the initrd and reboot.

You already have ucode=scan on Xen's command line, so after the reboot
you should see some messages about updating microcode too.

Irritatingly, AMD don't put client microcode into linux-firmware, but
there are various collections of blobs found in the wild online.  I've
picked the one which I think is right for your CPU, and packaged it it
appropriately for Xen.


Anyway, I'm not sure if this will fix anything, but life is too short to
be debugging stuff like this on out-of-date firmware/ucode.

~Andrew

Re: [PATCH v2 1/1] virgl: Implement resource_query_layout

Hi

On Thu, Dec 21, 2023 at 3:36 PM Julia Zhang  wrote:
>
> From: Daniel Stone 
>
> A new ioctl to shuttle information between host and guest about the
> actual buffer allocation, which can be used for interop between GL and
> Vulkan when supporting standard window systems.
>

The command hasn't been added to the kernel yet. The function is not
in the virgl library either.

> Signed-off-by: Daniel Stone 
> Co-developed-by: Julia Zhang 
> Signed-off-by: Julia Zhang 
> ---
>  hw/display/virtio-gpu-base.c|  4 +++
>  hw/display/virtio-gpu-virgl.c   | 40 +
>  include/hw/virtio/virtio-gpu-bswap.h|  7 
>  include/standard-headers/linux/virtio_gpu.h | 30 
>  meson.build |  4 +++
>  5 files changed, 85 insertions(+)
>
> diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
> index 6bcee3882f..09b37f015d 100644
> --- a/hw/display/virtio-gpu-base.c
> +++ b/hw/display/virtio-gpu-base.c
> @@ -240,6 +240,10 @@ virtio_gpu_base_get_features(VirtIODevice *vdev, 
> uint64_t features,
>  features |= (1 << VIRTIO_GPU_F_RESOURCE_UUID);
>  }
>
> +#ifdef HAVE_VIRGL_RESOURCE_QUERY_LAYOUT
> +features |= (1 << VIRTIO_GPU_F_RESOURCE_QUERY_LAYOUT);
> +#endif /* HAVE_VIRGL_RESOURCE_QUERY_LAYOUT */
> +
>  return features;
>  }
>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index c1e7c6d0c6..b331232fee 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -813,6 +813,40 @@ static void virgl_cmd_set_scanout_blob(VirtIOGPU *g,
>
>  #endif /* HAVE_VIRGL_RESOURCE_BLOB */
>
> +#ifdef HAVE_VIRGL_RESOURCE_QUERY_LAYOUT
> +static void virgl_cmd_resource_query_layout(VirtIOGPU *g,
> +   struct virtio_gpu_ctrl_command 
> *cmd)
> +{
> +struct virtio_gpu_resource_query_layout qlayout;
> +struct virtio_gpu_resp_resource_layout resp;
> +struct virgl_renderer_resource_layout rlayout;
> +int ret;
> +int i;
> +VIRTIO_GPU_FILL_CMD(qlayout);
> +virtio_gpu_resource_query_layout_bswap();
> +
> +ret = virgl_renderer_resource_query_layout(qlayout.resource_id, ,
> +  qlayout.width, qlayout.height,
> +  qlayout.format, qlayout.bind);
> +if (ret != 0) {
> +qemu_log_mask(LOG_GUEST_ERROR, "%s: resource %d is not 
> externally-allocated\n",
> +  __func__, qlayout.resource_id);
> +cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
> +return;
> +}
> +
> +memset(, 0, sizeof(resp));
> +resp.hdr.type = VIRTIO_GPU_RESP_OK_RESOURCE_LAYOUT;
> +resp.num_planes = rlayout.num_planes;
> +resp.modifier = rlayout.modifier;
> +for (i = 0; i < resp.num_planes; i++) {
> +resp.planes[i].offset = rlayout.planes[i].offset;
> +resp.planes[i].stride = rlayout.planes[i].stride;
> +}
> +virtio_gpu_ctrl_response(g, cmd, , sizeof(resp));
> +}
> +#endif /* HAVE_VIRGL_RESOURCE_QUERY_LAYOUT */
> +
>  void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
>struct virtio_gpu_ctrl_command *cmd)
>  {
> @@ -896,6 +930,12 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
>  virgl_cmd_set_scanout_blob(g, cmd);
>  break;
>  #endif /* HAVE_VIRGL_RESOURCE_BLOB */
> +
> +#ifdef HAVE_VIRGL_RESOURCE_QUERY_LAYOUT
> +case VIRTIO_GPU_CMD_RESOURCE_QUERY_LAYOUT:
> +   virgl_cmd_resource_query_layout(g, cmd);
> +   break;
> +#endif /* HAVE_VIRGL_RESOURCE_QUERY_LAYOUT*/
>  default:
>  cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
>  break;
> diff --git a/include/hw/virtio/virtio-gpu-bswap.h 
> b/include/hw/virtio/virtio-gpu-bswap.h
> index dd1975e2d4..dea8cf6fd3 100644
> --- a/include/hw/virtio/virtio-gpu-bswap.h
> +++ b/include/hw/virtio/virtio-gpu-bswap.h
> @@ -92,4 +92,11 @@ virtio_gpu_scanout_blob_bswap(struct 
> virtio_gpu_set_scanout_blob *ssb)
>  le32_to_cpus(>offsets[3]);
>  }
>
> +static inline void
> +virtio_gpu_resource_query_layout_bswap(struct 
> virtio_gpu_resource_query_layout *rql)
> +{
> +virtio_gpu_ctrl_hdr_bswap(>hdr);
> +le32_to_cpus(>resource_id);
> +}
> +
>  #endif
> diff --git a/include/standard-headers/linux/virtio_gpu.h 
> b/include/standard-headers/linux/virtio_gpu.h
> index c621389f3d..c9a2f58237 100644
> --- a/include/standard-headers/linux/virtio_gpu.h
> +++ b/include/standard-headers/linux/virtio_gpu.h
> @@ -65,6 +65,11 @@
>   */
>  #define VIRTIO_GPU_F_CONTEXT_INIT4
>
> +/*
> + * VIRTIO_GPU_CMD_RESOURCE_QUERY_LAYOUT
> + */
> +#define VIRTIO_GPU_F_RESOURCE_QUERY_LAYOUT 5
> +
>  enum virtio_gpu_ctrl_type {
> VIRTIO_GPU_UNDEFINED = 0,
>
> @@ -95,6 +100,7 @@ enum virtio_gpu_ctrl_type {
> VIRTIO_GPU_CMD_SUBMIT_3D,
> VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB,
> VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB,
> +

Re: [PATCH v6 10/11] virtio-gpu: Initialize Venus

Hi

On Tue, Dec 19, 2023 at 11:55 AM Huang Rui  wrote:
>
> From: Antonio Caggiano 
>
> Request Venus when initializing VirGL.
>
> Signed-off-by: Antonio Caggiano 
> Signed-off-by: Huang Rui 
> ---
>
> Changes in v6:
> - Remove the unstable API flags check because virglrenderer is already 1.0.
> - Squash the render server flag support into "Initialize Venus".
>
>  hw/display/virtio-gpu-virgl.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index f35a751824..c523a6717a 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -964,6 +964,10 @@ int virtio_gpu_virgl_init(VirtIOGPU *g)
>  }
>  #endif
>
> +#ifdef VIRGL_RENDERER_VENUS
> +flags |= VIRGL_RENDERER_VENUS | VIRGL_RENDERER_RENDER_SERVER;
> +#endif
> +

I wonder if it's a good idea to initialize venus by default. It
doesn't seem to require vulkan during initialization, but this may
evolve. Make it optional?

>  ret = virgl_renderer_init(g, flags, _gpu_3d_cbs);
>  if (ret != 0) {
>  error_report("virgl could not be initialized: %d", ret);
> --
> 2.25.1
>


-- 
Marc-André Lureau

Re: [PATCH v6 08/11] virtio-gpu: Resource UUID

Hi

On Tue, Dec 19, 2023 at 11:55 AM Huang Rui  wrote:
>
> From: Antonio Caggiano 
>
> Enable resource UUID feature and implement command resource assign UUID.
> This is done by introducing a hash table to map resource IDs to their
> UUIDs.

I agree with Akihiko, what about putting QemuUUID in struct
virtio_gpu_simple_resource?

(I also doubt about the hash table usefulness, but I don't know
how/why the UUID is used)

>
> Signed-off-by: Antonio Caggiano 
> Signed-off-by: Huang Rui 
> ---
>
> Changes in v6:
> - Set resource uuid as option.
> - Implement optional subsection of vmstate_virtio_gpu_resource_uuid_state
>   or virtio live migration.
> - Use g_int_hash/g_int_equal instead of the default.
> - Move virtio_vgpu_simple_resource initialization in the earlier new patch
>   "virtio-gpu: Introduce virgl_gpu_resource structure"
>
>  hw/display/trace-events|   1 +
>  hw/display/virtio-gpu-base.c   |   4 ++
>  hw/display/virtio-gpu-virgl.c  |   3 +
>  hw/display/virtio-gpu.c| 119 +
>  include/hw/virtio/virtio-gpu.h |   7 ++
>  5 files changed, 134 insertions(+)
>
> diff --git a/hw/display/trace-events b/hw/display/trace-events
> index 2336a0ca15..54d6894c59 100644
> --- a/hw/display/trace-events
> +++ b/hw/display/trace-events
> @@ -41,6 +41,7 @@ virtio_gpu_cmd_res_create_blob(uint32_t res, uint64_t size) 
> "res 0x%x, size %" P
>  virtio_gpu_cmd_res_unref(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_back_attach(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_back_detach(uint32_t res) "res 0x%x"
> +virtio_gpu_cmd_res_assign_uuid(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_xfer_toh_2d(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_xfer_toh_3d(uint32_t res) "res 0x%x"
>  virtio_gpu_cmd_res_xfer_fromh_3d(uint32_t res) "res 0x%x"
> diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
> index 37af256219..6bcee3882f 100644
> --- a/hw/display/virtio-gpu-base.c
> +++ b/hw/display/virtio-gpu-base.c
> @@ -236,6 +236,10 @@ virtio_gpu_base_get_features(VirtIODevice *vdev, 
> uint64_t features,
>  features |= (1 << VIRTIO_GPU_F_CONTEXT_INIT);
>  }
>
> +if (virtio_gpu_resource_uuid_enabled(g->conf)) {
> +features |= (1 << VIRTIO_GPU_F_RESOURCE_UUID);
> +}
> +
>  return features;
>  }
>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index 5a3a292f79..be9da6e780 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -777,6 +777,9 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
>  /* TODO add security */
>  virgl_cmd_ctx_detach_resource(g, cmd);
>  break;
> +case VIRTIO_GPU_CMD_RESOURCE_ASSIGN_UUID:
> +virtio_gpu_resource_assign_uuid(g, cmd);
> +break;
>  case VIRTIO_GPU_CMD_GET_CAPSET_INFO:
>  virgl_cmd_get_capset_info(g, cmd);
>  break;
> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index 8189c392dc..466debb256 100644
> --- a/hw/display/virtio-gpu.c
> +++ b/hw/display/virtio-gpu.c
> @@ -958,6 +958,37 @@ virtio_gpu_resource_detach_backing(VirtIOGPU *g,
>  virtio_gpu_cleanup_mapping(g, res);
>  }
>
> +void virtio_gpu_resource_assign_uuid(VirtIOGPU *g,
> + struct virtio_gpu_ctrl_command *cmd)
> +{
> +struct virtio_gpu_simple_resource *res;
> +struct virtio_gpu_resource_assign_uuid assign;
> +struct virtio_gpu_resp_resource_uuid resp;
> +QemuUUID *uuid;
> +
> +VIRTIO_GPU_FILL_CMD(assign);
> +virtio_gpu_bswap_32(, sizeof(assign));
> +trace_virtio_gpu_cmd_res_assign_uuid(assign.resource_id);
> +
> +res = virtio_gpu_find_check_resource(g, assign.resource_id, false, 
> __func__, >error);
> +if (!res) {
> +return;
> +}
> +
> +memset(, 0, sizeof(resp));
> +resp.hdr.type = VIRTIO_GPU_RESP_OK_RESOURCE_UUID;
> +
> +uuid = g_hash_table_lookup(g->resource_uuids, _id);
> +if (!uuid) {
> +uuid = g_new(QemuUUID, 1);
> +qemu_uuid_generate(uuid);
> +g_hash_table_insert(g->resource_uuids, _id, uuid);
> +}
> +
> +memcpy(resp.uuid, uuid, sizeof(QemuUUID));
> +virtio_gpu_ctrl_response(g, cmd, , sizeof(resp));
> +}
> +
>  void virtio_gpu_simple_process_cmd(VirtIOGPU *g,
> struct virtio_gpu_ctrl_command *cmd)
>  {
> @@ -1006,6 +1037,9 @@ void virtio_gpu_simple_process_cmd(VirtIOGPU *g,
>  case VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING:
>  virtio_gpu_resource_detach_backing(g, cmd);
>  break;
> +case VIRTIO_GPU_CMD_RESOURCE_ASSIGN_UUID:
> +virtio_gpu_resource_assign_uuid(g, cmd);
> +break;
>  default:
>  cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
>  break;
> @@ -1400,6 +1434,57 @@ static int virtio_gpu_blob_load(QEMUFile *f, void 
> *opaque, size_t size,
>  return 0;
>  }
>
> +static int virtio_gpu_resource_uuid_save(QEMUFile *f, void *opaque, size_t 
> size,
> +

Re: [PATCH v6 07/11] virtio-gpu: Handle resource blob commands

Hi

On Tue, Dec 19, 2023 at 11:55 AM Huang Rui  wrote:
>
> From: Antonio Caggiano 
>
> Support BLOB resources creation, mapping and unmapping by calling the
> new stable virglrenderer 0.10 interface. Only enabled when available and
> via the blob config. E.g. -device virtio-vga-gl,blob=true
>
> Signed-off-by: Antonio Caggiano 
> Signed-off-by: Dmitry Osipenko 
> Signed-off-by: Xenia Ragiadakou 
> Signed-off-by: Huang Rui 
> ---
>
> Changes in v6:
> - Use new struct virgl_gpu_resource.
> - Unmap, unref and destroy the resource only after the memory region
>   has been completely removed.
> - In unref check whether the resource is still mapped.
> - In unmap_blob check whether the resource has been already unmapped.
> - Fix coding style
>
>  hw/display/virtio-gpu-virgl.c | 274 +-
>  hw/display/virtio-gpu.c   |   4 +-
>  meson.build   |   4 +
>  3 files changed, 276 insertions(+), 6 deletions(-)

Could you split this patch to introduce the new resource
ref/get/put/destroy functions first, before adding the blob commands.
Please explain the rationale of the changes, why they are safe or
equivalent to current code. I'd also suggest documentation and better
naming for the functions, or inlined code as appropriate, as it's
confusing to understand together what should be used and when.

>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index faab374336..5a3a292f79 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -17,6 +17,7 @@
>  #include "trace.h"
>  #include "hw/virtio/virtio.h"
>  #include "hw/virtio/virtio-gpu.h"
> +#include "hw/virtio/virtio-gpu-bswap.h"
>
>  #include "ui/egl-helpers.h"
>
> @@ -24,8 +25,62 @@
>
>  struct virgl_gpu_resource {
>  struct virtio_gpu_simple_resource res;
> +uint32_t ref;
> +VirtIOGPU *g;
> +
> +#ifdef HAVE_VIRGL_RESOURCE_BLOB
> +/* only blob resource needs this region to be mapped as guest mmio */
> +MemoryRegion *region;
> +#endif
>  };
>
> +static void vres_get_ref(struct virgl_gpu_resource *vres)
> +{
> +uint32_t ref;
> +
> +ref = qatomic_fetch_inc(>ref);
> +g_assert(ref < INT_MAX);
> +}
> +
> +static void virgl_resource_destroy(struct virgl_gpu_resource *vres)
> +{
> +struct virtio_gpu_simple_resource *res;
> +VirtIOGPU *g;
> +
> +if (!vres) {
> +return;
> +}
> +
> +g = vres->g;
> +res = >res;
> +QTAILQ_REMOVE(>reslist, res, next);
> +virtio_gpu_cleanup_mapping(g, res);
> +g_free(vres);
> +}
> +
> +static void virgl_resource_unref(struct virgl_gpu_resource *vres)
> +{
> +struct virtio_gpu_simple_resource *res;
> +
> +if (!vres) {
> +return;
> +}
> +
> +res = >res;
> +virgl_renderer_resource_detach_iov(res->resource_id, NULL, NULL);
> +virgl_renderer_resource_unref(res->resource_id);
> +}
> +
> +static void vres_put_ref(struct virgl_gpu_resource *vres)
> +{
> +g_assert(vres->ref > 0);
> +
> +if (qatomic_fetch_dec(>ref) == 1) {
> +virgl_resource_unref(vres);
> +virgl_resource_destroy(vres);
> +}
> +}
> +
>  static struct virgl_gpu_resource *
>  virgl_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id)
>  {
> @@ -59,6 +114,8 @@ static void virgl_cmd_create_resource_2d(VirtIOGPU *g,
> c2d.width, c2d.height);
>
>  vres = g_new0(struct virgl_gpu_resource, 1);
> +vres_get_ref(vres);
> +vres->g = g;
>  vres->res.width = c2d.width;
>  vres->res.height = c2d.height;
>  vres->res.format = c2d.format;
> @@ -91,6 +148,8 @@ static void virgl_cmd_create_resource_3d(VirtIOGPU *g,
> c3d.width, c3d.height, c3d.depth);
>
>  vres = g_new0(struct virgl_gpu_resource, 1);
> +vres_get_ref(vres);
> +vres->g = g;
>  vres->res.width = c3d.width;
>  vres->res.height = c3d.height;
>  vres->res.format = c3d.format;
> @@ -126,12 +185,21 @@ static void virgl_cmd_resource_unref(VirtIOGPU *g,
>  return;
>  }
>
> -virgl_renderer_resource_detach_iov(unref.resource_id, NULL, NULL);
> -virgl_renderer_resource_unref(unref.resource_id);
> +#ifdef HAVE_VIRGL_RESOURCE_BLOB
> +if (vres->region) {
> +VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
> +MemoryRegion *mr = vres->region;
> +
> +warn_report("%s: blob resource %d not unmapped",
> +__func__, unref.resource_id);
> +vres->region = NULL;
> +memory_region_set_enabled(mr, false);
> +memory_region_del_subregion(>hostmem, mr);
> +object_unparent(OBJECT(mr));
> +}
> +#endif /* HAVE_VIRGL_RESOURCE_BLOB */
>
> -QTAILQ_REMOVE(>reslist, >res, next);
> -virtio_gpu_cleanup_mapping(g, >res);
> -g_free(vres);
> +vres_put_ref(vres);
>  }
>
>  static void virgl_cmd_context_create(VirtIOGPU *g,
> @@ -470,6 +538,191 @@ static void virgl_cmd_get_capset(VirtIOGPU *g,
>  g_free(resp);
>  }

[xen-unstable test] 184243: tolerable FAIL

2024-01-02 Thread osstest service owner

flight 184243 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184243/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-libvirt-raw 13 guest-startfail pass in 184241

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-raw 15 saverestore-support-check fail in 184241 like 
184239
 test-armhf-armhf-libvirt-raw 14 migrate-support-check fail in 184241 never pass
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 184241
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 184241
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 184241
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 184241
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 184241
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 184241
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 184241
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 184241
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 184241
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 184241
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 184241
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  49818cde637b5ec20383e46b71f93b2e7d867686
baseline version:
 xen  49818cde637b5ec20383e46b71f93b2e7d867686

Last test of basis   184243  2024-01-02 01:55:39 Z0 days
Testing same since  (not found) 0 attempts

jobs:

Re: [PATCH v6 05/11] virtio-gpu: Introduce virgl_gpu_resource structure

Hi

On Tue, Dec 19, 2023 at 11:55 AM Huang Rui  wrote:
>
> Introduce a new virgl_gpu_resource data structure and helper functions
> for virgl. It's used to add new member which is specific for virgl in
> following patches of blob memory support.
>
> Signed-off-by: Huang Rui 
> ---
>
> New patch:
> - Introduce new struct virgl_gpu_resource to store virgl specific members.
> - Move resource initialization from path "virtio-gpu: Resource UUID" here.
> - Remove error handling of g_new0, because glib will abort() on OOM.
> - Set iov and iov_cnt in struct virtio_gpu_simple_resource for all types
>   of resources.
>
>  hw/display/virtio-gpu-virgl.c | 84 ++-
>  1 file changed, 64 insertions(+), 20 deletions(-)
>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index 5bbc8071b2..faab374336 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -22,6 +22,23 @@
>
>  #include 
>
> +struct virgl_gpu_resource {
> +struct virtio_gpu_simple_resource res;
> +};
> +
> +static struct virgl_gpu_resource *
> +virgl_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id)
> +{
> +struct virtio_gpu_simple_resource *res;
> +
> +res = virtio_gpu_find_resource(g, resource_id);
> +if (!res) {
> +return NULL;
> +}
> +
> +return container_of(res, struct virgl_gpu_resource, res);
> +}
> +
>  #if VIRGL_RENDERER_CALLBACKS_VERSION >= 4
>  static void *
>  virgl_get_egl_display(G_GNUC_UNUSED void *cookie)
> @@ -35,11 +52,19 @@ static void virgl_cmd_create_resource_2d(VirtIOGPU *g,
>  {
>  struct virtio_gpu_resource_create_2d c2d;
>  struct virgl_renderer_resource_create_args args;
> +struct virgl_gpu_resource *vres;
>
>  VIRTIO_GPU_FILL_CMD(c2d);
>  trace_virtio_gpu_cmd_res_create_2d(c2d.resource_id, c2d.format,
> c2d.width, c2d.height);
>

It should check the resource doesn't already exist (similar to 2d code)

> +vres = g_new0(struct virgl_gpu_resource, 1);
> +vres->res.width = c2d.width;
> +vres->res.height = c2d.height;
> +vres->res.format = c2d.format;
> +vres->res.resource_id = c2d.resource_id;
> +QTAILQ_INSERT_HEAD(>reslist, >res, next);
> +
>  args.handle = c2d.resource_id;
>  args.target = 2;
>  args.format = c2d.format;
> @@ -59,11 +84,19 @@ static void virgl_cmd_create_resource_3d(VirtIOGPU *g,
>  {
>  struct virtio_gpu_resource_create_3d c3d;
>  struct virgl_renderer_resource_create_args args;
> +struct virgl_gpu_resource *vres;
>
>  VIRTIO_GPU_FILL_CMD(c3d);
>  trace_virtio_gpu_cmd_res_create_3d(c3d.resource_id, c3d.format,
> c3d.width, c3d.height, c3d.depth);
>

same

> +vres = g_new0(struct virgl_gpu_resource, 1);
> +vres->res.width = c3d.width;
> +vres->res.height = c3d.height;
> +vres->res.format = c3d.format;
> +vres->res.resource_id = c3d.resource_id;
> +QTAILQ_INSERT_HEAD(>reslist, >res, next);
> +
>  args.handle = c3d.resource_id;
>  args.target = c3d.target;
>  args.format = c3d.format;
> @@ -82,19 +115,23 @@ static void virgl_cmd_resource_unref(VirtIOGPU *g,
>   struct virtio_gpu_ctrl_command *cmd)
>  {
>  struct virtio_gpu_resource_unref unref;
> -struct iovec *res_iovs = NULL;
> -int num_iovs = 0;
> +struct virgl_gpu_resource *vres;
>
>  VIRTIO_GPU_FILL_CMD(unref);
>  trace_virtio_gpu_cmd_res_unref(unref.resource_id);
>
> -virgl_renderer_resource_detach_iov(unref.resource_id,
> -   _iovs,
> -   _iovs);
> -if (res_iovs != NULL && num_iovs != 0) {
> -virtio_gpu_cleanup_mapping_iov(g, res_iovs, num_iovs);
> +vres = virgl_gpu_find_resource(g, unref.resource_id);
> +if (!vres) {
> +cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
> +return;
>  }
> +
> +virgl_renderer_resource_detach_iov(unref.resource_id, NULL, NULL);
>  virgl_renderer_resource_unref(unref.resource_id);
> +
> +QTAILQ_REMOVE(>reslist, >res, next);
> +virtio_gpu_cleanup_mapping(g, >res);
> +g_free(vres);
>  }
>
>  static void virgl_cmd_context_create(VirtIOGPU *g,
> @@ -310,44 +347,51 @@ static void virgl_resource_attach_backing(VirtIOGPU *g,
>struct virtio_gpu_ctrl_command 
> *cmd)
>  {
>  struct virtio_gpu_resource_attach_backing att_rb;
> -struct iovec *res_iovs;
> -uint32_t res_niov;
> +struct virgl_gpu_resource *vres;
>  int ret;
>
>  VIRTIO_GPU_FILL_CMD(att_rb);
>  trace_virtio_gpu_cmd_res_back_attach(att_rb.resource_id);
>
> +vres = virgl_gpu_find_resource(g, att_rb.resource_id);
> +if (!vres) {
> +cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
> +return;
> +}
> +
>  ret = virtio_gpu_create_mapping_iov(g, att_rb.nr_entries,

Re: [PATCH v6 04/11] virtio-gpu: Don't require udmabuf when blobs and virgl are enabled

On Tue, Dec 19, 2023 at 11:54 AM Huang Rui  wrote:
>
> From: Dmitry Osipenko 
>
> The udmabuf usage is mandatory when virgl is disabled and blobs feature
> enabled in the Qemu machine configuration. If virgl and blobs are enabled,
> then udmabuf requirement is optional. Since udmabuf isn't widely supported
> by a popular Linux distros today, let's relax the udmabuf requirement for
> blobs=on,virgl=on. Now, a full-featured virtio-gpu acceleration is
> available to Qemu users without a need to have udmabuf available in the
> system.
>
> Reviewed-by: Antonio Caggiano 
> Signed-off-by: Dmitry Osipenko 
> Signed-off-by: Huang Rui 

Reviewed-by: Marc-André Lureau 

> ---
>
> No change in v6.
>
>  hw/display/virtio-gpu.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index 8b2f4c6be3..4c3ec9d0ea 100644
> --- a/hw/display/virtio-gpu.c
> +++ b/hw/display/virtio-gpu.c
> @@ -1443,6 +1443,7 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error 
> **errp)
>
>  if (virtio_gpu_blob_enabled(g->parent_obj.conf)) {
>  if (!virtio_gpu_rutabaga_enabled(g->parent_obj.conf) &&
> +!virtio_gpu_virgl_enabled(g->parent_obj.conf) &&
>  !virtio_gpu_have_udmabuf()) {
>  error_setg(errp, "need rutabaga or udmabuf for blob resources");
>  return;
> --
> 2.25.1
>


-- 
Marc-André Lureau

Re: [PATCH v6 03/11] virtio-gpu: Support context init feature with virglrenderer

Hi

On Tue, Dec 19, 2023 at 11:54 AM Huang Rui  wrote:
>
> Patch "virtio-gpu: CONTEXT_INIT feature" has added the context_init
> feature flags.
> We would like to enable the feature with virglrenderer, so add to create
> virgl renderer context with flags using context_id when valid.
>
> Originally-by: Antonio Caggiano 
> Signed-off-by: Huang Rui 
> ---
>
> Changes in v6:
> - Handle the case while context_init is disabled.
> - Enable context_init by default.
>
>  hw/display/virtio-gpu-virgl.c | 13 +++--
>  hw/display/virtio-gpu.c   |  4 
>  2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index 8bb7a2c21f..5bbc8071b2 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -106,8 +106,17 @@ static void virgl_cmd_context_create(VirtIOGPU *g,
>  trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
>  cc.debug_name);
>
> -virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen,
> -  cc.debug_name);
> +#ifdef HAVE_VIRGL_CONTEXT_CREATE_WITH_FLAGS
> +if (cc.context_init && 
> virtio_gpu_context_init_enabled(g->parent_obj.conf)) {
> +virgl_renderer_context_create_with_flags(cc.hdr.ctx_id,
> + cc.context_init,
> + cc.nlen,
> + cc.debug_name);
> +return;
> +}
> +#endif
> +
> +virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen, cc.debug_name);
>  }
>
>  static void virgl_cmd_context_destroy(VirtIOGPU *g,
> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index b016d3bac8..8b2f4c6be3 100644
> --- a/hw/display/virtio-gpu.c
> +++ b/hw/display/virtio-gpu.c
> @@ -1619,6 +1619,10 @@ static Property virtio_gpu_properties[] = {
>  DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags,
>  VIRTIO_GPU_FLAG_BLOB_ENABLED, false),
>  DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0),
> +#ifdef HAVE_VIRGL_CONTEXT_CREATE_WITH_FLAGS
> +DEFINE_PROP_BIT("context_init", VirtIOGPU, parent_obj.conf.flags,
> +VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED, true),
> +#endif

Does it make sense to make this configurable? If the context to be
created asked for a capset id but the one created doesn't respect it,
what's the outcome?

It looks like it should instead set cmd->error.

-- 
Marc-André Lureau

Re: [PATCH v6 01/11] linux-headers: Update to kernel headers to add venus capset

Hi

On Thu, Dec 21, 2023 at 10:55 AM Akihiko Odaki  wrote:
>
> On 2023/12/19 23:14, Peter Maydell wrote:
> > On Tue, 19 Dec 2023 at 13:49, Huang Rui  wrote:
> >>
> >> On Tue, Dec 19, 2023 at 08:20:22PM +0800, Akihiko Odaki wrote:
> >>> On 2023/12/19 16:53, Huang Rui wrote:
>  Sync up kernel headers to update venus macro till they are merged into
>  mainline.
> >>>
> >>> Thanks for sorting things out with the kernel and spec.
> >>>
> 
>  Signed-off-by: Huang Rui 
>  ---
> 
>  Changes in v6:
>  - Venus capset is applied in kernel, so update it in qemu for future use.
> 
>  https://lore.kernel.org/lkml/b79dcf75-c9e8-490e-644f-3b97d95f7...@collabora.com/
>  https://cgit.freedesktop.org/drm-misc/commit/?id=216d86b9a430f3280e5b631c51e6fd1a7774cfa0
> >>> Please include the link to the upstream commit in the commit message.
> >>
> >> So far, it's in drm maintainers' branch not in kernel mainline yet. Do I
> >> need to wait it to be merged into kernel mainline?
> >
> > For an RFC patchset, no. For patches to be merged into QEMU
> > the headers change must be in the kernel mainline, and the
> > QEMU commit that updates our copy of the headers must be a
> > full-sync done with scripts/update-linux-headers.sh, not a
> > manual edit.
>
> Apparently the kernel change is unlikely to be merged to mainline before
> QEMU 9.0 so we need to come up with some idea to deal with this.
>
> The release of Linux 6.7 is near and the development of 6.8 will start
> soon. So it was nice if the change made into 6.8, but unfortunately it
> missed the *probably last* drm-misc tree pull request for 6.8:
> https://lore.kernel.org/all/aqpn5miejmkks7pbcfex7b6u63uwsruywxsnr3x5ljs45qatin@nbkkej2elk46/
>
> It will still get into linux-next so we may retrieve headers from
> linux-next. Or we may add the definition to
> hw/display/virtio-gpu-virgl.c; the duplicate definition warning will
> tell when the definition becomes no longer necessary. I'm fine with
> either option.

The second option seems better to me, as it can avoid pulling unwanted changes.

thanks

-- 
Marc-André Lureau

[PATCH v5 13/13] xen/arm: add cache coloring support for Xen

This commit adds the cache coloring support for Xen own physical space.

It extends the implementation of setup_pagetables() to make use of Xen
cache coloring configuration. Page tables construction is essentially the
same except for the fact that PTEs point to a new temporary mapped,
physically colored space.

The temporary mapping is also used to relocate Xen to the new physical
space starting at the address taken from the old get_xen_paddr() function
which is brought back for the occasion.
The temporary mapping is finally converted to a mapping of the "old"
(meaning the original physical space) Xen code, so that the boot CPU can
actually address the variables and functions used by secondary CPUs until
they enable the MMU. This happens when the boot CPU needs to bring up other
CPUs (psci.c and smpboot.c) and when the TTBR value is passed to them
(prepare_secondary_mm()).

Finally, since the alternative framework needs to remap the Xen text and
inittext sections, this operation must be done in a coloring-aware way.
The function xen_remap_colored() is introduced for that.

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- FIXME: consider_modules copy pasted since it got moved
v4:
- removed set_value_for_secondary() because it was wrongly cleaning cache
- relocate_xen() now calls switch_ttbr_id()
---
 xen/arch/arm/alternative.c  |   9 +-
 xen/arch/arm/arm64/mmu/head.S   |  48 +++
 xen/arch/arm/arm64/mmu/mm.c |  26 +++-
 xen/arch/arm/include/asm/llc-coloring.h |  16 +++
 xen/arch/arm/include/asm/mm.h   |   7 +-
 xen/arch/arm/llc-coloring.c |  44 +++
 xen/arch/arm/mmu/setup.c|  82 +++-
 xen/arch/arm/mmu/smpboot.c  |  11 +-
 xen/arch/arm/psci.c |   9 +-
 xen/arch/arm/setup.c| 165 +++-
 xen/arch/arm/smpboot.c  |   9 +-
 11 files changed, 406 insertions(+), 20 deletions(-)

diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
index 016e66978b..54cbc2afad 100644
--- a/xen/arch/arm/alternative.c
+++ b/xen/arch/arm/alternative.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -209,8 +210,12 @@ void __init apply_alternatives_all(void)
  * The text and inittext section are read-only. So re-map Xen to
  * be able to patch the code.
  */
-xenmap = __vmap(_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
-VMAP_DEFAULT);
+if ( llc_coloring_enabled )
+xenmap = xen_remap_colored(xen_mfn, xen_size);
+else
+xenmap = __vmap(_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
+VMAP_DEFAULT);
+
 /* Re-mapping Xen is not expected to fail during boot. */
 BUG_ON(!xenmap);
 
diff --git a/xen/arch/arm/arm64/mmu/head.S b/xen/arch/arm/arm64/mmu/head.S
index 10774f30e4..6f0cc72897 100644
--- a/xen/arch/arm/arm64/mmu/head.S
+++ b/xen/arch/arm/arm64/mmu/head.S
@@ -419,6 +419,54 @@ fail:   PRINT("- Boot failed -\r\n")
 b 1b
 ENDPROC(fail)
 
+/* Copy Xen to new location and switch TTBR
+ * x0ttbr
+ * x1source address
+ * x2destination address
+ * x3length
+ *
+ * Source and destination must be word aligned, length is rounded up
+ * to a 16 byte boundary.
+ *
+ * MUST BE VERY CAREFUL when saving things to RAM over the copy */
+ENTRY(relocate_xen)
+/* Copy 16 bytes at a time using:
+ *   x9: counter
+ *   x10: data
+ *   x11: data
+ *   x12: source
+ *   x13: destination
+ */
+mov x9, x3
+mov x12, x1
+mov x13, x2
+
+1:  ldp x10, x11, [x12], #16
+stp x10, x11, [x13], #16
+
+subsx9, x9, #16
+bgt 1b
+
+/* Flush destination from dcache using:
+ * x9: counter
+ * x10: step
+ * x11: vaddr
+ */
+dsb   sy/* So the CPU issues all writes to the range */
+
+mov   x9, x3
+ldr   x10, =dcache_line_bytes /* x10 := step */
+ldr   x10, [x10]
+mov   x11, x2
+
+1:  dccvac, x11
+
+add   x11, x11, x10
+subs  x9, x9, x10
+bgt   1b
+
+b switch_ttbr_id
+
 /*
  * Switch TTBR
  *
diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
index d2651c9486..5a26d64ab7 100644
--- a/xen/arch/arm/arm64/mmu/mm.c
+++ b/xen/arch/arm/arm64/mmu/mm.c
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -125,27 +126,44 @@ void update_identity_mapping(bool enable)
 }
 
 extern void switch_ttbr_id(uint64_t ttbr);
+extern void relocate_xen(uint64_t ttbr, void *src, void *dst, size_t len);
 
 typedef void (switch_ttbr_fn)(uint64_t ttbr);
+typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t 
len);
 
 void __init switch_ttbr(uint64_t ttbr)
 {
-vaddr_t id_addr =

[PATCH v5 12/13] xen/arm: add Xen cache colors command line parameter

From: Luca Miccio 

This commit adds a new command line parameter to configure Xen cache colors.
These colors can be dumped with the cache coloring info debug-key.

By default, Xen uses the first color.
Benchmarking the VM interrupt response time provides an estimation of
LLC usage by Xen's most latency-critical runtime task. Results on Arm
Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
reserves 64 KiB of L2, is enough to attain best responsiveness.

More colors are instead very likely to be needed on processors whose L1
cache is physically-indexed and physically-tagged, such as Cortex-A57.
In such cases, coloring applies to L1 also, and there typically are two
distinct L1-colors. Therefore, reserving only one color for Xen would
senselessly partitions a cache memory that is already private, i.e.
underutilize it. The default amount of Xen colors is thus set to one.

Signed-off-by: Luca Miccio 
Signed-off-by: Marco Solieri 
Signed-off-by: Carlo Nonato 
---
 docs/misc/xen-command-line.pandoc | 10 ++
 xen/arch/arm/llc-coloring.c   | 29 +
 2 files changed, 39 insertions(+)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index 163fe7bcb1..f983f22796 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2877,6 +2877,16 @@ mode.
 **WARNING: `x2apic_phys` is deprecated and superseded by `x2apic-mode`.
 The latter takes precedence if both are set.**
 
+### xen-llc-colors (arm64)
+> `= List of [  | - ]`
+
+> Default: `0: the lowermost color`
+
+Specify Xen LLC color configuration. This options is available only when
+`CONFIG_LLC_COLORING` is enabled.
+Two colors are most likely needed on platforms where private caches are
+physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57.
+
 ### xenheap_megabytes (arm32)
 > `= `
 
diff --git a/xen/arch/arm/llc-coloring.c b/xen/arch/arm/llc-coloring.c
index 526129cc43..99ea69ad39 100644
--- a/xen/arch/arm/llc-coloring.c
+++ b/xen/arch/arm/llc-coloring.c
@@ -18,6 +18,9 @@
 #include 
 #include 
 
+#define XEN_DEFAULT_COLOR   0
+#define XEN_DEFAULT_NUM_COLORS  1
+
 bool __ro_after_init llc_coloring_enabled;
 boolean_param("llc-coloring", llc_coloring_enabled);
 
@@ -30,6 +33,9 @@ static unsigned int __ro_after_init nr_colors = 
CONFIG_NR_LLC_COLORS;
 static unsigned int __ro_after_init dom0_colors[CONFIG_NR_LLC_COLORS];
 static unsigned int __ro_after_init dom0_num_colors;
 
+static unsigned int __ro_after_init xen_colors[CONFIG_NR_LLC_COLORS];
+static unsigned int __ro_after_init xen_num_colors;
+
 #define mfn_color_mask  (nr_colors - 1)
 #define mfn_to_color(mfn)   (mfn_x(mfn) & mfn_color_mask)
 
@@ -87,6 +93,12 @@ static int __init parse_dom0_colors(const char *s)
 }
 custom_param("dom0-llc-colors", parse_dom0_colors);
 
+static int __init parse_xen_colors(const char *s)
+{
+return parse_color_config(s, xen_colors, _num_colors);
+}
+custom_param("xen-llc-colors", parse_xen_colors);
+
 /* Return the LLC way size by probing the hardware */
 static unsigned int __init get_llc_way_size(void)
 {
@@ -161,6 +173,8 @@ static void dump_coloring_info(unsigned char key)
 printk("'%c' pressed -> dumping LLC coloring general info\n", key);
 printk("LLC way size: %u KiB\n", llc_way_size >> 10);
 printk("Number of LLC colors supported: %u\n", nr_colors);
+printk("Xen has %u LLC colors: ", xen_num_colors);
+print_colors(xen_colors, xen_num_colors);
 }
 
 static bool check_colors(unsigned int *colors, unsigned int num_colors)
@@ -217,6 +231,21 @@ bool __init llc_coloring_init(void)
 return false;
 }
 
+if ( !xen_num_colors )
+{
+printk(XENLOG_WARNING
+   "Xen LLC color config not found. Using default color: %u\n",
+   XEN_DEFAULT_COLOR);
+xen_colors[0] = XEN_DEFAULT_COLOR;
+xen_num_colors = XEN_DEFAULT_NUM_COLORS;
+}
+
+if ( !check_colors(xen_colors, xen_num_colors) )
+{
+printk(XENLOG_ERR "Bad LLC color config for Xen\n");
+return false;
+}
+
 register_keyhandler('K', dump_coloring_info, "dump LLC coloring info", 1);
 
 return true;
-- 
2.34.1

[PATCH v5 01/13] xen/common: add cache coloring common code

This commit adds the Last Level Cache (LLC) coloring common header, Kconfig
options and functions. Since this is an arch specific feature, actual
implementation is postponed to later patches and Kconfig options are placed
under xen/arch.

LLC colors are a property of the domain, so the domain struct has to be
extended.

Based on original work from: Luca Miccio 

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- used - instead of _ for filenames
- removed domain_create_llc_colored()
- removed stub functions
- coloring domain fields are now #ifdef protected
v4:
- Kconfig options moved to xen/arch
- removed range for CONFIG_NR_LLC_COLORS
- added "llc_coloring_enabled" global to later implement the boot-time
  switch
- added domain_create_llc_colored() to be able to pass colors
- added is_domain_llc_colored() macro
---
 xen/arch/Kconfig   | 16 
 xen/common/Kconfig |  3 +++
 xen/common/domain.c|  4 +++
 xen/common/keyhandler.c|  4 +++
 xen/include/xen/llc-coloring.h | 46 ++
 xen/include/xen/sched.h|  5 
 6 files changed, 78 insertions(+)
 create mode 100644 xen/include/xen/llc-coloring.h

diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index 67ba38f32f..aad7e9da38 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -31,3 +31,19 @@ config NR_NUMA_NODES
  associated with multiple-nodes management. It is the upper bound of
  the number of NUMA nodes that the scheduler, memory allocation and
  other NUMA-aware components can handle.
+
+config LLC_COLORING
+   bool "Last Level Cache (LLC) coloring" if EXPERT
+   depends on HAS_LLC_COLORING
+
+config NR_LLC_COLORS
+   int "Maximum number of LLC colors"
+   default 128
+   depends on LLC_COLORING
+   help
+ Controls the build-time size of various arrays associated with LLC
+ coloring. Refer to cache coloring documentation for how to compute the
+ number of colors supported by the platform. This is only an upper
+ bound. The runtime value is autocomputed or manually set via cmdline.
+ The default value corresponds to an 8 MiB 16-ways LLC, which should be
+ more than what needed in the general case.
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 310ad4229c..e383f09d97 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -71,6 +71,9 @@ config HAS_IOPORTS
 config HAS_KEXEC
bool
 
+config HAS_LLC_COLORING
+   bool
+
 config HAS_PMAP
bool
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index f6f5574996..491585b0bb 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1144,6 +1145,9 @@ static void cf_check complete_domain_destroy(struct 
rcu_head *head)
 struct vcpu *v;
 int i;
 
+if ( is_domain_llc_colored(d) )
+domain_llc_coloring_free(d);
+
 /*
  * Flush all state for the vCPU previously having run on the current CPU.
  * This is in particular relevant for x86 HVM ones on VMX, so that this
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 99a2d72a02..27c2d324d8 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -307,6 +308,9 @@ static void cf_check dump_domains(unsigned char key)
 
 arch_dump_domain_info(d);
 
+if ( is_domain_llc_colored(d) )
+domain_dump_llc_colors(d);
+
 rangeset_domain_printk(d);
 
 dump_pageframe_info(d);
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
new file mode 100644
index 00..cedd97d4b5
--- /dev/null
+++ b/xen/include/xen/llc-coloring.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Last Level Cache (LLC) coloring common header
+ *
+ * Copyright (C) 2022 Xilinx Inc.
+ *
+ * Authors:
+ *Carlo Nonato 
+ */
+#ifndef __COLORING_H__
+#define __COLORING_H__
+
+#include 
+#include 
+
+#ifdef CONFIG_HAS_LLC_COLORING
+
+#include 
+
+#ifdef CONFIG_LLC_COLORING
+extern bool llc_coloring_enabled;
+#define llc_coloring_enabled (llc_coloring_enabled)
+#endif
+
+#endif
+
+#ifndef llc_coloring_enabled
+#define llc_coloring_enabled (false)
+#endif
+
+#define is_domain_llc_colored(d) (llc_coloring_enabled)
+
+void domain_llc_coloring_free(struct domain *d);
+void domain_dump_llc_colors(struct domain *d);
+
+#endif /* __COLORING_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 9da91e0e62..dae7fab673 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -626,6 +626,11 @@ struct domain
 
 /* Holding CDF_* constant. Internal flags for domain creation. */

[PATCH v5 10/13] xen/arm: use domain memory to allocate p2m page tables

Cache colored domains can benefit from having p2m page tables allocated
with the same coloring schema so that isolation can be achieved also for
those kind of memory accesses.
In order to do that, the domain struct is passed to the allocator and the
MEMF_no_owner flag is used.

This will be useful also when NUMA will be supported on Arm.

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- new patch
---
 xen/arch/arm/mmu/p2m.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c
index 41fcca011c..d02a478cb8 100644
--- a/xen/arch/arm/mmu/p2m.c
+++ b/xen/arch/arm/mmu/p2m.c
@@ -32,7 +32,7 @@ static struct page_info *p2m_alloc_page(struct domain *d)
  */
 if ( is_hardware_domain(d) )
 {
-pg = alloc_domheap_page(NULL, 0);
+pg = alloc_domheap_page(d, MEMF_no_owner);
 if ( pg == NULL )
 printk(XENLOG_G_ERR "Failed to allocate P2M pages for hwdom.\n");
 }
@@ -81,7 +81,7 @@ int p2m_set_allocation(struct domain *d, unsigned long pages, 
bool *preempted)
 if ( d->arch.paging.p2m_total_pages < pages )
 {
 /* Need to allocate more memory from domheap */
-pg = alloc_domheap_page(NULL, 0);
+pg = alloc_domheap_page(d, MEMF_no_owner);
 if ( pg == NULL )
 {
 printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
-- 
2.34.1

[PATCH v5 07/13] xen/page_alloc: introduce init_free_page_fields() helper

Introduce a new helper to initialize fields that have different uses for
free pages.

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- new patch
---
 xen/common/page_alloc.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index d874525916..9ee3981bb5 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -928,6 +928,13 @@ static struct page_info *get_free_buddy(unsigned int 
zone_lo,
 }
 }
 
+/* Initialise fields which have other uses for free pages. */
+static void init_free_page_fields(struct page_info *pg)
+{
+pg->u.inuse.type_info = PGT_TYPE_INFO_INITIALIZER;
+page_set_owner(pg, NULL);
+}
+
 /* Allocate 2^@order contiguous pages. */
 static struct page_info *alloc_heap_pages(
 unsigned int zone_lo, unsigned int zone_hi,
@@ -1036,10 +1043,7 @@ static struct page_info *alloc_heap_pages(
 accumulate_tlbflush(_tlbflush, [i],
 _timestamp);
 
-/* Initialise fields which have other uses for free pages. */
-pg[i].u.inuse.type_info = PGT_TYPE_INFO_INITIALIZER;
-page_set_owner([i], NULL);
-
+init_free_page_fields([i]);
 }
 
 spin_unlock(_lock);
@@ -2802,9 +2806,7 @@ static bool prepare_staticmem_pages(struct page_info *pg, 
unsigned long nr_mfns,
  * to PGC_state_inuse.
  */
 pg[i].count_info = PGC_static | PGC_state_inuse;
-/* Initialise fields which have other uses for free pages. */
-pg[i].u.inuse.type_info = PGT_TYPE_INFO_INITIALIZER;
-page_set_owner([i], NULL);
+init_free_page_fields([i]);
 }
 
 spin_unlock(_lock);
-- 
2.34.1

[PATCH v5 03/13] xen/arm: add Dom0 cache coloring support

This commit allows the user to set the cache coloring configuration for
Dom0 via a command line parameter.
Since cache coloring and static memory are incompatible, direct mapping
Dom0 isn't possible when coloring is enabled.

A common configuration syntax for cache colors is also introduced.

Based on original work from: Luca Miccio 

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- Carlo Nonato as the new author
- moved dom0 colors parsing (parse_colors()) in this patch
- added dom0_set_llc_colors() to set dom0 colors after creation
- moved color allocation and checking in this patch
- error handling when allocating color arrays
- FIXME: copy pasted allocate_memory() cause it got moved
v4:
- dom0 colors are dynamically allocated as for any other domain
  (colors are duplicated in dom0_colors and in the new array, but logic
  is simpler)
---
 docs/misc/arm/cache-coloring.rst|  29 ++
 docs/misc/xen-command-line.pandoc   |   9 ++
 xen/arch/arm/domain_build.c |  60 ++-
 xen/arch/arm/include/asm/llc-coloring.h |   1 +
 xen/arch/arm/llc-coloring.c | 128 
 5 files changed, 224 insertions(+), 3 deletions(-)

diff --git a/docs/misc/arm/cache-coloring.rst b/docs/misc/arm/cache-coloring.rst
index eabf8f5d1b..acf82c3df8 100644
--- a/docs/misc/arm/cache-coloring.rst
+++ b/docs/misc/arm/cache-coloring.rst
@@ -84,6 +84,35 @@ More specific documentation is available at 
`docs/misc/xen-command-line.pandoc`.
 +--+---+
 | ``llc-way-size`` | set the LLC way size  |
 +--+---+
+| ``dom0-llc-colors``  | Dom0 color configuration  |
++--+---+
+
+Colors selection format
+***
+
+Regardless of the memory pool that has to be colored (Xen, Dom0/DomUs),
+the color selection can be expressed using the same syntax. In particular a
+comma-separated list of colors or ranges of colors is used.
+Ranges are hyphen-separated intervals (such as `0-4`) and are inclusive on both
+sides.
+
+Note that:
+
+- no spaces are allowed between values.
+- no overlapping ranges or duplicated colors are allowed.
+- values must be written in ascending order.
+
+Examples:
+
++---+-+
+| **Configuration** | **Actual selection**|
++---+-+
+| 1-2,5-8   | [1, 2, 5, 6, 7, 8]  |
++---+-+
+| 4-8,10,11,12  | [4, 5, 6, 7, 8, 10, 11, 12] |
++---+-+
+| 0 | [0] |
++---+-+
 
 Known issues and limitations
 
diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index 22d2d5b6cf..51f6adf035 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
 
 Specify a list of IO ports to be excluded from dom0 access.
 
+### dom0-llc-colors (arm64)
+> `= List of [  | - ]`
+
+> Default: `All available LLC colors`
+
+Specify dom0 LLC color configuration. This options is available only when
+`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
+colors are chosen and the user is warned on Xen serial console.
+
 ### dom0_max_vcpus
 
 Either:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 6945b9755d..482c059bfa 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -414,7 +415,7 @@ static void __init allocate_memory_11(struct domain *d,
 }
 }
 
-#ifdef CONFIG_DOM0LESS_BOOT
+#if defined(CONFIG_DOM0LESS_BOOT) || defined(CONFIG_LLC_COLORING)
 bool __init allocate_bank_memory(struct domain *d, struct kernel_info *kinfo,
  gfn_t sgfn, paddr_t tot_size)
 {
@@ -478,6 +479,49 @@ bool __init allocate_bank_memory(struct domain *d, struct 
kernel_info *kinfo,
 }
 #endif
 
+static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
+{
+unsigned int i;
+paddr_t bank_size;
+
+printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
+   /* Don't want format this as PRIpaddr (16 digit hex) */
+   (unsigned long)(kinfo->unassigned_mem >> 20), d);
+
+kinfo->mem.nr_banks = 0;
+bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
+if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
+   bank_size) )
+goto fail;
+
+bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
+if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
+   bank_size) )
+

[PATCH v5 04/13] xen: extend domctl interface for cache coloring

This commit updates the domctl interface to allow the user to set cache
coloring configurations from the toolstack.
It also implements the functionality for arm64.

Based on original work from: Luca Miccio 

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- added a new hypercall to set colors
- uint for the guest handle
v4:
- updated XEN_DOMCTL_INTERFACE_VERSION
---
 xen/arch/arm/llc-coloring.c| 17 +
 xen/common/domctl.c| 11 +++
 xen/include/public/domctl.h| 10 +-
 xen/include/xen/llc-coloring.h |  3 +++
 4 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/llc-coloring.c b/xen/arch/arm/llc-coloring.c
index 5ce58aba70..a08614ec36 100644
--- a/xen/arch/arm/llc-coloring.c
+++ b/xen/arch/arm/llc-coloring.c
@@ -9,6 +9,7 @@
  *Carlo Nonato 
  */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -278,6 +279,22 @@ int dom0_set_llc_colors(struct domain *d)
 return domain_check_colors(d);
 }
 
+int domain_set_llc_colors_domctl(struct domain *d,
+ const struct xen_domctl_set_llc_colors 
*config)
+{
+if ( d->num_llc_colors )
+return -EEXIST;
+
+if ( domain_alloc_colors(d, config->num_llc_colors) )
+return -ENOMEM;
+
+if ( copy_from_guest(d->llc_colors, config->llc_colors,
+ config->num_llc_colors) )
+return -EFAULT;
+
+return domain_check_colors(d);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index f5a71ee5f7..b6867d0602 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -858,6 +859,16 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
u_domctl)
 __HYPERVISOR_domctl, "h", u_domctl);
 break;
 
+case XEN_DOMCTL_set_llc_colors:
+if ( !llc_coloring_enabled )
+break;
+
+ret = domain_set_llc_colors_domctl(d, >u.set_llc_colors);
+if ( ret == -EEXIST )
+printk(XENLOG_ERR
+   "Can't set LLC colors on an already created domain\n");
+break;
+
 default:
 ret = arch_do_domctl(op, d, u_domctl);
 break;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index a33f9ec32b..2b12069294 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -21,7 +21,7 @@
 #include "hvm/save.h"
 #include "memory.h"
 
-#define XEN_DOMCTL_INTERFACE_VERSION 0x0016
+#define XEN_DOMCTL_INTERFACE_VERSION 0x0017
 
 /*
  * NB. xen_domctl.domain is an IN/OUT parameter for this operation.
@@ -1190,6 +1190,12 @@ struct xen_domctl_vmtrace_op {
 typedef struct xen_domctl_vmtrace_op xen_domctl_vmtrace_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_vmtrace_op_t);
 
+struct xen_domctl_set_llc_colors {
+/* IN LLC coloring parameters */
+unsigned int num_llc_colors;
+XEN_GUEST_HANDLE_64(uint) llc_colors;
+};
+
 struct xen_domctl {
 uint32_t cmd;
 #define XEN_DOMCTL_createdomain   1
@@ -1277,6 +1283,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_vmtrace_op84
 #define XEN_DOMCTL_get_paging_mempool_size   85
 #define XEN_DOMCTL_set_paging_mempool_size   86
+#define XEN_DOMCTL_set_llc_colors87
 #define XEN_DOMCTL_gdbsx_guestmemio1000
 #define XEN_DOMCTL_gdbsx_pausevcpu 1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu   1002
@@ -1339,6 +1346,7 @@ struct xen_domctl {
 struct xen_domctl_vuart_op  vuart_op;
 struct xen_domctl_vmtrace_opvmtrace_op;
 struct xen_domctl_paging_mempoolpaging_mempool;
+struct xen_domctl_set_llc_colorsset_llc_colors;
 uint8_t pad[128];
 } u;
 };
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index cedd97d4b5..fa2edc8ad8 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -33,6 +33,9 @@ extern bool llc_coloring_enabled;
 void domain_llc_coloring_free(struct domain *d);
 void domain_dump_llc_colors(struct domain *d);
 
+int domain_set_llc_colors_domctl(struct domain *d,
+ const struct xen_domctl_set_llc_colors 
*config);
+
 #endif /* __COLORING_H__ */
 
 /*
-- 
2.34.1

[PATCH v5 09/13] xen: add cache coloring allocator for domains

This commit adds a new memory page allocator that implements the cache
coloring mechanism. The allocation algorithm enforces equal frequency
distribution of cache partitions, following the coloring configuration of a
domain. This allows an even utilization of cache sets for every domain.

Pages are stored in a color-indexed array of lists. Those lists are filled
by a simple init function which computes the color of each page.
When a domain requests a page, the allocator extract the page from the list
with the maximum number of free pages between those that the domain can
access, given its coloring configuration.

The allocator can only handle requests of order-0 pages. This allows for
easier implementation and since cache coloring targets only embedded systems,
it's assumed not to be a major problem.

The buddy allocator must coexist with the colored one because the Xen heap
isn't colored. For this reason a new Kconfig option and a command line
parameter are added to let the user set the amount of memory reserved for
the buddy allocator. Even when cache coloring is enabled, this memory
isn't managed by the colored allocator.

Colored heap information is dumped in the dump_heap() debug-key function.

Based on original work from: Luca Miccio 

Signed-off-by: Marco Solieri 
Signed-off-by: Carlo Nonato 
---
v5:
- Carlo Nonato as the new author
- the colored allocator balances color usage for each domain and it searches
  linearly only in the number of colors (FIXME removed)
- addedd scrub functionality
- removed stub functions (still requires some macro definition)
- addr_to_color turned to mfn_to_color for easier operations
- removed BUG_ON in init_color_heap_pages() in favor of panic()
- only non empty page lists are logged in dump_color_heap()
v4:
- moved colored allocator code after buddy allocator because it now has
  some dependencies on buddy functions
- buddy_alloc_size is now used only by the colored allocator
- fixed a bug that allowed the buddy to merge pages when they were colored
- free_color_heap_page() now calls mark_page_free()
- free_color_heap_page() uses of the frametable array for faster searches
- added FIXME comment for the linear search in free_color_heap_page()
- removed alloc_color_domheap_page() to let the colored allocator exploit
  some more buddy allocator code
- alloc_color_heap_page() now allocs min address pages first
- reduced the mess in end_boot_allocator(): use the first loop for
  init_color_heap_pages()
- fixed page_list_add_prev() (list.h) since it was doing the opposite of
  what it was supposed to do
- fixed page_list_add_prev() (non list.h) to check also for next existence
- removed unused page_list_add_next()
- moved p2m code in another patch
---
 docs/misc/arm/cache-coloring.rst  |  37 ++
 docs/misc/xen-command-line.pandoc |  14 +++
 xen/arch/Kconfig  |  12 ++
 xen/arch/arm/include/asm/mm.h |   5 +
 xen/arch/arm/llc-coloring.c   |  13 ++
 xen/common/page_alloc.c   | 193 +-
 xen/include/xen/llc-coloring.h|   4 +
 7 files changed, 273 insertions(+), 5 deletions(-)

diff --git a/docs/misc/arm/cache-coloring.rst b/docs/misc/arm/cache-coloring.rst
index ae1dd8f4af..c5582340da 100644
--- a/docs/misc/arm/cache-coloring.rst
+++ b/docs/misc/arm/cache-coloring.rst
@@ -9,6 +9,9 @@ To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
 If needed, change the maximum number of colors with
 ``CONFIG_NR_LLC_COLORS=``.
 
+If needed, change the buddy allocator reserved size with
+``CONFIG_BUDDY_ALLOCATOR_SIZE=``.
+
 Compile Xen and the toolstack and then configure it via
 `Command line parameters`_. For DomUs follow `DomUs configuration`_.
 
@@ -86,6 +89,8 @@ More specific documentation is available at 
`docs/misc/xen-command-line.pandoc`.
 +--+---+
 | ``dom0-llc-colors``  | Dom0 color configuration  |
 +--+---+
+| ``buddy-alloc-size`` | Buddy allocator reserved size |
++--+---+
 
 Colors selection format
 ***
@@ -160,6 +165,17 @@ DomUs colors can be set via Device Tree, also for Dom0less 
configurations
 **Note:** If no color configuration is provided for a domain, the default one,
 which corresponds to all available colors, is used instead.
 
+Colored allocator and buddy allocator
+*
+
+The colored allocator distributes pages based on color configurations of
+domains so that each domains only gets pages of its own colors.
+The colored allocator is meant as an alternative to the buddy allocator because
+its allocation policy is by definition incompatible with the generic one. Since
+the Xen heap is not colored yet, we need to support the coexistence of the two
+allocators and some memory must be left for the buddy one. Buddy memory
+reservation is configured via Kconfig or via command-line.
+
 Known issues

[PATCH v5 08/13] xen/page_alloc: introduce preserved page flags macro

PGC_static and PGC_extra are flags that needs to be preserved when assigning
a page. Define a new macro that groups those flags and use it instead of
or'ing every time.

The new macro is used also in free_heap_pages() allowing future commits to
extended it with other flags that must stop merging, as it now works for
PGC_static. PGC_extra is no harm here since it's only ever being set on
allocated pages.

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- new patch
---
 xen/common/page_alloc.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 9ee3981bb5..3bf3120287 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -158,6 +158,8 @@
 #define PGC_static 0
 #endif
 
+#define preserved_flags (PGC_extra | PGC_static)
+
 #ifndef PGT_TYPE_INFO_INITIALIZER
 #define PGT_TYPE_INFO_INITIALIZER 0
 #endif
@@ -1504,7 +1506,7 @@ static void free_heap_pages(
 /* Merge with predecessor block? */
 if ( !mfn_valid(page_to_mfn(predecessor)) ||
  !page_state_is(predecessor, free) ||
- (predecessor->count_info & PGC_static) ||
+ (predecessor->count_info & preserved_flags) ||
  (PFN_ORDER(predecessor) != order) ||
  (page_to_nid(predecessor) != node) )
 break;
@@ -1528,7 +1530,7 @@ static void free_heap_pages(
 /* Merge with successor block? */
 if ( !mfn_valid(page_to_mfn(successor)) ||
  !page_state_is(successor, free) ||
- (successor->count_info & PGC_static) ||
+ (successor->count_info & preserved_flags) ||
  (PFN_ORDER(successor) != order) ||
  (page_to_nid(successor) != node) )
 break;
@@ -2365,7 +2367,7 @@ int assign_pages(
 
 for ( i = 0; i < nr; i++ )
 {
-ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
+ASSERT(!(pg[i].count_info & ~preserved_flags));
 if ( pg[i].count_info & PGC_extra )
 extra_pages++;
 }
@@ -2425,7 +2427,7 @@ int assign_pages(
 page_set_owner([i], d);
 smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
 pg[i].count_info =
-(pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1;
+(pg[i].count_info & preserved_flags) | PGC_allocated | 1;
 
 page_list_add_tail([i], page_to_list(d, [i]));
 }
-- 
2.34.1

[PATCH v5 11/13] Revert "xen/arm: Remove unused BOOT_RELOC_VIRT_START"

This reverts commit 0c18fb76323bfb13615b6f13c98767face2d8097 (not clean).

This is not a clean revert since the rework of the memory layout, but it is
sufficiently similar to a clean one.
The only difference is that the BOOT_RELOC_VIRT_START must match the new
layout.

Cache coloring support for Xen needs to relocate Xen code and data in a new
colored physical space. The BOOT_RELOC_VIRT_START will be used as the virtual
base address for a temporary mapping to this new space.

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
 xen/arch/arm/include/asm/mmu/layout.h | 2 ++
 xen/arch/arm/mmu/setup.c  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/xen/arch/arm/include/asm/mmu/layout.h 
b/xen/arch/arm/include/asm/mmu/layout.h
index eac7eef885..30031f74d9 100644
--- a/xen/arch/arm/include/asm/mmu/layout.h
+++ b/xen/arch/arm/include/asm/mmu/layout.h
@@ -74,6 +74,8 @@
 #define BOOT_FDT_VIRT_START (FIXMAP_VIRT_START + FIXMAP_VIRT_SIZE)
 #define BOOT_FDT_VIRT_SIZE  _AT(vaddr_t, MB(4))
 
+#define BOOT_RELOC_VIRT_START   (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
+
 #ifdef CONFIG_LIVEPATCH
 #define LIVEPATCH_VMAP_START(BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
 #define LIVEPATCH_VMAP_SIZE_AT(vaddr_t, MB(2))
diff --git a/xen/arch/arm/mmu/setup.c b/xen/arch/arm/mmu/setup.c
index d5264e51bc..37b6d230ad 100644
--- a/xen/arch/arm/mmu/setup.c
+++ b/xen/arch/arm/mmu/setup.c
@@ -69,6 +69,7 @@ static void __init __maybe_unused build_assertions(void)
 /* 2MB aligned regions */
 BUILD_BUG_ON(XEN_VIRT_START & ~SECOND_MASK);
 BUILD_BUG_ON(FIXMAP_ADDR(0) & ~SECOND_MASK);
+BUILD_BUG_ON(BOOT_RELOC_VIRT_START & ~SECOND_MASK);
 /* 1GB aligned regions */
 #ifdef CONFIG_ARM_32
 BUILD_BUG_ON(XENHEAP_VIRT_START & ~FIRST_MASK);
-- 
2.34.1

[PATCH v5 06/13] xen/arm: add support for cache coloring configuration via device-tree

This commit adds the "llc-colors" Device Tree attribute that can be used
for DomUs and Dom0less color configurations. The syntax is the same used
for every color config.

Based on original work from: Luca Miccio 

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- static-mem check has been moved in a previous patch
- added domain_set_llc_colors_from_str() to set colors after domain creation
---
 docs/misc/arm/cache-coloring.rst| 48 -
 docs/misc/arm/device-tree/booting.txt   |  4 +++
 xen/arch/arm/dom0less-build.c   | 13 +++
 xen/arch/arm/include/asm/llc-coloring.h |  1 +
 xen/arch/arm/llc-coloring.c | 17 +
 5 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/docs/misc/arm/cache-coloring.rst b/docs/misc/arm/cache-coloring.rst
index acf82c3df8..ae1dd8f4af 100644
--- a/docs/misc/arm/cache-coloring.rst
+++ b/docs/misc/arm/cache-coloring.rst
@@ -10,7 +10,7 @@ If needed, change the maximum number of colors with
 ``CONFIG_NR_LLC_COLORS=``.
 
 Compile Xen and the toolstack and then configure it via
-`Command line parameters`_.
+`Command line parameters`_. For DomUs follow `DomUs configuration`_.
 
 Background
 **
@@ -114,6 +114,52 @@ Examples:
 | 0 | [0] |
 +---+-+
 
+DomUs configuration
+***
+
+DomUs colors can be set via Device Tree, also for Dom0less configurations
+(documentation at `docs/misc/arm/device-tree/booting.txt`) using the
+``llc-colors`` option. For example:
+
+::
+
+xen,xen-bootargs = "console=dtuart dtuart=serial0 dom0_mem=1G 
dom0_max_vcpus=1 sched=null llc-coloring=on llc-way-size=64K 
dom0-llc-colors=2-6";
+xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen 
root=/dev/ram0"
+
+dom0 {
+compatible = "xen,linux-zimage" "xen,multiboot-module";
+reg = <0x0 0x100 0x0 15858176>;
+};
+
+dom0-ramdisk {
+compatible = "xen,linux-initrd" "xen,multiboot-module";
+reg = <0x0 0x200 0x0 20638062>;
+};
+
+domU0 {
+#address-cells = <0x1>;
+#size-cells = <0x1>;
+compatible = "xen,domain";
+memory = <0x0 0x4>;
+llc-colors = "4-8,10,11,12";
+cpus = <0x1>;
+vpl011 = <0x1>;
+
+module@200 {
+compatible = "multiboot,kernel", "multiboot,module";
+reg = <0x200 0xff>;
+bootargs = "console=ttyAMA0";
+};
+
+module@3000 {
+compatible = "multiboot,ramdisk", "multiboot,module";
+reg = <0x300 0xff>;
+};
+};
+
+**Note:** If no color configuration is provided for a domain, the default one,
+which corresponds to all available colors, is used instead.
+
 Known issues and limitations
 
 
diff --git a/docs/misc/arm/device-tree/booting.txt 
b/docs/misc/arm/device-tree/booting.txt
index bbd955e9c2..e9f9862e9c 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -162,6 +162,10 @@ with the following properties:
 
 An integer specifying the number of vcpus to allocate to the guest.
 
+- llc-colors
+A string specifying the LLC color configuration for the guest.
+Refer to "docs/misc/arm/cache_coloring.rst" for syntax.
+
 - vpl011
 
 An empty property to enable/disable a virtual pl011 for the guest to
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 1142f7f74a..eb39f5291f 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -850,6 +850,7 @@ void __init create_domUs(void)
 struct dt_device_node *node;
 const struct dt_device_node *cpupool_node,
 *chosen = dt_find_node_by_path("/chosen");
+const char *llc_colors_str = NULL;
 
 BUG_ON(chosen == NULL);
 dt_for_each_child_node(chosen, node)
@@ -993,6 +994,13 @@ void __init create_domUs(void)
 #endif
 }
 
+dt_property_read_string(node, "llc-colors", _colors_str);
+if ( llc_coloring_enabled && !llc_colors_str )
+panic("'llc-colors' is required when LLC coloring is enabled\n");
+else if ( !llc_coloring_enabled && llc_colors_str)
+printk(XENLOG_WARNING
+   "'llc-colors' found, but LLC coloring is disabled\n");
+
 /*
  * The variable max_init_domid is initialized with zero, so here it's
  * very important to use the pre-increment operator to call
@@ -1003,6 +1011,11 @@ void __init create_domUs(void)
 panic("Error creating domain %s (rc = %ld)\n",
   dt_node_name(node), PTR_ERR(d));
 
+if ( llc_coloring_enabled &&
+ (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
+panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
+  dt_node_name(node), rc);
+

[PATCH v5 00/13] Arm cache coloring

Shared caches in multi-core CPU architectures represent a problem for
predictability of memory access latency. This jeopardizes applicability
of many Arm platform in real-time critical and mixed-criticality
scenarios. We introduce support for cache partitioning with page
coloring, a transparent software technique that enables isolation
between domains and Xen, and thus avoids cache interference.

When creating a domain, a simple syntax (e.g. `0-3` or `4-11`) allows
the user to define assignments of cache partitions ids, called colors,
where assigning different colors guarantees no mutual eviction on cache
will ever happen. This instructs the Xen memory allocator to provide
the i-th color assignee only with pages that maps to color i, i.e. that
are indexed in the i-th cache partition.

The proposed implementation supports the dom0less feature.
The proposed implementation doesn't support the static-mem feature.
The solution has been tested in several scenarios, including Xilinx Zynq
MPSoCs.

In this patch series there are two major unacceptable workarounds for which
I want to ask you for comments:
 - #3: allocate_memory() has been moved in dom0less_build.c, so I just copied
 it back to make it compile.
 - #13: consider_modules() has been moved to arm32 only. Again I just copied it.

Carlo Nonato (12):
  xen/common: add cache coloring common code
  xen/arm: add cache coloring initialization
  xen/arm: add Dom0 cache coloring support
  xen: extend domctl interface for cache coloring
  tools: add support for cache coloring configuration
  xen/arm: add support for cache coloring configuration via device-tree
  xen/page_alloc: introduce init_free_page_fields() helper
  xen/page_alloc: introduce preserved page flags macro
  xen: add cache coloring allocator for domains
  xen/arm: use domain memory to allocate p2m page tables
  Revert "xen/arm: Remove unused BOOT_RELOC_VIRT_START"
  xen/arm: add cache coloring support for Xen

Luca Miccio (1):
  xen/arm: add Xen cache colors command line parameter

 docs/man/xl.cfg.5.pod.in|  10 +
 docs/misc/arm/cache-coloring.rst| 209 
 docs/misc/arm/device-tree/booting.txt   |   4 +
 docs/misc/xen-command-line.pandoc   |  61 
 tools/include/libxl.h   |   5 +
 tools/include/xenctrl.h |   9 +
 tools/libs/ctrl/xc_domain.c |  34 ++
 tools/libs/light/libxl_create.c |   9 +
 tools/libs/light/libxl_types.idl|   1 +
 tools/xl/xl_parse.c |  38 ++-
 xen/arch/Kconfig|  28 ++
 xen/arch/arm/Kconfig|   1 +
 xen/arch/arm/Makefile   |   1 +
 xen/arch/arm/alternative.c  |   9 +-
 xen/arch/arm/arm64/mmu/head.S   |  48 +++
 xen/arch/arm/arm64/mmu/mm.c |  26 +-
 xen/arch/arm/dom0less-build.c   |  19 ++
 xen/arch/arm/domain_build.c |  60 +++-
 xen/arch/arm/include/asm/llc-coloring.h |  46 +++
 xen/arch/arm/include/asm/mm.h   |  12 +-
 xen/arch/arm/include/asm/mmu/layout.h   |   2 +
 xen/arch/arm/include/asm/processor.h|  16 +
 xen/arch/arm/llc-coloring.c | 409 
 xen/arch/arm/mmu/p2m.c  |   4 +-
 xen/arch/arm/mmu/setup.c|  83 -
 xen/arch/arm/mmu/smpboot.c  |  11 +-
 xen/arch/arm/psci.c |   9 +-
 xen/arch/arm/setup.c| 172 +-
 xen/arch/arm/smpboot.c  |   9 +-
 xen/common/Kconfig  |   3 +
 xen/common/domain.c |   4 +
 xen/common/domctl.c |  11 +
 xen/common/keyhandler.c |   4 +
 xen/common/page_alloc.c | 217 -
 xen/include/public/domctl.h |  10 +-
 xen/include/xen/llc-coloring.h  |  53 +++
 xen/include/xen/sched.h |   5 +
 37 files changed, 1610 insertions(+), 42 deletions(-)
 create mode 100644 docs/misc/arm/cache-coloring.rst
 create mode 100644 xen/arch/arm/include/asm/llc-coloring.h
 create mode 100644 xen/arch/arm/llc-coloring.c
 create mode 100644 xen/include/xen/llc-coloring.h

-- 
2.34.1

[PATCH v5 02/13] xen/arm: add cache coloring initialization

This commit implements functions declared in the LLC coloring common header
for arm64 and adds documentation. It also adds two command line options: a
runtime switch for the cache coloring feature and the LLC way size
parameter.

The feature init function consists of an auto probing of the cache layout
necessary to retrieve the LLC way size which is used to compute the number
of platform colors. It also adds a debug-key to dump general cache coloring
info.

Static memory allocation and cache coloring are incompatible because static
memory can't be guaranteed to use only colors assigned to the domain.
Panic during domUs creation when both are enabled.

Based on original work from: Luca Miccio 

Signed-off-by: Carlo Nonato 
Signed-off-by: Marco Solieri 
---
v5:
- used - instead of _ for filenames
- moved static-mem check in this patch
- moved dom0 colors parsing in next patch
- moved color allocation and configuration in next patch
- moved check_colors() in next patch
- colors are now printed in short form
v4:
- added "llc-coloring" cmdline option for the boot-time switch
- dom0 colors are now checked during domain init as for any other domain
- fixed processor.h masks bit width
- check for overflow in parse_color_config()
- check_colors() now checks also that colors are sorted and unique
---
 docs/misc/arm/cache-coloring.rst|  97 ++
 docs/misc/xen-command-line.pandoc   |  28 +
 xen/arch/arm/Kconfig|   1 +
 xen/arch/arm/Makefile   |   1 +
 xen/arch/arm/dom0less-build.c   |   6 +
 xen/arch/arm/include/asm/llc-coloring.h |  28 +
 xen/arch/arm/include/asm/processor.h|  16 +++
 xen/arch/arm/llc-coloring.c | 161 
 xen/arch/arm/setup.c|   7 ++
 9 files changed, 345 insertions(+)
 create mode 100644 docs/misc/arm/cache-coloring.rst
 create mode 100644 xen/arch/arm/include/asm/llc-coloring.h
 create mode 100644 xen/arch/arm/llc-coloring.c

diff --git a/docs/misc/arm/cache-coloring.rst b/docs/misc/arm/cache-coloring.rst
new file mode 100644
index 00..eabf8f5d1b
--- /dev/null
+++ b/docs/misc/arm/cache-coloring.rst
@@ -0,0 +1,97 @@
+Xen cache coloring user guide
+=
+
+The cache coloring support in Xen allows to reserve Last Level Cache (LLC)
+partitions for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
+
+To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
+
+If needed, change the maximum number of colors with
+``CONFIG_NR_LLC_COLORS=``.
+
+Compile Xen and the toolstack and then configure it via
+`Command line parameters`_.
+
+Background
+**
+
+Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
+to each core (hence using multiple cache units), while the last level is shared
+among all of them. Such configuration implies that memory operations on one
+core (e.g. running a DomU) are able to generate interference on another core
+(e.g .hosting another DomU). Cache coloring allows eliminating this
+mutual interference, and thus guaranteeing higher and more predictable
+performances for memory accesses.
+The key concept underlying cache coloring is a fragmentation of the memory
+space into a set of sub-spaces called colors that are mapped to disjoint cache
+partitions. Technically, the whole memory space is first divided into a number
+of subsequent regions. Then each region is in turn divided into a number of
+subsequent sub-colors. The generic i-th color is then obtained by all the
+i-th sub-colors in each region.
+
+::
+
+Region jRegion j+1
+.   
+. . .
+.   .
+_ _ ___ _ _ _ _
+| | | | | | |
+| c_0 | c_1 | | c_n | c_0 | c_1 |
+   _ _ _|_|_|_ _ _|_|_|_|_ _ _
+:   :
+:   :... ... .
+:color 0
+:... ... .
+:
+  . . ..:
+
+There are two pragmatic lesson to be learnt.
+
+1. If one wants to avoid cache interference between two domains, different
+   colors needs to be used for their memory.
+
+2. Color assignment must privilege contiguity in the partitioning. E.g.,
+   assigning colors (0,1) to domain I  and (2,3) to domain  J is better than
+   assigning colors (0,2) to I and (1,3) to J.
+
+How to compute the number of colors
+***
+
+To compute the number of available colors for a specific platform, the size of
+an LLC way and the page size used by Xen must be known. The first parameter can
+be found in

[PATCH v5 05/13] tools: add support for cache coloring configuration