On Fri, Aug 9, 2019 at 8:57 PM Adam Williamson
wrote:
>
> Hey folks! I'm starting a new thread for this to trim the recipient
> list a bit and include devel@ and coreos@.
>
> The Story So Far: there is a Fedora release criterion which requires
> Fedora to boot on Xen:
>
> "The release must boot
flight 139860 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139860/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
test-amd64-amd64-xl-xsm 11 debian-fixup fail REGR. vs. 139832
test-amd64-amd64-xl
flight 139862 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139862/
Failures and problems with tests :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
test-amd64-i386-xl-qemuu-win10-i386 broken in 139835
Tests
Further looking into this, it seems the problem occurs on this section
of the grub.cfg:
### BEGIN /etc/grub.d/08_fallback_counting ###
insmod increment
# Check if boot_counter exists and boot_success=0 to activate this
behaviour.
if [ -n "${boot_counter}" -a "${boot_success}" = "0" ]; then
#
Hey folks! I'm starting a new thread for this to trim the recipient
list a bit and include devel@ and coreos@.
The Story So Far: there is a Fedora release criterion which requires
Fedora to boot on Xen:
"The release must boot successfully as Xen DomU with releases providing
a functional,
On Fri, 9 Aug 2019, 23:21 Stefano Stabellini,
wrote:
> On Wed, 7 Aug 2019, Julien Grall wrote:
> > Hi Stefano,
> >
> > On 06/08/2019 22:49, Stefano Stabellini wrote:
> > > As we parse the device tree in Xen, keep track of the reserved-memory
> > > regions as they need special treatment
flight 139861 freebsd-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139861/
Perfect :-)
All tests in this flight passed as required
version targeted for testing:
freebsd d3d0bd153cf3e76effd2e9e8c66a847d1c5defe3
baseline version:
freebsd
On Thu, 8 Aug 2019, Volodymyr Babchuk wrote:
> Hi Stefano,
>
> Stefano Stabellini writes:
>
> > Don't allow reserved-memory regions to be remapped into any guests,
> > until reserved-memory regions are properly supported in Xen. For now,
> > do not call iomem_permit_access for them.
> >
> >
flight 139856 linux-4.19 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139856/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
build-armhf-pvops 6 kernel-build fail REGR. vs. 129313
Tests which did not
On Wed, 7 Aug 2019, Julien Grall wrote:
> Hi Stefano,
>
> On 06/08/2019 22:49, Stefano Stabellini wrote:
> > As we parse the device tree in Xen, keep track of the reserved-memory
> > regions as they need special treatment (follow-up patches will make use
> > of the stored information.)
> >
> >
On Wed, 7 Aug 2019, Julien Grall wrote:
> Hi Stefano,
>
> On 06/08/2019 22:49, Stefano Stabellini wrote:
> > Add new parameters to device_tree_for_each_node: node, depth,
> > address_cells, size_cells.
>
> address_cells (resp. size_cells) are named address_cells_p (resp.
> size_cells_p) in the
flight 139879 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139879/
Failures :-/ but no regressions.
Tests which did not succeed, but are not blocking:
test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass
test-arm64-arm64-xl-xsm
On Wed, 7 Aug 2019, Julien Grall wrote:
> Hi Stefano,
>
> On 06/08/2019 22:49, Stefano Stabellini wrote:
> > Change the signature of process_memory_node to match
> > device_tree_node_func. Thanks to this change, the next patch will be
> > able to use device_tree_for_each_node to call
On Wed, 7 Aug 2019, Julien Grall wrote:
> On 06/08/2019 22:49, Stefano Stabellini wrote:
> > static void __init process_multiboot_node(const void *fdt, int node,
> > const char *name,
> > u32 address_cells,
flight 139859 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139859/
Perfect :-)
All tests in this flight passed as required
version targeted for testing:
ovmf 4b1b7c1913092d73d689d8086dcfa579c0217dc8
baseline version:
ovmf
On Wed, 7 Aug 2019, Julien Grall wrote:
> Hi Stefano,
>
> On 06/08/2019 22:49, Stefano Stabellini wrote:
> > Improve early_print_info to also print the banks saved in
> > bootinfo.reserved_mem. Print them right after RESVD, increasing the same
> > index.
> >
> > Signed-off-by: Stefano Stabellini
Hi Stefano,
On 8/9/19 12:12 AM, Stefano Stabellini wrote:
We don't have a clear way to know how many virtual SPIs we need for the
boot domains. Introduce a new option under xen,domain to specify the
number of SPIs to allocate for the domain.
The property is optional, when absent, we'll use the
Hi Stefano,
On 8/9/19 12:12 AM, Stefano Stabellini wrote:
Detect "multiboot,device-tree" compatible nodes. Add them to the bootmod
array as BOOTMOD_GUEST_DTB. In kernel_probe, find the right
BOOTMOD_GUEST_DTB and store a pointer to it in dtb_bootmodule.
Signed-off-by: Stefano Stabellini
---
Hi Stefano,
On 8/9/19 12:12 AM, Stefano Stabellini wrote:
Scan the user provided dtb fragment at boot. For each device node, map
memory to guests, and route interrupts and setup the iommu.
The iommu is setup by passing the node of the device to assign on the
host device tree. The path is
Hi Stefano,
On 8/9/19 12:12 AM, Stefano Stabellini wrote:
Read the dtb fragment corresponding to a passthrough device from memory
at the location referred to by the "multiboot,dtb" compatible node.
Copy the fragment to the guest dtb.
Add a dtb_bootmodule field to struct kernel_info to find
Stefano,
Stefano Stabellini writes:
> We don't have a clear way to know how many virtual SPIs we need for the
> boot domains. Introduce a new option under xen,domain to specify the
> number of SPIs to allocate for the domain.
>
> The property is optional, when absent, we'll use the physical
flight 139850 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139850/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
test-amd64-amd64-migrupgrade 11 xen-boot/dst_hostfail REGR. vs. 139714
Tests which did
Hi, Julien
What will actually happen if the transaction fail again? For
instance,
if the IOVA was not mapped. Will you receive the interrupt again?
If so, are you going to make the flush again and again until the
guest
is killed?
This is a good question. I think, if
Hi Stefano,
I figured out I may want to read the docs before looking at the code :).
On 09/08/2019 00:12, Stefano Stabellini wrote:
Signed-off-by: Stefano Stabellini
---
Changes in v3:
- add nr_spis
- change description of interrupts and interrupt-parent
Changes in v2:
- device tree
Hi Stewart,
On 09/08/2019 19:34, Stewart Hildebrand wrote:
On Friday, August 9, 2019 2:24 PM, Stefano Stabellini
On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
Here is Jeff's initial patch for the issue.
I committed Julien's patch for now,
Great! Thanks!
but if we need to make any
On Friday, August 9, 2019 2:24 PM, Stefano Stabellini
>On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
>> Here is Jeff's initial patch for the issue.
>
>I committed Julien's patch for now,
Great! Thanks!
>but if we need to make any changes
>or decide for a better alternative, we can always revert
Stefano Stabellini writes:
> Detect "multiboot,device-tree" compatible nodes. Add them to the bootmod
> array as BOOTMOD_GUEST_DTB. In kernel_probe, find the right
> BOOTMOD_GUEST_DTB and store a pointer to it in dtb_bootmodule.
>
> Signed-off-by: Stefano Stabellini
>
> ---
> Changes in v2:
>
On Fri, 9 Aug 2019, Dario Faggioli wrote:
> On Wed, 2019-08-07 at 11:22 -0700, Stefano Stabellini wrote:
> > Hi Dario, George,
> >
> > Dom0less with sched=null is broken on staging, it simply hangs soon
> > after Xen is finished loading things. My impression is that vcpus are
> > not actually
Stefano Stabellini writes:
> Scan the user provided dtb fragment at boot. For each device node, map
> memory to guests, and route interrupts and setup the iommu.
>
> The iommu is setup by passing the node of the device to assign on the
> host device tree. The path is specified in the device
On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 9:39 AM, Jan Beulich wrote:
> >On 09.08.2019 14:14, Julien Grall wrote:
> >> Combining of buddies happens only such that the resulting larger buddy
> >> is still order-aligned. To cross a zone boundary while merging, the
>
On Friday, August 9, 2019 9:39 AM, Jan Beulich wrote:
>On 09.08.2019 14:14, Julien Grall wrote:
>> Combining of buddies happens only such that the resulting larger buddy
>> is still order-aligned. To cross a zone boundary while merging, the
>> implication is that both the buddy [0, 2^n-1] and the
Hi, Julien
On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
From: Oleksandr Tyshchenko
Introduce a separate file to keep various helpers which could be used
by more than one IOMMU driver in order not to duplicate code.
The first condidates to be moved to the new file are SMMU driver's
Hi Stefano,
On 09/08/2019 00:12, Stefano Stabellini wrote:
Move the interrupt handling code out of handle_device to a new function
so that it can be reused for dom0less VMs later.
Signed-off-by: Stefano Stabellini
---
Changes in v3:
- add patch
The diff is hard to read but I just moved the
On Wed, 2019-08-07 at 11:22 -0700, Stefano Stabellini wrote:
> Hi Dario, George,
>
> Dom0less with sched=null is broken on staging, it simply hangs soon
> after Xen is finished loading things. My impression is that vcpus are
> not actually started. I did a git bisection and it pointed to:
>
>
>
> On Wed 07-08-19 19:36:37, Ira Weiny wrote:
> > On Wed, Aug 07, 2019 at 10:46:49AM +0200, Michal Hocko wrote:
> > > > So I think your debug option and my suggested renaming serve a bit
> > > > different purposes (and thus both make sense). If you do the
> > > > renaming, you can just grep to
On Fri, 9 Aug 2019, Julien Grall wrote:
> Combining of buddies happens only such that the resulting larger buddy
> is still order-aligned. To cross a zone boundary while merging, the
> implication is that both the buddy [0, 2^n-1] and the buddy
> [2^n, 2^(n+1)] are free.
>
> Ideally we want to
flight 139869 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139869/
Failures :-/ but no regressions.
Tests which did not succeed, but are not blocking:
test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass
test-arm64-arm64-xl-xsm
Hi Stefano,
Stefano Stabellini writes:
> Read the dtb fragment corresponding to a passthrough device from memory
> at the location referred to by the "multiboot,dtb" compatible node.
>
> Copy the fragment to the guest dtb.
>
> Add a dtb_bootmodule field to struct kernel_info to find the dtb
>
Hi all,
Following the discussion we had at the Developer Summit (see
https://wiki.xenproject.org/wiki/Design_Sessions_2019#Community_Issues_.2F_Improvements_-_Communication.2C_Code_of_Conduct.2C_etc.
for notes) I put together a draft for the Code of Conduct which can be found
here as well as
Hi Oleksandr,
On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
From: Oleksandr Tyshchenko
Introduce a separate file to keep various helpers which could be used
by more than one IOMMU driver in order not to duplicate code.
The first condidates to be moved to the new file are SMMU driver's
Hello Stefano,
Stefano Stabellini writes:
> Move the interrupt handling code out of handle_device to a new function
> so that it can be reused for dom0less VMs later.
>
> Signed-off-by: Stefano Stabellini
> ---
> Changes in v3:
> - add patch
>
> The diff is hard to read but I just moved the
Hi,
On 07/08/2019 01:23, Stefano Stabellini wrote:
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 2674caa005..063523c7f7 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -920,6 +920,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t)
u_domctl)
unsigned
Hi Stefano,
On 07/08/2019 01:23, Stefano Stabellini wrote:
Add a new memory policy option for the iomem parameter.
Possible values are:
- arm_dev_nGnRE, Device-nGnRE, the default on Arm
- arm_mem_WB, WB cachable memory
- default
Store the parameter in a new field in libxl_iomem_range.
Pass
flight 139853 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/139853/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs.
139829
Tests which did
Intel Core/Xeon CPUs have two registers per architectural segment register, to
allow for sufficient speculation to cover a typical context switch (one write
to each segment). Unfortunately, these CPUs speculate over a faulting
descriptor load, and for a period of time, operate with the stale
On 09.08.2019 17:36, Andrew Cooper wrote:
On 09/08/2019 13:50, Jan Beulich wrote:
On 09.08.2019 14:39, Andrew Cooper wrote:
Xen, being 64bit only these days, has no use for a 32bit Ring 0 code
segment.
Delete __HYPERVISOR_CS32 and remove it from the GDTs. Also delete
__HYPERVISOR_CS64 and
On 09.08.2019 16:59, Andrew Cooper wrote:
On 09/08/2019 13:32, Jan Beulich wrote:
Signed-off-by: Jan Beulich
---
TBD: Especially with how the previous patch now works I'm unconvinced of
the utility of the linker script alignment check. It in particular
doesn't check the property
Commit fd35f32b4b ("tools/x86emul: Use struct cpuid_policy in the
userspace test harnesses") didn't account for the dependencies of
cpuid-autogen.h to potentially change between incremental builds. In
particular the harness has a "run" goal which is supposed to be usable
independently of the rest
On 09/08/2019 13:50, Jan Beulich wrote:
> On 09.08.2019 14:39, Andrew Cooper wrote:
>> Xen, being 64bit only these days, has no use for a 32bit Ring 0 code
>> segment.
>>
>> Delete __HYPERVISOR_CS32 and remove it from the GDTs. Also delete
>> __HYPERVISOR_CS64 and use __HYPERVISOR_CS uniformly.
>
On 09/08/2019 14:18, Jan Beulich wrote:
> On 09.08.2019 15:07, Andrew Cooper wrote:
>> On 09/08/2019 13:43, Jan Beulich wrote:
>>> On 09.08.2019 14:19, Andrew Cooper wrote:
On 09/08/2019 11:40, Jan Beulich wrote:
> --- /dev/null
> +++ b/xen/arch/x86/desc.c
> @@ -0,0 +1,109 @@
On 09/08/2019 16:17, Andrew Cooper wrote:
> The _struct suffix on tss_struct is quite redundant. Rename it to tss64 to
> mirror the existing tss32 structure we have in HVM's Task Switch logic.
>
> The per-cpu name having an init_ prefix is also wrong. There is exactly one
> TSS for each CPU,
The _struct suffix on tss_struct is quite redundant. Rename it to tss64 to
mirror the existing tss32 structure we have in HVM's Task Switch logic.
The per-cpu name having an init_ prefix is also wrong. There is exactly one
TSS for each CPU, which is used for the lifetime of the system. Drop
Today there are two distinct scenarios for vcpu_create(): either for
creation of idle-domain vcpus (vcpuid == processor) or for creation of
"normal" domain vcpus (including dom0), where the caller selects the
initial processor on a round-robin scheme of the allowed processors
(allowed being based
From: David Woodhouse
Ditch the bootsym() access from C code for the variables populated by
16-bit boot code. As well as being cleaner this also paves the way for
not having the 16-bit boot code in low memory for no-real-mode or EFI
loader boots at all.
Signed-off-by: David Woodhouse
---
From: David Woodhouse
Where booted from EFI or with no-real-mode, there is no need to stomp
on low memory with the 16-boot code. Instead, just go straight to
trampoline_protmode_entry() at its physical location within the Xen
image.
For now, the boot code (including the EFI loader path) still
From: David Woodhouse
We appear to have implemented a memcpy() in the low-memory trampoline
which we then call into from __start_xen(), for no adequately defined
reason.
Kill it with fire.
Signed-off-by: David Woodhouse
Acked-by: Andrew Cooper
---
v2: Minor fixups from Andrew.
From: David Woodhouse
As a first step toward using the low-memory trampoline only when necessary
for a legacy boot without no-real-mode, clean up the relocations into
three separate groups.
• bootsym() is now used only at boot time when no-real-mode isn't set.
• bootdatasym() is for
From: David Woodhouse
In preparation for splitting the boot and permanent trampolines from
each other. Some of these will change back, but most are boot so do the
plain search/replace that way first, then a subsequent patch will extract
the permanent trampoline code.
Signed-off-by: David
From: David Woodhouse
If the no-real-mode flag is set, don't go there at all. This is a prelude
to not even putting it there in the first place.
Signed-off-by: David Woodhouse
---
xen/arch/x86/boot/head.S | 10 ++
xen/arch/x86/boot/trampoline.S | 4
2 files changed, 10
Some cleanups for the boot path, originally inspired by an attempt to
avoid scribbling on arbitrarily-chosen low memory.
In the no-real-mode case we don't need to bounce through low memory at
all; we can run the 32-bit trampoline in-place in the Xen image.
The variables containing information
On 09/08/2019 13:32, Jan Beulich wrote:
> Signed-off-by: Jan Beulich
> ---
> TBD: Especially with how the previous patch now works I'm unconvinced of
> the utility of the linker script alignment check. It in particular
> doesn't check the property we're after in this patch, i.e. the
In order to be able to move cpus to cpupools with core scheduling
active it is mandatory to merge multiple cpus into one scheduling
resource or to split a scheduling resource with multiple cpus in it
into multiple scheduling resources. This in turn requires to modify
the cpu <-> scheduling
Switch credit2 scheduler completely from vcpu to sched_unit usage.
As we are touching lots of lines remove some white space at the end of
the line, too.
Signed-off-by: Juergen Gross
---
xen/common/sched_credit2.c | 820 ++---
1 file changed, 403
Add a scheduling granularity enum ("cpu", "core", "socket") for
specification of the scheduling granularity. Initially it is set to
"cpu", this can be modified by the new boot parameter (x86 only)
"sched-gran".
According to the selected granularity sched_granularity is set after
all cpus are
cpupool_domain_cpumask() is used by scheduling to select cpus or to
iterate over cpus. In order to support scheduling units spanning
multiple cpus let cpupool_domain_cpumask() return a cpumask with only
one bit set per scheduling resource.
Signed-off-by: Juergen Gross
---
xen/common/cpupool.c
With core scheduling active it is necessary to move multiple cpus at
the same time to or from a cpupool in order to avoid split scheduling
resources in between.
Signed-off-by: Juergen Gross
---
V1: new patch
---
xen/common/cpupool.c | 100 +
On- and offlining cpus with core scheduling is rather complicated as
the cpus are taken on- or offline one by one, but scheduling wants them
rather to be handled per core.
As the future plan is to be able to select scheduling granularity per
cpupool prepare that by storing the granularity in
Having a pointer to struct cpupool in struct sched_resource instead
of per cpu is enough.
Signed-off-by: Juergen Gross
---
V1: new patch
---
xen/common/cpupool.c | 4 +---
xen/common/sched_credit.c | 2 +-
xen/common/sched_rt.c | 2 +-
xen/common/schedule.c | 8
When core or socket scheduling are active enabling or disabling smt is
not possible as that would require a major host reconfiguration.
Add a bool sched_disable_smt_switching which will be set for core or
socket scheduling.
Signed-off-by: Juergen Gross
---
V1:
- new patch
V2:
- EBUSY as return
Switch null scheduler completely from vcpu to sched_unit usage.
Signed-off-by: Juergen Gross
---
xen/common/sched_null.c | 333
1 file changed, 165 insertions(+), 168 deletions(-)
diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
In order to make it easy to iterate over sched_unit elements of a
domain, build a single linked list and add an iterator for it. The new
list is guarded by the same mechanisms as the vcpu linked list as it
is modified only via vcpu_create() or vcpu_destroy().
For completeness add another iterator
When entering deep sleep states all domains are paused resulting in
all cpus only running idle vcpus. This enables us to stop scheduling
completely in order to avoid synchronization problems with core
scheduling when individual cpus are offlined.
Disabling the scheduler is done by replacing the
Rename vcpu_schedule_[un]lock[_irq]() to unit_schedule_[un]lock[_irq]()
and let it take a sched_unit pointer instead of a vcpu pointer as
parameter.
Signed-off-by: Juergen Gross
---
xen/common/sched_credit.c | 17
xen/common/sched_credit2.c | 40
With core scheduling active schedule_cpu_[add/rm]() has to cope with
different scheduling granularity: a cpu not in any cpupool is subject
to granularity 1 (cpu scheduling), while a cpu in a cpupool might be
in a scheduling resource with more than one cpu.
Handle that by having arrays of old/new
Switch rt scheduler completely from vcpu to sched_unit usage.
Signed-off-by: Juergen Gross
---
xen/common/sched_rt.c | 356 --
1 file changed, 174 insertions(+), 182 deletions(-)
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index
Instead of letting schedule_cpu_switch() handle moving cpus from and
to cpupools, split it into schedule_cpu_add() and schedule_cpu_rm().
This will allow us to drop allocating/freeing scheduler data for free
cpus as the idle scheduler doesn't need such data.
Signed-off-by: Juergen Gross
---
V1:
Especially in the do_schedule() functions of the different schedulers
using smp_processor_id() for the local cpu number is correct only if
the sched_unit is a single vcpu. As soon as larger sched_units are
used most uses should be replaced by the cpu number of the local
sched_resource instead.
Today the vcpu runstate of a new scheduled vcpu is always set to
"running" even if at that time vcpu_runnable() is already returning
false due to a race (e.g. with pausing the vcpu).
With core scheduling this can no longer work as not all vcpus of a
schedule unit have to be "running" when being
Having a pointer to struct scheduler in struct sched_resource instead
of per cpu is enough.
Signed-off-by: Juergen Gross
---
V1: new patch
---
xen/common/sched_credit.c | 18 +++---
xen/common/sched_credit2.c | 3 ++-
xen/common/schedule.c | 15 +++
This prepares making the different schedulers vcpu agnostic.
Note that some scheduler specific accessor function are misnamed after
this patch. This will be corrected in later patches.
Signed-off-by: Juergen Gross
---
xen/common/sched_arinc653.c | 4 ++--
xen/common/sched_credit.c | 6
Switch credit scheduler completely from vcpu to sched_unit usage.
Signed-off-by: Juergen Gross
---
xen/common/sched_credit.c | 503 +++---
1 file changed, 250 insertions(+), 253 deletions(-)
diff --git a/xen/common/sched_credit.c
In order to prepare for multiple vcpus per schedule unit move struct
task_slice in schedule() from the local stack into struct sched_unit
of the currently running unit. To make access easier for the single
schedulers add the pointer of the currently running unit as a parameter
of do_schedule().
The credit scheduler calls vcpu_pause_nosync() and vcpu_unpause()
today. Add sched_unit_pause_nosync() and sched_unit_unpause() to
perform the same operations on scheduler units instead.
Signed-off-by: Juergen Gross
---
xen/common/sched_credit.c | 6 +++---
xen/include/xen/sched-if.h | 10
Use sched_units instead of vcpus in schedule(). This includes the
introduction of sched_unit_runstate_change() as a replacement of
vcpu_runstate_change() in schedule().
Signed-off-by: Juergen Gross
---
xen/common/schedule.c | 68 +--
1 file
Affinities are scheduler specific attributes, they should be per
scheduling unit. So move all affinity related fields in struct vcpu
to struct sched_unit. While at it switch affinity related functions in
sched-if.h to use a pointer to sched_unit instead to vcpu as parameter.
The affinity_broken
When switching sched units synchronize all vcpus of the new unit to be
scheduled at the same time.
A variable sched_granularity is added which holds the number of vcpus
per schedule unit.
As tasklets require to schedule the idle unit it is required to set the
tasklet_work_scheduled parameter of
This prepares support of larger scheduling granularities, e.g. core
scheduling.
While at it move sched_has_urgent_vcpu() from include/asm-x86/cpuidle.h
into schedule.c removing the need for including sched-if.h in
cpuidle.h and multiple other C sources.
Signed-off-by: Juergen Gross
Acked-by:
Rename the scheduler related perf counters from vcpu* to unit* where
appropriate.
Signed-off-by: Juergen Gross
---
xen/common/sched_credit.c| 32
xen/common/sched_credit2.c | 18 +-
xen/common/sched_null.c | 18 +-
Where appropriate switch from for_each_vcpu() to for_each_sched_unit()
in order to prepare core scheduling.
As it is beneficial once here and for sure in future add a
unit_scheduler() helper and let vcpu_scheduler() use it.
Signed-off-by: Juergen Gross
---
V2:
- handle affinity_broken correctly
sched_move_irqs() should work on a sched_unit as that is the unit
moved between cpus.
Rename the current function to vcpu_move_irqs() as it is still needed
in schedule().
Signed-off-by: Juergen Gross
---
xen/common/schedule.c | 18 +-
1 file changed, 13 insertions(+), 5
With a scheduling granularity greater than 1 multiple vcpus share the
same struct sched_unit. Support that.
Setting the initial processor must be done carefully: we can't use
sched_set_res() as that relies on for_each_sched_unit_vcpu() which in
turn needs the vcpu already as a member of the
vcpu_wake() and vcpu_sleep() need to be made core scheduling aware:
they might need to switch a single vcpu of an already scheduled unit
between running and not running.
Especially when vcpu_sleep() for a vcpu is being called by a vcpu of
the same scheduling unit special care must be taken in
Add counters to struct sched_unit summing up runstates of associated
vcpus.
Signed-off-by: Juergen Gross
---
RFC V2: add counters for each possible runstate
---
xen/common/schedule.c | 5 +
xen/include/xen/sched.h | 2 ++
2 files changed, 7 insertions(+)
diff --git
In order to prepare core- and socket-scheduling use a new struct
sched_unit instead of struct vcpu for interfaces of the different
schedulers.
Rename the per-scheduler functions insert_vcpu and remove_vcpu to
insert_unit and remove_unit to reflect the change of the parameter.
In the schedulers
Instead of returning a physical cpu number let pick_cpu() return a
scheduler resource instead. Rename pick_cpu() to pick_resource() to
reflect that change.
Signed-off-by: Juergen Gross
Reviewed-by: Dario Faggioli
---
xen/common/sched_arinc653.c | 12 ++--
xen/common/sched_credit.c
When scheduling an unit with multiple vcpus there is no guarantee all
vcpus are available (e.g. above maxvcpus or vcpu offline). Fall back to
idle vcpu of the current cpu in that case. This requires to store the
correct schedule_unit pointer in the idle vcpu as long as it used as
fallback vcpu.
In preparation of core scheduling let the percpu pointer
schedule_data.curr point to a strct sched_unit instead of the related
vcpu. At the same time rename the per-vcpu scheduler specific structs
to per-unit ones.
Signed-off-by: Juergen Gross
Reviewed-by: Dario Faggioli
---
Add support for core- and socket-scheduling in the Xen hypervisor.
Via boot parameter sched-gran=core (or sched-gran=socket)
it is possible to change the scheduling granularity from cpu (the
default) to either whole cores or even sockets.
All logical cpus (threads) of the core or socket are
Add the following helpers using a sched_unit as input instead of a
vcpu:
- is_idle_unit() similar to is_idle_vcpu()
- is_unit_online() similar to is_vcpu_online()
- unit_runnable() like vcpu_runnable()
- sched_set_res() to set the current processor of an unit
- sched_unit_cpu() to get the current
Add a percpu variable holding the index of the cpu in the current
sched_resource structure. This index is used to get the correct vcpu
of a sched_unit on a specific cpu.
For now this index will be zero for all cpus, but with core scheduling
it will be possible to have higher values, too.
1 - 100 of 180 matches
Mail list logo