On 4/16/2021 5:04 AM, Vitaly Kuznetsov wrote:
Vineeth Pillai writes:
+#if IS_ENABLED(CONFIG_HYPERV)
+static void hv_init_vmcb(struct vmcb *vmcb)
+{
+ struct hv_enlightenments *hve = >hv_enlightenments;
+
+ if (npt_enabled &&
+ ms_hyperv.nest
On 4/16/2021 4:58 AM, Vitaly Kuznetsov wrote:
+
+#if IS_ENABLED(CONFIG_HYPERV)
+struct __packed hv_enlightenments {
+ struct __packed hv_enlightenments_control {
+ u32 nested_flush_hypercall:1;
+ u32 msr_bitmap:1;
+ u32 enlightened_npt_tlb: 1;
On 4/16/2021 4:36 AM, Vitaly Kuznetsov wrote:
struct kvm_vm_stat {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 58fa8c029867..614b4448a028 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
I still think that using arch/x86/kvm/hyperv.[ch] for
It may make sense to expand this a bit as it is probably unclear how the
change is related to SVM.
Something like:
HYPERV_CPUID_NESTED_FEATURES CPUID leaf can be present on both Intel and
AMD Hyper-V guests. Previously, the code was using
HV_X64_ENLIGHTENED_VMCS_RECOMMENDED feature bit to
support")
Signed-off-by: Vineeth Pillai
---
arch/x86/kvm/svm/svm.c | 48 ++
1 file changed, 48 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f59f03b5c722..cff01256c47e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b
e MSR bitmaps for changes. Instead, the L1
hypervisor must invalidate the corresponding clean field after making
changes to one of the MSR bitmaps."
Enable this for SVM.
Related VMX changes:
commit ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")
Signed-of
Enable remote TLB flush for SVM.
Signed-off-by: Vineeth Pillai
---
arch/x86/kvm/svm/svm.c | 37 +
1 file changed, 37 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2ad1f55c88d0..de141d5ae5fb 100644
--- a/arch/x86/kvm/svm
ace or HvFlushGuestPhysicalAddressList)
Signed-off-by: Vineeth Pillai
---
arch/x86/include/asm/hyperv-tlfs.h | 9 +
1 file changed, 9 insertions(+)
diff --git a/arch/x86/include/asm/hyperv-tlfs.h
b/arch/x86/include/asm/hyperv-tlfs.h
index 606f5cc579b2..005bf14d0449 100644
--- a/arch/x86/include/
Currently the remote TLB flush logic is specific to VMX.
Move it to a common place so that SVM can use it as well.
Signed-off-by: Vineeth Pillai
---
arch/x86/include/asm/kvm_host.h | 14 +
arch/x86/kvm/hyperv.c | 87 +
arch/x86/kvm/hyperv.h
Detect nested features exposed by Hyper-V if SVM is enabled.
Signed-off-by: Vineeth Pillai
---
arch/x86/kernel/cpu/mshyperv.c | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 3546d3e21787
or review comments.
---
Vineeth Pillai (7):
hyperv: Detect Nested virtualization support for SVM
hyperv: SVM enlightened TLB flush support flag
KVM: x86: hyper-v: Move the remote TLB flush logic out of vmx
KVM: SVM: hyper-v: Nested enlightenments in VMCB
KVM: SVM: hyper-v: Remote TLB flush
Add Hyper-V specific fields in VMCB to support SVM enlightenments.
Also a small refactoring of VMCB clean bits handling.
Signed-off-by: Vineeth Pillai
---
arch/x86/include/asm/svm.h | 24 +++-
arch/x86/kvm/svm/svm.c | 8
arch/x86/kvm/svm/svm.h | 30
On 4/8/21 11:44 AM, Paolo Bonzini wrote:
On 07/04/21 16:41, Vineeth Pillai wrote:
+#define VMCB_ALL_CLEAN_MASK (__CLEAN_MASK | (1U <<
VMCB_HV_NESTED_ENLIGHTENMENTS))
+#else
+#define VMCB_ALL_CLEAN_MASK __CLEAN_MASK
+#endif
I think this should depend on whether KVM is running
On 4/8/21 11:24 AM, Sean Christopherson wrote:
Technically, you can use normal memory accesses, so long as software guarantees
the VMCS isn't resident in the VMCS cache and knows the field offsets for the
underlying CPU. The lack of an architecturally defined layout is the biggest
issue,
On 4/7/21 3:56 PM, Michael Kelley wrote:
From: Vineeth Pillai Sent: Wednesday, April 7,
2021 7:41 AM
Bit 22 of HYPERV_CPUID_FEATURES.EDX is specific to SVM and specifies
support for enlightened TLB flush. With this enligtenment enabled,
s/enligtenment/enlightenment/
Thanks for catching
On 4/7/21 6:48 PM, Sean Christopherson wrote:
On Wed, Apr 07, 2021, Michael Kelley wrote:
+ pr_info("Hyper-V nested_features: 0x%x\n",
Nit: Most other similar lines put the colon in a different place:
pr_info("Hyper-V: nested features 0x%x\n",
One of these
On 4/8/21 7:18 AM, Vitaly Kuznetsov wrote:
enable_gif(svm);
@@ -3967,6 +3999,9 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu,
unsigned long root,
svm->vmcb->control.nested_cr3 = cr3;
vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
+ if
On 4/8/21 7:14 AM, Vitaly Kuznetsov wrote:
+ /*
+* Two Dimensional paging CR3
+* EPTP for Intel
+* nCR3 for AMD
+*/
+ u64 tdp_pointer;
};
'struct kvm_vcpu_hv' is only allocated when we emulate Hyper-V in KVM
(run Windows/Hyper-V guests on top of KVM).
Hi Vitaly,
On 4/8/21 7:06 AM, Vitaly Kuznetsov wrote:
- if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED) {
+ /*
+* AMD does not need enlightened VMCS as VMCB is already a
+* datastructure in memory.
Well, VMCS is also a structure in memory, isn't it? It's
support")
Signed-off-by: Vineeth Pillai
---
arch/x86/kvm/svm/svm.c | 48 ++
1 file changed, 48 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3562a247b7e8..c6d3f3a7c986 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b
e MSR bitmaps for changes. Instead, the L1
hypervisor must invalidate the corresponding clean field after making
changes to one of the MSR bitmaps."
Enable this for SVM.
Related VMX changes:
commit ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")
Signed-of
Enable remote TLB flush for SVM.
Signed-off-by: Vineeth Pillai
---
arch/x86/kvm/svm/svm.c | 35 +++
1 file changed, 35 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index baee91c1e936..6287cab61f15 100644
--- a/arch/x86/kvm/svm/svm.c
Add Hyper-V specific fields in VMCB to support SVM enlightenments.
Also a small refactoring of VMCB clean bits handling.
Signed-off-by: Vineeth Pillai
---
arch/x86/include/asm/svm.h | 24 +++-
arch/x86/kvm/svm/svm.h | 30 --
2 files changed
Currently the remote TLB flush logic is specific to VMX.
Move it to a common place so that SVM can use it as well.
Signed-off-by: Vineeth Pillai
---
arch/x86/include/asm/kvm_host.h | 15 +
arch/x86/kvm/hyperv.c | 89 ++
arch/x86/kvm/hyperv.h
ace or HvFlushGuestPhysicalAddressList)
Signed-off-by: Vineeth Pillai
---
arch/x86/include/asm/hyperv-tlfs.h | 9 +
1 file changed, 9 insertions(+)
diff --git a/arch/x86/include/asm/hyperv-tlfs.h
b/arch/x86/include/asm/hyperv-tlfs.h
index 606f5cc579b2..005bf14d0449 100644
--- a/arch/x86/include/
tition assist page."
L2 Windows boot time was measured with and without the patch. Time was
measured from power on to the login screen and was averaged over a
consecutive 5 trials:
Without the patch: 42 seconds
With the patch: 29 seconds
--
Vineeth Pillai (7):
hyperv: Detect Nested virt
Detect nested features exposed by Hyper-V if SVM is enabled.
Signed-off-by: Vineeth Pillai
---
arch/x86/kernel/cpu/mshyperv.c | 10 +-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 3546d3e21787
Hi Balbir,
On 11/22/20 6:44 AM, Balbir Singh wrote:
This seems cumbersome, is there no way to track the min_vruntime via
rq->core->min_vruntime?
Do you mean to have a core wide min_vruntime? We had a
similar approach from v3 to v7 and it had major issues which
broke the assumptions of cfs.
Hi Balbir,
On 11/20/20 5:15 AM, Singh, Balbir wrote:
On 18/11/20 10:19 am, Joel Fernandes (Google) wrote:
From: Peter Zijlstra
pick_next_entity() is passed curr == NULL during core-scheduling. Due to
this, if the rbtree is empty, the 'left' variable is set to NULL within
the function. This
On 10/24/20 7:10 AM, Vineeth Pillai wrote:
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 93a3b874077d..4cae5ac48b60 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4428,12 +4428,14 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct
sched_entity *curr
Hi Aubrey,
On 10/23/20 10:48 PM, Li, Aubrey wrote:
2. Do you see the issue in v7? Not much if at all has changed in this
part of the code from v7 -> v8 but could be something in the newer
kernel.
IIRC, I can run uperf successfully on v7.
I'm on tip/master 2d3e8c9424c9 (origin/master)
+
+bool cfs_prio_less(struct task_struct *a, struct task_struct *b)
+{
+ struct sched_entity *se_a = >se, *se_b = >se;
+ struct cfs_rq *cfs_rq_a, *cfs_rq_b;
+ u64 vruntime_a, vruntime_b;
+
+#ifdef CONFIG_FAIR_GROUP_SCHED
+ while (!is_same_tg(se_a, se_b)) {
+
core_fi_time; /* Last forced idle TS */
struct task_struct *core_pick;
unsigned int core_enabled;
unsigned int core_sched_seq;
vineeth@vinstation340u:~/WS/kernel_test/linux-qemu/out$ git show > log
vineeth@vinstation340u:~/WS/kernel_te
On 9/3/20 12:34 AM, Joel Fernandes wrote:
Indeed! For at least two reasons, IMO:
1) what Thomas is saying already. I.e., even on a CPU which has HT but
is not affected by any of the (known!) speculation issues, one may want
to use Core Scheduling _as_a_feature_. For instance, for avoiding
Hi Joel,
On 9/1/20 1:30 PM, Joel Fernandes wrote:
I think we can come here when hotplug thread is scheduled during online, but
mask is not yet updated. Probably can add it with this comment as well.
I don't see how that is possible. Because the cpuhp threads run during the
CPU onlining
Hi Joel,
On 9/1/20 1:10 AM, Joel Fernandes wrote:
3. The 'Rescheduling siblings' loop of pick_next_task() is quite fragile. It
calls various functions on rq->core_pick which could very well be NULL because:
An online sibling might have gone offline before a task could be picked for it,
or it
On 8/29/20 3:47 AM, pet...@infradead.org wrote:
During hotplug stress test, we have noticed that while a sibling is in
pick_next_task, another sibling can go offline or come online. What
we have observed is smt_mask get updated underneath us even if
we hold the lock. From reading the code,
On 8/28/20 5:25 PM, Peter Zijlstra wrote:
The only pupose of this loop seem to be to find if we have a forceidle;
surely we can avoid that by storing this during the pick.
The idea was to kick each cpu that was force idle. But now, thinking
about it, we just need to kick one as it will pick
On 8/28/20 4:55 PM, Peter Zijlstra wrote:
On Fri, Aug 28, 2020 at 03:51:09PM -0400, Julien Desfossez wrote:
+ if (is_idle_task(rq_i->core_pick) && rq_i->nr_running)
+ rq_i->core_forceidle = true;
Did you mean: rq_i->core_pick == rq_i->idle ?
On 8/28/20 4:51 PM, Peter Zijlstra wrote:
cpumask_weigt() is fairly expensive, esp. for something that should
'never' happen.
What exactly is the race here?
We'll update the cpu_smt_mask() fairly early in secondary bringup, but
where does it become a problem?
The moment the new thread
Hi Alex,
>
> As discussed during Linux Plumbers, here is a small repo with test
> scripts and applications that I've used to look at core scheduling
> unfairness:
>
>https://github.com/agraf/schedgaps
>
Thanks for sharing :).
> Please let me know if it's unclear how to use it or if you see
> Let me know your thoughts and looking forward to a good LPC MC discussion!
>
Nice write up Joel, thanks for taking time to compile this with great detail!
After going through the details of interface proposal using cgroup v2
controllers,
and based on our discussion offline, would like to note
Typo "packaet".
No need to resend. I can fix this while committing this patch.
Thanks Wei.
.
Signed-off-by: Vineeth Pillai
---
v2:
- s/pr_warn/pr_warn_once/
---
drivers/hv/hv_util.c | 19 ---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index 1f86e8d9b018..a4e8d96513c2 100644
--- a/drivers/hv
If for any reason, host timesync messages were not processed by
the guest, hv_ptp_gettime() returns a stale value and the
caller (clock_gettime, PTP ioctl etc) has no means to know this
now. Return an error so that the caller knows about this.
Signed-off-by: Vineeth Pillai
---
v2:
- Fix
Hi Michael,
> > + pr_warn("TimeSync IC pkt recv failed (Err: %d)\n",
> > + ret);
>
> Let's use pr_warn_once().
>
> If there's a packet at the head of the ring buffer that specifies a bogus
> length,
> we could take the error path. But the bad
Hi Michael,
> > +const u64 HOST_TIMESYNC_DELAY_THRESH = 600 * NSEC_PER_SEC;
>
> Kernel test robot has already complained that this should be static,
> and about the potential overflow based on the types of the constants in
> the right side expression. I didn't check the details, but I suspect
.
Signed-off-by: Vineeth Pillai
---
drivers/hv/hv_util.c | 19 ---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index 1357861fd8ae..c0491b727fd5 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -387,10
If for any reason, host timesync messages were not processed by
the guest, hv_ptp_gettime() returns a stale value and the
caller (clock_gettime, PTP ioctl etc) has no means to know this
now. Return an error so that the caller knows about this.
Signed-off-by: Vineeth Pillai
---
drivers/hv
On 20/07/26 06:49AM, Vineeth Pillai wrote:
>
>
> Sixth iteration of the Core-Scheduling feature.
>
I am no longer with DigitalOcean. Kindly use this email address for all
future responses.
Thanks,
Vineeth
On Sun, Jul 19, 2020 at 11:54 PM Joel Fernandes wrote:
>
> These ifdeffery and checkpatch / command line parameter issues were added by
> Vineeth before he sent out my patch. I'll let him comment on these, agreed
> they all need fixing!
>
Will fix this in the next iteration. Regarding the __setup
>
> I'm confused, how doesn't this break the invariant above?
>
> That is, all CPUs must at all times agree on the value of rq_lockp(),
> and I'm not seeing how that is true with the above changes.
>
While fixing the crash in cpu online/offline, I was focusing on
maintaining the invariance
of all
> > Please try this and see how it compares with the vanilla v2. I think its
> > time for a v3 now and we shall be posting it soon after some more
> > testing and benchmarking.
>
> Is there any potential change between pre v3 and v3? I prefer working
> based on v3 so that everyone are on the same
> > The following patch improved my test cases.
> > Welcome any comments.
> >
>
> This is certainly better than violating the point of the core scheduler :)
>
> If I'm understanding this right what will happen in this case is instead
> of using the idle process selected by the sibling we do the
On Sun, Mar 24, 2019 at 11:30 AM Alex Xu (Hello71) wrote:
>
> I get this BUG in 5.1-rc1 sometimes when powering off the machine. I
> suspect my setup erroneously executes two swapoff+cryptsetup close
> operations simultaneously, so a race condition is triggered.
>
> I am using a single swap on a
On Wed, Jan 2, 2019 at 2:43 PM Hugh Dickins wrote:
>
> Wrong. Without heavier locking that would add unwelcome overhead to
> common paths, we shall "always" need the retry logic. It does not
> come into play very often, but here are two examples of why it's
> needed (if I thought longer, I
On Tue, Jan 1, 2019 at 11:16 PM Hugh Dickins wrote:
> One more fix on top of what I sent yesterday: once I delved into
> the retries, I found that the major cause of exceeding MAX_RETRIES
> was the way the retry code neatly avoided retrying the last part of
> its work. With this fix in, I have
57 matches
Mail list logo