[tip:x86/hyperv] x86/hyper-v: Fix wrong merge conflict resolution
Commit-ID: be0e16ce7c3bf9855f1ef5ae46cf889e1784ddea Gitweb: https://git.kernel.org/tip/be0e16ce7c3bf9855f1ef5ae46cf889e1784ddea Author: K. Y. Srinivasan AuthorDate: Fri, 20 Jul 2018 03:50:09 + Committer: Thomas Gleixner CommitDate: Fri, 20 Jul 2018 06:56:23 +0200 x86/hyper-v: Fix wrong merge conflict resolution When the mapping betwween the Linux notion of CPU ID to the hypervisor's notion of CPU ID is not initialized, IPI must fall back to the non-enlightened path. The recent merge of upstream changes into the hyperv branch resolved a merge conflict wronly by returning success in that case, which results in the IPI not being sent at all. Fix it up. Fixes: 8f63e9230dec ("Merge branch 'x86/urgent' into x86/hyperv") Reported-by: Michael Kelley Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Cc: gre...@linuxfoundation.org Cc: de...@linuxdriverproject.org Cc: o...@aepfle.de Cc: a...@canonical.com Cc: jasow...@redhat.com Cc: h...@zytor.com Cc: sthem...@microsoft.com Cc: michael.h.kel...@microsoft.com Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180720035009.3995-1-...@linuxonhyperv.com --- arch/x86/hyperv/hv_apic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index 0c3c9f8fee77..5b0f613428c2 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -168,7 +168,7 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector) for_each_cpu(cur_cpu, mask) { vcpu = hv_cpu_number_to_vp_number(cur_cpu); if (vcpu == VP_INVAL) - return true; + return false; /* * This particular version of the IPI hypercall can
[tip:x86/hyperv] x86/hyper-v: Fix wrong merge conflict resolution
Commit-ID: be0e16ce7c3bf9855f1ef5ae46cf889e1784ddea Gitweb: https://git.kernel.org/tip/be0e16ce7c3bf9855f1ef5ae46cf889e1784ddea Author: K. Y. Srinivasan AuthorDate: Fri, 20 Jul 2018 03:50:09 + Committer: Thomas Gleixner CommitDate: Fri, 20 Jul 2018 06:56:23 +0200 x86/hyper-v: Fix wrong merge conflict resolution When the mapping betwween the Linux notion of CPU ID to the hypervisor's notion of CPU ID is not initialized, IPI must fall back to the non-enlightened path. The recent merge of upstream changes into the hyperv branch resolved a merge conflict wronly by returning success in that case, which results in the IPI not being sent at all. Fix it up. Fixes: 8f63e9230dec ("Merge branch 'x86/urgent' into x86/hyperv") Reported-by: Michael Kelley Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Cc: gre...@linuxfoundation.org Cc: de...@linuxdriverproject.org Cc: o...@aepfle.de Cc: a...@canonical.com Cc: jasow...@redhat.com Cc: h...@zytor.com Cc: sthem...@microsoft.com Cc: michael.h.kel...@microsoft.com Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180720035009.3995-1-...@linuxonhyperv.com --- arch/x86/hyperv/hv_apic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index 0c3c9f8fee77..5b0f613428c2 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -168,7 +168,7 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector) for_each_cpu(cur_cpu, mask) { vcpu = hv_cpu_number_to_vp_number(cur_cpu); if (vcpu == VP_INVAL) - return true; + return false; /* * This particular version of the IPI hypercall can
[tip:x86/hyperv] x86/hyper-v: Fix the circular dependency in IPI enlightenment
Commit-ID: 1268ed0c474a5c8f165ef386f3310521b5e00e27 Gitweb: https://git.kernel.org/tip/1268ed0c474a5c8f165ef386f3310521b5e00e27 Author: K. Y. Srinivasan AuthorDate: Tue, 3 Jul 2018 16:01:55 -0700 Committer: Thomas Gleixner CommitDate: Fri, 6 Jul 2018 12:32:59 +0200 x86/hyper-v: Fix the circular dependency in IPI enlightenment The IPI hypercalls depend on being able to map the Linux notion of CPU ID to the hypervisor's notion of the CPU ID. The array hv_vp_index[] provides this mapping. Code for populating this array depends on the IPI functionality. Break this circular dependency. [ tglx: Use a proper define instead of '-1' with a u32 variable as pointed out by Vitaly ] Fixes: 68bb7bfb7985 ("X86/Hyper-V: Enable IPI enlightenments") Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Tested-by: Michael Kelley Cc: gre...@linuxfoundation.org Cc: de...@linuxdriverproject.org Cc: o...@aepfle.de Cc: a...@canonical.com Cc: jasow...@redhat.com Cc: h...@zytor.com Cc: sthem...@microsoft.com Cc: michael.h.kel...@microsoft.com Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180703230155.15160-1-...@linuxonhyperv.com --- arch/x86/hyperv/hv_apic.c | 5 + arch/x86/hyperv/hv_init.c | 5 - arch/x86/include/asm/mshyperv.h | 5 - 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index f68855499391..402338365651 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -114,6 +114,8 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector) ipi_arg->vp_set.format = HV_GENERIC_SET_SPARSE_4K; nr_bank = cpumask_to_vpset(&(ipi_arg->vp_set), mask); } + if (nr_bank < 0) + goto ipi_mask_ex_done; if (!nr_bank) ipi_arg->vp_set.format = HV_GENERIC_SET_ALL; @@ -158,6 +160,9 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector) for_each_cpu(cur_cpu, mask) { vcpu = hv_cpu_number_to_vp_number(cur_cpu); + if (vcpu == VP_INVAL) + goto ipi_mask_done; + /* * This particular version of the IPI hypercall can * only target upto 64 CPUs. diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 4c431e1c1eff..1ff420217298 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -265,7 +265,7 @@ void __init hyperv_init(void) { u64 guest_id, required_msrs; union hv_x64_msr_hypercall_contents hypercall_msr; - int cpuhp; + int cpuhp, i; if (x86_hyper_type != X86_HYPER_MS_HYPERV) return; @@ -293,6 +293,9 @@ void __init hyperv_init(void) if (!hv_vp_index) return; + for (i = 0; i < num_possible_cpus(); i++) + hv_vp_index[i] = VP_INVAL; + hv_vp_assist_page = kcalloc(num_possible_cpus(), sizeof(*hv_vp_assist_page), GFP_KERNEL); if (!hv_vp_assist_page) { diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 3cd14311edfa..5a7375ed5f7c 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -9,6 +9,8 @@ #include #include +#define VP_INVAL U32_MAX + struct ms_hyperv_info { u32 features; u32 misc_features; @@ -20,7 +22,6 @@ struct ms_hyperv_info { extern struct ms_hyperv_info ms_hyperv; - /* * Generate the guest ID. */ @@ -281,6 +282,8 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset, */ for_each_cpu(cpu, cpus) { vcpu = hv_cpu_number_to_vp_number(cpu); + if (vcpu == VP_INVAL) + return -1; vcpu_bank = vcpu / 64; vcpu_offset = vcpu % 64; __set_bit(vcpu_offset, (unsigned long *)
[tip:x86/hyperv] x86/hyper-v: Fix the circular dependency in IPI enlightenment
Commit-ID: 1268ed0c474a5c8f165ef386f3310521b5e00e27 Gitweb: https://git.kernel.org/tip/1268ed0c474a5c8f165ef386f3310521b5e00e27 Author: K. Y. Srinivasan AuthorDate: Tue, 3 Jul 2018 16:01:55 -0700 Committer: Thomas Gleixner CommitDate: Fri, 6 Jul 2018 12:32:59 +0200 x86/hyper-v: Fix the circular dependency in IPI enlightenment The IPI hypercalls depend on being able to map the Linux notion of CPU ID to the hypervisor's notion of the CPU ID. The array hv_vp_index[] provides this mapping. Code for populating this array depends on the IPI functionality. Break this circular dependency. [ tglx: Use a proper define instead of '-1' with a u32 variable as pointed out by Vitaly ] Fixes: 68bb7bfb7985 ("X86/Hyper-V: Enable IPI enlightenments") Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Tested-by: Michael Kelley Cc: gre...@linuxfoundation.org Cc: de...@linuxdriverproject.org Cc: o...@aepfle.de Cc: a...@canonical.com Cc: jasow...@redhat.com Cc: h...@zytor.com Cc: sthem...@microsoft.com Cc: michael.h.kel...@microsoft.com Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180703230155.15160-1-...@linuxonhyperv.com --- arch/x86/hyperv/hv_apic.c | 5 + arch/x86/hyperv/hv_init.c | 5 - arch/x86/include/asm/mshyperv.h | 5 - 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index f68855499391..402338365651 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -114,6 +114,8 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector) ipi_arg->vp_set.format = HV_GENERIC_SET_SPARSE_4K; nr_bank = cpumask_to_vpset(&(ipi_arg->vp_set), mask); } + if (nr_bank < 0) + goto ipi_mask_ex_done; if (!nr_bank) ipi_arg->vp_set.format = HV_GENERIC_SET_ALL; @@ -158,6 +160,9 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector) for_each_cpu(cur_cpu, mask) { vcpu = hv_cpu_number_to_vp_number(cur_cpu); + if (vcpu == VP_INVAL) + goto ipi_mask_done; + /* * This particular version of the IPI hypercall can * only target upto 64 CPUs. diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 4c431e1c1eff..1ff420217298 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -265,7 +265,7 @@ void __init hyperv_init(void) { u64 guest_id, required_msrs; union hv_x64_msr_hypercall_contents hypercall_msr; - int cpuhp; + int cpuhp, i; if (x86_hyper_type != X86_HYPER_MS_HYPERV) return; @@ -293,6 +293,9 @@ void __init hyperv_init(void) if (!hv_vp_index) return; + for (i = 0; i < num_possible_cpus(); i++) + hv_vp_index[i] = VP_INVAL; + hv_vp_assist_page = kcalloc(num_possible_cpus(), sizeof(*hv_vp_assist_page), GFP_KERNEL); if (!hv_vp_assist_page) { diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 3cd14311edfa..5a7375ed5f7c 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -9,6 +9,8 @@ #include #include +#define VP_INVAL U32_MAX + struct ms_hyperv_info { u32 features; u32 misc_features; @@ -20,7 +22,6 @@ struct ms_hyperv_info { extern struct ms_hyperv_info ms_hyperv; - /* * Generate the guest ID. */ @@ -281,6 +282,8 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset, */ for_each_cpu(cpu, cpus) { vcpu = hv_cpu_number_to_vp_number(cpu); + if (vcpu == VP_INVAL) + return -1; vcpu_bank = vcpu / 64; vcpu_offset = vcpu % 64; __set_bit(vcpu_offset, (unsigned long *)
[tip:x86/urgent] x86/hyper-v: Fix the circular dependency in IPI enlightenment
Commit-ID: 5e6f19db2deca9e7eaa378447c77616b35693399 Gitweb: https://git.kernel.org/tip/5e6f19db2deca9e7eaa378447c77616b35693399 Author: K. Y. Srinivasan AuthorDate: Tue, 3 Jul 2018 16:01:55 -0700 Committer: Ingo Molnar CommitDate: Wed, 4 Jul 2018 10:50:03 +0200 x86/hyper-v: Fix the circular dependency in IPI enlightenment The IPI hypercalls depend on being able to map the Linux notion of CPU ID to the hypervisor's notion of the CPU ID. The array hv_vp_index[] provides this mapping. Code for populating this array depends on the IPI functionality. Break this circular dependency. Tested-by: Michael Kelley Signed-off-by: K. Y. Srinivasan Acked-by: Thomas Gleixner Cc: Linus Torvalds Cc: michael.h.kel...@microsoft.com Cc: Peter Zijlstra Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: vkuzn...@redhat.com Fixes: 68bb7bfb7985 ("X86/Hyper-V: Enable IPI enlightenments") Link: http://lkml.kernel.org/r/20180703230155.15160-1-...@linuxonhyperv.com Signed-off-by: Ingo Molnar --- arch/x86/hyperv/hv_apic.c | 5 + arch/x86/hyperv/hv_init.c | 5 - arch/x86/include/asm/mshyperv.h | 2 ++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index f68855499391..63d7c196739f 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -114,6 +114,8 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector) ipi_arg->vp_set.format = HV_GENERIC_SET_SPARSE_4K; nr_bank = cpumask_to_vpset(&(ipi_arg->vp_set), mask); } + if (nr_bank == -1) + goto ipi_mask_ex_done; if (!nr_bank) ipi_arg->vp_set.format = HV_GENERIC_SET_ALL; @@ -158,6 +160,9 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector) for_each_cpu(cur_cpu, mask) { vcpu = hv_cpu_number_to_vp_number(cur_cpu); + if (vcpu == -1) + goto ipi_mask_done; + /* * This particular version of the IPI hypercall can * only target upto 64 CPUs. diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 4c431e1c1eff..04159893702e 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -265,7 +265,7 @@ void __init hyperv_init(void) { u64 guest_id, required_msrs; union hv_x64_msr_hypercall_contents hypercall_msr; - int cpuhp; + int cpuhp, i; if (x86_hyper_type != X86_HYPER_MS_HYPERV) return; @@ -293,6 +293,9 @@ void __init hyperv_init(void) if (!hv_vp_index) return; + for (i = 0; i < num_possible_cpus(); i++) + hv_vp_index[i] = -1; + hv_vp_assist_page = kcalloc(num_possible_cpus(), sizeof(*hv_vp_assist_page), GFP_KERNEL); if (!hv_vp_assist_page) { diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 3cd14311edfa..dee3f7347253 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -281,6 +281,8 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset, */ for_each_cpu(cpu, cpus) { vcpu = hv_cpu_number_to_vp_number(cpu); + if (vcpu == -1) + return -1; vcpu_bank = vcpu / 64; vcpu_offset = vcpu % 64; __set_bit(vcpu_offset, (unsigned long *)
[tip:x86/urgent] x86/hyper-v: Fix the circular dependency in IPI enlightenment
Commit-ID: 5e6f19db2deca9e7eaa378447c77616b35693399 Gitweb: https://git.kernel.org/tip/5e6f19db2deca9e7eaa378447c77616b35693399 Author: K. Y. Srinivasan AuthorDate: Tue, 3 Jul 2018 16:01:55 -0700 Committer: Ingo Molnar CommitDate: Wed, 4 Jul 2018 10:50:03 +0200 x86/hyper-v: Fix the circular dependency in IPI enlightenment The IPI hypercalls depend on being able to map the Linux notion of CPU ID to the hypervisor's notion of the CPU ID. The array hv_vp_index[] provides this mapping. Code for populating this array depends on the IPI functionality. Break this circular dependency. Tested-by: Michael Kelley Signed-off-by: K. Y. Srinivasan Acked-by: Thomas Gleixner Cc: Linus Torvalds Cc: michael.h.kel...@microsoft.com Cc: Peter Zijlstra Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: vkuzn...@redhat.com Fixes: 68bb7bfb7985 ("X86/Hyper-V: Enable IPI enlightenments") Link: http://lkml.kernel.org/r/20180703230155.15160-1-...@linuxonhyperv.com Signed-off-by: Ingo Molnar --- arch/x86/hyperv/hv_apic.c | 5 + arch/x86/hyperv/hv_init.c | 5 - arch/x86/include/asm/mshyperv.h | 2 ++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index f68855499391..63d7c196739f 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -114,6 +114,8 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector) ipi_arg->vp_set.format = HV_GENERIC_SET_SPARSE_4K; nr_bank = cpumask_to_vpset(&(ipi_arg->vp_set), mask); } + if (nr_bank == -1) + goto ipi_mask_ex_done; if (!nr_bank) ipi_arg->vp_set.format = HV_GENERIC_SET_ALL; @@ -158,6 +160,9 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector) for_each_cpu(cur_cpu, mask) { vcpu = hv_cpu_number_to_vp_number(cur_cpu); + if (vcpu == -1) + goto ipi_mask_done; + /* * This particular version of the IPI hypercall can * only target upto 64 CPUs. diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 4c431e1c1eff..04159893702e 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -265,7 +265,7 @@ void __init hyperv_init(void) { u64 guest_id, required_msrs; union hv_x64_msr_hypercall_contents hypercall_msr; - int cpuhp; + int cpuhp, i; if (x86_hyper_type != X86_HYPER_MS_HYPERV) return; @@ -293,6 +293,9 @@ void __init hyperv_init(void) if (!hv_vp_index) return; + for (i = 0; i < num_possible_cpus(); i++) + hv_vp_index[i] = -1; + hv_vp_assist_page = kcalloc(num_possible_cpus(), sizeof(*hv_vp_assist_page), GFP_KERNEL); if (!hv_vp_assist_page) { diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 3cd14311edfa..dee3f7347253 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -281,6 +281,8 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset, */ for_each_cpu(cpu, cpus) { vcpu = hv_cpu_number_to_vp_number(cpu); + if (vcpu == -1) + return -1; vcpu_bank = vcpu / 64; vcpu_offset = vcpu % 64; __set_bit(vcpu_offset, (unsigned long *)
[tip:x86/hyperv] X86/Hyper-V: Consolidate the allocation of the hypercall input page
Commit-ID: 9a2d78e291a7dea0ae4b4a06ce6bbbe4f1ab7c13 Gitweb: https://git.kernel.org/tip/9a2d78e291a7dea0ae4b4a06ce6bbbe4f1ab7c13 Author: K. Y. Srinivasan <k...@microsoft.com> AuthorDate: Wed, 16 May 2018 14:53:34 -0700 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Sat, 19 May 2018 13:23:18 +0200 X86/Hyper-V: Consolidate the allocation of the hypercall input page Consolidate the allocation of the hypercall input page. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Signed-off-by: Thomas Gleixner <t...@linutronix.de> Reviewed-by: Michael Kelley <mikel...@microsoft.com> Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-5-...@linuxonhyperv.com --- arch/x86/hyperv/hv_init.c | 2 -- arch/x86/hyperv/mmu.c | 30 ++ arch/x86/include/asm/mshyperv.h | 1 - 3 files changed, 6 insertions(+), 27 deletions(-) diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 6bc90d68ac8b..4c431e1c1eff 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -324,8 +324,6 @@ void __init hyperv_init(void) hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg); wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); - hyper_alloc_mmu(); - hv_apic_init(); /* diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index c9cd28f0bae4..5f053d7d1bd9 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -32,9 +32,6 @@ struct hv_flush_pcpu_ex { /* Each gva in gva_list encodes up to 4096 pages to flush */ #define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE) -static struct hv_flush_pcpu __percpu **pcpu_flush; - -static struct hv_flush_pcpu_ex __percpu **pcpu_flush_ex; /* * Fills in gva_list starting from offset. Returns the number of items added. @@ -77,7 +74,7 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus, trace_hyperv_mmu_flush_tlb_others(cpus, info); - if (!pcpu_flush || !hv_hypercall_pg) + if (!hv_hypercall_pg) goto do_native; if (cpumask_empty(cpus)) @@ -85,10 +82,8 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus, local_irq_save(flags); - flush_pcpu = this_cpu_ptr(pcpu_flush); - - if (unlikely(!*flush_pcpu)) - *flush_pcpu = page_address(alloc_page(GFP_ATOMIC)); + flush_pcpu = (struct hv_flush_pcpu **) +this_cpu_ptr(hyperv_pcpu_input_arg); flush = *flush_pcpu; @@ -164,7 +159,7 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus, trace_hyperv_mmu_flush_tlb_others(cpus, info); - if (!pcpu_flush_ex || !hv_hypercall_pg) + if (!hv_hypercall_pg) goto do_native; if (cpumask_empty(cpus)) @@ -172,10 +167,8 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus, local_irq_save(flags); - flush_pcpu = this_cpu_ptr(pcpu_flush_ex); - - if (unlikely(!*flush_pcpu)) - *flush_pcpu = page_address(alloc_page(GFP_ATOMIC)); + flush_pcpu = (struct hv_flush_pcpu_ex **) +this_cpu_ptr(hyperv_pcpu_input_arg); flush = *flush_pcpu; @@ -257,14 +250,3 @@ void hyperv_setup_mmu_ops(void) pv_mmu_ops.flush_tlb_others = hyperv_flush_tlb_others_ex; } } - -void hyper_alloc_mmu(void) -{ - if (!(ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED)) - return; - - if (!(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED)) - pcpu_flush = alloc_percpu(struct hv_flush_pcpu *); - else - pcpu_flush_ex = alloc_percpu(struct hv_flush_pcpu_ex *); -} diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 0ee82519957b..9aaa493f5756 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -294,7 +294,6 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset, void __init hyperv_init(void); void hyperv_setup_mmu_ops(void); -void hyper_alloc_mmu(void); void hyperv_report_panic(struct pt_regs *regs, long err); bool hv_is_hyperv_initialized(void); void hyperv_cleanup(void);
[tip:x86/hyperv] X86/Hyper-V: Consolidate the allocation of the hypercall input page
Commit-ID: 9a2d78e291a7dea0ae4b4a06ce6bbbe4f1ab7c13 Gitweb: https://git.kernel.org/tip/9a2d78e291a7dea0ae4b4a06ce6bbbe4f1ab7c13 Author: K. Y. Srinivasan AuthorDate: Wed, 16 May 2018 14:53:34 -0700 Committer: Thomas Gleixner CommitDate: Sat, 19 May 2018 13:23:18 +0200 X86/Hyper-V: Consolidate the allocation of the hypercall input page Consolidate the allocation of the hypercall input page. Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Reviewed-by: Michael Kelley Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-5-...@linuxonhyperv.com --- arch/x86/hyperv/hv_init.c | 2 -- arch/x86/hyperv/mmu.c | 30 ++ arch/x86/include/asm/mshyperv.h | 1 - 3 files changed, 6 insertions(+), 27 deletions(-) diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 6bc90d68ac8b..4c431e1c1eff 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -324,8 +324,6 @@ void __init hyperv_init(void) hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg); wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); - hyper_alloc_mmu(); - hv_apic_init(); /* diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index c9cd28f0bae4..5f053d7d1bd9 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -32,9 +32,6 @@ struct hv_flush_pcpu_ex { /* Each gva in gva_list encodes up to 4096 pages to flush */ #define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE) -static struct hv_flush_pcpu __percpu **pcpu_flush; - -static struct hv_flush_pcpu_ex __percpu **pcpu_flush_ex; /* * Fills in gva_list starting from offset. Returns the number of items added. @@ -77,7 +74,7 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus, trace_hyperv_mmu_flush_tlb_others(cpus, info); - if (!pcpu_flush || !hv_hypercall_pg) + if (!hv_hypercall_pg) goto do_native; if (cpumask_empty(cpus)) @@ -85,10 +82,8 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus, local_irq_save(flags); - flush_pcpu = this_cpu_ptr(pcpu_flush); - - if (unlikely(!*flush_pcpu)) - *flush_pcpu = page_address(alloc_page(GFP_ATOMIC)); + flush_pcpu = (struct hv_flush_pcpu **) +this_cpu_ptr(hyperv_pcpu_input_arg); flush = *flush_pcpu; @@ -164,7 +159,7 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus, trace_hyperv_mmu_flush_tlb_others(cpus, info); - if (!pcpu_flush_ex || !hv_hypercall_pg) + if (!hv_hypercall_pg) goto do_native; if (cpumask_empty(cpus)) @@ -172,10 +167,8 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus, local_irq_save(flags); - flush_pcpu = this_cpu_ptr(pcpu_flush_ex); - - if (unlikely(!*flush_pcpu)) - *flush_pcpu = page_address(alloc_page(GFP_ATOMIC)); + flush_pcpu = (struct hv_flush_pcpu_ex **) +this_cpu_ptr(hyperv_pcpu_input_arg); flush = *flush_pcpu; @@ -257,14 +250,3 @@ void hyperv_setup_mmu_ops(void) pv_mmu_ops.flush_tlb_others = hyperv_flush_tlb_others_ex; } } - -void hyper_alloc_mmu(void) -{ - if (!(ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED)) - return; - - if (!(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED)) - pcpu_flush = alloc_percpu(struct hv_flush_pcpu *); - else - pcpu_flush_ex = alloc_percpu(struct hv_flush_pcpu_ex *); -} diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 0ee82519957b..9aaa493f5756 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -294,7 +294,6 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset, void __init hyperv_init(void); void hyperv_setup_mmu_ops(void); -void hyper_alloc_mmu(void); void hyperv_report_panic(struct pt_regs *regs, long err); bool hv_is_hyperv_initialized(void); void hyperv_cleanup(void);
[tip:x86/hyperv] X86/Hyper-V: Consolidate code for converting cpumask to vpset
Commit-ID: 800b8f03fdc8d66885ff03de531285526a4ca0d4 Gitweb: https://git.kernel.org/tip/800b8f03fdc8d66885ff03de531285526a4ca0d4 Author: K. Y. Srinivasan <k...@microsoft.com> AuthorDate: Wed, 16 May 2018 14:53:33 -0700 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Sat, 19 May 2018 13:23:18 +0200 X86/Hyper-V: Consolidate code for converting cpumask to vpset Consolidate code for converting cpumask to vpset. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Signed-off-by: Thomas Gleixner <t...@linutronix.de> Reviewed-by: Michael Kelley <mikel...@microsoft.com> Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-4-...@linuxonhyperv.com --- arch/x86/hyperv/mmu.c | 43 ++- 1 file changed, 2 insertions(+), 41 deletions(-) diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index adee39a7a3f2..c9cd28f0bae4 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -25,11 +25,7 @@ struct hv_flush_pcpu { struct hv_flush_pcpu_ex { u64 address_space; u64 flags; - struct { - u64 format; - u64 valid_bank_mask; - u64 bank_contents[]; - } hv_vp_set; + struct hv_vpset hv_vp_set; u64 gva_list[]; }; @@ -70,41 +66,6 @@ static inline int fill_gva_list(u64 gva_list[], int offset, return gva_n - offset; } -/* Return the number of banks in the resulting vp_set */ -static inline int cpumask_to_vp_set(struct hv_flush_pcpu_ex *flush, - const struct cpumask *cpus) -{ - int cpu, vcpu, vcpu_bank, vcpu_offset, nr_bank = 1; - - /* valid_bank_mask can represent up to 64 banks */ - if (hv_max_vp_index / 64 >= 64) - return 0; - - /* -* Clear all banks up to the maximum possible bank as hv_flush_pcpu_ex -* structs are not cleared between calls, we risk flushing unneeded -* vCPUs otherwise. -*/ - for (vcpu_bank = 0; vcpu_bank <= hv_max_vp_index / 64; vcpu_bank++) - flush->hv_vp_set.bank_contents[vcpu_bank] = 0; - - /* -* Some banks may end up being empty but this is acceptable. -*/ - for_each_cpu(cpu, cpus) { - vcpu = hv_cpu_number_to_vp_number(cpu); - vcpu_bank = vcpu / 64; - vcpu_offset = vcpu % 64; - __set_bit(vcpu_offset, (unsigned long *) - >hv_vp_set.bank_contents[vcpu_bank]); - if (vcpu_bank >= nr_bank) - nr_bank = vcpu_bank + 1; - } - flush->hv_vp_set.valid_bank_mask = GENMASK_ULL(nr_bank - 1, 0); - - return nr_bank; -} - static void hyperv_flush_tlb_others(const struct cpumask *cpus, const struct flush_tlb_info *info) { @@ -240,7 +201,7 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus, if (!cpumask_equal(cpus, cpu_present_mask)) { flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K; - nr_bank = cpumask_to_vp_set(flush, cpus); + nr_bank = cpumask_to_vpset(&(flush->hv_vp_set), cpus); } if (!nr_bank) {
[tip:x86/hyperv] X86/Hyper-V: Consolidate code for converting cpumask to vpset
Commit-ID: 800b8f03fdc8d66885ff03de531285526a4ca0d4 Gitweb: https://git.kernel.org/tip/800b8f03fdc8d66885ff03de531285526a4ca0d4 Author: K. Y. Srinivasan AuthorDate: Wed, 16 May 2018 14:53:33 -0700 Committer: Thomas Gleixner CommitDate: Sat, 19 May 2018 13:23:18 +0200 X86/Hyper-V: Consolidate code for converting cpumask to vpset Consolidate code for converting cpumask to vpset. Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Reviewed-by: Michael Kelley Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-4-...@linuxonhyperv.com --- arch/x86/hyperv/mmu.c | 43 ++- 1 file changed, 2 insertions(+), 41 deletions(-) diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index adee39a7a3f2..c9cd28f0bae4 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -25,11 +25,7 @@ struct hv_flush_pcpu { struct hv_flush_pcpu_ex { u64 address_space; u64 flags; - struct { - u64 format; - u64 valid_bank_mask; - u64 bank_contents[]; - } hv_vp_set; + struct hv_vpset hv_vp_set; u64 gva_list[]; }; @@ -70,41 +66,6 @@ static inline int fill_gva_list(u64 gva_list[], int offset, return gva_n - offset; } -/* Return the number of banks in the resulting vp_set */ -static inline int cpumask_to_vp_set(struct hv_flush_pcpu_ex *flush, - const struct cpumask *cpus) -{ - int cpu, vcpu, vcpu_bank, vcpu_offset, nr_bank = 1; - - /* valid_bank_mask can represent up to 64 banks */ - if (hv_max_vp_index / 64 >= 64) - return 0; - - /* -* Clear all banks up to the maximum possible bank as hv_flush_pcpu_ex -* structs are not cleared between calls, we risk flushing unneeded -* vCPUs otherwise. -*/ - for (vcpu_bank = 0; vcpu_bank <= hv_max_vp_index / 64; vcpu_bank++) - flush->hv_vp_set.bank_contents[vcpu_bank] = 0; - - /* -* Some banks may end up being empty but this is acceptable. -*/ - for_each_cpu(cpu, cpus) { - vcpu = hv_cpu_number_to_vp_number(cpu); - vcpu_bank = vcpu / 64; - vcpu_offset = vcpu % 64; - __set_bit(vcpu_offset, (unsigned long *) - >hv_vp_set.bank_contents[vcpu_bank]); - if (vcpu_bank >= nr_bank) - nr_bank = vcpu_bank + 1; - } - flush->hv_vp_set.valid_bank_mask = GENMASK_ULL(nr_bank - 1, 0); - - return nr_bank; -} - static void hyperv_flush_tlb_others(const struct cpumask *cpus, const struct flush_tlb_info *info) { @@ -240,7 +201,7 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus, if (!cpumask_equal(cpus, cpu_present_mask)) { flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K; - nr_bank = cpumask_to_vp_set(flush, cpus); + nr_bank = cpumask_to_vpset(&(flush->hv_vp_set), cpus); } if (!nr_bank) {
[tip:x86/hyperv] X86/Hyper-V: Enable IPI enlightenments
Commit-ID: 68bb7bfb7985df2bd15c2dc975cb68b7a901488a Gitweb: https://git.kernel.org/tip/68bb7bfb7985df2bd15c2dc975cb68b7a901488a Author: K. Y. Srinivasan <k...@microsoft.com> AuthorDate: Wed, 16 May 2018 14:53:31 -0700 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Sat, 19 May 2018 13:23:17 +0200 X86/Hyper-V: Enable IPI enlightenments Hyper-V supports hypercalls to implement IPI; use them. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Signed-off-by: Thomas Gleixner <t...@linutronix.de> Reviewed-by: Michael Kelley <mikel...@microsoft.com> Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-2-...@linuxonhyperv.com --- arch/x86/hyperv/hv_apic.c | 117 + arch/x86/hyperv/hv_init.c | 27 + arch/x86/include/asm/hyperv-tlfs.h | 15 + arch/x86/include/asm/mshyperv.h| 1 + 4 files changed, 160 insertions(+) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index ca20e31d311c..3e0de61f1a7c 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -33,6 +33,8 @@ #ifdef CONFIG_X86_64 #if IS_ENABLED(CONFIG_HYPERV) +static struct apic orig_apic; + static u64 hv_apic_icr_read(void) { u64 reg_val; @@ -88,8 +90,123 @@ static void hv_apic_eoi_write(u32 reg, u32 val) wrmsr(HV_X64_MSR_EOI, val, 0); } +/* + * IPI implementation on Hyper-V. + */ +static bool __send_ipi_mask(const struct cpumask *mask, int vector) +{ + int cur_cpu, vcpu; + struct ipi_arg_non_ex **arg; + struct ipi_arg_non_ex *ipi_arg; + int ret = 1; + unsigned long flags; + + if (cpumask_empty(mask)) + return true; + + if (!hv_hypercall_pg) + return false; + + if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR)) + return false; + + local_irq_save(flags); + arg = (struct ipi_arg_non_ex **)this_cpu_ptr(hyperv_pcpu_input_arg); + + ipi_arg = *arg; + if (unlikely(!ipi_arg)) + goto ipi_mask_done; + + ipi_arg->vector = vector; + ipi_arg->reserved = 0; + ipi_arg->cpu_mask = 0; + + for_each_cpu(cur_cpu, mask) { + vcpu = hv_cpu_number_to_vp_number(cur_cpu); + /* +* This particular version of the IPI hypercall can +* only target upto 64 CPUs. +*/ + if (vcpu >= 64) + goto ipi_mask_done; + + __set_bit(vcpu, (unsigned long *)_arg->cpu_mask); + } + + ret = hv_do_hypercall(HVCALL_SEND_IPI, ipi_arg, NULL); + +ipi_mask_done: + local_irq_restore(flags); + return ((ret == 0) ? true : false); +} + +static bool __send_ipi_one(int cpu, int vector) +{ + struct cpumask mask = CPU_MASK_NONE; + + cpumask_set_cpu(cpu, ); + return __send_ipi_mask(, vector); +} + +static void hv_send_ipi(int cpu, int vector) +{ + if (!__send_ipi_one(cpu, vector)) + orig_apic.send_IPI(cpu, vector); +} + +static void hv_send_ipi_mask(const struct cpumask *mask, int vector) +{ + if (!__send_ipi_mask(mask, vector)) + orig_apic.send_IPI_mask(mask, vector); +} + +static void hv_send_ipi_mask_allbutself(const struct cpumask *mask, int vector) +{ + unsigned int this_cpu = smp_processor_id(); + struct cpumask new_mask; + const struct cpumask *local_mask; + + cpumask_copy(_mask, mask); + cpumask_clear_cpu(this_cpu, _mask); + local_mask = _mask; + if (!__send_ipi_mask(local_mask, vector)) + orig_apic.send_IPI_mask_allbutself(mask, vector); +} + +static void hv_send_ipi_allbutself(int vector) +{ + hv_send_ipi_mask_allbutself(cpu_online_mask, vector); +} + +static void hv_send_ipi_all(int vector) +{ + if (!__send_ipi_mask(cpu_online_mask, vector)) + orig_apic.send_IPI_all(vector); +} + +static void hv_send_ipi_self(int vector) +{ + if (!__send_ipi_one(smp_processor_id(), vector)) + orig_apic.send_IPI_self(vector); +} + void __init hv_apic_init(void) { + if (ms_hyperv.hints & HV_X64_CLUSTER_IPI_RECOMMENDED) { + pr_info("Hyper-V: Using IPI hypercalls\n"); + /* +* Set the IPI entry points. +*/ + orig_apic = *apic; + + apic->send_IPI = hv_send_ipi; + apic->send_IPI_mask = hv_send_ipi_mask; + apic->send_IPI_mask_allbutself = hv_send_ipi_mask_allbutself; + apic->send_IPI_allbutself = hv_send_ipi_allbutself; + apic->send_I
[tip:x86/hyperv] X86/Hyper-V: Enhanced IPI enlightenment
Commit-ID: 366f03b0cf90ef55f063d4a54cf62b0ac9b6da9d Gitweb: https://git.kernel.org/tip/366f03b0cf90ef55f063d4a54cf62b0ac9b6da9d Author: K. Y. Srinivasan <k...@microsoft.com> AuthorDate: Wed, 16 May 2018 14:53:32 -0700 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Sat, 19 May 2018 13:23:17 +0200 X86/Hyper-V: Enhanced IPI enlightenment Support enhanced IPI enlightenments (to target more than 64 CPUs). Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Signed-off-by: Thomas Gleixner <t...@linutronix.de> Reviewed-by: Michael Kelley <mikel...@microsoft.com> Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-3-...@linuxonhyperv.com --- arch/x86/hyperv/hv_apic.c | 42 +- arch/x86/hyperv/mmu.c | 2 +- arch/x86/include/asm/hyperv-tlfs.h | 15 +- arch/x86/include/asm/mshyperv.h| 33 ++ 4 files changed, 89 insertions(+), 3 deletions(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index 3e0de61f1a7c..192b6ad6a361 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -93,6 +93,40 @@ static void hv_apic_eoi_write(u32 reg, u32 val) /* * IPI implementation on Hyper-V. */ +static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector) +{ + struct ipi_arg_ex **arg; + struct ipi_arg_ex *ipi_arg; + unsigned long flags; + int nr_bank = 0; + int ret = 1; + + local_irq_save(flags); + arg = (struct ipi_arg_ex **)this_cpu_ptr(hyperv_pcpu_input_arg); + + ipi_arg = *arg; + if (unlikely(!ipi_arg)) + goto ipi_mask_ex_done; + + ipi_arg->vector = vector; + ipi_arg->reserved = 0; + ipi_arg->vp_set.valid_bank_mask = 0; + + if (!cpumask_equal(mask, cpu_present_mask)) { + ipi_arg->vp_set.format = HV_GENERIC_SET_SPARSE_4K; + nr_bank = cpumask_to_vpset(&(ipi_arg->vp_set), mask); + } + if (!nr_bank) + ipi_arg->vp_set.format = HV_GENERIC_SET_ALL; + + ret = hv_do_rep_hypercall(HVCALL_SEND_IPI_EX, 0, nr_bank, + ipi_arg, NULL); + +ipi_mask_ex_done: + local_irq_restore(flags); + return ((ret == 0) ? true : false); +} + static bool __send_ipi_mask(const struct cpumask *mask, int vector) { int cur_cpu, vcpu; @@ -110,6 +144,9 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector) if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR)) return false; + if ((ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED)) + return __send_ipi_mask_ex(mask, vector); + local_irq_save(flags); arg = (struct ipi_arg_non_ex **)this_cpu_ptr(hyperv_pcpu_input_arg); @@ -193,7 +230,10 @@ static void hv_send_ipi_self(int vector) void __init hv_apic_init(void) { if (ms_hyperv.hints & HV_X64_CLUSTER_IPI_RECOMMENDED) { - pr_info("Hyper-V: Using IPI hypercalls\n"); + if ((ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED)) + pr_info("Hyper-V: Using ext hypercalls for IPI\n"); + else + pr_info("Hyper-V: Using IPI hypercalls\n"); /* * Set the IPI entry points. */ diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index 56c9ebac946f..adee39a7a3f2 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -239,7 +239,7 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus, flush->hv_vp_set.valid_bank_mask = 0; if (!cpumask_equal(cpus, cpu_present_mask)) { - flush->hv_vp_set.format = HV_GENERIC_SET_SPARCE_4K; + flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K; nr_bank = cpumask_to_vp_set(flush, cpus); } diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h index 332e786d4deb..3bfa92c2793c 100644 --- a/arch/x86/include/asm/hyperv-tlfs.h +++ b/arch/x86/include/asm/hyperv-tlfs.h @@ -344,6 +344,7 @@ struct hv_tsc_emulation_status { #define HVCALL_SEND_IPI0x000b #define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX 0x0013 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX 0x0014 +#define HVCALL_SEND_IPI_EX 0x0015 #define HVCALL_POST_MESSAGE0x005c #define HVCALL_SIGNAL_EVENT0x005d @@ -369,7 +370,7 @@ struct hv_tsc_emulation_status { #define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3) enum HV_GENERI
[tip:x86/hyperv] X86/Hyper-V: Enable IPI enlightenments
Commit-ID: 68bb7bfb7985df2bd15c2dc975cb68b7a901488a Gitweb: https://git.kernel.org/tip/68bb7bfb7985df2bd15c2dc975cb68b7a901488a Author: K. Y. Srinivasan AuthorDate: Wed, 16 May 2018 14:53:31 -0700 Committer: Thomas Gleixner CommitDate: Sat, 19 May 2018 13:23:17 +0200 X86/Hyper-V: Enable IPI enlightenments Hyper-V supports hypercalls to implement IPI; use them. Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Reviewed-by: Michael Kelley Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-2-...@linuxonhyperv.com --- arch/x86/hyperv/hv_apic.c | 117 + arch/x86/hyperv/hv_init.c | 27 + arch/x86/include/asm/hyperv-tlfs.h | 15 + arch/x86/include/asm/mshyperv.h| 1 + 4 files changed, 160 insertions(+) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index ca20e31d311c..3e0de61f1a7c 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -33,6 +33,8 @@ #ifdef CONFIG_X86_64 #if IS_ENABLED(CONFIG_HYPERV) +static struct apic orig_apic; + static u64 hv_apic_icr_read(void) { u64 reg_val; @@ -88,8 +90,123 @@ static void hv_apic_eoi_write(u32 reg, u32 val) wrmsr(HV_X64_MSR_EOI, val, 0); } +/* + * IPI implementation on Hyper-V. + */ +static bool __send_ipi_mask(const struct cpumask *mask, int vector) +{ + int cur_cpu, vcpu; + struct ipi_arg_non_ex **arg; + struct ipi_arg_non_ex *ipi_arg; + int ret = 1; + unsigned long flags; + + if (cpumask_empty(mask)) + return true; + + if (!hv_hypercall_pg) + return false; + + if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR)) + return false; + + local_irq_save(flags); + arg = (struct ipi_arg_non_ex **)this_cpu_ptr(hyperv_pcpu_input_arg); + + ipi_arg = *arg; + if (unlikely(!ipi_arg)) + goto ipi_mask_done; + + ipi_arg->vector = vector; + ipi_arg->reserved = 0; + ipi_arg->cpu_mask = 0; + + for_each_cpu(cur_cpu, mask) { + vcpu = hv_cpu_number_to_vp_number(cur_cpu); + /* +* This particular version of the IPI hypercall can +* only target upto 64 CPUs. +*/ + if (vcpu >= 64) + goto ipi_mask_done; + + __set_bit(vcpu, (unsigned long *)_arg->cpu_mask); + } + + ret = hv_do_hypercall(HVCALL_SEND_IPI, ipi_arg, NULL); + +ipi_mask_done: + local_irq_restore(flags); + return ((ret == 0) ? true : false); +} + +static bool __send_ipi_one(int cpu, int vector) +{ + struct cpumask mask = CPU_MASK_NONE; + + cpumask_set_cpu(cpu, ); + return __send_ipi_mask(, vector); +} + +static void hv_send_ipi(int cpu, int vector) +{ + if (!__send_ipi_one(cpu, vector)) + orig_apic.send_IPI(cpu, vector); +} + +static void hv_send_ipi_mask(const struct cpumask *mask, int vector) +{ + if (!__send_ipi_mask(mask, vector)) + orig_apic.send_IPI_mask(mask, vector); +} + +static void hv_send_ipi_mask_allbutself(const struct cpumask *mask, int vector) +{ + unsigned int this_cpu = smp_processor_id(); + struct cpumask new_mask; + const struct cpumask *local_mask; + + cpumask_copy(_mask, mask); + cpumask_clear_cpu(this_cpu, _mask); + local_mask = _mask; + if (!__send_ipi_mask(local_mask, vector)) + orig_apic.send_IPI_mask_allbutself(mask, vector); +} + +static void hv_send_ipi_allbutself(int vector) +{ + hv_send_ipi_mask_allbutself(cpu_online_mask, vector); +} + +static void hv_send_ipi_all(int vector) +{ + if (!__send_ipi_mask(cpu_online_mask, vector)) + orig_apic.send_IPI_all(vector); +} + +static void hv_send_ipi_self(int vector) +{ + if (!__send_ipi_one(smp_processor_id(), vector)) + orig_apic.send_IPI_self(vector); +} + void __init hv_apic_init(void) { + if (ms_hyperv.hints & HV_X64_CLUSTER_IPI_RECOMMENDED) { + pr_info("Hyper-V: Using IPI hypercalls\n"); + /* +* Set the IPI entry points. +*/ + orig_apic = *apic; + + apic->send_IPI = hv_send_ipi; + apic->send_IPI_mask = hv_send_ipi_mask; + apic->send_IPI_mask_allbutself = hv_send_ipi_mask_allbutself; + apic->send_IPI_allbutself = hv_send_ipi_allbutself; + apic->send_IPI_all = hv_send_ipi_all; + apic->send_IPI_self = hv_send_ipi_self; + } + if (ms_hyperv.hints & HV_X64_APIC_ACCESS_R
[tip:x86/hyperv] X86/Hyper-V: Enhanced IPI enlightenment
Commit-ID: 366f03b0cf90ef55f063d4a54cf62b0ac9b6da9d Gitweb: https://git.kernel.org/tip/366f03b0cf90ef55f063d4a54cf62b0ac9b6da9d Author: K. Y. Srinivasan AuthorDate: Wed, 16 May 2018 14:53:32 -0700 Committer: Thomas Gleixner CommitDate: Sat, 19 May 2018 13:23:17 +0200 X86/Hyper-V: Enhanced IPI enlightenment Support enhanced IPI enlightenments (to target more than 64 CPUs). Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Reviewed-by: Michael Kelley Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-3-...@linuxonhyperv.com --- arch/x86/hyperv/hv_apic.c | 42 +- arch/x86/hyperv/mmu.c | 2 +- arch/x86/include/asm/hyperv-tlfs.h | 15 +- arch/x86/include/asm/mshyperv.h| 33 ++ 4 files changed, 89 insertions(+), 3 deletions(-) diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c index 3e0de61f1a7c..192b6ad6a361 100644 --- a/arch/x86/hyperv/hv_apic.c +++ b/arch/x86/hyperv/hv_apic.c @@ -93,6 +93,40 @@ static void hv_apic_eoi_write(u32 reg, u32 val) /* * IPI implementation on Hyper-V. */ +static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector) +{ + struct ipi_arg_ex **arg; + struct ipi_arg_ex *ipi_arg; + unsigned long flags; + int nr_bank = 0; + int ret = 1; + + local_irq_save(flags); + arg = (struct ipi_arg_ex **)this_cpu_ptr(hyperv_pcpu_input_arg); + + ipi_arg = *arg; + if (unlikely(!ipi_arg)) + goto ipi_mask_ex_done; + + ipi_arg->vector = vector; + ipi_arg->reserved = 0; + ipi_arg->vp_set.valid_bank_mask = 0; + + if (!cpumask_equal(mask, cpu_present_mask)) { + ipi_arg->vp_set.format = HV_GENERIC_SET_SPARSE_4K; + nr_bank = cpumask_to_vpset(&(ipi_arg->vp_set), mask); + } + if (!nr_bank) + ipi_arg->vp_set.format = HV_GENERIC_SET_ALL; + + ret = hv_do_rep_hypercall(HVCALL_SEND_IPI_EX, 0, nr_bank, + ipi_arg, NULL); + +ipi_mask_ex_done: + local_irq_restore(flags); + return ((ret == 0) ? true : false); +} + static bool __send_ipi_mask(const struct cpumask *mask, int vector) { int cur_cpu, vcpu; @@ -110,6 +144,9 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector) if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR)) return false; + if ((ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED)) + return __send_ipi_mask_ex(mask, vector); + local_irq_save(flags); arg = (struct ipi_arg_non_ex **)this_cpu_ptr(hyperv_pcpu_input_arg); @@ -193,7 +230,10 @@ static void hv_send_ipi_self(int vector) void __init hv_apic_init(void) { if (ms_hyperv.hints & HV_X64_CLUSTER_IPI_RECOMMENDED) { - pr_info("Hyper-V: Using IPI hypercalls\n"); + if ((ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED)) + pr_info("Hyper-V: Using ext hypercalls for IPI\n"); + else + pr_info("Hyper-V: Using IPI hypercalls\n"); /* * Set the IPI entry points. */ diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index 56c9ebac946f..adee39a7a3f2 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -239,7 +239,7 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus, flush->hv_vp_set.valid_bank_mask = 0; if (!cpumask_equal(cpus, cpu_present_mask)) { - flush->hv_vp_set.format = HV_GENERIC_SET_SPARCE_4K; + flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K; nr_bank = cpumask_to_vp_set(flush, cpus); } diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h index 332e786d4deb..3bfa92c2793c 100644 --- a/arch/x86/include/asm/hyperv-tlfs.h +++ b/arch/x86/include/asm/hyperv-tlfs.h @@ -344,6 +344,7 @@ struct hv_tsc_emulation_status { #define HVCALL_SEND_IPI0x000b #define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX 0x0013 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX 0x0014 +#define HVCALL_SEND_IPI_EX 0x0015 #define HVCALL_POST_MESSAGE0x005c #define HVCALL_SIGNAL_EVENT0x005d @@ -369,7 +370,7 @@ struct hv_tsc_emulation_status { #define HV_FLUSH_USE_EXTENDED_RANGE_FORMAT BIT(3) enum HV_GENERIC_SET_FORMAT { - HV_GENERIC_SET_SPARCE_4K, + HV_GENERIC_SET_SPARSE_4K, HV_GENERIC_SET_ALL, }; @@ -721,4
[tip:x86/hyperv] X86/Hyper-V: Enlighten APIC access
Commit-ID: 6b48cb5f8347bc0153ff1d7b075db92e6723ffdb Gitweb: https://git.kernel.org/tip/6b48cb5f8347bc0153ff1d7b075db92e6723ffdb Author: K. Y. Srinivasan <k...@microsoft.com> AuthorDate: Wed, 16 May 2018 14:53:30 -0700 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Sat, 19 May 2018 13:23:17 +0200 X86/Hyper-V: Enlighten APIC access Hyper-V supports MSR based APIC access; implement the enlightenment. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Signed-off-by: Thomas Gleixner <t...@linutronix.de> Reviewed-by: Michael Kelley <mikel...@microsoft.com> Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-1-...@linuxonhyperv.com --- arch/x86/hyperv/Makefile| 2 +- arch/x86/hyperv/hv_apic.c | 104 arch/x86/hyperv/hv_init.c | 5 +- arch/x86/include/asm/mshyperv.h | 4 +- 4 files changed, 112 insertions(+), 3 deletions(-) diff --git a/arch/x86/hyperv/Makefile b/arch/x86/hyperv/Makefile index 367a8203cfcf..00ce4df01a09 100644 --- a/arch/x86/hyperv/Makefile +++ b/arch/x86/hyperv/Makefile @@ -1 +1 @@ -obj-y := hv_init.o mmu.o +obj-y := hv_init.o mmu.o hv_apic.o diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c new file mode 100644 index ..ca20e31d311c --- /dev/null +++ b/arch/x86/hyperv/hv_apic.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Hyper-V specific APIC code. + * + * Copyright (C) 2018, Microsoft, Inc. + * + * Author : K. Y. Srinivasan <k...@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifdef CONFIG_X86_64 +#if IS_ENABLED(CONFIG_HYPERV) + +static u64 hv_apic_icr_read(void) +{ + u64 reg_val; + + rdmsrl(HV_X64_MSR_ICR, reg_val); + return reg_val; +} + +static void hv_apic_icr_write(u32 low, u32 id) +{ + u64 reg_val; + + reg_val = SET_APIC_DEST_FIELD(id); + reg_val = reg_val << 32; + reg_val |= low; + + wrmsrl(HV_X64_MSR_ICR, reg_val); +} + +static u32 hv_apic_read(u32 reg) +{ + u32 reg_val, hi; + + switch (reg) { + case APIC_EOI: + rdmsr(HV_X64_MSR_EOI, reg_val, hi); + return reg_val; + case APIC_TASKPRI: + rdmsr(HV_X64_MSR_TPR, reg_val, hi); + return reg_val; + + default: + return native_apic_mem_read(reg); + } +} + +static void hv_apic_write(u32 reg, u32 val) +{ + switch (reg) { + case APIC_EOI: + wrmsr(HV_X64_MSR_EOI, val, 0); + break; + case APIC_TASKPRI: + wrmsr(HV_X64_MSR_TPR, val, 0); + break; + default: + native_apic_mem_write(reg, val); + } +} + +static void hv_apic_eoi_write(u32 reg, u32 val) +{ + wrmsr(HV_X64_MSR_EOI, val, 0); +} + +void __init hv_apic_init(void) +{ + if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) { + pr_info("Hyper-V: Using MSR based APIC access\n"); + apic_set_eoi_write(hv_apic_eoi_write); + apic->read = hv_apic_read; + apic->write = hv_apic_write; + apic->icr_write = hv_apic_icr_write; + apic->icr_read = hv_apic_icr_read; + } +} + +#endif /* CONFIG_HYPERV */ +#endif /* CONFIG_X86_64 */ diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index cfecc2272f2d..71e50fc2b7ef 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -242,8 +242,9 @@ static int hv_cpu_die(unsigned int cpu) * * 1. Setup the hypercall page. * 2. Register Hyper-V specific clocksource. + * 3. Setup Hyper-V specific APIC entry points. */ -void hyperv_init(void) +void __init hyperv_init(void) { u64 guest_id, required_msrs; union hv_x64_msr_hypercall_contents hypercall_msr; @@ -298,6 +299,8 @@ void hyperv_init(void) hyper_alloc_mmu(); + hv_apic_init(); + /* * Register Hyper-V specific clocksource. */ diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index b90e79610cf7..162977b82e2e 100
[tip:x86/hyperv] X86/Hyper-V: Enlighten APIC access
Commit-ID: 6b48cb5f8347bc0153ff1d7b075db92e6723ffdb Gitweb: https://git.kernel.org/tip/6b48cb5f8347bc0153ff1d7b075db92e6723ffdb Author: K. Y. Srinivasan AuthorDate: Wed, 16 May 2018 14:53:30 -0700 Committer: Thomas Gleixner CommitDate: Sat, 19 May 2018 13:23:17 +0200 X86/Hyper-V: Enlighten APIC access Hyper-V supports MSR based APIC access; implement the enlightenment. Signed-off-by: K. Y. Srinivasan Signed-off-by: Thomas Gleixner Reviewed-by: Michael Kelley Cc: o...@aepfle.de Cc: sthem...@microsoft.com Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: michael.h.kel...@microsoft.com Cc: h...@zytor.com Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: vkuzn...@redhat.com Link: https://lkml.kernel.org/r/20180516215334.6547-1-...@linuxonhyperv.com --- arch/x86/hyperv/Makefile| 2 +- arch/x86/hyperv/hv_apic.c | 104 arch/x86/hyperv/hv_init.c | 5 +- arch/x86/include/asm/mshyperv.h | 4 +- 4 files changed, 112 insertions(+), 3 deletions(-) diff --git a/arch/x86/hyperv/Makefile b/arch/x86/hyperv/Makefile index 367a8203cfcf..00ce4df01a09 100644 --- a/arch/x86/hyperv/Makefile +++ b/arch/x86/hyperv/Makefile @@ -1 +1 @@ -obj-y := hv_init.o mmu.o +obj-y := hv_init.o mmu.o hv_apic.o diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c new file mode 100644 index ..ca20e31d311c --- /dev/null +++ b/arch/x86/hyperv/hv_apic.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Hyper-V specific APIC code. + * + * Copyright (C) 2018, Microsoft, Inc. + * + * Author : K. Y. Srinivasan + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#ifdef CONFIG_X86_64 +#if IS_ENABLED(CONFIG_HYPERV) + +static u64 hv_apic_icr_read(void) +{ + u64 reg_val; + + rdmsrl(HV_X64_MSR_ICR, reg_val); + return reg_val; +} + +static void hv_apic_icr_write(u32 low, u32 id) +{ + u64 reg_val; + + reg_val = SET_APIC_DEST_FIELD(id); + reg_val = reg_val << 32; + reg_val |= low; + + wrmsrl(HV_X64_MSR_ICR, reg_val); +} + +static u32 hv_apic_read(u32 reg) +{ + u32 reg_val, hi; + + switch (reg) { + case APIC_EOI: + rdmsr(HV_X64_MSR_EOI, reg_val, hi); + return reg_val; + case APIC_TASKPRI: + rdmsr(HV_X64_MSR_TPR, reg_val, hi); + return reg_val; + + default: + return native_apic_mem_read(reg); + } +} + +static void hv_apic_write(u32 reg, u32 val) +{ + switch (reg) { + case APIC_EOI: + wrmsr(HV_X64_MSR_EOI, val, 0); + break; + case APIC_TASKPRI: + wrmsr(HV_X64_MSR_TPR, val, 0); + break; + default: + native_apic_mem_write(reg, val); + } +} + +static void hv_apic_eoi_write(u32 reg, u32 val) +{ + wrmsr(HV_X64_MSR_EOI, val, 0); +} + +void __init hv_apic_init(void) +{ + if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) { + pr_info("Hyper-V: Using MSR based APIC access\n"); + apic_set_eoi_write(hv_apic_eoi_write); + apic->read = hv_apic_read; + apic->write = hv_apic_write; + apic->icr_write = hv_apic_icr_write; + apic->icr_read = hv_apic_icr_read; + } +} + +#endif /* CONFIG_HYPERV */ +#endif /* CONFIG_X86_64 */ diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index cfecc2272f2d..71e50fc2b7ef 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -242,8 +242,9 @@ static int hv_cpu_die(unsigned int cpu) * * 1. Setup the hypercall page. * 2. Register Hyper-V specific clocksource. + * 3. Setup Hyper-V specific APIC entry points. */ -void hyperv_init(void) +void __init hyperv_init(void) { u64 guest_id, required_msrs; union hv_x64_msr_hypercall_contents hypercall_msr; @@ -298,6 +299,8 @@ void hyperv_init(void) hyper_alloc_mmu(); + hv_apic_init(); + /* * Register Hyper-V specific clocksource. */ diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index b90e79610cf7..162977b82e2e 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -258,7 +258,7 @@ static inline int hv_cpu_number_to_vp_number(int cpu_number)
[tip:x86/urgent] x86/hyper-V: Allocate the IDT entry early in boot
Commit-ID: 213ff44ae4eb5224010166db2f851e4eea068268 Gitweb: http://git.kernel.org/tip/213ff44ae4eb5224010166db2f851e4eea068268 Author: K. Y. Srinivasan <k...@microsoft.com> AuthorDate: Fri, 8 Sep 2017 16:15:57 -0700 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Wed, 13 Sep 2017 11:02:26 +0200 x86/hyper-V: Allocate the IDT entry early in boot Allocate the hypervisor callback IDT entry early in the boot sequence. The previous code would allocate the entry as part of registering the handler when the vmbus driver loaded, and this caused a problem for the IDT cleanup that Thomas is working on for v4.15. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Cc: Linus Torvalds <torva...@linux-foundation.org> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Thomas Gleixner <t...@linutronix.de> Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: o...@aepfle.de Link: http://lkml.kernel.org/r/20170908231557.2419-1-...@exchange.microsoft.com Signed-off-by: Ingo Molnar <mi...@kernel.org> --- arch/x86/kernel/cpu/mshyperv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 3b3f713..236324e8 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -59,8 +59,6 @@ void hyperv_vector_handler(struct pt_regs *regs) void hv_setup_vmbus_irq(void (*handler)(void)) { vmbus_handler = handler; - /* Setup the IDT for hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); } void hv_remove_vmbus_irq(void) @@ -251,6 +249,8 @@ static void __init ms_hyperv_init_platform(void) */ x86_platform.apic_post_init = hyperv_init; hyperv_setup_mmu_ops(); + /* Setup the IDT for hypervisor callback */ + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); #endif }
[tip:x86/urgent] x86/hyper-V: Allocate the IDT entry early in boot
Commit-ID: 213ff44ae4eb5224010166db2f851e4eea068268 Gitweb: http://git.kernel.org/tip/213ff44ae4eb5224010166db2f851e4eea068268 Author: K. Y. Srinivasan AuthorDate: Fri, 8 Sep 2017 16:15:57 -0700 Committer: Ingo Molnar CommitDate: Wed, 13 Sep 2017 11:02:26 +0200 x86/hyper-V: Allocate the IDT entry early in boot Allocate the hypervisor callback IDT entry early in the boot sequence. The previous code would allocate the entry as part of registering the handler when the vmbus driver loaded, and this caused a problem for the IDT cleanup that Thomas is working on for v4.15. Signed-off-by: K. Y. Srinivasan Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: a...@canonical.com Cc: de...@linuxdriverproject.org Cc: gre...@linuxfoundation.org Cc: jasow...@redhat.com Cc: o...@aepfle.de Link: http://lkml.kernel.org/r/20170908231557.2419-1-...@exchange.microsoft.com Signed-off-by: Ingo Molnar --- arch/x86/kernel/cpu/mshyperv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 3b3f713..236324e8 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -59,8 +59,6 @@ void hyperv_vector_handler(struct pt_regs *regs) void hv_setup_vmbus_irq(void (*handler)(void)) { vmbus_handler = handler; - /* Setup the IDT for hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); } void hv_remove_vmbus_irq(void) @@ -251,6 +249,8 @@ static void __init ms_hyperv_init_platform(void) */ x86_platform.apic_post_init = hyperv_init; hyperv_setup_mmu_ops(); + /* Setup the IDT for hypervisor callback */ + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); #endif }
[PATCH 1/1] x86/hyper-V: Allocate the IDT entry early in boot
Allocate the hypervisor callback IDT entry early in the boot sequence. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- arch/x86/kernel/cpu/mshyperv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 3b3f713e15e5..236324e83a3a 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -59,8 +59,6 @@ void hyperv_vector_handler(struct pt_regs *regs) void hv_setup_vmbus_irq(void (*handler)(void)) { vmbus_handler = handler; - /* Setup the IDT for hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); } void hv_remove_vmbus_irq(void) @@ -251,6 +249,8 @@ static void __init ms_hyperv_init_platform(void) */ x86_platform.apic_post_init = hyperv_init; hyperv_setup_mmu_ops(); + /* Setup the IDT for hypervisor callback */ + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); #endif } -- 2.14.1
[PATCH 1/1] x86/hyper-V: Allocate the IDT entry early in boot
Allocate the hypervisor callback IDT entry early in the boot sequence. Signed-off-by: K. Y. Srinivasan --- arch/x86/kernel/cpu/mshyperv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 3b3f713e15e5..236324e83a3a 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -59,8 +59,6 @@ void hyperv_vector_handler(struct pt_regs *regs) void hv_setup_vmbus_irq(void (*handler)(void)) { vmbus_handler = handler; - /* Setup the IDT for hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); } void hv_remove_vmbus_irq(void) @@ -251,6 +249,8 @@ static void __init ms_hyperv_init_platform(void) */ x86_platform.apic_post_init = hyperv_init; hyperv_setup_mmu_ops(); + /* Setup the IDT for hypervisor callback */ + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); #endif } -- 2.14.1
[PATCH 00/18] Drivers: hv: vmbus: Restructure architecture specific code
The current Hyper-V code under drivers/hv has bunch of X86 specific code. Restructure the code and move al architecture specific code to the appropriate files. As I was working on this restructuring, Roman Kagan <rka...@virtuozzo.com> has submitted patches to restructure the Hyper-V header files to address a different need - to share the definitions across all Hyper-V drivers including QEMU based drivers. Roman and I will coordinate our work. K. Y. Srinivasan (18): Drivers: hv: vmbus: Move the definition of hv_x64_msr_hypercall_contents Drivers: hv: vmbus: Move the definition of generate_guest_id() Drivers: hv vmbus: Move Hypercall page setup out of common code Drivers: hv: vmbus: Move Hypercall invocation code out of common code Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code Drivers: hv: vmbus: Move the extracting of Hypervisor version information Drivers: hv: vmbus: Move the crash notification function Drivers: hv: vmbus: Move the check for hypercall page setup Drivers: hv: vmbus: Move the code to signal end of message Drivers: hv: vmbus: Restructure the clockevents code Drivers: hv: util: Use hv_get_current_tick() to get current tick Drivers: hv: vmbus: Get rid of an unsused variable Drivers: hv: vmbus: Define APIs to manipulate the message page Drivers: hv: vmbus: Define APIs to manipulate the event page Drivers: hv: vmbus: Define APIs to manipulate the synthetic interrupt controller Drivers: hv: vmbus: Define an API to retrieve virtual processor index Drivers: hv: vmbus: Define an APIs to manage interrupt state Drivers: hv: vmbus: Cleanup hyperv_vmbus.h arch/x86/Kbuild|3 + arch/x86/hyperv/Makefile |1 + arch/x86/hyperv/hv_init.c | 251 ++ arch/x86/include/asm/mshyperv.h| 147 ++ arch/x86/include/uapi/asm/hyperv.h |8 + arch/x86/kernel/cpu/mshyperv.c | 50 --- drivers/hv/channel_mgmt.c |1 + drivers/hv/connection.c|7 +- drivers/hv/hv.c| 296 drivers/hv/hv_util.c |3 +- drivers/hv/hyperv_vmbus.h | 291 +--- drivers/hv/vmbus_drv.c | 25 --- 12 files changed, 472 insertions(+), 611 deletions(-) create mode 100644 arch/x86/hyperv/Makefile create mode 100644 arch/x86/hyperv/hv_init.c -- 1.7.4.1
[PATCH 00/18] Drivers: hv: vmbus: Restructure architecture specific code
The current Hyper-V code under drivers/hv has bunch of X86 specific code. Restructure the code and move al architecture specific code to the appropriate files. As I was working on this restructuring, Roman Kagan has submitted patches to restructure the Hyper-V header files to address a different need - to share the definitions across all Hyper-V drivers including QEMU based drivers. Roman and I will coordinate our work. K. Y. Srinivasan (18): Drivers: hv: vmbus: Move the definition of hv_x64_msr_hypercall_contents Drivers: hv: vmbus: Move the definition of generate_guest_id() Drivers: hv vmbus: Move Hypercall page setup out of common code Drivers: hv: vmbus: Move Hypercall invocation code out of common code Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code Drivers: hv: vmbus: Move the extracting of Hypervisor version information Drivers: hv: vmbus: Move the crash notification function Drivers: hv: vmbus: Move the check for hypercall page setup Drivers: hv: vmbus: Move the code to signal end of message Drivers: hv: vmbus: Restructure the clockevents code Drivers: hv: util: Use hv_get_current_tick() to get current tick Drivers: hv: vmbus: Get rid of an unsused variable Drivers: hv: vmbus: Define APIs to manipulate the message page Drivers: hv: vmbus: Define APIs to manipulate the event page Drivers: hv: vmbus: Define APIs to manipulate the synthetic interrupt controller Drivers: hv: vmbus: Define an API to retrieve virtual processor index Drivers: hv: vmbus: Define an APIs to manage interrupt state Drivers: hv: vmbus: Cleanup hyperv_vmbus.h arch/x86/Kbuild|3 + arch/x86/hyperv/Makefile |1 + arch/x86/hyperv/hv_init.c | 251 ++ arch/x86/include/asm/mshyperv.h| 147 ++ arch/x86/include/uapi/asm/hyperv.h |8 + arch/x86/kernel/cpu/mshyperv.c | 50 --- drivers/hv/channel_mgmt.c |1 + drivers/hv/connection.c|7 +- drivers/hv/hv.c| 296 drivers/hv/hv_util.c |3 +- drivers/hv/hyperv_vmbus.h | 291 +--- drivers/hv/vmbus_drv.c | 25 --- 12 files changed, 472 insertions(+), 611 deletions(-) create mode 100644 arch/x86/hyperv/Makefile create mode 100644 arch/x86/hyperv/hv_init.c -- 1.7.4.1
[PATCH V3 00/14] Drivers: hv: Some miscellaneous fixes and enhancements
Some miscellaneous fixes and enhancements. V2: Fixed a build issue reported by Greg. Only Patch # 13 is affected. V3: Address Stephen's comment. Only patch 3 is affected. Alex Ng (6): Drivers: hv: utils: Fix the mapping between host version and protocol to use Drivers: hv: balloon: Disable hot add when CONFIG_MEMORY_HOTPLUG is not set Drivers: hv: balloon: Add logging for dynamic memory operations Drivers: hv: vss: Improve log messages. Drivers: hv: vss: Operation timeouts should match host expectation Drivers: hv: balloon: Fix info request to show max page count K. Y. Srinivasan (3): Drivers: hv: vmbus: Base host signaling strictly on the ring state Drivers: hv: vmbus: On write cleanup the logic to interrupt the host Drivers: hv: vmbus: On the read path cleanup the logic to interrupt the host Vitaly Kuznetsov (2): Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2) Drivers: hv: utils: reduce HV_UTIL_NEGO_TIMEOUT timeout Weibing Zhang (3): tools: hv: remove unnecessary link flag tools: hv: fix a compile warning in snprintf tools: hv: remove unnecessary header files and netlink related code drivers/hv/channel.c | 93 ++-- drivers/hv/channel_mgmt.c |2 - drivers/hv/hv_balloon.c| 44 ++-- drivers/hv/hv_snapshot.c | 33 drivers/hv/hv_util.c |9 +++- drivers/hv/hyperv_vmbus.h | 12 +++--- drivers/hv/ring_buffer.c | 44 - include/linux/hyperv.h | 45 - tools/hv/Makefile |3 +- tools/hv/hv_fcopy_daemon.c |7 --- tools/hv/hv_kvp_daemon.c |9 + 11 files changed, 133 insertions(+), 168 deletions(-) -- 1.7.4.1
[PATCH V3 00/14] Drivers: hv: Some miscellaneous fixes and enhancements
Some miscellaneous fixes and enhancements. V2: Fixed a build issue reported by Greg. Only Patch # 13 is affected. V3: Address Stephen's comment. Only patch 3 is affected. Alex Ng (6): Drivers: hv: utils: Fix the mapping between host version and protocol to use Drivers: hv: balloon: Disable hot add when CONFIG_MEMORY_HOTPLUG is not set Drivers: hv: balloon: Add logging for dynamic memory operations Drivers: hv: vss: Improve log messages. Drivers: hv: vss: Operation timeouts should match host expectation Drivers: hv: balloon: Fix info request to show max page count K. Y. Srinivasan (3): Drivers: hv: vmbus: Base host signaling strictly on the ring state Drivers: hv: vmbus: On write cleanup the logic to interrupt the host Drivers: hv: vmbus: On the read path cleanup the logic to interrupt the host Vitaly Kuznetsov (2): Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2) Drivers: hv: utils: reduce HV_UTIL_NEGO_TIMEOUT timeout Weibing Zhang (3): tools: hv: remove unnecessary link flag tools: hv: fix a compile warning in snprintf tools: hv: remove unnecessary header files and netlink related code drivers/hv/channel.c | 93 ++-- drivers/hv/channel_mgmt.c |2 - drivers/hv/hv_balloon.c| 44 ++-- drivers/hv/hv_snapshot.c | 33 drivers/hv/hv_util.c |9 +++- drivers/hv/hyperv_vmbus.h | 12 +++--- drivers/hv/ring_buffer.c | 44 - include/linux/hyperv.h | 45 - tools/hv/Makefile |3 +- tools/hv/hv_fcopy_daemon.c |7 --- tools/hv/hv_kvp_daemon.c |9 + 11 files changed, 133 insertions(+), 168 deletions(-) -- 1.7.4.1
[PATCH 1/1] Drivers: hv: vmbus: fix the race when querying & updating the percpu list
From: Dexuan Cui <de...@microsoft.com> There is a rare race when we remove an entry from the global list hv_context.percpu_list[cpu] in hv_process_channel_removal() -> percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() -> process_chn_event() -> pcpu_relid2channel() is trying to query the list, we can get the kernel fault. Similarly, we also have the issue in the code path: vmbus_process_offer() -> percpu_channel_enq(). We can resolve the issue by disabling the tasklet when updating the list. The patch also moves vmbus_release_relid() to a later place where the channel has been removed from the per-cpu and the global lists. Reported-by: Rolf Neugebauer <rolf.neugeba...@docker.com> Signed-off-by: Dexuan Cui <de...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel.c |6 ++ drivers/hv/channel_mgmt.c | 32 include/linux/hyperv.h|3 +++ 3 files changed, 33 insertions(+), 8 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 9a88c63..e47d37d 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -505,7 +505,6 @@ static void reset_channel_cb(void *arg) static int vmbus_close_internal(struct vmbus_channel *channel) { struct vmbus_channel_close_channel *msg; - struct tasklet_struct *tasklet; int ret; /* @@ -517,8 +516,7 @@ static int vmbus_close_internal(struct vmbus_channel *channel) * To resolve the race, we can serialize them by disabling the * tasklet when the latter is running here. */ - tasklet = hv_context.event_dpc[channel->target_cpu]; - tasklet_disable(tasklet); + hv_event_tasklet_disable(channel); /* * In case a device driver's probe() fails (e.g., @@ -584,7 +582,7 @@ static int vmbus_close_internal(struct vmbus_channel *channel) get_order(channel->ringbuffer_pagecount * PAGE_SIZE)); out: - tasklet_enable(tasklet); + hv_event_tasklet_enable(channel); return ret; } diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index b6c1211..8818b92 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -21,6 +21,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include +#include #include #include #include @@ -303,16 +304,32 @@ static void vmbus_release_relid(u32 relid) vmbus_post_msg(, sizeof(struct vmbus_channel_relid_released)); } +void hv_event_tasklet_disable(struct vmbus_channel *channel) +{ + struct tasklet_struct *tasklet; + tasklet = hv_context.event_dpc[channel->target_cpu]; + tasklet_disable(tasklet); +} + +void hv_event_tasklet_enable(struct vmbus_channel *channel) +{ + struct tasklet_struct *tasklet; + tasklet = hv_context.event_dpc[channel->target_cpu]; + tasklet_enable(tasklet); + + /* In case there is any pending event */ + tasklet_schedule(tasklet); +} + void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) { unsigned long flags; struct vmbus_channel *primary_channel; - vmbus_release_relid(relid); - BUG_ON(!channel->rescind); BUG_ON(!mutex_is_locked(_connection.channel_mutex)); + hv_event_tasklet_disable(channel); if (channel->target_cpu != get_cpu()) { put_cpu(); smp_call_function_single(channel->target_cpu, @@ -321,6 +338,7 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) percpu_channel_deq(channel); put_cpu(); } + hv_event_tasklet_enable(channel); if (channel->primary_channel == NULL) { list_del(>listentry); @@ -341,6 +359,8 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) cpumask_clear_cpu(channel->target_cpu, _channel->alloced_cpus_in_node); + vmbus_release_relid(relid); + free_channel(channel); } @@ -409,6 +429,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) init_vp_index(newchannel, dev_type); + hv_event_tasklet_disable(newchannel); if (newchannel->target_cpu != get_cpu()) { put_cpu(); smp_call_function_single(newchannel->target_cpu, @@ -418,6 +439,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) percpu_channel_enq(newchannel); put_cpu(); } + hv_event_tasklet_enable(newchannel); /* * This state is used to indicate a successful open @@ -463,12 +485,11 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) return; err_deq_chan: - vmbus_release_relid(newchannel->offermsg.child_relid); - mutex_lock(_connection.c
[PATCH 1/1] Drivers: hv: vmbus: fix the race when querying & updating the percpu list
From: Dexuan Cui There is a rare race when we remove an entry from the global list hv_context.percpu_list[cpu] in hv_process_channel_removal() -> percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() -> process_chn_event() -> pcpu_relid2channel() is trying to query the list, we can get the kernel fault. Similarly, we also have the issue in the code path: vmbus_process_offer() -> percpu_channel_enq(). We can resolve the issue by disabling the tasklet when updating the list. The patch also moves vmbus_release_relid() to a later place where the channel has been removed from the per-cpu and the global lists. Reported-by: Rolf Neugebauer Signed-off-by: Dexuan Cui Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c |6 ++ drivers/hv/channel_mgmt.c | 32 include/linux/hyperv.h|3 +++ 3 files changed, 33 insertions(+), 8 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 9a88c63..e47d37d 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -505,7 +505,6 @@ static void reset_channel_cb(void *arg) static int vmbus_close_internal(struct vmbus_channel *channel) { struct vmbus_channel_close_channel *msg; - struct tasklet_struct *tasklet; int ret; /* @@ -517,8 +516,7 @@ static int vmbus_close_internal(struct vmbus_channel *channel) * To resolve the race, we can serialize them by disabling the * tasklet when the latter is running here. */ - tasklet = hv_context.event_dpc[channel->target_cpu]; - tasklet_disable(tasklet); + hv_event_tasklet_disable(channel); /* * In case a device driver's probe() fails (e.g., @@ -584,7 +582,7 @@ static int vmbus_close_internal(struct vmbus_channel *channel) get_order(channel->ringbuffer_pagecount * PAGE_SIZE)); out: - tasklet_enable(tasklet); + hv_event_tasklet_enable(channel); return ret; } diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index b6c1211..8818b92 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -21,6 +21,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include +#include #include #include #include @@ -303,16 +304,32 @@ static void vmbus_release_relid(u32 relid) vmbus_post_msg(, sizeof(struct vmbus_channel_relid_released)); } +void hv_event_tasklet_disable(struct vmbus_channel *channel) +{ + struct tasklet_struct *tasklet; + tasklet = hv_context.event_dpc[channel->target_cpu]; + tasklet_disable(tasklet); +} + +void hv_event_tasklet_enable(struct vmbus_channel *channel) +{ + struct tasklet_struct *tasklet; + tasklet = hv_context.event_dpc[channel->target_cpu]; + tasklet_enable(tasklet); + + /* In case there is any pending event */ + tasklet_schedule(tasklet); +} + void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) { unsigned long flags; struct vmbus_channel *primary_channel; - vmbus_release_relid(relid); - BUG_ON(!channel->rescind); BUG_ON(!mutex_is_locked(_connection.channel_mutex)); + hv_event_tasklet_disable(channel); if (channel->target_cpu != get_cpu()) { put_cpu(); smp_call_function_single(channel->target_cpu, @@ -321,6 +338,7 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) percpu_channel_deq(channel); put_cpu(); } + hv_event_tasklet_enable(channel); if (channel->primary_channel == NULL) { list_del(>listentry); @@ -341,6 +359,8 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) cpumask_clear_cpu(channel->target_cpu, _channel->alloced_cpus_in_node); + vmbus_release_relid(relid); + free_channel(channel); } @@ -409,6 +429,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) init_vp_index(newchannel, dev_type); + hv_event_tasklet_disable(newchannel); if (newchannel->target_cpu != get_cpu()) { put_cpu(); smp_call_function_single(newchannel->target_cpu, @@ -418,6 +439,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) percpu_channel_enq(newchannel); put_cpu(); } + hv_event_tasklet_enable(newchannel); /* * This state is used to indicate a successful open @@ -463,12 +485,11 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) return; err_deq_chan: - vmbus_release_relid(newchannel->offermsg.child_relid); - mutex_lock(_connection.channel_mutex); list_del(>listentry); mutex_unlock(_connection.channel_mutex); + hv_event_task
[PATCH net-next] netvsc: Use the new in-place consumption APIs in the rx path
Use the new APIs for eliminating a copy on the receive path. These new APIs also help in minimizing the number of memory barriers we end up issuing (in the ringbuffer code) since we can better control when we want to expose the ring state to the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Reviewed-by: Haiyang Zhang <haiya...@microsoft.com> Tested-by: Dexuan Cui <de...@microsoft.com> Tested-by: Simon Xiao <six...@microsoft.com> --- drivers/net/hyperv/netvsc.c | 88 +-- 1 files changed, 59 insertions(+), 29 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 719cb35..8cd4c19 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -1141,6 +1141,39 @@ static inline void netvsc_receive_inband(struct hv_device *hdev, } } +static void netvsc_process_raw_pkt(struct hv_device *device, + struct vmbus_channel *channel, + struct netvsc_device *net_device, + struct net_device *ndev, + u64 request_id, + struct vmpacket_descriptor *desc) +{ + struct nvsp_message *nvmsg; + + nvmsg = (struct nvsp_message *)((unsigned long) + desc + (desc->offset8 << 3)); + + switch (desc->type) { + case VM_PKT_COMP: + netvsc_send_completion(net_device, channel, device, desc); + break; + + case VM_PKT_DATA_USING_XFER_PAGES: + netvsc_receive(net_device, channel, device, desc); + break; + + case VM_PKT_DATA_INBAND: + netvsc_receive_inband(device, net_device, nvmsg); + break; + + default: + netdev_err(ndev, "unhandled packet type %d, tid %llx\n", + desc->type, request_id); + break; + } +} + + void netvsc_channel_cb(void *context) { int ret; @@ -1153,7 +1186,7 @@ void netvsc_channel_cb(void *context) unsigned char *buffer; int bufferlen = NETVSC_PACKET_SIZE; struct net_device *ndev; - struct nvsp_message *nvmsg; + bool need_to_commit = false; if (channel->primary_channel != NULL) device = channel->primary_channel->device_obj; @@ -1167,39 +1200,36 @@ void netvsc_channel_cb(void *context) buffer = get_per_channel_state(channel); do { + desc = get_next_pkt_raw(channel); + if (desc != NULL) { + netvsc_process_raw_pkt(device, + channel, + net_device, + ndev, + desc->trans_id, + desc); + + put_pkt_raw(channel, desc); + need_to_commit = true; + continue; + } + if (need_to_commit) { + need_to_commit = false; + commit_rd_index(channel); + } + ret = vmbus_recvpacket_raw(channel, buffer, bufferlen, _recvd, _id); if (ret == 0) { if (bytes_recvd > 0) { desc = (struct vmpacket_descriptor *)buffer; - nvmsg = (struct nvsp_message *)((unsigned long) -desc + (desc->offset8 << 3)); - switch (desc->type) { - case VM_PKT_COMP: - netvsc_send_completion(net_device, - channel, - device, desc); - break; - - case VM_PKT_DATA_USING_XFER_PAGES: - netvsc_receive(net_device, channel, - device, desc); - break; - - case VM_PKT_DATA_INBAND: - netvsc_receive_inband(device, - net_device, - nvmsg); - break; - - default: - netdev_err(ndev, - "unhandled packet type %d, " - "ti
[PATCH net-next] netvsc: Use the new in-place consumption APIs in the rx path
Use the new APIs for eliminating a copy on the receive path. These new APIs also help in minimizing the number of memory barriers we end up issuing (in the ringbuffer code) since we can better control when we want to expose the ring state to the host. Signed-off-by: K. Y. Srinivasan Reviewed-by: Haiyang Zhang Tested-by: Dexuan Cui Tested-by: Simon Xiao --- drivers/net/hyperv/netvsc.c | 88 +-- 1 files changed, 59 insertions(+), 29 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 719cb35..8cd4c19 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -1141,6 +1141,39 @@ static inline void netvsc_receive_inband(struct hv_device *hdev, } } +static void netvsc_process_raw_pkt(struct hv_device *device, + struct vmbus_channel *channel, + struct netvsc_device *net_device, + struct net_device *ndev, + u64 request_id, + struct vmpacket_descriptor *desc) +{ + struct nvsp_message *nvmsg; + + nvmsg = (struct nvsp_message *)((unsigned long) + desc + (desc->offset8 << 3)); + + switch (desc->type) { + case VM_PKT_COMP: + netvsc_send_completion(net_device, channel, device, desc); + break; + + case VM_PKT_DATA_USING_XFER_PAGES: + netvsc_receive(net_device, channel, device, desc); + break; + + case VM_PKT_DATA_INBAND: + netvsc_receive_inband(device, net_device, nvmsg); + break; + + default: + netdev_err(ndev, "unhandled packet type %d, tid %llx\n", + desc->type, request_id); + break; + } +} + + void netvsc_channel_cb(void *context) { int ret; @@ -1153,7 +1186,7 @@ void netvsc_channel_cb(void *context) unsigned char *buffer; int bufferlen = NETVSC_PACKET_SIZE; struct net_device *ndev; - struct nvsp_message *nvmsg; + bool need_to_commit = false; if (channel->primary_channel != NULL) device = channel->primary_channel->device_obj; @@ -1167,39 +1200,36 @@ void netvsc_channel_cb(void *context) buffer = get_per_channel_state(channel); do { + desc = get_next_pkt_raw(channel); + if (desc != NULL) { + netvsc_process_raw_pkt(device, + channel, + net_device, + ndev, + desc->trans_id, + desc); + + put_pkt_raw(channel, desc); + need_to_commit = true; + continue; + } + if (need_to_commit) { + need_to_commit = false; + commit_rd_index(channel); + } + ret = vmbus_recvpacket_raw(channel, buffer, bufferlen, _recvd, _id); if (ret == 0) { if (bytes_recvd > 0) { desc = (struct vmpacket_descriptor *)buffer; - nvmsg = (struct nvsp_message *)((unsigned long) -desc + (desc->offset8 << 3)); - switch (desc->type) { - case VM_PKT_COMP: - netvsc_send_completion(net_device, - channel, - device, desc); - break; - - case VM_PKT_DATA_USING_XFER_PAGES: - netvsc_receive(net_device, channel, - device, desc); - break; - - case VM_PKT_DATA_INBAND: - netvsc_receive_inband(device, - net_device, - nvmsg); - break; - - default: - netdev_err(ndev, - "unhandled packet type %d, " - "tid %llx len %d\n", - desc->type, request_id, -
[PATCH 1/2] Drivers: hv: get rid of timeout in vmbus_open()
From: Vitaly Kuznetsov <vkuzn...@redhat.com> vmbus_teardown_gpadl() can result in infinite wait when it is called on 5 second timeout in vmbus_open(). The issue is caused by the fact that gpadl teardown operation won't ever succeed for an opened channel and the timeout isn't always enough. As a guest, we can always trust the host to respond to our request (and there is nothing we can do if it doesn't). Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel.c |7 +-- 1 files changed, 1 insertions(+), 6 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index a68830c..9a88c63 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -73,7 +73,6 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size, void *in, *out; unsigned long flags; int ret, err = 0; - unsigned long t; struct page *page; spin_lock_irqsave(>lock, flags); @@ -183,11 +182,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size, goto error1; } - t = wait_for_completion_timeout(_info->waitevent, 5*HZ); - if (t == 0) { - err = -ETIMEDOUT; - goto error1; - } + wait_for_completion(_info->waitevent); spin_lock_irqsave(_connection.channelmsg_lock, flags); list_del(_info->msglistentry); -- 1.7.4.1
[PATCH 1/2] Drivers: hv: get rid of timeout in vmbus_open()
From: Vitaly Kuznetsov vmbus_teardown_gpadl() can result in infinite wait when it is called on 5 second timeout in vmbus_open(). The issue is caused by the fact that gpadl teardown operation won't ever succeed for an opened channel and the timeout isn't always enough. As a guest, we can always trust the host to respond to our request (and there is nothing we can do if it doesn't). Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c |7 +-- 1 files changed, 1 insertions(+), 6 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index a68830c..9a88c63 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -73,7 +73,6 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size, void *in, *out; unsigned long flags; int ret, err = 0; - unsigned long t; struct page *page; spin_lock_irqsave(>lock, flags); @@ -183,11 +182,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size, goto error1; } - t = wait_for_completion_timeout(_info->waitevent, 5*HZ); - if (t == 0) { - err = -ETIMEDOUT; - goto error1; - } + wait_for_completion(_info->waitevent); spin_lock_irqsave(_connection.channelmsg_lock, flags); list_del(_info->msglistentry); -- 1.7.4.1
[PATCH 2/2] Drivers: hv: utils: fix a race on userspace daemons registration
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Background: userspace daemons registration protocol for Hyper-V utilities drivers has two steps: 1) daemon writes its own version to kernel 2) kernel reads it and replies with module version at this point we consider the handshake procedure being completed and we do hv_poll_channel() transitioning the utility device to HVUTIL_READY state. At this point we're ready to handle messages from kernel. When hvutil_transport is in HVUTIL_TRANSPORT_CHARDEV mode we have a single buffer for outgoing message. hvutil_transport_send() puts to this buffer and till the buffer is cleared with hvt_op_read() returns -EFAULT to all consequent calls. Host<->guest protocol guarantees there is no more than one request at a time and we will not get new requests till we reply to the previous one so this single message buffer is enough. Now to the race. When we finish negotiation procedure and send kernel module version to userspace with hvutil_transport_send() it goes into the above mentioned buffer and if the daemon is slow enough to read it from there we can get a collision when a request from the host comes, we won't be able to put anything to the buffer so the request will be lost. To solve the issue we need to know when the negotiation is really done (when the version message is read by the daemon) and transition to HVUTIL_READY state after this happens. Implement a callback on read to support this. Old style netlink communication is not affected by the change, we don't really know when these messages are delivered but we don't have a single message buffer there. Reported-by: Barry Davis <barry_da...@stormagic.com> Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_fcopy.c | 14 ++ drivers/hv/hv_kvp.c | 27 --- drivers/hv/hv_snapshot.c| 16 +++- drivers/hv/hv_utils_transport.c | 15 ++- drivers/hv/hv_utils_transport.h |4 +++- 5 files changed, 54 insertions(+), 22 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index 23c7079..8b2ba98 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -83,6 +83,12 @@ static void fcopy_timeout_func(struct work_struct *dummy) hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); } +static void fcopy_register_done(void) +{ + pr_debug("FCP: userspace daemon registered\n"); + hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); +} + static int fcopy_handle_handshake(u32 version) { u32 our_ver = FCOPY_CURRENT_VERSION; @@ -94,7 +100,8 @@ static int fcopy_handle_handshake(u32 version) break; case FCOPY_VERSION_1: /* Daemon expects us to reply with our own version */ - if (hvutil_transport_send(hvt, _ver, sizeof(our_ver))) + if (hvutil_transport_send(hvt, _ver, sizeof(our_ver), + fcopy_register_done)) return -EFAULT; dm_reg_value = version; break; @@ -107,8 +114,7 @@ static int fcopy_handle_handshake(u32 version) */ return -EINVAL; } - pr_debug("FCP: userspace daemon ver. %d registered\n", version); - hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); + pr_debug("FCP: userspace daemon ver. %d connected\n", version); return 0; } @@ -161,7 +167,7 @@ static void fcopy_send_data(struct work_struct *dummy) } fcopy_transaction.state = HVUTIL_USERSPACE_REQ; - rc = hvutil_transport_send(hvt, out_src, out_len); + rc = hvutil_transport_send(hvt, out_src, out_len, NULL); if (rc) { pr_debug("FCP: failed to communicate to the daemon: %d\n", rc); if (cancel_delayed_work_sync(_timeout_work)) { diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index cb1a916..5e1fdc8 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -102,6 +102,17 @@ static void kvp_poll_wrapper(void *channel) hv_kvp_onchannelcallback(channel); } +static void kvp_register_done(void) +{ + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + pr_debug("KVP: userspace daemon registered\n"); + cancel_delayed_work_sync(_host_handshake_work); + hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); +} + static void kvp_register(int reg_value) { @@ -116,7 +127,8 @@ kvp_register(int reg_value) kvp_msg->kvp_hdr.operation = reg_value; strcpy(version, HV_DRV_VERSION); - hvutil_transport_send(hvt, kvp_msg, sizeof(*kvp_msg)); + hvutil_transport_send(hvt,
[PATCH 2/2] Drivers: hv: utils: fix a race on userspace daemons registration
From: Vitaly Kuznetsov Background: userspace daemons registration protocol for Hyper-V utilities drivers has two steps: 1) daemon writes its own version to kernel 2) kernel reads it and replies with module version at this point we consider the handshake procedure being completed and we do hv_poll_channel() transitioning the utility device to HVUTIL_READY state. At this point we're ready to handle messages from kernel. When hvutil_transport is in HVUTIL_TRANSPORT_CHARDEV mode we have a single buffer for outgoing message. hvutil_transport_send() puts to this buffer and till the buffer is cleared with hvt_op_read() returns -EFAULT to all consequent calls. Host<->guest protocol guarantees there is no more than one request at a time and we will not get new requests till we reply to the previous one so this single message buffer is enough. Now to the race. When we finish negotiation procedure and send kernel module version to userspace with hvutil_transport_send() it goes into the above mentioned buffer and if the daemon is slow enough to read it from there we can get a collision when a request from the host comes, we won't be able to put anything to the buffer so the request will be lost. To solve the issue we need to know when the negotiation is really done (when the version message is read by the daemon) and transition to HVUTIL_READY state after this happens. Implement a callback on read to support this. Old style netlink communication is not affected by the change, we don't really know when these messages are delivered but we don't have a single message buffer there. Reported-by: Barry Davis Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_fcopy.c | 14 ++ drivers/hv/hv_kvp.c | 27 --- drivers/hv/hv_snapshot.c| 16 +++- drivers/hv/hv_utils_transport.c | 15 ++- drivers/hv/hv_utils_transport.h |4 +++- 5 files changed, 54 insertions(+), 22 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index 23c7079..8b2ba98 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -83,6 +83,12 @@ static void fcopy_timeout_func(struct work_struct *dummy) hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); } +static void fcopy_register_done(void) +{ + pr_debug("FCP: userspace daemon registered\n"); + hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); +} + static int fcopy_handle_handshake(u32 version) { u32 our_ver = FCOPY_CURRENT_VERSION; @@ -94,7 +100,8 @@ static int fcopy_handle_handshake(u32 version) break; case FCOPY_VERSION_1: /* Daemon expects us to reply with our own version */ - if (hvutil_transport_send(hvt, _ver, sizeof(our_ver))) + if (hvutil_transport_send(hvt, _ver, sizeof(our_ver), + fcopy_register_done)) return -EFAULT; dm_reg_value = version; break; @@ -107,8 +114,7 @@ static int fcopy_handle_handshake(u32 version) */ return -EINVAL; } - pr_debug("FCP: userspace daemon ver. %d registered\n", version); - hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); + pr_debug("FCP: userspace daemon ver. %d connected\n", version); return 0; } @@ -161,7 +167,7 @@ static void fcopy_send_data(struct work_struct *dummy) } fcopy_transaction.state = HVUTIL_USERSPACE_REQ; - rc = hvutil_transport_send(hvt, out_src, out_len); + rc = hvutil_transport_send(hvt, out_src, out_len, NULL); if (rc) { pr_debug("FCP: failed to communicate to the daemon: %d\n", rc); if (cancel_delayed_work_sync(_timeout_work)) { diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index cb1a916..5e1fdc8 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -102,6 +102,17 @@ static void kvp_poll_wrapper(void *channel) hv_kvp_onchannelcallback(channel); } +static void kvp_register_done(void) +{ + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + pr_debug("KVP: userspace daemon registered\n"); + cancel_delayed_work_sync(_host_handshake_work); + hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); +} + static void kvp_register(int reg_value) { @@ -116,7 +127,8 @@ kvp_register(int reg_value) kvp_msg->kvp_hdr.operation = reg_value; strcpy(version, HV_DRV_VERSION); - hvutil_transport_send(hvt, kvp_msg, sizeof(*kvp_msg)); + hvutil_transport_send(hvt, kvp_msg, sizeof(*kvp_msg), + kvp_register_done); kf
[PATCH 0/2] Drivers: hv: Fix some issues in both vmbus as well as util drivers
Fix a race in the registration of user space daemons. Also get rid of timeout in vmbus_open() as timing out here would make it impossible to rollback correctly. Vitaly Kuznetsov (2): Drivers: hv: get rid of timeout in vmbus_open() Drivers: hv: utils: fix a race on userspace daemons registration drivers/hv/channel.c|7 +-- drivers/hv/hv_fcopy.c | 14 ++ drivers/hv/hv_kvp.c | 27 --- drivers/hv/hv_snapshot.c| 16 +++- drivers/hv/hv_utils_transport.c | 15 ++- drivers/hv/hv_utils_transport.h |4 +++- 6 files changed, 55 insertions(+), 28 deletions(-) -- 1.7.4.1
[PATCH 0/2] Drivers: hv: Fix some issues in both vmbus as well as util drivers
Fix a race in the registration of user space daemons. Also get rid of timeout in vmbus_open() as timing out here would make it impossible to rollback correctly. Vitaly Kuznetsov (2): Drivers: hv: get rid of timeout in vmbus_open() Drivers: hv: utils: fix a race on userspace daemons registration drivers/hv/channel.c|7 +-- drivers/hv/hv_fcopy.c | 14 ++ drivers/hv/hv_kvp.c | 27 --- drivers/hv/hv_snapshot.c| 16 +++- drivers/hv/hv_utils_transport.c | 15 ++- drivers/hv/hv_utils_transport.h |4 +++- 6 files changed, 55 insertions(+), 28 deletions(-) -- 1.7.4.1
[PATCH 3/3] Drivers: hv: don't leak memory in vmbus_establish_gpadl()
From: Vitaly Kuznetsov <vkuzn...@redhat.com> In some cases create_gpadl_header() allocates submessages but we never free them. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 2b109e8..a68830c 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -388,7 +388,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, struct vmbus_channel_gpadl_header *gpadlmsg; struct vmbus_channel_gpadl_body *gpadl_body; struct vmbus_channel_msginfo *msginfo = NULL; - struct vmbus_channel_msginfo *submsginfo; + struct vmbus_channel_msginfo *submsginfo, *tmp; struct list_head *curr; u32 next_gpadl_handle; unsigned long flags; @@ -445,6 +445,10 @@ cleanup: spin_lock_irqsave(_connection.channelmsg_lock, flags); list_del(>msglistentry); spin_unlock_irqrestore(_connection.channelmsg_lock, flags); + list_for_each_entry_safe(submsginfo, tmp, >submsglist, +msglistentry) { + kfree(submsginfo); + } kfree(msginfo); return ret; -- 1.7.4.1
[PATCH 3/3] Drivers: hv: don't leak memory in vmbus_establish_gpadl()
From: Vitaly Kuznetsov In some cases create_gpadl_header() allocates submessages but we never free them. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 2b109e8..a68830c 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -388,7 +388,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, struct vmbus_channel_gpadl_header *gpadlmsg; struct vmbus_channel_gpadl_body *gpadl_body; struct vmbus_channel_msginfo *msginfo = NULL; - struct vmbus_channel_msginfo *submsginfo; + struct vmbus_channel_msginfo *submsginfo, *tmp; struct list_head *curr; u32 next_gpadl_handle; unsigned long flags; @@ -445,6 +445,10 @@ cleanup: spin_lock_irqsave(_connection.channelmsg_lock, flags); list_del(>msglistentry); spin_unlock_irqrestore(_connection.channelmsg_lock, flags); + list_for_each_entry_safe(submsginfo, tmp, >submsglist, +msglistentry) { + kfree(submsginfo); + } kfree(msginfo); return ret; -- 1.7.4.1
[PATCH 1/3] Drivers: hv: avoid vfree() on crash
From: Vitaly Kuznetsov <vkuzn...@redhat.com> When we crash from NMI context (e.g. after NMI injection from host when 'sysctl -w kernel.unknown_nmi_panic=1' is set) we hit kernel BUG at mm/vmalloc.c:1530! as vfree() is denied. While the issue could be solved with in_nmi() check instead I opted for skipping vfree on all sorts of crashes to reduce the amount of work which can cause consequent crashes. We don't really need to free anything on crash. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv.c |8 +--- drivers/hv/hyperv_vmbus.h |2 +- drivers/hv/vmbus_drv.c|8 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index a1c086b..60dbd6c 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -278,7 +278,7 @@ cleanup: * * This routine is called normally during driver unloading or exiting. */ -void hv_cleanup(void) +void hv_cleanup(bool crash) { union hv_x64_msr_hypercall_contents hypercall_msr; @@ -288,7 +288,8 @@ void hv_cleanup(void) if (hv_context.hypercall_page) { hypercall_msr.as_uint64 = 0; wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); - vfree(hv_context.hypercall_page); + if (!crash) + vfree(hv_context.hypercall_page); hv_context.hypercall_page = NULL; } @@ -308,7 +309,8 @@ void hv_cleanup(void) hypercall_msr.as_uint64 = 0; wrmsrl(HV_X64_MSR_REFERENCE_TSC, hypercall_msr.as_uint64); - vfree(hv_context.tsc_page); + if (!crash) + vfree(hv_context.tsc_page); hv_context.tsc_page = NULL; } #endif diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 718b5c7..dfa9fac 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -495,7 +495,7 @@ struct hv_ring_buffer_debug_info { extern int hv_init(void); -extern void hv_cleanup(void); +extern void hv_cleanup(bool crash); extern int hv_post_message(union hv_connection_id connection_id, enum hv_message_type message_type, diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 952f20f..d11690e 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -871,7 +871,7 @@ err_alloc: bus_unregister(_bus); err_cleanup: - hv_cleanup(); + hv_cleanup(false); return ret; } @@ -1323,7 +1323,7 @@ static void hv_kexec_handler(void) vmbus_initiate_unload(false); for_each_online_cpu(cpu) smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); - hv_cleanup(); + hv_cleanup(false); }; static void hv_crash_handler(struct pt_regs *regs) @@ -1335,7 +1335,7 @@ static void hv_crash_handler(struct pt_regs *regs) * for kdump. */ hv_synic_cleanup(NULL); - hv_cleanup(); + hv_cleanup(true); }; static int __init hv_acpi_init(void) @@ -1395,7 +1395,7 @@ static void __exit vmbus_exit(void) _panic_block); } bus_unregister(_bus); - hv_cleanup(); + hv_cleanup(false); for_each_online_cpu(cpu) { tasklet_kill(hv_context.event_dpc[cpu]); smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); -- 1.7.4.1
[PATCH 1/3] Drivers: hv: avoid vfree() on crash
From: Vitaly Kuznetsov When we crash from NMI context (e.g. after NMI injection from host when 'sysctl -w kernel.unknown_nmi_panic=1' is set) we hit kernel BUG at mm/vmalloc.c:1530! as vfree() is denied. While the issue could be solved with in_nmi() check instead I opted for skipping vfree on all sorts of crashes to reduce the amount of work which can cause consequent crashes. We don't really need to free anything on crash. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv.c |8 +--- drivers/hv/hyperv_vmbus.h |2 +- drivers/hv/vmbus_drv.c|8 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index a1c086b..60dbd6c 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -278,7 +278,7 @@ cleanup: * * This routine is called normally during driver unloading or exiting. */ -void hv_cleanup(void) +void hv_cleanup(bool crash) { union hv_x64_msr_hypercall_contents hypercall_msr; @@ -288,7 +288,8 @@ void hv_cleanup(void) if (hv_context.hypercall_page) { hypercall_msr.as_uint64 = 0; wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); - vfree(hv_context.hypercall_page); + if (!crash) + vfree(hv_context.hypercall_page); hv_context.hypercall_page = NULL; } @@ -308,7 +309,8 @@ void hv_cleanup(void) hypercall_msr.as_uint64 = 0; wrmsrl(HV_X64_MSR_REFERENCE_TSC, hypercall_msr.as_uint64); - vfree(hv_context.tsc_page); + if (!crash) + vfree(hv_context.tsc_page); hv_context.tsc_page = NULL; } #endif diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 718b5c7..dfa9fac 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -495,7 +495,7 @@ struct hv_ring_buffer_debug_info { extern int hv_init(void); -extern void hv_cleanup(void); +extern void hv_cleanup(bool crash); extern int hv_post_message(union hv_connection_id connection_id, enum hv_message_type message_type, diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 952f20f..d11690e 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -871,7 +871,7 @@ err_alloc: bus_unregister(_bus); err_cleanup: - hv_cleanup(); + hv_cleanup(false); return ret; } @@ -1323,7 +1323,7 @@ static void hv_kexec_handler(void) vmbus_initiate_unload(false); for_each_online_cpu(cpu) smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); - hv_cleanup(); + hv_cleanup(false); }; static void hv_crash_handler(struct pt_regs *regs) @@ -1335,7 +1335,7 @@ static void hv_crash_handler(struct pt_regs *regs) * for kdump. */ hv_synic_cleanup(NULL); - hv_cleanup(); + hv_cleanup(true); }; static int __init hv_acpi_init(void) @@ -1395,7 +1395,7 @@ static void __exit vmbus_exit(void) _panic_block); } bus_unregister(_bus); - hv_cleanup(); + hv_cleanup(false); for_each_online_cpu(cpu) { tasklet_kill(hv_context.event_dpc[cpu]); smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); -- 1.7.4.1
[PATCH 2/3] Drivers: hv: get rid of redundant messagecount in create_gpadl_header()
From: Vitaly Kuznetsov <vkuzn...@redhat.com> We use messagecount only once in vmbus_establish_gpadl() to check if it is safe to iterate through the submsglist. We can just initialize the list header in all cases in create_gpadl_header() instead. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel.c | 38 -- 1 files changed, 16 insertions(+), 22 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 56dd261..2b109e8 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -238,8 +238,7 @@ EXPORT_SYMBOL_GPL(vmbus_send_tl_connect_request); * create_gpadl_header - Creates a gpadl for the specified buffer */ static int create_gpadl_header(void *kbuffer, u32 size, -struct vmbus_channel_msginfo **msginfo, -u32 *messagecount) + struct vmbus_channel_msginfo **msginfo) { int i; int pagecount; @@ -283,7 +282,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, gpadl_header->range[0].pfn_array[i] = slow_virt_to_phys( kbuffer + PAGE_SIZE * i) >> PAGE_SHIFT; *msginfo = msgheader; - *messagecount = 1; pfnsum = pfncount; pfnleft = pagecount - pfncount; @@ -323,7 +321,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, } msgbody->msgsize = msgsize; - (*messagecount)++; gpadl_body = (struct vmbus_channel_gpadl_body *)msgbody->msg; @@ -352,6 +349,8 @@ static int create_gpadl_header(void *kbuffer, u32 size, msgheader = kzalloc(msgsize, GFP_KERNEL); if (msgheader == NULL) goto nomem; + + INIT_LIST_HEAD(>submsglist); msgheader->msgsize = msgsize; gpadl_header = (struct vmbus_channel_gpadl_header *) @@ -366,7 +365,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, kbuffer + PAGE_SIZE * i) >> PAGE_SHIFT; *msginfo = msgheader; - *messagecount = 1; } return 0; @@ -391,7 +389,6 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, struct vmbus_channel_gpadl_body *gpadl_body; struct vmbus_channel_msginfo *msginfo = NULL; struct vmbus_channel_msginfo *submsginfo; - u32 msgcount; struct list_head *curr; u32 next_gpadl_handle; unsigned long flags; @@ -400,7 +397,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, next_gpadl_handle = (atomic_inc_return(_connection.next_gpadl_handle) - 1); - ret = create_gpadl_header(kbuffer, size, , ); + ret = create_gpadl_header(kbuffer, size, ); if (ret) return ret; @@ -423,24 +420,21 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, if (ret != 0) goto cleanup; - if (msgcount > 1) { - list_for_each(curr, >submsglist) { + list_for_each(curr, >submsglist) { + submsginfo = (struct vmbus_channel_msginfo *)curr; + gpadl_body = + (struct vmbus_channel_gpadl_body *)submsginfo->msg; - submsginfo = (struct vmbus_channel_msginfo *)curr; - gpadl_body = -(struct vmbus_channel_gpadl_body *)submsginfo->msg; + gpadl_body->header.msgtype = + CHANNELMSG_GPADL_BODY; + gpadl_body->gpadl = next_gpadl_handle; - gpadl_body->header.msgtype = - CHANNELMSG_GPADL_BODY; - gpadl_body->gpadl = next_gpadl_handle; + ret = vmbus_post_msg(gpadl_body, +submsginfo->msgsize - +sizeof(*submsginfo)); + if (ret != 0) + goto cleanup; - ret = vmbus_post_msg(gpadl_body, - submsginfo->msgsize - - sizeof(*submsginfo)); - if (ret != 0) - goto cleanup; - - } } wait_for_completion(>waitevent); -- 1.7.4.1
[PATCH 2/3] Drivers: hv: get rid of redundant messagecount in create_gpadl_header()
From: Vitaly Kuznetsov We use messagecount only once in vmbus_establish_gpadl() to check if it is safe to iterate through the submsglist. We can just initialize the list header in all cases in create_gpadl_header() instead. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel.c | 38 -- 1 files changed, 16 insertions(+), 22 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 56dd261..2b109e8 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -238,8 +238,7 @@ EXPORT_SYMBOL_GPL(vmbus_send_tl_connect_request); * create_gpadl_header - Creates a gpadl for the specified buffer */ static int create_gpadl_header(void *kbuffer, u32 size, -struct vmbus_channel_msginfo **msginfo, -u32 *messagecount) + struct vmbus_channel_msginfo **msginfo) { int i; int pagecount; @@ -283,7 +282,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, gpadl_header->range[0].pfn_array[i] = slow_virt_to_phys( kbuffer + PAGE_SIZE * i) >> PAGE_SHIFT; *msginfo = msgheader; - *messagecount = 1; pfnsum = pfncount; pfnleft = pagecount - pfncount; @@ -323,7 +321,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, } msgbody->msgsize = msgsize; - (*messagecount)++; gpadl_body = (struct vmbus_channel_gpadl_body *)msgbody->msg; @@ -352,6 +349,8 @@ static int create_gpadl_header(void *kbuffer, u32 size, msgheader = kzalloc(msgsize, GFP_KERNEL); if (msgheader == NULL) goto nomem; + + INIT_LIST_HEAD(>submsglist); msgheader->msgsize = msgsize; gpadl_header = (struct vmbus_channel_gpadl_header *) @@ -366,7 +365,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, kbuffer + PAGE_SIZE * i) >> PAGE_SHIFT; *msginfo = msgheader; - *messagecount = 1; } return 0; @@ -391,7 +389,6 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, struct vmbus_channel_gpadl_body *gpadl_body; struct vmbus_channel_msginfo *msginfo = NULL; struct vmbus_channel_msginfo *submsginfo; - u32 msgcount; struct list_head *curr; u32 next_gpadl_handle; unsigned long flags; @@ -400,7 +397,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, next_gpadl_handle = (atomic_inc_return(_connection.next_gpadl_handle) - 1); - ret = create_gpadl_header(kbuffer, size, , ); + ret = create_gpadl_header(kbuffer, size, ); if (ret) return ret; @@ -423,24 +420,21 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, if (ret != 0) goto cleanup; - if (msgcount > 1) { - list_for_each(curr, >submsglist) { + list_for_each(curr, >submsglist) { + submsginfo = (struct vmbus_channel_msginfo *)curr; + gpadl_body = + (struct vmbus_channel_gpadl_body *)submsginfo->msg; - submsginfo = (struct vmbus_channel_msginfo *)curr; - gpadl_body = -(struct vmbus_channel_gpadl_body *)submsginfo->msg; + gpadl_body->header.msgtype = + CHANNELMSG_GPADL_BODY; + gpadl_body->gpadl = next_gpadl_handle; - gpadl_body->header.msgtype = - CHANNELMSG_GPADL_BODY; - gpadl_body->gpadl = next_gpadl_handle; + ret = vmbus_post_msg(gpadl_body, +submsginfo->msgsize - +sizeof(*submsginfo)); + if (ret != 0) + goto cleanup; - ret = vmbus_post_msg(gpadl_body, - submsginfo->msgsize - - sizeof(*submsginfo)); - if (ret != 0) - goto cleanup; - - } } wait_for_completion(>waitevent); -- 1.7.4.1
[PATCH 0/3] Drivers: hv: vmbus: Some miscellaneous fixes
Some miscellaneous fixes. Vitaly Kuznetsov (3): Drivers: hv: avoid vfree() on crash Drivers: hv: get rid of redundant messagecount in create_gpadl_header() Drivers: hv: don't leak memory in vmbus_establish_gpadl() drivers/hv/channel.c | 44 +--- drivers/hv/hv.c |8 +--- drivers/hv/hyperv_vmbus.h |2 +- drivers/hv/vmbus_drv.c|8 4 files changed, 31 insertions(+), 31 deletions(-) -- 1.7.4.1
[PATCH 0/3] Drivers: hv: vmbus: Some miscellaneous fixes
Some miscellaneous fixes. Vitaly Kuznetsov (3): Drivers: hv: avoid vfree() on crash Drivers: hv: get rid of redundant messagecount in create_gpadl_header() Drivers: hv: don't leak memory in vmbus_establish_gpadl() drivers/hv/channel.c | 44 +--- drivers/hv/hv.c |8 +--- drivers/hv/hyperv_vmbus.h |2 +- drivers/hv/vmbus_drv.c|8 4 files changed, 31 insertions(+), 31 deletions(-) -- 1.7.4.1
[PATCH RESEND 5/5] tools: hv: lsvmbus: add pci pass-through UUID
From: Vitaly Kuznetsov <vkuzn...@redhat.com> lsvmbus keeps its own copy of all VMBus UUIDs, add PCIe pass-through device there to not report 'Unknown' for such devices. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- tools/hv/lsvmbus |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tools/hv/lsvmbus b/tools/hv/lsvmbus index 162a378..e8fecd6 100644 --- a/tools/hv/lsvmbus +++ b/tools/hv/lsvmbus @@ -35,6 +35,7 @@ vmbus_dev_dict = { '{ba6163d9-04a1-4d29-b605-72e2ffb1dc7f}' : 'Synthetic SCSI Controller', '{2f9bcc4a-0069-4af3-b76b-6fd0be528cda}' : 'Synthetic fiber channel adapter', '{8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501}' : 'Synthetic RDMA adapter', + '{44c4f61d--4400-9d52-802e27ede19f}' : 'PCI Express pass-through', '{276aacf4-ac15-426c-98dd-7521ad3f01fe}' : '[Reserved system device]', '{f8e65716-3cb3-4a06-9a60-1889c5cccab5}' : '[Reserved system device]', '{3375baf4-9e15-4b30-b765-67acb10d607b}' : '[Reserved system device]', -- 1.7.4.1
[PATCH RESEND 3/5] Drivers: hv: balloon: don't crash when memory is added in non-sorted order
From: Vitaly Kuznetsov <vkuzn...@redhat.com> When we iterate through all HA regions in handle_pg_range() we have an assumption that all these regions are sorted in the list and the 'start_pfn >= has->end_pfn' check is enough to find the proper region. Unfortunately it's not the case with WS2016 where host can hot-add regions in a different order. We end up modifying the wrong HA region and crashing later on pages online. Modify the check to make sure we found the region we were searching for while iterating. Fix the same check in pfn_covered() as well. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_balloon.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index b853b4b..43af913 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -714,7 +714,7 @@ static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt) * If the pfn range we are dealing with is not in the current * "hot add block", move on. */ - if ((start_pfn >= has->end_pfn)) + if (start_pfn < has->start_pfn || start_pfn >= has->end_pfn) continue; /* * If the current hot add-request extends beyond @@ -768,7 +768,7 @@ static unsigned long handle_pg_range(unsigned long pg_start, * If the pfn range we are dealing with is not in the current * "hot add block", move on. */ - if ((start_pfn >= has->end_pfn)) + if (start_pfn < has->start_pfn || start_pfn >= has->end_pfn) continue; old_covered_state = has->covered_end_pfn; -- 1.7.4.1
[PATCH RESEND 5/5] tools: hv: lsvmbus: add pci pass-through UUID
From: Vitaly Kuznetsov lsvmbus keeps its own copy of all VMBus UUIDs, add PCIe pass-through device there to not report 'Unknown' for such devices. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- tools/hv/lsvmbus |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tools/hv/lsvmbus b/tools/hv/lsvmbus index 162a378..e8fecd6 100644 --- a/tools/hv/lsvmbus +++ b/tools/hv/lsvmbus @@ -35,6 +35,7 @@ vmbus_dev_dict = { '{ba6163d9-04a1-4d29-b605-72e2ffb1dc7f}' : 'Synthetic SCSI Controller', '{2f9bcc4a-0069-4af3-b76b-6fd0be528cda}' : 'Synthetic fiber channel adapter', '{8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501}' : 'Synthetic RDMA adapter', + '{44c4f61d--4400-9d52-802e27ede19f}' : 'PCI Express pass-through', '{276aacf4-ac15-426c-98dd-7521ad3f01fe}' : '[Reserved system device]', '{f8e65716-3cb3-4a06-9a60-1889c5cccab5}' : '[Reserved system device]', '{3375baf4-9e15-4b30-b765-67acb10d607b}' : '[Reserved system device]', -- 1.7.4.1
[PATCH RESEND 3/5] Drivers: hv: balloon: don't crash when memory is added in non-sorted order
From: Vitaly Kuznetsov When we iterate through all HA regions in handle_pg_range() we have an assumption that all these regions are sorted in the list and the 'start_pfn >= has->end_pfn' check is enough to find the proper region. Unfortunately it's not the case with WS2016 where host can hot-add regions in a different order. We end up modifying the wrong HA region and crashing later on pages online. Modify the check to make sure we found the region we were searching for while iterating. Fix the same check in pfn_covered() as well. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index b853b4b..43af913 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -714,7 +714,7 @@ static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt) * If the pfn range we are dealing with is not in the current * "hot add block", move on. */ - if ((start_pfn >= has->end_pfn)) + if (start_pfn < has->start_pfn || start_pfn >= has->end_pfn) continue; /* * If the current hot add-request extends beyond @@ -768,7 +768,7 @@ static unsigned long handle_pg_range(unsigned long pg_start, * If the pfn range we are dealing with is not in the current * "hot add block", move on. */ - if ((start_pfn >= has->end_pfn)) + if (start_pfn < has->start_pfn || start_pfn >= has->end_pfn) continue; old_covered_state = has->covered_end_pfn; -- 1.7.4.1
[PATCH RESEND 4/5] Drivers: hv: balloon: reset host_specified_ha_region
From: Vitaly Kuznetsov <vkuzn...@redhat.com> We set host_specified_ha_region = true on certain request but this is a global state which stays 'true' forever. We need to reset it when we receive a request where ha_region is not specified. I did not see any real issues, the bug was found by code inspection. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_balloon.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 43af913..df35fb7 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -1400,6 +1400,7 @@ static void balloon_onchannelcallback(void *context) * This is a normal hot-add request specifying * hot-add memory. */ + dm->host_specified_ha_region = false; ha_pg_range = _msg->range; dm->ha_wrk.ha_page_range = *ha_pg_range; dm->ha_wrk.ha_region_range.page_range = 0; -- 1.7.4.1
[PATCH RESEND 1/5] Drivers: hv: kvp: fix IP Failover
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Hyper-V VMs can be replicated to another hosts and there is a feature to set different IP for replicas, it is called 'Failover TCP/IP'. When such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon as we finish negotiation procedure. The problem is that it can happen (and it actually happens) before userspace daemon connects and we reply with HV_E_FAIL to the message. As there are no repetitions we fail to set the requested IP. Solve the issue by postponing our reply to the negotiation message till userspace daemon is connected. We can't wait too long as there is a host-side timeout (cca. 75 seconds) and if we fail to reply in this time frame the whole KVP service will become inactive. The solution is not ideal - if it takes userspace daemon more than 60 seconds to connect IP Failover will still fail but I don't see a solution with our current separation between kernel and userspace parts. Other two modules (VSS and FCOPY) don't require such delay, leave them untouched. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_kvp.c | 31 +++ drivers/hv/hyperv_vmbus.h |5 + 2 files changed, 36 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 9b9b370..cb1a916 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy); static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); static void kvp_timeout_func(struct work_struct *dummy); +static void kvp_host_handshake_func(struct work_struct *dummy); static void kvp_register(int); static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); +static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); static const char kvp_devname[] = "vmbus/hv_kvp"; @@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy) hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } +static void kvp_host_handshake_func(struct work_struct *dummy) +{ + hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); +} + static int kvp_handle_handshake(struct hv_kvp_msg *msg) { switch (msg->kvp_hdr.operation) { @@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_debug("KVP: userspace daemon ver. %d registered\n", KVP_OP_REGISTER); kvp_register(dm_reg_value); + + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + cancel_delayed_work_sync(_host_handshake_work); hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); return 0; @@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context) struct icmsg_negotiate *negop = NULL; int util_fw_version; int kvp_srv_version; + static enum {NEGO_NOT_STARTED, +NEGO_IN_PROGRESS, +NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; + if (host_negotiatied == NEGO_NOT_STARTED && + kvp_transaction.state < HVUTIL_READY) { + /* +* If userspace daemon is not connected and host is asking +* us to negotiate we need to delay to not lose messages. +* This is important for Failover IP setting. +*/ + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(_host_handshake_work, + HV_UTIL_NEGO_TIMEOUT * HZ); + return; + } if (kvp_transaction.state > HVUTIL_READY) return; @@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context) vmbus_sendpacket(channel, recv_buffer, recvlen, requestid, VM_PKT_DATA_INBAND, 0); + + host_negotiatied = NEGO_FINISHED; } } @@ -708,6 +738,7 @@ hv_kvp_init(struct hv_util_service *srv) void hv_kvp_deinit(void) { kvp_transaction.state = HVUTIL_DEVICE_DYING; + cancel_delayed_work_sync(_host_handshake_work); cancel_delayed_work_sync(_timeout_work); cancel_work_sync(_sendkey_work); hvutil_transport_destroy(hvt); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5c586f..e5203e4 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -36,6 +36,11 @@ #define HV_UTIL_TIMEOUT 30 /* + * Timeout for guest-host handshake for services. + */ +#define HV_UTIL_NEGO_TIMEOUT 60 + +/* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent * is set by CPUID(HVCPUID_VERSION_FEATURES). */ -- 1.7.4.1
[PATCH RESEND 2/5] Drivers: hv: vmbus: handle various crash scenarios
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was used for initial contact or to CPU0 depending on host version. vmbus_wait_for_unload() doesn't account for the fact that in case we're crashing on some other CPU we won't get the CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will never end. Do the following: 1) Check for completion_done() in the loop. In case interrupt handler is still alive we'll get the confirmation we need. 2) Read message pages for all CPUs message page as we're unsure where CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with still-alive interrupt handler doing the same, add cmpxchg() to vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message. 3) Cleanup message pages on all CPUs. This is required (at least for the current CPU as we're clearing CPU0 messages now but we may want to bring up additional CPUs on crash) as new messages won't be delivered till we consume what's pending. On boot we'll place message pages somewhere else and we won't be able to read stale messages. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hyperv_vmbus.h | 16 +++- drivers/hv/vmbus_drv.c|7 +++-- 3 files changed, 61 insertions(+), 20 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 38b682ba..b6c1211 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -597,27 +597,55 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type) static void vmbus_wait_for_unload(void) { - int cpu = smp_processor_id(); - void *page_addr = hv_context.synic_message_page[cpu]; - struct hv_message *msg = (struct hv_message *)page_addr + - VMBUS_MESSAGE_SINT; + int cpu; + void *page_addr; + struct hv_message *msg; struct vmbus_channel_message_header *hdr; - bool unloaded = false; + u32 message_type; + /* +* CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was +* used for initial contact or to CPU0 depending on host version. When +* we're crashing on a different CPU let's hope that IRQ handler on +* the cpu which receives CHANNELMSG_UNLOAD_RESPONSE is still +* functional and vmbus_unload_response() will complete +* vmbus_connection.unload_event. If not, the last thing we can do is +* read message pages for all CPUs directly. +*/ while (1) { - if (READ_ONCE(msg->header.message_type) == HVMSG_NONE) { - mdelay(10); - continue; - } + if (completion_done(_connection.unload_event)) + break; - hdr = (struct vmbus_channel_message_header *)msg->u.payload; - if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) - unloaded = true; + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + + VMBUS_MESSAGE_SINT; - vmbus_signal_eom(msg); + message_type = READ_ONCE(msg->header.message_type); + if (message_type == HVMSG_NONE) + continue; - if (unloaded) - break; + hdr = (struct vmbus_channel_message_header *) + msg->u.payload; + + if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) + complete(_connection.unload_event); + + vmbus_signal_eom(msg, message_type); + } + + mdelay(10); + } + + /* +* We're crashing and already got the UNLOAD_RESPONSE, cleanup all +* maybe-pending messages on all CPUs to be able to receive new +* messages after we reconnect. +*/ + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; + msg->header.message_type = HVMSG_NONE; } } diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5203e4..718b5c7 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -625,9 +625,21 @@ extern struct vmbus_channel_message_table_entry channel_message_table[CHANNELMSG_COUNT]; /* Free the message slot and signal end-of-message if required */ -static inline void vmbus_signal_eom(struct hv_message *msg) +sta
[PATCH RESEND 4/5] Drivers: hv: balloon: reset host_specified_ha_region
From: Vitaly Kuznetsov We set host_specified_ha_region = true on certain request but this is a global state which stays 'true' forever. We need to reset it when we receive a request where ha_region is not specified. I did not see any real issues, the bug was found by code inspection. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 43af913..df35fb7 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -1400,6 +1400,7 @@ static void balloon_onchannelcallback(void *context) * This is a normal hot-add request specifying * hot-add memory. */ + dm->host_specified_ha_region = false; ha_pg_range = _msg->range; dm->ha_wrk.ha_page_range = *ha_pg_range; dm->ha_wrk.ha_region_range.page_range = 0; -- 1.7.4.1
[PATCH RESEND 1/5] Drivers: hv: kvp: fix IP Failover
From: Vitaly Kuznetsov Hyper-V VMs can be replicated to another hosts and there is a feature to set different IP for replicas, it is called 'Failover TCP/IP'. When such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon as we finish negotiation procedure. The problem is that it can happen (and it actually happens) before userspace daemon connects and we reply with HV_E_FAIL to the message. As there are no repetitions we fail to set the requested IP. Solve the issue by postponing our reply to the negotiation message till userspace daemon is connected. We can't wait too long as there is a host-side timeout (cca. 75 seconds) and if we fail to reply in this time frame the whole KVP service will become inactive. The solution is not ideal - if it takes userspace daemon more than 60 seconds to connect IP Failover will still fail but I don't see a solution with our current separation between kernel and userspace parts. Other two modules (VSS and FCOPY) don't require such delay, leave them untouched. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_kvp.c | 31 +++ drivers/hv/hyperv_vmbus.h |5 + 2 files changed, 36 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 9b9b370..cb1a916 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy); static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); static void kvp_timeout_func(struct work_struct *dummy); +static void kvp_host_handshake_func(struct work_struct *dummy); static void kvp_register(int); static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); +static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); static const char kvp_devname[] = "vmbus/hv_kvp"; @@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy) hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } +static void kvp_host_handshake_func(struct work_struct *dummy) +{ + hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); +} + static int kvp_handle_handshake(struct hv_kvp_msg *msg) { switch (msg->kvp_hdr.operation) { @@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_debug("KVP: userspace daemon ver. %d registered\n", KVP_OP_REGISTER); kvp_register(dm_reg_value); + + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + cancel_delayed_work_sync(_host_handshake_work); hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); return 0; @@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context) struct icmsg_negotiate *negop = NULL; int util_fw_version; int kvp_srv_version; + static enum {NEGO_NOT_STARTED, +NEGO_IN_PROGRESS, +NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; + if (host_negotiatied == NEGO_NOT_STARTED && + kvp_transaction.state < HVUTIL_READY) { + /* +* If userspace daemon is not connected and host is asking +* us to negotiate we need to delay to not lose messages. +* This is important for Failover IP setting. +*/ + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(_host_handshake_work, + HV_UTIL_NEGO_TIMEOUT * HZ); + return; + } if (kvp_transaction.state > HVUTIL_READY) return; @@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context) vmbus_sendpacket(channel, recv_buffer, recvlen, requestid, VM_PKT_DATA_INBAND, 0); + + host_negotiatied = NEGO_FINISHED; } } @@ -708,6 +738,7 @@ hv_kvp_init(struct hv_util_service *srv) void hv_kvp_deinit(void) { kvp_transaction.state = HVUTIL_DEVICE_DYING; + cancel_delayed_work_sync(_host_handshake_work); cancel_delayed_work_sync(_timeout_work); cancel_work_sync(_sendkey_work); hvutil_transport_destroy(hvt); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5c586f..e5203e4 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -36,6 +36,11 @@ #define HV_UTIL_TIMEOUT 30 /* + * Timeout for guest-host handshake for services. + */ +#define HV_UTIL_NEGO_TIMEOUT 60 + +/* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent * is set by CPUID(HVCPUID_VERSION_FEATURES). */ -- 1.7.4.1
[PATCH RESEND 2/5] Drivers: hv: vmbus: handle various crash scenarios
From: Vitaly Kuznetsov Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was used for initial contact or to CPU0 depending on host version. vmbus_wait_for_unload() doesn't account for the fact that in case we're crashing on some other CPU we won't get the CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will never end. Do the following: 1) Check for completion_done() in the loop. In case interrupt handler is still alive we'll get the confirmation we need. 2) Read message pages for all CPUs message page as we're unsure where CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with still-alive interrupt handler doing the same, add cmpxchg() to vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message. 3) Cleanup message pages on all CPUs. This is required (at least for the current CPU as we're clearing CPU0 messages now but we may want to bring up additional CPUs on crash) as new messages won't be delivered till we consume what's pending. On boot we'll place message pages somewhere else and we won't be able to read stale messages. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hyperv_vmbus.h | 16 +++- drivers/hv/vmbus_drv.c|7 +++-- 3 files changed, 61 insertions(+), 20 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 38b682ba..b6c1211 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -597,27 +597,55 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type) static void vmbus_wait_for_unload(void) { - int cpu = smp_processor_id(); - void *page_addr = hv_context.synic_message_page[cpu]; - struct hv_message *msg = (struct hv_message *)page_addr + - VMBUS_MESSAGE_SINT; + int cpu; + void *page_addr; + struct hv_message *msg; struct vmbus_channel_message_header *hdr; - bool unloaded = false; + u32 message_type; + /* +* CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was +* used for initial contact or to CPU0 depending on host version. When +* we're crashing on a different CPU let's hope that IRQ handler on +* the cpu which receives CHANNELMSG_UNLOAD_RESPONSE is still +* functional and vmbus_unload_response() will complete +* vmbus_connection.unload_event. If not, the last thing we can do is +* read message pages for all CPUs directly. +*/ while (1) { - if (READ_ONCE(msg->header.message_type) == HVMSG_NONE) { - mdelay(10); - continue; - } + if (completion_done(_connection.unload_event)) + break; - hdr = (struct vmbus_channel_message_header *)msg->u.payload; - if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) - unloaded = true; + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + + VMBUS_MESSAGE_SINT; - vmbus_signal_eom(msg); + message_type = READ_ONCE(msg->header.message_type); + if (message_type == HVMSG_NONE) + continue; - if (unloaded) - break; + hdr = (struct vmbus_channel_message_header *) + msg->u.payload; + + if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) + complete(_connection.unload_event); + + vmbus_signal_eom(msg, message_type); + } + + mdelay(10); + } + + /* +* We're crashing and already got the UNLOAD_RESPONSE, cleanup all +* maybe-pending messages on all CPUs to be able to receive new +* messages after we reconnect. +*/ + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; + msg->header.message_type = HVMSG_NONE; } } diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5203e4..718b5c7 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -625,9 +625,21 @@ extern struct vmbus_channel_message_table_entry channel_message_table[CHANNELMSG_COUNT]; /* Free the message slot and signal end-of-message if required */ -static inline void vmbus_signal_eom(struct hv_message *msg) +static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg
[PATCH RESEND 0/5] Drivers: hv: Some miscellaneous fixes
Some miscellaneous fixes. All these patches are being resent. Vitaly Kuznetsov (5): Drivers: hv: kvp: fix IP Failover Drivers: hv: vmbus: handle various crash scenarios Drivers: hv: balloon: don't crash when memory is added in non-sorted order Drivers: hv: balloon: reset host_specified_ha_region tools: hv: lsvmbus: add pci pass-through UUID drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hv_balloon.c |5 ++- drivers/hv/hv_kvp.c | 31 drivers/hv/hyperv_vmbus.h | 21 ++- drivers/hv/vmbus_drv.c|7 +++-- tools/hv/lsvmbus |1 + 6 files changed, 101 insertions(+), 22 deletions(-) -- 1.7.4.1
[PATCH RESEND 0/5] Drivers: hv: Some miscellaneous fixes
Some miscellaneous fixes. All these patches are being resent. Vitaly Kuznetsov (5): Drivers: hv: kvp: fix IP Failover Drivers: hv: vmbus: handle various crash scenarios Drivers: hv: balloon: don't crash when memory is added in non-sorted order Drivers: hv: balloon: reset host_specified_ha_region tools: hv: lsvmbus: add pci pass-through UUID drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hv_balloon.c |5 ++- drivers/hv/hv_kvp.c | 31 drivers/hv/hyperv_vmbus.h | 21 ++- drivers/hv/vmbus_drv.c|7 +++-- tools/hv/lsvmbus |1 + 6 files changed, 101 insertions(+), 22 deletions(-) -- 1.7.4.1
[PATCH net-next V5 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: Addressed most of the comments from Alexander Duyck <alexander.du...@gmail.com> and Rustad, Mark D <mark.d.rus...@intel.com>. V3: Addressed additional comments from Alexander Duyck <alexander.du...@gmail.com> V4: Addressed kbuild errors reported by: kbuild test robot <l...@intel.com> V5: Addressed additional comments from Alexander Duyck <alexander.du...@gmail.com> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx He
[PATCH net-next V5 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V5: No change from V1 drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next V5 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan --- V2: Addressed most of the comments from Alexander Duyck and Rustad, Mark D . V3: Addressed additional comments from Alexander Duyck V4: Addressed kbuild errors reported by: kbuild test robot V5: Addressed additional comments from Alexander Duyck drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx Head and Tail Descriptor Pointers and * the Base and Length of the Rx Descriptor Ring @@ -2056,7 +2067,10 @@ static void ixgbevf_negotiate_api(struct ixgbev
[PATCH net-next V5 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan --- V5: No change from V1 drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next V5 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1
[PATCH net-next V5 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1
[PATCH net-next V4 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V4: No change from V1 drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next V4 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan --- V4: No change from V1 drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next V4 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: Addressed most of the comments from Alexander Duyck <alexander.du...@gmail.com> and Rustad, Mark D <mark.d.rus...@intel.com>. V3: Addressed additional comments from Alexander Duyck <alexander.du...@gmail.com> V4: Addressed kbuild errors reported by: kbuild test robot <l...@intel.com> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx Head and Tail Descriptor Pointers and * the Base and Length of the Rx Descriptor Ring @@ -2056,7 +20
[PATCH net-next V4 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan --- V2: Addressed most of the comments from Alexander Duyck and Rustad, Mark D . V3: Addressed additional comments from Alexander Duyck V4: Addressed kbuild errors reported by: kbuild test robot drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx Head and Tail Descriptor Pointers and * the Base and Length of the Rx Descriptor Ring @@ -2056,7 +2067,10 @@ static void ixgbevf_negotiate_api(struct ixgbevf_adapter *adapter) spin_lock_bh(>mbx_lock); while (api[idx] !=
[PATCH net-next V4 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1
[PATCH net-next V4 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1
[PATCH net-next V3 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: No change from V1. V3: No change from V2. drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next V3 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan --- V2: No change from V1. V3: No change from V2. drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next V3 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1
[PATCH net-next V3 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> V2: Addressed most of the comments from Alexander Duyck <alexander.du...@gmail.com> and Rustad, Mark D <mark.d.rus...@intel.com>. V3: Addressed additional comments from Alexander Duyck <alexander.du...@gmail.com> --- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx Head and Tail Descriptor Pointers and * the Base and Length of the Rx Descriptor Ring @@ -2056,7 +2067,10 @@ static void ixgbevf_negotiate_api(struct ixgbevf_adapter *adapter) spin_l
[PATCH net-next V3 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1
[PATCH net-next V3 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan V2: Addressed most of the comments from Alexander Duyck and Rustad, Mark D . V3: Addressed additional comments from Alexander Duyck --- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx Head and Tail Descriptor Pointers and * the Base and Length of the Rx Descriptor Ring @@ -2056,7 +2067,10 @@ static void ixgbevf_negotiate_api(struct ixgbevf_adapter *adapter) spin_lock_bh(>mbx_lock); while (api[idx] != ixgbe_mbox_api_unknown) { - err = ixgbevf_negotiate_a
[PATCH net-next V2 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: No change from V1. drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next V2 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 5 files changed, 242 insertions(+), 4 deletions(-) -- 1.7.4.1
[PATCH net-next V2 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 5 files changed, 242 insertions(+), 4 deletions(-) -- 1.7.4.1
[PATCH net-next V2 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan --- V2: No change from V1. drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next V2 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 5 files changed, 242 insertions(+), 4 deletions(-) -- 1.7.4.1
[PATCH net-next V2 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 5 files changed, 242 insertions(+), 4 deletions(-) -- 1.7.4.1
[PATCH net-next V2 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: Addressed most of the comments from Alexander Duyck <alexander.du...@gmail.com> and Rustad, Mark D <mark.d.rus...@intel.com>. drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 4 files changed, 237 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c761d80 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; diff --git a/drivers/net/ethernet/intel/ixgbevf/mbx.c b/drivers/net/ethernet/intel/ixgbevf/mbx.c index dc68fea..298a0da 100644 --- a/drivers/net/ethernet/intel/ixgbevf/mbx.c +++ b/drivers/net/ethernet/intel/ixgbevf/mbx.c @@ -346,3 +346,15 @@ const struct ixgbe_mbx_operations ixgbevf_mbx_ops = { .check_for_rst = ixgbevf_check_for_rst_vf, }; +/** + * Mailbox operations when running on Hyper-V. + * On Hyper-V, PF/VF communiction is not through the + * hardware mailbox; this communication is through + * a software mediated path. + * Most mail box operations are noop while running on + * Hyper-V. + */ +const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops = { + .init_params= ixgbevf_init_mbx_params_vf, + .check_for_rst = ixgbevf_check_for_rst_vf, +}; diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.c b/drivers/net/ethernet/intel/ixgbevf/vf.c index 4d613a4..1ec13c1 100644 ---
[PATCH net-next V2 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan --- V2: Addressed most of the comments from Alexander Duyck and Rustad, Mark D . drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 4 files changed, 237 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c761d80 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; diff --git a/drivers/net/ethernet/intel/ixgbevf/mbx.c b/drivers/net/ethernet/intel/ixgbevf/mbx.c index dc68fea..298a0da 100644 --- a/drivers/net/ethernet/intel/ixgbevf/mbx.c +++ b/drivers/net/ethernet/intel/ixgbevf/mbx.c @@ -346,3 +346,15 @@ const struct ixgbe_mbx_operations ixgbevf_mbx_ops = { .check_for_rst = ixgbevf_check_for_rst_vf, }; +/** + * Mailbox operations when running on Hyper-V. + * On Hyper-V, PF/VF communiction is not through the + * hardware mailbox; this communication is through + * a software mediated path. + * Most mail box operations are noop while running on + * Hyper-V. + */ +const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops = { + .init_params= ixgbevf_init_mbx_params_vf, + .check_for_rst = ixgbevf_check_for_rst_vf, +}; diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.c b/drivers/net/ethernet/intel/ixgbevf/vf.c index 4d613a4..1ec13c1 100644 --- a/drivers/net/ethernet/intel/ixgbevf/vf.c +++ b/drivers/net/ethernet/intel/ixgbevf/vf.c @@ -2
Drivers/hv
Greg, Some time back I had sent a buch of patches for Hyper-V drivers. Are they still in the queue or should I resend them. Regards, K. Y
Drivers/hv
Greg, Some time back I had sent a buch of patches for Hyper-V drivers. Are they still in the queue or should I resend them. Regards, K. Y
[PATCH net-next 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 11 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 56 ++--- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 138 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 201 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..f8d2a0b 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..4a0ffac 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -49,6 +49,7 @@ #include #include #include +#include #include "ixgbevf.h" @@ -62,10 +63,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +83,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1809,12 +1818,13 @@ static int ixgbevf_vlan_rx_add_vid(struct net_device *netdev, { struct ixgbevf_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = >hw; - int err; + int err = 0; spin_lock_bh(>mbx_lock); /* add VID to filter table */ - err = hw->mac.ops.set_vfta(hw, vid, 0, true); + if (hw->mac.ops.set_vfta) + err = hw->mac.ops.set_vfta(hw, vid, 0, true); spin_unlock_bh(>mbx_lock); @@ -1835,12 +1845,13 @@ static int ixgbevf_vlan_rx_kill_vid(struct net_device *netdev, { struct ixgbevf_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = >hw; - int err; + int err = 0; spin_lock_bh(>mbx_lock); /* remove VID from filter table */ - err = hw->mac.ops.set_vfta(hw, vid, 0, false); + if (hw->mac.ops.set_vfta) + err = hw->mac.ops.set_vfta(hw, vid, 0, false); spin_unlock_bh(>mbx_lock); @@ -1873,14 +1884,16 @@ static int ixgbevf_write_uc_addr_list(struct net_device *netdev) struct netdev_hw_addr *ha; netdev_for_each_uc_addr(ha, netdev) { - hw->mac.ops.set_uc_addr(hw, ++coun
[PATCH net-next 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan --- drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1
[PATCH net-next 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan --- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 11 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 56 ++--- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 138 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 201 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..f8d2a0b 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..4a0ffac 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -49,6 +49,7 @@ #include #include #include +#include #include "ixgbevf.h" @@ -62,10 +63,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +83,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1809,12 +1818,13 @@ static int ixgbevf_vlan_rx_add_vid(struct net_device *netdev, { struct ixgbevf_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = >hw; - int err; + int err = 0; spin_lock_bh(>mbx_lock); /* add VID to filter table */ - err = hw->mac.ops.set_vfta(hw, vid, 0, true); + if (hw->mac.ops.set_vfta) + err = hw->mac.ops.set_vfta(hw, vid, 0, true); spin_unlock_bh(>mbx_lock); @@ -1835,12 +1845,13 @@ static int ixgbevf_vlan_rx_kill_vid(struct net_device *netdev, { struct ixgbevf_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = >hw; - int err; + int err = 0; spin_lock_bh(>mbx_lock); /* remove VID from filter table */ - err = hw->mac.ops.set_vfta(hw, vid, 0, false); + if (hw->mac.ops.set_vfta) + err = hw->mac.ops.set_vfta(hw, vid, 0, false); spin_unlock_bh(>mbx_lock); @@ -1873,14 +1884,16 @@ static int ixgbevf_write_uc_addr_list(struct net_device *netdev) struct netdev_hw_addr *ha; netdev_for_each_uc_addr(ha, netdev) { - hw->mac.ops.set_uc_addr(hw, ++count, ha-
[PATCH net-next 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 11 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 56 ++--- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 138 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 206 insertions(+), 18 deletions(-) -- 1.7.4.1
[PATCH net-next 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 11 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 56 ++--- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 138 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 206 insertions(+), 18 deletions(-) -- 1.7.4.1
[PATCH net-next 1/1] hv_netvsc: Implement support for VF drivers on Hyper-V
Support VF drivers on Hyper-V. On Hyper-V, each VF instance presented to the guest has an associated synthetic interface that shares the MAC address with the VF instance. Typically these are bonded together to support live migration. By default, the host delivers all the incoming packets on the synthetic interface. Once the VF is up, we need to explicitly switch the data path on the host to divert traffic onto the VF interface. Even after switching the data path, broadcast and multicast packets are always delivered on the synthetic interface and these will have to be injected back onto the VF interface (if VF is up). This patch implements the necessary support in netvsc to support Linux VF drivers. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Reviewed-by: Haiyang Zhang <haiya...@microsoft.com> --- drivers/net/hyperv/hyperv_net.h | 14 ++ drivers/net/hyperv/netvsc.c | 29 drivers/net/hyperv/netvsc_drv.c | 312 +--- drivers/net/hyperv/rndis_filter.c |6 + 4 files changed, 335 insertions(+), 26 deletions(-) diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index 8b3bd8e..6700a4d 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -202,6 +202,8 @@ int rndis_filter_receive(struct hv_device *dev, int rndis_filter_set_packet_filter(struct rndis_device *dev, u32 new_filter); int rndis_filter_set_device_mac(struct hv_device *hdev, char *mac); +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf); + #define NVSP_INVALID_PROTOCOL_VERSION ((u32)0x) #define NVSP_PROTOCOL_VERSION_12 @@ -641,6 +643,12 @@ struct netvsc_reconfig { u32 event; }; +struct garp_wrk { + struct work_struct dwrk; + struct net_device *netdev; + struct netvsc_device *netvsc_dev; +}; + /* The context of the netvsc device */ struct net_device_context { /* point back to our device context */ @@ -656,6 +664,7 @@ struct net_device_context { struct work_struct work; u32 msg_enable; /* debug level */ + struct garp_wrk gwrk; struct netvsc_stats __percpu *tx_stats; struct netvsc_stats __percpu *rx_stats; @@ -730,6 +739,11 @@ struct netvsc_device { u32 vf_alloc; /* Serial number of the VF to team with */ u32 vf_serial; + atomic_t open_cnt; + /* State to manage the associated VF interface. */ + bool vf_inject; + struct net_device *vf_netdev; + atomic_t vf_use_cnt; }; /* NdisInitialize message */ diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index ec313fc..eddce3c 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -33,6 +33,30 @@ #include "hyperv_net.h" +/* + * Switch the data path from the synthetic interface to the VF + * interface. + */ +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf) +{ + struct nvsp_message *init_pkt = _dev->channel_init_pkt; + struct hv_device *dev = nv_dev->dev; + + memset(init_pkt, 0, sizeof(struct nvsp_message)); + init_pkt->hdr.msg_type = NVSP_MSG4_TYPE_SWITCH_DATA_PATH; + if (vf) + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_VF; + else + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_SYNTHETIC; + + vmbus_sendpacket(dev->channel, init_pkt, + sizeof(struct nvsp_message), + (unsigned long)init_pkt, + VM_PKT_DATA_INBAND, 0); +} + static struct netvsc_device *alloc_net_device(struct hv_device *device) { @@ -52,11 +76,16 @@ static struct netvsc_device *alloc_net_device(struct hv_device *device) init_waitqueue_head(_device->wait_drain); net_device->start_remove = false; net_device->destroy = false; + atomic_set(_device->open_cnt, 0); + atomic_set(_device->vf_use_cnt, 0); net_device->dev = device; net_device->ndev = ndev; net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT; net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT; + net_device->vf_netdev = NULL; + net_device->vf_inject = false; + hv_set_drvdata(device, net_device); return net_device; } diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index b8121eb..bfdb568a 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -610,42 +610,24 @@ void netvsc_linkstatus_callback(struct hv_device *device_obj, schedule_delayed_work(_ctx->dwork, 0); } -/* - * netvsc_recv_callback - Callback when we receive a packet from the - * "wire" on the specified device. - */ -int netvsc_recv_callback(struct hv_devi
[PATCH net-next 1/1] hv_netvsc: Implement support for VF drivers on Hyper-V
Support VF drivers on Hyper-V. On Hyper-V, each VF instance presented to the guest has an associated synthetic interface that shares the MAC address with the VF instance. Typically these are bonded together to support live migration. By default, the host delivers all the incoming packets on the synthetic interface. Once the VF is up, we need to explicitly switch the data path on the host to divert traffic onto the VF interface. Even after switching the data path, broadcast and multicast packets are always delivered on the synthetic interface and these will have to be injected back onto the VF interface (if VF is up). This patch implements the necessary support in netvsc to support Linux VF drivers. Signed-off-by: K. Y. Srinivasan Reviewed-by: Haiyang Zhang --- drivers/net/hyperv/hyperv_net.h | 14 ++ drivers/net/hyperv/netvsc.c | 29 drivers/net/hyperv/netvsc_drv.c | 312 +--- drivers/net/hyperv/rndis_filter.c |6 + 4 files changed, 335 insertions(+), 26 deletions(-) diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index 8b3bd8e..6700a4d 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -202,6 +202,8 @@ int rndis_filter_receive(struct hv_device *dev, int rndis_filter_set_packet_filter(struct rndis_device *dev, u32 new_filter); int rndis_filter_set_device_mac(struct hv_device *hdev, char *mac); +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf); + #define NVSP_INVALID_PROTOCOL_VERSION ((u32)0x) #define NVSP_PROTOCOL_VERSION_12 @@ -641,6 +643,12 @@ struct netvsc_reconfig { u32 event; }; +struct garp_wrk { + struct work_struct dwrk; + struct net_device *netdev; + struct netvsc_device *netvsc_dev; +}; + /* The context of the netvsc device */ struct net_device_context { /* point back to our device context */ @@ -656,6 +664,7 @@ struct net_device_context { struct work_struct work; u32 msg_enable; /* debug level */ + struct garp_wrk gwrk; struct netvsc_stats __percpu *tx_stats; struct netvsc_stats __percpu *rx_stats; @@ -730,6 +739,11 @@ struct netvsc_device { u32 vf_alloc; /* Serial number of the VF to team with */ u32 vf_serial; + atomic_t open_cnt; + /* State to manage the associated VF interface. */ + bool vf_inject; + struct net_device *vf_netdev; + atomic_t vf_use_cnt; }; /* NdisInitialize message */ diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index ec313fc..eddce3c 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -33,6 +33,30 @@ #include "hyperv_net.h" +/* + * Switch the data path from the synthetic interface to the VF + * interface. + */ +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf) +{ + struct nvsp_message *init_pkt = _dev->channel_init_pkt; + struct hv_device *dev = nv_dev->dev; + + memset(init_pkt, 0, sizeof(struct nvsp_message)); + init_pkt->hdr.msg_type = NVSP_MSG4_TYPE_SWITCH_DATA_PATH; + if (vf) + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_VF; + else + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_SYNTHETIC; + + vmbus_sendpacket(dev->channel, init_pkt, + sizeof(struct nvsp_message), + (unsigned long)init_pkt, + VM_PKT_DATA_INBAND, 0); +} + static struct netvsc_device *alloc_net_device(struct hv_device *device) { @@ -52,11 +76,16 @@ static struct netvsc_device *alloc_net_device(struct hv_device *device) init_waitqueue_head(_device->wait_drain); net_device->start_remove = false; net_device->destroy = false; + atomic_set(_device->open_cnt, 0); + atomic_set(_device->vf_use_cnt, 0); net_device->dev = device; net_device->ndev = ndev; net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT; net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT; + net_device->vf_netdev = NULL; + net_device->vf_inject = false; + hv_set_drvdata(device, net_device); return net_device; } diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index b8121eb..bfdb568a 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -610,42 +610,24 @@ void netvsc_linkstatus_callback(struct hv_device *device_obj, schedule_delayed_work(_ctx->dwork, 0); } -/* - * netvsc_recv_callback - Callback when we receive a packet from the - * "wire" on the specified device. - */ -int netvsc_recv_callback(struct hv_device *device_obj, + +static struct sk_buff *netvsc_alloc_recv_skb(struct net_de
[PATCH 1/1] hv_netvsc: Implement support for VF drivers on Hyper-V
Support VF drivers on Hyper-V. On Hyper-V, each VF instance presented to the guest has an associated synthetic interface that shares the MAC address with the VF instance. Typically these are bonded together to support live migration. By default, the host delivers all the incoming packets on the synthetic interface. Once the VF is up, we need to explicitly switch the data path on the host to divert traffic onto the VF interface. Even after switching the data path, broadcast and multicast packets are always delivered on the synthetic interface and these will have to be injected back onto the VF interface (if VF is up). This patch implements the necessary support in netvsc to support Linux VF drivers. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/net/hyperv/hyperv_net.h | 14 ++ drivers/net/hyperv/netvsc.c | 29 drivers/net/hyperv/netvsc_drv.c | 309 +++ 3 files changed, 326 insertions(+), 26 deletions(-) diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index 8b3bd8e..6700a4d 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -202,6 +202,8 @@ int rndis_filter_receive(struct hv_device *dev, int rndis_filter_set_packet_filter(struct rndis_device *dev, u32 new_filter); int rndis_filter_set_device_mac(struct hv_device *hdev, char *mac); +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf); + #define NVSP_INVALID_PROTOCOL_VERSION ((u32)0x) #define NVSP_PROTOCOL_VERSION_12 @@ -641,6 +643,12 @@ struct netvsc_reconfig { u32 event; }; +struct garp_wrk { + struct work_struct dwrk; + struct net_device *netdev; + struct netvsc_device *netvsc_dev; +}; + /* The context of the netvsc device */ struct net_device_context { /* point back to our device context */ @@ -656,6 +664,7 @@ struct net_device_context { struct work_struct work; u32 msg_enable; /* debug level */ + struct garp_wrk gwrk; struct netvsc_stats __percpu *tx_stats; struct netvsc_stats __percpu *rx_stats; @@ -730,6 +739,11 @@ struct netvsc_device { u32 vf_alloc; /* Serial number of the VF to team with */ u32 vf_serial; + atomic_t open_cnt; + /* State to manage the associated VF interface. */ + bool vf_inject; + struct net_device *vf_netdev; + atomic_t vf_use_cnt; }; /* NdisInitialize message */ diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index ec313fc..eddce3c 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -33,6 +33,30 @@ #include "hyperv_net.h" +/* + * Switch the data path from the synthetic interface to the VF + * interface. + */ +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf) +{ + struct nvsp_message *init_pkt = _dev->channel_init_pkt; + struct hv_device *dev = nv_dev->dev; + + memset(init_pkt, 0, sizeof(struct nvsp_message)); + init_pkt->hdr.msg_type = NVSP_MSG4_TYPE_SWITCH_DATA_PATH; + if (vf) + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_VF; + else + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_SYNTHETIC; + + vmbus_sendpacket(dev->channel, init_pkt, + sizeof(struct nvsp_message), + (unsigned long)init_pkt, + VM_PKT_DATA_INBAND, 0); +} + static struct netvsc_device *alloc_net_device(struct hv_device *device) { @@ -52,11 +76,16 @@ static struct netvsc_device *alloc_net_device(struct hv_device *device) init_waitqueue_head(_device->wait_drain); net_device->start_remove = false; net_device->destroy = false; + atomic_set(_device->open_cnt, 0); + atomic_set(_device->vf_use_cnt, 0); net_device->dev = device; net_device->ndev = ndev; net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT; net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT; + net_device->vf_netdev = NULL; + net_device->vf_inject = false; + hv_set_drvdata(device, net_device); return net_device; } diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index b8121eb..5669589 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -610,42 +610,24 @@ void netvsc_linkstatus_callback(struct hv_device *device_obj, schedule_delayed_work(_ctx->dwork, 0); } -/* - * netvsc_recv_callback - Callback when we receive a packet from the - * "wire" on the specified device. - */ -int netvsc_recv_callback(struct hv_device *device_obj, + +static struct sk_buff *netvsc_alloc_recv_skb(struct net_device *net,
[PATCH 1/1] hv_netvsc: Implement support for VF drivers on Hyper-V
Support VF drivers on Hyper-V. On Hyper-V, each VF instance presented to the guest has an associated synthetic interface that shares the MAC address with the VF instance. Typically these are bonded together to support live migration. By default, the host delivers all the incoming packets on the synthetic interface. Once the VF is up, we need to explicitly switch the data path on the host to divert traffic onto the VF interface. Even after switching the data path, broadcast and multicast packets are always delivered on the synthetic interface and these will have to be injected back onto the VF interface (if VF is up). This patch implements the necessary support in netvsc to support Linux VF drivers. Signed-off-by: K. Y. Srinivasan --- drivers/net/hyperv/hyperv_net.h | 14 ++ drivers/net/hyperv/netvsc.c | 29 drivers/net/hyperv/netvsc_drv.c | 309 +++ 3 files changed, 326 insertions(+), 26 deletions(-) diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index 8b3bd8e..6700a4d 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -202,6 +202,8 @@ int rndis_filter_receive(struct hv_device *dev, int rndis_filter_set_packet_filter(struct rndis_device *dev, u32 new_filter); int rndis_filter_set_device_mac(struct hv_device *hdev, char *mac); +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf); + #define NVSP_INVALID_PROTOCOL_VERSION ((u32)0x) #define NVSP_PROTOCOL_VERSION_12 @@ -641,6 +643,12 @@ struct netvsc_reconfig { u32 event; }; +struct garp_wrk { + struct work_struct dwrk; + struct net_device *netdev; + struct netvsc_device *netvsc_dev; +}; + /* The context of the netvsc device */ struct net_device_context { /* point back to our device context */ @@ -656,6 +664,7 @@ struct net_device_context { struct work_struct work; u32 msg_enable; /* debug level */ + struct garp_wrk gwrk; struct netvsc_stats __percpu *tx_stats; struct netvsc_stats __percpu *rx_stats; @@ -730,6 +739,11 @@ struct netvsc_device { u32 vf_alloc; /* Serial number of the VF to team with */ u32 vf_serial; + atomic_t open_cnt; + /* State to manage the associated VF interface. */ + bool vf_inject; + struct net_device *vf_netdev; + atomic_t vf_use_cnt; }; /* NdisInitialize message */ diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index ec313fc..eddce3c 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -33,6 +33,30 @@ #include "hyperv_net.h" +/* + * Switch the data path from the synthetic interface to the VF + * interface. + */ +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf) +{ + struct nvsp_message *init_pkt = _dev->channel_init_pkt; + struct hv_device *dev = nv_dev->dev; + + memset(init_pkt, 0, sizeof(struct nvsp_message)); + init_pkt->hdr.msg_type = NVSP_MSG4_TYPE_SWITCH_DATA_PATH; + if (vf) + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_VF; + else + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_SYNTHETIC; + + vmbus_sendpacket(dev->channel, init_pkt, + sizeof(struct nvsp_message), + (unsigned long)init_pkt, + VM_PKT_DATA_INBAND, 0); +} + static struct netvsc_device *alloc_net_device(struct hv_device *device) { @@ -52,11 +76,16 @@ static struct netvsc_device *alloc_net_device(struct hv_device *device) init_waitqueue_head(_device->wait_drain); net_device->start_remove = false; net_device->destroy = false; + atomic_set(_device->open_cnt, 0); + atomic_set(_device->vf_use_cnt, 0); net_device->dev = device; net_device->ndev = ndev; net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT; net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT; + net_device->vf_netdev = NULL; + net_device->vf_inject = false; + hv_set_drvdata(device, net_device); return net_device; } diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index b8121eb..5669589 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -610,42 +610,24 @@ void netvsc_linkstatus_callback(struct hv_device *device_obj, schedule_delayed_work(_ctx->dwork, 0); } -/* - * netvsc_recv_callback - Callback when we receive a packet from the - * "wire" on the specified device. - */ -int netvsc_recv_callback(struct hv_device *device_obj, + +static struct sk_buff *netvsc_alloc_recv_skb(struct net_device *net, st
[PATCH 7/8] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets
Implement APIs for in-place consumption of vmbus packets. Currently, each packet is copied and processed one at a time and as part of processing each packet we potentially may signal the host (if it is waiting for room to produce a packet). These APIs help batched in-place processing of vmbus packets. We also optimize host signaling by having a separate API to signal the end of in-place consumption. With netvsc using these APIs, on an iperf run on average I see about 20X reduction in checks to signal the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c |1 + include/linux/hyperv.h | 86 ++ 2 files changed, 87 insertions(+), 0 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index dd255c9..fe586bf 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -132,6 +132,7 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, u32 next_read_location) { ring_info->ring_buffer->read_index = next_read_location; + ring_info->priv_read_index = next_read_location; } /* Get the size of the ring buffer. */ diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 6797a30..b10954a 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -126,6 +126,8 @@ struct hv_ring_buffer_info { u32 ring_datasize; /* < ring_size */ u32 ring_data_startoffset; + u32 priv_write_index; + u32 priv_read_index; }; /* @@ -1420,4 +1422,88 @@ static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) return false; } +/* + * An API to support in-place processing of incoming VMBUS packets. + */ +#define VMBUS_PKT_TRAILER 8 + +static inline struct vmpacket_descriptor * +get_next_pkt_raw(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + void *ring_buffer = hv_get_ring_buffer(ring_info); + struct vmpacket_descriptor *cur_desc; + u32 packetlen; + u32 dsize = ring_info->ring_datasize; + u32 delta = read_loc - ring_info->ring_buffer->read_index; + u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta); + + if (bytes_avail_toread < sizeof(struct vmpacket_descriptor)) + return NULL; + + if ((read_loc + sizeof(*cur_desc)) > dsize) + return NULL; + + cur_desc = ring_buffer + read_loc; + packetlen = cur_desc->len8 << 3; + + /* +* If the packet under consideration is wrapping around, +* return failure. +*/ + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1)) + return NULL; + + return cur_desc; +} + +/* + * A helper function to step through packets "in-place" + * This API is to be called after each successful call + * get_next_pkt_raw(). + */ +static inline void put_pkt_raw(struct vmbus_channel *channel, + struct vmpacket_descriptor *desc) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + u32 packetlen = desc->len8 << 3; + u32 dsize = ring_info->ring_datasize; + + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize) + BUG(); + /* +* Include the packet trailer. +*/ + ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER; +} + +/* + * This call commits the read index and potentially signals the host. + * Here is the pattern for using the "in-place" consumption APIs: + * + * while (get_next_pkt_raw() { + * process the packet "in-place"; + * put_pkt_raw(); + * } + * if (packets processed in place) + * commit_rd_index(); + */ +static inline void commit_rd_index(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + /* +* Make sure all reads are done before we update the read index since +* the writer may start writing to the read area once the read index +* is updated. +*/ + virt_rmb(); + ring_info->ring_buffer->read_index = ring_info->priv_read_index; + + if (hv_need_to_signal_on_read(ring_info)) + vmbus_set_event(channel); +} + + #endif /* _HYPERV_H */ -- 1.7.4.1
[PATCH 7/8] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets
Implement APIs for in-place consumption of vmbus packets. Currently, each packet is copied and processed one at a time and as part of processing each packet we potentially may signal the host (if it is waiting for room to produce a packet). These APIs help batched in-place processing of vmbus packets. We also optimize host signaling by having a separate API to signal the end of in-place consumption. With netvsc using these APIs, on an iperf run on average I see about 20X reduction in checks to signal the host. Signed-off-by: K. Y. Srinivasan --- drivers/hv/ring_buffer.c |1 + include/linux/hyperv.h | 86 ++ 2 files changed, 87 insertions(+), 0 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index dd255c9..fe586bf 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -132,6 +132,7 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, u32 next_read_location) { ring_info->ring_buffer->read_index = next_read_location; + ring_info->priv_read_index = next_read_location; } /* Get the size of the ring buffer. */ diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 6797a30..b10954a 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -126,6 +126,8 @@ struct hv_ring_buffer_info { u32 ring_datasize; /* < ring_size */ u32 ring_data_startoffset; + u32 priv_write_index; + u32 priv_read_index; }; /* @@ -1420,4 +1422,88 @@ static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) return false; } +/* + * An API to support in-place processing of incoming VMBUS packets. + */ +#define VMBUS_PKT_TRAILER 8 + +static inline struct vmpacket_descriptor * +get_next_pkt_raw(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + void *ring_buffer = hv_get_ring_buffer(ring_info); + struct vmpacket_descriptor *cur_desc; + u32 packetlen; + u32 dsize = ring_info->ring_datasize; + u32 delta = read_loc - ring_info->ring_buffer->read_index; + u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta); + + if (bytes_avail_toread < sizeof(struct vmpacket_descriptor)) + return NULL; + + if ((read_loc + sizeof(*cur_desc)) > dsize) + return NULL; + + cur_desc = ring_buffer + read_loc; + packetlen = cur_desc->len8 << 3; + + /* +* If the packet under consideration is wrapping around, +* return failure. +*/ + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1)) + return NULL; + + return cur_desc; +} + +/* + * A helper function to step through packets "in-place" + * This API is to be called after each successful call + * get_next_pkt_raw(). + */ +static inline void put_pkt_raw(struct vmbus_channel *channel, + struct vmpacket_descriptor *desc) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + u32 packetlen = desc->len8 << 3; + u32 dsize = ring_info->ring_datasize; + + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize) + BUG(); + /* +* Include the packet trailer. +*/ + ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER; +} + +/* + * This call commits the read index and potentially signals the host. + * Here is the pattern for using the "in-place" consumption APIs: + * + * while (get_next_pkt_raw() { + * process the packet "in-place"; + * put_pkt_raw(); + * } + * if (packets processed in place) + * commit_rd_index(); + */ +static inline void commit_rd_index(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + /* +* Make sure all reads are done before we update the read index since +* the writer may start writing to the read area once the read index +* is updated. +*/ + virt_rmb(); + ring_info->ring_buffer->read_index = ring_info->priv_read_index; + + if (hv_need_to_signal_on_read(ring_info)) + vmbus_set_event(channel); +} + + #endif /* _HYPERV_H */ -- 1.7.4.1
[PATCH 8/8] Drivers: hv: vmbus: handle various crash scenarios
From: Vitaly Kuznetsov <[mailto:vkuzn...@redhat.com]> Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was used for initial contact or to CPU0 depending on host version. vmbus_wait_for_unload() doesn't account for the fact that in case we're crashing on some other CPU we won't get the CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will never end. Do the following: 1) Check for completion_done() in the loop. In case interrupt handler is still alive we'll get the confirmation we need. 2) Read message pages for all CPUs message page as we're unsure where CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with still-alive interrupt handler doing the same, add cmpxchg() to vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message. 3) Cleanup message pages on all CPUs. This is required (at least for the current CPU as we're clearing CPU0 messages now but we may want to bring up additional CPUs on crash) as new messages won't be delivered till we consume what's pending. On boot we'll place message pages somewhere else and we won't be able to read stale messages. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hyperv_vmbus.h | 16 +++- drivers/hv/vmbus_drv.c|7 +++-- 3 files changed, 61 insertions(+), 20 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 38b682ba..b6c1211 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -597,27 +597,55 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type) static void vmbus_wait_for_unload(void) { - int cpu = smp_processor_id(); - void *page_addr = hv_context.synic_message_page[cpu]; - struct hv_message *msg = (struct hv_message *)page_addr + - VMBUS_MESSAGE_SINT; + int cpu; + void *page_addr; + struct hv_message *msg; struct vmbus_channel_message_header *hdr; - bool unloaded = false; + u32 message_type; + /* +* CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was +* used for initial contact or to CPU0 depending on host version. When +* we're crashing on a different CPU let's hope that IRQ handler on +* the cpu which receives CHANNELMSG_UNLOAD_RESPONSE is still +* functional and vmbus_unload_response() will complete +* vmbus_connection.unload_event. If not, the last thing we can do is +* read message pages for all CPUs directly. +*/ while (1) { - if (READ_ONCE(msg->header.message_type) == HVMSG_NONE) { - mdelay(10); - continue; - } + if (completion_done(_connection.unload_event)) + break; - hdr = (struct vmbus_channel_message_header *)msg->u.payload; - if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) - unloaded = true; + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + + VMBUS_MESSAGE_SINT; - vmbus_signal_eom(msg); + message_type = READ_ONCE(msg->header.message_type); + if (message_type == HVMSG_NONE) + continue; - if (unloaded) - break; + hdr = (struct vmbus_channel_message_header *) + msg->u.payload; + + if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) + complete(_connection.unload_event); + + vmbus_signal_eom(msg, message_type); + } + + mdelay(10); + } + + /* +* We're crashing and already got the UNLOAD_RESPONSE, cleanup all +* maybe-pending messages on all CPUs to be able to receive new +* messages after we reconnect. +*/ + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; + msg->header.message_type = HVMSG_NONE; } } diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5203e4..718b5c7 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -625,9 +625,21 @@ extern struct vmbus_channel_message_table_entry channel_message_table[CHANNELMSG_COUNT]; /* Free the message slot and signal end-of-message if required */ -static inline void vmbus_signal_eom(struct hv_message *ms
[PATCH 8/8] Drivers: hv: vmbus: handle various crash scenarios
From: Vitaly Kuznetsov <[mailto:vkuzn...@redhat.com]> Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was used for initial contact or to CPU0 depending on host version. vmbus_wait_for_unload() doesn't account for the fact that in case we're crashing on some other CPU we won't get the CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will never end. Do the following: 1) Check for completion_done() in the loop. In case interrupt handler is still alive we'll get the confirmation we need. 2) Read message pages for all CPUs message page as we're unsure where CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with still-alive interrupt handler doing the same, add cmpxchg() to vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message. 3) Cleanup message pages on all CPUs. This is required (at least for the current CPU as we're clearing CPU0 messages now but we may want to bring up additional CPUs on crash) as new messages won't be delivered till we consume what's pending. On boot we'll place message pages somewhere else and we won't be able to read stale messages. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hyperv_vmbus.h | 16 +++- drivers/hv/vmbus_drv.c|7 +++-- 3 files changed, 61 insertions(+), 20 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 38b682ba..b6c1211 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -597,27 +597,55 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type) static void vmbus_wait_for_unload(void) { - int cpu = smp_processor_id(); - void *page_addr = hv_context.synic_message_page[cpu]; - struct hv_message *msg = (struct hv_message *)page_addr + - VMBUS_MESSAGE_SINT; + int cpu; + void *page_addr; + struct hv_message *msg; struct vmbus_channel_message_header *hdr; - bool unloaded = false; + u32 message_type; + /* +* CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was +* used for initial contact or to CPU0 depending on host version. When +* we're crashing on a different CPU let's hope that IRQ handler on +* the cpu which receives CHANNELMSG_UNLOAD_RESPONSE is still +* functional and vmbus_unload_response() will complete +* vmbus_connection.unload_event. If not, the last thing we can do is +* read message pages for all CPUs directly. +*/ while (1) { - if (READ_ONCE(msg->header.message_type) == HVMSG_NONE) { - mdelay(10); - continue; - } + if (completion_done(_connection.unload_event)) + break; - hdr = (struct vmbus_channel_message_header *)msg->u.payload; - if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) - unloaded = true; + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + + VMBUS_MESSAGE_SINT; - vmbus_signal_eom(msg); + message_type = READ_ONCE(msg->header.message_type); + if (message_type == HVMSG_NONE) + continue; - if (unloaded) - break; + hdr = (struct vmbus_channel_message_header *) + msg->u.payload; + + if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) + complete(_connection.unload_event); + + vmbus_signal_eom(msg, message_type); + } + + mdelay(10); + } + + /* +* We're crashing and already got the UNLOAD_RESPONSE, cleanup all +* maybe-pending messages on all CPUs to be able to receive new +* messages after we reconnect. +*/ + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; + msg->header.message_type = HVMSG_NONE; } } diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5203e4..718b5c7 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -625,9 +625,21 @@ extern struct vmbus_channel_message_table_entry channel_message_table[CHANNELMSG_COUNT]; /* Free the message slot and signal end-of-message if required */ -static inline void vmbus_signal_eom(struct hv_message *msg) +static inline void vmbus_signal_eom(struct hv_mes
[PATCH 6/8] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h
In preparation for implementing APIs for in-place consumption of VMBUS packets, movve some ring buffer functionality into hyperv.h Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 55 -- include/linux/hyperv.h | 54 + 2 files changed, 54 insertions(+), 55 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 8f518af..dd255c9 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -84,52 +84,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) return false; } -/* - * To optimize the flow management on the send-side, - * when the sender is blocked because of lack of - * sufficient space in the ring buffer, potential the - * consumer of the ring buffer can signal the producer. - * This is controlled by the following parameters: - * - * 1. pending_send_sz: This is the size in bytes that the - *producer is trying to send. - * 2. The feature bit feat_pending_send_sz set to indicate if - *the consumer of the ring will signal when the ring - *state transitions from being full to a state where - *there is room for the producer to send the pending packet. - */ - -static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) -{ - u32 cur_write_sz; - u32 pending_sz; - - /* -* Issue a full memory barrier before making the signaling decision. -* Here is the reason for having this barrier: -* If the reading of the pend_sz (in this function) -* were to be reordered and read before we commit the new read -* index (in the calling function) we could -* have a problem. If the host were to set the pending_sz after we -* have sampled pending_sz and go to sleep before we commit the -* read index, we could miss sending the interrupt. Issue a full -* memory barrier to address this. -*/ - virt_mb(); - - pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); - /* If the other end is not blocked on write don't bother. */ - if (pending_sz == 0) - return false; - - cur_write_sz = hv_get_bytes_to_write(rbi); - - if (cur_write_sz >= pending_sz) - return true; - - return false; -} - /* Get the next write location for the specified ring buffer. */ static inline u32 hv_get_next_write_location(struct hv_ring_buffer_info *ring_info) @@ -180,15 +134,6 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, ring_info->ring_buffer->read_index = next_read_location; } - -/* Get the start of the ring buffer. */ -static inline void * -hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) -{ - return (void *)ring_info->ring_buffer->buffer; -} - - /* Get the size of the ring buffer. */ static inline u32 hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 4adeb6e..6797a30 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1366,4 +1366,58 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); void vmbus_set_event(struct vmbus_channel *channel); + +/* Get the start of the ring buffer. */ +static inline void * +hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) +{ + return (void *)ring_info->ring_buffer->buffer; +} + +/* + * To optimize the flow management on the send-side, + * when the sender is blocked because of lack of + * sufficient space in the ring buffer, potential the + * consumer of the ring buffer can signal the producer. + * This is controlled by the following parameters: + * + * 1. pending_send_sz: This is the size in bytes that the + *producer is trying to send. + * 2. The feature bit feat_pending_send_sz set to indicate if + *the consumer of the ring will signal when the ring + *state transitions from being full to a state where + *there is room for the producer to send the pending packet. + */ + +static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) +{ + u32 cur_write_sz; + u32 pending_sz; + + /* +* Issue a full memory barrier before making the signaling decision. +* Here is the reason for having this barrier: +* If the reading of the pend_sz (in this function) +* were to be reordered and read before we commit the new read +* index (in the calling function) we could +* have a problem. If the host were to set the pending_sz after we +* have sampled pending_sz and go to sleep before we commit the +* read index, we could miss sending the interrupt. Issue a full +* memory barrier to address
[PATCH 5/8] Drivers: hv: vmbus: Export the vmbus_set_event() API
In preparation for moving some ring buffer functionality out of the vmbus driver, export the API for signaling the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/connection.c |1 + drivers/hv/hyperv_vmbus.h |2 -- include/linux/hyperv.h|1 + 3 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index d02f137..fcf8a02 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -495,3 +495,4 @@ void vmbus_set_event(struct vmbus_channel *channel) hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL); } +EXPORT_SYMBOL_GPL(vmbus_set_event); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 8b07f9c..e5203e4 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -672,8 +672,6 @@ void vmbus_disconnect(void); int vmbus_post_msg(void *buffer, size_t buflen); -void vmbus_set_event(struct vmbus_channel *channel); - void vmbus_on_event(unsigned long data); void vmbus_on_msg_dpc(unsigned long data); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index a6b053c..4adeb6e 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1365,4 +1365,5 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); +void vmbus_set_event(struct vmbus_channel *channel); #endif /* _HYPERV_H */ -- 1.7.4.1
[PATCH 1/8] Drivers: hv: kvp: fix IP Failover
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Hyper-V VMs can be replicated to another hosts and there is a feature to set different IP for replicas, it is called 'Failover TCP/IP'. When such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon as we finish negotiation procedure. The problem is that it can happen (and it actually happens) before userspace daemon connects and we reply with HV_E_FAIL to the message. As there are no repetitions we fail to set the requested IP. Solve the issue by postponing our reply to the negotiation message till userspace daemon is connected. We can't wait too long as there is a host-side timeout (cca. 75 seconds) and if we fail to reply in this time frame the whole KVP service will become inactive. The solution is not ideal - if it takes userspace daemon more than 60 seconds to connect IP Failover will still fail but I don't see a solution with our current separation between kernel and userspace parts. Other two modules (VSS and FCOPY) don't require such delay, leave them untouched. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_kvp.c | 31 +++ drivers/hv/hyperv_vmbus.h |5 + 2 files changed, 36 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 9b9b370..cb1a916 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy); static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); static void kvp_timeout_func(struct work_struct *dummy); +static void kvp_host_handshake_func(struct work_struct *dummy); static void kvp_register(int); static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); +static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); static const char kvp_devname[] = "vmbus/hv_kvp"; @@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy) hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } +static void kvp_host_handshake_func(struct work_struct *dummy) +{ + hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); +} + static int kvp_handle_handshake(struct hv_kvp_msg *msg) { switch (msg->kvp_hdr.operation) { @@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_debug("KVP: userspace daemon ver. %d registered\n", KVP_OP_REGISTER); kvp_register(dm_reg_value); + + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + cancel_delayed_work_sync(_host_handshake_work); hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); return 0; @@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context) struct icmsg_negotiate *negop = NULL; int util_fw_version; int kvp_srv_version; + static enum {NEGO_NOT_STARTED, +NEGO_IN_PROGRESS, +NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; + if (host_negotiatied == NEGO_NOT_STARTED && + kvp_transaction.state < HVUTIL_READY) { + /* +* If userspace daemon is not connected and host is asking +* us to negotiate we need to delay to not lose messages. +* This is important for Failover IP setting. +*/ + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(_host_handshake_work, + HV_UTIL_NEGO_TIMEOUT * HZ); + return; + } if (kvp_transaction.state > HVUTIL_READY) return; @@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context) vmbus_sendpacket(channel, recv_buffer, recvlen, requestid, VM_PKT_DATA_INBAND, 0); + + host_negotiatied = NEGO_FINISHED; } } @@ -708,6 +738,7 @@ hv_kvp_init(struct hv_util_service *srv) void hv_kvp_deinit(void) { kvp_transaction.state = HVUTIL_DEVICE_DYING; + cancel_delayed_work_sync(_host_handshake_work); cancel_delayed_work_sync(_timeout_work); cancel_work_sync(_sendkey_work); hvutil_transport_destroy(hvt); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 12321b9..8b07f9c 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -36,6 +36,11 @@ #define HV_UTIL_TIMEOUT 30 /* + * Timeout for guest-host handshake for services. + */ +#define HV_UTIL_NEGO_TIMEOUT 60 + +/* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent * is set by CPUID(HVCPUID_VERSION_FEATURES). */ -- 1.7.4.1
[PATCH 6/8] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h
In preparation for implementing APIs for in-place consumption of VMBUS packets, movve some ring buffer functionality into hyperv.h Signed-off-by: K. Y. Srinivasan --- drivers/hv/ring_buffer.c | 55 -- include/linux/hyperv.h | 54 + 2 files changed, 54 insertions(+), 55 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 8f518af..dd255c9 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -84,52 +84,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) return false; } -/* - * To optimize the flow management on the send-side, - * when the sender is blocked because of lack of - * sufficient space in the ring buffer, potential the - * consumer of the ring buffer can signal the producer. - * This is controlled by the following parameters: - * - * 1. pending_send_sz: This is the size in bytes that the - *producer is trying to send. - * 2. The feature bit feat_pending_send_sz set to indicate if - *the consumer of the ring will signal when the ring - *state transitions from being full to a state where - *there is room for the producer to send the pending packet. - */ - -static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) -{ - u32 cur_write_sz; - u32 pending_sz; - - /* -* Issue a full memory barrier before making the signaling decision. -* Here is the reason for having this barrier: -* If the reading of the pend_sz (in this function) -* were to be reordered and read before we commit the new read -* index (in the calling function) we could -* have a problem. If the host were to set the pending_sz after we -* have sampled pending_sz and go to sleep before we commit the -* read index, we could miss sending the interrupt. Issue a full -* memory barrier to address this. -*/ - virt_mb(); - - pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); - /* If the other end is not blocked on write don't bother. */ - if (pending_sz == 0) - return false; - - cur_write_sz = hv_get_bytes_to_write(rbi); - - if (cur_write_sz >= pending_sz) - return true; - - return false; -} - /* Get the next write location for the specified ring buffer. */ static inline u32 hv_get_next_write_location(struct hv_ring_buffer_info *ring_info) @@ -180,15 +134,6 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, ring_info->ring_buffer->read_index = next_read_location; } - -/* Get the start of the ring buffer. */ -static inline void * -hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) -{ - return (void *)ring_info->ring_buffer->buffer; -} - - /* Get the size of the ring buffer. */ static inline u32 hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 4adeb6e..6797a30 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1366,4 +1366,58 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); void vmbus_set_event(struct vmbus_channel *channel); + +/* Get the start of the ring buffer. */ +static inline void * +hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) +{ + return (void *)ring_info->ring_buffer->buffer; +} + +/* + * To optimize the flow management on the send-side, + * when the sender is blocked because of lack of + * sufficient space in the ring buffer, potential the + * consumer of the ring buffer can signal the producer. + * This is controlled by the following parameters: + * + * 1. pending_send_sz: This is the size in bytes that the + *producer is trying to send. + * 2. The feature bit feat_pending_send_sz set to indicate if + *the consumer of the ring will signal when the ring + *state transitions from being full to a state where + *there is room for the producer to send the pending packet. + */ + +static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) +{ + u32 cur_write_sz; + u32 pending_sz; + + /* +* Issue a full memory barrier before making the signaling decision. +* Here is the reason for having this barrier: +* If the reading of the pend_sz (in this function) +* were to be reordered and read before we commit the new read +* index (in the calling function) we could +* have a problem. If the host were to set the pending_sz after we +* have sampled pending_sz and go to sleep before we commit the +* read index, we could miss sending the interrupt. Issue a full +* memory barrier to address this. +*/ + virt