[PATCH 1/1] x86/hyper-V: Allocate the IDT entry early in boot
Allocate the hypervisor callback IDT entry early in the boot sequence. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- arch/x86/kernel/cpu/mshyperv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 3b3f713e15e5..236324e83a3a 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -59,8 +59,6 @@ void hyperv_vector_handler(struct pt_regs *regs) void hv_setup_vmbus_irq(void (*handler)(void)) { vmbus_handler = handler; - /* Setup the IDT for hypervisor callback */ - alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); } void hv_remove_vmbus_irq(void) @@ -251,6 +249,8 @@ static void __init ms_hyperv_init_platform(void) */ x86_platform.apic_post_init = hyperv_init; hyperv_setup_mmu_ops(); + /* Setup the IDT for hypervisor callback */ + alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, hyperv_callback_vector); #endif } -- 2.14.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 00/18] Drivers: hv: vmbus: Restructure architecture specific code
The current Hyper-V code under drivers/hv has bunch of X86 specific code. Restructure the code and move al architecture specific code to the appropriate files. As I was working on this restructuring, Roman Kagan <rka...@virtuozzo.com> has submitted patches to restructure the Hyper-V header files to address a different need - to share the definitions across all Hyper-V drivers including QEMU based drivers. Roman and I will coordinate our work. K. Y. Srinivasan (18): Drivers: hv: vmbus: Move the definition of hv_x64_msr_hypercall_contents Drivers: hv: vmbus: Move the definition of generate_guest_id() Drivers: hv vmbus: Move Hypercall page setup out of common code Drivers: hv: vmbus: Move Hypercall invocation code out of common code Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code Drivers: hv: vmbus: Move the extracting of Hypervisor version information Drivers: hv: vmbus: Move the crash notification function Drivers: hv: vmbus: Move the check for hypercall page setup Drivers: hv: vmbus: Move the code to signal end of message Drivers: hv: vmbus: Restructure the clockevents code Drivers: hv: util: Use hv_get_current_tick() to get current tick Drivers: hv: vmbus: Get rid of an unsused variable Drivers: hv: vmbus: Define APIs to manipulate the message page Drivers: hv: vmbus: Define APIs to manipulate the event page Drivers: hv: vmbus: Define APIs to manipulate the synthetic interrupt controller Drivers: hv: vmbus: Define an API to retrieve virtual processor index Drivers: hv: vmbus: Define an APIs to manage interrupt state Drivers: hv: vmbus: Cleanup hyperv_vmbus.h arch/x86/Kbuild|3 + arch/x86/hyperv/Makefile |1 + arch/x86/hyperv/hv_init.c | 251 ++ arch/x86/include/asm/mshyperv.h| 147 ++ arch/x86/include/uapi/asm/hyperv.h |8 + arch/x86/kernel/cpu/mshyperv.c | 50 --- drivers/hv/channel_mgmt.c |1 + drivers/hv/connection.c|7 +- drivers/hv/hv.c| 296 drivers/hv/hv_util.c |3 +- drivers/hv/hyperv_vmbus.h | 291 +--- drivers/hv/vmbus_drv.c | 25 --- 12 files changed, 472 insertions(+), 611 deletions(-) create mode 100644 arch/x86/hyperv/Makefile create mode 100644 arch/x86/hyperv/hv_init.c -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V3 00/14] Drivers: hv: Some miscellaneous fixes and enhancements
Some miscellaneous fixes and enhancements. V2: Fixed a build issue reported by Greg. Only Patch # 13 is affected. V3: Address Stephen's comment. Only patch 3 is affected. Alex Ng (6): Drivers: hv: utils: Fix the mapping between host version and protocol to use Drivers: hv: balloon: Disable hot add when CONFIG_MEMORY_HOTPLUG is not set Drivers: hv: balloon: Add logging for dynamic memory operations Drivers: hv: vss: Improve log messages. Drivers: hv: vss: Operation timeouts should match host expectation Drivers: hv: balloon: Fix info request to show max page count K. Y. Srinivasan (3): Drivers: hv: vmbus: Base host signaling strictly on the ring state Drivers: hv: vmbus: On write cleanup the logic to interrupt the host Drivers: hv: vmbus: On the read path cleanup the logic to interrupt the host Vitaly Kuznetsov (2): Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2) Drivers: hv: utils: reduce HV_UTIL_NEGO_TIMEOUT timeout Weibing Zhang (3): tools: hv: remove unnecessary link flag tools: hv: fix a compile warning in snprintf tools: hv: remove unnecessary header files and netlink related code drivers/hv/channel.c | 93 ++-- drivers/hv/channel_mgmt.c |2 - drivers/hv/hv_balloon.c| 44 ++-- drivers/hv/hv_snapshot.c | 33 drivers/hv/hv_util.c |9 +++- drivers/hv/hyperv_vmbus.h | 12 +++--- drivers/hv/ring_buffer.c | 44 - include/linux/hyperv.h | 45 - tools/hv/Makefile |3 +- tools/hv/hv_fcopy_daemon.c |7 --- tools/hv/hv_kvp_daemon.c |9 + 11 files changed, 133 insertions(+), 168 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: fix the race when querying & updating the percpu list
From: Dexuan Cui <de...@microsoft.com> There is a rare race when we remove an entry from the global list hv_context.percpu_list[cpu] in hv_process_channel_removal() -> percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() -> process_chn_event() -> pcpu_relid2channel() is trying to query the list, we can get the kernel fault. Similarly, we also have the issue in the code path: vmbus_process_offer() -> percpu_channel_enq(). We can resolve the issue by disabling the tasklet when updating the list. The patch also moves vmbus_release_relid() to a later place where the channel has been removed from the per-cpu and the global lists. Reported-by: Rolf Neugebauer <rolf.neugeba...@docker.com> Signed-off-by: Dexuan Cui <de...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel.c |6 ++ drivers/hv/channel_mgmt.c | 32 include/linux/hyperv.h|3 +++ 3 files changed, 33 insertions(+), 8 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 9a88c63..e47d37d 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -505,7 +505,6 @@ static void reset_channel_cb(void *arg) static int vmbus_close_internal(struct vmbus_channel *channel) { struct vmbus_channel_close_channel *msg; - struct tasklet_struct *tasklet; int ret; /* @@ -517,8 +516,7 @@ static int vmbus_close_internal(struct vmbus_channel *channel) * To resolve the race, we can serialize them by disabling the * tasklet when the latter is running here. */ - tasklet = hv_context.event_dpc[channel->target_cpu]; - tasklet_disable(tasklet); + hv_event_tasklet_disable(channel); /* * In case a device driver's probe() fails (e.g., @@ -584,7 +582,7 @@ static int vmbus_close_internal(struct vmbus_channel *channel) get_order(channel->ringbuffer_pagecount * PAGE_SIZE)); out: - tasklet_enable(tasklet); + hv_event_tasklet_enable(channel); return ret; } diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index b6c1211..8818b92 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -21,6 +21,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include +#include #include #include #include @@ -303,16 +304,32 @@ static void vmbus_release_relid(u32 relid) vmbus_post_msg(, sizeof(struct vmbus_channel_relid_released)); } +void hv_event_tasklet_disable(struct vmbus_channel *channel) +{ + struct tasklet_struct *tasklet; + tasklet = hv_context.event_dpc[channel->target_cpu]; + tasklet_disable(tasklet); +} + +void hv_event_tasklet_enable(struct vmbus_channel *channel) +{ + struct tasklet_struct *tasklet; + tasklet = hv_context.event_dpc[channel->target_cpu]; + tasklet_enable(tasklet); + + /* In case there is any pending event */ + tasklet_schedule(tasklet); +} + void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) { unsigned long flags; struct vmbus_channel *primary_channel; - vmbus_release_relid(relid); - BUG_ON(!channel->rescind); BUG_ON(!mutex_is_locked(_connection.channel_mutex)); + hv_event_tasklet_disable(channel); if (channel->target_cpu != get_cpu()) { put_cpu(); smp_call_function_single(channel->target_cpu, @@ -321,6 +338,7 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) percpu_channel_deq(channel); put_cpu(); } + hv_event_tasklet_enable(channel); if (channel->primary_channel == NULL) { list_del(>listentry); @@ -341,6 +359,8 @@ void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid) cpumask_clear_cpu(channel->target_cpu, _channel->alloced_cpus_in_node); + vmbus_release_relid(relid); + free_channel(channel); } @@ -409,6 +429,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) init_vp_index(newchannel, dev_type); + hv_event_tasklet_disable(newchannel); if (newchannel->target_cpu != get_cpu()) { put_cpu(); smp_call_function_single(newchannel->target_cpu, @@ -418,6 +439,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) percpu_channel_enq(newchannel); put_cpu(); } + hv_event_tasklet_enable(newchannel); /* * This state is used to indicate a successful open @@ -463,12 +485,11 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) return; err_deq_chan: - vmbus_release_relid(newchannel->offermsg.child_relid); - mutex_lock(_connection.c
[PATCH net-next] netvsc: Use the new in-place consumption APIs in the rx path
Use the new APIs for eliminating a copy on the receive path. These new APIs also help in minimizing the number of memory barriers we end up issuing (in the ringbuffer code) since we can better control when we want to expose the ring state to the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Reviewed-by: Haiyang Zhang <haiya...@microsoft.com> Tested-by: Dexuan Cui <de...@microsoft.com> Tested-by: Simon Xiao <six...@microsoft.com> --- drivers/net/hyperv/netvsc.c | 88 +-- 1 files changed, 59 insertions(+), 29 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 719cb35..8cd4c19 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -1141,6 +1141,39 @@ static inline void netvsc_receive_inband(struct hv_device *hdev, } } +static void netvsc_process_raw_pkt(struct hv_device *device, + struct vmbus_channel *channel, + struct netvsc_device *net_device, + struct net_device *ndev, + u64 request_id, + struct vmpacket_descriptor *desc) +{ + struct nvsp_message *nvmsg; + + nvmsg = (struct nvsp_message *)((unsigned long) + desc + (desc->offset8 << 3)); + + switch (desc->type) { + case VM_PKT_COMP: + netvsc_send_completion(net_device, channel, device, desc); + break; + + case VM_PKT_DATA_USING_XFER_PAGES: + netvsc_receive(net_device, channel, device, desc); + break; + + case VM_PKT_DATA_INBAND: + netvsc_receive_inband(device, net_device, nvmsg); + break; + + default: + netdev_err(ndev, "unhandled packet type %d, tid %llx\n", + desc->type, request_id); + break; + } +} + + void netvsc_channel_cb(void *context) { int ret; @@ -1153,7 +1186,7 @@ void netvsc_channel_cb(void *context) unsigned char *buffer; int bufferlen = NETVSC_PACKET_SIZE; struct net_device *ndev; - struct nvsp_message *nvmsg; + bool need_to_commit = false; if (channel->primary_channel != NULL) device = channel->primary_channel->device_obj; @@ -1167,39 +1200,36 @@ void netvsc_channel_cb(void *context) buffer = get_per_channel_state(channel); do { + desc = get_next_pkt_raw(channel); + if (desc != NULL) { + netvsc_process_raw_pkt(device, + channel, + net_device, + ndev, + desc->trans_id, + desc); + + put_pkt_raw(channel, desc); + need_to_commit = true; + continue; + } + if (need_to_commit) { + need_to_commit = false; + commit_rd_index(channel); + } + ret = vmbus_recvpacket_raw(channel, buffer, bufferlen, _recvd, _id); if (ret == 0) { if (bytes_recvd > 0) { desc = (struct vmpacket_descriptor *)buffer; - nvmsg = (struct nvsp_message *)((unsigned long) -desc + (desc->offset8 << 3)); - switch (desc->type) { - case VM_PKT_COMP: - netvsc_send_completion(net_device, - channel, - device, desc); - break; - - case VM_PKT_DATA_USING_XFER_PAGES: - netvsc_receive(net_device, channel, - device, desc); - break; - - case VM_PKT_DATA_INBAND: - netvsc_receive_inband(device, - net_device, - nvmsg); - break; - - default: - netdev_err(ndev, - "unhandled packet type %d, " - "ti
[PATCH 2/2] Drivers: hv: utils: fix a race on userspace daemons registration
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Background: userspace daemons registration protocol for Hyper-V utilities drivers has two steps: 1) daemon writes its own version to kernel 2) kernel reads it and replies with module version at this point we consider the handshake procedure being completed and we do hv_poll_channel() transitioning the utility device to HVUTIL_READY state. At this point we're ready to handle messages from kernel. When hvutil_transport is in HVUTIL_TRANSPORT_CHARDEV mode we have a single buffer for outgoing message. hvutil_transport_send() puts to this buffer and till the buffer is cleared with hvt_op_read() returns -EFAULT to all consequent calls. Host<->guest protocol guarantees there is no more than one request at a time and we will not get new requests till we reply to the previous one so this single message buffer is enough. Now to the race. When we finish negotiation procedure and send kernel module version to userspace with hvutil_transport_send() it goes into the above mentioned buffer and if the daemon is slow enough to read it from there we can get a collision when a request from the host comes, we won't be able to put anything to the buffer so the request will be lost. To solve the issue we need to know when the negotiation is really done (when the version message is read by the daemon) and transition to HVUTIL_READY state after this happens. Implement a callback on read to support this. Old style netlink communication is not affected by the change, we don't really know when these messages are delivered but we don't have a single message buffer there. Reported-by: Barry Davis <barry_da...@stormagic.com> Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_fcopy.c | 14 ++ drivers/hv/hv_kvp.c | 27 --- drivers/hv/hv_snapshot.c| 16 +++- drivers/hv/hv_utils_transport.c | 15 ++- drivers/hv/hv_utils_transport.h |4 +++- 5 files changed, 54 insertions(+), 22 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index 23c7079..8b2ba98 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -83,6 +83,12 @@ static void fcopy_timeout_func(struct work_struct *dummy) hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); } +static void fcopy_register_done(void) +{ + pr_debug("FCP: userspace daemon registered\n"); + hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); +} + static int fcopy_handle_handshake(u32 version) { u32 our_ver = FCOPY_CURRENT_VERSION; @@ -94,7 +100,8 @@ static int fcopy_handle_handshake(u32 version) break; case FCOPY_VERSION_1: /* Daemon expects us to reply with our own version */ - if (hvutil_transport_send(hvt, _ver, sizeof(our_ver))) + if (hvutil_transport_send(hvt, _ver, sizeof(our_ver), + fcopy_register_done)) return -EFAULT; dm_reg_value = version; break; @@ -107,8 +114,7 @@ static int fcopy_handle_handshake(u32 version) */ return -EINVAL; } - pr_debug("FCP: userspace daemon ver. %d registered\n", version); - hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper); + pr_debug("FCP: userspace daemon ver. %d connected\n", version); return 0; } @@ -161,7 +167,7 @@ static void fcopy_send_data(struct work_struct *dummy) } fcopy_transaction.state = HVUTIL_USERSPACE_REQ; - rc = hvutil_transport_send(hvt, out_src, out_len); + rc = hvutil_transport_send(hvt, out_src, out_len, NULL); if (rc) { pr_debug("FCP: failed to communicate to the daemon: %d\n", rc); if (cancel_delayed_work_sync(_timeout_work)) { diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index cb1a916..5e1fdc8 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -102,6 +102,17 @@ static void kvp_poll_wrapper(void *channel) hv_kvp_onchannelcallback(channel); } +static void kvp_register_done(void) +{ + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + pr_debug("KVP: userspace daemon registered\n"); + cancel_delayed_work_sync(_host_handshake_work); + hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); +} + static void kvp_register(int reg_value) { @@ -116,7 +127,8 @@ kvp_register(int reg_value) kvp_msg->kvp_hdr.operation = reg_value; strcpy(version, HV_DRV_VERSION); - hvutil_transport_send(hvt, kvp_msg, sizeof(*kvp_msg)); + hvutil_transport_send(hvt,
[PATCH 1/2] Drivers: hv: get rid of timeout in vmbus_open()
From: Vitaly Kuznetsov <vkuzn...@redhat.com> vmbus_teardown_gpadl() can result in infinite wait when it is called on 5 second timeout in vmbus_open(). The issue is caused by the fact that gpadl teardown operation won't ever succeed for an opened channel and the timeout isn't always enough. As a guest, we can always trust the host to respond to our request (and there is nothing we can do if it doesn't). Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel.c |7 +-- 1 files changed, 1 insertions(+), 6 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index a68830c..9a88c63 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -73,7 +73,6 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size, void *in, *out; unsigned long flags; int ret, err = 0; - unsigned long t; struct page *page; spin_lock_irqsave(>lock, flags); @@ -183,11 +182,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size, goto error1; } - t = wait_for_completion_timeout(_info->waitevent, 5*HZ); - if (t == 0) { - err = -ETIMEDOUT; - goto error1; - } + wait_for_completion(_info->waitevent); spin_lock_irqsave(_connection.channelmsg_lock, flags); list_del(_info->msglistentry); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/2] Drivers: hv: Fix some issues in both vmbus as well as util drivers
Fix a race in the registration of user space daemons. Also get rid of timeout in vmbus_open() as timing out here would make it impossible to rollback correctly. Vitaly Kuznetsov (2): Drivers: hv: get rid of timeout in vmbus_open() Drivers: hv: utils: fix a race on userspace daemons registration drivers/hv/channel.c|7 +-- drivers/hv/hv_fcopy.c | 14 ++ drivers/hv/hv_kvp.c | 27 --- drivers/hv/hv_snapshot.c| 16 +++- drivers/hv/hv_utils_transport.c | 15 ++- drivers/hv/hv_utils_transport.h |4 +++- 6 files changed, 55 insertions(+), 28 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/3] Drivers: hv: get rid of redundant messagecount in create_gpadl_header()
From: Vitaly Kuznetsov <vkuzn...@redhat.com> We use messagecount only once in vmbus_establish_gpadl() to check if it is safe to iterate through the submsglist. We can just initialize the list header in all cases in create_gpadl_header() instead. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel.c | 38 -- 1 files changed, 16 insertions(+), 22 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 56dd261..2b109e8 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -238,8 +238,7 @@ EXPORT_SYMBOL_GPL(vmbus_send_tl_connect_request); * create_gpadl_header - Creates a gpadl for the specified buffer */ static int create_gpadl_header(void *kbuffer, u32 size, -struct vmbus_channel_msginfo **msginfo, -u32 *messagecount) + struct vmbus_channel_msginfo **msginfo) { int i; int pagecount; @@ -283,7 +282,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, gpadl_header->range[0].pfn_array[i] = slow_virt_to_phys( kbuffer + PAGE_SIZE * i) >> PAGE_SHIFT; *msginfo = msgheader; - *messagecount = 1; pfnsum = pfncount; pfnleft = pagecount - pfncount; @@ -323,7 +321,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, } msgbody->msgsize = msgsize; - (*messagecount)++; gpadl_body = (struct vmbus_channel_gpadl_body *)msgbody->msg; @@ -352,6 +349,8 @@ static int create_gpadl_header(void *kbuffer, u32 size, msgheader = kzalloc(msgsize, GFP_KERNEL); if (msgheader == NULL) goto nomem; + + INIT_LIST_HEAD(>submsglist); msgheader->msgsize = msgsize; gpadl_header = (struct vmbus_channel_gpadl_header *) @@ -366,7 +365,6 @@ static int create_gpadl_header(void *kbuffer, u32 size, kbuffer + PAGE_SIZE * i) >> PAGE_SHIFT; *msginfo = msgheader; - *messagecount = 1; } return 0; @@ -391,7 +389,6 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, struct vmbus_channel_gpadl_body *gpadl_body; struct vmbus_channel_msginfo *msginfo = NULL; struct vmbus_channel_msginfo *submsginfo; - u32 msgcount; struct list_head *curr; u32 next_gpadl_handle; unsigned long flags; @@ -400,7 +397,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, next_gpadl_handle = (atomic_inc_return(_connection.next_gpadl_handle) - 1); - ret = create_gpadl_header(kbuffer, size, , ); + ret = create_gpadl_header(kbuffer, size, ); if (ret) return ret; @@ -423,24 +420,21 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, if (ret != 0) goto cleanup; - if (msgcount > 1) { - list_for_each(curr, >submsglist) { + list_for_each(curr, >submsglist) { + submsginfo = (struct vmbus_channel_msginfo *)curr; + gpadl_body = + (struct vmbus_channel_gpadl_body *)submsginfo->msg; - submsginfo = (struct vmbus_channel_msginfo *)curr; - gpadl_body = -(struct vmbus_channel_gpadl_body *)submsginfo->msg; + gpadl_body->header.msgtype = + CHANNELMSG_GPADL_BODY; + gpadl_body->gpadl = next_gpadl_handle; - gpadl_body->header.msgtype = - CHANNELMSG_GPADL_BODY; - gpadl_body->gpadl = next_gpadl_handle; + ret = vmbus_post_msg(gpadl_body, +submsginfo->msgsize - +sizeof(*submsginfo)); + if (ret != 0) + goto cleanup; - ret = vmbus_post_msg(gpadl_body, - submsginfo->msgsize - - sizeof(*submsginfo)); - if (ret != 0) - goto cleanup; - - } } wait_for_completion(>waitevent); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/3] Drivers: hv: don't leak memory in vmbus_establish_gpadl()
From: Vitaly Kuznetsov <vkuzn...@redhat.com> In some cases create_gpadl_header() allocates submessages but we never free them. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 2b109e8..a68830c 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -388,7 +388,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, void *kbuffer, struct vmbus_channel_gpadl_header *gpadlmsg; struct vmbus_channel_gpadl_body *gpadl_body; struct vmbus_channel_msginfo *msginfo = NULL; - struct vmbus_channel_msginfo *submsginfo; + struct vmbus_channel_msginfo *submsginfo, *tmp; struct list_head *curr; u32 next_gpadl_handle; unsigned long flags; @@ -445,6 +445,10 @@ cleanup: spin_lock_irqsave(_connection.channelmsg_lock, flags); list_del(>msglistentry); spin_unlock_irqrestore(_connection.channelmsg_lock, flags); + list_for_each_entry_safe(submsginfo, tmp, >submsglist, +msglistentry) { + kfree(submsginfo); + } kfree(msginfo); return ret; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/3] Drivers: hv: avoid vfree() on crash
From: Vitaly Kuznetsov <vkuzn...@redhat.com> When we crash from NMI context (e.g. after NMI injection from host when 'sysctl -w kernel.unknown_nmi_panic=1' is set) we hit kernel BUG at mm/vmalloc.c:1530! as vfree() is denied. While the issue could be solved with in_nmi() check instead I opted for skipping vfree on all sorts of crashes to reduce the amount of work which can cause consequent crashes. We don't really need to free anything on crash. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv.c |8 +--- drivers/hv/hyperv_vmbus.h |2 +- drivers/hv/vmbus_drv.c|8 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index a1c086b..60dbd6c 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -278,7 +278,7 @@ cleanup: * * This routine is called normally during driver unloading or exiting. */ -void hv_cleanup(void) +void hv_cleanup(bool crash) { union hv_x64_msr_hypercall_contents hypercall_msr; @@ -288,7 +288,8 @@ void hv_cleanup(void) if (hv_context.hypercall_page) { hypercall_msr.as_uint64 = 0; wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); - vfree(hv_context.hypercall_page); + if (!crash) + vfree(hv_context.hypercall_page); hv_context.hypercall_page = NULL; } @@ -308,7 +309,8 @@ void hv_cleanup(void) hypercall_msr.as_uint64 = 0; wrmsrl(HV_X64_MSR_REFERENCE_TSC, hypercall_msr.as_uint64); - vfree(hv_context.tsc_page); + if (!crash) + vfree(hv_context.tsc_page); hv_context.tsc_page = NULL; } #endif diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 718b5c7..dfa9fac 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -495,7 +495,7 @@ struct hv_ring_buffer_debug_info { extern int hv_init(void); -extern void hv_cleanup(void); +extern void hv_cleanup(bool crash); extern int hv_post_message(union hv_connection_id connection_id, enum hv_message_type message_type, diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 952f20f..d11690e 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -871,7 +871,7 @@ err_alloc: bus_unregister(_bus); err_cleanup: - hv_cleanup(); + hv_cleanup(false); return ret; } @@ -1323,7 +1323,7 @@ static void hv_kexec_handler(void) vmbus_initiate_unload(false); for_each_online_cpu(cpu) smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); - hv_cleanup(); + hv_cleanup(false); }; static void hv_crash_handler(struct pt_regs *regs) @@ -1335,7 +1335,7 @@ static void hv_crash_handler(struct pt_regs *regs) * for kdump. */ hv_synic_cleanup(NULL); - hv_cleanup(); + hv_cleanup(true); }; static int __init hv_acpi_init(void) @@ -1395,7 +1395,7 @@ static void __exit vmbus_exit(void) _panic_block); } bus_unregister(_bus); - hv_cleanup(); + hv_cleanup(false); for_each_online_cpu(cpu) { tasklet_kill(hv_context.event_dpc[cpu]); smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/3] Drivers: hv: vmbus: Some miscellaneous fixes
Some miscellaneous fixes. Vitaly Kuznetsov (3): Drivers: hv: avoid vfree() on crash Drivers: hv: get rid of redundant messagecount in create_gpadl_header() Drivers: hv: don't leak memory in vmbus_establish_gpadl() drivers/hv/channel.c | 44 +--- drivers/hv/hv.c |8 +--- drivers/hv/hyperv_vmbus.h |2 +- drivers/hv/vmbus_drv.c|8 4 files changed, 31 insertions(+), 31 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 1/5] Drivers: hv: kvp: fix IP Failover
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Hyper-V VMs can be replicated to another hosts and there is a feature to set different IP for replicas, it is called 'Failover TCP/IP'. When such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon as we finish negotiation procedure. The problem is that it can happen (and it actually happens) before userspace daemon connects and we reply with HV_E_FAIL to the message. As there are no repetitions we fail to set the requested IP. Solve the issue by postponing our reply to the negotiation message till userspace daemon is connected. We can't wait too long as there is a host-side timeout (cca. 75 seconds) and if we fail to reply in this time frame the whole KVP service will become inactive. The solution is not ideal - if it takes userspace daemon more than 60 seconds to connect IP Failover will still fail but I don't see a solution with our current separation between kernel and userspace parts. Other two modules (VSS and FCOPY) don't require such delay, leave them untouched. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_kvp.c | 31 +++ drivers/hv/hyperv_vmbus.h |5 + 2 files changed, 36 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 9b9b370..cb1a916 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy); static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); static void kvp_timeout_func(struct work_struct *dummy); +static void kvp_host_handshake_func(struct work_struct *dummy); static void kvp_register(int); static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); +static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); static const char kvp_devname[] = "vmbus/hv_kvp"; @@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy) hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } +static void kvp_host_handshake_func(struct work_struct *dummy) +{ + hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); +} + static int kvp_handle_handshake(struct hv_kvp_msg *msg) { switch (msg->kvp_hdr.operation) { @@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_debug("KVP: userspace daemon ver. %d registered\n", KVP_OP_REGISTER); kvp_register(dm_reg_value); + + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + cancel_delayed_work_sync(_host_handshake_work); hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); return 0; @@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context) struct icmsg_negotiate *negop = NULL; int util_fw_version; int kvp_srv_version; + static enum {NEGO_NOT_STARTED, +NEGO_IN_PROGRESS, +NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; + if (host_negotiatied == NEGO_NOT_STARTED && + kvp_transaction.state < HVUTIL_READY) { + /* +* If userspace daemon is not connected and host is asking +* us to negotiate we need to delay to not lose messages. +* This is important for Failover IP setting. +*/ + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(_host_handshake_work, + HV_UTIL_NEGO_TIMEOUT * HZ); + return; + } if (kvp_transaction.state > HVUTIL_READY) return; @@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context) vmbus_sendpacket(channel, recv_buffer, recvlen, requestid, VM_PKT_DATA_INBAND, 0); + + host_negotiatied = NEGO_FINISHED; } } @@ -708,6 +738,7 @@ hv_kvp_init(struct hv_util_service *srv) void hv_kvp_deinit(void) { kvp_transaction.state = HVUTIL_DEVICE_DYING; + cancel_delayed_work_sync(_host_handshake_work); cancel_delayed_work_sync(_timeout_work); cancel_work_sync(_sendkey_work); hvutil_transport_destroy(hvt); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5c586f..e5203e4 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -36,6 +36,11 @@ #define HV_UTIL_TIMEOUT 30 /* + * Timeout for guest-host handshake for services. + */ +#define HV_UTIL_NEGO_TIMEOUT 60 + +/* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent * is s
[PATCH RESEND 3/5] Drivers: hv: balloon: don't crash when memory is added in non-sorted order
From: Vitaly Kuznetsov <vkuzn...@redhat.com> When we iterate through all HA regions in handle_pg_range() we have an assumption that all these regions are sorted in the list and the 'start_pfn >= has->end_pfn' check is enough to find the proper region. Unfortunately it's not the case with WS2016 where host can hot-add regions in a different order. We end up modifying the wrong HA region and crashing later on pages online. Modify the check to make sure we found the region we were searching for while iterating. Fix the same check in pfn_covered() as well. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_balloon.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index b853b4b..43af913 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -714,7 +714,7 @@ static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt) * If the pfn range we are dealing with is not in the current * "hot add block", move on. */ - if ((start_pfn >= has->end_pfn)) + if (start_pfn < has->start_pfn || start_pfn >= has->end_pfn) continue; /* * If the current hot add-request extends beyond @@ -768,7 +768,7 @@ static unsigned long handle_pg_range(unsigned long pg_start, * If the pfn range we are dealing with is not in the current * "hot add block", move on. */ - if ((start_pfn >= has->end_pfn)) + if (start_pfn < has->start_pfn || start_pfn >= has->end_pfn) continue; old_covered_state = has->covered_end_pfn; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 5/5] tools: hv: lsvmbus: add pci pass-through UUID
From: Vitaly Kuznetsov <vkuzn...@redhat.com> lsvmbus keeps its own copy of all VMBus UUIDs, add PCIe pass-through device there to not report 'Unknown' for such devices. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- tools/hv/lsvmbus |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tools/hv/lsvmbus b/tools/hv/lsvmbus index 162a378..e8fecd6 100644 --- a/tools/hv/lsvmbus +++ b/tools/hv/lsvmbus @@ -35,6 +35,7 @@ vmbus_dev_dict = { '{ba6163d9-04a1-4d29-b605-72e2ffb1dc7f}' : 'Synthetic SCSI Controller', '{2f9bcc4a-0069-4af3-b76b-6fd0be528cda}' : 'Synthetic fiber channel adapter', '{8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501}' : 'Synthetic RDMA adapter', + '{44c4f61d--4400-9d52-802e27ede19f}' : 'PCI Express pass-through', '{276aacf4-ac15-426c-98dd-7521ad3f01fe}' : '[Reserved system device]', '{f8e65716-3cb3-4a06-9a60-1889c5cccab5}' : '[Reserved system device]', '{3375baf4-9e15-4b30-b765-67acb10d607b}' : '[Reserved system device]', -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 2/5] Drivers: hv: vmbus: handle various crash scenarios
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was used for initial contact or to CPU0 depending on host version. vmbus_wait_for_unload() doesn't account for the fact that in case we're crashing on some other CPU we won't get the CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will never end. Do the following: 1) Check for completion_done() in the loop. In case interrupt handler is still alive we'll get the confirmation we need. 2) Read message pages for all CPUs message page as we're unsure where CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with still-alive interrupt handler doing the same, add cmpxchg() to vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message. 3) Cleanup message pages on all CPUs. This is required (at least for the current CPU as we're clearing CPU0 messages now but we may want to bring up additional CPUs on crash) as new messages won't be delivered till we consume what's pending. On boot we'll place message pages somewhere else and we won't be able to read stale messages. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hyperv_vmbus.h | 16 +++- drivers/hv/vmbus_drv.c|7 +++-- 3 files changed, 61 insertions(+), 20 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 38b682ba..b6c1211 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -597,27 +597,55 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type) static void vmbus_wait_for_unload(void) { - int cpu = smp_processor_id(); - void *page_addr = hv_context.synic_message_page[cpu]; - struct hv_message *msg = (struct hv_message *)page_addr + - VMBUS_MESSAGE_SINT; + int cpu; + void *page_addr; + struct hv_message *msg; struct vmbus_channel_message_header *hdr; - bool unloaded = false; + u32 message_type; + /* +* CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was +* used for initial contact or to CPU0 depending on host version. When +* we're crashing on a different CPU let's hope that IRQ handler on +* the cpu which receives CHANNELMSG_UNLOAD_RESPONSE is still +* functional and vmbus_unload_response() will complete +* vmbus_connection.unload_event. If not, the last thing we can do is +* read message pages for all CPUs directly. +*/ while (1) { - if (READ_ONCE(msg->header.message_type) == HVMSG_NONE) { - mdelay(10); - continue; - } + if (completion_done(_connection.unload_event)) + break; - hdr = (struct vmbus_channel_message_header *)msg->u.payload; - if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) - unloaded = true; + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + + VMBUS_MESSAGE_SINT; - vmbus_signal_eom(msg); + message_type = READ_ONCE(msg->header.message_type); + if (message_type == HVMSG_NONE) + continue; - if (unloaded) - break; + hdr = (struct vmbus_channel_message_header *) + msg->u.payload; + + if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) + complete(_connection.unload_event); + + vmbus_signal_eom(msg, message_type); + } + + mdelay(10); + } + + /* +* We're crashing and already got the UNLOAD_RESPONSE, cleanup all +* maybe-pending messages on all CPUs to be able to receive new +* messages after we reconnect. +*/ + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; + msg->header.message_type = HVMSG_NONE; } } diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5203e4..718b5c7 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -625,9 +625,21 @@ extern struct vmbus_channel_message_table_entry channel_message_table[CHANNELMSG_COUNT]; /* Free the message slot and signal end-of-message if required */ -static inline void vmbus_signal_eom(struct hv_message *msg) +sta
[PATCH RESEND 4/5] Drivers: hv: balloon: reset host_specified_ha_region
From: Vitaly Kuznetsov <vkuzn...@redhat.com> We set host_specified_ha_region = true on certain request but this is a global state which stays 'true' forever. We need to reset it when we receive a request where ha_region is not specified. I did not see any real issues, the bug was found by code inspection. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_balloon.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 43af913..df35fb7 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -1400,6 +1400,7 @@ static void balloon_onchannelcallback(void *context) * This is a normal hot-add request specifying * hot-add memory. */ + dm->host_specified_ha_region = false; ha_pg_range = _msg->range; dm->ha_wrk.ha_page_range = *ha_pg_range; dm->ha_wrk.ha_region_range.page_range = 0; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RESEND 0/5] Drivers: hv: Some miscellaneous fixes
Some miscellaneous fixes. All these patches are being resent. Vitaly Kuznetsov (5): Drivers: hv: kvp: fix IP Failover Drivers: hv: vmbus: handle various crash scenarios Drivers: hv: balloon: don't crash when memory is added in non-sorted order Drivers: hv: balloon: reset host_specified_ha_region tools: hv: lsvmbus: add pci pass-through UUID drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hv_balloon.c |5 ++- drivers/hv/hv_kvp.c | 31 drivers/hv/hyperv_vmbus.h | 21 ++- drivers/hv/vmbus_drv.c|7 +++-- tools/hv/lsvmbus |1 + 6 files changed, 101 insertions(+), 22 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V5 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V5: No change from V1 drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V5 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: Addressed most of the comments from Alexander Duyck <alexander.du...@gmail.com> and Rustad, Mark D <mark.d.rus...@intel.com>. V3: Addressed additional comments from Alexander Duyck <alexander.du...@gmail.com> V4: Addressed kbuild errors reported by: kbuild test robot <l...@intel.com> V5: Addressed additional comments from Alexander Duyck <alexander.du...@gmail.com> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx He
[PATCH net-next V5 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V4 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V4: No change from V1 drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V4 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: Addressed most of the comments from Alexander Duyck <alexander.du...@gmail.com> and Rustad, Mark D <mark.d.rus...@intel.com>. V3: Addressed additional comments from Alexander Duyck <alexander.du...@gmail.com> V4: Addressed kbuild errors reported by: kbuild test robot <l...@intel.com> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx Head and Tail Descriptor Pointers and * the Base and Length of the Rx Descriptor Ring @@ -2056,7 +20
[PATCH net-next V4 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V3 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: No change from V1. V3: No change from V2. drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V3 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 271 insertions(+), 7 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V3 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> V2: Addressed most of the comments from Alexander Duyck <alexander.du...@gmail.com> and Rustad, Mark D <mark.d.rus...@intel.com>. V3: Addressed additional comments from Alexander Duyck <alexander.du...@gmail.com> --- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 31 +++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 216 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 266 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c4bb480 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1795,7 +1803,10 @@ static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter) ixgbevf_setup_vfmrqc(adapter); /* notify the PF of our intent to use this size of frame */ - ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + if (!ixgbevf_on_hyperv(hw)) + ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); + else + ixgbevf_hv_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN); /* Setup the HW Rx Head and Tail Descriptor Pointers and * the Base and Length of the Rx Descriptor Ring @@ -2056,7 +2067,10 @@ static void ixgbevf_negotiate_api(struct ixgbevf_adapter *adapter) spin_l
[PATCH net-next V2 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: No change from V1. drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V2 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- V2: Addressed most of the comments from Alexander Duyck <alexander.du...@gmail.com> and Rustad, Mark D <mark.d.rus...@intel.com>. drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 4 files changed, 237 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..3296d27 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; @@ -494,6 +505,7 @@ void ixgbevf_free_rx_resources(struct ixgbevf_ring *); void ixgbevf_free_tx_resources(struct ixgbevf_ring *); void ixgbevf_update_stats(struct ixgbevf_adapter *adapter); int ethtool_ioctl(struct ifreq *ifr); +bool ixgbevf_on_hyperv(struct ixgbe_hw *hw); extern void ixgbevf_write_eitr(struct ixgbevf_q_vector *q_vector); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..c761d80 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -62,10 +62,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +82,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; diff --git a/drivers/net/ethernet/intel/ixgbevf/mbx.c b/drivers/net/ethernet/intel/ixgbevf/mbx.c index dc68fea..298a0da 100644 --- a/drivers/net/ethernet/intel/ixgbevf/mbx.c +++ b/drivers/net/ethernet/intel/ixgbevf/mbx.c @@ -346,3 +346,15 @@ const struct ixgbe_mbx_operations ixgbevf_mbx_ops = { .check_for_rst = ixgbevf_check_for_rst_vf, }; +/** + * Mailbox operations when running on Hyper-V. + * On Hyper-V, PF/VF communiction is not through the + * hardware mailbox; this communication is through + * a software mediated path. + * Most mail box operations are noop while running on + * Hyper-V. + */ +const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops = { + .init_params= ixgbevf_init_mbx_params_vf, + .check_for_rst = ixgbevf_check_for_rst_vf, +}; diff --git a/drivers/net/ethernet/intel/ixgbevf/vf.c b/drivers/net/ethernet/intel/ixgbevf/vf.c index 4d613a4..1ec13c1 100644 ---
[PATCH net-next V2 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 5 files changed, 242 insertions(+), 4 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next V2 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 12 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 16 ++- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 201 + 5 files changed, 242 insertions(+), 4 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Drivers/hv
Greg, Some time back I had sent a buch of patches for Hyper-V drivers. Are they still in the queue or should I resend them. Regards, K. Y ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next 2/2] intel: ixgbevf: Support Windows hosts (Hyper-V)
On Hyper-V, the VF/PF communication is a via software mediated path as opposed to the hardware mailbox. Make the necessary adjustments to support Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 11 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 56 ++--- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 138 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 5 files changed, 201 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h index 5ac60ee..f8d2a0b 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -460,9 +460,13 @@ enum ixbgevf_state_t { enum ixgbevf_boards { board_82599_vf, + board_82599_vf_hv, board_X540_vf, + board_X540_vf_hv, board_X550_vf, + board_X550_vf_hv, board_X550EM_x_vf, + board_X550EM_x_vf_hv, }; enum ixgbevf_xcast_modes { @@ -477,6 +481,13 @@ extern const struct ixgbevf_info ixgbevf_X550_vf_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_info; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; + +extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops; + /* needed by ethtool.c */ extern const char ixgbevf_driver_name[]; extern const char ixgbevf_driver_version[]; diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 007cbe0..4a0ffac 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -49,6 +49,7 @@ #include #include #include +#include #include "ixgbevf.h" @@ -62,10 +63,14 @@ static char ixgbevf_copyright[] = "Copyright (c) 2009 - 2015 Intel Corporation."; static const struct ixgbevf_info *ixgbevf_info_tbl[] = { - [board_82599_vf] = _82599_vf_info, - [board_X540_vf] = _X540_vf_info, - [board_X550_vf] = _X550_vf_info, - [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_82599_vf]= _82599_vf_info, + [board_82599_vf_hv] = _82599_vf_hv_info, + [board_X540_vf] = _X540_vf_info, + [board_X540_vf_hv] = _X540_vf_hv_info, + [board_X550_vf] = _X550_vf_info, + [board_X550_vf_hv] = _X550_vf_hv_info, + [board_X550EM_x_vf] = _X550EM_x_vf_info, + [board_X550EM_x_vf_hv] = _X550EM_x_vf_hv_info, }; /* ixgbevf_pci_tbl - PCI Device ID Table @@ -78,9 +83,13 @@ static const struct ixgbevf_info *ixgbevf_info_tbl[] = { */ static const struct pci_device_id ixgbevf_pci_tbl[] = { {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF), board_82599_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_82599_VF_HV), board_82599_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF), board_X540_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X540_VF_HV), board_X540_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF), board_X550_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550_VF_HV), board_X550_vf_hv }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, /* required last entry */ {0, } }; @@ -1809,12 +1818,13 @@ static int ixgbevf_vlan_rx_add_vid(struct net_device *netdev, { struct ixgbevf_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = >hw; - int err; + int err = 0; spin_lock_bh(>mbx_lock); /* add VID to filter table */ - err = hw->mac.ops.set_vfta(hw, vid, 0, true); + if (hw->mac.ops.set_vfta) + err = hw->mac.ops.set_vfta(hw, vid, 0, true); spin_unlock_bh(>mbx_lock); @@ -1835,12 +1845,13 @@ static int ixgbevf_vlan_rx_kill_vid(struct net_device *netdev, { struct ixgbevf_adapter *adapter = netdev_priv(netdev); struct ixgbe_hw *hw = >hw; - int err; + int err = 0; spin_lock_bh(>mbx_lock); /* remove VID from filter table */ - err = hw->mac.ops.set_vfta(hw, vid, 0, false); + if (hw->mac.ops.set_vfta) + err = hw->mac.ops.set_vfta(hw, vid, 0, false); spin_unlock_bh(>mbx_lock); @@ -1873,14 +1884,16 @@ static int ixgbevf_write_uc_addr_list(struct net_device *netdev) struct netdev_hw_addr *ha; netdev_for_each_uc_addr(ha, netdev) { - hw->mac.ops.set_uc_addr(hw, ++coun
[PATCH net-next 1/2] ethernet: intel: Add the device ID's presented while running on Hyper-V
Intel SR-IOV cards present different ID when running on Hyper-V. Add the device IDs presented while running on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/net/ethernet/intel/ixgbevf/defines.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/defines.h b/drivers/net/ethernet/intel/ixgbevf/defines.h index 5843458..1306a0d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -33,6 +33,11 @@ #define IXGBE_DEV_ID_X550_VF 0x1565 #define IXGBE_DEV_ID_X550EM_X_VF 0x15A8 +#define IXGBE_DEV_ID_82599_VF_HV 0x152E +#define IXGBE_DEV_ID_X540_VF_HV0x1530 +#define IXGBE_DEV_ID_X550_VF_HV0x1564 +#define IXGBE_DEV_ID_X550EM_X_VF_HV0x15A9 + #define IXGBE_VF_IRQ_CLEAR_MASK7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next 0/2] ethernet: intel: Support Hyper-V hosts
Make adjustments to the Intel 10G VF driver to support running on Hyper-V hosts. K. Y. Srinivasan (2): ethernet: intel: Add the device ID's presented while running on Hyper-V intel: ixgbevf: Support Windows hosts (Hyper-V) drivers/net/ethernet/intel/ixgbevf/defines.h |5 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 11 ++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 56 ++--- drivers/net/ethernet/intel/ixgbevf/mbx.c | 12 ++ drivers/net/ethernet/intel/ixgbevf/vf.c | 138 + drivers/net/ethernet/intel/ixgbevf/vf.h |2 + 6 files changed, 206 insertions(+), 18 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH net-next 1/1] hv_netvsc: Implement support for VF drivers on Hyper-V
Support VF drivers on Hyper-V. On Hyper-V, each VF instance presented to the guest has an associated synthetic interface that shares the MAC address with the VF instance. Typically these are bonded together to support live migration. By default, the host delivers all the incoming packets on the synthetic interface. Once the VF is up, we need to explicitly switch the data path on the host to divert traffic onto the VF interface. Even after switching the data path, broadcast and multicast packets are always delivered on the synthetic interface and these will have to be injected back onto the VF interface (if VF is up). This patch implements the necessary support in netvsc to support Linux VF drivers. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Reviewed-by: Haiyang Zhang <haiya...@microsoft.com> --- drivers/net/hyperv/hyperv_net.h | 14 ++ drivers/net/hyperv/netvsc.c | 29 drivers/net/hyperv/netvsc_drv.c | 312 +--- drivers/net/hyperv/rndis_filter.c |6 + 4 files changed, 335 insertions(+), 26 deletions(-) diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index 8b3bd8e..6700a4d 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -202,6 +202,8 @@ int rndis_filter_receive(struct hv_device *dev, int rndis_filter_set_packet_filter(struct rndis_device *dev, u32 new_filter); int rndis_filter_set_device_mac(struct hv_device *hdev, char *mac); +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf); + #define NVSP_INVALID_PROTOCOL_VERSION ((u32)0x) #define NVSP_PROTOCOL_VERSION_12 @@ -641,6 +643,12 @@ struct netvsc_reconfig { u32 event; }; +struct garp_wrk { + struct work_struct dwrk; + struct net_device *netdev; + struct netvsc_device *netvsc_dev; +}; + /* The context of the netvsc device */ struct net_device_context { /* point back to our device context */ @@ -656,6 +664,7 @@ struct net_device_context { struct work_struct work; u32 msg_enable; /* debug level */ + struct garp_wrk gwrk; struct netvsc_stats __percpu *tx_stats; struct netvsc_stats __percpu *rx_stats; @@ -730,6 +739,11 @@ struct netvsc_device { u32 vf_alloc; /* Serial number of the VF to team with */ u32 vf_serial; + atomic_t open_cnt; + /* State to manage the associated VF interface. */ + bool vf_inject; + struct net_device *vf_netdev; + atomic_t vf_use_cnt; }; /* NdisInitialize message */ diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index ec313fc..eddce3c 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -33,6 +33,30 @@ #include "hyperv_net.h" +/* + * Switch the data path from the synthetic interface to the VF + * interface. + */ +void netvsc_switch_datapath(struct netvsc_device *nv_dev, bool vf) +{ + struct nvsp_message *init_pkt = _dev->channel_init_pkt; + struct hv_device *dev = nv_dev->dev; + + memset(init_pkt, 0, sizeof(struct nvsp_message)); + init_pkt->hdr.msg_type = NVSP_MSG4_TYPE_SWITCH_DATA_PATH; + if (vf) + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_VF; + else + init_pkt->msg.v4_msg.active_dp.active_datapath = + NVSP_DATAPATH_SYNTHETIC; + + vmbus_sendpacket(dev->channel, init_pkt, + sizeof(struct nvsp_message), + (unsigned long)init_pkt, + VM_PKT_DATA_INBAND, 0); +} + static struct netvsc_device *alloc_net_device(struct hv_device *device) { @@ -52,11 +76,16 @@ static struct netvsc_device *alloc_net_device(struct hv_device *device) init_waitqueue_head(_device->wait_drain); net_device->start_remove = false; net_device->destroy = false; + atomic_set(_device->open_cnt, 0); + atomic_set(_device->vf_use_cnt, 0); net_device->dev = device; net_device->ndev = ndev; net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT; net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT; + net_device->vf_netdev = NULL; + net_device->vf_inject = false; + hv_set_drvdata(device, net_device); return net_device; } diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index b8121eb..bfdb568a 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -610,42 +610,24 @@ void netvsc_linkstatus_callback(struct hv_device *device_obj, schedule_delayed_work(_ctx->dwork, 0); } -/* - * netvsc_recv_callback - Callback when we receive a packet from the - * "wire" on the specified device. - */ -int netvsc_recv_callback(struct hv_devi
[PATCH 2/8] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer
Introduce separate functions for estimating how much can be read from and written to the ring buffer. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 25 - include/linux/hyperv.h | 27 +++ 2 files changed, 31 insertions(+), 21 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index a40a73a..544362c 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -38,8 +38,6 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi) u32 hv_end_read(struct hv_ring_buffer_info *rbi) { - u32 read; - u32 write; rbi->ring_buffer->interrupt_mask = 0; mb(); @@ -49,9 +47,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) * If it is not, we raced and we need to process new * incoming messages. */ - hv_get_ringbuffer_availbytes(rbi, , ); - - return read; + return hv_get_bytes_to_read(rbi); } /* @@ -106,9 +102,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; - u32 r_size; - u32 write_loc; - u32 read_loc = rbi->ring_buffer->read_index; u32 pending_sz; /* @@ -125,14 +118,11 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) mb(); pending_sz = rbi->ring_buffer->pending_send_sz; - write_loc = rbi->ring_buffer->write_index; /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; - r_size = rbi->ring_datasize; - cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : - read_loc - write_loc; + cur_write_sz = hv_get_bytes_to_write(rbi); if (cur_write_sz >= pending_sz) return true; @@ -332,7 +322,6 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, { int i = 0; u32 bytes_avail_towrite; - u32 bytes_avail_toread; u32 totalbytes_towrite = 0; u32 next_write_location; @@ -348,9 +337,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, if (lock) spin_lock_irqsave(_info->ring_lock, flags); - hv_get_ringbuffer_availbytes(outring_info, - _avail_toread, - _avail_towrite); + bytes_avail_towrite = hv_get_bytes_to_write(outring_info); /* * If there is only room for the packet, assume it is full. @@ -401,7 +388,6 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, void *buffer, u32 buflen, u32 *buffer_actual_len, u64 *requestid, bool *signal, bool raw) { - u32 bytes_avail_towrite; u32 bytes_avail_toread; u32 next_read_location = 0; u64 prev_indices = 0; @@ -417,10 +403,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, *buffer_actual_len = 0; *requestid = 0; - hv_get_ringbuffer_availbytes(inring_info, - _avail_toread, - _avail_towrite); - + bytes_avail_toread = hv_get_bytes_to_read(inring_info); /* Make sure there is something to read */ if (bytes_avail_toread < sizeof(desc)) { /* diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index ecd81c3..a6b053c 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -151,6 +151,33 @@ hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi, *read = dsize - *write; } +static inline u32 hv_get_bytes_to_read(struct hv_ring_buffer_info *rbi) +{ + u32 read_loc, write_loc, dsize, read; + + dsize = rbi->ring_datasize; + read_loc = rbi->ring_buffer->read_index; + write_loc = READ_ONCE(rbi->ring_buffer->write_index); + + read = write_loc >= read_loc ? (write_loc - read_loc) : + (dsize - read_loc) + write_loc; + + return read; +} + +static inline u32 hv_get_bytes_to_write(struct hv_ring_buffer_info *rbi) +{ + u32 read_loc, write_loc, dsize, write; + + dsize = rbi->ring_datasize; + read_loc = READ_ONCE(rbi->ring_buffer->read_index); + write_loc = rbi->ring_buffer->write_index; + + write = write_loc >= read_loc ? dsize - (write_loc - read_loc) : + read_loc - write_loc; + return write; +} + /* * VMBUS version is 32 bit entity broken up into * two 16 bit quantities: major_number. minor_number. -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/8] Drivers: hv: kvp: fix IP Failover
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Hyper-V VMs can be replicated to another hosts and there is a feature to set different IP for replicas, it is called 'Failover TCP/IP'. When such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon as we finish negotiation procedure. The problem is that it can happen (and it actually happens) before userspace daemon connects and we reply with HV_E_FAIL to the message. As there are no repetitions we fail to set the requested IP. Solve the issue by postponing our reply to the negotiation message till userspace daemon is connected. We can't wait too long as there is a host-side timeout (cca. 75 seconds) and if we fail to reply in this time frame the whole KVP service will become inactive. The solution is not ideal - if it takes userspace daemon more than 60 seconds to connect IP Failover will still fail but I don't see a solution with our current separation between kernel and userspace parts. Other two modules (VSS and FCOPY) don't require such delay, leave them untouched. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_kvp.c | 31 +++ drivers/hv/hyperv_vmbus.h |5 + 2 files changed, 36 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 9b9b370..cb1a916 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy); static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); static void kvp_timeout_func(struct work_struct *dummy); +static void kvp_host_handshake_func(struct work_struct *dummy); static void kvp_register(int); static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); +static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); static const char kvp_devname[] = "vmbus/hv_kvp"; @@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy) hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } +static void kvp_host_handshake_func(struct work_struct *dummy) +{ + hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); +} + static int kvp_handle_handshake(struct hv_kvp_msg *msg) { switch (msg->kvp_hdr.operation) { @@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_debug("KVP: userspace daemon ver. %d registered\n", KVP_OP_REGISTER); kvp_register(dm_reg_value); + + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + cancel_delayed_work_sync(_host_handshake_work); hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); return 0; @@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context) struct icmsg_negotiate *negop = NULL; int util_fw_version; int kvp_srv_version; + static enum {NEGO_NOT_STARTED, +NEGO_IN_PROGRESS, +NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; + if (host_negotiatied == NEGO_NOT_STARTED && + kvp_transaction.state < HVUTIL_READY) { + /* +* If userspace daemon is not connected and host is asking +* us to negotiate we need to delay to not lose messages. +* This is important for Failover IP setting. +*/ + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(_host_handshake_work, + HV_UTIL_NEGO_TIMEOUT * HZ); + return; + } if (kvp_transaction.state > HVUTIL_READY) return; @@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context) vmbus_sendpacket(channel, recv_buffer, recvlen, requestid, VM_PKT_DATA_INBAND, 0); + + host_negotiatied = NEGO_FINISHED; } } @@ -708,6 +738,7 @@ hv_kvp_init(struct hv_util_service *srv) void hv_kvp_deinit(void) { kvp_transaction.state = HVUTIL_DEVICE_DYING; + cancel_delayed_work_sync(_host_handshake_work); cancel_delayed_work_sync(_timeout_work); cancel_work_sync(_sendkey_work); hvutil_transport_destroy(hvt); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 12321b9..8b07f9c 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -36,6 +36,11 @@ #define HV_UTIL_TIMEOUT 30 /* + * Timeout for guest-host handshake for services. + */ +#define HV_UTIL_NEGO_TIMEOUT 60 + +/* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent * is s
[PATCH 8/8] Drivers: hv: vmbus: handle various crash scenarios
From: Vitaly Kuznetsov <[mailto:vkuzn...@redhat.com]> Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was used for initial contact or to CPU0 depending on host version. vmbus_wait_for_unload() doesn't account for the fact that in case we're crashing on some other CPU we won't get the CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will never end. Do the following: 1) Check for completion_done() in the loop. In case interrupt handler is still alive we'll get the confirmation we need. 2) Read message pages for all CPUs message page as we're unsure where CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with still-alive interrupt handler doing the same, add cmpxchg() to vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message. 3) Cleanup message pages on all CPUs. This is required (at least for the current CPU as we're clearing CPU0 messages now but we may want to bring up additional CPUs on crash) as new messages won't be delivered till we consume what's pending. On boot we'll place message pages somewhere else and we won't be able to read stale messages. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel_mgmt.c | 58 +--- drivers/hv/hyperv_vmbus.h | 16 +++- drivers/hv/vmbus_drv.c|7 +++-- 3 files changed, 61 insertions(+), 20 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 38b682ba..b6c1211 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -597,27 +597,55 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type) static void vmbus_wait_for_unload(void) { - int cpu = smp_processor_id(); - void *page_addr = hv_context.synic_message_page[cpu]; - struct hv_message *msg = (struct hv_message *)page_addr + - VMBUS_MESSAGE_SINT; + int cpu; + void *page_addr; + struct hv_message *msg; struct vmbus_channel_message_header *hdr; - bool unloaded = false; + u32 message_type; + /* +* CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was +* used for initial contact or to CPU0 depending on host version. When +* we're crashing on a different CPU let's hope that IRQ handler on +* the cpu which receives CHANNELMSG_UNLOAD_RESPONSE is still +* functional and vmbus_unload_response() will complete +* vmbus_connection.unload_event. If not, the last thing we can do is +* read message pages for all CPUs directly. +*/ while (1) { - if (READ_ONCE(msg->header.message_type) == HVMSG_NONE) { - mdelay(10); - continue; - } + if (completion_done(_connection.unload_event)) + break; - hdr = (struct vmbus_channel_message_header *)msg->u.payload; - if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) - unloaded = true; + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + + VMBUS_MESSAGE_SINT; - vmbus_signal_eom(msg); + message_type = READ_ONCE(msg->header.message_type); + if (message_type == HVMSG_NONE) + continue; - if (unloaded) - break; + hdr = (struct vmbus_channel_message_header *) + msg->u.payload; + + if (hdr->msgtype == CHANNELMSG_UNLOAD_RESPONSE) + complete(_connection.unload_event); + + vmbus_signal_eom(msg, message_type); + } + + mdelay(10); + } + + /* +* We're crashing and already got the UNLOAD_RESPONSE, cleanup all +* maybe-pending messages on all CPUs to be able to receive new +* messages after we reconnect. +*/ + for_each_online_cpu(cpu) { + page_addr = hv_context.synic_message_page[cpu]; + msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; + msg->header.message_type = HVMSG_NONE; } } diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index e5203e4..718b5c7 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -625,9 +625,21 @@ extern struct vmbus_channel_message_table_entry channel_message_table[CHANNELMSG_COUNT]; /* Free the message slot and signal end-of-message if required */ -static inline void vmbus_signal_eom(struct hv_message *ms
[PATCH 4/8] Drivers: hv: vmbus: Use the new virt_xx barrier code
Use the virt_xx barriers that have been defined for use in virtual machines. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 6ea1b55..8f518af 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -33,14 +33,14 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi) { rbi->ring_buffer->interrupt_mask = 1; - mb(); + virt_mb(); } u32 hv_end_read(struct hv_ring_buffer_info *rbi) { rbi->ring_buffer->interrupt_mask = 0; - mb(); + virt_mb(); /* * Now check to see if the ring buffer is still empty. @@ -68,12 +68,12 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) { - mb(); + virt_mb(); if (READ_ONCE(rbi->ring_buffer->interrupt_mask)) return false; /* check interrupt_mask before read_index */ - rmb(); + virt_rmb(); /* * This is the only case we need to signal when the * ring transitions from being empty to non-empty. @@ -115,7 +115,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) * read index, we could miss sending the interrupt. Issue a full * memory barrier to address this. */ - mb(); + virt_mb(); pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ @@ -371,7 +371,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, sizeof(u64)); /* Issue a full memory barrier before updating the write index */ - mb(); + virt_mb(); /* Now, update the write location */ hv_set_next_write_location(outring_info, next_write_location); @@ -447,7 +447,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, * the writer may start writing to the read area once the read index * is updated. */ - mb(); + virt_mb(); /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 7/8] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets
Implement APIs for in-place consumption of vmbus packets. Currently, each packet is copied and processed one at a time and as part of processing each packet we potentially may signal the host (if it is waiting for room to produce a packet). These APIs help batched in-place processing of vmbus packets. We also optimize host signaling by having a separate API to signal the end of in-place consumption. With netvsc using these APIs, on an iperf run on average I see about 20X reduction in checks to signal the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c |1 + include/linux/hyperv.h | 86 ++ 2 files changed, 87 insertions(+), 0 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index dd255c9..fe586bf 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -132,6 +132,7 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, u32 next_read_location) { ring_info->ring_buffer->read_index = next_read_location; + ring_info->priv_read_index = next_read_location; } /* Get the size of the ring buffer. */ diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 6797a30..b10954a 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -126,6 +126,8 @@ struct hv_ring_buffer_info { u32 ring_datasize; /* < ring_size */ u32 ring_data_startoffset; + u32 priv_write_index; + u32 priv_read_index; }; /* @@ -1420,4 +1422,88 @@ static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) return false; } +/* + * An API to support in-place processing of incoming VMBUS packets. + */ +#define VMBUS_PKT_TRAILER 8 + +static inline struct vmpacket_descriptor * +get_next_pkt_raw(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + void *ring_buffer = hv_get_ring_buffer(ring_info); + struct vmpacket_descriptor *cur_desc; + u32 packetlen; + u32 dsize = ring_info->ring_datasize; + u32 delta = read_loc - ring_info->ring_buffer->read_index; + u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta); + + if (bytes_avail_toread < sizeof(struct vmpacket_descriptor)) + return NULL; + + if ((read_loc + sizeof(*cur_desc)) > dsize) + return NULL; + + cur_desc = ring_buffer + read_loc; + packetlen = cur_desc->len8 << 3; + + /* +* If the packet under consideration is wrapping around, +* return failure. +*/ + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1)) + return NULL; + + return cur_desc; +} + +/* + * A helper function to step through packets "in-place" + * This API is to be called after each successful call + * get_next_pkt_raw(). + */ +static inline void put_pkt_raw(struct vmbus_channel *channel, + struct vmpacket_descriptor *desc) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + u32 packetlen = desc->len8 << 3; + u32 dsize = ring_info->ring_datasize; + + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize) + BUG(); + /* +* Include the packet trailer. +*/ + ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER; +} + +/* + * This call commits the read index and potentially signals the host. + * Here is the pattern for using the "in-place" consumption APIs: + * + * while (get_next_pkt_raw() { + * process the packet "in-place"; + * put_pkt_raw(); + * } + * if (packets processed in place) + * commit_rd_index(); + */ +static inline void commit_rd_index(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + /* +* Make sure all reads are done before we update the read index since +* the writer may start writing to the read area once the read index +* is updated. +*/ + virt_rmb(); + ring_info->ring_buffer->read_index = ring_info->priv_read_index; + + if (hv_need_to_signal_on_read(ring_info)) + vmbus_set_event(channel); +} + + #endif /* _HYPERV_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/8] Drivers: hv: vmbus: Export the vmbus_set_event() API
In preparation for moving some ring buffer functionality out of the vmbus driver, export the API for signaling the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/connection.c |1 + drivers/hv/hyperv_vmbus.h |2 -- include/linux/hyperv.h|1 + 3 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index d02f137..fcf8a02 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -495,3 +495,4 @@ void vmbus_set_event(struct vmbus_channel *channel) hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL); } +EXPORT_SYMBOL_GPL(vmbus_set_event); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 8b07f9c..e5203e4 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -672,8 +672,6 @@ void vmbus_disconnect(void); int vmbus_post_msg(void *buffer, size_t buflen); -void vmbus_set_event(struct vmbus_channel *channel); - void vmbus_on_event(unsigned long data); void vmbus_on_msg_dpc(unsigned long data); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index a6b053c..4adeb6e 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1365,4 +1365,5 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); +void vmbus_set_event(struct vmbus_channel *channel); #endif /* _HYPERV_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 6/8] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h
In preparation for implementing APIs for in-place consumption of VMBUS packets, movve some ring buffer functionality into hyperv.h Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 55 -- include/linux/hyperv.h | 54 + 2 files changed, 54 insertions(+), 55 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 8f518af..dd255c9 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -84,52 +84,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) return false; } -/* - * To optimize the flow management on the send-side, - * when the sender is blocked because of lack of - * sufficient space in the ring buffer, potential the - * consumer of the ring buffer can signal the producer. - * This is controlled by the following parameters: - * - * 1. pending_send_sz: This is the size in bytes that the - *producer is trying to send. - * 2. The feature bit feat_pending_send_sz set to indicate if - *the consumer of the ring will signal when the ring - *state transitions from being full to a state where - *there is room for the producer to send the pending packet. - */ - -static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) -{ - u32 cur_write_sz; - u32 pending_sz; - - /* -* Issue a full memory barrier before making the signaling decision. -* Here is the reason for having this barrier: -* If the reading of the pend_sz (in this function) -* were to be reordered and read before we commit the new read -* index (in the calling function) we could -* have a problem. If the host were to set the pending_sz after we -* have sampled pending_sz and go to sleep before we commit the -* read index, we could miss sending the interrupt. Issue a full -* memory barrier to address this. -*/ - virt_mb(); - - pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); - /* If the other end is not blocked on write don't bother. */ - if (pending_sz == 0) - return false; - - cur_write_sz = hv_get_bytes_to_write(rbi); - - if (cur_write_sz >= pending_sz) - return true; - - return false; -} - /* Get the next write location for the specified ring buffer. */ static inline u32 hv_get_next_write_location(struct hv_ring_buffer_info *ring_info) @@ -180,15 +134,6 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, ring_info->ring_buffer->read_index = next_read_location; } - -/* Get the start of the ring buffer. */ -static inline void * -hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) -{ - return (void *)ring_info->ring_buffer->buffer; -} - - /* Get the size of the ring buffer. */ static inline u32 hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 4adeb6e..6797a30 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1366,4 +1366,58 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); void vmbus_set_event(struct vmbus_channel *channel); + +/* Get the start of the ring buffer. */ +static inline void * +hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) +{ + return (void *)ring_info->ring_buffer->buffer; +} + +/* + * To optimize the flow management on the send-side, + * when the sender is blocked because of lack of + * sufficient space in the ring buffer, potential the + * consumer of the ring buffer can signal the producer. + * This is controlled by the following parameters: + * + * 1. pending_send_sz: This is the size in bytes that the + *producer is trying to send. + * 2. The feature bit feat_pending_send_sz set to indicate if + *the consumer of the ring will signal when the ring + *state transitions from being full to a state where + *there is room for the producer to send the pending packet. + */ + +static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) +{ + u32 cur_write_sz; + u32 pending_sz; + + /* +* Issue a full memory barrier before making the signaling decision. +* Here is the reason for having this barrier: +* If the reading of the pend_sz (in this function) +* were to be reordered and read before we commit the new read +* index (in the calling function) we could +* have a problem. If the host were to set the pending_sz after we +* have sampled pending_sz and go to sleep before we commit the +* read index, we could miss sending the interrupt. Issue a full +* memory barrier to address
[PATCH 3/8] Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile
Use the READ_ONCE macro to access variabes that can change asynchronously. This is the recommended mechanism for dealing with "unsafe" compiler optimizations. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 544362c..6ea1b55 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -69,7 +69,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) { mb(); - if (rbi->ring_buffer->interrupt_mask) + if (READ_ONCE(rbi->ring_buffer->interrupt_mask)) return false; /* check interrupt_mask before read_index */ @@ -78,7 +78,7 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) * This is the only case we need to signal when the * ring transitions from being empty to non-empty. */ - if (old_write == rbi->ring_buffer->read_index) + if (old_write == READ_ONCE(rbi->ring_buffer->read_index)) return true; return false; @@ -117,7 +117,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) */ mb(); - pending_sz = rbi->ring_buffer->pending_send_sz; + pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/8] Drivers: hv: Miscellaneous vmbus and util driver fixes
Cleanup the ringbuffer code and implement APIs for "in place" consumption. This patchset also includes some other miscellaneous fixes. K. Y. Srinivasan (6): Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile Drivers: hv: vmbus: Use the new virt_xx barrier code Drivers: hv: vmbus: Export the vmbus_set_event() API Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets Vitaly Kuznetsov (2): Drivers: hv: kvp: fix IP Failover Drivers: hv: vmbus: handle various crash scenarios drivers/hv/channel_mgmt.c | 58 drivers/hv/connection.c |1 + drivers/hv/hv_kvp.c | 31 drivers/hv/hyperv_vmbus.h | 23 +- drivers/hv/ring_buffer.c | 95 +++-- drivers/hv/vmbus_drv.c|7 +- include/linux/hyperv.h| 168 + 7 files changed, 278 insertions(+), 105 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read()
On the consumer side, we have interrupt driven flow management of the producer. It is sufficient to base the signaling decision on the amount of space that is available to write after the read is complete. The current code samples the previous available space and uses this in making the signaling decision. This state can be stale and is unnecessary. Since the state can be stale, we end up not signaling the host (when we should) and this can result in a hang. Fix this problem by removing the unnecessary check. I would like to thank Arseney Romanenko <arsen...@microsoft.com> for pointing out this issue. Also, issue a full memory barrier before making the signaling descision to correctly deal with potential reordering of the write (read index) followed by the read of pending_sz. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Tested-by: Dexuan Cui <de...@microsoft.com> Cc: <sta...@vger.kernel.org> --- drivers/hv/ring_buffer.c | 26 -- 1 files changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 5613e2b..a40a73a 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -103,15 +103,29 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) *there is room for the producer to send the pending packet. */ -static bool hv_need_to_signal_on_read(u32 prev_write_sz, - struct hv_ring_buffer_info *rbi) +static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; u32 r_size; - u32 write_loc = rbi->ring_buffer->write_index; + u32 write_loc; u32 read_loc = rbi->ring_buffer->read_index; - u32 pending_sz = rbi->ring_buffer->pending_send_sz; + u32 pending_sz; + /* +* Issue a full memory barrier before making the signaling decision. +* Here is the reason for having this barrier: +* If the reading of the pend_sz (in this function) +* were to be reordered and read before we commit the new read +* index (in the calling function) we could +* have a problem. If the host were to set the pending_sz after we +* have sampled pending_sz and go to sleep before we commit the +* read index, we could miss sending the interrupt. Issue a full +* memory barrier to address this. +*/ + mb(); + + pending_sz = rbi->ring_buffer->pending_send_sz; + write_loc = rbi->ring_buffer->write_index; /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; @@ -120,7 +134,7 @@ static bool hv_need_to_signal_on_read(u32 prev_write_sz, cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : read_loc - write_loc; - if ((prev_write_sz < pending_sz) && (cur_write_sz >= pending_sz)) + if (cur_write_sz >= pending_sz) return true; return false; @@ -455,7 +469,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); - *signal = hv_need_to_signal_on_read(bytes_avail_towrite, inring_info); + *signal = hv_need_to_signal_on_read(inring_info); return ret; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 4/7] drivers:hv: Reverse order of resources in hyperv_mmio
From: Jake Oshins <ja...@microsoft.com> A patch later in this series allocates child nodes in this resource tree. For that to work, this tree needs to be sorted in ascending order. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- Greg, please apply this to the 4.6 tree. drivers/hv/vmbus_drv.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 60553c1..1ce47d0 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1049,7 +1049,6 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) new_res->end = end; /* -* Stick ranges from higher in address space at the front of the list. * If two ranges are adjacent, merge them. */ do { @@ -1070,7 +1069,7 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) break; } - if ((*old_res)->end < new_res->start) { + if ((*old_res)->start > new_res->end) { new_res->sibling = *old_res; if (prev_res) (*prev_res)->sibling = new_res; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/7] drivers:hv: Use new vmbus_mmio_free() from client drivers.
From: Jake Oshins <ja...@microsoft.com> This patch modifies all the callers of vmbus_mmio_allocate() to call vmbus_mmio_free() instead of release_mem_region(). Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- Greg, please apply this to the 4.6 tree. drivers/pci/host/pci-hyperv.c | 14 +++--- drivers/video/fbdev/hyperv_fb.c |4 ++-- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index ed651ba..f2559b6 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -1795,14 +1795,14 @@ static void hv_pci_free_bridge_windows(struct hv_pcibus_device *hbus) if (hbus->low_mmio_space && hbus->low_mmio_res) { hbus->low_mmio_res->flags |= IORESOURCE_BUSY; - release_mem_region(hbus->low_mmio_res->start, - resource_size(hbus->low_mmio_res)); + vmbus_free_mmio(hbus->low_mmio_res->start, + resource_size(hbus->low_mmio_res)); } if (hbus->high_mmio_space && hbus->high_mmio_res) { hbus->high_mmio_res->flags |= IORESOURCE_BUSY; - release_mem_region(hbus->high_mmio_res->start, - resource_size(hbus->high_mmio_res)); + vmbus_free_mmio(hbus->high_mmio_res->start, + resource_size(hbus->high_mmio_res)); } } @@ -1880,8 +1880,8 @@ static int hv_pci_allocate_bridge_windows(struct hv_pcibus_device *hbus) release_low_mmio: if (hbus->low_mmio_res) { - release_mem_region(hbus->low_mmio_res->start, - resource_size(hbus->low_mmio_res)); + vmbus_free_mmio(hbus->low_mmio_res->start, + resource_size(hbus->low_mmio_res)); } return ret; @@ -1924,7 +1924,7 @@ static int hv_allocate_config_window(struct hv_pcibus_device *hbus) static void hv_free_config_window(struct hv_pcibus_device *hbus) { - release_mem_region(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH); + vmbus_free_mmio(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH); } /** diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c index e2451bd..2fd49b2 100644 --- a/drivers/video/fbdev/hyperv_fb.c +++ b/drivers/video/fbdev/hyperv_fb.c @@ -743,7 +743,7 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info) err3: iounmap(fb_virt); err2: - release_mem_region(par->mem->start, screen_fb_size); + vmbus_free_mmio(par->mem->start, screen_fb_size); par->mem = NULL; err1: if (!gen2vm) @@ -758,7 +758,7 @@ static void hvfb_putmem(struct fb_info *info) struct hvfb_par *par = info->par; iounmap(info->screen_base); - release_mem_region(par->mem->start, screen_fb_size); + vmbus_free_mmio(par->mem->start, screen_fb_size); par->mem = NULL; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/7] drivers:hv: Track allocations of children of hv_vmbus in private resource tree
From: Jake Oshins <ja...@microsoft.com> This patch changes vmbus_allocate_mmio() and vmbus_free_mmio() so that when child paravirtual devices allocate memory-mapped I/O space, they allocate it privately from a resource tree pointed at by hyperv_mmio and also by the public resource tree iomem_resource. This allows the region to be marked as "busy" in the private tree, but a "bridge window" in the public tree, guaranteeing that no two bridge windows will overlap each other but while also allowing the PCI device children of the bridge windows to overlap that window. One might conclude that this belongs in the pnp layer, rather than in this driver. Rafael Wysocki, the maintainter of the pnp layer, has previously asked that we not modify the pnp layer as it is considered deprecated. This patch is thus essentially a workaround. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- Greg, please apply this to the 4.6 tree. drivers/hv/vmbus_drv.c | 22 +- 1 files changed, 21 insertions(+), 1 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 1ce47d0..dfc6149 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1128,7 +1128,7 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t size, resource_size_t align, bool fb_overlap_ok) { - struct resource *iter; + struct resource *iter, *shadow; resource_size_t range_min, range_max, start, local_min, local_max; const char *dev_n = dev_name(_obj->device); u32 fb_end = screen_info.lfb_base + (screen_info.lfb_size << 1); @@ -1170,12 +1170,22 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, start = (local_min + align - 1) & ~(align - 1); for (; start + size - 1 <= local_max; start += align) { + shadow = __request_region(iter, start, + size, + NULL, + IORESOURCE_BUSY); + if (!shadow) + continue; + *new = request_mem_region_exclusive(start, size, dev_n); if (*new) { + shadow->name = (char *)*new; retval = 0; goto exit; } + + __release_region(iter, start, size); } } } @@ -1196,7 +1206,17 @@ EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); */ void vmbus_free_mmio(resource_size_t start, resource_size_t size) { + struct resource *iter; + + down(_mmio_lock); + for (iter = hyperv_mmio; iter; iter = iter->sibling) { + if ((iter->start >= start + size) || (iter->end <= start)) + continue; + + __release_region(iter, start, size); + } release_mem_region(start, size); + up(_mmio_lock); } EXPORT_SYMBOL_GPL(vmbus_free_mmio); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 7/7] drivers:hv: Separate out frame buffer logic when picking MMIO range
From: Jake Oshins <ja...@microsoft.com> Simplify the logic that picks MMIO ranges by pulling out the logic related to trying to lay frame buffer claim on top of where the firmware placed the frame buffer. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- Greg, please apply this to the 4.6 tree. drivers/hv/vmbus_drv.c | 80 +--- 1 files changed, 35 insertions(+), 45 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index eaa5c3b..a29a6c0 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1162,64 +1162,54 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, bool fb_overlap_ok) { struct resource *iter, *shadow; - resource_size_t range_min, range_max, start, local_min, local_max; + resource_size_t range_min, range_max, start; const char *dev_n = dev_name(_obj->device); - u32 fb_end = screen_info.lfb_base + (screen_info.lfb_size << 1); - int i, retval; + int retval; retval = -ENXIO; down(_mmio_lock); + /* +* If overlaps with frame buffers are allowed, then first attempt to +* make the allocation from within the reserved region. Because it +* is already reserved, no shadow allocation is necessary. +*/ + if (fb_overlap_ok && fb_mmio && !(min > fb_mmio->end) && + !(max < fb_mmio->start)) { + + range_min = fb_mmio->start; + range_max = fb_mmio->end; + start = (range_min + align - 1) & ~(align - 1); + for (; start + size - 1 <= range_max; start += align) { + *new = request_mem_region_exclusive(start, size, dev_n); + if (*new) { + retval = 0; + goto exit; + } + } + } + for (iter = hyperv_mmio; iter; iter = iter->sibling) { if ((iter->start >= max) || (iter->end <= min)) continue; range_min = iter->start; range_max = iter->end; - - /* If this range overlaps the frame buffer, split it into - two tries. */ - for (i = 0; i < 2; i++) { - local_min = range_min; - local_max = range_max; - if (fb_overlap_ok || (range_min >= fb_end) || - (range_max <= screen_info.lfb_base)) { - i++; - } else { - if ((range_min <= screen_info.lfb_base) && - (range_max >= screen_info.lfb_base)) { - /* -* The frame buffer is in this window, -* so trim this into the part that -* preceeds the frame buffer. -*/ - local_max = screen_info.lfb_base - 1; - range_min = fb_end; - } else { - range_min = fb_end; - continue; - } + start = (range_min + align - 1) & ~(align - 1); + for (; start + size - 1 <= range_max; start += align) { + shadow = __request_region(iter, start, size, NULL, + IORESOURCE_BUSY); + if (!shadow) + continue; + + *new = request_mem_region_exclusive(start, size, dev_n); + if (*new) { + shadow->name = (char *)*new; + retval = 0; + goto exit; } - start = (local_min + align - 1) & ~(align - 1); - for (; start + size - 1 <= local_max; start += align) { - shadow = __request_region(iter, start, - size, - NULL, - IORESOURCE_BUSY); - if (!shadow) - continue; - - *new = request_mem_region_exclusive(start, size, - dev_n); -
[PATCH 6/7] drivers:hv: Record MMIO range in use by frame buffer
From: Jake Oshins <ja...@microsoft.com> Later in the boot sequence, we need to figure out which memory ranges can be given out to various paravirtual drivers. The hyperv_fb driver should, ideally, be placed right on top of the frame buffer, without some other device getting plopped on top of this range in the meantime. Recording this now allows that to be guaranteed. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- Greg, please apply this to the 4.6 tree. drivers/hv/vmbus_drv.c | 37 - 1 files changed, 36 insertions(+), 1 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index dfc6149..eaa5c3b 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -41,6 +41,7 @@ #include #include #include +#include #include "hyperv_vmbus.h" static struct acpi_device *hv_acpi_dev; @@ -101,6 +102,8 @@ static struct notifier_block hyperv_panic_block = { .notifier_call = hyperv_panic_event, }; +static const char *fb_mmio_name = "fb_range"; +static struct resource *fb_mmio; struct resource *hyperv_mmio; DEFINE_SEMAPHORE(hyperv_mmio_lock); @@ -1091,6 +1094,12 @@ static int vmbus_acpi_remove(struct acpi_device *device) struct resource *next_res; if (hyperv_mmio) { + if (fb_mmio) { + __release_region(hyperv_mmio, fb_mmio->start, +resource_size(fb_mmio)); + fb_mmio = NULL; + } + for (cur_res = hyperv_mmio; cur_res; cur_res = next_res) { next_res = cur_res->sibling; kfree(cur_res); @@ -1100,6 +1109,30 @@ static int vmbus_acpi_remove(struct acpi_device *device) return 0; } +static void vmbus_reserve_fb(void) +{ + int size; + /* +* Make a claim for the frame buffer in the resource tree under the +* first node, which will be the one below 4GB. The length seems to +* be underreported, particularly in a Generation 1 VM. So start out +* reserving a larger area and make it smaller until it succeeds. +*/ + + if (screen_info.lfb_base) { + if (efi_enabled(EFI_BOOT)) + size = max_t(__u32, screen_info.lfb_size, 0x80); + else + size = max_t(__u32, screen_info.lfb_size, 0x400); + + for (; !fb_mmio && (size >= 0x10); size >>= 1) { + fb_mmio = __request_region(hyperv_mmio, + screen_info.lfb_base, size, + fb_mmio_name, 0); + } + } +} + /** * vmbus_allocate_mmio() - Pick a memory-mapped I/O range. * @new: If successful, supplied a pointer to the @@ -1261,8 +1294,10 @@ static int vmbus_acpi_add(struct acpi_device *device) if (ACPI_FAILURE(result)) continue; - if (hyperv_mmio) + if (hyperv_mmio) { + vmbus_reserve_fb(); break; + } } ret_val = 0; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/7] drivers:hv: Make a function to free mmio regions through vmbus
From: Jake Oshins <ja...@microsoft.com> This patch introduces a function that reverses everything done by vmbus_allocate_mmio(). Existing code just called release_mem_region(). Future patches in this series require a more complex sequence of actions, so this function is introduced to wrap those actions. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- Greg, please apply this to the 4.6 tree. drivers/hv/vmbus_drv.c | 15 +++ include/linux/hyperv.h |2 +- 2 files changed, 16 insertions(+), 1 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 799518b..60553c1 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1188,6 +1188,21 @@ exit: EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); /** + * vmbus_free_mmio() - Free a memory-mapped I/O range. + * @start: Base address of region to release. + * @size: Size of the range to be allocated + * + * This function releases anything requested by + * vmbus_mmio_allocate(). + */ +void vmbus_free_mmio(resource_size_t start, resource_size_t size) +{ + release_mem_region(start, size); + +} +EXPORT_SYMBOL_GPL(vmbus_free_mmio); + +/** * vmbus_cpu_number_to_vp_number() - Map CPU to VP. * @cpu_number: CPU number in Linux terms * diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index aa0fadc..ecd81c3 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1091,7 +1091,7 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t min, resource_size_t max, resource_size_t size, resource_size_t align, bool fb_overlap_ok); - +void vmbus_free_mmio(resource_size_t start, resource_size_t size); int vmbus_cpu_number_to_vp_number(int cpu_number); u64 hv_do_hypercall(u64 control, void *input, void *output); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/7] drivers:hv: Lock access to hyperv_mmio resource tree
From: Jake Oshins <ja...@microsoft.com> In existing code, this tree of resources is created in single-threaded code and never modified after it is created, and thus needs no locking. This patch introduces a semaphore for tree access, as other patches in this series introduce run-time modifications of this resource tree which can happen on multiple threads. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- Greg, please apply this to the 4.6 tree. drivers/hv/vmbus_drv.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 64713ff..799518b 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -102,6 +102,7 @@ static struct notifier_block hyperv_panic_block = { }; struct resource *hyperv_mmio; +DEFINE_SEMAPHORE(hyperv_mmio_lock); static int vmbus_exists(void) { @@ -1132,7 +1133,10 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t range_min, range_max, start, local_min, local_max; const char *dev_n = dev_name(_obj->device); u32 fb_end = screen_info.lfb_base + (screen_info.lfb_size << 1); - int i; + int i, retval; + + retval = -ENXIO; + down(_mmio_lock); for (iter = hyperv_mmio; iter; iter = iter->sibling) { if ((iter->start >= max) || (iter->end <= min)) @@ -1169,13 +1173,17 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, for (; start + size - 1 <= local_max; start += align) { *new = request_mem_region_exclusive(start, size, dev_n); - if (*new) - return 0; + if (*new) { + retval = 0; + goto exit; + } } } } - return -ENXIO; +exit: + up(_mmio_lock); + return retval; } EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/7] drivers: hv: Ensure that bridge windows don't overlap
Greg, please apply this set to 4.6 tree. Hyper-V VMs expose paravirtual drivers through a mechanism called VMBus, which is managed by hv_vmbus.ko. For each parvirtual service instance, this driver exposes a new child device. Some of these child devices need memory address space, into which Hyper-V will map things like the virtual video frame buffer. This memory-mapped address space is chosen by the guest OS, not the hypervisor. This is difficult to map onto the Linux pnp layer, as the code in the pnp layer to choose MMIO space keys off of bus type and it doesn't know anything about VMBus. The maintainers of the pnp layer have asked that we not offer patches to it that make it understand VMBus, but that we rather find ways of using the code in its current state. So hv_vmbus.ko exports a function, vmbus_allocate_mmio() for choosing the address space for any child driver that needs this facility. The recently introduced PCI front-end driver for Hyper-V VMs (pci-hyperv.ko) uses vmbus_allocate_mmio() for choosing both the region of memory space into which real PCI Express devices are mapped. The regions allocated are made to look like root PCI bus bridge windows to the PCI driver, reusing all the code in the PCI driver for the rest of PCI device management. The problem is that these bridge windows are marked in such a way that devices can still allocate from the memory space spanned by them, and this means that if two different PCI buses are created in the VM, each with devices under them, they may allocate the same memory space, leading to PCI Base Address Register which overlap. This patch series fixes the problem by tracking allocations to child devices in a separate resource tree, marking them such that the bridge windows can't overlap. The main memory resource tree, iomem_resource, contains resources properly marked as bridge windows, allowing their children to overlap with them. Jake Oshins (7): drivers:hv: Lock access to hyperv_mmio resource tree drivers:hv: Make a function to free mmio regions through vmbus drivers:hv: Use new vmbus_mmio_free() from client drivers. drivers:hv: Reverse order of resources in hyperv_mmio drivers:hv: Track allocations of children of hv_vmbus in private resource tree drivers:hv: Record MMIO range in use by frame buffer drivers:hv: Separate out frame buffer logic when picking MMIO range drivers/hv/vmbus_drv.c | 143 -- drivers/pci/host/pci-hyperv.c | 14 ++-- drivers/video/fbdev/hyperv_fb.c |4 +- include/linux/hyperv.h |2 +- 4 files changed, 115 insertions(+), 48 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/6] Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile
Use the READ_ONCE macro to access variabes that can change asynchronously. This is the recommended mechanism for dealing with "unsafe" compiler optimizations. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 544362c..6ea1b55 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -69,7 +69,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) { mb(); - if (rbi->ring_buffer->interrupt_mask) + if (READ_ONCE(rbi->ring_buffer->interrupt_mask)) return false; /* check interrupt_mask before read_index */ @@ -78,7 +78,7 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) * This is the only case we need to signal when the * ring transitions from being empty to non-empty. */ - if (old_write == rbi->ring_buffer->read_index) + if (old_write == READ_ONCE(rbi->ring_buffer->read_index)) return true; return false; @@ -117,7 +117,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) */ mb(); - pending_sz = rbi->ring_buffer->pending_send_sz; + pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/6] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer
Introduce separate functions for estimating how much can be read from and written to the ring buffer. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 25 - include/linux/hyperv.h | 27 +++ 2 files changed, 31 insertions(+), 21 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index a40a73a..544362c 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -38,8 +38,6 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi) u32 hv_end_read(struct hv_ring_buffer_info *rbi) { - u32 read; - u32 write; rbi->ring_buffer->interrupt_mask = 0; mb(); @@ -49,9 +47,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) * If it is not, we raced and we need to process new * incoming messages. */ - hv_get_ringbuffer_availbytes(rbi, , ); - - return read; + return hv_get_bytes_to_read(rbi); } /* @@ -106,9 +102,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; - u32 r_size; - u32 write_loc; - u32 read_loc = rbi->ring_buffer->read_index; u32 pending_sz; /* @@ -125,14 +118,11 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) mb(); pending_sz = rbi->ring_buffer->pending_send_sz; - write_loc = rbi->ring_buffer->write_index; /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; - r_size = rbi->ring_datasize; - cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : - read_loc - write_loc; + cur_write_sz = hv_get_bytes_to_write(rbi); if (cur_write_sz >= pending_sz) return true; @@ -332,7 +322,6 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, { int i = 0; u32 bytes_avail_towrite; - u32 bytes_avail_toread; u32 totalbytes_towrite = 0; u32 next_write_location; @@ -348,9 +337,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, if (lock) spin_lock_irqsave(_info->ring_lock, flags); - hv_get_ringbuffer_availbytes(outring_info, - _avail_toread, - _avail_towrite); + bytes_avail_towrite = hv_get_bytes_to_write(outring_info); /* * If there is only room for the packet, assume it is full. @@ -401,7 +388,6 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, void *buffer, u32 buflen, u32 *buffer_actual_len, u64 *requestid, bool *signal, bool raw) { - u32 bytes_avail_towrite; u32 bytes_avail_toread; u32 next_read_location = 0; u64 prev_indices = 0; @@ -417,10 +403,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, *buffer_actual_len = 0; *requestid = 0; - hv_get_ringbuffer_availbytes(inring_info, - _avail_toread, - _avail_towrite); - + bytes_avail_toread = hv_get_bytes_to_read(inring_info); /* Make sure there is something to read */ if (bytes_avail_toread < sizeof(desc)) { /* diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index ecd81c3..a6b053c 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -151,6 +151,33 @@ hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi, *read = dsize - *write; } +static inline u32 hv_get_bytes_to_read(struct hv_ring_buffer_info *rbi) +{ + u32 read_loc, write_loc, dsize, read; + + dsize = rbi->ring_datasize; + read_loc = rbi->ring_buffer->read_index; + write_loc = READ_ONCE(rbi->ring_buffer->write_index); + + read = write_loc >= read_loc ? (write_loc - read_loc) : + (dsize - read_loc) + write_loc; + + return read; +} + +static inline u32 hv_get_bytes_to_write(struct hv_ring_buffer_info *rbi) +{ + u32 read_loc, write_loc, dsize, write; + + dsize = rbi->ring_datasize; + read_loc = READ_ONCE(rbi->ring_buffer->read_index); + write_loc = rbi->ring_buffer->write_index; + + write = write_loc >= read_loc ? dsize - (write_loc - read_loc) : + read_loc - write_loc; + return write; +} + /* * VMBUS version is 32 bit entity broken up into * two 16 bit quantities: major_number. minor_number. -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 6/6] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets
Implement APIs for in-place consumption of vmbus packets. Currently, each packet is copied and processed one at a time and as part of processing each packet we potentially may signal the host (if it is waiting for room to produce a packet). These APIs help batched in-place processing of vmbus packets. We also optimize host signaling by having a separate API to signal the end of in-place consumption. With netvsc using these APIs, on an iperf run on average I see about 20X reduction in checks to signal the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c |1 + include/linux/hyperv.h | 86 ++ 2 files changed, 87 insertions(+), 0 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index dd255c9..fe586bf 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -132,6 +132,7 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, u32 next_read_location) { ring_info->ring_buffer->read_index = next_read_location; + ring_info->priv_read_index = next_read_location; } /* Get the size of the ring buffer. */ diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 6797a30..b10954a 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -126,6 +126,8 @@ struct hv_ring_buffer_info { u32 ring_datasize; /* < ring_size */ u32 ring_data_startoffset; + u32 priv_write_index; + u32 priv_read_index; }; /* @@ -1420,4 +1422,88 @@ static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) return false; } +/* + * An API to support in-place processing of incoming VMBUS packets. + */ +#define VMBUS_PKT_TRAILER 8 + +static inline struct vmpacket_descriptor * +get_next_pkt_raw(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + void *ring_buffer = hv_get_ring_buffer(ring_info); + struct vmpacket_descriptor *cur_desc; + u32 packetlen; + u32 dsize = ring_info->ring_datasize; + u32 delta = read_loc - ring_info->ring_buffer->read_index; + u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta); + + if (bytes_avail_toread < sizeof(struct vmpacket_descriptor)) + return NULL; + + if ((read_loc + sizeof(*cur_desc)) > dsize) + return NULL; + + cur_desc = ring_buffer + read_loc; + packetlen = cur_desc->len8 << 3; + + /* +* If the packet under consideration is wrapping around, +* return failure. +*/ + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1)) + return NULL; + + return cur_desc; +} + +/* + * A helper function to step through packets "in-place" + * This API is to be called after each successful call + * get_next_pkt_raw(). + */ +static inline void put_pkt_raw(struct vmbus_channel *channel, + struct vmpacket_descriptor *desc) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + u32 packetlen = desc->len8 << 3; + u32 dsize = ring_info->ring_datasize; + + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize) + BUG(); + /* +* Include the packet trailer. +*/ + ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER; +} + +/* + * This call commits the read index and potentially signals the host. + * Here is the pattern for using the "in-place" consumption APIs: + * + * while (get_next_pkt_raw() { + * process the packet "in-place"; + * put_pkt_raw(); + * } + * if (packets processed in place) + * commit_rd_index(); + */ +static inline void commit_rd_index(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + /* +* Make sure all reads are done before we update the read index since +* the writer may start writing to the read area once the read index +* is updated. +*/ + virt_rmb(); + ring_info->ring_buffer->read_index = ring_info->priv_read_index; + + if (hv_need_to_signal_on_read(ring_info)) + vmbus_set_event(channel); +} + + #endif /* _HYPERV_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/6] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h
In preparation for implementing APIs for in-place consumption of VMBUS packets, movve some ring buffer functionality into hyperv.h Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 55 -- include/linux/hyperv.h | 54 + 2 files changed, 54 insertions(+), 55 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 8f518af..dd255c9 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -84,52 +84,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) return false; } -/* - * To optimize the flow management on the send-side, - * when the sender is blocked because of lack of - * sufficient space in the ring buffer, potential the - * consumer of the ring buffer can signal the producer. - * This is controlled by the following parameters: - * - * 1. pending_send_sz: This is the size in bytes that the - *producer is trying to send. - * 2. The feature bit feat_pending_send_sz set to indicate if - *the consumer of the ring will signal when the ring - *state transitions from being full to a state where - *there is room for the producer to send the pending packet. - */ - -static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) -{ - u32 cur_write_sz; - u32 pending_sz; - - /* -* Issue a full memory barrier before making the signaling decision. -* Here is the reason for having this barrier: -* If the reading of the pend_sz (in this function) -* were to be reordered and read before we commit the new read -* index (in the calling function) we could -* have a problem. If the host were to set the pending_sz after we -* have sampled pending_sz and go to sleep before we commit the -* read index, we could miss sending the interrupt. Issue a full -* memory barrier to address this. -*/ - virt_mb(); - - pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); - /* If the other end is not blocked on write don't bother. */ - if (pending_sz == 0) - return false; - - cur_write_sz = hv_get_bytes_to_write(rbi); - - if (cur_write_sz >= pending_sz) - return true; - - return false; -} - /* Get the next write location for the specified ring buffer. */ static inline u32 hv_get_next_write_location(struct hv_ring_buffer_info *ring_info) @@ -180,15 +134,6 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, ring_info->ring_buffer->read_index = next_read_location; } - -/* Get the start of the ring buffer. */ -static inline void * -hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) -{ - return (void *)ring_info->ring_buffer->buffer; -} - - /* Get the size of the ring buffer. */ static inline u32 hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 4adeb6e..6797a30 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1366,4 +1366,58 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); void vmbus_set_event(struct vmbus_channel *channel); + +/* Get the start of the ring buffer. */ +static inline void * +hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) +{ + return (void *)ring_info->ring_buffer->buffer; +} + +/* + * To optimize the flow management on the send-side, + * when the sender is blocked because of lack of + * sufficient space in the ring buffer, potential the + * consumer of the ring buffer can signal the producer. + * This is controlled by the following parameters: + * + * 1. pending_send_sz: This is the size in bytes that the + *producer is trying to send. + * 2. The feature bit feat_pending_send_sz set to indicate if + *the consumer of the ring will signal when the ring + *state transitions from being full to a state where + *there is room for the producer to send the pending packet. + */ + +static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) +{ + u32 cur_write_sz; + u32 pending_sz; + + /* +* Issue a full memory barrier before making the signaling decision. +* Here is the reason for having this barrier: +* If the reading of the pend_sz (in this function) +* were to be reordered and read before we commit the new read +* index (in the calling function) we could +* have a problem. If the host were to set the pending_sz after we +* have sampled pending_sz and go to sleep before we commit the +* read index, we could miss sending the interrupt. Issue a full +* memory barrier to address
[PATCH 0/6] Drivers: hv: vmbus: Cleanup the ring buffer code
Cleanup and fix a bug in the ring buffer code. Also implement APIs for in place consumption of received packets. K. Y. Srinivasan (6): Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile Drivers: hv: vmbus: Use the new virt_xx barrier code Drivers: hv: vmbus: Export the vmbus_set_event() API Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets drivers/hv/connection.c |1 + drivers/hv/hyperv_vmbus.h |2 - drivers/hv/ring_buffer.c | 95 +++-- include/linux/hyperv.h| 168 + 4 files changed, 181 insertions(+), 85 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 4/6] Drivers: hv: vmbus: Export the vmbus_set_event() API
In preparation for moving some ring buffer functionality out of the vmbus driver, export the API for signaling the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/connection.c |1 + drivers/hv/hyperv_vmbus.h |2 -- include/linux/hyperv.h|1 + 3 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index d02f137..fcf8a02 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -495,3 +495,4 @@ void vmbus_set_event(struct vmbus_channel *channel) hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL); } +EXPORT_SYMBOL_GPL(vmbus_set_event); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 8b07f9c..e5203e4 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -672,8 +672,6 @@ void vmbus_disconnect(void); int vmbus_post_msg(void *buffer, size_t buflen); -void vmbus_set_event(struct vmbus_channel *channel); - void vmbus_on_event(unsigned long data); void vmbus_on_msg_dpc(unsigned long data); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index a6b053c..4adeb6e 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1365,4 +1365,5 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); +void vmbus_set_event(struct vmbus_channel *channel); #endif /* _HYPERV_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/6] Drivers: hv: vmbus: Use the new virt_xx barrier code
Use the virt_xx barriers that have been defined for use in virtual machines. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 6ea1b55..8f518af 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -33,14 +33,14 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi) { rbi->ring_buffer->interrupt_mask = 1; - mb(); + virt_mb(); } u32 hv_end_read(struct hv_ring_buffer_info *rbi) { rbi->ring_buffer->interrupt_mask = 0; - mb(); + virt_mb(); /* * Now check to see if the ring buffer is still empty. @@ -68,12 +68,12 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) { - mb(); + virt_mb(); if (READ_ONCE(rbi->ring_buffer->interrupt_mask)) return false; /* check interrupt_mask before read_index */ - rmb(); + virt_rmb(); /* * This is the only case we need to signal when the * ring transitions from being empty to non-empty. @@ -115,7 +115,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) * read index, we could miss sending the interrupt. Issue a full * memory barrier to address this. */ - mb(); + virt_mb(); pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ @@ -371,7 +371,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, sizeof(u64)); /* Issue a full memory barrier before updating the write index */ - mb(); + virt_mb(); /* Now, update the write location */ hv_set_next_write_location(outring_info, next_write_location); @@ -447,7 +447,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, * the writer may start writing to the read area once the read index * is updated. */ - mb(); + virt_mb(); /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH V2 1/1] Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read()
On the consumer side, we have interrupt driven flow management of the producer. It is sufficient to base the signaling decision on the amount of space that is available to write after the read is complete. The current code samples the previous available space and uses this in making the signaling decision. This state can be stale and is unnecessary. Since the state can be stale, we end up not signaling the host (when we should) and this can result in a hang. Fix this problem by removing the unnecessary check. I would like to thank Arseney Romanenko <arsen...@microsoft.com> for pointing out this issue. Also, issue a full memory barrier before making the signaling descision to correctly deal with potential reordering of the write (read index) followed by the read of pending_sz. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Tested-by: Dexuan Cui <de...@microsoft.com> Cc: <sta...@vger.kernel.org> --- drivers/hv/ring_buffer.c | 26 -- 1 files changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 5613e2b..a40a73a 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -103,15 +103,29 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) *there is room for the producer to send the pending packet. */ -static bool hv_need_to_signal_on_read(u32 prev_write_sz, - struct hv_ring_buffer_info *rbi) +static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; u32 r_size; - u32 write_loc = rbi->ring_buffer->write_index; + u32 write_loc; u32 read_loc = rbi->ring_buffer->read_index; - u32 pending_sz = rbi->ring_buffer->pending_send_sz; + u32 pending_sz; + /* +* Issue a full memory barrier before making the signaling decision. +* Here is the reason for having this barrier: +* If the reading of the pend_sz (in this function) +* were to be reordered and read before we commit the new read +* index (in the calling function) we could +* have a problem. If the host were to set the pending_sz after we +* have sampled pending_sz and go to sleep before we commit the +* read index, we could miss sending the interrupt. Issue a full +* memory barrier to address this. +*/ + mb(); + + pending_sz = rbi->ring_buffer->pending_send_sz; + write_loc = rbi->ring_buffer->write_index; /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; @@ -120,7 +134,7 @@ static bool hv_need_to_signal_on_read(u32 prev_write_sz, cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : read_loc - write_loc; - if ((prev_write_sz < pending_sz) && (cur_write_sz >= pending_sz)) + if (cur_write_sz >= pending_sz) return true; return false; @@ -455,7 +469,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); - *signal = hv_need_to_signal_on_read(bytes_avail_towrite, inring_info); + *signal = hv_need_to_signal_on_read(inring_info); return ret; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read()
On the consumer side, we have interrupt driven flow management of the producer. It is sufficient to base the signaling decision on the amount of space that is available to write after the read is complete. The current code samples the previous available space and uses this in making the signaling decision. This state can be stale and is unnecessary. Since the state can be stale, we end up not signaling the host (when we should) and this can result in a hang. Fix this problem by removing the unnecessary check. I would like to thank Arseney Romanenko <arsen...@microsoft.com> for pointing out this issue. Also, issue a full memory barrier before making the signaling descision to correctly deal with potential reordering of the write (read index) followed by the read of pending_sz. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Tested-by: Dexuan Cui <de...@microsoft.com> Cc: <sta...@vger.kernel.org> --- drivers/hv/ring_buffer.c | 20 1 files changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 5613e2b..e00b632 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -103,8 +103,7 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) *there is room for the producer to send the pending packet. */ -static bool hv_need_to_signal_on_read(u32 prev_write_sz, - struct hv_ring_buffer_info *rbi) +static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; u32 r_size; @@ -112,6 +111,19 @@ static bool hv_need_to_signal_on_read(u32 prev_write_sz, u32 read_loc = rbi->ring_buffer->read_index; u32 pending_sz = rbi->ring_buffer->pending_send_sz; + /* +* Issue a full memory barrier before making the signaling decision. +* Here is the reason for having this barrier: +* If the reading of the pend_sz (in this function) +* were to be reordered and read before we commit the new read +* index (in the calling function) we could +* have a problem. If the host were to set the pending_sz after we +* have sampled pending_sz and go to sleep before we commit the +* read index, we could miss sending the interrupt. Issue a full +* memory barrier to address this. +*/ + mb(); + /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; @@ -120,7 +132,7 @@ static bool hv_need_to_signal_on_read(u32 prev_write_sz, cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : read_loc - write_loc; - if ((prev_write_sz < pending_sz) && (cur_write_sz >= pending_sz)) + if (cur_write_sz >= pending_sz) return true; return false; @@ -455,7 +467,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); - *signal = hv_need_to_signal_on_read(bytes_avail_towrite, inring_info); + *signal = hv_need_to_signal_on_read(inring_info); return ret; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/6] Drivers: hv: kvp: fix IP Failover
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Hyper-V VMs can be replicated to another hosts and there is a feature to set different IP for replicas, it is called 'Failover TCP/IP'. When such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon as we finish negotiation procedure. The problem is that it can happen (and it actually happens) before userspace daemon connects and we reply with HV_E_FAIL to the message. As there are no repetitions we fail to set the requested IP. Solve the issue by postponing our reply to the negotiation message till userspace daemon is connected. We can't wait too long as there is a host-side timeout (cca. 75 seconds) and if we fail to reply in this time frame the whole KVP service will become inactive. The solution is not ideal - if it takes userspace daemon more than 60 seconds to connect IP Failover will still fail but I don't see a solution with our current separation between kernel and userspace parts. Other two modules (VSS and FCOPY) don't require such delay, leave them untouched. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_kvp.c | 31 +++ drivers/hv/hyperv_vmbus.h |5 + 2 files changed, 36 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 9b9b370..cb1a916 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy); static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); static void kvp_timeout_func(struct work_struct *dummy); +static void kvp_host_handshake_func(struct work_struct *dummy); static void kvp_register(int); static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); +static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); static const char kvp_devname[] = "vmbus/hv_kvp"; @@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy) hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } +static void kvp_host_handshake_func(struct work_struct *dummy) +{ + hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); +} + static int kvp_handle_handshake(struct hv_kvp_msg *msg) { switch (msg->kvp_hdr.operation) { @@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_debug("KVP: userspace daemon ver. %d registered\n", KVP_OP_REGISTER); kvp_register(dm_reg_value); + + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + cancel_delayed_work_sync(_host_handshake_work); hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); return 0; @@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context) struct icmsg_negotiate *negop = NULL; int util_fw_version; int kvp_srv_version; + static enum {NEGO_NOT_STARTED, +NEGO_IN_PROGRESS, +NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; + if (host_negotiatied == NEGO_NOT_STARTED && + kvp_transaction.state < HVUTIL_READY) { + /* +* If userspace daemon is not connected and host is asking +* us to negotiate we need to delay to not lose messages. +* This is important for Failover IP setting. +*/ + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(_host_handshake_work, + HV_UTIL_NEGO_TIMEOUT * HZ); + return; + } if (kvp_transaction.state > HVUTIL_READY) return; @@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context) vmbus_sendpacket(channel, recv_buffer, recvlen, requestid, VM_PKT_DATA_INBAND, 0); + + host_negotiatied = NEGO_FINISHED; } } @@ -708,6 +738,7 @@ hv_kvp_init(struct hv_util_service *srv) void hv_kvp_deinit(void) { kvp_transaction.state = HVUTIL_DEVICE_DYING; + cancel_delayed_work_sync(_host_handshake_work); cancel_delayed_work_sync(_timeout_work); cancel_work_sync(_sendkey_work); hvutil_transport_destroy(hvt); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 12321b9..8b07f9c 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -36,6 +36,11 @@ #define HV_UTIL_TIMEOUT 30 /* + * Timeout for guest-host handshake for services. + */ +#define HV_UTIL_NEGO_TIMEOUT 60 + +/* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent * is s
[PATCH 3/6] hv: Lock access to hyperv_mmio resource tree
From: Jake Oshins <ja...@microsoft.com> In existing code, this tree of resources is created in single-threaded code and never modified after it is created, and thus needs no locking. This patch introduces a semaphore for tree access, as other patches in this series introduce run-time modifications of this resource tree which can happen on multiple threads. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/vmbus_drv.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 44e95a4..60553c1 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -102,6 +102,7 @@ static struct notifier_block hyperv_panic_block = { }; struct resource *hyperv_mmio; +DEFINE_SEMAPHORE(hyperv_mmio_lock); static int vmbus_exists(void) { @@ -1132,7 +1133,10 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t range_min, range_max, start, local_min, local_max; const char *dev_n = dev_name(_obj->device); u32 fb_end = screen_info.lfb_base + (screen_info.lfb_size << 1); - int i; + int i, retval; + + retval = -ENXIO; + down(_mmio_lock); for (iter = hyperv_mmio; iter; iter = iter->sibling) { if ((iter->start >= max) || (iter->end <= min)) @@ -1169,13 +1173,17 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, for (; start + size - 1 <= local_max; start += align) { *new = request_mem_region_exclusive(start, size, dev_n); - if (*new) - return 0; + if (*new) { + retval = 0; + goto exit; + } } } } - return -ENXIO; +exit: + up(_mmio_lock); + return retval; } EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 6/6] hv: Track allocations of children of hv_vmbus in private resource tree
From: Jake Oshins <ja...@microsoft.com> This patch changes vmbus_allocate_mmio() and vmbus_free_mmio() so that when child paravirtual devices allocate memory-mapped I/O space, they allocate it privately from a resource tree pointed at by hyperv_mmio and also by the public resource tree iomem_resource. This allows the region to be marked as "busy" in the private tree, but a "bridge window" in the public tree, guaranteeing that no two bridge windows will overlap each other but while also allowing the PCI device children of the bridge windows to overlap that window. One might conclude that this belongs in the pnp layer, rather than in this driver. Rafael Wysocki, the maintainter of the pnp layer, has previously asked that we not modify the pnp layer as it is considered deprecated. This patch is thus essentially a workaround. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/vmbus_drv.c | 22 +- 1 files changed, 21 insertions(+), 1 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 1ce47d0..dfc6149 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1128,7 +1128,7 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t size, resource_size_t align, bool fb_overlap_ok) { - struct resource *iter; + struct resource *iter, *shadow; resource_size_t range_min, range_max, start, local_min, local_max; const char *dev_n = dev_name(_obj->device); u32 fb_end = screen_info.lfb_base + (screen_info.lfb_size << 1); @@ -1170,12 +1170,22 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, start = (local_min + align - 1) & ~(align - 1); for (; start + size - 1 <= local_max; start += align) { + shadow = __request_region(iter, start, + size, + NULL, + IORESOURCE_BUSY); + if (!shadow) + continue; + *new = request_mem_region_exclusive(start, size, dev_n); if (*new) { + shadow->name = (char *)*new; retval = 0; goto exit; } + + __release_region(iter, start, size); } } } @@ -1196,7 +1206,17 @@ EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); */ void vmbus_free_mmio(resource_size_t start, resource_size_t size) { + struct resource *iter; + + down(_mmio_lock); + for (iter = hyperv_mmio; iter; iter = iter->sibling) { + if ((iter->start >= start + size) || (iter->end <= start)) + continue; + + __release_region(iter, start, size); + } release_mem_region(start, size); + up(_mmio_lock); } EXPORT_SYMBOL_GPL(vmbus_free_mmio); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 4/6] hv: Use new vmbus_mmio_free() from client drivers.
From: Jake Oshins <ja...@microsoft.com> This patch modifies all the callers of vmbus_mmio_allocate() to call vmbus_mmio_free() instead of release_mem_region(). Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/pci/host/pci-hyperv.c | 14 +++--- drivers/video/fbdev/hyperv_fb.c |4 ++-- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index ed651ba..f2559b6 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -1795,14 +1795,14 @@ static void hv_pci_free_bridge_windows(struct hv_pcibus_device *hbus) if (hbus->low_mmio_space && hbus->low_mmio_res) { hbus->low_mmio_res->flags |= IORESOURCE_BUSY; - release_mem_region(hbus->low_mmio_res->start, - resource_size(hbus->low_mmio_res)); + vmbus_free_mmio(hbus->low_mmio_res->start, + resource_size(hbus->low_mmio_res)); } if (hbus->high_mmio_space && hbus->high_mmio_res) { hbus->high_mmio_res->flags |= IORESOURCE_BUSY; - release_mem_region(hbus->high_mmio_res->start, - resource_size(hbus->high_mmio_res)); + vmbus_free_mmio(hbus->high_mmio_res->start, + resource_size(hbus->high_mmio_res)); } } @@ -1880,8 +1880,8 @@ static int hv_pci_allocate_bridge_windows(struct hv_pcibus_device *hbus) release_low_mmio: if (hbus->low_mmio_res) { - release_mem_region(hbus->low_mmio_res->start, - resource_size(hbus->low_mmio_res)); + vmbus_free_mmio(hbus->low_mmio_res->start, + resource_size(hbus->low_mmio_res)); } return ret; @@ -1924,7 +1924,7 @@ static int hv_allocate_config_window(struct hv_pcibus_device *hbus) static void hv_free_config_window(struct hv_pcibus_device *hbus) { - release_mem_region(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH); + vmbus_free_mmio(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH); } /** diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c index e2451bd..2fd49b2 100644 --- a/drivers/video/fbdev/hyperv_fb.c +++ b/drivers/video/fbdev/hyperv_fb.c @@ -743,7 +743,7 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info) err3: iounmap(fb_virt); err2: - release_mem_region(par->mem->start, screen_fb_size); + vmbus_free_mmio(par->mem->start, screen_fb_size); par->mem = NULL; err1: if (!gen2vm) @@ -758,7 +758,7 @@ static void hvfb_putmem(struct fb_info *info) struct hvfb_par *par = info->par; iounmap(info->screen_base); - release_mem_region(par->mem->start, screen_fb_size); + vmbus_free_mmio(par->mem->start, screen_fb_size); par->mem = NULL; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/6] hv: Reverse order of resources in hyperv_mmio
From: Jake Oshins <ja...@microsoft.com> A patch later in this series allocates child nodes in this resource tree. For that to work, this tree needs to be sorted in ascending order. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/vmbus_drv.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 60553c1..1ce47d0 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1049,7 +1049,6 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) new_res->end = end; /* -* Stick ranges from higher in address space at the front of the list. * If two ranges are adjacent, merge them. */ do { @@ -1070,7 +1069,7 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) break; } - if ((*old_res)->end < new_res->start) { + if ((*old_res)->start > new_res->end) { new_res->sibling = *old_res; if (prev_res) (*prev_res)->sibling = new_res; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/6] hv: Make a function to free mmio regions through vmbus
From: Jake Oshins <ja...@microsoft.com> This patch introduces a function that reverses everything done by vmbus_allocate_mmio(). Existing code just called release_mem_region(). Future patches in this series require a more complex sequence of actions, so this function is introduced to wrap those actions. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/vmbus_drv.c | 15 +++ include/linux/hyperv.h |2 +- 2 files changed, 16 insertions(+), 1 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 64713ff..44e95a4 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1180,6 +1180,21 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); /** + * vmbus_free_mmio() - Free a memory-mapped I/O range. + * @start: Base address of region to release. + * @size: Size of the range to be allocated + * + * This function releases anything requested by + * vmbus_mmio_allocate(). + */ +void vmbus_free_mmio(resource_size_t start, resource_size_t size) +{ + release_mem_region(start, size); + +} +EXPORT_SYMBOL_GPL(vmbus_free_mmio); + +/** * vmbus_cpu_number_to_vp_number() - Map CPU to VP. * @cpu_number: CPU number in Linux terms * diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index aa0fadc..ecd81c3 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1091,7 +1091,7 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t min, resource_size_t max, resource_size_t size, resource_size_t align, bool fb_overlap_ok); - +void vmbus_free_mmio(resource_size_t start, resource_size_t size); int vmbus_cpu_number_to_vp_number(int cpu_number); u64 hv_do_hypercall(u64 control, void *input, void *output); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/6] Drivers: hv: vmbus: Cleanup and mmio management.
Cleanup and mmio management. Also included is a patch to fix an issue in KVP. Jake Oshins (5): hv: Make a function to free mmio regions through vmbus hv: Lock access to hyperv_mmio resource tree hv: Use new vmbus_mmio_free() from client drivers. hv: Reverse order of resources in hyperv_mmio hv: Track allocations of children of hv_vmbus in private resource tree Vitaly Kuznetsov (1): Drivers: hv: kvp: fix IP Failover drivers/hv/hv_kvp.c | 31 + drivers/hv/hyperv_vmbus.h |5 +++ drivers/hv/vmbus_drv.c | 56 ++- drivers/pci/host/pci-hyperv.c | 14 +- drivers/video/fbdev/hyperv_fb.c |4 +- include/linux/hyperv.h |2 +- 6 files changed, 95 insertions(+), 17 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/7] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer
Introduce separate functions for estimating how much can be read from and written to the ring buffer. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 24 include/linux/hyperv.h | 27 +++ 2 files changed, 31 insertions(+), 20 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 085003a..902375b 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -38,8 +38,6 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi) u32 hv_end_read(struct hv_ring_buffer_info *rbi) { - u32 read; - u32 write; rbi->ring_buffer->interrupt_mask = 0; mb(); @@ -49,9 +47,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) * If it is not, we raced and we need to process new * incoming messages. */ - hv_get_ringbuffer_availbytes(rbi, , ); - - return read; + return hv_get_bytes_to_read(rbi); } /* @@ -106,18 +102,13 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; - u32 r_size; - u32 write_loc = rbi->ring_buffer->write_index; - u32 read_loc = rbi->ring_buffer->read_index; u32 pending_sz = rbi->ring_buffer->pending_send_sz; /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; - r_size = rbi->ring_datasize; - cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : - read_loc - write_loc; + cur_write_sz = hv_get_bytes_to_write(rbi); if (cur_write_sz >= pending_sz) return true; @@ -317,7 +308,6 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, { int i = 0; u32 bytes_avail_towrite; - u32 bytes_avail_toread; u32 totalbytes_towrite = 0; u32 next_write_location; @@ -333,9 +323,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, if (lock) spin_lock_irqsave(_info->ring_lock, flags); - hv_get_ringbuffer_availbytes(outring_info, - _avail_toread, - _avail_towrite); + bytes_avail_towrite = hv_get_bytes_to_write(outring_info); /* * If there is only room for the packet, assume it is full. @@ -386,7 +374,6 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, void *buffer, u32 buflen, u32 *buffer_actual_len, u64 *requestid, bool *signal, bool raw) { - u32 bytes_avail_towrite; u32 bytes_avail_toread; u32 next_read_location = 0; u64 prev_indices = 0; @@ -402,10 +389,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, *buffer_actual_len = 0; *requestid = 0; - hv_get_ringbuffer_availbytes(inring_info, - _avail_toread, - _avail_towrite); - + bytes_avail_toread = hv_get_bytes_to_read(inring_info); /* Make sure there is something to read */ if (bytes_avail_toread < sizeof(desc)) { /* diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index ecd81c3..a6b053c 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -151,6 +151,33 @@ hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi, *read = dsize - *write; } +static inline u32 hv_get_bytes_to_read(struct hv_ring_buffer_info *rbi) +{ + u32 read_loc, write_loc, dsize, read; + + dsize = rbi->ring_datasize; + read_loc = rbi->ring_buffer->read_index; + write_loc = READ_ONCE(rbi->ring_buffer->write_index); + + read = write_loc >= read_loc ? (write_loc - read_loc) : + (dsize - read_loc) + write_loc; + + return read; +} + +static inline u32 hv_get_bytes_to_write(struct hv_ring_buffer_info *rbi) +{ + u32 read_loc, write_loc, dsize, write; + + dsize = rbi->ring_datasize; + read_loc = READ_ONCE(rbi->ring_buffer->read_index); + write_loc = rbi->ring_buffer->write_index; + + write = write_loc >= read_loc ? dsize - (write_loc - read_loc) : + read_loc - write_loc; + return write; +} + /* * VMBUS version is 32 bit entity broken up into * two 16 bit quantities: major_number. minor_number. -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 4/7] Drivers: hv: vmbus: Use the new virt_xx barrier code
Use the virt_xx barriers that have been defined for use in virtual machines. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 67dc245..c2c2b2e 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -33,14 +33,14 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi) { rbi->ring_buffer->interrupt_mask = 1; - mb(); + virt_mb(); } u32 hv_end_read(struct hv_ring_buffer_info *rbi) { rbi->ring_buffer->interrupt_mask = 0; - mb(); + virt_mb(); /* * Now check to see if the ring buffer is still empty. @@ -68,12 +68,12 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) { - mb(); + virt_mb(); if (READ_ONCE(rbi->ring_buffer->interrupt_mask)) return false; /* check interrupt_mask before read_index */ - rmb(); + virt_rmb(); /* * This is the only case we need to signal when the * ring transitions from being empty to non-empty. @@ -104,7 +104,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) u32 cur_write_sz; u32 pending_sz; - mb(); + virt_mb(); pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) @@ -359,7 +359,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, sizeof(u64)); /* Issue a full memory barrier before updating the write index */ - mb(); + virt_mb(); /* Now, update the write location */ hv_set_next_write_location(outring_info, next_write_location); @@ -435,7 +435,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, * the writer may start writing to the read area once the read index * is updated. */ - mb(); + virt_mb(); /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/7] Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile
Use the READ_ONCE macro to access variabes that can change asynchronously. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 902375b..2919395 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -69,7 +69,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) { mb(); - if (rbi->ring_buffer->interrupt_mask) + if (READ_ONCE(rbi->ring_buffer->interrupt_mask)) return false; /* check interrupt_mask before read_index */ @@ -78,7 +78,7 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) * This is the only case we need to signal when the * ring transitions from being empty to non-empty. */ - if (old_write == rbi->ring_buffer->read_index) + if (old_write == READ_ONCE(rbi->ring_buffer->read_index)) return true; return false; @@ -102,8 +102,9 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; - u32 pending_sz = rbi->ring_buffer->pending_send_sz; + u32 pending_sz; + pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 7/7] Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets
Implement APIs for in-place consumption of vmbus packets. Currently, each packet is copied and processed one at a time and as part of processing each packet we potentially may signal the host (if it is waiting for room to produce a packet). These APIs help batched in-place processing of vmbus packets. We also optimize host signaling by having a separate API to signal the end of in-place consumption. With netvsc using these APIs, on an iperf run on average I see about 20X reduction in checks to signal the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c |1 + include/linux/hyperv.h | 86 ++ 2 files changed, 87 insertions(+), 0 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 253311b..a2a38ab 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -132,6 +132,7 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, u32 next_read_location) { ring_info->ring_buffer->read_index = next_read_location; + ring_info->priv_read_index = next_read_location; } diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 8fc9b09..3fadbaf 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -126,6 +126,8 @@ struct hv_ring_buffer_info { u32 ring_datasize; /* < ring_size */ u32 ring_data_startoffset; + u32 priv_write_index; + u32 priv_read_index; }; /* @@ -1408,4 +1410,88 @@ static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) return false; } +/* + * An API to support in-place processing of incoming VMBUS packets. + */ +#define VMBUS_PKT_TRAILER 8 + +static inline struct vmpacket_descriptor * +get_next_pkt_raw(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + void *ring_buffer = hv_get_ring_buffer(ring_info); + struct vmpacket_descriptor *cur_desc; + u32 packetlen; + u32 dsize = ring_info->ring_datasize; + u32 delta = read_loc - ring_info->ring_buffer->read_index; + u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta); + + if (bytes_avail_toread < sizeof(struct vmpacket_descriptor)) + return NULL; + + if ((read_loc + sizeof(*cur_desc)) > dsize) + return NULL; + + cur_desc = ring_buffer + read_loc; + packetlen = cur_desc->len8 << 3; + + /* +* If the packet under consideration is wrapping around, +* return failure. +*/ + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1)) + return NULL; + + return cur_desc; +} + +/* + * A helper function to step through packets "in-place" + * This API is to be called after each successful call + * get_next_pkt_raw(). + */ +static inline void put_pkt_raw(struct vmbus_channel *channel, + struct vmpacket_descriptor *desc) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->priv_read_index; + u32 packetlen = desc->len8 << 3; + u32 dsize = ring_info->ring_datasize; + + if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize) + BUG(); + /* +* Include the packet trailer. +*/ + ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER; +} + +/* + * This call commits the read index and potentially signals the host. + * Here is the pattern for using the "in-place" consumption APIs: + * + * while (get_next_pkt_raw() { + * process the packet "in-place"; + * put_pkt_raw(); + * } + * if (packets processed in place) + * commit_rd_index(); + */ +static inline void commit_rd_index(struct vmbus_channel *channel) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + /* +* Make sure all reads are done before we update the read index since +* the writer may start writing to the read area once the read index +* is updated. +*/ + virt_rmb(); + ring_info->ring_buffer->read_index = ring_info->priv_read_index; + + if (hv_need_to_signal_on_read(ring_info)) + vmbus_set_event(channel); +} + + #endif /* _HYPERV_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 6/7] Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h
In preparation for implementing APIs for in-place consumption of VMBUS packets, movve some ring buffer functionality into hyperv.h Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 42 -- include/linux/hyperv.h | 42 ++ 2 files changed, 42 insertions(+), 42 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index c2c2b2e..253311b 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -84,40 +84,6 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) return false; } -/* - * To optimize the flow management on the send-side, - * when the sender is blocked because of lack of - * sufficient space in the ring buffer, potential the - * consumer of the ring buffer can signal the producer. - * This is controlled by the following parameters: - * - * 1. pending_send_sz: This is the size in bytes that the - *producer is trying to send. - * 2. The feature bit feat_pending_send_sz set to indicate if - *the consumer of the ring will signal when the ring - *state transitions from being full to a state where - *there is room for the producer to send the pending packet. - */ - -static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) -{ - u32 cur_write_sz; - u32 pending_sz; - - virt_mb(); - pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); - /* If the other end is not blocked on write don't bother. */ - if (pending_sz == 0) - return false; - - cur_write_sz = hv_get_bytes_to_write(rbi); - - if (cur_write_sz >= pending_sz) - return true; - - return false; -} - /* Get the next write location for the specified ring buffer. */ static inline u32 hv_get_next_write_location(struct hv_ring_buffer_info *ring_info) @@ -169,14 +135,6 @@ hv_set_next_read_location(struct hv_ring_buffer_info *ring_info, } -/* Get the start of the ring buffer. */ -static inline void * -hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) -{ - return (void *)ring_info->ring_buffer->buffer; -} - - /* Get the size of the ring buffer. */ static inline u32 hv_get_ring_buffersize(struct hv_ring_buffer_info *ring_info) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 4adeb6e..8fc9b09 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1366,4 +1366,46 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); void vmbus_set_event(struct vmbus_channel *channel); + +/* Get the start of the ring buffer. */ +static inline void * +hv_get_ring_buffer(struct hv_ring_buffer_info *ring_info) +{ + return (void *)ring_info->ring_buffer->buffer; +} + +/* + * To optimize the flow management on the send-side, + * when the sender is blocked because of lack of + * sufficient space in the ring buffer, potential the + * consumer of the ring buffer can signal the producer. + * This is controlled by the following parameters: + * + * 1. pending_send_sz: This is the size in bytes that the + *producer is trying to send. + * 2. The feature bit feat_pending_send_sz set to indicate if + *the consumer of the ring will signal when the ring + *state transitions from being full to a state where + *there is room for the producer to send the pending packet. + */ + +static inline bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) +{ + u32 cur_write_sz; + u32 pending_sz; + + virt_mb(); + pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); + /* If the other end is not blocked on write don't bother. */ + if (pending_sz == 0) + return false; + + cur_write_sz = hv_get_bytes_to_write(rbi); + + if (cur_write_sz >= pending_sz) + return true; + + return false; +} + #endif /* _HYPERV_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/7] Drivers: hv: vmbus: Export the vmbus_set_event() API
In preparation for moving some ring buffer functionality out of the vmbus driver, export the API for signaling the host. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/connection.c |1 + drivers/hv/hyperv_vmbus.h |2 -- include/linux/hyperv.h|1 + 3 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index d02f137..fcf8a02 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -495,3 +495,4 @@ void vmbus_set_event(struct vmbus_channel *channel) hv_do_hypercall(HVCALL_SIGNAL_EVENT, channel->sig_event, NULL); } +EXPORT_SYMBOL_GPL(vmbus_set_event); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index 28e9df9..8cbd630 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -678,8 +678,6 @@ void vmbus_disconnect(void); int vmbus_post_msg(void *buffer, size_t buflen); -void vmbus_set_event(struct vmbus_channel *channel); - void vmbus_on_event(unsigned long data); void vmbus_on_msg_dpc(unsigned long data); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index a6b053c..4adeb6e 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1365,4 +1365,5 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); +void vmbus_set_event(struct vmbus_channel *channel); #endif /* _HYPERV_H */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/7] Drivers: hv: vmbus: Cleanup the ring buffer code
Cleanup and fix a bug in the ring buffer code. Also implement APIs for in place consumption of received packets. K. Y. Srinivasan (7): Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile Drivers: hv: vmbus: Fix a bug in hv_need_to_signal_on_read() Drivers: hv: vmbus: Use the new virt_xx barrier code Drivers: hv: vmbus: Export the vmbus_set_event() API Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets drivers/hv/connection.c |1 + drivers/hv/hyperv_vmbus.h |2 - drivers/hv/ring_buffer.c | 79 --- include/linux/hyperv.h| 156 + 4 files changed, 169 insertions(+), 69 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/7] Drivers: hv: vmbus: Fix a bug in hv_need_to_signal_on_read()
We need to issue a full memory barrier prior making a signalling decision. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Cc: sta...@vger.kernel.org --- drivers/hv/ring_buffer.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 2919395..67dc245 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -104,6 +104,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) u32 cur_write_sz; u32 pending_sz; + mb(); pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/5] Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer
Introduce separate functions for estimating how much can be read from and written to the ring buffer. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 24 include/linux/hyperv.h | 27 +++ 2 files changed, 31 insertions(+), 20 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 085003a..902375b 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -38,8 +38,6 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi) u32 hv_end_read(struct hv_ring_buffer_info *rbi) { - u32 read; - u32 write; rbi->ring_buffer->interrupt_mask = 0; mb(); @@ -49,9 +47,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) * If it is not, we raced and we need to process new * incoming messages. */ - hv_get_ringbuffer_availbytes(rbi, , ); - - return read; + return hv_get_bytes_to_read(rbi); } /* @@ -106,18 +102,13 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; - u32 r_size; - u32 write_loc = rbi->ring_buffer->write_index; - u32 read_loc = rbi->ring_buffer->read_index; u32 pending_sz = rbi->ring_buffer->pending_send_sz; /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; - r_size = rbi->ring_datasize; - cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : - read_loc - write_loc; + cur_write_sz = hv_get_bytes_to_write(rbi); if (cur_write_sz >= pending_sz) return true; @@ -317,7 +308,6 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, { int i = 0; u32 bytes_avail_towrite; - u32 bytes_avail_toread; u32 totalbytes_towrite = 0; u32 next_write_location; @@ -333,9 +323,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, if (lock) spin_lock_irqsave(_info->ring_lock, flags); - hv_get_ringbuffer_availbytes(outring_info, - _avail_toread, - _avail_towrite); + bytes_avail_towrite = hv_get_bytes_to_write(outring_info); /* * If there is only room for the packet, assume it is full. @@ -386,7 +374,6 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, void *buffer, u32 buflen, u32 *buffer_actual_len, u64 *requestid, bool *signal, bool raw) { - u32 bytes_avail_towrite; u32 bytes_avail_toread; u32 next_read_location = 0; u64 prev_indices = 0; @@ -402,10 +389,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, *buffer_actual_len = 0; *requestid = 0; - hv_get_ringbuffer_availbytes(inring_info, - _avail_toread, - _avail_towrite); - + bytes_avail_toread = hv_get_bytes_to_read(inring_info); /* Make sure there is something to read */ if (bytes_avail_toread < sizeof(desc)) { /* diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index ecd81c3..a6b053c 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -151,6 +151,33 @@ hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi, *read = dsize - *write; } +static inline u32 hv_get_bytes_to_read(struct hv_ring_buffer_info *rbi) +{ + u32 read_loc, write_loc, dsize, read; + + dsize = rbi->ring_datasize; + read_loc = rbi->ring_buffer->read_index; + write_loc = READ_ONCE(rbi->ring_buffer->write_index); + + read = write_loc >= read_loc ? (write_loc - read_loc) : + (dsize - read_loc) + write_loc; + + return read; +} + +static inline u32 hv_get_bytes_to_write(struct hv_ring_buffer_info *rbi) +{ + u32 read_loc, write_loc, dsize, write; + + dsize = rbi->ring_datasize; + read_loc = READ_ONCE(rbi->ring_buffer->read_index); + write_loc = rbi->ring_buffer->write_index; + + write = write_loc >= read_loc ? dsize - (write_loc - read_loc) : + read_loc - write_loc; + return write; +} + /* * VMBUS version is 32 bit entity broken up into * two 16 bit quantities: major_number. minor_number. -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/5] Drivers: hv: vmbus: Implement copy-free read APIs
Implement copy-free read APIs. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 55 ++ include/linux/hyperv.h |6 + 2 files changed, 61 insertions(+), 0 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index c2c2b2e..c80e1f3 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -444,3 +444,58 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, return ret; } + +/* + * In-place read functions. + */ +bool get_next_pkt_raw(struct vmbus_channel *channel, + struct vmpacket_descriptor **desc) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->ring_buffer->read_index; + void *ring_buffer = hv_get_ring_buffer(ring_info); + struct vmpacket_descriptor *cur_desc; + u32 packetlen; + u32 dsize = ring_info->ring_datasize; + u32 bytes_avail_toread = hv_get_bytes_to_read(ring_info); + + if (bytes_avail_toread < sizeof(struct vmpacket_descriptor)) + return false; + + if ((read_loc + sizeof(*desc)) > dsize) + return false; + + cur_desc = ring_buffer + read_loc; + packetlen = cur_desc->len8 << 3; + + if ((read_loc + packetlen + 8) > (dsize - 1)) + return false; + + *desc = cur_desc; + return true; +} +EXPORT_SYMBOL_GPL(get_next_pkt_raw); + +void put_pkt_raw(struct vmbus_channel *channel, +struct vmpacket_descriptor *desc) +{ + struct hv_ring_buffer_info *ring_info = >inbound; + u32 read_loc = ring_info->ring_buffer->read_index; + u32 packetlen = desc->len8 << 3; + u32 dsize = ring_info->ring_datasize; + + if ((read_loc + packetlen + 8) > dsize) + BUG(); + + /* +* Make sure all reads are done before we update the read index since +* the writer may start writing to the read area once the read index +* is updated. +*/ + virt_mb(); + ring_info->ring_buffer->read_index += packetlen + 8; + + if (hv_need_to_signal_on_read(ring_info)) + vmbus_set_event(channel); +} +EXPORT_SYMBOL_GPL(put_pkt_raw); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index a6b053c..455f3f0 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1035,6 +1035,12 @@ extern int vmbus_recvpacket_raw(struct vmbus_channel *channel, u32 *buffer_actual_len, u64 *requestid); +bool get_next_pkt_raw(struct vmbus_channel *channel, + struct vmpacket_descriptor **desc); + +void put_pkt_raw(struct vmbus_channel *channel, +struct vmpacket_descriptor *desc); + extern void vmbus_ontimer(unsigned long data); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/5] Drivers: hv: vmbus
Cleanup the Hyper-V ring buffer code. Also Implement APIs for supporting copy-free operations on the read side. K. Y. Srinivasan (5): Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile Drivers: hv: vmbus: Fix a bug in hv_need_to_signal_on_read() Drivers: hv: vmbus: Use the new virt_xx barrier code Drivers: hv: vmbus: Implement copy-free read APIs drivers/hv/ring_buffer.c | 99 - include/linux/hyperv.h | 33 +++ 2 files changed, 103 insertions(+), 29 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/5] Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile
Use the READ_ONCE macro to access variabes that can change asynchronously. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 902375b..2919395 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -69,7 +69,7 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) { mb(); - if (rbi->ring_buffer->interrupt_mask) + if (READ_ONCE(rbi->ring_buffer->interrupt_mask)) return false; /* check interrupt_mask before read_index */ @@ -78,7 +78,7 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) * This is the only case we need to signal when the * ring transitions from being empty to non-empty. */ - if (old_write == rbi->ring_buffer->read_index) + if (old_write == READ_ONCE(rbi->ring_buffer->read_index)) return true; return false; @@ -102,8 +102,9 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; - u32 pending_sz = rbi->ring_buffer->pending_send_sz; + u32 pending_sz; + pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) return false; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/5] Drivers: hv: vmbus: Fix a bug in hv_need_to_signal_on_read()
We need to issue a full memory barrier prior making a signalling decision. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Cc: sta...@vger.kernel.org --- drivers/hv/ring_buffer.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 2919395..67dc245 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -104,6 +104,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) u32 cur_write_sz; u32 pending_sz; + mb(); pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 4/5] Drivers: hv: vmbus: Use the new virt_xx barrier code
Use the virt_xx barriers that have been defined for use in virtual machines. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/ring_buffer.c | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 67dc245..c2c2b2e 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -33,14 +33,14 @@ void hv_begin_read(struct hv_ring_buffer_info *rbi) { rbi->ring_buffer->interrupt_mask = 1; - mb(); + virt_mb(); } u32 hv_end_read(struct hv_ring_buffer_info *rbi) { rbi->ring_buffer->interrupt_mask = 0; - mb(); + virt_mb(); /* * Now check to see if the ring buffer is still empty. @@ -68,12 +68,12 @@ u32 hv_end_read(struct hv_ring_buffer_info *rbi) static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) { - mb(); + virt_mb(); if (READ_ONCE(rbi->ring_buffer->interrupt_mask)) return false; /* check interrupt_mask before read_index */ - rmb(); + virt_rmb(); /* * This is the only case we need to signal when the * ring transitions from being empty to non-empty. @@ -104,7 +104,7 @@ static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) u32 cur_write_sz; u32 pending_sz; - mb(); + virt_mb(); pending_sz = READ_ONCE(rbi->ring_buffer->pending_send_sz); /* If the other end is not blocked on write don't bother. */ if (pending_sz == 0) @@ -359,7 +359,7 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info, sizeof(u64)); /* Issue a full memory barrier before updating the write index */ - mb(); + virt_mb(); /* Now, update the write location */ hv_set_next_write_location(outring_info, next_write_location); @@ -435,7 +435,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, * the writer may start writing to the read area once the read index * is updated. */ - mb(); + virt_mb(); /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] scsi: storvsc: Support manual scan of FC hosts on Hyper-V
The default user scan function associated with FC (fc_user_scan) is not suitable for FC hosts on Hyper-V since we don't have an rport associated with FC host on Hyper-V . Set it to NULL so we can support manual scan of FC targets on Hyper-V. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Tested-by: Long Li <lon...@microsoft.com> Reviewed-by: Long Li <lon...@microsoft.com> --- drivers/scsi/storvsc_drv.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 132b168..8aec590 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1776,6 +1776,12 @@ static int __init storvsc_drv_init(void) * Install Hyper-V specific timeout handler. */ fc_transport_template->eh_timed_out = storvsc_eh_timed_out; + /* +* The default user scan function associated with FC (fc_user_scan) +* is not suitable for FC hosts on Hyper-V. Set it to NULL so we can +* support manual scan of FC targets on Hyper-V. +*/ + fc_transport_template->user_scan = NULL; #endif ret = vmbus_driver_register(_drv); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] Drivers: hv: vmbus: Fix a bug in hv_need_to_signal_on_read()
On the consumer side, we have interrupt driven flow management of the producer. It is sufficient to base the signalling decision on the amount of space that is available to write after the read is complete. The current code samples the previous available space and uses this in making the signalling decision. This state can be stale and is unnecessary. Since the state can be stale, we end up not signalling the host (when we should) and this can result in a hang. Fix this problem by removing the unnecessary check. I would like to thank Arseney Romanenko <arsen...@microsoft.com> for pointing out this bug. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Tested-by: Dexuan Cui <de...@microsoft.com> Cc: <sta...@vger.kernel.org> --- drivers/hv/ring_buffer.c |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 5613e2b..085003a 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -103,8 +103,7 @@ static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi) *there is room for the producer to send the pending packet. */ -static bool hv_need_to_signal_on_read(u32 prev_write_sz, - struct hv_ring_buffer_info *rbi) +static bool hv_need_to_signal_on_read(struct hv_ring_buffer_info *rbi) { u32 cur_write_sz; u32 r_size; @@ -120,7 +119,7 @@ static bool hv_need_to_signal_on_read(u32 prev_write_sz, cur_write_sz = write_loc >= read_loc ? r_size - (write_loc - read_loc) : read_loc - write_loc; - if ((prev_write_sz < pending_sz) && (cur_write_sz >= pending_sz)) + if (cur_write_sz >= pending_sz) return true; return false; @@ -455,7 +454,7 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info, /* Update the read index */ hv_set_next_read_location(inring_info, next_read_location); - *signal = hv_need_to_signal_on_read(bytes_avail_towrite, inring_info); + *signal = hv_need_to_signal_on_read(inring_info); return ret; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/6] Drivers: hv: kvp: fix IP Failover
From: Vitaly Kuznetsov <vkuzn...@redhat.com> Hyper-V VMs can be replicated to another hosts and there is a feature to set different IP for replicas, it is called 'Failover TCP/IP'. When such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon as we finish negotiation procedure. The problem is that it can happen (and it actually happens) before userspace daemon connects and we reply with HV_E_FAIL to the message. As there are no repetitions we fail to set the requested IP. Solve the issue by postponing our reply to the negotiation message till userspace daemon is connected. We can't wait too long as there is a host-side timeout (cca. 75 seconds) and if we fail to reply in this time frame the whole KVP service will become inactive. The solution is not ideal - if it takes userspace daemon more than 60 seconds to connect IP Failover will still fail but I don't see a solution with our current separation between kernel and userspace parts. Other two modules (VSS and FCOPY) don't require such delay, leave them untouched. Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_kvp.c | 30 ++ drivers/hv/hyperv_vmbus.h |5 + 2 files changed, 35 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index 9b9b370..0d3fcd6 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -78,9 +78,11 @@ static void kvp_send_key(struct work_struct *dummy); static void kvp_respond_to_host(struct hv_kvp_msg *msg, int error); static void kvp_timeout_func(struct work_struct *dummy); +static void kvp_host_handshake_func(struct work_struct *dummy); static void kvp_register(int); static DECLARE_DELAYED_WORK(kvp_timeout_work, kvp_timeout_func); +static DECLARE_DELAYED_WORK(kvp_host_handshake_work, kvp_host_handshake_func); static DECLARE_WORK(kvp_sendkey_work, kvp_send_key); static const char kvp_devname[] = "vmbus/hv_kvp"; @@ -130,6 +132,11 @@ static void kvp_timeout_func(struct work_struct *dummy) hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); } +static void kvp_host_handshake_func(struct work_struct *dummy) +{ + hv_poll_channel(kvp_transaction.recv_channel, hv_kvp_onchannelcallback); +} + static int kvp_handle_handshake(struct hv_kvp_msg *msg) { switch (msg->kvp_hdr.operation) { @@ -154,6 +161,12 @@ static int kvp_handle_handshake(struct hv_kvp_msg *msg) pr_debug("KVP: userspace daemon ver. %d registered\n", KVP_OP_REGISTER); kvp_register(dm_reg_value); + + /* +* If we're still negotiating with the host cancel the timeout +* work to not poll the channel twice. +*/ + cancel_delayed_work_sync(_host_handshake_work); hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper); return 0; @@ -594,7 +607,22 @@ void hv_kvp_onchannelcallback(void *context) struct icmsg_negotiate *negop = NULL; int util_fw_version; int kvp_srv_version; + static enum {NEGO_NOT_STARTED, +NEGO_IN_PROGRESS, +NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED; + if (host_negotiatied == NEGO_NOT_STARTED && + kvp_transaction.state < HVUTIL_READY) { + /* +* If userspace daemon is not connected and host is asking +* us to negotiate we need to delay to not lose messages. +* This is important for Failover IP setting. +*/ + host_negotiatied = NEGO_IN_PROGRESS; + schedule_delayed_work(_host_handshake_work, + HV_UTIL_NEGO_TIMEOUT * HZ); + return; + } if (kvp_transaction.state > HVUTIL_READY) return; @@ -672,6 +700,8 @@ void hv_kvp_onchannelcallback(void *context) vmbus_sendpacket(channel, recv_buffer, recvlen, requestid, VM_PKT_DATA_INBAND, 0); + + host_negotiatied = NEGO_FINISHED; } } diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index a64b176..28e9df9 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -36,6 +36,11 @@ #define HV_UTIL_TIMEOUT 30 /* + * Timeout for guest-host handshake for services. + */ +#define HV_UTIL_NEGO_TIMEOUT 60 + +/* * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent * is set by CPUID(HVCPUID_VERSION_FEATURES). */ -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 3/6] hv: Lock access to hyperv_mmio resource tree
From: Jake Oshins <ja...@microsoft.com> In existing code, this tree of resources is created in single-threaded code and never modified after it is created, and thus needs no locking. This patch introduces a semaphore for tree access, as other patches in this series introduce run-time modifications of this resource tree which can happen on multiple threads. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/vmbus_drv.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 44e95a4..60553c1 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -102,6 +102,7 @@ static struct notifier_block hyperv_panic_block = { }; struct resource *hyperv_mmio; +DEFINE_SEMAPHORE(hyperv_mmio_lock); static int vmbus_exists(void) { @@ -1132,7 +1133,10 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t range_min, range_max, start, local_min, local_max; const char *dev_n = dev_name(_obj->device); u32 fb_end = screen_info.lfb_base + (screen_info.lfb_size << 1); - int i; + int i, retval; + + retval = -ENXIO; + down(_mmio_lock); for (iter = hyperv_mmio; iter; iter = iter->sibling) { if ((iter->start >= max) || (iter->end <= min)) @@ -1169,13 +1173,17 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, for (; start + size - 1 <= local_max; start += align) { *new = request_mem_region_exclusive(start, size, dev_n); - if (*new) - return 0; + if (*new) { + retval = 0; + goto exit; + } } } } - return -ENXIO; +exit: + up(_mmio_lock); + return retval; } EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 6/6] hv: Track allocations of children of hv_vmbus in private resource tree
From: Jake Oshins <ja...@microsoft.com> This patch changes vmbus_allocate_mmio() and vmbus_free_mmio() so that when child paravirtual devices allocate memory-mapped I/O space, they allocate it privately from a resource tree pointed at by hyperv_mmio and also by the public resource tree iomem_resource. This allows the region to be marked as "busy" in the private tree, but a "bridge window" in the public tree, guaranteeing that no two bridge windows will overlap each other but while also allowing the PCI device children of the bridge windows to overlap that window. One might conclude that this belongs in the pnp layer, rather than in this driver. Rafael Wysocki, the maintainter of the pnp layer, has previously asked that we not modify the pnp layer as it is considered deprecated. This patch is thus essentially a workaround. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/vmbus_drv.c | 22 +- 1 files changed, 21 insertions(+), 1 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 1ce47d0..dfc6149 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1128,7 +1128,7 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t size, resource_size_t align, bool fb_overlap_ok) { - struct resource *iter; + struct resource *iter, *shadow; resource_size_t range_min, range_max, start, local_min, local_max; const char *dev_n = dev_name(_obj->device); u32 fb_end = screen_info.lfb_base + (screen_info.lfb_size << 1); @@ -1170,12 +1170,22 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, start = (local_min + align - 1) & ~(align - 1); for (; start + size - 1 <= local_max; start += align) { + shadow = __request_region(iter, start, + size, + NULL, + IORESOURCE_BUSY); + if (!shadow) + continue; + *new = request_mem_region_exclusive(start, size, dev_n); if (*new) { + shadow->name = (char *)*new; retval = 0; goto exit; } + + __release_region(iter, start, size); } } } @@ -1196,7 +1206,17 @@ EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); */ void vmbus_free_mmio(resource_size_t start, resource_size_t size) { + struct resource *iter; + + down(_mmio_lock); + for (iter = hyperv_mmio; iter; iter = iter->sibling) { + if ((iter->start >= start + size) || (iter->end <= start)) + continue; + + __release_region(iter, start, size); + } release_mem_region(start, size); + up(_mmio_lock); } EXPORT_SYMBOL_GPL(vmbus_free_mmio); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/6] Drivers: hv: vmbus: Fix mmio management
Fix mmio management for vmbus devices. Also included is a patch to fix an issue in KVP. Jake Oshins (5): hv: Make a function to free mmio regions through vmbus hv: Lock access to hyperv_mmio resource tree hv: Use new vmbus_mmio_free() from client drivers. hv: Reverse order of resources in hyperv_mmio hv: Track allocations of children of hv_vmbus in private resource tree Vitaly Kuznetsov (1): Drivers: hv: kvp: fix IP Failover drivers/hv/hv_kvp.c | 30 + drivers/hv/hyperv_vmbus.h |5 +++ drivers/hv/vmbus_drv.c | 56 ++- drivers/video/fbdev/hyperv_fb.c |4 +- include/linux/hyperv.h |2 +- 5 files changed, 87 insertions(+), 10 deletions(-) -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/6] hv: Make a function to free mmio regions through vmbus
From: Jake Oshins <ja...@microsoft.com> This patch introduces a function that reverses everything done by vmbus_allocate_mmio(). Existing code just called release_mem_region(). Future patches in this series require a more complex sequence of actions, so this function is introduced to wrap those actions. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/vmbus_drv.c | 15 +++ include/linux/hyperv.h |2 +- 2 files changed, 16 insertions(+), 1 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 64713ff..44e95a4 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1180,6 +1180,21 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, EXPORT_SYMBOL_GPL(vmbus_allocate_mmio); /** + * vmbus_free_mmio() - Free a memory-mapped I/O range. + * @start: Base address of region to release. + * @size: Size of the range to be allocated + * + * This function releases anything requested by + * vmbus_mmio_allocate(). + */ +void vmbus_free_mmio(resource_size_t start, resource_size_t size) +{ + release_mem_region(start, size); + +} +EXPORT_SYMBOL_GPL(vmbus_free_mmio); + +/** * vmbus_cpu_number_to_vp_number() - Map CPU to VP. * @cpu_number: CPU number in Linux terms * diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index aa0fadc..ecd81c3 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1091,7 +1091,7 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj, resource_size_t min, resource_size_t max, resource_size_t size, resource_size_t align, bool fb_overlap_ok); - +void vmbus_free_mmio(resource_size_t start, resource_size_t size); int vmbus_cpu_number_to_vp_number(int cpu_number); u64 hv_do_hypercall(u64 control, void *input, void *output); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/6] hv: Reverse order of resources in hyperv_mmio
From: Jake Oshins <ja...@microsoft.com> A patch later in this series allocates child nodes in this resource tree. For that to work, this tree needs to be sorted in ascending order. Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/vmbus_drv.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 60553c1..1ce47d0 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1049,7 +1049,6 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) new_res->end = end; /* -* Stick ranges from higher in address space at the front of the list. * If two ranges are adjacent, merge them. */ do { @@ -1070,7 +1069,7 @@ static acpi_status vmbus_walk_resources(struct acpi_resource *res, void *ctx) break; } - if ((*old_res)->end < new_res->start) { + if ((*old_res)->start > new_res->end) { new_res->sibling = *old_res; if (prev_res) (*prev_res)->sibling = new_res; -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 4/6] hv: Use new vmbus_mmio_free() from client drivers.
From: Jake Oshins <ja...@microsoft.com> This patch modifies all the callers of vmbus_mmio_allocate() to call vmbus_mmio_free() instead of release_mem_region(). Signed-off-by: Jake Oshins <ja...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/video/fbdev/hyperv_fb.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c index e2451bd..2fd49b2 100644 --- a/drivers/video/fbdev/hyperv_fb.c +++ b/drivers/video/fbdev/hyperv_fb.c @@ -743,7 +743,7 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info) err3: iounmap(fb_virt); err2: - release_mem_region(par->mem->start, screen_fb_size); + vmbus_free_mmio(par->mem->start, screen_fb_size); par->mem = NULL; err1: if (!gen2vm) @@ -758,7 +758,7 @@ static void hvfb_putmem(struct fb_info *info) struct hvfb_par *par = info->par; iounmap(info->screen_base); - release_mem_region(par->mem->start, screen_fb_size); + vmbus_free_mmio(par->mem->start, screen_fb_size); par->mem = NULL; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH v4 1/1] scsi: storvsc: Fix a build issue reported by kbuild test robot
tree: https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgit.kernel.org%2fpub%2fscm%2flinux%2fkernel%2fgit%2ftorvalds%2flinux.git=01%7c01%7ckys%40microsoft.com%7ce2e0622715844b79ad7108d32796ec3c%7c72f988bf86f141af91ab2d7cd011db47%7c1=ubr4GbBaNS%2ftOz%2buJBk0CL9N0UNG9x2TidLgy6Yovg4%3d master head: 03c21cb775a313f1ff19be59c5d02df3e3526471 commit: dac582417bc449b1f7f572d3f1dd9d23eec15cc9 storvsc: Properly support Fibre Channel devices date: 3 weeks ago config: x86_64-randconfig-s3-01281016 (attached as .config) reproduce: git checkout dac582417bc449b1f7f572d3f1dd9d23eec15cc9 # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): drivers/built-in.o: In function `storvsc_remove': >> storvsc_drv.c:(.text+0x213af7): undefined reference to `fc_remove_host' drivers/built-in.o: In function `storvsc_drv_init': >> storvsc_drv.c:(.init.text+0xcbcc): undefined reference to >> `fc_attach_transport' >> storvsc_drv.c:(.init.text+0xcc06): undefined reference to >> `fc_release_transport' drivers/built-in.o: In function `storvsc_drv_exit': >> storvsc_drv.c:(.exit.text+0x123c): undefined reference to >> `fc_release_transport' With this commit, the storvsc driver depends on FC atttributes. Make this dependency explicit. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Reported-by: Fengguang Wu <fengguang...@intel.com> --- v1 - v4: Incorporated suggestions from James Bottomley <james.bottom...@hansenpartnership.com> drivers/scsi/Kconfig |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 64eed87..166de0c 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -594,6 +594,7 @@ config XEN_SCSI_FRONTEND config HYPERV_STORAGE tristate "Microsoft Hyper-V virtual storage driver" depends on SCSI && HYPERV + depends on m || SCSI_FC_ATTRS != m default HYPERV help Select this option to enable the Hyper-V virtual storage driver. -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/1] scsi: storvsc: Fix a build issue reported by kbuild test robot
tree: https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgit.kernel.org%2fpub%2fscm%2flinux%2fkernel%2fgit%2ftorvalds%2flinux.git=01%7c01%7ckys%40microsoft.com%7ce2e0622715844b79ad7108d32796ec3c%7c72f988bf86f141af91ab2d7cd011db47%7c1=ubr4GbBaNS%2ftOz%2buJBk0CL9N0UNG9x2TidLgy6Yovg4%3d master head: 03c21cb775a313f1ff19be59c5d02df3e3526471 commit: dac582417bc449b1f7f572d3f1dd9d23eec15cc9 storvsc: Properly support Fibre Channel devices date: 3 weeks ago config: x86_64-randconfig-s3-01281016 (attached as .config) reproduce: git checkout dac582417bc449b1f7f572d3f1dd9d23eec15cc9 # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): drivers/built-in.o: In function `storvsc_remove': >> storvsc_drv.c:(.text+0x213af7): undefined reference to `fc_remove_host' drivers/built-in.o: In function `storvsc_drv_init': >> storvsc_drv.c:(.init.text+0xcbcc): undefined reference to >> `fc_attach_transport' >> storvsc_drv.c:(.init.text+0xcc06): undefined reference to >> `fc_release_transport' drivers/built-in.o: In function `storvsc_drv_exit': >> storvsc_drv.c:(.exit.text+0x123c): undefined reference to >> `fc_release_transport' With this commit, the storvsc driver depends on FC atttributes. Make this dependency explicit. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Reported-by: Fengguang Wu <fengguang...@intel.com> --- drivers/scsi/Kconfig |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 64eed87..24365c3 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -594,6 +594,7 @@ config XEN_SCSI_FRONTEND config HYPERV_STORAGE tristate "Microsoft Hyper-V virtual storage driver" depends on SCSI && HYPERV + depends on SCSI_FC_ATTRS default HYPERV help Select this option to enable the Hyper-V virtual storage driver. -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/8] Drivers: hv: vmbus: avoid wait_for_completion() on crash
From: Vitaly Kuznetsov <vkuzn...@redhat.com> wait_for_completion() may sleep, it enables interrupts and this is something we really want to avoid on crashes because interrupt handlers can cause other crashes. Switch to the recently introduced vmbus_wait_for_unload() doing busy wait instead. Reported-by: Radim Krcmar <rkrc...@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com> Reviewed-by: Radim Kr.má<rkrc...@redhat.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/channel_mgmt.c |4 ++-- drivers/hv/connection.c |2 +- drivers/hv/hyperv_vmbus.h |2 +- drivers/hv/vmbus_drv.c|4 ++-- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index b40f429..f70e352 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -641,7 +641,7 @@ static void vmbus_unload_response(struct vmbus_channel_message_header *hdr) complete(_connection.unload_event); } -void vmbus_initiate_unload(void) +void vmbus_initiate_unload(bool crash) { struct vmbus_channel_message_header hdr; @@ -658,7 +658,7 @@ void vmbus_initiate_unload(void) * vmbus_initiate_unload() is also called on crash and the crash can be * happening in an interrupt context, where scheduling is impossible. */ - if (!in_interrupt()) + if (!crash) wait_for_completion(_connection.unload_event); else vmbus_wait_for_unload(); diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index fa86b2c..3b6dc00 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -236,7 +236,7 @@ void vmbus_disconnect(void) /* * First send the unload request to the host. */ - vmbus_initiate_unload(); + vmbus_initiate_unload(false); if (vmbus_connection.work_queue) { drain_workqueue(vmbus_connection.work_queue); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index b9ea7f5..b0299da 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -663,7 +663,7 @@ void hv_vss_onchannelcallback(void *); int hv_fcopy_init(struct hv_util_service *); void hv_fcopy_deinit(void); void hv_fcopy_onchannelcallback(void *); -void vmbus_initiate_unload(void); +void vmbus_initiate_unload(bool crash); static inline void hv_poll_channel(struct vmbus_channel *channel, void (*cb)(void *)) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 30ea8ad..c8f1671 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1291,7 +1291,7 @@ static void hv_kexec_handler(void) int cpu; hv_synic_clockevents_cleanup(); - vmbus_initiate_unload(); + vmbus_initiate_unload(false); for_each_online_cpu(cpu) smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1); hv_cleanup(); @@ -1299,7 +1299,7 @@ static void hv_kexec_handler(void) static void hv_crash_handler(struct pt_regs *regs) { - vmbus_initiate_unload(); + vmbus_initiate_unload(true); /* * In crash handler we can't schedule synic cleanup for all CPUs, * doing the cleanup for current CPU only. This should be sufficient -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 8/8] Drivers: hv: vmbus: Support kexec on ws2012 r2 and above
From: Alex Ng <ale...@microsoft.com> WS2012 R2 and above hosts can support kexec in that thay can support reconnecting to the host (as would be needed in the kexec path) on any CPU. Enable this. Pre ws2012 r2 hosts don't have this ability and consequently cannot support kexec. Signed-off-by: Alex Ng <ale...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/connection.c | 10 +- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index 3b6dc00..d02f137 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -88,8 +88,16 @@ static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo, * This has been the behavior pre-win8. This is not * perf issue and having all channel messages delivered on CPU 0 * would be ok. +* For post win8 hosts, we support receiving channel messagges on +* all the CPUs. This is needed for kexec to work correctly where +* the CPU attempting to connect may not be CPU 0. */ - msg->target_vcpu = 0; + if (version >= VERSION_WIN8_1) { + msg->target_vcpu = hv_context.vp_index[get_cpu()]; + put_cpu(); + } else { + msg->target_vcpu = 0; + } /* * Add to list before we send the request since we may -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 6/8] Drivers: hv: utils: Remove util transport handler from list if registration fails
From: Alex Ng <ale...@microsoft.com> If util transport fails to initialize for any reason, the list of transport handlers may become corrupted due to freeing the transport handler without removing it from the list. Fix this by cleaning it up from the list. Signed-off-by: Alex Ng <ale...@microsoft.com> Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv_utils_transport.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/hv/hv_utils_transport.c b/drivers/hv/hv_utils_transport.c index 4f42c0e..9a9983f 100644 --- a/drivers/hv/hv_utils_transport.c +++ b/drivers/hv/hv_utils_transport.c @@ -310,6 +310,9 @@ struct hvutil_transport *hvutil_transport_init(const char *name, return hvt; err_free_hvt: + spin_lock(_list_lock); + list_del(>list); + spin_unlock(_list_lock); kfree(hvt); return NULL; } -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 5/8] Drivers: hv: util: Pass the channel information during the init call
Pass the channel information to the util drivers that need to defer reading the channel while they are processing a request. This would address the following issue reported by Vitaly: Commit 3cace4a61610 ("Drivers: hv: utils: run polling callback always in interrupt context") removed direct *_transaction.state = HVUTIL_READY assignments from *_handle_handshake() functions introducing the following race: if a userspace daemon connects before we get first non-negotiation request from the server hv_poll_channel() won't set transaction state to HVUTIL_READY as (!channel) condition will fail, we set it to non-NULL on the first real request from the server. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> Reported-by: Vitaly Kuznetsov <vkuzn...@redhat.com> --- drivers/hv/hv_fcopy.c|2 +- drivers/hv/hv_kvp.c |2 +- drivers/hv/hv_snapshot.c |2 +- drivers/hv/hv_util.c |1 + include/linux/hyperv.h |1 + 5 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index c37a71e..23c7079 100644 --- a/drivers/hv/hv_fcopy.c +++ b/drivers/hv/hv_fcopy.c @@ -251,7 +251,6 @@ void hv_fcopy_onchannelcallback(void *context) */ fcopy_transaction.recv_len = recvlen; - fcopy_transaction.recv_channel = channel; fcopy_transaction.recv_req_id = requestid; fcopy_transaction.fcopy_msg = fcopy_msg; @@ -317,6 +316,7 @@ static void fcopy_on_reset(void) int hv_fcopy_init(struct hv_util_service *srv) { recv_buffer = srv->recv_buffer; + fcopy_transaction.recv_channel = srv->channel; /* * When this driver loads, the user level daemon that diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c index d4ab81b..9b9b370 100644 --- a/drivers/hv/hv_kvp.c +++ b/drivers/hv/hv_kvp.c @@ -639,7 +639,6 @@ void hv_kvp_onchannelcallback(void *context) */ kvp_transaction.recv_len = recvlen; - kvp_transaction.recv_channel = channel; kvp_transaction.recv_req_id = requestid; kvp_transaction.kvp_msg = kvp_msg; @@ -688,6 +687,7 @@ int hv_kvp_init(struct hv_util_service *srv) { recv_buffer = srv->recv_buffer; + kvp_transaction.recv_channel = srv->channel; /* * When this driver loads, the user level daemon that diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c index 67def4a..3fba14e 100644 --- a/drivers/hv/hv_snapshot.c +++ b/drivers/hv/hv_snapshot.c @@ -263,7 +263,6 @@ void hv_vss_onchannelcallback(void *context) */ vss_transaction.recv_len = recvlen; - vss_transaction.recv_channel = channel; vss_transaction.recv_req_id = requestid; vss_transaction.msg = (struct hv_vss_msg *)vss_msg; @@ -337,6 +336,7 @@ hv_vss_init(struct hv_util_service *srv) return -ENOTSUPP; } recv_buffer = srv->recv_buffer; + vss_transaction.recv_channel = srv->channel; /* * When this driver loads, the user level daemon that diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c index 7994ec2..d5acaa2 100644 --- a/drivers/hv/hv_util.c +++ b/drivers/hv/hv_util.c @@ -322,6 +322,7 @@ static int util_probe(struct hv_device *dev, srv->recv_buffer = kmalloc(PAGE_SIZE * 4, GFP_KERNEL); if (!srv->recv_buffer) return -ENOMEM; + srv->channel = dev->channel; if (srv->util_init) { ret = srv->util_init(srv); if (ret) { diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index d23dab0..aa0fadc 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1251,6 +1251,7 @@ u64 hv_do_hypercall(u64 control, void *input, void *output); struct hv_util_service { u8 *recv_buffer; + void *channel; void (*util_cb)(void *); int (*util_init)(struct hv_util_service *); void (*util_deinit)(void); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 7/8] Drivers: hv: vmbus: Support handling messages on multiple CPUs
Starting with Windows 2012 R2, message inteerupts can be delivered on any VCPU in the guest. Support this functionality. Signed-off-by: K. Y. Srinivasan <k...@microsoft.com> --- drivers/hv/hv.c | 10 ++ drivers/hv/hyperv_vmbus.h |4 +++- drivers/hv/vmbus_drv.c| 10 -- 3 files changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c index ccb335f..a1c086b 100644 --- a/drivers/hv/hv.c +++ b/drivers/hv/hv.c @@ -204,6 +204,8 @@ int hv_init(void) sizeof(int) * NR_CPUS); memset(hv_context.event_dpc, 0, sizeof(void *) * NR_CPUS); + memset(hv_context.msg_dpc, 0, + sizeof(void *) * NR_CPUS); memset(hv_context.clk_evt, 0, sizeof(void *) * NR_CPUS); @@ -415,6 +417,13 @@ int hv_synic_alloc(void) } tasklet_init(hv_context.event_dpc[cpu], vmbus_on_event, cpu); + hv_context.msg_dpc[cpu] = kmalloc(size, GFP_ATOMIC); + if (hv_context.msg_dpc[cpu] == NULL) { + pr_err("Unable to allocate event dpc\n"); + goto err; + } + tasklet_init(hv_context.msg_dpc[cpu], vmbus_on_msg_dpc, cpu); + hv_context.clk_evt[cpu] = kzalloc(ced_size, GFP_ATOMIC); if (hv_context.clk_evt[cpu] == NULL) { pr_err("Unable to allocate clock event device\n"); @@ -456,6 +465,7 @@ err: static void hv_synic_free_cpu(int cpu) { kfree(hv_context.event_dpc[cpu]); + kfree(hv_context.msg_dpc[cpu]); kfree(hv_context.clk_evt[cpu]); if (hv_context.synic_event_page[cpu]) free_page((unsigned long)hv_context.synic_event_page[cpu]); diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index cada56a..a64b176 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h @@ -449,10 +449,11 @@ struct hv_context { u32 vp_index[NR_CPUS]; /* * Starting with win8, we can take channel interrupts on any CPU; -* we will manage the tasklet that handles events on a per CPU +* we will manage the tasklet that handles events messages on a per CPU * basis. */ struct tasklet_struct *event_dpc[NR_CPUS]; + struct tasklet_struct *msg_dpc[NR_CPUS]; /* * To optimize the mapping of relid to channel, maintain * per-cpu list of the channels based on their CPU affinity. @@ -675,6 +676,7 @@ int vmbus_post_msg(void *buffer, size_t buflen); void vmbus_set_event(struct vmbus_channel *channel); void vmbus_on_event(unsigned long data); +void vmbus_on_msg_dpc(unsigned long data); int hv_kvp_init(struct hv_util_service *); void hv_kvp_deinit(void); diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 6cd12f1..64713ff 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -45,7 +45,6 @@ static struct acpi_device *hv_acpi_dev; -static struct tasklet_struct msg_dpc; static struct completion probe_event; @@ -712,7 +711,7 @@ static void hv_process_timer_expiration(struct hv_message *msg, int cpu) vmbus_signal_eom(msg); } -static void vmbus_on_msg_dpc(unsigned long data) +void vmbus_on_msg_dpc(unsigned long data) { int cpu = smp_processor_id(); void *page_addr = hv_context.synic_message_page[cpu]; @@ -800,7 +799,7 @@ static void vmbus_isr(void) if (msg->header.message_type == HVMSG_TIMER_EXPIRED) hv_process_timer_expiration(msg, cpu); else - tasklet_schedule(_dpc); + tasklet_schedule(hv_context.msg_dpc[cpu]); } } @@ -824,8 +823,6 @@ static int vmbus_bus_init(void) return ret; } - tasklet_init(_dpc, vmbus_on_msg_dpc, 0); - ret = bus_register(_bus); if (ret) goto err_cleanup; @@ -1321,7 +1318,8 @@ static void __exit vmbus_exit(void) hv_synic_clockevents_cleanup(); vmbus_disconnect(); hv_remove_vmbus_irq(); - tasklet_kill(_dpc); + for_each_online_cpu(cpu) + tasklet_kill(hv_context.msg_dpc[cpu]); vmbus_free_channels(); if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { unregister_die_notifier(_die_block); -- 1.7.4.1 ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel