Re: [PATCH v3] Add Mergeable receive buffer support to vhost_net
Thanks! There's some whitespace damage, are you sending with your new sendmail setup? It seems to have worked for qemu patches ... Yes, I saw some line wraps in what I received, but I checked the original draft to be sure and they weren't there. Possibly from the relay; Sigh. @@ -167,8 +166,15 @@ static void handle_tx(struct vhost_net * /* TODO: Check specific error and bomb out unless ENOBUFS? */ err = sock-ops-sendmsg(NULL, sock, msg, len); if (unlikely(err 0)) { - vhost_discard_vq_desc(vq); - tx_poll_start(net, sock); + if (err == -EAGAIN) { +vhost_discard_desc(vq, 1); +tx_poll_start(net, sock); + } else { +vq_err(vq, sendmsg: errno %d\n, -err); +/* drop packet; do not discard/resend */ +vhost_add_used_and_signal(net-dev, vq, head, + 0); vhost does not currently has a consistent error handling strategy: if we drop packets, need to think which other errors should cause packet drops. I prefer to just call vq_err for now, and have us look at handling segfaults etc in a consistent way separately. I had to add this to avoid an infinite loop when I wrote a bad packet on the socket. I agree error handling needs a better look, but retrying a bad packet continuously while dumping in the log is what it was doing when I hit an error before this code. Isn't this better, at least until a second look? +} + I wonder whether it makes sense to check skb_queue_empty(sk-sk_receive_queue) outside the lock, to reduce the cost of this call on an empty queue (we know that it happens at least once each time we exit the loop on rx)? I was looking at alternatives to adding the lock in the first place, but I found I couldn't measure a difference in the cost with and without the lock. + int retries = 0; size_t len, total_len = 0; - int err; + int err, headcount, datalen; size_t hdr_size; struct socket *sock = rcu_dereference(vq-private_data); if (!sock || skb_queue_empty(sock-sk-sk_receive_queue)) @@ -222,31 +242,25 @@ static void handle_rx(struct vhost_net * vq_log = unlikely(vhost_has_feature(net-dev, VHOST_F_LOG_ALL)) ? vq-log : NULL; - for (;;) { - head = vhost_get_vq_desc(net-dev, vq, vq-iov, -ARRAY_SIZE(vq-iov), -out, in, -vq_log, log); + while ((datalen = vhost_head_len(sock-sk))) { + headcount = vhost_get_desc_n(vq, vq-heads, datalen, in, +vq_log, log); This looks like a bug, I think we need to pass datalen + header size to vhost_get_desc_n. Not sure how we know the header size that backend will use though. Maybe just look at our features. Yes; we have hdr_size, so I can add it here. It'll be 0 for the cases where the backend and guest both have vnet header (either the regular or larger mergeable buffers one), but should be added in for future raw socket support. /* OK, now we need to know about added descriptors. */ - if (head == vq-num) { - if (unlikely(vhost_enable_notify(vq))) { + if (!headcount) { + if (retries == 0 unlikely(vhost_enable_notify(vq))) { /* They have slipped one in as we were * doing that: check again. */ vhost_disable_notify(vq); +retries++; continue; } Hmm. The reason we have the code at all, as the comment says, is because guest could have added more buffers between the time we read last index and the time we enabled notification. So if we just break like this the race still exists. We could remember the last head value we observed, and have vhost_enable_notify check against this value? This is to prevent a spin loop in the case where we have some buffers available, but not enough for the current packet (ie, this is the replacement code for the rxmaxheadcount business). If they actually added something new, retrying once should see it, but what vhost_enable_notify() returns non-zero on is not new buffers but rather not empty. In the case mentioned, we aren't empty, so vhost_enable_notify() returns nonzero every time, but the guest hasn't given us enough buffers to proceed, so we continuously retry; this code breaks the spin loop until we've really got new buffers from the guest. Need to think about it. Another concern here is that on retries vhost_get_desc_n is doing extra work, rescanning the same descriptor again and again. Not sure how common this is, might be worthwhile to add a TODO to consider this at least. I had a printk in there to test the code and with the retries counter, it happens when we fill the ring (once, because of the retries checks), and then proceeds as desired when the guest gives us more buffers. Without the check, it spews until we
[GIT PULL] vhost-net fix for 2.6.34-rc3
David, The following tree includes a patch fixing an issue with vhost-net in 2.6.34-rc3. Please pull for 2.6.34. Thanks! The following changes since commit 2eaa9cfdf33b8d7fb7aff27792192e0019ae8fc6: Linux 2.6.34-rc3 (2010-03-30 09:24:39 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost Jeff Dike (1): vhost-net: fix vq_memory_access_ok error checking drivers/vhost/vhost.c |4 1 files changed, 4 insertions(+), 0 deletions(-) ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH v3] Add Mergeable receive buffer support to vhost_net
On Wed, Apr 07, 2010 at 10:37:17AM -0700, David Stevens wrote: Thanks! There's some whitespace damage, are you sending with your new sendmail setup? It seems to have worked for qemu patches ... Yes, I saw some line wraps in what I received, but I checked the original draft to be sure and they weren't there. Possibly from the relay; Sigh. @@ -167,8 +166,15 @@ static void handle_tx(struct vhost_net * /* TODO: Check specific error and bomb out unless ENOBUFS? */ err = sock-ops-sendmsg(NULL, sock, msg, len); if (unlikely(err 0)) { - vhost_discard_vq_desc(vq); - tx_poll_start(net, sock); + if (err == -EAGAIN) { +vhost_discard_desc(vq, 1); +tx_poll_start(net, sock); + } else { +vq_err(vq, sendmsg: errno %d\n, -err); +/* drop packet; do not discard/resend */ +vhost_add_used_and_signal(net-dev, vq, head, + 0); vhost does not currently has a consistent error handling strategy: if we drop packets, need to think which other errors should cause packet drops. I prefer to just call vq_err for now, and have us look at handling segfaults etc in a consistent way separately. I had to add this to avoid an infinite loop when I wrote a bad packet on the socket. I agree error handling needs a better look, but retrying a bad packet continuously while dumping in the log is what it was doing when I hit an error before this code. Isn't this better, at least until a second look? Hmm, what do you mean 'continuously'? Don't we only try again on next kick? +} + I wonder whether it makes sense to check skb_queue_empty(sk-sk_receive_queue) outside the lock, to reduce the cost of this call on an empty queue (we know that it happens at least once each time we exit the loop on rx)? I was looking at alternatives to adding the lock in the first place, but I found I couldn't measure a difference in the cost with and without the lock. + int retries = 0; size_t len, total_len = 0; - int err; + int err, headcount, datalen; size_t hdr_size; struct socket *sock = rcu_dereference(vq-private_data); if (!sock || skb_queue_empty(sock-sk-sk_receive_queue)) @@ -222,31 +242,25 @@ static void handle_rx(struct vhost_net * vq_log = unlikely(vhost_has_feature(net-dev, VHOST_F_LOG_ALL)) ? vq-log : NULL; - for (;;) { - head = vhost_get_vq_desc(net-dev, vq, vq-iov, -ARRAY_SIZE(vq-iov), -out, in, -vq_log, log); + while ((datalen = vhost_head_len(sock-sk))) { + headcount = vhost_get_desc_n(vq, vq-heads, datalen, in, +vq_log, log); This looks like a bug, I think we need to pass datalen + header size to vhost_get_desc_n. Not sure how we know the header size that backend will use though. Maybe just look at our features. Yes; we have hdr_size, so I can add it here. It'll be 0 for the cases where the backend and guest both have vnet header (either the regular or larger mergeable buffers one), but should be added in for future raw socket support. So hdr_size is the wrong thing to add then. We need to add non-zero value for tap now. /* OK, now we need to know about added descriptors. */ - if (head == vq-num) { - if (unlikely(vhost_enable_notify(vq))) { + if (!headcount) { + if (retries == 0 unlikely(vhost_enable_notify(vq))) { /* They have slipped one in as we were * doing that: check again. */ vhost_disable_notify(vq); +retries++; continue; } Hmm. The reason we have the code at all, as the comment says, is because guest could have added more buffers between the time we read last index and the time we enabled notification. So if we just break like this the race still exists. We could remember the last head value we observed, and have vhost_enable_notify check against this value? This is to prevent a spin loop in the case where we have some buffers available, but not enough for the current packet (ie, this is the replacement code for the rxmaxheadcount business). If they actually added something new, retrying once should see it, but what vhost_enable_notify() returns non-zero on is not new buffers but rather not empty. In the case mentioned, we aren't empty, so vhost_enable_notify() returns nonzero every time, but the guest hasn't given us enough buffers to proceed, so we continuously retry; this code breaks the spin loop until we've really got new buffers from the guest. What about the race I point out above? It seems to potentially cause a deadlock. Need to think about it. Another concern here is that on retries
Re: A clocksource driver for HyperV
On 4/5/2010 at 5:36 PM, in message 4bba57fb.2000...@goop.org , Jeremy Fitzhardinge jer...@goop.org wrote: On 04/05/2010 01:30 PM, Ky Srinivasan wrote: +static cycle_t read_hv_clock(struct clocksource *arg) +{ +cycle_t current_tick; +/* + * Read the partition counter to get the current tick count. This count + * is set to 0 when the partition is created and is incremented in + * 100 nanosecond units. + */ +rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick); +return current_tick; +} + +static struct clocksource clocksource_hyperv = { +.name = hyperv_clocksource, Seems like a redundantly long name; any use of this string is going to be in a context where it is obviously a clocksource. How about just hyperv +.rating = 400, /* use this when running on Hyperv*/ +.read = read_hv_clock, +.mask = CLOCKSOURCE_MASK(64), +.shift = HV_CLOCK_SHIFT, +}; + +static struct dmi_system_id __initconst +hv_timesource_dmi_table[] __maybe_unused = { +{ +.ident = Hyper-V, +.matches = { +DMI_MATCH(DMI_SYS_VENDOR, Microsoft Corporation), +DMI_MATCH(DMI_PRODUCT_NAME, Virtual Machine), +DMI_MATCH(DMI_BOARD_NAME, Virtual Machine), +}, +}, +{ }, +}; +MODULE_DEVICE_TABLE(dmi, hv_timesource_dmi_table); So you use the DMI signatures to determine whether the module is needed, but cpuid to work out if the feature is present? + +static struct pci_device_id __initconst +hv_timesource_pci_table[] __maybe_unused = { +{ PCI_DEVICE(0x1414, 0x5353) }, /* VGA compatible controller */ +{ 0 } +}; +MODULE_DEVICE_TABLE(pci, hv_timesource_pci_table); And/or PCI? Seems a bit... ad-hoc? Is this the official way to determine the presence of Hyper-V? The presence of HyperV and our ability to use the partition-wide counter obviously is checked via probing the cpuid leaves. The DMI/PCI signatures are used in auto-loading these modules. + + +static int __init hv_detect_hyperv(void) This looks generally useful. Should it be hidden away in the clocksource driver, or in some common hyper-v code? Do other hyper-v drivers have versions of this? Good point. Right now, I can think of multiple drivers replicating this code. We could include hyperV detection code in cpu/hypervisor.c . I will spin up a patch for doing that shortly. +{ +u32 eax, ebx, ecx, edx; +static char hyp_signature[20]; 20? static? + +cpuid(1,eax,ebx,ecx,edx); +if (!(ecx HV_HYPERVISOR_PRESENT_BIT)) { +printk(KERN_WARNING +Not on a Hypervisor\n); This just looks like noise, especially since it doesn't identify what is generating the message. And if you compile this code in as =y (non-modular) then it will complain every boot. +return 1; +} +cpuid(HV_CPUID_SIGNATURE,eax,ebx,ecx,edx); +*(u32 *)(hyp_signature + 0) = ebx; +*(u32 *)(hyp_signature + 4) = ecx; +*(u32 *)(hyp_signature + 8) = edx; +hyp_signature[12] = 0; + +if ((eax HV_CPUID_MIN) || (strcmp(Microsoft Hv, hyp_signature))) { memcmp, surely? +printk(KERN_WARNING +Not on HyperV; signature %s, eax %x\n, +hyp_signature, eax); +return 1; +} +/* + * Extract the features, recommendations etc. + */ +cpuid(HV_CPUID_FEATURES,eax,ebx,ecx,edx); +if (!(eax 0x10)) { +printk(KERN_WARNING HyperV Time Ref Counter not available!\n); +return 1; +} + +cpuid(HV_CPUID_RECOMMENDATIONS,eax,ebx,ecx,edx); +printk(KERN_INFO HyperV recommendations: %x\n, eax); +printk(KERN_INFO HyperV spin count: %x\n, ebx); +return 0; +} + + +static int __init init_hv_clocksource(void) +{ +if (hv_detect_hyperv()) +return -ENODEV; +/* + * The time ref counter in HyperV is in 100ns units. + * The definition of mult is: + * mult/2^shift = ns/cyc = 100 + * mult = (100 shift) + */ +clocksource_hyperv.mult = (100 HV_CLOCK_SHIFT); Why not initialize this in the structure? It's just 10022 isn't it? +printk(KERN_INFO Registering HyperV clock source\n); +return clocksource_register(clocksource_hyperv); +} + +module_init(init_hv_clocksource); +MODULE_DESCRIPTION(HyperV based clocksource); +MODULE_AUTHOR(K. Y. Srinivasan ksriniva...@novell.com ); +MODULE_LICENSE(GPL); Index: linux/drivers/staging/hv/Makefile === --- linux.orig/drivers/staging/hv/Makefile 2010-04-05 13:02:06.0 -0600 +++ linux/drivers/staging/hv/Makefile2010-04-05 13:02:13.0 -0600 @@ -1,4 +1,4 @@ -obj-$(CONFIG_HYPERV)
Re: A clocksource driver for HyperV
On 4/5/2010 at 5:36 PM, in message 4bba57fb.2000...@goop.org , Jeremy Fitzhardinge jer...@goop.org wrote: On 04/05/2010 01:30 PM, Ky Srinivasan wrote: +static cycle_t read_hv_clock(struct clocksource *arg) +{ +cycle_t current_tick; +/* + * Read the partition counter to get the current tick count. This count + * is set to 0 when the partition is created and is incremented in + * 100 nanosecond units. + */ +rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick); +return current_tick; +} + +static struct clocksource clocksource_hyperv = { +.name = hyperv_clocksource, Seems like a redundantly long name; any use of this string is going to be in a context where it is obviously a clocksource. How about just hyperv +.rating = 400, /* use this when running on Hyperv*/ +.read = read_hv_clock, +.mask = CLOCKSOURCE_MASK(64), +.shift = HV_CLOCK_SHIFT, +}; + +static struct dmi_system_id __initconst +hv_timesource_dmi_table[] __maybe_unused = { +{ +.ident = Hyper-V, +.matches = { +DMI_MATCH(DMI_SYS_VENDOR, Microsoft Corporation), +DMI_MATCH(DMI_PRODUCT_NAME, Virtual Machine), +DMI_MATCH(DMI_BOARD_NAME, Virtual Machine), +}, +}, +{ }, +}; +MODULE_DEVICE_TABLE(dmi, hv_timesource_dmi_table); So you use the DMI signatures to determine whether the module is needed, but cpuid to work out if the feature is present? + +static struct pci_device_id __initconst +hv_timesource_pci_table[] __maybe_unused = { +{ PCI_DEVICE(0x1414, 0x5353) }, /* VGA compatible controller */ +{ 0 } +}; +MODULE_DEVICE_TABLE(pci, hv_timesource_pci_table); And/or PCI? Seems a bit... ad-hoc? Is this the official way to determine the presence of Hyper-V? The presence of HyperV and our ability to use the partition-wide counter obviously is checked via probing the cpuid leaves. The DMI/PCI signatures are used in auto-loading these modules. + + +static int __init hv_detect_hyperv(void) This looks generally useful. Should it be hidden away in the clocksource driver, or in some common hyper-v code? Do other hyper-v drivers have versions of this? Good point. Right now, I can think of multiple drivers replicating this code. We could include hyperV detection code in cpu/hypervisor.c . I will spin up a patch for doing that shortly. +{ +u32 eax, ebx, ecx, edx; +static char hyp_signature[20]; 20? static? + +cpuid(1,eax,ebx,ecx,edx); +if (!(ecx HV_HYPERVISOR_PRESENT_BIT)) { +printk(KERN_WARNING +Not on a Hypervisor\n); This just looks like noise, especially since it doesn't identify what is generating the message. And if you compile this code in as =y (non-modular) then it will complain every boot. +return 1; +} +cpuid(HV_CPUID_SIGNATURE,eax,ebx,ecx,edx); +*(u32 *)(hyp_signature + 0) = ebx; +*(u32 *)(hyp_signature + 4) = ecx; +*(u32 *)(hyp_signature + 8) = edx; +hyp_signature[12] = 0; + +if ((eax HV_CPUID_MIN) || (strcmp(Microsoft Hv, hyp_signature))) { memcmp, surely? +printk(KERN_WARNING +Not on HyperV; signature %s, eax %x\n, +hyp_signature, eax); +return 1; +} +/* + * Extract the features, recommendations etc. + */ +cpuid(HV_CPUID_FEATURES,eax,ebx,ecx,edx); +if (!(eax 0x10)) { +printk(KERN_WARNING HyperV Time Ref Counter not available!\n); +return 1; +} + +cpuid(HV_CPUID_RECOMMENDATIONS,eax,ebx,ecx,edx); +printk(KERN_INFO HyperV recommendations: %x\n, eax); +printk(KERN_INFO HyperV spin count: %x\n, ebx); +return 0; +} + + +static int __init init_hv_clocksource(void) +{ +if (hv_detect_hyperv()) +return -ENODEV; +/* + * The time ref counter in HyperV is in 100ns units. + * The definition of mult is: + * mult/2^shift = ns/cyc = 100 + * mult = (100 shift) + */ +clocksource_hyperv.mult = (100 HV_CLOCK_SHIFT); Why not initialize this in the structure? It's just 10022 isn't it? +printk(KERN_INFO Registering HyperV clock source\n); +return clocksource_register(clocksource_hyperv); +} + +module_init(init_hv_clocksource); +MODULE_DESCRIPTION(HyperV based clocksource); +MODULE_AUTHOR(K. Y. Srinivasan ksriniva...@novell.com ); +MODULE_LICENSE(GPL); Index: linux/drivers/staging/hv/Makefile === --- linux.orig/drivers/staging/hv/Makefile 2010-04-05 13:02:06.0 -0600 +++ linux/drivers/staging/hv/Makefile2010-04-05 13:02:13.0 -0600 @@ -1,4 +1,4 @@ -obj-$(CONFIG_HYPERV)
Re: A clocksource driver for HyperV
On 04/07/2010 11:20 AM, Ky Srinivasan wrote: Jeremy, thank you for your comments. I am attaching the next version of this patch that addresses the comments I have gotten thus far. It looks like you overlooked a couple of comments. +} + +static struct clocksource hyperv_cs = { + .name = hyperv_clocksource, + .rating = 400, /* use this when running on Hyperv*/ + .read = read_hv_clock, Did you not want to call it hyperv for some reason? +static int __init hv_detect_hyperv(void) +{ + u32 eax, ebx, ecx, edx; + char hyp_signature[20]; Why 20? You're only putting 12 bytes into it. + + cpuid(1,eax,ebx,ecx,edx); + + if (!(ecx HV_HYPERVISOR_PRESENT_BIT)) + return 1; + + cpuid(HV_CPUID_SIGNATURE,eax,ebx,ecx,edx); + *(u32 *)(hyp_signature + 0) = ebx; + *(u32 *)(hyp_signature + 4) = ecx; + *(u32 *)(hyp_signature + 8) = edx; + + if ((eax HV_CPUID_MIN) || + (memcmp(Microsoft Hv, hyp_signature, 12))) { + printk(KERN_WARNING + Not on HyperV; signature %s, eax %x\n, + hyp_signature, eax); The message is pointless. There's nothing to error or warn about booting under some other hypervisor. Thanks, J ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [GIT PULL] vhost-net fix for 2.6.34-rc3
From: Michael S. Tsirkin m...@redhat.com Date: Wed, 7 Apr 2010 20:35:02 +0300 David, The following tree includes a patch fixing an issue with vhost-net in 2.6.34-rc3. Please pull for 2.6.34. Pulled, thanks Michael. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
[PULL] virtio console fixes
The following changes since commit 48de8cb7847d040c8892701c1ff3c55eff1f46b4: Linus Torvalds (1): Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/.../tip/linux-2.6-tip are available in the git repository at: ssh://master.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus.git master Amit Shah (1): MAINTAINERS: Put the virtio-console entry in correct alphabetical order Anton Blanchard (1): hvc_console: Fix race between hvc_close and hvc_remove François Diakhaté (1): virtio: console: Fix early_put_chars usage Michael S. Tsirkin (1): virtio: disable multiport console support. Rusty Russell (1): virtio: console makes incorrect assumption about virtio API MAINTAINERS| 13 drivers/char/hvc_console.c |4 -- drivers/char/virtio_console.c | 65 +--- include/linux/virtio_console.h | 23 -- 4 files changed, 54 insertions(+), 51 deletions(-) commit 9a82446bd269b130a9ac270e720e65c3843d4d0c Author: Amit Shah amit.s...@redhat.com Date: Tue Mar 23 18:23:09 2010 +0530 MAINTAINERS: Put the virtio-console entry in correct alphabetical order Move around the entry for virtio-console to keep the file sorted. Signed-off-by: Amit Shah amit.s...@redhat.com Signed-off-by: Rusty Russell ru...@rustcorp.com.au MAINTAINERS | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) commit 162a689a13ed61c0752726edb75427b2cd4186c1 Author: François Diakhaté fdi...@gmail.com Date: Tue Mar 23 18:23:15 2010 +0530 virtio: console: Fix early_put_chars usage Currently early_put_chars is not used by virtio_console because it can only be used once a port has been found, at which point it's too late because it is no longer needed. This patch should fix it. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Amit Shah amit.s...@redhat.com Signed-off-by: Rusty Russell ru...@rustcorp.com.au drivers/char/virtio_console.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) commit 9ff4cfab82d27e9fda72315f911bbaa9516e04bc Author: Rusty Russell ru...@rustcorp.com.au Date: Thu Apr 8 09:46:16 2010 -0600 virtio: console makes incorrect assumption about virtio API The get_buf() API sets the second arg to the number of bytes *written* by the other side; in this case it should be zero as these are output buffers. lguest gets this right (obviously kvm's console doesn't), resulting in continual buildup of console writes. Signed-off-by: Rusty Russell ru...@rustcorp.com.au Acked-by: Amit Shah amit.s...@redhat.com drivers/char/virtio_console.c | 10 +++--- 1 files changed, 3 insertions(+), 7 deletions(-) commit b7a413015d2986edf020fba765c906cc9cbcbfc9 Author: Michael S. Tsirkin m...@redhat.com Date: Wed Mar 31 21:56:42 2010 +0300 virtio: disable multiport console support. Move MULTIPORT feature and related config changes out of exported headers, and disable the feature at runtime. At this point, it seems less risky to keep code around until we can enable it than rip it out completely. Signed-off-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Rusty Russell ru...@rustcorp.com.au drivers/char/virtio_console.c | 49 +-- include/linux/virtio_console.h | 23 -- 2 files changed, 41 insertions(+), 31 deletions(-) commit 320718ee074acce5ffced6506cb51af1388942aa Author: Anton Blanchard an...@samba.org Date: Tue Apr 6 21:42:38 2010 +1000 hvc_console: Fix race between hvc_close and hvc_remove I don't claim to understand the tty layer, but it seems like hvc_open and hvc_close should be balanced in their kref reference counting. Right now we get a kref every call to hvc_open: if (hp-count++ 0) { tty_kref_get(tty); - here spin_unlock_irqrestore(hp-lock, flags); hvc_kick(); return 0; } /* else count == 0 */ tty-driver_data = hp; hp-tty = tty_kref_get(tty); -- or here if hp-count was 0 But hvc_close has: tty_kref_get(tty); if (--hp-count == 0) { ... /* Put the ref obtained in hvc_open() */ tty_kref_put(tty); ... } tty_kref_put(tty); Since the outside kref get/put balance we only do a single kref_put when count reaches 0. The patch below changes things to call tty_kref_put once for every hvc_close call, and with that my machine boots fine. Signed-off-by: Anton Blanchard an...@samba.org Acked-by: Amit Shah amit.s...@redhat.com Signed-off-by: Rusty Russell ru...@rustcorp.com.au