Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On Wed, 5 Nov 2014 15:46:25 + Daniel Silverstone dsilv...@digital-scurf.org wrote: Sadly I can't look at the exact firmware of the device, but I can tell you it was based on ST Electronics' virtual COM port example in their *old* STM32 development kits. From some experience with this at work, I have to say that that sample code had (circa 2010) some serious issues, and we ended up doing a lot of work to make it meet requirements on a project. I'm not that surprised it needs a quirk ... Richard -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On 5 Nov 2014, Johan Hovold told this: On Wed, Nov 05, 2014 at 03:14:49PM +, Nix wrote: What if you physically disconnect and reconnect the device instead, or simply That fixes it, in fact it's the only way to fix it once it's broken by this bug. I didn't mean whether it would get the device working again, but rather whether you could kill it this way. Never seen it fail after a physical disconnection. Ok. And it has to include an enumeration, since just opening and closing it (restarting the daemon) repeatedly seems to work? Well, it's more accurate to say that restarting the daemon doesn't make it fail, but won't make it start working if it's already gone nuts either. It normally copes fine with the daemon stopping and starting, yes (and the daemon copes fine with keys being connected and disconnected). Yeah, it seems your device firmware has crashed. It stops responding to control requests. Not surprising: I was fairly sure we were provoking a key firmware crash or something like that. This is a device with no support for flow control at all, after all, so I'm not terribly surprised that trying to do flow control confuses it. Only weird thing is that it has been coping with those calls for a long time. Only the slightly changed timings seems to have triggered this issue. Yeah. I get the strong impression from Daniel that the USB side of this was done by taking something that worked on the kernel of the day, adding the key-specific stuff to it, and making sure that it still worked. i.e. this was not a from-spec reimplementation, it was using an existing (old) stack. If that stack makes iffy assumptions, so will the entropy key. Look at it? Only Daniel Silverstone (Cc:ed) can do that. The only copy of the firmware I have is baked into the sealed key. :) As his email noted, no he can't :) but you do now have a link to the thing it was based on. Ah, ok. I guess we need a new quirk then. There's already a quirk in the driver to suppress error from when trying to set the control lines. But that doesn't help this device, which happily accepts the requests and then dies at random times. Yeah. Strange. (I thought it was it's 'right after', but you seem to have disproved that. It's only 'sometime later'.) Could you try two more things (too make sure line control is really the culprit): 1. If you clear HUPCL in ekeyd so that the lines are never lowered, does that fix the stability issue? Definitely not. I got a hang after the first reboot out of an afflicted kernel, when using a HUPCLless ekeyd. Weird. (I guess they're lowered on reboot anyway?) 2. Can you verify that the patch below works? Did I use the correct VID/PID? Could you provide lsusb -v output (for the capability flags) as well? The VID/PID are right. The patch seems to work. I tested against the usb testing tree directly, since that was easier than adjusting it to apply against 3.16. Sixteen reboots, no failures, looks to be fixed. lsusb output: Bus 002 Device 003: ID 20df:0001 Simtec Electronics Entropy Key [UDEKEY01] Device Descriptor: bLength18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass2 Communications bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize064 idVendor 0x20df Simtec Electronics idProduct 0x0001 Entropy Key [UDEKEY01] bcdDevice2.00 iManufacturer 1 Simtec Electronics iProduct2 Entropy Key iSerial 3 M/9tBjBLNzE2RSFD bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 67 bNumInterfaces 2 bConfigurationValue 1 iConfiguration 0 bmAttributes 0xc0 Self Powered MaxPower 76mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 1 AT-commands (v.25ter) iInterface 0 CDC Header: bcdCDC 1.10 CDC Call Management: bmCapabilities 0x00 bDataInterface 1 CDC ACM: bmCapabilities 0x00 CDC Union: bMasterInterface0 bSlaveInterface 1 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes3 Transfer TypeInterrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 255
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On Thu, Nov 06, 2014 at 01:49:13PM +, Nix wrote: On 5 Nov 2014, Johan Hovold told this: On Wed, Nov 05, 2014 at 03:14:49PM +, Nix wrote: Could you try two more things (too make sure line control is really the culprit): 1. If you clear HUPCL in ekeyd so that the lines are never lowered, does that fix the stability issue? Definitely not. I got a hang after the first reboot out of an afflicted kernel, when using a HUPCLless ekeyd. Weird. (I guess they're lowered on reboot anyway?) It's actually only the timings related to the control-lines being raised on open that has changed, so this would seem consistent with that. 2. Can you verify that the patch below works? Did I use the correct VID/PID? Could you provide lsusb -v output (for the capability flags) as well? The VID/PID are right. The patch seems to work. I tested against the usb testing tree directly, since that was easier than adjusting it to apply against 3.16. Sixteen reboots, no failures, looks to be fixed. Great, thanks for testing. Note that the patch is against the usb-linus branch of the usb tree, OK. (I presume this is gregkh's tree, not yours?) Yes, Greg's tree, could have been more clear about that. which also contains another fix which could affect this device (set_termios will now be called an extra time on every open). You also need this one, which have not yet been applied: http://marc.info/?l=linux-usbm=141520959813505w=2 It had been applied by the time I tested this :) Greg is fast. :) Thanks for tracking this down. That bisect cannot have been fun given the low failure rate (sometimes only one in ten reboots?). Johan -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On 6 Nov 2014, Johan Hovold said: On Thu, Nov 06, 2014 at 01:49:13PM +, Nix wrote: On 5 Nov 2014, Johan Hovold told this: On Wed, Nov 05, 2014 at 03:14:49PM +, Nix wrote: Could you try two more things (too make sure line control is really the culprit): 1. If you clear HUPCL in ekeyd so that the lines are never lowered, does that fix the stability issue? Definitely not. I got a hang after the first reboot out of an afflicted kernel, when using a HUPCLless ekeyd. Weird. (I guess they're lowered on reboot anyway?) It's actually only the timings related to the control-lines being raised on open that has changed, so this would seem consistent with that. Urgh. No wonder it was intermittent. Thanks for tracking this down. That bisect cannot have been fun given the low failure rate (sometimes only one in ten reboots?). It often failed quite fast, but yes, the negative case was hard to prove: I had to rewind twice. I counted reboots because I'm a flaming aspie pedant. 173 reboots that took... thank goodness it replicated on the machine of mine that's fastest to reboot! -- NULL (void) -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On Fri, Oct 31, 2014 at 04:44:46PM +, Nix wrote: Sorry for the delay: illness and work-related release time flurries. On 24 Oct 2014, Johan Hovold told this: [ +CC: linux-usb ] On Wed, Oct 22, 2014 at 04:36:59PM +0100, Nix wrote: On 22 Oct 2014, Johan Hovold outgrape: On Wed, Oct 22, 2014 at 10:31:17AM +0100, Nix wrote: On 14 Oct 2014, Johan Hovold verbalised: On Sun, Oct 12, 2014 at 10:36:30PM +0100, Nix wrote: I have checked: this code is being executed against a symlink that points to /dev/ttyACM0, and the tcsetattr() succeeds. (At least, it's succeeding on the kernel I'm running now, but of course that's 3.16.5 with this commit reverted...) You could verify that by enabling debugging in the cdc-acm driver and making sure that the corresponding control messages are indeed sent on close. I have a debugging dump at http://www.esperi.org.uk/~nix/temporary/cdc-acm.log; it's fairly What kernel were you using here? The log seems to suggest that it was generated with the commit in question reverted. Look now :) (the previous log is still there in cdc-acm-reverted.log.) Unfortunately, it seems the logs are incomplete. There are lots of entries missing (e.g. acm_tty_install when opening, but also some acm_submit_read_urb), some of which were there in your first log. OK. What we have now in http://www.esperi.org.uk/~nix/temporary/cdc-acm.log is a log from the pristine upstream 3.16.6 kernel with cdc-acm debugging turned on and the acm_tty_write - count stuff in acm_tty_write() disabled: I've increased the dmesg buffer size so the top isn't being cut off any more. It took a lot of boots for it to fail this time: about a dozen. The log contains the boot that failed and the one before. (To my uneducated eye, the initial traffic to/from the key looks *very* different in the second boot. Something is clearly wrong by this point, but that's not much of a surprise!) The log appears incomplete again, this time it seems the last part is completely missing (the device is never closed for example). The device still seems to be responding. This contains two boots -- one on which the USB key worked, and the next (with an identical kernel) with which the key was broken. (I'm not sure whether this problem happens at startup or shutdown time, so it seemed best to provide both.) That's a good idea. Is it only after reboot you've seen the device fail? So far. What if you physically disconnect and reconnect the device instead, or simply That fixes it, in fact it's the only way to fix it once it's broken by this bug. I didn't mean whether it would get the device working again, but rather whether you could kill it this way. repeatedly open and close it? Hm. Good idea. Same here, but the below test was also informative. ... no, that doesn't help. Additional log from that shows a lot of what looks like error returns: Oct 31 16:38:03 fold kern debug: : [ 168.135213] cdc_acm 2-1:1.0: acm_tty_close Oct 31 16:38:08 fold kern debug: : [ 173.130531] cdc_acm 2-1:1.0: acm_ctrl_msg - rq 0x22, val 0x0, len 0x0, result -110 Yeah, it seems your device firmware has crashed. It stops responding to control requests. Oct 31 16:38:08 fold kern debug: : [ 173.130557] cdc_acm 2-1:1.0: acm_port_shutdown Oct 31 16:38:08 fold kern debug: : [ 173.131475] cdc_acm 2-1:1.0: acm_ctrl_irq - urb shutting down with status: -2 Oct 31 16:38:08 fold kern debug: : [ 173.132474] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.132519] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.133473] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.133510] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.134471] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.134507] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.135471] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.135509] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.136472] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.136507] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.137471] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.137517] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.138471] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.138520] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.139469] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug:
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On 5 Nov 2014, Johan Hovold stated: On Fri, Oct 31, 2014 at 04:44:46PM +, Nix wrote: Sorry for the delay: illness and work-related release time flurries. On 24 Oct 2014, Johan Hovold told this: [ +CC: linux-usb ] On Wed, Oct 22, 2014 at 04:36:59PM +0100, Nix wrote: On 22 Oct 2014, Johan Hovold outgrape: On Wed, Oct 22, 2014 at 10:31:17AM +0100, Nix wrote: On 14 Oct 2014, Johan Hovold verbalised: On Sun, Oct 12, 2014 at 10:36:30PM +0100, Nix wrote: I have checked: this code is being executed against a symlink that points to /dev/ttyACM0, and the tcsetattr() succeeds. (At least, it's succeeding on the kernel I'm running now, but of course that's 3.16.5 with this commit reverted...) You could verify that by enabling debugging in the cdc-acm driver and making sure that the corresponding control messages are indeed sent on close. I have a debugging dump at http://www.esperi.org.uk/~nix/temporary/cdc-acm.log; it's fairly What kernel were you using here? The log seems to suggest that it was generated with the commit in question reverted. Look now :) (the previous log is still there in cdc-acm-reverted.log.) Unfortunately, it seems the logs are incomplete. There are lots of entries missing (e.g. acm_tty_install when opening, but also some acm_submit_read_urb), some of which were there in your first log. OK. What we have now in http://www.esperi.org.uk/~nix/temporary/cdc-acm.log is a log from the pristine upstream 3.16.6 kernel with cdc-acm debugging turned on and the acm_tty_write - count stuff in acm_tty_write() disabled: I've increased the dmesg buffer size so the top isn't being cut off any more. It took a lot of boots for it to fail this time: about a dozen. The log contains the boot that failed and the one before. (To my uneducated eye, the initial traffic to/from the key looks *very* different in the second boot. Something is clearly wrong by this point, but that's not much of a surprise!) The log appears incomplete again, this time it seems the last part is completely missing (the device is never closed for example). The device still seems to be responding. Nope -- by the time I clipped the log, the device was definitely nonresponsive. I've appended the remaining log, just in case. This is the same as the snapshot I added to the email itself last time: a close-and-open as I tried restarting the daemon, and another close as part of system shutdown. What if you physically disconnect and reconnect the device instead, or simply That fixes it, in fact it's the only way to fix it once it's broken by this bug. I didn't mean whether it would get the device working again, but rather whether you could kill it this way. Never seen it fail after a physical disconnection. ... no, that doesn't help. Additional log from that shows a lot of what looks like error returns: Oct 31 16:38:03 fold kern debug: : [ 168.135213] cdc_acm 2-1:1.0: acm_tty_close Oct 31 16:38:08 fold kern debug: : [ 173.130531] cdc_acm 2-1:1.0: acm_ctrl_msg - rq 0x22, val 0x0, len 0x0, result -110 Yeah, it seems your device firmware has crashed. It stops responding to control requests. Not surprising: I was fairly sure we were provoking a key firmware crash or something like that. This is a device with no support for flow control at all, after all, so I'm not terribly surprised that trying to do flow control confuses it. The above is all normal, but Oct 31 16:38:08 fold kern debug: : [ 173.161489] cdc_acm 2-1:1.0: acm_ctrl_msg - rq 0x22, val 0x3, len 0x0, result -62 here's another timeout. It's dead. Again, not surprising. Did you get anywhere with trying to look at the device firmware? Look at it? Only Daniel Silverstone (Cc:ed) can do that. The only copy of the firmware I have is baked into the sealed key. :) -- NULL (void) -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On Wed, Nov 05, 2014 at 15:14:49 +, Nix wrote: Did you get anywhere with trying to look at the device firmware? Look at it? Only Daniel Silverstone (Cc:ed) can do that. The only copy of the firmware I have is baked into the sealed key. :) Sadly I can't look at the exact firmware of the device, but I can tell you it was based on ST Electronics' virtual COM port example in their *old* STM32 development kits. Fortunately I do have a record of that example code here: https://git.gitano.org.uk/stm32-virtcomport.git/ I'm not a USB guru though. D. -- Daniel Silverstone http://www.digital-scurf.org/ PGP mail accepted and encouraged.Key Id: 3CCE BABE 206C 3B69 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On Wed, Nov 05, 2014 at 03:14:49PM +, Nix wrote: On 5 Nov 2014, Johan Hovold stated: On Fri, Oct 31, 2014 at 04:44:46PM +, Nix wrote: Sorry for the delay: illness and work-related release time flurries. On 24 Oct 2014, Johan Hovold told this: The log appears incomplete again, this time it seems the last part is completely missing (the device is never closed for example). The device still seems to be responding. Nope -- by the time I clipped the log, the device was definitely nonresponsive. I've appended the remaining log, just in case. This is the same as the snapshot I added to the email itself last time: a close-and-open as I tried restarting the daemon, and another close as part of system shutdown. Ok, yeah, there's the crash. What if you physically disconnect and reconnect the device instead, or simply That fixes it, in fact it's the only way to fix it once it's broken by this bug. I didn't mean whether it would get the device working again, but rather whether you could kill it this way. Never seen it fail after a physical disconnection. Ok. And it has to include an enumeration, since just opening and closing it (restarting the daemon) repeatedly seems to work? ... no, that doesn't help. Additional log from that shows a lot of what looks like error returns: Oct 31 16:38:03 fold kern debug: : [ 168.135213] cdc_acm 2-1:1.0: acm_tty_close Oct 31 16:38:08 fold kern debug: : [ 173.130531] cdc_acm 2-1:1.0: acm_ctrl_msg - rq 0x22, val 0x0, len 0x0, result -110 Yeah, it seems your device firmware has crashed. It stops responding to control requests. Not surprising: I was fairly sure we were provoking a key firmware crash or something like that. This is a device with no support for flow control at all, after all, so I'm not terribly surprised that trying to do flow control confuses it. Only weird thing is that it has been coping with those calls for a long time. Only the slightly changed timings seems to have triggered this issue. The above is all normal, but Oct 31 16:38:08 fold kern debug: : [ 173.161489] cdc_acm 2-1:1.0: acm_ctrl_msg - rq 0x22, val 0x3, len 0x0, result -62 here's another timeout. It's dead. Again, not surprising. Did you get anywhere with trying to look at the device firmware? Look at it? Only Daniel Silverstone (Cc:ed) can do that. The only copy of the firmware I have is baked into the sealed key. :) Ah, ok. I guess we need a new quirk then. There's already a quirk in the driver to suppress error from when trying to set the control lines. But that doesn't help this device, which happily accepts the requests and then dies at random times. Could you try two more things (too make sure line control is really the culprit): 1. If you clear HUPCL in ekeyd so that the lines are never lowered, does that fix the stability issue? 2. Can you verify that the patch below works? Did I use the correct VID/PID? Could you provide lsusb -v output (for the capability flags) as well? Note that the patch is against the usb-linus branch of the usb tree, which also contains another fix which could affect this device (set_termios will now be called an extra time on every open). You also need this one, which have not yet been applied: http://marc.info/?l=linux-usbm=141520959813505w=2 Thanks, Johan From 76abc8a7eda9ea1978ee3527c773210c28332317 Mon Sep 17 00:00:00 2001 From: Johan Hovold jo...@kernel.org Date: Wed, 5 Nov 2014 18:51:54 +0100 Subject: [PATCH] USB: cdc-acm: add quirk for control-line state requests Add new quirk for devices that cannot handle control-line state requests. Note that we currently send these requests to all devices, regardless of whether they claim to support it, but that errors are only logged if support is claimed. Since commit 0943d8ead30e (USB: cdc-acm: use tty-port dtr_rts), which only changed the timings for these requests slightly, this has been reported to cause occasional firmware crashes on Simtec Electronics Entropy Key devices after re-enumeration. Enable the quirk for this device. Reported-by: Nix n...@esperi.org.uk Cc: stable sta...@vger.kernel.org # v3.16 Signed-off-by: Johan Hovold jo...@kernel.org --- drivers/usb/class/cdc-acm.c | 14 -- drivers/usb/class/cdc-acm.h | 2 ++ 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c index 9d6495424b06..077d58ac3dcb 100644 --- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -148,8 +148,15 @@ static int acm_ctrl_msg(struct acm *acm, int request, int value, /* devices aren't required to support these requests. * the cdc acm descriptor tells whether they do... */ -#define acm_set_control(acm, control) \ - acm_ctrl_msg(acm, USB_CDC_REQ_SET_CONTROL_LINE_STATE, control, NULL, 0) +static inline int acm_set_control(struct acm
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
Sorry for the delay: illness and work-related release time flurries. On 24 Oct 2014, Johan Hovold told this: [ +CC: linux-usb ] On Wed, Oct 22, 2014 at 04:36:59PM +0100, Nix wrote: On 22 Oct 2014, Johan Hovold outgrape: On Wed, Oct 22, 2014 at 10:31:17AM +0100, Nix wrote: On 14 Oct 2014, Johan Hovold verbalised: On Sun, Oct 12, 2014 at 10:36:30PM +0100, Nix wrote: I have checked: this code is being executed against a symlink that points to /dev/ttyACM0, and the tcsetattr() succeeds. (At least, it's succeeding on the kernel I'm running now, but of course that's 3.16.5 with this commit reverted...) You could verify that by enabling debugging in the cdc-acm driver and making sure that the corresponding control messages are indeed sent on close. I have a debugging dump at http://www.esperi.org.uk/~nix/temporary/cdc-acm.log; it's fairly What kernel were you using here? The log seems to suggest that it was generated with the commit in question reverted. Look now :) (the previous log is still there in cdc-acm-reverted.log.) Unfortunately, it seems the logs are incomplete. There are lots of entries missing (e.g. acm_tty_install when opening, but also some acm_submit_read_urb), some of which were there in your first log. OK. What we have now in http://www.esperi.org.uk/~nix/temporary/cdc-acm.log is a log from the pristine upstream 3.16.6 kernel with cdc-acm debugging turned on and the acm_tty_write - count stuff in acm_tty_write() disabled: I've increased the dmesg buffer size so the top isn't being cut off any more. It took a lot of boots for it to fail this time: about a dozen. The log contains the boot that failed and the one before. (To my uneducated eye, the initial traffic to/from the key looks *very* different in the second boot. Something is clearly wrong by this point, but that's not much of a surprise!) This contains two boots -- one on which the USB key worked, and the next (with an identical kernel) with which the key was broken. (I'm not sure whether this problem happens at startup or shutdown time, so it seemed best to provide both.) That's a good idea. Is it only after reboot you've seen the device fail? So far. What if you physically disconnect and reconnect the device instead, or simply That fixes it, in fact it's the only way to fix it once it's broken by this bug. repeatedly open and close it? Hm. Good idea. ... no, that doesn't help. Additional log from that shows a lot of what looks like error returns: Oct 31 16:38:03 fold kern debug: : [ 168.135213] cdc_acm 2-1:1.0: acm_tty_close Oct 31 16:38:08 fold kern debug: : [ 173.130531] cdc_acm 2-1:1.0: acm_ctrl_msg - rq 0x22, val 0x0, len 0x0, result -110 Oct 31 16:38:08 fold kern debug: : [ 173.130557] cdc_acm 2-1:1.0: acm_port_shutdown Oct 31 16:38:08 fold kern debug: : [ 173.131475] cdc_acm 2-1:1.0: acm_ctrl_irq - urb shutting down with status: -2 Oct 31 16:38:08 fold kern debug: : [ 173.132474] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.132519] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.133473] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.133510] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.134471] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.134507] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.135471] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.135509] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.136472] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.136507] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.137471] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.137517] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.138471] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.138520] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.139469] cdc_acm 2-1:1.1: acm_write_bulk - len 0/1, status -2 Oct 31 16:38:08 fold kern debug: : [ 173.139515] cdc_acm 2-1:1.1: acm_softint Oct 31 16:38:08 fold kern debug: : [ 173.140470] cdc_acm 2-1:1.1: acm_read_bulk_callback - urb 0, len 0 Oct 31 16:38:08 fold kern debug: : [ 173.140491] cdc_acm 2-1:1.1: acm_read_bulk_callback - non-zero urb status: -2 Oct 31 16:38:08 fold kern debug: : [ 173.141469] cdc_acm 2-1:1.1: acm_read_bulk_callback - urb 1, len 0 Oct 31 16:38:08 fold kern debug: : [ 173.141489] cdc_acm 2-1:1.1: acm_read_bulk_callback - non-zero urb status: -2 Oct 31 16:38:08 fold kern debug: : [ 173.142469] cdc_acm 2-1:1.1:
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
[ +CC: linux-usb ] On Wed, Oct 22, 2014 at 04:36:59PM +0100, Nix wrote: On 22 Oct 2014, Johan Hovold outgrape: On Wed, Oct 22, 2014 at 10:31:17AM +0100, Nix wrote: On 14 Oct 2014, Johan Hovold verbalised: On Sun, Oct 12, 2014 at 10:36:30PM +0100, Nix wrote: I have checked: this code is being executed against a symlink that points to /dev/ttyACM0, and the tcsetattr() succeeds. (At least, it's succeeding on the kernel I'm running now, but of course that's 3.16.5 with this commit reverted...) You could verify that by enabling debugging in the cdc-acm driver and making sure that the corresponding control messages are indeed sent on close. I have a debugging dump at http://www.esperi.org.uk/~nix/temporary/cdc-acm.log; it's fairly What kernel were you using here? The log seems to suggest that it was generated with the commit in question reverted. Look now :) (the previous log is still there in cdc-acm-reverted.log.) Unfortunately, it seems the logs are incomplete. There are lots of entries missing (e.g. acm_tty_install when opening, but also some acm_submit_read_urb), some of which were there in your first log. Also you can disable the acm_tty_write - count output, which isn't really useful here. This contains two boots -- one on which the USB key worked, and the next (with an identical kernel) with which the key was broken. (I'm not sure whether this problem happens at startup or shutdown time, so it seemed best to provide both.) That's a good idea. Is it only after reboot you've seen the device fail? What if you physically disconnect and reconnect the device instead, or simply repeatedly open and close it? Johan -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16
On 24 Oct 2014, Johan Hovold told this: [ +CC: linux-usb ] On Wed, Oct 22, 2014 at 04:36:59PM +0100, Nix wrote: On 22 Oct 2014, Johan Hovold outgrape: On Wed, Oct 22, 2014 at 10:31:17AM +0100, Nix wrote: On 14 Oct 2014, Johan Hovold verbalised: On Sun, Oct 12, 2014 at 10:36:30PM +0100, Nix wrote: I have checked: this code is being executed against a symlink that points to /dev/ttyACM0, and the tcsetattr() succeeds. (At least, it's succeeding on the kernel I'm running now, but of course that's 3.16.5 with this commit reverted...) You could verify that by enabling debugging in the cdc-acm driver and making sure that the corresponding control messages are indeed sent on close. I have a debugging dump at http://www.esperi.org.uk/~nix/temporary/cdc-acm.log; it's fairly What kernel were you using here? The log seems to suggest that it was generated with the commit in question reverted. Look now :) (the previous log is still there in cdc-acm-reverted.log.) Unfortunately, it seems the logs are incomplete. There are lots of entries missing (e.g. acm_tty_install when opening, but also some acm_submit_read_urb), some of which were there in your first log. Curious. It's a straight cut-and-paste from the syslog. Maybe the kmsg buffer overflowed, but I start the ekey daemon *after* I start syslogd, so that seems unlikely. I'll have another look. Is it only after reboot you've seen the device fail? Yes (though sometimes after reboot of an unaffected kernel into an affected one! i.e. sometimes the first boot into an affected kernel is broken). What if you physically disconnect and reconnect the device instead, That makes it work. or simply repeatedly open and close it? IIRC -- but I'll have to check this tomorrow when I look at the log problem, so don't take it as gospel -- that doesn't affect it: I can stop and restart the daemon repeatedly and, if it wasn't working before, it's not working afterwards: if it was working before, it'll be working afterwards. -- NULL (void) -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html