Re: [PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-30 Thread A Sun
On 3/30/2017 3:12 AM, Sean Young wrote:
> On Wed, Mar 29, 2017 at 06:04:58PM -0400, A Sun wrote:
>> On 3/29/2017 5:06 PM, Sean Young wrote:
>> 
>>>
>>> Anyway, you're right and this patch looks ok. It would be nice to have the
>>> tx case handled too though.
>>>
>>> Thanks
>>> Sean
>>>
>>
>> Thanks; I'm looking at handling the tx case. If I can figure out the 
>> details, I'll post a new patch proposal separate, and likely dependent, on 
>> this one.
>>
>> My main obstacle at the moment, is I'm looking for a way to get mceusb 
>> device to respond with a USB TX error halt/stall (rather than the typical 
>> ACK and NAK) on a TX endpoint, in order to test halt/stall error detection 
>> and recovery for TX. ..A Sun
> 
> If you send IR, the drivers send a usb packet. However, the kernel will
> sleep for however long the IR is in ir_lirc_transmit_ir, so your other option
> is to set the transmit carrier repeatedly instead. You'd have to set the
> carrier to a different value every time.
> 
> 
> {
>   int fd, carrier;
> 
>   fd = open("/dev/lirc0", O_RDWR);
>   carrier = 38000;
>   for (;;) {
>   ioctl(fd, LIRC_SET_SEND_CARRIER, );
>   if (++carrier >= 4)
>   carrier = 38000;
>   }
> }
> 
> 
> Sean
> 

Thanks, this is good to know, for testing where multiple TX I/Os are pending 
prior to fault. 

I found a way to set up the TX -EPIPE fault administratively:

retval = usb_control_msg(ir->usbdev, usb_sndctrlpipe(ir->usbdev, 0),
USB_REQ_SET_FEATURE, USB_RECIP_ENDPOINT,
USB_ENDPOINT_HALT, usb_pipeendpoint(ir->pipe_out),
NULL, 0, USB_CTRL_SET_TIMEOUT);
dev_dbg(ir->dev, "set halt retval, %d", retval);

and have replications now for TX error and lock -out. Error occurs for every 
TX. No message flooding otherwise, as we expect. The RX side remains working.

...
[  249.986174] mceusb 1-1.2:1.0: requesting 38000 HZ carrier
[  249.986210] mceusb 1-1.2:1.0: send request called (size=0x4)
[  249.986256] mceusb 1-1.2:1.0: send request complete (res=0)
[  249.986403] mceusb 1-1.2:1.0: Error: request urb status = -32 (TX HALT)
[  249.999885] mceusb 1-1.2:1.0: send request called (size=0x3)
[  249.29] mceusb 1-1.2:1.0: send request complete (res=0)
[  250.13] mceusb 1-1.2:1.0: Error: request urb status = -32 (TX HALT)
[  250.019830] mceusb 1-1.2:1.0: send request called (size=0x21)
[  250.019868] mceusb 1-1.2:1.0: send request complete (res=0)
[  250.020007] mceusb 1-1.2:1.0: Error: request urb status = -32 (TX HALT)
...



Re: [PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-30 Thread Sean Young
On Wed, Mar 29, 2017 at 06:04:58PM -0400, A Sun wrote:
> On 3/29/2017 5:06 PM, Sean Young wrote:
> 
> > 
> > Anyway, you're right and this patch looks ok. It would be nice to have the
> > tx case handled too though.
> > 
> > Thanks
> > Sean
> > 
> 
> Thanks; I'm looking at handling the tx case. If I can figure out the details, 
> I'll post a new patch proposal separate, and likely dependent, on this one.
> 
> My main obstacle at the moment, is I'm looking for a way to get mceusb device 
> to respond with a USB TX error halt/stall (rather than the typical ACK and 
> NAK) on a TX endpoint, in order to test halt/stall error detection and 
> recovery for TX. ..A Sun

If you send IR, the drivers send a usb packet. However, the kernel will
sleep for however long the IR is in ir_lirc_transmit_ir, so your other option
is to set the transmit carrier repeatedly instead. You'd have to set the
carrier to a different value every time.


{
int fd, carrier;

fd = open("/dev/lirc0", O_RDWR);
carrier = 38000;
for (;;) {
ioctl(fd, LIRC_SET_SEND_CARRIER, );
if (++carrier >= 4)
carrier = 38000;
}
}


Sean


Re: [PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-29 Thread A Sun
On 3/29/2017 5:06 PM, Sean Young wrote:

> 
> Anyway, you're right and this patch looks ok. It would be nice to have the
> tx case handled too though.
> 
> Thanks
> Sean
> 

Thanks; I'm looking at handling the tx case. If I can figure out the details, 
I'll post a new patch proposal separate, and likely dependent, on this one.

My main obstacle at the moment, is I'm looking for a way to get mceusb device 
to respond with a USB TX error halt/stall (rather than the typical ACK and NAK) 
on a TX endpoint, in order to test halt/stall error detection and recovery for 
TX. ..A Sun



Re: [PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-29 Thread Sean Young
On Tue, Mar 28, 2017 at 09:40:05PM -0400, A Sun wrote:
> On 3/28/2017 4:25 PM, Sean Young wrote:
> 
> >  
> >> The unused EVENT_TX_HALT and the apparently extra _kevent functions and 
> >> kevent_flags are necessary for a later:
> >> [PATCH] [media] mceusb: TX -EPIPE lockup fix
> >> ...not yet written, transmit side equivalent bug. I respectfully recommend 
> >> keeping these hooks in place.
> > 
> > Have you observed this happening?
> > 
> 
> Not yet for my Infrared Transceiver device only. USB halt/stall errors 
> apparently are not USB device specific, and can occur with both TX and RX 
> according to the Linux Urb errors documentation. Calling usb_clear_halt() is 
> required for both cases to recover and restore operation of the stalled 
> endpoint.

Yes, I did not realise this, but you're right, they can happen in both
cases.

> The TX -EPIPE error is already separated out by code in the mceusb driver, 
> but with no error recovery handling.
> In mce_async_callback()
> case -EPIPE:
> default:
> dev_err(ir->dev, "Error: request urb status = %d", 
> urb->status);
> break;
> }
> 
> I believe I can trigger the condition by stress test flooding or otherwise 
> misusing the device's TX end-point in the driver (like the RX case), but I 
> haven't put much work into that yet.

The problem in the rx case is that in the EPIPE error case, the urb simply
resubmitted from the urb callback. In the tx case this does not happen
so I wouldn't expect a flood. Also after initialisation, no commands are
sent to the mceusb device except for IR transmit or tx carrier.

> > Speaking of which, how do you reproduce the original -EPIPE issue? I've
> > tried to reproduce on my raspberry pi 3 with a very similar mceusb
> > device, but I haven't had any luck.
> > 
> 
> Reproduction of the RX -EPIPE is "hard" to get. I haven't found a consistent 
> way to reproduce, other than inconsistently by:
>Bug trigger appears to be normal, but heavy, IR receiver use.
> In particular, punch up a bunch of buttons at random on a remote control (I 
> used a Sony TV remote) and include some long button presses in the process. 
> Do for say about 1 minute at the time. The bug won't trigger for idle or 
> occasional IR receiver activity.
> Some mceusb devices may be more susceptible to the problem than others.
> 
> > What's the lsusb -vv for this device?
> 
> Here it is, on my raspberry pi 3:
> 
> ...
> Bus 001 Device 006: ID 2304:0225 Pinnacle Systems, Inc. Remote Kit Infrared 
> Transceiver
> Device Descriptor:
>   bLength18
>   bDescriptorType 1
>   bcdUSB   2.00
>   bDeviceClass0 (Defined at Interface level)
>   bDeviceSubClass 0
>   bDeviceProtocol 0
>   bMaxPacketSize0 8
>   idVendor   0x2304 Pinnacle Systems, Inc.
>   idProduct  0x0225 Remote Kit Infrared Transceiver
>   bcdDevice0.01
>   iManufacturer   1 Pinnacle Systems
>   iProduct2 PCTV Remote USB
>   iSerial 5 7FFF
>   bNumConfigurations  1
>   Configuration Descriptor:
> bLength 9
> bDescriptorType 2
> wTotalLength   32
> bNumInterfaces  1
> bConfigurationValue 1
> iConfiguration  3 StandardConfiguration
> bmAttributes 0xa0
>   (Bus Powered)
>   Remote Wakeup
> MaxPower  100mA
> Interface Descriptor:
>   bLength 9
>   bDescriptorType 4
>   bInterfaceNumber0
>   bAlternateSetting   0
>   bNumEndpoints   2
>   bInterfaceClass   255 Vendor Specific Class
>   bInterfaceSubClass  0
>   bInterfaceProtocol  0
>   iInterface  4 StandardInterface
>   Endpoint Descriptor:
> bLength 7
> bDescriptorType 5
> bEndpointAddress 0x81  EP 1 IN
> bmAttributes2
>   Transfer TypeBulk
>   Synch Type   None
>   Usage Type   Data
> wMaxPacketSize 0x0040  1x 64 bytes
> bInterval  10
>   Endpoint Descriptor:
> bLength 7
> bDescriptorType 5
> bEndpointAddress 0x02  EP 2 OUT
> bmAttributes2
>   Transfer TypeBulk
>   Synch Type   None
>   Usage Type   Data
> wMaxPacketSize 0x0040  1x 64 bytes
> bInterval  10
> Device Status: 0x
>   (Bus Powered)
> ...
> 
> The most recent bug replication I got was while testing the fix methodology 
> for [patch 1/3]. The fault, when it occurs, is between IR data blocks during 
> receive.
> 
> ...
> Mar 22 12:16:35 raspberrypi kernel: [ 4863.489177] mceusb 1-1.2:1.0: rx data: 
> 90 7f 7f 7f 7f 7f 7f 7f 7f 7f 

Re: [PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-28 Thread A Sun
On 3/28/2017 4:25 PM, Sean Young wrote:

>  
>> The unused EVENT_TX_HALT and the apparently extra _kevent functions and 
>> kevent_flags are necessary for a later:
>> [PATCH] [media] mceusb: TX -EPIPE lockup fix
>> ...not yet written, transmit side equivalent bug. I respectfully recommend 
>> keeping these hooks in place.
> 
> Have you observed this happening?
> 

Not yet for my Infrared Transceiver device only. USB halt/stall errors 
apparently are not USB device specific, and can occur with both TX and RX 
according to the Linux Urb errors documentation. Calling usb_clear_halt() is 
required for both cases to recover and restore operation of the stalled 
endpoint.

The TX -EPIPE error is already separated out by code in the mceusb driver, but 
with no error recovery handling.
In mce_async_callback()
case -EPIPE:
default:
dev_err(ir->dev, "Error: request urb status = %d", urb->status);
break;
}

I believe I can trigger the condition by stress test flooding or otherwise 
misusing the device's TX end-point in the driver (like the RX case), but I 
haven't put much work into that yet.

> Speaking of which, how do you reproduce the original -EPIPE issue? I've
> tried to reproduce on my raspberry pi 3 with a very similar mceusb
> device, but I haven't had any luck.
> 

Reproduction of the RX -EPIPE is "hard" to get. I haven't found a consistent 
way to reproduce, other than inconsistently by:
   Bug trigger appears to be normal, but heavy, IR receiver use.
In particular, punch up a bunch of buttons at random on a remote control (I 
used a Sony TV remote) and include some long button presses in the process. Do 
for say about 1 minute at the time. The bug won't trigger for idle or 
occasional IR receiver activity.
Some mceusb devices may be more susceptible to the problem than others.

> What's the lsusb -vv for this device?

Here it is, on my raspberry pi 3:

...
Bus 001 Device 006: ID 2304:0225 Pinnacle Systems, Inc. Remote Kit Infrared 
Transceiver
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   2.00
  bDeviceClass0 (Defined at Interface level)
  bDeviceSubClass 0
  bDeviceProtocol 0
  bMaxPacketSize0 8
  idVendor   0x2304 Pinnacle Systems, Inc.
  idProduct  0x0225 Remote Kit Infrared Transceiver
  bcdDevice0.01
  iManufacturer   1 Pinnacle Systems
  iProduct2 PCTV Remote USB
  iSerial 5 7FFF
  bNumConfigurations  1
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength   32
bNumInterfaces  1
bConfigurationValue 1
iConfiguration  3 StandardConfiguration
bmAttributes 0xa0
  (Bus Powered)
  Remote Wakeup
MaxPower  100mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   2
  bInterfaceClass   255 Vendor Specific Class
  bInterfaceSubClass  0
  bInterfaceProtocol  0
  iInterface  4 StandardInterface
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81  EP 1 IN
bmAttributes2
  Transfer TypeBulk
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0040  1x 64 bytes
bInterval  10
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02  EP 2 OUT
bmAttributes2
  Transfer TypeBulk
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0040  1x 64 bytes
bInterval  10
Device Status: 0x
  (Bus Powered)
...

The most recent bug replication I got was while testing the fix methodology for 
[patch 1/3]. The fault, when it occurs, is between IR data blocks during 
receive.

...
Mar 22 12:16:35 raspberrypi kernel: [ 4863.489177] mceusb 1-1.2:1.0: rx data: 
90 7f 7f 7f 7f 7f 7f 7f 7f 7f 7f 7f 7f 7f 7f 7f 7f (length=17)
Mar 22 12:16:35 raspberrypi kernel: [ 4863.489184] mceusb 1-1.2:1.0: Raw IR 
data, 16 pulse/space samples
Mar 22 12:16:35 raspberrypi kernel: [ 4863.489189] mceusb 1-1.2:1.0: Storing 
space with duration 635
Mar 22 12:16:35 raspberrypi kernel: [ 4863.489195] mceusb 1-1.2:1.0: Storing 
space with duration 635
Mar 22 12:16:35 raspberrypi kernel: [ 4863.489199] mceusb 1-1.2:1.0: Storing 
space with duration 635
Mar 22 12:16:35 raspberrypi kernel: [ 4863.489203] mceusb 1-1.2:1.0: Storing 
space with duration 635
Mar 22 12:16:35 raspberrypi kernel: [ 4863.489207] mceusb 1-1.2:1.0: Storing 
space with 

Re: [PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-28 Thread Sean Young
On Mon, Mar 27, 2017 at 04:18:33AM -0400, A Sun wrote:
> On 3/26/2017 4:31 PM, Sean Young wrote:
> > On Sun, Mar 26, 2017 at 02:28:08PM -0400, A Sun wrote:
> >> commit 
> >> https://github.com/asunxx/linux/commit/17fe3b51f4ad5202a876ea4c92b5d99d4e166823
> >> Author: A Sun 
> >> Date:   Sun, 26 Mar 2017 13:24:18 -0400 
> > 
> > Please don't include this.
> > 
> >>
> ...
> >> mceusb 1-1.2:1.0: 2 tx ports (0x1 cabled) and 2 rx sensors (0x1 active)
> > 
> > It would be nice to have this tested against a mainline kernel. I thought
> > that was entirely possible on raspberry pis nowadays.
> ...
> >> +  /* kevent support */
> >> +  struct work_struct kevent;
> > 
> > kevent is not a descriptive name. How about something like clear_halt?
> > 
> >> +  unsigned long kevent_flags;
> >> +# define EVENT_TX_HALT0
> >> +# define EVENT_RX_HALT1
> > 
> > EVENT_TX_HALT is never used, so kevent_flags is only ever set to 1. The
> > entire field can be dropped.
> > 
> ...
> >> +  if (!schedule_work(>kevent)) {
> >> +  dev_err(ir->dev, "kevent %d may have been dropped", kevent);
> >> +  } else {
> >> +  dev_dbg(ir->dev, "kevent %d scheduled", kevent);
> >> +  }
> >> +}
> > 
> > Again name is not very descriptive.
> > 
> ...
> >> +  dev_err(ir->dev, "Error: urb status = %d (RX HALT)",
> >> +  urb->status);
> >> +  mceusb_defer_kevent(ir, EVENT_RX_HALT);
> > 
> > Here you could simply call schedule_work(). Note that EPIPE might also
> > be returned for device disconnect for some host controllers.
> > 
> >> +  return;
> ...
> >> +  int status;
> >> +
> >> +  if (test_bit(EVENT_RX_HALT, >kevent_flags)) {
> > 
> > If condition can go.
> > 
> >> +  usb_unlink_urb(ir->urb_in);
> >> +  status = usb_clear_halt(ir->usbdev, ir->pipe_in);
> 
> Hi Sean,
> 
> Thanks again for looking at this. This patch is based on similar error and 
> recovery, with the USB ethernet driver usbnet (usbnet.c, usbnet.h).
> 
> In usbnet, they call "kevent" (kernel device event?) any kind of hardware 
> state change or event in interrupt context that requires invoking 
> non-interrupt code to handle. I'm not sure what else I should name it. 
> Possible kevent-s are not limited to situations needing usb_clear_halt(). 
> From usbnet:
>  69 #   define EVENT_TX_HALT0
>  70 #   define EVENT_RX_HALT1
>  71 #   define EVENT_RX_MEMORY  2
>  72 #   define EVENT_STS_SPLIT  3
>  73 #   define EVENT_LINK_RESET 4
>  74 #   define EVENT_RX_PAUSED  5
>  75 #   define EVENT_DEV_ASLEEP 6
>  76 #   define EVENT_DEV_OPEN   7
>  77 #   define EVENT_DEVICE_REPORT_IDLE 8
>  78 #   define EVENT_NO_RUNTIME_PM  9
>  79 #   define EVENT_RX_KILL10
>  80 #   define EVENT_LINK_CHANGE11
>  81 #   define EVENT_SET_RX_MODE12
> So far, the first two are appearing applicable for mceusb.

So far, only EVENT_RX_HALT is relevant for mceusb. It is likely that, this
will the be only one we will ever need in which case all this kevent kind
of business is completely unnecessary for a simple usb driver, rather
than usb networking driver infrastructure.
 
> The unused EVENT_TX_HALT and the apparently extra _kevent functions and 
> kevent_flags are necessary for a later:
> [PATCH] [media] mceusb: TX -EPIPE lockup fix
> ...not yet written, transmit side equivalent bug. I respectfully recommend 
> keeping these hooks in place.

Have you observed this happening?

Speaking of which, how do you reproduce the original -EPIPE issue? I've
tried to reproduce on my raspberry pi 3 with a very similar mceusb
device, but I haven't had any luck.

What's the lsusb -vv for this device?

> For now, I think the transmit side EPIPE bug fix is less critical, since the 
> TX bug avoids hanging the host/kernel, but would still cause lockup of the 
> device.
> 
> In case of RX EPIPE on disconnect, the fix is still safe. Recovery attempt 
> should fail (in usb_clear_halt() or usb_submit_urb()) and abort without 
> further retry, and the recovery handler itself gets shutdown in 
> mceusb_dev_disconnect().


Thanks
Sean


Re: [PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-27 Thread A Sun
On 3/26/2017 4:31 PM, Sean Young wrote:
> On Sun, Mar 26, 2017 at 02:28:08PM -0400, A Sun wrote:
>> commit 
>> https://github.com/asunxx/linux/commit/17fe3b51f4ad5202a876ea4c92b5d99d4e166823
>> Author: A Sun 
>> Date:   Sun, 26 Mar 2017 13:24:18 -0400 
> 
> Please don't include this.
> 
>>
...
>> mceusb 1-1.2:1.0: 2 tx ports (0x1 cabled) and 2 rx sensors (0x1 active)
> 
> It would be nice to have this tested against a mainline kernel. I thought
> that was entirely possible on raspberry pis nowadays.
...
>> +/* kevent support */
>> +struct work_struct kevent;
> 
> kevent is not a descriptive name. How about something like clear_halt?
> 
>> +unsigned long kevent_flags;
>> +#   define EVENT_TX_HALT0
>> +#   define EVENT_RX_HALT1
> 
> EVENT_TX_HALT is never used, so kevent_flags is only ever set to 1. The
> entire field can be dropped.
> 
...
>> +if (!schedule_work(>kevent)) {
>> +dev_err(ir->dev, "kevent %d may have been dropped", kevent);
>> +} else {
>> +dev_dbg(ir->dev, "kevent %d scheduled", kevent);
>> +}
>> +}
> 
> Again name is not very descriptive.
> 
...
>> +dev_err(ir->dev, "Error: urb status = %d (RX HALT)",
>> +urb->status);
>> +mceusb_defer_kevent(ir, EVENT_RX_HALT);
> 
> Here you could simply call schedule_work(). Note that EPIPE might also
> be returned for device disconnect for some host controllers.
> 
>> +return;
...
>> +int status;
>> +
>> +if (test_bit(EVENT_RX_HALT, >kevent_flags)) {
> 
> If condition can go.
> 
>> +usb_unlink_urb(ir->urb_in);
>> +status = usb_clear_halt(ir->usbdev, ir->pipe_in);

Hi Sean,

Thanks again for looking at this. This patch is based on similar error and 
recovery, with the USB ethernet driver usbnet (usbnet.c, usbnet.h).

In usbnet, they call "kevent" (kernel device event?) any kind of hardware state 
change or event in interrupt context that requires invoking non-interrupt code 
to handle. I'm not sure what else I should name it. Possible kevent-s are not 
limited to situations needing usb_clear_halt(). From usbnet:
 69 #   define EVENT_TX_HALT0
 70 #   define EVENT_RX_HALT1
 71 #   define EVENT_RX_MEMORY  2
 72 #   define EVENT_STS_SPLIT  3
 73 #   define EVENT_LINK_RESET 4
 74 #   define EVENT_RX_PAUSED  5
 75 #   define EVENT_DEV_ASLEEP 6
 76 #   define EVENT_DEV_OPEN   7
 77 #   define EVENT_DEVICE_REPORT_IDLE 8
 78 #   define EVENT_NO_RUNTIME_PM  9
 79 #   define EVENT_RX_KILL10
 80 #   define EVENT_LINK_CHANGE11
 81 #   define EVENT_SET_RX_MODE12
So far, the first two are appearing applicable for mceusb.

The unused EVENT_TX_HALT and the apparently extra _kevent functions and 
kevent_flags are necessary for a later:
[PATCH] [media] mceusb: TX -EPIPE lockup fix
...not yet written, transmit side equivalent bug. I respectfully recommend 
keeping these hooks in place.

For now, I think the transmit side EPIPE bug fix is less critical, since the TX 
bug avoids hanging the host/kernel, but would still cause lockup of the device.

In case of RX EPIPE on disconnect, the fix is still safe. Recovery attempt 
should fail (in usb_clear_halt() or usb_submit_urb()) and abort without further 
retry, and the recovery handler itself gets shutdown in mceusb_dev_disconnect().

Please let me know how to proceed. Thanks. ..A Sun


Re: [PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-26 Thread Sean Young
On Sun, Mar 26, 2017 at 02:28:08PM -0400, A Sun wrote:
> commit 
> https://github.com/asunxx/linux/commit/17fe3b51f4ad5202a876ea4c92b5d99d4e166823
> Author: A Sun 
> Date:   Sun, 26 Mar 2017 13:24:18 -0400 

Please don't include this.

> 
> Bug:
> 
> RX -EPIPE failure with infinite loop and flooding of
> Mar 20 22:24:56 raspberrypi kernel: [ 2851.966506] mceusb 1-1.2:1.0: Error: 
> urb status = -32
> log message at 8000 messages per second.
> Bug trigger appears to be normal, but heavy, IR receiver use.
> Driver and Linux host become unusable after error.
> Also seen at https://sourceforge.net/p/lirc/mailman/message/34886165/
> 
> Fix:
> 
> Message reports RX usb halt (stall) condition requiring usb_clear_halt() call 
> in non-interrupt context to recover.
> Add driver workqueue call to perform this recovery based on method in use for 
> the usbnet device driver.
> 
> Tested with:
> 
> Linux raspberrypi 4.4.50-v7+ #970 SMP Mon Feb 20 19:18:29 GMT 2017 armv7l 
> GNU/Linux
> mceusb 1-1.2:1.0: Registered Pinnacle Systems PCTV Remote USB with mce 
> emulator interface version 1
> mceusb 1-1.2:1.0: 2 tx ports (0x1 cabled) and 2 rx sensors (0x1 active)

It would be nice to have this tested against a mainline kernel. I thought
that was entirely possible on raspberry pis nowadays.
> 
> Signed-off-by: A Sun 
> ---
>  drivers/media/rc/mceusb.c | 67 
> ++-
>  1 file changed, 66 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/rc/mceusb.c b/drivers/media/rc/mceusb.c
> index 238d8ea..7b6f9e5 100644
> --- a/drivers/media/rc/mceusb.c
> +++ b/drivers/media/rc/mceusb.c
> @@ -36,12 +36,13 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  
> -#define DRIVER_VERSION   "1.92"
> +#define DRIVER_VERSION   "1.93"
>  #define DRIVER_AUTHOR"Jarod Wilson "
>  #define DRIVER_DESC  "Windows Media Center Ed. eHome Infrared Transceiver " \
>   "device driver"
> @@ -417,6 +418,7 @@ struct mceusb_dev {
>   /* usb */
>   struct usb_device *usbdev;
>   struct urb *urb_in;
> + unsigned int pipe_in;
>   struct usb_endpoint_descriptor *usb_ep_out;
>  
>   /* buffers and dma */
> @@ -454,6 +456,12 @@ struct mceusb_dev {
>   u8 num_rxports; /* number of receive sensors */
>   u8 txports_cabled;  /* bitmask of transmitters with cable */
>   u8 rxports_active;  /* bitmask of active receive sensors */
> +
> + /* kevent support */
> + struct work_struct kevent;

kevent is not a descriptive name. How about something like clear_halt?

> + unsigned long kevent_flags;
> +#define EVENT_TX_HALT0
> +#define EVENT_RX_HALT1

EVENT_TX_HALT is never used, so kevent_flags is only ever set to 1. The
entire field can be dropped.

>  };
>  
>  /* MCE Device Command Strings, generally a port and command pair */
> @@ -1025,6 +1033,23 @@ static void mceusb_process_ir_data(struct mceusb_dev 
> *ir, int buf_len)
>   }
>  }
>  
> +/*
> + * Workqueue task dispatcher
> + * for work that can't be done in interrupt handlers
> + * (mceusb_dev_recv() and mce_async_callback()) nor tasklets.
> + * Invokes mceusb_deferred_kevent() for recovering from
> + * error events specified by the kevent bit field.
> + */
> +static void mceusb_defer_kevent(struct mceusb_dev *ir, int kevent)
> +{
> + set_bit(kevent, >kevent_flags);
> + if (!schedule_work(>kevent)) {
> + dev_err(ir->dev, "kevent %d may have been dropped", kevent);
> + } else {
> + dev_dbg(ir->dev, "kevent %d scheduled", kevent);
> + }
> +}

Again name is not very descriptive.

> +
>  static void mceusb_dev_recv(struct urb *urb)
>  {
>   struct mceusb_dev *ir;
> @@ -1052,6 +1077,11 @@ static void mceusb_dev_recv(struct urb *urb)
>   return;
>  
>   case -EPIPE:
> + dev_err(ir->dev, "Error: urb status = %d (RX HALT)",
> + urb->status);
> + mceusb_defer_kevent(ir, EVENT_RX_HALT);

Here you could simply call schedule_work(). Note that EPIPE might also
be returned for device disconnect for some host controllers.

> + return;
> +
>   default:
>   dev_err(ir->dev, "Error: urb status = %d", urb->status);
>   break;
> @@ -1170,6 +1200,37 @@ static void mceusb_flash_led(struct mceusb_dev *ir)
>   mce_async_out(ir, FLASH_LED, sizeof(FLASH_LED));
>  }
>  
> +/*
> + * Workqueue function
> + * for resetting or recovering device after occurrence of error events
> + * specified in ir->kevent bit field.
> + * Function runs (via schedule_work()) in non-interrupt context, for
> + * calls here (such as usb_clear_halt()) requiring non-interrupt context.
> + */
> +static void mceusb_deferred_kevent(struct work_struct *work)
> +{
> + struct mceusb_dev *ir =
> + container_of(work, 

[PATCH 1/3] [media] mceusb: RX -EPIPE (urb status = -32) lockup failure fix

2017-03-26 Thread A Sun
commit 
https://github.com/asunxx/linux/commit/17fe3b51f4ad5202a876ea4c92b5d99d4e166823
Author: A Sun 
Date:   Sun, 26 Mar 2017 13:24:18 -0400 

Bug:

RX -EPIPE failure with infinite loop and flooding of
Mar 20 22:24:56 raspberrypi kernel: [ 2851.966506] mceusb 1-1.2:1.0: Error: urb 
status = -32
log message at 8000 messages per second.
Bug trigger appears to be normal, but heavy, IR receiver use.
Driver and Linux host become unusable after error.
Also seen at https://sourceforge.net/p/lirc/mailman/message/34886165/

Fix:

Message reports RX usb halt (stall) condition requiring usb_clear_halt() call 
in non-interrupt context to recover.
Add driver workqueue call to perform this recovery based on method in use for 
the usbnet device driver.

Tested with:

Linux raspberrypi 4.4.50-v7+ #970 SMP Mon Feb 20 19:18:29 GMT 2017 armv7l 
GNU/Linux
mceusb 1-1.2:1.0: Registered Pinnacle Systems PCTV Remote USB with mce emulator 
interface version 1
mceusb 1-1.2:1.0: 2 tx ports (0x1 cabled) and 2 rx sensors (0x1 active)

Signed-off-by: A Sun 
---
 drivers/media/rc/mceusb.c | 67 ++-
 1 file changed, 66 insertions(+), 1 deletion(-)

diff --git a/drivers/media/rc/mceusb.c b/drivers/media/rc/mceusb.c
index 238d8ea..7b6f9e5 100644
--- a/drivers/media/rc/mceusb.c
+++ b/drivers/media/rc/mceusb.c
@@ -36,12 +36,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
-#define DRIVER_VERSION "1.92"
+#define DRIVER_VERSION "1.93"
 #define DRIVER_AUTHOR  "Jarod Wilson "
 #define DRIVER_DESC"Windows Media Center Ed. eHome Infrared Transceiver " \
"device driver"
@@ -417,6 +418,7 @@ struct mceusb_dev {
/* usb */
struct usb_device *usbdev;
struct urb *urb_in;
+   unsigned int pipe_in;
struct usb_endpoint_descriptor *usb_ep_out;
 
/* buffers and dma */
@@ -454,6 +456,12 @@ struct mceusb_dev {
u8 num_rxports; /* number of receive sensors */
u8 txports_cabled;  /* bitmask of transmitters with cable */
u8 rxports_active;  /* bitmask of active receive sensors */
+
+   /* kevent support */
+   struct work_struct kevent;
+   unsigned long kevent_flags;
+#  define EVENT_TX_HALT0
+#  define EVENT_RX_HALT1
 };
 
 /* MCE Device Command Strings, generally a port and command pair */
@@ -1025,6 +1033,23 @@ static void mceusb_process_ir_data(struct mceusb_dev 
*ir, int buf_len)
}
 }
 
+/*
+ * Workqueue task dispatcher
+ * for work that can't be done in interrupt handlers
+ * (mceusb_dev_recv() and mce_async_callback()) nor tasklets.
+ * Invokes mceusb_deferred_kevent() for recovering from
+ * error events specified by the kevent bit field.
+ */
+static void mceusb_defer_kevent(struct mceusb_dev *ir, int kevent)
+{
+   set_bit(kevent, >kevent_flags);
+   if (!schedule_work(>kevent)) {
+   dev_err(ir->dev, "kevent %d may have been dropped", kevent);
+   } else {
+   dev_dbg(ir->dev, "kevent %d scheduled", kevent);
+   }
+}
+
 static void mceusb_dev_recv(struct urb *urb)
 {
struct mceusb_dev *ir;
@@ -1052,6 +1077,11 @@ static void mceusb_dev_recv(struct urb *urb)
return;
 
case -EPIPE:
+   dev_err(ir->dev, "Error: urb status = %d (RX HALT)",
+   urb->status);
+   mceusb_defer_kevent(ir, EVENT_RX_HALT);
+   return;
+
default:
dev_err(ir->dev, "Error: urb status = %d", urb->status);
break;
@@ -1170,6 +1200,37 @@ static void mceusb_flash_led(struct mceusb_dev *ir)
mce_async_out(ir, FLASH_LED, sizeof(FLASH_LED));
 }
 
+/*
+ * Workqueue function
+ * for resetting or recovering device after occurrence of error events
+ * specified in ir->kevent bit field.
+ * Function runs (via schedule_work()) in non-interrupt context, for
+ * calls here (such as usb_clear_halt()) requiring non-interrupt context.
+ */
+static void mceusb_deferred_kevent(struct work_struct *work)
+{
+   struct mceusb_dev *ir =
+   container_of(work, struct mceusb_dev, kevent);
+   int status;
+
+   if (test_bit(EVENT_RX_HALT, >kevent_flags)) {
+   usb_unlink_urb(ir->urb_in);
+   status = usb_clear_halt(ir->usbdev, ir->pipe_in);
+   if (status < 0) {
+   dev_err(ir->dev, "rx clear halt error %d",
+   status);
+   return;
+   }
+   clear_bit(EVENT_RX_HALT, >kevent_flags);
+   status = usb_submit_urb(ir->urb_in, GFP_KERNEL);
+   if (status < 0) {
+   dev_err(ir->dev, "rx unhalt submit urb error %d",
+   status);
+   return;
+   }
+   }
+}
+
 static struct rc_dev