The driver enforces a strict one-to-one relationship between the
received RESPONSE_AVAILABLE notifications and messages read from
the device. At the same time, it will cancel the interrupt URB
when there is no client holding the character device open.

Many devices do not cope well with this behaviour.  They maintain
a FIFO queue of messages, and send notifications on a best effort
basis.  Messages are queued regardless of whether the notification
is successful or not. So if the driver loses a single notification,
which can easily happen when the interrupt URB is cancelled, then
the device and driver becomes out-of-sync. New messages end up
at the end of the queue, while the associated notification makes
the driver read only the first message from the queue.

This state is permanent from a user point of view. There is no
no way to flush the device queue without resetting the device or
using another driver.

The problem is easy to hit with current QMI and MBIM command line
tools, which typically close the character device after seeing
the reply they expect. Any pending unsolicited messages from the
device will then trigger the driver bug.

Fix by always reading all queued messages from the device when
the notification URB is first submitted.  This is expected to
end with an -EPIPE status when there are no more pending
messages, so demote the printk associated with -EPIPE to debug
level.

Cc: <sta...@vger.kernel.org>
Signed-off-by: Bjørn Mork <bj...@mork.no>
---

Hello,

this fixes a long standing problem which has become increasingly
annoying with each new generation of MBIM and QMI modems. They
simply do not expect the strict notification scheme we try to
enforce.

For a while I've tried to convice users to force the /dev/cdc-wdmX
device open, using tricks like "cat >/dev/cdc-wdm0". But semantics
like that are of course not acceptable.  Opening and closing the
char device should never cause the permanent host/device communcation
failure it currently does. To make this worse, the only reset tool
available to users is typically unplugging and replugging. And
even that can be difficult if it is a laptop internal modem.

Please backport this to stable as well.  The problem hits common
OpenWrt tools like umbim really hard.


Bjørn


 drivers/usb/class/cdc-wdm.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index 61ea87917433..f70972be2ee9 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -178,7 +178,7 @@ static void wdm_in_callback(struct urb *urb)
                                "nonzero urb status received: -ESHUTDOWN");
                        goto skip_error;
                case -EPIPE:
-                       dev_err(&desc->intf->dev,
+                       dev_dbg(&desc->intf->dev,
                                "nonzero urb status received: -EPIPE\n");
                        break;
                default:
@@ -200,6 +200,19 @@ static void wdm_in_callback(struct urb *urb)
                        desc->reslength = length;
                }
        }
+
+       /*
+        * If desc->resp_count is unset, then the urb was submitted
+        * without a prior notification.  If the device returned any
+        * data, then this implies that it had messages queued without
+        * notifying us.  Continue reading until that queue is flushed.
+        */
+       if (!desc->resp_count && length) {
+               dev_dbg(&desc->intf->dev, "got %d bytes without 
notification\n", length);
+               set_bit(WDM_RESPONDING, &desc->flags);
+               usb_submit_urb(desc->response, GFP_ATOMIC);
+       }
+
 skip_error:
        wake_up(&desc->wait);
 
@@ -647,6 +660,16 @@ static int wdm_open(struct inode *inode, struct file *file)
                        dev_err(&desc->intf->dev,
                                "Error submitting int urb - %d\n", rv);
                        rv = usb_translate_errors(rv);
+               } else {
+                       /*
+                        * Some devices keep pending messages queued
+                        * without resending notifications.  We must
+                        * flush the message queue before we can
+                        * assume a one-to-one relationship between
+                        * notifications and messages in the queue
+                        */
+                       set_bit(WDM_RESPONDING, &desc->flags);
+                       rv = usb_submit_urb(desc->response, GFP_KERNEL);
                }
        } else {
                rv = 0;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to