Hi,

after some debugging I have found bug(s) in handling of QHs related to EHCI low speed split interrupt transfers.

Attached patch seems to solve problem described below (non-working USB keyboard attached to the same port where was USB disk previously).

Please try it - maybe it solves also reported keyboard problem on fuloong/loongson.

BR,
Ales

I forgot one more thing - if You want to try repeat USBMS/USB keyboard
problem described below, You need to access the connected USB disk (e.g.
by "ls" command) before removing it.

Hi Vladimir,

I have some additional information - results of some today's experiments.
The main important thing related to USB keyboard:

USB keyboard doesn't work if it is connected as first device to the same
non-root hub port, where was connected USBMS device previously!
If the keyboard is disconnected and connected again to the same port, it
works.
this behavior is systematical, it can be repeated at any time (at least
on my PC).

What is very interesting, USB keyboard is listed by command "usb".
And it doesn't matter if previously connected USBMS device was HIGH of
FULL device.

I.e., according these test results, even if the USB keybord is not
working, it is connected to port and addressed properly and its control
pipe is working, but device itself is not working - maybe there is
something wrong in "high level" driver (usb_keyboard) - ?


There is also another bad thing - after some number of
connecting/disconnecting of the keyboard (at least about 20 and more)
the GRUB crashes - more properly, it reboots PC, i.e. there is probably
also some memory leak...

BR,
Ales


1.
Some of my USBMS devices have problems to work properly. It seems to be
some regression because they worked well on some older revisions... :-(
I did not make any investigation it this direction yet, as this problem
is probably not related to latest changes (fix of root ports) - so I
ignore them for now.

2.
Sometimes some devices are not recognized (not working) in the case when
they are connected before USB "starting" time (before the moment when
USB modules/drivers are loaded) or when hub with this device is
connected.
Additionally, it seems to happen only if device is connected via hub(s),
not directly into root port - at least it was behavior during my tests
(but I did only few tests on root ports, I focused my tests to USB
keyboard connected via hub(s) etc.).
It looks like, in some rare cases, usbhub.c maybe miss some non-root hub
port change(s). Unfortunately I had no enough time to try debug these
situations.

So I went through Your changes in usbhub.c and other USB related files.
Unfortunately, I did not found reason of the problem mentioned above in
point 2. yet.
But I found some another points to discuss:

a)
I found my old mistake related to variable "pending_reset".
Meaning of this variable is to avoid concurrent reset on devices which
are connected to the same controller HW instance.
This variable is stored in wrong place - it should be located not in
"struct grub_usb_controller_dev" but in "struct grub_*hci" (i.e.
grub_uhci, grub_ohci etc.).
In fact, its current location is not totally bad - it is also working,
but it can slow down handling of USB devices (mainly in USB "starting"
phase) in case when there are more controllers of the same type.


b)
There is missing waiting for device stable power in case when device is
connected to ROOT hub later than in USB "starting" time.
This could possibly lead to wrong device reset and malfunction.

I.e. the "first half" of "grub_usb_poll_devices" should be little bit
changed, it is not correct to call "attach_root_port" immediately when
"detect_dev" detected device connection - it should be done e.g. in
similar way as in "grub_usb_controller_dev_register" (or maybe better in
some another, "background" way, like You did for non-root hub - to
prevent unwanted delay in execution of another GRUB parts).


c)
I thought about this old code:

"...
poll_nonroot_hub (grub_usb_device_t dev)
{
   grub_usb_err_t err;
   unsigned i;
   grub_uint8_t changed;
   grub_size_t actual, len;

   if (!dev->hub_transfer)
     return;
..."

I think, as the possible "error recovery", the better than current
immediate return could be to try to call "grub_usb_bulk_read_background"
to schedule new background transfer for this hub before return - ?


d)
Cosmetic thing:
It will be fine to rename declaration
static struct grub_usb_hub *hubs;
to
static struct grub_usb_hub *root_hubs;
to be more self explanative... :-)


What do You think about the points a)-d) ?

BR,
Ales

_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel

--- ./grub/grub-core/bus/usb/ehci.c	2013-05-12 23:15:17.857567000 +0200
+++ ./grub2/grub-core/bus/usb/ehci.c	2013-07-18 17:07:04.378692409 +0200
@@ -670,7 +670,7 @@ grub_ehci_pci_iter (grub_pci_device_t de
     grub_cpu_to_le32 (GRUB_EHCI_TERMINATE);
   e->tdfree_virt = e->td_virt;
   /* Set Terminate in first QH, which is used in framelist */
-  e->qh_virt[0].qh_hptr = grub_cpu_to_le32 (GRUB_EHCI_TERMINATE);
+  e->qh_virt[0].qh_hptr = grub_cpu_to_le32 (GRUB_EHCI_TERMINATE | GRUB_EHCI_HPTR_TYPE_QH);
   e->qh_virt[0].td_overlay.next_td = grub_cpu_to_le32 (GRUB_EHCI_TERMINATE);
   e->qh_virt[0].td_overlay.alt_next_td =
     grub_cpu_to_le32 (GRUB_EHCI_TERMINATE);
@@ -977,6 +977,10 @@ grub_ehci_find_qh (struct grub_ehci *e,
   int i;
   grub_ehci_qh_t qh = e->qh_virt;
   grub_ehci_qh_t head;
+  grub_uint32_t qh_phys;
+  grub_uint32_t qh_terminate =
+    GRUB_EHCI_TERMINATE | GRUB_EHCI_HPTR_TYPE_QH;
+  grub_ehci_qh_t qh_iter;
 
   /* Prepare part of EP Characteristic to find existing QH */
   target = ((transfer->endpoint << GRUB_EHCI_EP_NUM_OFF) |
@@ -984,21 +988,58 @@ grub_ehci_find_qh (struct grub_ehci *e,
   target = grub_cpu_to_le32 (target);
   mask = grub_cpu_to_le32 (GRUB_EHCI_TARGET_MASK);
 
-  /* First try to find existing QH with proper target */
-  for (i = 2; i < GRUB_EHCI_N_QH; i++)	/* We ignore zero and first QH */
+  /* low speed interrupt transfers are linked to the periodic */
+  /* schedule, everything else to the asynchronous schedule */
+  if (transfer->dev->speed == GRUB_USB_SPEED_LOW
+      && transfer->type != GRUB_USB_TRANSACTION_TYPE_CONTROL)
+    head = &qh[0];
+  else
+    head = &qh[1];
+
+  /* First try to find existing QH with proper target in proper list */
+  qh_phys = grub_le_to_cpu32( head->qh_hptr );
+  if (qh_phys != qh_terminate)
+    qh_iter = grub_dma_phys2virt ( qh_phys & GRUB_EHCI_QHTDPTR_MASK,
+      e->qh_chunk );
+  else
+    qh_iter = NULL;
+
+  for (
+    i = 0;
+    (qh_phys != qh_terminate) && (qh_iter != NULL) &&
+    (qh_iter != head) && (i < GRUB_EHCI_N_QH);
+    i++ )
     {
-      if (!qh[i].ep_char)
-	break;			/* Found first not-allocated QH, finish */
-      if (target == (qh[i].ep_char & mask))
+      if (target == (qh_iter->ep_char & mask))
 	{		
 	  /* Found proper existing (and linked) QH, do setup of QH */
-	  grub_dprintf ("ehci", "find_qh: found, i=%d, QH=%p\n",
-			i, &qh[i]);
-	  grub_ehci_setup_qh (&qh[i], transfer);
-	  return &qh[i];
+	  grub_dprintf ("ehci", "find_qh: found, QH=%p\n", qh_iter);
+	  grub_ehci_setup_qh (qh_iter, transfer);
+	  return qh_iter;
 	}
+
+      qh_phys = grub_le_to_cpu32( qh_iter->qh_hptr );
+      if (qh_phys != qh_terminate)
+        qh_iter = grub_dma_phys2virt ( qh_phys & GRUB_EHCI_QHTDPTR_MASK,
+	  e->qh_chunk );
+      else
+        qh_iter = NULL;
+    }
+
+  /* variable "i" should be never equal to GRUB_EHCI_N_QH here */
+  if (i >= GRUB_EHCI_N_QH)
+    { /* Something very bad happened in QH list(s) ! */
+      grub_dprintf ("ehci", "find_qh: Mismatch in QH list! head=%p\n",
+        head);
     }
-  /* QH with target_addr does not exist, we have to add it */
+
+  /* QH with target_addr does not exist, we have to find and add it */
+  for (i = 2; i < GRUB_EHCI_N_QH; i++) /* We ignore zero and first QH */
+    {
+      if (!qh[i].ep_char)
+	break;	             /* Found first not-allocated QH, finish */
+    }
+
   /* Have we any free QH in array ? */
   if (i >= GRUB_EHCI_N_QH)	/* No. */
     {
@@ -1013,14 +1054,6 @@ grub_ehci_find_qh (struct grub_ehci *e,
   /* We should preset new QH and link it into AL */
   grub_ehci_setup_qh (&qh[i], transfer);
 
-  /* low speed interrupt transfers are linked to the periodic
-   * scheudle, everything else to the asynchronous schedule */
-  if (transfer->dev->speed == GRUB_USB_SPEED_LOW
-      && transfer->type != GRUB_USB_TRANSACTION_TYPE_CONTROL)
-    head = &qh[0];
-  else
-    head = &qh[1];
-
   /* Linking - this new (last) QH will copy the QH from the head QH */
   qh[i].qh_hptr = head->qh_hptr;
   /* Linking - the head QH will point to this new QH */
@@ -1538,17 +1571,20 @@ grub_ehci_cancel_transfer (grub_usb_cont
   int i;
   grub_uint64_t maxtime;
   grub_uint32_t qh_phys;
+  grub_uint32_t interrupt =
+    cdata->qh_virt->ep_cap & GRUB_EHCI_SMASK_MASK;
 
   /* QH can be active and should be de-activated and halted */
 
   grub_dprintf ("ehci", "cancel_transfer: begin\n");
 
-  /* First check if EHCI is running and AL is enabled and if not,
-   * there is no problem... */
-  if (((grub_ehci_oper_read32 (e, GRUB_EHCI_STATUS)
-	& GRUB_EHCI_ST_HC_HALTED) != 0) ||
-      ((grub_ehci_oper_read32 (e, GRUB_EHCI_STATUS)
-	& (GRUB_EHCI_ST_AS_STATUS | GRUB_EHCI_ST_PS_STATUS)) == 0))
+  /* First check if EHCI is running - if not, there is no problem */
+  /* to cancel any transfer. Or, if transfer is asynchronous, check */
+  /* if AL is enabled - if not, transfer can be canceled also. */
+  if (((grub_ehci_oper_read32 (e, GRUB_EHCI_STATUS) &
+      GRUB_EHCI_ST_HC_HALTED) != 0) ||
+    (!interrupt && ((grub_ehci_oper_read32 (e, GRUB_EHCI_STATUS) &
+      (GRUB_EHCI_ST_AS_STATUS | GRUB_EHCI_ST_PS_STATUS)) == 0)))
     {
       grub_ehci_pre_finish_transfer (transfer);
       grub_ehci_free_tds (e, cdata->td_first_virt, transfer, &actual);
@@ -1558,13 +1594,14 @@ grub_ehci_cancel_transfer (grub_usb_cont
       return GRUB_USB_ERR_NONE;
     }
 
-  /* EHCI and AL are running. What to do?
-   * Try to Halt QH via de-scheduling QH. */
+  /* EHCI and (AL or SL) are running. What to do? */
+  /* Try to Halt QH via de-scheduling QH. */
   /* Find index of previous QH */
   qh_phys = grub_dma_virt2phys(cdata->qh_virt, e->qh_chunk);
   for (i = 0; i < GRUB_EHCI_N_QH; i++)
     {
-      if ((e->qh_virt[i].qh_hptr & GRUB_EHCI_QHTDPTR_MASK) == qh_phys)
+      if ((grub_le_to_cpu32(e->qh_virt[i].qh_hptr)
+        & GRUB_EHCI_QHTDPTR_MASK) == qh_phys)
         break;
     }
   if (i == GRUB_EHCI_N_QH)
@@ -1618,24 +1655,12 @@ grub_ehci_cancel_transfer (grub_usb_cont
   grub_ehci_free_tds (e, cdata->td_first_virt, transfer, &actual);
   grub_ehci_free_td (e, cdata->td_alt_virt);
 
-  /* FIXME Putting the QH back on the list should work, but for some
-   * strange reason doing that will affect other QHs on the periodic
-   * list.  So free the QH instead of putting it back on the list
-   * which does seem to work, but I would like to know why. */
-
-#if 0
-  /* Finaly we should return QH back to the AL... */
-  e->qh_virt[i].qh_hptr =
-    grub_cpu_to_le32 (grub_dma_virt2phys
-		      (cdata->qh_virt, e->qh_chunk));
-#else
-  /* Free the QH */
+  /* "Free" the QH - link it to itself */
   cdata->qh_virt->ep_char = 0;
   cdata->qh_virt->qh_hptr =
     grub_cpu_to_le32 ((grub_dma_virt2phys (cdata->qh_virt,
                                            e->qh_chunk)
                        & GRUB_EHCI_POINTER_MASK) | GRUB_EHCI_HPTR_TYPE_QH);
-#endif
 
   grub_free (cdata);
 
_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel

Reply via email to