Re: aarch64 panic() at do_el1h_sync+0x21c
On Wed, Feb 03, 2021 at 11:22:53PM +0100, Mark Kettenis wrote: > The actual panic message would be helpful... Doh. If it aint there, then I fudged the copy-paste. I can try to reproduce it... -- Best Regards Edd Barrett http://www.theunixzoo.co.uk
Re: aarch64 panic() at do_el1h_sync+0x21c
> Date: Wed, 3 Feb 2021 21:59:11 + > From: Edd Barrett > > Hi, > > Yesterday I kicked off a dpb(1) run on a Raspberry Pi 4. When I came > back later, the following crash was on the serial console. > > I don't know what triggers it, I'm afraid. > > I can try things, if anyone has ideas. The actual panic message would be helpful... > ``` > Stopped at panic+0x158:mov w0, w20 > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > *520912 45342 550x10 00K sh > 271552 57609 55 0x2 01 perl > 246287 17436 55 0x2 03 cc > 297670 78204 55 0x2 02 c++ > db_enter() at panic+0x154 > panic() at do_el1h_sync+0x21c > do_el0_sync() at handle_el1h_sync+0x6c > handle_el1h_sync() at pmap_copy_page+0x98 > pmap_copy_page() at pmap_copy_page+0x98 > pmap_copy_page() at uvm_fault_upper+0x144 > uvm_fault_upper() at uvm_fault+0x100 > https://www.openbsd.org/ddb.html describes the minimum info required in bug > reports. Insufficient info makes it difficult to find and fix bugs. > ddb{0}> trace > db_enter() at panic+0x154 > panic() at do_el1h_sync+0x21c > do_el0_sync() at handle_el1h_sync+0x6c > handle_el1h_sync() at pmap_copy_page+0x98 > pmap_copy_page() at pmap_copy_page+0x98 > pmap_copy_page() at uvm_fault_upper+0x144 > uvm_fault_upper() at uvm_fault+0x100 > uvm_fault() at udata_abort+0x12c > udata_abort() at do_el0_sync+0x13c > do_el0_sync() at handle_el0_sync+0x74 > handle_el0_sync() at 0x1a5de72398 > --- trap --- > ddb{0}> machine ddbcpu 1 > Stopped at ampintc_ipi_ddb+0x1c: ldr x15, [sp,#16] > db_enter() at ampintc_ipi_ddb+0x18 > ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 > ampintc_irq_handler() at arm_cpu_intr+0x30 > arm_cpu_intr() at handle_el1h_irq+0x6c > handle_el1h_irq() at svc_handler+0x1e8 > svc_handler() at svc_handler+0x1e8 > svc_handler() at do_el0_sync+0xf4 > ddb{1}> trace > db_enter() at ampintc_ipi_ddb+0x18 > ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 > ampintc_irq_handler() at arm_cpu_intr+0x30 > arm_cpu_intr() at handle_el1h_irq+0x6c > handle_el1h_irq() at svc_handler+0x1e8 > svc_handler() at svc_handler+0x1e8 > svc_handler() at do_el0_sync+0xf4 > do_el0_sync() at handle_el0_sync+0x74 > handle_el0_sync() at 0xbe346549c > --- trap --- > ddb{1}> machine ddbcpu 2 > Stopped at ampintc_ipi_ddb+0x1c: ldr x15, [sp,#16] > db_enter() at ampintc_ipi_ddb+0x18 > ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 > ampintc_irq_handler() at arm_cpu_intr+0x30 > arm_cpu_intr() at handle_el1h_irq+0x6c > handle_el1h_irq() at svc_handler+0x1e8 > svc_handler() at svc_handler+0x1e8 > svc_handler() at do_el0_sync+0xf4 > ddb{2}> trace > db_enter() at ampintc_ipi_ddb+0x18 > ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 > ampintc_irq_handler() at arm_cpu_intr+0x30 > arm_cpu_intr() at handle_el1h_irq+0x6c > handle_el1h_irq() at svc_handler+0x1e8 > svc_handler() at svc_handler+0x1e8 > svc_handler() at do_el0_sync+0xf4 > do_el0_sync() at handle_el0_sync+0x74 > handle_el0_sync() at 0x49c136c84 > --- trap --- > ddb{2}> machine ddbcpu 3 > Stopped at ampintc_ipi_ddb+0x1c: ldr x15, [sp,#16] > db_enter() at ampintc_ipi_ddb+0x18 > ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 > ampintc_irq_handler() at arm_cpu_intr+0x30 > arm_cpu_intr() at handle_el1h_irq+0x6c > handle_el1h_irq() at svc_handler+0x1e8 > svc_handler() at svc_handler+0x1e8 > svc_handler() at do_el0_sync+0xf4 > ddb{3}> trace > db_enter() at ampintc_ipi_ddb+0x18 > ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 > ampintc_irq_handler() at arm_cpu_intr+0x30 > arm_cpu_intr() at handle_el1h_irq+0x6c > handle_el1h_irq() at svc_handler+0x1e8 > svc_handler() at svc_handler+0x1e8 > svc_handler() at do_el0_sync+0xf4 > do_el0_sync() at handle_el0_sync+0x74 > handle_el0_sync() at 0x42749a2f4 > --- trap --- > ddb{0}> ps >PID TID PPIDUID S FLAGS WAIT COMMAND > *45342 520912 37571 55 70x10sh > 57609 271552 21291 55 7 0x2perl > 17436 246287 35320 55 7 0x2cc > 35320 135299 39411 55 30x100088 sigsusp sh > 78204 297670 17054 55 7 0x2c++ > 17054 63352 35973 55 30x82 wait perl > 35973 95003 91433 55 30x10008a sigsusp sh > 10647 494721 94245 0 30x91 nanoslp perl > 91433 191032 8978 55 30x82 wait gmake > 8978 33444 94837 55 30x100088 sigsusp sh > 37571 49113 82738 55 30x10008a sigsusp sh > 82738 287706 6798 55 30x10008a sigsusp sh > 6798 273621 90412 55 30x10008a sigsusp make > 90412 356203 30174 55 30x10008a sigsusp make > 30174 496473 88029 55 30x10008a sigsusp sh > 88029 445633 29311 55 30x10008a sigsusp make > 29311
aarch64 panic() at do_el1h_sync+0x21c
Hi, Yesterday I kicked off a dpb(1) run on a Raspberry Pi 4. When I came back later, the following crash was on the serial console. I don't know what triggers it, I'm afraid. I can try things, if anyone has ideas. ``` Stopped at panic+0x158:mov w0, w20 TIDPIDUID PRFLAGS PFLAGS CPU COMMAND *520912 45342 550x10 00K sh 271552 57609 55 0x2 01 perl 246287 17436 55 0x2 03 cc 297670 78204 55 0x2 02 c++ db_enter() at panic+0x154 panic() at do_el1h_sync+0x21c do_el0_sync() at handle_el1h_sync+0x6c handle_el1h_sync() at pmap_copy_page+0x98 pmap_copy_page() at pmap_copy_page+0x98 pmap_copy_page() at uvm_fault_upper+0x144 uvm_fault_upper() at uvm_fault+0x100 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{0}> trace db_enter() at panic+0x154 panic() at do_el1h_sync+0x21c do_el0_sync() at handle_el1h_sync+0x6c handle_el1h_sync() at pmap_copy_page+0x98 pmap_copy_page() at pmap_copy_page+0x98 pmap_copy_page() at uvm_fault_upper+0x144 uvm_fault_upper() at uvm_fault+0x100 uvm_fault() at udata_abort+0x12c udata_abort() at do_el0_sync+0x13c do_el0_sync() at handle_el0_sync+0x74 handle_el0_sync() at 0x1a5de72398 --- trap --- ddb{0}> machine ddbcpu 1 Stopped at ampintc_ipi_ddb+0x1c: ldr x15, [sp,#16] db_enter() at ampintc_ipi_ddb+0x18 ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 ampintc_irq_handler() at arm_cpu_intr+0x30 arm_cpu_intr() at handle_el1h_irq+0x6c handle_el1h_irq() at svc_handler+0x1e8 svc_handler() at svc_handler+0x1e8 svc_handler() at do_el0_sync+0xf4 ddb{1}> trace db_enter() at ampintc_ipi_ddb+0x18 ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 ampintc_irq_handler() at arm_cpu_intr+0x30 arm_cpu_intr() at handle_el1h_irq+0x6c handle_el1h_irq() at svc_handler+0x1e8 svc_handler() at svc_handler+0x1e8 svc_handler() at do_el0_sync+0xf4 do_el0_sync() at handle_el0_sync+0x74 handle_el0_sync() at 0xbe346549c --- trap --- ddb{1}> machine ddbcpu 2 Stopped at ampintc_ipi_ddb+0x1c: ldr x15, [sp,#16] db_enter() at ampintc_ipi_ddb+0x18 ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 ampintc_irq_handler() at arm_cpu_intr+0x30 arm_cpu_intr() at handle_el1h_irq+0x6c handle_el1h_irq() at svc_handler+0x1e8 svc_handler() at svc_handler+0x1e8 svc_handler() at do_el0_sync+0xf4 ddb{2}> trace db_enter() at ampintc_ipi_ddb+0x18 ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 ampintc_irq_handler() at arm_cpu_intr+0x30 arm_cpu_intr() at handle_el1h_irq+0x6c handle_el1h_irq() at svc_handler+0x1e8 svc_handler() at svc_handler+0x1e8 svc_handler() at do_el0_sync+0xf4 do_el0_sync() at handle_el0_sync+0x74 handle_el0_sync() at 0x49c136c84 --- trap --- ddb{2}> machine ddbcpu 3 Stopped at ampintc_ipi_ddb+0x1c: ldr x15, [sp,#16] db_enter() at ampintc_ipi_ddb+0x18 ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 ampintc_irq_handler() at arm_cpu_intr+0x30 arm_cpu_intr() at handle_el1h_irq+0x6c handle_el1h_irq() at svc_handler+0x1e8 svc_handler() at svc_handler+0x1e8 svc_handler() at do_el0_sync+0xf4 ddb{3}> trace db_enter() at ampintc_ipi_ddb+0x18 ampintc_ipi_ddb() at ampintc_irq_handler+0x1c4 ampintc_irq_handler() at arm_cpu_intr+0x30 arm_cpu_intr() at handle_el1h_irq+0x6c handle_el1h_irq() at svc_handler+0x1e8 svc_handler() at svc_handler+0x1e8 svc_handler() at do_el0_sync+0xf4 do_el0_sync() at handle_el0_sync+0x74 handle_el0_sync() at 0x42749a2f4 --- trap --- ddb{0}> ps PID TID PPIDUID S FLAGS WAIT COMMAND *45342 520912 37571 55 70x10sh 57609 271552 21291 55 7 0x2perl 17436 246287 35320 55 7 0x2cc 35320 135299 39411 55 30x100088 sigsusp sh 78204 297670 17054 55 7 0x2c++ 17054 63352 35973 55 30x82 wait perl 35973 95003 91433 55 30x10008a sigsusp sh 10647 494721 94245 0 30x91 nanoslp perl 91433 191032 8978 55 30x82 wait gmake 8978 33444 94837 55 30x100088 sigsusp sh 37571 49113 82738 55 30x10008a sigsusp sh 82738 287706 6798 55 30x10008a sigsusp sh 6798 273621 90412 55 30x10008a sigsusp make 90412 356203 30174 55 30x10008a sigsusp make 30174 496473 88029 55 30x10008a sigsusp sh 88029 445633 29311 55 30x10008a sigsusp make 29311 152572 18553 55 30x10008a sigsusp sh 18553 202605 94245 55 30x10008a sigsusp make 94837 40263 98003 55 30x10008a sigsusp sh 98003 309681 6187 55 30x82 wait gmake 6187 390103 12422 55 30x100088 sigsusp sh 12422 86000 52828 55 3
Re: your mail
On Mon, Feb 01, 2021 at 06:56:41PM +0700, Neet Kucing wrote: > I was trying to install 6.8 but after booting shortly the font became too > small . And i cant read anything . Is there any way to handle this ? I use > HP Probook 242 G1 . The installation medium was created just about > 20minutes ago , and the host was MX linux live . One option would be to use a magnifying glass. Crude but effective. I'd go for this option. If that won't work, see if someone can help you out. Another option would be to write a file for autoinstall, then let it run the installation. Assuming you can make that work with your setup. Depending on your situation, you may want to just install everything to one partition, not using up the rest of the disk. After install, make new partitions into the unused space with disklabel, newfs, fsck -fp and mount them and edit /etc/fstab. Security wise, I would at least get /var and /usr/local into their own partitions. /home also would be really good, too. Note: you can growfs partitions bigger, but you can't make them smaller. Making them smaller means baking up the data, changing the partition smaller, newfs, fsck -fp, etcetera and restoring the backup into it. Good luck, Chris Bennett
Re: 6.8-current ifconfig umb0 and Device not configured
A question below.. On Wed, Feb 03, 2021 at 11:22:16AM +, Mikolaj Kucharski wrote: > On Wed, Feb 03, 2021 at 11:10:45AM +, Edd Barrett wrote: > > Hi, > > > > CCing ratchov@ and kettenis@ with some context. > > > > In short: my change broke ugen, which expects to scan up the interface > > range until an interface doesn't exist. > > > > On Wed, Feb 03, 2021 at 06:25:48AM +0100, Marcus Glocker wrote: > > > > > > Index: dev/usb/usbdi.c > > > === > > > RCS file: /cvs/src/sys/dev/usb/usbdi.c,v > > > retrieving revision 1.109 > > > diff -u -p -u -p -r1.109 usbdi.c > > > --- dev/usb/usbdi.c 1 Feb 2021 09:21:51 - 1.109 > > > +++ dev/usb/usbdi.c 2 Feb 2021 06:07:41 - > > > @@ -642,6 +642,10 @@ usbd_device2interface_handle(struct usbd > > > > > > if (dev->cdesc == NULL) > > > return (USBD_NOT_CONFIGURED); > > > + if (ifaceno < dev->cdesc->bNumInterfaces) { > > > + *iface = >ifaces[ifaceno]; > > > + return (USBD_NORMAL_COMPLETION); > > > + } > > > /* > > > * The correct interface should be at dev->ifaces[ifaceno], but > > > we've > > > * seen non-compliant devices in the wild which present > > > non-contiguous > > > > > > So OK if I commit this fix Edd, Stuart? > > > > I'm OK with it as a quick-fix. At least it will make both of the devices > > in question work. > > > > But in the long run, it's not hard to imagine other non-compliant > > devices which would still be defeated by this code. > > > > Suppose a device presents its contiguous interfaces in reverse order, e.g.: > > [2, 1, 0]. Now suppose a device driver asks for interface 2. We will > > find interface 0, as we never check if it's the right interface and we > > never reach the part of the code that scans the array. > > > > In other words, just because an index exists, doesn't mean it's the right > > interface. > > > > I think (and I'm not much of a kernel hacker, so I reserve the right be > > wrong) > > the correct solution is to: > > > > * always loop over the array looking for the right interface. > > * change ugen, so that it's scanning resilient to gaps in interface range. > > I would probably ask, what is the meaning of ifaceno? Is that variable an index in the array or is it bInterfaceNumber? >From my understanding, with -r1.110 of usbdi.c it's both. From this email thread, in various devices array index and bInterfaceNumber don't always have the same value. They don't always match. What users of usbd_device2interface_handle() function assume, an array index or bInterfaceNumber for ifaceno argument? > Not sure is your above proposition enough. Here is part of dmesg with > some debugging statments for 2 devices which trigger > usbd_device2interface_handle() > > See MMM markers. > > ... > ulpt0 at uhub0 port 3 configuration 1 interface 1 "Samsung Electronics Co., > Ltd. M2070 Series" rev 2.00/1.00 addr 2 > ulpt0: using bi-directional mode > ugen0 at uhub0 port 3 configuration 1 "Samsung Electronics Co., Ltd. M2070 > Series" rev 2.00/1.00 addr 2 > MMM: USBD_NORMAL_COMPLETION v1 ifaceno=0 bNumInterfaces=2 > [usbd_device2interface_handle()|usbdi.c|649] > uhub2 at uhub1 port 1 configuration 1 interface 0 "Advanced Micro Devices > Hub" rev 2.00/0.18 addr 2 > umb0 at uhub2 port 3 configuration 1 interface 12 "Sierra Wireless, > Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev > 2.00/0.06 addr 3 > ugen1 at uhub2 port 3 configuration 1 "Sierra Wireless, Incorporated Sierra > Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.00/0.06 addr 3 > MMM: USBD_NORMAL_COMPLETION v1 ifaceno=0 bNumInterfaces=5 > [usbd_device2interface_handle()|usbdi.c|649] > MMM: USBD_NORMAL_COMPLETION v1 ifaceno=1 bNumInterfaces=5 > [usbd_device2interface_handle()|usbdi.c|649] > MMM: USBD_NORMAL_COMPLETION v1 ifaceno=2 bNumInterfaces=5 > [usbd_device2interface_handle()|usbdi.c|649] > vscsi0 at root > scsibus2 at vscsi0: 256 targets > ... > > For ugen1 (belongs to physical device, whih also attaches a umb(4)) we > can see that ifaceno has value of 0, 1, 2, but from lsusb we see that: > > (from earliers Marcus Glocker's email) > > Index Interface Number > - --- > 0 0 > 1 2 > 2 3 > 3 12 > 4 13 > > for ifaceno 1 and 2 it will not equal to bInterfaceNumber if we iterate > the for (idx = 0; idx < dev->cdesc->bNumInterfaces; idx++) loop. > > I don't have good picture for how all the functions work with each > other, but it does feel like substantial work is needed here, to piece > everything together. > -- Regards, Mikolaj
Re: 6.8-current ifconfig umb0 and Device not configured
> Date: Wed, 3 Feb 2021 12:51:02 +0100 > From: Marcus Glocker > > On Wed, 3 Feb 2021 11:41:17 + > Edd Barrett wrote: > > > On Wed, Feb 03, 2021 at 11:17:01AM +, Stuart Henderson wrote: > > > btw the problem was seen with umb, it's not just ugen. > > > > From mglocker@'s explanation, I understood that it is the ugen driver > > trying to attach to some part of the device that in turn hoses umb for > > the same device? > > > > Maybe I misunderstood. > > That's right. ugen(4) tries to attach to another umb(4) interface, > then fails, and labels the whole umb(4) device dying. I don't know if > all umb(4) devices show this behaviour (I don't have such a device). > But the other thing I was thinking about is whether we should make > umb(4) attach to this other interface as well to prevent ugen(4) to > take control. Probably not. The umb0 in my x1 for example shows up as: umb0 at uhub0 port 4 configuration 1 interface 0 "Sierra Wireless Inc. Sierra Wireless EM7345 4G LTE" rev 2.00/17.29 addr 2 umodem0 at uhub0 port 4 configuration 1 interface 2 "Sierra Wireless Inc. Sierra Wireless EM7345 4G LTE" rev 2.00/17.29 addr 2 umodem0: data interface 3, has no CM over data, has break umodem0: status change notification available ucom0 at umodem0 And that ucom0 is useful to interact with the modem using AT commands. That is how you interact with the GNSS module for example. So probably ugen(4) should be changed such that it doesn't label the whole device as unusable if somehow attaching to a specific interface fails.
Re: 6.8-current ifconfig umb0 and Device not configured
Hello Gerhard, On Wed, 3 Feb 2021 13:20:19 +0100 Gerhard Roth wrote: > On 2/3/21 12:51 PM, Marcus Glocker wrote: > > On Wed, 3 Feb 2021 11:41:17 + > > Edd Barrett wrote: > > > >> On Wed, Feb 03, 2021 at 11:17:01AM +, Stuart Henderson wrote: > >>> btw the problem was seen with umb, it's not just ugen. > >> > >> From mglocker@'s explanation, I understood that it is the ugen > >> driver trying to attach to some part of the device that in turn > >> hoses umb for the same device? > >> > >> Maybe I misunderstood. > > > > That's right. ugen(4) tries to attach to another umb(4) interface, > > then fails, and labels the whole umb(4) device dying. I don't know > > if all umb(4) devices show this behaviour (I don't have such a > > device). But the other thing I was thinking about is whether we > > should make umb(4) attach to this other interface as well to > > prevent ugen(4) to take control. > > I don't think that's a good idea. Sierra Wireless devices have an > "USB composite" (USBCOMP) where the user can select between several > modes. And each mode is a different composition of USB devices. > E.g. mode 8 gives you four different interfaces: > > - DM (Device Management) > - NMEA (GPS Services) > - AT (AT Commands) > - MBIM > > While mode 9 is MBIM only. So depending upon the USBCOMP selection, > the devices will offer completely different interfaces. > > If umb would claim all interfaces, the other features become unusable. > And esp. AT commands can be very useful and allow to do things that > are not possible via MBIM. > > > If the additional interfaces are a problem, umb(4) could switch > Sierra Wireless modems into MBIM-only mode with QMI-over-MBIM > commands. But since the USBCOMP setting is persistent, that could > confuse dual-boot systems. OK. That's a quite a helpful update. I'm not so familiar with the umb(4) specifics. Then at least this clarifies that the current behaviour is fine. Since we have now fixed the umb(4) problem with the last usbdi.c commit I think we just keep it as is.
Re: 6.8-current ifconfig umb0 and Device not configured
On 2/3/21 12:51 PM, Marcus Glocker wrote: On Wed, 3 Feb 2021 11:41:17 + Edd Barrett wrote: On Wed, Feb 03, 2021 at 11:17:01AM +, Stuart Henderson wrote: btw the problem was seen with umb, it's not just ugen. From mglocker@'s explanation, I understood that it is the ugen driver trying to attach to some part of the device that in turn hoses umb for the same device? Maybe I misunderstood. That's right. ugen(4) tries to attach to another umb(4) interface, then fails, and labels the whole umb(4) device dying. I don't know if all umb(4) devices show this behaviour (I don't have such a device). But the other thing I was thinking about is whether we should make umb(4) attach to this other interface as well to prevent ugen(4) to take control. I don't think that's a good idea. Sierra Wireless devices have an "USB composite" (USBCOMP) where the user can select between several modes. And each mode is a different composition of USB devices. E.g. mode 8 gives you four different interfaces: - DM (Device Management) - NMEA (GPS Services) - AT (AT Commands) - MBIM While mode 9 is MBIM only. So depending upon the USBCOMP selection, the devices will offer completely different interfaces. If umb would claim all interfaces, the other features become unusable. And esp. AT commands can be very useful and allow to do things that are not possible via MBIM. If the additional interfaces are a problem, umb(4) could switch Sierra Wireless modems into MBIM-only mode with QMI-over-MBIM commands. But since the USBCOMP setting is persistent, that could confuse dual-boot systems. Gerhard At some point you just have to say "this device is broken crap, send it back or ebay it and buy an alternative". This is much easier for some classes of device where there are many alternatives (like audio interfaces) than mobile broadband where it's still very difficult to find something suitable with the correct physical interface. Yes, I'm starting to lean in this direction too. The only other solution would be to have some kind of quirks system, but I don't think that'd be perfect either: I bet some (different) devices share vendor and device IDs... Well. I think there are a lot of USB device with all kind of non-compliant USB configurations. That's why I personally think spending too much efforts here, trying to make the right thing, isn't worth it. You will fix something, and then break something else IMO. I would rather focus on getting as much devices possible supported without breaking others. Just as we did now :-)
Re: 6.8-current ifconfig umb0 and Device not configured
On Wed, 3 Feb 2021 11:41:17 + Edd Barrett wrote: > On Wed, Feb 03, 2021 at 11:17:01AM +, Stuart Henderson wrote: > > btw the problem was seen with umb, it's not just ugen. > > From mglocker@'s explanation, I understood that it is the ugen driver > trying to attach to some part of the device that in turn hoses umb for > the same device? > > Maybe I misunderstood. That's right. ugen(4) tries to attach to another umb(4) interface, then fails, and labels the whole umb(4) device dying. I don't know if all umb(4) devices show this behaviour (I don't have such a device). But the other thing I was thinking about is whether we should make umb(4) attach to this other interface as well to prevent ugen(4) to take control. > > At some point you just have to say "this device is broken crap, send > > it back or ebay it and buy an alternative". This is much easier for > > some classes of device where there are many alternatives (like > > audio interfaces) than mobile broadband where it's still very > > difficult to find something suitable with the correct physical > > interface. > > Yes, I'm starting to lean in this direction too. > > The only other solution would be to have some kind of quirks system, > but I don't think that'd be perfect either: I bet some (different) > devices share vendor and device IDs... Well. I think there are a lot of USB device with all kind of non-compliant USB configurations. That's why I personally think spending too much efforts here, trying to make the right thing, isn't worth it. You will fix something, and then break something else IMO. I would rather focus on getting as much devices possible supported without breaking others. Just as we did now :-)
Re: 6.8-current ifconfig umb0 and Device not configured
On Wed, Feb 03, 2021 at 11:17:01AM +, Stuart Henderson wrote: > btw the problem was seen with umb, it's not just ugen. >From mglocker@'s explanation, I understood that it is the ugen driver trying to attach to some part of the device that in turn hoses umb for the same device? Maybe I misunderstood. > At some point you just have to say "this device is broken crap, send > it back or ebay it and buy an alternative". This is much easier for some > classes of device where there are many alternatives (like audio interfaces) > than mobile broadband where it's still very difficult to find something > suitable with the correct physical interface. Yes, I'm starting to lean in this direction too. The only other solution would be to have some kind of quirks system, but I don't think that'd be perfect either: I bet some (different) devices share vendor and device IDs... -- Best Regards Edd Barrett http://www.theunixzoo.co.uk
Re: 6.8-current ifconfig umb0 and Device not configured
On Wed, Feb 03, 2021 at 11:10:45AM +, Edd Barrett wrote: > Hi, > > CCing ratchov@ and kettenis@ with some context. > > In short: my change broke ugen, which expects to scan up the interface > range until an interface doesn't exist. > > On Wed, Feb 03, 2021 at 06:25:48AM +0100, Marcus Glocker wrote: > > > > Index: dev/usb/usbdi.c > > === > > RCS file: /cvs/src/sys/dev/usb/usbdi.c,v > > retrieving revision 1.109 > > diff -u -p -u -p -r1.109 usbdi.c > > --- dev/usb/usbdi.c 1 Feb 2021 09:21:51 - 1.109 > > +++ dev/usb/usbdi.c 2 Feb 2021 06:07:41 - > > @@ -642,6 +642,10 @@ usbd_device2interface_handle(struct usbd > > > > if (dev->cdesc == NULL) > > return (USBD_NOT_CONFIGURED); > > + if (ifaceno < dev->cdesc->bNumInterfaces) { > > + *iface = >ifaces[ifaceno]; > > + return (USBD_NORMAL_COMPLETION); > > + } > > /* > > * The correct interface should be at dev->ifaces[ifaceno], but > > we've > > * seen non-compliant devices in the wild which present > > non-contiguous > > > > So OK if I commit this fix Edd, Stuart? > > I'm OK with it as a quick-fix. At least it will make both of the devices > in question work. > > But in the long run, it's not hard to imagine other non-compliant > devices which would still be defeated by this code. > > Suppose a device presents its contiguous interfaces in reverse order, e.g.: > [2, 1, 0]. Now suppose a device driver asks for interface 2. We will > find interface 0, as we never check if it's the right interface and we > never reach the part of the code that scans the array. > > In other words, just because an index exists, doesn't mean it's the right > interface. > > I think (and I'm not much of a kernel hacker, so I reserve the right be wrong) > the correct solution is to: > > * always loop over the array looking for the right interface. > * change ugen, so that it's scanning resilient to gaps in interface range. > Not sure is your above proposition enough. Here is part of dmesg with some debugging statments for 2 devices which trigger usbd_device2interface_handle() See MMM markers. ... ulpt0 at uhub0 port 3 configuration 1 interface 1 "Samsung Electronics Co., Ltd. M2070 Series" rev 2.00/1.00 addr 2 ulpt0: using bi-directional mode ugen0 at uhub0 port 3 configuration 1 "Samsung Electronics Co., Ltd. M2070 Series" rev 2.00/1.00 addr 2 MMM: USBD_NORMAL_COMPLETION v1 ifaceno=0 bNumInterfaces=2 [usbd_device2interface_handle()|usbdi.c|649] uhub2 at uhub1 port 1 configuration 1 interface 0 "Advanced Micro Devices Hub" rev 2.00/0.18 addr 2 umb0 at uhub2 port 3 configuration 1 interface 12 "Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.00/0.06 addr 3 ugen1 at uhub2 port 3 configuration 1 "Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.00/0.06 addr 3 MMM: USBD_NORMAL_COMPLETION v1 ifaceno=0 bNumInterfaces=5 [usbd_device2interface_handle()|usbdi.c|649] MMM: USBD_NORMAL_COMPLETION v1 ifaceno=1 bNumInterfaces=5 [usbd_device2interface_handle()|usbdi.c|649] MMM: USBD_NORMAL_COMPLETION v1 ifaceno=2 bNumInterfaces=5 [usbd_device2interface_handle()|usbdi.c|649] vscsi0 at root scsibus2 at vscsi0: 256 targets ... For ugen1 (belongs to physical device, whih also attaches a umb(4)) we can see that ifaceno has value of 0, 1, 2, but from lsusb we see that: (from earliers Marcus Glocker's email) Index Interface Number - --- 0 0 1 2 2 3 3 12 4 13 for ifaceno 1 and 2 it will not equal to bInterfaceNumber if we iterate the for (idx = 0; idx < dev->cdesc->bNumInterfaces; idx++) loop. I don't have good picture for how all the functions work with each other, but it does feel like substantial work is needed here, to piece everything together. -- Regards, Mikolaj
Re: 6.8-current ifconfig umb0 and Device not configured
On 2021/02/03 11:10, Edd Barrett wrote: > Hi, > > CCing ratchov@ and kettenis@ with some context. > > In short: my change broke ugen, which expects to scan up the interface > range until an interface doesn't exist. btw the problem was seen with umb, it's not just ugen. > On Wed, Feb 03, 2021 at 06:25:48AM +0100, Marcus Glocker wrote: > > > > Index: dev/usb/usbdi.c > > === > > RCS file: /cvs/src/sys/dev/usb/usbdi.c,v > > retrieving revision 1.109 > > diff -u -p -u -p -r1.109 usbdi.c > > --- dev/usb/usbdi.c 1 Feb 2021 09:21:51 - 1.109 > > +++ dev/usb/usbdi.c 2 Feb 2021 06:07:41 - > > @@ -642,6 +642,10 @@ usbd_device2interface_handle(struct usbd > > > > if (dev->cdesc == NULL) > > return (USBD_NOT_CONFIGURED); > > + if (ifaceno < dev->cdesc->bNumInterfaces) { > > + *iface = >ifaces[ifaceno]; > > + return (USBD_NORMAL_COMPLETION); > > + } > > /* > > * The correct interface should be at dev->ifaces[ifaceno], but > > we've > > * seen non-compliant devices in the wild which present > > non-contiguous > > > > So OK if I commit this fix Edd, Stuart? > > I'm OK with it as a quick-fix. At least it will make both of the devices > in question work. > > But in the long run, it's not hard to imagine other non-compliant > devices which would still be defeated by this code. At some point you just have to say "this device is broken crap, send it back or ebay it and buy an alternative". This is much easier for some classes of device where there are many alternatives (like audio interfaces) than mobile broadband where it's still very difficult to find something suitable with the correct physical interface. > Suppose a device presents its contiguous interfaces in reverse order, e.g.: > [2, 1, 0]. Now suppose a device driver asks for interface 2. We will > find interface 0, as we never check if it's the right interface and we > never reach the part of the code that scans the array. > > In other words, just because an index exists, doesn't mean it's the right > interface. > > I think (and I'm not much of a kernel hacker, so I reserve the right be wrong) > the correct solution is to: > > * always loop over the array looking for the right interface. > * change ugen, so that it's scanning resilient to gaps in interface range. > > -- > Best Regards > Edd Barrett > > http://www.theunixzoo.co.uk
Re: 6.8-current ifconfig umb0 and Device not configured
Hi, CCing ratchov@ and kettenis@ with some context. In short: my change broke ugen, which expects to scan up the interface range until an interface doesn't exist. On Wed, Feb 03, 2021 at 06:25:48AM +0100, Marcus Glocker wrote: > > Index: dev/usb/usbdi.c > === > RCS file: /cvs/src/sys/dev/usb/usbdi.c,v > retrieving revision 1.109 > diff -u -p -u -p -r1.109 usbdi.c > --- dev/usb/usbdi.c 1 Feb 2021 09:21:51 - 1.109 > +++ dev/usb/usbdi.c 2 Feb 2021 06:07:41 - > @@ -642,6 +642,10 @@ usbd_device2interface_handle(struct usbd > > if (dev->cdesc == NULL) > return (USBD_NOT_CONFIGURED); > + if (ifaceno < dev->cdesc->bNumInterfaces) { > + *iface = >ifaces[ifaceno]; > + return (USBD_NORMAL_COMPLETION); > + } > /* > * The correct interface should be at dev->ifaces[ifaceno], but we've > * seen non-compliant devices in the wild which present non-contiguous > > So OK if I commit this fix Edd, Stuart? I'm OK with it as a quick-fix. At least it will make both of the devices in question work. But in the long run, it's not hard to imagine other non-compliant devices which would still be defeated by this code. Suppose a device presents its contiguous interfaces in reverse order, e.g.: [2, 1, 0]. Now suppose a device driver asks for interface 2. We will find interface 0, as we never check if it's the right interface and we never reach the part of the code that scans the array. In other words, just because an index exists, doesn't mean it's the right interface. I think (and I'm not much of a kernel hacker, so I reserve the right be wrong) the correct solution is to: * always loop over the array looking for the right interface. * change ugen, so that it's scanning resilient to gaps in interface range. -- Best Regards Edd Barrett http://www.theunixzoo.co.uk