On Fri, Mar 12, 2021 at 11:11 AM Arnd Bergmann <[email protected]> wrote: > > > On Fri, Mar 12, 2021 at 9:46 AM Dmitry Vyukov <[email protected]> wrote: > > > > On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann <[email protected]> wrote: > > > > > On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <[email protected]> > > > > > wrote: > > > > > a) accessing a legacy ISA/LPC port should not result in an oops, > > > > > but should instead return values with all bits set. There could > > > > > be a ratelimited console warning about broken drivers, but we > > > > > can't assume that all drivers work correctly, as some ancient > > > > > PC style drivers still rely on this. > > > > > John Garry has recently worked on a related bugfix, so maybe > > > > > either this is the same bug he encountered (and hasn't merged > > > > > yet), or if his fix got merged there is still a remaining problem. > > > > > > > > b) It should not be possible to open /dev/ttyS3 if the device is > > > > > not initialized. What is the output of 'cat > > > > > /proc/tty/driver/serial' > > > > > on this machine? Do you see any messages from the serial > > > > > driver in the boot log? > > > > > Unfortunately there are so many different ways to probe devices > > > > > in the 8250 driver that I don't know where this comes from. > > > > > Your config file has > > > > > CONFIG_SERIAL_8250_PNP=y > > > > > CONFIG_SERIAL_8250_NR_UARTS=32 > > > > > CONFIG_SERIAL_8250_RUNTIME_UARTS=4 > > > > > CONFIG_SERIAL_8250_EXTENDED=y > > > > > I guess it's probably the preconfigured uarts that somehow > > > > > become probed without initialization, but it could also be > > > > > an explicit device incorrectly described by qemu. > > > > > > > > > > > > Here is fool boot log, /proc/tty/driver/serial and the crash: > > > > https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt > > > > > > Ok, so there are four 8250 ports, and none of them are initialized, > > > while the console is on /dev/ttyAMA0 using a different driver. > > > > > > I'm fairly sure this is a bug in the kernel then, not in qemu. > > > > > > > > > I also see that the PCI I/O space gets mapped to a physical address: > > > [ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO > > > 0x003eff0000..0x003effffff -> 0x0000000000 > > > > > > So it's probably qemu that triggers the 'synchronous external > > > abort' when accessing the PCI I/O space, which in turn hints > > > towards a bug in qemu. Presumably it only returns data from > > > I/O ports that are actually mapped to a device when real hardware > > > is supposed to return 0xffffffff when reading from unused I/O ports. > > > This would be separate from the work that John did, which only > > > fixed the kernel for accessing I/O port ranges that do not have > > > a corresponding MMU mapping to hardware ports. > > > > Will John's patch fix this crash w/o any changes in qemu? That would > > be good enough for syzbot. Otherwise we need to report the issue to > > qemu. > > No, this was a third issue. As far as I remember, this would result in > a similar problem in the case where there is no PCI bus at all, or > where no PCI host has an I/O port range, so the inb() from the serial > driver would cause a page fault. The problem you ran into happens > in qemu when the PCI I/O ports are mapped to hardware registers > that cause an exception when accessed. > > If you just want to work around the problem for now, it should > go away if you set CONFIG_SERIAL_8250_RUNTIME_UARTS > to zero.
It does not happen too often on syzbot so far, so let's try to do the right thing first. I've filed: https://bugs.launchpad.net/qemu/+bug/1918917 with a link to this thread. To be fair, I don't fully understand what I am talking about, I hope I proxied your description properly.

