On Wed, 13 Jul 2016, Mark Hounschell wrote: > > Alright, so you are basically supplementing O_NDELAY flag in order to > > avoid check_disk_change() being called. It's rather a coincidence that > > it has worked this way, but I agree with you that we can't ignore the > > fact that there is userspace relying on this behavior. > > I'm not supplementing anything. The driver _did_ this on its own.
I mean, you're passing O_NDELAY to open(/dev/fd0) exactly to avoid kernel issuing check_disk_change(). That's the only semantics O_NDELAY has for fd. > I just expect to be able to open the drive to get a handle without the > kernel attempting to access the media. My apps manage a disk_change on > their own. I don't think its check_disk_change that gives me my pain. > There is some probe happening that fails when a floppy is installed that > is not a "standard" format. That causes the open to fail which is the > most pain. Still I should be able to get a handle without any media or > even unrecognized media installed. Yeah, that's check_disk_change(). > > > The original behavior of the floppy driver was correct. I have no > > > idea what BUG these changes were supposed to fix but the "fix" > > > obviously broke user land. Was this bug reported by some new ROBOT > > > test or something? The kernel floppy driver has been stable for > > > years now > > > > That's not really true; the code is a racy mess, and this is being > > uncovered only when virtualized floppy devices started to exist > > (because they are much faster than a real hardware, and the different > > timing reveals bugs that were not visible before). > > Forgive me here as I'm ignorant about why any virtualized floppy would > require the real physical kernel floppy driver to be involved at all. Because VMs (such as qemu) actually do emulate a FDC on a hardware level, but don't emulate the timings of the real hardware (which are not mandated by the spec, but "are just there"). > > This particular fix was because syzkaller found a way how easily corrupt > > kernel memory using O_NDELAY to floppy driver; see > > > > https://lkml.org/lkml/2016/2/2/848 > > > > > so I am really confused as to why these changes were induced. > > > > The floppy driver is in an orphan mode; no new "features" are added "just > > because". Everything that's happening there is to fix real bugs in the > > kernel. > > > > I'll look into ways how to fix this, but I am afraid this is going to be > > really tricky. Therefore we'd have to very likely proceed asap with revert > > of 09954bad448 and coming up with a workaround that'd still avoid the bug > > reported by syzkaller. > > I would be happy to do some testing for you if needed. At least with regard to > our apps. Could you please check whether my last patch that Jens queued in linux-block.git ("floppy: fix open(O_ACCMODE) for ioctl-only open" in for-linus branch) remedies at least some of the issues you are seeing? Thanks, -- Jiri Kosina SUSE Labs

