Re: udev breakages - was: Re: Need of an .async_probe() type of callback at driver's core - Was: Re: [PATCH] [media] drxk: change it to use request_firmware_nowait()
On Wed, Oct 3, 2012 at 12:12 AM, Greg KH gre...@linuxfoundation.org wrote: Mauro, what version of udev are you using that is still showing this issue? Kay, didn't you resolve this already? If not, what was the reason why? It's the same in the current release, we still haven't wrapped our head around how to fix it/work around it. Unlike what the heated and pretty uncivilized and rude emails here claim, udev does not dead-lock or break things, it's just slow. The modprobe event handling runs into a ~30 second event timeout. Everything is still fully functional though, there's only this delay. Udev ensures full dependency resolution between parent and child events. Parent events have to finish the event handling and have to return, before child event handlers are started. We need to ensure such things so that (among other things) disk events have finished their operations before the partition events are started, so they can rely and access their fully set up parent devices. What happens here is that the module_init() call blocks in a userspace transaction, creating a child event that is not started until the parent event has finished. The event handler for modprobe times out then the child event loads the firmware. Having kernel module relying on a running and fully functional userspace to return from module_init() is surely a broken driver model, at least it's not how things should work. If userspace does not respond to firmware requests, module_init() locks up until the firmware timeout happens. This all is not so much about how probe() should behave, it's about a fragile dependency on a specific userspace transaction to link a loadable module into the kernel. Drivers should avoid such loops for many reasons. Also, it's unclear in many cases how such a model should work at all if the module is compiled in and initialized when no userspace is running. If that unfortunate module_init() lockup can't be solved properly in the kernel, we need to find out if we need to make the modprobe handling in udev async, or let firmware events bypass dependency resolving. As mentioned, we haven't decided as of now which road to take here. Thanks, Kay -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: udev breakages - was: Re: Need of an .async_probe() type of callback at driver's core - Was: Re: [PATCH] [media] drxk: change it to use request_firmware_nowait()
On Wed, Oct 3, 2012 at 6:57 PM, Greg KH gre...@linuxfoundation.org wrote: It's the same in the current release, we still haven't wrapped our head around how to fix it/work around it. Ick, as this is breaking people's previously-working machines, shouldn't this be resolved quickly? Nothing really breaks, It's slow and it will surely be fixed when we know what's the right fix, which we haven't sorted out at this moment. module_init() can do lots of bad things, sleeping, asking for firmware, and lots of other things. To have userspace block because of this doesn't seem very wise. Not saying that it is right or nice, but it's the kernel itself that blocks. Run init=/bin/sh and do a modprobe of one of these drivers and it hangs un-interruptible until the kernel's internal firmware loading request times out, just because userspace is not there. But previously this all just worked as we ran 'modprobe' in a new thread/process right? No, we used to un-queue events which had a timeout specified in the environment, that code caused other issues and was removed. it can do without worrying about stopping anything else in the system that might want to happen at the same time (like load multiple modules in a row). It should not be an issue, the serialization is strictly parent - child, everything else runs in parallel. If that unfortunate module_init() lockup can't be solved properly in the kernel, we need to find out if we need to make the modprobe handling in udev async, or let firmware events bypass dependency resolving. As mentioned, we haven't decided as of now which road to take here. It's not a lockup, there have never been rules about what a driver could and could not do in its module_init() function. Sure, there are some not-nice drivers out there, but don't halt the whole system just because of them. It is a kind of lock up, just try modprobe with the init=/bin/sh boot. I recommend making module loading async, like it used to be, and then all should be fine, right? That's the current idea, and Tom is looking into it how it could look like. I also have no issues at all if the kernel does load the firmware from the filesystem on its own; it sounds like the simplest and most robust solution from a general look at the problem. It would also make the difference between in-kernel firmware and out-of-kernel firmware less visible, which sounds good. Honestly, requiring firmware-loading userspace-transactions to successfully link a module into the kernel sounds like a pretty bad idea to start with. Unlike module loading which needs the depmod alias database and userspace configuration; with firmware loading, there is no policy involved where userspace would add any single additional value to that step. Thanks, Kay -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: udev breakages - was: Re: Need of an .async_probe() type of callback at driver's core - Was: Re: [PATCH] [media] drxk: change it to use request_firmware_nowait()
On Wed, Oct 3, 2012 at 10:39 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Wed, Oct 3, 2012 at 12:50 PM, Greg KH gre...@linuxfoundation.org wrote: Ok, like this? This looks good to me. Having udev do firmware loading and tieing it to the driver model may have not been such a good idea so many years ago. Doing it this way makes more sense. Ok, I wish this had been getting more testing in Linux-next or something, but I suspect that what I'll do is to commit this patch asap, and then commit another patch that turns off udev firmware loading entirely for the synchronous firmware loading case. Why? Just to get more testing, and seeing if there are reports of breakage. Maybe some udev out there has a different search path (or because udev runs in a different filesystem namespace or whatever), in which case running udev as a fallback would otherwise hide the fact that he direct kernel firmware loading isn't working. Ok? Comments? The current udev directory search order is: /lib/firmware/updates/$(uname -r)/ /lib/firmware/updates/ /lib/firmware/$(uname -r)/ /lib/firmware/ There is no commonly known /firmware directory. http://cgit.freedesktop.org/systemd/systemd/tree/src/udev/udev-builtin-firmware.c#n100 http://cgit.freedesktop.org/systemd/systemd/tree/configure.ac#n548 Kay -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: udev breakages - was: Re: Need of an .async_probe() type of callback at driver's core - Was: Re: [PATCH] [media] drxk: change it to use request_firmware_nowait()
On Wed, Oct 3, 2012 at 11:05 PM, Greg KH gre...@linuxfoundation.org wrote: As for the firmware path, maybe we should change that to be modified by userspace (much like /sbin/hotplug was) in a proc file so that distros can override the location if they need to. If that's needed, a CONFIG_FIRMWARE_PATH= with the array of locations would probably be sufficient. Like udev's defaults here: http://cgit.freedesktop.org/systemd/systemd/tree/configure.ac#n550 Kay -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: udev breakages - was: Re: Need of an .async_probe() type of callback at driver's core - Was: Re: [PATCH] [media] drxk: change it to use request_firmware_nowait()
On Thu, Oct 4, 2012 at 12:58 AM, Linus Torvalds torva...@linux-foundation.org wrote: That said, there's clearly enough variation here that I think that for now I won't take the step to disable the udev part. I'll do the patch to support direct filesystem firmware loading using the udev default paths, and that hopefully fixes the particular case people see with media modules. If that approach looks like it works out, please aim for full in-kernel-*only* support. I would absolutely like to get udev entirely out of the sick game of firmware loading here. I would welcome if we are not falling back to the blocking timeouted behaviour again. The whole story would be contained entirely in the kernel, and we get rid of the rather fragile userspace transaction to execute module_init(), where the kernel has no idea if userspace is even up to ever responding to its requests. There would be no coordination with userspace tools needed, which sounds like a better fit in the way we develop things with the loosely coupled kernel - udev requirements. If that works out, it would a bit like devtmpfs which turned out to be very simple, reliable and absolutely the right thing we could do to primarily mange /dev content. The whole dance with the fake firmware struct device, which has a 60 second timeout to wait for userspace, is a long story of weird failures at various aspects. It would not only solve the unfortunate modprobe lockup with init=/bin/sh we see here, also big servers with an insane amount of devices happen to run into the 60 sec timeout, because udev, which runs with 4000-8000 threads in parallel handling things like 30.000 disks is not scheduled in time to fulfill network card firmware requests. It would be nice if we don't have that arbitrary timeout at all. Having any timeout at all to answer the simple question if a file stored in the rootfs exists, should be a hint that there is something really wrong with the model that stuff is done. Thanks, Kay -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 3/4] em28xx: Workaround for new udev versions
On Tue, 2012-06-26 at 18:07 -0300, Mauro Carvalho Chehab wrote: Em 26-06-2012 17:40, Greg KH escreveu: On Tue, Jun 26, 2012 at 04:34:21PM -0300, Mauro Carvalho Chehab wrote: New udev-182 seems to be buggy: even when usermode is enabled, it insists on needing that probe would defer any firmware requests. So, drivers with firmware need to defer probe for the first driver's core request, otherwise an useless penalty of 30 seconds happens, as udev will refuse to load any firmware. Shouldn't you fix udev, if it really is a problem here? Papering over userspace bugs in the kernel isn't usually a good thing to do, as odds are, it will hit some other driver sometime, right? That's my opinion too, but Kay seems to think otherwise. On his opinion, waiting for firmware during module_init() is something that were never allowed at the device model. No, that's not at all an udev *bug*, the changelog in this patch is just plain wrong. It's just udev making noise about a broken driver behavior. And it's the messenger, not the problem. Kernel modules must not block in module_init() in a userspace transaction (fw load) that can take an unpredictable amount of time. It results in broken suspend/resume paths, or broken compiled-in module behaviour, and a modprobe processes which hangs uninterruptible until the firmware timeout happens. Uevents have dependencies, if a parent device event calls modprobe, the child device it creates will waits for the parent event to finish, but if the parent blocks in modprobe, it will not finish and we run into the deadlock udev complains about. Udev used to work around that, that workaround we turned into the logged error we see now. Again, uninterruptible blocking of module_init() in a in-kernel callout-to-userspace is not proper driver behavior, and needs to be changed. Thanks, Kay -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Q] udev and soc-camera
On Thu, Jan 28, 2010 at 00:25, Valentin Longchamp valentin.longch...@epfl.ch wrote: I have a system that is built with OpenEmbedded where I use a mt9t031 camera with the soc-camera framework. The mt9t031 works ok with the current kernel and system. However, udev does not create the /dev/video0 device node. I have to create it manually with mknod and then it works well. If I unbind the device on the soc-camera bus (and then eventually rebind it), udev then creates the node correctly. This looks like a timing issue at coldstart. OpenEmbedded currently builds udev 141 and I am using kernel 2.6.33-rc5 (but this was already like that with earlier kernels). Is this problem something known or has at least someone already experienced that problem ? You need to run udevadm trigger as the bootstrap/coldplug step, after you stared udev. All the devices which are already there at that time, will not get created by udev, only new ones which udev will see events for. The trigger will tell the kernel to send all events again. Or just use the kernel's devtmpfs, and all this should work, even without udev, if you do not have any other needs than plain device nodes. Kay -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html