On Thu, Apr 5, 2018 at 9:28 PM, Peter Hutterer <[email protected]> wrote:
> On Thu, Apr 05, 2018 at 01:20:51PM +0000, Friedrich, Eugen (ADITG/ESB) > wrote: > > Hi Peter > > > > Best regards > > > > Eugen Friedrich > > Engineering Software Base (ADITG/ESB) > > > > Tel. +49 5121 49 6921 > > > > > -----Original Message----- > > > From: wayland-devel [mailto:wayland-devel- > > > [email protected]] On Behalf Of Peter Hutterer > > > Sent: Donnerstag, 5. April 2018 06:27 > > > To: Matt Hoosier > > > Cc: Pekka Paalanen; wayland mailing list > > > Subject: Re: [PATCH libinput 0/1] Initialization race against udev > > > > > > On Wed, Apr 04, 2018 at 09:04:32AM -0500, Matt Hoosier wrote: > > > > I encounter another "bad" behavior related to multiple firings of > the UDev > > > > 'add' event for a given input node. > > > > > > > > Typically on modest embedded systems, you do not want to run the > > > exhaustive > > > > 'udevadm trigger' during the main booting sequence. That causes > hundreds > > > or > > > > thousands of UDev nodes to be crawled and processed by udevd all > during > > > the > > > > first moments of userspace. These overall bulk of these nodes are > pretty > > > > much irrelevant to the use-case of the device. The time spent > processing > > > > these non-essential /sys devices can easily slow down the device's > > > progress > > > > toward starting a UI with basic touch support and loading drivers > for the > > > > handful of essential peripherals by 10 or 15 seconds. > > > > > > > > Instead, it works better to manually command the same UDev > coldplugging > > > > that would have been done by 'udevadm trigger', but for a very small > > > > hand-picked set of devices rather than everything in /sys/devices. > The time > > > > savings are large. Of course, for completeness you do eventually > have to > > > > run 'udevadm trigger' so that the full set of hardware and kernel > software > > > > features are activated, but that can wait until after the main KPIs > are > > > > achieved. > > > > > > > > This scheme generally works just fine. Manually stimulating udevd to > > > > coldplug the specific devices you need keeps everything general: > > > > applications that find their hardware with UDev (such as with > libinput or > > > > Weston's DRM backend) all get to rely on their usual well-tested > > > codepaths. > > > > > > > > But there is a snag: if a device like /dev/input/event0 has been > > > > coldplugged once with the hands-on technique, then all the daemons > that > > > > care about it have already seen one UDev ACTION=add event for it. > When > > > the > > > > late-running 'udevadm trigger' does its exhaustive sweep across > > > > /sys/devices, this will cause a second ACTION=add event to be > triggered > > > for > > > > /dev/input/event0. Currently (well, with libinput 1.1.1) this causes > > > > libinput -- and consequently Weston -- to open a second > filedescriptor > > > > against /dev/input/event0, so that all input events are received in > > > > duplicate. That confuses the compositor's and applications' input > event > > > > handling. > > > > > > > > Would it be tolerable to put a patch into either libinput or Weston > to > > > > guard against double-opening the same input device? I realize that > the > > > > scheme outlined above is probably technically in violation of the > expected > > > > UDev initialization scheme, but I'm not aware of any way to suppress > > > udevd > > > > from broadcasting the 'add' action to all udev clients even though > the > > > > device has already been coldplugged once. It seems to me at least > plausible > > > > that the Weston stack would benefit from guarding against getting > into this > > > > bad state from receiving unexpected UDev events. > > > > > > > > > I don't think a weston patch is likely to help you here since > libinput's > > > udev handling is independent of weston. but tbh, I'd rather not do > this in > > > libinput either. as you write, you're not really behaving like udev is > > > supposed to and once we start having to deal with that the floodgates > are > > > open for weird patches. The code for this should be relatively > > > self-contained and easy-enough to maintain, please maintain that as a > > > distribution-specific patch. > > > > [EF] but as I can see there is a race condition which can always occur in > > any system if device will be added to the system after installation of > the > > udev monitor and before initializing already coldplugged devices. > > Can we really just accept this and hope it will never happen? Or I'm > missing something? > > yeah, that race condition could occur but the chances are ... very slim. > By the time libinput starts up, all the built-in devices are already > connected and what are the chances that you plug in a USB device at that > precise ms? > > Now, if you feel like writing patches for this you're welcome if they're > not > too intrusive but one problem with those patches is likely that they're > almost impossible to test for the udev backend (other than by manually > pausing libinput at the right time and hoping). So unless you can come up > with a solution for that too, I'd rather take my chances with the race > condition than maintaining untestable code that may not work correctly in > the one case where we need it :) > > > > > especially if you're really still on 1.1.1 which came out in 2015... > > > > > > As for how it would work: device_added in src/udev-seat.c, you can get > to > > > the libinput_seat which has the device list (they're all struct > > > evdev_device). You can get the fd from that and stat() it for the > st_rdev > > > or, simpler, add a field and strdup the device node into that struct > and > > > compare that. > > > > > > On second thought: I'm not sure how exactly you're handling the > > > cold-plugging but is there a way you can add a udev rule that assigns > > > LIBINPUT_IGNORE_DEVICE=1 to all input devices already handled? > > > > [EF] Could not get the exact idea! Should we set the attribute > > LIBINPUT_IGNORE_DEVICE from the libinput code? > > Above you wrote: > > Manually stimulating udevd to > > coldplug the specific devices you need keeps everything general: > I'm not certain whether this question was addressed to me or Eugen's commentary after my message. In any case, when I say "manually stimulate udevd to send an 'add' event", I mean exactly what Pekka interpreted it as. 'udevadm trigger' is nothing fancy; it essentially just exhaustively crawls across /sys and writes the string "add" into each device directory in which it finds a file named /sys/.../uevent. There is no stateful interaction with udevd or anything like that. So in my case, I'm just manually doing "echo add > /sys/devices/my-device-path/uevent" to shortcut the big expensive iteration that would be imposed if you tried to use 'udevadm trigger --lots-of-narrow-match-expressions'. Same effect from the perspective of interacting with the kernel or udevd though. > > I don't actually know what you mean by this, so I'm not sure what the udev > behaviour is in this case. But if you're holding back the > trigger anyway is that you can do whatever you do for the devices, then > drop > a udev rule in that sets LIBINPUT_IGNORE_DEVICE, then you run the udevadm > trigger. Any device that is now handled by udev will have that property > assigned and will thus be ignored, including all your duplicates. > > > Additional idea would be to use USEC_INITIALIZED time as unique > identifier > > in one boot cycle. > > Can you expand on how you think this would work? > > Cheers, > Peter > > > > > On Wed, Apr 4, 2018 at 6:51 AM, Pekka Paalanen <[email protected]> > > > wrote: > > > > > > > > > From: Pekka Paalanen <[email protected]> > > > > > > > > > > Hi, > > > > > > > > > > here is a patch for a race that was troubling a distribution for an > > > > > embedded device. The original form of this patch has been in > developer > > > > > use for months now, and it seems to work there. For this version I > have > > > > > only changed the logging and fixed a leak. > > > > > > > > > > I originally discussed this issue on #wayland with Peter Hutterer > on > > > > > September 22, 2017. The discussion is quoted below to refresh our > > > > > memories. > > > > > > > > > > In the distribution, Weston is started by systemd, the system is > not > > > > > very CPU-powerful, and lots of things are happening during the > boot, > > > > > which may all contribute to making this race possible to lose. > > > > > > > > > > I've only tested briefly on my work desktop to see that Weston > still > > > > > appears to find the input devices I expect. Obviously my desktop > would > > > > > never lose the race, because there are no input devices being > hotplugged > > > > > at the same time as Weston is starting up. > > > > > > > > > > The timer/idle callback idea is not implemented here, and neither > is the > > > > > double-add filtering. Let me know if you require those. > > > > > > > > > > > > > > > Thanks, > > > > > pq > > > > > > > > > > > > > > > < pq> whot, btw. we've looking into some fun libinput vs. udev > > > > > device initialization race here. Apparently the device enumeration > > > > > on libinput start-up may race with udev preparing devices at the > > > > > same time, so the initial device enumeration for weston may see > > > > > devices that have not had all their udev properties set yet. Do you > > > > > recall ever fighting such issues? > > > > > > > > > > < pq> whot, another issue is that we may see a double-add of a > > > > > device, first from the initial enumeration (with possibly missing > > > > > properties) and then a second time as a hotplug event because > > > > > libinput (correctly) listens for events before it enume existing > > > > > devices. > > > > > > > > > > < whot> pq: yes, I've seen this a lot with the test suite, but > never > > > > > in real life > > > > > > > > > > < whot> I'm pretty sure in my case it's always a lingering udev > > > > > device or some lingering properties > > > > > > > > > > < whot> e.g. a tablet has some keyboard properties set because the > > > > > kernel re-uses the event node > > > > > > > > > > < whot> but that's triggered by the test suite using the path > > > > > interface and there's bound to be a window where we can race. it > > > > > shouldn't happen with the udev device, I think > > > > > > > > > > < pq> specifically, in the target device/system with touchscreens > > > > > and weston, there are udev rules to set WL_OUTPUT on the device, > but > > > > > weston does not always see it. > > > > > > > > > > < pq> well, we use libinput via weston, and I don't think weston > > > > > uses the path-based function. It's really the initial enumeration > of > > > > > input devices in libinput that has the issue. > > > > > > > > > > < whot> pq: a double-add is a bug, I can picture that happening > with > > > > > the current code. should be fixable > > > > > > > > > > * whot mumples something about syspath comparisons to filter that > > > > > out > > > > > > > > > > < whot> mumbles, even. but it's late nough that I'm also open for > > > > > mumpling > > > > > > > > > > < pq> whot, nice. Yeah, let's see if we have a patch to submit or > > > > > just a bug report, it might take a while. > > > > > > > > > > < whot> pq: for the missing properties I have no idea, but that > > > > > would, if anything, be a systemd bug > > > > > > > > > > < pq> whot, really? What's udev_device_get_is_initialised() for > > > > > then? Does it not apply to the initial device enumeration as done > by > > > > > libinput? > > > > > > > > > > < pq> whot, how does initial device enumeration work? Does it go > the > > > > > udev daemon or just look in sysfs on its own? > > > > > > > > > > < whot> udev_device_get_is_initialized may help but it didn't with > > > > > the test suite > > > > > > > > > > < whot> but for your case, it might just because you're fighting a > > > > > different race > > > > > > > > > > < whot> should be easy enough to printf that to the log in your > case > > > > > and check if it's false when it fails > > > > > > > > > > < pq> whot, oh yeah, this didn't feel like a lingering device > issue. > > > > > > > > > > < whot> initial enumeration: see udev_input_add_devices in > > > > > src/udev-seat.c > > > > > > > > > > < whot> basically: "get list of devices maching subsystem input, > > > > > create udev device for each syspath in that list" > > > > > > > > > > < pq> whot, our current idea is to have the initial enumeration to > > > > > ignore devices where udev_device_get_is_initialized() returns > false, > > > > > because presumably we will get the hotplug event later anyway. > > > > > > > > > > < pq> that would avoid any busy-loop waiting > > > > > > > > > > < whot> you could schedule a callback for that device's syspath > > > > > > > > > > < whot> takes the whole "presumably" out of the question :) > > > > > > > > > > < pq> you mean like an idle task to re-check it? > > > > > > > > > > < whot> yeah, that or a timer > > > > > > > > > > < whot> not sure we have idle sources in libinput right now, so a > > > > > timer is your better option here > > > > > > > > > > < pq> whot, ok, sure. Thanks for the tip. :-) > > > > > > > > > > Nandor Han (1): > > > > > udev: validate input devices during cold-plug > > > > > > > > > > src/udev-seat.c | 11 +++++++++++ > > > > > 1 file changed, 11 insertions(+) > > > > > > > > > > -- > > > > > 2.16.1 > > > > > > > > > > _______________________________________________ > > > > > wayland-devel mailing list > > > > > [email protected] > > > > > https://lists.freedesktop.org/mailman/listinfo/wayland-devel > > > > > > > > > > > > _______________________________________________ > > > > wayland-devel mailing list > > > > [email protected] > > > > https://lists.freedesktop.org/mailman/listinfo/wayland-devel > > > > > > _______________________________________________ > > > wayland-devel mailing list > > > [email protected] > > > https://lists.freedesktop.org/mailman/listinfo/wayland-devel > > >
_______________________________________________ wayland-devel mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/wayland-devel
