Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
Patchset imported to github. Pull request: https://github.com/systemd-devs/systemd/compare/master...systemd-mailing-devs:1429895189.10988.22.camel%40gmail.com -- Generated by https://github.com/haraldh/mail2git ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Fri, 24.04.15 21:39, Andrei Borzenkov (arvidj...@gmail.com) wrote: В Fri, 24 Apr 2015 20:19:33 +0200 Lennart Poettering lenn...@poettering.net пишет: On Fri, 24.04.15 20:46, Ivan Shapovalov (intelfx...@gmail.com) wrote: On 2015-04-24 at 19:13 +0200, Lennart Poettering wrote: On Fri, 24.04.15 20:06, Ivan Shapovalov (intelfx...@gmail.com) wrote: With this patch applied, on `systemctl daemon-reload` I get the following: Any chance you can do the same with debugging on? systemd-analyze set-log-level debug right before the daemon-reload? That should show the transaction being queued in. Sure, I've ran it (log attached), but well... it did not show any new jobs being enqueued. But alsa-restore.service _did_ run and did reset my ALSA volume to the bootup value. Pretty confused, Note that starting services is recursive: if a service is enqueued, then we add all its dependencies to the transaction, verify that the transaction is without cycles and can be applied, and then actually apply it. This means, that starting a service foo.service, that requires bar.target, that requires waldo.service, will mean that waldo.service is also started, even if bar.target is already started anyway. I was sure that on reload systemd simply restores previous state of services. Why it attempts to start anything in the first place? well, sure it does restore it. but triggers might start someting after reload, It makes reload potentially dangerous; what if service was stopped on purpose and should remain this way? well, the next time you start something and that something declares it needs something else, then we will start that something else before that something, and that's the right thing to do. i mean, don't misunderstand what I wrote earlier: if a service is already started, and you do a reload, then its deps will not be pulled in again. Only newly enqueued start jobs will have that effect. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On 2015-04-27 at 17:14 +0200, Lennart Poettering wrote: On Sat, 25.04.15 05:48, Ivan Shapovalov (intelfx...@gmail.com) wrote: On 2015-04-25 at 04:00 +0300, Ivan Shapovalov wrote: On 2015-04-24 at 16:04 +0200, Lennart Poettering wrote: [...] Actually, it really is about the UNIT_TRIGGERS dependencies only, since we don't do the retroactive deps stuff at all when we are coldplugging, it's conditionalized in m-n_reloading = 0. So, I think I understand the problem. We should do this not only for UNIT_TRIGGERS, but also for any dependencies which may matter when activating that unit. That is, anything which is referenced by transaction_add_job_and_dependencies()... recursively. Here is what I have in mind. Don't know whether this is correct, but it fixes the problem for me. From 515d878e526e52fc154874e93a4c97555ebd8cff Mon Sep 17 00:00:00 2001 From: Ivan Shapovalov intelfx...@gmail.com Date: Sat, 25 Apr 2015 04:57:59 +0300 Subject: [PATCH] core: coldplug all units which participate in jobs This is yet another attempt to fix coldplugging order (more especially, the problem which happens when one creates a job during coldplugging, and it references a not-yet-coldplugged unit). Now we forcibly coldplug all units which participate in jobs. This is a superset of previously implemented handling of the UNIT_TRIGGERS dependencies, so that handling is removed. --- src/core/transaction.c | 6 ++ src/core/unit.c| 8 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/src/core/transaction.c b/src/core/transaction.c index 5974b1e..a02c02c 100644 --- a/src/core/transaction.c +++ b/src/core/transaction.c @@ -848,6 +848,12 @@ int transaction_add_job_and_dependencies( assert(type _JOB_TYPE_MAX_IN_TRANSACTION); assert(unit); +/* Before adding jobs for this unit, let's ensure that its state has been loaded. + * This matters when jobs are spawned as part of coldplugging itself (see. e. g. path_coldplug(). + * This way, we recursively coldplug units, ensuring that we do not look at state of + * not-yet-coldplugged units. */ +unit_coldplug(unit); I like the simplicity of this patch actually, but it's unfortunately too simple: coldplugging is to be applied only for services that are around at the time we come back from a reload. If you start a service during runtime, without any reloading anywhere around, we should not coldplug at all. I figure we need a coldplugging bool or so in Manager, which we set while coldplugging and can then check here. Yeah, right, I've fixed it locally but forgot to send a follow-up mail. Actually, isn't it unit-manager-n_reloading 0? -- Ivan Shapovalov / intelfx / signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Sat, 25.04.15 05:48, Ivan Shapovalov (intelfx...@gmail.com) wrote: On 2015-04-25 at 04:00 +0300, Ivan Shapovalov wrote: On 2015-04-24 at 16:04 +0200, Lennart Poettering wrote: [...] Actually, it really is about the UNIT_TRIGGERS dependencies only, since we don't do the retroactive deps stuff at all when we are coldplugging, it's conditionalized in m-n_reloading = 0. So, I think I understand the problem. We should do this not only for UNIT_TRIGGERS, but also for any dependencies which may matter when activating that unit. That is, anything which is referenced by transaction_add_job_and_dependencies()... recursively. Here is what I have in mind. Don't know whether this is correct, but it fixes the problem for me. From 515d878e526e52fc154874e93a4c97555ebd8cff Mon Sep 17 00:00:00 2001 From: Ivan Shapovalov intelfx...@gmail.com Date: Sat, 25 Apr 2015 04:57:59 +0300 Subject: [PATCH] core: coldplug all units which participate in jobs This is yet another attempt to fix coldplugging order (more especially, the problem which happens when one creates a job during coldplugging, and it references a not-yet-coldplugged unit). Now we forcibly coldplug all units which participate in jobs. This is a superset of previously implemented handling of the UNIT_TRIGGERS dependencies, so that handling is removed. --- src/core/transaction.c | 6 ++ src/core/unit.c| 8 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/src/core/transaction.c b/src/core/transaction.c index 5974b1e..a02c02c 100644 --- a/src/core/transaction.c +++ b/src/core/transaction.c @@ -848,6 +848,12 @@ int transaction_add_job_and_dependencies( assert(type _JOB_TYPE_MAX_IN_TRANSACTION); assert(unit); +/* Before adding jobs for this unit, let's ensure that its state has been loaded. + * This matters when jobs are spawned as part of coldplugging itself (see. e. g. path_coldplug(). + * This way, we recursively coldplug units, ensuring that we do not look at state of + * not-yet-coldplugged units. */ +unit_coldplug(unit); I like the simplicity of this patch actually, but it's unfortunately too simple: coldplugging is to be applied only for services that are around at the time we come back from a reload. If you start a service during runtime, without any reloading anywhere around, we should not coldplug at all. I figure we need a coldplugging bool or so in Manager, which we set while coldplugging and can then check here. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Mon, 27.04.15 18:28, Ivan Shapovalov (intelfx...@gmail.com) wrote: On 2015-04-27 at 17:14 +0200, Lennart Poettering wrote: On Sat, 25.04.15 05:48, Ivan Shapovalov (intelfx...@gmail.com) wrote: On 2015-04-25 at 04:00 +0300, Ivan Shapovalov wrote: On 2015-04-24 at 16:04 +0200, Lennart Poettering wrote: [...] Actually, it really is about the UNIT_TRIGGERS dependencies only, since we don't do the retroactive deps stuff at all when we are coldplugging, it's conditionalized in m-n_reloading = 0. So, I think I understand the problem. We should do this not only for UNIT_TRIGGERS, but also for any dependencies which may matter when activating that unit. That is, anything which is referenced by transaction_add_job_and_dependencies()... recursively. Here is what I have in mind. Don't know whether this is correct, but it fixes the problem for me. From 515d878e526e52fc154874e93a4c97555ebd8cff Mon Sep 17 00:00:00 2001 From: Ivan Shapovalov intelfx...@gmail.com Date: Sat, 25 Apr 2015 04:57:59 +0300 Subject: [PATCH] core: coldplug all units which participate in jobs This is yet another attempt to fix coldplugging order (more especially, the problem which happens when one creates a job during coldplugging, and it references a not-yet-coldplugged unit). Now we forcibly coldplug all units which participate in jobs. This is a superset of previously implemented handling of the UNIT_TRIGGERS dependencies, so that handling is removed. --- src/core/transaction.c | 6 ++ src/core/unit.c| 8 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/src/core/transaction.c b/src/core/transaction.c index 5974b1e..a02c02c 100644 --- a/src/core/transaction.c +++ b/src/core/transaction.c @@ -848,6 +848,12 @@ int transaction_add_job_and_dependencies( assert(type _JOB_TYPE_MAX_IN_TRANSACTION); assert(unit); +/* Before adding jobs for this unit, let's ensure that its state has been loaded. + * This matters when jobs are spawned as part of coldplugging itself (see. e. g. path_coldplug(). + * This way, we recursively coldplug units, ensuring that we do not look at state of + * not-yet-coldplugged units. */ +unit_coldplug(unit); I like the simplicity of this patch actually, but it's unfortunately too simple: coldplugging is to be applied only for services that are around at the time we come back from a reload. If you start a service during runtime, without any reloading anywhere around, we should not coldplug at all. I figure we need a coldplugging bool or so in Manager, which we set while coldplugging and can then check here. Yeah, right, I've fixed it locally but forgot to send a follow-up mail. Actually, isn't it unit-manager-n_reloading 0? Yes, indeed, that should suffice. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Fri, 24.04.15 15:52, Lennart Poettering (lenn...@poettering.net) wrote: On Wed, 25.02.15 21:40, Ivan Shapovalov (intelfx...@gmail.com) wrote: Ivan, Because the order of coldplugging is not defined, we can reference a not-yet-coldplugged unit and read its state while it has not yet been set to a meaningful value. This way, already active units may get started again. We fix this by deferring such actions until all units have been at least somehow coldplugged. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401 Hmm, so firstly, in this case, do those two alsa services have RemainAfterExit=yes set? I mean, if they have not, they really should. I they have, then queuing jobs for them a second time is not really an issue, because the services are already running they will be eaten up eventually. Oh, is there a simple reproducer for the issue btw? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
not yet marked)On 2015-04-24 at 15:52 +0200, Lennart Poettering wrote: On Wed, 25.02.15 21:40, Ivan Shapovalov (intelfx...@gmail.com) wrote: Ivan, Because the order of coldplugging is not defined, we can reference a not-yet-coldplugged unit and read its state while it has not yet been set to a meaningful value. This way, already active units may get started again. We fix this by deferring such actions until all units have been at least somehow coldplugged. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401 Hmm, so firstly, in this case, do those two alsa services have RemainAfterExit=yes set? I mean, if they have not, they really should. I they have, then queuing jobs for them a second time is not really an issue, because the services are already running they will be eaten up eventually. They do not, but this is actually irrelevant to the root issue. Setting RemainAfterExit=yes will simply hide it. Actually, in this bug the basic.target is started second time. IOW, the point is: no matter what is the configuration of units, none of them should be re-run on reload (given no configuration changes). But regarding the patch: I am sorry, but we really should find a different way to fix this, if there really is an issue to fix... I really don't like the hashmap that maps units to functions. I mean, the whole concept of unit vtables exists to encapsulate the unit-type-specific functions, and we should not add different place for configuring those. I agree, this is not the cleanest solution. But at least it gets the semantics right, and I've waited for comments/suggestions for ~1month before actually sending this patch... Revert as you see fit, let's just make sure we eventually come up with a solution. Also, and more importantly, the whole coldplug function exists only to seperate the unit loading and initial state changing into two separate steps, so that we know that all units are fully loaded before we start making state chnages. Now, I see that this falls short of the issue at hand here, Yes, exactly. The problem is that during coldplug we may accidentally look at the state of not-yet-coldplugged units. but I think the right fix is really to alter the order in which we coldplug. More specifically, I think we need to make the coldplugging order dependency aware: before we coldplug a unit, we should coldplug all units it might trigger, which are those with a listed UNIT_TRIGGERS dependency, as well as all those that retroactively_start_dependencies() and retroactively_stop_dependencies() operates on. Of course, we should also avoid running in loops here, but that should be easy by keeping a per-unit coldplug boolean around. Anyway, I reverted the patch for now, this really needs more thinking. I think I agree with this idea. I just didn't know how to handle potentially unbounded recursion. Maybe we can do something along these lines (pseudocode): while (any units left to coldplug) for (unit in hashmap) if (not yet marked) if (all deps of unit are coldplugged) coldplug_and_mark(unit); That is, we will repeatedly loop over hashmap, coldplugging only those units whose UNIT_TRIGGERS are already coldplugged, and leaving other units for next outer iteration. Makes sense? -- Ivan Shapovalov / intelfx / signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Fri, 24.04.15 16:04, Lennart Poettering (lenn...@poettering.net) wrote: On Fri, 24.04.15 15:52, Lennart Poettering (lenn...@poettering.net) wrote: before we coldplug a unit, we should coldplug all units it might trigger, which are those with a listed UNIT_TRIGGERS dependency, as well as all those that retroactively_start_dependencies() and retroactively_stop_dependencies() operates on. Of course, we should also avoid running in loops here, but that should be easy by keeping a per-unit coldplug boolean around. Actually, it really is about the UNIT_TRIGGERS dependencies only, since we don't do the retroactive deps stuff at all when we are coldplugging, it's conditionalized in m-n_reloading = 0. I have implemented this now in git: http://cgit.freedesktop.org/systemd/systemd/commit/?id=f78f265f405a61387c6c12a879ac0d6b6dc958db Ivan, any chance you can check if this fixes your issue? (Not sure it does, because I must admit I am not entirely sure I really understood it fully...) Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Wed, 25.02.15 21:40, Ivan Shapovalov (intelfx...@gmail.com) wrote: Ivan, Because the order of coldplugging is not defined, we can reference a not-yet-coldplugged unit and read its state while it has not yet been set to a meaningful value. This way, already active units may get started again. We fix this by deferring such actions until all units have been at least somehow coldplugged. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401 Hmm, so firstly, in this case, do those two alsa services have RemainAfterExit=yes set? I mean, if they have not, they really should. I they have, then queuing jobs for them a second time is not really an issue, because the services are already running they will be eaten up eventually. But regarding the patch: I am sorry, but we really should find a different way to fix this, if there really is an issue to fix... I really don't like the hashmap that maps units to functions. I mean, the whole concept of unit vtables exists to encapsulate the unit-type-specific functions, and we should not add different place for configuring those. Also, and more importantly, the whole coldplug function exists only to seperate the unit loading and initial state changing into two separate steps, so that we know that all units are fully loaded before we start making state chnages. Now, I see that this falls short of the issue at hand here, but I think the right fix is really to alter the order in which we coldplug. More specifically, I think we need to make the coldplugging order dependency aware: before we coldplug a unit, we should coldplug all units it might trigger, which are those with a listed UNIT_TRIGGERS dependency, as well as all those that retroactively_start_dependencies() and retroactively_stop_dependencies() operates on. Of course, we should also avoid running in loops here, but that should be easy by keeping a per-unit coldplug boolean around. Anyway, I reverted the patch for now, this really needs more thinking. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Fri, 24.04.15 15:52, Lennart Poettering (lenn...@poettering.net) wrote: before we coldplug a unit, we should coldplug all units it might trigger, which are those with a listed UNIT_TRIGGERS dependency, as well as all those that retroactively_start_dependencies() and retroactively_stop_dependencies() operates on. Of course, we should also avoid running in loops here, but that should be easy by keeping a per-unit coldplug boolean around. Actually, it really is about the UNIT_TRIGGERS dependencies only, since we don't do the retroactive deps stuff at all when we are coldplugging, it's conditionalized in m-n_reloading = 0. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
В Fri, 24 Apr 2015 20:19:33 +0200 Lennart Poettering lenn...@poettering.net пишет: On Fri, 24.04.15 20:46, Ivan Shapovalov (intelfx...@gmail.com) wrote: On 2015-04-24 at 19:13 +0200, Lennart Poettering wrote: On Fri, 24.04.15 20:06, Ivan Shapovalov (intelfx...@gmail.com) wrote: With this patch applied, on `systemctl daemon-reload` I get the following: Any chance you can do the same with debugging on? systemd-analyze set-log-level debug right before the daemon-reload? That should show the transaction being queued in. Sure, I've ran it (log attached), but well... it did not show any new jobs being enqueued. But alsa-restore.service _did_ run and did reset my ALSA volume to the bootup value. Pretty confused, Note that starting services is recursive: if a service is enqueued, then we add all its dependencies to the transaction, verify that the transaction is without cycles and can be applied, and then actually apply it. This means, that starting a service foo.service, that requires bar.target, that requires waldo.service, will mean that waldo.service is also started, even if bar.target is already started anyway. I was sure that on reload systemd simply restores previous state of services. Why it attempts to start anything in the first place? It makes reload potentially dangerous; what if service was stopped on purpose and should remain this way? ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Fri, 24.04.15 20:46, Ivan Shapovalov (intelfx...@gmail.com) wrote: On 2015-04-24 at 19:13 +0200, Lennart Poettering wrote: On Fri, 24.04.15 20:06, Ivan Shapovalov (intelfx...@gmail.com) wrote: With this patch applied, on `systemctl daemon-reload` I get the following: Any chance you can do the same with debugging on? systemd-analyze set-log-level debug right before the daemon-reload? That should show the transaction being queued in. Sure, I've ran it (log attached), but well... it did not show any new jobs being enqueued. But alsa-restore.service _did_ run and did reset my ALSA volume to the bootup value. Pretty confused, Note that starting services is recursive: if a service is enqueued, then we add all its dependencies to the transaction, verify that the transaction is without cycles and can be applied, and then actually apply it. This means, that starting a service foo.service, that requires bar.target, that requires waldo.service, will mean that waldo.service is also started, even if bar.target is already started anyway. hence: your alsa service should really use REmainAfterExit=yes, to not be started over and over again. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Fri, 24.04.15 17:23, Ivan Shapovalov (intelfx...@gmail.com) wrote: I think I agree with this idea. I just didn't know how to handle potentially unbounded recursion. Maybe we can do something along these lines (pseudocode): while (any units left to coldplug) for (unit in hashmap) if (not yet marked) if (all deps of unit are coldplugged) coldplug_and_mark(unit); That is, we will repeatedly loop over hashmap, coldplugging only those units whose UNIT_TRIGGERS are already coldplugged, and leaving other units for next outer iteration. Well, I just made the recursion a real recursion: in unit_coldplug(u): if (u.coldplugged) return; u.coldplugeged = true; foreach (x in u.triggers): unit_coldplug(x); Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On 2015-04-24 at 16:04 +0200, Lennart Poettering wrote: On Fri, 24.04.15 15:52, Lennart Poettering (lenn...@poettering.net) wrote: before we coldplug a unit, we should coldplug all units it might trigger, which are those with a listed UNIT_TRIGGERS dependency, as well as all those that retroactively_start_dependencies() and retroactively_stop_dependencies() operates on. Of course, we should also avoid running in loops here, but that should be easy by keeping a per-unit coldplug boolean around. Actually, it really is about the UNIT_TRIGGERS dependencies only, since we don't do the retroactive deps stuff at all when we are coldplugging, it's conditionalized in m-n_reloading = 0. So, I think I understand the problem. We should do this not only for UNIT_TRIGGERS, but also for any dependencies which may matter when activating that unit. That is, anything which is referenced by transaction_add_job_and_dependencies()... recursively. To illustrate: - A.path triggers A.service - A.service requires basic.target - we begin coldplugging - we coldplug A.path - by your patch, we first coldplug A.service - A.service is now active - we continue coldplugging A.path - NB: basic.target is not coldplugged yet! - A.path enters running and starts A.service - transaction_add_job_and_dependencies() adds jobs for all dependencies of A.service - at this point we're fucked up: basic.target is not coldplugged, but a job is added for it -- Ivan Shapovalov / intelfx / signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On 2015-04-25 at 04:00 +0300, Ivan Shapovalov wrote: On 2015-04-24 at 16:04 +0200, Lennart Poettering wrote: [...] Actually, it really is about the UNIT_TRIGGERS dependencies only, since we don't do the retroactive deps stuff at all when we are coldplugging, it's conditionalized in m-n_reloading = 0. So, I think I understand the problem. We should do this not only for UNIT_TRIGGERS, but also for any dependencies which may matter when activating that unit. That is, anything which is referenced by transaction_add_job_and_dependencies()... recursively. Here is what I have in mind. Don't know whether this is correct, but it fixes the problem for me. From 515d878e526e52fc154874e93a4c97555ebd8cff Mon Sep 17 00:00:00 2001 From: Ivan Shapovalov intelfx...@gmail.com Date: Sat, 25 Apr 2015 04:57:59 +0300 Subject: [PATCH] core: coldplug all units which participate in jobs This is yet another attempt to fix coldplugging order (more especially, the problem which happens when one creates a job during coldplugging, and it references a not-yet-coldplugged unit). Now we forcibly coldplug all units which participate in jobs. This is a superset of previously implemented handling of the UNIT_TRIGGERS dependencies, so that handling is removed. --- src/core/transaction.c | 6 ++ src/core/unit.c| 8 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/src/core/transaction.c b/src/core/transaction.c index 5974b1e..a02c02c 100644 --- a/src/core/transaction.c +++ b/src/core/transaction.c @@ -848,6 +848,12 @@ int transaction_add_job_and_dependencies( assert(type _JOB_TYPE_MAX_IN_TRANSACTION); assert(unit); +/* Before adding jobs for this unit, let's ensure that its state has been loaded. + * This matters when jobs are spawned as part of coldplugging itself (see. e. g. path_coldplug(). + * This way, we recursively coldplug units, ensuring that we do not look at state of + * not-yet-coldplugged units. */ +unit_coldplug(unit); + /* log_debug(Pulling in %s/%s from %s/%s, */ /* unit-id, job_type_to_string(type), */ /* by ? by-unit-id : NA, */ diff --git a/src/core/unit.c b/src/core/unit.c index 2b356e2..996b648 100644 --- a/src/core/unit.c +++ b/src/core/unit.c @@ -2889,14 +2889,6 @@ int unit_coldplug(Unit *u) { u-coldplugged = true; -/* Make sure everything that we might pull in through - * triggering is coldplugged before us */ -SET_FOREACH(other, u-dependencies[UNIT_TRIGGERS], i) { -r = unit_coldplug(other); -if (r 0) -return r; -} - if (UNIT_VTABLE(u)-coldplug) { r = UNIT_VTABLE(u)-coldplug(u); if (r 0) -- 2.3.6 -- Ivan Shapovalov / intelfx / signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Fri, 24.04.15 20:06, Ivan Shapovalov (intelfx...@gmail.com) wrote: With this patch applied, on `systemctl daemon-reload` I get the following: Any chance you can do the same with debugging on? systemd-analyze set-log-level debug right before the daemon-reload? That should show the transaction being queued in. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On 2015-04-24 at 16:20 +0200, Lennart Poettering wrote: On Fri, 24.04.15 16:04, Lennart Poettering (lenn...@poettering.net) wrote: On Fri, 24.04.15 15:52, Lennart Poettering (lenn...@poettering.net ) wrote: before we coldplug a unit, we should coldplug all units it might trigger, which are those with a listed UNIT_TRIGGERS dependency, as well as all those that retroactively_start_dependencies() and retroactively_stop_dependencies() operates on. Of course, we should also avoid running in loops here, but that should be easy by keeping a per-unit coldplug boolean around. Actually, it really is about the UNIT_TRIGGERS dependencies only, since we don't do the retroactive deps stuff at all when we are coldplugging, it's conditionalized in m-n_reloading = 0. I have implemented this now in git: http://cgit.freedesktop.org/systemd/systemd/commit/?id=f78f265f405a61 387c6c12a879ac0d6b6dc958db Ivan, any chance you can check if this fixes your issue? (Not sure it does, because I must admit I am not entirely sure I really understood it fully...) Seems like it didn't help. I use the following patch to alter coldplugging order slightly (it's a hashmap, so order is actually arbitrary, so this alteration is valid): cut patch here diff --git a/src/core/manager.c b/src/core/manager.c index f13dad5..542dd4f 100644 --- a/src/core/manager.c +++ b/src/core/manager.c @@ -975,6 +975,10 @@ int manager_enumerate(Manager *m) { return r; } +static bool coldplug_first(Unit *u) { +return !(endswith(u-id, .service) || endswith(u-id, .target)); +} + static void manager_coldplug(Manager *m) { Iterator i; Unit *u; @@ -990,6 +994,26 @@ static void manager_coldplug(Manager *m) { if (u-id != k) continue; +/* we need a reproducer */ +if (!coldplug_first(u)) +continue; + +r = unit_coldplug(u); +if (r 0) +log_warning_errno(r, We couldn't coldplug %s, proceeding anyway: %m, u-id); +} + +/* Process remaining units. */ +HASHMAP_FOREACH_KEY(u, k, m-units, i) { + +/* ignore aliases */ +if (u-id != k) +continue; + +/* skip already processed units */ +if (coldplug_first(u)) +continue; + r = unit_coldplug(u); if (r 0) log_warning_errno(r, We couldn't coldplug %s, proceeding anyway: %m, u-id); cut patch here With this patch applied, on `systemctl daemon-reload` I get the following: cut log here 2015-04-24T19:42:05+0300 intelfx-laptop sudo[15870]: intelfx : TTY=pts/3 ; PWD=/home/intelfx/tmp/build/systemd ; USER=root ; COMMAND=/usr/bin/systemctl daemon-reload 2015-04-24T19:42:05+0300 intelfx-laptop sudo[15870]: pam_unix(sudo:session): session opened for user root by intelfx(uid=0) 2015-04-24T19:42:05+0300 intelfx-laptop polkitd[8629]: Registered Authentication Agent for unix-process:15871:1490725 (system bus name :1.239 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale ru_RU.utf8) 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Reloading. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Listening on /dev/initctl Compatibility Named Pipe. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device LITEONIT_LSS-16L6G EFI. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Listening on fsck to fsckd communication Socket. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Set up automount var-lib-pacman-sync.automount. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Started Daily Cleanup of Temporary Directories. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Listening on RPCbind Server Activation Socket. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 datastore0. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 linux-build. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 datastore0. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device LITEONIT_LSS-16L6G EFI. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Activated swap Swap Partition. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 datastore0. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device WDC_WD10JPVX-08JC3T5 linux-build. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Mounted POSIX Message Queue File System. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Created slice System Slice. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Mounted /home. 2015-04-24T19:42:05+0300 intelfx-laptop systemd[1]: Found device
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On 2015-04-24 at 20:19 +0200, Lennart Poettering wrote: On Fri, 24.04.15 20:46, Ivan Shapovalov (intelfx...@gmail.com) wrote: On 2015-04-24 at 19:13 +0200, Lennart Poettering wrote: On Fri, 24.04.15 20:06, Ivan Shapovalov (intelfx...@gmail.com) wrote: With this patch applied, on `systemctl daemon-reload` I get the following: Any chance you can do the same with debugging on? systemd -analyze set-log-level debug right before the daemon-reload? That should show the transaction being queued in. Sure, I've ran it (log attached), but well... it did not show any new jobs being enqueued. But alsa-restore.service _did_ run and did reset my ALSA volume to the bootup value. Pretty confused, Note that starting services is recursive: if a service is enqueued, then we add all its dependencies to the transaction, verify that the transaction is without cycles and can be applied, and then actually apply it. This means, that starting a service foo.service, that requires bar.target, that requires waldo.service, will mean that waldo.service is also started, even if bar.target is already started anyway. Judging by current master, this is not the case. I've created a pair of throw-away services and a target in the described configuration, and dependencies of an already started target are not started again. I think the status quo is correct because activating an already activated target is a no-op. Anyway, this is orthogonal. The issue at hand is that the core looks at the state of not-yet-coldplugged units... -- Ivan Shapovalov / intelfx / signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
FTR, this has now been applied by Zbigniew. On Thu, Mar 5, 2015 at 7:07 PM, Ivan Shapovalov intelfx...@gmail.com wrote: On 2015-02-28 at 00:50 +0300, Ivan Shapovalov wrote: On 2015-02-27 at 22:25 +0100, Zbigniew Jędrzejewski-Szmek wrote: On Wed, Feb 25, 2015 at 09:40:23PM +0300, Ivan Shapovalov wrote: Because the order of coldplugging is not defined, we can reference a not-yet-coldplugged unit and read its state while it has not yet been set to a meaningful value. This way, already active units may get started again. We fix this by deferring such actions until all units have been at least somehow coldplugged. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401 --- v2: set waiting state on path/timer units after deferring the actual coldplug, so that we won't run into the exactly same problem during processing the deferred entries. This looks good. I seems to be the correct thing to do independently of the idea to split device states into three with the new pending state. Let's see what Lennart thinks though. Hmm.. This does not relate to the ongoing discussion about adding a third state for .device units. This is about coldplugging .path and .timer units during reloads. Ping? I don't want to miss v220 as well :) -- Ivan Shapovalov / intelfx / ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On 2015-02-28 at 00:50 +0300, Ivan Shapovalov wrote: On 2015-02-27 at 22:25 +0100, Zbigniew Jędrzejewski-Szmek wrote: On Wed, Feb 25, 2015 at 09:40:23PM +0300, Ivan Shapovalov wrote: Because the order of coldplugging is not defined, we can reference a not-yet-coldplugged unit and read its state while it has not yet been set to a meaningful value. This way, already active units may get started again. We fix this by deferring such actions until all units have been at least somehow coldplugged. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401 --- v2: set waiting state on path/timer units after deferring the actual coldplug, so that we won't run into the exactly same problem during processing the deferred entries. This looks good. I seems to be the correct thing to do independently of the idea to split device states into three with the new pending state. Let's see what Lennart thinks though. Hmm.. This does not relate to the ongoing discussion about adding a third state for .device units. This is about coldplugging .path and .timer units during reloads. Ping? I don't want to miss v220 as well :) -- Ivan Shapovalov / intelfx / signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On Wed, Feb 25, 2015 at 09:40:23PM +0300, Ivan Shapovalov wrote: Because the order of coldplugging is not defined, we can reference a not-yet-coldplugged unit and read its state while it has not yet been set to a meaningful value. This way, already active units may get started again. We fix this by deferring such actions until all units have been at least somehow coldplugged. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401 --- v2: set waiting state on path/timer units after deferring the actual coldplug, so that we won't run into the exactly same problem during processing the deferred entries. This looks good. I seems to be the correct thing to do independently of the idea to split device states into three with the new pending state. Let's see what Lennart thinks though. Zbyszek src/core/automount.c | 2 +- src/core/busname.c | 2 +- src/core/device.c| 2 +- src/core/manager.c | 33 - src/core/mount.c | 2 +- src/core/path.c | 14 ++ src/core/scope.c | 2 +- src/core/service.c | 2 +- src/core/slice.c | 2 +- src/core/snapshot.c | 2 +- src/core/socket.c| 2 +- src/core/swap.c | 2 +- src/core/target.c| 2 +- src/core/timer.c | 14 ++ src/core/unit.c | 25 - src/core/unit.h | 12 +--- 16 files changed, 88 insertions(+), 32 deletions(-) diff --git a/src/core/automount.c b/src/core/automount.c index 4a509ef..0539fbb 100644 --- a/src/core/automount.c +++ b/src/core/automount.c @@ -233,7 +233,7 @@ static void automount_set_state(Automount *a, AutomountState state) { unit_notify(UNIT(a), state_translation_table[old_state], state_translation_table[state], true); } -static int automount_coldplug(Unit *u) { +static int automount_coldplug(Unit *u, Hashmap *deferred_work) { Automount *a = AUTOMOUNT(u); int r; diff --git a/src/core/busname.c b/src/core/busname.c index 1d77292..43d7607 100644 --- a/src/core/busname.c +++ b/src/core/busname.c @@ -335,7 +335,7 @@ static void busname_set_state(BusName *n, BusNameState state) { unit_notify(UNIT(n), state_translation_table[old_state], state_translation_table[state], true); } -static int busname_coldplug(Unit *u) { +static int busname_coldplug(Unit *u, Hashmap *deferred_work) { BusName *n = BUSNAME(u); int r; diff --git a/src/core/device.c b/src/core/device.c index 2d983cc..70c2233 100644 --- a/src/core/device.c +++ b/src/core/device.c @@ -104,7 +104,7 @@ static void device_set_state(Device *d, DeviceState state) { unit_notify(UNIT(d), state_translation_table[old_state], state_translation_table[state], true); } -static int device_coldplug(Unit *u) { +static int device_coldplug(Unit *u, Hashmap *deferred_work) { Device *d = DEVICE(u); assert(d); diff --git a/src/core/manager.c b/src/core/manager.c index 79a9d54..239c8bb 100644 --- a/src/core/manager.c +++ b/src/core/manager.c @@ -975,8 +975,29 @@ static int manager_coldplug(Manager *m) { Unit *u; char *k; +/* + * Some unit types tend to spawn jobs or check other units' state + * during coldplug. This is wrong because it is undefined whether the + * units in question have been already coldplugged (i. e. their state + * restored). This way, we can easily re-start an already started unit + * or otherwise make a wrong decision based on the unit's state. + * + * Solve this by providing a way for coldplug functions to defer + * such actions until after all units have been coldplugged. + * + * We store Unit* - int(*)(Unit*). + * + * https://bugs.freedesktop.org/show_bug.cgi?id=88401 + */ +_cleanup_hashmap_free_ Hashmap *deferred_work = NULL; +int(*proc)(Unit*); + assert(m); +deferred_work = hashmap_new(trivial_hash_ops); +if (!deferred_work) +return -ENOMEM; + /* Then, let's set up their initial state. */ HASHMAP_FOREACH_KEY(u, k, m-units, i) { int q; @@ -985,7 +1006,17 @@ static int manager_coldplug(Manager *m) { if (u-id != k) continue; -q = unit_coldplug(u); +q = unit_coldplug(u, deferred_work); +if (q 0) +r = q; +} + +/* After coldplugging and setting up initial state of the units, + * let's perform operations which spawn jobs or query units' state. */ +HASHMAP_FOREACH_KEY(proc, u, deferred_work, i) { +int q; + +q = proc(u); if (q 0) r = q; } diff --git
Re: [systemd-devel] [PATCHv2] core: do not spawn jobs or touch other units during coldplugging
On 2015-02-27 at 22:25 +0100, Zbigniew Jędrzejewski-Szmek wrote: On Wed, Feb 25, 2015 at 09:40:23PM +0300, Ivan Shapovalov wrote: Because the order of coldplugging is not defined, we can reference a not-yet-coldplugged unit and read its state while it has not yet been set to a meaningful value. This way, already active units may get started again. We fix this by deferring such actions until all units have been at least somehow coldplugged. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=88401 --- v2: set waiting state on path/timer units after deferring the actual coldplug, so that we won't run into the exactly same problem during processing the deferred entries. This looks good. I seems to be the correct thing to do independently of the idea to split device states into three with the new pending state. Let's see what Lennart thinks though. Hmm.. This does not relate to the ongoing discussion about adding a third state for .device units. This is about coldplugging .path and .timer units during reloads. -- Ivan Shapovalov / intelfx / signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel