Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Tue, Apr 17, 2018 at 05:37:46PM +0200, Petr Mladek wrote: > On Mon 2018-04-16 14:04:25, Josh Poimboeuf wrote: > > On Mon, Apr 16, 2018 at 04:58:11PM +0200, Petr Mladek wrote: > > > On Wed 2018-04-11 10:48:52, Josh Poimboeuf wrote: > > > > On Wed, Apr 11, 2018 at 04:17:11PM +0200, Petr Mladek wrote: > > > > > Second, unrelated patches must never patch the same functions. > > > > > Otherwise we would not be able to define which implementation > > > > > should be used. This is especially important when a patch is > > > > > removed and we need to fallback either to another patch or > > > > > original code. Yes, it makes perfect sense. But it needs code > > > > > that will check it, refuse loading the patch, ... It is not > > > > > complicated. But it is rather additional code than > > > > > simplification. I might make livepatching more safe > > > > > but probably not simplify the code. > > > > > > > > We don't need to enforce that. The func stack can stay. If somebody > > > > wants to patch the same function multiple times (without using > > > > 'replace'), that's inadvisable, but it's also their business. They're > > > > responsible for the tooling to ensure the patch stack order is sane. > > > > > > > > > While it might make sense to ignore the patch stack (ordering) for > > > the enable operation. Do we really want to ignore it when disabling > > > a patch. > > > > > > By other words, do we want to allow disabling a patch that is in > > > the middle of the stack, only partly in use? Does not this allow > > > some other crazy scenarios? Is it really the user business? > > > Will it make our life easier? > > > > If there's no longer a patch stack, then there's no concept of a middle. > > We would expect the patches to be independent of one another, and so > > disabling any of them independently would be harmless. > > > > If any of the patches share a func, and the user disables one in the > > "middle", it's not our job to support that. The vendor / patch author > > should prevent such cases from occurring with tooling, packaging, > > documentation, etc. Or they can just use 'replace'. > > > > We can already have similar unexpected situations today. For example, > > what if patch B is a cumulative superset of patch A, but the user > > mistakenly loads patch A (without replace) *after* loading patch B? > > Then some unforeseen craziness could ensue. > > > > We can't control all such scenarios (and that's ok), but we shouldn't > > pretend that we support them. > > I recall some earlier discussion. The aim was to make livepatching > more safe. AFAIK, it was more related to the helper tooling around, > like "automatic" generation of the special relocation entries, ... > Anyway, I feel that the above gets somehow against it. > > The patch stack does not solve everything. But it makes it harder > to do wrong things. It also makes it harder to do _right_ things. For example, disabling a patch in the "middle" should be allowable if there are no patch dependencies. > > > This will not happen if we refuse to load non-replace patches > > > that touch an already patches fucntion. Then the patch stack > > > might become only implementation detail. It will not define > > > the ordering any longer. > > > > I think this would only be a partial solution. Patches can have > > implicit interdependencies, even if they don't patch the same function. > > Also it doesn't solve the problem when patches are loaded in the wrong > > order. We have to trust vendors and admins to do the right thing. > > I would still like to add this check if we remove the stack. > Is it too restrictive? Maybe. But it would help to prevent > creating some ugly mistakes. By other words, we will force > people using proper cumulative patches instead of > dangerous semi-cumulative Frankenstains. > > At least, it would reduce the number of scenarios that > we might meet and eventually need to debug. But the non-replace cumulative model can be perfectly safe, if the packaging/tooling supports the right workflow. I think we're talking about theoretical situations anyway. I'd rather not add any code for theoretical problems. If somebody reports a bug due to dangerous real world usage, then we can decide how to handle it at that point -- though I suspect my response will be "don't do that!" > > > > > > > > > Another possibility would be to get rid of the enable/disable > > > > > > > > > states. > > > > > > > > > I mean that the patch will be automatically enabled during > > > > > > > > > registration and removed during unregistration. > > > > > > OK. What about the following solution? > > > > > >+ Enable patch directly from klp_register_patch() > > >+ Unregister the patch directly from klp_complete_transition() > > > when the patch is disabled. > > >+ Put the module later when all sysfs entries are removed > > > in klp_unregister_patch(). > > > > > > As a result: > > > > > >+ module_init() will need t
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Mon 2018-04-16 14:04:25, Josh Poimboeuf wrote: > On Mon, Apr 16, 2018 at 04:58:11PM +0200, Petr Mladek wrote: > > On Wed 2018-04-11 10:48:52, Josh Poimboeuf wrote: > > > On Wed, Apr 11, 2018 at 04:17:11PM +0200, Petr Mladek wrote: > > > > Second, unrelated patches must never patch the same functions. > > > > Otherwise we would not be able to define which implementation > > > > should be used. This is especially important when a patch is > > > > removed and we need to fallback either to another patch or > > > > original code. Yes, it makes perfect sense. But it needs code > > > > that will check it, refuse loading the patch, ... It is not > > > > complicated. But it is rather additional code than > > > > simplification. I might make livepatching more safe > > > > but probably not simplify the code. > > > > > > We don't need to enforce that. The func stack can stay. If somebody > > > wants to patch the same function multiple times (without using > > > 'replace'), that's inadvisable, but it's also their business. They're > > > responsible for the tooling to ensure the patch stack order is sane. > > > > > > While it might make sense to ignore the patch stack (ordering) for > > the enable operation. Do we really want to ignore it when disabling > > a patch. > > > > By other words, do we want to allow disabling a patch that is in > > the middle of the stack, only partly in use? Does not this allow > > some other crazy scenarios? Is it really the user business? > > Will it make our life easier? > > If there's no longer a patch stack, then there's no concept of a middle. > We would expect the patches to be independent of one another, and so > disabling any of them independently would be harmless. > > If any of the patches share a func, and the user disables one in the > "middle", it's not our job to support that. The vendor / patch author > should prevent such cases from occurring with tooling, packaging, > documentation, etc. Or they can just use 'replace'. > > We can already have similar unexpected situations today. For example, > what if patch B is a cumulative superset of patch A, but the user > mistakenly loads patch A (without replace) *after* loading patch B? > Then some unforeseen craziness could ensue. > > We can't control all such scenarios (and that's ok), but we shouldn't > pretend that we support them. I recall some earlier discussion. The aim was to make livepatching more safe. AFAIK, it was more related to the helper tooling around, like "automatic" generation of the special relocation entries, ... Anyway, I feel that the above gets somehow against it. The patch stack does not solve everything. But it makes it harder to do wrong things. > > This will not happen if we refuse to load non-replace patches > > that touch an already patches fucntion. Then the patch stack > > might become only implementation detail. It will not define > > the ordering any longer. > > I think this would only be a partial solution. Patches can have > implicit interdependencies, even if they don't patch the same function. > Also it doesn't solve the problem when patches are loaded in the wrong > order. We have to trust vendors and admins to do the right thing. I would still like to add this check if we remove the stack. Is it too restrictive? Maybe. But it would help to prevent creating some ugly mistakes. By other words, we will force people using proper cumulative patches instead of dangerous semi-cumulative Frankenstains. At least, it would reduce the number of scenarios that we might meet and eventually need to debug. > > > > > > > > Another possibility would be to get rid of the enable/disable > > > > > > > > states. > > > > > > > > I mean that the patch will be automatically enabled during > > > > > > > > registration and removed during unregistration. > > > > OK. What about the following solution? > > > >+ Enable patch directly from klp_register_patch() > >+ Unregister the patch directly from klp_complete_transition() > > when the patch is disabled. > >+ Put the module later when all sysfs entries are removed > > in klp_unregister_patch(). > > > > As a result: > > > >+ module_init() will need to call only klp_register_patch() > >+ module_exit() will do nothing > >+ the module can be removed only when it is not longer needed > > Sounds good to me. It should be consistent with sysfs interface, see below. > > Some other ideas: > > > >+ rename /sys/kernel/livepatch//enable -> unregister > > allow to write into /sys/kernel/livepatch//transition > > > > + echo 1 >unregistrer to disable&unregister the patch > > + echo 0 >transition to revert the running transition > > Why not keep the existing sysfs interfaces? So > > echo 0 > enable > > would disable (and eventually unregister) the patch. First, it would be a bit inconsistent. The patch would be added by calling "register" function and removed by writing into "enabled" file.
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
> > > > Second, unrelated patches must never patch the same functions. > > > > Otherwise we would not be able to define which implementation > > > > should be used. This is especially important when a patch is > > > > removed and we need to fallback either to another patch or > > > > original code. Yes, it makes perfect sense. But it needs code > > > > that will check it, refuse loading the patch, ... It is not > > > > complicated. But it is rather additional code than > > > > simplification. I might make livepatching more safe > > > > but probably not simplify the code. > > > > > > We don't need to enforce that. The func stack can stay. If somebody > > > wants to patch the same function multiple times (without using > > > 'replace'), that's inadvisable, but it's also their business. They're > > > responsible for the tooling to ensure the patch stack order is sane. > > > > > > While it might make sense to ignore the patch stack (ordering) for > > the enable operation. Do we really want to ignore it when disabling > > a patch. > > > > By other words, do we want to allow disabling a patch that is in > > the middle of the stack, only partly in use? Does not this allow > > some other crazy scenarios? Is it really the user business? > > Will it make our life easier? > > If there's no longer a patch stack, then there's no concept of a middle. > We would expect the patches to be independent of one another, and so > disabling any of them independently would be harmless. > > If any of the patches share a func, and the user disables one in the > "middle", it's not our job to support that. The vendor / patch author > should prevent such cases from occurring with tooling, packaging, > documentation, etc. Or they can just use 'replace'. > > We can already have similar unexpected situations today. For example, > what if patch B is a cumulative superset of patch A, but the user > mistakenly loads patch A (without replace) *after* loading patch B? > Then some unforeseen craziness could ensue. > > We can't control all such scenarios (and that's ok), but we shouldn't > pretend that we support them. FWIW I agree with the above. Provided we keep the func stack. [...] > > The question is what to do with the stack of patches. It will have > > no meaning for the enable operation because it will be done > > automatically. But what about the disable/unregistrer operation? > > Assuming we got rid of the patch stack, would we even need to keep a > global list of patches anymore? There are places we walk through the list of patches (klp_add_nops() in this patch set, klp_module_coming()), so probably yes. Miroslav
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Mon, Apr 16, 2018 at 04:58:11PM +0200, Petr Mladek wrote: > On Wed 2018-04-11 10:48:52, Josh Poimboeuf wrote: > > On Wed, Apr 11, 2018 at 04:17:11PM +0200, Petr Mladek wrote: > > > > I still agree with my original conclusion that enforcing stack order no > > > > longer makes sense though. > > > > > > The question is what we will get if we remove the stack. Will it > > > really make the code easier and livepatching more safe? > > > > > > First, note that stack is the main trick used by the ftrace handler. > > > It gets the patch from top of the stack and use the new_addr from > > > that patch. > > > > > > If we remove the stack, we still need to handle 3 possibilities. > > > The handler will need to continue either with original_code, > > > old_patch or new_patch. > > > > > > Now, just imagine the code. We will need variables orig_addr, > > > old_addr, new_addr or so which might be confusing. It will be > > > even more confusion if we do revert/disable. Also new_addr will > > > become old_addr if we install yet another patch. > > > > > > We had exactly this in kGraft and it was a mess. I said "wow, > > > that is genius" when I saw the stack approach in the upstream > > > code proposal. > > > > You're confusing the func stack with the patch stack. > > > > My proposal is to get rid of the patch stack. > > The are related from my POV. The patches have the same ordering > in both patch and function stacks. The patch ordering is checked > when the patches are enabled and disabled. True. > > We can keep the func stack. It will be needed anyway for the 'replace' > > case, where the stack may be of size 2. > > OK, you want to keep the func stack because it is useful from the > implementation point of view. > > > > > Second, unrelated patches must never patch the same functions. > > > Otherwise we would not be able to define which implementation > > > should be used. This is especially important when a patch is > > > removed and we need to fallback either to another patch or > > > original code. Yes, it makes perfect sense. But it needs code > > > that will check it, refuse loading the patch, ... It is not > > > complicated. But it is rather additional code than > > > simplification. I might make livepatching more safe > > > but probably not simplify the code. > > > > We don't need to enforce that. The func stack can stay. If somebody > > wants to patch the same function multiple times (without using > > 'replace'), that's inadvisable, but it's also their business. They're > > responsible for the tooling to ensure the patch stack order is sane. > > > While it might make sense to ignore the patch stack (ordering) for > the enable operation. Do we really want to ignore it when disabling > a patch. > > By other words, do we want to allow disabling a patch that is in > the middle of the stack, only partly in use? Does not this allow > some other crazy scenarios? Is it really the user business? > Will it make our life easier? If there's no longer a patch stack, then there's no concept of a middle. We would expect the patches to be independent of one another, and so disabling any of them independently would be harmless. If any of the patches share a func, and the user disables one in the "middle", it's not our job to support that. The vendor / patch author should prevent such cases from occurring with tooling, packaging, documentation, etc. Or they can just use 'replace'. We can already have similar unexpected situations today. For example, what if patch B is a cumulative superset of patch A, but the user mistakenly loads patch A (without replace) *after* loading patch B? Then some unforeseen craziness could ensue. We can't control all such scenarios (and that's ok), but we shouldn't pretend that we support them. > This will not happen if we refuse to load non-replace patches > that touch an already patches fucntion. Then the patch stack > might become only implementation detail. It will not define > the ordering any longer. I think this would only be a partial solution. Patches can have implicit interdependencies, even if they don't patch the same function. Also it doesn't solve the problem when patches are loaded in the wrong order. We have to trust vendors and admins to do the right thing. > > > > > > > Another possibility would be to get rid of the enable/disable > > > > > > > states. > > > > > > > I mean that the patch will be automatically enabled during > > > > > > > registration and removed during unregistration. > > > > > > > > > > > > I don't see how disabling during unregistration would be possible, > > > > > > since > > > > > > the unregister is called from the patch module exit function, which > > > > > > can only be called *after* the patch is disabled. > > > > > > > > > > > > However, we could unregister immediately after disabling (i.e., in > > > > > > enabled_store context). > > > > > > > > > > I think this is what Petr meant. So there would be nothing in the > > >
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Wed 2018-04-11 10:48:52, Josh Poimboeuf wrote: > On Wed, Apr 11, 2018 at 04:17:11PM +0200, Petr Mladek wrote: > > > I still agree with my original conclusion that enforcing stack order no > > > longer makes sense though. > > > > The question is what we will get if we remove the stack. Will it > > really make the code easier and livepatching more safe? > > > > First, note that stack is the main trick used by the ftrace handler. > > It gets the patch from top of the stack and use the new_addr from > > that patch. > > > > If we remove the stack, we still need to handle 3 possibilities. > > The handler will need to continue either with original_code, > > old_patch or new_patch. > > > > Now, just imagine the code. We will need variables orig_addr, > > old_addr, new_addr or so which might be confusing. It will be > > even more confusion if we do revert/disable. Also new_addr will > > become old_addr if we install yet another patch. > > > > We had exactly this in kGraft and it was a mess. I said "wow, > > that is genius" when I saw the stack approach in the upstream > > code proposal. > > You're confusing the func stack with the patch stack. > > My proposal is to get rid of the patch stack. The are related from my POV. The patches have the same ordering in both patch and function stacks. The patch ordering is checked when the patches are enabled and disabled. > We can keep the func stack. It will be needed anyway for the 'replace' > case, where the stack may be of size 2. OK, you want to keep the func stack because it is useful from the implementation point of view. > > Second, unrelated patches must never patch the same functions. > > Otherwise we would not be able to define which implementation > > should be used. This is especially important when a patch is > > removed and we need to fallback either to another patch or > > original code. Yes, it makes perfect sense. But it needs code > > that will check it, refuse loading the patch, ... It is not > > complicated. But it is rather additional code than > > simplification. I might make livepatching more safe > > but probably not simplify the code. > > We don't need to enforce that. The func stack can stay. If somebody > wants to patch the same function multiple times (without using > 'replace'), that's inadvisable, but it's also their business. They're > responsible for the tooling to ensure the patch stack order is sane. While it might make sense to ignore the patch stack (ordering) for the enable operation. Do we really want to ignore it when disabling a patch. By other words, do we want to allow disabling a patch that is in the middle of the stack, only partly in use? Does not this allow some other crazy scenarios? Is it really the user business? Will it make our life easier? This will not happen if we refuse to load non-replace patches that touch an already patches fucntion. Then the patch stack might become only implementation detail. It will not define the ordering any longer. > > > > > > Another possibility would be to get rid of the enable/disable > > > > > > states. > > > > > > I mean that the patch will be automatically enabled during > > > > > > registration and removed during unregistration. > > > > > > > > > > I don't see how disabling during unregistration would be possible, > > > > > since > > > > > the unregister is called from the patch module exit function, which > > > > > can only be called *after* the patch is disabled. > > > > > > > > > > However, we could unregister immediately after disabling (i.e., in > > > > > enabled_store context). > > > > > > > > I think this is what Petr meant. So there would be nothing in the patch > > > > module exit function. Well, not exactly. We'd need to remove sysfs dir > > > > and > > > > maybe something more. > > > > > > Sounds good to me, though aren't the livepatch sysfs entries removed by > > > klp during unregister? > > > > This is why I asked in my earlier mail if we need to keep sysfs > > entries for unused patches. > > If they are permanently disabled then I think the sysfs entries should > be removed. > > > We could remove them when the patch gets disabled (transition > > finishes). Then we do not need to do anything in module_exit(). > > Agreed, though what happens if the transition finishes while still > running in the context of the write to the sysfs entry? Can we remove a > sysfs entry while executing code associated with it? I would need to test it to be sure. But I believe that it will be possible to remove sysfs entry from its own callback. All this code heavily uses reference counters and callbacks called when the count reaches zero. OK. What about the following solution? + Enable patch directly from klp_register_patch() + Unregister the patch directly from klp_complete_transition() when the patch is disabled. + Put the module later when all sysfs entries are removed in klp_unregister_patch(). As a result: + module_init() will
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Wed, Apr 11, 2018 at 04:17:11PM +0200, Petr Mladek wrote: > > I still agree with my original conclusion that enforcing stack order no > > longer makes sense though. > > The question is what we will get if we remove the stack. Will it > really make the code easier and livepatching more safe? > > First, note that stack is the main trick used by the ftrace handler. > It gets the patch from top of the stack and use the new_addr from > that patch. > > If we remove the stack, we still need to handle 3 possibilities. > The handler will need to continue either with original_code, > old_patch or new_patch. > > Now, just imagine the code. We will need variables orig_addr, > old_addr, new_addr or so which might be confusing. It will be > even more confusion if we do revert/disable. Also new_addr will > become old_addr if we install yet another patch. > > We had exactly this in kGraft and it was a mess. I said "wow, > that is genius" when I saw the stack approach in the upstream > code proposal. You're confusing the func stack with the patch stack. My proposal is to get rid of the patch stack. We can keep the func stack. It will be needed anyway for the 'replace' case, where the stack may be of size 2. > Second, unrelated patches must never patch the same functions. > Otherwise we would not be able to define which implementation > should be used. This is especially important when a patch is > removed and we need to fallback either to another patch or > original code. Yes, it makes perfect sense. But it needs code > that will check it, refuse loading the patch, ... It is not > complicated. But it is rather additional code than > simplification. I might make livepatching more safe > but probably not simplify the code. We don't need to enforce that. The func stack can stay. If somebody wants to patch the same function multiple times (without using 'replace'), that's inadvisable, but it's also their business. They're responsible for the tooling to ensure the patch stack order is sane. > > > > > Another possibility would be to get rid of the enable/disable states. > > > > > I mean that the patch will be automatically enabled during > > > > > registration and removed during unregistration. > > > > > > > > I don't see how disabling during unregistration would be possible, since > > > > the unregister is called from the patch module exit function, which > > > > can only be called *after* the patch is disabled. > > > > > > > > However, we could unregister immediately after disabling (i.e., in > > > > enabled_store context). > > > > > > I think this is what Petr meant. So there would be nothing in the patch > > > module exit function. Well, not exactly. We'd need to remove sysfs dir > > > and > > > maybe something more. > > > > Sounds good to me, though aren't the livepatch sysfs entries removed by > > klp during unregister? > > This is why I asked in my earlier mail if we need to keep sysfs > entries for unused patches. If they are permanently disabled then I think the sysfs entries should be removed. > We could remove them when the patch gets disabled (transition > finishes). Then we do not need to do anything in module_exit(). Agreed, though what happens if the transition finishes while still running in the context of the write to the sysfs entry? Can we remove a sysfs entry while executing code associated with it? -- Josh
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Wed 2018-04-11 07:32:14, Josh Poimboeuf wrote: > On Wed, Apr 11, 2018 at 10:07:31AM +0200, Miroslav Benes wrote: > > > > I was confused by wording "in the middle". It suggested that there > > > > might had been enabled patches on the top and the bottom of the stack > > > > and some disabled patches in between at the same time (or vice versa). > > > > This was not true. > > > > > > That *was* what I meant. Consider the following sequence of events: > > > > > > - Register patch 1 > > > - Enable patch 1 > > > - Register patch 2 > > > - Enable patch 2 > > > - Disable patch 2 > > > - Register patch 3 > > > - Enable patch 3 > > > > > > Notice that patch 2 (in the middle) is disabled, whereas patch 1 (on the > > > bottom) and patch 3 (on the top) are enabled. > > > > This should not be possible at all. > > > > __klp_enable_patch: > > > > if (patch->list.prev != &klp_patches && > > !list_prev_entry(patch, list)->enabled) > > return -EBUSY; > > > > When patch 3 is enabled, list_prev_entry() returns patch 2 and its > > ->enabled is false. > > Hm, you're right. I'm not sure how I got that idea... It is great that we finally understand each other. > I still agree with my original conclusion that enforcing stack order no > longer makes sense though. The question is what we will get if we remove the stack. Will it really make the code easier and livepatching more safe? First, note that stack is the main trick used by the ftrace handler. It gets the patch from top of the stack and use the new_addr from that patch. If we remove the stack, we still need to handle 3 possibilities. The handler will need to continue either with original_code, old_patch or new_patch. Now, just imagine the code. We will need variables orig_addr, old_addr, new_addr or so which might be confusing. It will be even more confusion if we do revert/disable. Also new_addr will become old_addr if we install yet another patch. We had exactly this in kGraft and it was a mess. I said "wow, that is genius" when I saw the stack approach in the upstream code proposal. Second, unrelated patches must never patch the same functions. Otherwise we would not be able to define which implementation should be used. This is especially important when a patch is removed and we need to fallback either to another patch or original code. Yes, it makes perfect sense. But it needs code that will check it, refuse loading the patch, ... It is not complicated. But it is rather additional code than simplification. I might make livepatching more safe but probably not simplify the code. > > > > Another possibility would be to get rid of the enable/disable states. > > > > I mean that the patch will be automatically enabled during > > > > registration and removed during unregistration. > > > > > > I don't see how disabling during unregistration would be possible, since > > > the unregister is called from the patch module exit function, which > > > can only be called *after* the patch is disabled. > > > > > > However, we could unregister immediately after disabling (i.e., in > > > enabled_store context). > > > > I think this is what Petr meant. So there would be nothing in the patch > > module exit function. Well, not exactly. We'd need to remove sysfs dir and > > maybe something more. > > Sounds good to me, though aren't the livepatch sysfs entries removed by > klp during unregister? This is why I asked in my earlier mail if we need to keep sysfs entries for unused patches. We could remove them when the patch gets disabled (transition finishes). Then we do not need to do anything in module_exit(). The patch module can be removed only when the transition is not forced and we call module_put(). Best Regards, Petr
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Wed, 11 Apr 2018, Josh Poimboeuf wrote: > On Wed, Apr 11, 2018 at 10:07:31AM +0200, Miroslav Benes wrote: > > > > I was confused by wording "in the middle". It suggested that there > > > > might had been enabled patches on the top and the bottom of the stack > > > > and some disabled patches in between at the same time (or vice versa). > > > > This was not true. > > > > > > That *was* what I meant. Consider the following sequence of events: > > > > > > - Register patch 1 > > > - Enable patch 1 > > > - Register patch 2 > > > - Enable patch 2 > > > - Disable patch 2 > > > - Register patch 3 > > > - Enable patch 3 > > > > > > Notice that patch 2 (in the middle) is disabled, whereas patch 1 (on the > > > bottom) and patch 3 (on the top) are enabled. > > > > This should not be possible at all. > > > > __klp_enable_patch: > > > > if (patch->list.prev != &klp_patches && > > !list_prev_entry(patch, list)->enabled) > > return -EBUSY; > > > > When patch 3 is enabled, list_prev_entry() returns patch 2 and its > > ->enabled is false. > > Hm, you're right. I'm not sure how I got that idea... > > I still agree with my original conclusion that enforcing stack order no > longer makes sense though. Frankly I cannot say. I have got no opinion on this, so if there is a patch to remove it, I won't oppose it (probably). I just think it is connected to the atomic replace patch set. > > > > Another possibility would be to get rid of the enable/disable states. > > > > I mean that the patch will be automatically enabled during > > > > registration and removed during unregistration. > > > > > > I don't see how disabling during unregistration would be possible, since > > > the unregister is called from the patch module exit function, which > > > can only be called *after* the patch is disabled. > > > > > > However, we could unregister immediately after disabling (i.e., in > > > enabled_store context). > > > > I think this is what Petr meant. So there would be nothing in the patch > > module exit function. Well, not exactly. We'd need to remove sysfs dir and > > maybe something more. > > Sounds good to me, though aren't the livepatch sysfs entries removed by > klp during unregister? Yes. I was thinking we may take something out of disable+unregister step and leave it in the exit function, but maybe there is nothing like that. > > > > The question is what is acceptable to others > > > > > > If there are any objections, this is their chance to speak up :-) > > > > > > > and if it needs to be done as part of this patch set. > > > > > > Maybe so, for at least a few reasons: > > > > > > - This patch set makes the 'stack' obsolete, so it makes sense to remove > > > the 'stack' with it. > > > > Not necessarily. I like Petr's rebase explanation here. > > I'm not sure what you mean. IIRC, his rebase explanation referred to > how we handle 'replace' patches, for which there is no stacking (as I > meant the term: enforcement of stack order for registration and > enablement). See above. I don't completely agree with the obsoleteness part. I don't see it that way. But it is moot. I'm not against the enforcement removal. Regards, Miroslav
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Wed, Apr 11, 2018 at 10:07:31AM +0200, Miroslav Benes wrote: > > > I was confused by wording "in the middle". It suggested that there > > > might had been enabled patches on the top and the bottom of the stack > > > and some disabled patches in between at the same time (or vice versa). > > > This was not true. > > > > That *was* what I meant. Consider the following sequence of events: > > > > - Register patch 1 > > - Enable patch 1 > > - Register patch 2 > > - Enable patch 2 > > - Disable patch 2 > > - Register patch 3 > > - Enable patch 3 > > > > Notice that patch 2 (in the middle) is disabled, whereas patch 1 (on the > > bottom) and patch 3 (on the top) are enabled. > > This should not be possible at all. > > __klp_enable_patch: > > if (patch->list.prev != &klp_patches && > !list_prev_entry(patch, list)->enabled) > return -EBUSY; > > When patch 3 is enabled, list_prev_entry() returns patch 2 and its > ->enabled is false. Hm, you're right. I'm not sure how I got that idea... I still agree with my original conclusion that enforcing stack order no longer makes sense though. > > > Another possibility would be to get rid of the enable/disable states. > > > I mean that the patch will be automatically enabled during > > > registration and removed during unregistration. > > > > I don't see how disabling during unregistration would be possible, since > > the unregister is called from the patch module exit function, which > > can only be called *after* the patch is disabled. > > > > However, we could unregister immediately after disabling (i.e., in > > enabled_store context). > > I think this is what Petr meant. So there would be nothing in the patch > module exit function. Well, not exactly. We'd need to remove sysfs dir and > maybe something more. Sounds good to me, though aren't the livepatch sysfs entries removed by klp during unregister? > > > The question is what is acceptable to others > > > > If there are any objections, this is their chance to speak up :-) > > > > > and if it needs to be done as part of this patch set. > > > > Maybe so, for at least a few reasons: > > > > - This patch set makes the 'stack' obsolete, so it makes sense to remove > > the 'stack' with it. > > Not necessarily. I like Petr's rebase explanation here. I'm not sure what you mean. IIRC, his rebase explanation referred to how we handle 'replace' patches, for which there is no stacking (as I meant the term: enforcement of stack order for registration and enablement). -- Josh
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Tue, 10 Apr 2018, Josh Poimboeuf wrote: > On Tue, Apr 10, 2018 at 10:34:55AM +0200, Petr Mladek wrote: > > > > > > > We were just recently discussing the possibility of not allowing > > > > > > > the > > > > > > > disabling of patches at all. If we're not going that far, let's > > > > > > > at > > > > > > > least further restrict it, for the sanity of our code, so we > > > > > > > don't have > > > > > > > to worry about a nasty mismatched stack of > > > > > > > enabled/disabled/enabled/etc, > > > > > > > at least for the cases where 'replace' patches aren't used. > > > > > > > > > > > > I am not completely sure where all these fears come from. From my > > > > > > point of view, this change is pretty safe and trivial thanks to NOPs > > > > > > and overall design. It would be a shame to do not have it. But I > > > > > > might be blind after spending so much time with the feature. > > > > > > > > > > I think you're missing my point. This isn't about your patch set, per > > > > > se. It's really about our existing code. Today, our patch stack > > > > > doesn't follow real stack semantics, because patches in the middle > > > > > might > > > > > be disabled. I see that as a problem. > > > > > > No, please read it again. I wasn't talking about replaced patches. > > > > I was confused by wording "in the middle". It suggested that there > > might had been enabled patches on the top and the bottom of the stack > > and some disabled patches in between at the same time (or vice versa). > > This was not true. > > That *was* what I meant. Consider the following sequence of events: > > - Register patch 1 > - Enable patch 1 > - Register patch 2 > - Enable patch 2 > - Disable patch 2 > - Register patch 3 > - Enable patch 3 > > Notice that patch 2 (in the middle) is disabled, whereas patch 1 (on the > bottom) and patch 3 (on the top) are enabled. This should not be possible at all. __klp_enable_patch: if (patch->list.prev != &klp_patches && !list_prev_entry(patch, list)->enabled) return -EBUSY; When patch 3 is enabled, list_prev_entry() returns patch 2 and its ->enabled is false. > > Do I understand it correctly that you do not like that the patches > > on the stack might be in two states (disabled/enabled). This might > > be translated that you do not like the state when the patch is > > registered and disabled. > > No, that wasn't really what I meant, but I have often wondered whether > we need such a distinction. > > > I wonder if the problem is in the "stack" abstraction. Would it help > > if we call it "sorted list" or whatever more suitable? > > But I don't even see a reason to have a sorted list (more on that > below). > > > Another possibility would be to get rid of the enable/disable states. > > I mean that the patch will be automatically enabled during > > registration and removed during unregistration. > > I don't see how disabling during unregistration would be possible, since > the unregister is called from the patch module exit function, which > can only be called *after* the patch is disabled. > > However, we could unregister immediately after disabling (i.e., in > enabled_store context). I think this is what Petr meant. So there would be nothing in the patch module exit function. Well, not exactly. We'd need to remove sysfs dir and maybe something more. > > Or we could rename the operation do add/remove or anything else. In > > fact, this is how it worked in kGraft. > > I'm not sure what renaming would solve, unless you mean to combine > registration and enablement into a single concept? Sounds good to me. > > > AFAIK, the enable/disabled state made more sense for immediate > > patches that could not be unloaded. We still need to keep the patches > > when the transaction is forced. The question is if we need to keep > > the sysfs entries for loaded but unused patches. > > I think we wouldn't need the sysfs entries. Just disable/unregister > forced patches like normal, except skipping the module_put(). > > > > Either way, why do we still need a stack? > > > > Good question. I suggest to agree on some terms first: > > > >+ Independent patches make unrelated changes. Any new function > > must not rely on changes done by any other patch. > > > >+ Dependent patches mean that a later patch depend on changes > > done by an earlier patch. For example, a new function might > > use function added by an earlier patch. > > > >+ Each cumulative patch include all necessary changes. I would say > > that it is self-containing and independent. Except that they should > > be able to continue using changes made by earlier patches (shadow > > variables, callbacks). > > > > > > Then we could say: > > > >+ The stack helps to enforce dependencies between dependent > > patches. But there is needed also some external solution > > that forces loading the patches in the right order. > > > >
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Tue, 10 Apr 2018, Josh Poimboeuf wrote: > > > I agree here. Practically we use only cumulative replacement patches at > > > SUSE. So with that in mind I don't care about the stacking much. But, it > > > may make sense for someone else. Evgenii mentioned they used it for > > > hotfixes. Therefore I'm reluctant to remove it completely. > > > > Well, it was convenient in some cases to provide a hot fix for a given bug > > on top of our official cumulative patch. So far, such fixes were only used > > on a few of the customers' machines (where they were needed ASAP). It just > > made it easier to see where is the common set of fixes and where is the > > customer-specific addition. > > > > I think, we can use cumulative patches in such cases too without much > > additional effort. For example, we can encode the distinction (base set of > > fixes + addition) in the module name or somewhere else. > > > > So, I think, it is fine for us, if stacking support is removed. Especially > > if that makes the implementation of livepatch less complex and more > > reliable. > > Just to clarify, I think we are just proposing the removal of the > enforcement of the stacking order. We will still allow multiple > non-replace patches to be applied. We just won't enforce which patches > can be disabled/removed at any given time. Heh, so I misunderstood. I thought you were talking about the removal of the stacking. Now it makes more sense. > So I think your old way of doing things (individual unrelated patches on > top of a cumulative patch) would still work. Yes. On the other hand the user needs to be even more careful, so I'd expect an update of documentation with the removal :). Miroslav
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
> > I agree here. Practically we use only cumulative replacement patches at > > SUSE. So with that in mind I don't care about the stacking much. But, it > > may make sense for someone else. Evgenii mentioned they used it for > > hotfixes. Therefore I'm reluctant to remove it completely. > > Well, it was convenient in some cases to provide a hot fix for a given bug > on top of our official cumulative patch. So far, such fixes were only used > on a few of the customers' machines (where they were needed ASAP). It just > made it easier to see where is the common set of fixes and where is the > customer-specific addition. > > I think, we can use cumulative patches in such cases too without much > additional effort. For example, we can encode the distinction (base set of > fixes + addition) in the module name or somewhere else. > > So, I think, it is fine for us, if stacking support is removed. Especially > if that makes the implementation of livepatch less complex and more > reliable. Just to clarify, I think we are just proposing the removal of the enforcement of the stacking order. We will still allow multiple non-replace patches to be applied. We just won't enforce which patches can be disabled/removed at any given time. So I think your old way of doing things (individual unrelated patches on top of a cumulative patch) would still work. -- Josh
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Tue, Apr 10, 2018 at 10:34:55AM +0200, Petr Mladek wrote: > > > > > > We were just recently discussing the possibility of not allowing the > > > > > > disabling of patches at all. If we're not going that far, let's at > > > > > > least further restrict it, for the sanity of our code, so we don't > > > > > > have > > > > > > to worry about a nasty mismatched stack of > > > > > > enabled/disabled/enabled/etc, > > > > > > at least for the cases where 'replace' patches aren't used. > > > > > > > > > > I am not completely sure where all these fears come from. From my > > > > > point of view, this change is pretty safe and trivial thanks to NOPs > > > > > and overall design. It would be a shame to do not have it. But I > > > > > might be blind after spending so much time with the feature. > > > > > > > > I think you're missing my point. This isn't about your patch set, per > > > > se. It's really about our existing code. Today, our patch stack > > > > doesn't follow real stack semantics, because patches in the middle might > > > > be disabled. I see that as a problem. > > > > No, please read it again. I wasn't talking about replaced patches. > > I was confused by wording "in the middle". It suggested that there > might had been enabled patches on the top and the bottom of the stack > and some disabled patches in between at the same time (or vice versa). > This was not true. That *was* what I meant. Consider the following sequence of events: - Register patch 1 - Enable patch 1 - Register patch 2 - Enable patch 2 - Disable patch 2 - Register patch 3 - Enable patch 3 Notice that patch 2 (in the middle) is disabled, whereas patch 1 (on the bottom) and patch 3 (on the top) are enabled. > Do I understand it correctly that you do not like that the patches > on the stack might be in two states (disabled/enabled). This might > be translated that you do not like the state when the patch is > registered and disabled. No, that wasn't really what I meant, but I have often wondered whether we need such a distinction. > I wonder if the problem is in the "stack" abstraction. Would it help > if we call it "sorted list" or whatever more suitable? But I don't even see a reason to have a sorted list (more on that below). > Another possibility would be to get rid of the enable/disable states. > I mean that the patch will be automatically enabled during > registration and removed during unregistration. I don't see how disabling during unregistration would be possible, since the unregister is called from the patch module exit function, which can only be called *after* the patch is disabled. However, we could unregister immediately after disabling (i.e., in enabled_store context). > Or we could rename the operation do add/remove or anything else. In > fact, this is how it worked in kGraft. I'm not sure what renaming would solve, unless you mean to combine registration and enablement into a single concept? Sounds good to me. > AFAIK, the enable/disabled state made more sense for immediate > patches that could not be unloaded. We still need to keep the patches > when the transaction is forced. The question is if we need to keep > the sysfs entries for loaded but unused patches. I think we wouldn't need the sysfs entries. Just disable/unregister forced patches like normal, except skipping the module_put(). > > Either way, why do we still need a stack? > > Good question. I suggest to agree on some terms first: > >+ Independent patches make unrelated changes. Any new function > must not rely on changes done by any other patch. > >+ Dependent patches mean that a later patch depend on changes > done by an earlier patch. For example, a new function might > use function added by an earlier patch. > >+ Each cumulative patch include all necessary changes. I would say > that it is self-containing and independent. Except that they should > be able to continue using changes made by earlier patches (shadow > variables, callbacks). > > > Then we could say: > >+ The stack helps to enforce dependencies between dependent > patches. But there is needed also some external solution > that forces loading the patches in the right order. > >+ The "replace" feature is useful for cumulative patches. > It allows to remove existing changes and remove unused stuff. > But cumulative patches might be done even _without_ the atomic > replace. > >+ Cumulative patches _with_ "replace" flag might be in theory > installed in random order because they handle not-longer patched > functions. Well, some incompatibility might be caused by shadow > variables and callbacks. Therefore it still might make sense > to install them in the right order. > > Cumulative patches _without_ "replace" flag must be installed > in the right order because they do not handle not-longer patched > functions. > > Anyway, for cumulative pa
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On 10.04.2018 16:21, Miroslav Benes wrote: I think you're missing my point. This isn't about your patch set, per se. It's really about our existing code. Today, our patch stack doesn't follow real stack semantics, because patches in the middle might be disabled. I see that as a problem. No, please read it again. I wasn't talking about replaced patches. I was confused by wording "in the middle". It suggested that there might had been enabled patches on the top and the bottom of the stack and some disabled patches in between at the same time (or vice versa). This was not true. Do I understand it correctly that you do not like that the patches on the stack might be in two states (disabled/enabled). This might be translated that you do not like the state when the patch is registered and disabled. I wonder if the problem is in the "stack" abstraction. Would it help if we call it "sorted list" or whatever more suitable? Another possibility would be to get rid of the enable/disable states. I mean that the patch will be automatically enabled during registration and removed during unregistration. Or we could rename the operation do add/remove or anything else. In fact, this is how it worked in kGraft. I've already wondered couple of times why we had separate enable/disable. If there is someone who knows, remind me, please. I wouldn't be against a simplification here. On the other hand, it is kind of nice to keep the registration and enablement separate. It is more flexible if someone needs it. Anyway, we should solve it together with the stacking. It is tightly connected. AFAIK, the enable/disabled state made more sense for immediate patches that could not be unloaded. We still need to keep the patches when the transaction is forced. The question is if we need to keep the sysfs entries for loaded but unused patches. If 'replace' were the only mode, then we wouldn't even need a patch stack because it wouldn't really matter much whether the previous patch is enabled or disabled. I think this is in agreement with the point you're making. But we still support non-replace patches. My feeling is that we should either do a true stack, or no stack at all. The in-between thing is going to be confusing, not only for us, but for patch authors and end users. I see it like two different modes. We either have a stack of patches that depend on each other. But if they depend on each other, they can use 'replace' and a stack isn't needed. Yes but see below. And If they *don't* depend on each other, then the stack is overly restrictive, for no good reason. Either way, why do we still need a stack? Good question. I suggest to agree on some terms first: + Independent patches make unrelated changes. Any new function must not rely on changes done by any other patch. + Dependent patches mean that a later patch depend on changes done by an earlier patch. For example, a new function might use function added by an earlier patch. + Each cumulative patch include all necessary changes. I would say that it is self-containing and independent. Except that they should be able to continue using changes made by earlier patches (shadow variables, callbacks). Then we could say: + The stack helps to enforce dependencies between dependent patches. But there is needed also some external solution that forces loading the patches in the right order. + The "replace" feature is useful for cumulative patches. It allows to remove existing changes and remove unused stuff. But cumulative patches might be done even _without_ the atomic replace. + Cumulative patches _with_ "replace" flag might be in theory installed in random order because they handle not-longer patched functions. Well, some incompatibility might be caused by shadow variables and callbacks. Therefore it still might make sense to install them in the right order. Cumulative patches _without_ "replace" flag must be installed in the right order because they do not handle not-longer patched functions. Anyway, for cumulative patches is important the external ordering (loading modules) and not the stack. Back to your question: The stack is needed for dependent non-cumulative patches. The cumulative patches with "replace" flag seems the be the most safe and most flexible solution. The question is if we want to force all users to use this mode. Or we have replace patches that are standalone and we allow a smooth transfer from one to another one. Anyway, for us, it is much more important the removal of replaced patches. We could probably live without the possibility to replace disabled patches. I think replacing disabled patches is ok, *if* we get rid of the illusion of a stack. The current stack isn't really a stack, it's just awkward. I personally do not have problems with it. As I said, I see this a
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
> > > > I think you're missing my point. This isn't about your patch set, per > > > > se. It's really about our existing code. Today, our patch stack > > > > doesn't follow real stack semantics, because patches in the middle might > > > > be disabled. I see that as a problem. > > > > No, please read it again. I wasn't talking about replaced patches. > > I was confused by wording "in the middle". It suggested that there > might had been enabled patches on the top and the bottom of the stack > and some disabled patches in between at the same time (or vice versa). > This was not true. > > Do I understand it correctly that you do not like that the patches > on the stack might be in two states (disabled/enabled). This might > be translated that you do not like the state when the patch is > registered and disabled. > > I wonder if the problem is in the "stack" abstraction. Would it help > if we call it "sorted list" or whatever more suitable? > > Another possibility would be to get rid of the enable/disable states. > I mean that the patch will be automatically enabled during > registration and removed during unregistration. Or we could rename > the operation do add/remove or anything else. In fact, this is how > it worked in kGraft. I've already wondered couple of times why we had separate enable/disable. If there is someone who knows, remind me, please. I wouldn't be against a simplification here. On the other hand, it is kind of nice to keep the registration and enablement separate. It is more flexible if someone needs it. Anyway, we should solve it together with the stacking. It is tightly connected. > AFAIK, the enable/disabled state made more sense for immediate > patches that could not be unloaded. We still need to keep the patches > when the transaction is forced. The question is if we need to keep > the sysfs entries for loaded but unused patches. > > > > > > If 'replace' were the only mode, then we wouldn't even need a patch > > > > stack because it wouldn't really matter much whether the previous patch > > > > is enabled or disabled. I think this is in agreement with the point > > > > you're making. > > > > > > > > But we still support non-replace patches. My feeling is that we should > > > > either do a true stack, or no stack at all. The in-between thing is > > > > going to be confusing, not only for us, but for patch authors and end > > > > users. > > > > > > I see it like two different modes. We either have a stack of patches > > > that depend on each other. > > > > But if they depend on each other, they can use 'replace' and a stack > > isn't needed. > > Yes but see below. > > > > And If they *don't* depend on each other, then the stack is overly > > restrictive, for no good reason. > > > > Either way, why do we still need a stack? > > Good question. I suggest to agree on some terms first: > >+ Independent patches make unrelated changes. Any new function > must not rely on changes done by any other patch. > >+ Dependent patches mean that a later patch depend on changes > done by an earlier patch. For example, a new function might > use function added by an earlier patch. > >+ Each cumulative patch include all necessary changes. I would say > that it is self-containing and independent. Except that they should > be able to continue using changes made by earlier patches (shadow > variables, callbacks). > > > Then we could say: > >+ The stack helps to enforce dependencies between dependent > patches. But there is needed also some external solution > that forces loading the patches in the right order. > >+ The "replace" feature is useful for cumulative patches. > It allows to remove existing changes and remove unused stuff. > But cumulative patches might be done even _without_ the atomic > replace. > >+ Cumulative patches _with_ "replace" flag might be in theory > installed in random order because they handle not-longer patched > functions. Well, some incompatibility might be caused by shadow > variables and callbacks. Therefore it still might make sense > to install them in the right order. > > Cumulative patches _without_ "replace" flag must be installed > in the right order because they do not handle not-longer patched > functions. > > Anyway, for cumulative patches is important the external > ordering (loading modules) and not the stack. > > > Back to your question: > > The stack is needed for dependent non-cumulative patches. > > The cumulative patches with "replace" flag seems the be > the most safe and most flexible solution. The question is > if we want to force all users to use this mode. > > > > > Or we have replace patches that are > > > standalone and we allow a smooth transfer from one to another one. > > > > > > Anyway, for us, it is much more important the removal of replaced > > > patches. We could probably live wit
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Fri 2018-04-06 14:50:49, Josh Poimboeuf wrote: > On Mon, Mar 26, 2018 at 12:11:07PM +0200, Petr Mladek wrote: > > On Fri 2018-03-23 17:44:10, Josh Poimboeuf wrote: > > > On Fri, Mar 23, 2018 at 10:45:07AM +0100, Petr Mladek wrote: > > > > On Tue 2018-03-20 15:15:02, Josh Poimboeuf wrote: > > > > > On Tue, Mar 20, 2018 at 01:25:38PM +0100, Petr Mladek wrote: > > > > > > On Mon 2018-03-19 16:43:24, Josh Poimboeuf wrote: > > > > > > > On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: > > > > > > > > > Along those lines, I'd also propose that we constrain our > > > > > > > > > existing patch > > > > > > > > > stacking even further. Right now we allow a new patch to be > > > > > > > > > registered > > > > > > > > > on top of a disabled patch, though we don't allow the new > > > > > > > > > patch to be > > > > > > > > > enabled until the previous patch gets enabled. I'd propose > > > > > > > > > we no longer > > > > > > > > > allow that condition. We should instead enforce that all > > > > > > > > > existing > > > > > > > > > patches are *enabled* before allowing a new patch to be > > > > > > > > > registered on > > > > > > > > > top. That way the patch stacking is even more sane, and > > > > > > > > > there are less > > > > > > > > > "unusual" conditions to worry about. We have enough of those > > > > > > > > > already. > > > > > > > > > Each additional bit of flexibility has a maintenance cost, > > > > > > > > > and this one > > > > > > > > > isn't worth it IMO. > > > > > > > > > > > > > > > > Again, this might make some things easier but it might also > > > > > > > > bring > > > > > > > > problems. > > > > > > > > > > > > > > > > For example, we would need to solve the situation when the last > > > > > > > > patch is disabled and cannot be removed because the transition > > > > > > > > was forced. This might be more common after removing the > > > > > > > > immediate > > > > > > > > feature. > > > > > > > > > > > > > > I would stop worrying about forced patches so much :-) > > > > > > > > > > > > I have already seen blocked transition several times. It is true > > > > > > that > > > > > > it was with kGraft. But we just do not have enough real life > > > > > > experience > > > > > > with the upstream livepatch code. > > > > > > > > > > But we're talking about patching on top of a *disabled* patch. Forced > > > > > or not, why would the patch be disabled in the first place? > > > > > > > > For example, it might be disabled because the transition stalled for > > > > too long and the user reverted it. Or just because it is possible > > > > to disable it. > > > > > > If they haven't previously forced any patches, and they reverted the > > > topmost patch because it stalled, they can easily unload the patch. > > > > > > If they *have* previously forced a patch, they can force enable the > > > topmost patch as well, or if that's not safe they can reboot (that's > > > what you get for forcing a patch...) > > > > IMHO, the reboot is the very last option for people that are using > > livepatching. > > ... but it may be a natural consequence of forcing a patch. > > > > > > We were just recently discussing the possibility of not allowing the > > > > > disabling of patches at all. If we're not going that far, let's at > > > > > least further restrict it, for the sanity of our code, so we don't > > > > > have > > > > > to worry about a nasty mismatched stack of > > > > > enabled/disabled/enabled/etc, > > > > > at least for the cases where 'replace' patches aren't used. > > > > > > > > I am not completely sure where all these fears come from. From my > > > > point of view, this change is pretty safe and trivial thanks to NOPs > > > > and overall design. It would be a shame to do not have it. But I > > > > might be blind after spending so much time with the feature. > > > > > > I think you're missing my point. This isn't about your patch set, per > > > se. It's really about our existing code. Today, our patch stack > > > doesn't follow real stack semantics, because patches in the middle might > > > be disabled. I see that as a problem. > > No, please read it again. I wasn't talking about replaced patches. I was confused by wording "in the middle". It suggested that there might had been enabled patches on the top and the bottom of the stack and some disabled patches in between at the same time (or vice versa). This was not true. Do I understand it correctly that you do not like that the patches on the stack might be in two states (disabled/enabled). This might be translated that you do not like the state when the patch is registered and disabled. I wonder if the problem is in the "stack" abstraction. Would it help if we call it "sorted list" or whatever more suitable? Another possibility would be to get rid of the enable/disable states. I mean that the patch will be automatically enabled during registration and removed during unregistration. Or we could rename the op
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Mon, Mar 26, 2018 at 12:11:07PM +0200, Petr Mladek wrote: > On Fri 2018-03-23 17:44:10, Josh Poimboeuf wrote: > > On Fri, Mar 23, 2018 at 10:45:07AM +0100, Petr Mladek wrote: > > > On Tue 2018-03-20 15:15:02, Josh Poimboeuf wrote: > > > > On Tue, Mar 20, 2018 at 01:25:38PM +0100, Petr Mladek wrote: > > > > > On Mon 2018-03-19 16:43:24, Josh Poimboeuf wrote: > > > > > > On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: > > > > > > > > Along those lines, I'd also propose that we constrain our > > > > > > > > existing patch > > > > > > > > stacking even further. Right now we allow a new patch to be > > > > > > > > registered > > > > > > > > on top of a disabled patch, though we don't allow the new patch > > > > > > > > to be > > > > > > > > enabled until the previous patch gets enabled. I'd propose we > > > > > > > > no longer > > > > > > > > allow that condition. We should instead enforce that all > > > > > > > > existing > > > > > > > > patches are *enabled* before allowing a new patch to be > > > > > > > > registered on > > > > > > > > top. That way the patch stacking is even more sane, and there > > > > > > > > are less > > > > > > > > "unusual" conditions to worry about. We have enough of those > > > > > > > > already. > > > > > > > > Each additional bit of flexibility has a maintenance cost, and > > > > > > > > this one > > > > > > > > isn't worth it IMO. > > > > > > > > > > > > > > Again, this might make some things easier but it might also bring > > > > > > > problems. > > > > > > > > > > > > > > For example, we would need to solve the situation when the last > > > > > > > patch is disabled and cannot be removed because the transition > > > > > > > was forced. This might be more common after removing the immediate > > > > > > > feature. > > > > > > > > > > > > I would stop worrying about forced patches so much :-) > > > > > > > > > > I have already seen blocked transition several times. It is true that > > > > > it was with kGraft. But we just do not have enough real life > > > > > experience > > > > > with the upstream livepatch code. > > > > > > > > But we're talking about patching on top of a *disabled* patch. Forced > > > > or not, why would the patch be disabled in the first place? > > > > > > For example, it might be disabled because the transition stalled for > > > too long and the user reverted it. Or just because it is possible > > > to disable it. > > > > If they haven't previously forced any patches, and they reverted the > > topmost patch because it stalled, they can easily unload the patch. > > > > If they *have* previously forced a patch, they can force enable the > > topmost patch as well, or if that's not safe they can reboot (that's > > what you get for forcing a patch...) > > IMHO, the reboot is the very last option for people that are using > livepatching. ... but it may be a natural consequence of forcing a patch. > > > > We were just recently discussing the possibility of not allowing the > > > > disabling of patches at all. If we're not going that far, let's at > > > > least further restrict it, for the sanity of our code, so we don't have > > > > to worry about a nasty mismatched stack of enabled/disabled/enabled/etc, > > > > at least for the cases where 'replace' patches aren't used. > > > > > > I am not completely sure where all these fears come from. From my > > > point of view, this change is pretty safe and trivial thanks to NOPs > > > and overall design. It would be a shame to do not have it. But I > > > might be blind after spending so much time with the feature. > > > > I think you're missing my point. This isn't about your patch set, per > > se. It's really about our existing code. Today, our patch stack > > doesn't follow real stack semantics, because patches in the middle might > > be disabled. I see that as a problem. > > This would be true if we keep the replaced patches on the stack. But > if we remove the replaced patches then there never will be disabled > patches in the middle. > > OK, there might be disabled patches in the middle during the > transition. But this is the situation where we basically could > not manipulate the stack. No, please read it again. I wasn't talking about replaced patches. > > If 'replace' were the only mode, then we wouldn't even need a patch > > stack because it wouldn't really matter much whether the previous patch > > is enabled or disabled. I think this is in agreement with the point > > you're making. > > > > But we still support non-replace patches. My feeling is that we should > > either do a true stack, or no stack at all. The in-between thing is > > going to be confusing, not only for us, but for patch authors and end > > users. > > I see it like two different modes. We either have a stack of patches > that depend on each other. But if they depend on each other, they can use 'replace' and a stack isn't needed. And If they *don't* depend on each other, then the stack i
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Fri 2018-03-23 17:44:10, Josh Poimboeuf wrote: > On Fri, Mar 23, 2018 at 10:45:07AM +0100, Petr Mladek wrote: > > On Tue 2018-03-20 15:15:02, Josh Poimboeuf wrote: > > > On Tue, Mar 20, 2018 at 01:25:38PM +0100, Petr Mladek wrote: > > > > On Mon 2018-03-19 16:43:24, Josh Poimboeuf wrote: > > > > > On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: > > > > > > > Along those lines, I'd also propose that we constrain our > > > > > > > existing patch > > > > > > > stacking even further. Right now we allow a new patch to be > > > > > > > registered > > > > > > > on top of a disabled patch, though we don't allow the new patch > > > > > > > to be > > > > > > > enabled until the previous patch gets enabled. I'd propose we no > > > > > > > longer > > > > > > > allow that condition. We should instead enforce that all existing > > > > > > > patches are *enabled* before allowing a new patch to be > > > > > > > registered on > > > > > > > top. That way the patch stacking is even more sane, and there > > > > > > > are less > > > > > > > "unusual" conditions to worry about. We have enough of those > > > > > > > already. > > > > > > > Each additional bit of flexibility has a maintenance cost, and > > > > > > > this one > > > > > > > isn't worth it IMO. > > > > > > > > > > > > Again, this might make some things easier but it might also bring > > > > > > problems. > > > > > > > > > > > > For example, we would need to solve the situation when the last > > > > > > patch is disabled and cannot be removed because the transition > > > > > > was forced. This might be more common after removing the immediate > > > > > > feature. > > > > > > > > > > I would stop worrying about forced patches so much :-) > > > > > > > > I have already seen blocked transition several times. It is true that > > > > it was with kGraft. But we just do not have enough real life experience > > > > with the upstream livepatch code. > > > > > > But we're talking about patching on top of a *disabled* patch. Forced > > > or not, why would the patch be disabled in the first place? > > > > For example, it might be disabled because the transition stalled for > > too long and the user reverted it. Or just because it is possible > > to disable it. > > If they haven't previously forced any patches, and they reverted the > topmost patch because it stalled, they can easily unload the patch. > > If they *have* previously forced a patch, they can force enable the > topmost patch as well, or if that's not safe they can reboot (that's > what you get for forcing a patch...) IMHO, the reboot is the very last option for people that are using livepatching. > > > We were just recently discussing the possibility of not allowing the > > > disabling of patches at all. If we're not going that far, let's at > > > least further restrict it, for the sanity of our code, so we don't have > > > to worry about a nasty mismatched stack of enabled/disabled/enabled/etc, > > > at least for the cases where 'replace' patches aren't used. > > > > I am not completely sure where all these fears come from. From my > > point of view, this change is pretty safe and trivial thanks to NOPs > > and overall design. It would be a shame to do not have it. But I > > might be blind after spending so much time with the feature. > > I think you're missing my point. This isn't about your patch set, per > se. It's really about our existing code. Today, our patch stack > doesn't follow real stack semantics, because patches in the middle might > be disabled. I see that as a problem. This would be true if we keep the replaced patches on the stack. But if we remove the replaced patches then there never will be disabled patches in the middle. OK, there might be disabled patches in the middle during the transition. But this is the situation where we basically could not manipulate the stack. > If 'replace' were the only mode, then we wouldn't even need a patch > stack because it wouldn't really matter much whether the previous patch > is enabled or disabled. I think this is in agreement with the point > you're making. > > But we still support non-replace patches. My feeling is that we should > either do a true stack, or no stack at all. The in-between thing is > going to be confusing, not only for us, but for patch authors and end > users. I see it like two different modes. We either have a stack of patches that depend on each other. Or we have replace patches that are standalone and we allow a smooth transfer from one to another one. Anyway, for us, it is much more important the removal of replaced patches. We could probably live without the possibility to replace disabled patches. > > Anyway, I am getting close to send v11. This change is done by a separate > > and last code-related patch. So, it can be omitted rather easily. > > Thanks a lot for splitting it up. It looks much easier to review now. > > I still would really like to see the selftests merged f
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Fri, Mar 23, 2018 at 10:45:07AM +0100, Petr Mladek wrote: > On Tue 2018-03-20 15:15:02, Josh Poimboeuf wrote: > > On Tue, Mar 20, 2018 at 01:25:38PM +0100, Petr Mladek wrote: > > > On Mon 2018-03-19 16:43:24, Josh Poimboeuf wrote: > > > > On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: > > > > > > Along those lines, I'd also propose that we constrain our existing > > > > > > patch > > > > > > stacking even further. Right now we allow a new patch to be > > > > > > registered > > > > > > on top of a disabled patch, though we don't allow the new patch to > > > > > > be > > > > > > enabled until the previous patch gets enabled. I'd propose we no > > > > > > longer > > > > > > allow that condition. We should instead enforce that all existing > > > > > > patches are *enabled* before allowing a new patch to be registered > > > > > > on > > > > > > top. That way the patch stacking is even more sane, and there are > > > > > > less > > > > > > "unusual" conditions to worry about. We have enough of those > > > > > > already. > > > > > > Each additional bit of flexibility has a maintenance cost, and this > > > > > > one > > > > > > isn't worth it IMO. > > > > > > > > > > Again, this might make some things easier but it might also bring > > > > > problems. > > > > > > > > > > For example, we would need to solve the situation when the last > > > > > patch is disabled and cannot be removed because the transition > > > > > was forced. This might be more common after removing the immediate > > > > > feature. > > > > > > > > I would stop worrying about forced patches so much :-) > > > > > > I have already seen blocked transition several times. It is true that > > > it was with kGraft. But we just do not have enough real life experience > > > with the upstream livepatch code. > > > > But we're talking about patching on top of a *disabled* patch. Forced > > or not, why would the patch be disabled in the first place? > > For example, it might be disabled because the transition stalled for > too long and the user reverted it. Or just because it is possible > to disable it. If they haven't previously forced any patches, and they reverted the topmost patch because it stalled, they can easily unload the patch. If they *have* previously forced a patch, they can force enable the topmost patch as well, or if that's not safe they can reboot (that's what you get for forcing a patch...) > > We were just recently discussing the possibility of not allowing the > > disabling of patches at all. If we're not going that far, let's at > > least further restrict it, for the sanity of our code, so we don't have > > to worry about a nasty mismatched stack of enabled/disabled/enabled/etc, > > at least for the cases where 'replace' patches aren't used. > > I am not completely sure where all these fears come from. From my > point of view, this change is pretty safe and trivial thanks to NOPs > and overall design. It would be a shame to do not have it. But I > might be blind after spending so much time with the feature. I think you're missing my point. This isn't about your patch set, per se. It's really about our existing code. Today, our patch stack doesn't follow real stack semantics, because patches in the middle might be disabled. I see that as a problem. If 'replace' were the only mode, then we wouldn't even need a patch stack because it wouldn't really matter much whether the previous patch is enabled or disabled. I think this is in agreement with the point you're making. But we still support non-replace patches. My feeling is that we should either do a true stack, or no stack at all. The in-between thing is going to be confusing, not only for us, but for patch authors and end users. In fact, *no stack* might be better, now that we have replace. Cumulative patches can use replace, so a stack is no longer needed for them. And for non cumulative patches, a stack may not make sense, since the patches should be independent anyway. But I don't know how big of a deal it is. > BTW: It seems that I am more paranoid when being in the reviewer role. > > Anyway, I am getting close to send v11. This change is done by a separate > and last code-related patch. So, it can be omitted rather easily. Thanks a lot for splitting it up. It looks much easier to review now. I still would really like to see the selftests merged first (or at the same time). Unfortunately I'll be out next week, so I won't be able to give it a proper review until I get back. Maybe I will come back in a more ACK-conducive mood ;-) -- Josh
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Tue 2018-03-20 15:15:02, Josh Poimboeuf wrote: > On Tue, Mar 20, 2018 at 01:25:38PM +0100, Petr Mladek wrote: > > On Mon 2018-03-19 16:43:24, Josh Poimboeuf wrote: > > > On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: > > > > > Along those lines, I'd also propose that we constrain our existing > > > > > patch > > > > > stacking even further. Right now we allow a new patch to be > > > > > registered > > > > > on top of a disabled patch, though we don't allow the new patch to be > > > > > enabled until the previous patch gets enabled. I'd propose we no > > > > > longer > > > > > allow that condition. We should instead enforce that all existing > > > > > patches are *enabled* before allowing a new patch to be registered on > > > > > top. That way the patch stacking is even more sane, and there are > > > > > less > > > > > "unusual" conditions to worry about. We have enough of those already. > > > > > Each additional bit of flexibility has a maintenance cost, and this > > > > > one > > > > > isn't worth it IMO. > > > > > > > > Again, this might make some things easier but it might also bring > > > > problems. > > > > > > > > For example, we would need to solve the situation when the last > > > > patch is disabled and cannot be removed because the transition > > > > was forced. This might be more common after removing the immediate > > > > feature. > > > > > > I would stop worrying about forced patches so much :-) > > > > I have already seen blocked transition several times. It is true that > > it was with kGraft. But we just do not have enough real life experience > > with the upstream livepatch code. > > But we're talking about patching on top of a *disabled* patch. Forced > or not, why would the patch be disabled in the first place? For example, it might be disabled because the transition stalled for too long and the user reverted it. Or just because it is possible to disable it. > We were just recently discussing the possibility of not allowing the > disabling of patches at all. If we're not going that far, let's at > least further restrict it, for the sanity of our code, so we don't have > to worry about a nasty mismatched stack of enabled/disabled/enabled/etc, > at least for the cases where 'replace' patches aren't used. I am not completely sure where all these fears come from. From my point of view, this change is pretty safe and trivial thanks to NOPs and overall design. It would be a shame to do not have it. But I might be blind after spending so much time with the feature. BTW: It seems that I am more paranoid when being in the reviewer role. Anyway, I am getting close to send v11. This change is done by a separate and last code-related patch. So, it can be omitted rather easily. > After these discussions I realize that there are several motivations for > this patch set: > > 1) Atomically reverting some functions in a previous patch while >upgrading other functions: accomplished by inserting nops for >reverted functions. Could also be accomplished by tooling which >patches functions with duplicates of the originals. > > 2) Improving performance implications of #1: accomplished by removing >the nops and old patch modules. > > 3) Decreasing user confusion about stacking order and what patches are >currently in effect: accomplished by permanently disabling and >removing the replaced patches. Could also be accomplished by a >better sysfs interface and tooling. > > 4) Improving debugging? How? (It seems to me that removing the >replaced patches could make debugging more difficult, as it erases >the patching history (similar to a git rebase)) This explains why you might be against replacing disabled patches. It could confuse the history. On the other hand, it might also hide bugs and create mysterious ones. If we remove code that should not longer be used, the system should crash immediately if the removed code/data is accessed by mistake. IMHO, this should help to catch bugs early and in more clear situation. The history can be seen in the logs or in users reports. I know that both sources are not 100% reliable but... IMHO, it is case by case. I could imagine situations where the clean up might help and situations where it might cause loosing information. It is hard to judge. I still vote for the clean up ;-) > It would be good to document all those. > > Anyway, I need to think about it some more, but I may be warming up to > the idea of permanently disabling the replaced patches. Your package > management analogy helped me to understand it better. Especially since > we'll be delivering these patches in packages, so the patch state would > mirror the patch state. It might be good to describe that analogy in > the documentation as well. I have updated the documentation a bit. I did not have the energy to completely rewrite it though. Anyway, we still have to agree on the code first. Best Regards, Petr
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Tue, Mar 20, 2018 at 01:25:38PM +0100, Petr Mladek wrote: > On Mon 2018-03-19 16:43:24, Josh Poimboeuf wrote: > > On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: > > > > Can someone remind me why we're permanently disabling replaced patches? > > > > I seem to remember being involved in that decision, but at least with > > > > this latest version of the patches, it seems like it would be simpler to > > > > just let 'replace' patches be rolled back to the previous state when > > > > they're unpatched. Then we don't need two lists of patches, the nops > > > > can become more permanent, the replaced patches remain "enabled" but > > > > inert, and the unpatching behavior is less surprising to the user, more > > > > like a normal rollback. > > > > > > Yes, keeping the patches might make some things easier. But it might > > > also bring some problems and it would make the feature less useful. > > > The following arguments come to my mind: > > > > > > 1. The feature should help to keep the system in a consistent and > > >well defined state. It should not depend on what patches were > > >installed before. > > > > But the nops already accomplish that. If they didn't, then this patch > > set has a major problem. > > The nops are enough to keep the functionality but they might harm > the performance. > > Livepatching is about preventing bugs without reboot. I could simply > imagine that ftrace on a hot patch might cause performance > problems on some workloads. And I would like to have a way out in > this case. But this is still an imaginary performance issue. Premature optimization and all that... > Anyway, I am reworking the patchset so that it implements your > approach first. The possibility to remove NOPs and replaced > livepatches is done via a followup patch. This might help > to discuss if the changes are worth it or not. Thanks, that will definitely help. > > > 2. The feature should allow to unpatch some functions while keeping > > >the others patched. > > > > > >The ftrace handler might cause some unwanted slowdown or other > > >problems. The performance might get restored only when we remove > > >the NOPs when they are not longer necessary. > > > > I'd say simplicity and maintainability of the code is more important > > than an (imagined) performance issue. The NOPs should be pretty fast > > anyway. > > > > Not to mention that my proposal would make the behavior less surprising > > and more user friendly (reverting a 'replace' patch restores it to its > > previous state). > > If the "disable" way works as expected, see below. > > Also it is less surprising only if people understand the stack of > patches. If they are familiar only with replace patches then it is > normal for them that the patches get replaced. It is then like > a package version update. Fair enough. It does seem analagous to package management in that way. > > > 3. The handling of callbacks is already problematic. We run only > > >the ones from the last patch to make things easier. > > > > > >We would need to come with something more complicated if we > > >want to support rollback to "random" patches on the stack. > > >And support for random patches is fundamental at least > > >from my point of view. > > > > Can you elaborate on what you mean by random patches and why it would > > require something more complicated from the callbacks? > > Let's say that we will use atomic replace for cumulative > patches. Then every new patch knows what earlier patches did. > It just did not know which of them was already installed. Therefore > it needs to detect what callbacks were already called. The callbacks > usually create or change something. So there should be something to check. > Therefore the way forward should be rather straightforward. > > The way back is more problematic. The callbacks in the new cumulative > patch would need to store information about the previous state and > be able to restore it when the patch gets disabled. It might more > or less double the callbacks code and testing scenarios. Makes sense. > > > > Along those lines, I'd also propose that we constrain our existing patch > > > > stacking even further. Right now we allow a new patch to be registered > > > > on top of a disabled patch, though we don't allow the new patch to be > > > > enabled until the previous patch gets enabled. I'd propose we no longer > > > > allow that condition. We should instead enforce that all existing > > > > patches are *enabled* before allowing a new patch to be registered on > > > > top. That way the patch stacking is even more sane, and there are less > > > > "unusual" conditions to worry about. We have enough of those already. > > > > Each additional bit of flexibility has a maintenance cost, and this one > > > > isn't worth it IMO. > > > > > > Again, this might make some things easier but it might also bring > > > problems. > > > > > > For example, w
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
> > I don't know, does anybody really care about this case (patching on top > > of a disabled patch)? It just adds to the crazy matrix of possible > > scenarios we have to keep in our heads, which means more bugs, for very > > little (hypothetical) gain. > > It depends if the we remove the replaced patches or not. If they are > removed then replacing disabled patches is rather trivial from both > coding and understanding side. I agree. Since we already have the code, there is no point not to have the feature. It is not that complicated after all. > I am going to add this as a separate patch as well. Let's discuss > it with the code. > > > > > White the atomic replace could make things easier for both developers > > > and users. > > > > I agree that atomic replace is a useful feature and I'm not arguing > > against it, so maybe I missed your point? > > Your suggestion allows easier code but it reduces the advantages of > the atomic replace feature. We would achieve almost the same results > with a normal livepatch where the functions behave like in > the original code. > > Also removing replaced patches can be seen as a clean up after > each patch. It might be more code but the target system might be > easier to debug. Also we do not need to mind about various > disable scenarios. I agree with this as well. Yes, it was a bit painful to review, but I was quite content with the result. I don't want to go halfway and be stuck with NOPs when it is not complicated to remove them completely after the transition. It'd be odd in my opinion. Regards, Miroslav
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On 20.03.2018 15:25, Petr Mladek wrote: On Mon 2018-03-19 16:43:24, Josh Poimboeuf wrote: On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: Can someone remind me why we're permanently disabling replaced patches? I seem to remember being involved in that decision, but at least with this latest version of the patches, it seems like it would be simpler to just let 'replace' patches be rolled back to the previous state when they're unpatched. Then we don't need two lists of patches, the nops can become more permanent, the replaced patches remain "enabled" but inert, and the unpatching behavior is less surprising to the user, more like a normal rollback. Yes, keeping the patches might make some things easier. But it might also bring some problems and it would make the feature less useful. The following arguments come to my mind: 1. The feature should help to keep the system in a consistent and well defined state. It should not depend on what patches were installed before. But the nops already accomplish that. If they didn't, then this patch set has a major problem. The nops are enough to keep the functionality but they might harm the performance. Livepatching is about preventing bugs without reboot. I could simply imagine that ftrace on a hot patch might cause performance problems on some workloads. And I would like to have a way out in this case. Anyway, I am reworking the patchset so that it implements your approach first. The possibility to remove NOPs and replaced livepatches is done via a followup patch. This might help to discuss if the changes are worth it or not. 2. The feature should allow to unpatch some functions while keeping the others patched. The ftrace handler might cause some unwanted slowdown or other problems. The performance might get restored only when we remove the NOPs when they are not longer necessary. I'd say simplicity and maintainability of the code is more important than an (imagined) performance issue. The NOPs should be pretty fast anyway. Not to mention that my proposal would make the behavior less surprising and more user friendly (reverting a 'replace' patch restores it to its previous state). If the "disable" way works as expected, see below. Also it is less surprising only if people understand the stack of patches. If they are familiar only with replace patches then it is normal for them that the patches get replaced. It is then like a package version update. 3. The handling of callbacks is already problematic. We run only the ones from the last patch to make things easier. We would need to come with something more complicated if we want to support rollback to "random" patches on the stack. And support for random patches is fundamental at least from my point of view. Can you elaborate on what you mean by random patches and why it would require something more complicated from the callbacks? Let's say that we will use atomic replace for cumulative patches. Then every new patch knows what earlier patches did. It just did not know which of them was already installed. Therefore it needs to detect what callbacks were already called. The callbacks usually create or change something. So there should be something to check. Therefore the way forward should be rather straightforward. The way back is more problematic. The callbacks in the new cumulative patch would need to store information about the previous state and be able to restore it when the patch gets disabled. It might more or less double the callbacks code and testing scenarios. Along those lines, I'd also propose that we constrain our existing patch stacking even further. Right now we allow a new patch to be registered on top of a disabled patch, though we don't allow the new patch to be enabled until the previous patch gets enabled. I'd propose we no longer allow that condition. We should instead enforce that all existing patches are *enabled* before allowing a new patch to be registered on top. That way the patch stacking is even more sane, and there are less "unusual" conditions to worry about. We have enough of those already. Each additional bit of flexibility has a maintenance cost, and this one isn't worth it IMO. Again, this might make some things easier but it might also bring problems. For example, we would need to solve the situation when the last patch is disabled and cannot be removed because the transition was forced. This might be more common after removing the immediate feature. I would stop worrying about forced patches so much :-) I have already seen blocked transition several times. It is true that it was with kGraft. But we just do not have enough real life experience with the upstream livepatch code. Forced patches already come with a disclaimer, and we can't bend over backwards for them. In such a rare case, the admin can just re-enable the forced patch before loading the 'replace' patch. Also
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Mon 2018-03-19 16:43:24, Josh Poimboeuf wrote: > On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: > > > Can someone remind me why we're permanently disabling replaced patches? > > > I seem to remember being involved in that decision, but at least with > > > this latest version of the patches, it seems like it would be simpler to > > > just let 'replace' patches be rolled back to the previous state when > > > they're unpatched. Then we don't need two lists of patches, the nops > > > can become more permanent, the replaced patches remain "enabled" but > > > inert, and the unpatching behavior is less surprising to the user, more > > > like a normal rollback. > > > > Yes, keeping the patches might make some things easier. But it might > > also bring some problems and it would make the feature less useful. > > The following arguments come to my mind: > > > > 1. The feature should help to keep the system in a consistent and > >well defined state. It should not depend on what patches were > >installed before. > > But the nops already accomplish that. If they didn't, then this patch > set has a major problem. The nops are enough to keep the functionality but they might harm the performance. Livepatching is about preventing bugs without reboot. I could simply imagine that ftrace on a hot patch might cause performance problems on some workloads. And I would like to have a way out in this case. Anyway, I am reworking the patchset so that it implements your approach first. The possibility to remove NOPs and replaced livepatches is done via a followup patch. This might help to discuss if the changes are worth it or not. > > 2. The feature should allow to unpatch some functions while keeping > >the others patched. > > > >The ftrace handler might cause some unwanted slowdown or other > >problems. The performance might get restored only when we remove > >the NOPs when they are not longer necessary. > > I'd say simplicity and maintainability of the code is more important > than an (imagined) performance issue. The NOPs should be pretty fast > anyway. > > Not to mention that my proposal would make the behavior less surprising > and more user friendly (reverting a 'replace' patch restores it to its > previous state). If the "disable" way works as expected, see below. Also it is less surprising only if people understand the stack of patches. If they are familiar only with replace patches then it is normal for them that the patches get replaced. It is then like a package version update. > > 3. The handling of callbacks is already problematic. We run only > >the ones from the last patch to make things easier. > > > >We would need to come with something more complicated if we > >want to support rollback to "random" patches on the stack. > >And support for random patches is fundamental at least > >from my point of view. > > Can you elaborate on what you mean by random patches and why it would > require something more complicated from the callbacks? Let's say that we will use atomic replace for cumulative patches. Then every new patch knows what earlier patches did. It just did not know which of them was already installed. Therefore it needs to detect what callbacks were already called. The callbacks usually create or change something. So there should be something to check. Therefore the way forward should be rather straightforward. The way back is more problematic. The callbacks in the new cumulative patch would need to store information about the previous state and be able to restore it when the patch gets disabled. It might more or less double the callbacks code and testing scenarios. > > > Along those lines, I'd also propose that we constrain our existing patch > > > stacking even further. Right now we allow a new patch to be registered > > > on top of a disabled patch, though we don't allow the new patch to be > > > enabled until the previous patch gets enabled. I'd propose we no longer > > > allow that condition. We should instead enforce that all existing > > > patches are *enabled* before allowing a new patch to be registered on > > > top. That way the patch stacking is even more sane, and there are less > > > "unusual" conditions to worry about. We have enough of those already. > > > Each additional bit of flexibility has a maintenance cost, and this one > > > isn't worth it IMO. > > > > Again, this might make some things easier but it might also bring > > problems. > > > > For example, we would need to solve the situation when the last > > patch is disabled and cannot be removed because the transition > > was forced. This might be more common after removing the immediate > > feature. > > I would stop worrying about forced patches so much :-) I have already seen blocked transition several times. It is true that it was with kGraft. But we just do not have enough real life experience with the upstream livepatch code. > Forced patch
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Mon, Mar 19, 2018 at 04:02:07PM +0100, Petr Mladek wrote: > > Can someone remind me why we're permanently disabling replaced patches? > > I seem to remember being involved in that decision, but at least with > > this latest version of the patches, it seems like it would be simpler to > > just let 'replace' patches be rolled back to the previous state when > > they're unpatched. Then we don't need two lists of patches, the nops > > can become more permanent, the replaced patches remain "enabled" but > > inert, and the unpatching behavior is less surprising to the user, more > > like a normal rollback. > > Yes, keeping the patches might make some things easier. But it might > also bring some problems and it would make the feature less useful. > The following arguments come to my mind: > > 1. The feature should help to keep the system in a consistent and >well defined state. It should not depend on what patches were >installed before. But the nops already accomplish that. If they didn't, then this patch set has a major problem. >We do not want to force people to install every single livepatch >before. It should be enough to install the last one. Of course... >But then we might get different amount of NOPs depending on what >was installed before. The same is true of this patch set as it is today. The number of NOPs used in the patching process will differ based on what patches were previously applied. This is unavoidable. My proposal is to let the NOPs hang around after the patching process. Either way we must rely on them *during* the patching process as well. > 2. The feature should allow to unpatch some functions while keeping >the others patched. > >The ftrace handler might cause some unwanted slowdown or other >problems. The performance might get restored only when we remove >the NOPs when they are not longer necessary. I'd say simplicity and maintainability of the code is more important than an (imagined) performance issue. The NOPs should be pretty fast anyway. Not to mention that my proposal would make the behavior less surprising and more user friendly (reverting a 'replace' patch restores it to its previous state). > 3. The handling of callbacks is already problematic. We run only >the ones from the last patch to make things easier. > >We would need to come with something more complicated if we >want to support rollback to "random" patches on the stack. >And support for random patches is fundamental at least >from my point of view. Can you elaborate on what you mean by random patches and why it would require something more complicated from the callbacks? > > Along those lines, I'd also propose that we constrain our existing patch > > stacking even further. Right now we allow a new patch to be registered > > on top of a disabled patch, though we don't allow the new patch to be > > enabled until the previous patch gets enabled. I'd propose we no longer > > allow that condition. We should instead enforce that all existing > > patches are *enabled* before allowing a new patch to be registered on > > top. That way the patch stacking is even more sane, and there are less > > "unusual" conditions to worry about. We have enough of those already. > > Each additional bit of flexibility has a maintenance cost, and this one > > isn't worth it IMO. > > Again, this might make some things easier but it might also bring > problems. > > For example, we would need to solve the situation when the last > patch is disabled and cannot be removed because the transition > was forced. This might be more common after removing the immediate > feature. I would stop worrying about forced patches so much :-) Forced patches already come with a disclaimer, and we can't bend over backwards for them. In such a rare case, the admin can just re-enable the forced patch before loading the 'replace' patch. > Also it might be less user friendly. I don't know, does anybody really care about this case (patching on top of a disabled patch)? It just adds to the crazy matrix of possible scenarios we have to keep in our heads, which means more bugs, for very little (hypothetical) gain. > White the atomic replace could make things easier for both developers > and users. I agree that atomic replace is a useful feature and I'm not arguing against it, so maybe I missed your point? > > The downside of the above proposals is that now you can't remove an old > > patch after it's been replaced, but IMO that's a small price to pay for > > code sanity. Every additional bit of flexibility has a maintenance > > cost, and this one isn't worth it IMO. Just force the user to leave the > > old inert patches loaded. They shouldn't take up much memory anyway, > > and they might come in handy should a revert be needed. > > I actually think about exactly the opposite way. IMHO, the atomic replace > and cumulative patches allow to handle livepatches more securely
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Tue 2018-03-13 17:46:13, Josh Poimboeuf wrote: > On Wed, Mar 07, 2018 at 09:20:34AM +0100, Petr Mladek wrote: > > From: Jason Baron > > > > We are going to add a feature called atomic replace. It will allow to > > create a patch that would replace all already registered patches. > > > > The replaced patches will stay registered because they are typically > > unregistered by some package uninstall scripts. But we will remove > > these patches from @klp_patches list to keep the enabled patch > > on the bottom of the stack. Otherwise, we would need to implement > > rather complex logic for moving the patches on the stack. Also > > it would complicate implementation of the atomic replace feature. > > It is not worth it. > > > > As a result, we will have patches that are registered but that > > are no longer usable. Let's get prepared for this and use > > a better descriptive name for klp_is_patch_registered() function. > > > > Also create separate list for the replaced patches and allow to > > unregister them. Alternative solution would be to add a flag > > into struct klp_patch. Note that patch->kobj.state_initialized > > is not safe because it can be cleared outside klp_mutex. > > --- > > kernel/livepatch/core.c | 30 -- > > 1 file changed, 24 insertions(+), 6 deletions(-) > > > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c > > index ab1f6a371fc8..fd0296859ff4 100644 > > --- a/kernel/livepatch/core.c > > +++ b/kernel/livepatch/core.c > > @@ -47,6 +47,13 @@ DEFINE_MUTEX(klp_mutex); > > > > static LIST_HEAD(klp_patches); > > > > +/* > > + * List of 'replaced' patches that have been replaced by a patch that has > > the > > + * 'replace' bit set. When they are added to this list, they are disabled > > and > > + * can not be re-enabled, but they can be unregistered(). > > + */ > > +static LIST_HEAD(klp_replaced_patches); > > Can someone remind me why we're permanently disabling replaced patches? > I seem to remember being involved in that decision, but at least with > this latest version of the patches, it seems like it would be simpler to > just let 'replace' patches be rolled back to the previous state when > they're unpatched. Then we don't need two lists of patches, the nops > can become more permanent, the replaced patches remain "enabled" but > inert, and the unpatching behavior is less surprising to the user, more > like a normal rollback. Yes, keeping the patches might make some things easier. But it might also bring some problems and it would make the feature less useful. The following arguments come to my mind: 1. The feature should help to keep the system in a consistent and well defined state. It should not depend on what patches were installed before. We do not want to force people to install every single livepatch before. It should be enough to install the last one. But then we might get different amount of NOPs depending on what was installed before. 2. The feature should allow to unpatch some functions while keeping the others patched. The ftrace handler might cause some unwanted slowdown or other problems. The performance might get restored only when we remove the NOPs when they are not longer necessary. 3. The handling of callbacks is already problematic. We run only the ones from the last patch to make things easier. We would need to come with something more complicated if we want to support rollback to "random" patches on the stack. And support for random patches is fundamental at least from my point of view. > Along those lines, I'd also propose that we constrain our existing patch > stacking even further. Right now we allow a new patch to be registered > on top of a disabled patch, though we don't allow the new patch to be > enabled until the previous patch gets enabled. I'd propose we no longer > allow that condition. We should instead enforce that all existing > patches are *enabled* before allowing a new patch to be registered on > top. That way the patch stacking is even more sane, and there are less > "unusual" conditions to worry about. We have enough of those already. > Each additional bit of flexibility has a maintenance cost, and this one > isn't worth it IMO. Again, this might make some things easier but it might also bring problems. For example, we would need to solve the situation when the last patch is disabled and cannot be removed because the transition was forced. This might be more common after removing the immediate feature. Also it might be less user friendly. White the atomic replace could make things easier for both developers and users. > The downside of the above proposals is that now you can't remove an old > patch after it's been replaced, but IMO that's a small price to pay for > code sanity. Every additional bit of flexibility has a maintenance > cost, and this one isn't worth it IMO. Just force the user to leave the > old inert patc
Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.
On Wed, Mar 07, 2018 at 09:20:34AM +0100, Petr Mladek wrote: > From: Jason Baron > > We are going to add a feature called atomic replace. It will allow to > create a patch that would replace all already registered patches. > > The replaced patches will stay registered because they are typically > unregistered by some package uninstall scripts. But we will remove > these patches from @klp_patches list to keep the enabled patch > on the bottom of the stack. Otherwise, we would need to implement > rather complex logic for moving the patches on the stack. Also > it would complicate implementation of the atomic replace feature. > It is not worth it. > > As a result, we will have patches that are registered but that > are no longer usable. Let's get prepared for this and use > a better descriptive name for klp_is_patch_registered() function. > > Also create separate list for the replaced patches and allow to > unregister them. Alternative solution would be to add a flag > into struct klp_patch. Note that patch->kobj.state_initialized > is not safe because it can be cleared outside klp_mutex. > > This patch does not change the existing behavior. > > Signed-off-by: Jason Baron > [pmla...@suse.com: Split and renamed klp_is_patch_usable()] > Signed-off-by: Petr Mladek > Cc: Josh Poimboeuf > Cc: Jessica Yu > Cc: Jiri Kosina > Acked-by: Miroslav Benes > --- > kernel/livepatch/core.c | 30 -- > 1 file changed, 24 insertions(+), 6 deletions(-) > > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c > index ab1f6a371fc8..fd0296859ff4 100644 > --- a/kernel/livepatch/core.c > +++ b/kernel/livepatch/core.c > @@ -47,6 +47,13 @@ DEFINE_MUTEX(klp_mutex); > > static LIST_HEAD(klp_patches); > > +/* > + * List of 'replaced' patches that have been replaced by a patch that has the > + * 'replace' bit set. When they are added to this list, they are disabled and > + * can not be re-enabled, but they can be unregistered(). > + */ > +static LIST_HEAD(klp_replaced_patches); Can someone remind me why we're permanently disabling replaced patches? I seem to remember being involved in that decision, but at least with this latest version of the patches, it seems like it would be simpler to just let 'replace' patches be rolled back to the previous state when they're unpatched. Then we don't need two lists of patches, the nops can become more permanent, the replaced patches remain "enabled" but inert, and the unpatching behavior is less surprising to the user, more like a normal rollback. Along those lines, I'd also propose that we constrain our existing patch stacking even further. Right now we allow a new patch to be registered on top of a disabled patch, though we don't allow the new patch to be enabled until the previous patch gets enabled. I'd propose we no longer allow that condition. We should instead enforce that all existing patches are *enabled* before allowing a new patch to be registered on top. That way the patch stacking is even more sane, and there are less "unusual" conditions to worry about. We have enough of those already. Each additional bit of flexibility has a maintenance cost, and this one isn't worth it IMO. The downside of the above proposals is that now you can't remove an old patch after it's been replaced, but IMO that's a small price to pay for code sanity. Every additional bit of flexibility has a maintenance cost, and this one isn't worth it IMO. Just force the user to leave the old inert patches loaded. They shouldn't take up much memory anyway, and they might come in handy should a revert be needed. > + > static struct kobject *klp_root_kobj; > > static void klp_init_func_list(struct klp_object *obj, struct klp_func *func) > @@ -108,17 +115,28 @@ static void klp_find_object_module(struct klp_object > *obj) > mutex_unlock(&module_mutex); > } > > -static bool klp_is_patch_registered(struct klp_patch *patch) > +static bool klp_is_patch_in_list(struct klp_patch *patch, > + struct list_head *head) > { > struct klp_patch *mypatch; > > - list_for_each_entry(mypatch, &klp_patches, list) > + list_for_each_entry(mypatch, head, list) > if (mypatch == patch) > return true; > > return false; > } > > +static bool klp_is_patch_usable(struct klp_patch *patch) > +{ > + return klp_is_patch_in_list(patch, &klp_patches); > +} > + > +static bool klp_is_patch_replaced(struct klp_patch *patch) > +{ > + return klp_is_patch_in_list(patch, &klp_replaced_patches); > +} > + > static bool klp_initialized(void) > { > return !!klp_root_kobj; > @@ -375,7 +393,7 @@ int klp_disable_patch(struct klp_patch *patch) > > mutex_lock(&klp_mutex); > > - if (!klp_is_patch_registered(patch)) { > + if (!klp_is_patch_usable(patch)) { This is another case where a wrapper function hurts readability. What does "usable" even mean to the reader of thi
[PATCH v10 05/10] livepatch: Support separate list for replaced patches.
From: Jason Baron We are going to add a feature called atomic replace. It will allow to create a patch that would replace all already registered patches. The replaced patches will stay registered because they are typically unregistered by some package uninstall scripts. But we will remove these patches from @klp_patches list to keep the enabled patch on the bottom of the stack. Otherwise, we would need to implement rather complex logic for moving the patches on the stack. Also it would complicate implementation of the atomic replace feature. It is not worth it. As a result, we will have patches that are registered but that are no longer usable. Let's get prepared for this and use a better descriptive name for klp_is_patch_registered() function. Also create separate list for the replaced patches and allow to unregister them. Alternative solution would be to add a flag into struct klp_patch. Note that patch->kobj.state_initialized is not safe because it can be cleared outside klp_mutex. This patch does not change the existing behavior. Signed-off-by: Jason Baron [pmla...@suse.com: Split and renamed klp_is_patch_usable()] Signed-off-by: Petr Mladek Cc: Josh Poimboeuf Cc: Jessica Yu Cc: Jiri Kosina Acked-by: Miroslav Benes --- kernel/livepatch/core.c | 30 -- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index ab1f6a371fc8..fd0296859ff4 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -47,6 +47,13 @@ DEFINE_MUTEX(klp_mutex); static LIST_HEAD(klp_patches); +/* + * List of 'replaced' patches that have been replaced by a patch that has the + * 'replace' bit set. When they are added to this list, they are disabled and + * can not be re-enabled, but they can be unregistered(). + */ +static LIST_HEAD(klp_replaced_patches); + static struct kobject *klp_root_kobj; static void klp_init_func_list(struct klp_object *obj, struct klp_func *func) @@ -108,17 +115,28 @@ static void klp_find_object_module(struct klp_object *obj) mutex_unlock(&module_mutex); } -static bool klp_is_patch_registered(struct klp_patch *patch) +static bool klp_is_patch_in_list(struct klp_patch *patch, +struct list_head *head) { struct klp_patch *mypatch; - list_for_each_entry(mypatch, &klp_patches, list) + list_for_each_entry(mypatch, head, list) if (mypatch == patch) return true; return false; } +static bool klp_is_patch_usable(struct klp_patch *patch) +{ + return klp_is_patch_in_list(patch, &klp_patches); +} + +static bool klp_is_patch_replaced(struct klp_patch *patch) +{ + return klp_is_patch_in_list(patch, &klp_replaced_patches); +} + static bool klp_initialized(void) { return !!klp_root_kobj; @@ -375,7 +393,7 @@ int klp_disable_patch(struct klp_patch *patch) mutex_lock(&klp_mutex); - if (!klp_is_patch_registered(patch)) { + if (!klp_is_patch_usable(patch)) { ret = -EINVAL; goto err; } @@ -475,7 +493,7 @@ int klp_enable_patch(struct klp_patch *patch) mutex_lock(&klp_mutex); - if (!klp_is_patch_registered(patch)) { + if (!klp_is_patch_usable(patch)) { ret = -EINVAL; goto err; } @@ -516,7 +534,7 @@ static ssize_t enabled_store(struct kobject *kobj, struct kobj_attribute *attr, mutex_lock(&klp_mutex); - if (!klp_is_patch_registered(patch)) { + if (!klp_is_patch_usable(patch)) { /* * Module with the patch could either disappear meanwhile or is * not properly initialized yet. @@ -971,7 +989,7 @@ int klp_unregister_patch(struct klp_patch *patch) mutex_lock(&klp_mutex); - if (!klp_is_patch_registered(patch)) { + if (!klp_is_patch_usable(patch) && !klp_is_patch_replaced(patch)) { ret = -EINVAL; goto err; } -- 2.13.6