Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, 2017-04-21 at 14:30 -0700, Kees Cook wrote: > On Fri, Apr 21, 2017 at 2:27 PM, James Bottomley >wrote: > > On Fri, 2017-04-21 at 13:22 -0700, Kees Cook wrote: > > > On Fri, Apr 21, 2017 at 12:55 PM, Eric Biggers < > > > ebigge...@gmail.com> > > > wrote: > > > > > > Of course, having extra checks behind a debug option is > > > > > > fine. But they should not be part of the base feature; the > > > > > > base feature should just be mitigation of reference count > > > > > > *overflows*. It would be nice to do more, of course; but > > > > > > when the extra stuff prevents people from using refcount_t > > > > > > for performance reasons, it defeats the point of the > > > > > > feature in the first place. > > > > > > > > > > Sure, but as I said above, I think the smaller tricks and > > > > > fixes won't be convincing enough, so their value is > > > > > questionable. > > > > > > > > This makes no sense, as the main point of the feature is > > > > supposed to be the security improvement. As-is, the extra > > > > debugging stuff is actually preventing the security improvement > > > > from being adopted, which is unfortunate. > > > > > > We've been trying to handle the conflicting desires of those > > > wanting very precise refcounting implementation and gaining the > > > security protections. Ultimately, the best way forward seemed to > > > be to first land the precise refcounting implementation, and > > > start conversion until we ran into concerns over performance. > > > Now, since we're here, we can move forward with getting a fast > > > implementation that provides the desired security protections > > > without too greatly messing with the refcount API. > > > > But that's not what it really looks like. What it looks like is > > someone came up with a new API and is now intent on forcing it > > through the kernel in about 500 patches using security as the > > hammer. > > The intent is to split refcounting and statistical counters away from > atomics, since they have distinct APIs. This will let us audit the > remaining atomic uses much more easily. But the security problem is counter overflow, right? That problem, as far as I can see exists in the atomics as well. I'm sure there might be one or two corner cases depending on overflow actually working, but I can't think of them. The refcount vs atomic comes on the precise meaning of decrement to zero. I'm not saying there's no benefit to detecting the condition, but the security problem looks to be much more pressing which is why I think this can be argued on the merits later. > > If we were really concerned about security first, we'd have fixed > > the atomic overflow problem in the atomics and then built the > > precise refcounting on that and tried to persuade people of the > > merits. > > I agree, but this approach was NAKed by multiple atomics maintainers. Overriding that decision by trying to convince all the consumers to move to a new API doesn't seem to be going so well either. Perhaps we could assist you in changing the minds of the atomics maintainers ... what was the primary problem? The additional couple of cycles or the fact that some use cases want overflow (or something else)? James > > Why can't we still do this? It looks like the overflow protection > > will add only a couple of cycles to the atomics. We can move the > > current version to an unchecked variant which can be used either in > > truly performance critical regions with no security implications or > > if someone really needs the atomic to overflow. From my point of > > view it would give us the 90% case (security) and stop this endless > > argument over the fast paths. Subsystems which have already moved > > to refcount would stay there and the rest get to evaluate a > > migration on the merits of the operational utility. > > -Kees >
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, 2017-04-21 at 14:30 -0700, Kees Cook wrote: > On Fri, Apr 21, 2017 at 2:27 PM, James Bottomley > wrote: > > On Fri, 2017-04-21 at 13:22 -0700, Kees Cook wrote: > > > On Fri, Apr 21, 2017 at 12:55 PM, Eric Biggers < > > > ebigge...@gmail.com> > > > wrote: > > > > > > Of course, having extra checks behind a debug option is > > > > > > fine. But they should not be part of the base feature; the > > > > > > base feature should just be mitigation of reference count > > > > > > *overflows*. It would be nice to do more, of course; but > > > > > > when the extra stuff prevents people from using refcount_t > > > > > > for performance reasons, it defeats the point of the > > > > > > feature in the first place. > > > > > > > > > > Sure, but as I said above, I think the smaller tricks and > > > > > fixes won't be convincing enough, so their value is > > > > > questionable. > > > > > > > > This makes no sense, as the main point of the feature is > > > > supposed to be the security improvement. As-is, the extra > > > > debugging stuff is actually preventing the security improvement > > > > from being adopted, which is unfortunate. > > > > > > We've been trying to handle the conflicting desires of those > > > wanting very precise refcounting implementation and gaining the > > > security protections. Ultimately, the best way forward seemed to > > > be to first land the precise refcounting implementation, and > > > start conversion until we ran into concerns over performance. > > > Now, since we're here, we can move forward with getting a fast > > > implementation that provides the desired security protections > > > without too greatly messing with the refcount API. > > > > But that's not what it really looks like. What it looks like is > > someone came up with a new API and is now intent on forcing it > > through the kernel in about 500 patches using security as the > > hammer. > > The intent is to split refcounting and statistical counters away from > atomics, since they have distinct APIs. This will let us audit the > remaining atomic uses much more easily. But the security problem is counter overflow, right? That problem, as far as I can see exists in the atomics as well. I'm sure there might be one or two corner cases depending on overflow actually working, but I can't think of them. The refcount vs atomic comes on the precise meaning of decrement to zero. I'm not saying there's no benefit to detecting the condition, but the security problem looks to be much more pressing which is why I think this can be argued on the merits later. > > If we were really concerned about security first, we'd have fixed > > the atomic overflow problem in the atomics and then built the > > precise refcounting on that and tried to persuade people of the > > merits. > > I agree, but this approach was NAKed by multiple atomics maintainers. Overriding that decision by trying to convince all the consumers to move to a new API doesn't seem to be going so well either. Perhaps we could assist you in changing the minds of the atomics maintainers ... what was the primary problem? The additional couple of cycles or the fact that some use cases want overflow (or something else)? James > > Why can't we still do this? It looks like the overflow protection > > will add only a couple of cycles to the atomics. We can move the > > current version to an unchecked variant which can be used either in > > truly performance critical regions with no security implications or > > if someone really needs the atomic to overflow. From my point of > > view it would give us the 90% case (security) and stop this endless > > argument over the fast paths. Subsystems which have already moved > > to refcount would stay there and the rest get to evaluate a > > migration on the merits of the operational utility. > > -Kees >
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, Apr 21, 2017 at 2:27 PM, James Bottomleywrote: > On Fri, 2017-04-21 at 13:22 -0700, Kees Cook wrote: >> On Fri, Apr 21, 2017 at 12:55 PM, Eric Biggers >> wrote: >> > > > Of course, having extra checks behind a debug option is fine. >> > > > But they should not be part of the base feature; the base >> > > > feature should just be mitigation of reference count >> > > > *overflows*. It would be nice to do more, of >> > > > course; but when the extra stuff prevents people from using >> > > > refcount_t for performance reasons, it defeats the point of the >> > > > feature in the first place. >> > > >> > > Sure, but as I said above, I think the smaller tricks and fixes >> > > won't be convincing enough, so their value is questionable. >> > >> > This makes no sense, as the main point of the feature is supposed >> > to be the security improvement. As-is, the extra debugging stuff >> > is actually preventing the security improvement from being adopted, >> > which is unfortunate. >> >> We've been trying to handle the conflicting desires of those wanting >> very precise refcounting implementation and gaining the security >> protections. Ultimately, the best way forward seemed to be to first >> land the precise refcounting implementation, and start conversion >> until we ran into concerns over performance. Now, since we're here, >> we can move forward with getting a fast implementation that provides >> the desired security protections without too greatly messing with the >> refcount API. > > But that's not what it really looks like. What it looks like is > someone came up with a new API and is now intent on forcing it through > the kernel in about 500 patches using security as the hammer. The intent is to split refcounting and statistical counters away from atomics, since they have distinct APIs. This will let us audit the remaining atomic uses much more easily. > If we were really concerned about security first, we'd have fixed the > atomic overflow problem in the atomics and then built the precise > refcounting on that and tried to persuade people of the merits. I agree, but this approach was NAKed by multiple atomics maintainers. > Why can't we still do this? It looks like the overflow protection will > add only a couple of cycles to the atomics. We can move the current > version to an unchecked variant which can be used either in truly > performance critical regions with no security implications or if > someone really needs the atomic to overflow. From my point of view it > would give us the 90% case (security) and stop this endless argument > over the fast paths. Subsystems which have already moved to refcount > would stay there and the rest get to evaluate a migration on the merits > of the operational utility. -Kees -- Kees Cook Pixel Security
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, Apr 21, 2017 at 2:27 PM, James Bottomley wrote: > On Fri, 2017-04-21 at 13:22 -0700, Kees Cook wrote: >> On Fri, Apr 21, 2017 at 12:55 PM, Eric Biggers >> wrote: >> > > > Of course, having extra checks behind a debug option is fine. >> > > > But they should not be part of the base feature; the base >> > > > feature should just be mitigation of reference count >> > > > *overflows*. It would be nice to do more, of >> > > > course; but when the extra stuff prevents people from using >> > > > refcount_t for performance reasons, it defeats the point of the >> > > > feature in the first place. >> > > >> > > Sure, but as I said above, I think the smaller tricks and fixes >> > > won't be convincing enough, so their value is questionable. >> > >> > This makes no sense, as the main point of the feature is supposed >> > to be the security improvement. As-is, the extra debugging stuff >> > is actually preventing the security improvement from being adopted, >> > which is unfortunate. >> >> We've been trying to handle the conflicting desires of those wanting >> very precise refcounting implementation and gaining the security >> protections. Ultimately, the best way forward seemed to be to first >> land the precise refcounting implementation, and start conversion >> until we ran into concerns over performance. Now, since we're here, >> we can move forward with getting a fast implementation that provides >> the desired security protections without too greatly messing with the >> refcount API. > > But that's not what it really looks like. What it looks like is > someone came up with a new API and is now intent on forcing it through > the kernel in about 500 patches using security as the hammer. The intent is to split refcounting and statistical counters away from atomics, since they have distinct APIs. This will let us audit the remaining atomic uses much more easily. > If we were really concerned about security first, we'd have fixed the > atomic overflow problem in the atomics and then built the precise > refcounting on that and tried to persuade people of the merits. I agree, but this approach was NAKed by multiple atomics maintainers. > Why can't we still do this? It looks like the overflow protection will > add only a couple of cycles to the atomics. We can move the current > version to an unchecked variant which can be used either in truly > performance critical regions with no security implications or if > someone really needs the atomic to overflow. From my point of view it > would give us the 90% case (security) and stop this endless argument > over the fast paths. Subsystems which have already moved to refcount > would stay there and the rest get to evaluate a migration on the merits > of the operational utility. -Kees -- Kees Cook Pixel Security
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, 2017-04-21 at 13:22 -0700, Kees Cook wrote: > On Fri, Apr 21, 2017 at 12:55 PM, Eric Biggers> wrote: > > > > Of course, having extra checks behind a debug option is fine. > > > > But they should not be part of the base feature; the base > > > > feature should just be mitigation of reference count > > > > *overflows*. It would be nice to do more, of > > > > course; but when the extra stuff prevents people from using > > > > refcount_t for performance reasons, it defeats the point of the > > > > feature in the first place. > > > > > > Sure, but as I said above, I think the smaller tricks and fixes > > > won't be convincing enough, so their value is questionable. > > > > This makes no sense, as the main point of the feature is supposed > > to be the security improvement. As-is, the extra debugging stuff > > is actually preventing the security improvement from being adopted, > > which is unfortunate. > > We've been trying to handle the conflicting desires of those wanting > very precise refcounting implementation and gaining the security > protections. Ultimately, the best way forward seemed to be to first > land the precise refcounting implementation, and start conversion > until we ran into concerns over performance. Now, since we're here, > we can move forward with getting a fast implementation that provides > the desired security protections without too greatly messing with the > refcount API. But that's not what it really looks like. What it looks like is someone came up with a new API and is now intent on forcing it through the kernel in about 500 patches using security as the hammer. If we were really concerned about security first, we'd have fixed the atomic overflow problem in the atomics and then built the precise refcounting on that and tried to persuade people of the merits. Why can't we still do this? It looks like the overflow protection will add only a couple of cycles to the atomics. We can move the current version to an unchecked variant which can be used either in truly performance critical regions with no security implications or if someone really needs the atomic to overflow. From my point of view it would give us the 90% case (security) and stop this endless argument over the fast paths. Subsystems which have already moved to refcount would stay there and the rest get to evaluate a migration on the merits of the operational utility. James
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, 2017-04-21 at 13:22 -0700, Kees Cook wrote: > On Fri, Apr 21, 2017 at 12:55 PM, Eric Biggers > wrote: > > > > Of course, having extra checks behind a debug option is fine. > > > > But they should not be part of the base feature; the base > > > > feature should just be mitigation of reference count > > > > *overflows*. It would be nice to do more, of > > > > course; but when the extra stuff prevents people from using > > > > refcount_t for performance reasons, it defeats the point of the > > > > feature in the first place. > > > > > > Sure, but as I said above, I think the smaller tricks and fixes > > > won't be convincing enough, so their value is questionable. > > > > This makes no sense, as the main point of the feature is supposed > > to be the security improvement. As-is, the extra debugging stuff > > is actually preventing the security improvement from being adopted, > > which is unfortunate. > > We've been trying to handle the conflicting desires of those wanting > very precise refcounting implementation and gaining the security > protections. Ultimately, the best way forward seemed to be to first > land the precise refcounting implementation, and start conversion > until we ran into concerns over performance. Now, since we're here, > we can move forward with getting a fast implementation that provides > the desired security protections without too greatly messing with the > refcount API. But that's not what it really looks like. What it looks like is someone came up with a new API and is now intent on forcing it through the kernel in about 500 patches using security as the hammer. If we were really concerned about security first, we'd have fixed the atomic overflow problem in the atomics and then built the precise refcounting on that and tried to persuade people of the merits. Why can't we still do this? It looks like the overflow protection will add only a couple of cycles to the atomics. We can move the current version to an unchecked variant which can be used either in truly performance critical regions with no security implications or if someone really needs the atomic to overflow. From my point of view it would give us the 90% case (security) and stop this endless argument over the fast paths. Subsystems which have already moved to refcount would stay there and the rest get to evaluate a migration on the merits of the operational utility. James
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, Apr 21, 2017 at 12:55 PM, Eric Biggerswrote: > Hi Elena, > > On Fri, Apr 21, 2017 at 10:55:29AM +, Reshetova, Elena wrote: >> > >> > At the very least, what is there now could probably be made about twice as >> > fast >> > by removing the checks that don't actually help mitigate refcount overflow >> > bugs, >> > specifically all the checks in refcount_dec(), and the part of >> > refcount_inc() >> > where it doesn't allow incrementing a 0 refcount. Hint: if a refcount is >> > 0, the >> > object has already been freed, so the attacker will have had the >> > opportunity to >> > allocate it with contents they control already. >> >> refcount_dec() is used very little through the code actually, it is more >> like an exception >> case since in order to use it one must really be sure that refcounter >> doesn't drop to zero. >> Removing the warn around it wouldn't likely affect much overall and thus it >> is better to >> stay to discourage people of API itself :) >> >> refcount_inc() is of course a different story, it is extensively used. I >> guess the perf issue >> on checking increment from zero might only come from WARNs being printed, >> but not really from an additional check here for zero since it is trivial >> and part of >> the needed loop anyway. So, I think only removing the >> WARNs might have any visible impact, but even this is probably not really >> that big. >> >> So, I think these changes won't really help adoption of interface if >> arguments against >> is performance. If we do have a performance issue, I think arch. specific >> implementation >> is likely the only way to considerably speed it up. > > I should have used refcount_dec_and_test() as the example, as the same applies > to both refcount_dec() and refcount_dec_and_test(). There is negligible > security benefit to have these refcount release operations checked vs. just > calling through to atomic_dec() and atomic_dec_and_test(). It's unfortunate, > but there is no known way to detect ahead of time (i.e. before exploitation) > if > there are too many refcount releases, only too many refcount acquires. > > The WARN is only executed if there is a bug, so naturally it's only a problem > if > the functions are to be inlined, creating bloat. The actual performance > problem > is the overhead associated with using comparisons and cmpxchg's to avoid > changing a refcount that is 0 or UINT_MAX. The more efficient approach would > be > to (a) drop the check for 0, and (b) don't require full operation to be > atomic, > but rather do something like "lock incl %[counter]; js ". Yes > it's not "atomic", and people have complained about this, but there is no > technical reason why it needs to be atomic. Attackers do *not* care whether > your exploit mitigation is theoretically "atomic" or not, they only care > whether > it works or not. And besides, it's not even "atomic_t" anymore, it's > "refcount_t". > >> > Of course, having extra checks behind a debug option is fine. But they >> > should >> > not be part of the base feature; the base feature should just be >> > mitigation of >> > reference count *overflows*. It would be nice to do more, of course; but >> > when >> > the extra stuff prevents people from using refcount_t for performance >> > reasons, >> > it defeats the point of the feature in the first place. >> >> Sure, but as I said above, I think the smaller tricks and fixes won't be >> convincing enough, >> so their value is questionable. > > This makes no sense, as the main point of the feature is supposed to be the > security improvement. As-is, the extra debugging stuff is actually preventing > the security improvement from being adopted, which is unfortunate. We've been trying to handle the conflicting desires of those wanting very precise refcounting implementation and gaining the security protections. Ultimately, the best way forward seemed to be to first land the precise refcounting implementation, and start conversion until we ran into concerns over performance. Now, since we're here, we can move forward with getting a fast implementation that provides the desired security protections without too greatly messing with the refcount API. -Kees -- Kees Cook Pixel Security
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, Apr 21, 2017 at 12:55 PM, Eric Biggers wrote: > Hi Elena, > > On Fri, Apr 21, 2017 at 10:55:29AM +, Reshetova, Elena wrote: >> > >> > At the very least, what is there now could probably be made about twice as >> > fast >> > by removing the checks that don't actually help mitigate refcount overflow >> > bugs, >> > specifically all the checks in refcount_dec(), and the part of >> > refcount_inc() >> > where it doesn't allow incrementing a 0 refcount. Hint: if a refcount is >> > 0, the >> > object has already been freed, so the attacker will have had the >> > opportunity to >> > allocate it with contents they control already. >> >> refcount_dec() is used very little through the code actually, it is more >> like an exception >> case since in order to use it one must really be sure that refcounter >> doesn't drop to zero. >> Removing the warn around it wouldn't likely affect much overall and thus it >> is better to >> stay to discourage people of API itself :) >> >> refcount_inc() is of course a different story, it is extensively used. I >> guess the perf issue >> on checking increment from zero might only come from WARNs being printed, >> but not really from an additional check here for zero since it is trivial >> and part of >> the needed loop anyway. So, I think only removing the >> WARNs might have any visible impact, but even this is probably not really >> that big. >> >> So, I think these changes won't really help adoption of interface if >> arguments against >> is performance. If we do have a performance issue, I think arch. specific >> implementation >> is likely the only way to considerably speed it up. > > I should have used refcount_dec_and_test() as the example, as the same applies > to both refcount_dec() and refcount_dec_and_test(). There is negligible > security benefit to have these refcount release operations checked vs. just > calling through to atomic_dec() and atomic_dec_and_test(). It's unfortunate, > but there is no known way to detect ahead of time (i.e. before exploitation) > if > there are too many refcount releases, only too many refcount acquires. > > The WARN is only executed if there is a bug, so naturally it's only a problem > if > the functions are to be inlined, creating bloat. The actual performance > problem > is the overhead associated with using comparisons and cmpxchg's to avoid > changing a refcount that is 0 or UINT_MAX. The more efficient approach would > be > to (a) drop the check for 0, and (b) don't require full operation to be > atomic, > but rather do something like "lock incl %[counter]; js ". Yes > it's not "atomic", and people have complained about this, but there is no > technical reason why it needs to be atomic. Attackers do *not* care whether > your exploit mitigation is theoretically "atomic" or not, they only care > whether > it works or not. And besides, it's not even "atomic_t" anymore, it's > "refcount_t". > >> > Of course, having extra checks behind a debug option is fine. But they >> > should >> > not be part of the base feature; the base feature should just be >> > mitigation of >> > reference count *overflows*. It would be nice to do more, of course; but >> > when >> > the extra stuff prevents people from using refcount_t for performance >> > reasons, >> > it defeats the point of the feature in the first place. >> >> Sure, but as I said above, I think the smaller tricks and fixes won't be >> convincing enough, >> so their value is questionable. > > This makes no sense, as the main point of the feature is supposed to be the > security improvement. As-is, the extra debugging stuff is actually preventing > the security improvement from being adopted, which is unfortunate. We've been trying to handle the conflicting desires of those wanting very precise refcounting implementation and gaining the security protections. Ultimately, the best way forward seemed to be to first land the precise refcounting implementation, and start conversion until we ran into concerns over performance. Now, since we're here, we can move forward with getting a fast implementation that provides the desired security protections without too greatly messing with the refcount API. -Kees -- Kees Cook Pixel Security
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
Hi Elena, On Fri, Apr 21, 2017 at 10:55:29AM +, Reshetova, Elena wrote: > > > > At the very least, what is there now could probably be made about twice as > > fast > > by removing the checks that don't actually help mitigate refcount overflow > > bugs, > > specifically all the checks in refcount_dec(), and the part of > > refcount_inc() > > where it doesn't allow incrementing a 0 refcount. Hint: if a refcount is > > 0, the > > object has already been freed, so the attacker will have had the > > opportunity to > > allocate it with contents they control already. > > refcount_dec() is used very little through the code actually, it is more like > an exception > case since in order to use it one must really be sure that refcounter doesn't > drop to zero. > Removing the warn around it wouldn't likely affect much overall and thus it > is better to > stay to discourage people of API itself :) > > refcount_inc() is of course a different story, it is extensively used. I > guess the perf issue > on checking increment from zero might only come from WARNs being printed, > but not really from an additional check here for zero since it is trivial and > part of > the needed loop anyway. So, I think only removing the > WARNs might have any visible impact, but even this is probably not really > that big. > > So, I think these changes won't really help adoption of interface if > arguments against > is performance. If we do have a performance issue, I think arch. specific > implementation > is likely the only way to considerably speed it up. I should have used refcount_dec_and_test() as the example, as the same applies to both refcount_dec() and refcount_dec_and_test(). There is negligible security benefit to have these refcount release operations checked vs. just calling through to atomic_dec() and atomic_dec_and_test(). It's unfortunate, but there is no known way to detect ahead of time (i.e. before exploitation) if there are too many refcount releases, only too many refcount acquires. The WARN is only executed if there is a bug, so naturally it's only a problem if the functions are to be inlined, creating bloat. The actual performance problem is the overhead associated with using comparisons and cmpxchg's to avoid changing a refcount that is 0 or UINT_MAX. The more efficient approach would be to (a) drop the check for 0, and (b) don't require full operation to be atomic, but rather do something like "lock incl %[counter]; js ". Yes it's not "atomic", and people have complained about this, but there is no technical reason why it needs to be atomic. Attackers do *not* care whether your exploit mitigation is theoretically "atomic" or not, they only care whether it works or not. And besides, it's not even "atomic_t" anymore, it's "refcount_t". > > Of course, having extra checks behind a debug option is fine. But they > > should > > not be part of the base feature; the base feature should just be mitigation > > of > > reference count *overflows*. It would be nice to do more, of course; but > > when > > the extra stuff prevents people from using refcount_t for performance > > reasons, > > it defeats the point of the feature in the first place. > > Sure, but as I said above, I think the smaller tricks and fixes won't be > convincing enough, > so their value is questionable. This makes no sense, as the main point of the feature is supposed to be the security improvement. As-is, the extra debugging stuff is actually preventing the security improvement from being adopted, which is unfortunate. - Eric
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
Hi Elena, On Fri, Apr 21, 2017 at 10:55:29AM +, Reshetova, Elena wrote: > > > > At the very least, what is there now could probably be made about twice as > > fast > > by removing the checks that don't actually help mitigate refcount overflow > > bugs, > > specifically all the checks in refcount_dec(), and the part of > > refcount_inc() > > where it doesn't allow incrementing a 0 refcount. Hint: if a refcount is > > 0, the > > object has already been freed, so the attacker will have had the > > opportunity to > > allocate it with contents they control already. > > refcount_dec() is used very little through the code actually, it is more like > an exception > case since in order to use it one must really be sure that refcounter doesn't > drop to zero. > Removing the warn around it wouldn't likely affect much overall and thus it > is better to > stay to discourage people of API itself :) > > refcount_inc() is of course a different story, it is extensively used. I > guess the perf issue > on checking increment from zero might only come from WARNs being printed, > but not really from an additional check here for zero since it is trivial and > part of > the needed loop anyway. So, I think only removing the > WARNs might have any visible impact, but even this is probably not really > that big. > > So, I think these changes won't really help adoption of interface if > arguments against > is performance. If we do have a performance issue, I think arch. specific > implementation > is likely the only way to considerably speed it up. I should have used refcount_dec_and_test() as the example, as the same applies to both refcount_dec() and refcount_dec_and_test(). There is negligible security benefit to have these refcount release operations checked vs. just calling through to atomic_dec() and atomic_dec_and_test(). It's unfortunate, but there is no known way to detect ahead of time (i.e. before exploitation) if there are too many refcount releases, only too many refcount acquires. The WARN is only executed if there is a bug, so naturally it's only a problem if the functions are to be inlined, creating bloat. The actual performance problem is the overhead associated with using comparisons and cmpxchg's to avoid changing a refcount that is 0 or UINT_MAX. The more efficient approach would be to (a) drop the check for 0, and (b) don't require full operation to be atomic, but rather do something like "lock incl %[counter]; js ". Yes it's not "atomic", and people have complained about this, but there is no technical reason why it needs to be atomic. Attackers do *not* care whether your exploit mitigation is theoretically "atomic" or not, they only care whether it works or not. And besides, it's not even "atomic_t" anymore, it's "refcount_t". > > Of course, having extra checks behind a debug option is fine. But they > > should > > not be part of the base feature; the base feature should just be mitigation > > of > > reference count *overflows*. It would be nice to do more, of course; but > > when > > the extra stuff prevents people from using refcount_t for performance > > reasons, > > it defeats the point of the feature in the first place. > > Sure, but as I said above, I think the smaller tricks and fixes won't be > convincing enough, > so their value is questionable. This makes no sense, as the main point of the feature is supposed to be the security improvement. As-is, the extra debugging stuff is actually preventing the security improvement from being adopted, which is unfortunate. - Eric
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, Apr 21, 2017 at 3:55 AM, Reshetova, Elenawrote: > On Thu, Apr 20, 2017 at 11:33 AM, Eric Biggers wrote: >> Of course, having extra checks behind a debug option is fine. But they >> should >> not be part of the base feature; the base feature should just be mitigation >> of >> reference count *overflows*. It would be nice to do more, of course; but >> when >> the extra stuff prevents people from using refcount_t for performance >> reasons, >> it defeats the point of the feature in the first place. > > Sure, but as I said above, I think the smaller tricks and fixes won't be > convincing enough, > so their value is questionable. > >> I strongly encourage anyone who has been involved in refcount_t to experiment >> with removing a reference count decrement somewhere in their kernel, then >> trying >> to exploit it to gain code execution. If you don't know what you're trying >> to >> protect against, you will not know which defences work and which don't. > > Well, we had successful CVEs and even exploits on this in the past. > @Kees, do you store a list of them in the project? Here are two from last year: http://perception-point.io/2016/01/14/analysis-and-exploitation-of-a-linux-kernel-vulnerability-cve-2016-0728/ http://cyseclabs.com/page?n=02012016 I don't disagree with Eric on the threat model: the primary concern for reference count protection is the overflow condition. Detecting a prior use-after-free is certainly nice to have, but the more important case is the initial overflow. How about we introduce something like CONFIG_HAVE_ARCH_FAST_REFCOUNT_T for per-arch implementations and CONFIG_FAST_REFCOUNT_T that trades coverage for speed, and checks only the overflow condition. This gets us the critical coverage without the changes in performance. This is basically what PaX/grsecurity already did: there is a tiny change to the atomic inc functions to detect the wrap. -Kees -- Kees Cook Pixel Security
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, Apr 21, 2017 at 3:55 AM, Reshetova, Elena wrote: > On Thu, Apr 20, 2017 at 11:33 AM, Eric Biggers wrote: >> Of course, having extra checks behind a debug option is fine. But they >> should >> not be part of the base feature; the base feature should just be mitigation >> of >> reference count *overflows*. It would be nice to do more, of course; but >> when >> the extra stuff prevents people from using refcount_t for performance >> reasons, >> it defeats the point of the feature in the first place. > > Sure, but as I said above, I think the smaller tricks and fixes won't be > convincing enough, > so their value is questionable. > >> I strongly encourage anyone who has been involved in refcount_t to experiment >> with removing a reference count decrement somewhere in their kernel, then >> trying >> to exploit it to gain code execution. If you don't know what you're trying >> to >> protect against, you will not know which defences work and which don't. > > Well, we had successful CVEs and even exploits on this in the past. > @Kees, do you store a list of them in the project? Here are two from last year: http://perception-point.io/2016/01/14/analysis-and-exploitation-of-a-linux-kernel-vulnerability-cve-2016-0728/ http://cyseclabs.com/page?n=02012016 I don't disagree with Eric on the threat model: the primary concern for reference count protection is the overflow condition. Detecting a prior use-after-free is certainly nice to have, but the more important case is the initial overflow. How about we introduce something like CONFIG_HAVE_ARCH_FAST_REFCOUNT_T for per-arch implementations and CONFIG_FAST_REFCOUNT_T that trades coverage for speed, and checks only the overflow condition. This gets us the critical coverage without the changes in performance. This is basically what PaX/grsecurity already did: there is a tiny change to the atomic inc functions to detect the wrap. -Kees -- Kees Cook Pixel Security
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, Apr 21, 2017 at 08:03:13AM -0600, Jens Axboe wrote: > You have it so easy - the code is completely standalone, building a > small test framework around it and measuring performance in _user space_ > is trivial. Something like this you mean: https://lkml.kernel.org/r/20170322165144.dtidvvbxey7w5...@hirez.programming.kicks-ass.net
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Fri, Apr 21, 2017 at 08:03:13AM -0600, Jens Axboe wrote: > You have it so easy - the code is completely standalone, building a > small test framework around it and measuring performance in _user space_ > is trivial. Something like this you mean: https://lkml.kernel.org/r/20170322165144.dtidvvbxey7w5...@hirez.programming.kicks-ass.net
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On 04/21/2017 09:22 AM, Peter Zijlstra wrote: > On Fri, Apr 21, 2017 at 08:03:13AM -0600, Jens Axboe wrote: >> You have it so easy - the code is completely standalone, building a >> small test framework around it and measuring performance in _user space_ >> is trivial. > > Something like this you mean: > > > https://lkml.kernel.org/r/20170322165144.dtidvvbxey7w5...@hirez.programming.kicks-ass.net Yes, exactly. But it'd be great to see better coverage of the API, and various cases of X threads running operations on the same refcount, so we get a full view of that side too. The above looks like just counting cycles for a single op of foo_inc(). -- Jens Axboe
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On 04/21/2017 09:22 AM, Peter Zijlstra wrote: > On Fri, Apr 21, 2017 at 08:03:13AM -0600, Jens Axboe wrote: >> You have it so easy - the code is completely standalone, building a >> small test framework around it and measuring performance in _user space_ >> is trivial. > > Something like this you mean: > > > https://lkml.kernel.org/r/20170322165144.dtidvvbxey7w5...@hirez.programming.kicks-ass.net Yes, exactly. But it'd be great to see better coverage of the API, and various cases of X threads running operations on the same refcount, so we get a full view of that side too. The above looks like just counting cycles for a single op of foo_inc(). -- Jens Axboe
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On 04/21/2017 04:55 AM, Reshetova, Elena wrote: Please don't send any more conversions until those have been resolved. >> >> Like I suggested months ago, how about doing an efficient implementation of >> refcount_t which doesn't use the bloated cmpxchg loop? Then there would be >> no >> need for endless performance arguments. In fact, in PaX there are already >> example implementations for several architectures. It's unfortunate that >> they're still being ignored for some reason. > > Providing efficient arch. specific implementation for refcount_t is > likely the next step to make performance-sensitive places happy. > However, in order to do it, we will need to measure for each subsystem > a) current atomic_t usage - base measurement > b) pure refcount_t usage impact > c) arch. specific refcount_t impact > Otherwise I have a chicken and egg problem here: people want better > performance and want to see arch. specific code for this, but in order > to convince maintainer in need of arch. specific code, I need to show > that we do have indeed a performance issue. Sorry, but that's a big load of... You have it so easy - the code is completely standalone, building a small test framework around it and measuring performance in _user space_ is trivial. Don't go around benchmarking conversions in specific subsystems. Generate numbers showing how refcount_t compares to atomic_t for various (valid) use cases. Once you have that, convincing people to adopt it would be so much easier. So stop talking about excuses, and start producing some numbers. If you can't do that, my level of interest in converting anything in block is zero. -- Jens Axboe
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On 04/21/2017 04:55 AM, Reshetova, Elena wrote: Please don't send any more conversions until those have been resolved. >> >> Like I suggested months ago, how about doing an efficient implementation of >> refcount_t which doesn't use the bloated cmpxchg loop? Then there would be >> no >> need for endless performance arguments. In fact, in PaX there are already >> example implementations for several architectures. It's unfortunate that >> they're still being ignored for some reason. > > Providing efficient arch. specific implementation for refcount_t is > likely the next step to make performance-sensitive places happy. > However, in order to do it, we will need to measure for each subsystem > a) current atomic_t usage - base measurement > b) pure refcount_t usage impact > c) arch. specific refcount_t impact > Otherwise I have a chicken and egg problem here: people want better > performance and want to see arch. specific code for this, but in order > to convince maintainer in need of arch. specific code, I need to show > that we do have indeed a performance issue. Sorry, but that's a big load of... You have it so easy - the code is completely standalone, building a small test framework around it and measuring performance in _user space_ is trivial. Don't go around benchmarking conversions in specific subsystems. Generate numbers showing how refcount_t compares to atomic_t for various (valid) use cases. Once you have that, convincing people to adopt it would be so much easier. So stop talking about excuses, and start producing some numbers. If you can't do that, my level of interest in converting anything in block is zero. -- Jens Axboe
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Thu, Apr 20, 2017 at 11:33:19AM -0700, Eric Biggers wrote: > Like I suggested months ago, how about doing an efficient implementation of > refcount_t which doesn't use the bloated cmpxchg loop? Then there would be no > need for endless performance arguments. In fact, in PaX there are already > example implementations for several architectures. It's unfortunate that > they're still being ignored for some reason. Yes. I will also NAK other conversions until this is done, which is a bit sad as the recount_t API itself is very useful.
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
On Thu, Apr 20, 2017 at 11:33:19AM -0700, Eric Biggers wrote: > Like I suggested months ago, how about doing an efficient implementation of > refcount_t which doesn't use the bloated cmpxchg loop? Then there would be no > need for endless performance arguments. In fact, in PaX there are already > example implementations for several architectures. It's unfortunate that > they're still being ignored for some reason. Yes. I will also NAK other conversions until this is done, which is a bit sad as the recount_t API itself is very useful.
RE: [PATCH 0/5] v2: block subsystem refcounter conversions
> Hi Elena, > > On Thu, Apr 20, 2017 at 04:10:16PM +, Reshetova, Elena wrote: > > > > > All the objections from DaveM on the amount of cycles spent on the > > > new refcount_t apply to the block layer fast path operations as well. > > > > Ok, could you please indicate the correct way to measure the impact for the > block layer? > > We can do the measurements. > > > > Best Regards, > > Elena. > > > > > > > > Please don't send any more conversions until those have been resolved. > > Like I suggested months ago, how about doing an efficient implementation of > refcount_t which doesn't use the bloated cmpxchg loop? Then there would be > no > need for endless performance arguments. In fact, in PaX there are already > example implementations for several architectures. It's unfortunate that > they're still being ignored for some reason. Providing efficient arch. specific implementation for refcount_t is likely the next step to make performance-sensitive places happy. However, in order to do it, we will need to measure for each subsystem a) current atomic_t usage - base measurement b) pure refcount_t usage impact c) arch. specific refcount_t impact Otherwise I have a chicken and egg problem here: people want better performance and want to see arch. specific code for this, but in order to convince maintainer in need of arch. specific code, I need to show that we do have indeed a performance issue. So, that's why I was endlessly asking for each subsystem that said that pointed out performance reasons to give hints on what should be measured. This way we can come back with real numbers (including for arch. specific implement) and then decide what makes the most sense. > > At the very least, what is there now could probably be made about twice as > fast > by removing the checks that don't actually help mitigate refcount overflow > bugs, > specifically all the checks in refcount_dec(), and the part of refcount_inc() > where it doesn't allow incrementing a 0 refcount. Hint: if a refcount is 0, > the > object has already been freed, so the attacker will have had the opportunity > to > allocate it with contents they control already. refcount_dec() is used very little through the code actually, it is more like an exception case since in order to use it one must really be sure that refcounter doesn't drop to zero. Removing the warn around it wouldn't likely affect much overall and thus it is better to stay to discourage people of API itself :) refcount_inc() is of course a different story, it is extensively used. I guess the perf issue on checking increment from zero might only come from WARNs being printed, but not really from an additional check here for zero since it is trivial and part of the needed loop anyway. So, I think only removing the WARNs might have any visible impact, but even this is probably not really that big. So, I think these changes won't really help adoption of interface if arguments against is performance. If we do have a performance issue, I think arch. specific implementation is likely the only way to considerably speed it up. > > Of course, having extra checks behind a debug option is fine. But they should > not be part of the base feature; the base feature should just be mitigation of > reference count *overflows*. It would be nice to do more, of course; but when > the extra stuff prevents people from using refcount_t for performance reasons, > it defeats the point of the feature in the first place. Sure, but as I said above, I think the smaller tricks and fixes won't be convincing enough, so their value is questionable. > > I strongly encourage anyone who has been involved in refcount_t to experiment > with removing a reference count decrement somewhere in their kernel, then > trying > to exploit it to gain code execution. If you don't know what you're trying to > protect against, you will not know which defences work and which don't. Well, we had successful CVEs and even exploits on this in the past. @Kees, do you store a list of them in the project? What do you actually want to verify? The fact that if you manage to create use-after-free situation, you are able to take advantage of it? Best Regards, Elena. > > - Eric
RE: [PATCH 0/5] v2: block subsystem refcounter conversions
> Hi Elena, > > On Thu, Apr 20, 2017 at 04:10:16PM +, Reshetova, Elena wrote: > > > > > All the objections from DaveM on the amount of cycles spent on the > > > new refcount_t apply to the block layer fast path operations as well. > > > > Ok, could you please indicate the correct way to measure the impact for the > block layer? > > We can do the measurements. > > > > Best Regards, > > Elena. > > > > > > > > Please don't send any more conversions until those have been resolved. > > Like I suggested months ago, how about doing an efficient implementation of > refcount_t which doesn't use the bloated cmpxchg loop? Then there would be > no > need for endless performance arguments. In fact, in PaX there are already > example implementations for several architectures. It's unfortunate that > they're still being ignored for some reason. Providing efficient arch. specific implementation for refcount_t is likely the next step to make performance-sensitive places happy. However, in order to do it, we will need to measure for each subsystem a) current atomic_t usage - base measurement b) pure refcount_t usage impact c) arch. specific refcount_t impact Otherwise I have a chicken and egg problem here: people want better performance and want to see arch. specific code for this, but in order to convince maintainer in need of arch. specific code, I need to show that we do have indeed a performance issue. So, that's why I was endlessly asking for each subsystem that said that pointed out performance reasons to give hints on what should be measured. This way we can come back with real numbers (including for arch. specific implement) and then decide what makes the most sense. > > At the very least, what is there now could probably be made about twice as > fast > by removing the checks that don't actually help mitigate refcount overflow > bugs, > specifically all the checks in refcount_dec(), and the part of refcount_inc() > where it doesn't allow incrementing a 0 refcount. Hint: if a refcount is 0, > the > object has already been freed, so the attacker will have had the opportunity > to > allocate it with contents they control already. refcount_dec() is used very little through the code actually, it is more like an exception case since in order to use it one must really be sure that refcounter doesn't drop to zero. Removing the warn around it wouldn't likely affect much overall and thus it is better to stay to discourage people of API itself :) refcount_inc() is of course a different story, it is extensively used. I guess the perf issue on checking increment from zero might only come from WARNs being printed, but not really from an additional check here for zero since it is trivial and part of the needed loop anyway. So, I think only removing the WARNs might have any visible impact, but even this is probably not really that big. So, I think these changes won't really help adoption of interface if arguments against is performance. If we do have a performance issue, I think arch. specific implementation is likely the only way to considerably speed it up. > > Of course, having extra checks behind a debug option is fine. But they should > not be part of the base feature; the base feature should just be mitigation of > reference count *overflows*. It would be nice to do more, of course; but when > the extra stuff prevents people from using refcount_t for performance reasons, > it defeats the point of the feature in the first place. Sure, but as I said above, I think the smaller tricks and fixes won't be convincing enough, so their value is questionable. > > I strongly encourage anyone who has been involved in refcount_t to experiment > with removing a reference count decrement somewhere in their kernel, then > trying > to exploit it to gain code execution. If you don't know what you're trying to > protect against, you will not know which defences work and which don't. Well, we had successful CVEs and even exploits on this in the past. @Kees, do you store a list of them in the project? What do you actually want to verify? The fact that if you manage to create use-after-free situation, you are able to take advantage of it? Best Regards, Elena. > > - Eric
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
Hi Elena, On Thu, Apr 20, 2017 at 04:10:16PM +, Reshetova, Elena wrote: > > > All the objections from DaveM on the amount of cycles spent on the > > new refcount_t apply to the block layer fast path operations as well. > > Ok, could you please indicate the correct way to measure the impact for the > block layer? > We can do the measurements. > > Best Regards, > Elena. > > > > > Please don't send any more conversions until those have been resolved. Like I suggested months ago, how about doing an efficient implementation of refcount_t which doesn't use the bloated cmpxchg loop? Then there would be no need for endless performance arguments. In fact, in PaX there are already example implementations for several architectures. It's unfortunate that they're still being ignored for some reason. At the very least, what is there now could probably be made about twice as fast by removing the checks that don't actually help mitigate refcount overflow bugs, specifically all the checks in refcount_dec(), and the part of refcount_inc() where it doesn't allow incrementing a 0 refcount. Hint: if a refcount is 0, the object has already been freed, so the attacker will have had the opportunity to allocate it with contents they control already. Of course, having extra checks behind a debug option is fine. But they should not be part of the base feature; the base feature should just be mitigation of reference count *overflows*. It would be nice to do more, of course; but when the extra stuff prevents people from using refcount_t for performance reasons, it defeats the point of the feature in the first place. I strongly encourage anyone who has been involved in refcount_t to experiment with removing a reference count decrement somewhere in their kernel, then trying to exploit it to gain code execution. If you don't know what you're trying to protect against, you will not know which defences work and which don't. - Eric
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
Hi Elena, On Thu, Apr 20, 2017 at 04:10:16PM +, Reshetova, Elena wrote: > > > All the objections from DaveM on the amount of cycles spent on the > > new refcount_t apply to the block layer fast path operations as well. > > Ok, could you please indicate the correct way to measure the impact for the > block layer? > We can do the measurements. > > Best Regards, > Elena. > > > > > Please don't send any more conversions until those have been resolved. Like I suggested months ago, how about doing an efficient implementation of refcount_t which doesn't use the bloated cmpxchg loop? Then there would be no need for endless performance arguments. In fact, in PaX there are already example implementations for several architectures. It's unfortunate that they're still being ignored for some reason. At the very least, what is there now could probably be made about twice as fast by removing the checks that don't actually help mitigate refcount overflow bugs, specifically all the checks in refcount_dec(), and the part of refcount_inc() where it doesn't allow incrementing a 0 refcount. Hint: if a refcount is 0, the object has already been freed, so the attacker will have had the opportunity to allocate it with contents they control already. Of course, having extra checks behind a debug option is fine. But they should not be part of the base feature; the base feature should just be mitigation of reference count *overflows*. It would be nice to do more, of course; but when the extra stuff prevents people from using refcount_t for performance reasons, it defeats the point of the feature in the first place. I strongly encourage anyone who has been involved in refcount_t to experiment with removing a reference count decrement somewhere in their kernel, then trying to exploit it to gain code execution. If you don't know what you're trying to protect against, you will not know which defences work and which don't. - Eric
RE: [PATCH 0/5] v2: block subsystem refcounter conversions
> All the objections from DaveM on the amount of cycles spent on the > new refcount_t apply to the block layer fast path operations as well. Ok, could you please indicate the correct way to measure the impact for the block layer? We can do the measurements. Best Regards, Elena. > > Please don't send any more conversions until those have been resolved.
RE: [PATCH 0/5] v2: block subsystem refcounter conversions
> All the objections from DaveM on the amount of cycles spent on the > new refcount_t apply to the block layer fast path operations as well. Ok, could you please indicate the correct way to measure the impact for the block layer? We can do the measurements. Best Regards, Elena. > > Please don't send any more conversions until those have been resolved.
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
All the objections from DaveM on the amount of cycles spent on the new refcount_t apply to the block layer fast path operations as well. Please don't send any more conversions until those have been resolved.
Re: [PATCH 0/5] v2: block subsystem refcounter conversions
All the objections from DaveM on the amount of cycles spent on the new refcount_t apply to the block layer fast path operations as well. Please don't send any more conversions until those have been resolved.