On Wed, May 08, 2024 at 10:52:44AM -0700, Linus Torvalds wrote: > On Tue, 7 May 2024 at 16:27, Kees Cook <keesc...@chromium.org> wrote: > [...] > Put another way the absolute first and fundamental thing you should > look at is to make sure tools don't complain about sane behavior.
Agreed, and I explicitly called this out in the proposals, including adding several idioms that have been found besides the case you mention: > > - allow for common idioms to bypass sanitizers: > > - explicit wrap-around tests: if (var + offset < var) > > - loop iterator wrap-around during post-decrement: while (var--) > > - negative unsigned constants: -1UL What I need, though, is for _intent_ to be captured at the source level. Unfortunately neither compilers nor humans are omniscient. Wrapping vs non-wrapping intent can only rarely be unambiguously determined. I tried to explain that here: > > This is what is meant through-out by "ambiguous": the _intent of the > > author_ can be ambiguous as it relates to whether a calculation is > > meant to wrap-around. That it wraps around on overflow is not ambiguous: > > it will wrap-around. :) See "Terminology Considerations". > > > > This is the corner stone of the problem: even though the behavior of > > overflow is well-defined, so many authors so often don't correctly handle > > it that the results threaten the integrity of Linux as a whole. C makes > > it too easy to get it wrong, so we are left needing to fix this at a > > fundamental level. This is not a developer education problem; it is a > > problem with C language semantics. "Just do it correctly" has not > > worked. > The thing is, wrap-around is not only well-defined, it's *common*, and > *EXPECTED*. Yes, common, but that still means "sometimes unexpected". The fact that when wrap-around is wanted, an author does nothing different from when it is unwanted is the core of the problem: C doesn't provide a way for us to see intent. We need to fix this. > Example: > > static inline u32 __hash_32_generic(u32 val) > { > return val * GOLDEN_RATIO_32; > } But what about: static inline u32 __item_offset(u32 val) { return val * ITEM_SIZE_PER_UNIT; } All I did was change the names of things and now we have to wonder if the result is going to be used for indexing or sizing, and maybe it never considered that there might be a way for val to be greater than UINT_MAX / ITEM_SIZE_PER_UNIT. So instead, how about: static inline u32_wrap __hash_32_generic(u32_wrap val) { return val * GOLDEN_RATIO_32; } Now intent is clear. > and dammit, I absolutely DO NOT THINK we should annotate this as some > kind of "special multiply". Yes, I agree, annotating the calculations individually seems like it would make things much less readable. And this matches the nearly universal feedback from developers. The solution used in other languages, with much success, is to tie wrapping intent to types. We can do the same. > I have no idea how many of these exist. But I am 100% convinced that > making humans annotate this and making the source code worse is > absolutely the wrong approach. I'd like to make the source better, not worse. :) But if we're going to solve this, and do it incrementally, I think progressively updating types is the way to go. We have used types in plenty of places where we want the compiler to help with checking (gfp_t, pid_t, etc), or with specific behaviors (atomic_t, refcount_t, etc). This would be more of that, though, yes, to a greater degree. > So I think you need to limit your wrap-around complaints, and really > think about how to limit them. If you go "wrap-around is wrong" as > some kind of general; rule, I'm going to ignore you, and I'm going to > tell people to ignore you, and refuse any idiotic patches that are the > result of such idiotic rules. This doesn't take into account the reality of deployments worldwide that are getting exploited by unexpected wrap-around. There's no problem with wrap-around per se, it's that we have no source-level way to indicate when it is _intended_. I would phrase the rule as "ambiguous intent of wrap-around is wrong". > Put another way the absolute first and fundamental thing you should > look at is to make sure tools don't complain about sane behavior. What do you see as the solution for the ambiguous intent problem? > Until you have done that, and taken this seriously, this discussion is > not going to ever result in anything productive. I think I've shown pretty clearly that I take this seriously. We have been working out feasibility and potential directions for some time now, engaging with compiler communities, building the tooling needed to do this with as minimal impact as possible, etc. I didn't send this RFC for fun. :) But let me make sure we're on the same page: do you agree that we need to protect Linux from classes of bugs? That there is value in protecting our billions of users from avoidable and exploitable flaws that have literally life-threatening consequences? If we don't agree that there is a problem here worth solving, then yes, it will be hard to have a productive discussion. And please remember that I'm not saying "let's fix it overnight", I'm looking to coordinate a best way forward. I want to make a plan. Neither human nor compiler omniscience has manifested, so what's the next step? -Kees -- Kees Cook