Re: [RFC] Mitigating unexpected arithmetic overflow

Kees Cook Wed, 08 May 2024 12:45:04 -0700

On Wed, May 08, 2024 at 10:52:44AM -0700, Linus Torvalds wrote:
> On Tue, 7 May 2024 at 16:27, Kees Cook <[email protected]> wrote:
> [...]
> Put another way the absolute first and fundamental thing you should
> look at is to make sure tools don't complain about sane behavior.


Agreed, and I explicitly called this out in the proposals, including
adding several idioms that have been found besides the case you mention:

> > - allow for common idioms to bypass sanitizers:
> >   - explicit wrap-around tests: if (var + offset < var)
> >   - loop iterator wrap-around during post-decrement: while (var--)
> >   - negative unsigned constants: -1UL

What I need, though, is for _intent_ to be captured at the source level.
Unfortunately neither compilers nor humans are omniscient. Wrapping vs
non-wrapping intent can only rarely be unambiguously determined. I tried
to explain that here:

> > This is what is meant through-out by "ambiguous": the _intent of the
> > author_ can be ambiguous as it relates to whether a calculation is
> > meant to wrap-around. That it wraps around on overflow is not ambiguous:
> > it will wrap-around. :) See "Terminology Considerations".
> >
> > This is the corner stone of the problem: even though the behavior of
> > overflow is well-defined, so many authors so often don't correctly handle
> > it that the results threaten the integrity of Linux as a whole. C makes
> > it too easy to get it wrong, so we are left needing to fix this at a
> > fundamental level. This is not a developer education problem; it is a
> > problem with C language semantics. "Just do it correctly" has not
> > worked.


> The thing is, wrap-around is not only well-defined, it's *common*, and
> *EXPECTED*.

Yes, common, but that still means "sometimes unexpected". The fact that
when wrap-around is wanted, an author does nothing different from when
it is unwanted is the core of the problem: C doesn't provide a way for
us to see intent. We need to fix this.

> Example:
> 
>    static inline u32 __hash_32_generic(u32 val)
>    {
>         return val * GOLDEN_RATIO_32;
>    }

But what about:

static inline u32 __item_offset(u32 val)
{
        return val * ITEM_SIZE_PER_UNIT;
}

All I did was change the names of things and now we have to wonder if
the result is going to be used for indexing or sizing, and maybe it
never considered that there might be a way for val to be greater than
UINT_MAX / ITEM_SIZE_PER_UNIT.

So instead, how about:

static inline u32_wrap __hash_32_generic(u32_wrap val)
{
         return val * GOLDEN_RATIO_32;
}

Now intent is clear.

> and dammit, I absolutely DO NOT THINK we should annotate this as some
> kind of "special multiply".

Yes, I agree, annotating the calculations individually seems like it would
make things much less readable. And this matches the nearly universal
feedback from developers. The solution used in other languages, with
much success, is to tie wrapping intent to types. We can do the same.

> I have no idea how many of these exist. But I am 100% convinced that
> making humans annotate this and making the source code worse is
> absolutely the wrong approach.

I'd like to make the source better, not worse. :) But if we're going
to solve this, and do it incrementally, I think progressively updating
types is the way to go. We have used types in plenty of places where we
want the compiler to help with checking (gfp_t, pid_t, etc), or with
specific behaviors (atomic_t, refcount_t, etc). This would be more of
that, though, yes, to a greater degree.

> So I think you need to limit your wrap-around complaints, and really
> think about how to limit them. If you go "wrap-around is wrong" as
> some kind of general; rule, I'm going to ignore you, and I'm going to
> tell people to ignore you, and refuse any idiotic patches that are the
> result of such idiotic rules.

This doesn't take into account the reality of deployments worldwide that
are getting exploited by unexpected wrap-around. There's no problem with
wrap-around per se, it's that we have no source-level way to indicate
when it is _intended_. I would phrase the rule as "ambiguous intent of
wrap-around is wrong".

> Put another way the absolute first and fundamental thing you should
> look at is to make sure tools don't complain about sane behavior.

What do you see as the solution for the ambiguous intent problem?

> Until you have done that, and taken this seriously, this discussion is
> not going to ever result in anything productive.

I think I've shown pretty clearly that I take this seriously. We have
been working out feasibility and potential directions for some time now,
engaging with compiler communities, building the tooling needed to do
this with as minimal impact as possible, etc. I didn't send this RFC
for fun. :)

But let me make sure we're on the same page: do you agree that we need
to protect Linux from classes of bugs? That there is value in protecting
our billions of users from avoidable and exploitable flaws that have
literally life-threatening consequences? If we don't agree that there
is a problem here worth solving, then yes, it will be hard to have a
productive discussion.

And please remember that I'm not saying "let's fix it overnight", I'm
looking to coordinate a best way forward. I want to make a plan. Neither
human nor compiler omniscience has manifested, so what's the next step?

-Kees

-- 
Kees Cook

Re: [RFC] Mitigating unexpected arithmetic overflow

Reply via email to