Re: VLA removal (was Re: [RFC 2/2] lustre: use VLA_SAFE)

2018-03-07 Thread Daniel Micay
On 7 March 2018 at 13:09, Linus Torvalds  wrote:
> On Wed, Mar 7, 2018 at 9:37 AM, Kees Cook  wrote:
>>
>> Building with -Wvla, I see 209 unique locations reported in 60 directories:
>> http://paste.ubuntu.com/p/srQxwPQS9s/
>
> Ok, that's not so bad. Maybe Greg could even add it to one of those
> things he encourages new people to do?
>
> Because at least *some* of them are pretty trivial. For example,
> looking at the core code, I was surprised to see something in
> lib/btree.c

Some are probably just the issue of technically having a VLA that's
not really a VLA:

static const int size = 5;

void foo(void) {
  int x[size];
}

% gcc -c -Wvla foo.c
foo.c: In function ‘foo’:
foo.c:4:3: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
   int x[size];
   ^~~

I don't really understand why the C standard didn't make `static
const` declarations usable as constant expressions like C++. They made
the pointer conversions more painful too.

It would be nice to get rid of those cases to use -Werror=vla though.


Re: VLA removal (was Re: [RFC 2/2] lustre: use VLA_SAFE)

2018-03-07 Thread Linus Torvalds
On Wed, Mar 7, 2018 at 9:37 AM, Kees Cook  wrote:
>
> Building with -Wvla, I see 209 unique locations reported in 60 directories:
> http://paste.ubuntu.com/p/srQxwPQS9s/

Ok, that's not so bad. Maybe Greg could even add it to one of those
things he encourages new people to do?

Because at least *some* of them are pretty trivial. For example,
looking at the core code, I was surprised to see something in
lib/btree.c

And that is just garbage: it uses

unsigned long key[geo->keylen];

which looks really dangerous, but that "struct btree_geo" is internal
to that file, and there are exactly three instances of it, with 32, 64
and 128 bit keys respectively. Note that "keylen" isn't actually
number of hits, but how many long-words you need.

So in actual fact, that array is limited to that 128 bits - just 16
bytes. So keylen is at most 4 (on 32-bit architectures) or 2 (on
64-bit ones).

Using

   #define MAXKEYLEN BITS_TO_LONGS(128)

or something like that would be trivial.

AND USING VLA'S IS ACTIVELY STUPID! It generates much more code, and
much _slower_ code (and more fragile code), than just using a fixed
key size would have done.

Ok, so lib/btree.c looks more core (by being in lib/) than it actually
is - I don't see the 128-bit btree being used *anywhere*, and the
others are only used by two drivers: the qla2xxx scsi driver and the
bcm2835-camera driver in staging.

Anyway, some of these are definitely easy to just fix, and using VLA's
is actively bad not just for security worries, but simply because
VLA's are a really horribly bad idea in general in the kernel.

Added Jörn Engel to the cc, since I looked at that lib/btree.c thing.

But that is just three of the 209 instances. Some of the others might
be slightly more painful to fix.

  Linus


VLA removal (was Re: [RFC 2/2] lustre: use VLA_SAFE)

2018-03-07 Thread Kees Cook
On Wed, Mar 7, 2018 at 2:10 AM, Tobin C. Harding  wrote:
> On Tue, Mar 06, 2018 at 09:46:02PM -0800, Kees Cook wrote:
>> On Tue, Mar 6, 2018 at 9:27 PM, Tobin C. Harding  wrote:
>> > Currently lustre uses a VLA to store a string on the stack.  We can use
>> > the newly define VLA_SAFE macro to make this declaration safer.
>> >
>> > Use VLA_SAFE to declare VLA.
>>
>> VLA_SAFE implements a max, which is nice, but I think we're just
>> digging ourselves into a bigger hole with this, since now all the
>> maxes must be validated (which isn't done here, what happens if
>> VLA_DEFAULT_MAX is smaller than the strlen() math? We'll overflow the
>> stack buffer in the later sprintf).
>
> ok, lets drop this.
>
> Memory on the stack is always going to be faster than memory from the
> slub allocator, right?  Do you think using kasprintf() is going to be
> acceptable? Isn't it only going to be acceptable on non-time critical
> paths?  I'm still trying to get my head around how we get rid of VLA
> when the stack is faster?  Is this a speed vs safety trade off that must
> be tackled on a case by case basis?

It really does need to be a case-by-case basis. It'll be a balance of
speed, safety, and sanity. :) In the lustre case, that's both a bug
fix (buffer over-run) and an unbounded VLA removal. Putting a string
of unknown length on the stack tends not to be sensible, so the
kmalloc/kfree is reasonable, IMO.

Building with -Wvla, I see 209 unique locations reported in 60 directories:
http://paste.ubuntu.com/p/srQxwPQS9s/

In the case of the crypto, my past thoughts have included either
adding a buffer to some already-allocated context, or using an upper
bound on the VLAs, since there's a fixed number of implementations
built in at any given time. Though, I suspect neither will work
without more examination. Usually, if it were easy, it'd be done
already. ;)

To try to keep from adding new VLAs, maybe we could add -Wvla to the
W=n level in scripts/Makefile.extrawarn. Likely W=2:

# W=1 - warnings that may be relevant and does not occur too often
# W=2 - warnings that occur quite often but may still be relevant
# W=3 - the more obscure warnings, can most likely be ignored

And frankly, maybe -Wformat-security -- and perhaps format-truncation
and format-overflow -- should get added to W=2 too... they've gotten
it much less noisy over time, though still very noisy. ;)

Or, as mentioned in another thread, disable -Wvla in certain
directories but enable it at the top-level. I'm less of a fan of that,
though, since it tends to lead to the problem just getting forgotten.

-Kees

-- 
Kees Cook
Pixel Security