Re: [RFC PATCH 0/7] runtime format string checking

2018-11-05 Thread Rasmus Villemoes
On 2018-11-01 23:57, Kees Cook wrote:

>> Yes, gcc should be able to infer the constness of drv from the fact that
>> it's never assigned to elsewhere in the function... I think I saw that
>> on some gcc todo list at some point.
> 
> If you find that bug, I'll add it to my gcc bug tracking list. :P

I looked into doing it myself (just for the format checking case, not
for variables in general), but gave up after a few hours. So I created
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87879 . I tried adding you
to the cc list, but it seems you don't have a gcc bugzilla account (?).

Looking more into this, as you can see above, it's actually a little
worse than "false positive" -Wformat-nonliteral - see the f2() case.

[At https://godbolt.org/z/KC4ZRK I also included the const char[] case,
which works as the const char* const case wrt. format checking, but for
some reason gcc open-codes the strlen(). But that's a separate issue,
and not one we should care about, since the const char[] thing is wrong
regardless due to the bade code gen]

Rasmus


Re: [RFC PATCH 0/7] runtime format string checking

2018-11-05 Thread Rasmus Villemoes
On 2018-11-01 23:57, Kees Cook wrote:

>> Yes, gcc should be able to infer the constness of drv from the fact that
>> it's never assigned to elsewhere in the function... I think I saw that
>> on some gcc todo list at some point.
> 
> If you find that bug, I'll add it to my gcc bug tracking list. :P

I looked into doing it myself (just for the format checking case, not
for variables in general), but gave up after a few hours. So I created
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87879 . I tried adding you
to the cc list, but it seems you don't have a gcc bugzilla account (?).

Looking more into this, as you can see above, it's actually a little
worse than "false positive" -Wformat-nonliteral - see the f2() case.

[At https://godbolt.org/z/KC4ZRK I also included the const char[] case,
which works as the const char* const case wrt. format checking, but for
some reason gcc open-codes the strlen(). But that's a separate issue,
and not one we should care about, since the const char[] thing is wrong
regardless due to the bade code gen]

Rasmus


Re: [RFC PATCH 0/7] runtime format string checking

2018-11-02 Thread Kees Cook
On Fri, Nov 2, 2018 at 1:09 PM, Rasmus Villemoes
 wrote:
> That's a bit too naive. At the very least, you must exclude static
> stuff, i.e. restrict to actual auto variables. Otherwise you're making
> things worse (a "static const char []" just occupies some space in
> .rodata, a "static const char * const" occupies the same space for the
> anonymous literal, plus space for a pointer). Furthermore, you must
> ensure that nobody does sizeof() on VAR. With a trivial extension of
> your script to exclude the "static const char" places, I get

Yes, thank you. That's the part I was forgetting and why I was doing
[] over * back then. There are certainly uses of sizeof() on these
strings. So, it seems better to get sizeof() right that the const-ness
right.

 https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/format-security=b7dcfc8f48caaafcc423e5793f7ef61b9bb5c458
 This one covers cases where the pointer is pointing to a const string,
 so really there's no sense in injecting the "%s", but I was collecting
 them to make real ones stand out.
>>>
>>> I don't agree. [...]
>>
>> Okay, then I'll forward this to akpm maybe?
>
> Yes, if all they do is replace f(..., s) by f(..., "%s", s) that should
> never hurt. Maybe check if there's a ..._puts() variant that can be used
> instead, e.g. seq_puts().

Alright, I'll see about bringing that series forward in time...

-Kees

-- 
Kees Cook


Re: [RFC PATCH 0/7] runtime format string checking

2018-11-02 Thread Kees Cook
On Fri, Nov 2, 2018 at 1:09 PM, Rasmus Villemoes
 wrote:
> That's a bit too naive. At the very least, you must exclude static
> stuff, i.e. restrict to actual auto variables. Otherwise you're making
> things worse (a "static const char []" just occupies some space in
> .rodata, a "static const char * const" occupies the same space for the
> anonymous literal, plus space for a pointer). Furthermore, you must
> ensure that nobody does sizeof() on VAR. With a trivial extension of
> your script to exclude the "static const char" places, I get

Yes, thank you. That's the part I was forgetting and why I was doing
[] over * back then. There are certainly uses of sizeof() on these
strings. So, it seems better to get sizeof() right that the const-ness
right.

 https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/format-security=b7dcfc8f48caaafcc423e5793f7ef61b9bb5c458
 This one covers cases where the pointer is pointing to a const string,
 so really there's no sense in injecting the "%s", but I was collecting
 them to make real ones stand out.
>>>
>>> I don't agree. [...]
>>
>> Okay, then I'll forward this to akpm maybe?
>
> Yes, if all they do is replace f(..., s) by f(..., "%s", s) that should
> never hurt. Maybe check if there's a ..._puts() variant that can be used
> instead, e.g. seq_puts().

Alright, I'll see about bringing that series forward in time...

-Kees

-- 
Kees Cook


Re: [RFC PATCH 0/7] runtime format string checking

2018-11-02 Thread Rasmus Villemoes
[trimming cc list]

On 2018-11-01 23:57, Kees Cook wrote:
> On Thu, Nov 1, 2018 at 3:06 PM, Rasmus Villemoes
>  wrote:
>> referring to an anonymous object in .rodata; one gets code gen like
>>
>> +:  31 c0   xor%eax,%eax
>> +:  48 b8 61 63 70 69 2dmovabs $0x7570632d69706361,%rax # "acpi-cpu"
>> +:  63 70 75
>> +:  c7 44 24 0b 66 72 65movl   $0x71657266,0xb(%rsp) # "freq"
>> +:  71
>> +:  c6 44 24 0f 00  movb   $0x0,0xf(%rsp) "\0"
>> +:  48 89 44 24 03  mov%rax,0x3(%rsp)
> 
> Oh that is nasty. Ugh. I hate the "const but not really ha ha" optimizations. 
> :(
> 
>> It's not the-end-of-the-world-horrible, but it's better avoided,
>> especially for patches that are not supposed to change anything. And
>> longer strings would of course produce even more gunk like the above.
>> A better fix which also silences -Wformat-security is to declare the
>> variable itself const, i.e.
>>
>> const char *const drv = "acpi-cpufreq".
> 
> Yes, that would be much better. Seems like we could do a really easy
> Coccinelle script to fix all of those?
> 
> @@
> identifier VAR;
> expression STRING;
> @@
> 
> - const char VAR[]
> + const char * const VAR
>   = STRING;
> 
> yields:
>  517 files changed, 890 insertions(+), 891 deletions(-)
> 
> Worth doing at the end of -rc2?

That's a bit too naive. At the very least, you must exclude static
stuff, i.e. restrict to actual auto variables. Otherwise you're making
things worse (a "static const char []" just occupies some space in
.rodata, a "static const char * const" occupies the same space for the
anonymous literal, plus space for a pointer). Furthermore, you must
ensure that nobody does sizeof() on VAR. With a trivial extension of
your script to exclude the "static const char" places, I get

 97 files changed, 151 insertions(+), 151 deletions(-)

but that includes a number of places at file level where VAR actually
has external linkage. Which is most likely not intentional, but those
places would need different fixes. Actually, a lot of them are of the
'version = "1.2 (Feb 3 1995)"' kind which are utterly useless, so should
simply be removed (possibly left in a comment).

There's not a whole lot of difference between

const char *const foo = "read";

and

static const char foo[] = "read";

The former allows the linker to share "read" with other identical
literals (or reuse the tail of "thread"), but the actual strings in
these cases are likely to be unique and not suffixes of others. The
latter is probably more readable (at least it's more common), and in
some cases one can slap on an __initconst, making the memory footprint
go away entirely. And when sizeof() is used,

So I think it's better to take the above 151 cases and do them in small
batches.

>>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/format-security=b7dcfc8f48caaafcc423e5793f7ef61b9bb5c458
>>> This one covers cases where the pointer is pointing to a const string,
>>> so really there's no sense in injecting the "%s", but I was collecting
>>> them to make real ones stand out.
>>
>> I don't agree. [...]
> 
> Okay, then I'll forward this to akpm maybe?

Yes, if all they do is replace f(..., s) by f(..., "%s", s) that should
never hurt. Maybe check if there's a ..._puts() variant that can be used
instead, e.g. seq_puts().

Rasmus


Re: [RFC PATCH 0/7] runtime format string checking

2018-11-02 Thread Rasmus Villemoes
[trimming cc list]

On 2018-11-01 23:57, Kees Cook wrote:
> On Thu, Nov 1, 2018 at 3:06 PM, Rasmus Villemoes
>  wrote:
>> referring to an anonymous object in .rodata; one gets code gen like
>>
>> +:  31 c0   xor%eax,%eax
>> +:  48 b8 61 63 70 69 2dmovabs $0x7570632d69706361,%rax # "acpi-cpu"
>> +:  63 70 75
>> +:  c7 44 24 0b 66 72 65movl   $0x71657266,0xb(%rsp) # "freq"
>> +:  71
>> +:  c6 44 24 0f 00  movb   $0x0,0xf(%rsp) "\0"
>> +:  48 89 44 24 03  mov%rax,0x3(%rsp)
> 
> Oh that is nasty. Ugh. I hate the "const but not really ha ha" optimizations. 
> :(
> 
>> It's not the-end-of-the-world-horrible, but it's better avoided,
>> especially for patches that are not supposed to change anything. And
>> longer strings would of course produce even more gunk like the above.
>> A better fix which also silences -Wformat-security is to declare the
>> variable itself const, i.e.
>>
>> const char *const drv = "acpi-cpufreq".
> 
> Yes, that would be much better. Seems like we could do a really easy
> Coccinelle script to fix all of those?
> 
> @@
> identifier VAR;
> expression STRING;
> @@
> 
> - const char VAR[]
> + const char * const VAR
>   = STRING;
> 
> yields:
>  517 files changed, 890 insertions(+), 891 deletions(-)
> 
> Worth doing at the end of -rc2?

That's a bit too naive. At the very least, you must exclude static
stuff, i.e. restrict to actual auto variables. Otherwise you're making
things worse (a "static const char []" just occupies some space in
.rodata, a "static const char * const" occupies the same space for the
anonymous literal, plus space for a pointer). Furthermore, you must
ensure that nobody does sizeof() on VAR. With a trivial extension of
your script to exclude the "static const char" places, I get

 97 files changed, 151 insertions(+), 151 deletions(-)

but that includes a number of places at file level where VAR actually
has external linkage. Which is most likely not intentional, but those
places would need different fixes. Actually, a lot of them are of the
'version = "1.2 (Feb 3 1995)"' kind which are utterly useless, so should
simply be removed (possibly left in a comment).

There's not a whole lot of difference between

const char *const foo = "read";

and

static const char foo[] = "read";

The former allows the linker to share "read" with other identical
literals (or reuse the tail of "thread"), but the actual strings in
these cases are likely to be unique and not suffixes of others. The
latter is probably more readable (at least it's more common), and in
some cases one can slap on an __initconst, making the memory footprint
go away entirely. And when sizeof() is used,

So I think it's better to take the above 151 cases and do them in small
batches.

>>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/format-security=b7dcfc8f48caaafcc423e5793f7ef61b9bb5c458
>>> This one covers cases where the pointer is pointing to a const string,
>>> so really there's no sense in injecting the "%s", but I was collecting
>>> them to make real ones stand out.
>>
>> I don't agree. [...]
> 
> Okay, then I'll forward this to akpm maybe?

Yes, if all they do is replace f(..., s) by f(..., "%s", s) that should
never hurt. Maybe check if there's a ..._puts() variant that can be used
instead, e.g. seq_puts().

Rasmus