Re: [Cocci] linux-kernel janitorial RFP: Mark static arrays as const

2021-03-03 Thread Rasmus Villemoes
On 02/03/2021 18.42, Joe Perches wrote:
> Here is a possible opportunity to reduce data usage in the kernel.
> 
> $ git grep -P -n '^static\s+(?!const|struct)(?:\w+\s+){1,3}\w+\s*\[\s*\]' 
> drivers/ | \
>   grep -v __initdata | \
>   wc -l
> 3250
> 
> Meaning there are ~3000 declarations of arrays with what appears to be
> file static const content that are not marked const.
> 
> So there are many static arrays that could be marked const to move the
> compiled object code from data to text minimizing the total amount of
> exposed r/w data.

You can add const if you like, but it will rarely change the generated
code. gcc is already smart enough to take a static array whose contents
are provably never modified within the TU and put it in .rodata:

static int x = 7;
static int y[2] = {13, 19};

static int p(int a, const int *foo)
{
return a + *foo;
}
int q(int a)
{
int b = p(a, );
return p(b, [b & 1]);
}
$ nm c.o
 T q
 r y
$ size c.o
   textdata bss dec hex filename
111   0   0 111  6f c.o

So x gets optimized away completely, but y isn't so easy to get rid of -
nevertheless, it's never modified and the address doesn't escape the TU,
so gcc treats is as if it was declared const.

I think you'll see the same if you try adding the const on a few of your
real-life examples. (Of course, the control flow may be so convoluted
that gcc isn't able to infer the constness, so I'm not saying it will
never make a difference - only that you shouldn't expect too much.)

Rasmus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [PATCH v2] scripts: coccinelle: check for !(un)?likely usage

2019-08-30 Thread Rasmus Villemoes
On 30/08/2019 08.56, Denis Efremov wrote:
> 
> 
> On 30.08.2019 03:42, Julia Lawall wrote:
>>
>>
>> On Thu, 29 Aug 2019, Denis Efremov wrote:
>>
>>> On 8/29/19 8:10 PM, Denis Efremov wrote:
 This patch adds coccinelle script for detecting !likely and
 !unlikely usage. These notations are confusing. It's better
 to replace !likely(x) with unlikely(!x) and !unlikely(x) with
 likely(!x) for readability.
>>>
>>> I'm not sure that this rule deserves the acceptance.
>>> Just to want to be sure that "!unlikely(x)" and "!likely(x)"
>>> are hard-readable is not only my perception and that they
>>> become more clear in form "likely(!x)" and "unlikely(!x)" too.
>>
>> Is likely/unlikely even useful for anything once it is a subexpression?
>>> julia
>>
> 
> Well, as far as I understand it,

Yes, and it could in fact make sense in cases like

  if (likely(foo->bar) && unlikely(foo->bar->baz)) {
 do_stuff_with(foo->bar->baz);
 do_more_stuff();
  }

which the compiler could then compile as (of course actual code
generation is always much more complicated due to things in the
surrounding code)

load foo->bar;
test bar;
if 0 goto skip;
load bar->baz;
test baz;
if !0 goto far_away;
skip:
  

so in the normal flow, neither branch is taken. If instead one wrote
unlikely(foo->bar && foo->bar->baz), gcc doesn't really know why we
expect the whole conjuntion to turn out false, so it could compile this
as a jump when foo->bar turns out non-zero - i.e., in the normal case,
we'd end up jumping.

But as far as !(un)likely(), I agree that it's easier to read as a human
if the ! operator is moved inside (and the "un" prefix stripped/added).
Whether it deserves a cocci script I don't know.

Rasmus


Re: [PATCH] scripts: coccinelle: check for !(un)?likely usage

2019-08-28 Thread Rasmus Villemoes
On 25/08/2019 21.19, Julia Lawall wrote:
> 
> 
>> On 26 Aug 2019, at 02:59, Denis Efremov  wrote:
>>
>>
>>
>>> On 25.08.2019 19:37, Joe Perches wrote:
 On Sun, 2019-08-25 at 16:05 +0300, Denis Efremov wrote:
 This patch adds coccinelle script for detecting !likely and !unlikely
 usage. It's better to use unlikely instead of !likely and vice versa.
>>>
>>> Please explain _why_ is it better in the changelog.
>>>
>>
>> In my naive understanding the negation (!) before the likely/unlikely
>> could confuse the compiler
> 
> As a human I am confused. Is !likely(x) equivalent to x or !x?

#undef likely
#undef unlikely
#define likely(x) (x)
#define unlikely(x) (x)

should be a semantic no-op. So changing !likely(x) to unlikely(x) is
completely wrong. If anything, !likely(x) can be transformed to
unlikely(!x).

Rasmus


Re: [Cocci] [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct

2018-05-03 Thread Rasmus Villemoes
On 2018-05-01 19:00, Kees Cook wrote:
> On Mon, Apr 30, 2018 at 2:29 PM, Rasmus Villemoes
> <li...@rasmusvillemoes.dk> wrote:
>>
>> gcc 5.1+ (I think) have the __builtin_OP_overflow checks that should
>> generate reasonable code. Too bad there's no completely generic
>> check_all_ops_in_this_expression(a+b*c+d/e, or_jump_here). Though it's
>> hard to define what they should be checked against - probably would
>> require all subexpressions (including the variables themselves) to have
>> the same type.
>>
>> plug: https://lkml.org/lkml/2015/7/19/358
> 
> That's a very nice series. Why did it never get taken?

Well, nobody seemed particularly interested, and then
https://lkml.org/lkml/2015/10/28/215 happened... but he did later seem
to admit that it could be useful for the multiplication checking, and
that "the gcc interface for multiplication overflow is fine".

I still think even for unsigned types overflow checking can be subtle. E.g.

u32 somevar;

if (somevar + sizeof(foo) < somevar)
  return -EOVERFLOW;
somevar += sizeof(this);

is broken, because the LHS is promoted to unsigned long/size_t, then so
is the RHS for the comparison, and the comparison is thus always false
(on 64bit). It gets worse if the two types are more "opaque", and in any
case it's not always easy to verify at a glance that the types are the
same, or at least that the expression of the widest type is on the RHS.

> It seems to do the right things quite correctly.

Yes, I wouldn't suggest it without the test module verifying corner
cases, and checking it has the same semantics whether used with old or
new gcc.

Would you shepherd it through if I updated the patches and resent?

Rasmus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct

2018-04-30 Thread Rasmus Villemoes
On 2018-04-30 22:16, Matthew Wilcox wrote:
> On Mon, Apr 30, 2018 at 12:02:14PM -0700, Kees Cook wrote:
>>
>> Getting the constant ordering right could be part of the macro
>> definition, maybe? i.e.:
>>
>> static inline void *kmalloc_ab(size_t a, size_t b, gfp_t flags)
>> {
>> if (__builtin_constant_p(a) && a != 0 && \
>> b > SIZE_MAX / a)
>> return NULL;
>> else if (__builtin_constant_p(b) && b != 0 && \
>>a > SIZE_MAX / b)
>> return NULL;
>>
>> return kmalloc(a * b, flags);
>> }
> 
> Ooh, if neither a nor b is constant, it just didn't do a check ;-(  This
> stuff is hard.
> 
>> (I just wish C had a sensible way to catch overflow...)
> 
> Every CPU I ever worked with had an "overflow" bit ... do we have a
> friend on the C standards ctte who might figure out a way to let us
> write code that checks it?

gcc 5.1+ (I think) have the __builtin_OP_overflow checks that should
generate reasonable code. Too bad there's no completely generic
check_all_ops_in_this_expression(a+b*c+d/e, or_jump_here). Though it's
hard to define what they should be checked against - probably would
require all subexpressions (including the variables themselves) to have
the same type.

plug: https://lkml.org/lkml/2015/7/19/358

Rasmus

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] "bad" macros in struct definitions

2017-06-08 Thread Rasmus Villemoes
On 8 June 2017 at 13:02, Julia Lawall  wrote:
>
>
> Could you send a concrete semantic patch and .c file that is not givig the
> desired result?  Coccinelle does parsing in multiple passes.  It may fail
> on an earlier pass (and emit error messages if you use --verbose-parsing)
> but still succeed on a later one.

Something like the attached. As is, I get

$ spatch --macro-file test.h --sp-file ./test.cocci ./test.c
init_defs_builtins: /usr/local/lib/coccinelle/standard.h
init_defs: test.h
HANDLING: ./test.c

but if I comment out the OBJHEAD I get the expected match in f2:

$ spatch --macro-file test.h --sp-file ./test.cocci ./test.c
init_defs_builtins: /usr/local/lib/coccinelle/standard.h
init_defs: test.h
HANDLING: ./test.c
diff =
--- ./test.c
+++ /tmp/cocci-output-22044-346b6b-test.c
@@ -21,5 +21,4 @@ void f1(struct s *s)

 void f2(struct s *s)
 {
-   s->foo = malloc(sizeof(struct bar));
 }


test.cocci
Description: Binary data
#define OBJHEAD
#define OBJHEAD		\
	int type;	\
	char name[100]; \

struct foo {
	char c;
};
struct bar {
	long l;
};

struct s {
	OBJHEAD
	struct foo *foo;
};

void f1(struct s *s)
{
	s->foo = malloc(sizeof(struct foo));
}

void f2(struct s *s)
{
	s->foo = malloc(sizeof(struct bar));
}
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] "bad" macros in struct definitions

2017-06-08 Thread Rasmus Villemoes
I'm trying to apply spatch to libnl
(https://github.com/thom311/libnl), but I'm having problems with
spatch's parsing of the header files. libnl has a macro

#define NLHDR_COMMON\
int ce_refcnt;  \
struct nl_object_ops *  ce_ops; \
struct nl_cache *   ce_cache;   \
struct nl_list_head ce_list;\
int ce_msgtype; \
int ce_flags;   \
uint64_tce_mask;


and many data structures are defined like

struct rtnl_link
{
NLHDR_COMMON

charl_name[IFNAMSIZ];
uint32_tl_family;
uint32_tl_arptype;
...

I tried adding a cocci.h file containing just

#define NLHDR_COMMON

and giving that as a --macro-file argument, but spatch still fails to
parse the structure. If I manually remove the NLHDR_COMMON from
rtnl_link above, everything works (that is, coccinelle now knows the
types of the various rtnl_link members and my .cocci works as
expected). Am I doing something wrong, or is this a bug in the "bad
macro" handling?

Related feature request: For now, I'm not really interested in the
members defined by the NLHDR_COMMON macro, so being able to just stub
it out would be fine. But I could imagine it would be nice to have it
expand to its actual definition. I suppose that could be done by
copying its definition to one's local --macro-file, but it would be
nicer to have a way to say "please always expand this macro to
whatever its current definition is".


$ spatch --version
spatch version 1.0.6-00153-g36e91e8 compiled with OCaml version 4.02.3
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] extremely slow processing of trivial header file

2015-10-16 Thread Rasmus Villemoes
Hi

I just ran into a slightly weird case. spatch takes forever handling a
simple header file which contains nothing but #defines.

I tried

  https://github.com/Villemoes/linux-cocci/blob/master/microopt/copy_bit.cocci

on

  drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h

With --profile, the first few lines are

$ spatch --include-headers --no-includes --patch . --profile -I 
arch/x86/include -I arch/x86/include/generated -I include -I 
arch/x86/include/uapi -I arch/x86/include/generated/uapi -I include/uapi -I 
include/generated/uapi -I include/linux/kconfig.h -D patch --cocci-file 
/home/villemoes/projects/linux-cocci/microopt/copy_bit.cocci 
./drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h
init_defs_builtins: /usr/local/lib/coccinelle/standard.h
HANDLING: ./drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h
Note: processing took41.8s: 
./drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h
starting: Common.group_assoc_bykey_eff
ending: Common.group_assoc_bykey_eff, 0.04s
-
profiling result
-
Main total   : 42.428 sec  1 count
Main.outfiles computation: 41.845 sec  1 count
full_engine  : 41.845 sec  1 count
bigloop  : 39.998 sec  1 count
process_a_ctl_a_env_a_toplevel   : 39.933 sec  93930 count
mysat: 39.782 sec  93930 count
rule3a   : 39.451 sec  1 count

Other headers in drivers/gpu/drm/amd/include/ also take a long time. Any
ideas? I'm using 'spatch version 1.0.2 with Python support and with PCRE
support'.

Rasmus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] Source code adjustments together with comments?

2015-02-04 Thread Rasmus Villemoes
On Wed, Feb 04 2015, Cyril Hrubis chru...@suse.cz wrote:

 Hi!
  Would you like to be more explicit in the semantic patch language
  around source code adjustments which should also affect
  corresponding comments?
 
 It seems complicated, and people don't always follow the same comventions.
 
 Probably it is thinking that the code looks like this:
 
 /*some comment on b */
 int b;

 That explains it.

 I could change it so that this strategy is only followed if /*some comment
 on b */ starts at the beginning of the line.

 That sounds good. What would be the strategy for the other case then?

It seems to be a reasonable heuristic that a comment beginning on a line
which also contains code is attached to that code, while a comment
beginning on a line by itself is attached to the following code (whether
that means one or more lines of code is of course impossible to guess).

So, in the former case, if the code is simply removed, the comment should
also vanish. But what if the semantic patch contains + code? Something
like

- int hash;
+ unsigned int hash;

In that case, it's probably best to leave the comment. But one can
probably always find examples where whatever coccinelle does, something
else would have been better. For example, if the comment should stay but
needs rewording, good luck teaching any computer program to do that.

In short, nothing can save one from doing a manual review of the
generated patch, especially when comments are involved.

Rasmus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] Some spatch questions

2015-02-03 Thread Rasmus Villemoes
On Tue, Feb 03 2015, Julia Lawall julia.law...@lip6.fr wrote:

 OK, I'm not sure to understand everything the script does, but could it be
 that you are passing all of the file names individually on the command
 line?   That is, are you trying to do something like

 spatch foo.cocci one.c two.c three.c four.c


Yes, that's how GNU parallel works (it's mostly xargs with lots of bells
and whistles), and the only way I could find to pass files to spatch
without letting it discover the files itself.

 If that is the case, then the memory usage is normal.  Putting
 multiple files on the command line means that you want them all to be
 handled at once.  For example, there may be some function in one.c
 whose properties should influence the transformation of four.c.  So it
 holds the parsed versions of all of the files in memory at once.

That kind of makes sense. But I would very much like an option to turn
off that behaviour, and make it process each file individually.

 Fortunately, we have finally implemented parallelism inside
 Coccinelle.  If you take version 1.0.0-rc24 from the web page, which I
 haven't had time to properly announce, you can use the argument -j NNN
 with spatch on a complete directory and it will work on the different
 files in parallel (with NNN instances).

Sounds good, but I think it would be nice with a somewhat more flexible
way of choosing the files to process, the most obvious being passing the
files explicitly, and for that an option as above seems to be absolutely
essential if one passes more than around 100 files. Moreover, I will
probably continue to use a wrapper script around spatch, to handle
tedious things such as passing appropriate -I options, extracting the //
Options: line etc. Handling the parallelism in that wrapper is then no
big deal.

Thanks,
Rasmus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] Finding embedded function names?

2014-12-05 Thread Rasmus Villemoes
On Fri, Dec 05 2014, Joe Perches j...@perches.com wrote:

 On Fri, 2014-12-05 at 08:18 +0100, Julia Lawall wrote:
 On Thu, 4 Dec 2014, Joe Perches wrote:
 
 Yes, by using python/ocaml:
 
 @r@
 char [] c;
 position p;
 identifier f;
 @@
 
 f(...,c@p,...)
 
 @script:ocaml@
 c  r.c;
 p  r.p;
 @@
 
 let ce = (List.hd p).current_element in
 if List.length(Str.split_delim (Str.regexp ce) c)  1
 then Printf.printf %s:%d: %s\n
(List.hd p).file (List.hd p).line c

 Good to know, thanks.

 Here are some results:
 
 drivers/net/wireless/zd1211rw/zd_usb.c:1573: %s usb_exit()\n
 []
 The idea would be to replace these by %s and __func__?  That would also be 
 possible.

 Yes and no.

 A lot of these are function tracing style messages and
 those should just be deleted.

Hardcoding the function name in a literal string also makes typos (or
copy-pastos) possible. I extended Julia's code to allow a small edit
distance. Requires the Levenshtein python module (on debian, apt-get
install python-levenshtein). It's not terribly slow, but to be really
useful I (or someone) would need to reduce the number of false positives.

One gets for example

drivers/net/wireless/rtlwifi/rtl8821ae/dm.c:2082:rtl8821ae_dm_txpwr_track_set_pwr():2:
 ===rtl8812ae_dm_txpwr_track_set_pwr\n
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:1229:bnx2x_ets_e3b0_config():2:
 bnx2x_ets_E3B0_config SP failed\n

The first of these is probably one of the tracing style messages; the
second probably falls in the 'should use %s, __func__' category (other
strings in that function actually use lowercase e3b0).


@initialize:python@
@@
import re
from Levenshtein import distance
mindist = 1
maxdist = 2
ignore_leading = True

@r@
char [] c;
position p;
identifier f;
@@

f(...,c@p,...)

@script:python@
c  r.c;
p  r.p;
@@

func = p[0].current_element
wpattern = [a-zA-Z_][a-zA-Z0-9_]*
if ignore_leading:
   func = func.strip(_)
   wpattern = [a-zA-Z][a-zA-Z0-9_]*
lf = len(func)
// ignore extremely short function names
if lf  3:
   words = [w for w in re.findall(wpattern, c) if abs(len(w) - lf) = maxdist]
   for w in words:
   d = distance(w, func) 
   if mindist = d and d = maxdist:
  print %s:%d:%s():%d: %s % (p[0].file, int(p[0].line), func, d, c)
  break



Rasmus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] Problem with already tagged token

2014-09-11 Thread Rasmus Villemoes
On Wed, Sep 10 2014, Julia Lawall julia.law...@lip6.fr wrote:

 Clever :)

 I haven't tried this, but I think you cudl do the following:

 @r@
 position p;
 @@

 seq_puts(...);
 seq_puts(...);
 seq_puts@p(...);

 @concat1 depends on patch@
 expression s;
 constant c1, c2;
 position p1, p2;

 Now change this line to

 position p1 != r.p, p2 != r.p;

 Then you can either use Coccinelle to do the iteration, or just run the
 semantic patch over and over on the code.


Thanks, that seems to work. Can you tell me how I would get coccinelle
to do the iteration? It's not really that important, since the
whitespace has to be fixed manually anyway, and with your suggestion one
finds all places with at least two consecutive seq_puts calls. But it
would be nice if one would _only_ have to fix whitespace

Thanks,
Rasmus

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] Matching literal strings

2014-09-11 Thread Rasmus Villemoes
Hi,

While at it, I also wanted to change seq_printf(m, str)
into seq_puts(m, str) [1], but the pattern seq_printf(m, %s, str) also
appears. So I tried

@p1 depends on patch@
expression s;
expression t;
@@
(
- seq_printf(s, t)
+ seq_puts(s, t)
|
- seq_printf(s, %s, t)
+ seq_puts(s, t)
)

However, this seems to match any format string consisting of a single
format specifier; for example in kernel/locking/lockdep_proc.c I get
these

-   seq_printf(m, %p, class-key);
+   seq_puts(m, class-key);

-   seq_printf(m, %c, c);
+   seq_puts(m, c);

-   seq_printf(m, %14lu, lt-nr);
+   seq_puts(m, lt-nr);

which is obviously wrong. What should I do to make coccinelle match a
specific string literal? I guess I could introduce 'constant c =~
^\%s\$;', but there might be a more canonical way.

Thanks,
Rasmus

[1] Besides avoiding the long seq_vprintf - snprintf - ... chain and
pushing all the non-existent varargs into a va_arg structure, this is
also safer if str happens to contain format characters; although usually
str is a literal.
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] Matching literal strings

2014-09-11 Thread Rasmus Villemoes
On Thu, Sep 11 2014, Julia Lawall julia.law...@lip6.fr wrote:

 On Thu, 11 Sep 2014, Rasmus Villemoes wrote:


 which is obviously wrong. What should I do to make coccinelle match a
 specific string literal? I guess I could introduce 'constant c =~
 ^\%s\$;', but there might be a more canonical way.

 It looks like a bug.  There have been some improvements in the handling of
 format strings, but perhaps the changes have not all been positive...

In that case you might like to know

$ spatch --version
spatch version 1.0.0-rc21 with Python support and with PCRE support

Note that it shouldn't really matter that %s is a format
string (which is why I didn't attempt to use a format variable; I
don't really understand those); I just want to find instances where
seq_printf is called with a specific string literal as its second argument.

 I will end the iterative version shortly.  You will need a version of
 coccinelle that supports ocaml scripting, which seems to mean that you
 need to have ocamlfind installed on your machine at the time when you
 make coccinelle.

OK, thanks for working on it.

Rasmus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] Problem with already tagged token

2014-09-10 Thread Rasmus Villemoes
Hi,

I'm trying to write a semantic patch to convert consecutive seq_puts()
calls into a single one, with the string literals concatenated. My
attempt below works fine when there are exactly two seq_puts() calls,
but fails when there are more. If I use coccicheck on a directory,
coccinelle seems to silently skip files containing three-or-more
(e.g. kernel/trace/trace.c), but gives me the already tagged token
error when I use it on the specific file.

Ideally, I'd of course like coccinelle to apply the spatch iteratively
until only a single call remains. I think I understand why I get the
error, but I'd like to know if there's some obvious fix I've overlooked.



@concat1 depends on patch@
expression s;
constant c1, c2;
position p1, p2;
@@
  seq_puts@p1(s, c1);
  seq_puts@p2(s, c2);

@script:python concat2@
c1  concat1.c1;
c2  concat1.c2;
c3;
@@

// The indentation probably needs to be fixed manually
coccinelle.c3 = c1 + \n\t + c2

@concat3 depends on patch@
identifier concat2.c3;
expression concat1.s;
constant concat1.c1, concat1.c2;
position concat1.p1, concat1.p2;
@@
- seq_puts@p1(s, c1);
- seq_puts@p2(s, c2);
+ seq_puts(s, c3);



Thanks,
Rasmus

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci