Re: [Cocci] linux-kernel janitorial RFP: Mark static arrays as const
On 02/03/2021 18.42, Joe Perches wrote: > Here is a possible opportunity to reduce data usage in the kernel. > > $ git grep -P -n '^static\s+(?!const|struct)(?:\w+\s+){1,3}\w+\s*\[\s*\]' > drivers/ | \ > grep -v __initdata | \ > wc -l > 3250 > > Meaning there are ~3000 declarations of arrays with what appears to be > file static const content that are not marked const. > > So there are many static arrays that could be marked const to move the > compiled object code from data to text minimizing the total amount of > exposed r/w data. You can add const if you like, but it will rarely change the generated code. gcc is already smart enough to take a static array whose contents are provably never modified within the TU and put it in .rodata: static int x = 7; static int y[2] = {13, 19}; static int p(int a, const int *foo) { return a + *foo; } int q(int a) { int b = p(a, ); return p(b, [b & 1]); } $ nm c.o T q r y $ size c.o textdata bss dec hex filename 111 0 0 111 6f c.o So x gets optimized away completely, but y isn't so easy to get rid of - nevertheless, it's never modified and the address doesn't escape the TU, so gcc treats is as if it was declared const. I think you'll see the same if you try adding the const on a few of your real-life examples. (Of course, the control flow may be so convoluted that gcc isn't able to infer the constness, so I'm not saying it will never make a difference - only that you shouldn't expect too much.) Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
Re: [PATCH v2] scripts: coccinelle: check for !(un)?likely usage
On 30/08/2019 08.56, Denis Efremov wrote: > > > On 30.08.2019 03:42, Julia Lawall wrote: >> >> >> On Thu, 29 Aug 2019, Denis Efremov wrote: >> >>> On 8/29/19 8:10 PM, Denis Efremov wrote: This patch adds coccinelle script for detecting !likely and !unlikely usage. These notations are confusing. It's better to replace !likely(x) with unlikely(!x) and !unlikely(x) with likely(!x) for readability. >>> >>> I'm not sure that this rule deserves the acceptance. >>> Just to want to be sure that "!unlikely(x)" and "!likely(x)" >>> are hard-readable is not only my perception and that they >>> become more clear in form "likely(!x)" and "unlikely(!x)" too. >> >> Is likely/unlikely even useful for anything once it is a subexpression? >>> julia >> > > Well, as far as I understand it, Yes, and it could in fact make sense in cases like if (likely(foo->bar) && unlikely(foo->bar->baz)) { do_stuff_with(foo->bar->baz); do_more_stuff(); } which the compiler could then compile as (of course actual code generation is always much more complicated due to things in the surrounding code) load foo->bar; test bar; if 0 goto skip; load bar->baz; test baz; if !0 goto far_away; skip: so in the normal flow, neither branch is taken. If instead one wrote unlikely(foo->bar && foo->bar->baz), gcc doesn't really know why we expect the whole conjuntion to turn out false, so it could compile this as a jump when foo->bar turns out non-zero - i.e., in the normal case, we'd end up jumping. But as far as !(un)likely(), I agree that it's easier to read as a human if the ! operator is moved inside (and the "un" prefix stripped/added). Whether it deserves a cocci script I don't know. Rasmus
Re: [PATCH] scripts: coccinelle: check for !(un)?likely usage
On 25/08/2019 21.19, Julia Lawall wrote: > > >> On 26 Aug 2019, at 02:59, Denis Efremov wrote: >> >> >> >>> On 25.08.2019 19:37, Joe Perches wrote: On Sun, 2019-08-25 at 16:05 +0300, Denis Efremov wrote: This patch adds coccinelle script for detecting !likely and !unlikely usage. It's better to use unlikely instead of !likely and vice versa. >>> >>> Please explain _why_ is it better in the changelog. >>> >> >> In my naive understanding the negation (!) before the likely/unlikely >> could confuse the compiler > > As a human I am confused. Is !likely(x) equivalent to x or !x? #undef likely #undef unlikely #define likely(x) (x) #define unlikely(x) (x) should be a semantic no-op. So changing !likely(x) to unlikely(x) is completely wrong. If anything, !likely(x) can be transformed to unlikely(!x). Rasmus
Re: [Cocci] [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct
On 2018-05-01 19:00, Kees Cook wrote: > On Mon, Apr 30, 2018 at 2:29 PM, Rasmus Villemoes > <li...@rasmusvillemoes.dk> wrote: >> >> gcc 5.1+ (I think) have the __builtin_OP_overflow checks that should >> generate reasonable code. Too bad there's no completely generic >> check_all_ops_in_this_expression(a+b*c+d/e, or_jump_here). Though it's >> hard to define what they should be checked against - probably would >> require all subexpressions (including the variables themselves) to have >> the same type. >> >> plug: https://lkml.org/lkml/2015/7/19/358 > > That's a very nice series. Why did it never get taken? Well, nobody seemed particularly interested, and then https://lkml.org/lkml/2015/10/28/215 happened... but he did later seem to admit that it could be useful for the multiplication checking, and that "the gcc interface for multiplication overflow is fine". I still think even for unsigned types overflow checking can be subtle. E.g. u32 somevar; if (somevar + sizeof(foo) < somevar) return -EOVERFLOW; somevar += sizeof(this); is broken, because the LHS is promoted to unsigned long/size_t, then so is the RHS for the comparison, and the comparison is thus always false (on 64bit). It gets worse if the two types are more "opaque", and in any case it's not always easy to verify at a glance that the types are the same, or at least that the expression of the widest type is on the RHS. > It seems to do the right things quite correctly. Yes, I wouldn't suggest it without the test module verifying corner cases, and checking it has the same semantics whether used with old or new gcc. Would you shepherd it through if I updated the patches and resent? Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
Re: [Cocci] [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct
On 2018-04-30 22:16, Matthew Wilcox wrote: > On Mon, Apr 30, 2018 at 12:02:14PM -0700, Kees Cook wrote: >> >> Getting the constant ordering right could be part of the macro >> definition, maybe? i.e.: >> >> static inline void *kmalloc_ab(size_t a, size_t b, gfp_t flags) >> { >> if (__builtin_constant_p(a) && a != 0 && \ >> b > SIZE_MAX / a) >> return NULL; >> else if (__builtin_constant_p(b) && b != 0 && \ >>a > SIZE_MAX / b) >> return NULL; >> >> return kmalloc(a * b, flags); >> } > > Ooh, if neither a nor b is constant, it just didn't do a check ;-( This > stuff is hard. > >> (I just wish C had a sensible way to catch overflow...) > > Every CPU I ever worked with had an "overflow" bit ... do we have a > friend on the C standards ctte who might figure out a way to let us > write code that checks it? gcc 5.1+ (I think) have the __builtin_OP_overflow checks that should generate reasonable code. Too bad there's no completely generic check_all_ops_in_this_expression(a+b*c+d/e, or_jump_here). Though it's hard to define what they should be checked against - probably would require all subexpressions (including the variables themselves) to have the same type. plug: https://lkml.org/lkml/2015/7/19/358 Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
Re: [Cocci] "bad" macros in struct definitions
On 8 June 2017 at 13:02, Julia Lawallwrote: > > > Could you send a concrete semantic patch and .c file that is not givig the > desired result? Coccinelle does parsing in multiple passes. It may fail > on an earlier pass (and emit error messages if you use --verbose-parsing) > but still succeed on a later one. Something like the attached. As is, I get $ spatch --macro-file test.h --sp-file ./test.cocci ./test.c init_defs_builtins: /usr/local/lib/coccinelle/standard.h init_defs: test.h HANDLING: ./test.c but if I comment out the OBJHEAD I get the expected match in f2: $ spatch --macro-file test.h --sp-file ./test.cocci ./test.c init_defs_builtins: /usr/local/lib/coccinelle/standard.h init_defs: test.h HANDLING: ./test.c diff = --- ./test.c +++ /tmp/cocci-output-22044-346b6b-test.c @@ -21,5 +21,4 @@ void f1(struct s *s) void f2(struct s *s) { - s->foo = malloc(sizeof(struct bar)); } test.cocci Description: Binary data #define OBJHEAD #define OBJHEAD \ int type; \ char name[100]; \ struct foo { char c; }; struct bar { long l; }; struct s { OBJHEAD struct foo *foo; }; void f1(struct s *s) { s->foo = malloc(sizeof(struct foo)); } void f2(struct s *s) { s->foo = malloc(sizeof(struct bar)); } ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
[Cocci] "bad" macros in struct definitions
I'm trying to apply spatch to libnl (https://github.com/thom311/libnl), but I'm having problems with spatch's parsing of the header files. libnl has a macro #define NLHDR_COMMON\ int ce_refcnt; \ struct nl_object_ops * ce_ops; \ struct nl_cache * ce_cache; \ struct nl_list_head ce_list;\ int ce_msgtype; \ int ce_flags; \ uint64_tce_mask; and many data structures are defined like struct rtnl_link { NLHDR_COMMON charl_name[IFNAMSIZ]; uint32_tl_family; uint32_tl_arptype; ... I tried adding a cocci.h file containing just #define NLHDR_COMMON and giving that as a --macro-file argument, but spatch still fails to parse the structure. If I manually remove the NLHDR_COMMON from rtnl_link above, everything works (that is, coccinelle now knows the types of the various rtnl_link members and my .cocci works as expected). Am I doing something wrong, or is this a bug in the "bad macro" handling? Related feature request: For now, I'm not really interested in the members defined by the NLHDR_COMMON macro, so being able to just stub it out would be fine. But I could imagine it would be nice to have it expand to its actual definition. I suppose that could be done by copying its definition to one's local --macro-file, but it would be nicer to have a way to say "please always expand this macro to whatever its current definition is". $ spatch --version spatch version 1.0.6-00153-g36e91e8 compiled with OCaml version 4.02.3 ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
[Cocci] extremely slow processing of trivial header file
Hi I just ran into a slightly weird case. spatch takes forever handling a simple header file which contains nothing but #defines. I tried https://github.com/Villemoes/linux-cocci/blob/master/microopt/copy_bit.cocci on drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h With --profile, the first few lines are $ spatch --include-headers --no-includes --patch . --profile -I arch/x86/include -I arch/x86/include/generated -I include -I arch/x86/include/uapi -I arch/x86/include/generated/uapi -I include/uapi -I include/generated/uapi -I include/linux/kconfig.h -D patch --cocci-file /home/villemoes/projects/linux-cocci/microopt/copy_bit.cocci ./drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h init_defs_builtins: /usr/local/lib/coccinelle/standard.h HANDLING: ./drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h Note: processing took41.8s: ./drivers/gpu/drm/amd/include/asic_reg/gmc/gmc_8_1_sh_mask.h starting: Common.group_assoc_bykey_eff ending: Common.group_assoc_bykey_eff, 0.04s - profiling result - Main total : 42.428 sec 1 count Main.outfiles computation: 41.845 sec 1 count full_engine : 41.845 sec 1 count bigloop : 39.998 sec 1 count process_a_ctl_a_env_a_toplevel : 39.933 sec 93930 count mysat: 39.782 sec 93930 count rule3a : 39.451 sec 1 count Other headers in drivers/gpu/drm/amd/include/ also take a long time. Any ideas? I'm using 'spatch version 1.0.2 with Python support and with PCRE support'. Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
Re: [Cocci] Source code adjustments together with comments?
On Wed, Feb 04 2015, Cyril Hrubis chru...@suse.cz wrote: Hi! Would you like to be more explicit in the semantic patch language around source code adjustments which should also affect corresponding comments? It seems complicated, and people don't always follow the same comventions. Probably it is thinking that the code looks like this: /*some comment on b */ int b; That explains it. I could change it so that this strategy is only followed if /*some comment on b */ starts at the beginning of the line. That sounds good. What would be the strategy for the other case then? It seems to be a reasonable heuristic that a comment beginning on a line which also contains code is attached to that code, while a comment beginning on a line by itself is attached to the following code (whether that means one or more lines of code is of course impossible to guess). So, in the former case, if the code is simply removed, the comment should also vanish. But what if the semantic patch contains + code? Something like - int hash; + unsigned int hash; In that case, it's probably best to leave the comment. But one can probably always find examples where whatever coccinelle does, something else would have been better. For example, if the comment should stay but needs rewording, good luck teaching any computer program to do that. In short, nothing can save one from doing a manual review of the generated patch, especially when comments are involved. Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
Re: [Cocci] Some spatch questions
On Tue, Feb 03 2015, Julia Lawall julia.law...@lip6.fr wrote: OK, I'm not sure to understand everything the script does, but could it be that you are passing all of the file names individually on the command line? That is, are you trying to do something like spatch foo.cocci one.c two.c three.c four.c Yes, that's how GNU parallel works (it's mostly xargs with lots of bells and whistles), and the only way I could find to pass files to spatch without letting it discover the files itself. If that is the case, then the memory usage is normal. Putting multiple files on the command line means that you want them all to be handled at once. For example, there may be some function in one.c whose properties should influence the transformation of four.c. So it holds the parsed versions of all of the files in memory at once. That kind of makes sense. But I would very much like an option to turn off that behaviour, and make it process each file individually. Fortunately, we have finally implemented parallelism inside Coccinelle. If you take version 1.0.0-rc24 from the web page, which I haven't had time to properly announce, you can use the argument -j NNN with spatch on a complete directory and it will work on the different files in parallel (with NNN instances). Sounds good, but I think it would be nice with a somewhat more flexible way of choosing the files to process, the most obvious being passing the files explicitly, and for that an option as above seems to be absolutely essential if one passes more than around 100 files. Moreover, I will probably continue to use a wrapper script around spatch, to handle tedious things such as passing appropriate -I options, extracting the // Options: line etc. Handling the parallelism in that wrapper is then no big deal. Thanks, Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
Re: [Cocci] Finding embedded function names?
On Fri, Dec 05 2014, Joe Perches j...@perches.com wrote: On Fri, 2014-12-05 at 08:18 +0100, Julia Lawall wrote: On Thu, 4 Dec 2014, Joe Perches wrote: Yes, by using python/ocaml: @r@ char [] c; position p; identifier f; @@ f(...,c@p,...) @script:ocaml@ c r.c; p r.p; @@ let ce = (List.hd p).current_element in if List.length(Str.split_delim (Str.regexp ce) c) 1 then Printf.printf %s:%d: %s\n (List.hd p).file (List.hd p).line c Good to know, thanks. Here are some results: drivers/net/wireless/zd1211rw/zd_usb.c:1573: %s usb_exit()\n [] The idea would be to replace these by %s and __func__? That would also be possible. Yes and no. A lot of these are function tracing style messages and those should just be deleted. Hardcoding the function name in a literal string also makes typos (or copy-pastos) possible. I extended Julia's code to allow a small edit distance. Requires the Levenshtein python module (on debian, apt-get install python-levenshtein). It's not terribly slow, but to be really useful I (or someone) would need to reduce the number of false positives. One gets for example drivers/net/wireless/rtlwifi/rtl8821ae/dm.c:2082:rtl8821ae_dm_txpwr_track_set_pwr():2: ===rtl8812ae_dm_txpwr_track_set_pwr\n drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:1229:bnx2x_ets_e3b0_config():2: bnx2x_ets_E3B0_config SP failed\n The first of these is probably one of the tracing style messages; the second probably falls in the 'should use %s, __func__' category (other strings in that function actually use lowercase e3b0). @initialize:python@ @@ import re from Levenshtein import distance mindist = 1 maxdist = 2 ignore_leading = True @r@ char [] c; position p; identifier f; @@ f(...,c@p,...) @script:python@ c r.c; p r.p; @@ func = p[0].current_element wpattern = [a-zA-Z_][a-zA-Z0-9_]* if ignore_leading: func = func.strip(_) wpattern = [a-zA-Z][a-zA-Z0-9_]* lf = len(func) // ignore extremely short function names if lf 3: words = [w for w in re.findall(wpattern, c) if abs(len(w) - lf) = maxdist] for w in words: d = distance(w, func) if mindist = d and d = maxdist: print %s:%d:%s():%d: %s % (p[0].file, int(p[0].line), func, d, c) break Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
Re: [Cocci] Problem with already tagged token
On Wed, Sep 10 2014, Julia Lawall julia.law...@lip6.fr wrote: Clever :) I haven't tried this, but I think you cudl do the following: @r@ position p; @@ seq_puts(...); seq_puts(...); seq_puts@p(...); @concat1 depends on patch@ expression s; constant c1, c2; position p1, p2; Now change this line to position p1 != r.p, p2 != r.p; Then you can either use Coccinelle to do the iteration, or just run the semantic patch over and over on the code. Thanks, that seems to work. Can you tell me how I would get coccinelle to do the iteration? It's not really that important, since the whitespace has to be fixed manually anyway, and with your suggestion one finds all places with at least two consecutive seq_puts calls. But it would be nice if one would _only_ have to fix whitespace Thanks, Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
[Cocci] Matching literal strings
Hi, While at it, I also wanted to change seq_printf(m, str) into seq_puts(m, str) [1], but the pattern seq_printf(m, %s, str) also appears. So I tried @p1 depends on patch@ expression s; expression t; @@ ( - seq_printf(s, t) + seq_puts(s, t) | - seq_printf(s, %s, t) + seq_puts(s, t) ) However, this seems to match any format string consisting of a single format specifier; for example in kernel/locking/lockdep_proc.c I get these - seq_printf(m, %p, class-key); + seq_puts(m, class-key); - seq_printf(m, %c, c); + seq_puts(m, c); - seq_printf(m, %14lu, lt-nr); + seq_puts(m, lt-nr); which is obviously wrong. What should I do to make coccinelle match a specific string literal? I guess I could introduce 'constant c =~ ^\%s\$;', but there might be a more canonical way. Thanks, Rasmus [1] Besides avoiding the long seq_vprintf - snprintf - ... chain and pushing all the non-existent varargs into a va_arg structure, this is also safer if str happens to contain format characters; although usually str is a literal. ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
Re: [Cocci] Matching literal strings
On Thu, Sep 11 2014, Julia Lawall julia.law...@lip6.fr wrote: On Thu, 11 Sep 2014, Rasmus Villemoes wrote: which is obviously wrong. What should I do to make coccinelle match a specific string literal? I guess I could introduce 'constant c =~ ^\%s\$;', but there might be a more canonical way. It looks like a bug. There have been some improvements in the handling of format strings, but perhaps the changes have not all been positive... In that case you might like to know $ spatch --version spatch version 1.0.0-rc21 with Python support and with PCRE support Note that it shouldn't really matter that %s is a format string (which is why I didn't attempt to use a format variable; I don't really understand those); I just want to find instances where seq_printf is called with a specific string literal as its second argument. I will end the iterative version shortly. You will need a version of coccinelle that supports ocaml scripting, which seems to mean that you need to have ocamlfind installed on your machine at the time when you make coccinelle. OK, thanks for working on it. Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci
[Cocci] Problem with already tagged token
Hi, I'm trying to write a semantic patch to convert consecutive seq_puts() calls into a single one, with the string literals concatenated. My attempt below works fine when there are exactly two seq_puts() calls, but fails when there are more. If I use coccicheck on a directory, coccinelle seems to silently skip files containing three-or-more (e.g. kernel/trace/trace.c), but gives me the already tagged token error when I use it on the specific file. Ideally, I'd of course like coccinelle to apply the spatch iteratively until only a single call remains. I think I understand why I get the error, but I'd like to know if there's some obvious fix I've overlooked. @concat1 depends on patch@ expression s; constant c1, c2; position p1, p2; @@ seq_puts@p1(s, c1); seq_puts@p2(s, c2); @script:python concat2@ c1 concat1.c1; c2 concat1.c2; c3; @@ // The indentation probably needs to be fixed manually coccinelle.c3 = c1 + \n\t + c2 @concat3 depends on patch@ identifier concat2.c3; expression concat1.s; constant concat1.c1, concat1.c2; position concat1.p1, concat1.p2; @@ - seq_puts@p1(s, c1); - seq_puts@p2(s, c2); + seq_puts(s, c3); Thanks, Rasmus ___ Cocci mailing list Cocci@systeme.lip6.fr https://systeme.lip6.fr/mailman/listinfo/cocci