> Im now somewhat unclear on the -macro_file <> option, > despite it helping my situation dramatically - > Why would you not parse #included headers by default ? > at least when theyre on the include path, like #include "perl.h"
The default is to parse a header file that has a name that resembles that of the file. So if you are working on video.c, it will parse video.h. At least in the context of Linux, that is the file that is most likely to have the definitions that are relevant to theproblem at hand. Parsing header files takes time. A lot of semantic patches don't need that information. I often use the option -no_includes, when my semantic patch is not sensitive to type information. A goal of Coccinelle is to let you make transformations on macro uses as well and to be independent of the configuration. If we just applied cpp based on the header files that we can find, then neither of those would be possible. When Coccinelle has the definition of a macro, sometimes it does take that into acount in constructing the internal representation. But in that case the tokens involved in the use of the macro become passed, and one can't do any transformation on them. So taking into account a macro definition is not necessarily a desirable thing. These decisions were taken in the context of Linux code, where the use of macros is mostly limited to code that stays within the syntax of C. Anyway, I will take a look at the perl code base and your rule and see if there is a way to improve things. julia On Thu, 24 Mar 2011, Jim Cromie wrote: > On Fri, Mar 11, 2011 at 3:55 PM, Julia Lawall <[email protected]> wrote: > > On Fri, 11 Mar 2011, Jim Cromie wrote: > > > >> Im trying to use spatch in the Perl codebase > >> git://perl5.git.perl.org/perl.git > >> to convert a struct->field ref to a macro, and am encountering > >> difficulties. > >> > >> I think theyre due to Perl's heavy use of macros, but Ive had a few > >> surprises, > >> which leaves me uncertain of my understanding. > >> > >> My cocci file is pretty simple, but misses stuff. > >> It finds 1 change in toke.c, but misses everything in op.c, > >> which is where the bulk of the field-refs happen. > >> > >> jimc@chumly:~/projects/perl-core/perl$ spatch -sp_file ../op_sibl.cocci > >> -verbose_parsing -debug -show_cocci -all_includes toke.c > >> init_defs_builtins: /usr/share/coccinelle/standard.h > >> ----------------------------------------------------------------------- > >> processing semantic patch file: ../op_sibl.cocci > >> with isos from: /usr/share/coccinelle/standard.iso > >> ----------------------------------------------------------------------- > >> > >> > >> @ sibling_LHS@ > >> expression OP; > >> expression OP2; > >> @@ > >> > >> - OP->op_sibling = OP2 > >> + setOpSibling(OP, OP2) > >> > >> @sibling_RHS@ > >> expression OP; > >> @@ > >> > >> - OP->op_sibling > >> + OpSibling(OP) > >> > > > A type metavariable represents a type like int. Perhaps you wanted to put > > identifier? But the case of an identifier would already have been > > transformed by the case for expression. > > > >> Once I yanked it, it found 1 transformation: > >> > >> HANDLING: toke.c > >> diff = > >> --- toke.c 2011-03-02 09:09:57.392038433 -0500 > >> +++ /tmp/cocci-output-31503-f60bf3-toke.c 2011-03-02 13:02:00.717080475 > >> -0500 > >> @@ -2370,7 +2370,7 @@ S_sublex_start(pTHX) > >> } > >> else if (op_type == OP_BACKTICK && PL_lex_op) { > >> /* readpipe() vas overriden */ > >> - > >> cSVOPx(cLISTOPx(cUNOPx(PL_lex_op)->op_first)->op_first->op_sibling)->op_sv > >> = tokeq(PL_lex_stuff); > >> + cSVOPx(OpSibling(cLISTOPx(cUNOPx(PL_lex_op)->op_first)->op_first))->op_sv > >> = tokeq(PL_lex_stuff); > >> pl_yylval.opval = PL_lex_op; > >> PL_lex_op = NULL; > >> PL_lex_stuff = NULL; > >> Note: processing took 113.1s: toke.c > > > > That is a lot of time for such a simple transformation. I think it must > > be having a very hard time parsing your code. It tries several strategies > > for each function, as long as it is unable to parse the function. > > > > Yes. lots of BAD, bad messages. > > > >> the patch clearly finds and changes a single expression, but : > >> > >> - it doesnt see it as the LHS of an assignment, why not ? > > > > I'm not sure what you mean here. Coccinelle considers both the left and > > right sides of = to be expressions. So if you write a pattern that has > > the form of an expression it will potentially match either of them. If > > you want a special treatment for the left hand side of an assignment, you > > should put a rule for that before the one for other kinds of expressions. > > > > What I meant was that my rules picked up one example of RHS, > but none of the LHS = RHS statements. > Thats probably because all the assignments are in op.c, > which is giving spatch indigestion. > > >> I gather order is not a good way to construct cocci rules, but Im unclear > >> on > >> how to spec that dependence. > > > > Order is the proper way to do things. Coccinelle applies the first rule to > > each top-level definition in the file, then the second rule to each > > resulting top-level definition in the file, etc. You can also give the > > rules names and use depends on between the initial @@ to express > > dependencies on the success or failure of other rules. For example: > > > > @r exists@ > > identifier f; > > expression E; > > @@ > > > > f (...) { <+... return E; ...+> } > > > > @null exists@ > > identifier r.f; > > @@ > > > > f (...) { <+... return NULL; ...+> } > > > > @depends on !null@ > > identifier r.f; > > @@ > > > > - f(...) > > + g() > > > > This replaces all calls to functions that never explicitly return NULL by > > g(), > > > >> OP's type here is a typedef, ie one of: > >> perl.h:typedef struct op OP; > >> perl.h:typedef struct cop COP; > >> perl.h:typedef struct unop UNOP; > >> perl.h:typedef struct binop BINOP; > >> perl.h:typedef struct listop LISTOP; > >> perl.h:typedef struct logop LOGOP; > >> where .+OPs have OP fields in common, but are not *inherited* (ie strictly > >> C) > >> All these perl ideosyncracies leave me puzzled on how to apply SmPL to > >> them. > > > > I'm not sure this information is needed. Perhaps the field name is good > > enough? Do you think there are other structure that have a field with the > > name op_sibling? If you want to be more careful, you can specify a set of > > possible types for an expression: > > > > the field name is good enough. > there are no other uses of op_sibling. > > > {struct op, struct cop, struct unop, ...} E; > > yes, this is what I meant - contraining the rule to match on a > specific type of expression. > that said, the following rule didnt work (didnt do the transform), but > it was much faster. > > @sibling_RHS@ > // { struct op*, struct cop*, struct unop*, struct binop*, > // struct listop*, struct logop*, struct svop* } OP; > expression OP; > @@ > > - OP->op_sibling > + OpSibling(OP) > > > > > > (... is not a SmPL "...") > > It should see through the typedefs if it has access to the .h file that > > contains their definition. You may need to give the option -all_includes > > and -I to specify the directory containing the header files. > > > >> - would a proper binding rule also help to reduce processing time ? > > > > I'm not sure what you mean by a proper binding rule. But I suspect that > > the processing time is entirely due to parsing. There is also an option > > -profile that lets you see where it is taking the most time. > > > > What I meant was a rule that constrains tightly, > so that spatch sees more quickly that the rule doesnt apply. > > > >> So, on to the macro-ish problems... > >> parsing toke.c is much more successful than op.c, so I start there. > > > > OK, I see you have already tried parse_c :) > > > >> maybe 10 most problematic tokens > >> ----------------------------------------------------------------------- > >> pTHX_: present in 80 parsing errors > >> example: > >> > >> void > >> Perl_munge_qwlist_to_paren_list(pTHX_ OP *qwlist) > >> { > > > > Add #define pTHX_ to your .h file to be given with the argument > > -macro_file. > > I used -macro_file perl.h ( which is in the same directory as the C file ) > it improved results dramatically. > > > $ spatch -sp ../opsibl.cocci -all_includes -parse_ch op.c -o op.cocci > ----------------------------------------------------------------------- > NB total files = 1; perfect = 0; pbs = 1; timeout = 0; =========> 0% > nb good = 296, nb passed = 70 =========> 23.648649% passed > nb good = 296, nb bad = 9862 =========> 2.913959% good > > > $ spatch -sp ../opsibl.cocci -all_includes -parse_ch op.c -o op.cocci > -macro_file perl.h > ----------------------------------------------------------------------- > NB total files = 1; perfect = 0; pbs = 1; timeout = 0; =========> 0% > nb good = 9723, nb passed = 653 =========> 6.716034% passed > nb good = 9723, nb bad = 435 =========> 95.717661% good > > > > >> s: present in 51 parsing errors > >> example: > >> > >> char * > >> Perl_scan_vstring(pTHX_ const char *s, const char *const e, SV *sv) > >> { > >> > >> > >> I suspect that both these problems, and many others, are due to the defn of > >> pTHX_ > > > > Yes. > > > >> which is a set of layered macros that introduce a maybe-term, > >> either blank-whitespace, or a "type var," as 1st item in a function > >> parameter-list, > >> (Read the trailing _ as a comma when the expansion isnt empty) > >> Ive pasted in some of the macros for perusal, skip if > >> obvious/uninteresting. > > > > Coccinelle doesn't care (much) about the actual definition of the macro. > > I noticed in /usr/share/coccinelle/standard.iso, there was: > // sound/sparc/dbri.c > // "bad macro", have a ',' at the end > #define CS4215_SINGLE(xname, entry, shift, mask, invert) \ > { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, \ > .info = snd_cs4215_info_single, \ > .get = snd_cs4215_get_single, .put = snd_cs4215_put_single, \ > .private_value = entry | (shift << 8) | (mask << 16) | (invert << 24) }, > > Does this mean that spatch has trouble with macros that expand into > list-elements ending with comma ?? > This describes the problematic macros: pTHX_ aTHX_ dTHX etc > > > > >> Parsing op.c shows many BAD lines: > >> > >> jimc@chumly:~/projects/perl-core/build-threads$ grep BAD ../spatch-res-op > >> |wc > >> 189 1088 9487 > >> > >> jimc@chumly:~/projects/perl-core/build-threads$ grep BAD ../spatch-res-op > >> BAD:!!!!! #define CALL_OPFREEHOOK(o) if (PL_opfreehook) PL_opfreehook(aTHX_ > >> o) > >> BAD:!!!!! Perl_Slab_Alloc(pTHX_ size_t sz) > >> BAD:!!!!! I32 **const slabs = PL_slabs; > >> BAD:!!!!! S_Slab_to_rw(pTHX_ void *op) > >> BAD:!!!!! Perl_op_refcnt_inc(pTHX_ OP *o) > >> BAD:!!!!! Perl_op_refcnt_dec(pTHX_ OP *o) > >> BAD:!!!!! Perl_Slab_Free(pTHX_ void *op) > >> BAD:!!!!! : PL_check[type](aTHX_ (OP*)o)) > >> BAD:!!!!! STMT_START { \ > >> > >> many of these appear to be the pTHX, as mentioned above, > >> CALL_OPFREEHOOK may also be problematic: > >> which expands to: > >> if ((my_perl->Iopfreehook)) (my_perl->Iopfreehook)(my_perl, o); > >> > >> What exactly does BAD !!! mean anyway ? > > > > This is the line where it gave up on the current function. > > > >> Is it more or less as described in /usr/share/coccinelle/standard.h ? > >> or is it that the dynamic expansion is insufficient ? > >> > >> > >> Overall, whats the right strategy for a project to help coccinelle to > >> support it ? > >> - a project specific -macro_file <file> > >> Do you keep links to such project files that would help in understanding > >> whats needed ? > > > > I don't have any, but it would indeed be a good idea to collect them. > > > > Im now somewhat unclear on the -macro_file <> option, > despite it helping my situation dramatically - > Why would you not parse #included headers by default ? > at least when theyre on the include path, like #include "perl.h" > > >> what does passed mean ? > > > > Sometimes when there is a macro that takes arguments, it considers that it > > is not able to properly parse the arguments, and the arguments are passed. > > No transformation occurs on them. This happens for macros that are > > defined in standard.h or your macro file. It has to do with the fact that > > there are two views of the arguments, as they are and as the macro file > > defines them to be. > > [jimc@groucho perl]$ spatch -sp ../opsibl.cocci -all_includes > -macro_file perl.h -show_cocci \ > -I . -outplace -include_headers op.c > ... > previous modification: > MINUS > According to environment 1: > sibling_RHS.OP -> cLOGOPo->op_first->op_sibling > > current modification: > MINUS > According to environment 1: > sibling_RHS.OP -> cLOGOPo->op_first > > Fatal error: exception Failure("sibling_RHS: already tagged token: > C code context > File "op.c", line 4531, column 60, charpos = 114512 > around = 'op_sibling', whole content = const I32 f = > is_list_assignment(cLOGOPo->op_first->op_sibling->op_sibling);") > > This fatal error results in no output. > I get this error on op.c from all versions I have: > > spatch version 0.2.0rc1 with Python support > spatch version 0.2.2 with Python support > spatch version 0.2.5-rc8 with Python support > > BTW: > on fedora-13: > Great: a test file now works: typeof.res > -------------------------------- > total score > -------------------------------- > good = 335/359 > Current score is greater than expected :) > (was expecting 298 but got 335) > Generating new expected score file and saving old one > mv tests/SCORE_expected.sexp tests/SCORE_expected.sexp.save [y/n] ? > > on ubuntu 10.10, got > good = 332/359 > > > julia > > > > thanks, > jimc >
_______________________________________________ Cocci mailing list [email protected] http://lists.diku.dk/mailman/listinfo/cocci (Web access from inside DIKUs LAN only)
