On Fri, Mar 11, 2011 at 3:55 PM, Julia Lawall <[email protected]> wrote:
> On Fri, 11 Mar 2011, Jim Cromie wrote:
>
>> Im trying to use spatch in the Perl codebase
>> git://perl5.git.perl.org/perl.git
>> to convert a struct->field ref to a macro, and am encountering difficulties.
>>
>> I think theyre due to Perl's heavy use of macros, but Ive had a few
>> surprises,
>> which leaves me uncertain of my understanding.
>>
>> My cocci file is pretty simple, but misses stuff.
>> It finds 1 change in toke.c, but misses everything in op.c,
>> which is where the bulk of the field-refs happen.
>>
>> jimc@chumly:~/projects/perl-core/perl$ spatch -sp_file ../op_sibl.cocci
>> -verbose_parsing -debug -show_cocci -all_includes toke.c
>> init_defs_builtins: /usr/share/coccinelle/standard.h
>> -----------------------------------------------------------------------
>> processing semantic patch file: ../op_sibl.cocci
>> with isos from: /usr/share/coccinelle/standard.iso
>> -----------------------------------------------------------------------
>>
>>
>> @ sibling_LHS@
>> expression OP;
>> expression OP2;
>> @@
>>
>> - OP->op_sibling = OP2
>> + setOpSibling(OP, OP2)
>>
>> @sibling_RHS@
>> expression OP;
>> @@
>>
>> - OP->op_sibling
>> + OpSibling(OP)
>>

> A type metavariable represents a type like int.  Perhaps you wanted to put
> identifier?  But the case of an identifier would already have been
> transformed by the case for expression.
>
>> Once I yanked it, it found 1 transformation:
>>
>> HANDLING: toke.c
>> diff =
>> --- toke.c 2011-03-02 09:09:57.392038433 -0500
>> +++ /tmp/cocci-output-31503-f60bf3-toke.c 2011-03-02 13:02:00.717080475
>> -0500
>> @@ -2370,7 +2370,7 @@ S_sublex_start(pTHX)
>>      }
>>      else if (op_type == OP_BACKTICK && PL_lex_op) {
>>   /* readpipe() vas overriden */
>> - cSVOPx(cLISTOPx(cUNOPx(PL_lex_op)->op_first)->op_first->op_sibling)->op_sv
>> = tokeq(PL_lex_stuff);
>> + cSVOPx(OpSibling(cLISTOPx(cUNOPx(PL_lex_op)->op_first)->op_first))->op_sv
>> = tokeq(PL_lex_stuff);
>>   pl_yylval.opval = PL_lex_op;
>>   PL_lex_op = NULL;
>>   PL_lex_stuff = NULL;
>> Note: processing took   113.1s: toke.c
>
> That is a lot of time for such a simple transformation.  I think it must
> be having a very hard time parsing your code.  It tries several strategies
> for each function, as long as it is unable to parse the function.
>

Yes. lots of BAD, bad messages.


>> the patch clearly finds and changes a single expression, but :
>>
>> - it doesnt see it as the LHS of an assignment, why not ?
>
> I'm not sure what you mean here.  Coccinelle considers both the left and
> right sides of = to be expressions.  So if you write a pattern that has
> the form of an expression it will potentially match either of them.  If
> you want a special treatment for the left hand side of an assignment, you
> should put a rule for that before the one for other kinds of expressions.
>

What I meant was that my rules picked up one example of RHS,
but none of the LHS = RHS statements.
Thats probably because all the assignments are in op.c,
which is giving spatch indigestion.

>> I gather order is not a good way to construct cocci rules, but Im unclear
>> on
>> how to spec that dependence.
>
> Order is the proper way to do things. Coccinelle applies the first rule to
> each top-level definition in the file, then the second rule to each
> resulting top-level definition in the file, etc.  You can also give the
> rules names and use depends on between the initial @@ to express
> dependencies on the success or failure of other rules.  For example:
>
> @r exists@
> identifier f;
> expression E;
> @@
>
> f (...) { <+... return E; ...+> }
>
> @null exists@
> identifier r.f;
> @@
>
> f (...) { <+... return NULL; ...+> }
>
> @depends on !null@
> identifier r.f;
> @@
>
> - f(...)
> + g()
>
> This replaces all calls to functions that never explicitly return NULL by
> g(),
>
>>  OP's type here is a typedef, ie one of:
>> perl.h:typedef struct op OP;
>> perl.h:typedef struct cop COP;
>> perl.h:typedef struct unop UNOP;
>> perl.h:typedef struct binop BINOP;
>> perl.h:typedef struct listop LISTOP;
>> perl.h:typedef struct logop LOGOP;
>> where .+OPs have OP fields in common, but are not *inherited* (ie strictly
>> C)
>> All these perl ideosyncracies leave me puzzled on how to apply SmPL to them.
>
> I'm not sure this information is needed.  Perhaps the field name is good
> enough?  Do you think there are other structure that have a field with the
> name op_sibling?  If you want to be more careful, you can specify a set of
> possible types for an expression:
>

the field name is good enough.
there are no other uses of op_sibling.

> {struct op, struct cop, struct unop, ...} E;

yes, this is what I meant - contraining the rule to match on a
specific type of expression.
that said, the following rule didnt work (didnt do the transform), but
it was much faster.

@sibling_RHS@
// { struct op*, struct cop*, struct unop*, struct binop*,
//   struct listop*, struct logop*, struct svop* } OP;
expression OP;
@@

- OP->op_sibling
+ OpSibling(OP)


>
> (... is not a SmPL "...")
> It should see through the typedefs if it has access to the .h file that
> contains their definition.  You may need to give the option -all_includes
> and -I to specify the directory containing the header files.
>
>> - would a proper binding rule also help to reduce processing time ?
>
> I'm not sure what you mean by a proper binding rule.  But I suspect that
> the processing time is entirely due to parsing.  There is also an option
> -profile that lets you see where it is taking the most time.
>

What I meant was a rule that constrains tightly,
so that spatch sees more quickly that the rule doesnt apply.


>> So, on to the macro-ish problems...
>> parsing toke.c is much more successful than op.c, so I start there.
>
> OK, I see you have already tried parse_c :)
>
>> maybe 10 most problematic tokens
>> -----------------------------------------------------------------------
>> pTHX_: present in 80 parsing errors
>> example:
>>
>>        void
>>        Perl_munge_qwlist_to_paren_list(pTHX_ OP *qwlist)
>>        {
>
> Add #define pTHX_ to your .h file to be given with the argument
> -macro_file.

I used -macro_file perl.h  ( which is in the same directory as the C file )
it improved results dramatically.


$ spatch -sp ../opsibl.cocci -all_includes -parse_ch op.c -o op.cocci
-----------------------------------------------------------------------
NB total files = 1; perfect = 0; pbs = 1; timeout = 0; =========> 0%
nb good = 296,  nb passed = 70 =========> 23.648649% passed
nb good = 296,  nb bad = 9862 =========> 2.913959% good


$ spatch -sp ../opsibl.cocci -all_includes -parse_ch op.c -o op.cocci
-macro_file perl.h
-----------------------------------------------------------------------
NB total files = 1; perfect = 0; pbs = 1; timeout = 0; =========> 0%
nb good = 9723,  nb passed = 653 =========> 6.716034% passed
nb good = 9723,  nb bad = 435 =========> 95.717661% good

>
>> s: present in 51 parsing errors
>> example:
>>
>>        char *
>>        Perl_scan_vstring(pTHX_ const char *s, const char *const e, SV *sv)
>>        {
>>
>>
>> I suspect that both these problems, and many others, are due to the defn of
>> pTHX_
>
> Yes.
>
>> which is a set of layered macros that introduce a maybe-term,
>> either blank-whitespace, or a "type var," as 1st item in a function
>> parameter-list,
>> (Read the trailing _ as a comma when the expansion isnt empty)
>> Ive pasted in some of the macros for perusal, skip if obvious/uninteresting.
>
> Coccinelle doesn't care (much) about the actual definition of the macro.

I noticed in /usr/share/coccinelle/standard.iso, there was:
// sound/sparc/dbri.c
// "bad macro", have a ',' at the end
#define CS4215_SINGLE(xname, entry, shift, mask, invert) \
{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, \
 .info = snd_cs4215_info_single, \
 .get = snd_cs4215_get_single, .put = snd_cs4215_put_single, \
 .private_value = entry | (shift << 8) | (mask << 16) | (invert << 24) },

Does this mean that spatch has trouble with macros that expand into
list-elements ending with comma ??
This describes the problematic macros: pTHX_ aTHX_ dTHX etc



>> Parsing op.c shows many BAD lines:
>>
>> jimc@chumly:~/projects/perl-core/build-threads$ grep BAD ../spatch-res-op
>> |wc
>>     189    1088    9487
>>
>> jimc@chumly:~/projects/perl-core/build-threads$ grep BAD ../spatch-res-op
>> BAD:!!!!! #define CALL_OPFREEHOOK(o) if (PL_opfreehook) PL_opfreehook(aTHX_
>> o)
>> BAD:!!!!! Perl_Slab_Alloc(pTHX_ size_t sz)
>> BAD:!!!!!     I32 **const slabs = PL_slabs;
>> BAD:!!!!! S_Slab_to_rw(pTHX_ void *op)
>> BAD:!!!!! Perl_op_refcnt_inc(pTHX_ OP *o)
>> BAD:!!!!! Perl_op_refcnt_dec(pTHX_ OP *o)
>> BAD:!!!!! Perl_Slab_Free(pTHX_ void *op)
>> BAD:!!!!!      : PL_check[type](aTHX_ (OP*)o))
>> BAD:!!!!!     STMT_START { \
>>
>> many of these appear to be the pTHX, as mentioned above,
>> CALL_OPFREEHOOK may also be problematic:
>> which expands to:
>>     if ((my_perl->Iopfreehook)) (my_perl->Iopfreehook)(my_perl, o);
>>
>> What exactly does BAD !!! mean anyway ?
>
> This is the line where it gave up on the current function.
>
>> Is it more or less as described in /usr/share/coccinelle/standard.h ?
>> or is it that the dynamic expansion is insufficient ?
>>
>>
>> Overall, whats the right strategy for a project to help coccinelle to
>> support it ?
>> - a project specific  -macro_file           <file>
>> Do you keep links to such project files that would help in understanding
>> whats needed ?
>
> I don't have any, but it would indeed be a good idea to collect them.
>

Im now somewhat unclear on the -macro_file <> option,
despite it helping my situation dramatically -
Why would you not parse #included headers by default ?
at least when theyre on the include path, like #include "perl.h"

>> what does passed mean ?
>
> Sometimes when there is a macro that takes arguments, it considers that it
> is not able to properly parse the arguments, and the arguments are passed.
> No transformation occurs on them.  This happens for macros that are
> defined in standard.h or your macro file.  It has to do with the fact that
> there are two views of the arguments, as they are and as the macro file
> defines them to be.

[jimc@groucho perl]$ spatch -sp ../opsibl.cocci -all_includes
-macro_file perl.h -show_cocci \
   -I . -outplace -include_headers op.c
...
previous modification:
MINUS
According to environment 1:
  sibling_RHS.OP -> cLOGOPo->op_first->op_sibling

current modification:
MINUS
According to environment 1:
  sibling_RHS.OP -> cLOGOPo->op_first

Fatal error: exception Failure("sibling_RHS: already tagged token:
C code context
File "op.c", line 4531, column 60,  charpos = 114512
   around = 'op_sibling', whole content =         const I32 f =
is_list_assignment(cLOGOPo->op_first->op_sibling->op_sibling);")

This fatal error results in no output.
I get this error on op.c from all versions I have:

spatch version 0.2.0rc1 with Python support
spatch version 0.2.2 with Python support
spatch version 0.2.5-rc8 with Python support

BTW:
on fedora-13:
Great: a test file now works: typeof.res
--------------------------------
total score
--------------------------------
good = 335/359
Current score is greater than expected :)
(was expecting 298 but got 335)
Generating new expected score file and saving old one
mv tests/SCORE_expected.sexp tests/SCORE_expected.sexp.save [y/n] ?

on ubuntu 10.10, got
good = 332/359

> julia
>

thanks,
jimc
_______________________________________________
Cocci mailing list
[email protected]
http://lists.diku.dk/mailman/listinfo/cocci
(Web access from inside DIKUs LAN only)

Reply via email to