On 01/13/2010 01:35 PM, Julia Lawall wrote:
As there seems to be an interest in this I'll re-run all variants of
that script tonight with coccinelle 0.2.0 and will report the results. I
will also attach the cocci files; it might well be that I've done
something wrong in them.
Great. Thanks. It is very useful to us to kno about any performance
bottlenecks.
This was like hunting chupacabra but a good learning experience. It went
from a "identifier filtering in python" version which I killed after 90+
minutes of burned CPU time; to a "OCaml regexp" version which finishes
in 12 minutes to a "cleaned up OCaml regexp" version finishing in
*DRUMROLL* 2.5 seconds. I have tried also the "traditional SmPL" version
proposed by Nicolas, it works fine and finishes in 9 seconds.
I have attached the cocci files and the test file is Wine's
dlls/oleaut32/typelib.c that can be grabbed from
http://source.winehq.org/git/wine.git/?a=tree;f=dlls/oleaut32;h=76989c42583c3814414f1832686392bc6d993ef5;hb=21b6c3202a506d32c198d9b87c4666fc5cf658aa
For completenes the specs of my computer: Intel Q9450 @2.66 GHz, 8 GB
RAM, Fedora 11, vanilla 2.6.33-rc4 Kernel.
Bottleneck:
I didn't know that a position has the function information in it too.
And I have seen a few example SmPL files using the below construct it
thus my initial approach was something like this:
@ r @
identifier f;
expression str;
position p;
@@
f ( ... ) {
<+...
fi...@p( str, ...);
...+>
}
@ script:python @
f << r.f; str << r.str; p << r.p;
@@
print("%s:%s:%s") % (p[0].file, p[0].line, f);
If this is run on the mentioned test file it will finish in 13.5
seconds.
The simplified version:
@ r @
expression str;
position p;
@@
fi...@p( str, ...);
@ script:python @
str << r.str;
p << r.p;
@@
print("%s:%s:%s") % (p[0].file, p[0].line, p[0].current_element);
will give the exact same result but finishes in 1.1 seconds; an order of
magnitude less.
The "f( ...) { <+... ...+> }" construct seems to be very expensive. My
script needs two such rules and it ends up being two orders of
magnitude slower.
Julia and Nicolas, many many thanks for your help and especially for
introducing the regexp support. For me it is the more natural form (I
was using perl before for such tasks), it is shorter and doesn't suffer
from rule ordering performance issues. Of course it is nice that it is
also the fastest solution but I could have lived even with the 12 minutes.
bye
michael
@ channel @
identifier ch;
declarer name WINE_DEFAULT_DEBUG_CHANNEL;
@@
WINE_DEFAULT_DEBUG_CHANNEL(ch);
@ fixme depends on channel @
identifier f;
identifier SPAM ~= "\(WINE_\)?\(FIXME\|ERR\|WARN\)";
expression str;
position p;
@@
f( ... )
{
<+...
s...@p( str, ... );
...+>
}
@ fixme_ @
identifier ch, f;
identifier SPAM_ ~= "\(WINE_\)?\(FIXME\|ERR\|WARN\)_";
expression str;
position p;
@@
f( ... )
{
<+...
sp...@p( ch )( str, ... );
...+>
}
@ script:python @
f << fixme.f;
p << fixme.p;
spam << fixme.SPAM;
ch << channel.ch;
str << fixme.str;
@@
print("%s:%s:%s:%s:%s:%s" % (p[0].file, p[0].line, f, spam, ch, str))
@ script:python @
f << fixme_.f;
p << fixme_.p;
spam << fixme_.SPAM_;
ch << fixme_.ch;
str << fixme_.str;
@@
print("%s:%s:%s:%s:%s:%s" % (p[0].file, p[0].line, f, spam, ch, str))
@ channel @
identifier ch;
declarer name WINE_DEFAULT_DEBUG_CHANNEL;
@@
WINE_DEFAULT_DEBUG_CHANNEL(ch);
@ fixme depends on channel @
identifier SPAM ~= "\(WINE_\)?\(FIXME\|ERR\|WARN\)";
expression str;
position p;
@@
s...@p( str, ... );
@ fixme_ @
identifier ch;
identifier SPAM_ ~= "\(WINE_\)?\(FIXME\|ERR\|WARN\)_";
expression str;
position p;
@@
sp...@p( ch )( str, ... );
@ script:python @
p << fixme.p;
spam << fixme.SPAM;
ch << channel.ch;
str << fixme.str;
@@
print("%s:%s:%s:%s:%s:%s" % (p[0].file, p[0].line, p[0].current_element, spam,
ch, str))
@ script:python @
p << fixme_.p;
spam << fixme_.SPAM_;
ch << fixme_.ch;
str << fixme_.str;
@@
print("%s:%s:%s:%s:%s:%s" % (p[0].file, p[0].line, p[0].current_element, spam,
ch, str))
@ channel @
identifier ch;
declarer name WINE_DEFAULT_DEBUG_CHANNEL;
@@
WINE_DEFAULT_DEBUG_CHANNEL(ch);
@ fixme depends on channel @
expression str;
position p;
@@
\(e...@p\|wine_...@p\|fi...@p\|wine_fi...@p\|w...@p\|wine_w...@p\)( str,
... );
@ id @
identifier SPAM;
position fixme.p;
@@
s...@p
@ fixme_ @
identifier ch;
expression str;
position p;
@@
\(e...@p\|wine_e...@p\|fix...@p\|wine_fix...@p\|wa...@p\|wine_wa...@p\)(
ch )( str, ... );
@ id_ @
identifier SPAM_;
position fixme_.p;
@@
sp...@p
@ script:python @
ch << channel.ch;
str << fixme.str;
p << fixme.p;
spam << id.SPAM;
@@
print("%s:%s:%s:%s:%s:%s" % (p[0].file, p[0].line, p[0].current_element, spam,
ch, str))
@ script:python @
ch << fixme_.ch;
p << fixme_.p;
str << fixme_.str;
spam << id_.SPAM_;
@@
print("%s:%s:%s:%s:%s:%s" % (p[0].file, p[0].line, p[0].current_element, spam,
ch, str))
_______________________________________________
Cocci mailing list
[email protected]
http://lists.diku.dk/mailman/listinfo/cocci
(Web access from inside DIKUs LAN only)