On Thu, 28 Oct 2010, Andi Kleen wrote:

> Hi,
> 
> Just wanted to summarize my coccinelle experiences from using it for 
> a short term practical project: lock push down on a widely used
> driver interface on the linux kernel. 

Thanks very much for the feedback!
 
> Overall I liked it:
> 
> - coccinelle is a powerful useful tool for doing changes to a large 
> source base.
> 
> Issues I ran into:
> 
> - the documentation could be improved

Indeed :)  What is really needed is someone to go through all of the 
mailing list discussions and extract the generalizable bits into the wiki 
pages, but we don't have a person to do that at present.

> - the grammar is not very easy to interpret.
> - the examples in the kernel source are too complicated, it would be nice
> if there were simpler ones without python and fancy features.

Perhaps these are useful:

http://coccinelle.lip6.fr/rules/

They are accessible from the documentation page, at "More examples are 
available here."

> - the support on the list is good and helpful.
> 
> - one problem is that larger drivers often consist out of multiple
> files for different pattern matches (e.g. one file has the callback
> definitio and the other the actual callback). I had to manually adjust
> the command lines in this case and didn't find an obvious way to automate
> this.

If you give multiple file names at once, it will treat them all together, 
but you have to know which files to give.  At one point, we tried to infer 
this automatically, but didn't really get that to a usable state.  Perhaps 
something could be inferred from the make file.

> - Since it's pretty slow on large files I found it useful to use
> it from a makefile with -jN to parallelize the runs

Kees Cook proposed the following script:

#!/bin/bash
set -e
MAX=$(getconf _NPROCESSORS_ONLN)
dir=$(mktemp -d)
for i in $(seq 0 $(( MAX - 1 )) ); do
    spatch -max $MAX -index $i -very_quiet "$@" > $dir/$i.out &
done
wait
cat $dir/*.out
rm -f $dir/*.out
rmdir $dir

The advantage of this is that you parse the semantic patch only once per 
processor.

If you are willing to use non-free software, you can pre-index your source 
code using glimpse (scripts/glimpseindex_cocci.sh), and then if you give 
coccinelle the option -use_glimpse, then it will only consider files that
contain words that are relevant to your semantic patch.  Otherwise, it 
uses grep to do the same filtering, but grep is slower.

If someone knows of a free indexing tool, then we could try to support 
that as well.  Actually glimpse is not entirely satisfactory because it 
considers _ to be a word separator, leading to many false positives in 
some cases.

> - I ran into some problems with the version (0.2.2 from opensuse buildservice)
> looping forever on some files. I ended up patching these files manually
> after killing the process after a few minutes. Maybe it needs a watchdog at 
> least?

You can give a timeout expressed in a number of seconds.  -timeout 120 is 
probably good enough.  On the other hand, if you run into such problems, 
please let me know (C file + semantic patch).  Often there is an 
inefficiency that can be corrected.

> - White space management is a problem. I had to do quite some manual
> changes to get rid of empty lines in the wrong places. Also sometimes
> there are extra spaces when replacing identifiers (but I read there were
> some fixes for this after 0.2.2)

I understand about the empty line problem you mentioned earlier.  But if 
you have other white space problems, please let me know.  They seems to 
have to be fixed one by one...

> - One thing I missed was a way to express expressions without side effects.
> For example when pushing down a lock I usually want to add a unlock
> before the return. But sometimes returns do tail calls to other functions,
> in this case the code needs to be refactored to first assign to a temporary.
> I didn't find a straight forward way to detect this automatically and
> ended up looking for it manually (had one case in a large change)

You could do something like:

(
- return f(...);
+ ret = f(...);
+ return ret;
|
  return E;
)

E and f are metavariables.  ret would have to be chosen somehow.

Another approach might be something like:

@@
identifier i;
constant c;
expression E;
@@

(
 return i;
|
 return c;
|
+ret = E;
+return ret;
-return E;
)

Neither of these is perfect, though.  Another option would be:

-return
+ret =
        <+... f(...) ...+>;
+return ret;

julia
_______________________________________________
Cocci mailing list
[email protected]
http://lists.diku.dk/mailman/listinfo/cocci
(Web access from inside DIKUs LAN only)

Reply via email to