On Thu, 28 Oct 2010, Andi Kleen wrote: > Hi, > > Just wanted to summarize my coccinelle experiences from using it for > a short term practical project: lock push down on a widely used > driver interface on the linux kernel.
Thanks very much for the feedback! > Overall I liked it: > > - coccinelle is a powerful useful tool for doing changes to a large > source base. > > Issues I ran into: > > - the documentation could be improved Indeed :) What is really needed is someone to go through all of the mailing list discussions and extract the generalizable bits into the wiki pages, but we don't have a person to do that at present. > - the grammar is not very easy to interpret. > - the examples in the kernel source are too complicated, it would be nice > if there were simpler ones without python and fancy features. Perhaps these are useful: http://coccinelle.lip6.fr/rules/ They are accessible from the documentation page, at "More examples are available here." > - the support on the list is good and helpful. > > - one problem is that larger drivers often consist out of multiple > files for different pattern matches (e.g. one file has the callback > definitio and the other the actual callback). I had to manually adjust > the command lines in this case and didn't find an obvious way to automate > this. If you give multiple file names at once, it will treat them all together, but you have to know which files to give. At one point, we tried to infer this automatically, but didn't really get that to a usable state. Perhaps something could be inferred from the make file. > - Since it's pretty slow on large files I found it useful to use > it from a makefile with -jN to parallelize the runs Kees Cook proposed the following script: #!/bin/bash set -e MAX=$(getconf _NPROCESSORS_ONLN) dir=$(mktemp -d) for i in $(seq 0 $(( MAX - 1 )) ); do spatch -max $MAX -index $i -very_quiet "$@" > $dir/$i.out & done wait cat $dir/*.out rm -f $dir/*.out rmdir $dir The advantage of this is that you parse the semantic patch only once per processor. If you are willing to use non-free software, you can pre-index your source code using glimpse (scripts/glimpseindex_cocci.sh), and then if you give coccinelle the option -use_glimpse, then it will only consider files that contain words that are relevant to your semantic patch. Otherwise, it uses grep to do the same filtering, but grep is slower. If someone knows of a free indexing tool, then we could try to support that as well. Actually glimpse is not entirely satisfactory because it considers _ to be a word separator, leading to many false positives in some cases. > - I ran into some problems with the version (0.2.2 from opensuse buildservice) > looping forever on some files. I ended up patching these files manually > after killing the process after a few minutes. Maybe it needs a watchdog at > least? You can give a timeout expressed in a number of seconds. -timeout 120 is probably good enough. On the other hand, if you run into such problems, please let me know (C file + semantic patch). Often there is an inefficiency that can be corrected. > - White space management is a problem. I had to do quite some manual > changes to get rid of empty lines in the wrong places. Also sometimes > there are extra spaces when replacing identifiers (but I read there were > some fixes for this after 0.2.2) I understand about the empty line problem you mentioned earlier. But if you have other white space problems, please let me know. They seems to have to be fixed one by one... > - One thing I missed was a way to express expressions without side effects. > For example when pushing down a lock I usually want to add a unlock > before the return. But sometimes returns do tail calls to other functions, > in this case the code needs to be refactored to first assign to a temporary. > I didn't find a straight forward way to detect this automatically and > ended up looking for it manually (had one case in a large change) You could do something like: ( - return f(...); + ret = f(...); + return ret; | return E; ) E and f are metavariables. ret would have to be chosen somehow. Another approach might be something like: @@ identifier i; constant c; expression E; @@ ( return i; | return c; | +ret = E; +return ret; -return E; ) Neither of these is perfect, though. Another option would be: -return +ret = <+... f(...) ...+>; +return ret; julia _______________________________________________ Cocci mailing list [email protected] http://lists.diku.dk/mailman/listinfo/cocci (Web access from inside DIKUs LAN only)
