Re: [Cocci] exists do not work if file group is too big

2018-05-17 Thread Jerome Glisse
On Thu, May 17, 2018 at 11:11:12PM +0200, Julia Lawall wrote:
> 
> 
> On Thu, 17 May 2018, Jerome Glisse wrote:
> 
> > On Thu, May 17, 2018 at 10:45:29PM +0200, Julia Lawall wrote:
> > > In terms of the running time, I get a running time of 11 seconds with the
> > > command line of 48 files and I get a running time of 22 seconds if I just
> > > run spatch on the entire kernel.  That seems slower, but if I use the
> > > command line of 48 files, I get changes in 27 files, while if I run it on
> > > the entire kernel, I get changes in 75 files.  If I run it on the entire
> > > kernel using 40 cores (-j 40), I get changes in 75 files in 1.7 seconds.
> >
> > In this simple test case this would work but in my real spatch i have to
> > change one function at a time as otherwise there is conflict on header
> > update. Moreover i also have to use --recursive-headers if i don't do the
> > git grep to provide file list which is also one of the reasons why it
> > was much slower.
> >
> > Ideally if header files could be updated --in-place even when there is
> > conflict i might first add proper #include as a first step to avoid the
> > recursive flag.
> 
> I'm not sure to understand any of the above.  At the moment you have
> --no-includes, so no headers are consulted.  What do you need from header
> files?  Do you need type information?  Do you need to change the files?
> Do the changes in the header files interact with the changes in the .c
> file?
> 
> If it just happens that there are also calls to set_page_dirty in the
> header files that you want to update, then you can give the options
> --no-includes --include-headers.
> 
> If the only thing you want from the header files is type information, then
> you can say eg --recursive-includes --include-headers-for-types.  That
> would be much faster, because the transformation rules wouldn't be applied
> over and over to the header files.
> 
> If you want to do both, the above options can be combined.

I need to update function prototype and one semantic patch might update
multiples function and those function might have function prototype in
same header files just couple lines of each others in which case this does
not work. This is an issue mostly when updating callback for which i do
not know the function name at time of writing the semantic patch.

The recursive is needed because in many intance the function prototype is
define in a header file that is not directly included by the c file but is
included through another include or a complex include chains.


> >
> >
> > As an example if i want to add a new argument to both:
> >
> > int set_page_dirty(struct page *page);
> > int set_page_dirty_lock(struct page *page);
> >
> > then because inside include/linux/mm.h they are declare one after the
> > other --in-place will fails. This case is not the worse, i can split
> > this in 2 spatches and it would work. I can't do that when updating
> > callback which have same pattern ie 2 callback prototype declare one
> > after the other.
> 
> Do you actually need to automate this part?
> 
> I think I could be more helpful if I had an example that more competely
> shows what you want to do.


Below is an actual example, i run it with:

spatch  --dir . --include-headers -D grep --sp-file spatch

cat /tmp/unicorn-functions | while read line ; do git grep -l \
$line -- '*.[ch]' | sed -e "s/\(.\+\)\.h/--include \1.h/g" \
> /tmp/unicorn-group ; echo "--include include/linux/fs.h" \
>> /tmp/unicorn-group ; echo "--include include/linux/mm.h" \
>> /tmp/unicorn-group ; cat /tmp/unicorn-group | tr '\n' ' ' ; \
echo ; done > /tmp/unicorn-groups

cat /tmp/unicorn-groups | while read line ; do spatch --in-place \
--no-includes --sp-file spatch $line ; done


spatch:

// First part of this file is only executed when -D grep argument is
// provided to spatch. It use to grep all function name that needs to
// be modified. Collected function name are written into a temporary
// file (/tmp/unicorn-functions) and git grep is use to find all the
// files that will be modified using second part of this semantic
// patch
virtual grep

// initialize file where we collect all function name (erase it)
@initialize:python depends on grep@
@@
file=open('/tmp/unicorn-functions', 'w')
file.close()

// match function name use as a callback
@G1 depends on grep@
identifier I1, FN;
@@
struct address_space_operations I1 = {..., .set_page_dirty = FN, ...};

@script:python depends on G1@
funcname << G1.FN;
@@
if funcname != "NULL":
  file=open('/tmp/unicorn-functions', 'a')
  file.write(funcname + '\n')
  file.close()

// Also collect all function name that use a_ops->set_page_dirty()
@G2 depends on grep exists@
expression E1, E2;
identifier FN;
type T1;
@@
T1 FN(...) { ...
(
E1.a_ops->set_page_dirty(E2)
|
E1->a_ops->set_page_dirty(E2)
)
... }

@script:python depends on G2@
funcname << G2.FN;
@@
file=open('/tmp/unicorn-functions', 'a')
file.write(funcname + '\n')
print(funcname)
file.close()


// -

Re: [Cocci] exists do not work if file group is too big

2018-05-17 Thread Julia Lawall


On Thu, 17 May 2018, Jerome Glisse wrote:

> On Thu, May 17, 2018 at 10:45:29PM +0200, Julia Lawall wrote:
> > In terms of the running time, I get a running time of 11 seconds with the
> > command line of 48 files and I get a running time of 22 seconds if I just
> > run spatch on the entire kernel.  That seems slower, but if I use the
> > command line of 48 files, I get changes in 27 files, while if I run it on
> > the entire kernel, I get changes in 75 files.  If I run it on the entire
> > kernel using 40 cores (-j 40), I get changes in 75 files in 1.7 seconds.
>
> In this simple test case this would work but in my real spatch i have to
> change one function at a time as otherwise there is conflict on header
> update. Moreover i also have to use --recursive-headers if i don't do the
> git grep to provide file list which is also one of the reasons why it
> was much slower.
>
> Ideally if header files could be updated --in-place even when there is
> conflict i might first add proper #include as a first step to avoid the
> recursive flag.

I'm not sure to understand any of the above.  At the moment you have
--no-includes, so no headers are consulted.  What do you need from header
files?  Do you need type information?  Do you need to change the files?
Do the changes in the header files interact with the changes in the .c
file?

If it just happens that there are also calls to set_page_dirty in the
header files that you want to update, then you can give the options
--no-includes --include-headers.

If the only thing you want from the header files is type information, then
you can say eg --recursive-includes --include-headers-for-types.  That
would be much faster, because the transformation rules wouldn't be applied
over and over to the header files.

If you want to do both, the above options can be combined.

>
>
> As an example if i want to add a new argument to both:
>
> int set_page_dirty(struct page *page);
> int set_page_dirty_lock(struct page *page);
>
> then because inside include/linux/mm.h they are declare one after the
> other --in-place will fails. This case is not the worse, i can split
> this in 2 spatches and it would work. I can't do that when updating
> callback which have same pattern ie 2 callback prototype declare one
> after the other.

Do you actually need to automate this part?

I think I could be more helpful if I had an example that more competely
shows what you want to do.

julia
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] exists do not work if file group is too big

2018-05-17 Thread Jerome Glisse
On Thu, May 17, 2018 at 10:45:29PM +0200, Julia Lawall wrote:
> In terms of the running time, I get a running time of 11 seconds with the
> command line of 48 files and I get a running time of 22 seconds if I just
> run spatch on the entire kernel.  That seems slower, but if I use the
> command line of 48 files, I get changes in 27 files, while if I run it on
> the entire kernel, I get changes in 75 files.  If I run it on the entire
> kernel using 40 cores (-j 40), I get changes in 75 files in 1.7 seconds.

In this simple test case this would work but in my real spatch i have to
change one function at a time as otherwise there is conflict on header
update. Moreover i also have to use --recursive-headers if i don't do the
git grep to provide file list which is also one of the reasons why it
was much slower.

Ideally if header files could be updated --in-place even when there is
conflict i might first add proper #include as a first step to avoid the
recursive flag.


As an example if i want to add a new argument to both:

int set_page_dirty(struct page *page);
int set_page_dirty_lock(struct page *page);

then because inside include/linux/mm.h they are declare one after the
other --in-place will fails. This case is not the worse, i can split
this in 2 spatches and it would work. I can't do that when updating
callback which have same pattern ie 2 callback prototype declare one
after the other.

Cheers,
Jérôme
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] exists do not work if file group is too big

2018-05-17 Thread Julia Lawall
In terms of the running time, I get a running time of 11 seconds with the
command line of 48 files and I get a running time of 22 seconds if I just
run spatch on the entire kernel.  That seems slower, but if I use the
command line of 48 files, I get changes in 27 files, while if I run it on
the entire kernel, I get changes in 75 files.  If I run it on the entire
kernel using 40 cores (-j 40), I get changes in 75 files in 1.7 seconds.

julia
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] exists do not work if file group is too big

2018-05-17 Thread Jerome Glisse
On Thu, May 17, 2018 at 09:33:09AM +0200, SF Markus Elfring wrote:
> > When operating on too many files the exists keyword no longer work.
> 
> How do you think about to show a more precise error message for one approach
> to add the preprocessor symbol “NULL” as the first parameter in a call
> of the function “set_page_dirty”?

There is no error message in the issue i reported. Nested block are
just silently no longer modified just because there is too many files.

> 
> 
> > spatch … --include fs/afs/internal.h …
> 
> * I find the position of this command parameter unusual.

I am using git grep to find files which reference the symbols i am
modifying and i am doing a bit of sed regular expression on its out-
put to prepend header files with --include

In the end i probably will have 40-50 semantics patches and i do not
wish to create by hand a list of files to modify especialy as if i
do so then i loose the big selling point of coccinelle ie not having
to worry about rebasing patches if things move around.

> 
> * Are you sure that you would like to adjust the selected source files
>   by a single command?

So as i am modifying a lot of functions which are close in sources
files, i am getting error with --in-place with headers files, i could
spit a patch and use patch -r - -f ... to force thing but this not only
complexify the whole set of commands needed to run, it also fails in
more sublte way (when doing twice the same modification which can lead
to missing chunk modification).

This is why i am gretting groups of files to modify one function at a
time and i also pass the function name to coccinelle using -D fn=name

I have try many ways to do this and after coupls weeks of iterations
this is the cleanest simplest, fastest solution i converged on.

> 
> * Would you like to pass any file name to a command variant in a loop?

I would not see the benefit in that, i use --no-includes in my command
line and modification are pretty fast so i am roughly happy as it is :)

Hopes this feedback helps. My biggest issue with coccinelle is the lack
of support for function pointer typedef. Otherwise it is a pretty smooth
experience. I have also an issue with python execution but this might be
a fedora packaging issue, i have a local patch that make it works in
fedora 27 but i need to check coccinelle git first to see if it is still
an issue in lastest version. I need to inspect fedora spec file too.

Big kudos to everyone working on this tools.

Cheers,
Jérôme
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci