Re: [Cocci] [PATCH v2 0/3] parsing_c: Optimize recursive header file parsing

2020-09-09 Thread Jaskaran Singh
On Thu, 2020-09-10 at 11:47 +0530, Jaskaran Singh wrote:
> This patch series aims to optimize performance for recursively
> parsing
> header files in Coccinelle.
> 
> Coccinelle's C parsing subsystem has an option called --recursive-
> includes
> to recursively parse header files. This is used for type
> inference/annotation.
> 
> Previously, using --recursive-includes on the entire Linux kernel
> source
> code would take far too long. On my computer with the following
> specs,
>   - Processor: AMD Ryzen 5 3550H
>   - RAM: 8 GB
> it would take close to 7 hours to complete.  The optimization that
> this
> patch series implements reduces that time to 1 hour.
> 
> The following is a high-level description of what has been
> implemented:
> - As header files are recursively parsed, they are scanned for the
>   following:
>   - fields of structs/unions/enums
>   - typedefs
>   - function prototypes
>   - global variables
>   The names of the above are stored in a "name cache", i.e. a
> hashtable to
>   map the name to the files it is declared in.
> - A dependency graph is built to determine dependencies between all
> the
>   files in the codebase.
> - In the type annotation phase of the C subsystem, if a function
> call,
>   struct/union field or identifier is encountered, the type of which
> is
>   not known to the annoter, the name cache is checked for the name.
> - The name cache gives a list of files that the name is
> declared/defined
>   in.  These files are cross checked with the dependency graph to
>   determine if any of these are reachable by the file that the
> annoter is
>   working on.
> - If a reachable header file is found, that file is parsed and the
> type
>   associated to the name is returned.
> 
> Different approaches that were attempted to alleviate this issue, and
> the
> problems with each are as follows:
> - Caching the most recently used files: A LRU cache to store ASTs of
> the
>   most recently encountered header files. The problem with this
> approach
>   is the amount of memory it takes to cache the header file ASTs.
> - Caching the most troublesome files: A pseudo-LFU cache to store
> files
>   that cumulatively take the longest to parse, and thus bloat the
> time
>   taken. The problem with this approach is the amount of memory it
> takes
>   to cache the header file ASTs.
> - Skipping unparsable locations in header files: Skipping top-level
> items
>   in a header file that cannot be parsed. This approach does not
> produce
>   even close to the amount of optimization needed.
> 
> The next step from here would be:
> - Maintain a small but persistent cache of header files in groups of
>   directories. Leverage multiprocessing for parsing these header
> files.
> - Leverage multiprocessing to parse header files initially for name
>   extraction.
> - Performing some initial matching with the semantic patch to
> determine if
>   a C file matches. If matches are found, call the annoter and
> recursively
>   parse header files for type annotation.
> - Recursively parse all header files only once and build a large type
>   environment. Use the dependency graph to determine reachability.
> This
>   has potential memory usage issues though.
> 
> 
> Changes in v2:
> --
> - Change occurences of 'begin' and 'match' on the same line with
> something else
>   to the next line for better readability.
> 
> 
>  Makefile |2 
>  parsing_c/includes_cache.ml  |  286
> +++
>  parsing_c/includes_cache.mli |   47 +++
>  parsing_c/parse_c.ml |   27 +++-
>  parsing_c/type_annoter_c.ml  |  130 ---
>  5 files changed, 466 insertions(+), 26 deletions(-)
> 

Yikes, the diffstat is the old one. Here's the latest (not very
different)

 Makefile |2 
 parsing_c/includes_cache.ml  |  290
+++
 parsing_c/includes_cache.mli |   47 ++
 parsing_c/parse_c.ml |   28 +++-
 parsing_c/type_annoter_c.ml  |  134 +--
 5 files changed, 475 insertions(+), 26 deletions(-)


> 

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] [PATCH v2 0/3] parsing_c: Optimize recursive header file parsing

2020-09-09 Thread Jaskaran Singh
This patch series aims to optimize performance for recursively parsing
header files in Coccinelle.

Coccinelle's C parsing subsystem has an option called --recursive-includes
to recursively parse header files. This is used for type
inference/annotation.

Previously, using --recursive-includes on the entire Linux kernel source
code would take far too long. On my computer with the following specs,
- Processor: AMD Ryzen 5 3550H
- RAM: 8 GB
it would take close to 7 hours to complete.  The optimization that this
patch series implements reduces that time to 1 hour.

The following is a high-level description of what has been implemented:
- As header files are recursively parsed, they are scanned for the
  following:
- fields of structs/unions/enums
- typedefs
- function prototypes
- global variables
  The names of the above are stored in a "name cache", i.e. a hashtable to
  map the name to the files it is declared in.
- A dependency graph is built to determine dependencies between all the
  files in the codebase.
- In the type annotation phase of the C subsystem, if a function call,
  struct/union field or identifier is encountered, the type of which is
  not known to the annoter, the name cache is checked for the name.
- The name cache gives a list of files that the name is declared/defined
  in.  These files are cross checked with the dependency graph to
  determine if any of these are reachable by the file that the annoter is
  working on.
- If a reachable header file is found, that file is parsed and the type
  associated to the name is returned.

Different approaches that were attempted to alleviate this issue, and the
problems with each are as follows:
- Caching the most recently used files: A LRU cache to store ASTs of the
  most recently encountered header files. The problem with this approach
  is the amount of memory it takes to cache the header file ASTs.
- Caching the most troublesome files: A pseudo-LFU cache to store files
  that cumulatively take the longest to parse, and thus bloat the time
  taken. The problem with this approach is the amount of memory it takes
  to cache the header file ASTs.
- Skipping unparsable locations in header files: Skipping top-level items
  in a header file that cannot be parsed. This approach does not produce
  even close to the amount of optimization needed.

The next step from here would be:
- Maintain a small but persistent cache of header files in groups of
  directories. Leverage multiprocessing for parsing these header files.
- Leverage multiprocessing to parse header files initially for name
  extraction.
- Performing some initial matching with the semantic patch to determine if
  a C file matches. If matches are found, call the annoter and recursively
  parse header files for type annotation.
- Recursively parse all header files only once and build a large type
  environment. Use the dependency graph to determine reachability. This
  has potential memory usage issues though.


Changes in v2:
--
- Change occurences of 'begin' and 'match' on the same line with something else
  to the next line for better readability.


 Makefile |2 
 parsing_c/includes_cache.ml  |  286 +++
 parsing_c/includes_cache.mli |   47 +++
 parsing_c/parse_c.ml |   27 +++-
 parsing_c/type_annoter_c.ml  |  130 ---
 5 files changed, 466 insertions(+), 26 deletions(-)


___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] [PATCH v2 2/3] parsing_c: parse_c: Build name cache and includes dependency graph

2020-09-09 Thread Jaskaran Singh
Build the includes dependency graph and name cache while parsing header
files. Every header file is parsed only once for name caching and, while
parsing these files, an includes dependency graph is built to determine
reachability of one header file from another file.

Signed-off-by: Jaskaran Singh 
---
 parsing_c/parse_c.ml | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/parsing_c/parse_c.ml b/parsing_c/parse_c.ml
index 5574cb11b..ef5870123 100644
--- a/parsing_c/parse_c.ml
+++ b/parsing_c/parse_c.ml
@@ -17,6 +17,7 @@ open Common
 
 module TH = Token_helpers
 module LP = Lexer_parser
+module IC = Includes_cache
 
 module Stat = Parsing_stat
 
@@ -995,15 +996,30 @@ let rec _parse_print_error_heuristic2 saved_typedefs 
saved_macros
 and handle_include file wrapped_incl k =
 let incl = Ast_c.unwrap wrapped_incl.Ast_c.i_include in
 let parsing_style = Includes.get_parsing_style () in
+let f = Includes.resolve file parsing_style incl in
 if Includes.should_parse parsing_style file incl
 then
-  match Includes.resolve file parsing_style incl with
+  match f with
   | Some header_filename when Common.lfile_exists header_filename ->
- (if !Flag_parsing_c.verbose_includes
- then pr2 ("including "^header_filename));
- let nonlocal =
-   match incl with Ast_c.NonLocal _ -> true | _ -> false in
-  ignore (k nonlocal header_filename)
+  if not (IC.has_been_parsed header_filename)
+  then
+begin
+  IC.add_to_parsed_files header_filename;
+  (if !Flag_parsing_c.verbose_includes
+  then pr2 ("including "^header_filename));
+  let nonlocal =
+match incl with Ast_c.NonLocal _ -> true | _ -> false in
+  let res = k nonlocal header_filename in
+  match res with
+None -> ()
+  | Some x ->
+  let pt = x.parse_trees in
+  let (p, _, _) = pt in
+  with_program2_unit
+(IC.extract_names header_filename)
+p
+end;
+  IC.add_to_dependency_graph file header_filename;
   | _ -> ()
 
 and _parse_print_error_heuristic2bis saved_typedefs saved_macros
-- 
2.21.3

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] [PATCH v2 1/3] parsing_c: includes_cache: Implement a name cache

2020-09-09 Thread Jaskaran Singh
Implement a name cache and includes dependency graph to optimize
performance for recursive parsing of header files.

The following is a high-level description of what has been implemented:
- As header files are recursively parsed, they are scanned for the
  following:
- fields of structs/unions/enums
- typedefs
- function prototypes
- global variables
  The names of the above are stored in a "name cache", i.e. a hashtable
  to map the name to the files it is declared in.
- A dependency graph is built to determine dependencies between all the
  files in the codebase.
- In the type annotation phase of the C subsystem, if a function call,
  struct/union field or identifier is encountered, the type of which is
  not known to the annoter, the name cache is checked for the name.
- The name cache gives a list of files that the name is declared/defined
  in.  These files are cross checked with the dependency graph to
  determine if any of these are reachable by the file that the annoter is
  working on.
- If a reachable header file is found, that file is parsed and all of
  the above listed constructs are extracted from it.

Suggested-by: Julia Lawall 
Signed-off-by: Jaskaran Singh 
---
 Makefile |   2 +-
 parsing_c/includes_cache.ml  | 290 +++
 parsing_c/includes_cache.mli |  47 ++
 3 files changed, 338 insertions(+), 1 deletion(-)
 create mode 100644 parsing_c/includes_cache.ml
 create mode 100644 parsing_c/includes_cache.mli

diff --git a/Makefile b/Makefile
index e25174413..f8d3424c0 100644
--- a/Makefile
+++ b/Makefile
@@ -50,7 +50,7 @@ SOURCES_parsing_cocci := \
 SOURCES_parsing_c := \
token_annot.ml flag_parsing_c.ml parsing_stat.ml \
token_c.ml ast_c.ml includes.ml control_flow_c.ml \
-   visitor_c.ml lib_parsing_c.ml control_flow_c_build.ml \
+   visitor_c.ml lib_parsing_c.ml includes_cache.ml control_flow_c_build.ml 
\
pretty_print_c.ml semantic_c.ml lexer_parser.ml parser_c.mly \
lexer_c.mll parse_string_c.ml token_helpers.ml token_views_c.ml \
cpp_token_c.ml parsing_hacks.ml cpp_analysis_c.ml \
diff --git a/parsing_c/includes_cache.ml b/parsing_c/includes_cache.ml
new file mode 100644
index 0..ca5c91822
--- /dev/null
+++ b/parsing_c/includes_cache.ml
@@ -0,0 +1,290 @@
+(*
+ * This file is part of Coccinelle, licensed under the terms of the GPL v2.
+ * See copyright.txt in the Coccinelle source code for more information.
+ * The Coccinelle source code can be obtained at http://coccinelle.lip6.fr
+ *)
+
+open Common
+module Lib = Lib_parsing_c
+
+(*)
+(* Wrappers *)
+(*)
+let pr_inc s =
+  if !Flag_parsing_c.verbose_includes
+  then Common.pr2 s
+
+(*)
+(* Graph types/modules *)
+(*)
+
+(* Filenames as keys to check paths from file A to file B. *)
+module Key : Set.OrderedType with type t = Common.filename = struct
+  type t = Common.filename
+  let compare = String.compare
+end
+
+module KeySet = Set.Make (Key)
+
+module KeyMap = Map.Make (Key)
+
+module Node : Set.OrderedType with type t = unit = struct
+  type t = unit
+  let compare = compare
+end
+
+module Edge : Set.OrderedType with type t = unit = struct
+  type t = unit
+  let compare = compare
+end
+
+module KeyEdgePair : Set.OrderedType with type t = Key.t * Edge.t =
+struct
+  type t = Key.t * Edge.t
+  let compare = compare
+end
+
+module KeyEdgeSet = Set.Make (KeyEdgePair)
+
+module G = Ograph_simple.Make
+  (Key) (KeySet) (KeyMap) (Node) (Edge) (KeyEdgePair) (KeyEdgeSet)
+
+
+(*)
+(* Includes dependency graph *)
+(*)
+
+(* Header file includes dependency graph *)
+let dependency_graph = ref (new G.ograph_mutable)
+
+(* Check if a path exists between one node to another.
+ * Almost a copy of dfs_iter in commons/ograph_extended.ml with minor changes.
+ * Return true if g satisfies predicate f else false
+ *)
+let dfs_exists xi f g =
+  let already = Hashtbl.create 101 in
+  let rec aux_dfs xs =
+let h xi =
+  if Hashtbl.mem already xi
+  then false
+  else
+begin
+  Hashtbl.add already xi true;
+  if f xi
+  then true
+  else
+begin
+  let f' (key, _) keyset = KeySet.add key keyset in
+  let newset =
+try KeyEdgeSet.fold f' (g#successors xi) KeySet.empty
+with Not_found -> KeySet.empty in
+  aux_dfs newset
+end
+end in
+KeySet.exists h xs in
+  aux_dfs (KeySet.singleton xi)
+
+let add_to_dependenc

[Cocci] [PATCH v2 3/3] parsing_c: type_annoter_c: Use name cache for type annotation

2020-09-09 Thread Jaskaran Singh
Use the name cache for type annotation. On encountering the following
which are not stored in the environment, the name cache is looked up and
the relevant header file is parsed for type information:
- struct field use
- typedef
- function call
- identifier
- enumeration constant

Signed-off-by: Jaskaran Singh 
---
 parsing_c/type_annoter_c.ml | 134 +++-
 1 file changed, 115 insertions(+), 19 deletions(-)

diff --git a/parsing_c/type_annoter_c.ml b/parsing_c/type_annoter_c.ml
index 25cb6c0ee..49fd060be 100644
--- a/parsing_c/type_annoter_c.ml
+++ b/parsing_c/type_annoter_c.ml
@@ -19,6 +19,7 @@ open Common
 open Ast_c
 
 module Lib = Lib_parsing_c
+module IC = Includes_cache
 
 (*)
 (* Prelude *)
@@ -186,6 +187,13 @@ type nameenv = {
 
 type environment = nameenv list
 
+let includes_parse_fn file =
+  let choose_includes = Includes.get_parsing_style () in
+  Includes.set_parsing_style Includes.Parse_no_includes;
+  let ret = Parse_c.parse_c_and_cpp false false file in
+  Includes.set_parsing_style choose_includes;
+  List.map fst (fst ret)
+
 (*  *)
 (* can be modified by the init_env function below, by
  * the file environment_unix.h
@@ -294,6 +302,39 @@ let member_env_lookup_enum s env =
   | [] -> false
   | env :: _ -> StringMap.mem s env.enum_constant
 
+(*  *)
+
+let add_cache_binding_in_scope namedef =
+  let (current, older) = Common.uncons !_scoped_env in
+  let new_frame fr =
+match namedef with
+  | IC.RetVarOrFunc (s, typ) ->
+ {fr with
+  var_or_func = StringMap.add s typ fr.var_or_func}
+  | IC.RetTypeDef   (s, typ) ->
+ let cv = typ, fr.typedef, fr.level in
+ let new_typedef_c : typedefs = { defs = StringMap.add s cv 
fr.typedef.defs } in
+ {fr with typedef = new_typedef_c}
+  | IC.RetStructUnionNameDef (s, (su, typ)) ->
+ {fr with
+  struct_union_name_def = StringMap.add s (su, typ) 
fr.struct_union_name_def}
+  | IC.RetEnumConstant (s, body) ->
+ {fr with
+  enum_constant = StringMap.add s body fr.enum_constant} in
+  (* These are global, so have to reflect them in all the frames. *)
+  _scoped_env := (new_frame current)::(List.map new_frame older)
+
+(* Has side-effects on the environment.
+ * TODO: profile? *)
+let get_type_from_includes_cache file name exp_types on_success on_failure =
+  let file_bindings =
+IC.get_types_from_name_cache
+  file name exp_types includes_parse_fn in
+  List.iter add_cache_binding_in_scope file_bindings;
+  match file_bindings with
+[] -> on_failure ()
+  | _ -> on_success ()
+
 
 (*)
 (* "type-lookup"  *)
@@ -394,7 +435,14 @@ let rec type_unfold_one_step ty env =
  then type_unfold_one_step t' env'
   else loop (s::seen) t' env
with Not_found ->
-  ty
+  let f = Ast_c.file_of_info (Ast_c.info_of_name name) in
+  get_type_from_includes_cache
+f s [IC.CacheTypedef]
+(fun () ->
+  let (t', env') = lookup_typedef s !_scoped_env in
+  TypeName (name, Some t') +>
+  Ast_c.rewrap_typeC ty)
+(fun () -> ty)
   )
 
   | FieldType (t, _, _) -> type_unfold_one_step t env
@@ -474,7 +522,15 @@ let rec typedef_fix ty env =
  TypeName (name, Some fixed) +>
  Ast_c.rewrap_typeC ty
 with Not_found ->
-  ty))
+  let f = Ast_c.file_of_info (Ast_c.info_of_name name) in
+  get_type_from_includes_cache
+f s [IC.CacheTypedef]
+(fun () ->
+   let (t', env') = lookup_typedef s !_scoped_env in
+   TypeName (name, Some t') +>
+   Ast_c.rewrap_typeC ty)
+(fun () -> ty)
+  ))
 
 | FieldType (t, a, b) ->
FieldType (typedef_fix t env, a, b) +> Ast_c.rewrap_typeC ty
@@ -797,8 +853,16 @@ let annotater_expr_visitor_subpart = (fun (k,bigf) expr ->
 Type_c.noTypeHere
 )
 | None ->
-pr2_once ("type_annotater: no type for function ident: " ^ s);
-Type_c.noTypeHere
+let f =
+  Ast_c.file_of_info
+(Ast_c.info_of_name ident) in
+get_type_from_includes_cache
+  f s [IC.CacheVarFunc]
+  (fun () ->
+ match lookup_opt_env lookup_var s with
+   Some (typ, local) -> make_info_fix (typ, local)
+ | None -> Type_c.noTypeHere)
+  (fun () -> Type_c.noTypeHere)
 )
 )
 
@@ -848,22 +912,36 

Re: [Cocci] [RFC PATCH 1/3] parsing_c: includes_cache: Implement a name cache

2020-09-09 Thread Jaskaran Singh
On Wed, 2020-09-09 at 22:05 +0200, Markus Elfring wrote:
> > Implement a name cache and includes dependency graph to optimize
> > performance for recursive parsing of header files.
> 
> Can such information trigger any more evolution besides the
> contributed
> OCaml source code?
> 
> 
> >   The names of the above are stored in a "name cache", i.e. a
> > hashtable
> >   to map the name to the files it is declared in.
> 
> How much does hashing matter here?
> 

It's a hash table. I don't know how OCaml's Hashtbl works under the
hood.

> 
> > - A dependency graph is built to determine dependencies between all
> > the
> >  files in the codebase.
> 
> Can such information indicate a need for its own programming
> interface?
> 

If you mean graphs, there are entire modules for them in Coccinelle.
(commons/ograph_*.ml).

> 
> > - In the type annotation phase of the C subsystem, if a function
> > call,
> >   struct/union field or identifier is encountered, the type of
> > which is
> >   not known to the annoter, the name cache is checked for the name.
> 
> Is there anything in common with symbol tables?
> 

Kind of.

Cheers,
Jaskaran.

> Regards,
> Markus

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [RFC PATCH 2/3] parsing_c: parse_c: Build name cache and includes dependency graph

2020-09-09 Thread Jaskaran Singh
On Wed, 2020-09-09 at 21:41 +0200, Julia Lawall wrote:
> 
> On Wed, 9 Sep 2020, Jaskaran Singh wrote:
> 
> > Build the includes dependency graph and name cache while parsing
> > header
> > files. Every header file is parsed only once for name caching and,
> > while
> > parsing these files, an includes dependency graph is built to
> > determine
> > reachability of one header file from another file.
> 
> So you really parse the whole file here?  

Yes.

> Could you avoid that? 

Well, we need the AST for name caching. I guess we could lazily parse
these on demand and do the name caching in the type annoter. But I'm
not sure how much of a difference that would make, it'd probably end up
parsing about 80% of those branches anyway.

>  Is it
> that you need to parse something to find the end of it?
> 

We need everything from it, i.e. struct fields, enumeration constants,
function prototypes, etc. as well as the #includes in it.

Cheers,
Jaskaran.

> julia
> 
> > Signed-off-by: Jaskaran Singh 
> > ---
> >  parsing_c/parse_c.ml | 27 +--
> >  1 file changed, 21 insertions(+), 6 deletions(-)
> > 
> > diff --git a/parsing_c/parse_c.ml b/parsing_c/parse_c.ml
> > index 5574cb11b..3b250720f 100644
> > --- a/parsing_c/parse_c.ml
> > +++ b/parsing_c/parse_c.ml
> > @@ -17,6 +17,7 @@ open Common
> > 
> >  module TH = Token_helpers
> >  module LP = Lexer_parser
> > +module IC = Includes_cache
> > 
> >  module Stat = Parsing_stat
> > 
> > @@ -995,15 +996,29 @@ let rec _parse_print_error_heuristic2
> > saved_typedefs saved_macros
> >  and handle_include file wrapped_incl k =
> >  let incl = Ast_c.unwrap wrapped_incl.Ast_c.i_include in
> >  let parsing_style = Includes.get_parsing_style () in
> > +let f = Includes.resolve file parsing_style incl in
> >  if Includes.should_parse parsing_style file incl
> >  then
> > -  match Includes.resolve file parsing_style incl with
> > +  match f with
> >| Some header_filename when Common.lfile_exists
> > header_filename ->
> > - (if !Flag_parsing_c.verbose_includes
> > - then pr2 ("including "^header_filename));
> > - let nonlocal =
> > -   match incl with Ast_c.NonLocal _ -> true | _ -> false in
> > -  ignore (k nonlocal header_filename)
> > +  if not (IC.has_been_parsed header_filename)
> > +  then begin
> > +IC.add_to_parsed_files header_filename;
> > +   (if !Flag_parsing_c.verbose_includes
> > +   then pr2 ("including "^header_filename));
> > +   let nonlocal =
> > + match incl with Ast_c.NonLocal _ -> true | _ -> false in
> > +let res = k nonlocal header_filename in
> > +match res with
> > +  None -> ()
> > +| Some x ->
> > +let pt = x.parse_trees in
> > +let (p, _, _) = pt in
> > +with_program2_unit
> > +  (IC.extract_names header_filename)
> > +  p
> > +  end;
> > +  IC.add_to_dependency_graph file header_filename;
> >| _ -> ()
> > 
> >  and _parse_print_error_heuristic2bis saved_typedefs saved_macros
> > --
> > 2.21.3
> > 
> > 

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [RFC PATCH 1/3] parsing_c: includes_cache: Implement a name cache

2020-09-09 Thread Markus Elfring
> Implement a name cache and includes dependency graph to optimize
> performance for recursive parsing of header files.

Can such information trigger any more evolution besides the contributed
OCaml source code?


>   The names of the above are stored in a "name cache", i.e. a hashtable
>   to map the name to the files it is declared in.

How much does hashing matter here?


> - A dependency graph is built to determine dependencies between all the
>  files in the codebase.

Can such information indicate a need for its own programming interface?


> - In the type annotation phase of the C subsystem, if a function call,
>   struct/union field or identifier is encountered, the type of which is
>   not known to the annoter, the name cache is checked for the name.

Is there anything in common with symbol tables?

Regards,
Markus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [RFC PATCH 2/3] parsing_c: parse_c: Build name cache and includes dependency graph

2020-09-09 Thread Julia Lawall



On Wed, 9 Sep 2020, Jaskaran Singh wrote:

> Build the includes dependency graph and name cache while parsing header
> files. Every header file is parsed only once for name caching and, while
> parsing these files, an includes dependency graph is built to determine
> reachability of one header file from another file.

So you really parse the whole file here?  Could you avoid that?  Is it
that you need to parse something to find the end of it?

julia

>
> Signed-off-by: Jaskaran Singh 
> ---
>  parsing_c/parse_c.ml | 27 +--
>  1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/parsing_c/parse_c.ml b/parsing_c/parse_c.ml
> index 5574cb11b..3b250720f 100644
> --- a/parsing_c/parse_c.ml
> +++ b/parsing_c/parse_c.ml
> @@ -17,6 +17,7 @@ open Common
>
>  module TH = Token_helpers
>  module LP = Lexer_parser
> +module IC = Includes_cache
>
>  module Stat = Parsing_stat
>
> @@ -995,15 +996,29 @@ let rec _parse_print_error_heuristic2 saved_typedefs 
> saved_macros
>  and handle_include file wrapped_incl k =
>  let incl = Ast_c.unwrap wrapped_incl.Ast_c.i_include in
>  let parsing_style = Includes.get_parsing_style () in
> +let f = Includes.resolve file parsing_style incl in
>  if Includes.should_parse parsing_style file incl
>  then
> -  match Includes.resolve file parsing_style incl with
> +  match f with
>| Some header_filename when Common.lfile_exists header_filename ->
> -   (if !Flag_parsing_c.verbose_includes
> -   then pr2 ("including "^header_filename));
> -   let nonlocal =
> - match incl with Ast_c.NonLocal _ -> true | _ -> false in
> -  ignore (k nonlocal header_filename)
> +  if not (IC.has_been_parsed header_filename)
> +  then begin
> +IC.add_to_parsed_files header_filename;
> + (if !Flag_parsing_c.verbose_includes
> + then pr2 ("including "^header_filename));
> + let nonlocal =
> +   match incl with Ast_c.NonLocal _ -> true | _ -> false in
> +let res = k nonlocal header_filename in
> +match res with
> +  None -> ()
> +| Some x ->
> +let pt = x.parse_trees in
> +let (p, _, _) = pt in
> +with_program2_unit
> +  (IC.extract_names header_filename)
> +  p
> +  end;
> +  IC.add_to_dependency_graph file header_filename;
>| _ -> ()
>
>  and _parse_print_error_heuristic2bis saved_typedefs saved_macros
> --
> 2.21.3
>
>
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [RFC PATCH 0/3] parsing_c: Optimize recursive header file parsing

2020-09-09 Thread Markus Elfring
> This patch series aims to optimize performance for recursively parsing
> header files in Coccinelle.

I am curious how you got encouraged to pick such a software development
challenge up.


> it would take close to 7 hours to complete.

This is unfortunate.

How do you think about to offer any more information (besides the
mentioned processor) for further benchmarking purposes?
https://github.com/coccinelle/coccinelle/issues/133


> The optimization that this patch series implements reduces that time to 1 
> hour.

Such a scale of improvement is impressive.


> - Skipping unparsable locations in header files: Skipping top-level items
>   in a header file that cannot be parsed. …

Will any further software evolution happen according to a topic like
“Exclusion of unsupported source code parts”?
https://github.com/coccinelle/coccinelle/issues/20


> - Recursively parse all header files only once and build a large type
>   environment. Use the dependency graph to determine reachability. …

Are you looking for the support of header file “precompilation”?

Regards,
Markus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] [RFC PATCH 0/3] parsing_c: Optimize recursive header file parsing

2020-09-09 Thread Jaskaran Singh
This patch series aims to optimize performance for recursively parsing
header files in Coccinelle.

Coccinelle's C parsing subsystem has an option called --recursive-includes
to recursively parse header files. This is used for type
inference/annotation.

Previously, using --recursive-includes on the entire Linux kernel source
code would take far too long. On my computer with the following specs,
- Processor: AMD Ryzen 5 3550H
- RAM: 8 GB
it would take close to 7 hours to complete.  The optimization that this
patch series implements reduces that time to 1 hour.

The following is a high-level description of what has been implemented:
- As header files are recursively parsed, they are scanned for the
  following:
- fields of structs/unions/enums
- typedefs
- function prototypes
- global variables
  The names of the above are stored in a "name cache", i.e. a hashtable to
  map the name to the files it is declared in.
- A dependency graph is built to determine dependencies between all the
  files in the codebase.
- In the type annotation phase of the C subsystem, if a function call,
  struct/union field or identifier is encountered, the type of which is
  not known to the annoter, the name cache is checked for the name.
- The name cache gives a list of files that the name is declared/defined
  in.  These files are cross checked with the dependency graph to
  determine if any of these are reachable by the file that the annoter is
  working on.
- If a reachable header file is found, that file is parsed and the type
  associated to the name is returned.

Different approaches that were attempted to alleviate this issue, and the
problems with each are as follows:
- Caching the most recently used files: A LRU cache to store ASTs of the
  most recently encountered header files. The problem with this approach
  is the amount of memory it takes to cache the header file ASTs.
- Caching the most troublesome files: A pseudo-LFU cache to store files
  that cumulatively take the longest to parse, and thus bloat the time
  taken. The problem with this approach is the amount of memory it takes
  to cache the header file ASTs.
- Skipping unparsable locations in header files: Skipping top-level items
  in a header file that cannot be parsed. This approach does not produce
  even close to the amount of optimization needed.

The next step from here would be:
- Maintain a small but persistent cache of header files in groups of
  directories. Leverage multiprocessing for parsing these header files.
- Leverage multiprocessing to parse header files initially for name
  extraction.
- Performing some initial matching with the semantic patch to determine if
  a C file matches. If matches are found, call the annoter and recursively
  parse header files for type annotation.
- Recursively parse all header files only once and build a large type
  environment. Use the dependency graph to determine reachability. This
  has potential memory usage issues though.


 Makefile |2 
 parsing_c/includes_cache.ml  |  286 +++
 parsing_c/includes_cache.mli |   47 +++
 parsing_c/parse_c.ml |   27 +++-
 parsing_c/type_annoter_c.ml  |  130 ---
 5 files changed, 466 insertions(+), 26 deletions(-)


___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] [RFC PATCH 3/3] parsing_c: type_annoter_c: Use name cache for type annotation

2020-09-09 Thread Jaskaran Singh
Use the name cache for type annotation. On encountering the following
which are not stored in the environment, the name cache is looked up and
the relevant header file is parsed for type information:
- struct field use
- typedef
- function call
- identifier
- enumeration constant

Signed-off-by: Jaskaran Singh 
---
 parsing_c/type_annoter_c.ml | 130 ++--
 1 file changed, 111 insertions(+), 19 deletions(-)

diff --git a/parsing_c/type_annoter_c.ml b/parsing_c/type_annoter_c.ml
index 25cb6c0ee..497332544 100644
--- a/parsing_c/type_annoter_c.ml
+++ b/parsing_c/type_annoter_c.ml
@@ -19,6 +19,7 @@ open Common
 open Ast_c
 
 module Lib = Lib_parsing_c
+module IC = Includes_cache
 
 (*)
 (* Prelude *)
@@ -186,6 +187,13 @@ type nameenv = {
 
 type environment = nameenv list
 
+let includes_parse_fn file =
+  let choose_includes = Includes.get_parsing_style () in
+  Includes.set_parsing_style Includes.Parse_no_includes;
+  let ret = Parse_c.parse_c_and_cpp false false file in
+  Includes.set_parsing_style choose_includes;
+  List.map fst (fst ret)
+
 (*  *)
 (* can be modified by the init_env function below, by
  * the file environment_unix.h
@@ -294,6 +302,39 @@ let member_env_lookup_enum s env =
   | [] -> false
   | env :: _ -> StringMap.mem s env.enum_constant
 
+(*  *)
+
+let add_cache_binding_in_scope namedef =
+  let (current, older) = Common.uncons !_scoped_env in
+  let new_frame fr =
+match namedef with
+  | IC.RetVarOrFunc (s, typ) ->
+ {fr with
+  var_or_func = StringMap.add s typ fr.var_or_func}
+  | IC.RetTypeDef   (s, typ) ->
+ let cv = typ, fr.typedef, fr.level in
+ let new_typedef_c : typedefs = { defs = StringMap.add s cv 
fr.typedef.defs } in
+ {fr with typedef = new_typedef_c}
+  | IC.RetStructUnionNameDef (s, (su, typ)) ->
+ {fr with
+  struct_union_name_def = StringMap.add s (su, typ) 
fr.struct_union_name_def}
+  | IC.RetEnumConstant (s, body) ->
+ {fr with
+  enum_constant = StringMap.add s body fr.enum_constant} in
+  (* These are global, so have to reflect them in all the frames. *)
+  _scoped_env := (new_frame current)::(List.map new_frame older)
+
+(* Has side-effects on the environment.
+ * TODO: profile? *)
+let get_type_from_includes_cache file name exp_types on_success on_failure =
+  let file_bindings =
+IC.get_types_from_name_cache
+  file name exp_types includes_parse_fn in
+  List.iter add_cache_binding_in_scope file_bindings;
+  match file_bindings with
+[] -> on_failure ()
+  | _ -> on_success ()
+
 
 (*)
 (* "type-lookup"  *)
@@ -394,7 +435,14 @@ let rec type_unfold_one_step ty env =
  then type_unfold_one_step t' env'
   else loop (s::seen) t' env
with Not_found ->
-  ty
+  let f = Ast_c.file_of_info (Ast_c.info_of_name name) in
+  get_type_from_includes_cache
+f s [IC.CacheTypedef]
+(fun () ->
+  let (t', env') = lookup_typedef s !_scoped_env in
+  TypeName (name, Some t') +>
+  Ast_c.rewrap_typeC ty)
+(fun () -> ty)
   )
 
   | FieldType (t, _, _) -> type_unfold_one_step t env
@@ -474,7 +522,15 @@ let rec typedef_fix ty env =
  TypeName (name, Some fixed) +>
  Ast_c.rewrap_typeC ty
 with Not_found ->
-  ty))
+  let f = Ast_c.file_of_info (Ast_c.info_of_name name) in
+  get_type_from_includes_cache
+f s [IC.CacheTypedef]
+(fun () ->
+  let (t', env') = lookup_typedef s !_scoped_env in
+  TypeName (name, Some t') +>
+  Ast_c.rewrap_typeC ty)
+(fun () -> ty)
+  ))
 
 | FieldType (t, a, b) ->
FieldType (typedef_fix t env, a, b) +> Ast_c.rewrap_typeC ty
@@ -797,8 +853,15 @@ let annotater_expr_visitor_subpart = (fun (k,bigf) expr ->
 Type_c.noTypeHere
 )
 | None ->
-pr2_once ("type_annotater: no type for function ident: " ^ s);
-Type_c.noTypeHere
+let f =
+  Ast_c.file_of_info
+(Ast_c.info_of_name ident) in
+get_type_from_includes_cache
+  f s [IC.CacheVarFunc]
+  (fun () -> match lookup_opt_env lookup_var s with
+Some (typ, local) -> make_info_fix (typ, local)
+  | None -> Type_c.noTypeHere)
+  (fun () -> Type_c.noTypeHere)
 )
 )
 
@@ -848,22 +911,33 @@ let annotater_expr_visitor_s

[Cocci] [RFC PATCH 1/3] parsing_c: includes_cache: Implement a name cache

2020-09-09 Thread Jaskaran Singh
Implement a name cache and includes dependency graph to optimize
performance for recursive parsing of header files.

The following is a high-level description of what has been implemented:
- As header files are recursively parsed, they are scanned for the
  following:
- fields of structs/unions/enums
- typedefs
- function prototypes
- global variables
  The names of the above are stored in a "name cache", i.e. a hashtable
  to map the name to the files it is declared in.
- A dependency graph is built to determine dependencies between all the
  files in the codebase.
- In the type annotation phase of the C subsystem, if a function call,
  struct/union field or identifier is encountered, the type of which is
  not known to the annoter, the name cache is checked for the name.
- The name cache gives a list of files that the name is declared/defined
  in.  These files are cross checked with the dependency graph to
  determine if any of these are reachable by the file that the annoter is
  working on.
- If a reachable header file is found, that file is parsed and all of
  the above listed constructs are extracted from it.

Suggested-by: Julia Lawall 
Signed-off-by: Jaskaran Singh 
---
 Makefile |   2 +-
 parsing_c/includes_cache.ml  | 286 +++
 parsing_c/includes_cache.mli |  47 ++
 3 files changed, 334 insertions(+), 1 deletion(-)
 create mode 100644 parsing_c/includes_cache.ml
 create mode 100644 parsing_c/includes_cache.mli

diff --git a/Makefile b/Makefile
index e25174413..f8d3424c0 100644
--- a/Makefile
+++ b/Makefile
@@ -50,7 +50,7 @@ SOURCES_parsing_cocci := \
 SOURCES_parsing_c := \
token_annot.ml flag_parsing_c.ml parsing_stat.ml \
token_c.ml ast_c.ml includes.ml control_flow_c.ml \
-   visitor_c.ml lib_parsing_c.ml control_flow_c_build.ml \
+   visitor_c.ml lib_parsing_c.ml includes_cache.ml control_flow_c_build.ml 
\
pretty_print_c.ml semantic_c.ml lexer_parser.ml parser_c.mly \
lexer_c.mll parse_string_c.ml token_helpers.ml token_views_c.ml \
cpp_token_c.ml parsing_hacks.ml cpp_analysis_c.ml \
diff --git a/parsing_c/includes_cache.ml b/parsing_c/includes_cache.ml
new file mode 100644
index 0..2c2cb0235
--- /dev/null
+++ b/parsing_c/includes_cache.ml
@@ -0,0 +1,286 @@
+(*
+ * This file is part of Coccinelle, licensed under the terms of the GPL v2.
+ * See copyright.txt in the Coccinelle source code for more information.
+ * The Coccinelle source code can be obtained at http://coccinelle.lip6.fr
+ *)
+
+open Common
+module Lib = Lib_parsing_c
+
+(*)
+(* Wrappers *)
+(*)
+let pr_inc s =
+  if !Flag_parsing_c.verbose_includes
+  then Common.pr2 s
+
+(*)
+(* Graph types/modules *)
+(*)
+
+(* Filenames as keys to check paths from file A to file B. *)
+module Key : Set.OrderedType with type t = Common.filename = struct
+  type t = Common.filename
+  let compare = String.compare
+end
+
+module KeySet = Set.Make (Key)
+
+module KeyMap = Map.Make (Key)
+
+module Node : Set.OrderedType with type t = unit = struct
+  type t = unit
+  let compare = compare
+end
+
+module Edge : Set.OrderedType with type t = unit = struct
+  type t = unit
+  let compare = compare
+end
+
+module KeyEdgePair : Set.OrderedType with type t = Key.t * Edge.t =
+struct
+  type t = Key.t * Edge.t
+  let compare = compare
+end
+
+module KeyEdgeSet = Set.Make (KeyEdgePair)
+
+module G = Ograph_simple.Make
+  (Key) (KeySet) (KeyMap) (Node) (Edge) (KeyEdgePair) (KeyEdgeSet)
+
+
+(*)
+(* Includes dependency graph *)
+(*)
+
+(* Header file includes dependency graph *)
+let dependency_graph = ref (new G.ograph_mutable)
+
+(* Check if a path exists between one node to another.
+ * Almost a copy of dfs_iter in commons/ograph_extended.ml with minor changes.
+ * Return true if g satisfies predicate f else false
+ *)
+let dfs_exists xi f g =
+  let already = Hashtbl.create 101 in
+  let rec aux_dfs xs =
+let h xi =
+  if Hashtbl.mem already xi
+  then false
+  else begin
+Hashtbl.add already xi true;
+if f xi
+then true
+else begin
+  let f' (key, _) keyset = KeySet.add key keyset in
+  let newset =
+try KeyEdgeSet.fold f' (g#successors xi) KeySet.empty
+with Not_found -> KeySet.empty in
+  aux_dfs newset
+end
+  end in
+KeySet.exists h xs in
+  aux_dfs (KeySet.singleton xi)
+
+let add_to_dependency_graph parent file =
+  let add_node a =
+if not (K

[Cocci] [RFC PATCH 2/3] parsing_c: parse_c: Build name cache and includes dependency graph

2020-09-09 Thread Jaskaran Singh
Build the includes dependency graph and name cache while parsing header
files. Every header file is parsed only once for name caching and, while
parsing these files, an includes dependency graph is built to determine
reachability of one header file from another file.

Signed-off-by: Jaskaran Singh 
---
 parsing_c/parse_c.ml | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/parsing_c/parse_c.ml b/parsing_c/parse_c.ml
index 5574cb11b..3b250720f 100644
--- a/parsing_c/parse_c.ml
+++ b/parsing_c/parse_c.ml
@@ -17,6 +17,7 @@ open Common
 
 module TH = Token_helpers
 module LP = Lexer_parser
+module IC = Includes_cache
 
 module Stat = Parsing_stat
 
@@ -995,15 +996,29 @@ let rec _parse_print_error_heuristic2 saved_typedefs 
saved_macros
 and handle_include file wrapped_incl k =
 let incl = Ast_c.unwrap wrapped_incl.Ast_c.i_include in
 let parsing_style = Includes.get_parsing_style () in
+let f = Includes.resolve file parsing_style incl in
 if Includes.should_parse parsing_style file incl
 then
-  match Includes.resolve file parsing_style incl with
+  match f with
   | Some header_filename when Common.lfile_exists header_filename ->
- (if !Flag_parsing_c.verbose_includes
- then pr2 ("including "^header_filename));
- let nonlocal =
-   match incl with Ast_c.NonLocal _ -> true | _ -> false in
-  ignore (k nonlocal header_filename)
+  if not (IC.has_been_parsed header_filename)
+  then begin
+IC.add_to_parsed_files header_filename;
+   (if !Flag_parsing_c.verbose_includes
+   then pr2 ("including "^header_filename));
+   let nonlocal =
+ match incl with Ast_c.NonLocal _ -> true | _ -> false in
+let res = k nonlocal header_filename in
+match res with
+  None -> ()
+| Some x ->
+let pt = x.parse_trees in
+let (p, _, _) = pt in
+with_program2_unit
+  (IC.extract_names header_filename)
+  p
+  end;
+  IC.add_to_dependency_graph file header_filename;
   | _ -> ()
 
 and _parse_print_error_heuristic2bis saved_typedefs saved_macros
-- 
2.21.3

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] [PATCH V3] scripts: coccicheck: Do not use shift command when rule is specified

2020-09-09 Thread Sumera Priyadarsini
The command "make coccicheck C=1 CHECK=scripts/coccicheck" results in the
error:
./scripts/coccicheck: line 65: -1: shift count out of range

This happens because every time the C variable is specified,
the shell arguments need to be "shifted" in order to take only
the last argument, which is the C file to test. These shell arguments
mostly comprise flags that have been set in the Makefile. However,
when coccicheck is specified in the make command as a rule, the
number of shell arguments is zero, thus passing the invalid value -1
to the shift command, resulting in an error.

Modify coccicheck to use the shift command only when
number of shell arguments is not zero.

Signed-off-by: Sumera Priyadarsini 

---
Changes in V2:
- Fix spelling errors as suggested by Markus Elfring
---
 scripts/coccicheck | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/scripts/coccicheck b/scripts/coccicheck
index e04d328210ac..5c8df337e1e3 100755
--- a/scripts/coccicheck
+++ b/scripts/coccicheck
@@ -61,9 +61,19 @@ COCCIINCLUDE=${COCCIINCLUDE// -include/ --include}
 if [ "$C" = "1" -o "$C" = "2" ]; then
 ONLINE=1
 
-# Take only the last argument, which is the C file to test
-shift $(( $# - 1 ))
-OPTIONS="$COCCIINCLUDE $1"
+# If the rule coccicheck is specified when calling make, number of
+# arguments is zero
+if [ $# -ne 0 ]; then
+   # Take only the last argument, which is the C file to test
+   shift $(( $# -1 ))
+   OPTIONS="$COCCIINCLUDE $1"
+else
+   if [ "$KBUILD_EXTMOD" = "" ] ; then
+   OPTIONS="--dir $srctree $COCCIINCLUDE"
+   else
+   OPTIONS="--dir $KBUILD_EXTMOD $COCCIINCLUDE"
+   fi
+fi
 
 # No need to parallelize Coccinelle since this mode takes one input file.
 NPROC=1
-- 
2.25.1

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [PATCH v2] scripts: coccicheck: Do not use shift command when rule is specified

2020-09-09 Thread Julia Lawall


On Wed, 9 Sep 2020, Markus Elfring wrote:

> > Modify coccicheck to use the shift command only when
> > number of shell arguments is not zero.
>
> I suggest to add the tag “Fixes” to the commit message.

I don't think there is any need for that.  This is not a patch that should
be backported.  The previous situation did not cause any problem with the
execution of make coccicheck, only a tiresome warning message.

julia

>
>
> > Changes in V2:
> > - Fix spelling errors as suggested by Markus Elfring
>
> Would you like to adjust the last word in the previous patch subject 
> accordingly?
>
> Regards,
> Markus
> ___
> Cocci mailing list
> Cocci@systeme.lip6.fr
> https://systeme.lip6.fr/mailman/listinfo/cocci
>___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [PATCH v2] scripts: coccicheck: Do not use shift command when rule is specified

2020-09-09 Thread Markus Elfring
> Modify coccicheck to use the shift command only when
> number of shell arguments is not zero.

I suggest to add the tag “Fixes” to the commit message.


> Changes in V2:
>   - Fix spelling errors as suggested by Markus Elfring

Would you like to adjust the last word in the previous patch subject 
accordingly?

Regards,
Markus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


[Cocci] [PATCH V2] scripts: coccicheck: Do not use shift command when rule is specfified

2020-09-09 Thread Sumera Priyadarsini
The command "make coccicheck C=1 CHECK=scripts/coccicheck" results in the
error:
./scripts/coccicheck: line 65: -1: shift count out of range

This happens because every time the C variable is specified,
the shell arguments need to be "shifted" in order to take only
the last argument, which is the C file to test. These shell arguments
mostly comprise flags that have been set in the Makefile. However,
when coccicheck is specified in the make command as a rule, the
number of shell arguments is zero, thus passing the invalid value -1
to the shift command, resulting in an error.

Modify coccicheck to use the shift command only when
number of shell arguments is not zero.

Signed-off-by: Sumera Priyadarsini 

---
Changes in V2:
- Fix spelling errors as suggested by Markus Elfring
---
 scripts/coccicheck | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/scripts/coccicheck b/scripts/coccicheck
index e04d328210ac..5c8df337e1e3 100755
--- a/scripts/coccicheck
+++ b/scripts/coccicheck
@@ -61,9 +61,19 @@ COCCIINCLUDE=${COCCIINCLUDE// -include/ --include}
 if [ "$C" = "1" -o "$C" = "2" ]; then
 ONLINE=1
 
-# Take only the last argument, which is the C file to test
-shift $(( $# - 1 ))
-OPTIONS="$COCCIINCLUDE $1"
+# If the rule coccicheck is specified when calling make, number of
+# arguments is zero
+if [ $# -ne 0 ]; then
+   # Take only the last argument, which is the C file to test
+   shift $(( $# -1 ))
+   OPTIONS="$COCCIINCLUDE $1"
+else
+   if [ "$KBUILD_EXTMOD" = "" ] ; then
+   OPTIONS="--dir $srctree $COCCIINCLUDE"
+   else
+   OPTIONS="--dir $KBUILD_EXTMOD $COCCIINCLUDE"
+   fi
+fi
 
 # No need to parallelize Coccinelle since this mode takes one input file.
 NPROC=1
-- 
2.25.1

___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci


Re: [Cocci] [PATCH] scripts: coccicheck: Do not use shift command when rule is specified

2020-09-09 Thread Sumera Priyadarsini
On Wed, Sep 09, 2020 at 08:52:19AM +0200, Markus Elfring wrote:
> I find it helpful to avoid typos (like the following) in the change 
> description.
> 
> 
> > … Makfeile. …
> 
> … Makefile. …
> 
> 
> > … paasing …
> 
> … passing …
> 
> 
> > …, resuting …
> 
> …, resulting …
> 
> 
> > This patch modifies coccicheck …
> 

I did make those errors but I also rectified them. This is strange
because my commit message shows the rectified version.
Either way, I will send a v2. Thanks for pointing this out.

> Would an imperative wording be preferred for the commit message?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=34d4ddd359dbcdf6c5fb3f85a179243d7a1cb7f8#n151
> 
> Regards,
> Markus
___
Cocci mailing list
Cocci@systeme.lip6.fr
https://systeme.lip6.fr/mailman/listinfo/cocci