Re: [creduce-dev] rm-toks-* passes: why flex-generated parser instead of clang?

John Regehr Fri, 12 Jul 2013 10:14:30 -0700

We'll try out our idea first since it's much easier than what youpropose. I won't put the new passes first unless they really do give aspeedup.

I agree that something like what you suggest would be a good idea, butI'm not sure how to use precompiled headers in a simple and generic way.One possibility is that C-Reduce would get an LLVM-specific module, aGCC-specific module, etc.


John



On 07/12/2013 10:36 AM, Konstantin Tokarev wrote:

12.07.2013, 20:13, "John Regehr" <[email protected]>:

  Thanks for letting me know.  I had intended to use a real lexer for a
  long time but somehow only got around to it recently.

  Konstantin, Yang and I are trying to figure out how to speed up the
  initial part of C-Reduce when it is given a very large C++ file.  The
  line-based passes are just not that great.

  Maybe you can give us feedback on our current idea.  The idea is to
  remove function bodies.  This can be done either by replacing a
  definition with a declaration, or simply by stripping everything out of
  the function definition (except for an appropriate "return" statement,
  obviously).

  My current idea is to reuse the line-based logic.  In other words, we
  first try to delete all function bodies, then the first half of them,
  then the second half, then the first quarter, etc...

  I think that if this is implemented wisely, a large speedup may be
  possible.  Does this seem reasonable?


Idea is fine, however I'm not sure if it's suitable for initial part.
My concern is speed of parsing. If solution will use clang, it may take
a long time to parse large file and figure out function bodies on each
iteration, in contrast to "dumb" delta passes relying on topformflat.

I have another proposal.

In the most of cases with real C++ code (not generated) the most generic
code from C and C++ standard libraries is placed in the top part of
translation unit, and most specific near the bottom. I've done several
reductions the next way:

1. Split preprocessed source into large "header" and small "source".

I believe here is a place where "a little brain time can save a lot of
CPU time" (tm), but I think in practise it can be automated, e.g. by
shrinking 1/5 or 200K (whichever is larger) from bottom.

2. Make precompiled headers from this header (if several compilers are
involved, each one requires its own copy of header and pch - tedious to
do by hands)

3. Reduce small "source" part.

4. See if source still depends on header, if no - we are done.

5. If header is large, split it into 2 parts again, make new source and
header and repeat 1-4.

6. cat two parts together and reduce result.

I see that some intelligence is needed to achieve exit in point (4), however
line-based reductions goes much faster when there is more unused library code.

Re: [creduce-dev] rm-toks-* passes: why flex-generated parser instead of clang?

Reply via email to