Re: Feature branch launched: deheader

2010-12-06 Thread Henrik Nordström
sön 2010-12-05 klockan 15:28 +0100 skrev Kinkie:
 Hi all,
   Eric Raymond recently released a tool named deheader
 (http://www.catb.org/esr/deheader/) which goes through c/c++ project
 looking for unneeded includes.
 It does so by trying to compile each source file after removing one
 #include statement at a time, and seeing if it builds.

Which is interesting, but a lot of filtering is needed to make use of
the results.

Many times OS headers is quite forgiving, allowing you to skip many
includes which is documented as required for a certain function. For
example socket(2) officially requires sys/types.h and sys/socket.h, but
many OS:es accept if you just include sys/socket.h.

And we do have some magics in our own sources where stripping an Squid
include would make that file still compile but result in missing
functionality. Most times this will result in link failure however as
long as config.h is always included one way or another.

I would use the tool very cautious. It's interesting, but the results
can not automatically be taken as a truth, only a hint.

Personally I don't have a big problem with a bit too many includes. 

Regards
Henrik



Re: Feature branch launched: deheader

2010-12-06 Thread Kinkie
Hi,
 Henrik:
 Which is interesting, but a lot of filtering is needed to make use of
 the results.

I agree. Hence the branch and associated matrix test.

 I would use the tool very cautious. It's interesting, but the results
 can not automatically be taken as a truth, only a hint.

That's more or less what I'm attempting.

 Personally I don't have a big problem with a bit too many includes.

Ignoring the compile time, whose relevance is debateable, they do
raise the complexity of unit tests. Some unit tests require way too
many stubs, as a result of improper layerization in the core, or
because of too many includes nested somewhere in the dependency tree.

In the tool output I am running, I explicitly targeted squid.h; not in
one case changing it to config.h resulted in a broken test-build.


-- 
    /kinkie


Feature branch launched: deheader

2010-12-05 Thread Kinkie
Hi all,
  Eric Raymond recently released a tool named deheader
(http://www.catb.org/esr/deheader/) which goes through c/c++ project
looking for unneeded includes.
It does so by trying to compile each source file after removing one
#include statement at a time, and seeing if it builds.
I tried it on squid, and the result was unexpected: it seems that the
c++ sources for squid contain up to 915 unneeded #include directives
(on an Ubuntu box, with all build options enabled).

The estimate is of course not realistic, as it can't account for other
platforms or feature-includes, but it is still worth a check IMVHO.
I've opened a feature-branch for the purpose at
lp:~kinkie/squid/deheader with a first round of de-headerization.

Thoughts? Comments?

-- 
    /kinkie


Re: Feature branch launched: deheader

2010-12-05 Thread Robert Collins
On Mon, Dec 6, 2010 at 3:28 AM, Kinkie gkin...@gmail.com wrote:
 Hi all,
  Eric Raymond recently released a tool named deheader
 (http://www.catb.org/esr/deheader/) which goes through c/c++ project
 looking for unneeded includes.
 It does so by trying to compile each source file after removing one
 #include statement at a time, and seeing if it builds.

Thats a fairly flawed approach.

Two reasons:
 - some headers when present affect correctness, not compilation.
 - on some platforms headers are mandatory, on others optional.

So, if you do this, be sure to take the minimal approach after
building on *all* platforms, and then still be conservative.

-Rob


Re: Feature branch launched: deheader

2010-12-05 Thread Kinkie
On Sun, Dec 5, 2010 at 7:48 PM, Robert Collins
robe...@robertcollins.net wrote:
 On Mon, Dec 6, 2010 at 3:28 AM, Kinkie gkin...@gmail.com wrote:
 Hi all,
  Eric Raymond recently released a tool named deheader
 (http://www.catb.org/esr/deheader/) which goes through c/c++ project
 looking for unneeded includes.
 It does so by trying to compile each source file after removing one
 #include statement at a time, and seeing if it builds.

 Thats a fairly flawed approach.

 Two reasons:
  - some headers when present affect correctness, not compilation.
  - on some platforms headers are mandatory, on others optional.

 So, if you do this, be sure to take the minimal approach after
 building on *all* platforms, and then still be conservative.

I agree
To address those issues I'm focusing on squid internal headers, and in
most cases only turning includes for squid.h into includes for
config.h. I'm relying on a matrix build to track system dependencies,
and on unit tests to catch correctness issues. On the plus side, this
effort involves massive amounts of cpu power and elapsed-time but not
as much brainpower, so if it all is for nothing, it won't be a big
waste.

-- 
    /kinkie


Re: Feature branch launched: deheader

2010-12-05 Thread Amos Jeffries
On Mon, 6 Dec 2010 07:48:48 +1300, Robert Collins
robe...@robertcollins.net wrote:
 On Mon, Dec 6, 2010 at 3:28 AM, Kinkie gkin...@gmail.com wrote:
 Hi all,
  Eric Raymond recently released a tool named deheader
 (http://www.catb.org/esr/deheader/) which goes through c/c++ project
 looking for unneeded includes.
 It does so by trying to compile each source file after removing one
 #include statement at a time, and seeing if it builds.
 
 Thats a fairly flawed approach.
 
 Two reasons:
  - some headers when present affect correctness, not compilation.
  - on some platforms headers are mandatory, on others optional.
 
 So, if you do this, be sure to take the minimal approach after
 building on *all* platforms, and then still be conservative.
 
 -Rob


Aye, to avoid those two problems, DO NOT make any changes in the compat/
folder indicated by deheader.
Where an OS has mandatory headers the include likely should be moved to
the OS area of compat/, common sense applies of course when the include is
component specific. Comments in the code next to those includes is required
regardless of where they are.

If a header is included through the general/shared area of compat the src/
copy may be removable. Otherwise duplicates are required. Note that compat
may/should #undef the HAVE_* for files included there to save build time.



On the whole I would take a slightly different approach. Using deheader in
a multiple-pass approach:

 pass 0: manually check the headers adding class pre-defines wherever
possible but missing. This will fix a lot of false-negatives the deheader
approach WILL hit in the squid sources.

 pass 1: to process the testHeaders pseudo-apps and remove the
sub-includes which are not actually needing to be pulled in via the .h.
This will highlight many bugs in the real .c/.cc which fail to include
their dependency .h directly.

 pass 2: running it no the whole project to see how many of the .c/.cc
includes are redundant. Using this as a rough guide to problems, noting
that there are likely a great many false-positives for the reasons Rob
mentions in addition to component dependency permutations.

As the least it will be a way to improve the code documentation about why
certain headers are included in strange places.


I'm wondering if we should collate the reports from testing tools like
this for manual inspection somewhere. I have cppcheck and valgrind outputs
as well that anyone could take up as a minor task. I suspect Alex has more
of those or from other tools as well.

Amos



Re: Feature branch launched: deheader

2010-12-05 Thread Alex Rousskov

On 12/05/2010 07:28 AM, Kinkie wrote:

Hi all,
   Eric Raymond recently released a tool named deheader
(http://www.catb.org/esr/deheader/) which goes through c/c++ project
looking for unneeded includes.
It does so by trying to compile each source file after removing one
#include statement at a time, and seeing if it builds.
I tried it on squid, and the result was unexpected: it seems that the
c++ sources for squid contain up to 915 unneeded #include directives
(on an Ubuntu box, with all build options enabled).

The estimate is of course not realistic, as it can't account for other
platforms or feature-includes, but it is still worth a check IMVHO.
I've opened a feature-branch for the purpose at
lp:~kinkie/squid/deheader with a first round of de-headerization.

Thoughts? Comments?


Removing unneeded includes can reduce build time so this can be a useful 
project, provided we do not end up spending more time fixing portability 
bugs caused by header removal.


The results of this beheading should not be committed until the changes 
are at least tested on Hudson, IMO.


Thank you,

Alex.



Re: Feature branch launched: deheader

2010-12-05 Thread Amos Jeffries
On Sun, 5 Dec 2010 15:28:00 +0100, Kinkie gkin...@gmail.com wrote:
 Hi all,
   Eric Raymond recently released a tool named deheader
 (http://www.catb.org/esr/deheader/) which goes through c/c++ project
 looking for unneeded includes.
 It does so by trying to compile each source file after removing one
 #include statement at a time, and seeing if it builds.
 I tried it on squid, and the result was unexpected: it seems that the
 c++ sources for squid contain up to 915 unneeded #include directives
 (on an Ubuntu box, with all build options enabled).
 
 The estimate is of course not realistic, as it can't account for other
 platforms or feature-includes, but it is still worth a check IMVHO.
 I've opened a feature-branch for the purpose at
 lp:~kinkie/squid/deheader with a first round of de-headerization.
 
 Thoughts? Comments?

Auditing your first batch commit...

 Disconnecting .cc from squid.h is fine under the current content of that
file. However when it is removed config.h always needs to be added in its
place. You will get false success results if they are both missing.
Particularly in the optional code as they are the only source of USE_* and
HAVE_* definition.

Amos



Re: Feature branch launched: deheader

2010-12-05 Thread Alex Rousskov

On 12/05/2010 07:59 PM, Amos Jeffries wrote:

On Sun, 5 Dec 2010 15:28:00 +0100, Kinkiegkin...@gmail.com  wrote:

Hi all,
   Eric Raymond recently released a tool named deheader
(http://www.catb.org/esr/deheader/) which goes through c/c++ project
looking for unneeded includes.
It does so by trying to compile each source file after removing one
#include statement at a time, and seeing if it builds.
I tried it on squid, and the result was unexpected: it seems that the
c++ sources for squid contain up to 915 unneeded #include directives
(on an Ubuntu box, with all build options enabled).

The estimate is of course not realistic, as it can't account for other
platforms or feature-includes, but it is still worth a check IMVHO.
I've opened a feature-branch for the purpose at
lp:~kinkie/squid/deheader with a first round of de-headerization.

Thoughts? Comments?


Auditing your first batch commit...

  Disconnecting .cc from squid.h is fine under the current content of that
file. However when it is removed config.h always needs to be added in its
place. You will get false success results if they are both missing.
Particularly in the optional code as they are the only source of USE_* and
HAVE_* definition.


Perhaps there is a way to tell deheader to assume that squid.h and/or 
config.h should always be there? We can fix them manually if needed.


Alex.