Re: Feature branch launched: deheader
sön 2010-12-05 klockan 15:28 +0100 skrev Kinkie: Hi all, Eric Raymond recently released a tool named deheader (http://www.catb.org/esr/deheader/) which goes through c/c++ project looking for unneeded includes. It does so by trying to compile each source file after removing one #include statement at a time, and seeing if it builds. Which is interesting, but a lot of filtering is needed to make use of the results. Many times OS headers is quite forgiving, allowing you to skip many includes which is documented as required for a certain function. For example socket(2) officially requires sys/types.h and sys/socket.h, but many OS:es accept if you just include sys/socket.h. And we do have some magics in our own sources where stripping an Squid include would make that file still compile but result in missing functionality. Most times this will result in link failure however as long as config.h is always included one way or another. I would use the tool very cautious. It's interesting, but the results can not automatically be taken as a truth, only a hint. Personally I don't have a big problem with a bit too many includes. Regards Henrik
Re: Feature branch launched: deheader
Hi, Henrik: Which is interesting, but a lot of filtering is needed to make use of the results. I agree. Hence the branch and associated matrix test. I would use the tool very cautious. It's interesting, but the results can not automatically be taken as a truth, only a hint. That's more or less what I'm attempting. Personally I don't have a big problem with a bit too many includes. Ignoring the compile time, whose relevance is debateable, they do raise the complexity of unit tests. Some unit tests require way too many stubs, as a result of improper layerization in the core, or because of too many includes nested somewhere in the dependency tree. In the tool output I am running, I explicitly targeted squid.h; not in one case changing it to config.h resulted in a broken test-build. -- /kinkie
Re: Feature branch launched: deheader
On Mon, Dec 6, 2010 at 3:28 AM, Kinkie gkin...@gmail.com wrote: Hi all, Eric Raymond recently released a tool named deheader (http://www.catb.org/esr/deheader/) which goes through c/c++ project looking for unneeded includes. It does so by trying to compile each source file after removing one #include statement at a time, and seeing if it builds. Thats a fairly flawed approach. Two reasons: - some headers when present affect correctness, not compilation. - on some platforms headers are mandatory, on others optional. So, if you do this, be sure to take the minimal approach after building on *all* platforms, and then still be conservative. -Rob
Re: Feature branch launched: deheader
On Sun, Dec 5, 2010 at 7:48 PM, Robert Collins robe...@robertcollins.net wrote: On Mon, Dec 6, 2010 at 3:28 AM, Kinkie gkin...@gmail.com wrote: Hi all, Eric Raymond recently released a tool named deheader (http://www.catb.org/esr/deheader/) which goes through c/c++ project looking for unneeded includes. It does so by trying to compile each source file after removing one #include statement at a time, and seeing if it builds. Thats a fairly flawed approach. Two reasons: - some headers when present affect correctness, not compilation. - on some platforms headers are mandatory, on others optional. So, if you do this, be sure to take the minimal approach after building on *all* platforms, and then still be conservative. I agree To address those issues I'm focusing on squid internal headers, and in most cases only turning includes for squid.h into includes for config.h. I'm relying on a matrix build to track system dependencies, and on unit tests to catch correctness issues. On the plus side, this effort involves massive amounts of cpu power and elapsed-time but not as much brainpower, so if it all is for nothing, it won't be a big waste. -- /kinkie
Re: Feature branch launched: deheader
On Mon, 6 Dec 2010 07:48:48 +1300, Robert Collins robe...@robertcollins.net wrote: On Mon, Dec 6, 2010 at 3:28 AM, Kinkie gkin...@gmail.com wrote: Hi all, Eric Raymond recently released a tool named deheader (http://www.catb.org/esr/deheader/) which goes through c/c++ project looking for unneeded includes. It does so by trying to compile each source file after removing one #include statement at a time, and seeing if it builds. Thats a fairly flawed approach. Two reasons: - some headers when present affect correctness, not compilation. - on some platforms headers are mandatory, on others optional. So, if you do this, be sure to take the minimal approach after building on *all* platforms, and then still be conservative. -Rob Aye, to avoid those two problems, DO NOT make any changes in the compat/ folder indicated by deheader. Where an OS has mandatory headers the include likely should be moved to the OS area of compat/, common sense applies of course when the include is component specific. Comments in the code next to those includes is required regardless of where they are. If a header is included through the general/shared area of compat the src/ copy may be removable. Otherwise duplicates are required. Note that compat may/should #undef the HAVE_* for files included there to save build time. On the whole I would take a slightly different approach. Using deheader in a multiple-pass approach: pass 0: manually check the headers adding class pre-defines wherever possible but missing. This will fix a lot of false-negatives the deheader approach WILL hit in the squid sources. pass 1: to process the testHeaders pseudo-apps and remove the sub-includes which are not actually needing to be pulled in via the .h. This will highlight many bugs in the real .c/.cc which fail to include their dependency .h directly. pass 2: running it no the whole project to see how many of the .c/.cc includes are redundant. Using this as a rough guide to problems, noting that there are likely a great many false-positives for the reasons Rob mentions in addition to component dependency permutations. As the least it will be a way to improve the code documentation about why certain headers are included in strange places. I'm wondering if we should collate the reports from testing tools like this for manual inspection somewhere. I have cppcheck and valgrind outputs as well that anyone could take up as a minor task. I suspect Alex has more of those or from other tools as well. Amos
Re: Feature branch launched: deheader
On 12/05/2010 07:28 AM, Kinkie wrote: Hi all, Eric Raymond recently released a tool named deheader (http://www.catb.org/esr/deheader/) which goes through c/c++ project looking for unneeded includes. It does so by trying to compile each source file after removing one #include statement at a time, and seeing if it builds. I tried it on squid, and the result was unexpected: it seems that the c++ sources for squid contain up to 915 unneeded #include directives (on an Ubuntu box, with all build options enabled). The estimate is of course not realistic, as it can't account for other platforms or feature-includes, but it is still worth a check IMVHO. I've opened a feature-branch for the purpose at lp:~kinkie/squid/deheader with a first round of de-headerization. Thoughts? Comments? Removing unneeded includes can reduce build time so this can be a useful project, provided we do not end up spending more time fixing portability bugs caused by header removal. The results of this beheading should not be committed until the changes are at least tested on Hudson, IMO. Thank you, Alex.
Re: Feature branch launched: deheader
On Sun, 5 Dec 2010 15:28:00 +0100, Kinkie gkin...@gmail.com wrote: Hi all, Eric Raymond recently released a tool named deheader (http://www.catb.org/esr/deheader/) which goes through c/c++ project looking for unneeded includes. It does so by trying to compile each source file after removing one #include statement at a time, and seeing if it builds. I tried it on squid, and the result was unexpected: it seems that the c++ sources for squid contain up to 915 unneeded #include directives (on an Ubuntu box, with all build options enabled). The estimate is of course not realistic, as it can't account for other platforms or feature-includes, but it is still worth a check IMVHO. I've opened a feature-branch for the purpose at lp:~kinkie/squid/deheader with a first round of de-headerization. Thoughts? Comments? Auditing your first batch commit... Disconnecting .cc from squid.h is fine under the current content of that file. However when it is removed config.h always needs to be added in its place. You will get false success results if they are both missing. Particularly in the optional code as they are the only source of USE_* and HAVE_* definition. Amos
Re: Feature branch launched: deheader
On 12/05/2010 07:59 PM, Amos Jeffries wrote: On Sun, 5 Dec 2010 15:28:00 +0100, Kinkiegkin...@gmail.com wrote: Hi all, Eric Raymond recently released a tool named deheader (http://www.catb.org/esr/deheader/) which goes through c/c++ project looking for unneeded includes. It does so by trying to compile each source file after removing one #include statement at a time, and seeing if it builds. I tried it on squid, and the result was unexpected: it seems that the c++ sources for squid contain up to 915 unneeded #include directives (on an Ubuntu box, with all build options enabled). The estimate is of course not realistic, as it can't account for other platforms or feature-includes, but it is still worth a check IMVHO. I've opened a feature-branch for the purpose at lp:~kinkie/squid/deheader with a first round of de-headerization. Thoughts? Comments? Auditing your first batch commit... Disconnecting .cc from squid.h is fine under the current content of that file. However when it is removed config.h always needs to be added in its place. You will get false success results if they are both missing. Particularly in the optional code as they are the only source of USE_* and HAVE_* definition. Perhaps there is a way to tell deheader to assume that squid.h and/or config.h should always be there? We can fix them manually if needed. Alex.