Re: Unused #include statements
Robert, Peff and Junio. Thank you all for your feedback. It's clear now what sort of analysis I should aim towards. Thanks, Zoltan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unused #include statements
Jeff King p...@peff.net writes: One of our rules is that git-compat-util.h (or one of the well-known headers which includes, cache.h or builtin.h) is included first in any translation unit. Perhaps by now we can spell it out a bit more explicitly and update CodingGuidelines. We have: - The first #include in C files, except in platform specific compat/ implementations, should be git-compat-util.h or another header file that includes it, such as cache.h or builtin.h. such as might be making things unnecessarily vague; I do not think a valid reason why we should say a .c file that includes advice.h as the first thing satisfies this requirement, for example. Because: - A command that interacts with the object store, config subsystem, the index, or the working tree cannot do anything without using what is declared in cache.h. - A built-in command must be declared in builtin.h, so anything in builtin/*.c must include it. it may be reasonable to say the first *.h file included must be one of git-compat-util.h, cache.h or builtin.h (and then we make sure that compat-util is the first thing included in either of the latter two). While I very much agree with the principle Robert alluded to [*1*], we may want to loosen that for .c files that include builtin.h or cache.h for the sake of brevity. For example, if you are builtin (hence you start by #include builtin.h), it may be reasonable to allow you to take whatever is in cache.h for granted [*2*]. So the rule might be: - The first #include in C files, except in platform specific compat/ implementations, must be either git-compat-util.h, cache.h or builtin.h. - A C file must directly include the header files that declare the functions and the types it uses, except for the functions and types that are made available to it by including one of the header files it must include by the previous rule. Optionally, - A C file must include only one of git-compat-util.h, cache.h or builtin.h; e.g. if you include builtin.h, do not include the other two, but it can consider what is availble in cache.h available to it. Thoughts? I am not looking forward to a torrent of patches whose sole purpose is to make the existing C files conform to any such rule, though. Clean-up patches that trickle in at a low rate is tolerable, but a torrent is too distracting. [Footnote] *1* For example, even though diff.h may include tree-walk.h for its own use, a .c file that includes diff.h without including tree-walk.h that uses update_tree_entry() or anything that is declared in the latter is very iffy from semantic point of view. *2* Because that facility is so widely used inside the codebase, builtin.h includes strbuf.h, so in addition to what are in cache.h, we may want to allow builtin implementations to take strbufs for granted as well. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unused #include statements
On Thu, Jan 15, 2015 at 10:50:45AM -0800, Junio C Hamano wrote: So the rule might be: - The first #include in C files, except in platform specific compat/ implementations, must be either git-compat-util.h, cache.h or builtin.h. - A C file must directly include the header files that declare the functions and the types it uses, except for the functions and types that are made available to it by including one of the header files it must include by the previous rule. Yeah, that makes sense (and is what I took away from the existing rule in CodingGuidelines, but I agree what is there is not very rigorous). Optionally, - A C file must include only one of git-compat-util.h, cache.h or builtin.h; e.g. if you include builtin.h, do not include the other two, but it can consider what is availble in cache.h available to it. Thoughts? I am not looking forward to a torrent of patches whose sole purpose is to make the existing C files conform to any such rule, though. Clean-up patches that trickle in at a low rate is tolerable, but a torrent is too distracting. I don't think the optionally one above is that necessary. Not because I don't agree with it, but because I do not know that we want to get into the business of laying out every minute detail and implication. The CodingGuidelines document is meant to be guidelines, and I do not want to see arguments like well, the guidelines do not explicitly _disallow_ this, so you must accept it or add something to the guideline. That is a waste of everybody's time. A general philosophy + good taste (from the submitter and the maintainer) should ideally be enough. And hopefully would stop a torrent of but this file doesn't conform to the letter of CodingGuidelines!. Maybe it does not, but if there is no tangible benefit besides blindly following some rules, it is not worth the precious time of developers. Which isn't to say we shouldn't clarify the document when need be. But I think what I quoted at the top already is probably a good improvement over what is there. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unused #include statements
Jeff King p...@peff.net writes: On Thu, Jan 15, 2015 at 10:50:45AM -0800, Junio C Hamano wrote: ... Thoughts? I am not looking forward to a torrent of patches whose sole purpose is to make the existing C files conform to any such rule, though. Clean-up patches that trickle in at a low rate is tolerable, but a torrent is too distracting. I don't think the optionally one above is that necessary. Not because I don't agree with it, but because I do not know that we want to get into the business of laying out every minute detail and implication. The CodingGuidelines document is meant to be guidelines, and I do not want to see arguments like well, the guidelines do not explicitly _disallow_ this, so you must accept it or add something to the guideline. That is a waste of everybody's time. Totally. I know we do not want to get into that business. A general philosophy + good taste (from the submitter and the maintainer) should ideally be enough. Yes, ideally ;-) Which isn't to say we shouldn't clarify the document when need be. But I think what I quoted at the top already is probably a good improvement over what is there. OK, thanks. Let's queue something like this for post 2.3 cycle, then. -- 8 -- Subject: CodingGuidelines: clarify C #include rules Even though advice.h includes git-compat-util.h, it is not sensible to have it as the first #include and indirectly satisify the You must give git-compat-util.h a clean environment to set up feature test macros before including any of the system headers are included, which is the real requirement. Because: - A command that interacts with the object store, config subsystem, the index, or the working tree cannot do anything without using what is declared in cache.h; - A built-in command must be declared in builtin.h, so anything in builtin/*.c must include it; - These two headers both include git-compat-util.h as the first thing; and - Almost all our *.c files (outside compat/ and borrowed files in xdiff/) need some Git-ness from cache.h to do something Git-ish. let's explicitly specify that one of these three header files must be the first thing that is included. Any of our *.c file should include the header file that directly declares what it uses, instead of relying on the fact that some *.h file it includes happens to include another *.h file that declares the necessary function or type. Spell it out as another guideline item. Helped-by: Jeff King p...@peff.net Signed-off-by: Junio C Hamano gits...@pobox.com --- Documentation/CodingGuidelines | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 894546d..578d07c 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -328,9 +328,14 @@ For C programs: - When you come up with an API, document it. - - The first #include in C files, except in platform specific - compat/ implementations, should be git-compat-util.h or another - header file that includes it, such as cache.h or builtin.h. + - The first #include in C files, except in platform specific compat/ + implementations, must be either git-compat-util.h, cache.h or + builtin.h. You do not have to include more than one of these. + + - A C file must directly include the header files that declare the + functions and the types it uses, except for the functions and types + that are made available to it by including one of the header files + it must include by the previous rule. - If you are planning a new command, consider writing it in shell or perl first, so that changes in semantics can be easily -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unused #include statements
On Thu, Jan 15, 2015 at 03:20:09PM -0800, Junio C Hamano wrote: OK, thanks. Let's queue something like this for post 2.3 cycle, then. -- 8 -- Subject: CodingGuidelines: clarify C #include rules [...] Thanks, this looks good to me. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unused #include statements
Hi Zoltan, I can't make a statement for the git project but I consider this kind of brute-force removal a very problematic approach for languages like C and C++. The reason for that is simple: Often header files include other header files since their content depends on those other header files. Let's assume a header file a.h includes b.h. Now consider a file c.c includes both a.h and b.h since they are actually using stuff from both of them. Your brute-force approach would now remove b.h since it is indirectly pulled in through a.h. While the removal would no lead to technially incorrect code it would no longer reflect the semantical situation (since c.c still uses stuff from b.h). Even worse is that if you later modify c.c to no longer use stuff from a.h you could no longer remove a.h since that way you would break the chain to b.h. Thus doing those kind of brute-force removals generally makes the include structure in a project very fragile. The analysis itself you did is still useful to identify header files that can potentially be removed but removing them without further analysis I would consider problematic. Robert -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Unused #include statements
Hello there, Since reading a post [1] about removing some unnecessary #include statements from a git C source file I've been intrigued to see how many more might be lurking in the code base. After a bit of digging around, my brute force approach of 'remove as many #includes as possible while making sure the code still successfully compiles' has returned the following results: $ git diff --stat alloc.c| 2 -- archive-zip.c | 1 - archive.c | 1 - argv-array.c | 1 - bisect.c | 9 - block-sha1/sha1.c | 2 -- branch.c | 1 - builtin/add.c | 7 --- builtin/annotate.c | 1 - builtin/apply.c| 8 builtin/archive.c | 3 --- builtin/bisect--helper.c | 2 -- builtin/blame.c| 13 - builtin/branch.c | 8 builtin/bundle.c | 1 - builtin/cat-file.c | 3 --- builtin/check-attr.c | 2 -- builtin/check-ignore.c | 2 -- builtin/check-mailmap.c| 2 -- builtin/check-ref-format.c | 2 -- builtin/checkout-index.c | 2 -- builtin/checkout.c | 11 --- builtin/clean.c| 3 --- builtin/clone.c| 10 -- builtin/column.c | 3 --- builtin/commit-tree.c | 4 builtin/commit.c | 13 - builtin/config.c | 1 - builtin/count-objects.c| 2 -- builtin/credential.c | 1 - builtin/describe.c | 5 - builtin/diff-files.c | 4 builtin/diff-index.c | 4 builtin/diff-tree.c| 4 builtin/diff.c | 6 -- builtin/fast-export.c | 8 builtin/fetch.c| 8 builtin/fmt-merge-msg.c| 6 -- builtin/for-each-ref.c | 6 -- builtin/fsck.c | 7 --- builtin/gc.c | 3 --- builtin/get-tar-commit-id.c| 3 --- builtin/grep.c | 6 -- builtin/hash-object.c | 2 -- builtin/help.c | 2 -- builtin/index-pack.c | 5 - builtin/init-db.c | 1 - builtin/interpret-trailers.c | 3 --- builtin/log.c | 14 -- builtin/ls-files.c | 3 --- builtin/ls-remote.c| 3 --- builtin/ls-tree.c | 3 --- builtin/mailinfo.c | 2 -- builtin/mailsplit.c| 3 --- builtin/merge-base.c | 5 - builtin/merge-file.c | 2 -- builtin/merge-index.c | 1 - builtin/merge-ours.c | 1 - builtin/merge-recursive.c | 3 --- builtin/merge-tree.c | 2 -- builtin/merge.c| 10 -- builtin/mktree.c | 2 -- builtin/mv.c | 4 builtin/name-rev.c | 3 --- builtin/notes.c| 4 builtin/pack-objects.c
Re: Unused #include statements
On Thu, Jan 15, 2015 at 05:14:39AM +0100, Robert Schiele wrote: Thus doing those kind of brute-force removals generally makes the include structure in a project very fragile. The analysis itself you did is still useful to identify header files that can potentially be removed but removing them without further analysis I would consider problematic. I would second that. Besides leading to a potentially fragile result, this analysis was done only for a particular platform with a particular set of config knobs. One of our rules is that git-compat-util.h (or one of the well-known headers which includes, cache.h or builtin.h) is included first in any translation unit. This gives git-compat-util the cleanest environment possible for making decisions, and lets macros it defines effect the rest of the code consistently. I suspect on modern platforms like Linux/glibc that it is not a huge deal to include git-compat-util a little late, simply because it does not have all that much to do. But on Solaris 8? Who knows. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html