Re: Newer gcc swallow version control keywords
On 10/18/2011 12:50 PM, Oleg Goldshmidt wrote: I understood that, and it's still unstable. Since if some young team member not aware of the $Id: trick, will write: log($Id: %d $Name: %s\n,id,name) ident will return garbage. Someone may alter keyword expansions just before the build, too. If you have documented way to get the ident strings, it's more stable. The whole point is to have these strings everywhere, otherwise they are not very useful. This means the whole development team is aware of and uses the convention. I understand why these keywords were useful for CVS/RCS, where each file had its own version number. When working with SVN, however, a single number uniquely identifies the entire source tree. Why not have that one number and get it done with? Shachar -- Shachar Shemesh Lingnu Open Source Consulting Ltd. http://www.lingnu.com ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 1:50 PM, Oleg Goldshmidt p...@goldshmidt.org wrote: I didn't understand how, eg, my C++ scheme don't work. I think it should work even if you're including the $Id$ strings in the headers files. Apart from the fact that you assume that main.cc is mine (what if my product is a library?), There's an important point here. If your product is an .so, all you need to do, is to expose the ident objects to the library users (of course, in this case the names of the variables must be unique per file version). This way, the linker is forced to leave those strings intact. If you're product is an .a archive, well, your out of luck. The same method for so files will work of course for the .a archive, but, the client who links to your program, is free to use a whole program optimizer, which might wipe off your program. I'm not sure it's even possible to mark those strings as always include in the object file. I really recommend this piece[1] by Meyers, who shows that fighting with the optimizer, is doomed to misery. [1] PDF: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf Bottom of page 6, Ctrl-F In essence, you’ve just fired the opening salvo depends on a whole lot of things I wouldn't necessarily need for any other purpose, makes the strings global Not too global. The global list of strings is exposed only in the main() and in the implementation of the Ident class. If you hate global variables - make it a static variable of the Ident class. and mutable, The list of strings is mutable, right? The stings can be immutable. Also it is possible to use a push-only queue instead of vector. But I'm nitpicking now ;-) and won't pick up, e.g., the case of wrong header I mentioned before (I checked)? I don't understand why is that. See here https://github.com/elazarl/ident-for-cpp[2] for an implementation which Works For Me (TM), with no non-static global variables, and it works for header files. [2] https://github.com/elazarl/ident-for-cpp This is a good idea in general, but it's not really an improvement. The trusted old scheme has all the needed properties and if gcc had an option to disable this particular kind of optimization selectively I wouldn't have a problem. If I understood correctly, the problem was how to port the GCC optimization-turn-off to other compilers without macro hackeries. I believe this is a good portable way. I'm trying to find a sane way to do that in portable C, but I'm not sure it's possible without too much effort. That said, the macro+gcc attribute seems the most reasonable approach given that VS have a way to disable this optimization. However, it spoils all the fun. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
Shachar Shemesh shac...@shemesh.biz writes: I understand why these keywords were useful for CVS/RCS, where each file had its own version number. When working with SVN, however, a single number uniquely identifies the entire source tree. Why not have that one number and get it done with? I believe it has been asked and answered, Your Honour ;-). Having a single number is enough only under a whole bunch of assumptions: the code must come from a single branch of a single repository, the build and release systems must be 100% reliable, no human mistake may occur on the way between develper workstation and customer deployment, etc., etc. The same argument is employed by the gcc team who say - either in the bug trail Nadav quoted or in the mailing list thread Ghiora (I think... sorry if I am mixing it up) found - that it is enough for their purposes so it must be enough for everybody. I say that with all the complexity of gcc their case is conceptually simple (a single piece of software from a single repository) and their build/release process is probably better than most. I assume that I do not have to explain to everybody that the difficulties inherent in changing processes in an organization are significant, legacy structures and processes abound, and every device that helps flushing problems early, quickly, and with minimal pain is worth deploying. So, my conclusion is that gcc at present has a problem. By applying an optimization that cannot be turned off it makes an assumption that it is smarter than the user[1] (isn't this a complaint typically aimed at, say, Microsoft Office? hmphhh...), and an important use case that is disabled by said optimization is dismissed, in part, on the grounds that the gcc team didn't need it in gcc code. This last paragraph is, of course, me venting. At people who are in no way to blame. That was easy, and helpful - thanks everybody! [1] Hey, we see that this variable cannot possibly be used by any program of which it is a part of, so we'll assume you are either stupid or careless and we'll just erase it without trace. Well, I am not stupid, I intended this variable to be used by another program to which this program is but an input, and you know what: it's none of the compiler's business to decide that I cannot possibly want that. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Mon, Oct 17, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow version control keywords: In any case, because there was always a fear that the compiler might optimize these out, someone invented a new directive, #ident, as in: #ident $Id$ This has always been there, but it has never been standard, AFAIK. It Always is always a relative term, when we're talking about a programming language (C) invented 40 years ago. I'm pretty sure #ident has indeed been around for around 20 years. About standardization, you may be right: I just checked and sadly I can't find #ident anywhere in the C99 standard. But that being said, I wonder if in practice, on actual computers of interest today, it will work. I just checked on my Fedora 15, Gcc 4.6.1, and it works well. Sadly, all the dozen other Unix variants to which I used to have access, have gone the way of the dodo, so I can't check this on any other C compiler. is not a GCC extension, either. Most preprocessors don't barf on directives they do not understand, but they may simply ignore #ident which will lead to the same behaviour that I do not want. They *may* ignore #ident, but like I said, #ident has been around for ages, and may actually work on many if not all compilers. Like I said, it does on gcc. I checked, and it seems it generates assemly that looks like: .ident $Id: hello $ Which I guess GNU AS supports, seeing that the ident string appears in the object file and in the final executable - even if compiled with -O2. When you say that it has worked for you with all sorts of compilers do you mean that it actually produced the $Id$ string with ident or that it didn't break? Whenever I checked, it worked in the sense of generating code. But I was never a big fan of what(1) and ident(1) like you, so I haven't checked this in ages. I was wondering whether someone would catch my little mischief: this bug report complains about the same effect in a slightly different situation. Even with older GCC the $Id$ string is not there with just -O2, but it is with -g -O2 (cf. my original posting - I put the exact command line there for a reason). Yes, I agree with the opinion that -fkeep-static-consts should override optimization (it is more specific), but it has never been the case (or at least not for a very long time). I suggest you add your comments to that bug report, to make it known that people still care. Apparently, you're the last user (!?) of ident(1), so make your voice heard :-) -- Nadav Har'El| Tuesday, Oct 18 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Unix is user friendly - it's just picky http://nadav.harel.org.il |about its friends. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
I guess that it doesn't apply to libraries, which must include this global variable. Excuse the idiotic solution, but can't you just add an option to print it out? int main(int argc,char**argv) {if (argc == 2 strcmp(argv[1],--ident) puts(ident);...} The only trick the optimizer can play here, is inlining puts and the input string to assembly commands, but I really don't think it can happen in practice. It seems like a good idea anyhow, to have a stable way of extracting this ident string from the executable (what happens for instance if by accident you have rcsident and rcs_ident? How would you know from the stripped executable which one to trust?). On Mon, Oct 17, 2011 at 6:29 PM, Oleg Goldshmidt p...@goldshmidt.org wrote: Hi, I have a gcc-related question. Problematic platform is Fedora 15 with gcc 4.6.1, as well as Fedora 14 with gcc 4.5.1. I am used to keeping RCS/CVS/SVN keywords (e.g., $Id$) in all my code. In the case of C/C++ this normally amounts to static const char foo_src_id[] = $Id$; in the source code. For those who have never encountered it: SVN and friends happily substitute the relevant information for $Id$, and the information then can be retrieved using ident(1), even from binaries, libraries, and such - incredibly useful. Now, put the above line in a C or C++ file, say foo.cc, and do the following: $ g++ -g -O2 foo.cc -c -o foo.o $ ident foo.o foo.o: $Id: foo.cc 673 2011-10-17 09:48:11Z oleg $ This works up to and including gcc 4.4, but it does not work with gcc 4.5.1 or gcc 4.6.1 (the ident part does not show any keywords). The reason seems to be that the optimizer realizes that the static const is not used and eliminates it (remove -O2 and ident works fine). I can fool gcc by static const char foo_src_id[] __attribute__((used)) = $Id$; but this is non-portable, and preprocessor acrobatics such as #if (__GNUC__ = 4) (__GNUC_MINOR__ 4) #define USED(x) x __attribute__((used)) #else #define USED(x) x #endif static const char USED(foo_src_id[]) = $Id$; is incredibly ugly and verbose, IMHO (even with #ifdef __GNUC__ only). So is any contortion to use the constant somewhere in each file. Does anyone know of a gcc option that will keep unused static constants even with optimization or or any other way to keep the keywords in a portable way? I tried different things that came to mind (such as -fkeep-static-consts) but they did not work, e.g., -fkeep-static-consts emits static consts only when optimization is not on. A bonus question: can anyone with Debian/Ubuntu/OpenSuSE/whatever check whether it works? If it is a RedHat-specific or generic for all GCCs is important. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011, Elazar Leibovich wrote about Re: Newer gcc swallow version control keywords: Excuse the idiotic solution, but can't you just add an option to print it out? int main(int argc,char**argv) {if (argc == 2 strcmp(argv[1],--ident) puts(ident);...} The point in Oleg's trick (which is not Oleg's invention - it has been in use for decades) is to be able to tell not just a single version number for the executable (your --ident is basically the same as --version supported by many programs), but rather to be able to tell the exact version of each and every source code file which participated in creating this executable. You run ident on the executable, and get a list of dozens (or thousands) of source file names and their exact version numbers and dates. This could be useful if you have a lot of versions of your code running around, and you don't remember how exactly each executable was generated. This trick was particularly useful in the days of SCCS and RCS, which work separately on each file, and each file has a separate version number. One reason that nobody really cares about this trick any more is that it has become MUCH LESS IMPORTANT on modern version control systems, e.g., Subversion or Git, where there is a single version number (or version hash) for the entire project, not one per file. So if you just include this single version number in your main() - as in your example - it is enough to find exactly which version of each source file went into this executable. If your project's result doesn't have a main() (e.g., it is a library), just find another way to stick this version number - e.g., provide a function or global variable (which won't be optimized out) containing this version number - and stick it in the same object file as your library's most important function (so it always gets included by users of your library). It seems like a good idea anyhow, to have a stable way of extracting this ident string from the executable (what happens for instance if by accident you have rcsident and rcs_ident? How would you know from the stripped executable which one to trust?). The idea with ident(1) is that the content of the string, not the variable name holding it, has a recognizeable structure. In particular, it looks like this: $Id: something $ - with the word something replaced by anything you wish. ident(1) looks for such strings in any file you give it - whether it is an executable, source code, or whatever, and prints them. what(1), SCCS's precursor of RCS's ident(1), used a different string format for the same purpose - it looked like @(#) something (with the string typically ending with a null). Nobody in their right mind still uses SCCS today, but what(1) still works, if you wish to use strings of this format in your executables to mark them. Nadav. -- Nadav Har'El| Tuesday, Oct 18 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Windows-2000/Professional isn't. http://nadav.harel.org.il | ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 11:50 AM, Nadav Har'El n...@math.technion.ac.ilwrote: On Tue, Oct 18, 2011, Elazar Leibovich wrote about Re: Newer gcc swallow version control keywords: Excuse the idiotic solution, but can't you just add an option to print it out? int main(int argc,char**argv) {if (argc == 2 strcmp(argv[1],--ident) puts(ident);...} The point in Oleg's trick (which is not Oleg's invention - it has been in use for decades) is to be able to tell not just a single version number for the executable (your --ident is basically the same as --version supported by many programs), but rather to be able to tell the exact version of each and every source code file which participated in creating this executable. You run ident on the executable, and get a list of dozens (or thousands) of source file names and their exact version numbers and dates. This could be useful if you have a lot of versions of your code running around, and you don't remember how exactly each executable was generated. I didn't understand that, but anyhow. Same idea could apply. For C++: main.cc: vectorstring __files; int main(int argc,char**argv) {if (argc == 2 string(argv)==--ident) {for (auto file:__files) cout file \n;exit(0);}} fileversion.h: class FileVersion {FileVersion(const string v){__files.push_back(v);}}; foo.cc: static const FileVersion foo($id$); With some macro trickery you might be able to get .init-like behaviour for C. (If you're willing to preprocess the C files with another utility, like you do to replace the $Id$ strings, it's actually not hard at all). This way, it's not ad-hoc, but a documented working way to get whatever you want in the executable. (I perfectly agree with you that today it's less relevant, but I'm taking Oleg's stance for the sake of the discussion) It seems like a good idea anyhow, to have a stable way of extracting this ident string from the executable (what happens for instance if by accident you have rcsident and rcs_ident? How would you know from the stripped executable which one to trust?). The idea with ident(1) is that the content of the string, not the variable name holding it, has a recognizeable structure. In particular, it looks like this: $Id: something $ - with the word something replaced by anything you wish. ident(1) looks for such strings in any file you give it - whether it is an executable, source code, or whatever, and prints them. I understood that, and it's still unstable. Since if some young team member not aware of the $Id: trick, will write: log($Id: %d $Name: %s\n,id,name) ident will return garbage. If you have documented way to get the ident strings, it's more stable. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 11:16 AM, Elazar Leibovich elaz...@gmail.com wrote: I guess that it doesn't apply to libraries, which must include this global variable. The whole point is that the constants are not global but have file scope. Therefore the optimizer can figure out they are not really used. Excuse the idiotic solution, but can't you just add an option to print it out? int main(int argc,char**argv) {if (argc == 2 strcmp(argv[1],--ident) puts(ident);...} This is fine for the main() routine of a regular application. What about other files? Headers? What about daemons that have no I/O? It seems like a good idea anyhow, to have a stable way of extracting this ident string from the executable That's the function of ident(1) - and that's what is not working because the compiler eats the strings. (what happens for instance if by accident you have rcsident and rcs_ident? How would you know from the stripped executable which one to trust?). It's the version control system's function to expand the keywords with the right data - if I have multiple strings they will be consistent. -- Oleg Goldshmidt | o...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 11:50 AM, Nadav Har'El n...@math.technion.ac.il wrote: One reason that nobody really cares about this trick any more is that it has become MUCH LESS IMPORTANT on modern version control systems, e.g., Subversion or Git, where there is a single version number (or version hash) for the entire project, not one per file. This is an idealistic argument that breaks down in many real-world situations. Some examples include build and release processes parts of which are done manually by copying files around, developers forgetting to check in a file or two, and (binary) patches/updates sent to individual customers one by one. The real world is, unfortunately, not entirely blissful. -- Oleg Goldshmidt | o...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
First, please see my last email to Nadav, which discuss a lot of the details. On Tue, Oct 18, 2011 at 12:41 PM, Oleg Goldshmidt p...@goldshmidt.orgwrote: On Tue, Oct 18, 2011 at 11:16 AM, Elazar Leibovich elaz...@gmail.com wrote: I guess that it doesn't apply to libraries, which must include this global variable. The whole point is that the constants are not global but have file scope. Therefore the optimizer can figure out they are not really used. But you shouldn't care that they're having a file scope, do you? So I think it's an OK solution to allow them have application scope. Am I correct? See my C++ implementation in the reply to Nadav. Excuse the idiotic solution, but can't you just add an option to print it out? int main(int argc,char**argv) {if (argc == 2 strcmp(argv[1],--ident) puts(ident);...} This is fine for the main() routine of a regular application. What about other files? Headers? What about daemons that have no I/O? It doesn't matter. The daemon will still not have output unless invoked with --ident. You shouldn't care about that, should you? In case of libraries (.o, .a, .so), I think that if you expose the const char* variable, the optimizer can't eat it away. How would it knows no user of the library uses it? It seems like a good idea anyhow, to have a stable way of extracting this ident string from the executable That's the function of ident(1) - and that's what is not working because the compiler eats the strings. See my reply to Nadav. I meant unstable because of $Id: %d, $Name: %s will also be printed. (what happens for instance if by accident you have rcsident and rcs_ident? How would you know from the stripped executable which one to trust?). It's the version control system's function to expand the keywords with the right data - if I have multiple strings they will be consistent. My bad, I thought it's supposed to be an application scope version string. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
I understood that, and it's still unstable. Since if some young team member not aware of the $Id: trick, will write: log($Id: %d $Name: %s\n,id,name) ident will return garbage. Someone may alter keyword expansions just before the build, too. If you have documented way to get the ident strings, it's more stable. The whole point is to have these strings everywhere, otherwise they are not very useful. This means the whole development team is aware of and uses the convention. -- Oleg Goldshmidt | o...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 12:50 PM, Oleg Goldshmidt p...@goldshmidt.orgwrote: I understood that, and it's still unstable. Since if some young team member not aware of the $Id: trick, will write: log($Id: %d $Name: %s\n,id,name) ident will return garbage. Someone may alter keyword expansions just before the build, too. I'm not sure I understand your comment. Are you telling that this is unlikely? I agree, this is not a very big point. Another reason it's more stable, is, that it never allow the optimizer to mess with your ident strings. Any conforming compiler MUST let you see all files used in the executable. If you have documented way to get the ident strings, it's more stable. The whole point is to have these strings everywhere, otherwise they are not very useful. This means the whole development team is aware of and uses the convention. I didn't understand how, eg, my C++ scheme don't work. I think it should work even if you're including the $Id$ strings in the headers files. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 12:57 PM, Elazar Leibovich elaz...@gmail.com wrote: On Tue, Oct 18, 2011 at 12:50 PM, Oleg Goldshmidt p...@goldshmidt.org wrote: Someone may alter keyword expansions just before the build, too. I'm not sure I understand your comment. Are you telling that this is unlikely? I agree, this is not a very big point. I am saying that protecting against someone who may modify a keyword expansion or create his own string of a similar format is not a big issue. I didn't understand how, eg, my C++ scheme don't work. I think it should work even if you're including the $Id$ strings in the headers files. Apart from the fact that you assume that main.cc is mine (what if my product is a library?), depends on a whole lot of things I wouldn't necessarily need for any other purpose, makes the strings global and mutable, and won't pick up, e.g., the case of wrong header I mentioned before (I checked)? This is a good idea in general, but it's not really an improvement. The trusted old scheme has all the needed properties and if gcc had an option to disable this particular kind of optimization selectively I wouldn't have a problem. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011, Elazar Leibovich wrote about Re: Newer gcc swallow version control keywords: fileversion.h: class FileVersion {FileVersion(const string v){__files.push_back(v);}}; foo.cc: static const FileVersion foo($id$); Well, basically you're showing that unlike C where a static constant that is never used can be optimized out, in C++ even a static constant that is never used can NOT, and therefore will not, be optimized out, because its constructor could have any unknown side-effects (in your example, writing to a global vector). This is interesting. But Oleg's question was about C, and unfortunately he showed that in his case, the compiler does optimize his constants out (if he uses -O2). With some macro trickery you might be able to get .init-like behaviour for C. (If you're willing to preprocess the C files with another utility, like you do to replace the $Id$ strings, it's actually not hard at all). This way, it's not ad-hoc, but a documented working way to get whatever you want in the executable. What are you talking about, #ident? Indeed, it's not ad-hoc and has existed for ages. But Oleg noted that it's not part of the C standard, and indeed I checked and it isn't. But it's still possible that all common compilers nowadays support it. Definitely all versions of gcc support it. I understood that, and it's still unstable. Since if some young team member not aware of the $Id: trick, will write: log($Id: %d $Name: %s\n,id,name) ident will return garbage. If you have documented way to get the ident strings, it's more stable. The idea of the $Id: (RCS) or @(#) (SCCS) prefixes is that nobody can accidentally write them by mistake. Why would a young team member use it incorrectly? Either you would use it (in which case you know what you're doing) or you won't. Typically coding conventions will specify that you use it - correctly - on every source file. -- Nadav Har'El| Tuesday, Oct 18 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Vote, n.: A person's right to make a fool http://nadav.harel.org.il |of himself and a wreck of his country. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 2:25 PM, Nadav Har'El n...@math.technion.ac.il wrote: On Tue, Oct 18, 2011, Elazar Leibovich wrote about Re: Newer gcc swallow version control keywords: fileversion.h: class FileVersion {FileVersion(const string v){__files.push_back(v);}}; foo.cc: static const FileVersion foo($id$); Well, basically you're showing that unlike C where a static constant that is never used can be optimized out, in C++ even a static constant that is never used can NOT, and therefore will not, be optimized out, because its constructor could have any unknown side-effects (in your example, writing to a global vector). This is interesting. No, no, it is optimized out - Elazar tried to make a global non-static (non-file-scope) variable (non-constant) that *is* used (by the main() routine - and may be used, including modification, by anyone else. Apart from that the only difference between C and C++ is that in C++ you can write a relatively simple constructor code (while dragging stuff like vector and string and possibly algorithm in) that will insert a reference into a global container. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow version control keywords: No, no, it is optimized out - Elazar tried to make a global non-static (non-file-scope) variable (non-constant) that *is* used (by the main() routine - and may be used, including modification, by anyone else. Apart from that the only difference between C and C++ is that in C++ you can write a relatively simple constructor code (while dragging stuff like vector and string and possibly algorithm in) that will insert a reference into a global container. I understood his intention differently. He used his trick not just in main, but also in every file in his example. Did you actually check this in C++? It seems to me like in C++, you cannot optimize out (elliminate the construction of code) unused static const variables whose type have a constructor, because this constructor can basically do *anything*, and this code might be necessary even if the object itself is never used. In my opinion, if gcc does optimize out unused static const variables that have a constructor with side effects, then it is making a mistake. I didn't check if that actually happens. But in any case your original question was about C, not C++. -- Nadav Har'El| Tuesday, Oct 18 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |I'm experiencing both amnesia and deja http://nadav.harel.org.il |vu. I think I've forgotten this before! ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
I understood his intention differently. He used his trick not just in main, but also in every file in his example. Yes, it does not matter. As I said, he adds an attempt to *refer* to the global variable in main() to make it used. Did you actually check this in C++? Yes. But in any case your original question was about C, not C++. It was about C++. C and C++ compilers behave the same. -- Oleg Goldshmidt | o...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow version control keywords: It was about C++. C and C++ compilers behave the same. I was very surprised to discover that this is indeed the case. I think this is a BUG. For example, consider this C++ program: #include cstdio class Ident { public: Ident(const char *ident){ // This constructor prints a message! printf(yo\n); } }; static Ident id($Id: hello $); main(){ printf(hello\n); } If you compile it with g++ (without optimization), the object id gets instanciated, and when you run the program you see the message yo first, before hello. But, if you compile it with g++ -O2, id gets optimized out and its constructor never runs - and you never see the yo message. So basically, compiling with -O2 changes the *behavior*, not just the *performance*, of the code. I don't know how this cannot be called a bug? But unfortunately, whether this is to be called a bug doesn't really help you :( -- Nadav Har'El| Tuesday, Oct 18 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |This '|' is not a pipe. http://nadav.harel.org.il | ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 4:04 PM, Nadav Har'El n...@math.technion.ac.il wrote: On Tue, Oct 18, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow version control keywords: It was about C++. C and C++ compilers behave the same. I was very surprised to discover that this is indeed the case. I think this is a BUG. For example, consider this C++ program: #include cstdio class Ident { public: Ident(const char *ident){ // This constructor prints a message! printf(yo\n); } }; static Ident id($Id: hello $); main(){ printf(hello\n); } If you compile it with g++ (without optimization), the object id gets instanciated, and when you run the program you see the message yo first, before hello. But, if you compile it with g++ -O2, id gets optimized out and its constructor never runs - and you never see the yo message. So basically, compiling with -O2 changes the *behavior*, not just the *performance*, of the code. I don't know how this cannot be called a bug? Actually, it works fine for me with or without -O2, with g++ 4.6.1 on F15. What am I doing wrong? ;-) -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011, Nadav Har'El wrote about Re: Newer gcc swallow version control keywords: On Tue, Oct 18, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow version control keywords: It was about C++. C and C++ compilers behave the same. I was very surprised to discover that this is indeed the case. I was originally right, and my last statement was wrong - the constructor DOES matter. I probably wasn't paying attention when I ran my previous test. Actually, I checked again, and constructor does cause the unused static object NOT to be optimized out. In the following example, the id object is not optimized out, just as I thought. You see yo in the printout, and ident(1) shows the ident string you wanted, even with -O2. If you're using C++, and are reluctant to use __attribute__((used)) because it's not standard, how about using this trick? BTW, unfortunately, you need the global below, because if Ident's constructor doesn't read its argument, the optimizer ends up optimizing away the string constant given to it as argument! Instead of using a global variable, you can probably consider doing other things with the argument which the optimizer considers as using it. #include cstdio extern const char *global; class Ident { public: Ident(const char *ident){ // We want it to touch the ident parameter, so the caller // doesn't think its ignored and can be optimized out. global=ident; printf(yo\n); } }; static Ident id($Id: hello $); main(){ printf(hello\n); } // Put this in a single file, probably main.cc const char *global; -- Nadav Har'El| Tuesday, Oct 18 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Unlike Microsoft, a restaurant would not http://nadav.harel.org.il |charge me for food with a bug! ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Mon, Oct 17, 2011 at 10:21 PM, Oleg Goldshmidt p...@goldshmidt.org wrote: Nadav Har'El n...@math.technion.ac.il writes: In any case, because there was always a fear that the compiler might optimize these out, someone invented a new directive, #ident, as in: #ident $Id$ This has always been there, but it has never been standard, AFAIK. It is not a GCC extension, either. Most preprocessors don't barf on directives they do not understand, but they may simply ignore #ident which will lead to the same behaviour that I do not want. Just to make things more interesting, it looks like MS Visual Studio 2010 ignores #ident, warns about it, and suggests #pragma comment(exestr,$Id$) instead, which works. It also emits nothing for a static const char foo[] = $Id$ in either Release or Debug mode. Luckily, it seems to provide __pragma() which should be the same as _Pragma and can be hidden in preprocessor macros. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Tue, Oct 18, 2011 at 12:49 PM, Elazar Leibovich elaz...@gmail.com wrote: But you shouldn't care that they're having a file scope, do you? Of course I do. Just as one example: suppose that this/dir/foo.c and that/other/module/bar.c both #include xyzzy.h. I want to catch a build system bug that takes the header from two different places. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
Hi Oleg, On Mon, Oct 17, 2011 at 06:29:58PM +0200, Oleg Goldshmidt wrote: ... Now, put the above line in a C or C++ file, say foo.cc, and do the following: $ g++ -g -O2 foo.cc -c -o foo.o $ ident foo.o foo.o: $Id: foo.cc 673 2011-10-17 09:48:11Z oleg $ This works up to and including gcc 4.4, but it does not work with gcc 4.5.1 or gcc 4.6.1 (the ident part does not show any keywords). The reason seems to be that the optimizer realizes that the static const is not used and eliminates it (remove -O2 and ident works fine). The -O2, as well as -O and -Os, gcc options enable a set of specific optimizations that can each be turned off. The full list is at http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html. Just go over this list and disable each optimization, until you find the one removing your static string. baruch -- ~. .~ Tk Open Systems =}ooO--U--Ooo{= - bar...@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On Mon, Oct 17, 2011, Oleg Goldshmidt wrote about Newer gcc swallow version control keywords: static const char foo_src_id[] = $Id$; I remember many years ago (when I was probably still using SCCS with its %..% macros, and SCCS's what(1) instead of ident(1)), there was already an argument on whether what you're trying to do should work, or whether proper compiler behavior is to optimize these declarations away. In any case, people who used this trick (rather than comments) *wanted* them not to be optimized away, and they always were *not* optimized away. I guess that now that changed, according to your report. In any case, because there was always a fear that the compiler might optimize these out, someone invented a new directive, #ident, as in: #ident $Id$ The C preprocessor would pass this statement unchanged to the C compiler (as it also does with #line), and the C compiler changed it to some static constant that doesn't get optimized away. Last time I checked, this worked, but I suggest you give it a shot again. I think it has been in all C compilers I've seen in the last two decades, but perhaps it rusted away because of disuse ;-) but this is non-portable, and preprocessor acrobatics such as #if (__GNUC__ = 4) (__GNUC_MINOR__ 4) #define USED(x) x __attribute__((used)) #else #define USED(x) x #endif static const char USED(foo_src_id[]) = $Id$; is incredibly ugly and verbose Why is this incredebly ugly? You can even make it nicer by putting in ident.h: #if (__GNUC__ = 4) (__GNUC_MINOR__ 4) #define USED(x) x __attribute__((used)) #else #define USED(x) x #endif #define IDENT(x) static const char USED(foo_src_id[]) = x; and then in each file just do #include ident.h IDENT($Id); Doesn't look that bad... (such as -fkeep-static-consts) but they did not work, e.g., -fkeep-static-consts emits static consts only when optimization is not on. Apparently, the optimizer behavior you're reporting and the odd behavior of -fkeep-static-consts you report is NOT new. Check out this bug report from 6 (!) years ago: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20319 Ironically, this bug is still marked NEW :-) check whether it works? If it is a RedHat-specific or generic for all GCCs is important. According to the aforementioned bug report, it has been this way in all GCCs for over 6 years, if optimization was turned on... I didn't check myself in years, though. -- Nadav Har'El|Monday, Oct 17 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Committee: A group of people that keeps http://nadav.harel.org.il |minutes and wastes hours. ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On 10/17/2011 06:29 PM, Oleg Goldshmidt wrote: Hi, I have a gcc-related question. Problematic platform is Fedora 15 with gcc 4.6.1, as well as Fedora 14 with gcc 4.5.1. I am used to keeping RCS/CVS/SVN keywords (e.g., $Id$) in all my code. In the case of C/C++ this normally amounts to static const char foo_src_id[] = $Id$; Leaving aside the question of whether that is a good idea - did you try changing that to: static const volatile char foo_src_id[] = $Id$; ? Shachar -- Shachar Shemesh Lingnu Open Source Consulting Ltd. http://www.lingnu.com ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
Nadav Har'El n...@math.technion.ac.il writes: In any case, because there was always a fear that the compiler might optimize these out, someone invented a new directive, #ident, as in: #ident $Id$ This has always been there, but it has never been standard, AFAIK. It is not a GCC extension, either. Most preprocessors don't barf on directives they do not understand, but they may simply ignore #ident which will lead to the same behaviour that I do not want. When you say that it has worked for you with all sorts of compilers do you mean that it actually produced the $Id$ string with ident or that it didn't break? Why is this incredebly ugly? Because #including an extra header just for this purpose _is_ ugly, IMHO (compared to just compiling with an extra option). Apparently, the optimizer behavior you're reporting and the odd behavior of -fkeep-static-consts you report is NOT new. Check out this bug report from 6 (!) years ago: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20319 I was wondering whether someone would catch my little mischief: this bug report complains about the same effect in a slightly different situation. Even with older GCC the $Id$ string is not there with just -O2, but it is with -g -O2 (cf. my original posting - I put the exact command line there for a reason). Yes, I agree with the opinion that -fkeep-static-consts should override optimization (it is more specific), but it has never been the case (or at least not for a very long time). Thanks, -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
Baruch Siach bar...@tkos.co.il writes: The -O2, as well as -O and -Os, gcc options enable a set of specific optimizations that can each be turned off. The full list is at http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html. Just go over this list and disable each optimization, until you find the one removing your static string. I didn't go over all of them but over those which sounded like they could be relevant - before posting. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
Shachar Shemesh shac...@shemesh.biz writes: Leaving aside the question of whether that is a good idea This saved my butt enough times in the past that I think it is ;-) - did you try changing that to: static const volatile char foo_src_id[] = $Id$; Hmm... const volatile hadn't occurred to me before, but I have just tried it and it did not work. Thanks! -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
On 10/17/2011 10:29 PM, Oleg Goldshmidt wrote: - did you try changing that to: static const volatile char foo_src_id[] = $Id$; Hmm... const volatile hadn't occurred to me before, but I have just tried it and it did not work. Just tested it myself. It does, indeed, not work. I wonder why? Seems like it SHOULD work. After all, that's what volatile is for, right? Shachar Thanks! -- Shachar Shemesh Lingnu Open Source Consulting Ltd. http://www.lingnu.com ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
Shachar Shemesh shac...@shemesh.biz writes: Just tested it myself. It does, indeed, not work. I wonder why? Seems like it SHOULD work. After all, that's what volatile is for, right? I suspected that const was more important than volatile, but it looks (after I removed const) that what overrules volatile is the fact that nothing uses the static variable (or const) - the compiler does not care that it was declared volatile. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
Shachar Shemesh wrote on Mon, Oct 17, 2011 at 22:47:49 +0200: On 10/17/2011 10:29 PM, Oleg Goldshmidt wrote: - did you try changing that to: static const volatile char foo_src_id[] = $Id$; Hmm... const volatile hadn't occurred to me before, but I have just tried it and it did not work. Just tested it myself. It does, indeed, not work. I wonder why? Seems like it SHOULD work. After all, that's what volatile is for, right? Because it's also static? ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
Nadav Har'El n...@math.technion.ac.il writes: #if (__GNUC__ = 4) (__GNUC_MINOR__ 4) #define USED(x) x __attribute__((used)) #else #define USED(x) x #endif #define IDENT(x) static const char USED(foo_src_id[]) = x; and then in each file just do #include ident.h IDENT($Id); The proper contents of ident.h would actually be #ifndef _IDENT_H_ #define _IDENT_H_ #if defined(__GNUC__) #define USED(x) x __attribute__((used)) #else #define USED(x) x #endif #define IDENT(s,x) static const char USED(s[]) = x #endif - this is because one would want to use different identifiers in every file to avoid a) clashes when keywords are used in header files (very useful: which versions of the headers was this file compiled with?) and b) possible effects of -fmerge-constants and friends. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Newer gcc swallow version control keywords
http://gcc.gnu.org/ml/gcc/2005-04/msg01429.html I do have three suggestions for you: 1) The current way to tell the compiler not to throw away apparently-unused data is __attribute__((used)), like this: static const char __attribute__((used)) rcs_sccs_id[] = $Id: @(#)%M% %I% 20%E% %U% copyright 2005 %Q% string1\\ $; On Mon, Oct 17, 2011 at 2:04 PM, Oleg Goldshmidt p...@goldshmidt.org wrote: Nadav Har'El n...@math.technion.ac.il writes: #if (__GNUC__ = 4) (__GNUC_MINOR__ 4) #define USED(x) x __attribute__((used)) #else #define USED(x) x #endif #define IDENT(x) static const char USED(foo_src_id[]) = x; and then in each file just do #include ident.h IDENT($Id); The proper contents of ident.h would actually be #ifndef _IDENT_H_ #define _IDENT_H_ #if defined(__GNUC__) #define USED(x) x __attribute__((used)) #else #define USED(x) x #endif #define IDENT(s,x) static const char USED(s[]) = x #endif - this is because one would want to use different identifiers in every file to avoid a) clashes when keywords are used in header files (very useful: which versions of the headers was this file compiled with?) and b) possible effects of -fmerge-constants and friends. -- Oleg Goldshmidt | p...@goldshmidt.org ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il -- Your time is limited, so don't waste it living someone else's life. Don't be trapped by dogma -- which is living with the results of other people's thinking. Don't let the noise of others' opinions drown out your own inner voice. And most important, have the courage to follow your heart and intuition. They somehow already know what you truly want to become. Everything else is secondary. Steve Jobs ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il