Re: Newer gcc swallow version control keywords

2011-10-21 Thread Shachar Shemesh
On 10/18/2011 12:50 PM, Oleg Goldshmidt wrote:
 I understood that, and it's still unstable. Since if some young team member
 not aware of the $Id: trick, will write:
 log($Id: %d $Name: %s\n,id,name)
 ident will return garbage.
 Someone may alter keyword expansions just before the build, too.

 If you have documented way to get the ident strings, it's more stable.
 The whole point is to have these strings everywhere, otherwise they
 are not very useful. This means the whole development team is aware of
 and uses the convention.

I understand why these keywords were useful for CVS/RCS, where each file
had its own version number. When working with SVN, however, a single
number uniquely identifies the entire source tree. Why not have that one
number and get it done with?

Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-21 Thread Elazar Leibovich
On Tue, Oct 18, 2011 at 1:50 PM, Oleg Goldshmidt p...@goldshmidt.org wrote:

  I didn't understand how, eg, my C++ scheme don't work. I think it should
  work even if you're including the $Id$ strings in the headers files.

 Apart from the fact that you assume that main.cc is mine (what if my
 product is a library?),


There's an important point here.
If your product is an .so, all you need to do, is to expose the ident
objects to the library users (of course, in this case the names of the
variables must be unique per file version). This way, the linker is forced
to leave those strings intact.

If you're product is an .a archive, well, your out of luck. The same method
for so files will work of course for the .a archive, but, the client who
links to your program, is free to use a whole program optimizer, which might
wipe off your program. I'm not sure it's even possible to mark those strings
as always include in the object file.

I really recommend this piece[1] by Meyers, who shows that fighting with the
optimizer, is doomed to misery.

[1] PDF: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf Bottom
of page 6, Ctrl-F In essence, you’ve just fired the opening salvo



 depends on a whole lot of things I wouldn't
 necessarily need for any other purpose, makes the strings global

Not too global. The global list of strings is exposed only in the main() and
in the implementation of the Ident class. If you hate global variables -
make it a static variable of the Ident class.

 and
 mutable,


The list of strings is mutable, right? The stings can be immutable. Also it
is possible to use a push-only queue instead of vector. But I'm nitpicking
now ;-)


 and won't pick up, e.g., the case of wrong header I mentioned
 before (I checked)?


I don't understand why is that.
See here https://github.com/elazarl/ident-for-cpp[2] for an implementation
which Works For Me (TM), with no non-static global variables, and it works
for header files.

[2] https://github.com/elazarl/ident-for-cpp



 This is a good idea in general, but it's not really an improvement.
 The trusted old scheme has all the needed properties and if gcc had an
 option to disable this particular kind of optimization selectively I
 wouldn't have a problem.


If I understood correctly, the problem was how to port the GCC
optimization-turn-off to other compilers without macro hackeries. I believe
this is a good portable way. I'm trying to find a sane way to do that in
portable C, but I'm not sure it's possible without too much effort.

That said, the macro+gcc attribute seems the most reasonable approach given
that VS have a way to disable this optimization. However, it spoils all the
fun.



 --
 Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-21 Thread Oleg Goldshmidt
Shachar Shemesh shac...@shemesh.biz writes:

 I understand why these keywords were useful for CVS/RCS, where each
 file had its own version number. When working with SVN, however, a
 single number uniquely identifies the entire source tree. Why not
 have that one number and get it done with?

I believe it has been asked and answered, Your Honour ;-). 

Having a single number is enough only under a whole bunch of
assumptions: the code must come from a single branch of a single
repository, the build and release systems must be 100% reliable, no
human mistake may occur on the way between develper workstation and
customer deployment, etc., etc.

The same argument is employed by the gcc team who say - either in the
bug trail Nadav quoted or in the mailing list thread Ghiora (I think...
sorry if I am mixing it up) found - that it is enough for their
purposes so it must be enough for everybody.

I say that with all the complexity of gcc their case is conceptually
simple (a single piece of software from a single repository) and their
build/release process is probably better than most. I assume that I do
not have to explain to everybody that the difficulties inherent in
changing processes in an organization are significant, legacy
structures and processes abound, and every device that helps flushing
problems early, quickly, and with minimal pain is worth deploying.

So, my conclusion is that gcc at present has a problem. By applying an
optimization that cannot be turned off it makes an assumption that it
is smarter than the user[1] (isn't this a complaint typically aimed at,
say, Microsoft Office? hmphhh...), and an important use case that is
disabled by said optimization is dismissed, in part, on the grounds
that the gcc team didn't need it in gcc code.

This last paragraph is, of course, me venting. At people who are in no
way to blame. That was easy, and helpful - thanks everybody!

[1] Hey, we see that this variable cannot possibly be used by any
program of which it is a part of, so we'll assume you are either
stupid or careless and we'll just erase it without trace. Well, I
am not stupid, I intended this variable to be used by another
program to which this program is but an input, and you know what:
it's none of the compiler's business to decide that I cannot
possibly want that.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Nadav Har'El
On Mon, Oct 17, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow 
version control keywords:
  In any case, because there was always a fear that the compiler might
  optimize these out, someone invented a new directive, #ident, as in:
 
  #ident $Id$
 
 This has always been there, but it has never been standard, AFAIK. It

Always is always a relative term, when we're talking about a programming
language (C) invented 40 years ago. I'm pretty sure #ident has indeed been
around for around 20 years. About standardization, you may be right: I just
checked and sadly I can't find #ident anywhere in the C99 standard.

But that being said, I wonder if in practice, on actual computers of interest
today, it will work. I just checked on my Fedora 15, Gcc 4.6.1, and it works
well. Sadly, all the dozen other Unix variants to which I used to have access,
have gone the way of the dodo, so I can't check this on any other C compiler.

 is not a GCC extension, either. Most preprocessors don't barf on
 directives they do not understand, but they may simply ignore #ident
 which will lead to the same behaviour that I do not want.

They *may* ignore #ident, but like I said, #ident has been around for ages,
and may actually work on many if not all compilers. Like I said, it does
on gcc. I checked, and it seems it  generates assemly that looks like:

.ident  $Id: hello $

Which I guess GNU AS supports, seeing that the ident string appears in the
object file and in the final executable - even if compiled with -O2.


 When you say that it has worked for you with all sorts of compilers do
 you mean that it actually produced the $Id$ string with ident or that
 it didn't break? 

Whenever I checked, it worked in the sense of generating code.
But I was never a big fan of what(1) and ident(1) like you, so I haven't
checked this in ages.

 I was wondering whether someone would catch my little mischief: this
 bug report complains about the same effect in a slightly different
 situation. Even with older GCC the $Id$ string is not there with just
 -O2, but it is with -g -O2 (cf. my original posting - I put the
 exact command line there for a reason). 
 
 Yes, I agree with the opinion that -fkeep-static-consts should
 override optimization (it is more specific), but it has never been the
 case (or at least not for a very long time).

I suggest you add your comments to that bug report, to make it known that
people still care. Apparently, you're the last user (!?) of ident(1), so
make your voice heard :-)

-- 
Nadav Har'El|   Tuesday, Oct 18 2011, 
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Unix is user friendly - it's just picky
http://nadav.harel.org.il   |about its friends.

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Elazar Leibovich
I guess that it doesn't apply to libraries, which must include this global
variable.

Excuse the idiotic solution, but can't you just add an option to print it
out?

int main(int argc,char**argv) {if (argc == 2  strcmp(argv[1],--ident)
puts(ident);...}

The only trick the optimizer can play here, is inlining puts and the input
string to assembly commands, but I really don't think it can happen in
practice.

It seems like a good idea anyhow, to have a stable way of extracting this
ident string from the executable (what happens for instance if by accident
you have rcsident and rcs_ident? How would you know from the stripped
executable which one to trust?).

On Mon, Oct 17, 2011 at 6:29 PM, Oleg Goldshmidt p...@goldshmidt.org wrote:

 Hi,

 I have a gcc-related question. Problematic platform is Fedora 15 with
 gcc 4.6.1, as well as Fedora 14 with gcc 4.5.1.

 I am used to keeping RCS/CVS/SVN keywords (e.g., $Id$) in all my code.
 In the case of C/C++ this normally amounts to

 static const char foo_src_id[] = $Id$;

 in the source code. For those who have never encountered it: SVN and
 friends happily substitute the relevant information for $Id$, and the
 information then can be retrieved using ident(1), even from binaries,
 libraries, and such - incredibly useful.

 Now, put the above line in a C or C++ file, say foo.cc, and do the
 following:

 $ g++ -g -O2 foo.cc -c -o foo.o
 $ ident foo.o
 foo.o:
 $Id: foo.cc 673 2011-10-17 09:48:11Z oleg $

 This works up to and including gcc 4.4, but it does not work with gcc
 4.5.1 or gcc 4.6.1 (the ident part does not show any keywords). The
 reason seems to be that the optimizer realizes that the static const
 is not used and eliminates it (remove -O2 and ident works fine).

 I can fool gcc by

 static const char foo_src_id[] __attribute__((used)) = $Id$;

 but this is non-portable, and preprocessor acrobatics such as

 #if (__GNUC__ = 4)  (__GNUC_MINOR__  4)
 #define USED(x) x __attribute__((used))
 #else
 #define USED(x) x
 #endif

 static const char USED(foo_src_id[]) = $Id$;

 is incredibly ugly and verbose, IMHO (even with #ifdef __GNUC__ only).
 So is any contortion to use the constant somewhere in each file.

 Does anyone know of a gcc option that will keep unused static
 constants even with optimization or or any other way to keep the
 keywords in a portable way? I tried different things that came to mind
 (such as -fkeep-static-consts) but they did not work, e.g.,
 -fkeep-static-consts emits static consts only when optimization is not
 on.

 A bonus question: can anyone with Debian/Ubuntu/OpenSuSE/whatever
 check whether it works? If it is a RedHat-specific or generic for all
 GCCs is important.

 --
 Oleg Goldshmidt | p...@goldshmidt.org

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Nadav Har'El
On Tue, Oct 18, 2011, Elazar Leibovich wrote about Re: Newer gcc swallow 
version control keywords:
 Excuse the idiotic solution, but can't you just add an option to print it
 out?
 
 int main(int argc,char**argv) {if (argc == 2  strcmp(argv[1],--ident)
 puts(ident);...}

The point in Oleg's trick (which is not Oleg's invention - it has been in
use for decades) is to be able to tell not just a single version number for
the executable (your --ident is basically the same as --version supported
by many programs), but rather to be able to tell the exact version of each
and every source code file which participated in creating this executable.
You run ident on the executable, and get a list of dozens (or thousands)
of source file names and their exact version numbers and dates. This could
be useful if you have a lot of versions of your code running around, and
you don't remember how exactly each executable was generated.

This trick was particularly useful in the days of SCCS and RCS, which work
separately on each file, and each file has a separate version number.

One reason that nobody really cares about this trick any more is that it has
become MUCH LESS IMPORTANT on modern version control systems, e.g., Subversion
or Git, where there is a single version number (or version hash) for the
entire project, not one per file. So if you just include this single
version number in your main() - as in your example - it is enough to find
exactly which version of each source file went into this executable.
If your project's result doesn't have a main() (e.g., it is a library), just
find another way to stick this version number - e.g., provide a function
or global variable (which won't be optimized out) containing this version
number - and stick it in the same object file as your library's most important
function (so it always gets included by users of your library).

 It seems like a good idea anyhow, to have a stable way of extracting this
 ident string from the executable (what happens for instance if by accident
 you have rcsident and rcs_ident? How would you know from the stripped
 executable which one to trust?).

The idea with ident(1) is that the content of the string, not the variable
name holding it, has a recognizeable structure. In particular, it looks like
this: $Id: something $ - with the word something replaced by anything you
wish. ident(1) looks for such strings in any file you give it - whether it
is an executable, source code, or whatever, and prints them.
what(1), SCCS's precursor of RCS's ident(1), used a different string format
for the same purpose - it looked like @(#) something (with the string
typically ending with a null). Nobody in their right mind still uses SCCS
today, but what(1) still works, if you wish to use strings of this format
in your executables to mark them.

Nadav.

-- 
Nadav Har'El|   Tuesday, Oct 18 2011, 
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Windows-2000/Professional isn't.
http://nadav.harel.org.il   |

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Elazar Leibovich
On Tue, Oct 18, 2011 at 11:50 AM, Nadav Har'El n...@math.technion.ac.ilwrote:

 On Tue, Oct 18, 2011, Elazar Leibovich wrote about Re: Newer gcc swallow
 version control keywords:
  Excuse the idiotic solution, but can't you just add an option to print it
  out?
 
  int main(int argc,char**argv) {if (argc == 2  strcmp(argv[1],--ident)
  puts(ident);...}

 The point in Oleg's trick (which is not Oleg's invention - it has been in
 use for decades) is to be able to tell not just a single version number for
 the executable (your --ident is basically the same as --version
 supported
 by many programs), but rather to be able to tell the exact version of each
 and every source code file which participated in creating this executable.
 You run ident on the executable, and get a list of dozens (or thousands)
 of source file names and their exact version numbers and dates. This could
 be useful if you have a lot of versions of your code running around, and
 you don't remember how exactly each executable was generated.


I didn't understand that, but anyhow. Same idea could apply.

For C++:
main.cc:
vectorstring __files;
int main(int argc,char**argv) {if (argc == 2  string(argv)==--ident)
{for (auto file:__files) cout  file  \n;exit(0);}}

fileversion.h:
class FileVersion {FileVersion(const string v){__files.push_back(v);}};

foo.cc:
static const FileVersion foo($id$);

With some macro trickery you might be able to get .init-like behaviour for
C. (If you're willing to preprocess the C files with another utility, like
you do to replace the $Id$ strings, it's actually not hard at all).

This way, it's not ad-hoc, but a documented working way to get whatever you
want in the executable.

(I perfectly agree with you that today it's less relevant, but I'm taking
Oleg's stance for the sake of the discussion)




  It seems like a good idea anyhow, to have a stable way of extracting this
  ident string from the executable (what happens for instance if by
 accident
  you have rcsident and rcs_ident? How would you know from the stripped
  executable which one to trust?).

 The idea with ident(1) is that the content of the string, not the variable
 name holding it, has a recognizeable structure. In particular, it looks
 like
 this: $Id: something $ - with the word something replaced by anything you
 wish. ident(1) looks for such strings in any file you give it - whether it
 is an executable, source code, or whatever, and prints them.


I understood that, and it's still unstable. Since if some young team member
not aware of the $Id: trick, will write:
log($Id: %d $Name: %s\n,id,name)
ident will return garbage.
If you have documented way to get the ident strings, it's more stable.
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
On Tue, Oct 18, 2011 at 11:16 AM, Elazar Leibovich elaz...@gmail.com wrote:
 I guess that it doesn't apply to libraries, which must include this global
 variable.

The whole point is that the constants are not global but have file
scope. Therefore the optimizer can figure out they are not really
used.

 Excuse the idiotic solution, but can't you just add an option to print it
 out?
 int main(int argc,char**argv) {if (argc == 2  strcmp(argv[1],--ident)
 puts(ident);...}

This is fine for the main() routine of a regular application. What
about other files? Headers? What about daemons that have no I/O?

 It seems like a good idea anyhow, to have a stable way of extracting this
 ident string from the executable

That's the function of ident(1) - and that's what is not working
because the compiler eats the strings.

 (what happens for instance if by accident
 you have rcsident and rcs_ident? How would you know from the stripped
 executable which one to trust?).

It's the version control system's function to expand the keywords with
the right data - if I have multiple strings they will be consistent.

-- 
Oleg Goldshmidt | o...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
On Tue, Oct 18, 2011 at 11:50 AM, Nadav Har'El n...@math.technion.ac.il wrote:

 One reason that nobody really cares about this trick any more is that it has
 become MUCH LESS IMPORTANT on modern version control systems, e.g., Subversion
 or Git, where there is a single version number (or version hash) for the
 entire project, not one per file.

This is an idealistic argument that breaks down in many real-world
situations. Some examples include build and release processes parts of
which are done manually by copying files around, developers forgetting
to check in a file or two, and (binary) patches/updates sent to
individual customers one by one. The real world is, unfortunately, not
entirely blissful.

-- 
Oleg Goldshmidt | o...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Elazar Leibovich
First, please see my last email to Nadav, which discuss a lot of the
details.

On Tue, Oct 18, 2011 at 12:41 PM, Oleg Goldshmidt p...@goldshmidt.orgwrote:

 On Tue, Oct 18, 2011 at 11:16 AM, Elazar Leibovich elaz...@gmail.com
 wrote:
  I guess that it doesn't apply to libraries, which must include this
 global
  variable.

 The whole point is that the constants are not global but have file
 scope. Therefore the optimizer can figure out they are not really
 used.


But you shouldn't care that they're having a file scope, do you?
So I think it's an OK solution to allow them have application scope. Am I
correct? See my C++ implementation in the reply to Nadav.



  Excuse the idiotic solution, but can't you just add an option to print it
  out?
  int main(int argc,char**argv) {if (argc == 2  strcmp(argv[1],--ident)
  puts(ident);...}

 This is fine for the main() routine of a regular application. What
 about other files? Headers? What about daemons that have no I/O?


It doesn't matter. The daemon will still not have output unless invoked with
--ident. You shouldn't care about that, should you?
In case of libraries (.o, .a, .so), I think that if you expose the const
char* variable, the optimizer can't eat it away. How would it knows no user
of the library uses it?



  It seems like a good idea anyhow, to have a stable way of extracting this
  ident string from the executable

 That's the function of ident(1) - and that's what is not working
 because the compiler eats the strings.


See my reply to Nadav. I meant unstable because of $Id: %d, $Name: %s will
also be printed.



  (what happens for instance if by accident
  you have rcsident and rcs_ident? How would you know from the stripped
  executable which one to trust?).

 It's the version control system's function to expand the keywords with
 the right data - if I have multiple strings they will be consistent.


My bad, I thought it's supposed to be an application scope version string.
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
 I understood that, and it's still unstable. Since if some young team member
 not aware of the $Id: trick, will write:
     log($Id: %d $Name: %s\n,id,name)
 ident will return garbage.

Someone may alter keyword expansions just before the build, too.

 If you have documented way to get the ident strings, it's more stable.

The whole point is to have these strings everywhere, otherwise they
are not very useful. This means the whole development team is aware of
and uses the convention.

-- 
Oleg Goldshmidt | o...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Elazar Leibovich
On Tue, Oct 18, 2011 at 12:50 PM, Oleg Goldshmidt p...@goldshmidt.orgwrote:

  I understood that, and it's still unstable. Since if some young team
 member
  not aware of the $Id: trick, will write:
  log($Id: %d $Name: %s\n,id,name)
  ident will return garbage.

 Someone may alter keyword expansions just before the build, too.


I'm not sure I understand your comment. Are you telling that this is
unlikely? I agree, this is not a very big point.

Another reason it's more stable, is, that it never allow the optimizer to
mess with your ident strings.
Any conforming compiler MUST let you see all files used in the executable.



  If you have documented way to get the ident strings, it's more stable.

 The whole point is to have these strings everywhere, otherwise they
 are not very useful. This means the whole development team is aware of
 and uses the convention.


I didn't understand how, eg, my C++ scheme don't work. I think it should
work even if you're including the $Id$ strings in the headers files.
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
On Tue, Oct 18, 2011 at 12:57 PM, Elazar Leibovich elaz...@gmail.com wrote:
 On Tue, Oct 18, 2011 at 12:50 PM, Oleg Goldshmidt p...@goldshmidt.org
 wrote:

 Someone may alter keyword expansions just before the build, too.

 I'm not sure I understand your comment. Are you telling that this is
 unlikely? I agree, this is not a very big point.

I am saying that protecting against someone who may modify a keyword
expansion or create his own string of a similar format is not a big
issue.

 I didn't understand how, eg, my C++ scheme don't work. I think it should
 work even if you're including the $Id$ strings in the headers files.

Apart from the fact that you assume that main.cc is mine (what if my
product is a library?), depends on a whole lot of things I wouldn't
necessarily need for any other purpose, makes the strings global and
mutable, and won't pick up, e.g., the case of wrong header I mentioned
before (I checked)?

This is a good idea in general, but it's not really an improvement.
The trusted old scheme has all the needed properties and if gcc had an
option to disable this particular kind of optimization selectively I
wouldn't have a problem.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Nadav Har'El
On Tue, Oct 18, 2011, Elazar Leibovich wrote about Re: Newer gcc swallow 
version control keywords:
 fileversion.h:
 class FileVersion {FileVersion(const string v){__files.push_back(v);}};
 
 foo.cc:
 static const FileVersion foo($id$);

Well, basically you're showing that unlike C where a static constant that
is never used can be optimized out, in C++ even a static constant that is
never used can NOT, and therefore will not, be optimized out, because its
constructor could have any unknown side-effects (in your example, writing
to a global vector). This is interesting. But Oleg's question was about C,
and unfortunately he showed that in his case, the compiler does optimize
his constants out (if he uses -O2).

 With some macro trickery you might be able to get .init-like behaviour for
 C. (If you're willing to preprocess the C files with another utility, like
 you do to replace the $Id$ strings, it's actually not hard at all).
 
 This way, it's not ad-hoc, but a documented working way to get whatever you
 want in the executable.

What are you talking about, #ident? Indeed, it's not ad-hoc and has existed
for ages. But Oleg noted that it's not part of the C standard, and indeed I
checked and it isn't. But it's still possible that all common compilers
nowadays support it. Definitely all versions of gcc support it.

 I understood that, and it's still unstable. Since if some young team member
 not aware of the $Id: trick, will write:
 log($Id: %d $Name: %s\n,id,name)
 ident will return garbage.
 If you have documented way to get the ident strings, it's more stable.

The idea of the $Id:  (RCS) or @(#)  (SCCS) prefixes is that nobody
can accidentally write them by mistake. Why would a young team member use
it incorrectly? Either you would use it (in which case you know what you're
doing) or you won't. Typically coding conventions will specify that you use
it - correctly - on every source file.


-- 
Nadav Har'El|   Tuesday, Oct 18 2011, 
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Vote, n.: A person's right to make a fool
http://nadav.harel.org.il   |of himself and a wreck of his country.

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
On Tue, Oct 18, 2011 at 2:25 PM, Nadav Har'El n...@math.technion.ac.il wrote:
 On Tue, Oct 18, 2011, Elazar Leibovich wrote about Re: Newer gcc swallow 
 version control keywords:
 fileversion.h:
 class FileVersion {FileVersion(const string v){__files.push_back(v);}};

 foo.cc:
 static const FileVersion foo($id$);

 Well, basically you're showing that unlike C where a static constant that
 is never used can be optimized out, in C++ even a static constant that is
 never used can NOT, and therefore will not, be optimized out, because its
 constructor could have any unknown side-effects (in your example, writing
 to a global vector). This is interesting.

No, no, it is optimized out - Elazar tried to make a global non-static
(non-file-scope) variable (non-constant) that *is* used (by the main()
routine - and may be used, including modification, by anyone else.
Apart from that the only difference between C and C++ is that in C++
you can write a relatively simple constructor code (while dragging
stuff like vector and string and possibly algorithm in) that
will insert a reference into a global container.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Nadav Har'El
On Tue, Oct 18, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow 
version control keywords:
 No, no, it is optimized out - Elazar tried to make a global non-static
 (non-file-scope) variable (non-constant) that *is* used (by the main()
 routine - and may be used, including modification, by anyone else.
 Apart from that the only difference between C and C++ is that in C++
 you can write a relatively simple constructor code (while dragging
 stuff like vector and string and possibly algorithm in) that
 will insert a reference into a global container.

I understood his intention differently. He used his trick not just in main,
but also in every file in his example.

Did you actually check this in C++? It seems to me like in C++, you cannot
optimize out (elliminate the construction of code) unused static const
variables whose type have a constructor, because this constructor can
basically do *anything*, and this code might be necessary even if the object
itself is never used.

In my opinion, if gcc does optimize out unused static const variables that
have a constructor with side effects, then it is making a mistake. I didn't
check if that actually happens. But in any case your original question was
about C, not C++.

-- 
Nadav Har'El|   Tuesday, Oct 18 2011, 
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |I'm experiencing both amnesia and deja
http://nadav.harel.org.il   |vu. I think I've forgotten this before!

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
 I understood his intention differently. He used his trick not just in main,
 but also in every file in his example.

Yes, it does  not matter. As I said, he adds an attempt to *refer* to
the global variable in main() to make it used.

 Did you actually check this in C++?

Yes.

 But in any case your original question was
 about C, not C++.

It was about C++. C and C++ compilers behave the same.

-- 
Oleg Goldshmidt | o...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Nadav Har'El
On Tue, Oct 18, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow 
version control keywords:
 It was about C++. C and C++ compilers behave the same.

I was very surprised to discover that this is indeed the case. I think this
is a BUG. For example, consider this C++ program:

#include cstdio
class Ident {
public:
Ident(const char *ident){
// This constructor prints a message!
printf(yo\n);
}
};

static Ident id($Id: hello $);

main(){
printf(hello\n);
}


If you compile it with g++ (without optimization), the object id gets
instanciated, and when you run the program you see the message yo first,
before hello. But, if you compile it with g++ -O2, id gets optimized out
and its constructor never runs - and you never see the yo message.

So basically, compiling with -O2 changes the *behavior*, not just the
*performance*, of the code. I don't know how this cannot be called a bug?

But unfortunately, whether this is to be called a bug doesn't really
help you :(

-- 
Nadav Har'El|   Tuesday, Oct 18 2011, 
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |This '|' is not a pipe.
http://nadav.harel.org.il   |

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
On Tue, Oct 18, 2011 at 4:04 PM, Nadav Har'El n...@math.technion.ac.il wrote:
 On Tue, Oct 18, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow 
 version control keywords:
 It was about C++. C and C++ compilers behave the same.

 I was very surprised to discover that this is indeed the case. I think this
 is a BUG. For example, consider this C++ program:

        #include cstdio
        class Ident {
        public:
                Ident(const char *ident){
                        // This constructor prints a message!
                        printf(yo\n);
                }
        };

        static Ident id($Id: hello $);

        main(){
                printf(hello\n);
        }


 If you compile it with g++ (without optimization), the object id gets
 instanciated, and when you run the program you see the message yo first,
 before hello. But, if you compile it with g++ -O2, id gets optimized out
 and its constructor never runs - and you never see the yo message.

 So basically, compiling with -O2 changes the *behavior*, not just the
 *performance*, of the code. I don't know how this cannot be called a bug?

Actually, it works fine for me with or without  -O2, with g++ 4.6.1 on
F15. What am I doing wrong? ;-)

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Nadav Har'El
On Tue, Oct 18, 2011, Nadav Har'El wrote about Re: Newer gcc swallow version 
control keywords:
 On Tue, Oct 18, 2011, Oleg Goldshmidt wrote about Re: Newer gcc swallow 
 version control keywords:
  It was about C++. C and C++ compilers behave the same.
 
 I was very surprised to discover that this is indeed the case.

I was originally right, and my last statement was wrong - the constructor
DOES matter. I probably wasn't paying attention when I ran my previous
test. Actually, I checked again, and constructor does cause the unused
static object NOT to be optimized out.

In the following example, the id object is not optimized out, just as
I thought. You see yo in the printout, and ident(1) shows the ident string
you wanted, even with -O2. If you're using C++, and are reluctant to use
__attribute__((used)) because it's not standard, how about using this trick?

BTW, unfortunately, you need the global below, because if Ident's constructor
doesn't read its argument, the optimizer ends up optimizing away the string
constant given to it as argument! Instead of using a global variable, you can
probably consider doing other things with the argument which the optimizer
considers as using it.

#include cstdio
extern const char *global;
class Ident {
public:
Ident(const char *ident){
// We want it to touch the ident parameter, so the caller
// doesn't think its ignored and can be optimized out.
global=ident;
printf(yo\n);
}
};

static Ident id($Id: hello $);

main(){
printf(hello\n);
}

// Put this in a single file, probably main.cc
const char *global;



-- 
Nadav Har'El|   Tuesday, Oct 18 2011, 
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Unlike Microsoft, a restaurant would not
http://nadav.harel.org.il   |charge me for food with a bug!

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
On Mon, Oct 17, 2011 at 10:21 PM, Oleg Goldshmidt p...@goldshmidt.org wrote:
 Nadav Har'El n...@math.technion.ac.il writes:

 In any case, because there was always a fear that the compiler might
 optimize these out, someone invented a new directive, #ident, as in:

 #ident $Id$

 This has always been there, but it has never been standard, AFAIK. It
 is not a GCC extension, either. Most preprocessors don't barf on
 directives they do not understand, but they may simply ignore #ident
 which will lead to the same behaviour that I do not want.

Just to make things more interesting, it looks like MS Visual Studio
2010 ignores #ident, warns about it, and suggests

#pragma comment(exestr,$Id$)

instead, which works. It also emits nothing for a static const char
foo[] = $Id$ in either Release or Debug mode.

Luckily, it seems to provide __pragma() which should be the same as
_Pragma and can be hidden in preprocessor macros.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-18 Thread Oleg Goldshmidt
On Tue, Oct 18, 2011 at 12:49 PM, Elazar Leibovich elaz...@gmail.com wrote:

 But you shouldn't care that they're having a file scope, do you?

Of course I do. Just as one example: suppose that this/dir/foo.c and
that/other/module/bar.c both #include xyzzy.h. I want to catch a
build system bug that takes the header from two different places.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Baruch Siach
Hi Oleg,

On Mon, Oct 17, 2011 at 06:29:58PM +0200, Oleg Goldshmidt wrote:
...
 Now, put the above line in a C or C++ file, say foo.cc, and do the 
 following:
 
 $ g++ -g -O2 foo.cc -c -o foo.o
 $ ident foo.o
 foo.o:
  $Id: foo.cc 673 2011-10-17 09:48:11Z oleg $
 
 This works up to and including gcc 4.4, but it does not work with gcc
 4.5.1 or gcc 4.6.1 (the ident part does not show any keywords). The
 reason seems to be that the optimizer realizes that the static const
 is not used and eliminates it (remove -O2 and ident works fine).

The -O2, as well as -O and -Os, gcc options enable a set of specific 
optimizations that can each be turned off. The full list is at 
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html. Just go 
over this list and disable each optimization, until you find the one removing 
your static string.

baruch

-- 
 ~. .~   Tk Open Systems
=}ooO--U--Ooo{=
   - bar...@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Nadav Har'El
On Mon, Oct 17, 2011, Oleg Goldshmidt wrote about Newer gcc swallow version 
control keywords:
 static const char foo_src_id[] = $Id$;

I remember many years ago (when I was probably still using SCCS with its %..%
macros, and SCCS's what(1) instead of ident(1)), there was already an argument
on whether what you're trying to do should work, or whether proper compiler
behavior is to optimize these declarations away. In any case, people who used
this trick (rather than comments) *wanted* them not to be optimized away, and
they always were *not* optimized away. I guess that now that changed,
according to your report.

In any case, because there was always a fear that the compiler might optimize
these out, someone invented a new directive, #ident, as in:

#ident $Id$

The C preprocessor would pass this statement unchanged to the C compiler
(as it also does with #line), and the C compiler changed it to some static
constant that doesn't get optimized away.
Last time I checked, this worked, but I suggest you give it a shot again.
I think it has been in all C compilers I've seen in the last two decades,
but perhaps it rusted away because of disuse ;-)


 but this is non-portable, and preprocessor acrobatics such as
 
 #if (__GNUC__ = 4)  (__GNUC_MINOR__  4)
 #define USED(x) x __attribute__((used))
 #else
 #define USED(x) x
 #endif
 
 static const char USED(foo_src_id[]) = $Id$;
 
 is incredibly ugly and verbose

Why is this incredebly ugly? You can even make it nicer by putting in ident.h:
#if (__GNUC__ = 4)  (__GNUC_MINOR__  4)
#define USED(x) x __attribute__((used))
#else
#define USED(x) x
#endif
#define IDENT(x) static const char USED(foo_src_id[]) = x;

and then in each file just do
#include ident.h
IDENT($Id);

Doesn't look that bad...

 (such as -fkeep-static-consts) but they did not work, e.g.,
 -fkeep-static-consts emits static consts only when optimization is not
 on.

Apparently, the optimizer behavior you're reporting and the odd behavior of
-fkeep-static-consts you report is NOT new. Check out this bug report
from 6 (!) years ago:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20319

Ironically, this bug is still marked NEW :-)

 check whether it works? If it is a RedHat-specific or generic for all
 GCCs is important.

According to the aforementioned bug report, it has been this way in all
GCCs for over 6 years, if optimization was turned on... I didn't check myself
in years, though.


-- 
Nadav Har'El|Monday, Oct 17 2011, 
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Committee: A group of people that keeps
http://nadav.harel.org.il   |minutes and wastes hours.

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Shachar Shemesh
On 10/17/2011 06:29 PM, Oleg Goldshmidt wrote:
 Hi,

 I have a gcc-related question. Problematic platform is Fedora 15 with
 gcc 4.6.1, as well as Fedora 14 with gcc 4.5.1.

 I am used to keeping RCS/CVS/SVN keywords (e.g., $Id$) in all my code.
 In the case of C/C++ this normally amounts to

 static const char foo_src_id[] = $Id$;
Leaving aside the question of whether that is a good idea - did you try
changing that to:
static const volatile char foo_src_id[] = $Id$;
?

Shachar


-- 
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Oleg Goldshmidt
Nadav Har'El n...@math.technion.ac.il writes:

 In any case, because there was always a fear that the compiler might
 optimize these out, someone invented a new directive, #ident, as in:

 #ident $Id$

This has always been there, but it has never been standard, AFAIK. It
is not a GCC extension, either. Most preprocessors don't barf on
directives they do not understand, but they may simply ignore #ident
which will lead to the same behaviour that I do not want.

When you say that it has worked for you with all sorts of compilers do
you mean that it actually produced the $Id$ string with ident or that
it didn't break? 

 Why is this incredebly ugly?

Because #including an extra header just for this purpose _is_ ugly,
IMHO (compared to just compiling with an extra option).

 Apparently, the optimizer behavior you're reporting and the odd behavior of
 -fkeep-static-consts you report is NOT new. Check out this bug report
 from 6 (!) years ago:

   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20319

I was wondering whether someone would catch my little mischief: this
bug report complains about the same effect in a slightly different
situation. Even with older GCC the $Id$ string is not there with just
-O2, but it is with -g -O2 (cf. my original posting - I put the
exact command line there for a reason). 

Yes, I agree with the opinion that -fkeep-static-consts should
override optimization (it is more specific), but it has never been the
case (or at least not for a very long time).

Thanks,

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Oleg Goldshmidt
Baruch Siach bar...@tkos.co.il writes:

 The -O2, as well as -O and -Os, gcc options enable a set of specific
 optimizations that can each be turned off. The full list is at
 http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html. Just
 go over this list and disable each optimization, until you find the
 one removing your static string.

I didn't go over all of them but over those which sounded like they
could be relevant - before posting.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Oleg Goldshmidt
Shachar Shemesh shac...@shemesh.biz writes:

 Leaving aside the question of whether that is a good idea

This saved my butt enough times in the past that I think it is ;-)

 - did you try
 changing that to:
 static const volatile char foo_src_id[] = $Id$;

Hmm... const volatile hadn't occurred to me before, but I have just
tried it and it did not work.

Thanks!

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Shachar Shemesh
On 10/17/2011 10:29 PM, Oleg Goldshmidt wrote:
 - did you try
 changing that to:
 static const volatile char foo_src_id[] = $Id$;
 Hmm... const volatile hadn't occurred to me before, but I have just
 tried it and it did not work.
Just tested it myself. It does, indeed, not work. I wonder why? Seems
like it SHOULD work. After all, that's what volatile is for, right?

Shachar
 Thanks!



-- 
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Oleg Goldshmidt
Shachar Shemesh shac...@shemesh.biz writes:

 Just tested it myself. It does, indeed, not work. I wonder why? 
 Seems like it SHOULD work. After all, that's what volatile is for,
 right?

I suspected that const was more important than volatile, but it looks
(after I removed const) that what overrules volatile is the fact that
nothing uses the static variable (or const) - the compiler does not
care that it was declared volatile.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Daniel Shahaf
Shachar Shemesh wrote on Mon, Oct 17, 2011 at 22:47:49 +0200:
 On 10/17/2011 10:29 PM, Oleg Goldshmidt wrote:
  - did you try
  changing that to:
  static const volatile char foo_src_id[] = $Id$;
  Hmm... const volatile hadn't occurred to me before, but I have just
  tried it and it did not work.
 Just tested it myself. It does, indeed, not work. I wonder why? Seems
 like it SHOULD work. After all, that's what volatile is for, right?

Because it's also static?

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Oleg Goldshmidt

Nadav Har'El n...@math.technion.ac.il writes:

   #if (__GNUC__ = 4)  (__GNUC_MINOR__  4)
   #define USED(x) x __attribute__((used))
   #else
   #define USED(x) x
   #endif
   #define IDENT(x) static const char USED(foo_src_id[]) = x;

 and then in each file just do
 #include ident.h
 IDENT($Id);

The proper contents of ident.h would actually be

#ifndef _IDENT_H_
#define _IDENT_H_

#if defined(__GNUC__)
#define USED(x) x __attribute__((used))
#else
#define USED(x) x
#endif

#define IDENT(s,x) static const char USED(s[]) = x

#endif

- this is because one would want to use different identifiers in every
file to avoid a) clashes when keywords are used in header files (very
useful: which versions of the headers was this file compiled with?) and
b) possible effects of -fmerge-constants and friends.

-- 
Oleg Goldshmidt | p...@goldshmidt.org

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: Newer gcc swallow version control keywords

2011-10-17 Thread Ghiora Drori
http://gcc.gnu.org/ml/gcc/2005-04/msg01429.html


 I do have three suggestions for you:

 1) The current way to tell the compiler not to throw away
apparently-unused data is __attribute__((used)), like this:

   static const char __attribute__((used)) rcs_sccs_id[] =
   $Id: @(#)%M% %I% 20%E% %U% copyright 2005 %Q% string1\\ $;



On Mon, Oct 17, 2011 at 2:04 PM, Oleg Goldshmidt p...@goldshmidt.org wrote:


 Nadav Har'El n...@math.technion.ac.il writes:

#if (__GNUC__ = 4)  (__GNUC_MINOR__  4)
#define USED(x) x __attribute__((used))
#else
#define USED(x) x
#endif
#define IDENT(x) static const char USED(foo_src_id[]) = x;
 
  and then in each file just do
  #include ident.h
  IDENT($Id);

 The proper contents of ident.h would actually be

 #ifndef _IDENT_H_
 #define _IDENT_H_

 #if defined(__GNUC__)
 #define USED(x) x __attribute__((used))
 #else
 #define USED(x) x
 #endif

 #define IDENT(s,x) static const char USED(s[]) = x

 #endif

 - this is because one would want to use different identifiers in every
 file to avoid a) clashes when keywords are used in header files (very
 useful: which versions of the headers was this file compiled with?) and
 b) possible effects of -fmerge-constants and friends.

 --
 Oleg Goldshmidt | p...@goldshmidt.org

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il




-- 
Your time is limited, so don't waste it living someone else's life. Don't be
trapped by dogma -- which is living with the results of other people's
thinking. Don't let the noise of others' opinions drown out your own inner
voice. And most important, have the courage to follow your heart and
intuition. They somehow already know what you truly want to become.
Everything else is secondary.
Steve Jobs
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il