On 18/09/12 22:59, Mike Frysinger wrote:
the linker's --build-id and associated .note.gnu.build-id section.  you can't
hash the entire object because it can change between compiles.  build-id lets
you say "regardless of the hash of the entire object, we know the content that
matters is unchanged".

Ah, excellent, this is the sort of detail I was looking for!

My own brief experimentation shows that static libraries contain troublesome datestamps, but object files appear to be reproducible, given the same source and command line (the case ccache handles).

Under what circumstances can the binary change but the build-id remain the same? I'm aware of line number, and file path differences in the debug info. Is there anything else?

Anyway, as I understand it, ccache could dump the build-id section first, if there is one, and hash the entire binary second, if there isn't one.

I'm a bit concerned about the build-id though. As I read it, the build-id can't tell the difference between a stripped binary and one with full debug, and the two certainly produce different output (OK, a *very* smart tool could determine that, with a certain link command or script, two different inputs are equivalent, but let's not go there). It can't even tell the difference between an object with *only* debug.

Hashing the entire binary could lead to additional cache misses in the case that the user has made minor, unimportant changes to the build, but in the normal case the object file will have come from the cache anyway so this won't be a problem.

The library datestamps problem can be got around by hashing the output of "ar p libNAME.a" (perhaps combined with "ar t libNAME.a", just to be safe, but certainly not with "-v"), or perhaps "objdump -j .note.gnu.build-id -s libNAME.a" if we want to use build-ids.

"-###" isn't meant to be a wildcard. That's an actual GCC option. I put
quotes around it because most shells would interpret the hashes as the
start of a comment.

hmm, gotcha.  it does seem to include all the necessary info.  whether it's
easy for a machine to parse across gcc versions is a diff question :).  seems
to have changed subtly over time between 3.3.6 and 4.7.1.

Probably true, but it ought to be possible to determine if we do understand it, or not, and fall back to the old behaviour if not.

ccache mailing list

Reply via email to