Problem Statement:
Given a build of the OpenJDK, how can you find out what source was used to build this binary install? Seed of a Solution: With Mercurial, a single repository changeset number identifies the state of the complete source repository. If this changeset (or set of changesets) could be somehow recorded with the built bits, then given any build you could quickly and easily reconstruct the exact source files that were used at build time. Problems: We have a forest not a single repository. We often create source bundles (sources minus the SCM management data, e.g. ".hg") so we need this to work in the face of building from source bundles. Possible Solution: First issue is identifying a repository of the forest relative to the root of the forest. So each repository would get a managed file ".identification" which would contain information to help identify the repository. For example, the topmost OpenJDK one would have a ".identification" file containing: root=. directory=. description=Root of the JDK Source Tree and the corba one would have: root=.. directory=corba description=Corba Sources etc. (the directory could be a deeper nested directory, like jdk/src/closed) This .identification file would be a permanent file in the repository, at the root of the repository. It's saying that to get to the root of the forest, you 'cd ${root}'. And if this repository is not located at ${root}/${directory} something is wrong, or the repository is not currently part of a forest. Second issue, the changeset id. A second file called ".changeset" would not be a managed file and would be created before the source bundles are created, and be non-existent if they can't be created because you don't have repositories (building from raw source trees) or don't have access to 'hg'. These files would just contain a changeset=id, created with: hg tip --template 'changeset={node}\n' So somewhere this needs to happen, before source bundles are created and before the use of this data: TREES:=$(shell hg ftrees) if [ "$(TREES)" != "" ] ; then for i in $(TREES) ; do (cd $i && hg tip --template 'changeset={node}\n' > .changeset ) done fi Third, all this data needs to be merged together into a file that could be used later to recreate the source tree by running: hg clone -r ${changeset} http://hg.openjdk.java.net/jdk7/${directory} ${directory} as many times as needed. The Makefiles would be sensitive to the existence of the .changeset files and allow for them to not exist where they are used, they might not be there in all cases. But when they are there, do something like: jdk_source_information.txt: $(RM) $@ echo "# JDK Source Information" > $@ if [ "$(TREES)" != "" ] ; then for i in $(TREES) ; do if [ -f ${i}/.identification ] ; then cat ${i}/.identification >> $@ if [ -f ${i}/.changeset ] ; then cat ${i}/.changeset >> $@ fi fi done fi Resulting in a file: # JDK Source Information root=. directory=. description=Root of the JDK Source Tree changeset=BIGHEXNUMBER root=.. directory=corba description=Corba Sources changeset=BIGHEXNUMBER ... Left in the jdk install tree. --- Just a first guess at a basic idea as to how this could work... Please don't assume the above is also an implementation, it's the basic idea of having members of the forest identify themselves, and the idea of recording the changesets, and finally of leaving source information in the resulting binary build. Comments? -kto P.S. Full RFE can be seen at: http://bugs.sun.com/view_bug.do?bug_id=6631003