On Nov 7, 2010, at 1:29 PM, jhumble wrote: > One possibility to get repeatable builds without filling up an artifacts > repository too fast could be to make Maven store the fully qualified pom > files in the artifacts repo and an md5 of the binary but not necessarily the > actual binary. I know artifacts repos already store some of this > information. > > That way you could make sure sufficient metadata is publicly available such > that they can be reproduced, without using up loads of disk space. You could > also happily delete older binaries, safe in the knowledge that people could > reproduce them from the metadata in the artifacts repo.
One of the things I like about snapshots is it just simply means "latest". Though the thing about timestamped snapshots is that they aren't guaranteed to exist (the repository is not typically assumed to be reliable), and they aren't 100% reproducible (the timestamp offset includes the time it took to build the artifact and all the artifacts before it, meaning there's no way to know exactly what point in time the build came from). Even if one could find the correct timestamp to check out from to get the same binary, whatever subsystem creates the timestamp on upload (wagon?) probably doesn't like being told what to call the snapshot. It follows the only way to get a reproducible build is either to tag the original sources or to know the SCM revision id. The revision id is a natural tag that is automatically generated, and does not clutter the named tag space with thousands of tags that have no organizational meaning. On my CI builds, the first thing that happens is grabbing the revision ID from SVN, and that's put in a properties file that can be used when the UI is generated. Where the version number helps users identify the general features to expect of the current software, the revision ID is great for filing issues so devs don't have to guess at what sources have the issue. When the sources all come from the same SCM repository tree, the rev ID makes it a cinch to reproduce the build. Of course, a better solution can span multiple trees and is reproducible. It just seems like the rev ID is really useful here for identifying reproducible builds without creating releases every time, does it fit with your ideas? If so, a hypothetical repository manager plugin could be maintaining information about snapshot dependencies based on SCM rev ID, thus allowing for reproducibility without modifying Maven or existing snapshot mechanics. Such a plugin might be able to generate a POM that has the extra rev ID metadata that the repo manager would recognize, allowing for existing SNAPSHOT-style identifiers to keep working for developer desktops (avoiding SCM thrash), but also providing reproducibility through synthetic POMs. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
