On 12/29/2014 04:21 AM, Phil Steitz wrote: > On 12/28/14 11:46 AM, Gilles wrote: >> Hi. >> >> On Sun, 28 Dec 2014 09:43:34 +0100, Luc Maisonobe wrote: >>> Le 28/12/2014 00:22, sebb a écrit : >>>> On 27 December 2014 at 22:19, Gilles >>>> <gil...@harfang.homelinux.org> wrote: >>>>> On Sat, 27 Dec 2014 17:48:05 +0000, sebb wrote: >>>>>> >>>>>> On 24 December 2014 at 15:11, Gilles >>>>>> <gil...@harfang.homelinux.org> wrote: >>>>>>> >>>>>>> On Wed, 24 Dec 2014 15:52:12 +0100, Luc Maisonobe wrote: >>>>>>>> >>>>>>>> >>>>>>>> Le 24/12/2014 15:04, Gilles a écrit : >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, 24 Dec 2014 09:31:46 +0100, Luc Maisonobe wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Le 24/12/2014 03:36, Gilles a écrit : >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, 23 Dec 2014 14:02:40 +0100, luc wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> This is a [VOTE] for releasing Apache Commons Math 3.4 >>>>>>>>>>>> from release >>>>>>>>>>>> candidate 3. >>>>>>>>>>>> >>>>>>>>>>>> Tag name: >>>>>>>>>>>> MATH_3_4_RC3 (signature can be checked from git using >>>>>>>>>>>> 'git tag >>>>>>>>>>>> -v') >>>>>>>>>>>> >>>>>>>>>>>> Tag URL: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> <https://git-wip-us.apache.org/repos/asf?p=commons-math.git;a=commit;h=befd8ebd96b8ef5a06b59dccb22bd55064e31c34> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Is there a way to check that the source code referred to >>>>>>>>>>> above >>>>>>>>>>> was the one used to create the JAR of the ".class" files. >>>>>>>>>>> [Out of curiosity, not suspicion, of course...] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, you can look at the end of the META-INF/MANIFEST.MS >>>>>>>>>> file embedded >>>>>>>>>> in the jar. The second-to-last entry is called >>>>>>>>>> Implementation-Build. >>>>>>>>>> It >>>>>>>>>> is automatically created by maven-jgit-buildnumber-plugin >>>>>>>>>> and contains >>>>>>>>>> the SHA1 identifier of the last commit used for the build. >>>>>>>>>> Here, is is >>>>>>>>>> befd8ebd96b8ef5a06b59dccb22bd55064e31c34, so we can check >>>>>>>>>> it really >>>>>>>>>> corresponds to the expected status of the git repository. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Can this be considered "secure", i.e. can't this entry in >>>>>>>>> the MANIFEST >>>>>>>>> file be modified to be the checksum of the repository but >>>>>>>>> with the >>>>>>>>> .class >>>>>>>>> files being substitued with those coming from another >>>>>>>>> compilation? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Modifying anything in the jar (either this entry within the >>>>>>>> manifest or >>>>>>>> any class) will modify the jar signature. So as long as >>>>>>>> people do check >>>>>>>> the global MD5, SHA1 or gpg signature we provide with our >>>>>>>> build, they >>>>>>>> are safe to assume the artifacts are Apache artifacts. >>>>>>>> >>>>>>>> This is not different from how releases are done with >>>>>>>> subversion as the >>>>>>>> source code control system, or even in C or C++ as the >>>>>>>> language. At one >>>>>>>> time, the release manager does perform a compilation and the >>>>>>>> fellow >>>>>>>> reviewers check the result. There is no fullproof process >>>>>>>> here, as >>>>>>>> always when security is involved. Even using an automated >>>>>>>> build and >>>>>>>> automatic signing on an Apache server would involve trust >>>>>>>> (i.e. one >>>>>>>> should assume that the server has not been tampered with, >>>>>>>> that the build >>>>>>>> process really does what it is expected to do, that the >>>>>>>> artifacts put to >>>>>>>> review are really the one created by the automatic process >>>>>>>> ...). >>>>>>>> >>>>>>>> Another point is that what we officially release is the >>>>>>>> source, which >>>>>>>> can be reviewed by external users. The binary parts are >>>>>>>> merely a >>>>>>>> convenience. >>>>>>> >>>>>>> >>>>>>> >>>>>>> That's an interesting point to come back to since it looks >>>>>>> like the >>>>>>> most time-consuming part of a release is not related to the >>>>>>> sources! >>>>>>> >>>>>>> Isn't it conceivable that a release could just be a commit >>>>>>> identifier >>>>>>> and a checksum of the repository? >>>>>>> >>>>>>> If the binaries are a just a convenience, why put so much >>>>>>> effort in it? >>>>>>> As a convenience, the artefacts could be produced after the >>>>>>> release, >>>>>>> accompanied with all the "caveat" notes which you mentioned. >>>>>>> >>>>>>> That would certainly increase the release rate. >>>>>> >>>>>> >>>>>> Binary releases still need to be reviewed to ensure that the >>>>>> correct N >>>>>> & L files are present, and that the archives don't contain >>>>>> material >>>>>> with disallowed licenses. >>>>>> >>>>>> It's not unknown for automated build processes to include >>>>>> files that >>>>>> should not be present. >>>>>> >>>>> >>>>> I fail to see the difference of principle between the "release" >>>>> context >>>>> and, say, the daily snapshot context. >>>> >>>> Snapshots are not (should not) be promoted to the general public as >>>> releases of the ASF. >>>> >>>>> What I mean is that there seem to be a contradiction between >>>>> saying that >>>>> a "release" is only about _source_ and the obligation to check >>>>> _binaries_. >>>> >>>> There is no contradiction here. >>>> The ASF releases source, they are required in a release. >>>> Binaries are optional. >>>> That does not mean that the ASF mirror system can be used to >>>> distribute arbitrary binaries. >>>> >>>>> It can occur that disallowed material is, at some point in >>>>> time, part of >>>>> the repository and/or the snapshot binaries. >>>>> However, what is forbidden is... forbidden, at all times. >>>> >>>> As with most things, this is not a strict dichotomy. >>>> >>>>> If it is indeed a problem to distribute forbidden material, >>>>> shouldn't >>>>> this be corrected in the repository? [That's indeed what you >>>>> did with >>>>> the blocking of the release.] >>>> >>>> If the repo is discovered to contain disallowed material, it >>>> needs to >>>> be removed. >>>> >>>>> Then again, once the repository is "clean", it can be tagged >>>>> and that >>>>> tagged _source_ is the release. >>>> >>>> Not quite. >>>> >>>> A release is a source archive that is voted on and distributed >>>> via the >>>> ASF mirror system. >>>> The contents must agree with the source tag, but the source tag >>>> is not >>>> the release. >>>> >>>>> Non-compliant binaries would thus only be the result of a >>>>> "mistake" >>>>> (if the build system is flawed, it's another problem, unrelated to >>>>> the released contents, which is _source_) to be corrected per se. >>>> >>>> Not so. There are other failure modes. >>>> >>>> An automated build obviously reduces the chances of mistakes, >>>> but it >>>> can still create an archive containing files that should not be >>>> there. >>>> [Or indeed, omits files that should be present] >>>> For example, the workspace contains spurious files which are >>>> implicitly included by the assembly instructions. >>>> Or the build process creates spurious files that are incorrectly >>>> added >>>> to the archive. >>>> Or the build incorrectly includes jars that are supposed to be >>>> provided by the end user >>>> etc. >>>> >>>> I have seen all the above in RC votes. >>>> There are probably other falure modes. >>>> >>>>> My proposition is that it's an independent step: once the build >>>>> system is adjusted to the expectations, "correct" binaries can be >>>>> generated from the same tagged release. >>>> >>>> It does not matter when the binary is built. >>>> If it is distributed by the PMC as a formal release, it must not >>>> contain any surprises, e.g. it must be licensed under the AL. >>>> >>>> It is therefore vital that the contents are as expected from the >>>> build. >>>> >>>> Note also that a formal release becomes an act of the PMC by the >>>> voting process. >>>> The ASF can then assume responsibility for any legal issues that >>>> may arise. >>>> Otherwise it is entirely the personal responsibility of the >>>> person who >>>> releases it. >>> >>> I think the last two points are really important: binaries must be >>> checked and the foundation provides a legal protection for the >>> project >>> if something weird occurs. >>> >>> I also think another point is important: many if not most users do >>> really expect binaries and not source. From our internal Apache >>> point >>> of view, these are a by-product,. For many others it is the >>> important >>> thing. It is mostly true in maven land as dependencies are >>> automatically retrieved in binary form, not source form. So the >>> maven >>> central repository as a distribution system is important. >>> >>> Even if for some security reason it sounds at first thought >>> logical to >>> rely on source only and compile oneself, in an industrial context >>> project teams do not have enough time to do it for all their >>> dependencies, so they use binaries provided by trusted third >>> parties. A >>> long time ago, I compiled a lot of free software tools for the >>> department I worked for at that time. I do not do this anymore, and >>> trust the binaries provided by the packaging team for a distribution >>> (typically Debian). They do rely on source and compile >>> themselves. Hey, >>> I even think Emmanuel here belongs to the Debian java team ;-) I >>> guess >>> such teams that do rely on source are rather the exception than the >>> rule. The other examples I can think of are packaging teams, >>> development teams that need bleeding edge (and will also directly >>> depend on the repository, not even the release), projects that >>> need to >>> introduce their own patches and people who have critical needs (for >>> example when safety of people is concerned or when they need full >>> control for legal or contractual reasons). Many other people >>> download >>> binaries directly and would simply not consider using a project >>> if it >>> is not readily available: they don't have time for this and don't >>> want >>> to learn how to build tens or hundred of different projects they >>> simply >>> use. >>> >> >> I do not disagree with anything said on this thread. [In >> particular, I >> did not at all imply that any one committer could take responsibility >> for releasing unchecked items.] >> >> I'm simply suggesting that what is called the release >> process/management >> could be made simpler (and _consequently_ could lead to more >> regularly >> releasing the CM code), by separating the concerns. >> The concerns are >> 1. "code" (the contents), and >> 2. "artefacts" (the result of the build system acting on the >> "code"). >> >> Checking of one of these is largely independent from checking the >> other. > > Unfortunately, not really. One principle that we have (maybe not > crystal clear in the release doco) is that when we do distribute > binaries, they should really be "convenience binaries" which means > that everything needed to create them is in the source or its > documented dependencies. What that means is that what we tag as the > source release needs to be able to generate any binaries that we > subsequently release. The only way to really test that is to > generate the binaries and inspect them as part of verifying the release. > > As others have pointed out, anything we release has to be verified > and voted on. As RM and reviewer, I think it is actually easier to > roll and verify source and binaries together.
Personally, I do not think that the RM tasks are that much work or cumbersome, once you have done it a few times. The bigger problem I see is related to the voting process, as there are many people looking at a release from very different POVs and finding problems that a RM (or single developer of a component) may not be aware of or able to test himself, thus delaying the release process a lot. A more automated way of creating and especially testing the correctness of releases would help here. Thomas > Phil > > >> [The more so that, as you said, no fool-proof link between the two >> can >> be ensured: From a security POV, checking the former requires a code >> review, while using the latter requires trust in the build system.] >> >> Thus we could release the "code", after checking and voting on the >> concerned elements (i.e. the repository state corresponding to a >> specific tag + the web site). >> >> Then we could release the "binaries", as a convenience, after >> checking >> and voting on the concerned elements (i.e. the files about to be >> distributed). >> >> I think that it's an added flexibility that would, for example, allow >> the tagging of the repository without necessarily release binaries >> (i.e. >> not involving that part of the work); and to release binaries >> (say, at >> regular intervals) based on the latest tagged code (i.e. not >> involving >> the work about solving/evaluating/postponing issues). >> >> [I completely admit that, at first, it might look a little more >> confusing for the plain user, but (IIUC) it would be a better >> representation of the reality covered by stating that the ASF >> releases source code.] >> >> >> Best regards, >> Gilles >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org