As discussion has settled somewhat, I would like to call another vote to accept the latest patch described in https://issues.apache.org/jira/browse/RIVER-432
The patch removes the archived jar files for Velocity and ASM and replaces them with Apache Ivy scripts that download the jars from Maven Central the first time a build is done. From then on, the jar files are in Ivy’s repository (for more info, see http://ant.apache.org/ivy). Voting will remain open at least until 2000 UTC Feb 13, 2014. Cheers, Greg. On Jan 3, 2014, at 12:57 PM, Greg Trasuk <tras...@stratuscom.com> wrote: > > On Jan 3, 2014, at 5:25 AM, Simon IJskes - QCG <si...@qcg.nl> wrote: > >> In order to gain some time to discuss this first i will vote -1. >> >> First, we decided to NOT remove velocity builder. > > When I read the email chain, my impression was that we wanted to remove it > (to quote you Sim, “To be honest, I hate it”), but there was a dependency on > it in the ‘extras’ folder that was added in the trunk branch. As there is no > ‘extras’ in the 2.2 branch, and that is what this patch applies to, I thought > it was fair to remove VelocityConfigurationBuilder from the 2.2 branch. > Perhaps we should revisit the ConfigurationBuilder approach in another > thread. For now I’ll spin another patch that doesn’t remove > VelocityConfigurationBuilder. > >> >> Second, no need to remove the jars as specified in your own comments on >> RIVER-432. >> >> Pulling in external jars at compile time, shall we start here? >> >> They are already in the svn. They are already in the build scripts. What >> does this patch fix? No legal problems? >> > > Apache policy is somewhat unclear on this point. One needs to examine the > mailing lists for clues on what we should really do. I will argue that: > > 1 - The fundamental distribution model of Apache is source code, not binaries. > 2 - Distributing binaries is tolerated but not encouraged. Since the svn > repository can be seen as a distribution point, binaries in svn are also > tolerated but not encouraged. > 3 - Downloading dependency binaries at build time is technologically easy, > provides the same guarantees as putting them in cvs, and avoids the question > of effectively distributing someone else’s code. > > All these together suggest that although we’re technically OK to put > dependency jars in a “-deps” package (note that the status quo _is_ > unacceptable - at the very least, we need to restructure the dependencies > into a “-deps” binary package), there is some policy uncertainty which we > avoid totally by having dependencies downloaded from a known-good source at > build time. > > Let’s examine these points: > > 1 - The fundamental distribution model of Apache is source code, not > binaries. Apache began with httpd. Back in those days, “Open Source” > software was distributed in source form only, simply because it was mostly > intended for Unix systems (then later Linux). I recall the first release of > Perl coming down as a ten-part uunet news message. Part of this distribution > model was practical necessity - System differences made it necessary to > compile your software on the hardware it was going to run on. Part of it was > open-source philosophy. Having the source code meant that you could take a > look at it and verify that it wasn’t hazardous to your operations. > > In any case, the way we use to use open source software was (“./configure; > make; make install”). If the software had dependencies, you built them from > source, for the same reasons. > > Now, as time has gone on, we’ve gotten used to having the JVM as a common > runtime, and jar files as a common binary distribution medium. But the > Apache Foundation’s mandate is to produce open source software that is freely > usable under the Apache License. That means source code is at the heart of > Apache, despite the rest of the world’s comfort with binaries. Hence Roy’s > statements in (1): > >> Class files are not open source. Jar files filled with class files >> are not open source. The fact that they are derived from open source >> is applicable only to what we allow projects to be dependent upon, >> not what we vote on as a release package. Release votes are on verified >> open source artifacts. Binary packages are separate from source packages. >> One cannot vote to approve a release containing a mix of source and >> binary code because the binary is not open source and cannot be verified >> to be safe for release (even if it was derived from open source). >> >> I thought that was frigging obvious. Why do I need to write documentation >> to explain something that is fundamental to the open source definition? > He’s talking about binary packages, not jar files in svn, but I read that > (and many other emails) as a distaste for binary distributions. > > In fact, if you look at Apache httpd’s download page, it doesn’t appear that > the Apache project publishes any Unix or Linux binaries. They leave that to > other organizations. > > 2 - Distributing binaries is tolerated but not encouraged. Since the svn > repository can be seen as a distribution point, binaries in svn are also > tolerated but not encouraged. > > It’s hard to find a single reference that encapsulates this outlook, but > that’s the impression I get from reading the various mailing lists. For > instance, Sam Ruby says (2): >> IMO, our projects release source. So, our projects should not maintain >> object/binary artifacts >> in their svn release tree, regardless of license (category a or b). > There is some debate on whether the svn tree should be considered a > distribution point. Incubator releases are regularly called out for not > having “NOTICE” and “RELEASE” files at all reasonable checkout points in svn. > [LEGAL-26] (https://issues.apache.org/jira/browse/LEGAL-26) concerns this > and remains unresolved. > > Doug Cutting (3) says: >> On Mon, Sep 16, 2013 at 2:50 AM, Stephen Connolly >> <stephen.alan.conno...@gmail.com> wrote: >>> * Source control is not an Apache distribution and hence we do not need to >>> have LICENSE and NOTICE files in source control, it can be a nice >>> convenience, but there is no *requirement*. >> >> I think perhaps you're looking for clear lines where things are >> actually a bit fuzzy. Certainly releases are official distributions >> and need LICENSE and NOTICE files. That line is clear. On the other >> hand, we try to discourage folks from thinking that source control is >> a distribution. Rather we wish it to be considered our shared >> workspace, containing works in progress, not yet always ready for >> distribution to folks outside the foundation. But, since we work in >> public, folks from outside the foundation can see our shared workspace >> and might occasionally mistake it for an official distribution. We'd >> like them to still see a LICENSE and NOTICE file. So it's not a >> hard-and-fast requirement that every tree that can possibly be checked >> out have a LICENSE and NOTICE file at its root, but it's a good >> practice for those trees that are likely to be checked out have them, >> so that folks who might consume them are well informed. > Again, he’s not talking directly about jar files in svn, however I think his > statement that “since we work in public, folks from outside the foundation > can see our shared workspace and might occasionally mistake it for an > official distribution” applies here. Fundamentally, the decision on how and > where to distribute ‘velocity.jar’ rightly belongs with the Velocity group > and I don’t think we ought to redistribute it. > > 3 - Downloading dependency binaries at build time is technologically easy, > provides the same guarantees as putting them in cvs, and avoids the question > of effectively distributing someone else’s code. > > There doesn’t seem to be clear policy in the ASF on this, as evidenced by the > frequent debates on it, and the lack of documentation. I’ve tried to lay out > an argument that having jars in svn is not encouraged by the ASF (really, > it’s not in line with the ASF’s charter), even if it isn’t disallowed. You > may disagree, and I won’t claim I’ve made a strong argument, simply because > the policy isn’t clear. So instead of going through arguments that amount to > differences of opinion on Apache policy, let’s use a technological solution > that is simple, common, and avoids the question altogether, by automatically > downloading the dependencies at build time. > > Projects that use Maven do this automatic download as standard practice > (that’s what Maven does, and that’s what the Maven Central infrastructure is > there to support). We don’t use Maven, which is fine (our customers have > asked us to make our binaries available in Maven Central, and we’ve done > that). Apache Ivy is a popular add-on to Apache Ant that provides similar > dependency resolution to an Ant-based build. > > I was a little surprised how easy it was to persuade Ivy to get the required > dependencies at build time. The “ivy.xml” file is 39 lines including the ASL > header (which by the way I forgot to include in the patch - I’ll fix that). > There are about 50 lines added to ‘build.xml’ to download Ivy and then > download the required jar files > > So, given that the status-quo seems to be unacceptable (Roy talks about not > having jar files in the open-source trees, only in “-deps” and “tools” > trees), we have two options: > > (a) - restructure the svn repository and the build to allow a separate > “-deps” distribution. This wouldn’t affect our binary distributions (note > that I’m no longer using the term “binary release”), but to build from > source, a user would have to download a separate archive, unpack it, and then > copy those files into the directory that was unpacked from the source > release. This option effectively still has us distributing dependent > binaries, which is not the goal of the ASF, just with a disclaimer that says > “this isn’t an ASF release, its just a binary distribution put together by a > committer for your convenience, so don’t feel you should trust it too much”. > > (b) - use Ivy to get the jars from Maven Central automatically as part of the > build. > > I think (b) is the option that causes the least hassle for our downstream > consumers, and not much hassle for us. > > >> Pulling external jars at compile time also makes it more difficult to >> certify the software. In order to certify the software you need to establish >> baseline that will be garanteed the same, even if you pull it from the >> archive 10 years later. > > As I said above, Apache’s focus is creating software that can be built from > source, not distributing binaries (note that QCG or another company might > have a different focus, and is perfectly able to distribute binaries under > the Apache license). I think a reasonably prudent user would ask “How can I > trust the ‘velocity.jar’ that’s included in this binary?” And the answer > would be either “because I built it from source and installed it in my > corporate repository” (very cautious, but not unheard-of) or “It was > published by the Velocity group to a trusted repository, Maven Central” (more > common). > > If you look in the “ivy.xml” file you’ll see that the dependencies are > specified using Maven-style “group-artifact-version” coordinates. Old > versions are maintained in Maven Central forever. I suppose it’s possible > that a publisher could convince Maven Central to remove a version for some > reason (security or licensing problems perhaps), but then, would we want to > be distributing that version in a “-deps” package? > > I agree that it’s not enough to just say “you need to download such-and-such > jar”, hence the automatic download managed by “Ivy” from Maven Central. > >> It is not a high level project that builds on several frameworks. It is a >> lowlevel system library. The stuff below the stack is minimal. The number of >> jars we use is limited. Why bother? >> > > In the currently released branches, the dependencies are limited to ASM and > Velocity. Looking forward to the trunk branch and the qa_refactor branch, > the number of external dependencies seem to be increasing (IMO I don’t like > that, because I also view River as a low level system library, but I’m only > one PMC member). We need to get in front of the problem before we start > distributing large numbers of dependencies. > > This point rolls in with the general question of jar files in version > control. I was always taught that version control was for source code, and > putting binaries into version control was a bad idea. In addition, there are > practical problems - with older systems like cvs, even doing an update or > commit effectively downloads the binaries, which slows things down if there > are large binary files. On newer distributed version control systems like > git or Mercurial, the entire repository, including all versions of binary > artifacts, comes down with the project checkout. Currently, we have one > version of relatively few jar files in our repository, so it’s not a major > issue. But it gets worse as time goes on. So I suggest we work out the > technology now to avoid the problem. > >> Gr. Simon >> > > Thanks for the questions, Sim. I hope you’ll come around to removing your > ‘-1’. > > Cheers, > > Greg > > Footnotes > —————— > > (1) - Roy Fielding - http://s.apache.org/roy-binary-deps-1 > (2) - Sam Ruby - http://s.apache.org/r5 > (3) - Doug Cutting - http://s.apache.org/GNP > >> On 02-01-14 18:22, Greg Trasuk wrote: >>> >>> Hello all: >>> >>> Please have a look at the patch mentioned below and cast a vote on it. >>> >>> The main idea is to remove the dependency jar files from the source >>> distribution. As a side effect of using Ivy, it becomes reasonable to >>> remove them from the svn archive as well. Also, the Velocity dependency >>> was there to support the VelocityConfigurationBuilder. We had discussed >>> removing that component, so rather than move that dependency to Ivy, I’ve >>> removed VelocityConfigurationBuilder. >>> >>> It’s arguable whether the VelocityConfigurationBuider was part of the >>> official Jini API (I see it as a utility, not API), so I don’t think this >>> commit actually requires a vote. However, it does seem like a significant >>> change to the build process that ought to be reviewed. So I propose to >>> treat this as a “lazy consensus” vote, and will commit the change to the >>> 2.2 branch if there are no objections in 72 hours (i.e. 1730UTC 20140105). >>> >>> At the same time, based on discussions over on >>> gene...@incubator.apache.org, I’ll withdraw my assertion that we can’t have >>> jars in svn. Those interested may want to check out the thread at >>> http://mail-archives.apache.org/mod_mbox/incubator-general/201312.mbox/%3C01B04CC4-95B8-4A39-BC16-04BAA4269B65%40stratuscom.com%3E >>> >>> Cheers, >>> >>> Greg. >>> >>> On Jan 2, 2014, at 12:05 PM, Greg Trasuk (JIRA) <j...@apache.org> wrote: >>> >>>> >>>> [ >>>> https://issues.apache.org/jira/browse/RIVER-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >>>> ] >>>> >>>> Greg Trasuk updated RIVER-432: >>>> ------------------------------ >>>> >>>> Attachment: river-2_2_remove_jars.diff >>>> >>>> The attached patch for the 2.2 branch does the following: >>>> - removes the 'asm' directory and 'tests/lib' directories which currently >>>> contain the asm library, mockito, and junit jars. >>>> - Modifies 'build.xml', 'common.xml', and adds 'ivy.xml' so that the >>>> Apache Ivy ant plugin is downloaded at build time, and then used to >>>> retrieve the libraries mentioned above from Maven Central. This removes >>>> the need to have the jar files in svn. >>>> - Removes (as per discussion >>>> http://mail-archives.apache.org/mod_mbox/river-dev/201211.mbox/%3C509B99E3.6080800%40qcg.nl%3E) >>>> the VelocityConfigBuilder, and associated Velocity jars. Note that the >>>> 'extras' folder is not present in the 2.2 branch, so Sim's last comments >>>> in the thread do not apply. >>>> >>>>> Jar files in svn and src distributions >>>>> -------------------------------------- >>>>> >>>>> Key: RIVER-432 >>>>> URL: https://issues.apache.org/jira/browse/RIVER-432 >>>>> Project: River >>>>> Issue Type: Bug >>>>> Reporter: Greg Trasuk >>>>> Attachments: river-2_2_remove_jars.diff >>>>> >>>>> >>>>> Recent traffic on the incubator lists has pointed out that including jar >>>>> files for dependencies in the subversion repository and the source >>>>> distributions is against Apache policy. >>>>> In River, the following libraries appear in the Subversion repository and >>>>> the source distributions (these are from trunk, a smaller set appear in >>>>> the 2.2 branch): >>>>> animal-sniffer >>>>> asm >>>>> bouncy-castle >>>>> dnsjava >>>>> high-scale-lib >>>>> rc-libs >>>>> velocity >>>>> They all have to go. What are we using them for? As I understand it, we >>>>> were going to remove the VelocityConfigurationBuilder, so that's not a >>>>> problem. Some of the others are available from Maven Central, so we can >>>>> get them at build time using Ivy or another build tool. Which ones are >>>>> actually required? And where did they come from? >>>> >>>> >>>> >>>> -- >>>> This message was sent by Atlassian JIRA >>>> (v6.1.5#6160) >>> >> >> >> -- >> QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl >> Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397 >