Sim: Any further thoughts on the below discussion? I’ve added a new patch to the JIRA issue that doesn’t remove VelocityConfigurationBuilder.
Just as another data point, I happened to do a fresh checkout of River over the weekend and with the embedded binaries, it took 14 minutes. Also, if you’re not familiar with Ivy, it works essentially the same way as Maven in that it downloads the dependencies from Maven Central to a local repository, so in most cases the download only happens once - it isn’t repeated for every build (unless one deletes the local repository - that sometimes makes sense inside Jenkins, but isn’t usually required for a developer). Cheers, Greg. On Jan 3, 2014, at 12:57 PM, Greg Trasuk <tras...@stratuscom.com> wrote: > > On Jan 3, 2014, at 5:25 AM, Simon IJskes - QCG <si...@qcg.nl> wrote: > >> In order to gain some time to discuss this first i will vote -1. >> >> First, we decided to NOT remove velocity builder. > > When I read the email chain, my impression was that we wanted to remove it > (to quote you Sim, “To be honest, I hate it”), but there was a dependency on > it in the ‘extras’ folder that was added in the trunk branch. As there is no > ‘extras’ in the 2.2 branch, and that is what this patch applies to, I thought > it was fair to remove VelocityConfigurationBuilder from the 2.2 branch. > Perhaps we should revisit the ConfigurationBuilder approach in another > thread. For now I’ll spin another patch that doesn’t remove > VelocityConfigurationBuilder. > >> >> Second, no need to remove the jars as specified in your own comments on >> RIVER-432. >> >> Pulling in external jars at compile time, shall we start here? >> >> They are already in the svn. They are already in the build scripts. What >> does this patch fix? No legal problems? >> > > Apache policy is somewhat unclear on this point. One needs to examine the > mailing lists for clues on what we should really do. I will argue that: > > 1 - The fundamental distribution model of Apache is source code, not binaries. > 2 - Distributing binaries is tolerated but not encouraged. Since the svn > repository can be seen as a distribution point, binaries in svn are also > tolerated but not encouraged. > 3 - Downloading dependency binaries at build time is technologically easy, > provides the same guarantees as putting them in cvs, and avoids the question > of effectively distributing someone else’s code. > > All these together suggest that although we’re technically OK to put > dependency jars in a “-deps” package (note that the status quo _is_ > unacceptable - at the very least, we need to restructure the dependencies > into a “-deps” binary package), there is some policy uncertainty which we > avoid totally by having dependencies downloaded from a known-good source at > build time. > > Let’s examine these points: > > 1 - The fundamental distribution model of Apache is source code, not > binaries. Apache began with httpd. Back in those days, “Open Source” > software was distributed in source form only, simply because it was mostly > intended for Unix systems (then later Linux). I recall the first release of > Perl coming down as a ten-part uunet news message. Part of this distribution > model was practical necessity - System differences made it necessary to > compile your software on the hardware it was going to run on. Part of it was > open-source philosophy. Having the source code meant that you could take a > look at it and verify that it wasn’t hazardous to your operations. > > In any case, the way we use to use open source software was (“./configure; > make; make install”). If the software had dependencies, you built them from > source, for the same reasons. > > Now, as time has gone on, we’ve gotten used to having the JVM as a common > runtime, and jar files as a common binary distribution medium. But the > Apache Foundation’s mandate is to produce open source software that is freely > usable under the Apache License. That means source code is at the heart of > Apache, despite the rest of the world’s comfort with binaries. Hence Roy’s > statements in (1): > >> Class files are not open source. Jar files filled with class files >> are not open source. The fact that they are derived from open source >> is applicable only to what we allow projects to be dependent upon, >> not what we vote on as a release package. Release votes are on verified >> open source artifacts. Binary packages are separate from source packages. >> One cannot vote to approve a release containing a mix of source and >> binary code because the binary is not open source and cannot be verified >> to be safe for release (even if it was derived from open source). >> >> I thought that was frigging obvious. Why do I need to write documentation >> to explain something that is fundamental to the open source definition? > He’s talking about binary packages, not jar files in svn, but I read that > (and many other emails) as a distaste for binary distributions. > > In fact, if you look at Apache httpd’s download page, it doesn’t appear that > the Apache project publishes any Unix or Linux binaries. They leave that to > other organizations. > > 2 - Distributing binaries is tolerated but not encouraged. Since the svn > repository can be seen as a distribution point, binaries in svn are also > tolerated but not encouraged. > > It’s hard to find a single reference that encapsulates this outlook, but > that’s the impression I get from reading the various mailing lists. For > instance, Sam Ruby says (2): >> IMO, our projects release source. So, our projects should not maintain >> object/binary artifacts >> in their svn release tree, regardless of license (category a or b). > There is some debate on whether the svn tree should be considered a > distribution point. Incubator releases are regularly called out for not > having “NOTICE” and “RELEASE” files at all reasonable checkout points in svn. > [LEGAL-26] (https://issues.apache.org/jira/browse/LEGAL-26) concerns this > and remains unresolved. > > Doug Cutting (3) says: >> On Mon, Sep 16, 2013 at 2:50 AM, Stephen Connolly >> <stephen.alan.conno...@gmail.com> wrote: >>> * Source control is not an Apache distribution and hence we do not need to >>> have LICENSE and NOTICE files in source control, it can be a nice >>> convenience, but there is no *requirement*. >> >> I think perhaps you're looking for clear lines where things are >> actually a bit fuzzy. Certainly releases are official distributions >> and need LICENSE and NOTICE files. That line is clear. On the other >> hand, we try to discourage folks from thinking that source control is >> a distribution. Rather we wish it to be considered our shared >> workspace, containing works in progress, not yet always ready for >> distribution to folks outside the foundation. But, since we work in >> public, folks from outside the foundation can see our shared workspace >> and might occasionally mistake it for an official distribution. We'd >> like them to still see a LICENSE and NOTICE file. So it's not a >> hard-and-fast requirement that every tree that can possibly be checked >> out have a LICENSE and NOTICE file at its root, but it's a good >> practice for those trees that are likely to be checked out have them, >> so that folks who might consume them are well informed. > Again, he’s not talking directly about jar files in svn, however I think his > statement that “since we work in public, folks from outside the foundation > can see our shared workspace and might occasionally mistake it for an > official distribution” applies here. Fundamentally, the decision on how and > where to distribute ‘velocity.jar’ rightly belongs with the Velocity group > and I don’t think we ought to redistribute it. > > 3 - Downloading dependency binaries at build time is technologically easy, > provides the same guarantees as putting them in cvs, and avoids the question > of effectively distributing someone else’s code. > > There doesn’t seem to be clear policy in the ASF on this, as evidenced by the > frequent debates on it, and the lack of documentation. I’ve tried to lay out > an argument that having jars in svn is not encouraged by the ASF (really, > it’s not in line with the ASF’s charter), even if it isn’t disallowed. You > may disagree, and I won’t claim I’ve made a strong argument, simply because > the policy isn’t clear. So instead of going through arguments that amount to > differences of opinion on Apache policy, let’s use a technological solution > that is simple, common, and avoids the question altogether, by automatically > downloading the dependencies at build time. > > Projects that use Maven do this automatic download as standard practice > (that’s what Maven does, and that’s what the Maven Central infrastructure is > there to support). We don’t use Maven, which is fine (our customers have > asked us to make our binaries available in Maven Central, and we’ve done > that). Apache Ivy is a popular add-on to Apache Ant that provides similar > dependency resolution to an Ant-based build. > > I was a little surprised how easy it was to persuade Ivy to get the required > dependencies at build time. The “ivy.xml” file is 39 lines including the ASL > header (which by the way I forgot to include in the patch - I’ll fix that). > There are about 50 lines added to ‘build.xml’ to download Ivy and then > download the required jar files > > So, given that the status-quo seems to be unacceptable (Roy talks about not > having jar files in the open-source trees, only in “-deps” and “tools” > trees), we have two options: > > (a) - restructure the svn repository and the build to allow a separate > “-deps” distribution. This wouldn’t affect our binary distributions (note > that I’m no longer using the term “binary release”), but to build from > source, a user would have to download a separate archive, unpack it, and then > copy those files into the directory that was unpacked from the source > release. This option effectively still has us distributing dependent > binaries, which is not the goal of the ASF, just with a disclaimer that says > “this isn’t an ASF release, its just a binary distribution put together by a > committer for your convenience, so don’t feel you should trust it too much”. > > (b) - use Ivy to get the jars from Maven Central automatically as part of the > build. > > I think (b) is the option that causes the least hassle for our downstream > consumers, and not much hassle for us. > > >> Pulling external jars at compile time also makes it more difficult to >> certify the software. In order to certify the software you need to establish >> baseline that will be garanteed the same, even if you pull it from the >> archive 10 years later. > > As I said above, Apache’s focus is creating software that can be built from > source, not distributing binaries (note that QCG or another company might > have a different focus, and is perfectly able to distribute binaries under > the Apache license). I think a reasonably prudent user would ask “How can I > trust the ‘velocity.jar’ that’s included in this binary?” And the answer > would be either “because I built it from source and installed it in my > corporate repository” (very cautious, but not unheard-of) or “It was > published by the Velocity group to a trusted repository, Maven Central” (more > common). > > If you look in the “ivy.xml” file you’ll see that the dependencies are > specified using Maven-style “group-artifact-version” coordinates. Old > versions are maintained in Maven Central forever. I suppose it’s possible > that a publisher could convince Maven Central to remove a version for some > reason (security or licensing problems perhaps), but then, would we want to > be distributing that version in a “-deps” package? > > I agree that it’s not enough to just say “you need to download such-and-such > jar”, hence the automatic download managed by “Ivy” from Maven Central. > >> It is not a high level project that builds on several frameworks. It is a >> lowlevel system library. The stuff below the stack is minimal. The number of >> jars we use is limited. Why bother? >> > > In the currently released branches, the dependencies are limited to ASM and > Velocity. Looking forward to the trunk branch and the qa_refactor branch, > the number of external dependencies seem to be increasing (IMO I don’t like > that, because I also view River as a low level system library, but I’m only > one PMC member). We need to get in front of the problem before we start > distributing large numbers of dependencies. > > This point rolls in with the general question of jar files in version > control. I was always taught that version control was for source code, and > putting binaries into version control was a bad idea. In addition, there are > practical problems - with older systems like cvs, even doing an update or > commit effectively downloads the binaries, which slows things down if there > are large binary files. On newer distributed version control systems like > git or Mercurial, the entire repository, including all versions of binary > artifacts, comes down with the project checkout. Currently, we have one > version of relatively few jar files in our repository, so it’s not a major > issue. But it gets worse as time goes on. So I suggest we work out the > technology now to avoid the problem. > >> Gr. Simon >> > > Thanks for the questions, Sim. I hope you’ll come around to removing your > ‘-1’. > > Cheers, > > Greg > > Footnotes > —————— > > (1) - Roy Fielding - http://s.apache.org/roy-binary-deps-1 > (2) - Sam Ruby - http://s.apache.org/r5 > (3) - Doug Cutting - http://s.apache.org/GNP > >> On 02-01-14 18:22, Greg Trasuk wrote: >>> >>> Hello all: >>> >>> Please have a look at the patch mentioned below and cast a vote on it. >>> >>> The main idea is to remove the dependency jar files from the source >>> distribution. As a side effect of using Ivy, it becomes reasonable to >>> remove them from the svn archive as well. Also, the Velocity dependency >>> was there to support the VelocityConfigurationBuilder. We had discussed >>> removing that component, so rather than move that dependency to Ivy, I’ve >>> removed VelocityConfigurationBuilder. >>> >>> It’s arguable whether the VelocityConfigurationBuider was part of the >>> official Jini API (I see it as a utility, not API), so I don’t think this >>> commit actually requires a vote. However, it does seem like a significant >>> change to the build process that ought to be reviewed. So I propose to >>> treat this as a “lazy consensus” vote, and will commit the change to the >>> 2.2 branch if there are no objections in 72 hours (i.e. 1730UTC 20140105). >>> >>> At the same time, based on discussions over on >>> gene...@incubator.apache.org, I’ll withdraw my assertion that we can’t have >>> jars in svn. Those interested may want to check out the thread at >>> http://mail-archives.apache.org/mod_mbox/incubator-general/201312.mbox/%3C01B04CC4-95B8-4A39-BC16-04BAA4269B65%40stratuscom.com%3E >>> >>> Cheers, >>> >>> Greg. >>> >>> On Jan 2, 2014, at 12:05 PM, Greg Trasuk (JIRA) <j...@apache.org> wrote: >>> >>>> >>>> [ >>>> https://issues.apache.org/jira/browse/RIVER-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >>>> ] >>>> >>>> Greg Trasuk updated RIVER-432: >>>> ------------------------------ >>>> >>>> Attachment: river-2_2_remove_jars.diff >>>> >>>> The attached patch for the 2.2 branch does the following: >>>> - removes the 'asm' directory and 'tests/lib' directories which currently >>>> contain the asm library, mockito, and junit jars. >>>> - Modifies 'build.xml', 'common.xml', and adds 'ivy.xml' so that the >>>> Apache Ivy ant plugin is downloaded at build time, and then used to >>>> retrieve the libraries mentioned above from Maven Central. This removes >>>> the need to have the jar files in svn. >>>> - Removes (as per discussion >>>> http://mail-archives.apache.org/mod_mbox/river-dev/201211.mbox/%3C509B99E3.6080800%40qcg.nl%3E) >>>> the VelocityConfigBuilder, and associated Velocity jars. Note that the >>>> 'extras' folder is not present in the 2.2 branch, so Sim's last comments >>>> in the thread do not apply. >>>> >>>>> Jar files in svn and src distributions >>>>> -------------------------------------- >>>>> >>>>> Key: RIVER-432 >>>>> URL: https://issues.apache.org/jira/browse/RIVER-432 >>>>> Project: River >>>>> Issue Type: Bug >>>>> Reporter: Greg Trasuk >>>>> Attachments: river-2_2_remove_jars.diff >>>>> >>>>> >>>>> Recent traffic on the incubator lists has pointed out that including jar >>>>> files for dependencies in the subversion repository and the source >>>>> distributions is against Apache policy. >>>>> In River, the following libraries appear in the Subversion repository and >>>>> the source distributions (these are from trunk, a smaller set appear in >>>>> the 2.2 branch): >>>>> animal-sniffer >>>>> asm >>>>> bouncy-castle >>>>> dnsjava >>>>> high-scale-lib >>>>> rc-libs >>>>> velocity >>>>> They all have to go. What are we using them for? As I understand it, we >>>>> were going to remove the VelocityConfigurationBuilder, so that's not a >>>>> problem. Some of the others are available from Maven Central, so we can >>>>> get them at build time using Ivy or another build tool. Which ones are >>>>> actually required? And where did they come from? >>>> >>>> >>>> >>>> -- >>>> This message was sent by Atlassian JIRA >>>> (v6.1.5#6160) >>> >> >> >> -- >> QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl >> Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397 >