On Jan 3, 2014, at 5:25 AM, Simon IJskes - QCG<si...@qcg.nl>
wrote:
In order to gain some time to discuss this first i will vote -1.
First, we decided to NOT remove velocity builder.
When I read the email chain, my impression was that we wanted to
remove it (to quote you Sim, “To be honest, I hate it”), but there
was a dependency on it in the ‘extras’ folder that was added in
the trunk branch. As there is no ‘extras’ in the 2.2 branch, and
that is what this patch applies to, I thought it was fair to remove
VelocityConfigurationBuilder from the 2.2 branch. Perhaps we
should revisit the ConfigurationBuilder approach in another
thread. For now I’ll spin another patch that doesn’t remove
VelocityConfigurationBuilder.
Second, no need to remove the jars as specified in your own
comments on RIVER-432.
Pulling in external jars at compile time, shall we start here?
They are already in the svn. They are already in the build
scripts. What does this patch fix? No legal problems?
Apache policy is somewhat unclear on this point. One needs to
examine the mailing lists for clues on what we should really do.
I will argue that:
1 - The fundamental distribution model of Apache is source code,
not binaries. 2 - Distributing binaries is tolerated but not
encouraged. Since the svn repository can be seen as a
distribution point, binaries in svn are also tolerated but not
encouraged. 3 - Downloading dependency binaries at build time is
technologically easy, provides the same guarantees as putting them
in cvs, and avoids the question of effectively distributing
someone else’s code.
All these together suggest that although we’re technically OK to
put dependency jars in a “-deps” package (note that the status quo
_is_ unacceptable - at the very least, we need to restructure the
dependencies into a “-deps” binary package), there is some policy
uncertainty which we avoid totally by having dependencies
downloaded from a known-good source at build time.
Let’s examine these points:
1 - The fundamental distribution model of Apache is source code,
not binaries. Apache began with httpd. Back in those days,
“Open Source” software was distributed in source form only, simply
because it was mostly intended for Unix systems (then later
Linux). I recall the first release of Perl coming down as a
ten-part uunet news message. Part of this distribution model was
practical necessity - System differences made it necessary to
compile your software on the hardware it was going to run on.
Part of it was open-source philosophy. Having the source code
meant that you could take a look at it and verify that it wasn’t
hazardous to your operations.
In any case, the way we use to use open source software was
(“./configure; make; make install”). If the software had
dependencies, you built them from source, for the same reasons.
Now, as time has gone on, we’ve gotten used to having the JVM as a
common runtime, and jar files as a common binary distribution
medium. But the Apache Foundation’s mandate is to produce open
source software that is freely usable under the Apache License.
That means source code is at the heart of Apache, despite the rest
of the world’s comfort with binaries. Hence Roy’s statements in
(1):
Class files are not open source. Jar files filled with class
files are not open source. The fact that they are derived from
open source is applicable only to what we allow projects to be
dependent upon, not what we vote on as a release package.
Release votes are on verified open source artifacts. Binary
packages are separate from source packages. One cannot vote to
approve a release containing a mix of source and binary code
because the binary is not open source and cannot be verified to
be safe for release (even if it was derived from open source).
I thought that was frigging obvious. Why do I need to write
documentation to explain something that is fundamental to the
open source definition?
He’s talking about binary packages, not jar files in svn, but I
read that (and many other emails) as a distaste for binary
distributions.
In fact, if you look at Apache httpd’s download page, it doesn’t
appear that the Apache project publishes any Unix or Linux
binaries. They leave that to other organizations.
2 - Distributing binaries is tolerated but not encouraged. Since
the svn repository can be seen as a distribution point, binaries
in svn are also tolerated but not encouraged.
It’s hard to find a single reference that encapsulates this
outlook, but that’s the impression I get from reading the various
mailing lists. For instance, Sam Ruby says (2):
IMO, our projects release source. So, our projects should not
maintain object/binary artifacts in their svn release tree,
regardless of license (category a or b).
There is some debate on whether the svn tree should be considered a
distribution point. Incubator releases are regularly called out
for not having “NOTICE” and “RELEASE” files at all reasonable
checkout points in svn. [LEGAL-26]
(https://issues.apache.org/jira/browse/LEGAL-26) concerns this and
remains unresolved.
Doug Cutting (3) says:
On Mon, Sep 16, 2013 at 2:50 AM, Stephen Connolly
<stephen.alan.conno...@gmail.com> wrote:
* Source control is not an Apache distribution and hence we do
not need to have LICENSE and NOTICE files in source control,
it can be a nice convenience, but there is no *requirement*.
I think perhaps you're looking for clear lines where things are
actually a bit fuzzy. Certainly releases are official
distributions and need LICENSE and NOTICE files. That line is
clear. On the other hand, we try to discourage folks from
thinking that source control is a distribution. Rather we wish
it to be considered our shared workspace, containing works in
progress, not yet always ready for distribution to folks outside
the foundation. But, since we work in public, folks from
outside the foundation can see our shared workspace and might
occasionally mistake it for an official distribution. We'd
like them to still see a LICENSE and NOTICE file. So it's not
a hard-and-fast requirement that every tree that can possibly be
checked out have a LICENSE and NOTICE file at its root, but it's
a good practice for those trees that are likely to be checked
out have them, so that folks who might consume them are well
informed.
Again, he’s not talking directly about jar files in svn, however I
think his statement that “since we work in public, folks from
outside the foundation can see our shared workspace and might
occasionally mistake it for an official distribution” applies
here. Fundamentally, the decision on how and where to distribute
‘velocity.jar’ rightly belongs with the Velocity group and I don’t
think we ought to redistribute it.
3 - Downloading dependency binaries at build time is
technologically easy, provides the same guarantees as putting them
in cvs, and avoids the question of effectively distributing
someone else’s code.
There doesn’t seem to be clear policy in the ASF on this, as
evidenced by the frequent debates on it, and the lack of
documentation. I’ve tried to lay out an argument that having
jars in svn is not encouraged by the ASF (really, it’s not in line
with the ASF’s charter), even if it isn’t disallowed. You may
disagree, and I won’t claim I’ve made a strong argument, simply
because the policy isn’t clear. So instead of going through
arguments that amount to differences of opinion on Apache policy,
let’s use a technological solution that is simple, common, and
avoids the question altogether, by automatically downloading the
dependencies at build time.
Projects that use Maven do this automatic download as standard
practice (that’s what Maven does, and that’s what the Maven Central
infrastructure is there to support). We don’t use Maven, which is
fine (our customers have asked us to make our binaries available in
Maven Central, and we’ve done that). Apache Ivy is a popular
add-on to Apache Ant that provides similar dependency resolution
to an Ant-based build.
I was a little surprised how easy it was to persuade Ivy to get the
required dependencies at build time. The “ivy.xml” file is 39
lines including the ASL header (which by the way I forgot to
include in the patch - I’ll fix that). There are about 50 lines
added to ‘build.xml’ to download Ivy and then download the
required jar files
So, given that the status-quo seems to be unacceptable (Roy talks
about not having jar files in the open-source trees, only in
“-deps” and “tools” trees), we have two options:
(a) - restructure the svn repository and the build to allow a
separate “-deps” distribution. This wouldn’t affect our binary
distributions (note that I’m no longer using the term “binary
release”), but to build from source, a user would have to download
a separate archive, unpack it, and then copy those files into the
directory that was unpacked from the source release. This option
effectively still has us distributing dependent binaries, which is
not the goal of the ASF, just with a disclaimer that says “this
isn’t an ASF release, its just a binary distribution put together
by a committer for your convenience, so don’t feel you should
trust it too much”.
(b) - use Ivy to get the jars from Maven Central automatically as
part of the build.
I think (b) is the option that causes the least hassle for our
downstream consumers, and not much hassle for us.
Pulling external jars at compile time also makes it more
difficult to certify the software. In order to certify the
software you need to establish baseline that will be garanteed
the same, even if you pull it from the archive 10 years later.
As I said above, Apache’s focus is creating software that can be
built from source, not distributing binaries (note that QCG or
another company might have a different focus, and is perfectly
able to distribute binaries under the Apache license). I think a
reasonably prudent user would ask “How can I trust the
‘velocity.jar’ that’s included in this binary?” And the answer
would be either “because I built it from source and installed it
in my corporate repository” (very cautious, but not unheard-of) or
“It was published by the Velocity group to a trusted repository,
Maven Central” (more common).
If you look in the “ivy.xml” file you’ll see that the dependencies
are specified using Maven-style “group-artifact-version”
coordinates. Old versions are maintained in Maven Central
forever. I suppose it’s possible that a publisher could convince
Maven Central to remove a version for some reason (security or
licensing problems perhaps), but then, would we want to be
distributing that version in a “-deps” package?
I agree that it’s not enough to just say “you need to download
such-and-such jar”, hence the automatic download managed by “Ivy”
from Maven Central.
It is not a high level project that builds on several
frameworks. It is a lowlevel system library. The stuff below the
stack is minimal. The number of jars we use is limited. Why
bother?
In the currently released branches, the dependencies are limited to
ASM and Velocity. Looking forward to the trunk branch and the
qa_refactor branch, the number of external dependencies seem to be
increasing (IMO I don’t like that, because I also view River as a
low level system library, but I’m only one PMC member). We need
to get in front of the problem before we start distributing large
numbers of dependencies.
This point rolls in with the general question of jar files in
version control. I was always taught that version control was
for source code, and putting binaries into version control was a
bad idea. In addition, there are practical problems - with older
systems like cvs, even doing an update or commit effectively
downloads the binaries, which slows things down if there are large
binary files. On newer distributed version control systems like
git or Mercurial, the entire repository, including all versions of
binary artifacts, comes down with the project checkout.
Currently, we have one version of relatively few jar files in our
repository, so it’s not a major issue. But it gets worse as time
goes on. So I suggest we work out the technology now to avoid
the problem.
Gr. Simon
Thanks for the questions, Sim. I hope you’ll come around to
removing your ‘-1’.
Cheers,
Greg
Footnotes
——————
(1) - Roy Fielding - http://s.apache.org/roy-binary-deps-1
(2) - Sam Ruby - http://s.apache.org/r5
(3) - Doug Cutting - http://s.apache.org/GNP
On 02-01-14 18:22, Greg Trasuk wrote:
Hello all:
Please have a look at the patch mentioned below and cast a
vote on it.
The main idea is to remove the dependency jar files from the
source distribution. As a side effect of using Ivy, it
becomes reasonable to remove them from the svn archive as
well. Also, the Velocity dependency was there to support the
VelocityConfigurationBuilder. We had discussed removing that
component, so rather than move that dependency to Ivy, I’ve
removed VelocityConfigurationBuilder.
It’s arguable whether the VelocityConfigurationBuider was part
of the official Jini API (I see it as a utility, not API), so
I don’t think this commit actually requires a vote. However,
it does seem like a significant change to the build process
that ought to be reviewed. So I propose to treat this as a
“lazy consensus” vote, and will commit the change to the 2.2
branch if there are no objections in 72 hours (i.e. 1730UTC
20140105).
At the same time, based on discussions over on
gene...@incubator.apache.org, I’ll withdraw my assertion that
we can’t have jars in svn. Those interested may want to
check out the thread at
http://mail-archives.apache.org/mod_mbox/incubator-general/201312.mbox/%3C01B04CC4-95B8-4A39-BC16-04BAA4269B65%40stratuscom.com%3E
Cheers,
Greg.
On Jan 2, 2014, at 12:05 PM, Greg Trasuk (JIRA)
<j...@apache.org> wrote:
[
https://issues.apache.org/jira/browse/RIVER-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Greg Trasuk updated RIVER-432:
------------------------------
Attachment: river-2_2_remove_jars.diff
The attached patch for the 2.2 branch does the following:
- removes the 'asm' directory and 'tests/lib' directories
which currently contain the asm library, mockito, and junit
jars. - Modifies 'build.xml', 'common.xml', and adds
'ivy.xml' so that the Apache Ivy ant plugin is downloaded at
build time, and then used to retrieve the libraries
mentioned above from Maven Central. This removes the need
to have the jar files in svn. - Removes (as per discussion
http://mail-archives.apache.org/mod_mbox/river-dev/201211.mbox/%3C509B99E3.6080800%40qcg.nl%3E)
the VelocityConfigBuilder, and associated Velocity jars.
Note that the 'extras' folder is not present in the 2.2
branch, so Sim's last comments in the thread do not apply.
Jar files in svn and src distributions
--------------------------------------
Key: RIVER-432
URL: https://issues.apache.org/jira/browse/RIVER-432
Project: River
Issue Type: Bug
Reporter: Greg Trasuk
Attachments: river-2_2_remove_jars.diff
Recent traffic on the incubator lists has pointed out that
including jar files for dependencies in the subversion
repository and the source distributions is against Apache
policy. In River, the following libraries appear in the
Subversion repository and the source distributions (these
are from trunk, a smaller set appear in the 2.2 branch):
animal-sniffer asm bouncy-castle dnsjava high-scale-lib
rc-libs
velocity
They all have to go. What are we using them for? As I
understand it, we were going to remove the
VelocityConfigurationBuilder, so that's not a problem.
Some of the others are available from Maven Central, so we
can get them at build time using Ivy or another build
tool. Which ones are actually required? And where did
they come from?
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
--
QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag:
28088397