On 7/28/15 12:08 PM, Steve Amerige wrote:
Hi Henrik,
In most instances, version numbers aren't part of filenames. Consider
executables. For example, OS commands such as 'ls' aren't ls-1.1.
Scripts are written to depend on resources with constant naming. The
same applies to jar files. Code can be written to use standardized
filenames and can be depended upon to work even when jar files are
updated. In Linux, the /alternatives /command is one way of managing
versioning. And, there are plenty of other approaches to versioning.
For jars, the manifest.mf is a common (and standard) place to set
package version information
<https://docs.oracle.com/javase/tutorial/deployment/jar/packageman.html>.
Hi Steve. Agreed that executables are rarely versioned. But, a jar file
is an archive (more similar to a file/directory bundle), and not much
different from your versioned tar file example in your first email
(httpd-2.4.16.tar.gz). And in order to examine a manifest file, the
archive has to be unzipped first, which adds an extra step. As I
mentioned in my other email, the default behavior of Maven and Gradle is
to version artifacts. Like it or not, but it *is* the industry standard,
and has been for a long time.
In modern deployment environments such as cloud computing, in
particular, the notion of version is not as relevant as it used to
be. Customers do not think of what version software is in the cloud.
It is the application that is undergoing continuous, agile, modification.
Version numbers are of course of limited interest to the consumers of
cloud services/apps. I for sure don't care which version of Facebook I'm
using. But for the Facebook developers, it's absolutely crucial to know
which version is deployed. A company I've done work for recently had a
system glitch *precisely* because an older version of a jar file
happened to make it into a production deployment. It took hours of
debugging to determine the cause of the glitch, something that would
have been immediately avoided if the jar file name had contained the
version.
Having version numbers as part of filenames breaks the use of these
files as reusable components.
I don't see how. foo-1.jar and foo-2.jar aren't the same thing. Sure,
they may only differ slightly, but I don't see how being explicit breaks
things. The common way of dealing with dependencies is to be explicit
about the versioning. I.e. in my build system, I usually express
specifically which version of every single dependency I'm using (in the
case where it's relevant).
The issue of binary, source code, and behavioral compatibility is
important. As long as contracts are preserved, all is well. In any
event, a version number is a very weak indicator and cannot be relied
on to determine compatibility.
Why is versioning a "weak indicator"?
Thorough testing before deploying code that relies on updated jar
files is important.
Of course, but sometimes it's advantageous to have a simple mechanism
for determining this long after deployment. Sure, I can always open the
archive and examine the manifest, but that's pretty tedious, especially
in applications that may depend on hundreds of jar archives.
And, from a security perspective, we're looking at standardized
filenames and testing them against various exploits. It is better to
have a relatively unchanging set of names that we can run checks
against to determine what they are by checksums, etc. Security is
becoming a bigger and bigger concern, and efforts should be made to
have consistent filenames across releases so that changes are more
readily identifiable.
OK, one additional argument for being explicit and keep the versioning
in the name :-)
From at IT perspective, changes are pushed out to developers to ensure
that they're developing into standardized environments. We're
essentially doing at least the following as a workaround:
VERSION=2.4.4
cd /usr/local
rm -rf groovy-$VERSIONgroovy
unzip -q /network/path/to/apache-groovy-sdk-$VERSION.zip
ln -s groovy-$VERSION groovy
cd groovy-$VERSION
find . -name "*-$VERSION*" | while read FILENAME ; do
STANDARD_NAME=$(echo "$FILENAME" | sed "s/-$VERSION//")
mv $FILENAME $STANDARD_NAME
done
This seems like a lot of extra work to me, and if you need portability
across OS flavors/versions, this is certainly a big hassle. Many
companies have developers on multiple platforms (Windows, MacOS, Linux,
...). Managing scripts, symlinks, and renaming files is a sure way to
create a lot of headaches in a heterogeneous environment. Why are you
not relying on the build system to provide the correct versions of your
dependencies? You could set up a local Archiva server or similar to
serve up the jar files that have been approved and security tested.
Maybe your requirements are different than most shops, but it seems like
a lot of extra work and a pretty brittle system to do what you showed
above. Why not drive this through Maven or Gradle, and integrate that
with your SCM system? Most project I've worked on in the last few years
make sure all build dependencies are explicitly expressed in a pom.xml
or build.gradle file. When a developer checks out the latest revision
from the SCM and runs a build, he/she will have the exact same
dependencies as all the other developers. No need for mucking around
with unzipping archives, renaming files, creating symlinks, etc.
Note that the above isn't perfect as the internal components of the
zip archive include other jar files with version numbers other than
the Groovy version number. One could remove version numbers from all
files. It's merely used for illustration and shouldn't be used in any
production environment.
Apache's own HTTP <http://httpd.apache.org/download.cgi> or Tomcat
<http://tomcat.apache.org/download-80.cgi> code that doesn't use
filenames with embedded version numbers; however, counter examples
include Hadoop <https://hadoop.apache.org/releases.html> and Lucene
<https://lucene.apache.org/core/> that do use version numbering in
filenames. So, it is clear that this is an area that has not been
standardized with the ASF. Perhaps the Apache folks can chime in on this.
Well, if you pull your Apache jar dependencies from a Maven/Ivy repo,
all artifacts are versioned, so I'm not sure what the benefit is of
doing it the Tomcat way. I think it has more to do with the fact that
Tomcat is a very old product, and it's just legacy packaging. I would
applaud if the jars distributed with Tomcat were versioned just like the
Maven/Ivy artifacts.
We continue to assert that having version numbers, or other metadata,
as a part of filenames is a bad practice. We do have workarounds in
place, but think it would benefit the community to consider this change.
And I continue to assert the opposite :-)
The habit of renaming archives as in your script snippet above is
especially bad practice if you ask me. I'll give you another interesting
use case where versioning info in the file name is handy. Have you ever
happened to be in a situation where an app fails or starts behaving
strange because of a duplicate jar file name with a different version
happen to end up on the classpath? It's a lot quicker to figure this out
when immediately noticing that the jar file name contains two different
versions. Example classpath:
/usr/local/foo/foo.jar:...<gazillion other classpath entries
here>...:/opt/foo/foo.jar:...<gazillion other classpath entries here>...
To figure out the versions of the duplicate foo.jar archives, I have to
unzip them and examine the manifest. If the version is already in the
file name, it would have been immediately obvious.
Enjoy,
Steve Amerige
Principal Software Developer, Fraud and Compliance Solutions Development
SAS Institute, 100 SAS Campus Dr, Room U3050, Cary, NC 27513-8617
Cheers,
-H
On 7/28/2015 10:38 AM, Henrik Martin wrote:
I'm not part of the contributor team, so I can't speak for the Groovy
team, but I would strongly disagree with you. If you use Maven or
Gradle, it's easy to maintain dependencies on particular versions of
jar files, and have your IDE immediately pick up the new version. In
fact, the default behavior for both Maven and Gradle is to explicitly
maintain version numbers in artifacts. Removing this would be a step
back to the 90s. Sometimes jar files have to copied into other
directories outside of their normal home. An example is when
deploying Tomcat. Stuff like jdbc drivers etc typically get copied
into $CATALINA_BASE/lib. It's worth gold to immediately be able to
tell which particular version of those jar files are in there, vs
just seeing "foobar.jar".
I would argue that you should probably change the practice of
creating symlinks to explicitly versioned jar files as this is
obviously a pain when new versions are introduced.
Just my $0.02.
-H
On 7/28/15 5:26 AM, Steve Amerige wrote:
Hi all,
Every time we take a download of the latest Groovy software, we have
to do the same task: remove version numbers from filenames. As of
the 2.4.4 release, there are 39 files, shown below, that have the
version number as part of the distribution. So, why is this a problem?
* IDEs cannot silently be updated to use a mandated Groovy
version. With 2.4.4 dealing with a zero-day vulnerability
issue, we want to push this out. However, the version numbers
in files mean that users must participate in the updating. This
is not desirable.
* Links that users might have at the OS level are broken with each
new release.
* Version numbers in files makes it more difficult to diff between
releases.
* Version numbers as a part of a filename is a specific case of
metadata as part of filenames and, as such, we consider it to be
a bad practice. This information is better kept in a file,
preferably machine readable in a format such as JSON or XML, or
in manifest files that can be consumed by software that might do
version number validation as part of security efforts in
maintaining code.
It is reasonable that the root directory include a version number.
But, under that directory, we'd expect that the contents are
version-less. A good example of a version-less Apache project is the
HTTP Server <http://httpd.apache.org/download.cgi>. The download is
presently a file named *httpd-2.4.16.tar.gz*, and when extracted
produces a top-level directory named *httpd-2.4.16*. No file name
contains the version number string. The two files *CHANGES *and
*httpd.spec *contain the version number string. I believe that
Groovy should follow this example, and possibly go one step better
by having an explicit manifest file with all pertinent metadata for
the Groovy release that includes metadata such as the version
number, license name, checksums of files (for security checking), etc.
If you agree, how can we start the process of making this change?
Thanks,
Steve Amerige
Principal Software Developer, Fraud and Compliance Solutions Development
SAS Institute, 100 SAS Campus Dr, Room U3050, Cary, NC 27513-8617
./lib/groovy-sql-2.4.4.jar
./lib/groovy-testng-2.4.4.jar
./lib/groovy-jsr223-2.4.4.jar
./lib/groovy-servlet-2.4.4.jar
./lib/groovy-json-2.4.4.jar
./lib/groovy-jmx-2.4.4.jar
./lib/groovy-test-2.4.4.jar
./lib/groovy-bsf-2.4.4.jar
./lib/groovy-groovydoc-2.4.4.jar
./lib/groovy-nio-2.4.4.jar
./lib/groovy-console-2.4.4.jar
./lib/groovy-xml-2.4.4.jar
./lib/groovy-ant-2.4.4.jar
./lib/groovy-docgenerator-2.4.4.jar
./lib/groovy-groovysh-2.4.4.jar
./lib/groovy-templates-2.4.4.jar
./lib/groovy-swing-2.4.4.jar
./lib/groovy-2.4.4.jar
./apache-groovy-src-2.4.4-incubating.zip
./embeddable/groovy-all-2.4.4-indy.jar
./embeddable/groovy-all-2.4.4.jar
./indy/groovy-json-2.4.4-indy.jar
./indy/groovy-console-2.4.4-indy.jar
./indy/groovy-2.4.4-indy.jar
./indy/groovy-sql-2.4.4-indy.jar
./indy/groovy-jmx-2.4.4-indy.jar
./indy/groovy-servlet-2.4.4-indy.jar
./indy/groovy-xml-2.4.4-indy.jar
./indy/groovy-swing-2.4.4-indy.jar
./indy/groovy-templates-2.4.4-indy.jar
./indy/groovy-ant-2.4.4-indy.jar
./indy/groovy-groovydoc-2.4.4-indy.jar
./indy/groovy-nio-2.4.4-indy.jar
./indy/groovy-test-2.4.4-indy.jar
./indy/groovy-testng-2.4.4-indy.jar
./indy/groovy-groovysh-2.4.4-indy.jar
./indy/groovy-docgenerator-2.4.4-indy.jar
./indy/groovy-bsf-2.4.4-indy.jar
./indy/groovy-jsr223-2.4.4-indy.jar