Hi Andy,
thanks again for all the work you have done to push out the first Apache
Jena release.

I answer your questions below. Answers can safely be ignored or read later.
No more questions from me => a reply is not necessary.

Time to celebrate the first Apache Jena release.

Happy Christmas to all.

Paolo

Andy Seaborne wrote:
>> Comments and rationale (longer version):
>>
>>
>> The dist area is used by people and reducing choices there, in future
>> releases,
>> would be a great improvement. For example, I was not sure about the
>> difference
>> between:
>>
>>    apache-jena-2.7.0-incubating.tar.bz2      14-Dec-2011 15:50   14M
>>    apache-jena-2.7.0-incubating.tar.gz       14-Dec-2011 15:44   16M
>>
>> I did not find a difference, so is apache-jena-2.7.0-incubating.tar.bz2
>> necessary?
>
> See
> http://incubator.apache.org/guides/releasemanagement.html
>
> and as an example:
> http://www.apache.org/dist/ant/binaries/
>
> These are the same set of files, packaged differently.  bzip2 is
> sometimes significantly smaller.

 "Compression Formats"

 "Ship at least one of tar.gz, bz or bz2 for UNIX and linux (but note this).
  Ship zip for windows."

  The "note this" link does not work for me. I am not sure if there are good
  reasons for bz2 vs. tar.gz.

 "Note (TODO link) that there are known compatibility issues when using
  certain tar programs. (TODO Saris verses GNU tar) It is recommend that
  project that use Ant or Maven as build tools, use these tools to create
  the archives since these implementations work well across a range of
  platforms. It is recommended that project which do not use these tools
  consider shipping the *nix package as a bz2 archive."

  bz2 it is recommended for projects which do not use Ant or Maven to
  create their archives.

 [1] 
http://incubator.apache.org/guides/releasemanagement.html#best-practice-formats

When I was testing .tar.gz I was unsure if .bz2 was the same or not.
I checked and I did not find differences, therefore .bz2 did not increase
testing cost.

I am still unsure if other than size there are other reasons why it's there.
The documentation above does not answer that (probably because of the broken
links).

>
>> The instruction to recreate the release in the BUILD file found in
>> apache-jena-2.7.0-incubating-source-release.zip involve a lot of manual
>> steps (fortunately it's something most of the people will not need to go
>> through). Ideally, it could be: download, uncompress and run a command.
>> Easier to document, less likely something goes wrong.
>
> I thought we'd agreed this long ago.
> This release is to get something out of the door now.

Yes.

> I've suggested a change to a single-trunk multi-module build already so
> I'm not sure what point you are making here.

A single-trunk multi-module is an improvement and a simplication.
However, I used multi-module projects only with a single version for
all modules and where all modules are released together.

I am not sure if other approaches work well... and at the moment I do
not have time nor energy to investigate.

> For a single-trunk multi-module, like all the project you keep pointing
> to, we can have a single build.  Multi-trunk, multi-module is a
> non-starter because of RAT, version tags and mvn release plugin
> assumptions.  But we will loose the ability to branch modules as we do
> at the moment.

Yep.

For me, it is not a problem to branch a single-trunk even if I need
to experiment with only one module (if this simplifies the life of the
release manager and/or experience with tools we use).

> JenaDist replaces the a replacement source-release artifact creator.  It
> does not create a complete image of the development systems at the point
> of release to match the SVN tag; it creates a special composite that
> isn't tested. The point of RAT is to capture common practice.

JenaDist is not something I am proud of :-). It should be ditched and moved
back to the Scratch area where it came from. It is something unusual and
certainly not a best practice. It is the best I was able to do with the
constraint of: "not change anything in all the others modules" and without
a single-trunk layout.

>
>> This is an example of ideal 'dist' area which I would be more happy with:
>
> Why are you unhappy?  You use Jena from maven - this does not affect you.

The 'dist' area is for people who want to download and use Apache Jena.
Perhaps, they are not even developers and they want the binary version
to run some of the command line tools. A 'dist' area with less choices
is IMHO better for the 'dist' target audience.

The perfect situation is: no choice. Just one download option.

This in practice is not possible, since we need a 'source' release and
the operating system (i.e. .tar.gz and/or .bz2 for UNIX and .zip for
Windows).

Even if something does not affect directly me, it does not mean I don't care.
Better user experience for all Apache Jena users is something I care.
The more users the better. The better experience for everyone using Jena,
the better.

The current 'dist' also affect committers/testing at the time of a release.
This round I had time to test only:

 - apache-jena-2.7.0-incubating.tar.gz
 - apache-jena-2.7.0-incubating.tar.bz2 (same as above => zero cost)

Not the content in the directories:

 - jena-arq-2.9.0-incubating/
 - jena-core-2.7.0-incubating/
 - jena-iri-0.9.0-incubating/
 - jena-top-0-incubating/

I tested the content of the directories above (but *-source-release.zip)
via the Maven repository.

> The statement
> jena.staging.apache.org/jena/download/index.html
>
> """
> You can download official Apache releases from here:
>
>     www.apache.org/dist/incubator/jena/
> """
>
> will lead as directly as possible to the latest version.

Ack.

> Building version numbers into documentation is to be strongly avoided
> IMHO.

Agree.

Some projects keep more than one version in the 'dist' area. If we will
only have one version in the 'dist' area. Good.
A directory is not necessary in this case.

Projects with more than one version in the 'dist' area, often have a
'stable'|'current'|'latest' link to point to the latest release (and
use that in the documentation). A good practice as well.

> (I'd remove them from that page - they don't get maintained
> especially at the point of release when there is enough to do already -
> but "thank you" to Ian for making sure it's ready to go).
>
>>    http://w.a.o/dist/incubator/jena/apache-jena-x.y.z-incubating/
>
> Have you checked what is proposed?
>

Yes. See above.

> http://people.apache.org/~andy/dist-apache-jena-2.7.0-incubating-RC-1/apache-jena-2.7.0-incubating/
>
>
> Those files are all there exactly as you have them.  +bz2 (separate issue).
>
> In addition, the latest download is top level as explained to make it
> easy for people, especially non-maven, novice users.

Yes.

If you think it's a good thing, when releases are archived, would it be
possible to move stuff:

 - from: http://www.apache.org/dist/incubator/jena/*
 - to: 
http://archive.apache.org/dist/incubator/jena/apache-jena-x.y.z-incubating/*
   (or http://archive.apache.org/dist/incubator/jena/jena-x.y.z-incubating/*)

*not* in the rood http://archive.apache.org/dist/incubator/jena/, otherwise
it can become messy and confusing in the long run.

>>    apache-jena-x.y.z-incubating.tar.gz
>>    apache-jena-x.y.z-incubating.tar.gz.asc
>>    apache-jena-x.y.z-incubating.tar.gz.md5
>>    apache-jena-x.y.z-incubating.tar.gz.sha1
>>    apache-jena-x.y.z-incubating.zip
>>    apache-jena-x.y.z-incubating.zip.asc
>>    apache-jena-x.y.z-incubating.zip.md5
>>    apache-jena-x.y.z-incubating.zip.sha1
>>    apache-jena-x.y.z-incubating-source-release.tar.gz (*)
>>    apache-jena-x.y.z-incubating-source-release.tar.gz.asc
>>    apache-jena-x.y.z-incubating-source-release.tar.gz.md5
>>    apache-jena-x.y.z-incubating-source-release.tar.gz.sha1
>>
>> ... or, if people on Windows can uncompress a .tar.gz,
>
> They can't, without additional, non-standard programs.  Support for zip
> files is built-in to Windows.

Ack.

> For more, see
> http://incubator.apache.org/guides/releasemanagement.html

Ack.

>> and there is no
>> difference between the content of:
>>    apache-jena-x.y.z-incubating.tar.gz
>>    apache-jena-x.y.z-incubating.zip
>>
>> Then, the 'dist' are could have even less choices (even better):
>>
>>    http://w.a.o/dist/incubator/jena/apache-jena-x.y.z-incubating/
>>
>>    apache-jena-x.y.z-incubating.tar.gz
>>    apache-jena-x.y.z-incubating.tar.gz.asc
>>    apache-jena-x.y.z-incubating.tar.gz.md5
>>    apache-jena-x.y.z-incubating.tar.gz.sha1
>>    apache-jena-x.y.z-incubating-source-release.tar.gz (*)
>>    apache-jena-x.y.z-incubating-source-release.tar.gz.asc
>>    apache-jena-x.y.z-incubating-source-release.tar.gz.md5
>>    apache-jena-x.y.z-incubating-source-release.tar.gz.sha1
>>
>> (*) this file should (if future) contain all the sources necessary to
>> rebuild
>> the binary release (i.e. apache-jena-x.y.z-incubating.tar.gz). People who
>> want to recreate the binary release can download (*), uncompress and
>> run a
>> command.
>
> And we have discussed this several times.  Release something now, create
> space for a reorg.  What is the problem with what was in the pre-release
> trial run and the current release proposal?

Explained above.

> If, as it should be, (*) is everything, then we need a single-trunk
> multi-module layout to work with RAT.  It has pros and cons; downside -
> branching per module is changed; releasing just a module needs tagging.
>  I have no idea how the release plugin deals with it and want to test it
> out first.

Yes.

> I do know from experience that misconfiguring the release
> plugin will attempt to tag junk in the wrong place, not generate an
> warning or error in a dry run.

I've only used the release plugin (not on the Apache infrastructure) and
for single-trunk, multi-module projects where all the modules have the same
version and all the modules are released together. In this situation, I do
not have problems and there is little things left to do manually.

>> With the dist area above, users are presented with just one choice:
>> binary or sources?
>
> "users" is a vague and broad label.
>
> What sort of user might be looking in dist/?
> novice? redistributor?

Novice (more likely) => Less choices.

>
> How is each of those classes of user served?

Novice, first adopters => 'dist' area
Developers => artifacts repository

> Why add an additional level of directory naming?
> The file names have versions in them.  What is gained and for whom?

As I said above.

If we have only one (i.e. the latest) release in the 'dist' area the
current approach works well (IMHO it can be further improved reducing
choices for users (in particular novice users) and, if possible,
removing the directories).

>> An example of an Apache multi-module project which keep a 'dist' area
>> very
>> clean and usable for people is Apache Whirr:
>> http://www.apache.org/dist/incubator/whirr/whirr-0.6.0-incubating/
>
> Does whirr serve the same community Jena does?

No. Only Jena serve the same community as Jena.
I think we can learn from looking at other well managed Apache projects.
None of the other well managed Apache projects will serve the same community
Jena does.

> Does it serve its community in the same way?

No. See above.

> Who is being served from thir dist/?

The Apache Whirr users who want to install and use the binary distribution
of Apache Whirr as quickly as possible. They download a .tar.gz, unzip it
and they can run the scripts in the bin/ directory.

Similar to Jena novice users: download a binary distribution, unzip it
and run the scripts in the bin/ or bat/ directory.

> Why is it called "src"?  And what does that mean?

What they call 'src' is what we call 'source-release'.

> What is the history of its community and the expectations on
> distribution, naming and layout?

See above.

> Please can you explain how it applies to Jena?

It's a good example of a multi-module project with a minimal 'dist' area
which present users (in particular novice) with the less number of choices.

> Whirr does not produce a zip, yet its best practice to do so.
>
> "Apache Whirr is a set of libraries for running cloud services."
>
> Jena is not just a set of libraries.  You yourself keep asking about
> Fuseki which is not a library.

Agree.

Fuseki is not a library. Closer (not the same) to something like Tomcat.
But, the Tomcat community is much bigger.

Apache Tomcat distribution area is here:
http://www.apache.org/dist/tomcat/

> You continually tell users to use maven - that's your personal opinion.

I didn't in this thread. :-)

In future, I'll make clear that everything I write about Maven or anything
else is my personal opinion.

I don't care about Maven in itself or any other tool we use to build/manage/
publish Apache Jena artifact. I am convinced that a repository or artifacts
with their sources and the dependencies between artifacts in a machine
readable format is a very good thing for the entire Java/Apache ecosystem.
Same is true for other programming languages which use different tools but
have an automated way to mange dependencies.

IMHO Maven happens to be the tool, from the ones I've used, that currently
sucks less and 'scales better' (*) with diversity and complexity.
(*) this does not mean 'maximum flexibility' nor freedom.

> I see the complete download as useful to a class of users.

Agree.

>> In relation to the artifacts in the Maven repo, we currently have:
>>
>>
>> https://repository.apache.org/content/repositories/orgapachejena-334/org/apache/jena/
>>
>>
>>    apache-jena/
>>    jena-arq/
>>    jena-core/
>>    jena-iri/
>>    jena-top/
>>
>> My suggestion here, if possible, is not to publish the 'apache-jena'
>> artifact.
>
> And you want a single build command?
>
> It is the maven way to publish artifacts.
>
> Where would not-an-artifact go?
> How would it get there?

The problem with the current 'apache-jena' artifact is, as I said, that
it is not something developers can add to their dependencies in order to
use Jena. The name is a strong attractor and I fear someone might assume
it is that the artifact to depend on.

> What constitutes the formal release artifacts if it's split between
> maven repo and somewhere else?

http://www.apache.org/dev/release.html#what-must-every-release-contain

>
> If you look on incubator-general@, most maven-based projects don't even
> mention dist/.  No idea why - it seems important to me.

Agree.

> I put it in
> because this is our first release and because I think that it should be
> mentioned in the vote.  I explained that in the message on 1/Dec about
> the pre-release trial run.
>
>> Rationale: the files of the distribution are all available in the
>> 'dist' area
>> and apache-jena is not something people can use as their dependencies.
>
> The dist/ area is the official release bytes.

Yep.

>
>> Right now, you cannot have:
>>
>>      <dependency>
>>        <groupId>org.apache.jena</groupId>
>>        <artifactId>apache-jena</artifactId>
>>        <version>2.7.0-incubating</version>
>>      </dependency>
>>
>> While an artifact repository is supposed to be used by programs rather
>> than
>> people, people often browse artifact repositories to find their
>> dependencies
>> or the latest version for a dependency they already have.
>
> This is contradictory to me.  People can browse and find apache-jena.zip.

The artifact org.apache.jena:apache-jena:2.7.0-incubating is not a valid
dependency for developers wanting to use Apache Jena.

It you put this in your dependencies:

      <dependency>
        <groupId>org.apache.jena</groupId>
        <artifactId>apache-jena</artifactId>
        <version>2.7.0-incubating</version>
      </dependency>

You get:

[ERROR] [...] Could not find artifact 
org.apache.jena:apache-jena:jar:2.7.0-incubating in apache-staging-repo 
(https://repository.apache.org/content/repositories/orgapachejena-334/)


> We could now add making it using Jenkins at low cost.  I don't know
> another way to do that.
>
>>
>> The principle of less choices is still valid here. More importantly,
>> apache-jena would be the wrong choice and probably cause frustration
>> and questions as result.
>>
>> Paolo
>>
>
>     Andy

Paolo

Reply via email to