[RT] Cocoon's own publishing system

Stefano Mazzocchi Tue, 11 Mar 2003 07:48:54 -0800

Diana Shannon wrote:

On Tuesday, March 11, 2003, at 05:32 AM, Pier Fumagalli wrote:
I suggest linking to http://xml.apache.org/cocoon/mirror.cgi#nightly but
I guess that's your intention ;-)
Yes, but at that point we'll have to re-build the site once again...
Ok, we shouldn't be so limited in rebuilding the site as often as necessary.


The wiki shows pretty evidently that our documentation publishing system
is currently too hard to use and to slow to work with. This is evident
when people are even afraid to touch it.

We must fix this.

Here's why it has been tedious and time consuming in the past, at least for me.

Ok, let's start planning something better.

1. Many, many committers weren't updating release and head branches with their doc updates. It took time to scrutinize differences in the branches, to make sure all relevant docs were in the release branch, which is what is used to generate the web site.

Agreed this is a problem.

Hopefully, now that we have two clearly separated repositories, people
will document only in the appropriate one.

2. Updating the live site repository is time consuming, at least for me, on a slow dial-up connection (I live in a rural area of the US with no broadband option). The api docs directory is the time killer here. I spent eight hours, one night, simply performing a cvs update followed by a cvs commit. The most recent update wasn't so bad. The commit/update took only 2.5 hours.

Oh, god. I didn't know that. This is a shame. I'm sorry Diana. I know you don't pay dial-up per-minute fees as we normally do here in europe, but still.

we must make a better system.

3. I was really excited about Forrest transition, thinking the automation would save me all of the above time which I could devote to docs content. Unfortunately: - only a few committers participated in the trial run, so it seemed to me, interest/support is not that great.


I would like to know the issues that are still left on the table to
solve and work on them. Forrest is clearly the way to go and the site
transition give us the opportunity to think about it.

- Forresters seemed to suggest, and I could be wrong, that the live site cvs update would **still** be required even with Forrest.

No, not necessarely, but a totally different system must be setup in place.

Thus, I failed to see how the transition would make my volunteer committer life any more liberated, since this time killing step was still necessary.

This *must* go away.

I'm happy to help with updating the site based on the revised cvs mirror links discussed in this thread. However, I can't do it until later this week. In the future, I think it's better if more committers would share the burden of updating the live site cvs every now and then, particularly those with greater bandwidth connections. In the hopes that this will happen, I'll post detailed instructions on how this can be done on wiki. (I've posted email instructions on two separate occasions in the past which I will now fine tune.)


Please do, those will help, but for now, let's clear the whiteboard and
start outlining the best publishing system.

- o -

Apache has very high security standards. If our web sites get hacked, the ASF image of quality and security is damaged. Having easier to install docs, but lower security is not an option.

This means that every solution must be *designed* around security.

IMO, the metapattern of IoC gives us a lot of security. So, the best publishing system would be something like:

repository -(generation)-> staging -(publishing)-> production

where

repository is used for storing our documents

stating is the location of the staged documentation

production is the location of the final docs

From a security analysis, a compromise of the staging area is not a big risk, this means that staging can be automated without major political issues.

While the above arrows indicate the flow of data, the flow of control must be inverted to provide complete vertical security:

repository <-(reads from)- staging <-(reads from)- production

- o -

So, here is the plan I propose:

1) repository is CVS on icarus. as it is today. no changes required in the editing/authoring process (for now, at least)

2) automated staging server is moof (or nagoya)

I'd suggest to install it on moof (or nagoya) [moof is a macosx server donated to the ASF by apple and located in their campus, lots of bandwidth and support for final java 1.4.1 as for yesterday, administred by the apache instrastructure]

Checkout is done over anoncvs, so no possible compromise from the staging server to the repository.

3) the staging server should work for docs as gump works for nightly builds, nagging the appropriate mail list if:

 - docs aren't valid
 - links are broken

Note that javadocs and idldocs must be automated as well.

4) the staging server will regenerate results automatically grabbing the changes out of CVS directly. this operation will perform unassisted, just like gump. in fact, forrest was born to be the gump alter-ego for documentation.

5) when publishing on production is needed, a person with an account on icarus will simply log in and perform a remote rsync between the staging area and the server. A simple script with a readme will do the job. I estimate it might take less than 60 seconds to update the web site this way since the thruput between moof and daedalus is several Mbits.

- o -

From a usability point of view we gain:

- no need for broadband. This means that people can update the site even on a GPRS cellphone, or on a slow link in africa.

- we are nagged if things go wrong: continous integration for documents.

- the site update frequency will hopefully improve, thus improving our quality of service.

From a security point of view:

- we use existing ASF-proven security infrastructure (everything is done over SSH)

- by keeping the stating server on a different machine and with no information on how to access the others, we don't create more points of failure

- o -

NOTE: from an operativity point of view, Pier has enough karma to setup everything we need on moof or nagoya, as well as providing accounts for those who want to help running the staging server (I would suspect Jeff and Steven to be interested in helping out, hopefully others as well). We might need to post our plan of action to infrastructure@ once we decide what to do, but since there are no security issues they shouldn't be concerned about it (I've already discussed this architecture and people didn't have objections).

Comments and suggestions will be very appreciated.

Thanks.

S.

[RT] Cocoon's own publishing system

Reply via email to