There have been various discussions about this topic on the Infrastructure mailing list. I have summarised those into a proposal and built a web page at http://forrest.apache.org/proposal-asf-publish.html The text content is copied below to facilitate further discussion.
This proposal links to a separate proposal document which shows how the "Forrestbot" addresses part of that infrastructure.
----------------------------------------------------------------------- -
Warning
This is a draft proposal document. It is not yet the consensus of ASF
nor of the Infrastructure committee. This proposal is a summary of
various email discussions held over the past years, especially on
[EMAIL PROTECTED] around 2004-07-29 which expanded on previous
discussions.
Overview
All ASF projects need to be able to concentrate on their projects and the content of their websites, rather than get tangled up in arcane website publication procedures.
There is a "staging and publishing server" which is separate from the live production web server. The project committers would commit their source changes, then trigger a "documentation build", then review the staging website. When satisfied, they "approve and publish" so as to copy the stage to the "publication point". There are a number of rotated older versions of the publication point. A deliberate action on the live webserver causes rsync to pull the current publication point into production.
Publication infrastructure and actions
+------------------+ [A] Commit the changes to source documents | | svn.apache.org | | | | [B] +------------------------------------+ | | | | Source docs managed in project SVN | | +------------------+ +------------------------------------+ V
+------------------+ [C] Trigger the build | | stage.apache.org | | | | [D] +---------------------+ | | | | Build the documents | | | | +---------------------+ | | | | | | [E] +-------------------------------+ | | | | Staging server enables review | | | | +-------------------------------+ | | | | | | [F] Approve and publish | | | | | | [G] +----------------------------+ | | | | Publication point | | | | | (with some older versions) | | +------------------+ +----------------------------+ V
+------------------+ [H] Rsync pull into production | | www.apache.org | | | | [I] +--------------------------------------+ | | | | Production webserver $tlp.apache.org | | +------------------+ +--------------------------------------+ V
[A] Commit the changes to source documents
The content changes are committed to the project's source repository. The committer might have already built and tested the documents with their local documentation build system. On other occasions, they might commit changes without building locally. Some committers might not even have installed a local build system, they might just edit or patch the content.
[B] Source docs are managed in project SVN
The source files for the project's website are held in an SVN repository. These might be XML source for some projects, while others might have simple HTML docs.
Forgive my naiveté here, but is this process different for a project like xml-fop, which uses CVS as for version control (i.e., would xml-fop and other 'CVS' projects have a corresponding cvs.apache.org)? Or is this totally separate from a project's version control, and everyone uses svn.apache.org for this stage of the process?
[C] Trigger the build
Via a secure https web interface, or via ssh to the server and use command-line.
[D] Build the documents
The build system on the server will generate the project documents and deploy them to the staging server website.
Projects can use various documentation tools: Anakia, Forrest, Maven, raw html, etc. Each system would have its own ways to report build problems to the committer (e.g. xml validation, broken links, content and spelling errors, configuration errors).
[E] Staging server enables review
A pre-release website. Anyone can review online. Some projects might want to password-protect.
[F] Approve and publish
When satisfied, they "approve and publish" so as to copy the stage to the "publication point". Via a secure https web interface, or via ssh to the server and use command-line.
[G] Publication point
A holding area, from which the production website can be recreated as required. Keep a number of rotating versions, i.e. rm -rf ${publish_dir}.3 mv ${publish_dir}.2 ${publish_dir}.3 mv ${publish_dir}.1 ${publish_dir}.2 mv ${publish_dir} ${publish_dir}.1 mv $staging_dir $publish_dir
[H] Rsync pull from publication point into production
Someone with commit access for the project would issue a command on the live web server to synchronise with the current contents of the publication server via rsync pull. This would either be executed by ssh and command-line, or via a secure https web interface.
We want the final rsync to be independent, so that it can also be executed by infrastructure people in the event that the web sites need to be recreated. The rsync would be manual.
The old way ... cd /www/$tlp.apache.org; cvs up -Pd or cd /www/jakarta.apache.org/$proj; cvs up -Pd
The new way ...
rsync -avz -e ssh --delete stage.apache.org:/www/$tlp.apache.org/ \ /www/$tlp.apache.org/ or rsync -avz -e ssh --delete stage.apache.org:/www/jakarta.apache.org/$proj/ \ /www/jakarta.apache.org/$proj/
[I] Production webserver for the project
The live production website for the project at $tlp.apache.org
Other notes
* The actions (A and C and F and H) are completely independent manual steps and are deliberate accountable acts. This ensures human oversight in the deployment process. * The actions should not be automated, especially action H. If someone did manage to break in to the publishing server, then their changes would be automatically published. * Some people would like action C and action F to be automated (say every 30 hours). Committers can still trigger it manually at other times. * The actions F and H could be combined. For example, we could have a script on the production server that contacted the publishing server to perform action F and then performed the rsync (action H). * The proposal from Apache Forrest to have an ASF Forrestbot as one method for projects to handle the "staging server" (item C through to item G). This does not preclude other mechanisms. * The Doco concept adds interactive and workflow capabilities to this publication infrastructure.
Background and impediments
This is a collection of notes about the past impediments which have hindered the publishing process ... * The generated project sites were maintained in source control, primarily to enable the infrastructure team to restore the live web server in case of emergency. That added one more level of complexity for the projects. * When people wanted to work on projects docs, they were hindered by needing to install the document generation system locally. That was too onerous for some projects and caused delays with website maintenance.
Scratch notes
Some notes which have not yet been incorporated ... * How would we log the actions?
* Noel: We need to accomodate sites that come from a single source,
and sites that come from multiple sources,
e.g. Jakarta or the XML Federation.
----------------------------------------------------------------------- -
With the exception of my one note above (svn.a.o vs. cvs.a.o), the above sounds good^H^H^H^H GREAT to me! I hope others will comment on this (at least to say "Looks good to me!") so this process can move forward, and we can get relieve ourselves of this onerous issue.
Thank you David for writing such a clear and concise proposal!
Web Maestro Clay -- Clay Leeds - <[EMAIL PROTECTED]> Webmaster/Developer - Medata, Inc. - <http://www.medata.com/> PGP Public Key: <https://mail.medata.com/pgp/cleeds.asc>
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]