[sorry for cross-post: this is a general issue, but I'd like the cocoon people to know what I'm doing so that they might give me a hand :)]
I started the effort that will, hopefully, bring us a much more useful documentation system for xml.apache.org and, hopefully, to the entire ASF, even if political and ego obstacles will get in the way. I personally don't care: this effort is mainly to create a better documentation infrastructure following the goals outlined below. I started the Cocoon project three years ago exactly for this reason and now that has all the features I needed, I think I can attack the problem from a very wide angle. The site building system will be targetted toward xml.apache.org, but I'll keep a very broad perspective, making it possible to adapt the system to other apache.org projects with very few changes. BIG DISCLAIMER: however, whether this happens or not, I personally don't care. For sure, don't count on me wasting my time on fighting about 'my DTD is better than yours' or 'my system is faster/smaller/cleaner/easier-to-use/more-extensible than yours'. I'll come up with a system that works and then you guys will vote on what to do. I consider this an exercise to present full Cocoon potentials (that, objectively, beat the pants out of all the other systems used around Apache) but nothing more than this. - o - Ok, now that I stated this, let's get into the effort goals. GOALS ----- 1) Speed: current xml.apache.org is slow. Empirical studies on learning processes indicate that if a page takes more than 10 seconds on a 56Kbs modem, the cognitive experience is degrated. 2) Coherence: current xml.apache.org is extremely incoherent. Again, it's easy to understand that lack of coherence between subprojects docs is perceived (and sometimes reflects!) lack of cooperation. 3) Navigation: the navigation experience on current xml.apache.org is a nightmare. There is no way to perceive the basic elements of spatial navigation: where am I? where can I go? how do I go back? how do I go there? 4) Depth: the current xml.apache.org page layout forces a flat hierarchy of levels. The current Cocoon documentation somewhat extends this, but the visual look doesn't reflect the notion. Visual codes are extremely important to allow a easy and immediate navigation even at the deepest level. 5) Usefulness: xml.apache.org contains powerful software but it's not powerful in itself. It should be a window on the information useful for both users and developers, along with friendly behavior, such as print-friendly versions of the single pages and of the whole per-subproject documentation, pagination of long articles, site-restricted search, graphs of project-related data and so on. 6) Simplicity: xml.apache.org is done by volunteers, on all levels. Nobody is directly paid to do this. Not even myself. So, if the above goals are met, but the system is not simple and immediate to use for those who have to maintain and update the information, the result is void over a short period of time. 7) Extensibility, Flexibility, Modularity: web sites, just as software, are living entities that adapt on their environment. The build system must not restrict the ability to evolutionary extend the information architecture. 8) URI Solidity and Future Compatibility: URIs are contracts between the publisher and the user. Human users have the ability to estimate the long-term validity of these contracts and 'route around' eventual broken links, while machine users do not. The goal is to come up with a system that allows to generate a web site with strong URIs. Design Decisions ---------------- staticity: even if I think that the availability of a dynamic publishing system would be beneficial, considering the web site load, the load of the apache machines and the state of the JVM for FreeBSD and the political problems behind all this, it's *must* easier (at least for now) to have a static version of the site batch-produced and then placed into the web-serving space. automaticity: the site will be automatically generated out of files stored into CVS. The idea is to have GUMP-like nagging features that send email to the various mail lists using XML validation to estimate the 'integrity' of the docs placed. For this reason, in honor of Sam Ruby's great work, and for the resonation with 'forest', thus a huge number of trees (i.e. XML files), I call this effort "Forrest". I believe that together, Forrest and Gump, will help bringing apache quality one step up (moreover, as in the name, forrest wraps gump and will publish its generated data, providing more overall coherence) - o - separation of concerns ---------------------- There are three concern islands, here is a list of their duties. subproject ========== each subproject should provide: 3.a) a 'description' file that includes information on the codebase, its description, its released versions, its CVS modules, its CVS tags, its mail lists and its documentations (yes, a subproject might have more than one, think of Xerces1/Xerces2, Xalan1/Xalan2, Cocoon1/Cocoon2). [proposed filename: /description.xml] 3.b) a 'committers info' file that includes information on the committers, along with a short bio, an email address and a picture of them. [proposed filename: /committers.xml] 3.c) a 'change log' file that includes information on changes and software relases [proposed filename: /changes.xml] 3.d) a 'todo list' file that includes the information on things to do and who volunteered for doing it [proposed filename: /todo.xml] 3.e) a 'news' file that includes events and useful information that should be made available to the general public. then, for each documentation (location is get from the description file): 3.f) a 'table of content' that indicates the hierarchical sequence of the files and where to find them into the CVS repository (for each documentation). This is kept as a single file to allow document writers to maintain 'coherence' and visualize the entire part. This is equivalent to the stylebook book.xml file but with full nesting capabilities. 3.e) the pages that componse the documentation (their location is get from the ToC file) Log scanner =========== The log scanner is a set of scripts that scan the logs from the CVS, the mail lists and the web site to gather information on: 1) mail list activity (subscribers and messages) 2) web site activity (hits and downloads) 3) CVS activity (general commits, commits per person) This scanner provides this information in a simple format that can be easily fed into the documentation building system. Build system ============ The build system will: 1) aggregate, filter and otherwise adapt the information collected from the various subprojects CVS modules, from the log scanner and from the GUMP run into static HTML files (for the browser pages), static PDF files (for print-friendly versions) and JPEG images (for graphs). 2) generate navigation information in all the pages 3) check validation of all the required XML files and send nag messages to the mail lists if failure occurs. 4) generate httpd-related corollary files (.htaccess, header.html, footer.html and so on). 5) upload the parts that didn't have failures online. The goal is to have the system running completely autonomous: this follows the Gump approach. [Sam, I'll need your help here, since I don't have an account on nagoya] - o - Things to decide ================ 1) DTDs ------- The Cocoon project already has DTDs for 'documentation','change logs','todo list' and 'specifications'. They mainly use XHTML tags and are very easy to learn (they are an expansion of the original stylebook DTDs, so it's pretty easy to automatically adapt existing stylebook documents to this improved DTD, still keeping the simplicity we had before). The rest of the required DTDs (description, news and ToC) must be agreed upon (i'll work on them in the next days) 2) URIs ------- In order to achieve the future-compatible goal, we must come up with a guideline for URIs. For example, the Cocoon project had /cocoon and /cocoon2, then Cocoon 2.0 was released final and we moved /cocoon2 into /cocoon and /cocoon into /cocoon1, creating a shit-load of broken links. Two solutions where proposed (add your own if you have more) a) use version specific information and use mod_rewrite to adapt. for example xml.apache.org/cocoon/1.8.2/index.html xml.apache.org/cocoon/2.0b1/index.html xml.apache.org/cocoon/2.0b2/index.html xml.apache.org/cocoon/2.0rc1/index.html xml.apache.org/cocoon/2.0rc2/index.html xml.apache.org/cocoon/2.0/index.html then xml.apache.org/cocoon/ -> xml.apache.org/cocoon/2.0/index.hml Problem is that while those versioned URI are never broken, the version-less redirected URI is changed for each release and doesn't reduce broken links. Also, it's probably easier to download the required version and look into the shipped docs and results in unnecessary big web sites. b) use semantic-meaningful yet version-less URIs xml.apache.org/cocoon/previous/ -> points to the previous generation docs xml.apache.org/cocoon/ -> points to the latest docs xml.apache.org/cocoon/next/ -> points to the next generation docs which removes the need to have keep all the docs versions online, yet provides the ability to have both versions the latest one and the previous generation (for Cocoon would be Cocoon 1.8.2, Cocoon 2.0, Cocoon 2.1-dev today). The problem of broken links isn't solved since everytime there is a transition, there is a chance of breaking previously established links if the docs ToC changes from one generation to the next. 3) layout --------- The layout previously proposed on this list was a solution to the speed problem but I couldn't adapt it to the depth needs identified in the rest of the goals. So, I resurrected my rusty web design skills and came up with the layout you find attached. I've tested it on IE 5.5, NS 4.78 and Moz 0.9.5 on win2k. Feedback, suggestions and criticisms are appreciated. 4) CVS location and mail list discussions ----------------------------------------- Just like Gump which is not a subproject on its own, Forrest doesn't deserve that status neither as long as it remains a single-man show (and my experience tells me it will very likely remain so if the above goals are met) At the same time, just like Gump, it requires a CVS space. Possible places are: 1) xml-site 2) xml-forrest 3) xml-site2 for mail list discussions, solutions are: 1) [EMAIL PROTECTED] 2) [EMAIL PROTECTED] 3) [EMAIL PROTECTED] 4) [EMAIL PROTECTED] Please, add your comments/suggestions and your votes where a decision is required. Thank you. -- Stefano Mazzocchi One must still have chaos in oneself to be able to give birth to a dancing star. <[EMAIL PROTECTED]> Friedrich Nietzsche --------------------------------------------------------------------
new-site.zip
Description: Zip compressed data
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]