Rob,
I support your general point. Using static HTML files to achieve might
have been a sound argument in the 1990s, but it isn't really credible
with today platform technologies. What are the transactional rates for
the Apache site? How many requests per second even just roughly?
Taking your example of the MediaWiki engine, this is scaled to meet the
transactional and data volume demand of wikipedia.org, one of the
busiest websites on the planet. (There are typically ~100 updates per
second and goodness know how many pageviews.) See
http://www.mediawiki.org/wiki/Cache and the few dozen subsidiary pages.
There are many high performance caching products that address this issue
-- Apache even does one: http://trafficserver.apache.org/ -- and the
mediaWiki engine already integrates with a couple of the leaders: Squid
and Varnish.
Apache's "heartland" is its "number one HTTP server on the Internet".
Are we rally saying that the best way to manage content is through
static HTML files? This is just daft IHMO. Has anyone ever heard of
current CMS technology.
How many content editors and contributors can read HTML these days?
One other point: yes SVN or any equivalent versioning repository can
store most types of content, but versioning should take place at the
highest level of abstraction and language that the content providers
work in. Take an extreme example to emphasise this point. svn can
store object modules, but does this mean that we should use these are
the master control and disassemble back to assembly code to update
programs. Of course not. But to many editors, HTML is little more that
binary machine code.
Non-functional (infrastructure) requirements help drive the design and
implementation cycles but they shouldn't unnecessarily limit the true
functional requirements of the system. To do so is madness. Is is
really an approach Apache wants to advocate?
Regards
Terry
On Fri, Aug 12, 2011 at 10:10 AM, Shane Curcuru<[email protected]> wrote:
(To provide a little context while Gav may be asleep)
On 8/12/2011 9:26 AM, Rob Weir wrote:
On Fri, Aug 12, 2011 at 3:41 AM, Gavin McDonald<[email protected]>
wrote:
On Thu, Aug 11, 2011 at 12:12 PM, Kay Schenk<[email protected]>
...snip snip snip...
Just a thought: Could you do the entire website in MediaWiki, with only
exception cases (download page, etc.) done in HTML?
Just to put a blocker on this right away, we will not be using the wiki
as the
main website or the main entrance into the OOo world.
Since it is not self-evident to me why a wiki would be a problem for
the main website, could you explain this a little further? Is there a
technical problem? Remember, the wiki already comprises several
thousand pages of website content, so in a very real sense the "main"
website is already the wiki.
Performance. As I understand it, the bulk of all apache.org content is
served statically as html files. Putting a major project's homepage website
like the future office.a.o (or whatever name) up as a wiki would add a
significant amount of load to our servers, even for a highly efficient wiki
engine.
Thanks, that gives some context. So "main" in this case is not
necessarily only the top level page, i.e., an eventually
openoffice.apache.org or the current www.openoffice.org. Certainly
those pages would be some of the most highly-trafficked pages. But we
probably have some others that are also, FAQ's, Release notes,
download page, etc.
But that still leaves the long tail of the thousands of other pages
that are individually accessed rarely, but may add up to significant
load.
I'm surprised there is no caching mechanism for MediaWiki to simply
write out up static versions of pages and then invalidate the cache
for a particular page when it is changed. In theory you could have
the rarely-changed pages be just as efficient as static HTML. Plugins
exist that do this for WordPress, for example.
The beauty of the CMS is that while it's easy to work on the pages (either
via SVN or browser), the final result is simply checked into SVN and then
the resulting .html file is just stuck on the production webserver site.
Some projects use a wiki to manage their homepages (i.e. project.a.o,
separate from any community wiki they may have), but the physical homepage
that end-users see is typically static html that's been exported from their
wiki site.
Gav or infra folk can provide more details, but you should plan on adhering
to whatever performance restrictions the infra team requires for the main
website.
- Shane