On Fri, Aug 12, 2011 at 12:19 PM, Terry Ellison <[email protected]> wrote: > Rob, > > I support your general point. Using static HTML files to achieve might have > been a sound argument in the 1990s, but it isn't really credible with today > platform technologies. What are the transactional rates for the Apache > site? How many requests per second even just roughly? > > Taking your example of the MediaWiki engine, this is scaled to meet the > transactional and data volume demand of wikipedia.org, one of the busiest > websites on the planet. (There are typically ~100 updates per second and > goodness know how many pageviews.) See http://www.mediawiki.org/wiki/Cache > and the few dozen subsidiary pages. There are many high performance caching > products that address this issue -- Apache even does one: > http://trafficserver.apache.org/ -- and the mediaWiki engine already > integrates with a couple of the leaders: Squid and Varnish. > > Apache's "heartland" is its "number one HTTP server on the Internet". Are > we rally saying that the best way to manage content is through static HTML > files? This is just daft IHMO. Has anyone ever heard of current CMS > technology. >
If you read the thread, it started by a suggestion from me that we use the wiki as the main website. Than Gavin said the main website must be served (not necessarily managed) as static HTML. I asked why. Shane said it was for performance reasons. I said that caching should be able to handle that. So I don't think you and I disagree on this point. If you think you do, then you'll need to try harder to explain it. I certainly am not disagreeing with you. Unless I misunderstood you ;-) > How many content editors and contributors can read HTML these days? > > One other point: yes SVN or any equivalent versioning repository can store > most types of content, but versioning should take place at the highest level > of abstraction and language that the content providers work in. Take an > extreme example to emphasise this point. svn can store object modules, but > does this mean that we should use these are the master control and > disassemble back to assembly code to update programs. Of course not. But > to many editors, HTML is little more that binary machine code. > > Non-functional (infrastructure) requirements help drive the design and > implementation cycles but they shouldn't unnecessarily limit the true > functional requirements of the system. To do so is madness. Is is really > an approach Apache wants to advocate? > > Regards > Terry >> >> On Fri, Aug 12, 2011 at 10:10 AM, Shane Curcuru<[email protected]> >> wrote: >>> >>> (To provide a little context while Gav may be asleep) >>> >>> On 8/12/2011 9:26 AM, Rob Weir wrote: >>>> >>>> On Fri, Aug 12, 2011 at 3:41 AM, Gavin McDonald<[email protected]> >>>> wrote: >>>>>> >>>>>> On Thu, Aug 11, 2011 at 12:12 PM, Kay Schenk<[email protected]> >>> >>> ...snip snip snip... >>> >>>>>> Just a thought: Could you do the entire website in MediaWiki, with >>>>>> only >>>>>> exception cases (download page, etc.) done in HTML? >>>>> >>>>> Just to put a blocker on this right away, we will not be using the wiki >>>>> as the >>>>> main website or the main entrance into the OOo world. >>>>> >>>> Since it is not self-evident to me why a wiki would be a problem for >>>> the main website, could you explain this a little further? Is there a >>>> technical problem? Remember, the wiki already comprises several >>>> thousand pages of website content, so in a very real sense the "main" >>>> website is already the wiki. >>> >>> Performance. As I understand it, the bulk of all apache.org content is >>> served statically as html files. Putting a major project's homepage >>> website >>> like the future office.a.o (or whatever name) up as a wiki would add a >>> significant amount of load to our servers, even for a highly efficient >>> wiki >>> engine. >>> >> Thanks, that gives some context. So "main" in this case is not >> necessarily only the top level page, i.e., an eventually >> openoffice.apache.org or the current www.openoffice.org. Certainly >> those pages would be some of the most highly-trafficked pages. But we >> probably have some others that are also, FAQ's, Release notes, >> download page, etc. >> >> But that still leaves the long tail of the thousands of other pages >> that are individually accessed rarely, but may add up to significant >> load. >> >> I'm surprised there is no caching mechanism for MediaWiki to simply >> write out up static versions of pages and then invalidate the cache >> for a particular page when it is changed. In theory you could have >> the rarely-changed pages be just as efficient as static HTML. Plugins >> exist that do this for WordPress, for example. >> >> >>> The beauty of the CMS is that while it's easy to work on the pages >>> (either >>> via SVN or browser), the final result is simply checked into SVN and then >>> the resulting .html file is just stuck on the production webserver site. >>> Some projects use a wiki to manage their homepages (i.e. project.a.o, >>> separate from any community wiki they may have), but the physical >>> homepage >>> that end-users see is typically static html that's been exported from >>> their >>> wiki site. >>> >>> Gav or infra folk can provide more details, but you should plan on >>> adhering >>> to whatever performance restrictions the infra team requires for the main >>> website. >>> >>> - Shane >>> > >
