Probably not what you want to hear, but any chance you could put the templates (which you say are much less than your terabyte of content), local on all the relevant boxes? We had a similar problem a few years ago and decided to put our template stuff local (which yes, was a little bit of work), and not even touch nfs for them. Local stats are pretty much free, or at least much closer to free than nfs stats. And yeah, we had several terabytes of storage, but our templates were much, much smaller.
Hope this helps, Earl -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Tompkins Sent: Tuesday, July 05, 2005 2:38 PM To: Andy Wardley Cc: Andy Lester; [email protected] Subject: Re: [Templates] Template Caching & premature optimization Thanks, Andy, for this analysis, but unfortunately it really doesn't come close to the scale I'm dealing with. I have a couple of thousand apache and thttpd processes constantly hitting nfs shares for file stats on over a terabyte of content (only a very small fraction of this is web templates). And we already have projects slated to migrate other sites that will double the traffic. This is definite and we need to be ready for tripling the traffic within the next 2-3 years. We are, however, due for some current benchmarking which will have to be done anyway as development ensues on our rewrite. Previous benchmarking was performed a few years ago by the CTO at the time, and is no longer available. Certainly I can see where this thread is being considered as preemptive optimization. This is my fault for not giving the full scope of the issue and just leaving it as "too much nfs activity." But I don't see this as preemptive optimization. I see it as an unnecessary call beyond the initial page load and not much less than if I were to attempt to frequently validate that my chair exists after I've sat down in it. Once it's there and it's in use it does not require re-validation. If the chair were to break while I'm sitting in it, the entire process of sitting down must be restarted--I get up, find a replacement chair and then sit down again. It's the same thing with templates: if an error is found in the template then a revision is made which must be approved, it then replaces the template and the servers are restarted. Think of the templates less in the light of traditional web pages and more in the light of perl modules. Perl doesn't care if a module has changed or even if it has been deleted from disk after it's been loaded. If you want to enact a changed library, you (typically) must bounce the process. This may sound a bit over the edge, but it helps to ensure the integrity of any code that could be used for processing credit cards. Only a few people can approve these these types of changes while many people may have their hands in the development of templates. As a further complication, those who can approve changes cannot be involved in them beyond reviewing the revision. I've really been trying to avoid getting into much detail here, it's time consuming and borders on disclosing company policy. I was hoping that simply stating that this is my need and asking "what is the accepted approach with TT" would suffice, but it seems that there's not an "accepted" approach. For whatever reason and whether it's accepted by the community or not I have a few goals in mind for our redesign that I'm hoping to come close to using TT. Here are a couple that are relevant to this topic: * Mark certain templates as "protected" so they cannot be modified after being loaded and reinstate the ability to modify non-sensitive pages (which mostly eliminates this whole stat issue from my perspective except for protected components, because statting a file would once again be needed). * Preload selective (primarily the protected) templates in the parent apache (1.3) process to ensure that changes can't sneak through as new apache children are spawned If these two goals in particular can be done with TT, then this issue is resolved for me as soon as I find out how. Otherwise, I'm left with locking down *all* template revisions until I come up with an alternative. For as much as I know about TT at this point, it might mean sub-classing from Template::Provider, but as I mentioned, I'm new to TT and I'd really prefer to keep my hands out of there until I become more familiar with it. Locking down template revisions (in part or in whole) is a tiny detail in the big picture and that it's not being done because I *want* to do it or that I think it's the best approach (it's certainly not the easiest), it's being done because I *must* do it to show strict auditing policy over any piece of code involved in a point of sale. We've not yet solidified our final templating solution; I'm still working out discovery and so far TT is the forerunner. This entire issue may resolve out to having my head stuck in previous solutions that I really need to rethink. But I was just looking for a response to how this has been dealt with previously by experienced TT users (as I think was the original post on this thread). My joining this thread was simply in that it sounded similar to what I will be dealing with. -- Tim Andy Wardley wrote: >Andy Lester wrote: > > >Actually, you'll only have half a million stat calls, which according to my >test below is less than a second of machine overhead per day. > > perl -MBenchmark -e 'timethis(10_000_000, sub { stat $0 })' > > timethis 432000: -1 wallclock secs ( 0.18 usr + 0.28 sys = 0.46 CPU) > @ 939130.43/s (n=432000) > >Why? $Template::Provider::STAT_TTL is set to 1 (second) by default. That >means that each file is checked once a second, at most, regardless of how >many page impressions you're getting. That's 86k stat() calls per day >(60*60*24), per template used (which I assumed to be 5 in the calculation >above) = 432,000 > >And even if you were hitting stat() for every template, for every page, 20 >million stat() calls is still only approx. 20 seconds of processor overhead >per day. That's pretty cheap. > >You mention that you're mounted across NFS, which will certainly make things >a little slower. But if you're looking to speed thing up, then replicating >the templates to a local filesytem is going to have a much greater impact >than trying to optimise away stat() calls. > >So I think Andy's advice is sound: measure what you're doing, and be >sure that you're optimising the right thing. > >I personally suspect that tuning out the stat() calls isn't going to save >you a great deal of time, but I could be wrong. So if you want to reduce the >number of stat calls, simply set STAT_TTL to a higher value. > > $Template::Provider::STAT_TTL = 60; > >HTH >A > > > > _______________________________________________ templates mailing list [email protected] http://lists.template-toolkit.org/mailman/listinfo/templates _______________________________________________ templates mailing list [email protected] http://lists.template-toolkit.org/mailman/listinfo/templates
