> > I don't get the "cross-correlation" part. I don't want to combine > > two reports, or do I? > It's the memory requirement. If you have 10,000 unique requests on > your site (not including separate query strings) and you have 16 > buckets in the processing time report, you now have to track 160,000 > unique combinations of processing-time -> request. This is even worse > for things like host to referrer!
I don't think that's true. The problem isn't an n^2 problem. The original request was for a list of the top 50 worse performers. So, you have an heap with 50 elements in it, each element is a pair (time,name). For every log entry you processes, check whether it's time is greater than the quickest element on the heap, and if so, add it to the heap. Alternatively, Jeremy, you were suggesting a set of buckets. If I understand right, we'd see the worst few performers in the 1-2s range, the worst few performers in the 2-5s range, and so on. This'd be just the same, except with one heap per bucket. Caveat: I don't know what exactly "processing time" is. At least, my logfiles don't seem to include it. If it's not explicitly stored in the log, and instead has to be calculated as the time between two separate requests... well, that'd involve some separate processing beforehand. There's an unrelated different cross-corelation program I wrote which annotated the "Request Report" by adding, for each request, a list of the top downloaders. You'd think this'd be an n^2 problem. But just run Analog once the first time to get the request report, then run it a second time, except on this second run it ignores everything but the requests it's been told to look out for. The computational complexity of this second run is of the same order as the first run. In practice, I didn't even bother writing it properly, just stuck everything naively into STL containers, and it works fine up to half a million log entries. The "host->referrer" you mention would be like this. -- Lucian +------------------------------------------------------------------------ | TO UNSUBSCRIBE from this list: | http://lists.isite.net/listgate/analog-help/unsubscribe.html | | Digest version: http://lists.isite.net/listgate/analog-help-digest/ | Usenet version: news://news.gmane.org/gmane.comp.web.analog.general | List archives: http://www.analog.cx/docs/mailing.html#listarchives +------------------------------------------------------------------------