> > I don't get the "cross-correlation" part. I don't want to combine
> > two reports, or do I?
> It's the memory requirement. If you have 10,000 unique requests on
> your site (not including separate query strings) and you have 16
> buckets in the processing time report, you now have to track 160,000
> unique combinations of processing-time -> request. This is even worse
> for things like host to referrer!

I don't think that's true. The problem isn't an n^2 problem.

The original request was for a list of the top 50 worse performers. So, you
have an heap with 50 elements in it, each element is a pair (time,name). For
every log entry you processes, check whether it's time is greater than the
quickest element on the heap, and if so, add it to the heap.

Alternatively, Jeremy, you were suggesting a set of buckets. If I understand
right, we'd see the worst few performers in the 1-2s range, the worst few
performers in the 2-5s range, and so on. This'd be just the same, except
with one heap per bucket.

Caveat: I don't know what exactly "processing time" is. At least, my
logfiles don't seem to include it. If it's not explicitly stored in the log,
and instead has to be calculated as the time between two separate
requests... well, that'd involve some separate processing beforehand.

There's an unrelated different cross-corelation program I wrote which
annotated the "Request Report" by adding, for each request, a list of the
top downloaders. You'd think this'd be an n^2 problem. But just run Analog
once the first time to get the request report, then run it a second time,
except on this second run it ignores everything but the requests it's been
told to look out for. The computational complexity of this second run is of
the same order as the first run. In practice, I didn't even bother writing
it properly, just stuck everything naively into STL containers, and it works
fine up to half a million log entries. The "host->referrer" you mention
would be like this.

--
Lucian

+------------------------------------------------------------------------
|  TO UNSUBSCRIBE from this list:
|    http://lists.isite.net/listgate/analog-help/unsubscribe.html
|
|  Digest version: http://lists.isite.net/listgate/analog-help-digest/
|  Usenet version: news://news.gmane.org/gmane.comp.web.analog.general
|  List archives:  http://www.analog.cx/docs/mailing.html#listarchives
+------------------------------------------------------------------------

Reply via email to