We're beginning to look at the logfile analysis section of an ongoing Squid based project. Nothing in existence really offers what we need as it's not really tailored towards our market or our users, plus our logs contain an identd lookup string which is customised to our requirements and doesn't return the usual username only information so existing logfile analysis progs and scripts don't handle it the way we need.
My query is in regard to the best way to go about analysing logs in general. Main outputs from log querying will be displayed in a browser, and since our current strengths lie in php we're aiming to use this as the language of choice. I've been playing with using regexp to take a line from the access.log and split it into the relevant sections, but I'm unsure whether this is the best or most efficient way of doing things. The expression to match each line is very complicated and I'm under the impression that regexp matches are CPU intensive so large logfiles may take an age to process (our product could easily have upwards of 250 simultaneous users).
Can anyone comment on ways logfiles can be analysed, and possibly what drawbacks and/or advantages each way may have? We'll likely be running a logfile per week, which then gets archived and a new log started but this will be looked at during our testing phase which is a while off yet!
Any comments on this will be gratefully accepted!
Regards,
nry
_________________________________________________________________ Get Hotmail on your mobile phone http://www.msn.co.uk/msnmobile
