Apparently, though unproven, at 01:10 on Tuesday 17 May 2011, Felix Miata did 
opine thusly:

> After attempting to install for the first time last week, I started 3
> different threads here looking for help. I'm pleased with the nature of the
> responses, and being able to succeed eventually using a mix of those
> responses and my own efforts digging into Google, gentoo.org and cranial
> cobwebs. So, thanks to all who replied, and even to those who showed
> interest without replying.
> 
> For http://fm.no-ip.com/Tmp/Linux/G/, newly created to use with those three
> threads, 'cat /var/log/apache2/access_log | grep "GET /Tmp/Linux/G" | grep
> -v <myip> | sort > outfile' generated 117 lines. That's a lot more hits
> than I can ever remember getting before when asking for help from a
> mailing list (even if it did take 5 days to accumulate so many).
> 
> I'm curious if anyone here would like to offer a better variant of my local
> query that would limit the hit count so that no more than one hit per IP is
> represented in the output? My skill with such things is very limited. I
> can't think of the the name of a command to cut the IP off the front of
> each line, much less how to compare if it's a non-first instance to be
> discarded. Or, maybe there's an Apache utility for doing this that I just
> don't know about?

There's always a million ways to skin a cat like this. At a high volume site 
you would of course not try and deal with this directly from the apache logs. 
You would send them to syslog which would parse them and write them to a 
database from where you could run sophisticated SQL.

There are also Apache analyser apps out there, google will find them.

But I think all that is overkill for what you want. Your command works fine 
except for needing to discard duplicate IPs. You don't seem to need to know 
the details of the GET, so just grab using awk the first field and sort | uniq 
the result. It will run a tad quicker (and reveal less n00bness to your 
audience) if you grep the file directly instead of cat | grep:

grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | \
awk '{print $1}' | sort | uniq | wc

In true grand Unix tradition you cannot get quicker, dirtier or more effective 
than that


-- 
alan dot mckinnon at gmail dot com

Reply via email to