Hi,

W. D. Sadeep wrote on Tue, Oct 28, 2025 at 09:21:36PM +0800:

> I'm thinking of parsing the /var/www/logs/access.log from httpd for
> purposes like identifying bot activity using fgrep, grep, cut, sed,
> sort, and uniq.

In general, parsers are notorious for inviting bugs, and in general,
bugs are notorious for causing security issues.  For that reason,
in many programs that need parsers, the parsers are the parts that
you want to run with the least privilege.

> Is it safe to do that in a cron job?
> 
> I see requests that appear to embed scripts. So, I'm wondering if it's
> naive to parse them like that.

Writing a log parser in the sh(1) language - and your above question
sounds a bit as if that is what you are planning to do - does not
strike me as a particularly wise choice because the sh(1) language
is notorious for meta-character, word-splitting/whitespace and
quoting issues, so having a shell script is rarely good for security.
Picking a safer language may be better even if you are quite
experienced with shell programming.

> If that's the case, are there any precautions I can take?

Use a dedicated user account that has no access to anything else,
such that, if the parser spirals out of control, the worst
that can happen is that your log report gets corrupted.

In particular, do not run the parser as the root or daemon or www
user or any other system user, and make sure that the user you set
up for that purpose cannot write to the /var/www/logs/ directory,
where it might potentially destroy the logs it was supposed to
merely read.

Also, many log analysis programs exists - maybe one of the existing
programs fits your needs?  I admit that more than twenty years ago,
i wrote a log parser for some machines i was running back then (and
i used Perl, which was almost certainly a reasonable choice), but
i'm no longer it sure rolling my own was a particularly good idea.
Anyway, i never published it, and it certainly wasn't good enough
for publishing.

Yours,
  Ingo

Reply via email to