On 3/30/06, Tom Lane <[EMAIL PROTECTED]> wrote:
> I really find it pretty bizarre to want to log a duration without
> logging the statement that caused it.  Seems like the
> log_min_duration_statement parameter ought to be enough.  These logging
> options are already messy and unintuitive, and adding still more odd
> frammishes doesn't help that ...

It's not so odd as it was the default behaviour for 7.4. I agree with
you having just the duration is not the best solution but we cannot
log every query for performances reason.

The point is we are hosting several databases for our customers and
for these databases, we don't know the application. I'm going to take
a real example we have here to explain why the behaviour of
log_duration in 7.4 was perfect for us.

This database is a 2 GB database receiving 13 millions queries a day.
We usually have the same old big queries which cannot be really
optimized in the log due to log_min_duration_statement so this setting
doesn't give us any clue on how our server performs because these
queries are always slow.
Here are the cases when a global overview with all duration was useful:
- our customer tells us their site is slower than usual. We check the
database log reports and we can see if the database is slower (average
duration higher) or not. If this is the case, perhaps there is a bug
or a new feature in their application performing a lot more queries
than before: we can tell them too as we have all the queries logged by
their duration.
- we have a maintenance to plan which requires the database to be down
(namely the upgrade from 7.4 to 8.1 we did a few weeks ago) and we
need to choose a time when the database activity is not so high as we
will move the db to a slower server. In our case, it's not directly
correlated to the web statistics due to an advanced cache system.
Having every query logged, we know when the database activity is
- we see on the graphs the database is significantly slower after
let's say 5pm. This can be due to a the need of a vacuum on a specific
table because of a cronjob running. That's true that log_duration
won't be enough but we will use log_statement='mod' during this period
the following day to see if our hypothesis is true.

So our point is:
- log_min_duration_statement is not enough as it does not give us a
global overview of the database activity.
- log_statement='all' generates far too I/O for us and we cannot enable it.
- log_duration=on in 7.4 was a good compromise allowing us to detect
the problems on the database server while generating a not so big log
file (1.2 GB a day) and nearly no overhead.

That's why we'd like to see this behaviour back again in PostgreSQL.

> Changing code without changing the associated comments isn't a good way
> to get your ideas accepted, either, eg here: [...]
> The patch makes that comment a lie.  This is not acceptable coding practice.

Sorry for that. The attached patch is fixed.

I'm not sure if we should remove the log_duration log if
log_min_duration_statement logs the query and duration for the current
query. PostgreSQL 7.4 logged a line for the duration due to
log_duration=on and a line for statement + duration due to
log_min_duration_statement. I have kept this behaviour to be
consistent with 7.4 but it's perhaps better to not log the query twice
(the tool we use count the query only once anyway). Any feedback on
this is welcome. I'll update the patch if needed.

Thanks for your attention and comments.



Attachment: log_duration_2.patch.gz
Description: GNU Zip compressed data

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Reply via email to