On Fri, Mar 6, 2015 at 5:38 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Kevin Grittner <kgri...@ymail.com> writes:
> > Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
> >> How would fccebe421 explain the large amount of random writes (~4MB/s
> >> for more than an hour), reported in the initial post? And why would that
> >> only affect the EXPLAIN and not the bare query?
> >> But the random writes don't really match in this scenario ...
>
> > Sure they do -- both the index and heap pages may be rewritten with
> > hints that the rows are dead.
>
> Hm ... yeah, this idea could match the symptoms: if the database has built
> up a large debt of dead-but-unhinted rows, then the first pass through the
> index would cause a write storm but the next would not, which would
> explain why doing EXPLAIN immediately followed by the real query would put
> all the hint-update burden on the EXPLAIN even though the planning phase
> of the real query would inspect the same index entries.
>
> But if that's the situation, those hint-updating writes would get done
> sooner or later --- probably sooner, by actual execution of the query
> itself.  So I'm not convinced that moving to SnapshotAny would fix
> anything much, only change where the problem manifests.
>

But the actual query is using a seq scan, and so it would hint the table in
efficient sequential order, rather than hinting the table haphazardly in
index order like probing the endpoint does.




> Also, it's less than clear why only this particular query is showing
> any stress.  Dead rows should be a hazard for anything, especially if
> there are enough of them to require hours to re-hint.  And why wouldn't
> autovacuum get to them first?
>

Say the timing of this query is such that 10% of the parent turns over
between invocations of this query, and that this 10% is all at the end of
some index but random over the table heap.  If autovac kicks in at 20% turn
over, then half the time autovac does get to them first, and half the time
it doesn't.  It would be interesting to know if this query is bad every
time it is planner, or just sometimes.

Cheers,

Jeff

On Fri, Mar 6, 2015 at 5:38 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Kevin Grittner <kgri...@ymail.com> writes:
> > Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
> >> How would fccebe421 explain the large amount of random writes (~4MB/s
> >> for more than an hour), reported in the initial post? And why would that
> >> only affect the EXPLAIN and not the bare query?
> >> But the random writes don't really match in this scenario ...
>
> > Sure they do -- both the index and heap pages may be rewritten with
> > hints that the rows are dead.
>
> Hm ... yeah, this idea could match the symptoms: if the database has built
> up a large debt of dead-but-unhinted rows, then the first pass through the
> index would cause a write storm but the next would not, which would
> explain why doing EXPLAIN immediately followed by the real query would put
> all the hint-update burden on the EXPLAIN even though the planning phase
> of the real query would inspect the same index entries.
>
> But if that's the situation, those hint-updating writes would get done
> sooner or later --- probably sooner, by actual execution of the query
> itself.  So I'm not convinced that moving to SnapshotAny would fix
> anything much, only change where the problem manifests.
>
> Also, it's less than clear why only this particular query is showing
> any stress.  Dead rows should be a hazard for anything, especially if
> there are enough of them to require hours to re-hint.  And why wouldn't
> autovacuum get to them first?
>
>                         regards, tom lane
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Reply via email to