Re: Dealing with SeqScans when Time-based Partitions Cut Over

Peter J. Holzer Sun, 21 Dec 2025 01:49:38 -0800

On 2025-12-18 13:52:22 -0500, Matthew Planchard wrote:
> 
> In a table with high insert frequency (~1.5k rows/s) and high query
> frequency (~1k queries/s), partitioned by record creation time, we have
> observed the following behavior:
> 
> * When the current time crosses a partition boundary, all new records
>   are written to the new partition, which was previously empty, as
>   expected
> 
> * Because the planner's latest knowledge of the partition was based on
>   its state prior to the cutover, it assumes the partition is empty and
>   creates plans that use sequential scans
> 
> * The table accumulates tens to hundreds of thousands of rows, and the
>   sequentail scans start to use nearly 100% of available database CPU
> 
> * Eventually the planner updates thee stats and all is well, but the
>   cycle repeats the next time the partitions cut over.


If I understand you correctly, the stats are updated after "tens to
hundreds of thousands of rows" which at that insert rate would translate
to several minutes.


> We have tried setting up a cron job that runs ANALYZE on the most recent
> partition of the table every 15 seconds at the start of the hour, and
> while this does help in reducing the magnitude and duration of the
> problem, it is insufficient to fully resolve it (our engineers are still
> getting daily pages for high DB CPU utilization).

Since nobody seems to have an idea how to prevent the seqscans I'd go
back one step:

What is the real problem here?

1) The database is sluggish during that time and users are negatively
   affected by the sluggishness.
2) Your engineers get spurious pages for something which isn't really a 
   problem and which they can do nothing about.
3) Something else.

Note that I don't consider "high CPU utilization" a real problem. That's
what the CPU is there for. It might cause problems or be a symptom of a
problem, but it isn't a problem in itself.

If the problem is 1) I agree that a database-related solution should be
found. I think it is possible to explicitely set the statistics on a
table. So maybe that would be good enough until auto-analyze kicks in.

But if 2) is the real problem I think you should change the ruleset in
your monitoring. Pages should only be sent out for conditions which
actually impact the service and which can (probably) be fixed or at
least investigated by the engineer on call. Something which happens
every day at the same time and resolves itself within a few minutes
doesn't satisfy these criteria. Maybe you can just add a "downtime" to
that rule. Something like "between 00:00 and 00:07, don't send out
notifications for this condition".

        hjp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | [email protected]         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

signature.asc
Description: PGP signature

Re: Dealing with SeqScans when Time-based Partitions Cut Over

Reply via email to