Hi,
I found an issue where Postgres (with *effective_io_concurrency* of 64 or
higher) runs out of local buffers during a sequential scan on a temporary
table with TOAST data.
The issue occurs because the ReadStream look-ahead pins all the local
buffers. This results in the TOAST index look-up and TOAST page read being
unable to find any available local buffers. The ReadStream's
max_pinned_buffers can be as high as the num_temp_buffers, depending on the
effective_io_concurrency.
Here is a reproduction of the issue using the default temp_buffers setting
and effective_io_concurrency=128:
docker run --name my-postgres -e POSTGRES_PASSWORD=my-password -p 5432:5432
-d postgres:18 -c effective_io_concurrency=128
postgres=# CREATE TEMPORARY TABLE tmp_tbl1 (
s_suppkey NUMERIC NOT NULL,
s_nationkey NUMERIC,
s_comment VARCHAR(256),
s_name CHAR(256),
s_address VARCHAR(256),
s_phone TEXT,
s_acctbal NUMERIC,
CONSTRAINT supplier_pk PRIMARY KEY (s_suppkey)
);
CREATE TABLE
postgres=# INSERT INTO tmp_tbl1 (s_suppkey, s_nationkey, s_comment, s_name,
s_address, s_phone, s_acctbal)
SELECT
('1' || repeat('0', 2000) || i::text)::NUMERIC AS s_suppkey,
('5' || repeat('0', 2000) || floor(random() * 25)::text)::NUMERIC AS
s_nationkey,
md5(random()::text) || ' some comment' AS s_comment,
'Supplier#' || LPAD(i::text, 9, '0') AS s_name,
'Address-' || md5(i::text) AS s_address,
repeat('P', 4096) || '-' || i::text || repeat('P', 2048) || 'fwoiefrr'
||
repeat('fejwfelwkmfP', 4096) || '-' || i::text ||
repeat('fnwekjfmelkwf', 2048) AS s_phone,
('9' || repeat('9', 2000) || '.' || floor(random()*100)::text)::NUMERIC
AS s_acctbal
FROM generate_series(1, 8000) AS i;
INSERT 0 8000
postgres=# SELECT * FROM tmp_tbl1;
ERROR: no empty local buffer available
Attached is a patch that addresses this by limiting ReadStream's
max_pinned_buffers for temp tables to 75% of the available local buffers.
It also introduces a cap on max_ios for temp tables to
DEFAULT_EFFECTIVE_IO_CONCURRENCY, to account for multiple sequential scan
look-aheads happening simultaneously.
Regards,
Induja Sreekanthan
From 713a79d054ba98576faf90073dfac3d7720bddb1 Mon Sep 17 00:00:00 2001
From: Induja Sreekanthan <[email protected]>
Date: Mon, 23 Feb 2026 14:21:16 -0500
Subject: [PATCH] Limit the max_pinned_buffers in ReadStream to prevent
look-ahead exhausting the local buffers
The ReadStream look-ahead logic can pin up to GetLocalPinLimit()=num_temp_buffers local buffers if the effective_io_concurrency is 64 or higher. This can end up pinning all the local buffers, and the backend being unable to find any local buffers available outside the ReadStream. An example where this is seen is when the temporary table has TOAST data, and the backend needs to access the toast index and the toast page between the sequential scan. Another scenario to keep in mind is when there are multiple SeqScan nodes in the same query plan and run by the same backend.
This patch fixes this with the following fixes:
- Limit the maximum possible number of pinned local buffers in ReadStream to use only 75% of the total available local buffers (GetLocalPinLimit()).
- Limit the max_ios for temp tables to DEFAULT_EFFECTIVE_IO_CONCURRENCY=16.
---
src/backend/storage/aio/read_stream.c | 31 ++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index 88717c2ff8f..7f92a903130 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -80,6 +80,20 @@
#include "utils/rel.h"
#include "utils/spccache.h"
+/*
+ * This indicates the upper limit on the percentage of the local buffers can be
+ * used by the read-stream's look-ahead logic.
+ */
+#define LOCAL_BUFFERS_USE_FOR_READ_STREAM_LIMIT 75
+/*
+ * This indicates the maximum number of IOs that can be outstanding for a
+ * read-stream for temporary tables. This needs to be limited to avoid running
+ * out of local buffers during look-ahead. Think multiple SeqScan nodes on temp
+ * tables in the same query plan executed by the same backend, resulting in
+ * multiple read streams.
+ */
+#define READ_STREAM_TEMP_TABLES_MAX_IOS DEFAULT_EFFECTIVE_IO_CONCURRENCY
+
typedef struct InProgressIO
{
int16 buffer_index;
@@ -576,6 +590,14 @@ read_stream_begin_impl(int flags,
max_ios = get_tablespace_maintenance_io_concurrency(tablespace_id);
else
max_ios = get_tablespace_io_concurrency(tablespace_id);
+ if (SmgrIsTemp(smgr))
+ {
+ /*
+ * For temp relations, use a smaller value to avoid using up all the local
+ * buffers for look-ahead.
+ */
+ max_ios = Min(max_ios, READ_STREAM_TEMP_TABLES_MAX_IOS);
+ }
/* Cap to INT16_MAX to avoid overflowing below */
max_ios = Min(max_ios, PG_INT16_MAX);
@@ -616,7 +638,14 @@ read_stream_begin_impl(int flags,
* we'll check that later in read_stream_start_pending_read().
*/
if (SmgrIsTemp(smgr))
- max_possible_buffer_limit = GetLocalPinLimit();
+ {
+ /*
+ * Don't use up the entire local buffer pool. Leave some buffers so that the
+ * backend can do other work as well.
+ */
+ max_possible_buffer_limit =
+ GetLocalPinLimit() * LOCAL_BUFFERS_USE_FOR_READ_STREAM_LIMIT / 100;
+ }
else
max_possible_buffer_limit = GetPinLimit();
max_pinned_buffers = Min(max_pinned_buffers, max_possible_buffer_limit);
--
2.53.0.414.gf7e9f6c205-goog