About the system:
Ubuntu 20.04, 64GB ram, 16GB shared buffer, 500 MB working mem, Postgresql 14.1

Core issue:
The following statement below, when not divided up into chunks, but run across 
all 800M rows, did trigger an OOM-kill from the OS.
I have looked into it by kernel logs as well as postgresql logs. The postgresql 
just says it was killed, and the OS killed it due to the fact that all mem 
including swap was exhausted.
Looking at TOP while updating, I can see the RSS column of a single postgresql 
process (the connection I assume), just grow and grow until it chokes the 
system.

Statement:
Update table alfa
set x = beta.x
from beta where beta.id=alpha.id and x <> beta.x

alpha is a wide table (40 columns), partitioned into 5 equally partitions by 
year. Total row count 800M rows
beta is a 10 column 40M rows table.
the updated field x is non-indexed varchar; the id fields are indexed.
there are no triggers

I am well aware that huge updates have general issues, like locking the table 
etc, and it is perhaps discouraged. And I did solve it by batching it in 1M and 
1M rows.
However, my curiosity still remains of what is really happening here. Why do 
Postgresql run out of memory? Exactly what is it storing in that memory? I am 
aware of the work_mem danger, but that is not what is happening here. I can 
replicate this with 32MB work mem as well; This is a low connection database.
Any help is appreciated.

Klaudie

track_activity_query_size = 4096
synchronous_commit = off
full_page_writes = off
#wal_compression = on
wal_level = minimal
max_wal_senders = 0

log_min_duration_statement = 1000
idle_in_transaction_session_timeout = '300s' # in milliseconds, 0 is disabled
tcp_keepalives_idle = '300s'
max_connections = 50
shared_buffers = 16GB
effective_cache_size = 48GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
min_wal_size = 4GB
max_wal_size = 16GB
#wal_buffers = 16MB
default_statistics_target = 1000
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 1000MB
max_worker_processes = 8
max_parallel_workers = 8
max_parallel_workers_per_gather = 4
max_parallel_maintenance_workers = 4
cpu_tuple_cost = 0.03

Reply via email to