Hello, In PG11beta2, my backend process gets terminated during COPY FROM of large text file (544GB) on partitioned table. The kernel log says OOM Killer send SIGKILL due to memory pressure. In fact, 63GB of system physical 64GB was consumed by the PostgreSQL backend just before the termination.
OOM Killer says: [69267.687791] Out of memory: Kill process 23881 (postgres) score 977 or sacrifice child [69267.687860] Killed process 23881 (postgres) total-vm:105210868kB, anon-rss:63372320kB, file-rss:0kB, shmem-rss:126144kB Configurations are below: The 'lineorder' partition table has three child tables by hash-partitioning on lo_orderkey (numeric). Each child table has its own tablespace. 'lineorder_p0' is built on the tablespace 'nvme0' on behalf of independent SSD device for instance. The query I run is: copy lineorder from '/opt/nvme0/ssbm/lineorder.tbl' delimiter '|'; So, its contents shall be distributed to individual child tables, based on in-database evaluation of hash-keys. To track which memory-context consumes too much memory more than usual expectations, I put elog() to report dying message prior to OOM Killer. See "aset-track.patch". It raises a warning message when memory consumption per memory-context goes across watermark. It says 'PortalContext' consumed 25GB at 04:26, then it grows up to 34GB at 05:21, and terminated at 05:51. It looks to me somewhere allocates memory our of per-tuple memory context, but I'm still under the investigation. Any ideas? ------------ 2018-07-25 04:26:54.096 JST [23881] WARNING: memory context 'PortalContext' grows up 25769803784 bytes 2018-07-25 04:26:54.096 JST [23881] CONTEXT: COPY lineorder, line 1610626836: "1610614887|1|18487099|541334|1474684|19980523|3-MEDIUM|0|30|4125930|30528526|2|4043411|82518|8|19980..." WARNING: memory context 'PortalContext' grows up 25769803784 bytes 2018-07-25 04:27:07.202 JST [23865] LOG: checkpoints are occurring too frequently (25 seconds apart) : <snip> : 2018-07-25 05:21:22.423 JST [23881] WARNING: memory context 'PortalContext' grows up 34359738384 bytes 2018-07-25 05:21:22.423 JST [23881] CONTEXT: COPY lineorder, line 2147497762: "2147498439|7|910553|962168|773580|19971006|1-URGENT|0|46|5658552|38894795|1|5601966|73807|2|19971201..." : <snip> : 2018-07-25 05:51:07.264 JST [23837] LOG: server process (PID 23881) was terminated by signal 9: Killed 2018-07-25 05:51:07.264 JST [23837] DETAIL: Failed process was running: copy lineorder from '/opt/nvme0/ssbm/lineorder.tbl' delimiter '|'; ------------ -- HeteroDB, Inc / The PG-Strom Project KaiGai Kohei <kai...@heterodb.com>
aset-track.patch
Description: Binary data
OOMKiller.log
Description: Binary data