Re: Add the ability to limit the amount of memory that can be allocated to backends.

Tomas Vondra Tue, 26 Dec 2023 13:52:44 -0800

Hi,

I wanted to take a look at the patch, and I noticed it's broken since
3d51cb5197 renamed a couple pgstat functions in August. I plan to maybe
do some benchmarks etc. preferably on current master, so here's a
version fixing that minor bitrot.


As for the patch, I only skimmed through the thread so far, to get some
idea of what the approach and goals are, etc. I didn't look at the code
yet, so can't comment on that.

However, at pgconf.eu a couple week ago I had quite a few discussions
about such "backend memory limit" could/should work in principle, and
I've been thinking about ways to implement this. So let me share some
thoughts about how this patch aligns with that ...

(FWIW it's not my intent to hijack or derail this patch in any way, but
there's a couple things I think we should do differently.)

I'm 100% on board with having a memory limit "above" work_mem. It's
really annoying that we have no way to restrict the amount of memory a
backend can allocate for complex queries, etc.

But I find it a bit strange that we aim to introduce a "global" memory
limit for all backends combined first. I'm not against having that too,
but it's not the feature I usually wish to have. I need some protection
against runaway backends, that happen to allocate a lot memory.

Similarly, I'd like to be able to have different limits depending on
what the backend does - a backend doing OLAP may naturally need more
memory, while a backend doing OLTP may have a much tighter limit.

But with a single global limit none of this is possible. It may help
reducing the risk of unexpected OOM issues (not 100%, but useful), but
it can't limit the impact to the one backend - if memory starts runnning
out, it will affect all other backends a bit randomly (depending on the
order in which the backends happen to allocate memory). And it does not
consider what workloads the backends execute.

Let me propose a slightly different architecture that I imagined while
thinking about this. It's not radically differrent from what the patch
does, but it focuses on the local accounting first. I believe it's
possible to extend this to enforce the global limit too.

FWIW I haven't tried implementing this - I don't want to "hijack" this
thread and do my own thing. I can take a stab at a PoC if needed.

Firstly, I'm not quite happy with how all the memory contexts have to
do their own version of the accounting and memory checking. I think we
should move that into a new abstraction which I call "memory pool".
It's very close to "memory context" but it only deals with allocating
blocks, not the chunks requested by palloc() etc. So when someone does
palloc(), that may be AllocSetAlloc(). And instead of doing malloc()
that would do MemoryPoolAlloc(blksize), and then that would do all the
accounting and checks, and then do malloc().

This may sound like an unnecessary indirection, but the idea is that a
single memory pool would back many memory contexts (perhaps all for
a given backend). In principle we might even establish separate memory
pools for different parts of the memory context hierarchy, but I'm not
sure we need that.

I can imagine the pool could also cache blocks for cases when we create
and destroy contexts very often, but glibc should already does that for
us, I believe.

For me, the accounting and memory context is the primary goal. I wonder
if we considered this context/pool split while working on the accounting
for hash aggregate, but I think we were too attached to doing all of it
in the memory context hierarchy.

Of course, this memory pool is per backend, and so would be the memory
accounting and limit enforced by it. But I can imagine extending to do
a global limit similar to what the current patch does - using a counter
in shared memory, or something. I haven't reviewed what's the overhead
or how it handles cases when a backend terminates in some unexpected
way. But whatever the current patch does, memory pool could do too.


Secondly, I think there's an annoying issue with the accounting at the
block level - it makes it problematic to use low limit values. We double
the block size, so we may quickly end up with a block size a couple MBs,
which means the accounting granularity gets very coarse.

I think it'd be useful to introduce a "backpressure" between the memory
pool and the memory context, depending on how close we are to the limit.
For example if the limit is 128MB and the backend allocated 16MB so far,
we're pretty far away from the limit. So if the backend requests 8MB
block, that's fine and the memory pool should malloc() that. But if we
already allocated 100MB, maybe we should be careful and not allow 8MB
blocks - the memory pool should be allowed to override this and return
just 1MB block. Sure, this would have to be optional, and not all places
can accept a smaller block than requested (when the chunk would not fit
into the smaller block). It would require a suitable memory pool API
and more work in the memory contexts, but it seems pretty useful.
Certainly not something for v1.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From ad74704ef5a62f9d2b8d98f40d6fbc5b52bc742b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@2ndquadrant.com>
Date: Tue, 26 Dec 2023 17:54:40 +0100
Subject: [PATCH v20231226 1/3] Add tracking of backend memory allocated

Add tracking of backend memory allocated in total and by allocation
type (aset, dsm, generation, slab) by process.

allocated_bytes tracks the current bytes of memory allocated to the
backend process. aset_allocated_bytes, dsm_allocated_bytes,
generation_allocated_bytes and slab_allocated_bytes track the
allocation by type for the backend process. They are updated for the
process as memory is malloc'd/freed.  Memory allocated to items on
the freelist is included.  Dynamic shared memory allocations are
included only in the value displayed for the backend that created
them, they are not included in the value for backends that are
attached to them to avoid double counting. DSM allocations that are
not destroyed by the creating process prior to it's exit are
considered long lived and are tracked in a global counter
global_dsm_allocated_bytes. We limit the floor of allocation
counters to zero. Created views pg_stat_global_memory_allocation and
pg_stat_memory_allocation for access to these trackers.
---
 doc/src/sgml/monitoring.sgml                | 246 ++++++++++++++++++++
 src/backend/catalog/system_views.sql        |  34 +++
 src/backend/storage/ipc/dsm.c               |  11 +-
 src/backend/storage/ipc/dsm_impl.c          |  78 +++++++
 src/backend/storage/lmgr/proc.c             |   1 +
 src/backend/utils/activity/backend_status.c | 114 +++++++++
 src/backend/utils/adt/pgstatfuncs.c         |  84 +++++++
 src/backend/utils/init/miscinit.c           |   3 +
 src/backend/utils/mmgr/aset.c               |  17 ++
 src/backend/utils/mmgr/generation.c         |  15 ++
 src/backend/utils/mmgr/slab.c               |  22 ++
 src/include/catalog/pg_proc.dat             |  17 ++
 src/include/storage/proc.h                  |   2 +
 src/include/utils/backend_status.h          | 144 +++++++++++-
 src/test/regress/expected/rules.out         |  27 +++
 src/test/regress/expected/stats.out         |  36 +++
 src/test/regress/sql/stats.sql              |  20 ++
 17 files changed, 869 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4f8058d8b1b..99f5acf07f4 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4563,6 +4563,252 @@ description | Waiting for a newly initialized WAL file to reach durable storage
 
  </sect2>
 
+ <sect2 id="monitoring-pg-stat-memory-allocation-view">
+  <title><structname>pg_stat_memory_allocation</structname></title>
+
+  <indexterm>
+   <primary>pg_stat_memory_allocation</primary>
+  </indexterm>
+
+  <para>
+   The <structname>pg_stat_memory_allocation</structname> view will have one
+   row per server process, showing information related to the current memory
+   allocation of that process in total and by allocator type. Due to the
+   dynamic nature of memory allocations the allocated bytes values may not be
+   exact but should be sufficient for the intended purposes. Dynamic shared
+   memory allocations are included only in the value displayed for the backend
+   that created them, they are not included in the value for backends that are
+   attached to them to avoid double counting.  Use
+   <function>pg_size_pretty</function> described in
+   <xref linkend="functions-admin-dbsize"/> to make these values more easily
+   readable.
+  </para>
+
+  <table id="pg-stat-memory-allocation-view" xreflabel="pg_stat_memory_allocation">
+   <title><structname>pg_stat_memory_allocation</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of the database this backend is connected to
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>pid</structfield> <type>integer</type>
+      </para>
+      <para>
+       Process ID of this backend
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>allocated_bytes</structfield> <type>bigint</type>
+      </para>
+     <para>
+      Memory currently allocated to this backend in bytes. This is the balance
+      of bytes allocated and freed by this backend. Dynamic shared memory
+      allocations are included only in the value displayed for the backend that
+      created them, they are not included in the value for backends that are
+      attached to them to avoid double counting.
+     </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>aset_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Memory currently allocated to this backend in bytes via the allocation
+       set allocator.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>dsm_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Memory currently allocated to this backend in bytes via the dynamic
+       shared memory allocator. Upon process exit, dsm allocations that have
+       not been freed are considered long lived and added to
+       <structfield>global_dsm_allocated_bytes</structfield> found in the
+       <link linkend="monitoring-pg-stat-global-memory-allocation-view">
+       <structname>pg_stat_global_memory_allocation</structname></link> view.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>generation_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Memory currently allocated to this backend in bytes via the generation
+       allocator.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>slab_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Memory currently allocated to this backend in bytes via the slab
+       allocator.
+      </para></entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
+
+ <sect2 id="monitoring-pg-stat-global-memory-allocation-view">
+  <title><structname>pg_stat_global_memory_allocation</structname></title>
+
+  <indexterm>
+   <primary>pg_stat_global-memory_allocation</primary>
+  </indexterm>
+
+  <para>
+   The <structname>pg_stat_global_memory_allocation</structname> view will
+   have one row showing information related to current shared memory
+   allocations. Due to the dynamic nature of memory allocations the allocated
+   bytes values may not be exact but should be sufficient for the intended
+   purposes. Use <function>pg_size_pretty</function> described in
+   <xref linkend="functions-admin-dbsize"/> to make the byte populated values
+   more easily readable.
+  </para>
+
+  <table id="pg-stat-global-memory-allocation-view" xreflabel="pg_stat_global_memory_allocation">
+   <title><structname>pg_stat_global_memory_allocation</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of the database this backend is connected to
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>shared_memory_size_mb</structfield> <type>integer</type>
+      </para>
+      <para>
+       Reports the size of the main shared memory area, rounded up to the
+       nearest megabyte. See <xref linkend="guc-shared-memory-size"/>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>shared_memory_size_in_huge_pages</structfield> <type>bigint</type>
+      </para>
+     <para>
+      Reports the number of huge pages that are needed for the main shared
+      memory area based on the specified huge_page_size. If huge pages are not
+      supported, this will be -1. See
+      <xref linkend="guc-shared-memory-size-in-huge-pages"/>.
+     </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>global_dsm_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Long lived dynamically allocated memory currently allocated to the
+       database. Upon process exit, dsm allocations that have not been freed
+       are considered long lived and added to
+       <structfield>global_dsm_allocated_bytes</structfield>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_aset_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Sum total of <structfield>aset_allocated_bytes</structfield> for all
+       backend processes from
+       <link linkend="monitoring-pg-stat-memory-allocation-view">
+       <structname>pg_stat_memory_allocation</structname></link> view.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_dsm_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Sum total of <structfield>dsm_allocated_bytes</structfield> for all
+       backend processes from
+       <link linkend="monitoring-pg-stat-memory-allocation-view">
+       <structname>pg_stat_memory_allocation</structname></link> view.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_generation_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Sum total of <structfield>generation_allocated_bytes</structfield> for
+       all backend processes from
+       <link linkend="monitoring-pg-stat-memory-allocation-view">
+       <structname>pg_stat_memory_allocation</structname></link> view.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_slab_allocated_bytes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Sum total of <structfield>slab_allocated_bytes</structfield> for all
+       backend processes from
+       <link linkend="monitoring-pg-stat-memory-allocation-view">
+       <structname>pg_stat_memory_allocation</structname></link> view.
+      </para></entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
+
  <sect2 id="monitoring-stats-functions">
   <title>Statistics Functions</title>
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 11d18ed9dd6..fd7fcaf59e3 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1369,3 +1369,37 @@ CREATE VIEW pg_stat_subscription_stats AS
 
 CREATE VIEW pg_wait_events AS
     SELECT * FROM pg_get_wait_events();
+
+CREATE VIEW pg_stat_memory_allocation AS
+    SELECT
+        S.datid AS datid,
+        S.pid,
+        S.allocated_bytes,
+        S.aset_allocated_bytes,
+        S.dsm_allocated_bytes,
+        S.generation_allocated_bytes,
+        S.slab_allocated_bytes
+    FROM pg_stat_get_memory_allocation(NULL) AS S
+        LEFT JOIN pg_database AS D ON (S.datid = D.oid);
+
+CREATE VIEW pg_stat_global_memory_allocation AS
+WITH sums AS (
+    SELECT
+        SUM(aset_allocated_bytes) AS total_aset_allocated_bytes,
+        SUM(dsm_allocated_bytes) AS total_dsm_allocated_bytes,
+        SUM(generation_allocated_bytes) AS total_generation_allocated_bytes,
+        SUM(slab_allocated_bytes) AS total_slab_allocated_bytes
+    FROM
+        pg_stat_memory_allocation
+)
+SELECT
+        S.datid AS datid,
+        current_setting('shared_memory_size', true) as shared_memory_size,
+        (current_setting('shared_memory_size_in_huge_pages', true))::integer as shared_memory_size_in_huge_pages,
+        S.global_dsm_allocated_bytes,
+        sums.total_aset_allocated_bytes,
+        sums.total_dsm_allocated_bytes,
+        sums.total_generation_allocated_bytes,
+        sums.total_slab_allocated_bytes
+    FROM sums, pg_stat_get_global_memory_allocation() AS S
+        LEFT JOIN pg_database AS D ON (S.datid = D.oid);
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 628f3ecd3fb..6725780fc77 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -803,6 +803,15 @@ dsm_detach_all(void)
 void
 dsm_detach(dsm_segment *seg)
 {
+	/*
+	 * Retain mapped_size to pass into destroy call in cases where the detach
+	 * is the last reference. mapped_size is zeroed as part of the detach
+	 * process, but is needed later in these cases for dsm_allocated_bytes
+	 * accounting.
+	 */
+	Size		local_seg_mapped_size = seg->mapped_size;
+	Size	   *ptr_local_seg_mapped_size = &local_seg_mapped_size;
+
 	/*
 	 * Invoke registered callbacks.  Just in case one of those callbacks
 	 * throws a further error that brings us back here, pop the callback
@@ -883,7 +892,7 @@ dsm_detach(dsm_segment *seg)
 			 */
 			if (is_main_region_dsm_handle(seg->handle) ||
 				dsm_impl_op(DSM_OP_DESTROY, seg->handle, 0, &seg->impl_private,
-							&seg->mapped_address, &seg->mapped_size, WARNING))
+							&seg->mapped_address, ptr_local_seg_mapped_size, WARNING))
 			{
 				LWLockAcquire(DynamicSharedMemoryControlLock, LW_EXCLUSIVE);
 				if (is_main_region_dsm_handle(seg->handle))
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index 35fa910d6f2..a9e0987747b 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -66,6 +66,7 @@
 #include "postmaster/postmaster.h"
 #include "storage/dsm_impl.h"
 #include "storage/fd.h"
+#include "utils/backend_status.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 
@@ -232,6 +233,14 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 							name)));
 			return false;
 		}
+
+		/*
+		 * Detach and destroy pass through here, only decrease the memory
+		 * shown allocated in pg_stat_activity when the creator destroys the
+		 * allocation.
+		 */
+		if (op == DSM_OP_DESTROY)
+			pgstat_report_allocated_bytes_decrease(*mapped_size, PG_ALLOC_DSM);
 		*mapped_address = NULL;
 		*mapped_size = 0;
 		if (op == DSM_OP_DESTROY && shm_unlink(name) != 0)
@@ -332,6 +341,33 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 						name)));
 		return false;
 	}
+
+	/*
+	 * Attach and create pass through here, only update backend memory
+	 * allocated in pg_stat_activity for the creator process.
+	 */
+	if (op == DSM_OP_CREATE)
+	{
+		/*
+		 * Posix creation calls dsm_impl_posix_resize implying that resizing
+		 * occurs or may be added in the future. As implemented
+		 * dsm_impl_posix_resize utilizes fallocate or truncate, passing the
+		 * whole new size as input, growing the allocation as needed (only
+		 * truncate supports shrinking). We update by replacing the old
+		 * allocation with the new.
+		 */
+#if defined(HAVE_POSIX_FALLOCATE) && defined(__linux__)
+		/*
+		 * posix_fallocate does not shrink allocations, adjust only on
+		 * allocation increase.
+		 */
+		if (request_size > *mapped_size)
+			pgstat_report_allocated_bytes_increase(request_size - *mapped_size, PG_ALLOC_DSM);
+#else
+		pgstat_report_allocated_bytes_decrease(*mapped_size, PG_ALLOC_DSM);
+		pgstat_report_allocated_bytes_increase(request_size, PG_ALLOC_DSM);
+#endif
+	}
 	*mapped_address = address;
 	*mapped_size = request_size;
 	close(fd);
@@ -538,6 +574,14 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 							name)));
 			return false;
 		}
+
+		/*
+		 * Detach and destroy pass through here, only decrease the memory
+		 * shown allocated in pg_stat_activity when the creator destroys the
+		 * allocation.
+		 */
+		if (op == DSM_OP_DESTROY)
+			pgstat_report_allocated_bytes_decrease(*mapped_size, PG_ALLOC_DSM);
 		*mapped_address = NULL;
 		*mapped_size = 0;
 		if (op == DSM_OP_DESTROY && shmctl(ident, IPC_RMID, NULL) < 0)
@@ -585,6 +629,13 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 						name)));
 		return false;
 	}
+
+	/*
+	 * Attach and create pass through here, only update backend memory
+	 * allocated in pg_stat_activity for the creator process.
+	 */
+	if (op == DSM_OP_CREATE)
+		pgstat_report_allocated_bytes_increase(request_size, PG_ALLOC_DSM);
 	*mapped_address = address;
 	*mapped_size = request_size;
 
@@ -653,6 +704,13 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 			return false;
 		}
 
+		/*
+		 * Detach and destroy pass through here, only decrease the memory
+		 * shown allocated in pg_stat_activity when the creator destroys the
+		 * allocation.
+		 */
+		if (op == DSM_OP_DESTROY)
+			pgstat_report_allocated_bytes_decrease(*mapped_size, PG_ALLOC_DSM);
 		*impl_private = NULL;
 		*mapped_address = NULL;
 		*mapped_size = 0;
@@ -769,6 +827,12 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		return false;
 	}
 
+	/*
+	 * Attach and create pass through here, only update backend memory
+	 * allocated in pg_stat_activity for the creator process.
+	 */
+	if (op == DSM_OP_CREATE)
+		pgstat_report_allocated_bytes_increase(info.RegionSize, PG_ALLOC_DSM);
 	*mapped_address = address;
 	*mapped_size = info.RegionSize;
 	*impl_private = hmap;
@@ -813,6 +877,13 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
 							name)));
 			return false;
 		}
+
+		/*
+		 * Detach and destroy pass through here, only decrease the memory
+		 * shown allocated in pg_stat_activity when the creator destroys the
+		 * allocation.
+		 */
+		pgstat_report_allocated_bytes_decrease(*mapped_size, PG_ALLOC_DSM);
 		*mapped_address = NULL;
 		*mapped_size = 0;
 		if (op == DSM_OP_DESTROY && unlink(name) != 0)
@@ -934,6 +1005,13 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
 						name)));
 		return false;
 	}
+
+	/*
+	 * Attach and create pass through here, only update backend memory
+	 * allocated in pg_stat_activity for the creator process.
+	 */
+	if (op == DSM_OP_CREATE)
+		pgstat_report_allocated_bytes_increase(request_size, PG_ALLOC_DSM);
 	*mapped_address = address;
 	*mapped_size = request_size;
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index b6451d9d083..81304a569c0 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -180,6 +180,7 @@ InitProcGlobal(void)
 	ProcGlobal->checkpointerLatch = NULL;
 	pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PGPROCNO);
 	pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PGPROCNO);
+	pg_atomic_init_u64(&ProcGlobal->global_dsm_allocation, 0);
 
 	/*
 	 * Create and initialize all the PGPROC structures we'll need.  There are
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 6e734c6caff..838a7337933 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -49,6 +49,24 @@ int			pgstat_track_activity_query_size = 1024;
 /* exposed so that backend_progress.c can access it */
 PgBackendStatus *MyBEEntry = NULL;
 
+/*
+ * Memory allocated to this backend prior to pgstats initialization. Migrated to
+ * shared memory on pgstats initialization.
+ */
+uint64		local_my_allocated_bytes = 0;
+uint64	   *my_allocated_bytes = &local_my_allocated_bytes;
+
+/* Memory allocated to this backend by type prior to pgstats initialization.
+ * Migrated to shared memory on pgstats initialization
+ */
+uint64		local_my_aset_allocated_bytes = 0;
+uint64	   *my_aset_allocated_bytes = &local_my_aset_allocated_bytes;
+uint64		local_my_dsm_allocated_bytes = 0;
+uint64	   *my_dsm_allocated_bytes = &local_my_dsm_allocated_bytes;
+uint64		local_my_generation_allocated_bytes = 0;
+uint64	   *my_generation_allocated_bytes = &local_my_generation_allocated_bytes;
+uint64		local_my_slab_allocated_bytes = 0;
+uint64	   *my_slab_allocated_bytes = &local_my_slab_allocated_bytes;
 
 static PgBackendStatus *BackendStatusArray = NULL;
 static char *BackendAppnameBuffer = NULL;
@@ -401,6 +419,32 @@ pgstat_bestart(void)
 	lbeentry.st_progress_command_target = InvalidOid;
 	lbeentry.st_query_id = UINT64CONST(0);
 
+	/* Alter allocation reporting from local storage to shared memory */
+	pgstat_set_allocated_bytes_storage(&MyBEEntry->allocated_bytes,
+									   &MyBEEntry->aset_allocated_bytes,
+									   &MyBEEntry->dsm_allocated_bytes,
+									   &MyBEEntry->generation_allocated_bytes,
+									   &MyBEEntry->slab_allocated_bytes);
+
+	/*
+	 * Populate sum of memory allocated prior to pgstats initialization to
+	 * pgstats and zero the local variable. This is a += assignment because
+	 * InitPostgres allocates memory after pgstat_beinit but prior to
+	 * pgstat_bestart so we have allocations to both local and shared memory
+	 * to combine.
+	 */
+	lbeentry.allocated_bytes += local_my_allocated_bytes;
+	local_my_allocated_bytes = 0;
+	lbeentry.aset_allocated_bytes += local_my_aset_allocated_bytes;
+	local_my_aset_allocated_bytes = 0;
+
+	lbeentry.dsm_allocated_bytes += local_my_dsm_allocated_bytes;
+	local_my_dsm_allocated_bytes = 0;
+	lbeentry.generation_allocated_bytes += local_my_generation_allocated_bytes;
+	local_my_generation_allocated_bytes = 0;
+	lbeentry.slab_allocated_bytes += local_my_slab_allocated_bytes;
+	local_my_slab_allocated_bytes = 0;
+
 	/*
 	 * we don't zero st_progress_param here to save cycles; nobody should
 	 * examine it until st_progress_command has been set to something other
@@ -460,6 +504,9 @@ pgstat_beshutdown_hook(int code, Datum arg)
 {
 	volatile PgBackendStatus *beentry = MyBEEntry;
 
+	/* Stop reporting memory allocation changes to shared memory */
+	pgstat_reset_allocated_bytes_storage();
+
 	/*
 	 * Clear my status entry, following the protocol of bumping st_changecount
 	 * before and after.  We use a volatile pointer here to ensure the
@@ -1214,3 +1261,70 @@ pgstat_clip_activity(const char *raw_activity)
 
 	return activity;
 }
+
+/*
+ * Configure bytes allocated reporting to report allocated bytes to
+ * shared memory.
+ *
+ * Expected to be called during backend startup (in pgstat_bestart), to point
+ * allocated bytes accounting into shared memory.
+ */
+void
+pgstat_set_allocated_bytes_storage(uint64 *allocated_bytes,
+								   uint64 *aset_allocated_bytes,
+								   uint64 *dsm_allocated_bytes,
+								   uint64 *generation_allocated_bytes,
+								   uint64 *slab_allocated_bytes)
+{
+	/* Map allocations to shared memory */
+	my_allocated_bytes = allocated_bytes;
+	*allocated_bytes = local_my_allocated_bytes;
+
+	my_aset_allocated_bytes = aset_allocated_bytes;
+	*aset_allocated_bytes = local_my_aset_allocated_bytes;
+
+	my_dsm_allocated_bytes = dsm_allocated_bytes;
+	*dsm_allocated_bytes = local_my_dsm_allocated_bytes;
+
+	my_generation_allocated_bytes = generation_allocated_bytes;
+	*generation_allocated_bytes = local_my_generation_allocated_bytes;
+
+	my_slab_allocated_bytes = slab_allocated_bytes;
+	*slab_allocated_bytes = local_my_slab_allocated_bytes;
+}
+
+/*
+ * Reset allocated bytes storage location.
+ *
+ * Expected to be called during backend shutdown, before the locations set up
+ * by pgstat_set_allocated_bytes_storage become invalid.
+ */
+void
+pgstat_reset_allocated_bytes_storage(void)
+{
+	if (ProcGlobal)
+	{
+		volatile PROC_HDR *procglobal = ProcGlobal;
+
+		/*
+		 * Add dsm allocations that have not been freed to global dsm
+		 * accounting
+		 */
+		pg_atomic_add_fetch_u64(&procglobal->global_dsm_allocation,
+								*my_dsm_allocated_bytes);
+	}
+
+	/* Reset memory allocation variables */
+	*my_allocated_bytes = local_my_allocated_bytes = 0;
+	*my_aset_allocated_bytes = local_my_aset_allocated_bytes = 0;
+	*my_dsm_allocated_bytes = local_my_dsm_allocated_bytes = 0;
+	*my_generation_allocated_bytes = local_my_generation_allocated_bytes = 0;
+	*my_slab_allocated_bytes = local_my_slab_allocated_bytes = 0;
+
+	/* Point my_{*_}allocated_bytes from shared memory back to local */
+	my_allocated_bytes = &local_my_allocated_bytes;
+	my_aset_allocated_bytes = &local_my_aset_allocated_bytes;
+	my_dsm_allocated_bytes = &local_my_dsm_allocated_bytes;
+	my_generation_allocated_bytes = &local_my_generation_allocated_bytes;
+	my_slab_allocated_bytes = &local_my_slab_allocated_bytes;
+}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 0cea320c00e..b372ee691ba 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2016,3 +2016,87 @@ pg_stat_have_stats(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(pgstat_have_entry(kind, dboid, objoid));
 }
+
+/*
+ * Get the memory allocation of PG backends.
+ */
+Datum
+pg_stat_get_memory_allocation(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_MEMORY_ALLOCATION_COLS	7
+	int			num_backends = pgstat_fetch_stat_numbackends();
+	int			curr_backend;
+	int			pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
+	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+
+	InitMaterializedSRF(fcinfo, 0);
+
+	/* 1-based index */
+	for (curr_backend = 1; curr_backend <= num_backends; curr_backend++)
+	{
+		/* for each row */
+		Datum		values[PG_STAT_GET_MEMORY_ALLOCATION_COLS] = {0};
+		bool		nulls[PG_STAT_GET_MEMORY_ALLOCATION_COLS] = {0};
+		LocalPgBackendStatus *local_beentry;
+		PgBackendStatus *beentry;
+
+		/* Get the next one in the list */
+		local_beentry = pgstat_fetch_stat_local_beentry(curr_backend);
+		beentry = &local_beentry->backendStatus;
+
+		/* If looking for specific PID, ignore all the others */
+		if (pid != -1 && beentry->st_procpid != pid)
+			continue;
+
+		/* Values available to all callers */
+		if (beentry->st_databaseid != InvalidOid)
+			values[0] = ObjectIdGetDatum(beentry->st_databaseid);
+		else
+			nulls[0] = true;
+
+		values[1] = Int32GetDatum(beentry->st_procpid);
+		values[2] = UInt64GetDatum(beentry->allocated_bytes);
+		values[3] = UInt64GetDatum(beentry->aset_allocated_bytes);
+		values[4] = UInt64GetDatum(beentry->dsm_allocated_bytes);
+		values[5] = UInt64GetDatum(beentry->generation_allocated_bytes);
+		values[6] = UInt64GetDatum(beentry->slab_allocated_bytes);
+
+		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
+
+		/* If only a single backend was requested, and we found it, break. */
+		if (pid != -1)
+			break;
+	}
+
+	return (Datum) 0;
+}
+
+/*
+ * Get the global memory allocation statistics.
+ */
+Datum
+pg_stat_get_global_memory_allocation(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS	2
+	TupleDesc	tupdesc;
+	Datum		values[PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS] = {0};
+	bool		nulls[PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS] = {0};
+	volatile PROC_HDR *procglobal = ProcGlobal;
+
+	/* Initialise attributes information in the tuple descriptor */
+	tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
+					   OIDOID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "global_dsm_allocated_bytes",
+					   INT8OID, -1, 0);
+	BlessTupleDesc(tupdesc);
+
+	/* datid */
+	values[0] = ObjectIdGetDatum(MyDatabaseId);
+
+	/* get global_dsm_allocated_bytes */
+	values[1] = Int64GetDatum(pg_atomic_read_u64(&procglobal->global_dsm_allocation));
+
+	/* Returns the record as Datum */
+	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 5c9b6f991e0..2b082c68df6 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -170,6 +170,9 @@ InitPostmasterChild(void)
 				(errcode_for_socket_access(),
 				 errmsg_internal("could not set postmaster death monitoring pipe to FD_CLOEXEC mode: %m")));
 #endif
+
+	/* Init allocated bytes to avoid double counting parent allocation */
+	pgstat_init_allocated_bytes();
 }
 
 /*
diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index c3affaf5a8a..7af0d141da2 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -47,6 +47,7 @@
 #include "postgres.h"
 
 #include "port/pg_bitutils.h"
+#include "utils/backend_status.h"
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
 #include "utils/memutils_memorychunk.h"
@@ -517,6 +518,7 @@ AllocSetContextCreateInternal(MemoryContext parent,
 						name);
 
 	((MemoryContext) set)->mem_allocated = firstBlockSize;
+	pgstat_report_allocated_bytes_increase(firstBlockSize, PG_ALLOC_ASET);
 
 	return (MemoryContext) set;
 }
@@ -539,6 +541,7 @@ AllocSetReset(MemoryContext context)
 	AllocSet	set = (AllocSet) context;
 	AllocBlock	block;
 	Size		keepersize PG_USED_FOR_ASSERTS_ONLY;
+	uint64		deallocation = 0;
 
 	Assert(AllocSetIsValid(set));
 
@@ -581,6 +584,7 @@ AllocSetReset(MemoryContext context)
 		{
 			/* Normal case, release the block */
 			context->mem_allocated -= block->endptr - ((char *) block);
+			deallocation += block->endptr - ((char *) block);
 
 #ifdef CLOBBER_FREED_MEMORY
 			wipe_mem(block, block->freeptr - ((char *) block));
@@ -591,6 +595,7 @@ AllocSetReset(MemoryContext context)
 	}
 
 	Assert(context->mem_allocated == keepersize);
+	pgstat_report_allocated_bytes_decrease(deallocation, PG_ALLOC_ASET);
 
 	/* Reset block size allocation sequence, too */
 	set->nextBlockSize = set->initBlockSize;
@@ -609,6 +614,7 @@ AllocSetDelete(MemoryContext context)
 	AllocSet	set = (AllocSet) context;
 	AllocBlock	block = set->blocks;
 	Size		keepersize PG_USED_FOR_ASSERTS_ONLY;
+	uint64		deallocation = 0;
 
 	Assert(AllocSetIsValid(set));
 
@@ -647,11 +653,13 @@ AllocSetDelete(MemoryContext context)
 
 				freelist->first_free = (AllocSetContext *) oldset->header.nextchild;
 				freelist->num_free--;
+				deallocation += oldset->header.mem_allocated;
 
 				/* All that remains is to free the header/initial block */
 				free(oldset);
 			}
 			Assert(freelist->num_free == 0);
+			pgstat_report_allocated_bytes_decrease(deallocation, PG_ALLOC_ASET);
 		}
 
 		/* Now add the just-deleted context to the freelist. */
@@ -668,7 +676,10 @@ AllocSetDelete(MemoryContext context)
 		AllocBlock	next = block->next;
 
 		if (!IsKeeperBlock(set, block))
+		{
 			context->mem_allocated -= block->endptr - ((char *) block);
+			deallocation += block->endptr - ((char *) block);
+		}
 
 #ifdef CLOBBER_FREED_MEMORY
 		wipe_mem(block, block->freeptr - ((char *) block));
@@ -681,6 +692,7 @@ AllocSetDelete(MemoryContext context)
 	}
 
 	Assert(context->mem_allocated == keepersize);
+	pgstat_report_allocated_bytes_decrease(deallocation + context->mem_allocated, PG_ALLOC_ASET);
 
 	/* Finally, free the context header, including the keeper block */
 	free(set);
@@ -730,6 +742,7 @@ AllocSetAlloc(MemoryContext context, Size size)
 			return NULL;
 
 		context->mem_allocated += blksize;
+		pgstat_report_allocated_bytes_increase(blksize, PG_ALLOC_ASET);
 
 		block->aset = set;
 		block->freeptr = block->endptr = ((char *) block) + blksize;
@@ -943,6 +956,7 @@ AllocSetAlloc(MemoryContext context, Size size)
 			return NULL;
 
 		context->mem_allocated += blksize;
+		pgstat_report_allocated_bytes_increase(blksize, PG_ALLOC_ASET);
 
 		block->aset = set;
 		block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ;
@@ -1040,6 +1054,7 @@ AllocSetFree(void *pointer)
 			block->next->prev = block->prev;
 
 		set->header.mem_allocated -= block->endptr - ((char *) block);
+		pgstat_report_allocated_bytes_decrease(block->endptr - ((char *) block), PG_ALLOC_ASET);
 
 #ifdef CLOBBER_FREED_MEMORY
 		wipe_mem(block, block->freeptr - ((char *) block));
@@ -1170,7 +1185,9 @@ AllocSetRealloc(void *pointer, Size size)
 
 		/* updated separately, not to underflow when (oldblksize > blksize) */
 		set->header.mem_allocated -= oldblksize;
+		pgstat_report_allocated_bytes_decrease(oldblksize, PG_ALLOC_ASET);
 		set->header.mem_allocated += blksize;
+		pgstat_report_allocated_bytes_increase(blksize, PG_ALLOC_ASET);
 
 		block->freeptr = block->endptr = ((char *) block) + blksize;
 
diff --git a/src/backend/utils/mmgr/generation.c b/src/backend/utils/mmgr/generation.c
index 92401ccf738..0ed54571497 100644
--- a/src/backend/utils/mmgr/generation.c
+++ b/src/backend/utils/mmgr/generation.c
@@ -37,6 +37,7 @@
 
 #include "lib/ilist.h"
 #include "port/pg_bitutils.h"
+#include "utils/backend_status.h"
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
 #include "utils/memutils_memorychunk.h"
@@ -263,6 +264,7 @@ GenerationContextCreate(MemoryContext parent,
 						name);
 
 	((MemoryContext) set)->mem_allocated = firstBlockSize;
+	pgstat_report_allocated_bytes_increase(firstBlockSize, PG_ALLOC_GENERATION);
 
 	return (MemoryContext) set;
 }
@@ -279,6 +281,7 @@ GenerationReset(MemoryContext context)
 {
 	GenerationContext *set = (GenerationContext *) context;
 	dlist_mutable_iter miter;
+	uint64		deallocation = 0;
 
 	Assert(GenerationIsValid(set));
 
@@ -301,9 +304,14 @@ GenerationReset(MemoryContext context)
 		if (IsKeeperBlock(set, block))
 			GenerationBlockMarkEmpty(block);
 		else
+		{
+			deallocation += block->blksize;
 			GenerationBlockFree(set, block);
+		}
 	}
 
+	pgstat_report_allocated_bytes_decrease(deallocation, PG_ALLOC_GENERATION);
+
 	/* set it so new allocations to make use of the keeper block */
 	set->block = KeeperBlock(set);
 
@@ -324,6 +332,9 @@ GenerationDelete(MemoryContext context)
 {
 	/* Reset to release all releasable GenerationBlocks */
 	GenerationReset(context);
+
+	pgstat_report_allocated_bytes_decrease(context->mem_allocated, PG_ALLOC_GENERATION);
+
 	/* And free the context header and keeper block */
 	free(context);
 }
@@ -370,6 +381,7 @@ GenerationAlloc(MemoryContext context, Size size)
 			return NULL;
 
 		context->mem_allocated += blksize;
+		pgstat_report_allocated_bytes_increase(blksize, PG_ALLOC_GENERATION);
 
 		/* block with a single (used) chunk */
 		block->context = set;
@@ -473,6 +485,7 @@ GenerationAlloc(MemoryContext context, Size size)
 				return NULL;
 
 			context->mem_allocated += blksize;
+			pgstat_report_allocated_bytes_increase(blksize, PG_ALLOC_GENERATION);
 
 			/* initialize the new block */
 			GenerationBlockInit(set, block, blksize);
@@ -725,6 +738,8 @@ GenerationFree(void *pointer)
 	dlist_delete(&block->node);
 
 	set->header.mem_allocated -= block->blksize;
+	pgstat_report_allocated_bytes_decrease(block->blksize, PG_ALLOC_GENERATION);
+
 	free(block);
 }
 
diff --git a/src/backend/utils/mmgr/slab.c b/src/backend/utils/mmgr/slab.c
index 40c1d401c4c..c99ff532af2 100644
--- a/src/backend/utils/mmgr/slab.c
+++ b/src/backend/utils/mmgr/slab.c
@@ -69,6 +69,7 @@
 #include "postgres.h"
 
 #include "lib/ilist.h"
+#include "utils/backend_status.h"
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
 #include "utils/memutils_memorychunk.h"
@@ -417,6 +418,13 @@ SlabContextCreate(MemoryContext parent,
 						parent,
 						name);
 
+	/*
+	 * If SlabContextCreate is updated to add context header size to
+	 * context->mem_allocated, then update here and SlabDelete appropriately
+	 */
+	pgstat_report_allocated_bytes_increase(Slab_CONTEXT_HDRSZ(slab->chunksPerBlock),
+										   PG_ALLOC_SLAB);
+
 	return (MemoryContext) slab;
 }
 
@@ -433,6 +441,7 @@ SlabReset(MemoryContext context)
 	SlabContext *slab = (SlabContext *) context;
 	dlist_mutable_iter miter;
 	int			i;
+	uint64		deallocation = 0;
 
 	Assert(SlabIsValid(slab));
 
@@ -453,6 +462,7 @@ SlabReset(MemoryContext context)
 #endif
 		free(block);
 		context->mem_allocated -= slab->blockSize;
+		deallocation += slab->blockSize;
 	}
 
 	/* walk over blocklist and free the blocks */
@@ -469,9 +479,11 @@ SlabReset(MemoryContext context)
 #endif
 			free(block);
 			context->mem_allocated -= slab->blockSize;
+			deallocation += slab->blockSize;
 		}
 	}
 
+	pgstat_report_allocated_bytes_decrease(deallocation, PG_ALLOC_SLAB);
 	slab->curBlocklistIndex = 0;
 
 	Assert(context->mem_allocated == 0);
@@ -486,6 +498,14 @@ SlabDelete(MemoryContext context)
 {
 	/* Reset to release all the SlabBlocks */
 	SlabReset(context);
+
+	/*
+	 * Until context header allocation is included in context->mem_allocated,
+	 * cast to slab and decrement the header allocation
+	 */
+	pgstat_report_allocated_bytes_decrease(Slab_CONTEXT_HDRSZ(((SlabContext *) context)->chunksPerBlock),
+										   PG_ALLOC_SLAB);
+
 	/* And free the context header */
 	free(context);
 }
@@ -550,6 +570,7 @@ SlabAlloc(MemoryContext context, Size size)
 
 			block->slab = slab;
 			context->mem_allocated += slab->blockSize;
+			pgstat_report_allocated_bytes_increase(slab->blockSize, PG_ALLOC_SLAB);
 
 			/* use the first chunk in the new block */
 			chunk = SlabBlockGetChunk(slab, block, 0);
@@ -744,6 +765,7 @@ SlabFree(void *pointer)
 #endif
 			free(block);
 			slab->header.mem_allocated -= slab->blockSize;
+			pgstat_report_allocated_bytes_decrease(slab->blockSize, PG_ALLOC_SLAB);
 		}
 
 		/*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b8b26c263db..fe0549c43d9 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5435,6 +5435,23 @@
   proname => 'pg_stat_get_backend_idset', prorows => '100', proretset => 't',
   provolatile => 's', proparallel => 'r', prorettype => 'int4',
   proargtypes => '', prosrc => 'pg_stat_get_backend_idset' },
+{ oid => '9890',
+  descr => 'statistics: memory allocation information for backends',
+  proname => 'pg_stat_get_memory_allocation', prorows => '100', proisstrict => 'f',
+  proretset => 't', provolatile => 's', proparallel => 'r',
+  prorettype => 'record', proargtypes => 'int4',
+  proallargtypes => '{int4,oid,int4,int8,int8,int8,int8,int8}',
+  proargmodes => '{i,o,o,o,o,o,o,o}',
+  proargnames => '{pid,datid,pid,allocated_bytes,aset_allocated_bytes,dsm_allocated_bytes,generation_allocated_bytes,slab_allocated_bytes}',
+  prosrc => 'pg_stat_get_memory_allocation' },
+{ oid => '9891',
+  descr => 'statistics: global memory allocation information',
+  proname => 'pg_stat_get_global_memory_allocation', proisstrict => 'f',
+  provolatile => 's', proparallel => 'r', prorettype => 'record',
+  proargtypes => '', proallargtypes => '{oid,int8}',
+  proargmodes => '{o,o}',
+  proargnames => '{datid,global_dsm_allocated_bytes}',
+  prosrc =>'pg_stat_get_global_memory_allocation' },
 { oid => '2022',
   descr => 'statistics: information about currently active backends',
   proname => 'pg_stat_get_activity', prorows => '100', proisstrict => 'f',
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index e87fd25d643..26b17d66477 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -404,6 +404,8 @@ typedef struct PROC_HDR
 	int			spins_per_delay;
 	/* Buffer id of the buffer that Startup process waits for pin on, or -1 */
 	int			startupBufferPinWaitBufId;
+	/* Global dsm allocations */
+	pg_atomic_uint64 global_dsm_allocation;
 } PROC_HDR;
 
 extern PGDLLIMPORT PROC_HDR *ProcGlobal;
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index 75fc18c4327..c2c8ba7214d 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -10,6 +10,7 @@
 #ifndef BACKEND_STATUS_H
 #define BACKEND_STATUS_H
 
+#include "common/int.h"
 #include "datatype/timestamp.h"
 #include "libpq/pqcomm.h"
 #include "miscadmin.h"			/* for BackendType */
@@ -32,6 +33,14 @@ typedef enum BackendState
 	STATE_DISABLED,
 } BackendState;
 
+/* Enum helper for reporting memory allocator type */
+enum pg_allocator_type
+{
+	PG_ALLOC_ASET = 1,
+	PG_ALLOC_DSM,
+	PG_ALLOC_GENERATION,
+	PG_ALLOC_SLAB
+};
 
 /* ----------
  * Shared-memory data structures
@@ -170,6 +179,15 @@ typedef struct PgBackendStatus
 
 	/* query identifier, optionally computed using post_parse_analyze_hook */
 	uint64		st_query_id;
+
+	/* Current memory allocated to this backend */
+	uint64		allocated_bytes;
+
+	/* Current memory allocated to this backend by type */
+	uint64		aset_allocated_bytes;
+	uint64		dsm_allocated_bytes;
+	uint64		generation_allocated_bytes;
+	uint64		slab_allocated_bytes;
 } PgBackendStatus;
 
 
@@ -294,6 +312,11 @@ extern PGDLLIMPORT int pgstat_track_activity_query_size;
  * ----------
  */
 extern PGDLLIMPORT PgBackendStatus *MyBEEntry;
+extern PGDLLIMPORT uint64 *my_allocated_bytes;
+extern PGDLLIMPORT uint64 *my_aset_allocated_bytes;
+extern PGDLLIMPORT uint64 *my_dsm_allocated_bytes;
+extern PGDLLIMPORT uint64 *my_generation_allocated_bytes;
+extern PGDLLIMPORT uint64 *my_slab_allocated_bytes;
 
 
 /* ----------
@@ -325,7 +348,12 @@ extern const char *pgstat_get_backend_current_activity(int pid, bool checkUser);
 extern const char *pgstat_get_crashed_backend_activity(int pid, char *buffer,
 													   int buflen);
 extern uint64 pgstat_get_my_query_id(void);
-
+extern void pgstat_set_allocated_bytes_storage(uint64 *allocated_bytes,
+											   uint64 *aset_allocated_bytes,
+											   uint64 *dsm_allocated_bytes,
+											   uint64 *generation_allocated_bytes,
+											   uint64 *slab_allocated_bytes);
+extern void pgstat_reset_allocated_bytes_storage(void);
 
 /* ----------
  * Support functions for the SQL-callable functions to
@@ -338,5 +366,119 @@ extern LocalPgBackendStatus *pgstat_get_local_beentry_by_backend_id(BackendId be
 extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int idx);
 extern char *pgstat_clip_activity(const char *raw_activity);
 
+/* ----------
+ * pgstat_report_allocated_bytes_decrease() -
+ *  Called to report decrease in memory allocated for this backend.
+ *
+ * my_{*_}allocated_bytes initially points to local memory, making it safe to
+ * call this before pgstats has been initialized.
+ * ----------
+ */
+static inline void
+pgstat_report_allocated_bytes_decrease(int64 proc_allocated_bytes,
+									   int pg_allocator_type)
+{
+	uint64		temp;
+
+	/* Avoid allocated_bytes unsigned integer overflow on decrease */
+	if (pg_sub_u64_overflow(*my_allocated_bytes, proc_allocated_bytes, &temp))
+	{
+		/* On overflow, set allocated bytes and allocator type bytes to zero */
+		*my_allocated_bytes = 0;
+		*my_aset_allocated_bytes = 0;
+		*my_dsm_allocated_bytes = 0;
+		*my_generation_allocated_bytes = 0;
+		*my_slab_allocated_bytes = 0;
+	}
+	else
+	{
+		/* decrease allocation */
+		*my_allocated_bytes -= proc_allocated_bytes;
+
+		/* Decrease allocator type allocated bytes. */
+		switch (pg_allocator_type)
+		{
+			case PG_ALLOC_ASET:
+				*my_aset_allocated_bytes -= proc_allocated_bytes;
+				break;
+			case PG_ALLOC_DSM:
+
+				/*
+				 * Some dsm allocations live beyond process exit. These are
+				 * accounted for in a global counter in
+				 * pgstat_reset_allocated_bytes_storage at process exit.
+				 */
+				*my_dsm_allocated_bytes -= proc_allocated_bytes;
+				break;
+			case PG_ALLOC_GENERATION:
+				*my_generation_allocated_bytes -= proc_allocated_bytes;
+				break;
+			case PG_ALLOC_SLAB:
+				*my_slab_allocated_bytes -= proc_allocated_bytes;
+				break;
+		}
+	}
+
+	return;
+}
+
+/* ----------
+ * pgstat_report_allocated_bytes_increase() -
+ *  Called to report increase in memory allocated for this backend.
+ *
+ * my_allocated_bytes initially points to local memory, making it safe to call
+ * this before pgstats has been initialized.
+ * ----------
+ */
+static inline void
+pgstat_report_allocated_bytes_increase(int64 proc_allocated_bytes,
+									   int pg_allocator_type)
+{
+	*my_allocated_bytes += proc_allocated_bytes;
+
+	/* Increase allocator type allocated bytes */
+	switch (pg_allocator_type)
+	{
+		case PG_ALLOC_ASET:
+			*my_aset_allocated_bytes += proc_allocated_bytes;
+			break;
+		case PG_ALLOC_DSM:
+
+			/*
+			 * Some dsm allocations live beyond process exit. These are
+			 * accounted for in a global counter in
+			 * pgstat_reset_allocated_bytes_storage at process exit.
+			 */
+			*my_dsm_allocated_bytes += proc_allocated_bytes;
+			break;
+		case PG_ALLOC_GENERATION:
+			*my_generation_allocated_bytes += proc_allocated_bytes;
+			break;
+		case PG_ALLOC_SLAB:
+			*my_slab_allocated_bytes += proc_allocated_bytes;
+			break;
+	}
+
+	return;
+}
+
+/* ---------
+ * pgstat_init_allocated_bytes() -
+ *
+ * Called to initialize allocated bytes variables after fork and to
+ * avoid double counting allocations.
+ * ---------
+ */
+static inline void
+pgstat_init_allocated_bytes(void)
+{
+	*my_allocated_bytes = 0;
+	*my_aset_allocated_bytes = 0;
+	*my_dsm_allocated_bytes = 0;
+	*my_generation_allocated_bytes = 0;
+	*my_slab_allocated_bytes = 0;
+
+	return;
+}
 
 #endif							/* BACKEND_STATUS_H */
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 05070393b99..7d412b26801 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1872,6 +1872,24 @@ pg_stat_database_conflicts| SELECT oid AS datid,
     pg_stat_get_db_conflict_startup_deadlock(oid) AS confl_deadlock,
     pg_stat_get_db_conflict_logicalslot(oid) AS confl_active_logicalslot
    FROM pg_database d;
+pg_stat_global_memory_allocation| WITH sums AS (
+         SELECT sum(pg_stat_memory_allocation.aset_allocated_bytes) AS total_aset_allocated_bytes,
+            sum(pg_stat_memory_allocation.dsm_allocated_bytes) AS total_dsm_allocated_bytes,
+            sum(pg_stat_memory_allocation.generation_allocated_bytes) AS total_generation_allocated_bytes,
+            sum(pg_stat_memory_allocation.slab_allocated_bytes) AS total_slab_allocated_bytes
+           FROM pg_stat_memory_allocation
+        )
+ SELECT s.datid,
+    current_setting('shared_memory_size'::text, true) AS shared_memory_size,
+    (current_setting('shared_memory_size_in_huge_pages'::text, true))::integer AS shared_memory_size_in_huge_pages,
+    s.global_dsm_allocated_bytes,
+    sums.total_aset_allocated_bytes,
+    sums.total_dsm_allocated_bytes,
+    sums.total_generation_allocated_bytes,
+    sums.total_slab_allocated_bytes
+   FROM sums,
+    (pg_stat_get_global_memory_allocation() s(datid, global_dsm_allocated_bytes)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_gssapi| SELECT pid,
     gss_auth AS gss_authenticated,
     gss_princ AS principal,
@@ -1898,6 +1916,15 @@ pg_stat_io| SELECT backend_type,
     fsync_time,
     stats_reset
    FROM pg_stat_get_io() b(backend_type, object, context, reads, read_time, writes, write_time, writebacks, writeback_time, extends, extend_time, op_bytes, hits, evictions, reuses, fsyncs, fsync_time, stats_reset);
+pg_stat_memory_allocation| SELECT s.datid,
+    s.pid,
+    s.allocated_bytes,
+    s.aset_allocated_bytes,
+    s.dsm_allocated_bytes,
+    s.generation_allocated_bytes,
+    s.slab_allocated_bytes
+   FROM (pg_stat_get_memory_allocation(NULL::integer) s(datid, pid, allocated_bytes, aset_allocated_bytes, dsm_allocated_bytes, generation_allocated_bytes, slab_allocated_bytes)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_analyze| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 346e10a3d2b..92326a3e697 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1646,4 +1646,40 @@ SELECT COUNT(*) FROM brin_hot_3 WHERE a = 2;
 
 DROP TABLE brin_hot_3;
 SET enable_seqscan = on;
+-- ensure that allocated_bytes exist for backends
+SELECT
+    allocated_bytes > 0 AS result
+FROM
+    pg_stat_activity ps
+    JOIN pg_stat_memory_allocation pa ON (pa.pid = ps.pid)
+WHERE
+    backend_type IN ('checkpointer', 'background writer', 'walwriter', 'autovacuum launcher');
+ result 
+--------
+ t
+ t
+ t
+ t
+(4 rows)
+
+-- ensure that pg_stat_global_memory_allocation view exists
+SELECT
+    datid > 0, pg_size_bytes(shared_memory_size) >= 0, shared_memory_size_in_huge_pages >= -1, global_dsm_allocated_bytes >= 0
+FROM
+    pg_stat_global_memory_allocation;
+ ?column? | ?column? | ?column? | ?column? 
+----------+----------+----------+----------
+ t        | t        | t        | t
+(1 row)
+
+-- ensure that pg_stat_memory_allocation view exists
+SELECT
+    pid > 0, allocated_bytes >= 0, aset_allocated_bytes >= 0, dsm_allocated_bytes >= 0, generation_allocated_bytes >= 0, slab_allocated_bytes >= 0
+FROM
+    pg_stat_memory_allocation limit 1;
+ ?column? | ?column? | ?column? | ?column? | ?column? | ?column? 
+----------+----------+----------+----------+----------+----------
+ t        | t        | t        | t        | t        | t
+(1 row)
+
 -- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index e3b4ca96e89..01134187bb8 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -849,4 +849,24 @@ DROP TABLE brin_hot_3;
 
 SET enable_seqscan = on;
 
+-- ensure that allocated_bytes exist for backends
+SELECT
+    allocated_bytes > 0 AS result
+FROM
+    pg_stat_activity ps
+    JOIN pg_stat_memory_allocation pa ON (pa.pid = ps.pid)
+WHERE
+    backend_type IN ('checkpointer', 'background writer', 'walwriter', 'autovacuum launcher');
+
+-- ensure that pg_stat_global_memory_allocation view exists
+SELECT
+    datid > 0, pg_size_bytes(shared_memory_size) >= 0, shared_memory_size_in_huge_pages >= -1, global_dsm_allocated_bytes >= 0
+FROM
+    pg_stat_global_memory_allocation;
+
+-- ensure that pg_stat_memory_allocation view exists
+SELECT
+    pid > 0, allocated_bytes >= 0, aset_allocated_bytes >= 0, dsm_allocated_bytes >= 0, generation_allocated_bytes >= 0, slab_allocated_bytes >= 0
+FROM
+    pg_stat_memory_allocation limit 1;
 -- End of Stats Test
-- 
2.41.0

From 468288c2dd154ebcb90cb3a93bee36622169be18 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@2ndquadrant.com>
Date: Tue, 26 Dec 2023 18:05:02 +0100
Subject: [PATCH v20231226 2/3] fixup: pgstat_get_local_beentry_by_index

---
 src/backend/utils/adt/pgstatfuncs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index b372ee691ba..9b52cc5091f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2041,7 +2041,7 @@ pg_stat_get_memory_allocation(PG_FUNCTION_ARGS)
 		PgBackendStatus *beentry;
 
 		/* Get the next one in the list */
-		local_beentry = pgstat_fetch_stat_local_beentry(curr_backend);
+		local_beentry = pgstat_get_local_beentry_by_index(curr_backend);
 		beentry = &local_beentry->backendStatus;
 
 		/* If looking for specific PID, ignore all the others */
-- 
2.41.0

From d45b026dfcaa5d891f83af43645fd92519d9841a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@2ndquadrant.com>
Date: Tue, 26 Dec 2023 17:55:23 +0100
Subject: [PATCH v20231226 3/3] Add the ability to limit the amount of memory
 that can be  allocated to backends.

This builds on the work that adds backend memory allocated tracking.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
exhaust max_total_backend_memory memory will be denied with an out of memory
error causing that backend's current query/transaction to fail.  Further
requests will not be allocated until dropping below the limit. Keep this in
mind when setting this value. Due to the dynamic nature of memory allocations,
this limit is not exact. This limit does not affect auxiliary backend
processes. Backend memory allocations are displayed in the
pg_stat_memory_allocation and pg_stat_global_memory_allocation views.
---
 doc/src/sgml/config.sgml                      |  30 ++++
 doc/src/sgml/monitoring.sgml                  |  38 ++++-
 src/backend/catalog/system_views.sql          |   2 +
 src/backend/port/sysv_shmem.c                 |   9 ++
 src/backend/postmaster/postmaster.c           |   5 +
 src/backend/storage/ipc/dsm_impl.c            |  18 +++
 src/backend/storage/lmgr/proc.c               |  45 ++++++
 src/backend/utils/activity/backend_status.c   | 147 ++++++++++++++++++
 src/backend/utils/adt/pgstatfuncs.c           |  16 +-
 src/backend/utils/hash/dynahash.c             |   3 +-
 src/backend/utils/init/miscinit.c             |   8 +
 src/backend/utils/misc/guc_tables.c           |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c                 |  33 ++++
 src/backend/utils/mmgr/generation.c           |  16 ++
 src/backend/utils/mmgr/slab.c                 |  15 +-
 src/include/catalog/pg_proc.dat               |   6 +-
 src/include/storage/proc.h                    |   7 +
 src/include/utils/backend_status.h            | 102 +++++++++++-
 src/test/regress/expected/rules.out           |   4 +-
 20 files changed, 498 insertions(+), 20 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b5624ca8847..6b0e7b753d8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2167,6 +2167,36 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-total-backend-memory" xreflabel="max_total_backend_memory">
+      <term><varname>max_total_backend_memory</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_total_backend_memory</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies a limit to the amount of memory (MB) that may be allocated to
+        backends in total (i.e. this is not a per user or per backend limit).
+        If unset, or set to 0 it is disabled.  At databse startup
+        max_total_backend_memory is reduced by shared_memory_size_mb
+        (includes shared buffers and other memory required for initialization).
+        Each backend process is intialized with a 1MB local allowance which
+        also reduces total_bkend_mem_bytes_available. Keep this in mind when
+        setting this value. A backend request that would exhaust the limit will
+        be denied with an out of memory error causing that backend's current
+        query/transaction to fail. Further requests will not be allocated until
+        dropping below the limit.  This limit does not affect auxiliary backend
+        processes
+        <xref linkend="glossary-auxiliary-proc"/> or the postmaster process.
+        Backend memory allocations (<varname>allocated_bytes</varname>) are
+        displayed in the
+        <link linkend="monitoring-pg-stat-memory-allocation-view"><structname>pg_stat_memory_allocation</structname></link>
+        view.  Due to the dynamic nature of memory allocations, this limit is
+        not exact.
+       </para>
+      </listitem>
+     </varlistentry>
+
      </variablelist>
      </sect2>
 
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 99f5acf07f4..01e5d5ef85b 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4623,10 +4623,7 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
      <para>
       Memory currently allocated to this backend in bytes. This is the balance
-      of bytes allocated and freed by this backend. Dynamic shared memory
-      allocations are included only in the value displayed for the backend that
-      created them, they are not included in the value for backends that are
-      attached to them to avoid double counting.
+      of bytes allocated and freed by this backend.
      </para></entry>
      </row>
 
@@ -4743,6 +4740,39 @@ description | Waiting for a newly initialized WAL file to reach durable storage
      </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>max_total_backend_memory_bytes</structfield> <type>bigint</type>
+      </para>
+     <para>
+      Reports the user defined backend maximum allowed shared memory in bytes.
+      0 if disabled or not set. See
+      <xref linkend="guc-max-total-backend-memory"/>.
+     </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_bkend_mem_bytes_available</structfield> <type>bigint</type>
+      </para>
+     <para>
+      Tracks max_total_backend_memory (in bytes) available for allocation. At
+      database startup, total_bkend_mem_bytes_available is reduced by the
+      byte equivalent of shared_memory_size_mb. Each backend process is
+      intialized with a 1MB local allowance which also reduces
+      total_bkend_mem_bytes_available. A process's allocation requests reduce
+      it's local allowance. If a process's allocation request exceeds it's
+      remaining allowance, an attempt is made to refill the local allowance
+      from total_bkend_mem_bytes_available. If the refill request fails, then
+      the requesting process will fail with an out of memory error resulting
+      in the cancellation of that process's active query/transaction.  The
+      default refill allocation quantity is 1MB.  If a request is greater than
+      1MB, an attempt will be made to allocate the full amount. If
+      max_total_backend_memory is disabled, this will be -1.
+      <xref linkend="guc-max-total-backend-memory"/>.
+     </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>global_dsm_allocated_bytes</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fd7fcaf59e3..c665985aca8 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1396,6 +1396,8 @@ SELECT
         S.datid AS datid,
         current_setting('shared_memory_size', true) as shared_memory_size,
         (current_setting('shared_memory_size_in_huge_pages', true))::integer as shared_memory_size_in_huge_pages,
+        pg_size_bytes(current_setting('max_total_backend_memory', true)) as max_total_backend_memory_bytes,
+        S.total_bkend_mem_bytes_available,
         S.global_dsm_allocated_bytes,
         sums.total_aset_allocated_bytes,
         sums.total_dsm_allocated_bytes,
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2de280ecb6f..af87c1dd3b3 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -34,6 +34,7 @@
 #include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/pg_shmem.h"
+#include "utils/backend_status.h"
 #include "utils/guc_hooks.h"
 #include "utils/pidfile.h"
 
@@ -917,6 +918,14 @@ PGSharedMemoryReAttach(void)
 	dsm_set_control_handle(hdr->dsm_control);
 
 	UsedShmemSegAddr = hdr;		/* probably redundant */
+
+	/*
+	 * Init allocated bytes to avoid double counting parent allocation for
+	 * fork/exec processes. Forked processes perform this action in
+	 * InitPostmasterChild. For EXEC_BACKEND processes we have to wait for
+	 * shared memory to be reattached.
+	 */
+	pgstat_init_allocated_bytes();
 }
 
 /*
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index b163e89cbb5..a658bc0d131 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -545,6 +545,7 @@ typedef struct
 #endif
 	char		my_exec_path[MAXPGPATH];
 	char		pkglib_path[MAXPGPATH];
+	int			max_total_bkend_mem;
 } BackendParameters;
 
 static void read_backend_variables(char *id, Port **port, BackgroundWorker **worker);
@@ -6130,6 +6131,8 @@ save_backend_variables(BackendParameters *param, Port *port, BackgroundWorker *w
 
 	strlcpy(param->pkglib_path, pkglib_path, MAXPGPATH);
 
+	param->max_total_bkend_mem = max_total_bkend_mem;
+
 	return true;
 }
 
@@ -6372,6 +6375,8 @@ restore_backend_variables(BackendParameters *param, Port **port, BackgroundWorke
 
 	strlcpy(pkglib_path, param->pkglib_path, MAXPGPATH);
 
+	max_total_bkend_mem = param->max_total_bkend_mem;
+
 	/*
 	 * We need to restore fd.c's counts of externally-opened FDs; to avoid
 	 * confusion, be sure to do this after restoring max_safe_fds.  (Note:
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index a9e0987747b..2c74580cec0 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -254,6 +254,16 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+	{
+		ereport(elevel,
+				(errcode_for_dynamic_shared_memory(),
+				 errmsg("out of memory for segment \"%s\" - exceeds max_total_backend_memory: %m",
+						name)));
+		return false;
+	}
+
 	/*
 	 * Create new segment or open an existing one for attach.
 	 *
@@ -523,6 +533,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 		int			flags = IPCProtection;
 		size_t		segsize;
 
+		/* Do not exceed maximum allowed memory allocation */
+		if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+			return false;
+
 		/*
 		 * Allocate the memory BEFORE acquiring the resource, so that we don't
 		 * leak the resource if memory allocation fails.
@@ -717,6 +731,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/* Create new segment or open an existing one for attach. */
 	if (op == DSM_OP_CREATE)
 	{
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 81304a569c0..51a6c312284 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -51,6 +51,7 @@
 #include "storage/procsignal.h"
 #include "storage/spin.h"
 #include "storage/standby.h"
+#include "utils/guc.h"
 #include "utils/timeout.h"
 #include "utils/timestamp.h"
 
@@ -182,6 +183,50 @@ InitProcGlobal(void)
 	pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PGPROCNO);
 	pg_atomic_init_u64(&ProcGlobal->global_dsm_allocation, 0);
 
+	/* Setup backend memory limiting if configured */
+	if (max_total_bkend_mem > 0)
+	{
+		/*
+		 * Convert max_total_bkend_mem to bytes, account for
+		 * shared_memory_size, and initialize total_bkend_mem_bytes.
+		 */
+		int			result = 0;
+
+		/* Get integer value of shared_memory_size */
+		if (parse_int(GetConfigOption("shared_memory_size", true, false), &result, 0, NULL))
+		{
+			/*
+			 * Error on startup if backend memory limit is less than shared
+			 * memory size. Warn on startup if backend memory available is
+			 * less than arbitrarily picked value of 100MB.
+			 */
+
+			if (max_total_bkend_mem - result <= 0)
+			{
+				ereport(ERROR,
+						errmsg("configured max_total_backend_memory %dMB is <= shared_memory_size %dMB",
+							   max_total_bkend_mem, result),
+						errhint("Disable or increase the configuration parameter \"max_total_backend_memory\"."));
+			}
+			else if (max_total_bkend_mem - result <= 100)
+			{
+				ereport(WARNING,
+						errmsg("max_total_backend_memory %dMB - shared_memory_size %dMB is <= 100MB",
+							   max_total_bkend_mem, result),
+						errhint("Consider increasing the configuration parameter \"max_total_backend_memory\"."));
+			}
+
+			/*
+			 * Account for shared memory size and initialize
+			 * total_bkend_mem_bytes.
+			 */
+			pg_atomic_init_u64(&ProcGlobal->total_bkend_mem_bytes,
+							   (uint64) max_total_bkend_mem * 1024 * 1024 - (uint64) result * 1024 * 1024);
+		}
+		else
+			ereport(ERROR, errmsg("max_total_backend_memory initialization is unable to parse shared_memory_size"));
+	}
+
 	/*
 	 * Create and initialize all the PGPROC structures we'll need.  There are
 	 * five separate consumers: (1) normal backends, (2) autovacuum workers
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 838a7337933..a3d610a2461 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -45,6 +45,12 @@
 bool		pgstat_track_activities = false;
 int			pgstat_track_activity_query_size = 1024;
 
+/*
+ * Max backend memory allocation allowed (MB). 0 = disabled.
+ * Centralized bucket ProcGlobal->max_total_bkend_mem is initialized
+ * as a byte representation of this value in InitProcGlobal().
+ */
+int			max_total_bkend_mem = 0;
 
 /* exposed so that backend_progress.c can access it */
 PgBackendStatus *MyBEEntry = NULL;
@@ -68,6 +74,31 @@ uint64	   *my_generation_allocated_bytes = &local_my_generation_allocated_bytes;
 uint64		local_my_slab_allocated_bytes = 0;
 uint64	   *my_slab_allocated_bytes = &local_my_slab_allocated_bytes;
 
+/*
+ * Define initial allocation allowance for a backend.
+ *
+ * NOTE: initial_allocation_allowance && allocation_allowance_refill_qty
+ * may be candidates for future GUC variables. Arbitrary 1MB selected initially.
+ */
+uint64		initial_allocation_allowance = 1024 * 1024;
+uint64		allocation_allowance_refill_qty = 1024 * 1024;
+
+/*
+ * Local counter to manage shared memory allocations. At backend startup, set to
+ * initial_allocation_allowance via pgstat_init_allocated_bytes(). Decrease as
+ * memory is malloc'd. When exhausted, atomically refill if available from
+ * ProcGlobal->max_total_bkend_mem via exceeds_max_total_bkend_mem().
+ */
+uint64		allocation_allowance = 0;
+
+/*
+ * Local counter of free'd shared memory. Return to global
+ * max_total_bkend_mem when return threshold is met. Arbitrary 1MB bytes
+ * selected initially.
+ */
+uint64		allocation_return = 0;
+uint64		allocation_return_threshold = 1024 * 1024;
+
 static PgBackendStatus *BackendStatusArray = NULL;
 static char *BackendAppnameBuffer = NULL;
 static char *BackendClientHostnameBuffer = NULL;
@@ -1291,6 +1322,8 @@ pgstat_set_allocated_bytes_storage(uint64 *allocated_bytes,
 
 	my_slab_allocated_bytes = slab_allocated_bytes;
 	*slab_allocated_bytes = local_my_slab_allocated_bytes;
+
+	return;
 }
 
 /*
@@ -1314,6 +1347,23 @@ pgstat_reset_allocated_bytes_storage(void)
 								*my_dsm_allocated_bytes);
 	}
 
+	/*
+	 * When limiting maximum backend memory, return this backend's memory
+	 * allocations to global.
+	 */
+	if (max_total_bkend_mem)
+	{
+		volatile PROC_HDR *procglobal = ProcGlobal;
+
+		pg_atomic_add_fetch_u64(&procglobal->total_bkend_mem_bytes,
+								*my_allocated_bytes + allocation_allowance +
+								allocation_return);
+
+		/* Reset memory allocation variables */
+		allocation_allowance = 0;
+		allocation_return = 0;
+	}
+
 	/* Reset memory allocation variables */
 	*my_allocated_bytes = local_my_allocated_bytes = 0;
 	*my_aset_allocated_bytes = local_my_aset_allocated_bytes = 0;
@@ -1327,4 +1377,101 @@ pgstat_reset_allocated_bytes_storage(void)
 	my_dsm_allocated_bytes = &local_my_dsm_allocated_bytes;
 	my_generation_allocated_bytes = &local_my_generation_allocated_bytes;
 	my_slab_allocated_bytes = &local_my_slab_allocated_bytes;
+
+	return;
+}
+
+/*
+ * Determine if allocation request will exceed max backend memory allowed.
+ * Do not apply to auxiliary processes.
+ * Refill allocation request bucket when needed/possible.
+ */
+bool
+exceeds_max_total_bkend_mem(uint64 allocation_request)
+{
+	bool		result = false;
+
+	/*
+	 * When limiting maximum backend memory, attempt to refill allocation
+	 * request bucket if needed.
+	 */
+	if (max_total_bkend_mem && allocation_request > allocation_allowance &&
+		ProcGlobal != NULL)
+	{
+		volatile PROC_HDR *procglobal = ProcGlobal;
+		uint64		available_max_total_bkend_mem = 0;
+		bool		sts = false;
+
+		/*
+		 * If allocation request is larger than memory refill quantity then
+		 * attempt to increase allocation allowance with requested amount,
+		 * otherwise fall through. If this refill fails we do not have enough
+		 * memory to meet the request.
+		 */
+		if (allocation_request >= allocation_allowance_refill_qty)
+		{
+			while ((available_max_total_bkend_mem = pg_atomic_read_u64(&procglobal->total_bkend_mem_bytes)) >= allocation_request)
+			{
+				if ((result = pg_atomic_compare_exchange_u64(&procglobal->total_bkend_mem_bytes,
+															 &available_max_total_bkend_mem,
+															 available_max_total_bkend_mem - allocation_request)))
+				{
+					allocation_allowance = allocation_allowance + allocation_request;
+					break;
+				}
+			}
+
+			/*
+			 * Exclude auxiliary and Postmaster processes from the check.
+			 * Return false. While we want to exclude them from the check, we
+			 * do not want to exclude them from the above allocation handling.
+			 */
+			if (MyAuxProcType != NotAnAuxProcess || MyProcPid == PostmasterPid)
+				return false;
+
+			/*
+			 * If the atomic exchange fails (result == false), we do not have
+			 * enough reserve memory to meet the request. Negate result to
+			 * return the proper value.
+			 */
+
+			return !result;
+		}
+
+		/*
+		 * Attempt to increase allocation allowance by memory refill quantity.
+		 * If available memory is/becomes less than memory refill quantity,
+		 * fall through to attempt to allocate remaining available memory.
+		 */
+		while ((available_max_total_bkend_mem = pg_atomic_read_u64(&procglobal->total_bkend_mem_bytes)) >= allocation_allowance_refill_qty)
+		{
+			if ((sts = pg_atomic_compare_exchange_u64(&procglobal->total_bkend_mem_bytes,
+													  &available_max_total_bkend_mem,
+													  available_max_total_bkend_mem - allocation_allowance_refill_qty)))
+			{
+				allocation_allowance = allocation_allowance + allocation_allowance_refill_qty;
+				break;
+			}
+		}
+
+		/* Do not attempt to increase allocation if available memory is below
+		 * allocation_allowance_refill_qty .
+		 */
+
+		/*
+		 * If refill is not successful, we return true, memory limit exceeded
+		 */
+		if (!sts)
+			result = true;
+	}
+
+	/*
+	 * Exclude auxiliary and postmaster processes from the check. Return false.
+	 * While we want to exclude them from the check, we do not want to exclude
+	 * them from the above allocation handling.
+	 */
+	if (MyAuxProcType != NotAnAuxProcess || MyProcPid == PostmasterPid)
+		result = false;
+
+	return result;
 }
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 9b52cc5091f..09451bc4298 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2077,7 +2077,7 @@ pg_stat_get_memory_allocation(PG_FUNCTION_ARGS)
 Datum
 pg_stat_get_global_memory_allocation(PG_FUNCTION_ARGS)
 {
-#define PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS	2
+#define PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS	3
 	TupleDesc	tupdesc;
 	Datum		values[PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS] = {0};
 	bool		nulls[PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS] = {0};
@@ -2087,15 +2087,23 @@ pg_stat_get_global_memory_allocation(PG_FUNCTION_ARGS)
 	tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_GLOBAL_MEMORY_ALLOCATION_COLS);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "datid",
 					   OIDOID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "global_dsm_allocated_bytes",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "total_bkend_mem_bytes_available",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 3, "global_dsm_allocated_bytes",
 					   INT8OID, -1, 0);
 	BlessTupleDesc(tupdesc);
 
 	/* datid */
 	values[0] = ObjectIdGetDatum(MyDatabaseId);
 
-	/* get global_dsm_allocated_bytes */
-	values[1] = Int64GetDatum(pg_atomic_read_u64(&procglobal->global_dsm_allocation));
+	/* Get total_bkend_mem_bytes - return -1 if disabled */
+	if (max_total_bkend_mem == 0)
+		values[1] = Int64GetDatum(-1);
+	else
+		values[1] = Int64GetDatum(pg_atomic_read_u64(&procglobal->total_bkend_mem_bytes));
+
+	/* Get global_dsm_allocated_bytes */
+	values[2] = Int64GetDatum(pg_atomic_read_u64(&procglobal->global_dsm_allocation));
 
 	/* Returns the record as Datum */
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 012d4a0b1fd..cd68e5265af 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -104,7 +104,6 @@
 #include "utils/dynahash.h"
 #include "utils/memutils.h"
 
-
 /*
  * Constants
  *
@@ -359,7 +358,6 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)
 	Assert(flags & HASH_ELEM);
 	Assert(info->keysize > 0);
 	Assert(info->entrysize >= info->keysize);
-
 	/*
 	 * For shared hash tables, we have a local hash header (HTAB struct) that
 	 * we allocate in TopMemoryContext; all else is in shared memory.
@@ -377,6 +375,7 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)
 	}
 	else
 	{
+		/* Set up to allocate the hash header */
 		/* Create the hash table's private memory context */
 		if (flags & HASH_CONTEXT)
 			CurrentDynaHashCxt = info->hcxt;
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 2b082c68df6..02a060a3da8 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -171,8 +171,16 @@ InitPostmasterChild(void)
 				 errmsg_internal("could not set postmaster death monitoring pipe to FD_CLOEXEC mode: %m")));
 #endif
 
+	/*
+	 * Init pgstat allocated bytes counters here for forked backends.
+	 * Fork/exec backends have not yet reattached to shared memory at this
+	 * point. They will init pgstat allocated bytes counters in
+	 * PGSharedMemoryReAttach.
+	 */
+#ifndef EXEC_BACKEND
 	/* Init allocated bytes to avoid double counting parent allocation */
 	pgstat_init_allocated_bytes();
+#endif
 }
 
 /*
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 9f59440526f..bcad500bde7 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3553,6 +3553,17 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_total_backend_memory", PGC_SU_BACKEND, RESOURCES_MEM,
+			gettext_noop("Restrict total backend memory allocations to this max."),
+			gettext_noop("0 turns this feature off."),
+			GUC_UNIT_MB
+		},
+		&max_total_bkend_mem,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b2809c711a1..59cb9886d73 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -160,6 +160,9 @@
 #vacuum_buffer_usage_limit = 256kB	# size of vacuum and analyze buffer access strategy ring;
 					# 0 to disable vacuum buffer access strategy;
 					# range 128kB to 16GB
+#max_total_backend_memory = 0MB		# Restrict total backend memory allocations
+									# to this max (in MB). 0 turns this feature
+									# off.
 
 # - Disk -
 
diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index 7af0d141da2..656ec384851 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -438,6 +438,18 @@ AllocSetContextCreateInternal(MemoryContext parent,
 	else
 		firstBlockSize = Max(firstBlockSize, initBlockSize);
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (exceeds_max_total_bkend_mem(firstBlockSize))
+	{
+		if (TopMemoryContext)
+			MemoryContextStats(TopMemoryContext);
+		ereport(ERROR,
+				(errcode(ERRCODE_OUT_OF_MEMORY),
+				 errmsg("out of memory - exceeds max_total_backend_memory"),
+				 errdetail("Failed while creating memory context \"%s\".",
+						   name)));
+	}
+
 	/*
 	 * Allocate the initial block.  Unlike other aset.c blocks, it starts with
 	 * the context header and its block header follows that.
@@ -737,6 +749,11 @@ AllocSetAlloc(MemoryContext context, Size size)
 #endif
 
 		blksize = chunk_size + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
+
+		/* Do not exceed maximum allowed memory allocation */
+		if (exceeds_max_total_bkend_mem(blksize))
+			return NULL;
+
 		block = (AllocBlock) malloc(blksize);
 		if (block == NULL)
 			return NULL;
@@ -937,6 +954,10 @@ AllocSetAlloc(MemoryContext context, Size size)
 		while (blksize < required_size)
 			blksize <<= 1;
 
+		/* Do not exceed maximum allowed memory allocation */
+		if (exceeds_max_total_bkend_mem(blksize))
+			return NULL;
+
 		/* Try to allocate it */
 		block = (AllocBlock) malloc(blksize);
 
@@ -1175,6 +1196,18 @@ AllocSetRealloc(void *pointer, Size size)
 		blksize = chksize + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
 		oldblksize = block->endptr - ((char *) block);
 
+		/*
+		 * Do not exceed maximum allowed memory allocation. NOTE: checking for
+		 * the full size here rather than just the amount of increased
+		 * allocation to prevent a potential underflow of *my_allocation
+		 * allowance in cases where blksize - oldblksize does not trigger a
+		 * refill but blksize is greater than *my_allocation_allowance.
+		 * Underflow would occur with the call below to
+		 * pgstat_report_allocated_bytes_increase()
+		 */
+		if (blksize > oldblksize && exceeds_max_total_bkend_mem(blksize))
+			return NULL;
+
 		block = (AllocBlock) realloc(block, blksize);
 		if (block == NULL)
 		{
diff --git a/src/backend/utils/mmgr/generation.c b/src/backend/utils/mmgr/generation.c
index 0ed54571497..a1667c34371 100644
--- a/src/backend/utils/mmgr/generation.c
+++ b/src/backend/utils/mmgr/generation.c
@@ -200,6 +200,16 @@ GenerationContextCreate(MemoryContext parent,
 	else
 		allocSize = Max(allocSize, initBlockSize);
 
+	if (exceeds_max_total_bkend_mem(allocSize))
+	{
+		MemoryContextStats(TopMemoryContext);
+		ereport(ERROR,
+				(errcode(ERRCODE_OUT_OF_MEMORY),
+				 errmsg("out of memory - exceeds max_total_backend_memory"),
+				 errdetail("Failed while creating memory context \"%s\".",
+						   name)));
+	}
+
 	/*
 	 * Allocate the initial block.  Unlike other generation.c blocks, it
 	 * starts with the context header and its block header follows that.
@@ -376,6 +386,9 @@ GenerationAlloc(MemoryContext context, Size size)
 	{
 		Size		blksize = required_size + Generation_BLOCKHDRSZ;
 
+		if (exceeds_max_total_bkend_mem(blksize))
+			return NULL;
+
 		block = (GenerationBlock *) malloc(blksize);
 		if (block == NULL)
 			return NULL;
@@ -479,6 +492,9 @@ GenerationAlloc(MemoryContext context, Size size)
 			if (blksize < required_size)
 				blksize = pg_nextpower2_size_t(required_size);
 
+			if (exceeds_max_total_bkend_mem(blksize))
+				return NULL;
+
 			block = (GenerationBlock *) malloc(blksize);
 
 			if (block == NULL)
diff --git a/src/backend/utils/mmgr/slab.c b/src/backend/utils/mmgr/slab.c
index c99ff532af2..34fe5d713e0 100644
--- a/src/backend/utils/mmgr/slab.c
+++ b/src/backend/utils/mmgr/slab.c
@@ -360,7 +360,16 @@ SlabContextCreate(MemoryContext parent,
 		elog(ERROR, "block size %zu for slab is too small for %zu-byte chunks",
 			 blockSize, chunkSize);
 
-
+	/* Do not exceed maximum allowed memory allocation */
+	if (exceeds_max_total_bkend_mem(Slab_CONTEXT_HDRSZ(chunksPerBlock)))
+	{
+		MemoryContextStats(TopMemoryContext);
+		ereport(ERROR,
+				(errcode(ERRCODE_OUT_OF_MEMORY),
+				 errmsg("out of memory - exceeds max_total_backend_memory"),
+				 errdetail("Failed while creating memory context \"%s\".",
+						   name)));
+	}
 
 	slab = (SlabContext *) malloc(Slab_CONTEXT_HDRSZ(chunksPerBlock));
 	if (slab == NULL)
@@ -563,6 +572,10 @@ SlabAlloc(MemoryContext context, Size size)
 		}
 		else
 		{
+			/* Do not exceed maximum allowed memory allocation */
+			if (exceeds_max_total_bkend_mem(slab->blockSize))
+				return NULL;
+
 			block = (SlabBlock *) malloc(slab->blockSize);
 
 			if (unlikely(block == NULL))
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fe0549c43d9..19839fc0459 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5448,9 +5448,9 @@
   descr => 'statistics: global memory allocation information',
   proname => 'pg_stat_get_global_memory_allocation', proisstrict => 'f',
   provolatile => 's', proparallel => 'r', prorettype => 'record',
-  proargtypes => '', proallargtypes => '{oid,int8}',
-  proargmodes => '{o,o}',
-  proargnames => '{datid,global_dsm_allocated_bytes}',
+  proargtypes => '', proallargtypes => '{oid,int8,int8}',
+  proargmodes => '{o,o,o}',
+  proargnames => '{datid,total_bkend_mem_bytes_available,global_dsm_allocated_bytes}',
   prosrc =>'pg_stat_get_global_memory_allocation' },
 { oid => '2022',
   descr => 'statistics: information about currently active backends',
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 26b17d66477..e53ed1cba05 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -406,6 +406,13 @@ typedef struct PROC_HDR
 	int			startupBufferPinWaitBufId;
 	/* Global dsm allocations */
 	pg_atomic_uint64 global_dsm_allocation;
+
+	/*
+	 * Max backend memory allocation tracker. Used/Initialized when
+	 * max_total_bkend_mem > 0 as max_total_bkend_mem (MB) converted to bytes.
+	 * Decreases/increases with free/malloc of backend memory.
+	 */
+	pg_atomic_uint64 total_bkend_mem_bytes;
 } PROC_HDR;
 
 extern PGDLLIMPORT PROC_HDR *ProcGlobal;
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index c2c8ba7214d..f07db4e57ff 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -15,6 +15,7 @@
 #include "libpq/pqcomm.h"
 #include "miscadmin.h"			/* for BackendType */
 #include "storage/backendid.h"
+#include "storage/proc.h"
 #include "utils/backend_progress.h"
 
 
@@ -305,6 +306,7 @@ typedef struct LocalPgBackendStatus
  */
 extern PGDLLIMPORT bool pgstat_track_activities;
 extern PGDLLIMPORT int pgstat_track_activity_query_size;
+extern PGDLLIMPORT int max_total_bkend_mem;
 
 
 /* ----------
@@ -317,6 +319,10 @@ extern PGDLLIMPORT uint64 *my_aset_allocated_bytes;
 extern PGDLLIMPORT uint64 *my_dsm_allocated_bytes;
 extern PGDLLIMPORT uint64 *my_generation_allocated_bytes;
 extern PGDLLIMPORT uint64 *my_slab_allocated_bytes;
+extern PGDLLIMPORT uint64 allocation_allowance;
+extern PGDLLIMPORT uint64 initial_allocation_allowance;
+extern PGDLLIMPORT uint64 allocation_return;
+extern PGDLLIMPORT uint64 allocation_return_threshold;
 
 
 /* ----------
@@ -365,6 +371,7 @@ extern PgBackendStatus *pgstat_get_beentry_by_backend_id(BackendId beid);
 extern LocalPgBackendStatus *pgstat_get_local_beentry_by_backend_id(BackendId beid);
 extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int idx);
 extern char *pgstat_clip_activity(const char *raw_activity);
+extern bool exceeds_max_total_bkend_mem(uint64 allocation_request);
 
 /* ----------
  * pgstat_report_allocated_bytes_decrease() -
@@ -380,7 +387,7 @@ pgstat_report_allocated_bytes_decrease(int64 proc_allocated_bytes,
 {
 	uint64		temp;
 
-	/* Avoid allocated_bytes unsigned integer overflow on decrease */
+	/* Sanity check: my allocated bytes should never drop below zero */
 	if (pg_sub_u64_overflow(*my_allocated_bytes, proc_allocated_bytes, &temp))
 	{
 		/* On overflow, set allocated bytes and allocator type bytes to zero */
@@ -389,13 +396,35 @@ pgstat_report_allocated_bytes_decrease(int64 proc_allocated_bytes,
 		*my_dsm_allocated_bytes = 0;
 		*my_generation_allocated_bytes = 0;
 		*my_slab_allocated_bytes = 0;
+
+		/* Add freed memory to allocation return counter. */
+		allocation_return += proc_allocated_bytes;
+
+		/*
+		 * Return freed memory to the global counter if return threshold is
+		 * met.
+		 */
+		if (max_total_bkend_mem && allocation_return >= allocation_return_threshold)
+		{
+			if (ProcGlobal)
+			{
+				volatile PROC_HDR *procglobal = ProcGlobal;
+
+				/* Add to global tracker */
+				pg_atomic_add_fetch_u64(&procglobal->total_bkend_mem_bytes,
+										allocation_return);
+
+				/* Restart the count */
+				allocation_return = 0;
+			}
+		}
 	}
 	else
 	{
-		/* decrease allocation */
-		*my_allocated_bytes -= proc_allocated_bytes;
+		/* Add freed memory to allocation return counter */
+		allocation_return += proc_allocated_bytes;
 
-		/* Decrease allocator type allocated bytes. */
+		/* Decrease allocator type allocated bytes */
 		switch (pg_allocator_type)
 		{
 			case PG_ALLOC_ASET:
@@ -417,6 +446,30 @@ pgstat_report_allocated_bytes_decrease(int64 proc_allocated_bytes,
 				*my_slab_allocated_bytes -= proc_allocated_bytes;
 				break;
 		}
+
+		/* decrease allocation */
+		*my_allocated_bytes = *my_aset_allocated_bytes +
+			*my_dsm_allocated_bytes + *my_generation_allocated_bytes +
+			*my_slab_allocated_bytes;
+
+		/*
+		 * Return freed memory to the global counter if return threshold is
+		 * met.
+		 */
+		if (max_total_bkend_mem && allocation_return >= allocation_return_threshold)
+		{
+			if (ProcGlobal)
+			{
+				volatile PROC_HDR *procglobal = ProcGlobal;
+
+				/* Add to global tracker */
+				pg_atomic_add_fetch_u64(&procglobal->total_bkend_mem_bytes,
+										allocation_return);
+
+				/* Restart the count */
+				allocation_return = 0;
+			}
+		}
 	}
 
 	return;
@@ -434,7 +487,13 @@ static inline void
 pgstat_report_allocated_bytes_increase(int64 proc_allocated_bytes,
 									   int pg_allocator_type)
 {
-	*my_allocated_bytes += proc_allocated_bytes;
+	uint64		temp;
+
+	/* Sanity check: my allocated bytes should never drop below zero */
+	if (pg_sub_u64_overflow(allocation_allowance, proc_allocated_bytes, &temp))
+		allocation_allowance = 0;
+	else
+		allocation_allowance -= proc_allocated_bytes;
 
 	/* Increase allocator type allocated bytes */
 	switch (pg_allocator_type)
@@ -459,6 +518,9 @@ pgstat_report_allocated_bytes_increase(int64 proc_allocated_bytes,
 			break;
 	}
 
+	*my_allocated_bytes = *my_aset_allocated_bytes + *my_dsm_allocated_bytes +
+		*my_generation_allocated_bytes + *my_slab_allocated_bytes;
+
 	return;
 }
 
@@ -478,6 +540,36 @@ pgstat_init_allocated_bytes(void)
 	*my_generation_allocated_bytes = 0;
 	*my_slab_allocated_bytes = 0;
 
+	/* If we're limiting backend memory */
+	if (max_total_bkend_mem)
+	{
+		volatile PROC_HDR *procglobal = ProcGlobal;
+		uint64		available_max_total_bkend_mem = 0;
+
+		allocation_return = 0;
+		allocation_allowance = 0;
+
+		/* Account for the initial allocation allowance */
+		while ((available_max_total_bkend_mem = pg_atomic_read_u64(&procglobal->total_bkend_mem_bytes)) >= initial_allocation_allowance)
+		{
+			/*
+			 * On success populate allocation_allowance. Failure here will
+			 * result in the backend's first invocation of
+			 * exceeds_max_total_bkend_mem allocating requested, default, or
+			 * available memory or result in an out of memory error.
+			 */
+			if (pg_atomic_compare_exchange_u64(&procglobal->total_bkend_mem_bytes,
+											   &available_max_total_bkend_mem,
+											   available_max_total_bkend_mem -
+											   initial_allocation_allowance))
+			{
+				allocation_allowance = initial_allocation_allowance;
+
+				break;
+			}
+		}
+	}
+
 	return;
 }
 
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7d412b26801..cf7a2b52359 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1882,13 +1882,15 @@ pg_stat_global_memory_allocation| WITH sums AS (
  SELECT s.datid,
     current_setting('shared_memory_size'::text, true) AS shared_memory_size,
     (current_setting('shared_memory_size_in_huge_pages'::text, true))::integer AS shared_memory_size_in_huge_pages,
+    pg_size_bytes(current_setting('max_total_backend_memory'::text, true)) AS max_total_backend_memory_bytes,
+    s.total_bkend_mem_bytes_available,
     s.global_dsm_allocated_bytes,
     sums.total_aset_allocated_bytes,
     sums.total_dsm_allocated_bytes,
     sums.total_generation_allocated_bytes,
     sums.total_slab_allocated_bytes
    FROM sums,
-    (pg_stat_get_global_memory_allocation() s(datid, global_dsm_allocated_bytes)
+    (pg_stat_get_global_memory_allocation() s(datid, total_bkend_mem_bytes_available, global_dsm_allocated_bytes)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_gssapi| SELECT pid,
     gss_auth AS gss_authenticated,
-- 
2.41.0

Re: Add the ability to limit the amount of memory that can be allocated to backends.

Reply via email to