On Sun, Jan 16, 2022 at 8:32 PM Thomas Munro <thomas.mu...@gmail.com> wrote:
> On Sun, Jan 16, 2022 at 6:03 PM DEVOPS_WwIT <dev...@ww-it.cn> wrote:
> > Solaris and FreeBSD supports large/super pages, and can be used
> > automatically by applications.
> >
> > Seems Postgres can't use the large/super pages on Solaris and FreeBSD
> > os(I think can't use the large/super page HPUX and AIX), is there anyone
> > could take a look?
>
> 3.  FreeBSD: FreeBSD does transparently migrate PostgreSQL memory to
> "super" pages quite well in my experience, but there is also a new
> facility in FreeBSD 13 to ask for specific page sizes explicitly.  I
> wrote a quick and dirty patch to enable PostgreSQL's huge_pages and
> huge_page_size settings to work with that interface, but I haven't yet
> got as far as testing it very hard or proposing it...  but here it is,
> if you like experimental code[2].

I was reminded to rebase that and tidy it up a bit, by recent
discussion of page table magic in other threads.  Documentation of
these interfaces is sparse to put it mildly (I may try to improve that
myself) but basically the terminology is "super" for pages subject to
promotion/demotion, and "large" when explicitly managed.  Not
proposing for commit right now as I need to learn more about all this
and there are some policy decisions lurking in here (eg synchronous
defrag vs nowait depending on flags), but the patch may be useful for
experimentation.  For example, it allows huge_page_size=1GB if your
system can handle that.
From b76dc0e5a472824aaa87de4f6c1f6db26c810c7f Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Thu, 3 Nov 2022 10:06:24 +1300
Subject: [PATCH] Support huge_pages and huge_page_size on FreeBSD.

FreeBSD often uses huge (super) pages due to automatic promotion, but it
may be useful to be able to request that explicitly, and to be able to
experiment with different page sizes by explicit request.

Discussion: https://postgr.es/m/3043b674-46d6-a8e9-3811-5a3007c8dceb%40ww-it.cn
---
 doc/src/sgml/config.sgml      |  6 ++--
 src/backend/port/sysv_shmem.c | 58 ++++++++++++++++++++++++++++++-----
 2 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 559eb898a9..cd65916bb5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1678,9 +1678,9 @@ include_dir 'conf.d'
        </para>
 
        <para>
-        At present, this setting is supported only on Linux and Windows. The
+        At present, this setting is supported only on Linux, FreeBSD and Windows. The
         setting is ignored on other systems when set to
-        <literal>try</literal>.  On Linux, it is only supported when
+        <literal>try</literal>.  On Linux and FreeBSD, it is only supported when
         <varname>shared_memory_type</varname> is set to <literal>mmap</literal>
         (the default).
        </para>
@@ -1742,7 +1742,7 @@ include_dir 'conf.d'
         about usage and support, see <xref linkend="linux-huge-pages"/>.
        </para>
        <para>
-        Non-default settings are currently supported only on Linux.
+        Non-default settings are currently supported only on Linux and FreeBSD.
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 97ce7b7c49..0bb8c20c8d 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -576,8 +576,9 @@ GetHugePageSize(Size *hugepagesize, int *mmap_flags)
 bool
 check_huge_page_size(int *newval, void **extra, GucSource source)
 {
-#if !(defined(MAP_HUGE_MASK) && defined(MAP_HUGE_SHIFT))
-	/* Recent enough Linux only, for now.  See GetHugePageSize(). */
+#if !(defined(MAP_HUGE_MASK) && defined(MAP_HUGE_SHIFT)) && \
+	!defined(SHM_LARGEPAGE_ALLOC_DEFAULT)
+	/* Recent enough Linux and FreeBSD only, for now. */
 	if (*newval != 0)
 	{
 		GUC_check_errdetail("huge_page_size must be 0 on this platform.");
@@ -601,12 +602,9 @@ CreateAnonymousSegment(Size *size)
 	void	   *ptr = MAP_FAILED;
 	int			mmap_errno = 0;
 
-#ifndef MAP_HUGETLB
-	/* PGSharedMemoryCreate should have dealt with this case */
-	Assert(huge_pages != HUGE_PAGES_ON);
-#else
 	if (huge_pages == HUGE_PAGES_ON || huge_pages == HUGE_PAGES_TRY)
 	{
+#ifdef MAP_HUGETLB
 		/*
 		 * Round up the request size to a suitable large value.
 		 */
@@ -624,8 +622,52 @@ CreateAnonymousSegment(Size *size)
 		if (huge_pages == HUGE_PAGES_TRY && ptr == MAP_FAILED)
 			elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m",
 				 allocsize);
-	}
 #endif
+#ifdef SHM_LARGEPAGE_ALLOC_DEFAULT
+		int			nsizes;
+		size_t	   *page_sizes;
+		size_t		page_size;
+		int			page_size_index = -1;
+
+		/*
+		 * Find the matching page size index, or if huge_page_size wasn't set,
+		 * then skip the smallest size and take the next one after that.
+		 */
+		nsizes = getpagesizes(NULL, 0);
+		page_sizes = palloc(nsizes * sizeof(*page_sizes));
+		getpagesizes(page_sizes, nsizes);
+		for (int i = 0; i < nsizes; ++i)
+		{
+			if (huge_page_size * 1024 == page_sizes[i] ||
+				(huge_page_size == 0 && i > 0))
+			{
+				page_size = page_sizes[i];
+				page_size_index = i;
+				if (allocsize % page_size != 0)
+					allocsize += page_size - (allocsize % page_size);
+				break;
+			}
+		}
+		pfree(page_sizes);
+		if (index >= 0)
+		{
+			int			fd;
+
+			fd = shm_create_largepage(SHM_ANON, O_RDWR, page_size_index,
+									  SHM_LARGEPAGE_ALLOC_DEFAULT, 0);
+			if (fd >= 0)
+			{
+				if (ftruncate(fd, allocsize) == 0)
+				{
+					ptr = mmap(NULL, allocsize, PROT_READ | PROT_WRITE,
+							   MAP_SHARED, fd, 0);
+					mmap_errno = errno;
+				}
+				close(fd);
+			}
+		}
+#endif
+	}
 
 	if (ptr == MAP_FAILED && huge_pages != HUGE_PAGES_ON)
 	{
@@ -709,7 +751,7 @@ PGSharedMemoryCreate(Size size,
 						DataDir)));
 
 	/* Complain if hugepages demanded but we can't possibly support them */
-#if !defined(MAP_HUGETLB)
+#if !defined(MAP_HUGETLB) && !defined(SHM_LARGEPAGE_ALLOC_DEFAULT)
 	if (huge_pages == HUGE_PAGES_ON)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-- 
2.35.1

Reply via email to