The attached updated patch fixes the merge conflicts and updates the
numbering of the test files.

On Mon, Feb 24, 2025 at 12:06 PM Andrew Jackson <andrewjackson...@gmail.com>
wrote:

> The previous patch had a mistake in a reference in the documentation. This
> should fix it.
>
> On Mon, Feb 24, 2025 at 10:07 AM Andrew Jackson <
> andrewjackson...@gmail.com> wrote:
>
>> Looks like this needed another rebase to account for the oauth commit.
>> Rebase attached.
>>
>> On Mon, Feb 24, 2025 at 9:38 AM Andrew Jackson <
>> andrewjackson...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Thank you for the review!
>>>
>>> Review Response
>>>
>>> - Made a first pass at a real commit message
>>> - Fixed the condition on the if statement to use strcmp
>>> - Added a test suite in the files `src/interfaces/libpq/t/
>>> 006_target_session_attr_dns.pl` and `src/interfaces/libpq/t/
>>> 007_load_balance_dns_check_all_addrs.pl` which checks the
>>> target_session_attrs as when used with and without load balancing.
>>>
>>> Regarding the name of the variable itself I am definitely open to
>>> opinions on this. I didn't put too much thought initially and just chose
>>> `check_all_addrs`. I feel like given that it modifies the behaviour of
>>> `target_session_attrs` ideally it should reference that in the name but
>>> that would make that variable name very long: something akin to
>>> `target_session_attrs_check_all_addrs`.
>>>
>>> Context
>>>
>>> I tested some drivers as well and found that pgx, psycopg, and
>>> rust-postgres all traverse every IP address when looking for a matching
>>> target_session_attrs. Asyncpg and psycopg2 on the other hand follow libpq
>>> and terminate additional attempts after the first failure. Given this it
>>> seems like there is a decent amount of fragmentation in the ecosystem as to
>>> how exactly to implement this feature. I believe some drivers choose to
>>> traverse all addresses because they have users target the same use case
>>> outlined above.
>>>
>>> Thanks again,
>>> Andrew Jackson
>>>
>>> On Sun, Feb 16, 2025 at 6:03 AM Andrey Borodin <x4...@yandex-team.ru>
>>> wrote:
>>>
>>>> Hi Andrew!
>>>>
>>>> cc Jelte, I suspect he might be interested.
>>>>
>>>> > On 20 Nov 2024, at 20:51, Andrew Jackson <andrewjackson...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Would appreciate any feedback on the applicability/relevancy of the
>>>> goal here or the implementation.
>>>>
>>>> Thank you for raising the issue. Following our discussion in Discord
>>>> I'm putting my thoughts to list.
>>>>
>>>>
>>>> Context
>>>>
>>>> A DNS record might return several IPs. Consider we have a connection
>>>> string with "host=A,B", A is resolved to 1.1.1.1,2.2.2.2, B to
>>>> 3.3.3.3,4.4.4.4.
>>>> If we connect with "target_session_attrs=read-write" IPs 1.1.1.1 and
>>>> 3.3.3.3 will be probed, but 2.2.2.2 and 4.4.4.4 won't (if 1.1.1.1 and
>>>> 3.3.3.3 responded).
>>>>
>>>> If we enable libpq load balancing some random 2 IPs will be probed.
>>>>
>>>> IMO it's a bug, at least when load balancing is enabled. Let's consider
>>>> if we can change default behavior here. I suspect we can't do it for
>>>> "load_balance_hosts=disable". And even for load balancing this might be too
>>>> unexpected change for someone.
>>>>
>>>> Further I only consider proposal not as a bug fix, but as a feature.
>>>>
>>>> In Discord we have surveyed some other drivers.
>>>> pgx treats all IPs as different servers [1]. npgsql goes through all
>>>> IPs one-by-one always [2]. PGJDBC are somewhat in a decision process [3]
>>>> (cc Dave and Vladimir, if they would like to provide some input).
>>>>
>>>>
>>>> Review
>>>>
>>>> The patch needs a rebase. It's trivial, so please fine attached. The
>>>> patch needs real commit message, it's not trivial :)
>>>>
>>>> We definitely need to adjust tests [0]. We need to change
>>>> 004_load_balance_dns.pl so that it tests target_session_attrs too.
>>>>
>>>> Some documentation would be nice.
>>>>
>>>> I do not like how this check is performed
>>>> +                                               if
>>>> (conn->check_all_addrs && conn->check_all_addrs[0] == '1')
>>>> Let's make it like load balancing is done [4].
>>>>
>>>> Finally, let's think about naming alternatives for "check_all_addrs".
>>>>
>>>> I think that's enough for a first round of the review. If it's not a
>>>> bug, but a feature - it's a very narrow window to get to 18. But we might
>>>> be lucky...
>>>>
>>>> Thank you!
>>>>
>>>>
>>>> Best regards, Andrey Borodin.
>>>>
>>>> [0]
>>>> https://github.com/postgres/postgres/commit/7f5b19817eaf38e70ad1153db4e644ee9456853e#diff-b05b74d2a97d7f74d4311ba1702d732f0df1b101c6ac99c146b51215174cf3ffR94
>>>> [1] https://github.com/jackc/pgx/blob/master/pgconn/pgconn.go#L177
>>>> [2]
>>>> https://github.com/npgsql/npgsql/blob/7f1a59fa8dc1ccc34a70154f49a768e1abf826ba/src/Npgsql/Internal/NpgsqlConnector.cs#L986
>>>> [3] https://github.com/pgjdbc/pgjdbc/pull/3012#discussion_r1408069450
>>>> [4]
>>>> https://github.com/postgres/postgres/commit/7f5b19817eaf38e70ad1153db4e644ee9456853e#diff-8d819454e061b9d4cdae9c8922ded05753a629d70f2ac1de1d4f6d5a4aeb7f68R1660
>>>>
>>>
From bb691d950739e2d6790bcd247c230a577a105aa9 Mon Sep 17 00:00:00 2001
From: CommanderKeynes <andrewjackson...@gmail.coma>
Date: Sat, 17 May 2025 08:29:01 -0500
Subject: [PATCH v6] Add option to check all addrs for target_session.

The current behaviour of libpq with regard to searching
for a matching target_session_attrs in a list of addrs is
that after successfully connecting to a server if the servers
session_attr does not match the request target_session_attrs
no futher address is considered. This behaviour is extremely
inconvenient in environments where the user is attempting to
implement a high availability setup without having to modify
DNS records or a proxy server config.

This PR adds a client side option called check_all_addrs.
When set to 1 this option will tell libpq to continue checking
any remaining addresses even if there was a target_session_attrs
mismatch on one of them.

Author: Andrew Jackson
Reviewed-by: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/flat/CAKK5BkESSc69sp2TiTWHvvOHCUey0rDWXSrR9pinyRqyfamUYg%40mail.gmail.com
---
 doc/src/sgml/libpq.sgml                       |  33 +++++
 src/interfaces/libpq/fe-connect.c             |  25 ++--
 src/interfaces/libpq/libpq-int.h              |   1 +
 .../libpq/t/007_target_session_attr_dns.pl    | 129 ++++++++++++++++++
 .../t/008_load_balance_dns_check_all_addrs.pl | 128 +++++++++++++++++
 5 files changed, 306 insertions(+), 10 deletions(-)
 create mode 100644 src/interfaces/libpq/t/007_target_session_attr_dns.pl
 create mode 100644 src/interfaces/libpq/t/008_load_balance_dns_check_all_addrs.pl

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 695fe958c3e..6f43a06ec63 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2555,6 +2555,39 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
       </listitem>
      </varlistentry>
 
+     <varlistentry id="libpq-connect-check-all-addrs" xreflabel="check_all_addrs">
+      <term><literal>check_all_addrs</literal></term>
+      <listitem>
+       <para>
+        Controls whether or not all addresses within a hostname are checked when trying to make a connection
+        when attempting to find a connection with a matching <xref linkend="libpq-connect-target-session-attrs"/>.
+
+        There are two modes:
+        <variablelist>
+         <varlistentry>
+          <term><literal>0</literal> (default)</term>
+          <listitem>
+           <para>
+            If a successful connection is made and that connection is found to have a
+            mismatching <xref linkend="libpq-connect-target-session-attrs"/> do not check
+            any additional addresses and move onto the next host if one was provided.
+           </para>
+          </listitem>
+         </varlistentry>
+         <varlistentry>
+          <term><literal>1</literal></term>
+          <listitem>
+           <para>
+            If a successful connection is made and that connection is found to have a
+            mismatching <xref linkend="libpq-connect-target-session-attrs"/> proceed
+            to check any additional addresses.
+           </para>
+          </listitem>
+         </varlistentry>
+        </variablelist>
+       </para>
+      </listitem>
+     </varlistentry>
     </variablelist>
    </para>
   </sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 430c0fa4442..54ed809ee7c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -389,6 +389,10 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
 
 	{"scram_server_key", NULL, NULL, NULL, "SCRAM-Server-Key", "D", SCRAM_MAX_KEY_LEN * 2,
 	offsetof(struct pg_conn, scram_server_key)},
+	{"check_all_addrs", "PGCHECKALLADDRS",
+		DefaultLoadBalanceHosts, NULL,
+		"Check-All-Addrs", "", 1,
+	offsetof(struct pg_conn, check_all_addrs)},
 
 	/* OAuth v2 */
 	{"oauth_issuer", NULL, NULL, NULL,
@@ -4434,11 +4438,11 @@ keep_going:						/* We will come back to here until there is
 						conn->status = CONNECTION_OK;
 						sendTerminateConn(conn);
 
-						/*
-						 * Try next host if any, but we don't want to consider
-						 * additional addresses for this host.
-						 */
-						conn->try_next_host = true;
+						if (strcmp(conn->check_all_addrs, "1") == 0)
+							conn->try_next_addr = true;
+						else
+							conn->try_next_host = true;
+
 						goto keep_going;
 					}
 				}
@@ -4489,11 +4493,11 @@ keep_going:						/* We will come back to here until there is
 						conn->status = CONNECTION_OK;
 						sendTerminateConn(conn);
 
-						/*
-						 * Try next host if any, but we don't want to consider
-						 * additional addresses for this host.
-						 */
-						conn->try_next_host = true;
+						if (strcmp(conn->check_all_addrs, "1") == 0)
+							conn->try_next_addr = true;
+						else
+							conn->try_next_host = true;
+
 						goto keep_going;
 					}
 				}
@@ -5119,6 +5123,7 @@ freePGconn(PGconn *conn)
 	free(conn->oauth_client_id);
 	free(conn->oauth_client_secret);
 	free(conn->oauth_scope);
+	free(conn->check_all_addrs);
 	termPQExpBuffer(&conn->errorMessage);
 	termPQExpBuffer(&conn->workBuffer);
 
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index a6cfd7f5c9d..4508073efad 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -427,6 +427,7 @@ struct pg_conn
 	char	   *scram_client_key;	/* base64-encoded SCRAM client key */
 	char	   *scram_server_key;	/* base64-encoded SCRAM server key */
 	char	   *sslkeylogfile;	/* where should the client write ssl keylogs */
+	char       *check_all_addrs;  /* whether to check all ips within a host or terminate on failure */
 
 	bool		cancelRequest;	/* true if this connection is used to send a
 								 * cancel request, instead of being a normal
diff --git a/src/interfaces/libpq/t/007_target_session_attr_dns.pl b/src/interfaces/libpq/t/007_target_session_attr_dns.pl
new file mode 100644
index 00000000000..914f6c472f4
--- /dev/null
+++ b/src/interfaces/libpq/t/007_target_session_attr_dns.pl
@@ -0,0 +1,129 @@
+
+# Copyright (c) 2023-2025, PostgreSQL Global Development Group
+use strict;
+use warnings FATAL => 'all';
+use Config;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bload_balance\b/)
+{
+	plan skip_all =>
+	  'Potentially unsafe test load_balance not enabled in PG_TEST_EXTRA';
+}
+
+# Cluster setup which is shared for testing both load balancing methods
+my $can_bind_to_127_0_0_2 =
+  $Config{osname} eq 'linux' || $PostgreSQL::Test::Utils::windows_os;
+
+# Checks for the requirements for testing load balancing method 2
+if (!$can_bind_to_127_0_0_2)
+{
+	plan skip_all => 'load_balance test only supported on Linux and Windows';
+}
+
+my $hosts_path;
+if ($windows_os)
+{
+	$hosts_path = 'c:\Windows\System32\Drivers\etc\hosts';
+}
+else
+{
+	$hosts_path = '/etc/hosts';
+}
+
+my $hosts_content = PostgreSQL::Test::Utils::slurp_file($hosts_path);
+
+my $hosts_count = () =
+  $hosts_content =~ /127\.0\.0\.[1-3] pg-loadbalancetest/g;
+if ($hosts_count != 3)
+{
+	# Host file is not prepared for this test
+	plan skip_all => "hosts file was not prepared for DNS load balance test";
+}
+
+$PostgreSQL::Test::Cluster::use_tcp = 1;
+$PostgreSQL::Test::Cluster::test_pghost = '127.0.0.1';
+my $port = PostgreSQL::Test::Cluster::get_free_port();
+
+my $node_primary1 = PostgreSQL::Test::Cluster->new('primary1', port => $port);
+$node_primary1->init(has_archiving => 1, allows_streaming => 1);
+
+# Start it
+$node_primary1->start;
+
+# Take backup from which all operations will be run
+$node_primary1->backup('my_backup');
+
+my $node_standby = PostgreSQL::Test::Cluster->new('standby', port => $port, own_host => 1);
+$node_standby->init_from_backup($node_primary1, 'my_backup',
+	has_restoring => 1);
+$node_standby->start();
+
+my $node_primary2 = PostgreSQL::Test::Cluster->new('node1', port => $port, own_host => 1);
+$node_primary2 ->init();
+$node_primary2 ->start();
+
+# target_session_attrs=primary should always choose the first one.
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=primary check_all_addrs=1",
+	"target_session_attrs=primary connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-write check_all_addrs=1",
+	"target_session_attrs=read-write connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=any check_all_addrs=1",
+	"target_session_attrs=any connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=standby check_all_addrs=1",
+	"target_session_attrs=standby connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-only check_all_addrs=1",
+	"target_session_attrs=read-only connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+
+
+$node_primary1->stop();
+
+# target_session_attrs=primary should always choose the first one.
+$node_primary2->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=primary check_all_addrs=1",
+	"target_session_attrs=primary connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary2->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-write check_all_addrs=1",
+	"target_session_attrs=read-write connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=any check_all_addrs=1",
+	"target_session_attrs=any connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=standby check_all_addrs=1",
+	"target_session_attrs=standby connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-only check_all_addrs=1",
+	"target_session_attrs=read-only connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+
+$node_primary2->stop();
+$node_standby->stop();
+
+
+done_testing();
diff --git a/src/interfaces/libpq/t/008_load_balance_dns_check_all_addrs.pl b/src/interfaces/libpq/t/008_load_balance_dns_check_all_addrs.pl
new file mode 100644
index 00000000000..d3405598e67
--- /dev/null
+++ b/src/interfaces/libpq/t/008_load_balance_dns_check_all_addrs.pl
@@ -0,0 +1,128 @@
+# Copyright (c) 2023-2025, PostgreSQL Global Development Group
+use strict;
+use warnings FATAL => 'all';
+use Config;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bload_balance\b/)
+{
+	plan skip_all =>
+	  'Potentially unsafe test load_balance not enabled in PG_TEST_EXTRA';
+}
+
+my $can_bind_to_127_0_0_2 =
+  $Config{osname} eq 'linux' || $PostgreSQL::Test::Utils::windows_os;
+
+# Checks for the requirements for testing load balancing method 2
+if (!$can_bind_to_127_0_0_2)
+{
+	plan skip_all => 'load_balance test only supported on Linux and Windows';
+}
+
+my $hosts_path;
+if ($windows_os)
+{
+	$hosts_path = 'c:\Windows\System32\Drivers\etc\hosts';
+}
+else
+{
+	$hosts_path = '/etc/hosts';
+}
+
+my $hosts_content = PostgreSQL::Test::Utils::slurp_file($hosts_path);
+
+my $hosts_count = () =
+  $hosts_content =~ /127\.0\.0\.[1-3] pg-loadbalancetest/g;
+if ($hosts_count != 3)
+{
+	# Host file is not prepared for this test
+	plan skip_all => "hosts file was not prepared for DNS load balance test";
+}
+
+$PostgreSQL::Test::Cluster::use_tcp = 1;
+$PostgreSQL::Test::Cluster::test_pghost = '127.0.0.1';
+
+my $port = PostgreSQL::Test::Cluster::get_free_port();
+local $Test::Builder::Level = $Test::Builder::Level + 1;
+my $node_primary1 = PostgreSQL::Test::Cluster->new("primary1", port => $port);
+$node_primary1->init(has_archiving => 1, allows_streaming => 1);
+
+# Start it
+$node_primary1->start();
+
+# Take backup from which all operations will be run
+$node_primary1->backup("my_backup");
+
+my $node_standby = PostgreSQL::Test::Cluster->new("standby", port => $port, own_host => 1);
+$node_standby->init_from_backup($node_primary1, "my_backup",
+	has_restoring => 1);
+$node_standby->start();
+
+my $node_primary2 = PostgreSQL::Test::Cluster->new("node1", port => $port, own_host => 1);
+$node_primary2->init();
+$node_primary2->start();
+sub test_target_session_attr {
+	my $target_session_attrs = shift;
+	my $test_num = shift;
+	my $primary1_expect_traffic = shift;
+	my $standby_expeect_traffic = shift;
+	my $primary2_expect_traffic = shift;
+	# Statistically the following loop with load_balance_hosts=random will almost
+	# certainly connect at least once to each of the nodes. The chance of that not
+	# happening is so small that it's negligible: (2/3)^50 = 1.56832855e-9
+	foreach my $i (1 .. 50)
+	{
+		$node_primary1->connect_ok(
+			"host=pg-loadbalancetest port=$port load_balance_hosts=random target_session_attrs=${target_session_attrs} check_all_addrs=1",
+			"repeated connections with random load balancing",
+			sql => "SELECT 'connect${test_num}'");
+	}
+	my $node_primary1_occurrences = () =
+	  $node_primary1->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+	my $node_standby_occurrences = () =
+	  $node_standby->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+	my $node_primary2_occurrences = () =
+	  $node_primary2->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+
+	my $total_occurrences =
+	  $node_primary1_occurrences + $node_standby_occurrences + $node_primary2_occurrences;
+
+	if ($primary1_expect_traffic == 1) {
+		ok($node_primary1_occurrences > 0, "received at least one connection on node1");
+	}else{
+		ok($node_primary1_occurrences == 0, "received at least one connection on node1");
+	}
+	if ($standby_expeect_traffic == 1) {
+		ok($node_standby_occurrences > 0, "received at least one connection on node1");
+	}else{
+		ok($node_standby_occurrences == 0, "received at least one connection on node1");
+	}
+
+	if ($primary2_expect_traffic == 1) {
+		ok($node_primary2_occurrences > 0, "received at least one connection on node1");
+	}else{
+		ok($node_primary2_occurrences == 0, "received at least one connection on node1");
+	}
+
+	ok($total_occurrences == 50, "received 50 connections across all nodes");
+}
+
+test_target_session_attr('any',
+	1, 1, 1, 1);
+test_target_session_attr('read-only',
+	2, 0, 1, 0);
+test_target_session_attr('read-write',
+	3, 1, 0, 1);
+test_target_session_attr('primary',
+	4, 1, 0, 1);
+test_target_session_attr('standby',
+	5, 0, 1, 0);
+
+
+$node_primary1->stop();
+$node_primary2->stop();
+$node_standby->stop();
+
+done_testing();
-- 
2.47.2

Reply via email to