The attached updated patch fixes the merge conflicts and updates the numbering of the test files.
On Mon, Feb 24, 2025 at 12:06 PM Andrew Jackson <andrewjackson...@gmail.com> wrote: > The previous patch had a mistake in a reference in the documentation. This > should fix it. > > On Mon, Feb 24, 2025 at 10:07 AM Andrew Jackson < > andrewjackson...@gmail.com> wrote: > >> Looks like this needed another rebase to account for the oauth commit. >> Rebase attached. >> >> On Mon, Feb 24, 2025 at 9:38 AM Andrew Jackson < >> andrewjackson...@gmail.com> wrote: >> >>> Hi, >>> >>> Thank you for the review! >>> >>> Review Response >>> >>> - Made a first pass at a real commit message >>> - Fixed the condition on the if statement to use strcmp >>> - Added a test suite in the files `src/interfaces/libpq/t/ >>> 006_target_session_attr_dns.pl` and `src/interfaces/libpq/t/ >>> 007_load_balance_dns_check_all_addrs.pl` which checks the >>> target_session_attrs as when used with and without load balancing. >>> >>> Regarding the name of the variable itself I am definitely open to >>> opinions on this. I didn't put too much thought initially and just chose >>> `check_all_addrs`. I feel like given that it modifies the behaviour of >>> `target_session_attrs` ideally it should reference that in the name but >>> that would make that variable name very long: something akin to >>> `target_session_attrs_check_all_addrs`. >>> >>> Context >>> >>> I tested some drivers as well and found that pgx, psycopg, and >>> rust-postgres all traverse every IP address when looking for a matching >>> target_session_attrs. Asyncpg and psycopg2 on the other hand follow libpq >>> and terminate additional attempts after the first failure. Given this it >>> seems like there is a decent amount of fragmentation in the ecosystem as to >>> how exactly to implement this feature. I believe some drivers choose to >>> traverse all addresses because they have users target the same use case >>> outlined above. >>> >>> Thanks again, >>> Andrew Jackson >>> >>> On Sun, Feb 16, 2025 at 6:03 AM Andrey Borodin <x4...@yandex-team.ru> >>> wrote: >>> >>>> Hi Andrew! >>>> >>>> cc Jelte, I suspect he might be interested. >>>> >>>> > On 20 Nov 2024, at 20:51, Andrew Jackson <andrewjackson...@gmail.com> >>>> wrote: >>>> > >>>> > Would appreciate any feedback on the applicability/relevancy of the >>>> goal here or the implementation. >>>> >>>> Thank you for raising the issue. Following our discussion in Discord >>>> I'm putting my thoughts to list. >>>> >>>> >>>> Context >>>> >>>> A DNS record might return several IPs. Consider we have a connection >>>> string with "host=A,B", A is resolved to 1.1.1.1,2.2.2.2, B to >>>> 3.3.3.3,4.4.4.4. >>>> If we connect with "target_session_attrs=read-write" IPs 1.1.1.1 and >>>> 3.3.3.3 will be probed, but 2.2.2.2 and 4.4.4.4 won't (if 1.1.1.1 and >>>> 3.3.3.3 responded). >>>> >>>> If we enable libpq load balancing some random 2 IPs will be probed. >>>> >>>> IMO it's a bug, at least when load balancing is enabled. Let's consider >>>> if we can change default behavior here. I suspect we can't do it for >>>> "load_balance_hosts=disable". And even for load balancing this might be too >>>> unexpected change for someone. >>>> >>>> Further I only consider proposal not as a bug fix, but as a feature. >>>> >>>> In Discord we have surveyed some other drivers. >>>> pgx treats all IPs as different servers [1]. npgsql goes through all >>>> IPs one-by-one always [2]. PGJDBC are somewhat in a decision process [3] >>>> (cc Dave and Vladimir, if they would like to provide some input). >>>> >>>> >>>> Review >>>> >>>> The patch needs a rebase. It's trivial, so please fine attached. The >>>> patch needs real commit message, it's not trivial :) >>>> >>>> We definitely need to adjust tests [0]. We need to change >>>> 004_load_balance_dns.pl so that it tests target_session_attrs too. >>>> >>>> Some documentation would be nice. >>>> >>>> I do not like how this check is performed >>>> + if >>>> (conn->check_all_addrs && conn->check_all_addrs[0] == '1') >>>> Let's make it like load balancing is done [4]. >>>> >>>> Finally, let's think about naming alternatives for "check_all_addrs". >>>> >>>> I think that's enough for a first round of the review. If it's not a >>>> bug, but a feature - it's a very narrow window to get to 18. But we might >>>> be lucky... >>>> >>>> Thank you! >>>> >>>> >>>> Best regards, Andrey Borodin. >>>> >>>> [0] >>>> https://github.com/postgres/postgres/commit/7f5b19817eaf38e70ad1153db4e644ee9456853e#diff-b05b74d2a97d7f74d4311ba1702d732f0df1b101c6ac99c146b51215174cf3ffR94 >>>> [1] https://github.com/jackc/pgx/blob/master/pgconn/pgconn.go#L177 >>>> [2] >>>> https://github.com/npgsql/npgsql/blob/7f1a59fa8dc1ccc34a70154f49a768e1abf826ba/src/Npgsql/Internal/NpgsqlConnector.cs#L986 >>>> [3] https://github.com/pgjdbc/pgjdbc/pull/3012#discussion_r1408069450 >>>> [4] >>>> https://github.com/postgres/postgres/commit/7f5b19817eaf38e70ad1153db4e644ee9456853e#diff-8d819454e061b9d4cdae9c8922ded05753a629d70f2ac1de1d4f6d5a4aeb7f68R1660 >>>> >>>
From bb691d950739e2d6790bcd247c230a577a105aa9 Mon Sep 17 00:00:00 2001 From: CommanderKeynes <andrewjackson...@gmail.coma> Date: Sat, 17 May 2025 08:29:01 -0500 Subject: [PATCH v6] Add option to check all addrs for target_session. The current behaviour of libpq with regard to searching for a matching target_session_attrs in a list of addrs is that after successfully connecting to a server if the servers session_attr does not match the request target_session_attrs no futher address is considered. This behaviour is extremely inconvenient in environments where the user is attempting to implement a high availability setup without having to modify DNS records or a proxy server config. This PR adds a client side option called check_all_addrs. When set to 1 this option will tell libpq to continue checking any remaining addresses even if there was a target_session_attrs mismatch on one of them. Author: Andrew Jackson Reviewed-by: Andrey Borodin Discussion: https://www.postgresql.org/message-id/flat/CAKK5BkESSc69sp2TiTWHvvOHCUey0rDWXSrR9pinyRqyfamUYg%40mail.gmail.com --- doc/src/sgml/libpq.sgml | 33 +++++ src/interfaces/libpq/fe-connect.c | 25 ++-- src/interfaces/libpq/libpq-int.h | 1 + .../libpq/t/007_target_session_attr_dns.pl | 129 ++++++++++++++++++ .../t/008_load_balance_dns_check_all_addrs.pl | 128 +++++++++++++++++ 5 files changed, 306 insertions(+), 10 deletions(-) create mode 100644 src/interfaces/libpq/t/007_target_session_attr_dns.pl create mode 100644 src/interfaces/libpq/t/008_load_balance_dns_check_all_addrs.pl diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 695fe958c3e..6f43a06ec63 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -2555,6 +2555,39 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname </listitem> </varlistentry> + <varlistentry id="libpq-connect-check-all-addrs" xreflabel="check_all_addrs"> + <term><literal>check_all_addrs</literal></term> + <listitem> + <para> + Controls whether or not all addresses within a hostname are checked when trying to make a connection + when attempting to find a connection with a matching <xref linkend="libpq-connect-target-session-attrs"/>. + + There are two modes: + <variablelist> + <varlistentry> + <term><literal>0</literal> (default)</term> + <listitem> + <para> + If a successful connection is made and that connection is found to have a + mismatching <xref linkend="libpq-connect-target-session-attrs"/> do not check + any additional addresses and move onto the next host if one was provided. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term><literal>1</literal></term> + <listitem> + <para> + If a successful connection is made and that connection is found to have a + mismatching <xref linkend="libpq-connect-target-session-attrs"/> proceed + to check any additional addresses. + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + </listitem> + </varlistentry> </variablelist> </para> </sect2> diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 430c0fa4442..54ed809ee7c 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -389,6 +389,10 @@ static const internalPQconninfoOption PQconninfoOptions[] = { {"scram_server_key", NULL, NULL, NULL, "SCRAM-Server-Key", "D", SCRAM_MAX_KEY_LEN * 2, offsetof(struct pg_conn, scram_server_key)}, + {"check_all_addrs", "PGCHECKALLADDRS", + DefaultLoadBalanceHosts, NULL, + "Check-All-Addrs", "", 1, + offsetof(struct pg_conn, check_all_addrs)}, /* OAuth v2 */ {"oauth_issuer", NULL, NULL, NULL, @@ -4434,11 +4438,11 @@ keep_going: /* We will come back to here until there is conn->status = CONNECTION_OK; sendTerminateConn(conn); - /* - * Try next host if any, but we don't want to consider - * additional addresses for this host. - */ - conn->try_next_host = true; + if (strcmp(conn->check_all_addrs, "1") == 0) + conn->try_next_addr = true; + else + conn->try_next_host = true; + goto keep_going; } } @@ -4489,11 +4493,11 @@ keep_going: /* We will come back to here until there is conn->status = CONNECTION_OK; sendTerminateConn(conn); - /* - * Try next host if any, but we don't want to consider - * additional addresses for this host. - */ - conn->try_next_host = true; + if (strcmp(conn->check_all_addrs, "1") == 0) + conn->try_next_addr = true; + else + conn->try_next_host = true; + goto keep_going; } } @@ -5119,6 +5123,7 @@ freePGconn(PGconn *conn) free(conn->oauth_client_id); free(conn->oauth_client_secret); free(conn->oauth_scope); + free(conn->check_all_addrs); termPQExpBuffer(&conn->errorMessage); termPQExpBuffer(&conn->workBuffer); diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h index a6cfd7f5c9d..4508073efad 100644 --- a/src/interfaces/libpq/libpq-int.h +++ b/src/interfaces/libpq/libpq-int.h @@ -427,6 +427,7 @@ struct pg_conn char *scram_client_key; /* base64-encoded SCRAM client key */ char *scram_server_key; /* base64-encoded SCRAM server key */ char *sslkeylogfile; /* where should the client write ssl keylogs */ + char *check_all_addrs; /* whether to check all ips within a host or terminate on failure */ bool cancelRequest; /* true if this connection is used to send a * cancel request, instead of being a normal diff --git a/src/interfaces/libpq/t/007_target_session_attr_dns.pl b/src/interfaces/libpq/t/007_target_session_attr_dns.pl new file mode 100644 index 00000000000..914f6c472f4 --- /dev/null +++ b/src/interfaces/libpq/t/007_target_session_attr_dns.pl @@ -0,0 +1,129 @@ + +# Copyright (c) 2023-2025, PostgreSQL Global Development Group +use strict; +use warnings FATAL => 'all'; +use Config; +use PostgreSQL::Test::Utils; +use PostgreSQL::Test::Cluster; +use Test::More; + +if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bload_balance\b/) +{ + plan skip_all => + 'Potentially unsafe test load_balance not enabled in PG_TEST_EXTRA'; +} + +# Cluster setup which is shared for testing both load balancing methods +my $can_bind_to_127_0_0_2 = + $Config{osname} eq 'linux' || $PostgreSQL::Test::Utils::windows_os; + +# Checks for the requirements for testing load balancing method 2 +if (!$can_bind_to_127_0_0_2) +{ + plan skip_all => 'load_balance test only supported on Linux and Windows'; +} + +my $hosts_path; +if ($windows_os) +{ + $hosts_path = 'c:\Windows\System32\Drivers\etc\hosts'; +} +else +{ + $hosts_path = '/etc/hosts'; +} + +my $hosts_content = PostgreSQL::Test::Utils::slurp_file($hosts_path); + +my $hosts_count = () = + $hosts_content =~ /127\.0\.0\.[1-3] pg-loadbalancetest/g; +if ($hosts_count != 3) +{ + # Host file is not prepared for this test + plan skip_all => "hosts file was not prepared for DNS load balance test"; +} + +$PostgreSQL::Test::Cluster::use_tcp = 1; +$PostgreSQL::Test::Cluster::test_pghost = '127.0.0.1'; +my $port = PostgreSQL::Test::Cluster::get_free_port(); + +my $node_primary1 = PostgreSQL::Test::Cluster->new('primary1', port => $port); +$node_primary1->init(has_archiving => 1, allows_streaming => 1); + +# Start it +$node_primary1->start; + +# Take backup from which all operations will be run +$node_primary1->backup('my_backup'); + +my $node_standby = PostgreSQL::Test::Cluster->new('standby', port => $port, own_host => 1); +$node_standby->init_from_backup($node_primary1, 'my_backup', + has_restoring => 1); +$node_standby->start(); + +my $node_primary2 = PostgreSQL::Test::Cluster->new('node1', port => $port, own_host => 1); +$node_primary2 ->init(); +$node_primary2 ->start(); + +# target_session_attrs=primary should always choose the first one. +$node_primary1->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=primary check_all_addrs=1", + "target_session_attrs=primary connects to the first node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); +$node_primary1->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=read-write check_all_addrs=1", + "target_session_attrs=read-write connects to the first node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); +$node_primary1->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=any check_all_addrs=1", + "target_session_attrs=any connects to the first node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); +$node_standby->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=standby check_all_addrs=1", + "target_session_attrs=standby connects to the third node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); +$node_standby->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=read-only check_all_addrs=1", + "target_session_attrs=read-only connects to the third node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); + + +$node_primary1->stop(); + +# target_session_attrs=primary should always choose the first one. +$node_primary2->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=primary check_all_addrs=1", + "target_session_attrs=primary connects to the first node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); +$node_primary2->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=read-write check_all_addrs=1", + "target_session_attrs=read-write connects to the first node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); +$node_standby->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=any check_all_addrs=1", + "target_session_attrs=any connects to the first node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); +$node_standby->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=standby check_all_addrs=1", + "target_session_attrs=standby connects to the third node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); +$node_standby->connect_ok( + "host=pg-loadbalancetest port=$port target_session_attrs=read-only check_all_addrs=1", + "target_session_attrs=read-only connects to the third node", + sql => "SELECT 'connect1'", + log_like => [qr/statement: SELECT 'connect1'/]); + +$node_primary2->stop(); +$node_standby->stop(); + + +done_testing(); diff --git a/src/interfaces/libpq/t/008_load_balance_dns_check_all_addrs.pl b/src/interfaces/libpq/t/008_load_balance_dns_check_all_addrs.pl new file mode 100644 index 00000000000..d3405598e67 --- /dev/null +++ b/src/interfaces/libpq/t/008_load_balance_dns_check_all_addrs.pl @@ -0,0 +1,128 @@ +# Copyright (c) 2023-2025, PostgreSQL Global Development Group +use strict; +use warnings FATAL => 'all'; +use Config; +use PostgreSQL::Test::Utils; +use PostgreSQL::Test::Cluster; +use Test::More; + +if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bload_balance\b/) +{ + plan skip_all => + 'Potentially unsafe test load_balance not enabled in PG_TEST_EXTRA'; +} + +my $can_bind_to_127_0_0_2 = + $Config{osname} eq 'linux' || $PostgreSQL::Test::Utils::windows_os; + +# Checks for the requirements for testing load balancing method 2 +if (!$can_bind_to_127_0_0_2) +{ + plan skip_all => 'load_balance test only supported on Linux and Windows'; +} + +my $hosts_path; +if ($windows_os) +{ + $hosts_path = 'c:\Windows\System32\Drivers\etc\hosts'; +} +else +{ + $hosts_path = '/etc/hosts'; +} + +my $hosts_content = PostgreSQL::Test::Utils::slurp_file($hosts_path); + +my $hosts_count = () = + $hosts_content =~ /127\.0\.0\.[1-3] pg-loadbalancetest/g; +if ($hosts_count != 3) +{ + # Host file is not prepared for this test + plan skip_all => "hosts file was not prepared for DNS load balance test"; +} + +$PostgreSQL::Test::Cluster::use_tcp = 1; +$PostgreSQL::Test::Cluster::test_pghost = '127.0.0.1'; + +my $port = PostgreSQL::Test::Cluster::get_free_port(); +local $Test::Builder::Level = $Test::Builder::Level + 1; +my $node_primary1 = PostgreSQL::Test::Cluster->new("primary1", port => $port); +$node_primary1->init(has_archiving => 1, allows_streaming => 1); + +# Start it +$node_primary1->start(); + +# Take backup from which all operations will be run +$node_primary1->backup("my_backup"); + +my $node_standby = PostgreSQL::Test::Cluster->new("standby", port => $port, own_host => 1); +$node_standby->init_from_backup($node_primary1, "my_backup", + has_restoring => 1); +$node_standby->start(); + +my $node_primary2 = PostgreSQL::Test::Cluster->new("node1", port => $port, own_host => 1); +$node_primary2->init(); +$node_primary2->start(); +sub test_target_session_attr { + my $target_session_attrs = shift; + my $test_num = shift; + my $primary1_expect_traffic = shift; + my $standby_expeect_traffic = shift; + my $primary2_expect_traffic = shift; + # Statistically the following loop with load_balance_hosts=random will almost + # certainly connect at least once to each of the nodes. The chance of that not + # happening is so small that it's negligible: (2/3)^50 = 1.56832855e-9 + foreach my $i (1 .. 50) + { + $node_primary1->connect_ok( + "host=pg-loadbalancetest port=$port load_balance_hosts=random target_session_attrs=${target_session_attrs} check_all_addrs=1", + "repeated connections with random load balancing", + sql => "SELECT 'connect${test_num}'"); + } + my $node_primary1_occurrences = () = + $node_primary1->log_content() =~ /statement: SELECT 'connect${test_num}'/g; + my $node_standby_occurrences = () = + $node_standby->log_content() =~ /statement: SELECT 'connect${test_num}'/g; + my $node_primary2_occurrences = () = + $node_primary2->log_content() =~ /statement: SELECT 'connect${test_num}'/g; + + my $total_occurrences = + $node_primary1_occurrences + $node_standby_occurrences + $node_primary2_occurrences; + + if ($primary1_expect_traffic == 1) { + ok($node_primary1_occurrences > 0, "received at least one connection on node1"); + }else{ + ok($node_primary1_occurrences == 0, "received at least one connection on node1"); + } + if ($standby_expeect_traffic == 1) { + ok($node_standby_occurrences > 0, "received at least one connection on node1"); + }else{ + ok($node_standby_occurrences == 0, "received at least one connection on node1"); + } + + if ($primary2_expect_traffic == 1) { + ok($node_primary2_occurrences > 0, "received at least one connection on node1"); + }else{ + ok($node_primary2_occurrences == 0, "received at least one connection on node1"); + } + + ok($total_occurrences == 50, "received 50 connections across all nodes"); +} + +test_target_session_attr('any', + 1, 1, 1, 1); +test_target_session_attr('read-only', + 2, 0, 1, 0); +test_target_session_attr('read-write', + 3, 1, 0, 1); +test_target_session_attr('primary', + 4, 1, 0, 1); +test_target_session_attr('standby', + 5, 0, 1, 0); + + +$node_primary1->stop(); +$node_primary2->stop(); +$node_standby->stop(); + +done_testing(); -- 2.47.2