Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

Drouvot, Bertrand Fri, 21 Oct 2022 05:12:08 -0700

Hi,

On 10/21/22 2:58 AM, Michael Paquier wrote:

On Wed, Oct 19, 2022 at 10:45:44AM +0200, Drouvot, Bertrand wrote:

Please find attached v1-0001-regex-handling-for-db-and-roles-in-hba.patch to
implement regexes for databases and roles in hba.


It does also contain new regexes related TAP tests and doc updates.


Thanks for the updated version.  This is really easy to look at now.

It relies on the refactoring made in fc579e11c6 (but changes the
regcomp_auth_token() parameters so that it is now responsible for emitting
the compilation error message (if any), to avoid code duplication in
parse_hba_line() and parse_ident_line() for roles, databases and user name
mapping).


@@ -652,13 +670,18 @@ check_role(const char *role, Oid roleid, List *tokens)
[...]
-       if (!tok->quoted && tok->string[0] == '+')
+       if (!token_has_regexp(tok))
         {
Hmm.  Do we need token_has_regexp() here for all the cases?  We know
that the string can begin with a '+', hence it is no regex.  The same
applies for the case of "all".  The remaining case is the one where
the user name matches exactly the AuthToken string, which should be
last as we want to treat anything beginning with a '/' as a regex.  It
seems like we could do an order like that?  Say:
if (!tok->quoted && tok->string[0] == '+')
     //do
else if (token_is_keyword(tok, "all"))
     //do
else if (token_has_regexp(tok))
     //do regex compilation, handling failures
else if (token_matches(tok, role))
     //do exact match

The same approach with keywords first, regex, and exact match could be
applied as well for the databases?  Perhaps it is just mainly a matter

of taste,


Yeah, I think it is.

and it depends on how much you want to prioritize the place
of the regex over the rest but that could make the code easier to
understand in the long-run and this is a very sensitive code area,

And agree that your proposal tastes better ;-): it is easier tounderstand, v2 attached has been done that way.

Compiling the expressions for the user and database lists one-by-one
in parse_hba_line() as you do is correct.  However there is a gotcha
that you are forgetting here: the memory allocations done for the
regexp compilations are not linked to the memory context where each
line is processed (named hbacxt in load_hba()) and need a separate

cleanup.


Oops, right, thanks for the call out!

In the same fashion as load_ident(), it seems to me that we
need two extra things for this patch:
- if !ok (see where we do MemoryContextDelete(hbacxt)), we need to go
through new_parsed_lines and free for each line the AuthTokens for the
database and user lists.
- if ok and new_parsed_lines != NIL, the same cleanup needs to
happen.


Right, but I think that should be "parsed_hba_lines != NIL".

My guess is that you could do both the same way as load_ident() does,
keeping some symmetry between the two code paths.

Right. To avoid code duplication in the !ok/ok cases, the functionfree_hba_line() has been added in v2: it goes through the list ofdatabases and roles tokens and call free_auth_token() for each of them.

Unifying both into
a common routine would be sort of annoying as HbaLines uses lists
within the lists of parsed lines, and IdentLine can have one at most
in each line.


I agree, and v2 is not attempting to unify them.

For now, I have made your last patch a bit shorter by applying the
refactoring of regcomp_auth_token() separately with a few tweaks to
the comments.


Thanks! v2 attached does apply on top of that.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index c6f1b70fd3..30753003ba 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -234,10 +234,13 @@ hostnogssenc  <replaceable>database</replaceable>  
<replaceable>user</replaceabl
        replication connections do not specify any particular database whereas
        logical replication connections do specify it.
        Otherwise, this is the name of
-       a specific <productname>PostgreSQL</productname> database.
-       Multiple database names can be supplied by separating them with
-       commas.  A separate file containing database names can be specified by
-       preceding the file name with <literal>@</literal>.
+       a specific <productname>PostgreSQL</productname> database or a regular
+       expression preceded by <literal>/</literal>.
+       Multiple database names and/or regular expressions preceded by 
<literal>/</literal>
+       can be supplied by separating them with commas.
+       A separate file containing database names and/or regular expressions 
preceded
+       by <literal>/</literal> can be specified by preceding the file name
+       with <literal>@</literal>.
       </para>
      </listitem>
     </varlistentry>
@@ -249,18 +252,20 @@ hostnogssenc  <replaceable>database</replaceable>  
<replaceable>user</replaceabl
        Specifies which database user name(s) this record
        matches. The value <literal>all</literal> specifies that it
        matches all users.  Otherwise, this is either the name of a specific
-       database user, or a group name preceded by <literal>+</literal>.
+       database user, a regular expression preceded by <literal>/</literal>,
+       or a group name preceded by <literal>+</literal>.
        (Recall that there is no real distinction between users and groups
        in <productname>PostgreSQL</productname>; a <literal>+</literal> mark 
really means
        <quote>match any of the roles that are directly or indirectly members
        of this role</quote>, while a name without a <literal>+</literal> mark 
matches
-       only that specific role.) For this purpose, a superuser is only
+       only that specific role or regular expression.) For this purpose, a 
superuser is only
        considered to be a member of a role if they are explicitly a member
        of the role, directly or indirectly, and not just by virtue of
        being a superuser.
-       Multiple user names can be supplied by separating them with commas.
-       A separate file containing user names can be specified by preceding the
-       file name with <literal>@</literal>.
+       Multiple user names and/or regular expressions preceded by 
<literal>/</literal>
+       can be supplied by separating them with commas. A separate file 
containing
+       user names and/or regular expressions preceded by <literal>/</literal>
+       can be specified by preceding the file name with <literal>@</literal>.
       </para>
      </listitem>
     </varlistentry>
@@ -739,6 +744,18 @@ host    all             all             ::1/128            
     trust
 # TYPE  DATABASE        USER            ADDRESS                 METHOD
 host    all             all             localhost               trust
 
+# The same using regular expression for DATABASE, which allows connection to 
the
+# db1 and testdb databases and any database with a name ending with "test".
+#
+# TYPE  DATABASE               USER            ADDRESS          METHOD
+local   db1,/^.*test$,testdb   all             localhost        trust
+
+# The same using regular expression for USER, which allows connection to the
+# user1 and testuser users and any user with a name ending with "test".
+#
+# TYPE  DATABASE                 USER                              ADDRESS     
     METHOD
+local   db1,/^.*test$,testdb     user1,/^.*test$,testuser          localhost   
     trust
+
 # Allow any user from any host with IP address 192.168.93.x to connect
 # to database "postgres" as the same user name that ident reports for
 # the connection (typically the operating system user name).
@@ -785,15 +802,17 @@ host    all             all             192.168.12.10/32  
      gss
 # TYPE  DATABASE        USER            ADDRESS                 METHOD
 host    all             all             192.168.0.0/16          ident 
map=omicron
 
-# If these are the only three lines for local connections, they will
+# If these are the only four lines for local connections, they will
 # allow local users to connect only to their own databases (databases
-# with the same name as their database user name) except for administrators
-# and members of role "support", who can connect to all databases.  The file
-# $PGDATA/admins contains a list of names of administrators.  Passwords
+# with the same name as their database user name) except for users ending
+# with "helpdesk", administrators and members of role "support",
+# who can connect to all databases.
+# The file $PGDATA/admins contains a list of names of administrators.  
Passwords
 # are required in all cases.
 #
 # TYPE  DATABASE        USER            ADDRESS                 METHOD
 local   sameuser        all                                     md5
+local   all             /^.*helpdesk$                           md5
 local   all             @admins                                 md5
 local   all             +support                                md5
 
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index f3539a7929..0e2185d35e 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -293,6 +293,31 @@ free_auth_token(AuthToken *token)
                pg_regfree(token->regex);
 }
 
+/*
+ * Free a HbaLine, that may include databases and/or roles AuthToken with a
+ * regular expression that needs to be cleaned up explicitly.
+ */
+static void
+free_hba_line(HbaLine *line)
+{
+       ListCell   *cell;
+       AuthToken  *tok;
+
+       /* Clean up roles regexes */
+       foreach(cell, line->roles)
+       {
+               tok = lfirst(cell);
+               free_auth_token(tok);
+       }
+
+       /* Clean up databases regexes */
+       foreach(cell, line->databases)
+       {
+               tok = lfirst(cell);
+               free_auth_token(tok);
+       }
+}
+
 /*
  * Copy a AuthToken struct into freshly palloc'd memory.
  */
@@ -676,8 +701,14 @@ check_role(const char *role, Oid roleid, List *tokens)
                        if (is_member(roleid, tok->string + 1))
                                return true;
                }
-               else if (token_matches(tok, role) ||
-                                token_is_keyword(tok, "all"))
+               else if (token_is_keyword(tok, "all"))
+                       return true;
+               else if (token_has_regexp(tok))
+               {
+                       if (regexec_auth_token(role, tok, 0, NULL) == REG_OKAY)
+                               return true;
+               }
+               else if (token_matches(tok, role))
                        return true;
        }
        return false;
@@ -719,6 +750,11 @@ check_db(const char *dbname, const char *role, Oid roleid, 
List *tokens)
                }
                else if (token_is_keyword(tok, "replication"))
                        continue;                       /* never match this if 
not walsender */
+               else if (token_has_regexp(tok))
+               {
+                       if (regexec_auth_token(dbname, tok, 0, NULL) == 
REG_OKAY)
+                               return true;
+               }
                else if (token_matches(tok, dbname))
                        return true;
        }
@@ -1138,8 +1174,15 @@ parse_hba_line(TokenizedAuthLine *tok_line, int elevel)
        tokens = lfirst(field);
        foreach(tokencell, tokens)
        {
-               parsedline->databases = lappend(parsedline->databases,
-                                                                               
copy_auth_token(lfirst(tokencell)));
+               AuthToken  *tok = copy_auth_token(lfirst(tokencell));
+
+               /*
+                * Compile a regex from the database token, if necessary.
+                */
+               if (regcomp_auth_token(tok, HbaFileName, line_num, err_msg, 
elevel))
+                       return NULL;
+
+               parsedline->databases = lappend(parsedline->databases, tok);
        }
 
        /* Get the roles. */
@@ -1158,8 +1201,15 @@ parse_hba_line(TokenizedAuthLine *tok_line, int elevel)
        tokens = lfirst(field);
        foreach(tokencell, tokens)
        {
-               parsedline->roles = lappend(parsedline->roles,
-                                                                       
copy_auth_token(lfirst(tokencell)));
+               AuthToken  *tok = copy_auth_token(lfirst(tokencell));
+
+               /*
+                * Compile a regex from the role token, if necessary.
+                */
+               if (regcomp_auth_token(tok, HbaFileName, line_num, err_msg, 
elevel))
+                       return NULL;
+
+               parsedline->roles = lappend(parsedline->roles, tok);
        }
 
        if (parsedline->conntype != ctLocal)
@@ -2355,12 +2405,31 @@ load_hba(void)
 
        if (!ok)
        {
-               /* File contained one or more errors, so bail out */
+               /*
+                * File contained one or more errors, so bail out, first being 
careful
+                * to clean up whatever we allocated.  Most stuff will go away 
via
+                * MemoryContextDelete, but we have to clean up regexes 
explicitly.
+                */
+               foreach(line, new_parsed_lines)
+               {
+                       HbaLine    *newline = (HbaLine *) lfirst(line);
+
+                       free_hba_line(newline);
+               }
                MemoryContextDelete(hbacxt);
                return false;
        }
 
        /* Loaded new file successfully, replace the one we use */
+       if (parsed_hba_lines != NIL)
+       {
+               foreach(line, parsed_hba_lines)
+               {
+                       HbaLine    *newline = (HbaLine *) lfirst(line);
+
+                       free_hba_line(newline);
+               }
+       }
        if (parsed_hba_context != NULL)
                MemoryContextDelete(parsed_hba_context);
        parsed_hba_context = hbacxt;
diff --git a/src/test/authentication/t/001_password.pl 
b/src/test/authentication/t/001_password.pl
index ea664d18f5..acb8bfaac8 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -81,6 +81,14 @@ $node->safe_psql(
         GRANT ALL ON sysuser_data TO md5_role;");
 $ENV{"PGPASSWORD"} = 'pass';
 
+# Create a role that contains a comma to stress the parsing.
+$node->safe_psql('postgres',
+       q{SET password_encryption='md5'; CREATE ROLE "md5,role" LOGIN PASSWORD 
'pass';}
+);
+
+# Create a database to test regular expression.
+$node->safe_psql('postgres', "CREATE database regex_testdb;");
+
 # For "trust" method, all users should be able to connect. These users are not
 # considered to be authenticated.
 reset_pg_hba($node, 'all', 'all', 'trust');
@@ -200,6 +208,39 @@ append_to_file(
 
 test_conn($node, 'user=md5_role', 'password from pgpass', 0);
 
+# Testing with regular expression for username. Note that the third regex
+# matches in this case.
+reset_pg_hba($node, 'all', '/^.*nomatch.*$, baduser, /^md.*$', 'password');
+test_conn($node, 'user=md5_role', 'password, matching regexp for username',
+       0);
+
+# The third regex does not match anymore.
+reset_pg_hba($node, 'all', '/^.*nomatch.*$, baduser, /^m_d.*$', 'password');
+test_conn($node, 'user=md5_role',
+       'password, non matching regexp for username',
+       2, log_unlike => [qr/connection authenticated:/]);
+
+# Test with a comma in the regular expression
+reset_pg_hba($node, 'all', '"/^.*5,.*e$"', 'password');
+test_conn($node, 'user=md5,role', 'password', 'matching regexp for username',
+       0);
+
+# Testing with regular expression for dbname. The third regex matches.
+reset_pg_hba($node, '/^.*nomatch.*$, baddb, /^regex_t.*b$', 'all',
+       'password');
+test_conn(
+       $node, 'user=md5_role dbname=regex_testdb', 'password,
+   matching regexp for dbname', 0);
+
+# The third regex does not match anymore.
+reset_pg_hba($node, '/^.*nomatch.*$, baddb, /^regex_t.*ba$',
+       'all', 'password');
+test_conn(
+       $node,
+       'user=md5_role dbname=regex_testdb',
+       'password, non matching regexp for dbname',
+       2, log_unlike => [qr/connection authenticated:/]);
+
 unlink($pgpassfile);
 delete $ENV{"PGPASSFILE"};

Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

Reply via email to