Re: [HACKERS] [TODO] Process pg_hba.conf keywords as case-insensitive

Kyotaro HORIGUCHI Thu, 18 Sep 2014 03:42:46 -0700

Hi, This is revised patch including document.

I confused three identifiers to be compared, names in the
catalog, those in pg_hba lines and those given from the client
under connecting. This patch concerns the comparison between
pg_hba and client names.


Finally all the additional pg_strcasecmp() or whole catalog
scanning are eliminated. This version works as following.

Tokenize every hba tokens and categorize having two attributes,

   One is whether the case is preserved or not. Case of a word is
   preserved in the returned token if the word is enclosed with
   double quotes.

   Another is token type, Leading bare '+' indicates the token is
   a group name, and '@' indicates file inclusion. The string in
   returned token is stripped of the special characters.

   A double quoted region which does not begin at the beginning
   of the word was handled in its own way from before this
   change. I don't know it is right or not. (ho"r""i"guti stored
   as hor"iguti by the orignal next_token() and it is not
   changed)

Matching names are performed as following,

   Tokens corrensponding to keywords should be 'normal' ones (not
   a group name or file inclusion) and should not be
   case-preserved ones, which were enclosed by double quotes. The
   tokens are lowercased so token_is_keyword() macro compares
   them by strcmp().

   Database name and user name should be 'normal' tokens and the
   cases of the names are preserved or not according to the
   notaion in hba line so token_matches() compares them with the
   name given from client by strcmp().


The patch size is far reduced from the previous version.


At Wed, 10 Sep 2014 11:32:22 +0200, Florian Pflug <f...@phlo.org> wrote in 
<7d70ee06-1e80-44d6-9428-5f60ad796...@phlo.org>
> So foo, Foo and FOO would all match the user called <foo>,
> but "Foo" would match the user called <Foo>, and "FOO" the
> user called <FOO>.

This patch does so.

> An unquoted "+" would cause whatever follows it to be interpreted
> as a group name, whereas a quoted "+" would simply become part of
> the user name (or group name, if there's an additional unquoted
> "+" before it).
> So +foo would refer to the group <foo>, +"FOO" to the group <FOO>,
> and +"+A" to the group <+A>.

I think this behaves so.

> I haven't checked if such an approach would be sufficiently
> backwards-compatible, though.

One obveous breaking which affects the existing sane pg_hba.conf
is that db and user names not surrounded by double quotes became
to match the lowercased names, not the original name containing
uppercase characters. But this is just what this patch intended.

I think all behaviors for other cases appear in existing
pg_hba.conf are unchanged including the behaviors for string
consists of single character '+' or '@'.

# '+' is treated as a group name '' and '@' is treated as a
# user/db name '@' but they seems meanless..

Any suggestions?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

>From b02cea3ead352a198f341c0f2a9f6ab93f439077 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp>
Date: Thu, 18 Sep 2014 17:06:21 +0900
Subject: [PATCH 2/2] Make pg_hba.conf case insensitive.

---
 src/backend/libpq/hba.c |  101 ++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 83 insertions(+), 18 deletions(-)

diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index 84da823..e4b1635 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -60,9 +60,18 @@ typedef struct check_network_data
 	bool		result;			/* set to true if match */
 } check_network_data;
 
+typedef enum TokenType
+{
+	NORMAL,
+	GROUP_NAME,			/* this token had leading '+' */
+	FILE_INCLUSION,		/* this token had leading '@' */
+} TokenType;
 
-#define token_is_keyword(t, k)	(!t->quoted && strcmp(t->string, k) == 0)
-#define token_matches(t, k)  (strcmp(t->string, k) == 0)
+#define token_is_keyword(tk, kw)	\
+	((tk)->type == NORMAL && !(tk)->case_preserved &&	\
+	 (strcmp((tk)->string, (kw)) == 0))
+#define token_matches(t, k)	\
+	((t)->type == NORMAL && (strcmp((t)->string, (k)) == 0))
 
 /*
  * A single string token lexed from the HBA config file, together with whether
@@ -71,7 +80,8 @@ typedef struct check_network_data
 typedef struct HbaToken
 {
 	char	   *string;
-	bool		quoted;
+	TokenType	type;
+	bool		case_preserved;
 } HbaToken;
 
 /*
@@ -123,8 +133,14 @@ pg_isblank(const char c)
  * the first character.  (We use that to prevent "@x" from being treated
  * as a file inclusion request.  Note that @"x" should be so treated;
  * we want to allow that to support embedded spaces in file paths.)
+ * type is one of NORMAL, GROUP_NAME, FILE_INCLUSION. GROUP_NAME is set if the
+ * token is prefix by '+', FILE_INCLUSION if prefixed by '@', NORMAL
+ * otherwise.
  * We set *terminating_comma to indicate whether the token is terminated by a
  * comma (which is not returned.)
+ * case_preserved is set if the token's case of every character is preserved
+ * when the whole word is enclosed by double quotes. Elsewise returned token
+ * string is lowercased.
  *
  * If successful: store null-terminated token at *buf and return TRUE.
  * If no more tokens on line: set *buf = '\0' and return FALSE.
@@ -136,8 +152,8 @@ pg_isblank(const char c)
  * Handle comments.
  */
 static bool
-next_token(char **lineptr, char *buf, int bufsz, bool *initial_quote,
-		   bool *terminating_comma)
+next_token(char **lineptr, char *buf, int bufsz,
+		   bool *case_preserved, int *type, bool *terminating_comma)
 {
 	int			c;
 	char	   *start_buf = buf;
@@ -149,8 +165,9 @@ next_token(char **lineptr, char *buf, int bufsz, bool *initial_quote,
 	/* end_buf reserves two bytes to ensure we can append \n and \0 */
 	Assert(end_buf > start_buf);
 
-	*initial_quote = false;
 	*terminating_comma = false;
+	*case_preserved = false;
+	*type = NORMAL;
 
 	/* Move over initial whitespace and commas */
 	while ((c = (*(*lineptr)++)) != '\0' && (pg_isblank(c) || c == ','))
@@ -162,6 +179,17 @@ next_token(char **lineptr, char *buf, int bufsz, bool *initial_quote,
 		return false;
 	}
 
+	if (c == '+' || c == '@')
+	{
+		*type = (c == '+' ? GROUP_NAME : FILE_INCLUSION);
+
+		/*
+		 * Skip capturing it, and we can read the following characters as
+		 * usual.
+		 */
+		c = *(*lineptr)++;
+	}
+
 	/*
 	 * Build a token in buf of next characters up to EOF, EOL, unquoted comma,
 	 * or unquoted whitespace.
@@ -201,8 +229,17 @@ next_token(char **lineptr, char *buf, int bufsz, bool *initial_quote,
 		}
 
 		if (c != '"' || was_quote)
+		{
 			*buf++ = c;
 
+			/*
+			 * Cancel case-sensitive state if trailing characters found for
+			 * the quoted region.
+			 */
+			if (*case_preserved && !in_quote)
+				*case_preserved = false;
+		}
+
 		/* Literal double-quote is two double-quotes */
 		if (in_quote && c == '"')
 			was_quote = !was_quote;
@@ -214,7 +251,7 @@ next_token(char **lineptr, char *buf, int bufsz, bool *initial_quote,
 			in_quote = !in_quote;
 			saw_quote = true;
 			if (buf == start_buf)
-				*initial_quote = true;
+				*case_preserved = true;
 		}
 
 		c = *(*lineptr)++;
@@ -226,13 +263,32 @@ next_token(char **lineptr, char *buf, int bufsz, bool *initial_quote,
 	 */
 	(*lineptr)--;
 
+	/*
+	 * only '@' alone is treated as the character itself for backward
+	 * compatibility.
+	 */
+	if (buf == start_buf && *type == FILE_INCLUSION)
+	{
+		*buf++ = '@';
+		*type = NORMAL;
+	}
+
 	*buf = '\0';
 
-	return (saw_quote || buf > start_buf);
+	/* Down case the names if non-case-preserve notation */
+	if (!*case_preserved)
+	{
+		char *p;
+		for (p = start_buf ; p < buf ; p++) 
+			*p = tolower(*p);
+	}
+
+	/* Allowing empty group name for backward comptibility */
+	return (saw_quote || buf > start_buf || *type == GROUP_NAME);
 }
 
 static HbaToken *
-make_hba_token(char *token, bool quoted)
+make_hba_token(char *token, TokenType toktype, bool case_preserved)
 {
 	HbaToken   *hbatoken;
 	int			toklen;
@@ -240,7 +296,8 @@ make_hba_token(char *token, bool quoted)
 	toklen = strlen(token);
 	hbatoken = (HbaToken *) palloc(sizeof(HbaToken) + toklen + 1);
 	hbatoken->string = (char *) hbatoken + sizeof(HbaToken);
-	hbatoken->quoted = quoted;
+	hbatoken->type = toktype;
+	hbatoken->case_preserved = case_preserved;
 	memcpy(hbatoken->string, token, toklen + 1);
 
 	return hbatoken;
@@ -252,7 +309,8 @@ make_hba_token(char *token, bool quoted)
 static HbaToken *
 copy_hba_token(HbaToken *in)
 {
-	HbaToken   *out = make_hba_token(in->string, in->quoted);
+	HbaToken   *out = make_hba_token(in->string,
+									 in->type, in->case_preserved);
 
 	return out;
 }
@@ -269,19 +327,26 @@ next_field_expand(const char *filename, char **lineptr)
 {
 	char		buf[MAX_TOKEN];
 	bool		trailing_comma;
-	bool		initial_quote;
+	bool		case_preserved;
+	int			type;
 	List	   *tokens = NIL;
 
 	do
 	{
-		if (!next_token(lineptr, buf, sizeof(buf), &initial_quote, &trailing_comma))
+		if (!next_token(lineptr, buf, sizeof(buf), 
+						&case_preserved, &type,
+						&trailing_comma))
 			break;
 
 		/* Is this referencing a file? */
-		if (!initial_quote && buf[0] == '@' && buf[1] != '\0')
-			tokens = tokenize_inc_file(tokens, filename, buf + 1);
+		if (type == FILE_INCLUSION)
+			tokens = tokenize_inc_file(tokens, filename, buf);
+		else if (type == GROUP_NAME)
+			tokens = lappend(tokens,
+							 make_hba_token(buf, GROUP_NAME, case_preserved));
 		else
-			tokens = lappend(tokens, make_hba_token(buf, initial_quote));
+			tokens = lappend(tokens,
+							 make_hba_token(buf, NORMAL, case_preserved));
 	} while (trailing_comma);
 
 	return tokens;
@@ -489,9 +554,9 @@ check_role(const char *role, Oid roleid, List *tokens)
 	foreach(cell, tokens)
 	{
 		tok = lfirst(cell);
-		if (!tok->quoted && tok->string[0] == '+')
+		if (tok->type == GROUP_NAME)
 		{
-			if (is_member(roleid, tok->string + 1))
+			if (is_member(roleid, tok->string))
 				return true;
 		}
 		else if (token_matches(tok, role) ||
-- 
1.7.1

>From 3eafe5ef922f3bdf9ea34d8ae408c3b1d0345768 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp>
Date: Thu, 18 Sep 2014 17:07:41 +0900
Subject: [PATCH 1/2] Document for make pg_hba.conf case insensitive

---
 doc/src/sgml/client-auth.sgml |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 7704f73..5fd0108 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -196,8 +196,13 @@ hostnossl  <replaceable>database</replaceable>  <replaceable>user</replaceable>
        Otherwise, this is the name of
        a specific <productname>PostgreSQL</productname> database.
        Multiple database names can be supplied by separating them with
-       commas.  A separate file containing database names can be specified by
+       commas.
+       Quoted identifers excepet unicode escaping are
+       allowed. (See <xref linkend="sql-syntax-identifiers"> for more
+       information).
+       A separate file containing database names can be specified by
        preceding the file name with <literal>@</>.
+       the file name with <literal>@</>.
       </para>
      </listitem>
     </varlistentry>
@@ -219,6 +224,9 @@ hostnossl  <replaceable>database</replaceable>  <replaceable>user</replaceable>
        of the role, directly or indirectly, and not just by virtue of
        being a superuser.
        Multiple user names can be supplied by separating them with commas.
+       Quoted identifers excepet unicode escaping are
+       allowed. (See <xref linkend="sql-syntax-identifiers"> for more
+       information).
        A separate file containing user names can be specified by preceding the
        file name with <literal>@</>.
       </para>
-- 
1.7.1

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [TODO] Process pg_hba.conf keywords as case-insensitive

Reply via email to