On 24.01.2020 21:29, Tomas Vondra wrote:
Hi Nikita,

This patch seems inactive / stuck in "waiting on author" since November.
It's marked as bugfix, so it'd be good to get it committed instead of
just punting it to the next CF.

I did a quick review, and I came mostly with the same two complaints as
Alexander ...

On Wed, Jul 17, 2019 at 09:33:46PM +0300, Alexander Korotkov wrote:
Hi Nikita,

On Tue, Jul 16, 2019 at 6:52 PM Nikita Glukhov <n.glu...@postgrespro.ru> wrote:
I looked at "ltree syntax improvement" patch and found two more very
old bugs in ltree/lquery (fixes are attached):

Thank you for the fixes.  I've couple notes on them.

0001-Fix-max-size-checking-for-ltree-and-lquery.patch

+#define LTREE_MAX_LEVELS Min(PG_UINT16_MAX, MaxAllocSize / sizeof(nodeitem))
+#define LQUERY_MAX_LEVELS Min(PG_UINT16_MAX, MaxAllocSize / ITEMSIZE)

Looks over caution.  PG_UINT16_MAX is not even close to MaxAllocSize /
sizeof(nodeitem) or MaxAllocSize / ITEMSIZE.


Yeah, I'm also puzzled by the usage of PG_UINT16_MAX here. It's so much
lower than the other values we could jut use the constant directly, but
let's say the structs could grow from the ~16B to chnge this.

Ok, LTREE_MAX_LEVELS and LQUERY_MAX_LEVELS are defined simply as PG_UINT16_MAX 
now.


The main question is why we need PG_UINT16_MAX at all? It kinda implies
we need to squish the value into a 2B counter or something, but is that
actually true? I don't see anything like that in ltree_io.c.

ltree.numlevel and lquery.numlevel are uint16 fields.


I also found two places in ltree_concat() where numlevel can overflow.

The first is ltree_concat() (operator ||(ltree, ltree)):

=# SELECT nlevel(('a' || repeat('.a', 65533))::ltree || 'a');
 nlevel
--------
  65535
(1 row)

=# SELECT nlevel(('a' || repeat('.a', 65534))::ltree || 'a');
 nlevel
--------
      0
(1 row)



The second is parsing of low and high level limits in lquery_in():

=# SELECT '*{65535}'::lquery;
  lquery
----------
 *{65535}
(1 row)

=# SELECT '*{65536}'::lquery;
 lquery
--------
 *{0}
(1 row)

=# SELECT '*{65537}'::lquery;
 lquery
--------
 *{1}
(1 row)


The both problems are fixed in the new version of the patch.

So it seems more like an arbitrary value considered "sane" - which is
fine, but then a comment saying so would be nice, and we could pick a
value that is "nicer" for humans. Or just use value computed from the
MaxAllocSize limit, without the Min().

0002-Fix-successive-lquery-ops.patch

diff --git a/contrib/ltree/lquery_op.c b/contrib/ltree/lquery_op.c
index 62172d5..d4f4941 100644
--- a/contrib/ltree/lquery_op.c
+++ b/contrib/ltree/lquery_op.c
@@ -255,8 +255,8 @@ checkCond(lquery_level *curq, int query_numlevel,
ltree_level *curt, int tree_nu
 }
 else
 {
- low_pos = cur_tpos + curq->low;
- high_pos = cur_tpos + curq->high;
+ low_pos = Min(low_pos + curq->low, PG_UINT16_MAX);
+ high_pos = Min(high_pos + curq->high, PG_UINT16_MAX);
 if (ptr && ptr->q)
 {
 ptr->nq++;
@@ -282,8 +282,8 @@ checkCond(lquery_level *curq, int query_numlevel,
ltree_level *curt, int tree_nu
 }
 else
 {
- low_pos = cur_tpos + curq->low;
- high_pos = cur_tpos + curq->high;
+ low_pos = Min(low_pos + curq->low, PG_UINT16_MAX);
+ high_pos = Min(high_pos + curq->high, PG_UINT16_MAX);
 }

 curq = LQL_NEXT(curq);

I'm not sure what do these checks do.  Code around is uncommented and
puzzled.  But could we guarantee the same invariant on the stage of
ltree/lquery parsing?


Unfortunately, the current code is somewhat undercommented :-(

The main problem is that no one really understands how it works now.

low_pos and high_pos seem to be a range of tree levels, from which is allowed
to match the rest of lquery.

For example, when we are matching '.b' in the 'a.*{2,3}.*{4,5}.b'::lquery,
low_pos = 1 + 2 + 4 = 7  and  high_pos = 1 + 3 + 5 = 9.


The main goal of the patch is to fix calculation of low_pos and high_pos:

- low_pos = cur_tpos + curq->low;
- high_pos = cur_tpos + curq->high;
+ low_pos = low_pos + curq->low;
+ high_pos = high_pos + curq->high;


Anyway, I don't quite understand why we need these caps. It kinda seems
like a band-aid for potential overflow.

Why should it be OK for the values to even get past the maximum, with
sane input data? And isn't there a better upper limit (e.g. based on
how much space we actually allocated)?

We can compare low_pos to tree_numlevel and return false earlier, if it is
greater.  And it seems that high_pos we can also limit to tree_numlevel.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
>From 2bb5d618ca7f0ac3997a2e34e3a06dbd404ca26f Mon Sep 17 00:00:00 2001
From: Nikita Glukhov <n.glu...@postgrespro.ru>
Date: Thu, 7 Mar 2019 17:45:44 +0300
Subject: [PATCH v2 1/2] Fix max size checking for ltree and lquery

---
 contrib/ltree/expected/ltree.out | 46 ++++++++++++++++++++++++++++++++++++++++
 contrib/ltree/ltree.h            |  2 ++
 contrib/ltree/ltree_io.c         | 44 ++++++++++++++++++++++++++------------
 contrib/ltree/ltree_op.c         |  9 +++++++-
 contrib/ltree/sql/ltree.sql      | 11 ++++++++++
 5 files changed, 98 insertions(+), 14 deletions(-)

diff --git a/contrib/ltree/expected/ltree.out b/contrib/ltree/expected/ltree.out
index 8226930..41e7f94 100644
--- a/contrib/ltree/expected/ltree.out
+++ b/contrib/ltree/expected/ltree.out
@@ -457,6 +457,52 @@ SELECT nlevel('1.2.3.4');
       4
 (1 row)
 
+SELECT nlevel(('1' || repeat('.1', 65534))::ltree);
+ nlevel 
+--------
+  65535
+(1 row)
+
+SELECT nlevel(('1' || repeat('.1', 65535))::ltree);
+ERROR:  number of ltree levels (65536) exceeds the maximum allowed (65535)
+SELECT nlevel(('1' || repeat('.1', 65534))::ltree || '1');
+ERROR:  number of ltree levels (65536) exceeds the maximum allowed (65535)
+SELECT ('1' || repeat('.1', 65534))::lquery IS NULL;
+ ?column? 
+----------
+ f
+(1 row)
+
+SELECT ('1' || repeat('.1', 65535))::lquery IS NULL;
+ERROR:  number of lquery levels (65536) exceeds the maximum allowed (65535)
+SELECT '*{65535}'::lquery;
+  lquery  
+----------
+ *{65535}
+(1 row)
+
+SELECT '*{65536}'::lquery;
+ERROR:  lquery syntax error
+LINE 1: SELECT '*{65536}'::lquery;
+               ^
+DETAIL:  Low limit (65536) exceeds the maximum allowed (65535).
+SELECT '*{,65534}'::lquery;
+  lquery   
+-----------
+ *{,65534}
+(1 row)
+
+SELECT '*{,65535}'::lquery;
+ lquery 
+--------
+ *
+(1 row)
+
+SELECT '*{,65536}'::lquery;
+ERROR:  lquery syntax error
+LINE 1: SELECT '*{,65536}'::lquery;
+               ^
+DETAIL:  High limit (65536) exceeds the maximum allowed (65535).
 SELECT '1.2'::ltree  < '2.2'::ltree;
  ?column? 
 ----------
diff --git a/contrib/ltree/ltree.h b/contrib/ltree/ltree.h
index 366e580..a5608ca 100644
--- a/contrib/ltree/ltree.h
+++ b/contrib/ltree/ltree.h
@@ -25,6 +25,7 @@ typedef struct
 
 #define LTREE_HDRSIZE	MAXALIGN( offsetof(ltree, data) )
 #define LTREE_FIRST(x)	( (ltree_level*)( ((char*)(x))+LTREE_HDRSIZE ) )
+#define LTREE_MAX_LEVELS	PG_UINT16_MAX	/* ltree.numlevel is uint16 */
 
 
 /* lquery */
@@ -77,6 +78,7 @@ typedef struct
 
 #define LQUERY_HDRSIZE	 MAXALIGN( offsetof(lquery, data) )
 #define LQUERY_FIRST(x)   ( (lquery_level*)( ((char*)(x))+LQUERY_HDRSIZE ) )
+#define LQUERY_MAX_LEVELS	PG_UINT16_MAX	/* lquery.numlevel is uint16 */
 
 #define LQUERY_HASNOT		0x01
 
diff --git a/contrib/ltree/ltree_io.c b/contrib/ltree/ltree_io.c
index 900a46a..3d2d1b3 100644
--- a/contrib/ltree/ltree_io.c
+++ b/contrib/ltree/ltree_io.c
@@ -58,11 +58,11 @@ ltree_in(PG_FUNCTION_ARGS)
 		ptr += charlen;
 	}
 
-	if (num + 1 > MaxAllocSize / sizeof(nodeitem))
+	if (num + 1 > LTREE_MAX_LEVELS)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-				 errmsg("number of levels (%d) exceeds the maximum allowed (%d)",
-						num + 1, (int) (MaxAllocSize / sizeof(nodeitem)))));
+				 errmsg("number of ltree levels (%d) exceeds the maximum allowed (%d)",
+						num + 1, LTREE_MAX_LEVELS)));
 	list = lptr = (nodeitem *) palloc(sizeof(nodeitem) * (num + 1));
 	ptr = buf;
 	while (*ptr)
@@ -227,11 +227,11 @@ lquery_in(PG_FUNCTION_ARGS)
 	}
 
 	num++;
-	if (num > MaxAllocSize / ITEMSIZE)
+	if (num > LQUERY_MAX_LEVELS)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-				 errmsg("number of levels (%d) exceeds the maximum allowed (%d)",
-						num, (int) (MaxAllocSize / ITEMSIZE))));
+				 errmsg("number of lquery levels (%d) exceeds the maximum allowed (%d)",
+						num, LQUERY_MAX_LEVELS)));
 	curqlevel = tmpql = (lquery_level *) palloc0(ITEMSIZE * num);
 	ptr = buf;
 	while (*ptr)
@@ -344,7 +344,7 @@ lquery_in(PG_FUNCTION_ARGS)
 			else if (charlen == 1 && t_iseq(ptr, '.'))
 			{
 				curqlevel->low = 0;
-				curqlevel->high = 0xffff;
+				curqlevel->high = LTREE_MAX_LEVELS;
 				curqlevel = NEXTLEV(curqlevel);
 				state = LQPRS_WAITLEVEL;
 			}
@@ -357,7 +357,16 @@ lquery_in(PG_FUNCTION_ARGS)
 				state = LQPRS_WAITSNUM;
 			else if (t_isdigit(ptr))
 			{
-				curqlevel->low = atoi(ptr);
+				int			low = atoi(ptr);
+
+				if (low > PG_UINT16_MAX)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("lquery syntax error"),
+							 errdetail("Low limit (%d) exceeds the maximum allowed (%d).",
+									   low, PG_UINT16_MAX)));
+
+				curqlevel->low = (uint16) low;
 				state = LQPRS_WAITND;
 			}
 			else
@@ -367,12 +376,21 @@ lquery_in(PG_FUNCTION_ARGS)
 		{
 			if (t_isdigit(ptr))
 			{
-				curqlevel->high = atoi(ptr);
+				int			high = atoi(ptr);
+
+				if (high > LTREE_MAX_LEVELS)
+					ereport(ERROR,
+							(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("lquery syntax error"),
+							 errdetail("High limit (%d) exceeds the maximum allowed (%d).",
+									   high, LTREE_MAX_LEVELS)));
+
+				curqlevel->high = high;
 				state = LQPRS_WAITCLOSE;
 			}
 			else if (charlen == 1 && t_iseq(ptr, '}'))
 			{
-				curqlevel->high = 0xffff;
+				curqlevel->high = LTREE_MAX_LEVELS;
 				state = LQPRS_WAITEND;
 			}
 			else
@@ -444,7 +462,7 @@ lquery_in(PG_FUNCTION_ARGS)
 							   lptr->wlen, pos)));
 	}
 	else if (state == LQPRS_WAITOPEN)
-		curqlevel->high = 0xffff;
+		curqlevel->high = LTREE_MAX_LEVELS;
 	else if (state != LQPRS_WAITEND)
 		ereport(ERROR,
 				(errcode(ERRCODE_SYNTAX_ERROR),
@@ -593,7 +611,7 @@ lquery_out(PG_FUNCTION_ARGS)
 			}
 			else if (curqlevel->low == 0)
 			{
-				if (curqlevel->high == 0xffff)
+				if (curqlevel->high == LTREE_MAX_LEVELS)
 				{
 					*ptr = '*';
 					*(ptr + 1) = '\0';
@@ -601,7 +619,7 @@ lquery_out(PG_FUNCTION_ARGS)
 				else
 					sprintf(ptr, "*{,%d}", curqlevel->high);
 			}
-			else if (curqlevel->high == 0xffff)
+			else if (curqlevel->high == LTREE_MAX_LEVELS)
 			{
 				sprintf(ptr, "*{%d,}", curqlevel->low);
 			}
diff --git a/contrib/ltree/ltree_op.c b/contrib/ltree/ltree_op.c
index 070868f..34e6e4b 100644
--- a/contrib/ltree/ltree_op.c
+++ b/contrib/ltree/ltree_op.c
@@ -274,10 +274,17 @@ static ltree *
 ltree_concat(ltree *a, ltree *b)
 {
 	ltree	   *r;
+	int			numlevel = (int) a->numlevel + b->numlevel;
+
+	if (numlevel > LTREE_MAX_LEVELS)
+		ereport(ERROR,
+				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+				 errmsg("number of ltree levels (%d) exceeds the maximum allowed (%d)",
+						numlevel, LTREE_MAX_LEVELS)));
 
 	r = (ltree *) palloc0(VARSIZE(a) + VARSIZE(b) - LTREE_HDRSIZE);
 	SET_VARSIZE(r, VARSIZE(a) + VARSIZE(b) - LTREE_HDRSIZE);
-	r->numlevel = a->numlevel + b->numlevel;
+	r->numlevel = (uint16) numlevel;
 
 	memcpy(LTREE_FIRST(r), LTREE_FIRST(a), VARSIZE(a) - LTREE_HDRSIZE);
 	memcpy(((char *) LTREE_FIRST(r)) + VARSIZE(a) - LTREE_HDRSIZE,
diff --git a/contrib/ltree/sql/ltree.sql b/contrib/ltree/sql/ltree.sql
index 846b04e..fca3ae6 100644
--- a/contrib/ltree/sql/ltree.sql
+++ b/contrib/ltree/sql/ltree.sql
@@ -90,6 +90,17 @@ SELECT '1.*.4|3|2.*{1}'::lquery;
 SELECT 'qwerty%@*.tu'::lquery;
 
 SELECT nlevel('1.2.3.4');
+SELECT nlevel(('1' || repeat('.1', 65534))::ltree);
+SELECT nlevel(('1' || repeat('.1', 65535))::ltree);
+SELECT nlevel(('1' || repeat('.1', 65534))::ltree || '1');
+SELECT ('1' || repeat('.1', 65534))::lquery IS NULL;
+SELECT ('1' || repeat('.1', 65535))::lquery IS NULL;
+SELECT '*{65535}'::lquery;
+SELECT '*{65536}'::lquery;
+SELECT '*{,65534}'::lquery;
+SELECT '*{,65535}'::lquery;
+SELECT '*{,65536}'::lquery;
+
 SELECT '1.2'::ltree  < '2.2'::ltree;
 SELECT '1.2'::ltree  <= '2.2'::ltree;
 SELECT '2.2'::ltree  = '2.2'::ltree;
-- 
2.7.4

>From cc3496ddfc72920e0d546f5e6fe7f76952e6484e Mon Sep 17 00:00:00 2001
From: Nikita Glukhov <n.glu...@postgrespro.ru>
Date: Thu, 7 Mar 2019 16:47:10 +0300
Subject: [PATCH v2 2/2] Fix successive lquery * ops

---
 contrib/ltree/expected/ltree.out | 24 ++++++++++++++++++++++++
 contrib/ltree/lquery_op.c        | 17 +++++++++++++----
 contrib/ltree/sql/ltree.sql      |  5 ++++-
 3 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/contrib/ltree/expected/ltree.out b/contrib/ltree/expected/ltree.out
index 41e7f94..1b31dbc 100644
--- a/contrib/ltree/expected/ltree.out
+++ b/contrib/ltree/expected/ltree.out
@@ -959,6 +959,30 @@ SELECT 'a.b.c.d.e'::ltree ~ '*.!b.*.!c.*';
  f
 (1 row)
 
+SELECT 'a.b.c.d.e'::ltree ~ 'a.*{2}.*{2}';
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT 'a.b.c.d.e'::ltree ~ 'a.*{1}.*{2}.e';
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT 'a.b.c.d.e'::ltree ~ 'a.*{1}.*{4}';
+ ?column? 
+----------
+ f
+(1 row)
+
+SELECT 'a.b.c.d.e'::ltree ~ 'a.*{5}.*';
+ ?column? 
+----------
+ f
+(1 row)
+
 SELECT 'QWER_TY'::ltree ~ 'q%@*';
  ?column? 
 ----------
diff --git a/contrib/ltree/lquery_op.c b/contrib/ltree/lquery_op.c
index bfbcee6..1d2614b 100644
--- a/contrib/ltree/lquery_op.c
+++ b/contrib/ltree/lquery_op.c
@@ -248,8 +248,13 @@ checkCond(lquery_level *curq, int query_numlevel, ltree_level *curt, int tree_nu
 		}
 		else
 		{
-			low_pos = cur_tpos + curq->low;
-			high_pos = cur_tpos + curq->high;
+			low_pos += curq->low;
+
+			if (low_pos > tree_numlevel)
+				return false;
+
+			high_pos = Min(high_pos + curq->high, tree_numlevel);
+
 			if (ptr && ptr->q)
 			{
 				ptr->nq++;
@@ -275,8 +280,12 @@ checkCond(lquery_level *curq, int query_numlevel, ltree_level *curt, int tree_nu
 		}
 		else
 		{
-			low_pos = cur_tpos + curq->low;
-			high_pos = cur_tpos + curq->high;
+			low_pos += curq->low;
+
+			if (low_pos > tree_numlevel)
+				return false;
+
+			high_pos = Min(high_pos + curq->high, tree_numlevel);
 		}
 
 		curq = LQL_NEXT(curq);
diff --git a/contrib/ltree/sql/ltree.sql b/contrib/ltree/sql/ltree.sql
index fca3ae6..1de0b2a 100644
--- a/contrib/ltree/sql/ltree.sql
+++ b/contrib/ltree/sql/ltree.sql
@@ -179,7 +179,10 @@ SELECT 'a.b.c.d.e'::ltree ~ 'a.!b.*{1}.!c.*';
 SELECT 'a.b.c.d.e'::ltree ~ '!b.*{1}.!c.*';
 SELECT 'a.b.c.d.e'::ltree ~ '*.!b.*{1}.!c.*';
 SELECT 'a.b.c.d.e'::ltree ~ '*.!b.*.!c.*';
-
+SELECT 'a.b.c.d.e'::ltree ~ 'a.*{2}.*{2}';
+SELECT 'a.b.c.d.e'::ltree ~ 'a.*{1}.*{2}.e';
+SELECT 'a.b.c.d.e'::ltree ~ 'a.*{1}.*{4}';
+SELECT 'a.b.c.d.e'::ltree ~ 'a.*{5}.*';
 
 SELECT 'QWER_TY'::ltree ~ 'q%@*';
 SELECT 'QWER_TY'::ltree ~ 'Q_t%@*';
-- 
2.7.4

Reply via email to