Re: Improper use about DatumGetInt32

2021-01-19 Thread Peter Eisentraut

On 2021-01-14 09:00, Michael Paquier wrote:

I don't have more comments by reading the code and my tests have
passed after applying the patch on top of df10ac62.  I would have also
added some tests that check after blkno < 0 and > MaxBlockNumber in
all the functions where it can be triggered as that's cheap for 1.8
and 1.9, but that it's a minor point.


committed with some additional tests




Re: Improper use about DatumGetInt32

2021-01-14 Thread Michael Paquier
On Wed, Jan 13, 2021 at 09:27:37AM +0100, Peter Eisentraut wrote:
> Interesting idea.  Here is a patch that incorporates that.

Thanks for adding some coverage.

This patch needs a small rebase as Heikki has just introduced some
functions for gist, bumping the module to 1.9 (no need to bump to
1.10, right?).

I don't have more comments by reading the code and my tests have
passed after applying the patch on top of df10ac62.  I would have also
added some tests that check after blkno < 0 and > MaxBlockNumber in
all the functions where it can be triggered as that's cheap for 1.8
and 1.9, but that it's a minor point.
--
Michael


signature.asc
Description: PGP signature


Re: Improper use about DatumGetInt32

2021-01-13 Thread Peter Eisentraut

On 2021-01-11 09:09, Michael Paquier wrote:

On Sat, Jan 09, 2021 at 01:41:39PM +0100, Peter Eisentraut wrote:

If we had a way to do such testing then we wouldn't need these markers. But
AFAICT, we don't.


Not sure I am following your point here.  Taking your patch, it is
possible to trigger the version of get_raw_page() <= 1.8 just with
something like the following:
create extension pageinspect version "1.8";
select get_raw_page('pg_class', 0);

There are no in-core regression tests that check the compatibility of
extensions with older versions, but it is technically possible to do
so.


Interesting idea.  Here is a patch that incorporates that.
>From 6372a1e50b684976723fa6537d664b28e42371ee Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 13 Jan 2021 09:12:10 +0100
Subject: [PATCH v4] pageinspect: Change block number arguments to bigint

Block numbers are 32-bit unsigned integers.  Therefore, the smallest
SQL integer type that they can fit in is bigint.  However, in the
pageinspect module, most input and output parameters dealing with
block numbers were declared as int.  The behavior with block numbers
larger than a signed 32-bit integer was therefore dubious.  Change
these arguments to type bigint and add some more explicit error
checking on the block range.

(Other contrib modules appear to do this correctly already.)

Since we are changing argument types of existing functions, in order
to not misbehave if the binary is updated before the extension is
updated, we need to create new C symbols for the entry points, similar
to how it's done in other extensions as well.

Reported-by: Ashutosh Bapat 
Reviewed-by: Alvaro Herrera 
Discussion: 
https://www.postgresql.org/message-id/flat/d8f6bdd536df403b9b33816e9f7e0b9d@G08CNEXMBPEKD05.g08.fujitsu.local
---
 contrib/pageinspect/Makefile  |  5 +-
 contrib/pageinspect/brinfuncs.c   | 13 ++-
 contrib/pageinspect/btreefuncs.c  | 76 +
 contrib/pageinspect/expected/gin.out  |  1 +
 .../pageinspect/expected/oldextversions.out   | 40 +
 contrib/pageinspect/hashfuncs.c   | 11 ++-
 contrib/pageinspect/pageinspect--1.8--1.9.sql | 81 +++
 contrib/pageinspect/pageinspect.control   |  2 +-
 contrib/pageinspect/pageinspect.h |  9 +++
 contrib/pageinspect/rawpage.c | 75 -
 contrib/pageinspect/sql/gin.sql   |  2 +
 contrib/pageinspect/sql/oldextversions.sql| 20 +
 doc/src/sgml/pageinspect.sgml | 12 +--
 13 files changed, 314 insertions(+), 33 deletions(-)
 create mode 100644 contrib/pageinspect/expected/oldextversions.out
 create mode 100644 contrib/pageinspect/pageinspect--1.8--1.9.sql
 create mode 100644 contrib/pageinspect/sql/oldextversions.sql

diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index d9d8177116..b78ffceaff 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -12,14 +12,15 @@ OBJS = \
rawpage.o
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
+DATA =  pageinspect--1.8--1.9.sql \
+   pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
pageinspect--1.0--1.1.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
-REGRESS = page btree brin gin hash checksum
+REGRESS = page btree brin gin hash checksum oldextversions
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index e872f65f01..0e3c2deb66 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -252,7 +252,18 @@ brin_page_items(PG_FUNCTION_ARGS)
int att = attno - 1;
 
values[0] = UInt16GetDatum(offset);
-   values[1] = UInt32GetDatum(dtup->bt_blkno);
+   switch (TupleDescAttr(tupdesc, 1)->atttypid)
+   {
+   case INT8OID:
+   values[1] = Int64GetDatum((int64) 
dtup->bt_blkno);
+   break;
+   case INT4OID:
+   /* support for old extension version */
+   values[1] = 
UInt32GetDatum(dtup->bt_blkno);
+   break;
+   default:
+   elog(ERROR, "incorrect output types");
+   }
values[2] = UInt16GetDatum(attno);
values[3] = 
BoolGetDatum(dtup->bt_columns[att].bv_allnulls);
values[4] = 

Re: Improper use about DatumGetInt32

2021-01-11 Thread Michael Paquier
On Sat, Jan 09, 2021 at 01:41:39PM +0100, Peter Eisentraut wrote:
> If we had a way to do such testing then we wouldn't need these markers. But
> AFAICT, we don't.

Not sure I am following your point here.  Taking your patch, it is
possible to trigger the version of get_raw_page() <= 1.8 just with
something like the following:
create extension pageinspect version "1.8";
select get_raw_page('pg_class', 0);

There are no in-core regression tests that check the compatibility of
extensions with older versions, but it is technically possible to do
so.
--
Michael


signature.asc
Description: PGP signature


Re: Improper use about DatumGetInt32

2021-01-09 Thread Peter Eisentraut

On 2021-01-09 02:46, Michael Paquier wrote:

+/* LCOV_EXCL_START */
Does it really make sense to add those markers here?  It seems to me
that we would ignore any new coverage if regression tests based on
older versions are added (we may really want to have such tests for
more in-core extensions to be able to verify the portability of an
extension, but that's not the job of this patch of course).


If we had a way to do such testing then we wouldn't need these markers. 
But AFAICT, we don't.



-   elog(ERROR, "block 0 is a meta page");
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("block 0 is a meta page")));
[...]
+errmsg("block number %llu is out of range for relation \"%s\"",
This does not follow the usual style for error reports that should not
be written as full sentences?  Maybe something like "invalid block
number %u referring to meta page" and "block number out of range for
relation %s: %llu"?


There are many error messages that say "[something] is out of range".  I 
don't think banning that would serve any purpose.





Re: Improper use about DatumGetInt32

2021-01-08 Thread Michael Paquier
On Fri, Jan 08, 2021 at 04:54:47PM +0100, Peter Eisentraut wrote:
> Updated patch that does that.

Thanks.  Looks sane seen from here.

+/* LCOV_EXCL_START */
Does it really make sense to add those markers here?  It seems to me
that we would ignore any new coverage if regression tests based on
older versions are added (we may really want to have such tests for
more in-core extensions to be able to verify the portability of an
extension, but that's not the job of this patch of course).

-   elog(ERROR, "block 0 is a meta page");
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("block 0 is a meta page")));
[...]
+errmsg("block number %llu is out of range for relation \"%s\"",
This does not follow the usual style for error reports that should not
be written as full sentences?  Maybe something like "invalid block
number %u referring to meta page" and "block number out of range for
relation %s: %llu"?
--
Michael


signature.asc
Description: PGP signature


Re: Improper use about DatumGetInt32

2021-01-08 Thread Peter Eisentraut

On 2021-01-08 10:21, Peter Eisentraut wrote:

I think on 64-bit systems it's actually safe, but on 32-bit systems
(with USE_FLOAT8_BYVAL), if you use the new binaries with the old
SQL-level definitions, you'd get the int4 that is passed in interpreted
as a pointer, which would lead to very bad things.  So I think we need
to create new functions with a different C symbol.  I'll work on that.


Updated patch that does that.
>From d0b802478ef2f5fc4200175c56b1f9b2f712e9ed Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 8 Jan 2021 16:49:07 +0100
Subject: [PATCH v3] pageinspect: Change block number arguments to bigint

Block numbers are 32-bit unsigned integers.  Therefore, the smallest
SQL integer type that they can fit in is bigint.  However, in the
pageinspect module, most input and output parameters dealing with
block numbers were declared as int.  The behavior with block numbers
larger than a signed 32-bit integer was therefore dubious.  Change
these arguments to type bigint and add some more explicit error
checking on the block range.

(Other contrib modules appear to do this correctly already.)

Since we are changing argument types of existing functions, in order
to not misbehave if the binary is updated before the extension is
updated, we need to create new C symbols for the entry points, similar
to how it's done in other extensions as well.

Reported-by: Ashutosh Bapat 
Reviewed-by: Alvaro Herrera 
Discussion: 
https://www.postgresql.org/message-id/flat/d8f6bdd536df403b9b33816e9f7e0b9d@G08CNEXMBPEKD05.g08.fujitsu.local
---
 contrib/pageinspect/Makefile  |  3 +-
 contrib/pageinspect/brinfuncs.c   | 13 ++-
 contrib/pageinspect/btreefuncs.c  | 84 ++
 contrib/pageinspect/hashfuncs.c   | 11 ++-
 contrib/pageinspect/pageinspect--1.8--1.9.sql | 81 +
 contrib/pageinspect/pageinspect.control   |  2 +-
 contrib/pageinspect/pageinspect.h |  9 ++
 contrib/pageinspect/rawpage.c | 87 ++-
 doc/src/sgml/pageinspect.sgml | 12 +--
 9 files changed, 270 insertions(+), 32 deletions(-)
 create mode 100644 contrib/pageinspect/pageinspect--1.8--1.9.sql

diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index d9d8177116..3fcbfea293 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -12,7 +12,8 @@ OBJS = \
rawpage.o
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
+DATA =  pageinspect--1.8--1.9.sql \
+   pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index e872f65f01..0e3c2deb66 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -252,7 +252,18 @@ brin_page_items(PG_FUNCTION_ARGS)
int att = attno - 1;
 
values[0] = UInt16GetDatum(offset);
-   values[1] = UInt32GetDatum(dtup->bt_blkno);
+   switch (TupleDescAttr(tupdesc, 1)->atttypid)
+   {
+   case INT8OID:
+   values[1] = Int64GetDatum((int64) 
dtup->bt_blkno);
+   break;
+   case INT4OID:
+   /* support for old extension version */
+   values[1] = 
UInt32GetDatum(dtup->bt_blkno);
+   break;
+   default:
+   elog(ERROR, "incorrect output types");
+   }
values[2] = UInt16GetDatum(attno);
values[3] = 
BoolGetDatum(dtup->bt_columns[att].bv_allnulls);
values[4] = 
BoolGetDatum(dtup->bt_columns[att].bv_hasnulls);
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 445605db58..a123955aae 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -41,8 +41,10 @@
 #include "utils/varlena.h"
 
 PG_FUNCTION_INFO_V1(bt_metap);
+PG_FUNCTION_INFO_V1(bt_page_items_1_9);
 PG_FUNCTION_INFO_V1(bt_page_items);
 PG_FUNCTION_INFO_V1(bt_page_items_bytea);
+PG_FUNCTION_INFO_V1(bt_page_stats_1_9);
 PG_FUNCTION_INFO_V1(bt_page_stats);
 
 #define IS_INDEX(r) ((r)->rd_rel->relkind == RELKIND_INDEX)
@@ -160,11 +162,11 @@ GetBTPageStatistics(BlockNumber blkno, Buffer buffer, 
BTPageStat *stat)
  * Usage: SELECT * FROM bt_page_stats('t1_pkey', 1);
  * ---
  */
-Datum
-bt_page_stats(PG_FUNCTION_ARGS)
+static Datum

Re: Improper use about DatumGetInt32

2021-01-08 Thread Peter Eisentraut

On 2020-12-25 08:45, Michael Paquier wrote:

If we really think that we ought to differentiate, then we could do what
pg_stat_statement does, and have a separate C function that's called
with the obsolete signature (pg_stat_statements_1_8 et al).

With the 1.8 flavor, it is possible to pass down a negative number
and it may not fail depending on the number of blocks of the relation,
so I think that you had better have a compatibility layer if a user
has the new binaries but is still on 1.8.  And that's surely a safe
move.


I think on 64-bit systems it's actually safe, but on 32-bit systems 
(with USE_FLOAT8_BYVAL), if you use the new binaries with the old 
SQL-level definitions, you'd get the int4 that is passed in interpreted 
as a pointer, which would lead to very bad things.  So I think we need 
to create new functions with a different C symbol.  I'll work on that.





Re: Improper use about DatumGetInt32

2020-12-24 Thread Michael Paquier
On Fri, Dec 04, 2020 at 03:58:22PM -0300, Alvaro Herrera wrote:
> I don't know if it's possible to determine (at function execution time)
> that we're running with the old extension version; if so it might
> suffice to throw a warning but still have the SQL function run the same
> C function.

Hmm.  You could look after extversion?  Usually we just handle that
with compatibility routines.

> If we really think that we ought to differentiate, then we could do what
> pg_stat_statement does, and have a separate C function that's called
> with the obsolete signature (pg_stat_statements_1_8 et al).

With the 1.8 flavor, it is possible to pass down a negative number
and it may not fail depending on the number of blocks of the relation,
so I think that you had better have a compatibility layer if a user
has the new binaries but is still on 1.8.  And that's surely a safe
move.
--
Michael


signature.asc
Description: PGP signature


Re: Improper use about DatumGetInt32

2020-12-04 Thread Alvaro Herrera
On 2020-Dec-03, Peter Eisentraut wrote:

> On 2020-11-30 16:32, Alvaro Herrera wrote:
> > On 2020-Nov-30, Peter Eisentraut wrote:
> > 
> > > Patch updated this way.  I agree it's better that way.
> > 
> > Thanks, LGTM.
> 
> For a change like this, do we need to change the C symbol names, so that
> there is no misbehavior if the shared library is not updated at the same
> time as the extension is upgraded in SQL?

Good question.  One point is that since the changes to the arguments are
just in the way we read the values from the Datum C-values, there's no
actual ABI change.  So if I understand correctly, there's no danger of a
crash; there's just a danger of misinterpreting a value.

I don't know if it's possible to determine (at function execution time)
that we're running with the old extension version; if so it might
suffice to throw a warning but still have the SQL function run the same
C function.

If we really think that we ought to differentiate, then we could do what
pg_stat_statement does, and have a separate C function that's called
with the obsolete signature (pg_stat_statements_1_8 et al).




Re: Improper use about DatumGetInt32

2020-12-03 Thread Peter Eisentraut

On 2020-11-30 16:32, Alvaro Herrera wrote:

On 2020-Nov-30, Peter Eisentraut wrote:


Patch updated this way.  I agree it's better that way.


Thanks, LGTM.


For a change like this, do we need to change the C symbol names, so that 
there is no misbehavior if the shared library is not updated at the same 
time as the extension is upgraded in SQL?





Re: Improper use about DatumGetInt32

2020-11-30 Thread Alvaro Herrera
On 2020-Nov-30, Peter Eisentraut wrote:

> Patch updated this way.  I agree it's better that way.

Thanks, LGTM.





Re: Improper use about DatumGetInt32

2020-11-30 Thread Peter Eisentraut

On 2020-11-27 13:37, Ashutosh Bapat wrote:

Yeah, I had it like that for a moment, but then you need to duplicate
the check in get_raw_page() and get_raw_page_fork().  I figured since
get_raw_page_internal() does all the other argument checking also, it
seems sensible to put the block range check there too.  But it's not a
big deal either way.


FWIW, my 2c. Though I agree with both sides, I 
prefer get_raw_page_internal() accepting BlockNumber, since that's what 
it deals with and not the entire int8.


Patch updated this way.  I agree it's better that way.
From dc75fc23891cf9e16fb2c718def09823295587c7 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 30 Nov 2020 15:48:26 +0100
Subject: [PATCH v2] pageinspect: Change block number arguments to bigint

Block numbers are 32-bit unsigned integers.  Therefore, the smallest
SQL integer type that they can fit in is bigint.  However, in the
pageinspect module, most input and output parameters dealing with
block numbers were declared as int.  The behavior with block numbers
larger than a signed 32-bit integer was therefore dubious.  Change
these arguments to type bigint and add some more explicit error
checking on the block range.

(Other contrib modules appear to do this correctly already.)

Discussion: 
https://www.postgresql.org/message-id/flat/d8f6bdd536df403b9b33816e9f7e0b9d@G08CNEXMBPEKD05.g08.fujitsu.local
---
 contrib/pageinspect/Makefile  |  3 +-
 contrib/pageinspect/brinfuncs.c   |  2 +-
 contrib/pageinspect/btreefuncs.c  | 40 ++---
 contrib/pageinspect/hashfuncs.c   | 11 ++-
 contrib/pageinspect/pageinspect--1.8--1.9.sql | 81 +++
 contrib/pageinspect/pageinspect.control   |  2 +-
 contrib/pageinspect/rawpage.c | 21 -
 doc/src/sgml/pageinspect.sgml | 12 +--
 8 files changed, 144 insertions(+), 28 deletions(-)
 create mode 100644 contrib/pageinspect/pageinspect--1.8--1.9.sql

diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index d9d8177116..3fcbfea293 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -12,7 +12,8 @@ OBJS = \
rawpage.o
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
+DATA =  pageinspect--1.8--1.9.sql \
+   pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index fb32d74a66..6f452c688d 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -252,7 +252,7 @@ brin_page_items(PG_FUNCTION_ARGS)
int att = attno - 1;
 
values[0] = UInt16GetDatum(offset);
-   values[1] = UInt32GetDatum(dtup->bt_blkno);
+   values[1] = Int64GetDatum((int64) dtup->bt_blkno);
values[2] = UInt16GetDatum(attno);
values[3] = 
BoolGetDatum(dtup->bt_columns[att].bv_allnulls);
values[4] = 
BoolGetDatum(dtup->bt_columns[att].bv_hasnulls);
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 445605db58..5937de3a1c 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -164,7 +164,7 @@ Datum
 bt_page_stats(PG_FUNCTION_ARGS)
 {
text   *relname = PG_GETARG_TEXT_PP(0);
-   uint32  blkno = PG_GETARG_UINT32(1);
+   int64   blkno = PG_GETARG_INT64(1);
Buffer  buffer;
Relationrel;
RangeVar   *relrv;
@@ -197,8 +197,15 @@ bt_page_stats(PG_FUNCTION_ARGS)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("cannot access temporary tables of 
other sessions")));
 
+   if (blkno < 0 || blkno > MaxBlockNumber)
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("invalid block number")));
+
if (blkno == 0)
-   elog(ERROR, "block 0 is a meta page");
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("block 0 is a meta page")));
 
CHECK_RELATION_BLOCK_RANGE(rel, blkno);
 
@@ -219,16 +226,16 @@ bt_page_stats(PG_FUNCTION_ARGS)
elog(ERROR, "return type must be a row type");
 
j = 0;
-   values[j++] = psprintf("%d", stat.blkno);
+   values[j++] = psprintf("%u", stat.blkno);
values[j++] = psprintf("%c", stat.type);
-   values[j++] = psprintf("%d", stat.live_items);
-   values[j++] = psprintf("%d", stat.dead_items);
-   

Re: Improper use about DatumGetInt32

2020-11-27 Thread Ashutosh Bapat
On Thu, Nov 26, 2020 at 9:57 PM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 2020-11-26 14:27, Alvaro Herrera wrote:
> > On 2020-Nov-26, Peter Eisentraut wrote:
> >
> >> The point of the patch is to have the range check somewhere.  If you
> just
> >> cast it, then you won't notice out of range arguments.  Note that other
> >> contrib modules that take block numbers work the same way.
> >
> > I'm not saying not to do that; just saying we should not propagate it to
> > places that don't need it.  get_raw_page gets its page number from
> > PG_GETARG_INT64(), and the range check should be there.  But then it
> > calls get_raw_page_internal, and it could pass a BlockNumber -- there's
> > no need to pass an int64.  So get_raw_page_internal does not need a
> > range check.
>
> Yeah, I had it like that for a moment, but then you need to duplicate
> the check in get_raw_page() and get_raw_page_fork().  I figured since
> get_raw_page_internal() does all the other argument checking also, it
> seems sensible to put the block range check there too.  But it's not a
> big deal either way.
>

FWIW, my 2c. Though I agree with both sides, I
prefer get_raw_page_internal() accepting BlockNumber, since that's what it
deals with and not the entire int8.

--
Best Wishes,
Ashutosh


Re: Improper use about DatumGetInt32

2020-11-27 Thread Ashutosh Bapat
On Wed, Nov 25, 2020 at 8:13 PM Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> On 02.11.2020 18:59, Peter Eisentraut wrote:
> > I have committed 0003.
> >
> > For 0001, normal_rand(), I think you should reject negative arguments
> > with an error.
>
> I've updated 0001. The change is trivial, see attached.
>
> >
> > For 0002, I think you should change the block number arguments to
> > int8, same as other contrib modules do.
> >
> Agree. It will need a bit more work, though. Probably a new version of
> pageinspect contrib, as the public API will change.
> Ashutosh, are you going to continue working on it?
>

Sorry I was away on Diwali vacation so couldn't address Peter's comments in
time. Thanks for taking this further. I will review Peter's patch.

--
Best Wishes,
Ashutosh


Re: Improper use about DatumGetInt32

2020-11-26 Thread Peter Eisentraut

On 2020-11-26 14:27, Alvaro Herrera wrote:

On 2020-Nov-26, Peter Eisentraut wrote:


The point of the patch is to have the range check somewhere.  If you just
cast it, then you won't notice out of range arguments.  Note that other
contrib modules that take block numbers work the same way.


I'm not saying not to do that; just saying we should not propagate it to
places that don't need it.  get_raw_page gets its page number from
PG_GETARG_INT64(), and the range check should be there.  But then it
calls get_raw_page_internal, and it could pass a BlockNumber -- there's
no need to pass an int64.  So get_raw_page_internal does not need a
range check.


Yeah, I had it like that for a moment, but then you need to duplicate 
the check in get_raw_page() and get_raw_page_fork().  I figured since 
get_raw_page_internal() does all the other argument checking also, it 
seems sensible to put the block range check there too.  But it's not a 
big deal either way.





Re: Improper use about DatumGetInt32

2020-11-26 Thread Alvaro Herrera
On 2020-Nov-26, Peter Eisentraut wrote:

> The point of the patch is to have the range check somewhere.  If you just
> cast it, then you won't notice out of range arguments.  Note that other
> contrib modules that take block numbers work the same way.

I'm not saying not to do that; just saying we should not propagate it to
places that don't need it.  get_raw_page gets its page number from
PG_GETARG_INT64(), and the range check should be there.  But then it
calls get_raw_page_internal, and it could pass a BlockNumber -- there's
no need to pass an int64.  So get_raw_page_internal does not need a
range check.




Re: Improper use about DatumGetInt32

2020-11-26 Thread Peter Eisentraut

On 2020-11-25 20:04, Alvaro Herrera wrote:

On 2020-Nov-25, Peter Eisentraut wrote:


  bt_page_stats(PG_FUNCTION_ARGS)
  {
text   *relname = PG_GETARG_TEXT_PP(0);
-   uint32  blkno = PG_GETARG_UINT32(1);
+   int64   blkno = PG_GETARG_INT64(1);


As a matter of style, I think it'd be better to have an int64 variable
that gets the value from PG_GETARG_INT64(), then you cast that to
another variable that's a BlockNumber and use that throughout the rest
of the code.  So you'd avoid changes like this:


  static bytea *get_raw_page_internal(text *relname, ForkNumber forknum,
-   
BlockNumber blkno);
+   int64 
blkno);


where the previous coding was correct, and the new one is dubious and it
forces you to add unnecessary range checks in that function:


@@ -144,11 +144,16 @@ get_raw_page_internal(text *relname, ForkNumber forknum, 
BlockNumber blkno)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("cannot access temporary tables of other 
sessions")));
  
+	if (blkno < 0 || blkno > MaxBlockNumber)

+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("invalid block number")));
+


The point of the patch is to have the range check somewhere.  If you 
just cast it, then you won't notice out of range arguments.  Note that 
other contrib modules that take block numbers work the same way.





Re: Improper use about DatumGetInt32

2020-11-25 Thread Alvaro Herrera
On 2020-Nov-25, Peter Eisentraut wrote:

>  bt_page_stats(PG_FUNCTION_ARGS)
>  {
>   text   *relname = PG_GETARG_TEXT_PP(0);
> - uint32  blkno = PG_GETARG_UINT32(1);
> + int64   blkno = PG_GETARG_INT64(1);

As a matter of style, I think it'd be better to have an int64 variable
that gets the value from PG_GETARG_INT64(), then you cast that to
another variable that's a BlockNumber and use that throughout the rest
of the code.  So you'd avoid changes like this:

>  static bytea *get_raw_page_internal(text *relname, ForkNumber forknum,
> - 
> BlockNumber blkno);
> + int64 
> blkno);

where the previous coding was correct, and the new one is dubious and it
forces you to add unnecessary range checks in that function:

> @@ -144,11 +144,16 @@ get_raw_page_internal(text *relname, ForkNumber 
> forknum, BlockNumber blkno)
>   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>errmsg("cannot access temporary tables of 
> other sessions")));
>  
> + if (blkno < 0 || blkno > MaxBlockNumber)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +  errmsg("invalid block number")));
> +





Re: Improper use about DatumGetInt32

2020-11-25 Thread Peter Eisentraut

On 2020-11-02 16:59, Peter Eisentraut wrote:

I have committed 0003.

For 0001, normal_rand(), I think you should reject negative arguments
with an error.


I have committed a fix for that.


For 0002, I think you should change the block number arguments to int8,
same as other contrib modules do.


Looking further into this, almost all of pageinspect needs to be updated 
to handle block numbers larger than INT_MAX correctly.  Attached is a 
patch for this.  It is meant to work like other contrib modules, such as 
pg_freespace and pg_visibility.  I haven't tested this much yet.
From 30c0e36a1f19b2d09eb81a26949c7793167084e8 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 25 Nov 2020 17:13:46 +0100
Subject: [PATCH] pageinspect: Change block number arguments to bigint

---
 contrib/pageinspect/Makefile  |  3 +-
 contrib/pageinspect/brinfuncs.c   |  2 +-
 contrib/pageinspect/btreefuncs.c  | 40 ++---
 contrib/pageinspect/hashfuncs.c   | 11 ++-
 contrib/pageinspect/pageinspect--1.8--1.9.sql | 81 +++
 contrib/pageinspect/pageinspect.control   |  2 +-
 contrib/pageinspect/rawpage.c | 24 --
 doc/src/sgml/pageinspect.sgml | 12 +--
 8 files changed, 143 insertions(+), 32 deletions(-)
 create mode 100644 contrib/pageinspect/pageinspect--1.8--1.9.sql

diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index d9d8177116..3fcbfea293 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -12,7 +12,8 @@ OBJS = \
rawpage.o
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
+DATA =  pageinspect--1.8--1.9.sql \
+   pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index fb32d74a66..6f452c688d 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -252,7 +252,7 @@ brin_page_items(PG_FUNCTION_ARGS)
int att = attno - 1;
 
values[0] = UInt16GetDatum(offset);
-   values[1] = UInt32GetDatum(dtup->bt_blkno);
+   values[1] = Int64GetDatum((int64) dtup->bt_blkno);
values[2] = UInt16GetDatum(attno);
values[3] = 
BoolGetDatum(dtup->bt_columns[att].bv_allnulls);
values[4] = 
BoolGetDatum(dtup->bt_columns[att].bv_hasnulls);
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 445605db58..5937de3a1c 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -164,7 +164,7 @@ Datum
 bt_page_stats(PG_FUNCTION_ARGS)
 {
text   *relname = PG_GETARG_TEXT_PP(0);
-   uint32  blkno = PG_GETARG_UINT32(1);
+   int64   blkno = PG_GETARG_INT64(1);
Buffer  buffer;
Relationrel;
RangeVar   *relrv;
@@ -197,8 +197,15 @@ bt_page_stats(PG_FUNCTION_ARGS)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("cannot access temporary tables of 
other sessions")));
 
+   if (blkno < 0 || blkno > MaxBlockNumber)
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("invalid block number")));
+
if (blkno == 0)
-   elog(ERROR, "block 0 is a meta page");
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("block 0 is a meta page")));
 
CHECK_RELATION_BLOCK_RANGE(rel, blkno);
 
@@ -219,16 +226,16 @@ bt_page_stats(PG_FUNCTION_ARGS)
elog(ERROR, "return type must be a row type");
 
j = 0;
-   values[j++] = psprintf("%d", stat.blkno);
+   values[j++] = psprintf("%u", stat.blkno);
values[j++] = psprintf("%c", stat.type);
-   values[j++] = psprintf("%d", stat.live_items);
-   values[j++] = psprintf("%d", stat.dead_items);
-   values[j++] = psprintf("%d", stat.avg_item_size);
-   values[j++] = psprintf("%d", stat.page_size);
-   values[j++] = psprintf("%d", stat.free_size);
-   values[j++] = psprintf("%d", stat.btpo_prev);
-   values[j++] = psprintf("%d", stat.btpo_next);
-   values[j++] = psprintf("%d", (stat.type == 'd') ? stat.btpo.xact : 
stat.btpo.level);
+   values[j++] = psprintf("%u", stat.live_items);
+   values[j++] = psprintf("%u", stat.dead_items);
+   values[j++] = psprintf("%u", stat.avg_item_size);
+   values[j++] = psprintf("%u", stat.page_size);
+   values[j++] = psprintf("%u", 

Re: Improper use about DatumGetInt32

2020-11-25 Thread Anastasia Lubennikova

On 02.11.2020 18:59, Peter Eisentraut wrote:

I have committed 0003.

For 0001, normal_rand(), I think you should reject negative arguments 
with an error.


I've updated 0001. The change is trivial, see attached.



For 0002, I think you should change the block number arguments to 
int8, same as other contrib modules do.


Agree. It will need a bit more work, though. Probably a new version of 
pageinspect contrib, as the public API will change.

Ashutosh, are you going to continue working on it?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit b42df63c757bf5ff715052a401aba37691a7e2b8
Author: anastasia 
Date:   Wed Nov 25 17:19:33 2020 +0300

Handle negative number of tuples passed to normal_rand()

The function converts the first argument i.e. the number of tuples to
return into an unsigned integer which turns out to be huge number when a
negative value is passed. This causes the function to take much longer
time to execute. Instead throw an error about invalid parameter value.

While at it, improve SQL test to test the number of tuples returned by
this function.

Authors: Ashutosh Bapat, Anastasia Lubennikova

diff --git a/contrib/tablefunc/expected/tablefunc.out b/contrib/tablefunc/expected/tablefunc.out
index fffadc6e1b4..97bfd690841 100644
--- a/contrib/tablefunc/expected/tablefunc.out
+++ b/contrib/tablefunc/expected/tablefunc.out
@@ -3,10 +3,20 @@ CREATE EXTENSION tablefunc;
 -- normal_rand()
 -- no easy way to do this for regression testing
 --
-SELECT avg(normal_rand)::int FROM normal_rand(100, 250, 0.2);
- avg 
--
- 250
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(100, 250, 0.2);
+ avg | count 
+-+---
+ 250 |   100
+(1 row)
+
+-- negative number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(-1, 250, 0.2);
+ERROR:  invalid number of tuples: -1
+-- zero number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(0, 250, 0.2);
+ avg | count 
+-+---
+ | 0
 (1 row)
 
 --
diff --git a/contrib/tablefunc/sql/tablefunc.sql b/contrib/tablefunc/sql/tablefunc.sql
index ec375b05c63..5341c005f91 100644
--- a/contrib/tablefunc/sql/tablefunc.sql
+++ b/contrib/tablefunc/sql/tablefunc.sql
@@ -4,7 +4,11 @@ CREATE EXTENSION tablefunc;
 -- normal_rand()
 -- no easy way to do this for regression testing
 --
-SELECT avg(normal_rand)::int FROM normal_rand(100, 250, 0.2);
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(100, 250, 0.2);
+-- negative number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(-1, 250, 0.2);
+-- zero number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(0, 250, 0.2);
 
 --
 -- crosstab()
diff --git a/contrib/tablefunc/tablefunc.c b/contrib/tablefunc/tablefunc.c
index 02f02eab574..c6d32d46f01 100644
--- a/contrib/tablefunc/tablefunc.c
+++ b/contrib/tablefunc/tablefunc.c
@@ -174,6 +174,7 @@ normal_rand(PG_FUNCTION_ARGS)
 	FuncCallContext *funcctx;
 	uint64		call_cntr;
 	uint64		max_calls;
+	int32		num_tuples;
 	normal_rand_fctx *fctx;
 	float8		mean;
 	float8		stddev;
@@ -193,7 +194,14 @@ normal_rand(PG_FUNCTION_ARGS)
 		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
 
 		/* total number of tuples to be returned */
-		funcctx->max_calls = PG_GETARG_UINT32(0);
+		num_tuples = PG_GETARG_INT32(0);
+
+		if (num_tuples < 0)
+			ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid number of tuples: %d", num_tuples)));
+
+		funcctx->max_calls = num_tuples;
 
 		/* allocate memory for user context */
 		fctx = (normal_rand_fctx *) palloc(sizeof(normal_rand_fctx));


Re: Improper use about DatumGetInt32

2020-11-02 Thread Peter Eisentraut

I have committed 0003.

For 0001, normal_rand(), I think you should reject negative arguments 
with an error.


For 0002, I think you should change the block number arguments to int8, 
same as other contrib modules do.





Re: Improper use about DatumGetInt32

2020-10-22 Thread Ashutosh Bapat
On Fri, 16 Oct 2020 at 19:26, Alvaro Herrera 
wrote:

> On 2020-Sep-23, Ashutosh Bapat wrote:
>
> > > You're ignoring the xid use-case, for which DatumGetUInt32 actually is
> > > the right thing.
> >
> > There is DatumGetTransactionId() which should be used instead.
> > That made me search if there's PG_GETARG_TRANSACTIONID() and yes it's
> > there but only defined in xid.c. So pg_xact_commit_timestamp(),
> > pg_xact_commit_timestamp_origin() and pg_get_multixact_members() use
> > PG_GETARG_UNIT32. IMO those should be changed to use
> > PG_GETARG_TRANSACTIONID. That would require moving
> > PG_GETARG_TRANSACTIONID somewhere outside xid.c; may be fmgr.h where
> > other PG_GETARG_* are.
>
> Hmm, yeah, I think this would be a good idea.
>

The patch 0003 does that.


>
> > get_raw_page() also does similar thing but the effect is not as dangerous
> > SELECT octet_length(get_raw_page('test1', 'main', -1)) AS main_1;
> >   ERROR:  block number 4294967295 is out of range for relation "test1"
> > Similarly for bt_page_stats() and bt_page_items()
>
> Hmm, but page numbers above signed INT_MAX are valid.  So this would
> prevent reading all legitimate pages past that.
>
>
According to https://www.postgresql.org/docs/12/datatype-numeric.html,
these functions shouldn't be accepting values higher than INT_MAX since
it's outside the integer data type range. But may be it's a convenient way
to avoid using bigint. Anyway those changes are separate in 0002 patch
which can be discarded as a whole. But for now I am keeping it in the bunch.

-- 
Best Wishes,
Ashutosh
From c81e432aed62b04fcd7fb6f60f96c76a6a946e4a Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat 
Date: Thu, 22 Oct 2020 15:46:14 +0530
Subject: [PATCH 3/3] Extern'alize PG_GETARG_TRANSACTIONID and
 PG_RETURN_TRANSACTIONID

We use PG_GETARG_UINT32 to extract transaction id from an input argument
in UDFs. Instead extern'alize PG_GETARG_TRANSACTIONID defined in xid.c
and use it.

PG_RETURN_TRANSACTIONID has no users outside xid.c. But extern'alized
that too to be symmetric with PG_GETARG_TRANSACTIONID and in case a need
arised in future.

Ashutosh Bapat
---
 src/backend/access/transam/commit_ts.c | 4 ++--
 src/backend/access/transam/multixact.c | 2 +-
 src/backend/utils/adt/xid.c| 2 --
 src/include/fmgr.h | 2 ++
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index cb8a968801..2fe551f17e 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -404,7 +404,7 @@ error_commit_ts_disabled(void)
 Datum
 pg_xact_commit_timestamp(PG_FUNCTION_ARGS)
 {
-	TransactionId xid = PG_GETARG_UINT32(0);
+	TransactionId xid = PG_GETARG_TRANSACTIONID(0);
 	TimestampTz ts;
 	bool		found;
 
@@ -481,7 +481,7 @@ pg_last_committed_xact(PG_FUNCTION_ARGS)
 Datum
 pg_xact_commit_timestamp_origin(PG_FUNCTION_ARGS)
 {
-	TransactionId xid = PG_GETARG_UINT32(0);
+	TransactionId xid = PG_GETARG_TRANSACTIONID(0);
 	RepOriginId nodeid;
 	TimestampTz ts;
 	Datum		values[2];
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 6ccdc5b58c..4338c664fb 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3320,7 +3320,7 @@ pg_get_multixact_members(PG_FUNCTION_ARGS)
 		int			nmembers;
 		int			iter;
 	} mxact;
-	MultiXactId mxid = PG_GETARG_UINT32(0);
+	MultiXactId mxid = PG_GETARG_TRANSACTIONID(0);
 	mxact	   *multi;
 	FuncCallContext *funccxt;
 
diff --git a/src/backend/utils/adt/xid.c b/src/backend/utils/adt/xid.c
index 20389aff1d..0bccfa4755 100644
--- a/src/backend/utils/adt/xid.c
+++ b/src/backend/utils/adt/xid.c
@@ -23,8 +23,6 @@
 #include "utils/builtins.h"
 #include "utils/xid8.h"
 
-#define PG_GETARG_TRANSACTIONID(n)	DatumGetTransactionId(PG_GETARG_DATUM(n))
-#define PG_RETURN_TRANSACTIONID(x)	return TransactionIdGetDatum(x)
 
 #define PG_GETARG_COMMANDID(n)		DatumGetCommandId(PG_GETARG_DATUM(n))
 #define PG_RETURN_COMMANDID(x)		return CommandIdGetDatum(x)
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index f25068fae2..ce37e342cd 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -276,6 +276,7 @@ extern struct varlena *pg_detoast_datum_packed(struct varlena *datum);
 #define PG_GETARG_POINTER(n) DatumGetPointer(PG_GETARG_DATUM(n))
 #define PG_GETARG_CSTRING(n) DatumGetCString(PG_GETARG_DATUM(n))
 #define PG_GETARG_NAME(n)	 DatumGetName(PG_GETARG_DATUM(n))
+#define PG_GETARG_TRANSACTIONID(n)	DatumGetTransactionId(PG_GETARG_DATUM(n))
 /* these macros hide the pass-by-reference-ness of the datatype: */
 #define PG_GETARG_FLOAT4(n)  DatumGetFloat4(PG_GETARG_DATUM(n))
 #define PG_GETARG_FLOAT8(n)  DatumGetFloat8(PG_GETARG_DATUM(n))
@@ -360,6 +361,7 @@ extern struct varlena *pg_detoast_datum_packed(struct varlena *datum);
 #define PG_RETURN_POINTER(x) return PointerGetDatum(x)
 #define PG_RETURN_CSTRING(x) return 

Re: Improper use about DatumGetInt32

2020-10-16 Thread Alvaro Herrera
On 2020-Sep-23, Ashutosh Bapat wrote:

> > You're ignoring the xid use-case, for which DatumGetUInt32 actually is
> > the right thing.
> 
> There is DatumGetTransactionId() which should be used instead.
> That made me search if there's PG_GETARG_TRANSACTIONID() and yes it's
> there but only defined in xid.c. So pg_xact_commit_timestamp(),
> pg_xact_commit_timestamp_origin() and pg_get_multixact_members() use
> PG_GETARG_UNIT32. IMO those should be changed to use
> PG_GETARG_TRANSACTIONID. That would require moving
> PG_GETARG_TRANSACTIONID somewhere outside xid.c; may be fmgr.h where
> other PG_GETARG_* are.

Hmm, yeah, I think this would be a good idea.

> get_raw_page() also does similar thing but the effect is not as dangerous
> SELECT octet_length(get_raw_page('test1', 'main', -1)) AS main_1;
>   ERROR:  block number 4294967295 is out of range for relation "test1"
> Similarly for bt_page_stats() and bt_page_items()

Hmm, but page numbers above signed INT_MAX are valid.  So this would
prevent reading all legitimate pages past that.





Re: Improper use about DatumGetInt32

2020-09-23 Thread Ashutosh Bapat
On Wed, Sep 23, 2020 at 1:41 AM Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Mon, Sep 21, 2020 at 3:53 PM Andres Freund  wrote:
> >> I think we mostly use it for the few places where we currently expose
> >> data as a signed integer on the SQL level, but internally actually treat
> >> it as a unsigned data.
>
> > So why is the right solution to that not DatumGetInt32() + a cast to uint32?
>
> You're ignoring the xid use-case, for which DatumGetUInt32 actually is
> the right thing.

There is DatumGetTransactionId() which should be used instead.
That made me search if there's PG_GETARG_TRANSACTIONID() and yes it's
there but only defined in xid.c. So pg_xact_commit_timestamp(),
pg_xact_commit_timestamp_origin() and pg_get_multixact_members() use
PG_GETARG_UNIT32. IMO those should be changed to use
PG_GETARG_TRANSACTIONID. That would require moving
PG_GETARG_TRANSACTIONID somewhere outside xid.c; may be fmgr.h where
other PG_GETARG_* are.

> I tend to agree though that if the SQL argument is
> of a signed type, the least API-abusing answer is a signed DatumGetXXX
> macro followed by whatever cast you need.
>

I looked for some uses of PG_GETARG_UNIT32() which is the counterpart
of DatumGetUint32(). Found some buggy usages apart from the ones which
can be converted to PG_GETARG_TRANSACTIONID listed above.
normal_rand() for example returns a huge number of rows and takes
forever if we pass a negative first argument to it. Someone could
misuse that for a DOS attack or it could be just an accident that they
pass a negative value to that function and the query takes forever.
explain analyze select count(*) from normal_rand(-100, 1.0, 1.0);
   QUERY
PLAN
-
 Aggregate  (cost=12.50..12.51 rows=1 width=8) (actual
time=2077574.718..2077574.719 rows=1 loops=1)
   ->  Function Scan on normal_rand  (cost=0.00..10.00 rows=1000
width=0) (actual time=1005176.149..1729994.366 rows=4293967296
loops=1)
 Planning Time: 0.346 ms
 Execution Time: 2079034.835 ms

get_raw_page() also does similar thing but the effect is not as dangerous
SELECT octet_length(get_raw_page('test1', 'main', -1)) AS main_1;
  ERROR:  block number 4294967295 is out of range for relation "test1"
Similarly for bt_page_stats() and bt_page_items()

PFA patches to correct those.

There's Oracle compatible chr() which also uses PG_GETARG_UINT32() but
it's (accidentally?) reporting the negative inputs correctly because
it filters out very large values and reports those using %d. It's
arguable whether we should change that, so I have left it untouched.
But I think we should change that as well and get rid of
PG_GETARG_UNIT32 altogether. This will prevent any future misuse.


--
Best Wishes,
Ashutosh Bapat
From e13eeafc452a977a1e5b9e6c51b043f00dd2d5c7 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat 
Date: Tue, 22 Sep 2020 19:00:56 +0530
Subject: [PATCH 1/2] Handle negative number of tuples passed to normal_rand()

The function converts the first argument i.e. the number of tuples to
return into an unsigned integer which turns out to be huge number when a
negative value is passed. This causes the function to take much longer
time to execute. Instead return no rows in that case.

While at it, improve SQL test to test the number of tuples returned by
this function.

Ashutosh Bapat
---
 contrib/tablefunc/expected/tablefunc.out | 15 +++
 contrib/tablefunc/sql/tablefunc.sql  |  4 +++-
 contrib/tablefunc/tablefunc.c|  4 +++-
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/contrib/tablefunc/expected/tablefunc.out b/contrib/tablefunc/expected/tablefunc.out
index fffadc6e1b..1c9ecef52c 100644
--- a/contrib/tablefunc/expected/tablefunc.out
+++ b/contrib/tablefunc/expected/tablefunc.out
@@ -3,10 +3,17 @@ CREATE EXTENSION tablefunc;
 -- normal_rand()
 -- no easy way to do this for regression testing
 --
-SELECT avg(normal_rand)::int FROM normal_rand(100, 250, 0.2);
- avg 
--
- 250
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(100, 250, 0.2);
+ avg | count 
+-+---
+ 250 |   100
+(1 row)
+
+-- negative number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(-1, 250, 0.2);
+ avg | count 
+-+---
+ | 0
 (1 row)
 
 --
diff --git a/contrib/tablefunc/sql/tablefunc.sql b/contrib/tablefunc/sql/tablefunc.sql
index ec375b05c6..02e8a98c73 100644
--- a/contrib/tablefunc/sql/tablefunc.sql
+++ b/contrib/tablefunc/sql/tablefunc.sql
@@ -4,7 +4,9 @@ CREATE EXTENSION tablefunc;
 -- normal_rand()
 -- no easy way to do this for regression testing
 --
-SELECT avg(normal_rand)::int FROM normal_rand(100, 250, 0.2);
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(100, 250, 0.2);
+-- negative number of tuples
+SELECT avg(normal_rand)::int, count(*) FROM normal_rand(-1, 250, 0.2);
 
 --
 -- 

Re: Improper use about DatumGetInt32

2020-09-22 Thread Tom Lane
Robert Haas  writes:
> On Mon, Sep 21, 2020 at 3:53 PM Andres Freund  wrote:
>> I think we mostly use it for the few places where we currently expose
>> data as a signed integer on the SQL level, but internally actually treat
>> it as a unsigned data.

> So why is the right solution to that not DatumGetInt32() + a cast to uint32?

You're ignoring the xid use-case, for which DatumGetUInt32 actually is
the right thing.  I tend to agree though that if the SQL argument is
of a signed type, the least API-abusing answer is a signed DatumGetXXX
macro followed by whatever cast you need.

regards, tom lane




Re: Improper use about DatumGetInt32

2020-09-22 Thread Robert Haas
On Mon, Sep 21, 2020 at 3:53 PM Andres Freund  wrote:
> On 2020-09-21 14:08:22 -0400, Robert Haas wrote:
> > There is no SQL type corresponding to the C data type uint32, so I'm
> > not sure why we even have DatumGetUInt32.  I'm sort of suspicious that
> > there's some fuzzy thinking going on there.
>
> I think we mostly use it for the few places where we currently expose
> data as a signed integer on the SQL level, but internally actually treat
> it as a unsigned data.

So why is the right solution to that not DatumGetInt32() + a cast to uint32?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Improper use about DatumGetInt32

2020-09-21 Thread Andres Freund
Hi,

On 2020-09-21 14:08:22 -0400, Robert Haas wrote:
> There is no SQL type corresponding to the C data type uint32, so I'm
> not sure why we even have DatumGetUInt32.  I'm sort of suspicious that
> there's some fuzzy thinking going on there.

I think we mostly use it for the few places where we currently expose
data as a signed integer on the SQL level, but internally actually treat
it as a unsigned data. There's not a lot of those, but there e.g. is
pg_class.relpages.  There also may be places where we use it for
functions that can be created but not called from SQL (using the
INTERNAL type).

- Andres




Re: Improper use about DatumGetInt32

2020-09-21 Thread Tom Lane
Robert Haas  writes:
> Typically, the DatumGetBlah() function that you pick should match the
> SQL data type that the function is returning. So if the function
> returns pg_catalog.int4, which corresponds to the C data type int32,
> you would use DatumGetInt32. There is no SQL type corresponding to the
> C data type uint32, so I'm not sure why we even have DatumGetUInt32.

xid?

regards, tom lane




Re: Improper use about DatumGetInt32

2020-09-21 Thread Robert Haas
On Sun, Sep 20, 2020 at 9:17 PM Hou, Zhijie  wrote:
> In (/contrib/bloom/blutils.c:277), I found it use DatumGetInt32 to get UInt32 
> type.
> Is it more appropriate to use DatumGetUInt32 here?

Typically, the DatumGetBlah() function that you pick should match the
SQL data type that the function is returning. So if the function
returns pg_catalog.int4, which corresponds to the C data type int32,
you would use DatumGetInt32. There is no SQL type corresponding to the
C data type uint32, so I'm not sure why we even have DatumGetUInt32.
I'm sort of suspicious that there's some fuzzy thinking going on
there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Improper use about DatumGetInt32

2020-09-20 Thread Bharath Rupireddy
On Mon, Sep 21, 2020 at 6:47 AM Hou, Zhijie  wrote:
>
> In (/contrib/bloom/blutils.c:277), I found it use DatumGetInt32 to get UInt32 
> type.
> Is it more appropriate to use DatumGetUInt32 here?
>

Makes sense. +1 for the patch. I think with the existing code also we
don't have any problem. If we assume that the hash functions return
uint32, with DatumGetInt32() we are typecasting that uint32 result to
int32, we are assigning it to uint32 i.e. typecasting int32 back to
uint32. Eventually, I think, we will see the proper value in hashVal.
I did a small experiment to prove this [1].

uint32hashVal;
hashVal = DatumGetInt32(FunctionCall1Coll(>hashFn[attno],
state->collations[attno], value));

It's good to run a few test cases/test suites(if they exist) that hit
this part of the code, just to ensure we don't break anything.

[1]
int main()
{
unsigned int u = 3 * 1024 * 1024 * 1024;

printf("%u\n", u);
int i = u;
printf("%d\n", i);
unsigned int u1 = i;
printf("%u\n", u1);

return 0;
}
Output of the above snippet:
3221225472
-1073741824
3221225472

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Improper use about DatumGetInt32

2020-09-20 Thread Hou, Zhijie
Hi

In (/contrib/bloom/blutils.c:277), I found it use DatumGetInt32 to get UInt32 
type.
Is it more appropriate to use DatumGetUInt32 here?

See the attachment for the patch

Bes regards,
houzj





0001-DatumGetUInt32.patch
Description: 0001-DatumGetUInt32.patch