Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-25 Thread Michael Paquier
On Wed, Nov 25, 2015 at 10:06 PM, Teodor Sigaev  wrote:

> -   bits_len = tuphdr->t_hoff -
> -   offsetof(HeapTupleHeaderData, t_bits);
> +   int bits_len =
> +   ((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 +
> 1) * 8;
>
> As I understand offline comments of Nikolay, current version of page
> inspect contains an mistake here. Should we backpatch this?
>

As far as I understood from the code, the current code in pageinspect is
overestimating the length of t_bits. So that's actually harmless. For the
sake of this patch though this is needed to perform the sanity checks in
place when scanning raw page entries.
-- 
Michael


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-25 Thread Nikolay Shaplov
В письме от 25 ноября 2015 22:27:57 пользователь Michael Paquier написал:
> On Wed, Nov 25, 2015 at 10:16 PM, Nikolay Shaplov 
> wrote:
> > В письме от 18 ноября 2015 15:52:54 пользователь Michael Paquier написал:
> > Teodor Sigaev asked a very good question: does it properly do upgrade from
> > 1.3
> > to 1.4
> > 
> > I've rechecked and fixed
> 
> What was actually the problem? 

tuple_data_split should be defined before heap_page_item_attrs.

And there were also error in the first argument of tuple_data_split there. It 
should be  "rel_oid oid" instead of "t_oid rel_oid" 

> I have to admit that I forgot to test that
> directly, and did not spot anything obvious on the 1.3--1.4.sql file.
yes. but each of us added a non-obvious mistake there :-)

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-25 Thread Michael Paquier
On Wed, Nov 25, 2015 at 10:16 PM, Nikolay Shaplov 
wrote:

> В письме от 18 ноября 2015 15:52:54 пользователь Michael Paquier написал:
> Teodor Sigaev asked a very good question: does it properly do upgrade from
> 1.3
> to 1.4
>
> I've rechecked and fixed
>

What was actually the problem? I have to admit that I forgot to test that
directly, and did not spot anything obvious on the 1.3--1.4.sql file.
-- 
Michael


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-25 Thread Teodor Sigaev

Teodor Sigaev asked a very good question: does it properly do upgrade from 1.3
to 1.4

I've rechecked and fixed

here is a patch.


Committed, thank you.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-25 Thread Teodor Sigaev

-   bits_len = tuphdr->t_hoff -
-   offsetof(HeapTupleHeaderData, t_bits);
+   int bits_len =
+   ((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 
8;

As I understand offline comments of Nikolay, current version of page inspect 
contains an mistake here. Should we backpatch this?



OK. I have switched the status of this patch to "Ready for committer"
(please, committer-san, double-check the area around
tuple_data_split_internal when fetching data for each attribute, I
think that we got that right but I may be missing something as well).

Looks good for a first glance


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-25 Thread Nikolay Shaplov
В письме от 18 ноября 2015 15:52:54 пользователь Michael Paquier написал:

Teodor Sigaev asked a very good question: does it properly do upgrade from 1.3 
to 1.4

I've rechecked and fixed

here is a patch.

> On Wed, Nov 18, 2015 at 3:10 AM, Nikolay Shaplov wrote:
> > Everything seems to be ok. I've changed only one thing in your version
> > of the patch. I've renamed split_tuple_data to
> > tuple_data_split_internal, because when there are split_tuple_data and
> > tuple_data_split in the same file, nobody will guess what the difference
> > between these function and why are they named in such a strange way.
> 
> Yep, that sounds better this way.
> 
> > If it is ok, set proper status, and let commiter do his job :-)
> 
> OK. I have switched the status of this patch to "Ready for committer"
> (please, committer-san, double-check the area around
> tuple_data_split_internal when fetching data for each attribute, I
> think that we got that right but I may be missing something as well).

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..91ab119 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,9 +5,9 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
-	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql \
+	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
+	pageinspect--1.0--1.1.sql pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 ifdef USE_PGXS
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..546a051 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -27,8 +27,11 @@
 
 #include "access/htup_details.h"
 #include "funcapi.h"
-#include "utils/builtins.h"
+#include "catalog/pg_type.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+#include "utils/rel.h"
 
 
 /*
@@ -55,6 +58,42 @@ bits_to_text(bits8 *bits, int len)
 
 
 /*
+ * text_to_bits
+ *
+ * Converts a c-string representation of bits into a bits8-array. This is
+ * the reverse operation of previous routine.
+ */
+static bits8 *
+text_to_bits(char *str, int len)
+{
+	bits8	   *bits;
+	int			off = 0;
+	char		byte = 0;
+
+	bits = palloc(len + 1);
+
+	while (off < len)
+	{
+		if (off % 8 == 0)
+			byte = 0;
+
+		if ((str[off] == '0') || (str[off] == '1'))
+			byte = byte | ((str[off] - '0') << off % 8);
+		else
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("illegal character '%c' in t_bits string", str[off])));
+
+		if (off % 8 == 7)
+			bits[off / 8] = byte;
+
+		off++;
+	}
+
+	return bits;
+}
+
+/*
  * heap_page_items
  *
  * Allows inspection of line pointers and tuple headers of a heap page.
@@ -122,8 +161,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -154,7 +193,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 			lp_offset + lp_len <= raw_page_size)
 		{
 			HeapTupleHeader tuphdr;
-			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +208,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+	 tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -180,11 +228,11 @@ heap_page_items(PG_FUNCTION_ARGS)
 			{
 if (tuphdr->t_infomask & HEAP_HASNULL)
 {
-	bits_len = tuphdr->t_hoff -
-		offsetof(HeapTupleHeaderData, t_bits);
+	int	bits_len =
+			((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 8;
 
 	values[11] = CStringGetTextDatum(
- bits_to_text(tuphdr->t_bits, bits_len * 8));
+ bits_to_text(tuphdr->t_bits, bits_len));
 }
 else
 	nulls[11] = true;
@@ -208,7 +256,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +271,205 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-17 Thread Nikolay Shaplov

> >> I still have an opinion that documentation should be more verbose, than
> >> your version, but I can accept your version.
> > 
> > I am not sure that's necessary, pageinspect is for developers.
> > 
> >> Who is going to add heap_page_item_attrs to your patch? me or you?
> > 
> > I recall some parts of the code I still did not like much :) I'll grab
> > some room to have an extra look at it.
> 
> I have added back heap_page_item_attrs the patch attached, fixing at
> the same time some typos found while looking again at the code. If you
> are fine with this version, let's have a committer look at it.

Everything seems to be ok. I've changed only one thing in your version
of the patch. I've renamed split_tuple_data to
tuple_data_split_internal, because when there are split_tuple_data and
tuple_data_split in the same file, nobody will guess what the difference
between these function and why are they named in such a strange way.

If it is ok, set proper status, and let commiter do his job :-)



-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..91ab119 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,9 +5,9 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
-	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql \
+	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
+	pageinspect--1.0--1.1.sql pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 ifdef USE_PGXS
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..546a051 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -27,8 +27,11 @@
 
 #include "access/htup_details.h"
 #include "funcapi.h"
-#include "utils/builtins.h"
+#include "catalog/pg_type.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+#include "utils/rel.h"
 
 
 /*
@@ -55,6 +58,42 @@ bits_to_text(bits8 *bits, int len)
 
 
 /*
+ * text_to_bits
+ *
+ * Converts a c-string representation of bits into a bits8-array. This is
+ * the reverse operation of previous routine.
+ */
+static bits8 *
+text_to_bits(char *str, int len)
+{
+	bits8	   *bits;
+	int			off = 0;
+	char		byte = 0;
+
+	bits = palloc(len + 1);
+
+	while (off < len)
+	{
+		if (off % 8 == 0)
+			byte = 0;
+
+		if ((str[off] == '0') || (str[off] == '1'))
+			byte = byte | ((str[off] - '0') << off % 8);
+		else
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("illegal character '%c' in t_bits string", str[off])));
+
+		if (off % 8 == 7)
+			bits[off / 8] = byte;
+
+		off++;
+	}
+
+	return bits;
+}
+
+/*
  * heap_page_items
  *
  * Allows inspection of line pointers and tuple headers of a heap page.
@@ -122,8 +161,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -154,7 +193,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 			lp_offset + lp_len <= raw_page_size)
 		{
 			HeapTupleHeader tuphdr;
-			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +208,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+	 tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -180,11 +228,11 @@ heap_page_items(PG_FUNCTION_ARGS)
 			{
 if (tuphdr->t_infomask & HEAP_HASNULL)
 {
-	bits_len = tuphdr->t_hoff -
-		offsetof(HeapTupleHeaderData, t_bits);
+	int	bits_len =
+			((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 8;
 
 	values[11] = CStringGetTextDatum(
- bits_to_text(tuphdr->t_bits, bits_len * 8));
+ bits_to_text(tuphdr->t_bits, bits_len));
 }
 else
 	nulls[11] = true;
@@ -208,7 +256,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +271,205 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-17 Thread Guillaume Lelarge
Hi,

Le 12 nov. 2015 1:05 AM, "Michael Paquier"  a
écrit :
>
> On Thu, Nov 12, 2015 at 12:41 AM, Nikolay Shaplov
>  wrote:
> > В письме от 28 октября 2015 16:57:36 пользователь Michael Paquier
написал:
> >> On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier wrote:
> >> > On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:
> >> >> Or it's ready to commit, and just not marked this way?
> >> >
> >> > No, I don't think we have reached this state yet.
> >> >
> >> >> I am going to make report based on this patch in Vienna. It would be
> >> >> nice to have it committed until then :)
> >> >
> >> > This is registered in this November's CF and the current one is not
> >> > completely wrapped up, so I haven't rushed into looking at it.
> >>
> >> So, I have finally been able to look at this patch in more details,
> >> resulting in the attached. I noticed a couple of things that should be
> >> addressed, mainly:
> >> - addition of a new routine text_to_bits to perform the reverse
> >> operation of bits_to_text. This was previously part of
> >> tuple_data_split, I think that it deserves its own function.
> >> - split_tuple_data should be static
> >> - t_bits_str should not be a text, rather a char* fetched using
> >> text_to_cstring(PG_GETARG_TEXT_PP(4)). This way there is no need to
> >> perform extra calculations with VARSIZE and VARHDRSZ
> >> - split_tuple_data can directly use the relation OID instead of the
> >> tuple descriptor of the relation
> >> - t_bits was leaking memory. For correctness I think that it is better
> >> to free it after calling split_tuple_data.
> >> - PG_DETOAST_DATUM_COPY allocates some memory, this was leaking as
> >> well in raw_attr actually. I refactored the code such as a bytea* is
> >> used and always freed when allocated to avoid leaks. Removing raw_attr
> >> actually simplified the code a bit.
> >> - I simplified the docs, that was largely too verbose in my opinion.
> >> - Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using
> >> VARATT_IS_EXTERNAL and VARATT_IS_EXTERNAL_ONDISK seems more adapted to
> >> me, those other ones are much more low-level and not really spread in
> >> the backend code.
> >> - Found some typos in the code, the docs and some comments. I reworked
> >> the error messages as well.
> >> - The code format was not really in line with the project guidelines.
> >> I fixed that as well.
> >> - I removed heap_page_item_attrs for now to get this patch in a
> >> bare-bone state. Though I would not mind if this is re-added (I
> >> personally don't think that's much necessary in the module
> >> actually...).
> >> - The calculation of the length of t_bits using HEAP_NATTS_MASK is
> >> more correct as you mentioned earlier, so I let it in its state.
> >> That's actually more accurate for error handling as well.
> >> That's everything I recall I have. How does this look?
> > You've completely rewrite everything ;-)
> >
> > Let everything be the way you wrote. This code is better than mine.
> >
> > With one exception. I really  need heap_page_item_attrs function. I
used it in
> > my Tuples Internals presentation
> > https://github.com/dhyannataraj/tuple-internals-presentation
> > and I am 100% sure that this function is needed for educational
purposes, and
> > this function should be as simple as possible, so students can use it
without
> > extra efforts.
>
> Fine. That's your patch after all.
>
> > I still have an opinion that documentation should be more verbose, than
your
> > version, but I can accept your version.
>
> I am not sure that's necessary, pageinspect is for developers.
>

FWIW, I agree that pageinspect is mostly for devs. Still, as i said to
Nikolay after his talk at pgconf.eu, it's a nice tool for people who like
to know what's going on deep inside PostgreSQL.

So +1 for that nice feature.

> > Who is going to add heap_page_item_attrs to your patch? me or you?
>
> I recall some parts of the code I still did not like much :) I'll grab
> some room to have an extra look at it.


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-17 Thread Michael Paquier
On Wed, Nov 18, 2015 at 3:10 AM, Nikolay Shaplov wrote:
> Everything seems to be ok. I've changed only one thing in your version
> of the patch. I've renamed split_tuple_data to
> tuple_data_split_internal, because when there are split_tuple_data and
> tuple_data_split in the same file, nobody will guess what the difference
> between these function and why are they named in such a strange way.

Yep, that sounds better this way.

> If it is ok, set proper status, and let commiter do his job :-)

OK. I have switched the status of this patch to "Ready for committer"
(please, committer-san, double-check the area around
tuple_data_split_internal when fetching data for each attribute, I
think that we got that right but I may be missing something as well).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-11 Thread Nikolay Shaplov
В письме от 28 октября 2015 16:57:36 пользователь Michael Paquier написал:
> On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier wrote:
> > On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:
> >> Or it's ready to commit, and just not marked this way?
> > 
> > No, I don't think we have reached this state yet.
> > 
> >> I am going to make report based on this patch in Vienna. It would be
> >> nice to have it committed until then :)
> > 
> > This is registered in this November's CF and the current one is not
> > completely wrapped up, so I haven't rushed into looking at it.
> 
> So, I have finally been able to look at this patch in more details,
> resulting in the attached. I noticed a couple of things that should be
> addressed, mainly:
> - addition of a new routine text_to_bits to perform the reverse
> operation of bits_to_text. This was previously part of
> tuple_data_split, I think that it deserves its own function.
> - split_tuple_data should be static
> - t_bits_str should not be a text, rather a char* fetched using
> text_to_cstring(PG_GETARG_TEXT_PP(4)). This way there is no need to
> perform extra calculations with VARSIZE and VARHDRSZ
> - split_tuple_data can directly use the relation OID instead of the
> tuple descriptor of the relation
> - t_bits was leaking memory. For correctness I think that it is better
> to free it after calling split_tuple_data.
> - PG_DETOAST_DATUM_COPY allocates some memory, this was leaking as
> well in raw_attr actually. I refactored the code such as a bytea* is
> used and always freed when allocated to avoid leaks. Removing raw_attr
> actually simplified the code a bit.
> - I simplified the docs, that was largely too verbose in my opinion.
> - Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using
> VARATT_IS_EXTERNAL and VARATT_IS_EXTERNAL_ONDISK seems more adapted to
> me, those other ones are much more low-level and not really spread in
> the backend code.
> - Found some typos in the code, the docs and some comments. I reworked
> the error messages as well.
> - The code format was not really in line with the project guidelines.
> I fixed that as well.
> - I removed heap_page_item_attrs for now to get this patch in a
> bare-bone state. Though I would not mind if this is re-added (I
> personally don't think that's much necessary in the module
> actually...).
> - The calculation of the length of t_bits using HEAP_NATTS_MASK is
> more correct as you mentioned earlier, so I let it in its state.
> That's actually more accurate for error handling as well.
> That's everything I recall I have. How does this look?
You've completely rewrite everything ;-)

Let everything be the way you wrote. This code is better than mine.

With one exception. I really  need heap_page_item_attrs function. I used it in 
my Tuples Internals presentation 
https://github.com/dhyannataraj/tuple-internals-presentation
and I am 100% sure that this function is needed for educational purposes, and 
this function should be as simple as possible, so students cat use it without 
extra efforts.

I still have an opinion that documentation should be more verbose, than your 
version, but I can accept your version.

Who is going to add heap_page_item_attrs to your patch? me or you?

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-11 Thread Michael Paquier
On Thu, Nov 12, 2015 at 9:04 AM, Michael Paquier wrote:
> On Thu, Nov 12, 2015 at 12:41 AM, Nikolay Shaplov wrote:
>> I still have an opinion that documentation should be more verbose, than your
>> version, but I can accept your version.
>
> I am not sure that's necessary, pageinspect is for developers.
>
>> Who is going to add heap_page_item_attrs to your patch? me or you?
>
> I recall some parts of the code I still did not like much :) I'll grab
> some room to have an extra look at it.

I have added back heap_page_item_attrs the patch attached, fixing at
the same time some typos found while looking again at the code. If you
are fine with this version, let's have a committer look at it.
-- 
Michael
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..91ab119 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,9 +5,9 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
-	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql \
+	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
+	pageinspect--1.0--1.1.sql pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 ifdef USE_PGXS
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..fdbb66c 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -27,8 +27,11 @@
 
 #include "access/htup_details.h"
 #include "funcapi.h"
-#include "utils/builtins.h"
+#include "catalog/pg_type.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+#include "utils/rel.h"
 
 
 /*
@@ -55,6 +58,42 @@ bits_to_text(bits8 *bits, int len)
 
 
 /*
+ * text_to_bits
+ *
+ * Converts a c-string representation of bits into a bits8-array. This is
+ * the reverse operation of previous routine.
+ */
+static bits8 *
+text_to_bits(char *str, int len)
+{
+	bits8	   *bits;
+	int			off = 0;
+	char		byte = 0;
+
+	bits = palloc(len + 1);
+
+	while (off < len)
+	{
+		if (off % 8 == 0)
+			byte = 0;
+
+		if ((str[off] == '0') || (str[off] == '1'))
+			byte = byte | ((str[off] - '0') << off % 8);
+		else
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("illegal character '%c' in t_bits string", str[off])));
+
+		if (off % 8 == 7)
+			bits[off / 8] = byte;
+
+		off++;
+	}
+
+	return bits;
+}
+
+/*
  * heap_page_items
  *
  * Allows inspection of line pointers and tuple headers of a heap page.
@@ -122,8 +161,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -154,7 +193,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 			lp_offset + lp_len <= raw_page_size)
 		{
 			HeapTupleHeader tuphdr;
-			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +208,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+	 tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -180,11 +228,11 @@ heap_page_items(PG_FUNCTION_ARGS)
 			{
 if (tuphdr->t_infomask & HEAP_HASNULL)
 {
-	bits_len = tuphdr->t_hoff -
-		offsetof(HeapTupleHeaderData, t_bits);
+	int	bits_len =
+			((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 8;
 
 	values[11] = CStringGetTextDatum(
- bits_to_text(tuphdr->t_bits, bits_len * 8));
+ bits_to_text(tuphdr->t_bits, bits_len));
 }
 else
 	nulls[11] = true;
@@ -208,7 +256,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +271,205 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		SRF_RETURN_DONE(fctx);
 }
+
+/*
+ * split_tuple_data
+ *
+ * Split raw tuple data taken directly from a page into an array of bytea
+ * elements. This routine does a lookup on NULL values and creates array
+ * elements accordindly. This is a reimplementation of nocachegetattr()
+ * in heaptuple.c simplified for educational purposes.
+ */
+static Datum
+split_tuple_data(Oid relid, char *tupdata,
+ 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-11 Thread Michael Paquier
On Thu, Nov 12, 2015 at 12:41 AM, Nikolay Shaplov
 wrote:
> В письме от 28 октября 2015 16:57:36 пользователь Michael Paquier написал:
>> On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier wrote:
>> > On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:
>> >> Or it's ready to commit, and just not marked this way?
>> >
>> > No, I don't think we have reached this state yet.
>> >
>> >> I am going to make report based on this patch in Vienna. It would be
>> >> nice to have it committed until then :)
>> >
>> > This is registered in this November's CF and the current one is not
>> > completely wrapped up, so I haven't rushed into looking at it.
>>
>> So, I have finally been able to look at this patch in more details,
>> resulting in the attached. I noticed a couple of things that should be
>> addressed, mainly:
>> - addition of a new routine text_to_bits to perform the reverse
>> operation of bits_to_text. This was previously part of
>> tuple_data_split, I think that it deserves its own function.
>> - split_tuple_data should be static
>> - t_bits_str should not be a text, rather a char* fetched using
>> text_to_cstring(PG_GETARG_TEXT_PP(4)). This way there is no need to
>> perform extra calculations with VARSIZE and VARHDRSZ
>> - split_tuple_data can directly use the relation OID instead of the
>> tuple descriptor of the relation
>> - t_bits was leaking memory. For correctness I think that it is better
>> to free it after calling split_tuple_data.
>> - PG_DETOAST_DATUM_COPY allocates some memory, this was leaking as
>> well in raw_attr actually. I refactored the code such as a bytea* is
>> used and always freed when allocated to avoid leaks. Removing raw_attr
>> actually simplified the code a bit.
>> - I simplified the docs, that was largely too verbose in my opinion.
>> - Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using
>> VARATT_IS_EXTERNAL and VARATT_IS_EXTERNAL_ONDISK seems more adapted to
>> me, those other ones are much more low-level and not really spread in
>> the backend code.
>> - Found some typos in the code, the docs and some comments. I reworked
>> the error messages as well.
>> - The code format was not really in line with the project guidelines.
>> I fixed that as well.
>> - I removed heap_page_item_attrs for now to get this patch in a
>> bare-bone state. Though I would not mind if this is re-added (I
>> personally don't think that's much necessary in the module
>> actually...).
>> - The calculation of the length of t_bits using HEAP_NATTS_MASK is
>> more correct as you mentioned earlier, so I let it in its state.
>> That's actually more accurate for error handling as well.
>> That's everything I recall I have. How does this look?
> You've completely rewrite everything ;-)
>
> Let everything be the way you wrote. This code is better than mine.
>
> With one exception. I really  need heap_page_item_attrs function. I used it in
> my Tuples Internals presentation
> https://github.com/dhyannataraj/tuple-internals-presentation
> and I am 100% sure that this function is needed for educational purposes, and
> this function should be as simple as possible, so students can use it without
> extra efforts.

Fine. That's your patch after all.

> I still have an opinion that documentation should be more verbose, than your
> version, but I can accept your version.

I am not sure that's necessary, pageinspect is for developers.

> Who is going to add heap_page_item_attrs to your patch? me or you?

I recall some parts of the code I still did not like much :) I'll grab
some room to have an extra look at it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-28 Thread Michael Paquier
On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier wrote:
> On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:
>> Or it's ready to commit, and just not marked this way?
>
> No, I don't think we have reached this state yet.
>
>> I am going to make report based on this patch in Vienna. It would be
>> nice to have it committed until then :)
>
> This is registered in this November's CF and the current one is not
> completely wrapped up, so I haven't rushed into looking at it.

So, I have finally been able to look at this patch in more details,
resulting in the attached. I noticed a couple of things that should be
addressed, mainly:
- addition of a new routine text_to_bits to perform the reverse
operation of bits_to_text. This was previously part of
tuple_data_split, I think that it deserves its own function.
- split_tuple_data should be static
- t_bits_str should not be a text, rather a char* fetched using
text_to_cstring(PG_GETARG_TEXT_PP(4)). This way there is no need to
perform extra calculations with VARSIZE and VARHDRSZ
- split_tuple_data can directly use the relation OID instead of the
tuple descriptor of the relation
- t_bits was leaking memory. For correctness I think that it is better
to free it after calling split_tuple_data.
- PG_DETOAST_DATUM_COPY allocates some memory, this was leaking as
well in raw_attr actually. I refactored the code such as a bytea* is
used and always freed when allocated to avoid leaks. Removing raw_attr
actually simplified the code a bit.
- I simplified the docs, that was largely too verbose in my opinion.
- Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using
VARATT_IS_EXTERNAL and VARATT_IS_EXTERNAL_ONDISK seems more adapted to
me, those other ones are much more low-level and not really spread in
the backend code.
- Found some typos in the code, the docs and some comments. I reworked
the error messages as well.
- The code format was not really in line with the project guidelines.
I fixed that as well.
- I removed heap_page_item_attrs for now to get this patch in a
bare-bone state. Though I would not mind if this is re-added (I
personally don't think that's much necessary in the module
actually...).
- The calculation of the length of t_bits using HEAP_NATTS_MASK is
more correct as you mentioned earlier, so I let it in its state.
That's actually more accurate for error handling as well.
That's everything I recall I have. How does this look?
-- 
Michael
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..91ab119 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,9 +5,9 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
-	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql \
+	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
+	pageinspect--1.0--1.1.sql pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 ifdef USE_PGXS
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..15fd7f1 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -27,8 +27,11 @@
 
 #include "access/htup_details.h"
 #include "funcapi.h"
-#include "utils/builtins.h"
+#include "catalog/pg_type.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+#include "utils/rel.h"
 
 
 /*
@@ -55,6 +58,42 @@ bits_to_text(bits8 *bits, int len)
 
 
 /*
+ * text_to_bits
+ *
+ * Converts a c-string representation of bits into a bits8-array. This is
+ * the reverse operation of previous routine.
+ */
+static bits8 *
+text_to_bits(char *str, int len)
+{
+	bits8	   *bits;
+	int			off = 0;
+	char		byte = 0;
+
+	bits = palloc(len + 1);
+
+	while (off < len)
+	{
+		if (off % 8 == 0)
+			byte = 0;
+
+		if ((str[off] == '0') || (str[off] == '1'))
+			byte = byte | ((str[off] - '0') << off % 8);
+		else
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("illegal character '%c' in t_bits string", str[off])));
+
+		if (off % 8 == 7)
+			bits[off / 8] = byte;
+
+		off++;
+	}
+
+	return bits;
+}
+
+/*
  * heap_page_items
  *
  * Allows inspection of line pointers and tuple headers of a heap page.
@@ -122,8 +161,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -154,7 +193,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 			lp_offset + lp_len <= raw_page_size)
 		{
 			HeapTupleHeader tuphdr;
-			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +208,14 @@ 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-16 Thread Nikolay Shaplov
So what's next?
We need something else to discuss?
We need somebody else's opinion to rule this out?
Or it's ready to commit, and just not marked this way?

I am going to make report based on this patch in Vienna. It would be
nice to have it committed until then :)

On 02.10.2015 07:10, Michael Paquier wrote:
> On Thu, Oct 1, 2015 at 8:13 PM, Nikolay Shaplov wrote:
>> В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier
> написал:
>>> - errmsg("input page too small (%d
> bytes)",
>>> raw_page_size)));
>>> +errmsg("input page too small (%d
>>> bytes)", raw_page_size)));
>>> Please be careful of spurious diffs. Those just make the life of
> committers
>>> more difficult than it already is.
>> Oh, that's not diff. That's I've fixed indent on the code I did not write
> :-)
>
> pgindent would have taken care of that if needed. There is no need to add
> noise in the code for this patch.
>
>>> + 
>>> + General idea about output columns: lp_*
>>> attributes
>>> + are about where tuple is located inside the page;
>>> + t_xmin, t_xmax,
>>> + t_field3, t_ctid are
> about
>>> + visibility of this tuplue inside or outside of the transaction;
>>> + t_infomask2, t_infomask
> has
>>> some
>>> + information about properties of attributes stored in tuple data.
>>> + t_hoff sais where tuple data begins and
>>> + t_bits sais which attributes are NULL and
> which
>>> are
>>> + not. Please notice that t_bits here is not an actual value that is
>>> stored
>>> + in tuple data, but it's text representation  with '0' and '1'
>>> charactrs.
>>> + 
>>> I would remove that as well. htup_details.h contains all this
> information.
>> Made it even more shorter. Just links and comments about representation of
>> t_bits.
> I would completely remove this part.
>
>> There also was a bug in original pageinspect, that did not get t_bits
> right
>> when there was OID in the tuple.  I've fixed it too.
> Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an
> OID it will be stored at the end of HeapTupleHeader and t_hoff includes it,
> so bits_len is definitely larger of 4 bytes should an OID be present.
>if (tuphdr->t_infomask & HEAP_HASNULL)
>{
> -   bits_len = tuphdr->t_hoff -
> -   offsetof(HeapTupleHeaderData, t_bits);
> +  int bits_len =
> +  ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 +
> 1)*8;
>
> values[11] = CStringGetTextDatum(
> -   bits_to_text(tuphdr->t_bits,
> bits_len * 8));
> +  bits_to_text(tuphdr->t_bits,
> bits_len));
> }
> And here is what you are referring to in your patch. I think that we should
> instead check for HEAP_HASOID and reduce bits_len by the size of Oid should
> one be present. As this is a bug that applies to all the existing versions
> of Postgres it would be good to extract it as a separate patch and then
> apply your own patch on top of it instead of including in your feature.
> Attached is a patch, this should be applied down to 9.0 I guess. Could you
> rebase your patch on top of it?
>
>> Here is next patch in attachment.
> Cool, thanks.
>
> -test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
> +
> +test=# select * from heap_page_items(get_raw_page('pg_range', 0));
> This example in the docs is far too long in character length... We should
> get that reduced.
>
> +  Please notice that t_bits in tuple header
> structure
> +  is a binary bitmap, but heap_page_items returns
> +  t_bits as human readable text representation of
> +  original t_bits bitmap.
> This had better remove the mention to "please notice". Still as this is
> already described in htup_details.h there is no real point to describe it
> again here: that's a bitmap of NULLs.
>
>



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-16 Thread Michael Paquier
On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov
 wrote:
> So what's next?

Wait and see a bit.

> We need something else to discuss?
> We need somebody else's opinion to rule this out?

The spec of the patch looks clear to me.

> Or it's ready to commit, and just not marked this way?

No, I don't think we have reached this state yet.

> I am going to make report based on this patch in Vienna. It would be
> nice to have it committed until then :)

This is registered in this November's CF and the current one is not
completely wrapped up, so I haven't rushed into looking at it.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-02 Thread Nikolay Shaplov
В письме от 2 октября 2015 13:10:22 пользователь Michael Paquier написал:
 
> >> + 
> >> + General idea about output columns: lp_*
> >> attributes
> >> + are about where tuple is located inside the page;
> >> + t_xmin, t_xmax,
> >> + t_field3, t_ctid are
> 
> about
> 
> >> + visibility of this tuplue inside or outside of the transaction;
> >> + t_infomask2, t_infomask
> 
> has
> 
> >> some
> >> + information about properties of attributes stored in tuple data.
> >> + t_hoff sais where tuple data begins and
> >> + t_bits sais which attributes are NULL and
> 
> which
> 
> >> are
> >> + not. Please notice that t_bits here is not an actual value that is
> >> stored
> >> + in tuple data, but it's text representation  with '0' and '1'
> >> charactrs.
> >> + 
> >> I would remove that as well. htup_details.h contains all this
> 
> information.
> 
> > Made it even more shorter. Just links and comments about representation of
> > t_bits.
> 
> I would completely remove this part.

Michael my hand would not do it. I've been working as a lecturer for six 
years. If I want to pass an information in a comfortable way to reader, there 
should go some binding phrase. It may be very vague, but it should outline  
(may be in a very brief way, but still outline) an information that would be 
found if he follows the links. 

If we just give links "knowledge flow" will be uncomfortable for person who 
reads it.


> -test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
> +
> +test=# select * from heap_page_items(get_raw_page('pg_range', 0));
> This example in the docs is far too long in character length... We should
> get that reduced.

May be should do 

\x and limit 1 like this:

test=# select * from heap_page_items(get_raw_page('pg_range', 0)) limit 1;
-[ RECORD 1 ]---
lp  | 1
lp_off  | 8144
lp_flags| 1
lp_len  | 48
t_xmin  | 1
t_xmax  | 0
t_field3| 0
t_ctid  | (0,1)
t_infomask2 | 6
t_infomask  | 2304
t_hoff  | 24
t_bits  | 
t_oid   | 
t_data  | \x400f1700ba074a0f520f




> +  Please notice that t_bits in tuple header
> structure
> +  is a binary bitmap, but heap_page_items returns
> +  t_bits as human readable text representation of
> +  original t_bits bitmap.
> This had better remove the mention to "please notice". Still as this is
> already described in htup_details.h there is no real point to describe it
> again here: that's a bitmap of NULLs.

heap_page_items returns text(!) as t_bits. This is unexpected. This is not 
described in page layout documentation. We should tell about it here 
explicitly. 


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-02 Thread Nikolay Shaplov
В письме от 2 октября 2015 13:10:22 Вы написали:

> > There also was a bug in original pageinspect, that did not get t_bits
> 
> right
> 
> > when there was OID in the tuple.  I've fixed it too.
> 
> Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an
> OID it will be stored at the end of HeapTupleHeader and t_hoff includes it,
> so bits_len is definitely larger of 4 bytes should an OID be present.
>if (tuphdr->t_infomask & HEAP_HASNULL)
>{
> -   bits_len = tuphdr->t_hoff -
> -   offsetof(HeapTupleHeaderData, t_bits);
> +  int bits_len =
> +  ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 +
> 1)*8;
> 
> values[11] = CStringGetTextDatum(
> -   bits_to_text(tuphdr->t_bits,
> bits_len * 8));
> +  bits_to_text(tuphdr->t_bits,
> bits_len));
> }
> And here is what you are referring to in your patch. I think that we should
> instead check for HEAP_HASOID and reduce bits_len by the size of Oid should
> one be present. As this is a bug that applies to all the existing versions
> of Postgres it would be good to extract it as a separate patch and then
> apply your own patch on top of it instead of including in your feature.
> Attached is a patch, this should be applied down to 9.0 I guess. Could you
> rebase your patch on top of it?
No we should not do it, because after t_bits there always goes padding bytes. 
So OID or the top of tuple data is properly aligned. So we should not use 
t_hoff for determinating t_bit's size at all.

Here is an example. I create a table with 10 columns and OID. (ten is greater 
then 8, so there should be two bytes of t_bits data)

create table test3 (a1 int, a2 int, a3 int, a4 int,a5 int,a6 int, a7 int,a8 
int, a9 int, a10 int) with oids;
insert into test3 VALUES
(1,2,3,4,5,6,7,8,null,10);

With your patch we get 

test=# select lp, t_bits, t_data from heap_page_items(get_raw_page('test3', 0));
 lp |  t_bits  |
   t_data   
+--+
  1 | 0100 | 
\x010002000300040005000600070008000a00
(1 row)


here we get 40 bits of t_bits.

With my way to calculate t_bits length we get

test=# select lp, t_bits, t_data from heap_page_items(get_raw_page('test3', 0));
 lp |  t_bits  |   t_data   

+--+
  1 | 0100 | 
\x010002000300040005000600070008000a00
(1 row)

16 bits as expected.

So I would keep my version of bits_len calculation


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-01 Thread Nikolay Shaplov
В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier написал:



> 
> - errmsg("input page too small (%d bytes)",
> raw_page_size)));
> +errmsg("input page too small (%d
> bytes)", raw_page_size)));
> Please be careful of spurious diffs. Those just make the life of committers
> more difficult than it already is.

Oh, that's not diff. That's I've fixed indent on the code I did not write :-)


> + 
> + General idea about output columns: lp_*
> attributes
> + are about where tuple is located inside the page;
> + t_xmin, t_xmax,
> + t_field3, t_ctid are about
> + visibility of this tuplue inside or outside of the transaction;
> + t_infomask2, t_infomask has
> some
> + information about properties of attributes stored in tuple data.
> + t_hoff sais where tuple data begins and
> + t_bits sais which attributes are NULL and which
> are
> + not. Please notice that t_bits here is not an actual value that is
> stored
> + in tuple data, but it's text representation  with '0' and '1'
> charactrs.
> + 
> I would remove that as well. htup_details.h contains all this information.
Made it even more shorter. Just links and comments about representation of 
t_bits.


> +
> +test=# select * from heap_page_item_attrs(get_raw_page('pg_range',
> 0),'pg_range'::regclass);
> +[loong tuple data]
> This example is too large in character per lines, I think that you should
> cut a major part of the fields, why not just keeping lp and t_attrs for
> example.
Did id.

> 
> +  
> +   
> +rel_oid
> +oid
> +OID of the relation, of the tuple we want to split
> +   
> +
> +   
> +tuple_data
> +bytea
> +tuple raw data to split
> +
> +   
> In the description of tuple_data_split, I am not sure it is worth listing
> all the argument of the function like that. IMO, we should just say: those
> are the fields returned by "heap_page_items". tuple_data should as well be
> renamed to t_data.
Did it.

> 
> +  tuple attributes instead of one peace of raw tuple data. All other
> return
> This is not that "peaceful" to me. It should be "piece" :)
Oops ;-)

> +   values[13] = PointerGetDatum(tuple_data_bytea);
> +   nulls[13] = false;
> There is no point to set nulls[13] here.
Oh, you are right!

> 
> +
> +test=# select tuple_data_split('pg_range'::regclass,
> '\x400f1700ba074a0f520f'::bytea, 2304, 6, null);
> +   tuple_data_split
> +---
>  +
> {"\\x400f","\\x1700","\\x","\\xba07","\\x4a0f","\\x5
> 20f"} +(1 row)
> This would be more demonstrative if combined with heap_page_items, like
> that for example:
> SELECT tuple_data_split('pg_class'::regclass, t_data, t_infomask,
> t_infomask2, t_bits) FROM heap_page_items(get_raw_page('pg_class', 0));
> And actually this query crashes.
Oh... It crached because I did not check that t_data can actually be NULL.

There also was a bug in original pageinspect, that did not get t_bits right 
when there was OID in the tuple.  I've fixed it too. 


Here is next patch in attachment.


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..4fd3087 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,14 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len,
+			TupleDesc tuple_desc, uint16 t_infomask,
+			uint16 t_infomask2, bits8 *t_bits, bool
+			do_detoast);
 
 /*
  * bits_to_text
@@ -122,8 +129,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -154,7 +161,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 			lp_offset + lp_len <= raw_page_size)
 		{
 			

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-01 Thread Michael Paquier
On Thu, Oct 1, 2015 at 8:13 PM, Nikolay Shaplov wrote:
> В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier
написал:
>>
>> - errmsg("input page too small (%d
bytes)",
>> raw_page_size)));
>> +errmsg("input page too small (%d
>> bytes)", raw_page_size)));
>> Please be careful of spurious diffs. Those just make the life of
committers
>> more difficult than it already is.
>
> Oh, that's not diff. That's I've fixed indent on the code I did not write
:-)

pgindent would have taken care of that if needed. There is no need to add
noise in the code for this patch.

>> + 
>> + General idea about output columns: lp_*
>> attributes
>> + are about where tuple is located inside the page;
>> + t_xmin, t_xmax,
>> + t_field3, t_ctid are
about
>> + visibility of this tuplue inside or outside of the transaction;
>> + t_infomask2, t_infomask
has
>> some
>> + information about properties of attributes stored in tuple data.
>> + t_hoff sais where tuple data begins and
>> + t_bits sais which attributes are NULL and
which
>> are
>> + not. Please notice that t_bits here is not an actual value that is
>> stored
>> + in tuple data, but it's text representation  with '0' and '1'
>> charactrs.
>> + 
>> I would remove that as well. htup_details.h contains all this
information.
> Made it even more shorter. Just links and comments about representation of
> t_bits.

I would completely remove this part.

> There also was a bug in original pageinspect, that did not get t_bits
right
> when there was OID in the tuple.  I've fixed it too.

Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an
OID it will be stored at the end of HeapTupleHeader and t_hoff includes it,
so bits_len is definitely larger of 4 bytes should an OID be present.
   if (tuphdr->t_infomask & HEAP_HASNULL)
   {
-   bits_len = tuphdr->t_hoff -
-   offsetof(HeapTupleHeaderData, t_bits);
+  int bits_len =
+  ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 +
1)*8;

values[11] = CStringGetTextDatum(
-   bits_to_text(tuphdr->t_bits,
bits_len * 8));
+  bits_to_text(tuphdr->t_bits,
bits_len));
}
And here is what you are referring to in your patch. I think that we should
instead check for HEAP_HASOID and reduce bits_len by the size of Oid should
one be present. As this is a bug that applies to all the existing versions
of Postgres it would be good to extract it as a separate patch and then
apply your own patch on top of it instead of including in your feature.
Attached is a patch, this should be applied down to 9.0 I guess. Could you
rebase your patch on top of it?

> Here is next patch in attachment.

Cool, thanks.

-test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
+
+test=# select * from heap_page_items(get_raw_page('pg_range', 0));
This example in the docs is far too long in character length... We should
get that reduced.

+  Please notice that t_bits in tuple header
structure
+  is a binary bitmap, but heap_page_items returns
+  t_bits as human readable text representation of
+  original t_bits bitmap.
This had better remove the mention to "please notice". Still as this is
already described in htup_details.h there is no real point to describe it
again here: that's a bitmap of NULLs.
-- 
Michael
From 3a21fd817748b8001e91159c3f2a557088b4fa26 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 2 Oct 2015 12:58:13 +0900
Subject: [PATCH] Fix overestimated size of t_bits in pageinspect

In a tuple, an object-id field is added at the end of HeapTupleHeader
and the size of he header is updated to include it. However heap_page_items
in pageinspect ignored that fact and overestimated the size of bitmap of
NULLs by 4 bytes should an OID be defined in this tuple.

Per report from Nikolay Sharplov, for a problem found while working on
a new feature for pageinspect, and patch by Michael Paquier.
---
 contrib/pageinspect/heapfuncs.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..031d455 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -154,7 +154,6 @@ heap_page_items(PG_FUNCTION_ARGS)
 			lp_offset + lp_len <= raw_page_size)
 		{
 			HeapTupleHeader tuphdr;
-			int			bits_len;
 
 			/* Extract information from the tuple header */
 
@@ -180,9 +179,15 @@ heap_page_items(PG_FUNCTION_ARGS)
 			{
 if (tuphdr->t_infomask & HEAP_HASNULL)
 {
+	int		bits_len;
+
 	bits_len = tuphdr->t_hoff -
 		offsetof(HeapTupleHeaderData, t_bits);
 
+	/* ignore OID field of tuple if present 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-29 Thread Nikolay Shaplov
В письме от 26 сентября 2015 20:57:25 пользователь Michael Paquier написал:

> > So I would consider two options: Either move t_infomask/t_infomask2
> > details
> > into storage.sgml or leave as it is.
> 
> The documentation redirects the reader to look at htup_details.h (the
> documentation is actually incorrect, I sent a separate patch), and I
> see no point in duplicating such low-level descriptions, that would be
> double maintenance for the same result.

What do you think about this? (I've changed only heap_tuple_items 
documentation there)



-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..52a0663 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,14 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len,
+			TupleDesc tuple_desc, uint16 t_infomask,
+			uint16 t_infomask2, bits8 *t_bits, bool
+			do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -91,7 +98,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 		if (raw_page_size < SizeOfPageHeaderData)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-  errmsg("input page too small (%d bytes)", raw_page_size)));
+	 errmsg("input page too small (%d bytes)", raw_page_size)));
 
 		fctx = SRF_FIRSTCALL_INIT();
 		mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);
@@ -122,8 +129,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +162,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +177,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+	 tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+			nulls[13] = false;
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -208,7 +225,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +240,223 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		SRF_RETURN_DONE(fctx);
 }
+
+PG_FUNCTION_INFO_V1(tuple_data_split);
+Datum
+tuple_data_split(PG_FUNCTION_ARGS)
+{
+	Oidrel_oid;
+	bytea		   *raw_data;
+	uint16			t_infomask;
+	uint16			t_infomask2;
+	text		   *t_bits_str;
+	RelationData   *rel;
+	TupleDesc		tuple_desc;
+	bool			do_detoast = false;
+	interror_level = ERROR;
+
+	bits8		   *t_bits = NULL;
+	Datum			res;
+
+	rel_oid		= PG_GETARG_OID(0);
+	raw_data	= PG_GETARG_BYTEA_P(1);
+	t_infomask	= PG_GETARG_INT16(2);
+	t_infomask2	= PG_GETARG_INT16(3);
+	t_bits_str	= PG_ARGISNULL(4) ? NULL : PG_GETARG_TEXT_P(4);
+	if (PG_NARGS()>=6)
+		do_detoast = PG_GETARG_BOOL(5);
+	if (PG_NARGS()>=7)
+		error_level = PG_GETARG_BOOL(6) ? WARNING : ERROR;
+
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	(errmsg("must be superuser to use raw page functions";
+
+	/*
+	 * Here we converting t_bits string back to the bits8 array,
+	 * as it really is in the tuple header
+	 */
+	if (t_infomask & HEAP_HASNULL)
+	{
+		int		bits_str_len;
+		int		bits_len;
+		char   *p;
+		int		off;
+		char	byte = 0;
+
+		bits_len = (t_infomask2 & HEAP_NATTS_MASK) / 8 + 1;
+		if (!t_bits_str)
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("t_bits argument is NULL, though we expect it to be NOT NULL and %i character long",
+			bits_len 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-29 Thread Michael Paquier
On Tue, Sep 29, 2015 at 11:39 PM, Nikolay Shaplov wrote:
> But since now we actually parse data with tuple_data_split, we can
provide a
> precisely formed  fake information, so you are not limited with how it is
> actually stored in page. You just pass any arguments you want. So you
does not
> need warning mode anymore.

Yeah, I agree with you here, let's simplify it then. One could as well
catch the error in a plpgsql wrapper function if that's really necessary
and log the failed events at the same time in a custom way.

- errmsg("input page too small (%d bytes)",
raw_page_size)));
+errmsg("input page too small (%d
bytes)", raw_page_size)));
Please be careful of spurious diffs. Those just make the life of committers
more difficult than it already is.

+ 
+ General idea about output columns: lp_*
attributes
+ are about where tuple is located inside the page;
+ t_xmin, t_xmax,
+ t_field3, t_ctid are about
+ visibility of this tuplue inside or outside of the transaction;
+ t_infomask2, t_infomask has
some
+ information about properties of attributes stored in tuple data.
+ t_hoff sais where tuple data begins and
+ t_bits sais which attributes are NULL and which
are
+ not. Please notice that t_bits here is not an actual value that is
stored
+ in tuple data, but it's text representation  with '0' and '1'
charactrs.
+ 
I would remove that as well. htup_details.h contains all this information.

+ 
+ For more detailed information see documentation:
+ , 
+ and source code: src/include/storage/itemid.h,
+ src/include/access/htup_details.h.
+ 
This looks cool to me though.

+
+test=# select * from heap_page_item_attrs(get_raw_page('pg_range',
0),'pg_range'::regclass);
+[loong tuple data]
This example is too large in character per lines, I think that you should
cut a major part of the fields, why not just keeping lp and t_attrs for
example.

+  
+   
+rel_oid
+oid
+OID of the relation, of the tuple we want to split
+   
+
+   
+tuple_data
+bytea
+tuple raw data to split
+
+   
In the description of tuple_data_split, I am not sure it is worth listing
all the argument of the function like that. IMO, we should just say: those
are the fields returned by "heap_page_items". tuple_data should as well be
renamed to t_data.

+  tuple attributes instead of one peace of raw tuple data. All other
return
This is not that "peaceful" to me. It should be "piece" :)

+   values[13] = PointerGetDatum(tuple_data_bytea);
+   nulls[13] = false;
There is no point to set nulls[13] here.

+
+test=# select tuple_data_split('pg_range'::regclass,
'\x400f1700ba074a0f520f'::bytea, 2304, 6, null);
+   tuple_data_split
+---
+
{"\\x400f","\\x1700","\\x","\\xba07","\\x4a0f","\\x520f"}
+(1 row)
This would be more demonstrative if combined with heap_page_items, like
that for example:
SELECT tuple_data_split('pg_class'::regclass, t_data, t_infomask,
t_infomask2, t_bits) FROM heap_page_items(get_raw_page('pg_class', 0));
And actually this query crashes.
-- 
Michael


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-28 Thread Nikolay Shaplov
В письме от 26 сентября 2015 20:57:25 Вы написали:

> >> Thanks! I just had a short look at it:
> >> - I am not convinced that it is worth declaring 3 versions of
> >> tuple_data_split.
> > 
> > How which of them should we leave?
> 
> The one with the most arguments. Now perhaps we could have as well two
> of them: one which has the minimum number of arguments with default
> values, and the second one with the maximum number of arguments.

I've been thinking about this warning mode for a while... 

I've added this argument to the heap_page_item_atts, because if somebody would 
like to force it to parse fake data, it would be really difficult to provide 
proper fake data if one heap to perfectly fit tuple descriptor of another heap.
So to do this you have to lover error level.

But since now we actually parse data with tuple_data_split, we can provide a 
precisely formed  fake information, so you are not limited with how it is 
actually stored in page. You just pass any arguments you want. So you does not 
need warning mode anymore.

So may be I should get rid of "warning mode" now.

Concerning do_detoast argument. I do not have any clear opinion. I thing that 
tuple_data_split is a kind of internal function. So it is ok if it is not so 
convenient to use. 

But in  heap_page_item_attrs should be optional. Because it is used from plsql 
console, and in most cases user does not want deTOASing. 

So if in one function do_detoast is optional, may be it would be good to have 
it optional in other functions to keep similarity...

So summorising: if you want make things simpler, I would first get totally rid 
of warning_mode first of all, then if you would insist, make do_detoast non-
optional.

> 
> >> +t_infomask2
> >> +integer
> >> +stores number of attributes (11 bits of the word). The
> >> rest are used for flag bits:
> >> +
> >> +0x2000 - tuple was updated and key cols modified, or tuple deleted
> >> +0x4000 - tuple was HOT-updated
> >> +0x8000 - this is heap-only tuple
> >> +0xE000 - visibility-related bits (so called "hint bits")
> >> +
> >> This large chunk of documentation is a duplicate of storage.sgml. If
> >> that's really necessary, it looks adapted to me to have more detailed
> >> comments at code level directly in heapfuncs.c.
> > 
> > Hm... There is no explanation of t_infomask/t_infomask2 bits in
> > storage.sgml.
> > 
> > If there is no source of information other then source code, then the
> > documentation is not good.
> 
> pageinspect is already something that works at low-level. My guess is
> that users of this module are going to have a look at the code either
> way to understand how tuple data is manipulated within a page.


> > So I would consider two options: Either move t_infomask/t_infomask2
> > details
> > into storage.sgml or leave as it is.
> 
> The documentation redirects the reader to look at htup_details.h (the
> documentation is actually incorrect, I sent a separate patch)
I do not see any redirect at
http://www.postgresql.org/docs/9.4/static/pageinspect.html

> , and I
> see no point in duplicating such low-level descriptions, that would be
> double maintenance for the same result.
> 
> > I am lazy, and does not feel confidence about touching main documentation,
> > so I would prefer to leave as it is.
> 
> Your patch is proving the contrary :) Honestly I would just rip out
> the part you have added to describe all the fields related to
> HeapTupleHeaderData, and have only a simple self-contained example to
> show how to use the new function tuple_data_split.
May be. I would think about it a little bit more, and discuss it with my 
friends

> >> The example of tuple_data_split in the docs would be more interesting
> >> if embedded with a call to heap_page_items.
> > 
> > This example would be almost not readable. May be I should add one more
> > praise explaining where did we take arguments for tuple_data_split
> 
> I would as well just remove heap_page_item_attrs, an example in the
> docs would be just enough IMO (note that I never mind being outvoted).

This is function that pageinspect user is definitely will use this function. 
And as a user I would like to see function output in documentation the way I 
would see it, in order to be better prepared to see it in real life. There are 
limits however, get_raw_page output should not be shown in documentation ;-)
heap_page_item_attrs is close to these limits, but did not reach it, in my 
opinion... So I'd prefer to keep it there.

> Btw, shouldn't the ereport messages in tuple_data_split use
> error_level instead of ERROR?
These errors are in the part where I parse t_bits from text back to bit 
representation.  Here there is not chance to get non-strict behavior, since 
there is no way to guess what should we do if we met characters then '0' or 
'1'. Or what to do if there is only 7 characters but we should write whole 
byte.etc...
Moreover as I wrote above bay be we just do not need warning_mode at all 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-26 Thread Michael Paquier
On Sat, Sep 26, 2015 at 1:46 AM, Nikolay Shaplov wrote:
> В письме от 25 сентября 2015 20:59:29 пользователь Michael Paquier написал:
>> Thanks! I just had a short look at it:
>> - I am not convinced that it is worth declaring 3 versions of
>> tuple_data_split.
> How which of them should we leave?

The one with the most arguments. Now perhaps we could have as well two
of them: one which has the minimum number of arguments with default
values, and the second one with the maximum number of arguments.

>> - The patch does not respect the project code style,
>> particularly one-line "if condition {foo;}" are not adapted, code line is
> limited
>> to 80 characters, etc. The list is basically here:
>> http://www.postgresql.org/docs/current/static/source.html
> I did my best. Results are attached.

Thanks, it looks better.

>> - Be aware of typos: s/whitch/which is one.
> I've run spellchecker on all comments. Hope that I removed most of the
> mistakes. But as I am not native speaker, I will not be able to eliminate them
> all. I will need help here.

I have not spotted new mistakes regarding that.

>> +t_infomask2
>> +integer
>> +stores number of attributes (11 bits of the word). The
>> rest are used for flag bits:
>> +
>> +0x2000 - tuple was updated and key cols modified, or tuple deleted
>> +0x4000 - tuple was HOT-updated
>> +0x8000 - this is heap-only tuple
>> +0xE000 - visibility-related bits (so called "hint bits")
>> +
>> This large chunk of documentation is a duplicate of storage.sgml. If
>> that's really necessary, it looks adapted to me to have more detailed
>> comments at code level directly in heapfuncs.c.
> Hm... There is no explanation of t_infomask/t_infomask2 bits in storage.sgml.
>
> If there is no source of information other then source code, then the
> documentation is not good.

pageinspect is already something that works at low-level. My guess is
that users of this module are going to have a look at the code either
way to understand how tuple data is manipulated within a page.

> So I would consider two options: Either move t_infomask/t_infomask2 details
> into storage.sgml or leave as it is.

The documentation redirects the reader to look at htup_details.h (the
documentation is actually incorrect, I sent a separate patch), and I
see no point in duplicating such low-level descriptions, that would be
double maintenance for the same result.

> I am lazy, and does not feel confidence about touching main documentation, so 
> I
> would prefer to leave as it is.

Your patch is proving the contrary :) Honestly I would just rip out
the part you have added to describe all the fields related to
HeapTupleHeaderData, and have only a simple self-contained example to
show how to use the new function tuple_data_split.

>> The example of tuple_data_split in the docs would be more interesting
>> if embedded with a call to heap_page_items.
>
> This example would be almost not readable. May be I should add one more praise
> explaining where did we take arguments for tuple_data_split

I would as well just remove heap_page_item_attrs, an example in the
docs would be just enough IMO (note that I never mind being outvoted).

Btw, shouldn't the ereport messages in tuple_data_split use
error_level instead of ERROR?
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-25 Thread Nikolay Shaplov
В письме от 11 сентября 2015 15:12:04 пользователь Michael Paquier написал:

> > Ok.Let's come to the final decision with tuple_data_parse, and i will add
> > this switch there and to pure sql heap_page_item_attrs
> 
> Fine for me.

Here is final version with documentation.

Hope it will be the last one. :-)

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..d42ca48 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,11 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len, TupleDesc tuple_desc, uint16 t_infomask, uint16 t_infomask2, bits8 *t_bits, bool do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -122,8 +126,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +159,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +174,13 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff, tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+			nulls[13] = false;
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -208,7 +221,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +236,210 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		SRF_RETURN_DONE(fctx);
 }
+
+PG_FUNCTION_INFO_V1(tuple_data_split);
+Datum
+tuple_data_split(PG_FUNCTION_ARGS)
+{
+	Oidrel_oid;
+	bytea			*raw_data;
+	uint16			t_infomask;
+	uint16			t_infomask2;
+	text			*t_bits_str;
+	RelationData	*rel;
+	TupleDesc		tuple_desc;
+	bool			do_detoast = false;
+	interror_level = ERROR;
+
+	bits8			*t_bits = NULL;
+	Datum			res;
+
+	rel_oid		= PG_GETARG_OID(0);
+	raw_data	= PG_GETARG_BYTEA_P(1);
+	t_infomask	= PG_GETARG_INT16(2);
+	t_infomask2	= PG_GETARG_INT16(3);
+	t_bits_str	= PG_ARGISNULL(4) ? NULL : PG_GETARG_TEXT_P(4);
+	if (PG_NARGS()>=6) do_detoast = PG_GETARG_BOOL(5);
+	if (PG_NARGS()>=7) error_level = PG_GETARG_BOOL(6)?WARNING:ERROR;
+
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	(errmsg("must be superuser to use raw page functions";
+
+	/*
+	 * Here we converting t_bits string back to the bits8 array,
+	 * as it really is in the tuple header
+	 */
+	if (t_infomask & HEAP_HASNULL)
+	{
+		int bits_str_len;
+		int bits_len;
+		char * p;
+		int off;
+		char byte = 0;
+		bits_len = (t_infomask2 & HEAP_NATTS_MASK) / 8 + 1;
+		if (! t_bits_str)
+		{
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+		errmsg("t_bits argument is NULL, though we expect it to be NOT NULL and %i character long", bits_len * 8)));
+		}
+		bits_str_len =  VARSIZE(t_bits_str) - VARHDRSZ;
+		if (bits_str_len % 8)
+		{
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+		errmsg("t_bits argument length should be multiple of eight")));
+		}
+		if (bits_len * 8 != bits_str_len)
+		{
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+		errmsg("wrong t_bits argument length. Expected %i, actual is %i", bits_len * 8, bits_str_len)));
+		}
+		t_bits = palloc(bits_len + 1);
+
+		p = (char *) t_bits_str + VARHDRSZ;
+		off = 0;
+		while (off

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-25 Thread Nikolay Shaplov
В письме от 25 сентября 2015 20:59:29 пользователь Michael Paquier написал:

> > Here is final version with documentation.
> 
> Thanks! I just had a short look at it:
> - I am not convinced that it is worth declaring 3 versions of
> tuple_data_split.
How which of them should we leave? 

> - The patch does not respect the project code style,
> particularly one-line "if condition {foo;}" are not adapted, code line is 
limited
> to 80 characters, etc. The list is basically here:
> http://www.postgresql.org/docs/current/static/source.html
I did my best. Results are attached.

> - Be aware of typos: s/whitch/which is one.
I've run spellchecker on all comments. Hope that I removed most of the 
mistakes. But as I am not native speaker, I will not be able to eliminate them 
all. I will need help here.
 
> +t_infomask2
> +integer
> +stores number of attributes (11 bits of the word). The
> rest are used for flag bits:
> +
> +0x2000 - tuple was updated and key cols modified, or tuple deleted
> +0x4000 - tuple was HOT-updated
> +0x8000 - this is heap-only tuple
> +0xE000 - visibility-related bits (so called "hint bits")
> +
> This large chunk of documentation is a duplicate of storage.sgml. If
> that's really necessary, it looks adapted to me to have more detailed
> comments at code level directly in heapfuncs.c.
Hm... There is no explanation of t_infomask/t_infomask2 bits in storage.sgml. 

If there is no source of information other then source code, then the 
documentation is not good.

If there were information about t_infomask/t_infomask2 in storage.sgml, then I 
would add "See storage.sgml for more info" into pageinspect doc, and thats 
all. But since there is no such information there, I  think that the best 
thing is to quote comments from source code there, so you can get all 
information from documentation, not looking for it in the code.

So I would consider two options: Either move t_infomask/t_infomask2 details 
into storage.sgml or leave as it is.

I am lazy, and does not feel confidence about touching main documentation, so I 
would prefer to leave as it is.


> The example of tuple_data_split in the docs would be more interesting
> if embedded with a call to heap_page_items.

This example would be almost not readable. May be I should add one more praise 
explaining where did we take arguments for tuple_data_split


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..52a0663 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,14 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len,
+			TupleDesc tuple_desc, uint16 t_infomask,
+			uint16 t_infomask2, bits8 *t_bits, bool
+			do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -91,7 +98,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 		if (raw_page_size < SizeOfPageHeaderData)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-  errmsg("input page too small (%d bytes)", raw_page_size)));
+	 errmsg("input page too small (%d bytes)", raw_page_size)));
 
 		fctx = SRF_FIRSTCALL_INIT();
 		mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);
@@ -122,8 +129,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +162,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +177,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-25 Thread Michael Paquier
On Fri, Sep 25, 2015 at 8:30 PM, Nikolay Shaplov wrote:
> Here is final version with documentation.

Thanks! I just had a short look at it:
- I am not convinced that it is worth declaring 3 versions of tuple_data_split.
- The patch does not respect the project code style, particularly
one-line "if condition {foo;}" are not adapted, code line is limited
to 80 characters, etc. The list is basically here:
http://www.postgresql.org/docs/current/static/source.html
- Be aware of typos: s/whitch/which is one.

+t_infomask2
+integer
+stores number of attributes (11 bits of the word). The
rest are used for flag bits:
+
+0x2000 - tuple was updated and key cols modified, or tuple deleted
+0x4000 - tuple was HOT-updated
+0x8000 - this is heap-only tuple
+0xE000 - visibility-related bits (so called "hint bits")
+
This large chunk of documentation is a duplicate of storage.sgml. If
that's really necessary, it looks adapted to me to have more detailed
comments at code level directly in heapfuncs.c.

The example of tuple_data_split in the docs would be more interesting
if embedded with a call to heap_page_items.

> Hope it will be the last one. :-)

Unfortunately not :) But this is definitely going in the right
direction thinking that this code is mainly targeted for educational
purposes.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-18 Thread Nikolay Shaplov
В письме от 11 сентября 2015 15:12:04 Вы написали:

> > Ok.Let's come to the final decision with tuple_data_parse, and i will add
> > this switch there and to pure sql heap_page_item_attrs
> 
> Fine for me.

So I've modified the code, now we have:

heap_page_items - have a column with raw tuple data

tuple_data_split - takes oid, raw tuple data, infomask, infomask2 and bits 
parsed as string and returns bytea[] with attribute raw values. It also have 
two optional arguments do_detoast that forces function to detoast attribute, 
and warning_mode that allows to set this function to warning mode, and do not 
stop working if some inconsistency were found. 

there is also pure sql function heap_page_item_attrs that takes page data, and 
table oid, and returns same data as heap_page_items but bytea[] of attributes 
instead of one whole piece of raw data. It also has optional argument 
do_detoast that allows to get bytea[] of detoasted attribute data.

I've decided that there is no real need in warning_mode in 
heap_page_item_attrs so there is no such argument there.

So now it is still RFC. Final patch with documentation will come soon

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..d42ca48 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,11 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len, TupleDesc tuple_desc, uint16 t_infomask, uint16 t_infomask2, bits8 *t_bits, bool do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -122,8 +126,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +159,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +174,13 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff, tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+			nulls[13] = false;
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -208,7 +221,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +236,210 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		SRF_RETURN_DONE(fctx);
 }
+
+PG_FUNCTION_INFO_V1(tuple_data_split);
+Datum
+tuple_data_split(PG_FUNCTION_ARGS)
+{
+	Oidrel_oid;
+	bytea			*raw_data;
+	uint16			t_infomask;
+	uint16			t_infomask2;
+	text			*t_bits_str;
+	RelationData	*rel;
+	TupleDesc		tuple_desc;
+	bool			do_detoast = false;
+	interror_level = ERROR;
+
+	bits8			*t_bits = NULL;
+	Datum			res;
+
+	rel_oid		= PG_GETARG_OID(0);
+	raw_data	= PG_GETARG_BYTEA_P(1);
+	t_infomask	= PG_GETARG_INT16(2);
+	t_infomask2	= PG_GETARG_INT16(3);
+	t_bits_str	= PG_ARGISNULL(4) ? NULL : PG_GETARG_TEXT_P(4);
+	if (PG_NARGS()>=6) do_detoast = PG_GETARG_BOOL(5);
+	if (PG_NARGS()>=7) error_level = PG_GETARG_BOOL(6)?WARNING:ERROR;
+
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	(errmsg("must be superuser to use raw page functions";
+
+	/*
+	 * Here we converting t_bits string back to the bits8 array,
+	 * as it really is in the tuple header
+	 */
+	if (t_infomask & HEAP_HASNULL)
+	{
+		int bits_str_len;
+		int bits_len;
+		char * p;
+		int off;
+		char byte = 0;
+		bits_len = (t_infomask2 & HEAP_NATTS_MASK) / 8 + 1;
+		if (! t_bits_str)
+		{
+			

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-11 Thread Michael Paquier
On Fri, Sep 11, 2015 at 12:08 AM, Nikolay Shaplov
 wrote:
> В письме от 10 сентября 2015 15:46:25 пользователь Michael Paquier написал:
>
>> > So if move tuple data parsing into separate function, then we have to pass
>> > these values alongside the tuple data. This is not convenient at all.
>> > So I did it in a way you see in the patch.
>>
>> Why is it not convenient at all? Yes, you have a point, we need those
>> fields to be able to parse the t_data properly. Still the possibility
>> to show individual fields of a tuple as a bytea array either with
>> toasted or detoasted values is a concept completely different from
>> simply showing the page items, which is what, it seems to me,
>> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
>> and t_bits are already available as return fields of heap_page_items,
>> we should simply add a function like that:
>> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
>> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
>> returns bytea[]
>
> Just compare two expressions:
>
> select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);
>
> and
>
> select  lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid,
> t_infomask2, t_infomask, t_hoff, t_bits, t_oid,  tuple_data_parse (
> t_data,  t_infomask, t_infomask2, t_bits, 'test'::regclass, true, false)
> from heap_page_item_attrs(get_raw_page('test', 0));
>
> The second variant is a total mess and almost unsable...

It is hard to believe as well that any sane application would use * as
well in a SELECT query. Users would though and we are talking about
user's education here :)

> Though I've discussed this issue with friedns, and we came to conclusion that
> it might be good to implement tuple_data_parse and then implement
> easy to use heap_page_item_attrs in pure SQL, using heap_page_items and
> tuple_data_parse.
> That would keep usage simplicity, and make code more simple as you offer.

Yep. That's doable with a simple SQL function. I am not sure that it
is worth including in pageinspect though.

> The only one argument against it is that in internal representation t_bits is
> binary, and in sql output it is string with '0' and '1' characters. We will
> have to convert it back to binary mode. This is not difficult, but just 
> useless
> convertations there and back again.

The reason why this is visibly converted from bit to text is that the
in-core bit data type has a fixed length, and in the case of
HeapTupleHeaderData there is a variable length.

> What do you think about this solution?

For code simplicity's sake this seems worth the cost.

>> Note that the data corruption checks apply only to this function as
>> far as I understand, so I think that things could be actually split
>> into two independent patches:
>> 1) Upgrade heap_page_items to add the tuple data as bytea.
>> 2) Add the new function able to parse those fields appropriately.
>>
>> As this code, as you justly mentioned, is aimed mainly for educational
>> purposes to understand a page structure, we should definitely make it
>> as simple as possible at code level, and it seems to me that this
>> comes with a separate SQL interface to control tuple data parsing as a
>> bytea[]. We are doing no favor to our users to complicate the code of
>> pageinspect.c as this patch is doing in its current state, especially
>> if beginners want to have a look at it.
>>
>> >> Actually not two functions, but just one, with an extra flag to be
>> >> able to enforce detoasting on the field values where this can be done.
>> >
>> > Yeah, I also thought about it. But did not come to any final conclusion.
>> > Should we add forth argument, that enables detoasting, instead of adding
>> > separate function?
>>
>> This is definitely something you want to control with a switch.
> Ok.Let's come to the final decision with tuple_data_parse, and i will add this
> switch there and to pure sql heap_page_item_attrs

Fine for me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-11 Thread Michael Paquier
On Fri, Sep 11, 2015 at 12:12 AM, Bruce Momjian  wrote:
> On Thu, Sep 10, 2015 at 03:46:25PM +0900, Michael Paquier wrote:
>> Why is it not convenient at all? Yes, you have a point, we need those
>> fields to be able to parse the t_data properly. Still the possibility
>> to show individual fields of a tuple as a bytea array either with
>> toasted or detoasted values is a concept completely different from
>> simply showing the page items, which is what, it seems to me,
>> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
>> and t_bits are already available as return fields of heap_page_items,
>> we should simply add a function like that:
>> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
>> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
>> returns bytea[]
>
> Should pageinspect create a table that contains some of the constants
> used to interpret infomask?

Interesting idea. It may be indeed useful to show to a user mappings
between t_infomask flags <=> textual meaning. I guess that we could
have an SRF function with a view on top of it that returns such a
list. The same can apply to t_infomask2.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-10 Thread Michael Paquier
On Wed, Sep 9, 2015 at 8:39 PM, Nikolay Shaplov
 wrote:
> В письме от 8 сентября 2015 11:53:24 Вы написали:
>> Hence, instead of all those complications, why not simply introducing two
>> functions that take as input the tuple data and the OID of the relation,
>> returning those bytea arrays? It seems to be that this would be a more
>> handy interface, and as this is for educational purposes I guess that the
>> addition of the overhead induced by the extra function call would be
>> acceptable.
>
> When I looked at this task I considered doing the same thing. Bun 
> unfortunately it is
> not possible. (Or if be more correct it is possible, but if I do so, it would 
> be hard to us
> e it)
> The thing is, that to properly split tuple data into attributes, we need some 
> values from
> tuple header:
> t_infomask2: where postgres store actual number of stored attributes, that 
> may differ
> from one in tuple data. That will allow to properly parse tuples after
> ALTER TABLE ADD COLUMN when it was done without SET DEFAULT option
> t_infomask: has bit that indicates that there is some null attributes in 
> tuple.
> t_bits: has a bit mask that shows what attributes were set to null.
>
> So if move tuple data parsing into separate function, then we have to pass 
> these
> values alongside the tuple data. This is not convenient at all.
> So I did it in a way you see in the patch.

Why is it not convenient at all? Yes, you have a point, we need those
fields to be able to parse the t_data properly. Still the possibility
to show individual fields of a tuple as a bytea array either with
toasted or detoasted values is a concept completely different from
simply showing the page items, which is what, it seems to me,
heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
and t_bits are already available as return fields of heap_page_items,
we should simply add a function like that:
heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
returns bytea[]

Note that the data corruption checks apply only to this function as
far as I understand, so I think that things could be actually split
into two independent patches:
1) Upgrade heap_page_items to add the tuple data as bytea.
2) Add the new function able to parse those fields appropriately.

As this code, as you justly mentioned, is aimed mainly for educational
purposes to understand a page structure, we should definitely make it
as simple as possible at code level, and it seems to me that this
comes with a separate SQL interface to control tuple data parsing as a
bytea[]. We are doing no favor to our users to complicate the code of
pageinspect.c as this patch is doing in its current state, especially
if beginners want to have a look at it.

>> Actually not two functions, but just one, with an extra flag to be
>> able to enforce detoasting on the field values where this can be done.
>
> Yeah, I also thought about it. But did not come to any final conclusion. 
> Should
> we add forth argument, that enables detoasting, instead of adding separate
> function?

This is definitely something you want to control with a switch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-10 Thread Bruce Momjian
On Thu, Sep 10, 2015 at 03:46:25PM +0900, Michael Paquier wrote:
> Why is it not convenient at all? Yes, you have a point, we need those
> fields to be able to parse the t_data properly. Still the possibility
> to show individual fields of a tuple as a bytea array either with
> toasted or detoasted values is a concept completely different from
> simply showing the page items, which is what, it seems to me,
> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
> and t_bits are already available as return fields of heap_page_items,
> we should simply add a function like that:
> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
> returns bytea[]

Should pageinspect create a table that contains some of the constants
used to interpret infomask?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-10 Thread Nikolay Shaplov
В письме от 10 сентября 2015 15:46:25 пользователь Michael Paquier написал:

> > So if move tuple data parsing into separate function, then we have to pass
> > these values alongside the tuple data. This is not convenient at all.
> > So I did it in a way you see in the patch.
> 
> Why is it not convenient at all? Yes, you have a point, we need those
> fields to be able to parse the t_data properly. Still the possibility
> to show individual fields of a tuple as a bytea array either with
> toasted or detoasted values is a concept completely different from
> simply showing the page items, which is what, it seems to me,
> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
> and t_bits are already available as return fields of heap_page_items,
> we should simply add a function like that:
> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
> returns bytea[]

Just compare two expressions:

select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);

and 

select  lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, 
t_infomask2, t_infomask, t_hoff, t_bits, t_oid,  tuple_data_parse (
t_data,  t_infomask, t_infomask2, t_bits, 'test'::regclass, true, false) 
from heap_page_item_attrs(get_raw_page('test', 0));

The second variant is a total mess and almost unsable...

Though I've discussed this issue with friedns, and we came to conclusion that 
it might be good to implement tuple_data_parse and then implement 
easy to use heap_page_item_attrs in pure SQL, using heap_page_items and 
tuple_data_parse.

That would keep usage simplicity, and make code more simple as you offer.

The only one argument against it is that in internal representation t_bist is 
binary, and in sql output it is string with '0' and '1' characters. We will 
have to convert it back to binary mode. This is not difficult, but just useless 
convertations there and back again.

What do you think about this solution?


> Note that the data corruption checks apply only to this function as
> far as I understand, so I think that things could be actually split
> into two independent patches:
> 1) Upgrade heap_page_items to add the tuple data as bytea.
> 2) Add the new function able to parse those fields appropriately.
> 
> As this code, as you justly mentioned, is aimed mainly for educational
> purposes to understand a page structure, we should definitely make it
> as simple as possible at code level, and it seems to me that this
> comes with a separate SQL interface to control tuple data parsing as a
> bytea[]. We are doing no favor to our users to complicate the code of
> pageinspect.c as this patch is doing in its current state, especially
> if beginners want to have a look at it.
> 
> >> Actually not two functions, but just one, with an extra flag to be
> >> able to enforce detoasting on the field values where this can be done.
> > 
> > Yeah, I also thought about it. But did not come to any final conclusion.
> > Should we add forth argument, that enables detoasting, instead of adding
> > separate function?
> 
> This is definitely something you want to control with a switch.
Ok.Let's come to the final decision with tuple_data_parse, and i will add this 
switch there and to pure sql heap_page_item_attrs


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-09 Thread Nikolay Shaplov
В письме от 8 сентября 2015 11:53:24 Вы написали:
> On Sat, Sep 5, 2015 at 1:05 AM, Nikolay Shaplov wrote:
> > В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:
> > > Documentation is missing, that would be good to have to understand what
> > > each function is intended to do.
> > 
> > I were going to add documentation when this patch is commited, or at least
> > approved for commit. So I would know for sure that function definition
> > would
> > not change, so had not to rewrite it again and again.
> 
> I doubt that this patch would be committed without documentation, this is a
> requirement for any patch.
Ok. I can do it. 

> > So if it is possible I would write documentation and test when this patch
> > is
> > already approved.
> 
> I'm fine with that. Let's discuss its shape and come up with an agreement.
> It would have been good to post test cases of what this stuff actually does
> though, people reviewing this thread are limited to guess on what is tried
> to be achieved just by reading the code.

To test non detoasted functions one should do:

CREATE EXTENSION pageinspect;

create table test (a int, b smallint,c varchar);
insert into test VALUES
(-1,2,'111'),
(2,null,'222'),
(3,8,'333'),
(null,16,null);

Then 

# select * from heap_page_items(get_raw_page('test', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
t_infomask2 | t_infomask | t_hoff |  t_bits  | t_oid | t_data 
++--++++--++-+++--+---+
  1 |   8152 |1 | 34 |763 |  0 |0 | (0,1)  |
   3 |   2050 | 24 |  |   | \x020009313131
  2 |   8120 |1 | 32 |763 |  0 |0 | (0,2)  |
   3 |   2051 | 24 | 1010 |   | \x020009323232
  3 |   8080 |1 | 34 |763 |  0 |0 | (0,3)  |
   3 |   2050 | 24 |  |   | \x030008000933
  4 |   8048 |1 | 26 |763 |  0 |0 | (0,4)  |
   3 |   2049 | 24 | 0100 |   | \x1000

or 

# select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
t_infomask2 | t_infomask | t_hoff |  t_bits  | t_oid | t_attrs  
   
++--++++--++-+++--+---+-
  1 |   8152 |1 | 34 |763 |  0 |0 | (0,1)  |
   3 |   2050 | 24 |  |   | 
{"\\x","\\x0200","\\x09313131"}
  2 |   8120 |1 | 32 |763 |  0 |0 | (0,2)  |
   3 |   2051 | 24 | 1010 |   | 
{"\\x0200",NULL,"\\x09323232"}
  3 |   8080 |1 | 34 |763 |  0 |0 | (0,3)  |
   3 |   2050 | 24 |  |   | 
{"\\x0300","\\x0800","\\x0933"}
  4 |   8048 |1 | 26 |763 |  0 |0 | (0,4)  |
   3 |   2049 | 24 | 0100 |   | {NULL,"\\x1000",NULL}
(4 rows)

For detoasted function you should do something like this:

CREATE EXTENSION pageinspect;

create table test2 (c varchar);

insert into test2 VALUES ('++');
insert into test2 VALUES (repeat('+',2100));

Then  heap_page_item_attrs will show you compressed attr data:

# select * from heap_page_item_attrs(get_raw_page('test2', 
0),'test2'::regclass);
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |
t_attrs
++--++++--++-++++---+---
  1 |   8160 |1 | 27 |768 |  0 |0 | (0,1)  |
   1 |   2050 | 24 ||   | {"\\x072b2b"}
  2 |   8096 |1 | 59 |769 |  0 |0 | (0,2)  |
   1 |   2050 | 24 ||   | 
{"\\x8e003408fe2b0f01ff0f01ff0f01ff0f01ff0f01ff0f01ff0f01ff010f01aa"}
(2 rows)

and heap_page_item_detoast_attrs will show you original data

# select * from heap_page_item_detoast_attrs(get_raw_page('test2', 
0),'test2'::regclass);
[will not paste output here it is too long, see it for yourself]


> That's actually where
> documentation, even a draft of documentation helps a lot in the review to
> see if what is expected by the developer matches what the code actually
> does.
> 
> > Code has some whitespaces.
> > I've found and removed some. Hope that was all of them...
> 
> Yeah, it looks that you took completely rid of them.
> 
> In details, 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-07 Thread Michael Paquier
On Tue, Sep 8, 2015 at 11:53 AM, Michael Paquier wrote:
> Honestly, heap_page_item_attrs and heap_page_item_detoast_attrs are way too
> similar to what heap_page_items does, leading to a code maze that is going
> to make future extensions more difficult, which is what lead to the
> refactoring your patch does.
> Hence, instead of all those complications, why not simply introducing two
> functions that take as input the tuple data and the OID of the relation,
> returning those bytea arrays? It seems to be that this would be a more handy
> interface, and as this is for educational purposes I guess that the addition
> of the overhead induced by the extra function call would be acceptable.

Actually not two functions, but just one, with an extra flag to be
able to enforce detoasting on the field values where this can be done.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-07 Thread Michael Paquier
On Sat, Sep 5, 2015 at 1:05 AM, Nikolay Shaplov wrote:

> В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:
> > Documentation is missing, that would be good to have to understand what
> > each function is intended to do.
>
> I were going to add documentation when this patch is commited, or at least
> approved for commit. So I would know for sure that function definition
> would
> not change, so had not to rewrite it again and again.
>

I doubt that this patch would be committed without documentation, this is a
requirement for any patch.


> So if it is possible I would write documentation and test when this patch
> is
> already approved.
>

I'm fine with that. Let's discuss its shape and come up with an agreement.
It would have been good to post test cases of what this stuff actually does
though, people reviewing this thread are limited to guess on what is tried
to be achieved just by reading the code. That's actually where
documentation, even a draft of documentation helps a lot in the review to
see if what is expected by the developer matches what the code actually
does.

> Code has some whitespaces.
> I've found and removed some. Hope that was all of them...
>

Yeah, it looks that you took completely rid of them.

In details, this patch introduces two independent concepts:
- add tuple data as a new bytea column of heap_page_items. This is indeed
where it should be, and note that this is where the several corruption
checks are done by checking the state of the tuple data.
- add heap_page_item_attrs and heap_page_item_detoast_attrs, which is very
similar to heap_page_items, at the difference that it can take an OID to be
able to use a TupleDesc and return a bytea array with the data of each
attribute.

Honestly, heap_page_item_attrs and heap_page_item_detoast_attrs are way too
similar to what heap_page_items does, leading to a code maze that is going
to make future extensions more difficult, which is what lead to the
refactoring your patch does.
Hence, instead of all those complications, why not simply introducing two
functions that take as input the tuple data and the OID of the relation,
returning those bytea arrays? It seems to be that this would be a more
handy interface, and as this is for educational purposes I guess that the
addition of the overhead induced by the extra function call would be
acceptable.

Regards,
-- 
Michael


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-04 Thread Nikolay Shaplov
В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:

> Yeah, I think that's acceptable to have a switch, defaulting to ERROR if
> caller specifies nothing.
> 
> Here are some observations after looking at the code, no functional testing.
> 
> +   int error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2)
> 
> :NULL;
> 
> When handling additional arguments, it seems to me that it is more readable
> to use something like that:
> if (PG_NARGS >= 3)
> {
>  arg = PG_GETARG_XXX(2);
>  //etc.
> }
> 
> +   error_mode = error_mode?WARNING:ERROR;
> 
> This generates warnings at compilation.
> 
yeah, what I have done here is too complex with no reason. I've simplified this 
code now.

> +   if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
> +   {
> +   ereport(WARNING,
> +   (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +   (errmsg("Runnung heap_page_item_attrs in
> WARNING mode.";
> +   }
> 
> I am not convinced that this is necessary.
I've removed it.

> 
> +   inter_call_data->raw_page_size = VARSIZE(raw_page) -
> VARHDRSZ;
> +   if (inter_call_data->raw_page_size < SizeOfPageHeaderData)
> Including raw_page_size in the status data does not seem necessary to me.
> The page data is already available for each loop so you could just extract
> it from it.
> 
> +   ereport(inter_call_data->error_level,
> +   (errcode(ERRCODE_DATA_CORRUPTED),
> +   errmsg("Data corruption: number of
> attributes in tuple header is greater than number of attributes in tuple
> descripor")));
> I'd rather switch the error reports related to data corruption from ereport
> to elog, which is more suited for internal errors, and it seems to me that
> it is the case here.
Hm... First, since we have proper error code, do not see why not give it.

Second, this is not exactly internal error, in some cases user tries to parse 
corrupted data on purpose. So for user it is not an internal error, error from 
deep below, but error on the level he is currently working at. So I would not 
call it internal error.

> heapfuncs.c:370:7: warning: variable 'raw_attr' is used uninitialized
> whenever 'if' condition is false [-Wsometimes-uninitialized]
> if (!is_null)
> ^~~~
> heapfuncs.c:419:43: note: uninitialized use occurs here
> raw_attrs = accumArrayResult(raw_attrs, raw_attr, is_null,
> BYTEAOID, fctx->multi_call_memory_ctx);
> ^~~~
> heapfuncs.c:370:3: note: remove the 'if' if its condition is always true
> if (!is_null)
> ^
> heapfuncs.c:357:17: note: initialize the variable 'raw_attr' to silence
> this warning
> Datum raw_attr;
> My compiler complains about uninitialized variables.
Fixed it

> +--
> +-- _heap_page_items()
> +--
> +CREATE FUNCTION _heap_page_items(IN page bytea, IN relname text,
> I am not sure why this is necessary and visibly it overlaps with the
> existing declaration of heap_page_items. The last argument is different
> though, and I recall that we decided not to use anymore the relation name
> specified as text, but its OID instead in this module.
Oh! This should not go to the final patch :-( Sorry. Removed it.


> Documentation is missing, that would be good to have to understand what
> each function is intended to do.

I were going to add documentation when this patch is commited, or at least 
approved for commit. So I would know for sure that function definition would 
not change, so had not to rewrite it again and again.

So if it is possible I would write documentation and test when this patch is 
already approved.

> Code has some whitespaces.
I've found and removed some. Hope that was all of them...


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..a1b07cb 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,12 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-03 Thread Michael Paquier
On Wed, Sep 2, 2015 at 6:58 PM, Nikolay Shaplov 
wrote:

> В письме от 3 августа 2015 15:35:23 пользователь Alvaro Herrera написал:
> > Nikolay Shaplov wrote:
> > > This patch adds several new functions, available from SQL queries. All
> > > these functions are based on heap_page_items, but accept slightly
> > > different arguments and has one additional column at the result set:
> > >
> > > heap_page_tuples - accepts relation name, and bulkno, and returns usual
> > > heap_page_items set with additional column that contain snapshot of
> tuple
> > > data area stored in bytea.
> >
> > I think the API around get_raw_page is more useful generally.  You might
> > have table blocks stored in a bytea column for instance, because you
> > extracted from some remote server and inserted into a local table for
> > examination.  If you only accept relname/blkno, there's no way to
> > examine data other than blocks directly in the database dir, which is
> > limiting.
> >
> > > There is also one strange function: _heap_page_items it is useless for
> > > practical purposes. As heap_page_items it accepts page data as bytea,
> but
> > > also get a relation name. It tries to apply tuple descriptor of that
> > > relation to that page data.
> >
> > For BRIN, I added something similar (brin_page_items), but it receives
> > the index OID rather than name, and constructs a tuple descriptor to
> > read the data.  I think OID is better than name generally.  (You can
> > cast the relation name to regclass).
> >
> > It's easy to misuse, but these functions are superuser-only, so there
> > should be no security issue at least.  The possibility of a crash
> > remains real, though, so maybe we should try to come up with a way to
> > harden that.
>
> I've done as you've said: Now patch adds two functions heap_page_item_attrs
> and heap_page_item_detoast_attrs and these function takes raw_page and
> relation OID as arguments. They also have third optional parameter that
> allows
> to change error mode from ERROR to WARNING.
>
> Is it ok now?
>

Yeah, I think that's acceptable to have a switch, defaulting to ERROR if
caller specifies nothing.

Here are some observations after looking at the code, no functional testing.

+   int error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2)
:NULL;

When handling additional arguments, it seems to me that it is more readable
to use something like that:
if (PG_NARGS >= 3)
{
 arg = PG_GETARG_XXX(2);
 //etc.
}

+   error_mode = error_mode?WARNING:ERROR;

This generates warnings at compilation.

+   if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
+   {
+   ereport(WARNING,
+   (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+   (errmsg("Runnung heap_page_item_attrs in
WARNING mode.";
+   }

I am not convinced that this is necessary.

+   inter_call_data->raw_page_size = VARSIZE(raw_page) -
VARHDRSZ;
+   if (inter_call_data->raw_page_size < SizeOfPageHeaderData)
Including raw_page_size in the status data does not seem necessary to me.
The page data is already available for each loop so you could just extract
it from it.

+   ereport(inter_call_data->error_level,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+   errmsg("Data corruption: number of
attributes in tuple header is greater than number of attributes in tuple
descripor")));
I'd rather switch the error reports related to data corruption from ereport
to elog, which is more suited for internal errors, and it seems to me that
it is the case here.

heapfuncs.c:370:7: warning: variable 'raw_attr' is used uninitialized
whenever 'if' condition is false [-Wsometimes-uninitialized]
if (!is_null)
^~~~
heapfuncs.c:419:43: note: uninitialized use occurs here
raw_attrs = accumArrayResult(raw_attrs, raw_attr, is_null,
BYTEAOID, fctx->multi_call_memory_ctx);
^~~~
heapfuncs.c:370:3: note: remove the 'if' if its condition is always true
if (!is_null)
^
heapfuncs.c:357:17: note: initialize the variable 'raw_attr' to silence
this warning
Datum raw_attr;
My compiler complains about uninitialized variables.

+--
+-- _heap_page_items()
+--
+CREATE FUNCTION _heap_page_items(IN page bytea, IN relname text,
I am not sure why this is necessary and visibly it overlaps with the
existing declaration of heap_page_items. The last argument is different
though, and I recall that we decided not to use anymore the relation name
specified as text, but its OID instead in this module.

Documentation is missing, that would be good to have to understand what
each function is intended to do. Code has some whitespaces.
-- 
Michael


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-02 Thread Nikolay Shaplov
В письме от 3 августа 2015 15:35:23 пользователь Alvaro Herrera написал:
> Nikolay Shaplov wrote:
> > This patch adds several new functions, available from SQL queries. All
> > these functions are based on heap_page_items, but accept slightly
> > different arguments and has one additional column at the result set:
> > 
> > heap_page_tuples - accepts relation name, and bulkno, and returns usual
> > heap_page_items set with additional column that contain snapshot of tuple
> > data area stored in bytea.
> 
> I think the API around get_raw_page is more useful generally.  You might
> have table blocks stored in a bytea column for instance, because you
> extracted from some remote server and inserted into a local table for
> examination.  If you only accept relname/blkno, there's no way to
> examine data other than blocks directly in the database dir, which is
> limiting.
> 
> > There is also one strange function: _heap_page_items it is useless for
> > practical purposes. As heap_page_items it accepts page data as bytea, but
> > also get a relation name. It tries to apply tuple descriptor of that
> > relation to that page data.
> 
> For BRIN, I added something similar (brin_page_items), but it receives
> the index OID rather than name, and constructs a tuple descriptor to
> read the data.  I think OID is better than name generally.  (You can
> cast the relation name to regclass).
> 
> It's easy to misuse, but these functions are superuser-only, so there
> should be no security issue at least.  The possibility of a crash
> remains real, though, so maybe we should try to come up with a way to
> harden that.

I've done as you've said: Now patch adds two functions heap_page_item_attrs 
and heap_page_item_detoast_attrs and these function takes raw_page and 
relation OID as arguments. They also have third optional parameter that allows 
to change error mode from ERROR to WARNING.

Is it ok now?


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..59eae0f 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,12 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/namespace.h"
+#include "catalog/pg_type.h"
 
+#include "rawpage.h"
 
 /*
  * bits_to_text
@@ -53,6 +58,9 @@ bits_to_text(bits8 *bits, int len)
 	return str;
 }
 
+Datum heap_page_items_internal(FunctionCallInfo fcinfo, bytea *raw_page, Oid rel_oid, int error_level, bool do_detoast);
+void fill_header_data(FuncCallContext *fctx, Datum *values, bool *nulls, bool do_detoast);
+void split_tuple_data(FuncCallContext *fctx, HeapTupleHeader tuphdr, Datum *values, bool *nulls, bool do_detoast);
 
 /*
  * heap_page_items
@@ -66,38 +74,80 @@ typedef struct heap_page_items_state
 	TupleDesc	tupd;
 	Page		page;
 	uint16		offset;
+	TupleDesc	page_tuple_desc;
+	int		raw_page_size;
+	int		error_level;
 } heap_page_items_state;
 
 Datum
 heap_page_items(PG_FUNCTION_ARGS)
 {
 	bytea	   *raw_page = PG_GETARG_BYTEA_P(0);
+	return heap_page_items_internal(fcinfo, raw_page, InvalidOid, ERROR, false);
+}
+
+PG_FUNCTION_INFO_V1(heap_page_item_attrs);
+Datum
+heap_page_item_attrs(PG_FUNCTION_ARGS)
+{
+	bytea		*raw_page = PG_GETARG_BYTEA_P(0);
+	Oid		rel_oid = PG_GETARG_OID(1);
+	int		error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2) :NULL;
+	error_mode = error_mode?WARNING:ERROR;
+	if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
+	{
+		ereport(WARNING,
+			(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+(errmsg("Runnung heap_page_item_attrs in WARNING mode.";
+	}
+
+	return heap_page_items_internal(fcinfo, raw_page, rel_oid, error_mode, false);
+}
+
+PG_FUNCTION_INFO_V1(heap_page_item_detoast_attrs);
+Datum
+heap_page_item_detoast_attrs(PG_FUNCTION_ARGS)
+{
+	bytea		*raw_page = PG_GETARG_BYTEA_P(0);
+	Oid		rel_oid = PG_GETARG_OID(1);
+	int		error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2) :NULL;
+	error_mode = error_mode?WARNING:ERROR;
+	if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
+	{
+		ereport(WARNING,
+			(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+(errmsg("Runnung heap_page_item_detoast_attrs in WARNING mode.";
+	}
+	return heap_page_items_internal(fcinfo, raw_page, rel_oid, error_mode, 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-08-05 Thread Nikolay Shaplov
В письме от 3 августа 2015 15:35:23 Вы написали:
 Nikolay Shaplov wrote:
  This patch adds several new functions, available from SQL queries. All
  these functions are based on heap_page_items, but accept slightly
  different arguments and has one additional column at the result set:
  
  heap_page_tuples - accepts relation name, and bulkno, and returns usual
  heap_page_items set with additional column that contain snapshot of tuple
  data area stored in bytea.
 
 I think the API around get_raw_page is more useful generally.  You might
 have table blocks stored in a bytea column for instance, because you
 extracted from some remote server and inserted into a local table for
 examination.  If you only accept relname/blkno, there's no way to
 examine data other than blocks directly in the database dir, which is
 limiting.
 
  There is also one strange function: _heap_page_items it is useless for
  practical purposes. As heap_page_items it accepts page data as bytea, but
  also get a relation name. It tries to apply tuple descriptor of that
  relation to that page data.
 
 For BRIN, I added something similar (brin_page_items), but it receives
 the index OID rather than name, and constructs a tuple descriptor to
 read the data.  I think OID is better than name generally.  (You can
 cast the relation name to regclass).
 
 It's easy to misuse, but these functions are superuser-only, so there
 should be no security issue at least.  The possibility of a crash
 remains real, though, so maybe we should try to come up with a way to
 harden that.

Hmm... may be you are right. I'll try to change it would take raw page and 
OID.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-08-03 Thread Nikolay Shaplov
В письме от 3 августа 2015 14:30:46 пользователь Michael Paquier написал:
 On Mon, Aug 3, 2015 at 1:03 AM, Nikolay Shaplov
 
 n.shap...@postgrespro.ru wrote:
  Hi!
  
  I've created a patch for pageinspect that allows to see data stored in the
  tuple.
  
  This patch has two main purposes:
  
  1. Practical: Make manual DB recovery more simple
 
 To what are you referring to in this case? Manual manipulation of
 on-disk data manually?
Yes, when DB is broken for example

 
  b) I have plenty of sanity check while reading parsing that tuple, for
  this
  function I've changed error level from ERROR to WARNING. This function
  will
  allow to write proper tests that all these checks work as they were
  designed (I hope to write these tests sooner or later)
 
 +ereport(inter_call_data-error_level,
 +(errcode(ERRCODE_DATA_CORRUPTED),
 +errmsg(Data corruption: Iterating over
 tuple data reached out of actual tuple size)));
 I don't think that the possibility to raise a WARNING is a good thing
 in those code paths. If data is corrupted this may crash, and I am not
 sure that anybody would want that even for educational purposes.
Hm... I considered _heap_page_items really very dangerous function, with big 
red Do not call it if you not sure what are you doing warning. Reading data 
with not proper attribute descriptor is dangerous any way. But when I wrote 
that code, I did not have that checks at first, and it was really interesting 
to subst one data with another and see what will happen. And I thought that 
may be other explorers will like to do the same. And it is really possible 
only in warning mode. So I left warnings only in _heap_page_items, and set 
errors for all other cases.


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-08-03 Thread Alvaro Herrera
Nikolay Shaplov wrote:

 This patch adds several new functions, available from SQL queries. All these 
 functions are based on heap_page_items, but accept slightly different 
 arguments and has one additional column at the result set:
 
 heap_page_tuples - accepts relation name, and bulkno, and returns usual 
 heap_page_items set with additional column that contain snapshot of tuple 
 data 
 area stored in bytea.

I think the API around get_raw_page is more useful generally.  You might
have table blocks stored in a bytea column for instance, because you
extracted from some remote server and inserted into a local table for
examination.  If you only accept relname/blkno, there's no way to
examine data other than blocks directly in the database dir, which is
limiting.

 There is also one strange function: _heap_page_items it is useless for 
 practical purposes. As heap_page_items it accepts page data as bytea, but 
 also 
 get a relation name. It tries to apply tuple descriptor of that relation to 
 that page data. 

For BRIN, I added something similar (brin_page_items), but it receives
the index OID rather than name, and constructs a tuple descriptor to
read the data.  I think OID is better than name generally.  (You can
cast the relation name to regclass).

It's easy to misuse, but these functions are superuser-only, so there
should be no security issue at least.  The possibility of a crash
remains real, though, so maybe we should try to come up with a way to
harden that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers