subject:"RE\: libpq debug log"

Re: libpq debug log

2021-06-10 Thread Noah Misch

On Mon, Jun 07, 2021 at 11:37:58AM -0400, Robert Haas wrote:
> On Sat, Jun 5, 2021 at 11:03 AM Alvaro Herrera  
> wrote:
> > > This added a PQtraceSetFlags() function.  We have a dozen PQset*() 
> > > functions,
> > > but this and PQresultSetInstanceData() are the only PQSomethingSet()
> > > functions.  Is it okay if I rename it to PQsetTraceFlags()?  I think that
> > > would be more idiomatic for libpq.
> >
> > Hmm, there is an obvious parallel between PQtrace() and
> > PQtraceSetFlags() which is lost with your proposed rename.  I'm not
> > wedded to the name though, so I'm just -0.1 on the rename.  If you feel
> > strongly about it, I won't oppose it.
> 
> I'm +1 for the rename. I think it's a lot more idiomatic.

At +1 vs. -0.1, I would have withdrawn the proposal.  Seeing +2 vs. -0.1, I
have pushed it.

Re: libpq debug log

2021-06-07 Thread Robert Haas

On Sat, Jun 5, 2021 at 11:03 AM Alvaro Herrera  wrote:
> > This added a PQtraceSetFlags() function.  We have a dozen PQset*() 
> > functions,
> > but this and PQresultSetInstanceData() are the only PQSomethingSet()
> > functions.  Is it okay if I rename it to PQsetTraceFlags()?  I think that
> > would be more idiomatic for libpq.
>
> Hmm, there is an obvious parallel between PQtrace() and
> PQtraceSetFlags() which is lost with your proposed rename.  I'm not
> wedded to the name though, so I'm just -0.1 on the rename.  If you feel
> strongly about it, I won't oppose it.

I'm +1 for the rename. I think it's a lot more idiomatic.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: libpq debug log

2021-06-05 Thread Alvaro Herrera

On 2021-Jun-04, Noah Misch wrote:

> On Fri, Apr 09, 2021 at 05:16:11PM -0400, Alvaro Herrera wrote:
> > Pushed now.
> 
> This added a PQtraceSetFlags() function.  We have a dozen PQset*() functions,
> but this and PQresultSetInstanceData() are the only PQSomethingSet()
> functions.  Is it okay if I rename it to PQsetTraceFlags()?  I think that
> would be more idiomatic for libpq.

Hmm, there is an obvious parallel between PQtrace() and
PQtraceSetFlags() which is lost with your proposed rename.  I'm not
wedded to the name though, so I'm just -0.1 on the rename.  If you feel
strongly about it, I won't oppose it.

-- 
Álvaro Herrera   Valdivia, Chile
"La espina, desde que nace, ya pincha" (Proverbio africano)

Re: libpq debug log

2021-06-05 Thread Noah Misch

On Fri, Apr 09, 2021 at 05:16:11PM -0400, Alvaro Herrera wrote:
> Pushed now.

This added a PQtraceSetFlags() function.  We have a dozen PQset*() functions,
but this and PQresultSetInstanceData() are the only PQSomethingSet()
functions.  Is it okay if I rename it to PQsetTraceFlags()?  I think that
would be more idiomatic for libpq.

Re: libpq debug log

2021-04-09 Thread Alvaro Herrera

On 2021-Apr-02, Tom Lane wrote:

> Alvaro Herrera  writes:
> > On 2021-Apr-02, Tom Lane wrote:
> >> +1, but the comment could be more specific.  Maybe like "In regress mode,
> >> suppress the length of ErrorResponse and NoticeResponse messages --- the F
> >> (file name) field, in particular, can vary in length depending on compiler
> >> used."
> 
> > With your commit this is no longer true, so how about the attached?
> 
> Right.  Your text looks fine.

Pushed now.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"I'm always right, but sometimes I'm more right than other times."
  (Linus Torvalds)

Re: libpq debug log

2021-04-02 Thread Tom Lane

Alvaro Herrera  writes:
> On 2021-Apr-02, Tom Lane wrote:
>> +1, but the comment could be more specific.  Maybe like "In regress mode,
>> suppress the length of ErrorResponse and NoticeResponse messages --- the F
>> (file name) field, in particular, can vary in length depending on compiler
>> used."

> With your commit this is no longer true, so how about the attached?

Right.  Your text looks fine.

(I see drongo is now green, so it looks like its compiler does indeed emit
backslash-separated full paths.)

regards, tom lane

Re: libpq debug log

2021-04-02 Thread Alvaro Herrera

On 2021-Apr-02, Tom Lane wrote:

> Alvaro Herrera  writes:
> > As in the attached patch.
> 
> +1, but the comment could be more specific.  Maybe like "In regress mode,
> suppress the length of ErrorResponse and NoticeResponse messages --- the F
> (file name) field, in particular, can vary in length depending on compiler
> used."

With your commit this is no longer true, so how about the attached?

-- 
Álvaro Herrera   Valdivia, Chile
>From 80af8f687ea3665a045d9005e0ec1fc53e1a393a Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 2 Apr 2021 10:51:42 -0300
Subject: [PATCH v2] Suppress length of Notice/Error in PQtrace regress mode

A (relatively minor) annoyance of ErrorResponse/NoticeResponse messages
as printed by PQtrace() is that their length might vary when we move
error messages from one file to another, one function to another, or
even when their location line number changes number of digits.

To avoid having to adjust expected files for some tests, make the
regress mode of PQtrace() suppress the length word of those messages.

Discussion: https://postgr.es/m/20210402023010.GA13563@alvherre.pgsql
Reviewed-by: Tom Lane 
---
 src/interfaces/libpq/fe-trace.c   | 11 ++-
 .../libpq_pipeline/traces/pipeline_abort.trace|  8 
 .../modules/libpq_pipeline/traces/transaction.trace   |  6 +++---
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/src/interfaces/libpq/fe-trace.c b/src/interfaces/libpq/fe-trace.c
index 9a4595f5c8..ca24869e91 100644
--- a/src/interfaces/libpq/fe-trace.c
+++ b/src/interfaces/libpq/fe-trace.c
@@ -540,7 +540,16 @@ pqTraceOutputMessage(PGconn *conn, const char *message, bool toServer)
 	length = (int) pg_ntoh32(length);
 	logCursor += 4;
 
-	fprintf(conn->Pfdebug, "%s\t%d\t", prefix, length);
+	/*
+	 * In regress mode, suppress the length of ErrorResponse and
+	 * NoticeResponse.  The F (file name), L (line number) and R (routine
+	 * name) fields can change as server code is modified, and if their
+	 * lengths differ from the originals, that would break tests.
+	 */
+	if (regress && !toServer && (id == 'E' || id == 'N'))
+		fprintf(conn->Pfdebug, "%s\tNN\t", prefix);
+	else
+		fprintf(conn->Pfdebug, "%s\t%d\t", prefix, length);
 
 	switch (id)
 	{
diff --git a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
index b061435785..254e485997 100644
--- a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
+++ b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
@@ -1,5 +1,5 @@
 F	42	Query	 "DROP TABLE IF EXISTS pq_pipeline_demo"
-B	123	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_demo" does not exist, skipping" F "" L "" R "" \x00
+B	NN	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_demo" does not exist, skipping" F "" L "" R "" \x00
 B	15	CommandComplete	 "DROP TABLE"
 B	5	ReadyForQuery	 I
 F	99	Query	 "CREATE UNLOGGED TABLE pq_pipeline_demo(id serial primary key, itemno integer,int8filler int8);"
@@ -27,7 +27,7 @@ B	4	ParseComplete
 B	4	BindComplete
 B	4	NoData
 B	15	CommandComplete	 "INSERT 0 1"
-B	217	ErrorResponse	 S "ERROR" V "ERROR" C "42883" M "function no_such_function(integer) does not exist" H "No function matches the given name and argument types. You might need to add explicit type casts." P "8" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42883" M "function no_such_function(integer) does not exist" H "No function matches the given name and argument types. You might need to add explicit type casts." P "8" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 B	4	ParseComplete
 B	4	BindComplete
@@ -39,7 +39,7 @@ F	12	Bind	 "" "" 0 0 0
 F	6	Describe	 P ""
 F	9	Execute	 "" 0
 F	4	Sync
-B	123	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 F	54	Parse	 "" "SELECT 1.0/g FROM generate_series(3, -1, -1) g" 0
 F	12	Bind	 "" "" 0 0 0
@@ -52,7 +52,7 @@ B	33	RowDescription	 1 "?column?"  0  65535 -1 0
 B	32	DataRow	 1 22 '0.'
 B	32	DataRow	 1 22 '0.5000'
 B	32	DataRow	 1 22 '1.'
-B	70	ErrorResponse	 S "ERROR" V "ERROR" C "22012" M "division by zero" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "22012" M "division by zero" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 F	40	Query	 "SELECT itemno FROM pq_pipeline_demo"
 B	31	RowDescription	 1 "itemno"  2  4 -1 0
diff --git a/src/test/modules/libpq_pipeline/traces/transaction.trace b/src/test/modules/libpq_pipeline/traces/transaction.trace
index 0267a534fa..1dcc2373c0 100644
---

Re: libpq debug log

2021-04-02 Thread Alvaro Herrera

On 2021-Apr-02, Tom Lane wrote:

> On third thought, maybe we should push your patch too.  Although I think
> 53aafdb9f is going to fix the platform-specific aspect of this, we are
> still going to risk some implementation dependence of the libpq_pipeline
> results:
> 
> * Every so often, the number of digits in the reported line numbers
> will change (999 -> 1001 or the like), due to changes in unrelated
> code.

Bah -- of course.

> * Occasionally we refactor things so that the "same" error is reported
> from a different file.

True.  (This was the reason for masking F and R, but I didn't realize
that it'd have an effect in the message length).

> It's hard to judge whether that will happen often enough to be an
> annoying maintenance problem, but there's definitely a hazard.
> Not sure if we should proactively lobotomize the test, or wait to
> see if we get annoyed.

I suspect we're going to see enough bf failures if we don't suppress it,
because the problem is going to only show up with TAP testing enabled
and the src/test/modules tests, which perhaps not all committers run.

> In any case I'd like to wait till after drongo's next run, so
> we can confirm whether or not the backslashes-in-__FILE__ hypothesis
> is correct.  If it is, then 53aafdb9f is a good fix on its own
> merits, independently of libpq_pipeline.

Wilco.

-- 
Álvaro Herrera   Valdivia, Chile
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)

Re: libpq debug log

2021-04-02 Thread Tom Lane

I wrote:
>> I bet what is happening on drongo is that the compiler has generated a
>> __FILE__ value that contains backslashes not slashes, and this code
>> doesn't know how to shorten those.  So maybe instead of lobotomizing
>> this test, we should fix that.

> Did that, but we'll have to wait a few hours to see if it makes
> drongo happy.

On third thought, maybe we should push your patch too.  Although I think
53aafdb9f is going to fix the platform-specific aspect of this, we are
still going to risk some implementation dependence of the libpq_pipeline
results:

* Every so often, the number of digits in the reported line numbers
will change (999 -> 1001 or the like), due to changes in unrelated
code.

* Occasionally we refactor things so that the "same" error is reported
from a different file.

It's hard to judge whether that will happen often enough to be an
annoying maintenance problem, but there's definitely a hazard.
Not sure if we should proactively lobotomize the test, or wait to
see if we get annoyed.

In any case I'd like to wait till after drongo's next run, so
we can confirm whether or not the backslashes-in-__FILE__ hypothesis
is correct.  If it is, then 53aafdb9f is a good fix on its own
merits, independently of libpq_pipeline.

regards, tom lane

Re: libpq debug log

2021-04-02 Thread Alvaro Herrera

On 2021-Apr-02, Tom Lane wrote:

> I wrote:
> > I bet what is happening on drongo is that the compiler has generated a
> > __FILE__ value that contains backslashes not slashes, and this code
> > doesn't know how to shorten those.  So maybe instead of lobotomizing
> > this test, we should fix that.
> 
> Did that, but we'll have to wait a few hours to see if it makes
> drongo happy.

Thanks, I really hope it does!

-- 
Álvaro Herrera   Valdivia, Chile

Re: libpq debug log

2021-04-02 Thread Tom Lane

I wrote:
> I bet what is happening on drongo is that the compiler has generated a
> __FILE__ value that contains backslashes not slashes, and this code
> doesn't know how to shorten those.  So maybe instead of lobotomizing
> this test, we should fix that.

Did that, but we'll have to wait a few hours to see if it makes
drongo happy.

regards, tom lane

Re: libpq debug log

2021-04-02 Thread Tom Lane

Alvaro Herrera  writes:
> As in the attached patch.

+1, but the comment could be more specific.  Maybe like "In regress mode,
suppress the length of ErrorResponse and NoticeResponse messages --- the F
(file name) field, in particular, can vary in length depending on compiler
used."

Actually though ... I recall now that elog.c tries to standardize the F
output by stripping the path part:

/* keep only base name, useful especially for vpath builds */
slash = strrchr(filename, '/');
if (slash)
filename = slash + 1;

I bet what is happening on drongo is that the compiler has generated a
__FILE__ value that contains backslashes not slashes, and this code
doesn't know how to shorten those.  So maybe instead of lobotomizing
this test, we should fix that.

slash = strrchr(filename, '/');
+   if (!slash)
+   slash = strrchr(filename, '\\');
if (slash)
filename = slash + 1;

regards, tom lane

Re: libpq debug log

2021-04-02 Thread Alvaro Herrera

On 2021-Apr-02, Alvaro Herrera wrote:

> On 2021-Apr-02, Kyotaro Horiguchi wrote:
> 
> > At Fri, 02 Apr 2021 14:40:09 +0900 (JST), Kyotaro Horiguchi 
> >  wrote in 
> 
> > > So, the cheapest measure for regression test would be just making the
> > 
> > So, the cheapest measure for regression test would be just *masking* the
> > 
> > > length field, while regress is true...
> 
> Yeah.

As in the attached patch.

-- 
Álvaro Herrera   Valdivia, Chile
"Investigación es lo que hago cuando no sé lo que estoy haciendo"
(Wernher von Braun)
>From 387a15ec80cfcdd2279b109215b9f658bc972748 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 2 Apr 2021 10:51:42 -0300
Subject: [PATCH] Suppress length of Notice/Error in PQtrace regress mode

---
 src/interfaces/libpq/fe-trace.c  | 9 -
 .../modules/libpq_pipeline/traces/pipeline_abort.trace   | 8 
 src/test/modules/libpq_pipeline/traces/transaction.trace | 6 +++---
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/src/interfaces/libpq/fe-trace.c b/src/interfaces/libpq/fe-trace.c
index 9a4595f5c8..298713c602 100644
--- a/src/interfaces/libpq/fe-trace.c
+++ b/src/interfaces/libpq/fe-trace.c
@@ -540,7 +540,14 @@ pqTraceOutputMessage(PGconn *conn, const char *message, bool toServer)
 	length = (int) pg_ntoh32(length);
 	logCursor += 4;
 
-	fprintf(conn->Pfdebug, "%s\t%d\t", prefix, length);
+	/*
+	 * In regress mode, suppress the length of ErrorResponse and
+	 * NoticeResponse -- they vary depending on compiler.
+	 */
+	if (regress && !toServer && (id == 'E' || id == 'N'))
+		fprintf(conn->Pfdebug, "%s\tNN\t", prefix);
+	else
+		fprintf(conn->Pfdebug, "%s\t%d\t", prefix, length);
 
 	switch (id)
 	{
diff --git a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
index b061435785..254e485997 100644
--- a/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
+++ b/src/test/modules/libpq_pipeline/traces/pipeline_abort.trace
@@ -1,5 +1,5 @@
 F	42	Query	 "DROP TABLE IF EXISTS pq_pipeline_demo"
-B	123	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_demo" does not exist, skipping" F "" L "" R "" \x00
+B	NN	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_demo" does not exist, skipping" F "" L "" R "" \x00
 B	15	CommandComplete	 "DROP TABLE"
 B	5	ReadyForQuery	 I
 F	99	Query	 "CREATE UNLOGGED TABLE pq_pipeline_demo(id serial primary key, itemno integer,int8filler int8);"
@@ -27,7 +27,7 @@ B	4	ParseComplete
 B	4	BindComplete
 B	4	NoData
 B	15	CommandComplete	 "INSERT 0 1"
-B	217	ErrorResponse	 S "ERROR" V "ERROR" C "42883" M "function no_such_function(integer) does not exist" H "No function matches the given name and argument types. You might need to add explicit type casts." P "8" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42883" M "function no_such_function(integer) does not exist" H "No function matches the given name and argument types. You might need to add explicit type casts." P "8" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 B	4	ParseComplete
 B	4	BindComplete
@@ -39,7 +39,7 @@ F	12	Bind	 "" "" 0 0 0
 F	6	Describe	 P ""
 F	9	Execute	 "" 0
 F	4	Sync
-B	123	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "42601" M "cannot insert multiple commands into a prepared statement" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 F	54	Parse	 "" "SELECT 1.0/g FROM generate_series(3, -1, -1) g" 0
 F	12	Bind	 "" "" 0 0 0
@@ -52,7 +52,7 @@ B	33	RowDescription	 1 "?column?"  0  65535 -1 0
 B	32	DataRow	 1 22 '0.'
 B	32	DataRow	 1 22 '0.5000'
 B	32	DataRow	 1 22 '1.'
-B	70	ErrorResponse	 S "ERROR" V "ERROR" C "22012" M "division by zero" F "" L "" R "" \x00
+B	NN	ErrorResponse	 S "ERROR" V "ERROR" C "22012" M "division by zero" F "" L "" R "" \x00
 B	5	ReadyForQuery	 I
 F	40	Query	 "SELECT itemno FROM pq_pipeline_demo"
 B	31	RowDescription	 1 "itemno"  2  4 -1 0
diff --git a/src/test/modules/libpq_pipeline/traces/transaction.trace b/src/test/modules/libpq_pipeline/traces/transaction.trace
index 0267a534fa..1dcc2373c0 100644
--- a/src/test/modules/libpq_pipeline/traces/transaction.trace
+++ b/src/test/modules/libpq_pipeline/traces/transaction.trace
@@ -1,5 +1,5 @@
 F	79	Query	 "DROP TABLE IF EXISTS pq_pipeline_tst;CREATE TABLE pq_pipeline_tst (id int)"
-B	122	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_tst" does not exist, skipping" F "" L "" R "" \x00
+B	NN	NoticeResponse	 S "NOTICE" V "NOTICE" C "0" M "table "pq_pipeline_tst" does not exist, skipping" F "" L "" R "" \x00
 B	15	CommandComplete	 "DROP TABLE"
 B	17	CommandComplete	 "CREATE TABLE"
 B	5

Re: libpq debug log

2021-04-02 Thread Alvaro Herrera

On 2021-Apr-02, Kyotaro Horiguchi wrote:

> At Fri, 02 Apr 2021 14:40:09 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 

> > So, the cheapest measure for regression test would be just making the
> 
> So, the cheapest measure for regression test would be just *masking* the
> 
> > length field, while regress is true...

Yeah.

> tired?

Same here.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end." (2nd Commandment for C programmers)

Re: libpq debug log

2021-04-01 Thread Kyotaro Horiguchi

At Fri, 02 Apr 2021 01:45:06 -0400, Tom Lane  wrote in 
> Kyotaro Horiguchi  writes:
> > At Fri, 2 Apr 2021 02:56:44 +, "iwata@fujitsu.com" 
> >  wrote in 
> >> Do ErrorResponse and NoticeResponse vary from test to test ...?
> >> If so, it seemed difficult to make the trace logs available for regression 
> >> test. 
> >> I will also investigate the cause of this issue.
> 
> > The redacted fields, F, L and R contained source file, souce line and
> > source function respectively. It is reasonable guess that the
> > difference comes from them but I'm not sure how they make a difference
> > of 50 bytes in length...
> 
> Some compilers include the full path of the source file in F ...

Ah. I suspected that but dismissed the possibility looking that on my
environment, gcc8.  That completely makes sense. Thank you!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: libpq debug log

2021-04-01 Thread Kyotaro Horiguchi

At Fri, 02 Apr 2021 14:40:09 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Fri, 2 Apr 2021 02:56:44 +, "iwata@fujitsu.com" 
>  wrote in 
> > Hi Alvaro san
> > 
> > Thank you for your fix of trace log code.
> > 
> > > From: 'alvhe...@alvh.no-ip.org' 
> > > Sent: Friday, April 2, 2021 11:30 AM
> > ...
> > > It still didn't fix it!  Drongo is now reporting a difference in the 
> > > expected trace --
> > > and the differences all seem to be message lengths.
> > > Now that is pretty mysterious, because the messages themselves are printed
> > > identically.  Perl's output is pretty unhelpful, but I wrote them to a 
> > > file and diffed
> > > manually; it's attached.
> >
> > Do ErrorResponse and NoticeResponse vary from test to test ...?
> > If so, it seemed difficult to make the trace logs available for regression 
> > test. 
> > I will also investigate the cause of this issue.
> 
> The redacted fields, F, L and R contained source file, souce line and
> source function respectively. It is reasonable guess that the
> difference comes from them but I'm not sure how they make a difference
> of 50 bytes in length...
> 
> Anyway if the length is wrong, we should get an error after emitting
> the log line.
> 
> > if (logCursor - 1 != length)
> > fprintf(conn->Pfdebug,
> > "mismatched message length: consumed %d, 
> > expected %d\n",
> > logCursor - 1, length);
> 
> So, the cheapest measure for regression test would be just making the

So, the cheapest measure for regression test would be just *masking* the

> length field, while regress is true...

tired?

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: libpq debug log

2021-04-01 Thread Tom Lane

Kyotaro Horiguchi  writes:
> At Fri, 2 Apr 2021 02:56:44 +, "iwata@fujitsu.com" 
>  wrote in 
>> Do ErrorResponse and NoticeResponse vary from test to test ...?
>> If so, it seemed difficult to make the trace logs available for regression 
>> test. 
>> I will also investigate the cause of this issue.

> The redacted fields, F, L and R contained source file, souce line and
> source function respectively. It is reasonable guess that the
> difference comes from them but I'm not sure how they make a difference
> of 50 bytes in length...

Some compilers include the full path of the source file in F ...

regards, tom lane

Re: libpq debug log

2021-04-01 Thread Kyotaro Horiguchi

At Fri, 2 Apr 2021 02:56:44 +, "iwata@fujitsu.com" 
 wrote in 
> Hi Alvaro san
> 
> Thank you for your fix of trace log code.
> 
> > From: 'alvhe...@alvh.no-ip.org' 
> > Sent: Friday, April 2, 2021 11:30 AM
> ...
> > It still didn't fix it!  Drongo is now reporting a difference in the 
> > expected trace --
> > and the differences all seem to be message lengths.
> > Now that is pretty mysterious, because the messages themselves are printed
> > identically.  Perl's output is pretty unhelpful, but I wrote them to a file 
> > and diffed
> > manually; it's attached.
>
> Do ErrorResponse and NoticeResponse vary from test to test ...?
> If so, it seemed difficult to make the trace logs available for regression 
> test. 
> I will also investigate the cause of this issue.

The redacted fields, F, L and R contained source file, souce line and
source function respectively. It is reasonable guess that the
difference comes from them but I'm not sure how they make a difference
of 50 bytes in length...

Anyway if the length is wrong, we should get an error after emitting
the log line.

>   if (logCursor - 1 != length)
>   fprintf(conn->Pfdebug,
>   "mismatched message length: consumed %d, 
> expected %d\n",
>   logCursor - 1, length);

So, the cheapest measure for regression test would be just making the
length field, while regress is true...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: libpq debug log

2021-04-01 Thread iwata....@fujitsu.com

Hi Alvaro san

Thank you for your fix of trace log code.

> From: 'alvhe...@alvh.no-ip.org' 
> Sent: Friday, April 2, 2021 11:30 AM
...
> It still didn't fix it!  Drongo is now reporting a difference in the expected 
> trace --
> and the differences all seem to be message lengths.
> Now that is pretty mysterious, because the messages themselves are printed
> identically.  Perl's output is pretty unhelpful, but I wrote them to a file 
> and diffed
> manually; it's attached.

Do ErrorResponse and NoticeResponse vary from test to test ...?
If so, it seemed difficult to make the trace logs available for regression 
test. 
I will also investigate the cause of this issue.

Regards,
Aya Iwata

Re: libpq debug log

2021-04-01 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Apr-01, 'alvhe...@alvh.no-ip.org' wrote:

> Ooh, wow ... now that is a silly bug!  Thanks, I'll push the fix in a
> minute.

It still didn't fix it!  Drongo is now reporting a difference in the
expected trace -- and the differences all seem to be message lengths.
Now that is pretty mysterious, because the messages themselves are
printed identically.  Perl's output is pretty unhelpful, but I wrote them to a
file and diffed manually; it's attached.

So for example when we expect
! # B   123 ErrorResponseS "ERROR" V "ERROR" C "42601" M "cannot insert 
multiple commands into a prepared statement" F "" L "" R "" \\x00

we actually get
! # B   173 ErrorResponseS "ERROR" V "ERROR" C "42601" M "cannot insert 
multiple commands into a prepared statement" F "" L "" R "" \\x00

[slaps forehead] I suppose the difference must be that the message
length includes the redacted string fields.  So the file in the F file
is 50 chars longer, *kaboom*.  (I'm actually very surprised that this
didn't blow up earlier.)

I'm not sure what to do about this. Maybe the message length for
ErrorResponse/NoticeResponse ought to be redacted too.

-- 
Álvaro Herrera39°49'30"S 73°17'W

Re: libpq debug log

2021-04-01 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Apr-01, Tom Lane wrote:

> So drongo is still failing, and after a bit of looking around at
> other code I finally got hit with the clue hammer.  Per port.h:
> 
>  * On Windows, setvbuf() does not support _IOLBF mode, and interprets that
>  * as _IOFBF.  To add insult to injury, setvbuf(file, NULL, _IOFBF, 0)
>  * crashes outright if "parameter validation" is enabled.  Therefore, in
>  * places where we'd like to select line-buffered mode, we fall back to
>  * unbuffered mode instead on Windows.  Always use PG_IOLBF not _IOLBF
>  * directly in order to implement this behavior.
> 
> You want to do the honors?  And do something about that shift bug
> while at it.

Ooh, wow ... now that is a silly bug!  Thanks, I'll push the fix in a
minute.

Andrew also noted that the use of command_ok() in the test file eats all
the output and is what is preventing us from seeing it in the first.
I'll try to get rid of that together with the rest.

-- 
Álvaro Herrera   Valdivia, Chile

Re: libpq debug log

2021-04-01 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Apr-01, Tom Lane wrote:

> "'alvhe...@alvh.no-ip.org'"  writes:
> > On 2021-Apr-01, Tom Lane wrote:
> >> BTW, what in the world is this supposed to accomplish?
> >> -(long long) rows_to_send);
> >> +(1L << 62) + (long long) rows_to_send);
> 
> > It makes the text representation wider, which means some buffer fills up
> > faster and the program switches from sending to receiving.
> 
> Hm.  If the actual field value isn't relevant, why bother including
> rows_to_send in it?  A constant string would be simpler and much
> less confusing as to its purpose.

Hah, yeah, we could as well do that I guess :-)

-- 
Álvaro Herrera   Valdivia, Chile
"I think my standards have lowered enough that now I think 'good design'
is when the page doesn't irritate the living f*ck out of me." (JWZ)

Re: libpq debug log

2021-04-01 Thread Tom Lane

So drongo is still failing, and after a bit of looking around at
other code I finally got hit with the clue hammer.  Per port.h:

 * On Windows, setvbuf() does not support _IOLBF mode, and interprets that
 * as _IOFBF.  To add insult to injury, setvbuf(file, NULL, _IOFBF, 0)
 * crashes outright if "parameter validation" is enabled.  Therefore, in
 * places where we'd like to select line-buffered mode, we fall back to
 * unbuffered mode instead on Windows.  Always use PG_IOLBF not _IOLBF
 * directly in order to implement this behavior.

You want to do the honors?  And do something about that shift bug
while at it.

regards, tom lane

Re: libpq debug log

2021-04-01 Thread Tom Lane

"'alvhe...@alvh.no-ip.org'"  writes:
> On 2021-Apr-01, Tom Lane wrote:
>> BTW, what in the world is this supposed to accomplish?
>> -(long long) rows_to_send);
>> +(1L << 62) + (long long) rows_to_send);

> It makes the text representation wider, which means some buffer fills up
> faster and the program switches from sending to receiving.

Hm.  If the actual field value isn't relevant, why bother including
rows_to_send in it?  A constant string would be simpler and much
less confusing as to its purpose.

regards, tom lane

Re: libpq debug log

2021-04-01 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Apr-01, Tom Lane wrote:

> BTW, what in the world is this supposed to accomplish?
> 
> -(long long) rows_to_send);
> +(1L << 62) + (long long) rows_to_send);
> 
> Various buildfarm members are complaining that the shift distance
> is more than the width of "long", which I'd just fix with s/L/LL/
> except that the addition of the constant seems just wrong to
> begin with.

It makes the text representation wider, which means some buffer fills up
faster and the program switches from sending to receiving.

-- 
Álvaro Herrera   Valdivia, Chile
"Ni aún el genio muy grande llegaría muy lejos
si tuviera que sacarlo todo de su propio interior" (Goethe)

Re: libpq debug log

2021-04-01 Thread Tom Lane

BTW, what in the world is this supposed to accomplish?

-(long long) rows_to_send);
+(1L << 62) + (long long) rows_to_send);

Various buildfarm members are complaining that the shift distance
is more than the width of "long", which I'd just fix with s/L/LL/
except that the addition of the constant seems just wrong to
begin with.

regards, tom lane

Re: libpq debug log

2021-04-01 Thread Tom Lane

"'alvhe...@alvh.no-ip.org'"  writes:
> Eh, so I forgot to strdup(optarg[optind]).  Apparently that works fine
> in glibc but other getopt implementations are likely not so friendly.

Hmm ... interesting theory, but I don't think I buy it, since the
program isn't doing anything that should damage argv[].  I guess
we'll soon find out.

regards, tom lane

Re: libpq debug log

2021-04-01 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-31, Tom Lane wrote:

> I wrote:
> > That is weird - only test 4 (of 8) runs at all, the rest seem to
> > fail to connect.  What's different about pipelined_insert?
> 
> Oh ... there's a pretty obvious theory.  pipelined_insert is
> the only one that is not asked to write a trace file.
> So for some reason, opening the trace file fails.
> (I wonder why we don't see an error message for this though.)

Eh, so I forgot to strdup(optarg[optind]).  Apparently that works fine
in glibc but other getopt implementations are likely not so friendly.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"I am amazed at [the pgsql-sql] mailing list for the wonderful support, and
lack of hesitasion in answering a lost soul's question, I just wished the rest
of the mailing list could be like this."   (Fotis)
   (http://archives.postgresql.org/pgsql-sql/2006-06/msg00265.php)

Re: libpq debug log

2021-03-31 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-31, Tom Lane wrote:

> While this may have little to do with drongo's issue,
> I'm going to take exception to this bit that I see that
> the patch added to PQtrace():
> 
>   /* Make the trace stream line-buffered */
>   setvbuf(debug_port, NULL, _IOLBF, 0);

Mea culpa.  I added this early on because it made PQtrace's use more
comfortable, but I agree that it's a mistake.  Removed.  I put it in
libpq_pipeline.c instead.

> ... and as for an actual explanation for drongo's issue,
> I'm sort of wondering why libpq_pipeline.c is opening the
> trace file in "w+" mode rather than vanilla "w" mode.
> That seems unnecessary, and maybe it doesn't work so well
> on Windows.

well, at least the MSVC documentation claims that it works, but who
knows.  I removed that too.
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fopen-wfopen?view=msvc-160

-- 
Álvaro Herrera   Valdivia, Chile

Re: libpq debug log

2021-03-31 Thread Tom Lane

I wrote:
> What I suspect is some Windows dependency in the way that
> 001_libpq_pipeline.pl is setting up the trace output files.

While this may have little to do with drongo's issue,
I'm going to take exception to this bit that I see that
the patch added to PQtrace():

/* Make the trace stream line-buffered */
setvbuf(debug_port, NULL, _IOLBF, 0);

I do not like that on two grounds:

1. The trace file handle belongs to the calling application,
not to libpq.  I do not think libpq has any business editorializing
on the file's settings.

2. POSIX says that setvbuf() must be done before any other
operation on the file.  Since we have no way to know whether
any output has already been written to the trace file, the
above is a spec violation.

... and as for an actual explanation for drongo's issue,
I'm sort of wondering why libpq_pipeline.c is opening the
trace file in "w+" mode rather than vanilla "w" mode.
That seems unnecessary, and maybe it doesn't work so well
on Windows.

regards, tom lane

Re: libpq debug log

2021-03-31 Thread Tom Lane

"'alvhe...@alvh.no-ip.org'"  writes:
> On 2021-Mar-31, Tom Lane wrote:
>> So for some reason, opening the trace file fails.
>> (I wonder why we don't see an error message for this though.)

> .. oh, I think we forgot to set conn->Pfdebug = NULL when creating the
> connection.  So when we do PQtrace(), the first thing it does is
> PQuntrace(), and then that tries to do fflush(conn->Pfdebug) ---> crash.
> So this should fix it.

Nope, see the MemSet a few lines further up.  This change seems like
good style, but it won't fix anything --- else we'd have been
having issues with the pre-existing trace logic.

What I suspect is some Windows dependency in the way that
001_libpq_pipeline.pl is setting up the trace output files.
cc'ing Andrew to see if he has any ideas.

regards, tom lane

Re: libpq debug log

2021-03-31 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-31, 'alvhe...@alvh.no-ip.org' wrote:

> .. oh, I think we forgot to set conn->Pfdebug = NULL when creating the
> connection.  So when we do PQtrace(), the first thing it does is
> PQuntrace(), and then that tries to do fflush(conn->Pfdebug) ---> crash.
> So this should fix it.

I tried to use MALLOC_PERTURB_ to prove this theory, but I failed at it.
I'll just push this blind and see what happens.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude."  (Brian Kernighan)

Re: libpq debug log

2021-03-31 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-31, Tom Lane wrote:

> I wrote:
> > That is weird - only test 4 (of 8) runs at all, the rest seem to
> > fail to connect.  What's different about pipelined_insert?
> 
> Oh ... there's a pretty obvious theory.  pipelined_insert is
> the only one that is not asked to write a trace file.
> So for some reason, opening the trace file fails.
> (I wonder why we don't see an error message for this though.)

.. oh, I think we forgot to set conn->Pfdebug = NULL when creating the
connection.  So when we do PQtrace(), the first thing it does is
PQuntrace(), and then that tries to do fflush(conn->Pfdebug) ---> crash.
So this should fix it.

-- 
Álvaro Herrera   Valdivia, Chile
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a90bdb8ab6..56a8266bc3 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3952,6 +3952,7 @@ makeEmptyPGconn(void)
 	conn->verbosity = PQERRORS_DEFAULT;
 	conn->show_context = PQSHOW_CONTEXT_ERRORS;
 	conn->sock = PGINVALID_SOCKET;
+	conn->Pfdebug = NULL;
 
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe

Re: libpq debug log

2021-03-31 Thread Tom Lane

I wrote:
> That is weird - only test 4 (of 8) runs at all, the rest seem to
> fail to connect.  What's different about pipelined_insert?

Oh ... there's a pretty obvious theory.  pipelined_insert is
the only one that is not asked to write a trace file.
So for some reason, opening the trace file fails.
(I wonder why we don't see an error message for this though.)

regards, tom lane

Re: libpq debug log

2021-03-31 Thread Tom Lane

"'alvhe...@alvh.no-ip.org'"  writes:
> This is not the *only* issue though; at least animal drongo shows a
> completely different failure, where the last few tests don't even get to
> run, and the server log just has this:

That is weird - only test 4 (of 8) runs at all, the rest seem to
fail to connect.  What's different about pipelined_insert?

regards, tom lane

Re: libpq debug log

2021-03-31 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-31, Tom Lane wrote:

> I think this is a timing problem that's triggered (on some machines)
> by force_parallel_mode = regress.  Looking at spurfowl's latest
> failure of this type, the postmaster log shows
> 
> 2021-03-31 14:34:54.982 EDT [18233:15] 001_libpq_pipeline.pl LOG:  execute 
> : SELECT 1.0/g FROM generate_series(3, -1, -1) g
> 2021-03-31 14:34:54.992 EDT [18234:1] ERROR:  division by zero
> 2021-03-31 14:34:54.992 EDT [18234:2] STATEMENT:  SELECT 1.0/g FROM 
> generate_series(3, -1, -1) g
> 2021-03-31 14:34:54.993 EDT [18233:16] 001_libpq_pipeline.pl ERROR:  division 
> by zero
> 2021-03-31 14:34:54.993 EDT [18233:17] 001_libpq_pipeline.pl STATEMENT:  
> SELECT 1.0/g FROM generate_series(3, -1, -1) g
> 2021-03-31 14:34:54.995 EDT [18216:4] LOG:  background worker "parallel 
> worker" (PID 18234) exited with exit code 1
> 2021-03-31 14:34:54.995 EDT [18233:18] 001_libpq_pipeline.pl LOG:  could not 
> send data to client: Broken pipe
> 2021-03-31 14:34:54.995 EDT [18233:19] 001_libpq_pipeline.pl FATAL:  
> connection to client lost
> 
> We can see that the division by zero occurred in a parallel worker.
> My theory is that in parallel mode, it's uncertain whether the
> error will be reported before or after the "preceding" successful
> output rows.  So you need to disable parallelism to make this
> test case stable.

Ooh, that makes sense, thanks.  Added a SET in the program itself to set
it to off.

This is not the *only* issue though; at least animal drongo shows a
completely different failure, where the last few tests don't even get to
run, and the server log just has this:

2021-03-31 20:20:14.460 UTC [178364:4] 001_libpq_pipeline.pl LOG:  
disconnection: session time: 0:00:00.469 user=pgrunner database=postgres 
host=127.0.0.1 port=52638
2021-03-31 20:20:14.681 UTC [178696:1] [unknown] LOG:  connection received: 
host=127.0.0.1 port=52639
2021-03-31 20:20:14.687 UTC [178696:2] [unknown] LOG:  connection authorized: 
user=pgrunner database=postgres application_name=001_libpq_pipeline.pl
2021-03-31 20:20:15.157 UTC [178696:3] 001_libpq_pipeline.pl LOG:  could not 
receive data from client: An existing connection was forcibly closed by the 
remote host.

I suppose the program is crashing for some reason.

-- 
Álvaro Herrera   Valdivia, Chile
"Los dioses no protegen a los insensatos.  Éstos reciben protección de
otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)

Re: libpq debug log

2021-03-31 Thread Tom Lane

"'alvhe...@alvh.no-ip.org'"  writes:
> So crake failed.  The failure is that it doesn't print the DataRow
> messages.  That's quite odd.  We see this in the trace log:

I think this is a timing problem that's triggered (on some machines)
by force_parallel_mode = regress.  Looking at spurfowl's latest
failure of this type, the postmaster log shows

2021-03-31 14:34:54.982 EDT [18233:15] 001_libpq_pipeline.pl LOG:  execute 
: SELECT 1.0/g FROM generate_series(3, -1, -1) g
2021-03-31 14:34:54.992 EDT [18234:1] ERROR:  division by zero
2021-03-31 14:34:54.992 EDT [18234:2] STATEMENT:  SELECT 1.0/g FROM 
generate_series(3, -1, -1) g
2021-03-31 14:34:54.993 EDT [18233:16] 001_libpq_pipeline.pl ERROR:  division 
by zero
2021-03-31 14:34:54.993 EDT [18233:17] 001_libpq_pipeline.pl STATEMENT:  SELECT 
1.0/g FROM generate_series(3, -1, -1) g
2021-03-31 14:34:54.995 EDT [18216:4] LOG:  background worker "parallel worker" 
(PID 18234) exited with exit code 1
2021-03-31 14:34:54.995 EDT [18233:18] 001_libpq_pipeline.pl LOG:  could not 
send data to client: Broken pipe
2021-03-31 14:34:54.995 EDT [18233:19] 001_libpq_pipeline.pl FATAL:  connection 
to client lost

We can see that the division by zero occurred in a parallel worker.
My theory is that in parallel mode, it's uncertain whether the
error will be reported before or after the "preceding" successful
output rows.  So you need to disable parallelism to make this
test case stable.

regards, tom lane

RE: libpq debug log

2021-03-30 Thread tsunakawa.ta...@fujitsu.com

From: 'alvhe...@alvh.no-ip.org' 
> > got expected ERROR:  cannot insert multiple commands into a prepared
> statement
> > got expected division-by-zero
> 
> Eh?  this is not at all expected, of course, but clearly not PQtrace's
> fault.  I'll look tomorrow.

Iwata-san and I were starting to investigate the issue.  I guessed the first 
message was not expected.  Please let us know if there's something we can help.

(I was amazed that you finally committed this great feature, libpq pipeline, 
while you are caring about many patches.)


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-30 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-30, 'alvhe...@alvh.no-ip.org' wrote:

> got expected ERROR:  cannot insert multiple commands into a prepared statement
> got expected division-by-zero

Eh?  this is not at all expected, of course, but clearly not PQtrace's
fault.  I'll look tomorrow.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"El sentido de las cosas no viene de las cosas, sino de
las inteligencias que las aplican a sus problemas diarios
en busca del progreso." (Ernesto Hernández-Novich)

RE: libpq debug log

2021-03-30 Thread tsunakawa.ta...@fujitsu.com

From: 'alvhe...@alvh.no-ip.org' 
> Okay, pushed this patch and the new testing for it based on
> libpq_pipeline.  We'll see how the buildfarm likes it.

Thank you very much!  I appreciate you taking your valuable time while I 
imagine you must be pretty busy with taking care of other (possibly more 
important) patches.

TBH, when Tom-san suggested drastic change, I was afraid we may not be able to 
complete this in PG 14.  But in the end, I'm very happy that the patch has 
become much leaner and cleaner.

And congratulations on your first commit, Iwata-san!  I hope you can have time 
and energy to try adding a connection parameter to enable tracing, which 
eliminates application modification.


> I didn't like the idea of silently skipping the redacted fields, so I
> changed the code to print  or  instead.  I also made the
> redacting occur in the last mile (pqTraceOutputInt32 / String) rather
> that in their callers: it seemed quite odd to advance the cursor in the
> "else" branch.
> 
> I refactored the duplicate code that appeared for Notice and Error.
> In that function, we redact not only the 'L' field (what Iwata-san was
> doing) but also 'F' (file) and 'R' (routine) because those things can
> move around for reasons that are not interesting to testing this code.
> 
> In the libpq_pipeline commit I added 'pipeline_abort' and 'transaction'
> to the cases that generate traces, which adds coverage for
> NoticeResponse and ErrorResponse.

These make sense to me.  Thank you for repeatedly polishing and making the 
patch better much.



Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-30 Thread 'alvhe...@alvh.no-ip.org'



So crake failed.  The failure is that it doesn't print the DataRow
messages.  That's quite odd.  We see this in the trace log:

Mar 30 20:05:15 # F 54  Parse"" "SELECT 1.0/g FROM 
generate_series(3, -1, -1) g" 0
Mar 30 20:05:15 # F 12  Bind "" "" 0 0 0
Mar 30 20:05:15 # F 6   Describe P ""
Mar 30 20:05:15 # F 9   Execute  "" 0
Mar 30 20:05:15 # F 4   Sync
Mar 30 20:05:15 # B 4   ParseComplete
Mar 30 20:05:15 # B 4   BindComplete
Mar 30 20:05:15 # B 33  RowDescription   1 "?column?"  0  65535 
-1 0
Mar 30 20:05:15 # B 70  ErrorResponseS "ERROR" V "ERROR" C "22012" 
M "division by zero" F "" L "" R "" \\x00
Mar 30 20:05:15 # B 5   ReadyForQueryI

and the expected is this:

Mar 30 20:05:15 # F 54  Parse"" "SELECT 1.0/g FROM 
generate_series(3, -1, -1) g" 0
Mar 30 20:05:15 # F 12  Bind "" "" 0 0 0
Mar 30 20:05:15 # F 6   Describe P ""
Mar 30 20:05:15 # F 9   Execute  "" 0
Mar 30 20:05:15 # F 4   Sync
Mar 30 20:05:15 # B 4   ParseComplete
Mar 30 20:05:15 # B 4   BindComplete
Mar 30 20:05:15 # B 33  RowDescription   1 "?column?"  0  65535 
-1 0
Mar 30 20:05:15 # B 32  DataRow  1 22 '0.'
Mar 30 20:05:15 # B 32  DataRow  1 22 '0.5000'
Mar 30 20:05:15 # B 32  DataRow  1 22 '1.'
Mar 30 20:05:15 # B 70  ErrorResponseS "ERROR" V "ERROR" C "22012" 
M "division by zero" F "" L "" R "" \\x00
Mar 30 20:05:15 # B 5   ReadyForQueryI


...

I wonder if this is a libpq pipeline problem rather than a PQtrace
problem.  In that test case we're using the libpq singlerow mode, so we
should see three rows before the error is sent, but for some reason
crake is not doing that.

aborted pipeline... NOTICE:  table "pq_pipeline_demo" does not exist, skipping
ok
got expected ERROR:  cannot insert multiple commands into a prepared statement
got expected division-by-zero
ok 5 - libpq_pipeline pipeline_abort
not ok 6 - pipeline_abort trace match


Other hosts seem to get it right:

# Running: libpq_pipeline -t 
/Users/buildfarm/bf-data/HEAD/pgsql.build/src/test/modules/libpq_pipeline/tmp_check/log/pipeline_abort.trace
 pipeline_abort port=49797 
host=/var/folders/md/8mp8j5m5169ccgy11plb4w_8gp/T/iA9YP9_IHa 
dbname='postgres' 1
aborted pipeline... NOTICE:  table "pq_pipeline_demo" does not exist, skipping
ok
got expected ERROR:  cannot insert multiple commands into a prepared statement
got row: 0.
got row: 0.5000
got row: 1.
got expected division-by-zero
ok 5 - libpq_pipeline pipeline_abort
ok 6 - pipeline_abort trace match


-- 
Álvaro Herrera   Valdivia, Chile
"This is what I like so much about PostgreSQL.  Most of the surprises
are of the "oh wow!  That's cool" Not the "oh shit!" kind.  :)"
Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php

Re: libpq debug log

2021-03-30 Thread 'alvhe...@alvh.no-ip.org'

Okay, pushed this patch and the new testing for it based on
libpq_pipeline.  We'll see how the buildfarm likes it.

I made some further changes to the last version; user-visibly, I split
the trace flags in two, keeping the timestamp suppression separate from
the redacting feature for regression testing.

I didn't like the idea of silently skipping the redacted fields, so I
changed the code to print  or  instead.  I also made the
redacting occur in the last mile (pqTraceOutputInt32 / String) rather
that in their callers: it seemed quite odd to advance the cursor in the
"else" branch.

I refactored the duplicate code that appeared for Notice and Error.
In that function, we redact not only the 'L' field (what Iwata-san was
doing) but also 'F' (file) and 'R' (routine) because those things can
move around for reasons that are not interesting to testing this code.

In the libpq_pipeline commit I added 'pipeline_abort' and 'transaction'
to the cases that generate traces, which adds coverage for
NoticeResponse and ErrorResponse.

-- 
Álvaro Herrera39°49'30"S 73°17'W

RE: libpq debug log

2021-03-30 Thread iwata....@fujitsu.com

Hi Alvaro san, Tsunakawa san

Thank you for creating the v30 patch.

> From: Tsunakawa, Takayuki/綱川 貴之 
> Sent: Monday, March 29, 2021 9:45 AM
...
> Iwata-san,
> Please review Alvaro-san's code, and I think you can integrate all patches 
> into
> one except for 0002 and 0007.  Those two patches may be separate or
> merged into one as a test patch.

I reviewed v30 patches. I think it was good except that the documentation about 
the direction of the message was not changing.
I also tried the v30 patch using regression test and it worked fine.
I merged v30 patches and update the patch to v31.

This new version patch includes the fix of libpq.smgl fix and the addition of 
regression test mode.
In libpq.smgl, the symbol indicating the message direction has been corrected 
from "<" ">" to "B" "F" in the documentation.

> From: alvhe...@alvh.no-ip.org 
> Sent: Sunday, March 28, 2021 4:28 AM
...
> Maybe the easiest way is to have a new flag PQTRACE_REGRESS_MODE.

To prepare for regression test, I read Message Formats documentation.
https://www.postgresql.org/docs/current/protocol-message-formats.html

Following protocol messages have values that depend on the code of master at 
that time;
- BackendKeyData(B) ... includes backend PID and backend private key
- ErrorResponse(B) ... includes error message line number
- FunctionCall(F) ... includes function OID
- NoticeResponse(B) ... includes notice message line number
- NotificationResponse (B) ... includes backend PID
- ParameterDescription  ... includes parameter OID
- Parse(F) ... includes parameter data type OID
- RowDescription(B) ... includes OIDs

I checked status of conn->pqTraceFlags to decide whether output 
version-dependent messages or not in above protocol message output functions.
In ErrorResponse and NoticeResponse, I skip string type message logging only 
when field type code is "L".
In my understanding, other field code message type does not depend on version.
So I didn't skip other code type's string messages.
And I also changed description of pqTraceSetFlags()
by changing PQTRACE_SUPPRESS_TIMESTAMPS flag to the PQTRACE_REGRESS_MODE flag.

Output of regress mode is following;

B   124 ErrorResponseS "ERROR" V "ERROR" C "22023" M "unrecognized 
parameter "some_nonexistent_parameter"" F "reloptions.c" L R 
"parseRelOptionsInternal" \x00
B   14  ParameterDescription 2   

Output of non-regress mode is following;

2021-03-30 12:55:31.327913  B   124 ErrorResponseS "ERROR" V "ERROR" C 
"22023" M "unrecognized parameter "some_nonexistent_parameter"" F 
"reloptions.c" L "1447" R "parseRelOptionsInternal" \x00
2021-03-30 12:56:12.691617  B   14  ParameterDescription 2 25 701

Regards,
Aya Iwata


v31-libpq-trace-log.patch
Description: v31-libpq-trace-log.patch

RE: libpq debug log

2021-03-29 Thread tsunakawa.ta...@fujitsu.com

From: 'alvhe...@alvh.no-ip.org' 
> > > (Hey, what the heck is that "Z" at the end of the message?  I thought
> > > the error ended right at the \x00 ...)
> >
> > We'll investigate these issues.
> 
> For what it's worth, I did fix this problem in patch 0005 that I
> attached.  The problem was that one "continue" should have been "break",
> and also a "while ( .. )" needed to be made an infinite loop.  It was
> easy to catch these problems once I added (in 0006) the check that the
> bytes consumed equal message length, as I had suggested a couple of
> weeks ago :-)  I also changed the code for Notice, but I didn't actually
> verify that one.

You already fix the issue, didn't you?  Thank you.  Your suggestion of checking 
the length has surely proved to be correct.


Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-29 Thread tsunakawa.ta...@fujitsu.com

From: Kyotaro Horiguchi 
> It's better to be short as far as it is clear enough. Actually '<' to
> 'F' and '>' to 'B' is clear enough to me. So I don't need a longer
> notation.

Agreed, because the message format description in the PG manual uses F and B.


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-28 Thread Kyotaro Horiguchi

At Mon, 29 Mar 2021 00:02:58 -0300, "'alvhe...@alvh.no-ip.org'" 
 wrote in 
> On 2021-Mar-29, tsunakawa.ta...@fujitsu.com wrote:
> 
> > > (Hey, what the heck is that "Z" at the end of the message?  I thought
> > > the error ended right at the \x00 ...)
> > 
> > We'll investigate these issues.
> 
> For what it's worth, I did fix this problem in patch 0005 that I
> attached.  The problem was that one "continue" should have been "break",
> and also a "while ( .. )" needed to be made an infinite loop.  It was
> easy to catch these problems once I added (in 0006) the check that the
> bytes consumed equal message length, as I had suggested a couple of
> weeks ago :-)  I also changed the code for Notice, but I didn't actually
> verify that one.
> 
> > > 2. The < and > characters are not good for visual inspection.  I
> > > replaced them with F and B and I think it's much clearer.  Compare:
> > > I think the second one is much easier on the eye.
> > 
> > Yes, agreed.  I too thought of something like "C->S" and "S->C"
> > because client and server should be more familiar for users than
> > frontend and backend.
> 
> Hmm, yeah, that's a reasonable option too.  What do others think?

It's better to be short as far as it is clear enough. Actually '<' to
'F' and '>' to 'B' is clear enough to me. So I don't need a longer
notation.  O(ut) and (I)n also makes sense to me.  Rather, "C->S", and
"S->C" are a little difficult to understand at a glance

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: libpq debug log

2021-03-28 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-29, tsunakawa.ta...@fujitsu.com wrote:

> > (Hey, what the heck is that "Z" at the end of the message?  I thought
> > the error ended right at the \x00 ...)
> 
> We'll investigate these issues.

For what it's worth, I did fix this problem in patch 0005 that I
attached.  The problem was that one "continue" should have been "break",
and also a "while ( .. )" needed to be made an infinite loop.  It was
easy to catch these problems once I added (in 0006) the check that the
bytes consumed equal message length, as I had suggested a couple of
weeks ago :-)  I also changed the code for Notice, but I didn't actually
verify that one.

> > 2. The < and > characters are not good for visual inspection.  I
> > replaced them with F and B and I think it's much clearer.  Compare:
> > I think the second one is much easier on the eye.
> 
> Yes, agreed.  I too thought of something like "C->S" and "S->C"
> because client and server should be more familiar for users than
> frontend and backend.

Hmm, yeah, that's a reasonable option too.  What do others think?

-- 
Álvaro Herrera   Valdivia, Chile

RE: libpq debug log

2021-03-28 Thread tsunakawa.ta...@fujitsu.com

From: alvhe...@alvh.no-ip.org 
> > Proposed changes on top of v29.
> 
> This last one uses libpq_pipeline -t and verifies the output against an 
> expected
> trace file.  Applies on top of all the previous patches.  I attach the whole 
> lot,
> so that the CF bot has a chance to run it.

Thank you for polishing the patch.

Iwata-san,
Please review Alvaro-san's code, and I think you can integrate all patches into 
one except for 0002 and 0007.  Those two patches may be separate or merged into 
one as a test patch.


> I did notice another problem for comparison of expected trace files, which is
> that RowDescription includes table OIDs for some columns.  I think we would
> need to have a flag to suppress that too, somehow, although the answer to what
> should we do is not as clear as for the other two cases.

I'm afraid this may render the test comparison almost impossible.  Tests that 
access system catalogs and large objects probably output OIDs.



Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-28 Thread tsunakawa.ta...@fujitsu.com

From: alvhe...@alvh.no-ip.org 
> I added an option to the new libpq_pipeline program that it activates
> libpq trace.  It works nicely and I think we can add that to the
> regression tests.

That's good news.  Thank you.



> 1. The trace output for the error message won't be very nice, because it
> prints line numbers.  So if I want to match the output to an "expected"
> file, it would break every time somebody edits the source file where the
> error originates and the ereport() line is moved.  For example:

> (Hey, what the heck is that "Z" at the end of the message?  I thought
> the error ended right at the \x00 ...)

We'll investigate these issues.


> 2. The < and > characters are not good for visual inspection.  I
> replaced them with F and B and I think it's much clearer.  Compare:
> I think the second one is much easier on the eye.

Yes, agreed.  I too thought of something like "C->S" and "S->C" because client 
and server should be more familiar for users than frontend and backend.


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-27 Thread alvhe...@alvh.no-ip.org

On 2021-Mar-27, alvhe...@alvh.no-ip.org wrote:

> This last one uses libpq_pipeline -t and verifies the output against an
> expected trace file.  Applies on top of all the previous patches.  I
> attach the whole lot, so that the CF bot has a chance to run it.

All tests pass, but CFbot does not run src/test/modules, so it's not
saying much.

-- 
Álvaro Herrera39°49'30"S 73°17'W

Re: libpq debug log

2021-03-27 Thread alvhe...@alvh.no-ip.org

On 2021-Mar-26, alvhe...@alvh.no-ip.org wrote:

> Proposed changes on top of v29.

This last one uses libpq_pipeline -t and verifies the output against an
expected trace file.  Applies on top of all the previous patches.  I
attach the whole lot, so that the CF bot has a chance to run it.

I did notice another problem for comparison of expected trace files,
which is that RowDescription includes table OIDs for some columns.  I
think we would need to have a flag to suppress that too, somehow,
although the answer to what should we do is not as clear as for the
other two cases.

I dodged the issue by just using -t for the pipeline cases that don't
have OIDs in their output.  This leaves us with no coverage for the
ErrorResponse message, which I think is a shortcoming.  If we fixed the
OID problem and the ErrorResponse problem, we could add an expected
trace for pipeline_abort.  I think we should do that.

Maybe the easiest way is to have a new flag PQTRACE_REGRESS_MODE.

-- 
Álvaro Herrera39°49'30"S 73°17'W
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)
>From f239da90f069b19e1c2fa6b69bc7c6fd4be826b9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 26 Mar 2021 11:10:52 -0300
Subject: [PATCH v30 1/7] libpq trace v29

---
 doc/src/sgml/libpq.sgml |  39 +-
 src/interfaces/libpq/Makefile   |   1 +
 src/interfaces/libpq/exports.txt|   1 +
 src/interfaces/libpq/fe-connect.c   |  21 -
 src/interfaces/libpq/fe-exec.c  |   4 -
 src/interfaces/libpq/fe-misc.c  |  66 +--
 src/interfaces/libpq/fe-protocol3.c |   8 +
 src/interfaces/libpq/libpq-fe.h |   2 +
 src/interfaces/libpq/libpq-int.h|   7 +
 src/interfaces/libpq/libpq-trace.c  | 717 
 10 files changed, 789 insertions(+), 77 deletions(-)
 create mode 100644 src/interfaces/libpq/libpq-trace.c

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index be674fbaa9..838e394d54 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -6459,12 +6459,27 @@ PGContextVisibility PQsetErrorContextVisibility(PGconn *conn, PGContextVisibilit
 
 
  
-  Enables  tracing of the client/server communication to a debugging file stream.
+  Enables tracing of the client/server communication to a debugging file
+  stream.
 
 void PQtrace(PGconn *conn, FILE *stream);
 
  
 
+ 
+  Each line consists of: an optional timestamp, a direction indicator
+  ( for messages from client to server
+  or  for messages from server to client),
+  message length, message type, and message contents.
+  Non-message contents fields (timestamp, direction, length and message type)
+  are separated by a tab. Message contents are separated by a space.
+  Protocol strings are enclosed in double quotes, while strings used as data
+  values are enclosed in single quotes.  Non-printable chars are printed as
+  hexadecimal escapes.
+  Further message-type-specific detail can be found in
+  .
+ 
+
  
   
On Windows, if the libpq library and an application are
@@ -6479,6 +6494,28 @@ void PQtrace(PGconn *conn, FILE *stream);
 

 
+   
+PQtraceSetFlagsPQtraceSetFlags
+
+
+ 
+  Controls the tracing behavior of client/server communication.
+
+void PQtraceSetFlags(PGconn *conn, int flags);
+
+ 
+
+ 
+  flags contains flag bits describing the operating mode
+  of tracing.
+  If flags contains PQTRACE_SUPPRESS_TIMESTAMPS,
+  then the timestamp is not included when printing each message.
+  This function must be called after calling PQtrace.
+ 
+
+
+   
+

 PQuntracePQuntrace
 
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index 2aca882a2b..0424523492 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -41,6 +41,7 @@ OBJS = \
 	fe-secure.o \
 	legacy-pqsignal.o \
 	libpq-events.o \
+	libpq-trace.o \
 	pqexpbuffer.o \
 	fe-auth.o
 
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 5c48c14191..a00701f2c5 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -183,3 +183,4 @@ PQenterPipelineMode   180
 PQexitPipelineMode181
 PQpipelineSync182
 PQpipelineStatus  183
+PQtraceSetFlags   184
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 53b354abb2..a90bdb8ab6 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -6859,27 +6859,6 @@ PQsetErrorContextVisibility(PGconn *conn, PGContextVisibility show_context)
 	return old;
 }
 
-void
-PQtrace(PGconn *conn, FILE *debug_port)
-{
-	if (conn == NULL)
-		return;
-	PQuntrace(conn);
-	conn->Pfdebug = debug_port;
-}
-
-void
-PQuntrace(PGconn *conn)
-{
-	if (conn ==

Re: libpq debug log

2021-03-26 Thread alvhe...@alvh.no-ip.org

Proposed changes on top of v29.

-- 
Álvaro Herrera   Valdivia, Chile
>From b32ae3805bb28553c0a1cf308c6ed27f58576f3c Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 26 Mar 2021 19:12:12 -0300
Subject: [PATCH 1/5] libpq_pipeline: add -t support for PQtrace

---
 .../modules/libpq_pipeline/libpq_pipeline.c   | 84 +--
 1 file changed, 59 insertions(+), 25 deletions(-)

diff --git a/src/test/modules/libpq_pipeline/libpq_pipeline.c b/src/test/modules/libpq_pipeline/libpq_pipeline.c
index 846ee9f5ab..34d085035b 100644
--- a/src/test/modules/libpq_pipeline/libpq_pipeline.c
+++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c
@@ -23,6 +23,7 @@
 #include "catalog/pg_type_d.h"
 #include "common/fe_memutils.h"
 #include "libpq-fe.h"
+#include "pg_getopt.h"
 #include "portability/instr_time.h"
 
 
@@ -30,6 +31,9 @@ static void exit_nicely(PGconn *conn);
 
 const char *const progname = "libpq_pipeline";
 
+/* Options and defaults */
+char	   *tracefile = NULL;	/* path to PQtrace() file */
+
 
 #define DEBUG
 #ifdef DEBUG
@@ -1209,8 +1213,10 @@ usage(const char *progname)
 {
 	fprintf(stderr, "%s tests libpq's pipeline mode.\n\n", progname);
 	fprintf(stderr, "Usage:\n");
-	fprintf(stderr, "  %s tests", progname);
-	fprintf(stderr, "  %s testname [conninfo [number_of_rows]]\n", progname);
+	fprintf(stderr, "  %s [OPTION] tests\n", progname);
+	fprintf(stderr, "  %s [OPTION] TESTNAME [CONNINFO [NUMBER_OF_ROWS]\n", progname);
+	fprintf(stderr, "\nOptions:\n");
+	fprintf(stderr, "  -t TRACEFILE   generate a libpq trace to TRACEFILE\n");
 }
 
 static void
@@ -1231,37 +1237,54 @@ main(int argc, char **argv)
 {
 	const char *conninfo = "";
 	PGconn	   *conn;
+	FILE	   *trace;
+	char	   *testname;
 	int			numrows = 1;
 	PGresult   *res;
+	int			c;
 
-	if (strcmp(argv[1], "tests") == 0)
+	while ((c = getopt(argc, argv, "t:")) != -1)
 	{
-		print_test_list();
-		exit(0);
+		switch (c)
+		{
+			case 't':			/* trace file */
+tracefile = pg_strdup(optarg);
+break;
+		}
 	}
 
-	/*
-	 * The testname parameter is mandatory; it can be followed by a conninfo
-	 * string and number of rows.
-	 */
-	if (argc < 2 || argc > 4)
+	if (optind < argc)
+	{
+		testname = argv[optind];
+		optind++;
+	}
+	else
 	{
 		usage(argv[0]);
 		exit(1);
 	}
 
-	if (argc >= 3)
-		conninfo = pg_strdup(argv[2]);
+	if (strcmp(testname, "tests") == 0)
+	{
+		print_test_list();
+		exit(0);
+	}
 
-	if (argc >= 4)
+	if (optind < argc)
+	{
+		conninfo = argv[optind];
+		optind++;
+	}
+	if (optind < argc)
 	{
 		errno = 0;
-		numrows = strtol(argv[3], NULL, 10);
+		numrows = strtol(argv[optind], NULL, 10);
 		if (errno != 0 || numrows <= 0)
 		{
-			fprintf(stderr, "couldn't parse \"%s\" as a positive integer\n", argv[3]);
+			fprintf(stderr, "couldn't parse \"%s\" as a positive integer\n", argv[optind]);
 			exit(1);
 		}
+		optind++;
 	}
 
 	/* Make a connection to the database */
@@ -1272,30 +1295,41 @@ main(int argc, char **argv)
 PQerrorMessage(conn));
 		exit_nicely(conn);
 	}
+
+	/* Set the trace file, if requested */
+	if (tracefile != NULL)
+	{
+		trace = fopen(tracefile, "w+");
+
+		if (trace == NULL)
+			pg_fatal("could not open file \"%s\": %m", tracefile);
+		PQtrace(conn, trace);
+		PQtraceSetFlags(conn, PQTRACE_SUPPRESS_TIMESTAMPS);
+	}
+
 	res = PQexec(conn, "SET lc_messages TO \"C\"");
 	if (PQresultStatus(res) != PGRES_COMMAND_OK)
 		pg_fatal("failed to set lc_messages: %s", PQerrorMessage(conn));
 
-	if (strcmp(argv[1], "disallowed_in_pipeline") == 0)
+	if (strcmp(testname, "disallowed_in_pipeline") == 0)
 		test_disallowed_in_pipeline(conn);
-	else if (strcmp(argv[1], "multi_pipelines") == 0)
+	else if (strcmp(testname, "multi_pipelines") == 0)
 		test_multi_pipelines(conn);
-	else if (strcmp(argv[1], "pipeline_abort") == 0)
+	else if (strcmp(testname, "pipeline_abort") == 0)
 		test_pipeline_abort(conn);
-	else if (strcmp(argv[1], "pipelined_insert") == 0)
+	else if (strcmp(testname, "pipelined_insert") == 0)
 		test_pipelined_insert(conn, numrows);
-	else if (strcmp(argv[1], "prepared") == 0)
+	else if (strcmp(testname, "prepared") == 0)
 		test_prepared(conn);
-	else if (strcmp(argv[1], "simple_pipeline") == 0)
+	else if (strcmp(testname, "simple_pipeline") == 0)
 		test_simple_pipeline(conn);
-	else if (strcmp(argv[1], "singlerow") == 0)
+	else if (strcmp(testname, "singlerow") == 0)
 		test_singlerowmode(conn);
-	else if (strcmp(argv[1], "transaction") == 0)
+	else if (strcmp(testname, "transaction") == 0)
 		test_transaction(conn);
 	else
 	{
-		fprintf(stderr, "\"%s\" is not a recognized test name\n", argv[1]);
-		usage(argv[0]);
+		fprintf(stderr, "\"%s\" is not a recognized test name\n", testname);
 		exit(1);
 	}
 
-- 
2.20.1

>From 4b62c3159fe3aa8445317a5d65b7e81d91c7fba6 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 26 Mar 2021 19:13:04 -0300
Subject: [PATCH 2/5] put FILE * arg always first

---
 src/interfaces/libpq/libpq-trace.c | 234 ++---
 1

Re: libpq debug log

2021-03-26 Thread alvhe...@alvh.no-ip.org

Hello

I added an option to the new libpq_pipeline program that it activates
libpq trace.  It works nicely and I think we can add that to the
regression tests.  However I have two observations.

1. The trace output for the error message won't be very nice, because it
prints line numbers.  So if I want to match the output to an "expected"
file, it would break every time somebody edits the source file where the
error originates and the ereport() line is moved.  For example:

<   70  ErrorResponseS "ERROR" V "ERROR" C "22012" M 
"division by zero" F "numeric.c" L "8366" R "div_var" \x00 "Z"

The line number 8366 in this line would be problematic.  I think if we
want regression testing for this, we should have another trace flag to
suppress output of a few of these fields.  I would have it print only S,
C and M.

(Hey, what the heck is that "Z" at the end of the message?  I thought
the error ended right at the \x00 ...)


2. The < and > characters are not good for visual inspection.  I
replaced them with F and B and I think it's much clearer.  Compare:

>   27  Query"SET lc_messages TO "C""
<   8   CommandComplete  "SET"
<   5   ReadyForQueryI
>   21  Parse"" "SELECT $1" 1 23
>   19  Bind "" "" 0 1 1 '1' 1 0
>   6   Describe P ""
>   9   Execute  "" 0
>   4   Sync
<   4   ParseComplete
<   4   BindComplete
<   33  RowDescription   1 "?column?" 0 0 23 4 -1 0
<   11  DataRow  1 1 '1'
<   13  CommandComplete  "SELECT 1"
<   5   ReadyForQueryI
>   4   Terminate

with

F   27  Query"SET lc_messages TO "C""
B   8   CommandComplete  "SET"
B   5   ReadyForQueryI
F   21  Parse"" "SELECT $1" 1 23
F   19  Bind "" "" 0 1 1 '1' 1 0
F   6   Describe P ""
F   9   Execute  "" 0
F   4   Sync
B   4   ParseComplete
B   4   BindComplete
B   33  RowDescription   1 "?column?" 0 0 23 4 -1 0
B   11  DataRow  1 1 '1'
B   13  CommandComplete  "SELECT 1"
B   5   ReadyForQueryI
F   4   Terminate

I think the second one is much easier on the eye.

(This one is the output from "libpq_pipeline simple_pipeline").

-- 
Álvaro Herrera   Valdivia, Chile
"Saca el libro que tu religión considere como el indicado para encontrar la
oración que traiga paz a tu alma. Luego rebootea el computador
y ve si funciona" (Carlos Duclós)

RE: libpq debug log

2021-03-19 Thread tsunakawa.ta...@fujitsu.com

Alvaro-san, Iwata-san,
cc: Tom-san, Horiguchi-san,


Thank you, it finally looks complete to me!

Alvaro-san,
We appreciate your help and patience, sometimes rewriting large part of the 
patch.  Could you do the final check (and possibly tweak) for commit?



Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-19 Thread iwata....@fujitsu.com

Hi Tsunakawa san,

Thank you for your review. I update patch to v29.

Output is following. It is fine.
```
2021-03-19 07:21:09.917302  >   4   Terminate   
2021-03-19 07:21:09.961494  >   155 Query"CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(some_nonexistent_parameter = true);"
2021-03-19 07:21:09.962657  <   124 ErrorResponseS "ERROR" V "ERROR" C 
"22023" M "unrecognized parameter "some_nonexistent_parameter"" F 
"reloptions.c" L "1456" R "parseRelOptionsInternal" \x00 "Z"
2021-03-19 07:21:09.962702  <   5   ReadyForQueryI  
2021-03-19 07:21:09.962767  >   144 Query"CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(random_page_cost = 3.0);"
2021-03-19 07:21:09.967056  <   22  CommandComplete  "CREATE TABLESPACE"
2021-03-19 07:21:09.967092  <   5   ReadyForQueryI  
2021-03-19 07:21:09.967148  >   81  Query"SELECT spcoptions FROM 
pg_tablespace WHERE spcname = 'regress_tblspacewith';"
2021-03-19 07:21:09.970338  <   35  RowDescription   1 "spcoptions" 1213 5 1009 
65535 -1 0
2021-03-19 07:21:09.970402  <   32  DataRow  1 22 '{random_page_cost=3.0}'  
2021-03-19 07:21:09.970420  <   13  CommandComplete  "SELECT 1" 
2021-03-19 07:21:09.970431  <   5   ReadyForQueryI  
2021-03-19 07:21:09.970558  >   42  Query"DROP TABLESPACE 
regress_tblspacewith;"
2021-03-19 07:21:09.971500  <   20  CommandComplete  "DROP TABLESPACE"  
2021-03-19 07:21:09.971561  <   5   ReadyForQueryI   
```

> -Original Message-
> From: Tsunakawa, Takayuki/綱川 貴之 
> Sent: Friday, March 19, 2021 11:37 AM


> Thanks to removing the static arrays, the code got much smaller.  I'm happy
> with this result.

Thank you so much. Your advice was very helpful in refactoring the patch.

> (1)
> +  ( for messages from client to server,
> +  and  for messages from server to client),
> 
> I think this was meant to say "> or <".  So, maybe you should remove "," at
> the end of the first line, and replace "and" with "or".

I fixed.

> 
> 
> (2)
> + case 8 :/* GSSENCRequest or SSLRequest */
> + /* These messsage does not reach here. */
> 
> messsage does -> messages do

I fixed.

> (3)
> + fprintf(f, "\tAuthentication");
> 
> Why don't you move this \t in each message type to the end of:
> 
> + fprintf(conn->Pfdebug, "%s\t%s\t%d", timestr, prefix, length);

I fixed.

> 
> Plus, don't we say in the manual that fields (timestamp, direction, length,
> message type, and message content) are separated by a tab?

Sure. I added following documentation;
+ Non-message contents fields (timestamp, direction, length and message type)
+ are separated by a tab. Message contents are separated by a space.

> Also, the code doesn't seem to separate the message type and content with a
> tab.

I fixed to output a tab at the end of message types.

> (4)
> Where did the processing for unknown message go in
> pqTraceOutputMessage()?  Add default label?


> (5)
> + case 16:/* CancelRequest */
> + fprintf(conn->Pfdebug,
> "%s\t>\t%d\tCancelRequest", timestr, length);
> 
> I think this should follow pqTraceOutputMessage().  That is, each case label
> only print the message type name in case other message types are added in
> the future.

Sure. I fixed pqTraceOutputNoTypeByteMessage() following to 
pqTraceOutputMessage() style.


Regards,
Aya Iwata


v29-libpq-trace-log.patch
Description: v29-libpq-trace-log.patch

RE: libpq debug log

2021-03-18 Thread tsunakawa.ta...@fujitsu.com

Hello Iwata-san,


Thanks to removing the static arrays, the code got much smaller.  I'm happy 
with this result.


(1)
+  ( for messages from client to server,
+  and  for messages from server to client),

I think this was meant to say "> or <".  So, maybe you should remove "," at the 
end of the first line, and replace "and" with "or".


(2)
+   case 8 :/* GSSENCRequest or SSLRequest */
+   /* These messsage does not reach here. */

messsage does -> messages do


(3)
+   fprintf(f, "\tAuthentication");

Why don't you move this \t in each message type to the end of:

+   fprintf(conn->Pfdebug, "%s\t%s\t%d", timestr, prefix, length);

Plus, don't we say in the manual that fields (timestamp, direction, length, 
message type, and message content) are separated by a tab?
Also, the code doesn't seem to separate the message type and content with a tab.


(4)
Where did the processing for unknown message go in pqTraceOutputMessage()?  Add 
default label?


(5)
+   case 16:/* CancelRequest */
+   fprintf(conn->Pfdebug, "%s\t>\t%d\tCancelRequest", 
timestr, length);

I think this should follow pqTraceOutputMessage().  That is, each case label 
only print the message type name in case other message types are added in the 
future.



Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-18 Thread iwata....@fujitsu.com

Hi Horiguchi san and Tsunakawa san,

Thank you for you review. 

I update patch to v28.  In this patch, I removed array.
And I fixed some code according to Horiguchi san and Tsunakawa san review 
comment.

> From: Tsunakawa, Takayuki/綱川 貴之 
> Sent: Thursday, March 18, 2021 12:38 PM
> I sort of think so to remove the big arrays.  But to minimize duplicate code, 
> I
> think the code structure will look like:
> 
>   fprintf(timestamp, length);
>   switch (type)
>   {
>   case '?':
>   pqTraceOutput?(...);
>   break;
>   case '?':
>   /* No message content */
>   fprintf("message_type_name");
>   break;
>   }
> 
> void
> pqTraceOutput?(...)
> {
>   fprintf("message_type_name");
>   print message content;
> }

The code follows above format.
And I changed .sgml documentation;
- Changed order of message length and type
- Removed byte-16 and byte-32 explanation because I removed # output in 
previous patch.

Output style is following;

```
2021-03-18 08:59:36.141660  <   5   ReadyForQuery I 
  
2021-03-18 08:59:36.141723  >   4   Terminate   
  
2021-03-18 08:59:36.173263  >   155 Query "CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(some_nonexistent_parameter = true);"
2021-03-18 08:59:36.174439  <   124 ErrorResponse S "ERROR" V "ERROR" C "22023" 
M "unrecognized parameter "some_nonexistent_parameter"" F "reloptions.c" L 
"1456" R "parseRelOptionsInternal" \x00 "Z"
2021-03-18 08:59:36.174483  <   5   ReadyForQuery I 
  
2021-03-18 08:59:36.174545  >   144 Query "CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(random_page_cost = 3.0);"
2021-03-18 08:59:36.176155  <   22  CommandComplete "CREATE TABLESPACE" 
  
2021-03-18 08:59:36.176190  <   5   ReadyForQuery I 
  
2021-03-18 08:59:36.176243  >   81  Query "SELECT spcoptions FROM pg_tablespace 
WHERE spcname = 'regress_tblspacewith';"
  
2021-03-18 08:59:36.179286  <   35  RowDescription 1 "spcoptions" 1213 5 1009 
65535 -1 0
2021-03-18 08:59:36.179326  <   32  DataRow 1 22 '{random_page_cost=3.0}'   
  
2021-03-18 08:59:36.179339  <   13  CommandComplete "SELECT 1"  
2021-03-18 08:59:36.179349  <   5   ReadyForQuery I 
2021-03-18 08:59:36.179504  >   42  Query "DROP TABLESPACE 
regress_tblspacewith;"
2021-03-18 08:59:36.180400  <   20  CommandComplete "DROP TABLESPACE"   
2021-03-18 08:59:36.180432  <   5   ReadyForQuery I   
```


> -Original Message-
> From: Kyotaro Horiguchi 
> Sent: Thursday, March 18, 2021 5:30 PM
> 
> At Thu, 18 Mar 2021 07:34:36 +, "tsunakawa.ta...@fujitsu.com"
>  wrote in
> > From: Iwata, Aya/岩田 彩 
> > > > Yes, precisely, 2 bytes for the double quotes needs to be
> > > > subtracted as
> > > > follows:
> > > >
> > > > len = fprintf(...);
> > > > *cursor += (len - 2);
> > >
> > > Thank you for your advice. I changed pqTraceOutputString set cursor
> > > to fprintf return -2.
> > > And I removed cursor movement from that function.
> >
> > Ouch, not 2 but 3, to include a single whitespace at the beginning.
> >
> > The rest looks good.  I hope we're almost at the finish line.
> 
> Maybe.

String is end by '\0'. So (len -2) is OK. 
However it seems like mistake because fprintf output string and 3 characters. 
So I added explanation here and changed (len -2) to (len -3 +1). 
I think it is OK because I found similar style in ecpg code.

> > + pqTraceOutputR(const char *message, FILE *f) {
> > +   int cursor = 0;
> > +
> > +   pqTraceOutputInt32(message, , f);
> >
> > I don't understand the reason for spliting message and  here.
> >
> > + pqTraceOutputR(const char *message, FILE *f) {
> > +   char *p = message;
> > +
> > +   pqTraceOutputInt32(, f);
> >
> > works well.
> 
> Yes, that would also work.  But I like the separate cursor + fixed starting
> point here, probably because it's sometimes confusing to see the pointer
> value changed inside functions.  (And a pointer itself is an allergy for some
> people, not to mention a pointer to ointer...)  Also, libpq uses cursors for
> network I/O buffers.  So, I think the patch can be as it is now.

I think pass message and cursor is better. Because it is easy to understand and
The moving cursor style is used when libpq execute protocol message.
So I didn't make this change.
 
> +/* RowDescription */
> +static void
> +pqTraceOutputT(const char *message, int end, FILE *f) {
> + int cursor = 0;
> + int nfields;
> + int i;
> +
> + nfields = pqTraceOutputInt16(message, , f);
> +
> + for (i = 0; i < nfields; i++)
> + {

RE: libpq debug log

2021-03-18 Thread tsunakawa.ta...@fujitsu.com

From: Kyotaro Horiguchi 
> + pqTraceOutputR(const char *message, FILE *f)
> + {
> + int cursor = 0;
> +
> + pqTraceOutputInt32(message, , f);
> 
> I don't understand the reason for spliting message and  here.
> 
> + pqTraceOutputR(const char *message, FILE *f)
> + {
> + char *p = message;
> +
> + pqTraceOutputInt32(, f);
> 
> works well.

Yes, that would also work.  But I like the separate cursor + fixed starting 
point here, probably because it's sometimes confusing to see the pointer value 
changed inside functions.  (And a pointer itself is an allergy for some people, 
not to mention a pointer to ointer...)  Also, libpq uses cursors for network 
I/O buffers.  So, I think the patch can be as it is now.


> +static void
> +pqTraceOutputT(const char *message, int end, FILE *f)
> +{
> + int cursor = 0;
> + int nfields;
> + int i;
> +
> + nfields = pqTraceOutputInt16(message, , f);
> +
> + for (i = 0; i < nfields; i++)
> + {
> + pqTraceOutputString(message, , end, f);
> + pqTraceOutputInt32(message, , f);
> + pqTraceOutputInt16(message, , f);
> + pqTraceOutputInt32(message, , f);
> + pqTraceOutputInt16(message, , f);
> + pqTraceOutputInt32(message, , f);
> + pqTraceOutputInt16(message, , f);
> + }
> +}
> 
> I didn't looked closer, but lookong the usage of the variable "end",
> something's wrong in the function.

Ah, end doesn't serve any purpose.  I guess the compiler emitted a warning "end 
is not used".  I think end can be removed.



Regards
Takayuki Tsunakawa}

Re: libpq debug log

2021-03-18 Thread Kyotaro Horiguchi

At Thu, 18 Mar 2021 07:34:36 +, "tsunakawa.ta...@fujitsu.com" 
 wrote in 
> From: Iwata, Aya/岩田 彩 
> > > Yes, precisely, 2 bytes for the double quotes needs to be subtracted
> > > as
> > > follows:
> > >
> > >   len = fprintf(...);
> > >   *cursor += (len - 2);
> > 
> > Thank you for your advice. I changed pqTraceOutputString set cursor to 
> > fprintf
> > return -2.
> > And I removed cursor movement from that function.
> 
> Ouch, not 2 but 3, to include a single whitespace at the beginning.
> 
> The rest looks good.  I hope we're almost at the finish line.

Maybe.

At Wed, 17 Mar 2021 13:36:32 -0300, Alvaro Herrera  
wrote in 
> In pqTraceOutputString(), you can use the return value from fprintf to
> move the cursor -- no need to count chars.
> 
> I still think that the message-type specific functions should print the
> message type, rather than having the string arrays.

In other words, pqTraceOutputMessage recognizes message id and calls
the function corresponding one-on-one to the id. So the functions
knows what is the message type of myself and there's no reason for
pqTraceOutputMessage to print the message type on its behalf.


+ pqTraceOutputR(const char *message, FILE *f)
+ {
+   int cursor = 0;
+ 
+   pqTraceOutputInt32(message, , f);

I don't understand the reason for spliting message and  here.

+ pqTraceOutputR(const char *message, FILE *f)
+ {
+   char *p = message;
+ 
+   pqTraceOutputInt32(, f);

works well.


+/* RowDescription */
+static void
+pqTraceOutputT(const char *message, int end, FILE *f)
+{
+   int cursor = 0;
+   int nfields;
+   int i;
+
+   nfields = pqTraceOutputInt16(message, , f);
+
+   for (i = 0; i < nfields; i++)
+   {
+   pqTraceOutputString(message, , end, f);
+   pqTraceOutputInt32(message, , f);
+   pqTraceOutputInt16(message, , f);
+   pqTraceOutputInt32(message, , f);
+   pqTraceOutputInt16(message, , f);
+   pqTraceOutputInt32(message, , f);
+   pqTraceOutputInt16(message, , f);
+   }
+}

I didn't looked closer, but lookong the usage of the variable "end",
something's wrong in the function.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: libpq debug log

2021-03-18 Thread tsunakawa.ta...@fujitsu.com

From: Iwata, Aya/岩田 彩 
> > Yes, precisely, 2 bytes for the double quotes needs to be subtracted
> > as
> > follows:
> >
> > len = fprintf(...);
> > *cursor += (len - 2);
> 
> Thank you for your advice. I changed pqTraceOutputString set cursor to fprintf
> return -2.
> And I removed cursor movement from that function.

Ouch, not 2 but 3, to include a single whitespace at the beginning.

The rest looks good.  I hope we're almost at the finish line.


Regards
Takayuki Tsunakawa}

RE: libpq debug log

2021-03-18 Thread iwata....@fujitsu.com

Hi Alvaro san and Tsunakawa san,

Thank you for your review. I updated patch to v27.

`make check` output is following. I think it is OK.
```
2021-03-18 07:02:55.090598  <   ReadyForQuery   5 I 
2021-03-18 07:02:55.090672  >   Terminate   4   
2021-03-18 07:02:55.120492  >   Query   155 "CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(some_nonexistent_parameter = true);"
2021-03-18 07:02:55.121624  <   ErrorResponse   124 S "ERROR" V "ERROR" C 
"22023" M "unrecognized parameter "some_nonexistent_parameter"" F 
"reloptions.c" L "1447" R "parseRelOptionsInternal" \x00 "Z"
2021-03-18 07:02:55.121664  <   ReadyForQuery   5 I 
2021-03-18 07:02:55.121728  >   Query   144 "CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(random_page_cost = 3.0);"
2021-03-18 07:02:55.123335  <   CommandComplete 22 "CREATE TABLESPACE"  
2021-03-18 07:02:55.123400  <   ReadyForQuery   5 I 
2021-03-18 07:02:55.123460  >   Query   81 "SELECT spcoptions FROM 
pg_tablespace WHERE spcname = 'regress_tblspacewith';"
2021-03-18 07:02:55.126556  <   RowDescription  35 1 "spcoptions" 1213 5 1009 
65535 -1 0
2021-03-18 07:02:55.126594  <   DataRow 32 1 22 '{random_page_cost=3.0}'
2021-03-18 07:02:55.126607  <   CommandComplete 13 "SELECT 1"   
2021-03-18 07:02:55.126617  <   ReadyForQuery   5 I 
```

> Iwata-san,
> Why don't you submit v27 patch with the current arrays kept, and then v28
> with the arrays removed soon after?

And I will try to remove byte1 type table for v28 patch.

> From: Tsunakawa, Takayuki/綱川 貴之 
> Sent: Thursday, March 18, 2021 12:38 PM
> From: Alvaro Herrera 
> > In pqTraceOutputString(), you can use the return value from fprintf to
> > move the cursor -- no need to count chars.
> 
> Yes, precisely, 2 bytes for the double quotes needs to be subtracted as
> follows:
> 
>   len = fprintf(...);
>   *cursor += (len - 2);

Thank you for your advice. I changed pqTraceOutputString set cursor to fprintf  
return -2.
And I removed cursor movement from that function.

> From: Tsunakawa, Takayuki/綱川 貴之 
> Sent: Thursday, March 18, 2021 11:52 AM
... 
> I'll look at the comments from Alvaro-san and Horiguchi-san.  Here are my
> review comments:

Thank you for your review. I am sorry previous patch contain some mistake.

> (23)
> + /* Trace message only when there is first 1 byte */
> + if (conn->Pfdebug && conn->outCount < conn->outMsgStart)
> + {
> + if (conn->outCount < conn->outMsgStart)
> + pqTraceOutputMessage(conn, conn->outBuffer +
> conn->outCount, true);
> + else
> + pqTraceOutputNoTypeByteMessage(conn,
> +
>   conn->outBuffer + conn->outMsgStart);
> + }
> 
> The inner else path doesn't seem to be reached because both the outer and
> inner if contain the same condition.  I think you want to remove the second
> condition from the outer if.

Yes, I remove second condition.

> (24) pqTraceOutputNoTypeByteMessage
> + case 16:/* CancelRequest */
> + fprintf(conn->Pfdebug,
> "%s\t>\tCancelRequest\t%d", timestr, length);
> + pqTraceOutputInt32(message, ,
> conn->Pfdebug);
> + pqTraceOutputInt32(message, ,
> conn->Pfdebug);
> + break;
> 
> Another int32 data needs to be output as follows:
> 
> --
> Int32(80877102)
> The cancel request code. The value is chosen to contain 1234 in the most
> significant 16 bits, and 5678 in the least significant 16 bits. (To avoid
> confusion, this code must not be the same as any protocol version number.)
> 
> Int32
> The process ID of the target backend.
> 
> Int32
> The secret key for the target backend.
> --

Thank you. I read document again and I add one pqTraceOutputInt32().

> (25)
> + case 8 :/* GSSENRequest or SSLRequest */
> 
> GSSENRequest -> GSSENCRequest

Thank you. I fixed.


> (26)
> +static void
> +pqTraceOutputByte1(const char *data, int *cursor, FILE *pfdebug) {
> + const char *v = data + *cursor;
> + /*
> 
> +static void
> +pqTraceOutputf(const char *message, int end, FILE *f) {
> + int cursor = 0;
> + pqTraceOutputString(message, , end, f); }
> 
> Put an empty line to separate the declaration part and execution part.

Thank you. I fixed this. I add space anywhere in pqTraceOutput? function.

> (27)
> + const char  *message_type = "UnkownMessage";
> +
> + id = message[LogCursor++];
> +
> + memcpy(, message + LogCursor , 4);
> + length = (int) pg_ntoh32(length);
> + LogCursor += 4;
> + LogEnd = length - 4;
> 
> + /* Get

RE: libpq debug log

2021-03-17 Thread tsunakawa.ta...@fujitsu.com

From: Alvaro Herrera 
> In pqTraceOutputString(), you can use the return value from fprintf to
> move the cursor -- no need to count chars.

Yes, precisely, 2 bytes for the double quotes needs to be subtracted as follows:

len = fprintf(...);
*cursor += (len - 2);


> I still think that the message-type specific functions should print the
> message type, rather than having the string arrays.

I sort of think so to remove the big arrays.  But to minimize duplicate code, I 
think the code structure will look like:

fprintf(timestamp, length);
switch (type)
{
case '?':
pqTraceOutput?(...);
break;
case '?':
/* No message content */
fprintf("message_type_name");
break;
}   

void
pqTraceOutput?(...)
{
fprintf("message_type_name");
print message content;  
}

The order of length and message type is reversed.  The .sgml should also be 
changed accordingly.  What do you think?

Iwata-san, 
Why don't you submit v27 patch with the current arrays kept, and then v28 with 
the arrays removed soon after?


From: Kyotaro Horiguchi 
> It would help when the value is "255", which is confusing between -1
> (or 255) in byte or 255 in 2-byte word. Or when the value looks like
> broken. I'd suggest "b"yte for 1 byte, "s"hort for 2 bytes, "l"ong for
> 4 bytes ('l' is confusing with '1', but anyway it is not needed)..

I don't have a strong opinion on this.  (I kind of think I want to see 
unprefixed numbers; readers will look at the protocol reference anyway.)  I'd 
like to leave this up to Iwata-san and Alvaro-san.


Regards
Takayuki Tsunakawa}

RE: libpq debug log

2021-03-17 Thread tsunakawa.ta...@fujitsu.com

I'll look at the comments from Alvaro-san and Horiguchi-san.  Here are my 
review comments:


(23)
+   /* Trace message only when there is first 1 byte */
+   if (conn->Pfdebug && conn->outCount < conn->outMsgStart)
+   {
+   if (conn->outCount < conn->outMsgStart)
+   pqTraceOutputMessage(conn, conn->outBuffer + 
conn->outCount, true);
+   else
+   pqTraceOutputNoTypeByteMessage(conn,
+   
conn->outBuffer + conn->outMsgStart);
+   }

The inner else path doesn't seem to be reached because both the outer and inner 
if contain the same condition.  I think you want to remove the second condition 
from the outer if.


(24) pqTraceOutputNoTypeByteMessage
+   case 16:/* CancelRequest */
+   fprintf(conn->Pfdebug, "%s\t>\tCancelRequest\t%d", 
timestr, length);
+   pqTraceOutputInt32(message, , conn->Pfdebug);
+   pqTraceOutputInt32(message, , conn->Pfdebug);
+   break;

Another int32 data needs to be output as follows:

--
Int32(80877102)
The cancel request code. The value is chosen to contain 1234 in the most 
significant 16 bits, and 5678 in the least significant 16 bits. (To avoid 
confusion, this code must not be the same as any protocol version number.)

Int32
The process ID of the target backend.

Int32
The secret key for the target backend.
--


(25)
+   case 8 :/* GSSENRequest or SSLRequest */

GSSENRequest -> GSSENCRequest


(26)
+static void
+pqTraceOutputByte1(const char *data, int *cursor, FILE *pfdebug)
+{
+   const char *v = data + *cursor;
+   /*

+static void
+pqTraceOutputf(const char *message, int end, FILE *f)
+{
+   int cursor = 0;
+   pqTraceOutputString(message, , end, f);
+}

Put an empty line to separate the declaration part and execution part.


(27)
+   const char  *message_type = "UnkownMessage";
+
+   id = message[LogCursor++];
+
+   memcpy(, message + LogCursor , 4);
+   length = (int) pg_ntoh32(length);
+   LogCursor += 4;
+   LogEnd = length - 4;

+   /* Get a protocol type from first byte identifier */
+   if (toServer &&
+   id < lengthof(protocol_message_type_f) &&
+   protocol_message_type_f[(unsigned char)id] != NULL)
+   message_type = protocol_message_type_f[(unsigned char)id];
+   else if (!toServer &&
+   id < lengthof(protocol_message_type_b) &&
+   protocol_message_type_b[(unsigned char)id] != NULL)
+   message_type = protocol_message_type_b[(unsigned char)id];
+   else
+   {
+   fprintf(conn->Pfdebug, "Unknown message: %02x\n", id);
+   return;
+   }
+

The initial value "UnkownMessage" is not used.  So, you can initialize 
message_type with NULL and do like:

+if (...)
+   ...
+   else if (...)
+   ...
+
+   if (message_type == NULL)
+   {
+   fprintf(conn->Pfdebug, "Unknown message: %02x\n", id);
+   return;
+

Plus, I think this should be done before looking at the message length.


(28)
pqTraceOutputBinary() is only used in pqTraceOutputNchar().  Do they have to be 
separated?


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-17 Thread Alvaro Herrera

Hello

In pqTraceOutputString(), you can use the return value from fprintf to
move the cursor -- no need to count chars.

I still think that the message-type specific functions should print the
message type, rather than having the string arrays.

-- 
Álvaro Herrera   Valdivia, Chile
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"

RE: libpq debug log

2021-03-17 Thread iwata....@fujitsu.com

Hi Tsunakawa san, Alvaro san,

Thank you very much for your review. It helped me a lot to make the patch 
better.
I update patch to v26. 
This patch has been updated according to Tsunakawa san and Alvaro san review 
comments.

The output is following;
```
2021-03-17 14:43:16.411238  >   Terminate   4   
  
2021-03-17 14:43:16.441775  >   Query   155 "CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(some_nonexistent_parameter = true);"
2021-03-17 14:43:16.442935  <   ErrorResponse   124 S "ERROR" V "ERROR" C 
"22023" M "unrecognized parameter "some_nonexistent_parameter"" F 
"reloptions.c" L "1447" R "parseRelOptionsInternal" \x00 "Z"
2021-03-17 14:43:16.442977  <   ReadyForQuery   5 I 
  
2021-03-17 14:43:16.443042  >   Query   144 "CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(random_page_cost = 3.0);"
2021-03-17 14:43:16.444774  <   CommandComplete 22 "CREATE TABLESPACE"  
  
2021-03-17 14:43:16.444807  <   ReadyForQuery   5 I 
2021-03-17 14:43:16.444878  >   Query   81 "SELECT spcoptions FROM 
pg_tablespace WHERE spcname = 'regress_tblspacewith';"
2021-03-17 14:43:16.448117  <   RowDescription  35 1 "spcoptions" 1213 5 1009 
65535 -1 0
2021-03-17 14:43:16.448157  <   DataRow 32 1 22 '{random_page_cost=3.0}'
  
2021-03-17 14:43:16.448171  <   CommandComplete 13 "SELECT 1"   
```

> -Original Message-
> From: Tsunakawa, Takayuki/綱川 貴之 
> Sent: Tuesday, March 16, 2021 12:03 PM
 
 
> (1)
> -  Enables  tracing of the client/server communication to a debugging
> file stream.
> +  Enables tracing of the client/server communication to a debugging file
> +  stream.
> +  Only available when protocol version 3 or higher is used.
> 
> The last line is unnecessary now that v2 is not supported.

I removed last comment from documentation file.

> (2)
> @@ -113,6 +114,7 @@ install: all installdirs install-lib
>   $(INSTALL_DATA) $(srcdir)/libpq-fe.h '$(DESTDIR)$(includedir)'
>   $(INSTALL_DATA) $(srcdir)/libpq-events.h '$(DESTDIR)$(includedir)'
>   $(INSTALL_DATA) $(srcdir)/libpq-int.h
> '$(DESTDIR)$(includedir_internal)'
> + $(INSTALL_DATA) $(srcdir)/libpq-trace.h
> '$(DESTDIR)$(includedir_internal)'
>   $(INSTALL_DATA) $(srcdir)/pqexpbuffer.h
> '$(DESTDIR)$(includedir_internal)'
>   $(INSTALL_DATA) $(srcdir)/pg_service.conf.sample
> '$(DESTDIR)$(datadir)/pg_service.conf.sample'
> 
> Why is this necessary?

I think it is not necessary. I removed it.

> (3) libpq-trace.h
> +#ifdef __cplusplus
> +extern "C"
> +{
> 
> This is unnecessary because pqTraceOutputMessage() is for libpq's internal
> use.  This extern is for the user's C++ application to call public libpq
> functions.
> 
> +#include "libpq-fe.h"
> +#include "libpq-int.h"
> 
> I think these can be removed by declaring the struct and function as follows:
> 
> struct pg_conn;
> extern void pqTraceOutputMessage(struct pg_conn *conn, const char
> *message, bool toServer);
> 
> But... after doing the above, only this declaration is left in libpq-trace.h. 
>  Why
> don't you put your original declaration using PGconn in libpq-int.h and remove
> libpq-trace.h?

I remove this file.
I was wondering whether to delete this file during the development of v25 
patch. I leave it because it keep scalability.
If tracing process become update and it have a lot of extern function, leave 
this header file is meaningful.
However I think it does not happen. So I remove it.

> (4)
> + /* Trace message only when there is first 1 byte */
> + if (conn->Pfdebug && (conn->outCount < conn->outMsgStart))
> + pqTraceOutputMessage(conn, conn->outBuffer +
> conn->outCount, true);
> 
> () surrounding the condition after && can be removed.

I removed this (). And This if () statement has been modified according to the 
review comment (14).

> (5)
> +static const char *pqGetProtocolMsgType(unsigned char c,
> +
>   bool toServer);
> 
> This is unnecessary because the function definition precedes its only one call
> site.

Yes, I removed this definition.

> (6)
> + * We accumulate frontend message pieces in an array as the libpq code
> + writes
> + * them, and log the complete message when pqTraceOutputFeMsg is called.
> + * For backend, we print the pieces as soon as we receive them from the
> server.
> 
> The first paragraph is no longer true.  I think this entire comment is
> unnecessary.

I removed this explanation. 

> (7)
> +static const char *
> +pqGetProtocolMsgType(unsigned char c, bool toServer) {
> + if (toServer == true &&
> + c < lengthof(protocol_message_type_f) &&
> + protocol_message_type_f[c] != NULL)
> + return protocol_message_type_f[c];
> +
> + if (toServer == false &&
> + c <

Re: libpq debug log

2021-03-16 Thread Kyotaro Horiguchi

At Wed, 17 Mar 2021 02:09:32 +, "tsunakawa.ta...@fujitsu.com" 
 wrote in 
> Alvaro-san,
> 
> 
> Thank you for taking your time to take a look at an incomplete patch.  I 
> thought I would ask you for final check for commit after Iwata-san has 
> reflected my review comments.
> 
> I discussed with Iwata-sn your below comments.  Let me convey her opinions.  
> (She is now focusing on fixing the patch.)  We'd appreciate if you could 
> tweak her completed patch.
> 
> 
> From: Alvaro Herrera 
> > * There's no cross-check that the protocol message length word
> >   corresponds to what we actually print.  I think one easy way to
> >   cross-check that would be to return the "count" from the type-specific
> >   routines, and verify that it matches "length" in pqTraceOutputMessage.
> >   (If the separate routines are removed and the code put back in one
> >   giant switch, then the "count" is already available when the switch
> >   ends.)
> 
> We don't think the length check is necessary, because 1) for FE->BE, correct 
> messages are always assembled, and 2) for BE->FE, the parsing and 
> decomposition of messages have already checked the messages.

Maybe it is not for the sanity of message bytes but to check if the
logging-stuff is implemented in sync with the messages structure.  If
the given message length differs from the length the logging facility
read after scanning a message bytes, it's sign of something wrong in
*the logging feature*.

> > * If we have separate functions for each message type, then we can have
> >   that function supply the message name by itself, rather than have the
> >   separate protocol_message_type_f / _b arrays that print it.
> 
> I felt those two arrays are clumsy and thought of suggesting to remove them.  
> But I didn't because the functions or case labels for each message type have 
> to duplicate the printing of header fields: timestamp, message type, and 
> length.  Maybe we can change the order of output to "timestamp length 
> message_type content", but I kind of prefer the current order.  What do you 
> think?  (I can understand removing the clumsy static arrays should be better 
> than the order of output fields.)

+1 for removing the arrays.

> > * If we make each type-specific function print the message name, then we
> >   need to make the same function print the message length, because it
> >   comes after.  So the function would have to receive the length (so
> >   that it can be printed).  But I think we should still have the
> >   cross-check I mentioned in my first point occur in the main
> >   pqTraceOutputMessage, not the type-specific function, for fear that we
> >   will forget the cross-check in one of the functions and never realize
> >   that we did.
> 
> As mentioned above, we think the current structure is better for smaller and 
> readable code.
> 
> 
> > * I would make the pqTraceOutputInt16() function and siblings advance
> >   the cursor themselves, actually.  I think something like this:
> >   static int
> >   pqTraceOutputInt16(const char *data, int *cursor, FILE *pfdebug)
> >   {
> >   uint16tmp;
> >   int   result;
> > 
> >   memcpy(, data + *cursor, 2);
> >   *cursor += 2;
> >   result = (int) pg_ntoh16(tmp);
> >   fprintf(pfdebug, " #%d", result);
> > 
> >   return result;
> >   }
> >   (So the caller has to pass the original "data" pointer, not
> >   "data+cursor").  This means the caller no longer has to do "cursor+=N"
> >   at each place.  Also, you get rid of the moveStrCursor() which does
> >   not look good.
> 
> That makes sense, because in fact the patch mistakenly added 4 when it should 
> add 2.  Also, the code would become smaller.

FWIW, that's what I suggested upthread:p  So +3.

me> *I*'d want to make the output functions move the reading pointer by
me> themseves.  pqTradeOutputMsg can be as simplified as the follows doing

> > * I'm not fond of the idea of prefixing "#" for 16bit ints and no
> >   prefix for 32bit ints.  Seems obscure and the output looks weird.
> >   I would use a one-letter prefix for each type, "w" for 32-bit ints
> >   (stands for "word") and "h" for 16-bit ints (stands for "half-word").
> >   Message length goes unadorned.  Then you'd have lines like this
> > 
> > 2021-03-15 08:10:44.324296  <   RowDescription  35 h1 "spcoptions"
> > w1213 h5 w1009 h65535 w-1 h0
> > 2021-03-15 08:10:44.324335  <   DataRow 32 h1 w22
> > '{random_page_cost=3.0}'
> 
> Yes, actually I felt something similar.  Taking a second thought, I think we 
> don't have to have any prefix because it doesn't help users.  So we're 
> removing the prefix.  We don't have any opinion on adding some prefix.

It would help when the value is "255", which is confusing between -1
(or 255) in byte or 255 in 2-byte word. Or when the value looks like
broken. I'd suggest "b"yte for 1 byte, "s"hort for 2 bytes, "l"ong for
4 bytes ('l' is confusing with '1', but anyway it is not needed)..

> >

RE: libpq debug log

2021-03-16 Thread tsunakawa.ta...@fujitsu.com

Alvaro-san,


Thank you for taking your time to take a look at an incomplete patch.  I 
thought I would ask you for final check for commit after Iwata-san has 
reflected my review comments.

I discussed with Iwata-sn your below comments.  Let me convey her opinions.  
(She is now focusing on fixing the patch.)  We'd appreciate if you could tweak 
her completed patch.


From: Alvaro Herrera 
> * There's no cross-check that the protocol message length word
>   corresponds to what we actually print.  I think one easy way to
>   cross-check that would be to return the "count" from the type-specific
>   routines, and verify that it matches "length" in pqTraceOutputMessage.
>   (If the separate routines are removed and the code put back in one
>   giant switch, then the "count" is already available when the switch
>   ends.)

We don't think the length check is necessary, because 1) for FE->BE, correct 
messages are always assembled, and 2) for BE->FE, the parsing and decomposition 
of messages have already checked the messages.


> * If we have separate functions for each message type, then we can have
>   that function supply the message name by itself, rather than have the
>   separate protocol_message_type_f / _b arrays that print it.

I felt those two arrays are clumsy and thought of suggesting to remove them.  
But I didn't because the functions or case labels for each message type have to 
duplicate the printing of header fields: timestamp, message type, and length.  
Maybe we can change the order of output to "timestamp length message_type 
content", but I kind of prefer the current order.  What do you think?  (I can 
understand removing the clumsy static arrays should be better than the order of 
output fields.)


> * If we make each type-specific function print the message name, then we
>   need to make the same function print the message length, because it
>   comes after.  So the function would have to receive the length (so
>   that it can be printed).  But I think we should still have the
>   cross-check I mentioned in my first point occur in the main
>   pqTraceOutputMessage, not the type-specific function, for fear that we
>   will forget the cross-check in one of the functions and never realize
>   that we did.

As mentioned above, we think the current structure is better for smaller and 
readable code.


> * I would make the pqTraceOutputInt16() function and siblings advance
>   the cursor themselves, actually.  I think something like this:
>   static int
>   pqTraceOutputInt16(const char *data, int *cursor, FILE *pfdebug)
>   {
> uint16tmp;
> int   result;
> 
> memcpy(, data + *cursor, 2);
> *cursor += 2;
> result = (int) pg_ntoh16(tmp);
> fprintf(pfdebug, " #%d", result);
> 
> return result;
>   }
>   (So the caller has to pass the original "data" pointer, not
>   "data+cursor").  This means the caller no longer has to do "cursor+=N"
>   at each place.  Also, you get rid of the moveStrCursor() which does
>   not look good.

That makes sense, because in fact the patch mistakenly added 4 when it should 
add 2.  Also, the code would become smaller.


> * I'm not fond of the idea of prefixing "#" for 16bit ints and no
>   prefix for 32bit ints.  Seems obscure and the output looks weird.
>   I would use a one-letter prefix for each type, "w" for 32-bit ints
>   (stands for "word") and "h" for 16-bit ints (stands for "half-word").
>   Message length goes unadorned.  Then you'd have lines like this
> 
> 2021-03-15 08:10:44.324296  <   RowDescription  35 h1 "spcoptions"
> w1213 h5 w1009 h65535 w-1 h0
> 2021-03-15 08:10:44.324335  <   DataRow 32 h1 w22
> '{random_page_cost=3.0}'

Yes, actually I felt something similar.  Taking a second thought, I think we 
don't have to have any prefix because it doesn't help users.  So we're removing 
the prefix.  We don't have any opinion on adding some prefix.


> * I don't like that pqTraceOutputS/H print nothing when !toServer.  I
> think they should complain.

Yes, the caller should not call the function if there's no message content.  
That way, the function doesn't need the toServer argument.


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-16 Thread Alvaro Herrera

On 2021-Mar-15, iwata@fujitsu.com wrote:

> I create protocol message reading function for each protocol message type. 
> (Ex. pqTraceOutputB() read Bind message)
> This makes the nesting shallower and makes the code easier to read.

I'm not sure I agree with this structural change.  Yes, it is less
indentation, but these functions are all quite short and simple anyway.
But I don't disagree strongly with it either.

Four issues:
* There's no cross-check that the protocol message length word
  corresponds to what we actually print.  I think one easy way to
  cross-check that would be to return the "count" from the type-specific
  routines, and verify that it matches "length" in pqTraceOutputMessage.
  (If the separate routines are removed and the code put back in one
  giant switch, then the "count" is already available when the switch
  ends.)

* If we have separate functions for each message type, then we can have
  that function supply the message name by itself, rather than have the
  separate protocol_message_type_f / _b arrays that print it.

* If we make each type-specific function print the message name, then we
  need to make the same function print the message length, because it
  comes after.  So the function would have to receive the length (so
  that it can be printed).  But I think we should still have the
  cross-check I mentioned in my first point occur in the main
  pqTraceOutputMessage, not the type-specific function, for fear that we
  will forget the cross-check in one of the functions and never realize
  that we did.

* I would make the pqTraceOutputInt16() function and siblings advance
  the cursor themselves, actually.  I think something like this:
  static int
  pqTraceOutputInt16(const char *data, int *cursor, FILE *pfdebug)
  {
  uint16tmp;
  int   result;

  memcpy(, data + *cursor, 2);
  *cursor += 2;
  result = (int) pg_ntoh16(tmp);
  fprintf(pfdebug, " #%d", result);

  return result;
  }
  (So the caller has to pass the original "data" pointer, not
  "data+cursor").  This means the caller no longer has to do "cursor+=N"
  at each place.  Also, you get rid of the moveStrCursor() which does
  not look good.

Bikeshedding item:
* I'm not fond of the idea of prefixing "#" for 16bit ints and no
  prefix for 32bit ints.  Seems obscure and the output looks weird.
  I would use a one-letter prefix for each type, "w" for 32-bit ints
  (stands for "word") and "h" for 16-bit ints (stands for "half-word"). 
  Message length goes unadorned.  Then you'd have lines like this

2021-03-15 08:10:44.324296  <   RowDescription  35 h1 "spcoptions" w1213 h5 
w1009 h65535 w-1 h0
2021-03-15 08:10:44.324335  <   DataRow 32 h1 w22 '{random_page_cost=3.0}'  

* I don't like that pqTraceOutputS/H print nothing when !toServer.  I
think they should complain.

-- 
Álvaro Herrera39°49'30"S 73°17'W

RE: libpq debug log

2021-03-16 Thread tsunakawa.ta...@fujitsu.com

I've finished the review.  Here are the last set of comments.


(16)
(15) is also true for the processing of 'H' message.


(17) pqTraceOutputD
+   for (i = 0; i < nfields; i++)
+   {
+   len = pqTraceOutputInt32(message + cursor, f);
+   cursor += 4;
+   if (len == -1)
+   break;
+   pqTraceOutputNchar(message + cursor, f, len);
+   cursor += len;
+   }

Not break but continue, because non-NULL columns may follow a NULL column.


(18)
+   case 'E':   /* Execute(F) or Error Return(B) */
+   pqTraceOutputE(message + LogCursor, LogEnd, 
conn->Pfdebug, toServer);
+   break;

Error Return -> ErrorResponse


(19) pqTraceOutputF
Check the protocol again.  Two for loops should be required, one for format 
codes and another for arguments.


(20)
+   if ( len != -1)

Remove extra space before len.


(21)
I guess you intentionally omitted the following messages because they are only 
used during connection establishment.  I'm fine with it.  I wrote this just in 
case you missed them.

GSSENCRequest (F)
Int32(8)

GSSResponse (F)
Byte1('p')

PasswordMessage (F)
Byte1('p')

SASLInitialResponse (F)
Byte1('p')

SASLResponse (F)
Byte1('p')


(22)
+/* NotificationResponse */
+static void
+pqTraceOutputA(const char *message, int end, FILE *f)
+{
+   int cursor = 0;
+
+   pqTraceOutputInt16(message + cursor, f);
+   cursor += 2;
+   pqTraceOutputString(message + cursor, f);

Not Int16 but Int32.


Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-16 Thread tsunakawa.ta...@fujitsu.com

I've not finished reviewing yet, but there seems to be many mistakes.  I'm 
sending second set of review comments now so you can fix them in parallel.


(8)
+   charid = '\0';

This initialization is not required because id will always be assigned a value 
shortly.


(9)
+static int
+pqTraceOutputInt32(const char *data, FILE *pfdebug)
+{
+   int result;
+
+   memcpy(, data, 4);
+   result = (int) pg_ntoh32(result);
+   fputc(' ', pfdebug);
+   fprintf(pfdebug, "%d", result);
+
+   return result;
+}

fputc() and fprintf() can be merged into one fprintf() call for efficiency.

TBH, it feels strange that an "output" function returns a value.  But I 
understood this as a positive compromise for reducing code; without this, the 
caller has to do memcpy() and endianness conversion.


(10)
+/* BackendKeyData */
+static void
+pqTraceOutputK(const char *message, FILE *f)
+{
+   int cursor = 0;
+
+   pqTraceOutputInt32(message + cursor, f);
+   cursor += 4;
+   pqTraceOutputInt32(message + cursor, f);
+}

I don't think you need to always use a cursor variable in order to align with 
more complex messages.  That is, you can just write:

+   pqTraceOutputInt32(message, f);
+   pqTraceOutputInt32(message + 4, f);


(11)
+   default:
+   fprintf(conn->Pfdebug, "Invalid Protocol:%c\n", id);
+   break;
+

(This is related to (7))
You can remove this default label if you exit the function before the switch 
statement when the message type is unknown.  Make sure to output \n in that 
case.


(12) pqTraceOutputB
+   for (i = 0; i < nparams; i++)
+   {
+   nbytes = pqTraceOutputInt32(message + cursor, f);
+   cursor += 4;
+   if (nbytes == -1)
+   break;
+   pqTraceOutputNchar(message + cursor, f, nbytes);
+   cursor += nbytes;
+   }

Not break but continue, because non-NULL parameters may follow a NULL parameter.


(13) pqTraceOutputB
+   pqTraceOutputInt16(message + cursor, f);
+   cursor += 4;
+   pqTraceOutputInt16(message + cursor, f);
+}

This part doesn't seem to match the following description.

--
After the last parameter, the following fields appear:

Int16
The number of result-column format codes that follow (denoted R below). This 
can be zero to indicate that there are no result columns or that the result 
columns should all use the default format (text); or one, in which case the 
specified format code is applied to all result columns (if any); or it can 
equal the actual number of result columns of the query.

Int16[R]
The result-column format codes. Each must presently be zero (text) or one 
(binary).
--


(14)
The processing for CancelRequest message is missing?


(15)
+/* CopyInResponse */
+static void
+pqTraceOutputG(const char *message, int end, FILE *f)
+{
+   int cursor = 0;
+
+   pqTraceOutputByte1(message + cursor, f);
+   cursor++;
+
+   while (end > cursor)
+   {
+   pqTraceOutputInt16(message + cursor, f);
+   cursor += 2;
+   }
+}
+

According to the following description, for loop should be used.

--
Int16
The number of columns in the data to be copied (denoted N below).

Int16[N]
The format codes to be used for each column. Each must presently be zero (text) 
or one (binary). All must be zero if the overall copy format is textual.
--




Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-15 Thread tsunakawa.ta...@fujitsu.com

I'm looking at the last file libpq-trace.c.  I'll continue the review after 
lunch.  Below are some comments so far.


(1)
-  Enables  tracing of the client/server communication to a debugging file 
stream.
+  Enables tracing of the client/server communication to a debugging file
+  stream.
+  Only available when protocol version 3 or higher is used.

The last line is unnecessary now that v2 is not supported.


(2)
@@ -113,6 +114,7 @@ install: all installdirs install-lib
$(INSTALL_DATA) $(srcdir)/libpq-fe.h '$(DESTDIR)$(includedir)'
$(INSTALL_DATA) $(srcdir)/libpq-events.h '$(DESTDIR)$(includedir)'
$(INSTALL_DATA) $(srcdir)/libpq-int.h '$(DESTDIR)$(includedir_internal)'
+   $(INSTALL_DATA) $(srcdir)/libpq-trace.h 
'$(DESTDIR)$(includedir_internal)'
$(INSTALL_DATA) $(srcdir)/pqexpbuffer.h 
'$(DESTDIR)$(includedir_internal)'
$(INSTALL_DATA) $(srcdir)/pg_service.conf.sample 
'$(DESTDIR)$(datadir)/pg_service.conf.sample'

Why is this necessary?


(3) libpq-trace.h
+#ifdef __cplusplus
+extern "C"
+{

This is unnecessary because pqTraceOutputMessage() is for libpq's internal use. 
 This extern is for the user's C++ application to call public libpq functions.

+#include "libpq-fe.h"
+#include "libpq-int.h"

I think these can be removed by declaring the struct and function as follows:

struct pg_conn;
extern void pqTraceOutputMessage(struct pg_conn *conn, const char *message, 
bool toServer);

But... after doing the above, only this declaration is left in libpq-trace.h.  
Why don't you put your original declaration using PGconn in libpq-int.h and 
remove libpq-trace.h?


(4)
+   /* Trace message only when there is first 1 byte */
+   if (conn->Pfdebug && (conn->outCount < conn->outMsgStart))
+   pqTraceOutputMessage(conn, conn->outBuffer + conn->outCount, 
true);

() surrounding the condition after && can be removed.


(5)
+static const char *pqGetProtocolMsgType(unsigned char c,
+   
bool toServer);

This is unnecessary because the function definition precedes its only one call 
site.


(6)
+ * We accumulate frontend message pieces in an array as the libpq code writes
+ * them, and log the complete message when pqTraceOutputFeMsg is called.
+ * For backend, we print the pieces as soon as we receive them from the server.

The first paragraph is no longer true.  I think this entire comment is 
unnecessary.


(7)
+static const char *
+pqGetProtocolMsgType(unsigned char c, bool toServer)
+{
+   if (toServer == true &&
+   c < lengthof(protocol_message_type_f) &&
+   protocol_message_type_f[c] != NULL)
+   return protocol_message_type_f[c];
+
+   if (toServer == false &&
+   c < lengthof(protocol_message_type_b) &&
+   protocol_message_type_b[c] != NULL)
+   return protocol_message_type_b[c];
+
+   return "UnknownMessage:";
+}

"== true/false" can be removed.  libpq doesn't appear to use such style.

Why don't we embed this processing in pqTraceOutputMessage() because this 
function is short and called only once?  The added benefit would be that you 
can print an invalid message type byte like this:

fprintf(..., "Unknown message: %02x\n", ...);


Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-15 Thread iwata....@fujitsu.com

Alvaro san, Tom san Horiguchi san, Tsunakawa san and Kirk san,

Thank you very much for review and advice.

> > Works for me.
> 
> Pushed that.  I think we're now waiting on Iwata-san to finish a new version 
> of
> the tracing patch.
Thank you very much Alvaro san and Tom san. Your patch and code change makes my 
work more smoothly. 
And I am sorry for attaching not-good-work patch. fixsegv.patch help my 
implementation.

I update this patch to v25. I fix function arguments and something which are 
pointed out by reviewers.

I create protocol message reading function for each protocol message type. (Ex. 
pqTraceOutputB() read Bind message)
This makes the nesting shallower and makes the code easier to read.
Each data type output functions(Ex. pqTraceOutputString() ) are called from 
each protocol message type function. 
Cursor operation is performed in each protocol message type function.

Like Kirk san,  I share `make check` results. As far as I can see, tracing 
works well.
I compare this result with current master branch trace log. It does not 
miss/add any protocol message.
And Thank you Kirk san for your share of make check result and previous patch!

```
2021-03-15 08:10:44.288999  >   Terminate   4   
2021-03-15 08:10:44.317959  >   Query   155 "CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(some_nonexistent_parameter = true);"
2021-03-15 08:10:44.319121  <   ErrorResponse   124 S "ERROR" V "ERROR" C 
"22023" M "unrecognized parameter "some_nonexistent_parameter"" F 
"reloptions.c" L "1447" R "parseRelOptionsInternal" \x00
2021-03-15 08:10:44.319161  <   ReadyForQuery   5 I 
2021-03-15 08:10:44.319221  >   Query   144 "CREATE TABLESPACE 
regress_tblspacewith LOCATION 
'/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH 
(random_page_cost = 3.0);"
2021-03-15 08:10:44.320889  <   CommandComplete 22 "CREATE TABLESPACE"  
2021-03-15 08:10:44.320922  <   ReadyForQuery   5 I 
2021-03-15 08:10:44.320979  >   Query   81 "SELECT spcoptions FROM 
pg_tablespace WHERE spcname = 'regress_tblspacewith';"
2021-03-15 08:10:44.324296  <   RowDescription  35 #1 "spcoptions" 1213 #5 1009 
#65535 -1 #0
2021-03-15 08:10:44.324335  <   DataRow 32 #1 22 '{random_page_cost=3.0}'   
  
2021-03-15 08:10:44.324347  <   CommandComplete 13 "SELECT 1"   
2021-03-15 08:10:44.324388  <   ReadyForQuery   5 I 
2021-03-15 08:10:44.324516  >   Query   42 "DROP TABLESPACE 
regress_tblspacewith;"
2021-03-15 08:10:44.325423  <   CommandComplete 20 "DROP TABLESPACE"
2021-03-15 08:10:44.325453  <   ReadyForQuery   5 I  
```


> From: Tsunakawa, Takayuki/綱川 貴之 
> Sent: Friday, March 5, 2021 3:17 PM
...
> 
> void pqLogMessage(PGconn *conn, const char *message, bool toServer);

Thank you Tsunakawa san. In v25 patch, I fixed qTraceOutputMessag() function to 
this style.
I am sorry changing arguments in previous patch. That was my misunderstanding.

> From: Tom Lane 
> Sent: Thursday, March 11, 2021 6:33 AM
...
> There are still some cowboy advancements of inStart in the functions
> concerned with COPY processing, but it doesn't look like there's much to be
> done about that.  Those spots will need their own trace calls.

I called pqTraceOutputMessage() from pqParseInput3(), COPY process and 
pqFunctionCall3(). pqFunctionCall3() is called by PQfn().
I found that there is protocol message id switch(). So I call the tracing 
function there.

> From: Kyotaro Horiguchi 
> Sent: Thursday, March 11, 2021 10:20 AM

Thank you Horiguchi san. Your keen review suggestions and detailed advice. It 
is very helpful for updating my patch.

> The individual per-type-output functions are designed to fit to the
> corresponding pqGet*() functions.  On the other hand, now that we read the
> message bytes knowing the exact format in advance as you did in
> pqTraceOutputMsg().  So the complexity exist in the functions can be
> eliminated by separating them into more type specific output functions. For
> example, pqTraceOutputInt() is far simplified by spliting it into
> pqTraceOutputInt16() and -32().
Yes, I refer to pqGet functions. To make data type output function more 
simpler, 
I separate output int function to pqTraceOutputInt16 and -32.

> I think we can get rid of copying in the output functions for String and 
> Bytes in
> different ways.
I remove memcpy call from String and Bytes output function. 
These functions just pass *message to fprintf().

> + /* Protocol 2.0 does not support tracing. */
> + if (PG_PROTOCOL_MAJOR(conn->pversion) < 3)
> + return;
> 
> We can write that message out into tracing file.
> 
> +void
> +PQtraceSetFlags(PGconn *conn, int flags) {
> + if (conn == NULL)
> + return;
> + /* Protocol 2.0 is not supported. */
> + if

Re: libpq debug log

2021-03-11 Thread Tom Lane

Alvaro Herrera  writes:
> On 2021-Mar-10, Tom Lane wrote:
>> After studying this further, I think we should apply the attached
>> patch to remove that responsibility from pqParseInput3's subroutines.
>> This will allow a single trace call near the bottom of pqParseInput3
>> to handle all cases that that function processes.

> Works for me.

Pushed that.  I think we're now waiting on Iwata-san to finish a
new version of the tracing patch.

regards, tom lane

Re: libpq debug log

2021-03-10 Thread Kyotaro Horiguchi

At Thu, 11 Mar 2021 04:12:57 +, "tsunakawa.ta...@fujitsu.com" 
 wrote in 
> From: Kyotaro Horiguchi 
> > Right. So something like this?
> > 
> > unsigned char p;
> > 
> > p = buf + *cursor;
> > result = (uint32) (*p << 24) + (*(p + 1)) << 16 + ...);
> 
> Yes, that would work (if p is a pointer), but I think memcpy() is enough like 
> pqGetInt() does.

On second thought, memcpy onto a local variable doesn't harm. I agreed
to you.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: libpq debug log

2021-03-10 Thread tsunakawa.ta...@fujitsu.com

From: Kyotaro Horiguchi 
> Right. So something like this?
> 
> unsigned char p;
> 
> p = buf + *cursor;
> result = (uint32) (*p << 24) + (*(p + 1)) << 16 + ...);

Yes, that would work (if p is a pointer), but I think memcpy() is enough like 
pqGetInt() does.


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-10 Thread Kyotaro Horiguchi

At Thu, 11 Mar 2021 03:01:16 +, "tsunakawa.ta...@fujitsu.com" 
 wrote in 
> From: Kyotaro Horiguchi 
> > The output functions copy message bytes into local variable but the
> > same effect can be obtained by just casing via typed pointer type.
> > 
> >   uint32 tmp4;
> >   ..
> >   memcpy(, buf + *cursor, 4);
> >   result = (int) pg_ntoh32(tmp4);
> > 
> > can be written as
> > 
> >   result = pg_ntoh32(* (uint32 *) (buf + *cursor));
> 
> I'm afraid we need to memcpy() because of memory alignment.

Right. So something like this?

unsigned char p;

p = buf + *cursor;
result = (uint32) (*p << 24) + (*(p + 1)) << 16 + ...);


> > I think we can get rid of copying in the output functions for String
> > and Bytes in different ways.
> 
> I haven't looked at this code, but you sound right.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: libpq debug log

2021-03-10 Thread Kyotaro Horiguchi

At Thu, 11 Mar 2021 01:51:55 +, "tsunakawa.ta...@fujitsu.com" 
 wrote in 
> Alvaro-san, Tom-san, Horiguchi-san,
> 
> From: Kyotaro Horiguchi 
> > +1 for the thanks for the quick work.  I have some random comments
> > after a quick look on it.
> 
> Thank you very much for giving many comments.  And We're sorry to
> have caused you trouble.  I told Iwata-san yesterday to modify the
> patch to use the logging function interface that Tom-san, Alvaro-san
> and I agreed on, that I'd review after the new revision has been

Yeah, I agreed at the time. Sorry for not responding it but had no
room in my mind to do that:p

> posted, and let others know she is modifying the patch again so that
> they won't start reviewing.  But I should have done the request on
> hackers.
> 
> We're catching up with your feedback and post the updated patch.
> Then I'll review it.
> 
> We appreciate your help so much.

My pleasure.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: libpq debug log

2021-03-10 Thread Kyotaro Horiguchi

At Wed, 10 Mar 2021 20:38:20 -0500, Tom Lane  wrote in 
> Kyotaro Horiguchi  writes:
> > Maybe I'm missing something, but the above doesn't seem working.  I
> > didn't see such message when making a SSL connection.
> 
> As far as that goes, I don't see any point in trying to trace
> connection-related messages, because there is no way to enable
> tracing on a connection before it's created.

Yeah, agreed. In the previous version tracing functions are called
during protocol negotiation but that no longer happenes.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: libpq debug log

2021-03-10 Thread tsunakawa.ta...@fujitsu.com

From: Kyotaro Horiguchi 
> The output functions copy message bytes into local variable but the
> same effect can be obtained by just casing via typed pointer type.
> 
>   uint32 tmp4;
>   ..
>   memcpy(, buf + *cursor, 4);
>   result = (int) pg_ntoh32(tmp4);
> 
> can be written as
> 
>   result = pg_ntoh32(* (uint32 *) (buf + *cursor));

I'm afraid we need to memcpy() because of memory alignment.

> I think we can get rid of copying in the output functions for String
> and Bytes in different ways.

I haven't looked at this code, but you sound right.


Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-10 Thread tsunakawa.ta...@fujitsu.com

Alvaro-san, Tom-san, Horiguchi-san,

From: Kyotaro Horiguchi 
> +1 for the thanks for the quick work.  I have some random comments
> after a quick look on it.

Thank you very much for giving many comments.  And We're sorry to have caused 
you trouble.  I told Iwata-san yesterday to modify the patch to use the logging 
function interface that Tom-san, Alvaro-san and I agreed on, that I'd review 
after the new revision has been posted, and let others know she is modifying 
the patch again so that they won't start reviewing.  But I should have done the 
request on hackers.

We're catching up with your feedback and post the updated patch.  Then I'll 
review it.

We appreciate your help so much.


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-10 Thread Tom Lane

Kyotaro Horiguchi  writes:
> Maybe I'm missing something, but the above doesn't seem working.  I
> didn't see such message when making a SSL connection.

As far as that goes, I don't see any point in trying to trace
connection-related messages, because there is no way to enable
tracing on a connection before it's created.

regards, tom lane

Re: libpq debug log

2021-03-10 Thread Kyotaro Horiguchi

At Wed, 10 Mar 2021 10:31:27 -0300, "'alvhe...@alvh.no-ip.org'" 
 wrote in 
> On 2021-Mar-10, iwata@fujitsu.com wrote:
> 
> > Hi all,
> > 
> > Following all reviewer's advice, I have created a new patch.
> > 
> > In this patch, I add only two tracing entry points; I call 
> > pqTraceOutputMsg(PGconn conn, int msgCursor, PGCommSource commsource) in 
> > pqParseInput3 () and pqPutMsgEnd () to output log.
> > The argument contains message first byte offset called msgCursor because it 
> > is simpler to specify the buffer pointer in the caller. 
> > 
> > In pqTraceOutputMsg(), the content common to all protocol messages (first 
> > timestamp, < or >, first 1 byte, and length) are output first and after 
> > that each protocol message contents is output. I referred to pqParseInput3 
> > () , fe-exec.c and documentation page for that part.
> > 
> > This fix almost eliminates if(conn->Pfdebug) that was built into the code 
> > for other features.
> 
> This certainly looks much better!  Thanks for getting it done so
> quickly.
> 
> I didn't review the code super closely.  I do see a compiler warning:

+1 for the thanks for the quick work.  I have some random comments
after a quick look on it.

The individual per-type-output functions are designed to fit to the
corresponding pqGet*() functions.  On the other hand, now that we read
the message bytes knowing the exact format in advance as you did in
pqTraceOutputMsg().  So the complexity exist in the functions can be
eliminated by separating them into more type specific output
functions. For example, pqTraceOutputInt() is far simplified by
spliting it into pqTraceOutputInt16() and -32().

The output functions copy message bytes into local variable but the
same effect can be obtained by just casing via typed pointer type.

  uint32 tmp4;
  ..
  memcpy(, buf + *cursor, 4);
  result = (int) pg_ntoh32(tmp4);

can be written as

  result = pg_ntoh32(* (uint32 *) (buf + *cursor));

I think we can get rid of copying in the output functions for String
and Bytes in different ways.

Now that we can manage our own reading pointer within
pqTraceOutputMsg(), the per-type-output functions can work only on the
reading pointer instead of buffer pointer and cursor, and length.
*I*'d want to make the output functions move the reading pointer by
themseves.  pqTradeOutputMsg can be as simplified as the follows doing
this.  End-of-message pointer may be useful to check read-overrunning
on the message buffer.

  switch (id) {
case 'R':
  pqTraceOutputInt32(, endptr, conn->Pfdebug);
  fputc('\n', conn->Pfdebug);
  break;
case 'K':
  pqTraceOutputInt32(, endptr, conn->Pfdebug));
  pqTraceOutputInt32(, endptr, conn->Pfdebug));
...

+   char   *prefix = commsource == MSGDIR_FROM_BACKEND ? "<" : ">";
+   int LogEnd = commsource == MSGDIR_FROM_BACKEND ? 
conn->inCursor : conn->outMsgEnd;
+   char*logBuffer = commsource == MSGDIR_FROM_BACKEND ? 
conn->inBuffer 
..
+   if (commsource == MSGDIR_FROM_BACKEND)
+   id = logBuffer[LogCursor++];

Repeated usage of the ternaly operator on the same condition makes
code hard-to-read.  It is simpler to consolidate them into one if-else
statement.

+   result = (int) pg_ntoh32(result32);
+   if (result == NEGOTIATE_SSL_CODE)

Maybe I'm missing something, but the above doesn't seem working.  I
didn't see such message when making a SSL connection.

+   /* Protocol 2.0 does not support tracing. */
+   if (PG_PROTOCOL_MAJOR(conn->pversion) < 3)
+   return;

We can write that message out into tracing file.

+void
+PQtraceSetFlags(PGconn *conn, int flags)
+{
+   if (conn == NULL)
+   return;
+   /* Protocol 2.0 is not supported. */
+   if (PG_PROTOCOL_MAJOR(conn->pversion) < 3)
+   return;
+   /* If PQtrace() failed, do nothing. */
+   if (conn->Pfdebug == NULL)
+   return;
+   conn->traceFlags = flags;

Pfdebug is always NULL for V2 protocol there, so it can be an
assertion?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: libpq debug log

2021-03-10 Thread Tom Lane

Alvaro Herrera  writes:
> On 2021-Mar-10, Tom Lane wrote:
>> After studying this further, I think we should apply the attached
>> patch to remove that responsibility from pqParseInput3's subroutines.
>> This will allow a single trace call near the bottom of pqParseInput3
>> to handle all cases that that function processes.

> Works for me.

I dug into the git history a little and concluded that the reason for
doing that in the subroutines is that 92785dac2 made it so for this
reason:

+   /*
+* Advance inStart to show that the "D" message has been processed.  We
+* must do this before calling the row processor, in case it longjmps.
+*/
+   conn->inStart = conn->inCursor;

We later gave up on allowing user-defined row processor callbacks,
but we failed to undo this messiness.  Looking at what 92785dac2 did
confirms something else I'd been eyeing, which is why these subroutines
have their own checks for having consumed the right amount of data
instead of letting the master check in pqParseInput3 take care of it.
They didn't use to do that, but that check was hoisted up to before the
row processor call, so we wouldn't expose data from a corrupt message to
user code.  So I think we can undo that too.

92785dac2 only directly hacked getRowDescriptions and getAnotherTuple,
but it looks like similar error handling was stuck into
getParamDescriptions later.

regards, tom lane

Re: libpq debug log

2021-03-10 Thread Alvaro Herrera

On 2021-Mar-10, Tom Lane wrote:

> After studying this further, I think we should apply the attached
> patch to remove that responsibility from pqParseInput3's subroutines.
> This will allow a single trace call near the bottom of pqParseInput3
> to handle all cases that that function processes.

Works for me.

> BTW, while looking at this I concluded that getParamDescriptions
> is actually buggy: if it's given a malformed message that seems
> to need more data than the message length specifies, it just
> returns EOF, resulting in an infinite loop.  This function apparently
> got missed while converting the logic from v2 (where that was the
> right thing) to v3 (where it ain't).  So that bit needs to be
> back-patched.

Ah, that makes sense.

> I'm tempted to back-patch the whole thing though.

+0.5 on that.  I think we may be happy that all branches are alike
(though it doesn't look like this will cause any subtle bugs -- breakage
will be fairly obvious).

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Most hackers will be perfectly comfortable conceptualizing users as entropy
 sources, so let's move on."   (Nathaniel Smith)

Re: libpq debug log

2021-03-10 Thread Tom Lane

I wrote:
> Or we could rethink the logic.  It's not quite clear to me, after
> all this time, why getRowDescriptions() et al are allowed to
> move inStart themselves rather than letting the main loop in
> pqParseInput3 do it.  It might well be an artifact of having not
> rewritten the v2 logic too much.

After studying this further, I think we should apply the attached
patch to remove that responsibility from pqParseInput3's subroutines.
This will allow a single trace call near the bottom of pqParseInput3
to handle all cases that that function processes.

There are still some cowboy advancements of inStart in the functions
concerned with COPY processing, but it doesn't look like there's
much to be done about that.  Those spots will need their own trace
calls.

BTW, while looking at this I concluded that getParamDescriptions
is actually buggy: if it's given a malformed message that seems
to need more data than the message length specifies, it just
returns EOF, resulting in an infinite loop.  This function apparently
got missed while converting the logic from v2 (where that was the
right thing) to v3 (where it ain't).  So that bit needs to be
back-patched.  I'm tempted to back-patch the whole thing though.

regards, tom lane

diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 2ca8c057b9..233456fd90 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -290,8 +290,6 @@ pqParseInput3(PGconn *conn)
 		/* First 'T' in a query sequence */
 		if (getRowDescriptions(conn, msgLength))
 			return;
-		/* getRowDescriptions() moves inStart itself */
-		continue;
 	}
 	else
 	{
@@ -337,8 +335,7 @@ pqParseInput3(PGconn *conn)
 case 't':		/* Parameter Description */
 	if (getParamDescriptions(conn, msgLength))
 		return;
-	/* getParamDescriptions() moves inStart itself */
-	continue;
+	break;
 case 'D':		/* Data Row */
 	if (conn->result != NULL &&
 		conn->result->resultStatus == PGRES_TUPLES_OK)
@@ -346,8 +343,6 @@ pqParseInput3(PGconn *conn)
 		/* Read another tuple of a normal query response */
 		if (getAnotherTuple(conn, msgLength))
 			return;
-		/* getAnotherTuple() moves inStart itself */
-		continue;
 	}
 	else if (conn->result != NULL &&
 			 conn->result->resultStatus == PGRES_FATAL_ERROR)
@@ -462,7 +457,6 @@ handleSyncLoss(PGconn *conn, char id, int msgLength)
  * command for a prepared statement) containing the attribute data.
  * Returns: 0 if processed message successfully, EOF to suspend parsing
  * (the latter case is not actually used currently).
- * In the former case, conn->inStart has been advanced past the message.
  */
 static int
 getRowDescriptions(PGconn *conn, int msgLength)
@@ -577,9 +571,6 @@ getRowDescriptions(PGconn *conn, int msgLength)
 	/* Success! */
 	conn->result = result;
 
-	/* Advance inStart to show that the "T" message has been processed. */
-	conn->inStart = conn->inCursor;
-
 	/*
 	 * If we're doing a Describe, we're done, and ready to pass the result
 	 * back to the client.
@@ -603,9 +594,6 @@ advance_and_error:
 	if (result && result != conn->result)
 		PQclear(result);
 
-	/* Discard the failed message by pretending we read it */
-	conn->inStart += 5 + msgLength;
-
 	/*
 	 * Replace partially constructed result with an error result. First
 	 * discard the old result to try to win back some memory.
@@ -624,6 +612,12 @@ advance_and_error:
 	appendPQExpBuffer(>errorMessage, "%s\n", errmsg);
 	pqSaveErrorResult(conn);
 
+	/*
+	 * Show the message as fully consumed, else pqParseInput3 will overwrite
+	 * our error with a complaint about that.
+	 */
+	conn->inCursor = conn->inStart + 5 + msgLength;
+
 	/*
 	 * Return zero to allow input parsing to continue.  Subsequent "D"
 	 * messages will be ignored until we get to end of data, since an error
@@ -635,12 +629,8 @@ advance_and_error:
 /*
  * parseInput subroutine to read a 't' (ParameterDescription) message.
  * We'll build a new PGresult structure containing the parameter data.
- * Returns: 0 if completed message, EOF if not enough data yet.
- * In the former case, conn->inStart has been advanced past the message.
- *
- * Note that if we run out of data, we have to release the partially
- * constructed PGresult, and rebuild it again next time.  Fortunately,
- * that shouldn't happen often, since 't' messages usually fit in a packet.
+ * Returns: 0 if processed message successfully, EOF to suspend parsing
+ * (the latter case is not actually used currently).
  */
 static int
 getParamDescriptions(PGconn *conn, int msgLength)
@@ -690,23 +680,16 @@ getParamDescriptions(PGconn *conn, int msgLength)
 	/* Success! */
 	conn->result = result;
 
-	/* Advance inStart to show that the "t" message has been processed. */
-	conn->inStart = conn->inCursor;
-
 	return 0;
 
 not_enough_data:
-	PQclear(result);
-	return EOF;
+	errmsg =

Re: libpq debug log

2021-03-10 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-10, Tom Lane wrote:

> Or we could rethink the logic.  It's not quite clear to me, after
> all this time, why getRowDescriptions() et al are allowed to
> move inStart themselves rather than letting the main loop in
> pqParseInput3 do it.  It might well be an artifact of having not
> rewritten the v2 logic too much.

I would certainly prefer that the logic stays put for the time being,
while I finalize the pipelining stuff.

-- 
Álvaro Herrera   Valdivia, Chile

Re: libpq debug log

2021-03-10 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-10, Tom Lane wrote:

> "'alvhe...@alvh.no-ip.org'"  writes:
> > After staring at it a couple of times, I think that the places in
> > pqParseInput3() where there's a comment "... moves inStart itself" and
> > then "continue;" should have a call to pqTraceOutputMsg(), since AFAIU
> > those are the places where the message in question would not reach the
> > "Successfully consumed this message" block that prints each message.
> 
> Yeah, the whole business of just when a message has been "consumed"
> is a stumbling block for libpq tracing.  It was a real mess with the
> existing code and v2 protocol, because we could actually try to parse
> a message more than once, with the first few tries deciding that the
> message wasn't all here yet (but nonetheless emitting partial trace
> output).

Hmm, that makes sense, but the issue I'm reporting is not the same,
unless I misunderstand you.

> Now that v2 is dead, the logic to abort because of the message not
> being all arrived yet is basically useless: only the little bit of
> code concerned with the message length word really needs to cope with
> that.  It's tempting to go through and get rid of all the now-unreachable
> "return"s and such, but it seemed like it would be a lot of code churn for
> not really that much gain.

That sounds like an interesting exercise, and I bet it'd bring a lot of
code readability improvements.

> I didn't look at the new version of the patch yet, so I'm not
> sure whether the issues it still has are related to this.

The issues I noticed are related to separate messages rather than one
message split in pieces -- for example several DataRow messages are
processed internally in a loop, rather than each individually.  The
number of places that need to be adjusted for things to AFAICT work
correctly are few enough; ISTM that the attached patch is sufficient.

(The business with the "logged" boolean is necessary so that we print to
the trace file any of those messages even if they are deviating from the
protocol.)

-- 
Álvaro Herrera   Valdivia, Chile
"La experiencia nos dice que el hombre peló millones de veces las patatas,
pero era forzoso admitir la posibilidad de que en un caso entre millones,
las patatas pelarían al hombre" (Ijon Tichy)

Re: libpq debug log

2021-03-10 Thread Tom Lane

"'alvhe...@alvh.no-ip.org'"  writes:
> After staring at it a couple of times, I think that the places in
> pqParseInput3() where there's a comment "... moves inStart itself" and
> then "continue;" should have a call to pqTraceOutputMsg(), since AFAIU
> those are the places where the message in question would not reach the
> "Successfully consumed this message" block that prints each message.

After digging around a little, I remember that there are a *bunch*
of places in fe-protocol3.c that advance inStart.  The cleanest way
to shove this stuff in without rethinking that logic would be to
call pqTraceOutputMsg immediately before each such advance (using
the old inStart value as the message start address).  A possible
advantage of doing it like that is that we'd be aware by that point
of whether we think the message is good or bad (or whether it was
good but we failed to process it, perhaps because of OOM).  Seems
like that could be a useful thing to include in the log.

Or we could rethink the logic.  It's not quite clear to me, after
all this time, why getRowDescriptions() et al are allowed to
move inStart themselves rather than letting the main loop in
pqParseInput3 do it.  It might well be an artifact of having not
rewritten the v2 logic too much.

regards, tom lane

Re: libpq debug log

2021-03-10 Thread Tom Lane

"'alvhe...@alvh.no-ip.org'"  writes:
> After staring at it a couple of times, I think that the places in
> pqParseInput3() where there's a comment "... moves inStart itself" and
> then "continue;" should have a call to pqTraceOutputMsg(), since AFAIU
> those are the places where the message in question would not reach the
> "Successfully consumed this message" block that prints each message.

Yeah, the whole business of just when a message has been "consumed"
is a stumbling block for libpq tracing.  It was a real mess with the
existing code and v2 protocol, because we could actually try to parse
a message more than once, with the first few tries deciding that the
message wasn't all here yet (but nonetheless emitting partial trace
output).

Now that v2 is dead, the logic to abort because of the message not
being all arrived yet is basically useless: only the little bit of
code concerned with the message length word really needs to cope with
that.  It's tempting to go through and get rid of all the now-unreachable
"return"s and such, but it seemed like it would be a lot of code churn for
not really that much gain.

I didn't look at the new version of the patch yet, so I'm not
sure whether the issues it still has are related to this.

regards, tom lane

Re: libpq debug log

2021-03-10 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-10, 'alvhe...@alvh.no-ip.org' wrote:

> I also found that DataRow messages are not being printed.  This seems to
> fix that in the normal case and singlerow, at least in pipeline mode.
> Didn't verify the non-pipeline mode.

Hm, and RowDescription ('T') messages are also not being printed ... so
I think there's some larger problem here, and perhaps it needs to be
fixed using a different approach.

After staring at it a couple of times, I think that the places in
pqParseInput3() where there's a comment "... moves inStart itself" and
then "continue;" should have a call to pqTraceOutputMsg(), since AFAIU
those are the places where the message in question would not reach the
"Successfully consumed this message" block that prints each message.

-- 
Álvaro Herrera39°49'30"S 73°17'W
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)

Re: libpq debug log

2021-03-10 Thread 'alvhe...@alvh.no-ip.org'

I also found that DataRow messages are not being printed.  This seems to
fix that in the normal case and singlerow, at least in pipeline mode.
Didn't verify the non-pipeline mode.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Nunca se desea ardientemente lo que solo se desea por razón" (F. Alexandre)
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 68f0f6a081..e8db5edb71 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -869,6 +869,9 @@ getAnotherTuple(PGconn *conn, int msgLength)
 		goto advance_and_error;
 	}
 
+	if (conn->Pfdebug)
+		pqTraceOutputMsg(conn, conn->inStart, MSGDIR_FROM_BACKEND);
+
 	/* Advance inStart to show that the "D" message has been processed. */
 	conn->inStart = conn->inCursor;

Re: libpq debug log

2021-03-10 Thread 'alvhe...@alvh.no-ip.org'

On 2021-Mar-10, iwata@fujitsu.com wrote:

> Hi all,
> 
> Following all reviewer's advice, I have created a new patch.
> 
> In this patch, I add only two tracing entry points; I call 
> pqTraceOutputMsg(PGconn conn, int msgCursor, PGCommSource commsource) in 
> pqParseInput3 () and pqPutMsgEnd () to output log.
> The argument contains message first byte offset called msgCursor because it 
> is simpler to specify the buffer pointer in the caller. 
> 
> In pqTraceOutputMsg(), the content common to all protocol messages (first 
> timestamp, < or >, first 1 byte, and length) are output first and after that 
> each protocol message contents is output. I referred to pqParseInput3 () , 
> fe-exec.c and documentation page for that part.
> 
> This fix almost eliminates if(conn->Pfdebug) that was built into the code for 
> other features.

This certainly looks much better!  Thanks for getting it done so
quickly.

I didn't review the code super closely.  I do see a compiler warning:

/pgsql/source/pipeline/src/interfaces/libpq/libpq-trace.c: In function 
'pqTraceOutputNchar':
/pgsql/source/pipeline/src/interfaces/libpq/libpq-trace.c:373:2: warning: 
argument 1 null where non-null expected [-Wnonnull]
  memcpy(v, buf + *cursor, len);
  ^

and then when I try to run the "libpq_pipeline" program from the other
thread, I get a crash with this backtrace:

Program received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:288
288 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No existe el 
fichero o el directorio.
(gdb) bt
#0  __memmove_avx_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:288
#1  0x77f9b50b in pqTraceOutputNchar (buf=buf@entry=0x55564660 "P", 
LogEnd=LogEnd@entry=42, cursor=cursor@entry=0x7fffdeb4, 
pfdebug=0x55569320, len=1)
at /pgsql/source/pipeline/src/interfaces/libpq/libpq-trace.c:373
#2  0x77f9bc25 in pqTraceOutputMsg (conn=conn@entry=0x55560260, 
msgCursor=, 
commsource=commsource@entry=MSGDIR_FROM_FRONTEND)
at /pgsql/source/pipeline/src/interfaces/libpq/libpq-trace.c:476
#3  0x77f96280 in pqPutMsgEnd (conn=conn@entry=0x55560260)
at /pgsql/source/pipeline/src/interfaces/libpq/fe-misc.c:533
...

The attached patch fixes it, though I'm not sure that that's the final
form we want it to have (since we'd alloc and free repeatedly, making it
slow -- perhaps better to use a static, or ...? ).

-- 
Álvaro Herrera   Valdivia, Chile
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)
diff --git a/src/interfaces/libpq/libpq-trace.c b/src/interfaces/libpq/libpq-trace.c
index ef294fa556..5d3b40a1d0 100644
--- a/src/interfaces/libpq/libpq-trace.c
+++ b/src/interfaces/libpq/libpq-trace.c
@@ -368,7 +368,15 @@ pqTraceOutputBinary(char *v, int length, FILE *pfdebug)
 static void
 pqTraceOutputNchar(char *buf, int LogEnd, int *cursor, FILE *pfdebug, int len)
 {
-	char	*v = '\0';
+	char	*v;
+
+	v = malloc(len);
+	if (v == NULL)
+	{
+		fprintf(pfdebug, "'..%d chars..'", len);
+		*cursor += len;
+		return;
+	}
 
 	memcpy(v, buf + *cursor, len);
 	*cursor += len;
@@ -377,6 +385,8 @@ pqTraceOutputNchar(char *buf, int LogEnd, int *cursor, FILE *pfdebug, int len)
 	pqTraceOutputBinary(v, len, pfdebug);
 	fprintf(pfdebug, "\'");
 
+	free(v);
+
 	if (*cursor < LogEnd)
 		fprintf(pfdebug, " ");
 	else

RE: libpq debug log

2021-03-09 Thread iwata....@fujitsu.com

Hi all,

Following all reviewer's advice, I have created a new patch.

In this patch, I add only two tracing entry points; I call 
pqTraceOutputMsg(PGconn conn, int msgCursor, PGCommSource commsource) in 
pqParseInput3 () and pqPutMsgEnd () to output log.
The argument contains message first byte offset called msgCursor because it is 
simpler to specify the buffer pointer in the caller. 

In pqTraceOutputMsg(), the content common to all protocol messages (first 
timestamp, < or >, first 1 byte, and length) are output first and after that 
each protocol message contents is output. I referred to pqParseInput3 () , 
fe-exec.c and documentation page for that part.

This fix almost eliminates if(conn->Pfdebug) that was built into the code for 
other features.

Regards,
Aya Iwata

> -Original Message-
> From: alvhe...@alvh.no-ip.org 
> Sent: Friday, March 5, 2021 10:41 PM
> To: Tsunakawa, Takayuki/綱川 貴之 
> Cc: 'Tom Lane' ; Iwata, Aya/岩田 彩
> ; Jamison, Kirk/ジャミソン カーク
> ; 'Kyotaro Horiguchi' ;
> pgsql-hackers@lists.postgresql.org
> Subject: Re: libpq debug log
> 
> On 2021-Mar-05, tsunakawa.ta...@fujitsu.com wrote:
> 
> > From: Tom Lane 
> > > But I think passing the message start address explicitly might be
> > > better than having it understand the buffering behavior in enough
> > > detail to know where to find the message.  Part of the point here
> > > (IMO) is to decouple the tracing logic from the core libpq logic, in
> > > hopes of not having common-mode bugs.
> >
> > Ouch, you're perfectly right.  Then let's make the signature:
> >
> > void pqLogMessage(PGconn *conn, const char *message, bool
> > toServer);
> 
> Yeah, looks good!  I agree that going this route will result in more 
> trustworthy
> trace output.
> 
> --
> Álvaro Herrera39°49'30"S 73°17'W

v24-libpq-trace-log.patch
Description: v24-libpq-trace-log.patch

Re: libpq debug log

2021-03-05 Thread alvhe...@alvh.no-ip.org

On 2021-Mar-05, tsunakawa.ta...@fujitsu.com wrote:

> From: Tom Lane 
> > But I think passing the message start address explicitly might be
> > better than having it understand the buffering behavior in enough
> > detail to know where to find the message.  Part of the point here
> > (IMO) is to decouple the tracing logic from the core libpq logic, in
> > hopes of not having common-mode bugs.
> 
> Ouch, you're perfectly right.  Then let's make the signature:
> 
> void pqLogMessage(PGconn *conn, const char *message, bool toServer);

Yeah, looks good!  I agree that going this route will result in more
trustworthy trace output.

-- 
Álvaro Herrera39°49'30"S 73°17'W

RE: libpq debug log

2021-03-05 Thread k.jami...@fujitsu.com

Hi Alvaro-san and Horiguchi-san,
CC) Iwata-san, Tsunakawa-san

Attached is the V23 of the libpq trace patch.

(1)
From: Álvaro Herrera 
> It appears that something is still wrong.  I applied lipq pipeline v27 from 
> [1]
> and ran src/test/modules/test_libpq/pipeline singlerow, after patching it to
> do PQtrace() after PQconn(). Below is the output I get from that. The
> noteworthy point is that "ParseComplete" messages appear multiple times
> for some reason ... but that's quite odd, because if I look at the network 
> traffic
> with Wireshark I certainly do not see the ParseComplete message being sent
> three times.

I applied Alvaro's libpq pipeline patch (v33) [1] on top of this libpq trace 
patch,
then traced the src/test/modules/libpq_pipeline/libpq_pipeline.c 
test_singlerowmode.
It's still the same trace output even after applying the fix recommendations
of Horiguchi-san.

>   Parse   38 "" "SELECT generate_series(42, $1)" #0
>   Bind20 "" "" #0 #1 2 '44' #1 #0
>   Describe6 P ""
>   Execute 9 "" 0
>   Parse   38 "" "SELECT generate_series(42, $1)" #0
>   Bind20 "" "" #0 #1 2 '45' #1 #0
>   Describe6 P ""
>   Execute 9 "" 0
>   Parse   38 "" "SELECT generate_series(42, $1)" #0
>   Bind20 "" "" #0 #1 2 '46' #1 #0
>   Describe6 P ""
>   Execute 9 "" 0
>   Sync4
<   ParseComplete   4
<   BindComplete4
<   RowDescription  40 #1 "generate_series" 0 #0 23 #4 -1 #0
<   DataRow 12 #1 2 '42'
<   DataRow 12 #1 2 '43'
<   DataRow 12 #1 2 '44'
<   CommandComplete 13 "SELECT 3"
<   ParseComplete   4
<   ParseComplete   4
<   ParseComplete   4
<   BindComplete4
<   RowDescription  40 #1 "generate_series" 0 #0 23 #4 -1 #0
<   DataRow 12 #1 2 '42'
<   DataRow 12 #1 2 '43'
<   DataRow 12 #1 2 '44'
<   DataRow 12 #1 2 '45'
<   CommandComplete 13 "SELECT 4"
<   ParseComplete   4
<   ParseComplete   4
<   ParseComplete   4
<   BindComplete4
<   RowDescription  40 #1 "generate_series" 0 #0 23 #4 -1 #0
<   DataRow 12 #1 2 '42'
<   DataRow 12 #1 2 '43'
<   DataRow 12 #1 2 '44'
<   DataRow 12 #1 2 '45'
<   DataRow 12 #1 2 '46'
<   CommandComplete 13 "SELECT 5"
<   ReadyForQuery   5 I

ParseComplete still appears 3 times, but I wonder if that is expected only for 
pipeline's singlerowmode test.
From my observation, when I traced the whole regression test output
by putting PQtrace() in fe-connect.c: connectDBComplete(),
ParseComplete appears only once or twice, etc. for other commands.
In other words, ISTM, it's a case-to-case basis:
Some examples from the whole regression trace output is below:
a.
>   Describe6 P ""
>   Execute 9 "" 0
>   Sync4
<   ParseComplete   4
<   BindComplete4
b.
<   ErrorResponse   217 S "ERROR" V "ERROR" C "42883" M "function 
no_such_function(integer) does not exist" H "No function matches the given name 
and argument types. You might need to add explicit type casts." P "8" F 
"parse_func.c" L "629" R "ParseFuncOrColumn" \x00
<   ReadyForQuery   5 I
<   ParseComplete   4
<   ParseComplete   4
c.
>   Bind31 "" "my_insert" #0 #1 4 '9652' #1 #0
>   Describe6 P ""
>   Execute 9 "" 0
<   ParseComplete   4
<   ParseComplete   4
<   ParseComplete   4
<   BindComplete4
<   NoData  4
d.
<   CommandComplete 15 "INSERT 0 2"
<   ReadyForQuery   5 T
>   Parse   89 "i" "insert into test (name, amount, letter) select name, 
amount+$1, letter from test" #0
>   Sync4
<   ParseComplete   4
<   ReadyForQuery   5 T

As you can see from the examples above, ParseComplete appears in multiples/once 
depending on the test.
As for libpq_pipeline's test_singlerowmode, it appears 3x for the 
CommandComplete SELECT.
I am wondering now whether or not it's a bug. Maybe it's not.
Thoughts?


How I traced the whole regression test: fe-connect.c: connectDBComplete()

@@ -2114,6 +2137,7 @@ connectDBComplete(PGconn *conn)
int timeout = 0;
int last_whichhost = -2;/* certainly different 
from whichhost */
struct addrinfo *last_addr_cur = NULL;
+   FILE*trace_file;
 
if (conn == NULL || conn->status == CONNECTION_BAD)
return 0;
@@ -2171,6 +2195,9 @@ connectDBComplete(PGconn *conn)
switch (flag)
{
case PGRES_POLLING_OK:
+   trace_file =

RE: libpq debug log

2021-03-04 Thread tsunakawa.ta...@fujitsu.com

From: Tom Lane 
> But I think passing the message start address explicitly might be better than
> having it understand the buffering behavior in enough detail to know where to
> find the message.  Part of the point here
> (IMO) is to decouple the tracing logic from the core libpq logic, in hopes of 
> not
> having common-mode bugs.

Ouch, you're perfectly right.  Then let's make the signature:

void pqLogMessage(PGconn *conn, const char *message, bool toServer);

Thank you!


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-04 Thread Tom Lane

"tsunakawa.ta...@fujitsu.com"  writes:
> Oops, the logging facility needs the destination (conn->pfdebug).  So, it 
> will be:

> void pqLogMessage(PGconn *conn, bool toServer);

Right, and it'd need the debug flags too, so +1 for just passing the
PGconn instead of handing those over piecemeal.

But I think passing the message start address explicitly might be
better than having it understand the buffering behavior in enough
detail to know where to find the message.  Part of the point here
(IMO) is to decouple the tracing logic from the core libpq logic,
in hopes of not having common-mode bugs.

regards, tom lane

RE: libpq debug log

2021-03-04 Thread tsunakawa.ta...@fujitsu.com

From: tsunakawa.ta...@fujitsu.com 
> I understood that the former is pqParseInput3() and the latter is
> pqPutMsgEnd().  They call the logging function wne conn->pfdebug is not
> NULL.  Its signature is like this (that will be defined in libpq-trace.c):
> 
> void pqLogMessage(const char *message, bool toServer);

Oops, the logging facility needs the destination (conn->pfdebug).  So, it will 
be:

void pqLogMessage(PGconn *conn, bool toServer);

The function should know where to extract the message from.


Regards
Takayuki Tsunakawa

RE: libpq debug log

2021-03-04 Thread tsunakawa.ta...@fujitsu.com

Tom-san, Alvaro-san,


From: Tom Lane 
> I took a quick look through the v22 patch, and TBH I don't like much of 
> anything
> at all about the proposed architecture.  It's retained most of the flavor of 
> the
> way it was done before, which was a hangover from the old process-on-the-fly
> scheme.

Yes, the patch just follows the current style of interspersing.  (But some 
places are removed so it's a bit cleaner.)

Anyway, I think your suggestion should be better, resulting in much cleaner 
separation of message handling and logging.


> I think the right way to do this, now that we've killed off v2 protocol 
> support, is to
> rely on the fact that libpq fully buffers both incoming and outgoing messages.
> We should nuke every last bit of the existing code that intersperses tracing 
> logic
> with normal message decoding and construction, and instead have two tracing
> entry points:

FWIW, I was surprised and glad when I saw the commit message to remove the v2 
protocol.


> (1) Log a received message.  This is called as soon as we know (from the
> length word) that we have the whole message available, and it's just passed a
> pointer into the input buffer.  It should examine the input message for itself
> and print the details.
> 
> (2) Log a message-to-be-sent.  Again, call it as soon as we've constructed a
> complete message in the output buffer, and let it examine and print that
> message for itself.

I understood that the former is pqParseInput3() and the latter is 
pqPutMsgEnd().  They call the logging function wne conn->pfdebug is not NULL.  
Its signature is like this (that will be defined in libpq-trace.c):

void pqLogMessage(const char *message, bool toServer);

The message argument points to the message type byte.  toServer is true for 
messages sent to the server.  The signature could alternatively be one of the 
following, but this is a matter of taste:

void pqLogMessage(char message_type, const char *message, bool toServer);
void pqLogMessage(char message_type, int length, const char *message, bool 
toServer);

This function simply outputs the timestamp, message direction, message type, 
length, and message contents.  The output processing of message contents 
branches on a message type with a switch-case statement.  Perhaps the output 
processing of each message should be separated into its own function like 
pqLogMessageD() for 'D' message so that the indentation won't get deep in the 
main logging function.


> Both of these pieces of logic could be written directly from the protocol
> specification, without any interaction with the main libpq code paths, which
> would be a Good Thing IMO.  The current intertwined approach is somewhere
> between useless and counterproductive if you're in need of tracing down a
> libpq bug.  (In other words, I'm suggesting that the apparent need for
> duplicate logic would be good not bad, and indeed that it'd be best to write 
> the
> tracing logic without consulting the existing libpq code at all.)

Yes, the only thing I'm concerned about is to have two places that have to be 
aware of the message format -- message encoding/decoding and logging.  But this 
cleaner separation would pay it.

Alvaro-san, what do you think?  Anyway, we'll soon post the updated patch that 
fixes the repeated ParseComplete issue based on the current patch.  After that, 
we'll send a patch based on Tom-san's idea.


Regards
Takayuki Tsunakawa

Re: libpq debug log

2021-03-04 Thread Tom Lane

"tsunakawa.ta...@fujitsu.com"  writes:
> From: Kyotaro Horiguchi 
>> Using (inCursor - inStart) as logCursor doesn't work correctly if tracing 
>> state
>> desyncs.  Once desync happens inStart can be moved at the timing that the
>> tracing code doesn't expect. This requires (as I mentioned upthread)
>> pqReadData to actively reset logCursor, though.
>> 
>> logCursor should move when bytes are fed to the tracing functoins even when
>> theyare the last bytes of a message.

> Hmm, sharp and nice insight.  I'd like Iwata-san and Kirk to consider this, 
> including Horiguchi-san's previous mails.

I took a quick look through the v22 patch, and TBH I don't like much of
anything at all about the proposed architecture.  It's retained most
of the flavor of the way it was done before, which was a hangover from
the old process-on-the-fly scheme.

I think the right way to do this, now that we've killed off v2 protocol
support, is to rely on the fact that libpq fully buffers both incoming
and outgoing messages.  We should nuke every last bit of the existing
code that intersperses tracing logic with normal message decoding and
construction, and instead have two tracing entry points:

(1) Log a received message.  This is called as soon as we know (from
the length word) that we have the whole message available, and it's
just passed a pointer into the input buffer.  It should examine the
input message for itself and print the details.

(2) Log a message-to-be-sent.  Again, call it as soon as we've constructed
a complete message in the output buffer, and let it examine and print
that message for itself.

Both of these pieces of logic could be written directly from the protocol
specification, without any interaction with the main libpq code paths,
which would be a Good Thing IMO.  The current intertwined approach is
somewhere between useless and counterproductive if you're in need of
tracing down a libpq bug.  (In other words, I'm suggesting that the
apparent need for duplicate logic would be good not bad, and indeed that
it'd be best to write the tracing logic without consulting the existing
libpq code at all.)

This would, I think, also eliminate the need for extra storage to hold
info about bits of the message, which IMO is a pretty significant
downside of the proposed design.  The printing logic would just print
directly to Pfdebug; it wouldn't have to accumulate anything, and so
it wouldn't have the out-of-memory failure modes that this patch has.
You could also get rid of messy stuff like pqTraceForciblyCloseBeMsg.

Lastly, this'd reduce the overhead the tracing functionality imposes
on normal usage to about nil.  Admittedly, all those "if (Pfdebug)"
tests probably don't cost that much execution-wise, but they cost
something.  Making only one such test per sent or received message
has to be better.

regards, tom lane

1 2 3 >

1 - 100 of 230 matches

Mail list logo