Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-22 Thread Joe Conway
On 12/22/2016 06:55 AM, Tom Lane wrote:
> Joe Conway  writes:
>> On 12/21/2016 09:20 PM, Tom Lane wrote:
>>> I see that you need to pass the PGconn into dblink_res_error() now, but
>>> what's the point of the new "bool fail" parameter?
> 
>> That part isn't new -- we added it sometime prior to 9.2:
> 
> Oh!  I misread the patch --- something about an unluckily-placed line
> wrap and not looking very closely :-(.  Yeah, it's fine as is.
> Sorry for the noise.

Thanks -- committed/backpatched to 9.2

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-22 Thread Tom Lane
Joe Conway  writes:
> On 12/21/2016 09:20 PM, Tom Lane wrote:
>> I see that you need to pass the PGconn into dblink_res_error() now, but
>> what's the point of the new "bool fail" parameter?

> That part isn't new -- we added it sometime prior to 9.2:

Oh!  I misread the patch --- something about an unluckily-placed line
wrap and not looking very closely :-(.  Yeah, it's fine as is.
Sorry for the noise.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Joe Conway
On 12/21/2016 09:20 PM, Tom Lane wrote:
> Joe Conway  writes:
>> On 12/21/2016 04:22 PM, Tom Lane wrote:
>>> In short, yes, please copy that bit into dblink.
> 
>> The attached should do the trick I think.
> 
> I see that you need to pass the PGconn into dblink_res_error() now, but
> what's the point of the new "bool fail" parameter?

That part isn't new -- we added it sometime prior to 9.2:

8<--
if (fail)
level = ERROR;
else
level = NOTICE;
8<--

It allows dblink to throw a NOTICE on remote errors rather than an
actual ERROR, e.g. for an autonomous transaction.

From the docs (9.2 in this case)
8<--
fail_on_error

If true (the default when omitted) then an error thrown on the
remote side of the connection causes an error to also be thrown locally.
If false, the remote error is locally reported as a NOTICE, and the
function returns no rows.
8<--

>> You think it is reasonable to backpatch this part too?
> 
> Yes, definitely.

Ok, will do.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Tom Lane
Joe Conway  writes:
> On 12/21/2016 04:22 PM, Tom Lane wrote:
>> In short, yes, please copy that bit into dblink.

> The attached should do the trick I think.

I see that you need to pass the PGconn into dblink_res_error() now, but
what's the point of the new "bool fail" parameter?

> You think it is reasonable to backpatch this part too?

Yes, definitely.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Joe Conway
On 12/21/2016 04:22 PM, Tom Lane wrote:
> Joe Conway  writes:
>> I did notice that postgres_fdw has the following stanza that I don't see
>> in dblink:
> 
>> 8<--
>> /*
>>  * If we don't get a message from the PGresult, try the PGconn.  This
>>  * is needed because for connection-level failures, PQexec may just
>>  * return NULL, not a PGresult at all.
>>  */
>> if (message_primary == NULL)
>>  message_primary = PQerrorMessage(conn);
>> 8<--
> 
>> I wonder if the original issue on pgsql-bugs was a connection-level
>> failure rather than OOM? Seems like dblink ought to do the same.
> 
> Oooh ... I had thought that code was in both, which was why I was having
> a hard time explaining the OP's failure.  But I see you're right,
> which provides a very straightforward explanation for the report.
> I believe that if libpq is unable to malloc a PGresult, it will return
> NULL but put an "out of memory" message into the PGconn's error buffer.
> I had supposed that we'd capture and report the latter, but as the
> dblink code stands, it won't.
> 
> In short, yes, please copy that bit into dblink.

The attached should do the trick I think. You think it is reasonable to
backpatch this part too?

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index ee45cd2..db3085a 100644
*** a/contrib/dblink/dblink.c
--- b/contrib/dblink/dblink.c
*** static Relation get_rel_from_relname(tex
*** 112,118 
  static char *generate_relation_name(Relation rel);
  static void dblink_connstr_check(const char *connstr);
  static void dblink_security_check(PGconn *conn, remoteConn *rconn);
! static void dblink_res_error(const char *conname, PGresult *res, const char *dblink_context_msg, bool fail);
  static char *get_connect_string(const char *servername);
  static char *escape_param_str(const char *from);
  static void validate_pkattnums(Relation rel,
--- 112,119 
  static char *generate_relation_name(Relation rel);
  static void dblink_connstr_check(const char *connstr);
  static void dblink_security_check(PGconn *conn, remoteConn *rconn);
! static void dblink_res_error(PGconn *conn, const char *conname, PGresult *res,
! 			 const char *dblink_context_msg, bool fail);
  static char *get_connect_string(const char *servername);
  static char *escape_param_str(const char *from);
  static void validate_pkattnums(Relation rel,
*** dblink_open(PG_FUNCTION_ARGS)
*** 427,433 
  	res = PQexec(conn, buf.data);
  	if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
! 		dblink_res_error(conname, res, "could not open cursor", fail);
  		PG_RETURN_TEXT_P(cstring_to_text("ERROR"));
  	}
  
--- 428,434 
  	res = PQexec(conn, buf.data);
  	if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
! 		dblink_res_error(conn, conname, res, "could not open cursor", fail);
  		PG_RETURN_TEXT_P(cstring_to_text("ERROR"));
  	}
  
*** dblink_close(PG_FUNCTION_ARGS)
*** 496,502 
  	res = PQexec(conn, buf.data);
  	if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
! 		dblink_res_error(conname, res, "could not close cursor", fail);
  		PG_RETURN_TEXT_P(cstring_to_text("ERROR"));
  	}
  
--- 497,503 
  	res = PQexec(conn, buf.data);
  	if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
! 		dblink_res_error(conn, conname, res, "could not close cursor", fail);
  		PG_RETURN_TEXT_P(cstring_to_text("ERROR"));
  	}
  
*** dblink_fetch(PG_FUNCTION_ARGS)
*** 599,605 
  		(PQresultStatus(res) != PGRES_COMMAND_OK &&
  		 PQresultStatus(res) != PGRES_TUPLES_OK))
  	{
! 		dblink_res_error(conname, res, "could not fetch from cursor", fail);
  		return (Datum) 0;
  	}
  	else if (PQresultStatus(res) == PGRES_COMMAND_OK)
--- 600,607 
  		(PQresultStatus(res) != PGRES_COMMAND_OK &&
  		 PQresultStatus(res) != PGRES_TUPLES_OK))
  	{
! 		dblink_res_error(conn, conname, res,
! 		 "could not fetch from cursor", fail);
  		return (Datum) 0;
  	}
  	else if (PQresultStatus(res) == PGRES_COMMAND_OK)
*** dblink_record_internal(FunctionCallInfo
*** 750,757 
  if (PQresultStatus(res) != PGRES_COMMAND_OK &&
  	PQresultStatus(res) != PGRES_TUPLES_OK)
  {
! 	dblink_res_error(conname, res, "could not execute query",
! 	 fail);
  	/* if fail isn't set, we'll return an empty query result */
  }
  else
--- 752,759 
  if (PQresultStatus(res) != PGRES_COMMAND_OK &&
  	PQresultStatus(res) != PGRES_TUPLES_OK)
  {
! 	dblink_res_error(conn, conname, res,
! 	 "could not execute query", fail);
  	/* if fail isn't set, we'll return an empty query result */
  }
  else
*** materializeQueryResult(FunctionCallInfo
*** 996,1002 
  			PGresult   *res1 = res;
  
  			res = NULL;
! 			

Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Tom Lane
Joe Conway  writes:
> I did notice that postgres_fdw has the following stanza that I don't see
> in dblink:

> 8<--
> /*
>  * If we don't get a message from the PGresult, try the PGconn.  This
>  * is needed because for connection-level failures, PQexec may just
>  * return NULL, not a PGresult at all.
>  */
> if (message_primary == NULL)
>   message_primary = PQerrorMessage(conn);
> 8<--

> I wonder if the original issue on pgsql-bugs was a connection-level
> failure rather than OOM? Seems like dblink ought to do the same.

Oooh ... I had thought that code was in both, which was why I was having
a hard time explaining the OP's failure.  But I see you're right,
which provides a very straightforward explanation for the report.
I believe that if libpq is unable to malloc a PGresult, it will return
NULL but put an "out of memory" message into the PGconn's error buffer.
I had supposed that we'd capture and report the latter, but as the
dblink code stands, it won't.

In short, yes, please copy that bit into dblink.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Joe Conway
On 12/21/2016 10:08 AM, Tom Lane wrote:
> I wrote:
 I propose that we should change that string to "could not obtain message
 string for error on connection "foo"", or something along that line.
> 
> BTW, looking closer, I notice that the dblink case already has
> 
>   errcontext("Error occurred on dblink connection named \"%s\": %s.",
>  dblink_context_conname, dblink_context_msg)));
> 
> so we probably don't need the connection name in the primary error
> message.  Now I think "could not obtain message string for remote error"
> would be a sufficient message.
> 
> In the postgres_fdw case, I'd be inclined to use the same replacement
> primary message.  Maybe we should think about adding the server name
> to the errcontext there, but that seems like an independent improvement.

Committed that way.

I did notice that postgres_fdw has the following stanza that I don't see
in dblink:

8<--
/*
 * If we don't get a message from the PGresult, try the PGconn.  This
 * is needed because for connection-level failures, PQexec may just
 * return NULL, not a PGresult at all.
 */
if (message_primary == NULL)
message_primary = PQerrorMessage(conn);
8<--

I wonder if the original issue on pgsql-bugs was a connection-level
failure rather than OOM? Seems like dblink ought to do the same.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Tom Lane
I wrote:
>>> I propose that we should change that string to "could not obtain message
>>> string for error on connection "foo"", or something along that line.

BTW, looking closer, I notice that the dblink case already has

  errcontext("Error occurred on dblink connection named \"%s\": %s.",
 dblink_context_conname, dblink_context_msg)));

so we probably don't need the connection name in the primary error
message.  Now I think "could not obtain message string for remote error"
would be a sufficient message.

In the postgres_fdw case, I'd be inclined to use the same replacement
primary message.  Maybe we should think about adding the server name
to the errcontext there, but that seems like an independent improvement.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Joe Conway
On 12/21/2016 09:27 AM, Tom Lane wrote:
> Joe Conway  writes:
>> On 12/21/2016 08:49 AM, Tom Lane wrote:
>>> I propose that we should change that string to "could not obtain message
>>> string for error on connection "foo"", or something along that line.
>>>
>>> postgres_fdw has the same disease.  It wouldn't have the notion of a named
>>> connection, but maybe we could insert the foreign server name instead.
> 
>> Seems reasonable to me. I can work on it if you'd like. Do you think
>> this should be backpatched?
> 
> If you have time for it, please do, I have lots on my plate already.

Ok, will do.

> I'd vote for back-patching; the benefits of a clearer error message
> are obvious, and it hardly seems likely that any existing applications
> are depending on this specific error string.

Agreed.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Tom Lane
Joe Conway  writes:
> On 12/21/2016 08:49 AM, Tom Lane wrote:
>> I propose that we should change that string to "could not obtain message
>> string for error on connection "foo"", or something along that line.
>>
>> postgres_fdw has the same disease.  It wouldn't have the notion of a named
>> connection, but maybe we could insert the foreign server name instead.

> Seems reasonable to me. I can work on it if you'd like. Do you think
> this should be backpatched?

If you have time for it, please do, I have lots on my plate already.

I'd vote for back-patching; the benefits of a clearer error message
are obvious, and it hardly seems likely that any existing applications
are depending on this specific error string.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Joe Conway
On 12/21/2016 08:49 AM, Tom Lane wrote:
> We have a report in pgsql-general of a dblink query failing with
>   ERROR: unknown error
> This is, to say the least, unhelpful.  And it violates our error
> message style guidelines.
> 
> Where that is coming from is a situation where we've failed to extract
> any primary message from a remote error.  (I theorize that today's report
> is triggered by an out-of-memory situation, but that's only an unsupported
> guess at the moment.)
> 
> I propose that we should change that string to "could not obtain message
> string for error on connection "foo"", or something along that line.
> 
> postgres_fdw has the same disease.  It wouldn't have the notion of a named
> connection, but maybe we could insert the foreign server name instead.
> 
> A possible objection is that if we really are on the hairy edge of OOM,
> trying to construct such error strings might push us over the edge and
> then you get "out of memory" instead.  But IMO that's not worse; it
> could be argued to be a more useful description of the problem.
> 
> Comments?

Seems reasonable to me. I can work on it if you'd like. Do you think
this should be backpatched?

Joe


-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Tom Lane
"David G. Johnston"  writes:
> On Wed, Dec 21, 2016 at 9:49 AM, Tom Lane  wrote:
>> A possible objection is that if we really are on the hairy edge of OOM,
>> trying to construct such error strings might push us over the edge

> What am I missing here?  Constructing said string occurs on the local end
> of the connection but the memory problem exists on the remote end.

Um, that's not clear, in fact it's not likely.  The backend usually
manages to tell you so if it's out of memory --- or, worst case, it
crashes trying, in which case the local libpq ought to gin up a report
about loss of connection.  So the root cause I'm suspecting is that
the local libpq was unable to obtain memory for a PGresult or something
like that.  That theory has some holes of its own, because libpq also
keeps some cards up its sleeve that usually let it report out-of-memory
successfully, but it's the best I can do with the info at hand.

In any case, the point of the error style guidelines is that it's *always*
possible to do better than "unknown error"; now that it's been proven
that this case is reachable in the field, we should try harder.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread Tom Lane
I wrote:
> We have a report in pgsql-general of a dblink query failing with
>   ERROR: unknown error

Er, fingers faster than brain this morning.  Actually that report is in
pgsql-bugs, specifically bug #14471:
https://www.postgresql.org/message-id/20161221094443.25614.47807%40wrigleys.postgresql.org

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting rid of "unknown error" in dblink and postgres_fdw

2016-12-21 Thread David G. Johnston
On Wed, Dec 21, 2016 at 9:49 AM, Tom Lane  wrote:

> A possible objection is that if we really are on the hairy edge of OOM,
> trying to construct such error strings might push us over the edge


What am I missing here?  Constructing said string occurs on the local end
of the connection but the memory problem exists on the remote end.

David J.