Re: Patch proposal: New hooks in the connection path

2023-04-04 Thread Drouvot, Bertrand

Hi,

On 4/4/23 12:08 AM, Gregory Stark (as CFM) wrote:

This looks like it was a good discussion -- last summer. But it
doesn't seem to be a patch under active development now.

It sounds like there were some design constraints that still need some
new ideas to solve and a new patch will be needed to address them.

Should this be marked Returned With Feedback?



I just marked it as Returned With Feedback.

I may re-open it later on to resume the discussion or share
new ideas though.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Patch proposal: New hooks in the connection path

2023-04-03 Thread Gregory Stark (as CFM)
This looks like it was a good discussion -- last summer. But it
doesn't seem to be a patch under active development now.

It sounds like there were some design constraints that still need some
new ideas to solve and a new patch will be needed to address them.

Should this be marked Returned With Feedback?

-- 
Gregory Stark
As Commitfest Manager




Re: Patch proposal: New hooks in the connection path

2022-08-16 Thread Gurjeet Singh
On Tue, Aug 16, 2022 at 3:16 AM Bharath Rupireddy
 wrote:
>
> On Tue, Aug 16, 2022 at 1:55 PM Drouvot, Bertrand  wrote:
> >
> > Hi,
> >
> > On 8/16/22 10:10 AM, Bharath Rupireddy wrote:
> > > On Tue, Aug 16, 2022 at 1:31 PM Drouvot, Bertrand  
> > > wrote:
> > >> On 8/14/22 7:52 AM, Gurjeet Singh wrote:
> > >>> On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand  
> > >>> wrote:
> > >>> I think we can reduce the number of places the hook is called, if we
> > >>> call the hook from proc_exit(), and at all the other places we simply 
> > >>> set
> > >>> a global variable to signify the reason for the failure. The case of
> > >>> _exit(1) from the signal-handler cannot use such a mechanism, but I
> > >>> think all the other cases of interest can simply register one of the
> > >>> FCET_* values, and let the call from proc_exit() pass that value
> > >>> to the hook.
> > >> That looks like a good idea to me. I'm tempted to rewrite the patch that
> > >> way (and addressing the first comment in the same time).
> > >>
> > >> Curious to hear about others hackers thoughts too.

I agree that we need feedback from long-timers here,  on the decision
of whether to use proc_exit() for this purpose.

> > > IMO, calling the hook from proc_exit() is not a good design as
> > > proc_exit() is a generic code called from many places in the source
> > > code, even the simple code of kind  if(call_failed_conn_hook) {
> > > falied_conn_hook(params);} can come in the way of many exit code paths
> > > which is undesirable, and the likelihood of introducing new bugs may
> > > increase.
> >
> > Thanks for the feedback.
> >
> > What do you think about calling the hook only if the new global variable
> > is not equal to its default value (which would mean don't trigger the
> > hook)?
>
> IMO, that's not a good design as explained above. Why should the
> failed connection hook related code get hit for each and every
> proc_exit() call? Here, the code duplication i.e. the number of places
> the failed connection hook gets called mustn't be the reason to move
> that code to proc_exit().

I agree, it doesn't feel _clean_, having to maintain a global
variable, pass it to hook at exit, etc. But the alternative feels less
cleaner.

This hook needs to be called when the process has decided to exit, so
it makes sense to place this call in stack above proc_exit(), whose
sole job is to let the process die gracefully, and take care of things
on the way out.

There are quite a few places in core that leverage proc_exit()'s
facilities (by registering on_proc_exit callbacks), so an
extension/hook doing so wouldn't be out of the ordinary; (apparently
contrib/sepgsql has already set the precedent on an extension using
the on_proc_exit callback). Admittedly, in this case the core will be
managing and passing it the additional global variable needed to
record the connection failure reason, FCET_*.

If we agree that proc_exit() is a good place to place this call, then
this hook can be converted into a on_proc_exit callback. If the global
variable is exported, then the extension(s) can access it in the
callback to ascertain why the process is exiting, and proc_exit()
won't have to know anything special about the extension, or hook, or
the global variable.

The on_proc_exit callback method wouldn't work for the _exit() called
in StartupPacketTimeoutHandler(), so that will need to be handled
separately.

Best regards,
Gurjeet
http://Gurje.et




Re: Patch proposal: New hooks in the connection path

2022-08-16 Thread Bharath Rupireddy
On Tue, Aug 16, 2022 at 1:55 PM Drouvot, Bertrand  wrote:
>
> Hi,
>
> On 8/16/22 10:10 AM, Bharath Rupireddy wrote:
> > On Tue, Aug 16, 2022 at 1:31 PM Drouvot, Bertrand  
> > wrote:
> >> On 8/14/22 7:52 AM, Gurjeet Singh wrote:
> >>> On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand  
> >>> wrote:
> >>> I think we can reduce the number of places the hook is called, if we
> >>> call the hook from proc_exit(), and at all the other places we simply set
> >>> a global variable to signify the reason for the failure. The case of
> >>> _exit(1) from the signal-handler cannot use such a mechanism, but I
> >>> think all the other cases of interest can simply register one of the
> >>> FCET_* values, and let the call from proc_exit() pass that value
> >>> to the hook.
> >> That looks like a good idea to me. I'm tempted to rewrite the patch that
> >> way (and addressing the first comment in the same time).
> >>
> >> Curious to hear about others hackers thoughts too.
> > IMO, calling the hook from proc_exit() is not a good design as
> > proc_exit() is a generic code called from many places in the source
> > code, even the simple code of kind  if(call_failed_conn_hook) {
> > falied_conn_hook(params);} can come in the way of many exit code paths
> > which is undesirable, and the likelihood of introducing new bugs may
> > increase.
>
> Thanks for the feedback.
>
> What do you think about calling the hook only if the new global variable
> is not equal to its default value (which would mean don't trigger the
> hook)?

IMO, that's not a good design as explained above. Why should the
failed connection hook related code get hit for each and every
proc_exit() call? Here, the code duplication i.e. the number of places
the failed connection hook gets called mustn't be the reason to move
that code to proc_exit().

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/




Re: Patch proposal: New hooks in the connection path

2022-08-16 Thread Drouvot, Bertrand

Hi,

On 8/16/22 10:10 AM, Bharath Rupireddy wrote:

On Tue, Aug 16, 2022 at 1:31 PM Drouvot, Bertrand  wrote:

On 8/14/22 7:52 AM, Gurjeet Singh wrote:

On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand  wrote:
I think we can reduce the number of places the hook is called, if we
call the hook from proc_exit(), and at all the other places we simply set
a global variable to signify the reason for the failure. The case of
_exit(1) from the signal-handler cannot use such a mechanism, but I
think all the other cases of interest can simply register one of the
FCET_* values, and let the call from proc_exit() pass that value
to the hook.

That looks like a good idea to me. I'm tempted to rewrite the patch that
way (and addressing the first comment in the same time).

Curious to hear about others hackers thoughts too.

IMO, calling the hook from proc_exit() is not a good design as
proc_exit() is a generic code called from many places in the source
code, even the simple code of kind  if(call_failed_conn_hook) {
falied_conn_hook(params);} can come in the way of many exit code paths
which is undesirable, and the likelihood of introducing new bugs may
increase.


Thanks for the feedback.

What do you think about calling the hook only if the new global variable 
is not equal to its default value (which would mean don't trigger the 
hook)?


Regards,

--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com





Re: Patch proposal: New hooks in the connection path

2022-08-16 Thread Bharath Rupireddy
On Tue, Aug 16, 2022 at 1:31 PM Drouvot, Bertrand  wrote:
>
> On 8/14/22 7:52 AM, Gurjeet Singh wrote:
> > On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand  
> > wrote:
>
> > I think we can reduce the number of places the hook is called, if we
> > call the hook from proc_exit(), and at all the other places we simply set
> > a global variable to signify the reason for the failure. The case of
> > _exit(1) from the signal-handler cannot use such a mechanism, but I
> > think all the other cases of interest can simply register one of the
> > FCET_* values, and let the call from proc_exit() pass that value
> > to the hook.
>
> That looks like a good idea to me. I'm tempted to rewrite the patch that
> way (and addressing the first comment in the same time).
>
> Curious to hear about others hackers thoughts too.

IMO, calling the hook from proc_exit() is not a good design as
proc_exit() is a generic code called from many places in the source
code, even the simple code of kind  if(call_failed_conn_hook) {
falied_conn_hook(params);} can come in the way of many exit code paths
which is undesirable, and the likelihood of introducing new bugs may
increase.

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/




Re: Patch proposal: New hooks in the connection path

2022-08-13 Thread Gurjeet Singh
(reposting the same review, with many grammatical fixes)

On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand  wrote:
> Please find attached v2-0004-connection_hooks.patch

 /*
  * Stop here if it was bad or a cancel packet.  ProcessStartupPacket
  * already did any appropriate error reporting.
  */
 if (status != STATUS_OK)
+{
+#ifndef EXEC_BACKEND
+if (FailedConnection_hook)
+(*FailedConnection_hook) (FCET_BAD_STARTUP_PACKET, port);
+#endif
 proc_exit(0);
+}

Per the comment above the if condition, the `status != OK` may
represent a cancel packet, as well. Clearly, a cancel packet is not
the same as a _bad_ packet. So I think here you need to differentiate
between a cancel packet and a genuinely bad packet; I don't see
anything good coming out of us, or the hook-developer, lumping
those 2 cases together.

I think we can reduce the number of places the hook is called, if we
call the hook from proc_exit(), and at all the other places we simply set
a global variable to signify the reason for the failure. The case of
_exit(1) from the signal-handler cannot use such a mechanism, but I
think all the other cases of interest can simply register one of the
FCET_* values, and let the call from proc_exit() pass that value
to the hook.

If we can convince ourselves that we can use proc_exit(1) in
StartupPacketTimeoutHandler(), instead of calling _exit(1), I think we
cal replace all call sites for this hook with the
set-global-variable variant.

> ...
> * This should be the only function to call exit().
> * -cim 2/6/90
>...
> proc_exit(int code)

The comment on proc_exit() claims that it should be the only place
calling exit(), except that the add-on/extension hooks may ignore this.
So there must be a strong reason why the signal-handler uses _exit()
to bypass all callbacks.

Best regards,
Gurjeet
http://Gurje.et




Re: Patch proposal: New hooks in the connection path

2022-08-13 Thread Gurjeet Singh
On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand  wrote:
> Please find attached v2-0004-connection_hooks.patch

 /*
  * Stop here if it was bad or a cancel packet.  ProcessStartupPacket
  * already did any appropriate error reporting.
  */
 if (status != STATUS_OK)
+{
+#ifndef EXEC_BACKEND
+if (FailedConnection_hook)
+(*FailedConnection_hook) (FCET_BAD_STARTUP_PACKET, port);
+#endif
 proc_exit(0);
+}

Per the comment above the if condition, the `status != OK` may
represent a cancel packet, as well. Clearly, a cancel packet is not
the same as a _bad_ packet. So I think here you need to differentiate
between a cancel packet and a genuinely bad packet; I don't see
anything good coming good out of us, or the hook-developer, lumping
those 2 cases together.

I think we can reduce the number of places the hook is called, if we
call the hook from proc_exit(), and all the other places we simply set
a global variable to signify the reason for the failure. The case of
_exit(1) from the signal-handler cannot use such a mechanism, but I
think all the other cases of interest can simply register one of the
FCET_* value, and the hook call from proc_exit() can pass that value
to the hook.

If we can convinces ourselves that we can use proc_exit(1) in
StartupPacketTimeoutHandler(), instead of calling _exit(1), I think we
cal eliminate replace all call sites for this hook with
set-global-variable variant.

> ...
> * This should be the only function to call exit().
> * -cim 2/6/90
>...
> proc_exit(int code)

The comment on proc_exit() claims that should be the only place
calling exit(), except the add-on/extension hooks. So there must be a
strong reason why the signal-handler uses _exit() to bypass all
callbacks.

Best regards,
Gurjeet
http://Gurje.et




Re: Patch proposal: New hooks in the connection path

2022-08-13 Thread Bharath Rupireddy
On Tue, Aug 2, 2022 at 6:55 PM Drouvot, Bertrand  wrote:
>
> Hi Bharath,
>
> On 7/14/22 11:43 AM, Bharath Rupireddy wrote:
> > On Fri, Jul 8, 2022 at 5:54 PM Bharath Rupireddy
> >  wrote:
> >> Looking at v2-0003 patch and emit_log_hook, how about we filter out
> >> for those connectivity errors either based on error codes and if they
> >> aren't unique, perhaps passing special flags to ereport API indicating
> >> that it's a connectivity error and in the emit_log_hook we can look
> >> for those connectivity error codes or flags to collect the stats about
> >> the failure connections (with MyProcPort being present in
> >> emit_log_hook)? This way, we don't need a new hook. Thoughts?
> > Bertrand and Other Hackers, above comment may have been lost in the
> > wild - any thoughts on it?
>
> Thanks for your feedback!
>
> I can see 2 issues with that approach:
>
> - We’ll not be able to track the “startup timeout case” (well, we may
> not be able to track it anyway depending of what next to [1] will be) as
> it does not emit any log messages.
>
> [1]:
> https://www.postgresql.org/message-id/a1558d12-c1c4-0fe5-f8a5-2b6c2294e55f%40amazon.com

Yes, we wanted to be very quick in StartupPacketTimeoutHandler because
it is a timeout signal handler after all.

> - We’ll depend of the log_min_messages value (means
> edata->output_to_server needs to be true for the emit_log_hook to be
> triggered).

Hm, we can just say that 'log_min_message setting will enable/disable
the feature'.

I agree with your first point of not having an error in
StartupPacketTimeoutHandler hence I don't think using emit log hook
for the connection failure stats helps.

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/




Re: Patch proposal: New hooks in the connection path

2022-07-14 Thread Bharath Rupireddy
On Fri, Jul 8, 2022 at 5:54 PM Bharath Rupireddy
 wrote:
>
> Looking at v2-0003 patch and emit_log_hook, how about we filter out
> for those connectivity errors either based on error codes and if they
> aren't unique, perhaps passing special flags to ereport API indicating
> that it's a connectivity error and in the emit_log_hook we can look
> for those connectivity error codes or flags to collect the stats about
> the failure connections (with MyProcPort being present in
> emit_log_hook)? This way, we don't need a new hook. Thoughts?

Bertrand and Other Hackers, above comment may have been lost in the
wild - any thoughts on it?

Regards,
Bharath Rupireddy.




Re: Patch proposal: New hooks in the connection path

2022-07-08 Thread Tom Lane
Bharath Rupireddy  writes:
> On Fri, Jul 8, 2022 at 1:40 AM Tom Lane  wrote:
>> It doesn't seem like a great place for a hook, because the list of stuff
>> you could safely do there would be mighty short, possibly the empty set.

> I agree with this. But, all of the areas that v2-0003 touched for
> connectivity failures, they typically are emitting
> ereport(FATAL,/ereport(COMMERROR, (in ProcessStartupPacket) and we
> have emit_log_hook already being exposed and the implementers can,
> literally, do anything the hook.

This is utterly off-point, because those calls are not inside
signal handlers.

regards, tom lane




Re: Patch proposal: New hooks in the connection path

2022-07-08 Thread Bharath Rupireddy
On Fri, Jul 8, 2022 at 1:40 AM Tom Lane  wrote:
>
> Joe Conway  writes:
> > It isn't clear to me if having a hook in the timeout handler is a
> > nonstarter -- perhaps a comment with suitable warning for prospective
> > extension authors is enough? Anyone else want to weigh in on this issue
> > specifically?
>
> It doesn't seem like a great place for a hook, because the list of stuff
> you could safely do there would be mighty short, possibly the empty set.
> Write to shared memory?  Not too safe.  Write to a file?  Even less.
> Write to local memory?  Pointless, because we're about to _exit(1).
> Pretty much anything I can think of that you'd want to do is something
> we've already decided the core code can't safely do, and putting it
> in a hook won't make it safer.

I agree with this. But, all of the areas that v2-0003 touched for
connectivity failures, they typically are emitting
ereport(FATAL,/ereport(COMMERROR, (in ProcessStartupPacket) and we
have emit_log_hook already being exposed and the implementers can,
literally, do anything the hook.

Looking at v2-0003 patch and emit_log_hook, how about we filter out
for those connectivity errors either based on error codes and if they
aren't unique, perhaps passing special flags to ereport API indicating
that it's a connectivity error and in the emit_log_hook we can look
for those connectivity error codes or flags to collect the stats about
the failure connections (with MyProcPort being present in
emit_log_hook)? This way, we don't need a new hook. Thoughts?

Regards,
Bharath Rupireddy.




Re: Patch proposal: New hooks in the connection path

2022-07-07 Thread Tom Lane
Joe Conway  writes:
> It isn't clear to me if having a hook in the timeout handler is a 
> nonstarter -- perhaps a comment with suitable warning for prospective 
> extension authors is enough? Anyone else want to weigh in on this issue 
> specifically?

It doesn't seem like a great place for a hook, because the list of stuff
you could safely do there would be mighty short, possibly the empty set.
Write to shared memory?  Not too safe.  Write to a file?  Even less.
Write to local memory?  Pointless, because we're about to _exit(1).
Pretty much anything I can think of that you'd want to do is something
we've already decided the core code can't safely do, and putting it
in a hook won't make it safer.

If someone wants to argue for this hook, I'd like to see a credible
example of a *safe* use-case, keeping in mind the points raised in
the comments in BackendInitialize and process_startup_packet_die.

regards, tom lane




Re: Patch proposal: New hooks in the connection path

2022-07-07 Thread Joe Conway

On 7/6/22 04:13, Drouvot, Bertrand wrote:

On 7/6/22 12:11 AM, Joe Conway wrote:

On 7/5/22 03:37, Bharath Rupireddy wrote:

2. Timeout Handler is a signal handler, called as part of SIGALRM
signal handler, most of the times, signal handlers ought to be doing
small things, now that we are handing off the control to hook, which
can do any long running work (writing to a remote storage, file,
aggregate etc.), I don't think it's the right thing to do here.
  static void
  StartupPacketTimeoutHandler(void)
  {
+ if (FailedConnection_hook)
+ (*FailedConnection_hook) (FCET_STARTUP_PACKET_TIMEOUT, MyProcPort);



But more to Bharath's point, perhaps this is a case that is better
served by incrementing a stat counter and not exposed as a hook?


I think that the advantage of the hook is that it gives the extension
author the ability/flexibility to aggregate the counter based on
information available in the Port Struct (say the client addr for
example) at this stage.

What about to aggregate the stat counter based on the client addr? (Not
sure if there is more useful information (than the client addr) at this
stage though)

That said, i agree that having a hook in a time out handler might not be
the right thing to do (even if at the end that would be to the extension
author responsibility to do "small things" in it), so it has been
removed in the new attached version.


It isn't clear to me if having a hook in the timeout handler is a 
nonstarter -- perhaps a comment with suitable warning for prospective 
extension authors is enough? Anyone else want to weigh in on this issue 
specifically?


--
Joe Conway
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Patch proposal: New hooks in the connection path

2022-07-06 Thread Drouvot, Bertrand

Hi,

On 7/6/22 12:11 AM, Joe Conway wrote:


On 7/5/22 03:37, Bharath Rupireddy wrote:
On Mon, Jul 4, 2022 at 6:23 PM Drouvot, Bertrand 
 wrote:

On 7/2/22 1:00 AM, Nathan Bossart wrote:
> Could we model this after fmgr_hook?  The first argument in that hook
> indicates where it is being called from.  This doesn't alleviate 
the need
> for several calls to the hook in the authentication logic, but 
extension

> authors would only need to define one hook.

I like the idea and indeed fmgr.h looks a good place to model it.

Attached a new patch version doing so.


I was thinking along the same lines, so +1 for the general approach


Thanks for the review!




Thanks for the patch. Can we think of enhancing
ClientAuthentication_hook_type itself i.e. make it a generic hook for
all sorts of authentication metrics, info etc. with the type parameter
embedded to it instead of new hook FailedConnection_hook?We can either
add a new parameter for the "event" (the existing
ClientAuthentication_hook_type implementers will have problems), or
embed/multiplex the "event" info to existing Port structure or status
variable (macro or enum) (existing implementers will not have
compatibility problems).  IMO, this looks cleaner going forward.


Not sure I like this though -- I'll have to think about that


Not sure about this one neither.

The "enhanced" ClientAuthentication_hook will have to be fired at the 
same places as the new FailedConnection_hook is, but i think those 
places are not necessary linked to real authentication per say (making 
the name confusing).





On the v2 patch:

1. Why do we need to place the hook and structures in fmgr.h? Why not 
in auth.h?


agreed -- it does not belong in fmgr.h


Moved to auth.h.




2. Timeout Handler is a signal handler, called as part of SIGALRM
signal handler, most of the times, signal handlers ought to be doing
small things, now that we are handing off the control to hook, which
can do any long running work (writing to a remote storage, file,
aggregate etc.), I don't think it's the right thing to do here.
  static void
  StartupPacketTimeoutHandler(void)
  {
+ if (FailedConnection_hook)
+ (*FailedConnection_hook) (FCET_STARTUP_PACKET_TIMEOUT, MyProcPort);
+ ereport(COMMERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("timeout while processing startup packet")));


Why add the ereport()?


removed it.



But more to Bharath's point, perhaps this is a case that is better
served by incrementing a stat counter and not exposed as a hook?


I think that the advantage of the hook is that it gives the extension 
author the ability/flexibility to aggregate the counter based on 
information available in the Port Struct (say the client addr for 
example) at this stage.


What about to aggregate the stat counter based on the client addr? (Not 
sure if there is more useful information (than the client addr) at this 
stage though)


That said, i agree that having a hook in a time out handler might not be 
the right thing to do (even if at the end that would be to the extension 
author responsibility to do "small things" in it), so it has been 
removed in the new attached version.




Also, a teeny nit:
8<--
+   if (status != STATUS_OK) {
+   if (FailedConnection_hook)
8<--

does not follow usual practice and probably should be:

8<--
+   if (status != STATUS_OK)
+   {
+   if (FailedConnection_hook)
8<--



Thanks!, fixed.

--

Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
diff --git a/src/backend/postmaster/postmaster.c 
b/src/backend/postmaster/postmaster.c
index d7257e4056..d9e1e3b4c1 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -567,6 +567,8 @@ int postmaster_alive_fds[2] = {-1, -1};
 HANDLE PostmasterHandle;
 #endif
 
+FailedConnection_hook_type FailedConnection_hook = NULL;
+
 /*
  * Postmaster main entry point
  */
@@ -4462,7 +4464,11 @@ BackendInitialize(Port *port)
 * already did any appropriate error reporting.
 */
if (status != STATUS_OK)
+   {
+   if (FailedConnection_hook)
+   (*FailedConnection_hook) (FCET_BAD_STARTUP_PACKET, 
port);
proc_exit(0);
+   }
 
/*
 * Now that we have the user and database name, we can set the process
diff --git a/src/backend/utils/init/postinit.c 
b/src/backend/utils/init/postinit.c
index 6b9082604f..e9bbd185f4 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -360,10 +360,14 @@ CheckMyDatabase(const char *name, bool am_superuser, bool 
override_allow_connect
if (!am_superuser &&
pg_database_aclcheck(MyDatabaseId, GetUserId(),
 ACL_CONNECT) 
!= ACLCHECK_OK)
+   {
+   if (FailedConnection_hook)
+  

Re: Patch proposal: New hooks in the connection path

2022-07-05 Thread Joe Conway

On 7/5/22 03:37, Bharath Rupireddy wrote:

On Mon, Jul 4, 2022 at 6:23 PM Drouvot, Bertrand  wrote:

On 7/2/22 1:00 AM, Nathan Bossart wrote:
> Could we model this after fmgr_hook?  The first argument in that hook
> indicates where it is being called from.  This doesn't alleviate the need
> for several calls to the hook in the authentication logic, but extension
> authors would only need to define one hook.

I like the idea and indeed fmgr.h looks a good place to model it.

Attached a new patch version doing so.


I was thinking along the same lines, so +1 for the general approach


Thanks for the patch. Can we think of enhancing
ClientAuthentication_hook_type itself i.e. make it a generic hook for
all sorts of authentication metrics, info etc. with the type parameter
embedded to it instead of new hook FailedConnection_hook?We can either
add a new parameter for the "event" (the existing
ClientAuthentication_hook_type implementers will have problems), or
embed/multiplex the "event" info to existing Port structure or status
variable (macro or enum) (existing implementers will not have
compatibility problems).  IMO, this looks cleaner going forward.


Not sure I like this though -- I'll have to think about that


On the v2 patch:

1. Why do we need to place the hook and structures in fmgr.h? Why not in auth.h?


agreed -- it does not belong in fmgr.h


2. Timeout Handler is a signal handler, called as part of SIGALRM
signal handler, most of the times, signal handlers ought to be doing
small things, now that we are handing off the control to hook, which
can do any long running work (writing to a remote storage, file,
aggregate etc.), I don't think it's the right thing to do here.
  static void
  StartupPacketTimeoutHandler(void)
  {
+ if (FailedConnection_hook)
+ (*FailedConnection_hook) (FCET_STARTUP_PACKET_TIMEOUT, MyProcPort);
+ ereport(COMMERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("timeout while processing startup packet")));


Why add the ereport()?

But more to Bharath's point, perhaps this is a case that is better 
served by incrementing a stat counter and not exposed as a hook?


Also, a teeny nit:
8<--
+   if (status != STATUS_OK) {
+   if (FailedConnection_hook)
8<--

does not follow usual practice and probably should be:

8<--
+   if (status != STATUS_OK)
+   {
+   if (FailedConnection_hook)
8<--


--
Joe Conway
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Patch proposal: New hooks in the connection path

2022-07-05 Thread Bharath Rupireddy
On Mon, Jul 4, 2022 at 6:23 PM Drouvot, Bertrand  wrote:
>
> Hi,
>
> On 7/2/22 1:00 AM, Nathan Bossart wrote:
> > Could we model this after fmgr_hook?  The first argument in that hook
> > indicates where it is being called from.  This doesn't alleviate the need
> > for several calls to the hook in the authentication logic, but extension
> > authors would only need to define one hook.
>
> I like the idea and indeed fmgr.h looks a good place to model it.
>
> Attached a new patch version doing so.

Thanks for the patch. Can we think of enhancing
ClientAuthentication_hook_type itself i.e. make it a generic hook for
all sorts of authentication metrics, info etc. with the type parameter
embedded to it instead of new hook FailedConnection_hook?We can either
add a new parameter for the "event" (the existing
ClientAuthentication_hook_type implementers will have problems), or
embed/multiplex the "event" info to existing Port structure or status
variable (macro or enum) (existing implementers will not have
compatibility problems).  IMO, this looks cleaner going forward.

On the v2 patch:

1. Why do we need to place the hook and structures in fmgr.h? Why not in auth.h?

2. Timeout Handler is a signal handler, called as part of SIGALRM
signal handler, most of the times, signal handlers ought to be doing
small things, now that we are handing off the control to hook, which
can do any long running work (writing to a remote storage, file,
aggregate etc.), I don't think it's the right thing to do here.
 static void
 StartupPacketTimeoutHandler(void)
 {
+ if (FailedConnection_hook)
+ (*FailedConnection_hook) (FCET_STARTUP_PACKET_TIMEOUT, MyProcPort);
+ ereport(COMMERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("timeout while processing startup packet")));

3. On "not letting these hooks (ClientAuthentication_hook_type or
FailedConnection_hook_type) expose sensitive info via Port structure"
- it seems like the Port structure has sensitive info like HbaLine,
host address, username, etc. but that's what it is so I think we are
okay with the structure as-is.

Regards,
Bharath Rupireddy.




Re: Patch proposal: New hooks in the connection path

2022-07-05 Thread Bharath Rupireddy
On Mon, Jul 4, 2022 at 6:29 PM Drouvot, Bertrand  wrote:
>
> On 7/2/22 2:49 AM, Roberto Mello wrote:
>
> On Fri, Jul 1, 2022 at 5:00 PM Nathan Bossart  
> wrote:
>>
>> That being said, I don't see why this information couldn't be provided in a
>> system view.  IMO it is generically useful.
>
> +1 for a system view with appropriate permissions, in addition to the hooks.
>
> That would make the information easily accessible to a number or monitoring 
> systems besides the admin.
>
> Agree about that.

Are we going to have it as a part of shared memory stats? Or a
separate shared memory for connection stats exposing these via a
function and a view can be built on this function like
pg_get_replication_slots and pg_replication_slots?

> I'll start another thread and propose a dedicated patch for the "internal 
> counters" and how to expose them.

IMHO, let's have the discussion here in this thread and the patch can be 0002.

Regards,
Bharath Rupireddy.




Re: Patch proposal: New hooks in the connection path

2022-07-04 Thread Zhihong Yu
On Mon, Jul 4, 2022 at 5:54 AM Drouvot, Bertrand 
wrote:

> Hi,
>
> On 7/2/22 1:00 AM, Nathan Bossart wrote:
> > Could we model this after fmgr_hook?  The first argument in that hook
> > indicates where it is being called from.  This doesn't alleviate the need
> > for several calls to the hook in the authentication logic, but extension
> > authors would only need to define one hook.
>
> I like the idea and indeed fmgr.h looks a good place to model it.
>
> Attached a new patch version doing so.
>
> Thanks
>
> --
>
> Bertrand Drouvot
> Amazon Web Services: https://aws.amazon.com

Hi,
+   FCET_SPT,   /* startup packet timeout */
+   FCET_BSP,   /* bad startup packet */

Looking at existing enum type, such as FmgrHookEventType, the part after
underscore is a word.
I think it would be good to follow existing practice and make the enums
more readable.

Cheers


Re: Patch proposal: New hooks in the connection path

2022-07-01 Thread Roberto Mello
On Fri, Jul 1, 2022 at 5:00 PM Nathan Bossart 
wrote:

>
>
> That being said, I don't see why this information couldn't be provided in a
> system view.  IMO it is generically useful.


+1 for a system view with appropriate permissions, in addition to the
hooks.

That would make the information easily accessible to a number or monitoring
systems besides the admin.

Roberto

—
Crunchy Data — passion for open source PostgreSQL

>


Re: Patch proposal: New hooks in the connection path

2022-07-01 Thread Nathan Bossart
On Fri, Jul 01, 2022 at 09:48:40AM +0200, Drouvot, Bertrand wrote:
>> However, I'm personally not okay with having multiple hooks
>> as proposed in the v1 patch.
> 
> I agree that it would be great to reduce the number of proposed hooks.
> 
> But,
> 
>>   Can we think of having a single hook
> 
> The proposed hooks are triggered during errors (means that the connection
> attempt break) and:
> 
> - In the connection paths that will not reach the ClientAuthentication_hook
> at all: those are the ones related to the bad startup packet and timeout
> while processing the startup packet.
> 
> or
> 
> - After the ClientAuthentication_hook is fired: those are the bad db oid,
> bad db name and bad perm ones.
> 
> So, It does look like having only one hook would require refactoring in the
> connection path and I'm not sure if this is worth it.
> 
>> or
>> enhancing the existing ClientAuthentication_hook where we pass a
>> PURPOSE parameter (CONN_SUCCESS, CONN_FAILURE, CONN_FOO, CONN_BAR
>> ) tp the hook?
> 
> I think one could already "predict" the bad db and bad perm errors within
> the current ClientAuthentication_hook.
> 
> But in case of multiple "possible" errors (within the same connection
> attempt) how could we know for sure the one that will be actually reported?
> That's why i think the best way is to put new hooks as close as possible to
> the place where the related errors are reported.
> 
> What do you think?

Could we model this after fmgr_hook?  The first argument in that hook
indicates where it is being called from.  This doesn't alleviate the need
for several calls to the hook in the authentication logic, but extension
authors would only need to define one hook.

That being said, I don't see why this information couldn't be provided in a
system view.  IMO it is generically useful.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Patch proposal: New hooks in the connection path

2022-06-30 Thread Bharath Rupireddy
On Thu, Jun 30, 2022 at 1:31 PM Drouvot, Bertrand  wrote:
>
> Hi hackers,
>
> While commit 960869da08 added some information about connections that have 
> been successfully authenticated, there is no metrics for connections that 
> have not (or did not reached the authentication stage).
>
> Adding metrics about failed connections attempts could also help, for example 
> with proper sampling, to:
>
> detect spikes in failed login attempts
> check if there is a correlation between spikes in successful and failed 
> connection attempts
>
> While the number of successful connections could also already been tracked 
> with the ClientAuthentication_hook (and also the ones that failed the 
> authentication) we are missing metrics about:
>
> why the connection failed (could be bad password, bad database, bad user, 
> missing CONNECT privilege...)
> number of times the authentication stage has not been reached
> why the authentication stage has not been reached (bad startup packets, 
> timeout while processing startup packet,...)
>
> Those missing metrics (in addition to the ones that can be already gathered) 
> could provide value for:
>
> security investigations
> anomalies detections
> tracking application misconfigurations
>
> In an attempt to be able to provide those metrics, please find attached a 
> patch proposal to add new hooks in the connection path, that would be fired 
> if:
>
> there is a bad startup packet
> there is a timeout while processing the startup packet
> user does not have CONNECT privilege
> database does not exist
>
> For safety those hooks request the use of a const Port parameter, so that 
> they could be used only for reporting purpose (for example, we are working on 
> an extension to record detailed login metrics counters).
>
> Another option could be to add those metrics in the engine itself (instead of 
> providing new hooks to get them), but the new hooks option gives more 
> flexibility on how to render and exploit them (there is a lot of information 
> in the Port Struct that one could be interested with).
>
> I’m adding this patch proposal to the commitfest.
> Looking forward to your feedback,

+1 for the idea. I've seen numerous cases where the login metrics
(especially failed connections) are handy in analyzing stuff. And I'm
okay with the hook approach than the postgres emitting the necessary
metrics. However, I'm personally not okay with having multiple hooks
as proposed in the v1 patch. Can we think of having a single hook or
enhancing the existing ClientAuthentication_hook where we pass a
PURPOSE parameter (CONN_SUCCESS, CONN_FAILURE, CONN_FOO, CONN_BAR
) tp the hook? With this approach, we don't need to spread out the
postgres code with many hooks and the hook implementers will look at
the PURPOSE parameter and deal with it accordingly.

On the security aspect, we must ensure we don't leak any sensitive
information such as password or SSH key to the new hook - if PGPORT
has this information, maybe we need to mask that structure a bit
before handing it off to the hook.

Regards,
Bharath Rupireddy.




Patch proposal: New hooks in the connection path

2022-06-30 Thread Drouvot, Bertrand

Hi hackers,

While commit 960869da08 added some information about connections that 
have been successfully authenticated, there is no metrics for 
connections that have not (or did not reached the authentication stage).


Adding metrics about failed connections attempts could also help, for 
example with proper sampling, to:


 * detect spikes in failed login attempts
 * check if there is a correlation between spikes in successful and
   failed connection attempts

While the number of successful connections could also already been 
tracked with the ClientAuthentication_hook (and also the ones that 
failed the authentication) we are missing metrics about:


 * why the connection failed (could be bad password, bad database, bad
   user, missing CONNECT privilege...)
 * number of times the authentication stage has not been reached
 * why the authentication stage has not been reached (bad startup
   packets, timeout while processing startup packet,...)

Those missing metrics (in addition to the ones that can be already 
gathered) could provide value for:


 * security investigations
 * anomalies detections
 * tracking application misconfigurations

In an attempt to be able to provide those metrics, please find attached 
a patch proposal to add new hooks in the connection path, that would be 
fired if:


 * there is a bad startup packet
 * there is a timeout while processing the startup packet
 * user does not have CONNECT privilege
 * database does not exist

For safety those hooks request the use of a const Port parameter, so 
that they could be used only for reporting purpose (for example, we are 
working on an extension to record detailed login metrics counters).


Another option could be to add those metrics in the engine itself 
(instead of providing new hooks to get them), but the new hooks option 
gives more flexibility on how to render and exploit them (there is a lot 
of information in the Port Struct that one could be interested with).


I’m adding this patch proposal to the commitfest.
Looking forward to your feedback,

Regards,
Bertrand
diff --git a/src/backend/postmaster/postmaster.c 
b/src/backend/postmaster/postmaster.c
index dde4bc25b1..8e00327a5d 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -567,6 +567,9 @@ int postmaster_alive_fds[2] = {-1, -1};
 HANDLE PostmasterHandle;
 #endif
 
+StartupPacketTimeout_hook_type StartupPacketTimeout_hook = NULL;
+BadConnPacket_hook_type BadConnPacket_hook = NULL;
+
 /*
  * Postmaster main entry point
  */
@@ -4462,8 +4465,11 @@ BackendInitialize(Port *port)
 * Stop here if it was bad or a cancel packet.  ProcessStartupPacket
 * already did any appropriate error reporting.
 */
-   if (status != STATUS_OK)
+   if (status != STATUS_OK) {
+   if (BadConnPacket_hook)
+   (*BadConnPacket_hook) (port);
proc_exit(0);
+   }
 
/*
 * Now that we have the user and database name, we can set the process
@@ -5323,6 +5329,11 @@ dummy_handler(SIGNAL_ARGS)
 static void
 StartupPacketTimeoutHandler(void)
 {
+   if (StartupPacketTimeout_hook)
+   (*StartupPacketTimeout_hook) (MyProcPort);
+   ereport(COMMERROR,
+   (errcode(ERRCODE_PROTOCOL_VIOLATION),
+errmsg("timeout while processing startup packet")));
_exit(1);
 }
 
diff --git a/src/backend/utils/init/postinit.c 
b/src/backend/utils/init/postinit.c
index 6b9082604f..562ed331bf 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -67,6 +67,14 @@
 #include "utils/syscache.h"
 #include "utils/timeout.h"
 
+/*
+ * Hooks to be used when a connection has been refused in case of bad
+ * database name, bad database oid or bad permissions.
+ */
+BadDb_hook_type baddbname_hook = NULL;
+BadDb_hook_type baddboid_hook = NULL;
+BadDb_hook_type baddbperm_hook = NULL;
+
 static HeapTuple GetDatabaseTuple(const char *dbname);
 static HeapTuple GetDatabaseTupleByOid(Oid dboid);
 static void PerformAuthentication(Port *port);
@@ -360,10 +368,14 @@ CheckMyDatabase(const char *name, bool am_superuser, bool 
override_allow_connect
if (!am_superuser &&
pg_database_aclcheck(MyDatabaseId, GetUserId(),
 ACL_CONNECT) 
!= ACLCHECK_OK)
+   {
+   if (baddbperm_hook)
+   (*baddbperm_hook) (MyProcPort);
ereport(FATAL,

(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 errmsg("permission denied for database 
\"%s\"", name),
 errdetail("User does not have CONNECT 
privilege.")));
+   }
 
/*
 * Check connection limit for this database.
@@