Re: Printing backtrace of postgres processes

2024-02-26 Thread Michael Paquier
On Mon, Feb 26, 2024 at 04:05:05PM +0100, Christoph Berg wrote:
> I tried that now. Mind that I'm not a benchmarking expert, and there's
> been quite some jitter in the results, but I think there's a clear
> trend.
> 
> Even if we regard the 1873 as an outlier, I've seen many vanilla runs
> with 22xx tps, and not a single v28 run with 22xx tps. Other numbers I
> collected suggested a cost of at least 3% for the feature.

Thanks for the numbers.  Yes, that's annoying and I suspect could be
noticeable for a lot of users..
--
Michael


signature.asc
Description: PGP signature


Re: Printing backtrace of postgres processes

2024-02-26 Thread Christoph Berg
Re: Michael Paquier
> Something like this can be measured with a bunch of concurrent
> connections attempting connections and a very high rate, like pgbench
> with an empty script and -C, for local connections.

I tried that now. Mind that I'm not a benchmarking expert, and there's
been quite some jitter in the results, but I think there's a clear
trend.

Current head without and with the v28 patchset.
Command line:
pgbench -n -C -c 20 -j 20 -f empty.sql -T 30 --progress=2
empty.sql just contains a ";"
model name: 13th Gen Intel(R) Core(TM) i7-13700H

head:
tps = 2211.289863 (including reconnection times)
tps = 2113.907588 (including reconnection times)
tps = 2200.406877 (including reconnection times)
average: 2175

v28:
tps = 1873.472459 (including reconnection times)
tps = 2068.094383 (including reconnection times)
tps = 2196.890897 (including reconnection times)
average: 2046

2046 / 2175 = 0.941

Even if we regard the 1873 as an outlier, I've seen many vanilla runs
with 22xx tps, and not a single v28 run with 22xx tps. Other numbers I
collected suggested a cost of at least 3% for the feature.

Christoph




Re: Printing backtrace of postgres processes

2024-02-23 Thread Michael Paquier
On Fri, Feb 23, 2024 at 04:39:47PM +0100, Christoph Berg wrote:
> I'd be concerned about the cost of doing that as part of the startup
> of every single backend process. Shouldn't this rather be done within
> the postmaster so it's automatically inherited by forked backends?
> (EXEC_BACKEND systems probably don't have libgcc I guess.)

Something like this can be measured with a bunch of concurrent
connections attempting connections and a very high rate, like pgbench
with an empty script and -C, for local connections.
--
Michael


signature.asc
Description: PGP signature


Re: Printing backtrace of postgres processes

2024-02-23 Thread Christoph Berg
Re: Michael Paquier
> >>•  backtrace() and backtrace_symbols_fd() don't call malloc()  
> >> explic‐
> >>   itly,  but  they  are part of libgcc, which gets loaded 
> >> dynamically
> >>   when first used.  Dynamic loading usually triggers a call  to  
> >> mal‐
> >>   loc(3).   If  you  need certain calls to these two functions to 
> >> not
> >>   allocate memory (in signal handlers, for example), you need to 
> >> make
> >>   sure libgcc is loaded beforehand.
> >>
> >> and the patch ensures that libgcc is loaded by calling a dummy
> >> backtrace() at the start of the process.
> 
> FWIW, anything I am reading about the matter freaks me out, including
> the dlopen() part in all the backends:
> https://www.gnu.org/software/libc/manual/html_node/Backtraces.html

I'd be concerned about the cost of doing that as part of the startup
of every single backend process. Shouldn't this rather be done within
the postmaster so it's automatically inherited by forked backends?
(EXEC_BACKEND systems probably don't have libgcc I guess.)

Christoph




Re: Printing backtrace of postgres processes

2024-02-13 Thread Ashutosh Bapat
On Mon, Feb 12, 2024 at 6:52 AM Ashutosh Bapat
 wrote:
>
> >
> > > We defer actual action triggered by a signal till CHECK_FOR_INTERRUPTS
> > > is called. I understand that we can't do that here since we want to
> > > capture the backtrace at that moment and can't wait till next CFI. But
> > > printing the backend can surely wait till next CFI right?
> >
> > Delaying the call of backtrace() to happen during a CFI() would be
> > safe, yes, and writing data to stderr would not really be an issue as
> > at least the data would be sent somewhere.  That's less useful, but
> > we do that for memory contexts.
>
> Memory contexts do not change more or less till next CFI, but stack
> traces do. So I am not sure whether it is desirable to wait to capture
> backtrace till next CFI. Given that the user can not time a call to
> pg_log_backend() exactly, so whether it captures the backtrace exactly
> at when interrupt happens or at the next CFI may not matter much in
> practice.
>

Thinking more about this I have following thoughts/questions:

1. Whether getting a backtrace at CFI is good enough?
Backtrace is required to know what a process is doing when it's stuck
or is behaviour unexpected etc. PostgreSQL code has CFIs sprinkled in
almost all the tight loops. Whether those places are enough to cover
most of the cases that the user of this feature would care about?

2. tools like gdb, strace can be used to get the stack trace of any
process, so do we really need this tool?
Most of the OSes provide such tools but may be there are useful in
kubernetes like environment, I am not sure.

3. tools like gdb and strace are able to capture stack trace at any
point during execution. Can we use the same mechanism instead of
relying on CFI?

4. tools like gdb and strace can capture more than just stack trace
e.g. variable values, values of registers etc. Are we planning to add
those facilities as well? OR whether this feature will be useful
without those facilities?

May the feature be more useful if it can provide PostgreSQL specific
details which an external tool can not.

-- 
Best Wishes,
Ashutosh Bapat




Re: Printing backtrace of postgres processes

2024-02-11 Thread Ashutosh Bapat
On Sat, Feb 10, 2024 at 5:36 AM Michael Paquier  wrote:
>
> On Fri, Feb 09, 2024 at 02:27:26PM +0530, Ashutosh Bapat wrote:
> > On Fri, Feb 9, 2024 at 2:18 PM Alvaro Herrera  
> > wrote:
> >> Hmm, but the backtrace() manpage says
> >>
> >>•  backtrace() and backtrace_symbols_fd() don't call malloc()  
> >> explic‐
> >>   itly,  but  they  are part of libgcc, which gets loaded 
> >> dynamically
> >>   when first used.  Dynamic loading usually triggers a call  to  
> >> mal‐
> >>   loc(3).   If  you  need certain calls to these two functions to 
> >> not
> >>   allocate memory (in signal handlers, for example), you need to 
> >> make
> >>   sure libgcc is loaded beforehand.
> >>
> >> and the patch ensures that libgcc is loaded by calling a dummy
> >> backtrace() at the start of the process.
>
> FWIW, anything I am reading about the matter freaks me out, including
> the dlopen() part in all the backends:
> https://www.gnu.org/software/libc/manual/html_node/Backtraces.html
>
> So I really question whether it is a good idea to assume if this will
> always be safe depending on the version of libgcc dealt with,
> increasing the impact area.  Perhaps that's worrying too much, but it
> looks like one of these things where we'd better be really careful.

I agree. We don't want a call to backtrace printing mechanism to make
things worse.

>
> > We defer actual action triggered by a signal till CHECK_FOR_INTERRUPTS
> > is called. I understand that we can't do that here since we want to
> > capture the backtrace at that moment and can't wait till next CFI. But
> > printing the backend can surely wait till next CFI right?
>
> Delaying the call of backtrace() to happen during a CFI() would be
> safe, yes, and writing data to stderr would not really be an issue as
> at least the data would be sent somewhere.  That's less useful, but
> we do that for memory contexts.

Memory contexts do not change more or less till next CFI, but stack
traces do. So I am not sure whether it is desirable to wait to capture
backtrace till next CFI. Given that the user can not time a call to
pg_log_backend() exactly, so whether it captures the backtrace exactly
at when interrupt happens or at the next CFI may not matter much in
practice.

-- 
Best Wishes,
Ashutosh Bapat




Re: Printing backtrace of postgres processes

2024-02-09 Thread Michael Paquier
On Fri, Feb 09, 2024 at 02:27:26PM +0530, Ashutosh Bapat wrote:
> On Fri, Feb 9, 2024 at 2:18 PM Alvaro Herrera  wrote:
>> Hmm, but the backtrace() manpage says
>>
>>•  backtrace() and backtrace_symbols_fd() don't call malloc()  explic‐
>>   itly,  but  they  are part of libgcc, which gets loaded dynamically
>>   when first used.  Dynamic loading usually triggers a call  to  mal‐
>>   loc(3).   If  you  need certain calls to these two functions to not
>>   allocate memory (in signal handlers, for example), you need to make
>>   sure libgcc is loaded beforehand.
>>
>> and the patch ensures that libgcc is loaded by calling a dummy
>> backtrace() at the start of the process.

FWIW, anything I am reading about the matter freaks me out, including
the dlopen() part in all the backends:
https://www.gnu.org/software/libc/manual/html_node/Backtraces.html

So I really question whether it is a good idea to assume if this will
always be safe depending on the version of libgcc dealt with,
increasing the impact area.  Perhaps that's worrying too much, but it
looks like one of these things where we'd better be really careful.

> We defer actual action triggered by a signal till CHECK_FOR_INTERRUPTS
> is called. I understand that we can't do that here since we want to
> capture the backtrace at that moment and can't wait till next CFI. But
> printing the backend can surely wait till next CFI right?

Delaying the call of backtrace() to happen during a CFI() would be
safe, yes, and writing data to stderr would not really be an issue as
at least the data would be sent somewhere.  That's less useful, but
we do that for memory contexts.
--
Michael


signature.asc
Description: PGP signature


Re: Printing backtrace of postgres processes

2024-02-09 Thread Ashutosh Bapat
On Fri, Feb 9, 2024 at 2:18 PM Alvaro Herrera  wrote:
>
> On 2024-Feb-09, Michael Paquier wrote:
>
> > Anyway, I've been digging around the signal-safety of backtrace(3)
> > (even looking a bit at some GCC code, brrr), and I am under the
> > impression that backtrace() is just by nature not safe and also
> > dangerous in signal handlers.  One example of issue I've found:
> > https://github.com/gperftools/gperftools/issues/838
> >
> > This looks like enough ground to me to reject the patch.
>
> Hmm, but the backtrace() manpage says
>
>•  backtrace() and backtrace_symbols_fd() don't call malloc()  explic‐
>   itly,  but  they  are part of libgcc, which gets loaded dynamically
>   when first used.  Dynamic loading usually triggers a call  to  mal‐
>   loc(3).   If  you  need certain calls to these two functions to not
>   allocate memory (in signal handlers, for example), you need to make
>   sure libgcc is loaded beforehand.
>
> and the patch ensures that libgcc is loaded by calling a dummy
> backtrace() at the start of the process.
>

We defer actual action triggered by a signal till CHECK_FOR_INTERRUPTS
is called. I understand that we can't do that here since we want to
capture the backtrace at that moment and can't wait till next CFI. But
printing the backend can surely wait till next CFI right?


-- 
Best Wishes,
Ashutosh Bapat




Re: Printing backtrace of postgres processes

2024-02-09 Thread Alvaro Herrera
On 2024-Feb-09, Michael Paquier wrote:

> Anyway, I've been digging around the signal-safety of backtrace(3)
> (even looking a bit at some GCC code, brrr), and I am under the
> impression that backtrace() is just by nature not safe and also
> dangerous in signal handlers.  One example of issue I've found:
> https://github.com/gperftools/gperftools/issues/838
> 
> This looks like enough ground to me to reject the patch.

Hmm, but the backtrace() manpage says

   •  backtrace() and backtrace_symbols_fd() don't call malloc()  explic‐
  itly,  but  they  are part of libgcc, which gets loaded dynamically
  when first used.  Dynamic loading usually triggers a call  to  mal‐
  loc(3).   If  you  need certain calls to these two functions to not
  allocate memory (in signal handlers, for example), you need to make
  sure libgcc is loaded beforehand.

and the patch ensures that libgcc is loaded by calling a dummy
backtrace() at the start of the process.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"All rings of power are equal,
But some rings of power are more equal than others."
 (George Orwell's The Lord of the Rings)




Re: Printing backtrace of postgres processes

2024-02-09 Thread Michael Paquier
On Thu, Feb 08, 2024 at 12:25:18PM +0900, Michael Paquier wrote:
> In HandleLogBacktraceInterrupt(), we don't use backtrace_symbols() and
> rely on backtrace_symbols_fd() to avoid doing malloc() in the signal
> handler as mentioned in [1] back in 2022.  Perhaps the part about the
> fact that we don't use backtrace_symbols() should be mentioned
> explicitely in a comment rather than silently implied?  That's
> a very important point.

This has been itching me, so I have spent more time reading about
that, and while browsing signal(7) and signal-safety(7), I've first
noticed that this is not safe in the patch: 
+   write_stderr("logging current backtrace of process with PID %d:\n",
+MyProcPid);

Note that there's a write_stderr_signal_safe().

Anyway, I've been digging around the signal-safety of backtrace(3)
(even looking a bit at some GCC code, brrr), and I am under the
impression that backtrace() is just by nature not safe and also
dangerous in signal handlers.  One example of issue I've found:
https://github.com/gperftools/gperftools/issues/838

This looks like enough ground to me to reject the patch.
--
Michael


signature.asc
Description: PGP signature


Re: Printing backtrace of postgres processes

2024-02-07 Thread Michael Paquier
On Thu, Feb 08, 2024 at 12:30:00AM +0530, Bharath Rupireddy wrote:
> I've missed adding LoadBacktraceFunctions() in InitAuxiliaryProcess
> for 0002 patch. Please find the attached v29 patch set. Sorry for the
> noise.

I've been torturing the patch with \watch and loops calling the
function while doing a sequential scan of pg_stat_activity, and that
was stable while doing a pgbench and an installcheck-world in
parallel, with some infinite loops and some spinlocks I should not
have taken.

+   if (backtrace_functions_loaded)
+   return;

I was really wondering about this point, particularly regarding the
fact that this would load libgcc for all the backends when they start,
unconditionally.  One thing could be to hide that behind a postmaster
GUC disabled by default, but then we'd come back to using gdb on a
live server, which is no fun on a customer environment because of the
extra dependencies, which may not, or just cannot, be installed.  So
yeah, I think that I'm OK about that.

+ * procsignal.h
+ *   Backtrace-related routines

This one is incorrect.

In HandleLogBacktraceInterrupt(), we don't use backtrace_symbols() and
rely on backtrace_symbols_fd() to avoid doing malloc() in the signal
handler as mentioned in [1] back in 2022.  Perhaps the part about the
fact that we don't use backtrace_symbols() should be mentioned
explicitely in a comment rather than silently implied?  That's
a very important point.

Echoing with upthread, and we've been more lax with superuser checks
and assignment of custom roles in recent years, I agree with the
approach of the patch to make that superuser by default.  Then users
can force their own policy as they want with an EXECUTE ACL on the SQL
function.

As a whole, I'm pretty excited about being able to rely on that
without the need to use gdb to get a live trace.  Does anybody have
objections and/or comments, particularly about the superuser and the
load part at backend startup?  This thread has been going on for so
long that it would be good to let 1 week for folks to react before
doing anything.  See v29 for references.

[1]: 
https://www.postgresql.org/message-id/CALj2ACUNZVB0cQovvKBd53-upsMur8j-5_K=-fg86uaa+wy...@mail.gmail.com
--
Michael


signature.asc
Description: PGP signature


Re: Printing backtrace of postgres processes

2024-02-07 Thread Bharath Rupireddy
On Wed, Feb 7, 2024 at 9:00 PM Bharath Rupireddy
 wrote:
>
> On Wed, Feb 7, 2024 at 2:57 PM Michael Paquier  wrote:
> >
> > On Wed, Feb 07, 2024 at 02:04:39PM +0530, Bharath Rupireddy wrote:
> > > Well, that 'ubuntu' is the default username there, I've changed it now
> > > and kept the output short.
> >
> > I would keep it just at two or three lines, with a "For example, with
> > lines like":
>
> Done.
>
> > > I've simplified the tests, now we don't need two separate output files
> > > for tests. Please see the attached v27 patch.
> >
> > +  proname => 'pg_log_backtrace', provolatile => 'v', prorettype => 'bool',
> >
> > Hmm.  Would it be better to be in line with memory contexts logging
> > and use pg_log_backend_backtrace()?
>
> +1.
>
> > One thing I was wondering is that
> > there may be a good point in splitting the backtrace support into two
> > functions (backends and auxiliary processes) that could be split with
> > two execution ACLs across different roles.
>
> -1 for that unless we have any requests. I mean, backtrace is a common
> thing for all postgres processes, why different controls are needed?
> I'd go with what pg_log_backend_memory_contexts does - it supports
> both backends and auxiliary processes.
>
> > +   PROCSIG_LOG_BACKTRACE,  /* ask backend to log the current backtrace 
> > */
> >
> > Incorrect order.
>
> PROCSIG_XXX aren't alphabetically ordered, no?
>
> > +-- Backtrace is not logged for auxiliary processes
> >
> > Not sure there's a point in keeping that in the tests for now.
> >
> > +* XXX: It might be worth implementing it for auxiliary processes.
> >
> > Same, I would remove that.
>
> Done.
>
> > +static volatile sig_atomic_t backtrace_functions_loaded = false;
> >
> > Hmm, so you need that because of the fact that it would be called in a
> > signal as backtrace(3) says:
> > "If you need certain calls to these two functions to not allocate
> > memory (in signal handlers, for example), you need to make sure libgcc
> > is loaded beforehand".
> >
> > True that it is not interesting to only log something when having a
> > CFI, this needs to be dynamic to get a precise state of things.
>
> Right.
>
> I've also fixed some test failures. Please see the attached v28 patch
> set. 0002 extends pg_log_backend_backtrace to auxiliary processes,
> just like pg_log_backend_memory_contexts (not focused on PID
> de-duplication code yet).

I've missed adding LoadBacktraceFunctions() in InitAuxiliaryProcess
for 0002 patch. Please find the attached v29 patch set. Sorry for the
noise.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v29-0001-Add-function-to-log-backtrace-of-a-backend.patch
Description: Binary data


v29-0002-Extend-backtrace-logging-function-for-auxiliary-.patch
Description: Binary data


Re: Printing backtrace of postgres processes

2024-02-07 Thread Bharath Rupireddy
On Wed, Feb 7, 2024 at 2:57 PM Michael Paquier  wrote:
>
> On Wed, Feb 07, 2024 at 02:04:39PM +0530, Bharath Rupireddy wrote:
> > Well, that 'ubuntu' is the default username there, I've changed it now
> > and kept the output short.
>
> I would keep it just at two or three lines, with a "For example, with
> lines like":

Done.

> > I've simplified the tests, now we don't need two separate output files
> > for tests. Please see the attached v27 patch.
>
> +  proname => 'pg_log_backtrace', provolatile => 'v', prorettype => 'bool',
>
> Hmm.  Would it be better to be in line with memory contexts logging
> and use pg_log_backend_backtrace()?

+1.

> One thing I was wondering is that
> there may be a good point in splitting the backtrace support into two
> functions (backends and auxiliary processes) that could be split with
> two execution ACLs across different roles.

-1 for that unless we have any requests. I mean, backtrace is a common
thing for all postgres processes, why different controls are needed?
I'd go with what pg_log_backend_memory_contexts does - it supports
both backends and auxiliary processes.

> +   PROCSIG_LOG_BACKTRACE,  /* ask backend to log the current backtrace */
>
> Incorrect order.

PROCSIG_XXX aren't alphabetically ordered, no?

> +-- Backtrace is not logged for auxiliary processes
>
> Not sure there's a point in keeping that in the tests for now.
>
> +* XXX: It might be worth implementing it for auxiliary processes.
>
> Same, I would remove that.

Done.

> +static volatile sig_atomic_t backtrace_functions_loaded = false;
>
> Hmm, so you need that because of the fact that it would be called in a
> signal as backtrace(3) says:
> "If you need certain calls to these two functions to not allocate
> memory (in signal handlers, for example), you need to make sure libgcc
> is loaded beforehand".
>
> True that it is not interesting to only log something when having a
> CFI, this needs to be dynamic to get a precise state of things.

Right.

I've also fixed some test failures. Please see the attached v28 patch
set. 0002 extends pg_log_backend_backtrace to auxiliary processes,
just like pg_log_backend_memory_contexts (not focused on PID
de-duplication code yet).

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v28-0002-Extend-backtrace-logging-function-to-auxiliary-p.patch
Description: Binary data


v28-0001-Add-function-to-log-backtrace-of-a-backend.patch
Description: Binary data


Re: Printing backtrace of postgres processes

2024-02-07 Thread Michael Paquier
On Wed, Feb 07, 2024 at 02:04:39PM +0530, Bharath Rupireddy wrote:
> Well, that 'ubuntu' is the default username there, I've changed it now
> and kept the output short.

I would keep it just at two or three lines, with a "For example, with
lines like":

> I've simplified the tests, now we don't need two separate output files
> for tests. Please see the attached v27 patch.

+  proname => 'pg_log_backtrace', provolatile => 'v', prorettype => 'bool', 

Hmm.  Would it be better to be in line with memory contexts logging
and use pg_log_backend_backtrace()?  One thing I was wondering is that
there may be a good point in splitting the backtrace support into two
functions (backends and auxiliary processes) that could be split with
two execution ACLs across different roles. 

+   PROCSIG_LOG_BACKTRACE,  /* ask backend to log the current backtrace */

Incorrect order.

+-- Backtrace is not logged for auxiliary processes 

Not sure there's a point in keeping that in the tests for now.

+* XXX: It might be worth implementing it for auxiliary processes.

Same, I would remove that.

+static volatile sig_atomic_t backtrace_functions_loaded = false;

Hmm, so you need that because of the fact that it would be called in a
signal as backtrace(3) says:
"If you need certain calls to these two functions to not allocate
memory (in signal handlers, for example), you need to make sure libgcc
is loaded beforehand".

True that it is not interesting to only log something when having a
CFI, this needs to be dynamic to get a precise state of things.
--
Michael


signature.asc
Description: PGP signature


Re: Printing backtrace of postgres processes

2024-02-07 Thread Bharath Rupireddy
On Tue, Feb 6, 2024 at 4:21 PM Michael Paquier  wrote:
>
> --- a/src/backend/storage/ipc/procarray.c
> +++ b/src/backend/storage/ipc/procarray.c
> +bool
> +SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason)
> +{
>
> Looking at 0001.  This API is a thin wrapper of SendProcSignal(), that
> just checks that we have actually a process before using it.
> Shouldn't it be in procsignal.c.
>
> Now looking at 0002, this new routine is used in one place.  Seeing
> that we have something similar in pgstatfuncs.c, wouldn't it be
> better, instead of englobing SendProcSignal(), to have one routine
> that's able to return a PID for a PostgreSQL process?

I liked the idea of going ahead with logging backtraces for only
backends for now, so a separate wrapper like this isn't needed.

> All the backtrace-related handling is stored in procsignal.c, could it
> be cleaner to move the internals into a separate, new file, like
> procbacktrace.c or something similar?

+1. Moved all the code to a new file.

> LoadBacktraceFunctions() is one more thing we need to set up in all
> auxiliary processes.  That's a bit sad to require that in all these
> places, and we may forget to add it.  Could we put more efforts in
> centralizing these initializations?

If we were to do it for only backends (including bg workers)
InitProcess() is the better place. If we were to do it for both
backends and auxiliary processes, BaseInit() is best.

> The auxiliary processes are one
> portion of the problem, and getting stack traces for backend processes
> would be useful on its own.  Another suggestion that I'd propose to
> simplify the patch would be to focus solely on backends for now, and
> do something for auxiliary process later on.  If you do that, the
> strange layer with BackendOrAuxproc() is not a problem anymore, as it
> would be left out for now.

+1 to keep it simple for now; that is, log backtraces of only backends
leaving auxiliary processes aside.

> +
> +logging current backtrace of process with PID 3499242:
> +postgres: ubuntu postgres [local] INSERT(+0x61a355)[0x5559b94de355]
> +postgres: ubuntu postgres [local] 
> INSERT(procsignal_sigusr1_handler+0x9e)[0x5559b94de4ef]
>
> This is IMO too much details for the documentation, and could be
> confusing depending on how the code evolves in the future.  I'd
> suggest to keep it minimal, cutting that to a few lines.  I don't see
> a need to mention ubuntu, either.

Well, that 'ubuntu' is the default username there, I've changed it now
and kept the output short.

I've simplified the tests, now we don't need two separate output files
for tests. Please see the attached v27 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From e44351d85f6415c0649c18e00dc92368be0bc1d6 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 7 Feb 2024 07:46:05 +
Subject: [PATCH v27] Add function to log backtrace of a backend

---
 doc/src/sgml/func.sgml   |  53 +++
 src/backend/catalog/system_functions.sql |   2 +
 src/backend/storage/ipc/Makefile |   1 +
 src/backend/storage/ipc/meson.build  |   1 +
 src/backend/storage/ipc/procbacktrace.c  | 143 +++
 src/backend/storage/ipc/procsignal.c |   4 +
 src/backend/storage/lmgr/proc.c  |   3 +
 src/include/catalog/pg_proc.dat  |   4 +
 src/include/storage/procbacktrace.h  |  20 +++
 src/include/storage/procsignal.h |   1 +
 src/test/regress/expected/misc_functions.out |  51 +++
 src/test/regress/sql/misc_functions.sql  |  41 ++
 12 files changed, 324 insertions(+)
 create mode 100644 src/backend/storage/ipc/procbacktrace.c
 create mode 100644 src/include/storage/procbacktrace.h

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6788ba8ef4..4d5a31b113 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27186,6 +27186,24 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of a backend with the specified process
+ID. The backtraces will be logged to stderr.
+Typically, a backtrace identifies which function a process is currently
+executing, and is useful for developers to diagnose stuck processes.
+This function is supported only if PostgreSQL was built with the ability
+to capture backtraces, otherwise it will emit a warning.
+   
+  
+
   

 
@@ -27300,6 +27318,41 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_backtrace can be used to log the backtrace of
+a backend process. For example:
+
+postgres=# select 

Re: Printing backtrace of postgres processes

2024-02-06 Thread Michael Paquier
On Fri, Jan 26, 2024 at 11:58:00PM +0530, Bharath Rupireddy wrote:
> You probably were looking at v21, the above change isn't there in
> versions after that. Can you please review the latest version v26
> attached here?
> 
> We might want this patch extended to the newly added walsummarizer
> process which I'll do so in the next version.

--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
+bool
+SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason)
+{

Looking at 0001.  This API is a thin wrapper of SendProcSignal(), that
just checks that we have actually a process before using it.
Shouldn't it be in procsignal.c.

Now looking at 0002, this new routine is used in one place.  Seeing
that we have something similar in pgstatfuncs.c, wouldn't it be
better, instead of englobing SendProcSignal(), to have one routine
that's able to return a PID for a PostgreSQL process?

All the backtrace-related handling is stored in procsignal.c, could it
be cleaner to move the internals into a separate, new file, like
procbacktrace.c or something similar?

LoadBacktraceFunctions() is one more thing we need to set up in all
auxiliary processes.  That's a bit sad to require that in all these
places, and we may forget to add it.  Could we put more efforts in
centralizing these initializations?  The auxiliary processes are one
portion of the problem, and getting stack traces for backend processes
would be useful on its own.  Another suggestion that I'd propose to
simplify the patch would be to focus solely on backends for now, and
do something for auxiliary process later on.  If you do that, the
strange layer with BackendOrAuxproc() is not a problem anymore, as it
would be left out for now.

+
+logging current backtrace of process with PID 3499242:
+postgres: ubuntu postgres [local] INSERT(+0x61a355)[0x5559b94de355]
+postgres: ubuntu postgres [local] 
INSERT(procsignal_sigusr1_handler+0x9e)[0x5559b94de4ef]

This is IMO too much details for the documentation, and could be
confusing depending on how the code evolves in the future.  I'd
suggest to keep it minimal, cutting that to a few lines.  I don't see
a need to mention ubuntu, either.
--
Michael


signature.asc
Description: PGP signature


Re: Printing backtrace of postgres processes

2024-01-26 Thread Bharath Rupireddy
On Fri, Jan 26, 2024 at 4:11 PM Alvaro Herrera  wrote:
>

Thanks for reviewing.

> > +You can get the file name and line number from the logged details by 
> > using
> > +gdb/addr2line in linux platforms (users must ensure gdb/addr2line is
> > +already installed).
>
> This doesn't read great.  I mean, what is gdb/addr2line?  I think you
> mean either GDB or addr2line; this could be clearer.

Wrapped them in  tag and reworded the comment a bit.

> >   * internal backtrace support functions in the backtrace.  This requires 
> > that
> > - * this and related functions are not inlined.
> > + * this and related functions are not inlined. If the edata pointer is 
> > valid,
> > + * backtrace information will be set in edata.
> >   */
> > -static void
> > +void
> >  set_backtrace(ErrorData *edata, int num_skip)
> >  {
> >   StringInfoData errtrace;
>
> This seems like a terrible API choice, and the comment change is no
> good.  I suggest you need to create a function that deals only with a
> StringInfo, maybe
>   append_backtrace(StringInfo buf, int num_skip)
> which is used by set_backtrace to print the backtrace in
> edata->backtrace, and a new function log_backtrace() that does the
> ereport(LOG_SERVER_ONLY) thing.

You probably were looking at v21, the above change isn't there in
versions after that. Can you please review the latest version v26
attached here?

We might want this patch extended to the newly added walsummarizer
process which I'll do so in the next version.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From c00bc969b7e55dff92cdd5b7fef355dc2bc98f8f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 26 Jan 2024 15:51:02 +
Subject: [PATCH v26] Move sending multiplexed-SIGUSR1 signal code to a
 function

Add a new function hosting the common code for sending
multiplexed-SIGUSR1 signal to a backend process. This function
will also be used as-is by an upcoming commit reducing the code
duplication.
---
 src/backend/storage/ipc/procarray.c | 60 +
 src/backend/utils/adt/mcxtfuncs.c   | 49 ++-
 src/include/storage/procarray.h |  1 +
 3 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index ee2d7f8585..f6465529e7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3097,6 +3097,66 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type)
 	return result;
 }
 
+/*
+ * SendProcSignalBackendOrAuxproc -- check if the process with given pid is a
+ * backend or an auxiliary process and send it the SIGUSR1 signal for a given
+ * reason.
+ *
+ * Returns true if sending the signal was successful, false otherwise.
+ */
+bool
+SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason)
+{
+	PGPROC	   *proc;
+	BackendId	backendId = InvalidBackendId;
+
+	proc = BackendPidGetProc(pid);
+
+	/*
+	 * See if the process with given pid is a backend or an auxiliary process.
+	 *
+	 * If the given process is a backend, use its backend id in
+	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
+	 * that because auxiliary processes (except the startup process) don't
+	 * have a valid backend id.
+	 */
+	if (proc != NULL)
+		backendId = proc->backendId;
+	else
+		proc = AuxiliaryPidGetProc(pid);
+
+	/*
+	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+	 * isn't valid; but by the time we reach kill(), a process for which we
+	 * get a valid proc here might have terminated on its own.  There's no way
+	 * to acquire a lock on an arbitrary process to prevent that. But since
+	 * this mechanism is usually used to debug a backend or an auxiliary
+	 * process running and consuming lots of memory or a long running process,
+	 * that it might end on its own first and its memory contexts are not
+	 * logged or backtrace not logged is not a problem.
+	 */
+	if (proc == NULL)
+	{
+		/*
+		 * This is just a warning so a loop-through-resultset will not abort
+		 * if one backend terminated on its own during the run.
+		 */
+		ereport(WARNING,
+(errmsg("PID %d is not a PostgreSQL server process", pid)));
+		return false;
+	}
+
+	if (SendProcSignal(pid, reason, backendId) < 0)
+	{
+		/* Again, just a warning to allow loops */
+		ereport(WARNING,
+(errmsg("could not send signal to process %d: %m", pid)));
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * BackendPidGetProc -- get a backend's PGPROC given its PID
  *
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index 4708d73f5f..f7b4f8dac1 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -145,51 +145,8 @@ Datum
 pg_log_backend_memory_contexts(PG_FUNCTION_ARGS)
 {
 	int			pid = PG_GETARG_INT32(0);
-	PGPROC	   *proc;
-	BackendId	backendId = 

Re: Printing backtrace of postgres processes

2024-01-26 Thread Alvaro Herrera
On 2022-Jan-27, vignesh C wrote:

> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
> index 0ee6974f1c..855ccc8902 100644
> --- a/doc/src/sgml/func.sgml
> +++ b/doc/src/sgml/func.sgml

> +You can get the file name and line number from the logged details by 
> using
> +gdb/addr2line in linux platforms (users must ensure gdb/addr2line is
> +already installed).

This doesn't read great.  I mean, what is gdb/addr2line?  I think you
mean either GDB or addr2line; this could be clearer.

> diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
> index 7402696986..522a525741 100644
> --- a/src/backend/utils/error/elog.c
> +++ b/src/backend/utils/error/elog.c
> @@ -944,9 +943,10 @@ errbacktrace(void)
>   * Compute backtrace data and add it to the supplied ErrorData.  num_skip
>   * specifies how many inner frames to skip.  Use this to avoid showing the
>   * internal backtrace support functions in the backtrace.  This requires that
> - * this and related functions are not inlined.
> + * this and related functions are not inlined. If the edata pointer is valid,
> + * backtrace information will be set in edata.
>   */
> -static void
> +void
>  set_backtrace(ErrorData *edata, int num_skip)
>  {
>   StringInfoData errtrace;

This seems like a terrible API choice, and the comment change is no
good.  I suggest you need to create a function that deals only with a
StringInfo, maybe
  append_backtrace(StringInfo buf, int num_skip)
which is used by set_backtrace to print the backtrace in
edata->backtrace, and a new function log_backtrace() that does the
ereport(LOG_SERVER_ONLY) thing.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers




Re: Printing backtrace of postgres processes

2024-01-25 Thread vignesh C
On Sun, 5 Nov 2023 at 01:49, Bharath Rupireddy
 wrote:
>
> On Thu, Jul 20, 2023 at 8:22 PM Daniel Gustafsson  wrote:
> >
> > > On 11 Jan 2023, at 15:44, Bharath Rupireddy 
> > >  wrote:
> > >
> > > On Wed, Nov 30, 2022 at 11:43 AM Bharath Rupireddy
> > >  wrote:
> > >>
> > >> I'm attaching the v22 patch set for further review.
> > >
> > > Needed a rebase due to 216a784829c2c5f03ab0c43e009126cbb819e9b2.
> > > Attaching v23 patch set for further review.
> >
> > This thread has stalled for well over 6 months with the patch going from CF 
> > to
> > CF.  From skimming the thread it seems that a lot of the details have been
> > ironed out with most (all?) objections addressed.  Is any committer 
> > interested
> > in picking this up?  If not we should probably mark it returned with 
> > feedback.
>
> Rebase needed due to function oid clash. Picked the new OID with the
> help of src/include/catalog/unused_oids. PSA v24 patch set.

Rebase needed due to changes in parallel_schedule. PSA v25 patch set.

Regards,
Vignesh
From 32fa1d898b2599b836a2265f746acf230ffb358e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 4 Nov 2023 18:06:36 +
Subject: [PATCH v25 1/2] Move sending multiplexed-SIGUSR1 signal code to a
 function

Add a new function hosting the common code for sending
multiplexed-SIGUSR1 signal to a backend process. This function
will also be used as-is by an upcoming commit reducing the code
duplication.
---
 src/backend/storage/ipc/procarray.c | 60 +
 src/backend/utils/adt/mcxtfuncs.c   | 49 ++-
 src/include/storage/procarray.h |  1 +
 3 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index ee2d7f8585..f6465529e7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3097,6 +3097,66 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type)
 	return result;
 }
 
+/*
+ * SendProcSignalBackendOrAuxproc -- check if the process with given pid is a
+ * backend or an auxiliary process and send it the SIGUSR1 signal for a given
+ * reason.
+ *
+ * Returns true if sending the signal was successful, false otherwise.
+ */
+bool
+SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason)
+{
+	PGPROC	   *proc;
+	BackendId	backendId = InvalidBackendId;
+
+	proc = BackendPidGetProc(pid);
+
+	/*
+	 * See if the process with given pid is a backend or an auxiliary process.
+	 *
+	 * If the given process is a backend, use its backend id in
+	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
+	 * that because auxiliary processes (except the startup process) don't
+	 * have a valid backend id.
+	 */
+	if (proc != NULL)
+		backendId = proc->backendId;
+	else
+		proc = AuxiliaryPidGetProc(pid);
+
+	/*
+	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+	 * isn't valid; but by the time we reach kill(), a process for which we
+	 * get a valid proc here might have terminated on its own.  There's no way
+	 * to acquire a lock on an arbitrary process to prevent that. But since
+	 * this mechanism is usually used to debug a backend or an auxiliary
+	 * process running and consuming lots of memory or a long running process,
+	 * that it might end on its own first and its memory contexts are not
+	 * logged or backtrace not logged is not a problem.
+	 */
+	if (proc == NULL)
+	{
+		/*
+		 * This is just a warning so a loop-through-resultset will not abort
+		 * if one backend terminated on its own during the run.
+		 */
+		ereport(WARNING,
+(errmsg("PID %d is not a PostgreSQL server process", pid)));
+		return false;
+	}
+
+	if (SendProcSignal(pid, reason, backendId) < 0)
+	{
+		/* Again, just a warning to allow loops */
+		ereport(WARNING,
+(errmsg("could not send signal to process %d: %m", pid)));
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * BackendPidGetProc -- get a backend's PGPROC given its PID
  *
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index 4708d73f5f..f7b4f8dac1 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -145,51 +145,8 @@ Datum
 pg_log_backend_memory_contexts(PG_FUNCTION_ARGS)
 {
 	int			pid = PG_GETARG_INT32(0);
-	PGPROC	   *proc;
-	BackendId	backendId = InvalidBackendId;
+	bool		result;
 
-	proc = BackendPidGetProc(pid);
-
-	/*
-	 * See if the process with given pid is a backend or an auxiliary process.
-	 *
-	 * If the given process is a backend, use its backend id in
-	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
-	 * that because auxiliary processes (except the startup process) don't
-	 * have a valid backend id.
-	 */
-	if (proc != NULL)
-		backendId = proc->backendId;
-	else
-		proc = AuxiliaryPidGetProc(pid);
-
-	/*
-	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
-	 * isn't valid; but by the time 

Re: Printing backtrace of postgres processes

2024-01-14 Thread Maciek Sakrejda
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:tested, passed

I'm not sure if this actually still needs review, but it's marked as such in 
the CF app, so I'm reviewing it in the hopes of moving it along.

The feature works as documented. The docs say "This function is supported only 
if PostgreSQL was built with the ability to capture backtraces, otherwise it 
will emit a warning." I'm not sure what building with the ability to capture 
backtraces is, but it worked with no special config on my machine. I don't have 
much C experience, so I don't know if this is something that should have more 
context in a README somewhere, or if it's likely someone who's interested in 
this will already know what to do. The code looks fine to me.

I tried running make installcheck-world, but it failed on 17 tests. However, 
master also fails here on 17 tests. A normal make check-world passes on both 
branches. I assume I'm doing something wrong and would appreciate any pointers 
[0].

Based on my review and Daniel's comment above, I'm marking this as Ready for 
Committer.

Thanks,
Maciek

[0] My regression.diffs has errors like

```
diff -U3 
/home/maciek/code/aux/postgres/src/test/regress/expected/copyselect.out 
/home/maciek/code/aux/postgres/src/test/regress/results/copyselect.out
--- /home/maciek/code/aux/postgres/src/test/regress/expected/copyselect.out 
2023-01-02 12:21:10.792646101 -0800
+++ /home/maciek/code/aux/postgres/src/test/regress/results/copyselect.out  
2024-01-14 15:04:07.513887866 -0800
@@ -131,11 +131,6 @@
 2
  ?column? 
 --
-3
-(1 row)
-
- ?column? 
---
 4
 (1 row)
```

and

```
diff -U3 
/home/maciek/code/aux/postgres/src/test/regress/expected/create_table.out 
/home/maciek/code/aux/postgres/src/test/regress/results/create_table.out
--- /home/maciek/code/aux/postgres/src/test/regress/expected/create_table.out   
2023-10-02 22:14:02.583377845 -0700
+++ /home/maciek/code/aux/postgres/src/test/regress/results/create_table.out
2024-01-14 15:04:09.037890710 -0800
@@ -854,8 +854,6 @@
  b  | integer |   | not null | 1   | plain|  | 
 Partition of: parted FOR VALUES IN ('b')
 Partition constraint: ((a IS NOT NULL) AND (a = 'b'::text))
-Not-null constraints:
-"part_b_b_not_null" NOT NULL "b" (local, inherited)
 
 -- Both partition bound and partition key in describe output
 \d+ part_c
``` 

I'm on Ubuntu 22.04 with Postgres 11, 12, 13, and 16 installed from PGDG.

The new status of this patch is: Ready for Committer


Re: Printing backtrace of postgres processes

2023-11-04 Thread Bharath Rupireddy
On Thu, Jul 20, 2023 at 8:22 PM Daniel Gustafsson  wrote:
>
> > On 11 Jan 2023, at 15:44, Bharath Rupireddy 
> >  wrote:
> >
> > On Wed, Nov 30, 2022 at 11:43 AM Bharath Rupireddy
> >  wrote:
> >>
> >> I'm attaching the v22 patch set for further review.
> >
> > Needed a rebase due to 216a784829c2c5f03ab0c43e009126cbb819e9b2.
> > Attaching v23 patch set for further review.
>
> This thread has stalled for well over 6 months with the patch going from CF to
> CF.  From skimming the thread it seems that a lot of the details have been
> ironed out with most (all?) objections addressed.  Is any committer interested
> in picking this up?  If not we should probably mark it returned with feedback.

Rebase needed due to function oid clash. Picked the new OID with the
help of src/include/catalog/unused_oids. PSA v24 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 9673b537cc1797f86aac69613eb50b6fc99256f8 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 4 Nov 2023 18:06:36 +
Subject: [PATCH v24] Move sending multiplexed-SIGUSR1 signal code to a
 function

Add a new function hosting the common code for sending
multiplexed-SIGUSR1 signal to a backend process. This function
will also be used as-is by an upcoming commit reducing the code
duplication.
---
 src/backend/storage/ipc/procarray.c | 60 +
 src/backend/utils/adt/mcxtfuncs.c   | 49 ++-
 src/include/storage/procarray.h |  1 +
 3 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 4ca2789d10..9fe1158a0c 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3092,6 +3092,66 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type)
 	return result;
 }
 
+/*
+ * SendProcSignalBackendOrAuxproc -- check if the process with given pid is a
+ * backend or an auxiliary process and send it the SIGUSR1 signal for a given
+ * reason.
+ *
+ * Returns true if sending the signal was successful, false otherwise.
+ */
+bool
+SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason)
+{
+	PGPROC	   *proc;
+	BackendId	backendId = InvalidBackendId;
+
+	proc = BackendPidGetProc(pid);
+
+	/*
+	 * See if the process with given pid is a backend or an auxiliary process.
+	 *
+	 * If the given process is a backend, use its backend id in
+	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
+	 * that because auxiliary processes (except the startup process) don't
+	 * have a valid backend id.
+	 */
+	if (proc != NULL)
+		backendId = proc->backendId;
+	else
+		proc = AuxiliaryPidGetProc(pid);
+
+	/*
+	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+	 * isn't valid; but by the time we reach kill(), a process for which we
+	 * get a valid proc here might have terminated on its own.  There's no way
+	 * to acquire a lock on an arbitrary process to prevent that. But since
+	 * this mechanism is usually used to debug a backend or an auxiliary
+	 * process running and consuming lots of memory or a long running process,
+	 * that it might end on its own first and its memory contexts are not
+	 * logged or backtrace not logged is not a problem.
+	 */
+	if (proc == NULL)
+	{
+		/*
+		 * This is just a warning so a loop-through-resultset will not abort
+		 * if one backend terminated on its own during the run.
+		 */
+		ereport(WARNING,
+(errmsg("PID %d is not a PostgreSQL server process", pid)));
+		return false;
+	}
+
+	if (SendProcSignal(pid, reason, backendId) < 0)
+	{
+		/* Again, just a warning to allow loops */
+		ereport(WARNING,
+(errmsg("could not send signal to process %d: %m", pid)));
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * BackendPidGetProc -- get a backend's PGPROC given its PID
  *
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index 92ca5b2f72..7b17afc2ff 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -145,51 +145,8 @@ Datum
 pg_log_backend_memory_contexts(PG_FUNCTION_ARGS)
 {
 	int			pid = PG_GETARG_INT32(0);
-	PGPROC	   *proc;
-	BackendId	backendId = InvalidBackendId;
+	bool		result;
 
-	proc = BackendPidGetProc(pid);
-
-	/*
-	 * See if the process with given pid is a backend or an auxiliary process.
-	 *
-	 * If the given process is a backend, use its backend id in
-	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
-	 * that because auxiliary processes (except the startup process) don't
-	 * have a valid backend id.
-	 */
-	if (proc != NULL)
-		backendId = proc->backendId;
-	else
-		proc = AuxiliaryPidGetProc(pid);
-
-	/*
-	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
-	 * isn't valid; but by the time we reach kill(), a process for which we
-	 * get a valid proc here might have 

Re: Printing backtrace of postgres processes

2023-07-20 Thread Daniel Gustafsson
> On 11 Jan 2023, at 15:44, Bharath Rupireddy 
>  wrote:
> 
> On Wed, Nov 30, 2022 at 11:43 AM Bharath Rupireddy
>  wrote:
>> 
>> I'm attaching the v22 patch set for further review.
> 
> Needed a rebase due to 216a784829c2c5f03ab0c43e009126cbb819e9b2.
> Attaching v23 patch set for further review.

This thread has stalled for well over 6 months with the patch going from CF to
CF.  From skimming the thread it seems that a lot of the details have been
ironed out with most (all?) objections addressed.  Is any committer interested
in picking this up?  If not we should probably mark it returned with feedback.

--
Daniel Gustafsson





Re: Printing backtrace of postgres processes

2023-01-11 Thread Bharath Rupireddy
On Wed, Nov 30, 2022 at 11:43 AM Bharath Rupireddy
 wrote:
>
> I'm attaching the v22 patch set for further review.

Needed a rebase due to 216a784829c2c5f03ab0c43e009126cbb819e9b2.
Attaching v23 patch set for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From ee5c26f0d4e2e211166250857ea42d30a3666709 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 11 Jan 2023 13:32:13 +
Subject: [PATCH v23] Move sending multiplexed-SIGUSR1 signal code to a
 function

Add a new function hosting the common code for sending
multiplexed-SIGUSR1 signal to a backend process. This function
will also be used as-is by an upcoming commit reducing the code
duplication.
---
 src/backend/storage/ipc/procarray.c | 60 +
 src/backend/utils/adt/mcxtfuncs.c   | 49 ++-
 src/include/storage/procarray.h |  1 +
 3 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 4340bf9641..5681f0d3b0 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3151,6 +3151,66 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type)
 	return result;
 }
 
+/*
+ * SendProcSignalBackendOrAuxproc -- check if the process with given pid is a
+ * backend or an auxiliary process and send it the SIGUSR1 signal for a given
+ * reason.
+ *
+ * Returns true if sending the signal was successful, false otherwise.
+ */
+bool
+SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason)
+{
+	PGPROC	   *proc;
+	BackendId	backendId = InvalidBackendId;
+
+	proc = BackendPidGetProc(pid);
+
+	/*
+	 * See if the process with given pid is a backend or an auxiliary process.
+	 *
+	 * If the given process is a backend, use its backend id in
+	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
+	 * that because auxiliary processes (except the startup process) don't
+	 * have a valid backend id.
+	 */
+	if (proc != NULL)
+		backendId = proc->backendId;
+	else
+		proc = AuxiliaryPidGetProc(pid);
+
+	/*
+	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+	 * isn't valid; but by the time we reach kill(), a process for which we
+	 * get a valid proc here might have terminated on its own.  There's no way
+	 * to acquire a lock on an arbitrary process to prevent that. But since
+	 * this mechanism is usually used to debug a backend or an auxiliary
+	 * process running and consuming lots of memory or a long running process,
+	 * that it might end on its own first and its memory contexts are not
+	 * logged or backtrace not logged is not a problem.
+	 */
+	if (proc == NULL)
+	{
+		/*
+		 * This is just a warning so a loop-through-resultset will not abort
+		 * if one backend terminated on its own during the run.
+		 */
+		ereport(WARNING,
+(errmsg("PID %d is not a PostgreSQL server process", pid)));
+		return false;
+	}
+
+	if (SendProcSignal(pid, reason, backendId) < 0)
+	{
+		/* Again, just a warning to allow loops */
+		ereport(WARNING,
+(errmsg("could not send signal to process %d: %m", pid)));
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * BackendPidGetProc -- get a backend's PGPROC given its PID
  *
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index 92ca5b2f72..7b17afc2ff 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -145,51 +145,8 @@ Datum
 pg_log_backend_memory_contexts(PG_FUNCTION_ARGS)
 {
 	int			pid = PG_GETARG_INT32(0);
-	PGPROC	   *proc;
-	BackendId	backendId = InvalidBackendId;
+	bool		result;
 
-	proc = BackendPidGetProc(pid);
-
-	/*
-	 * See if the process with given pid is a backend or an auxiliary process.
-	 *
-	 * If the given process is a backend, use its backend id in
-	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
-	 * that because auxiliary processes (except the startup process) don't
-	 * have a valid backend id.
-	 */
-	if (proc != NULL)
-		backendId = proc->backendId;
-	else
-		proc = AuxiliaryPidGetProc(pid);
-
-	/*
-	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
-	 * isn't valid; but by the time we reach kill(), a process for which we
-	 * get a valid proc here might have terminated on its own.  There's no way
-	 * to acquire a lock on an arbitrary process to prevent that. But since
-	 * this mechanism is usually used to debug a backend or an auxiliary
-	 * process running and consuming lots of memory, that it might end on its
-	 * own first and its memory contexts are not logged is not a problem.
-	 */
-	if (proc == NULL)
-	{
-		/*
-		 * This is just a warning so a loop-through-resultset will not abort
-		 * if one backend terminated on its own during the run.
-		 */
-		ereport(WARNING,
-(errmsg("PID %d is not a PostgreSQL server process", pid)));
-		

Re: Printing backtrace of postgres processes

2022-11-29 Thread Bharath Rupireddy
On Fri, Nov 11, 2022 at 6:04 PM Bharath Rupireddy
 wrote:
>
> On Fri, Nov 11, 2022 at 7:59 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Thu, 10 Nov 2022 15:56:35 +0530, Bharath Rupireddy 
> >  wrote in
> > > On Mon, Apr 18, 2022 at 9:10 AM vignesh C  wrote:
> > > >
> > > > The attached v21 patch has the changes for the same.
> > >
> > > Thanks for the patch. Here are some comments:
> > >
> > > 1. I think there's a fundamental problem with this patch, that is, it
> > > prints the backtrace when the interrupt is processed but not when
> > > interrupt is received. This way, the backends/aux processes will
> >
> > Yeah, but the obstacle was backtrace(3) itself. Andres pointed [1]
> > that that may be doable with some care (and I agree to the opinion).
> > AFAIS no discussions followed and things have been to the current
> > shape since then.
> >
> >
> > [1] 
> > https://www.postgresql.org/message-id/20201201032649.aekv5b5dicvmovf4%40alap3.anarazel.de
> > | > Surely this is *utterly* unsafe.  You can't do that sort of stuff in
> > | > a signal handler.
> > |
> > | That's of course true for the current implementation - but I don't think
> > | it's a fundamental constraint. With a bit of care backtrace() and
> > | backtrace_symbols() itself can be signal safe:
> >
> > man 3 backtrace
> > >  *  backtrace()  and  backtrace_symbols_fd() don't call malloc() explic‐
> > > itly, but they are part of libgcc,  which  gets  loaded  dynamically
> > > when  first  used.   Dynamic loading usually triggers a call to mal‐
> > > loc(3).  If you need certain calls to these  two  functions  to  not
> > > allocate  memory (in signal handlers, for example), you need to make
> > > sure libgcc is loaded beforehand.
>
> I missed that part. Thanks for pointing it out. The
> backtrace_symbols() seems to be returning a malloc'ed array [1],
> meaning it can't be used in signal handlers (if used, it can cause
> deadlocks as per [2]) and existing set_backtrace() is using it.
> Therefore, we need to either change set_backtrace() to use
> backtrace_symbols_fd() instead of backtrace_symobls() or introduce
> another function for the purpose of this feature. If done that, then
> we can think of preloading of libgcc which makes backtrace(),
> backtrace_symobols_fd() safe to use in signal handlers.
>
> Looks like we're not loading libgcc explicitly now into any of
> postgres processes, please correct me if I'm wrong here. If we're not
> loading it right now, is it acceptable to load libgcc into every
> postgres process for the sake of this feature?
>
> [1] https://linux.die.net/man/3/backtrace_symbols
> [2] 
> https://stackoverflow.com/questions/40049751/malloc-inside-linux-signal-handler-cause-deadlock

I took a stab at it after discussing with Vignesh off-list. Here's
what I've done:

1. Make backtrace functions signal safe i.e. ensure libgcc is loaded,
just in case if it's not, by calling backtrace() function during the
early life of a process after SIGUSR1 signal handler and other signal
handlers are established.
2. Emit backtrace to stderr within the signal handler itself to keep
things simple so that we don't need to allocate dynamic memory inside
the signal handler.
3. Split the patch into 3 for ease of review, 0001 does preparatory
stuff, 0002 adds the function, 0003 adds the docs and tests.

I'm attaching the v23 patch set for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v22-0001-Move-sending-multiplexed-SIGUSR1-signal-code-to-.patch
Description: Binary data


v22-0002-Add-function-to-log-the-backtrace-of-the-specifi.patch
Description: Binary data


v22-0003-Add-documentation-and-tests-for-pg_log_backtrace.patch
Description: Binary data


Re: Printing backtrace of postgres processes

2022-11-11 Thread Bharath Rupireddy
On Fri, Nov 11, 2022 at 7:59 AM Kyotaro Horiguchi
 wrote:
>
> At Thu, 10 Nov 2022 15:56:35 +0530, Bharath Rupireddy 
>  wrote in
> > On Mon, Apr 18, 2022 at 9:10 AM vignesh C  wrote:
> > >
> > > The attached v21 patch has the changes for the same.
> >
> > Thanks for the patch. Here are some comments:
> >
> > 1. I think there's a fundamental problem with this patch, that is, it
> > prints the backtrace when the interrupt is processed but not when
> > interrupt is received. This way, the backends/aux processes will
>
> Yeah, but the obstacle was backtrace(3) itself. Andres pointed [1]
> that that may be doable with some care (and I agree to the opinion).
> AFAIS no discussions followed and things have been to the current
> shape since then.
>
>
> [1] 
> https://www.postgresql.org/message-id/20201201032649.aekv5b5dicvmovf4%40alap3.anarazel.de
> | > Surely this is *utterly* unsafe.  You can't do that sort of stuff in
> | > a signal handler.
> |
> | That's of course true for the current implementation - but I don't think
> | it's a fundamental constraint. With a bit of care backtrace() and
> | backtrace_symbols() itself can be signal safe:
>
> man 3 backtrace
> >  *  backtrace()  and  backtrace_symbols_fd() don't call malloc() explic‐
> > itly, but they are part of libgcc,  which  gets  loaded  dynamically
> > when  first  used.   Dynamic loading usually triggers a call to mal‐
> > loc(3).  If you need certain calls to these  two  functions  to  not
> > allocate  memory (in signal handlers, for example), you need to make
> > sure libgcc is loaded beforehand.

I missed that part. Thanks for pointing it out. The
backtrace_symbols() seems to be returning a malloc'ed array [1],
meaning it can't be used in signal handlers (if used, it can cause
deadlocks as per [2]) and existing set_backtrace() is using it.
Therefore, we need to either change set_backtrace() to use
backtrace_symbols_fd() instead of backtrace_symobls() or introduce
another function for the purpose of this feature. If done that, then
we can think of preloading of libgcc which makes backtrace(),
backtrace_symobols_fd() safe to use in signal handlers.

Looks like we're not loading libgcc explicitly now into any of
postgres processes, please correct me if I'm wrong here. If we're not
loading it right now, is it acceptable to load libgcc into every
postgres process for the sake of this feature?

[1] https://linux.die.net/man/3/backtrace_symbols
[2] 
https://stackoverflow.com/questions/40049751/malloc-inside-linux-signal-handler-cause-deadlock

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Printing backtrace of postgres processes

2022-11-10 Thread Kyotaro Horiguchi
At Thu, 10 Nov 2022 15:56:35 +0530, Bharath Rupireddy 
 wrote in 
> On Mon, Apr 18, 2022 at 9:10 AM vignesh C  wrote:
> >
> > The attached v21 patch has the changes for the same.
> 
> Thanks for the patch. Here are some comments:
> 
> 1. I think there's a fundamental problem with this patch, that is, it
> prints the backtrace when the interrupt is processed but not when
> interrupt is received. This way, the backends/aux processes will

Yeah, but the obstacle was backtrace(3) itself. Andres pointed [1]
that that may be doable with some care (and I agree to the opinion).
AFAIS no discussions followed and things have been to the current
shape since then.


[1] 
https://www.postgresql.org/message-id/20201201032649.aekv5b5dicvmovf4%40alap3.anarazel.de
| > Surely this is *utterly* unsafe.  You can't do that sort of stuff in
| > a signal handler.
| 
| That's of course true for the current implementation - but I don't think
| it's a fundamental constraint. With a bit of care backtrace() and
| backtrace_symbols() itself can be signal safe:

man 3 backtrace
>  *  backtrace()  and  backtrace_symbols_fd() don't call malloc() explic‐
> itly, but they are part of libgcc,  which  gets  loaded  dynamically
> when  first  used.   Dynamic loading usually triggers a call to mal‐
> loc(3).  If you need certain calls to these  two  functions  to  not
> allocate  memory (in signal handlers, for example), you need to make
> sure libgcc is loaded beforehand.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Printing backtrace of postgres processes

2022-11-10 Thread Bharath Rupireddy
On Mon, Apr 18, 2022 at 9:10 AM vignesh C  wrote:
>
> The attached v21 patch has the changes for the same.

Thanks for the patch. Here are some comments:

1. I think there's a fundamental problem with this patch, that is, it
prints the backtrace when the interrupt is processed but not when
interrupt is received. This way, the backends/aux processes will
always have backtraces in their main loops or some processing loops
wherever CHECK_FOR_INTERRUPTS() is called [1]. What we need is
something more useful. What I think is the better way is to capture
the backtrace, call set_backtrace(), when the interrupt arrives and
then if we don't want to print it in the interrupt handler, perhaps
save it and print it in the next CFI() cycle. See some realistic and
useful backtrace [2] when I emitted the backtrace in the interrupt
handler.

2.
+specified process ID.  This function can send the request to
+backends and auxiliary processes except the logger and statistics
+collector.  The backtraces will be logged at LOG
There's no statistics collector process any more. Please remove it.
Also remove 'the' before 'logger' to be in sync with
pg_log_backend_memory_contexts docs.

3.
+configuration set (See  for
Lowercase 'see' to be in sync with pg_log_backend_memory_contexts docs.

4.
+You can obtain the file name and line number from the logged
details by using
How about 'One can obtain'?

5.
+postgres=# select pg_log_backtrace(pg_backend_pid());
+For example:
+
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
A more realistic example would be better here, say walwriter or
checkpointer or some other process backtrace will be good instead of a
backend logging its pg_log_backtrace()'s call stack?

6.
Don't we need the backtrace of walreceiver? I think it'll be good to
have in ProcessWalRcvInterrupts().

7.
+errmsg_internal("logging backtrace of PID %d", MyProcPid));
+elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace);
Can we get rid of "current backtrace:%s" and have something similar to
ProcessLogMemoryContextInterrupt() like below?

errmsg("logging backtrace of PID %d", MyProcPid)));
errmsg("%s", errtrace);

[1]
2022-11-10 09:55:44.691 UTC [1346443] LOG:  logging backtrace of PID 1346443
2022-11-10 09:55:44.691 UTC [1346443] LOG:  current backtrace:
postgres: checkpointer (set_backtrace+0x46) [0x5640df9849c6]
postgres: checkpointer (ProcessLogBacktraceInterrupt+0x16)
[0x5640df7fd326]
postgres: checkpointer (CheckpointerMain+0x1a3) [0x5640df77f553]
postgres: checkpointer (AuxiliaryProcessMain+0xc9) [0x5640df77d5e9]
postgres: checkpointer (+0x436a9a) [0x5640df783a9a]
postgres: checkpointer (PostmasterMain+0xd57) [0x5640df7877e7]
postgres: checkpointer (main+0x20f) [0x5640df4a6f1f]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4b9fe9dd90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f4b9fe9de40]
postgres: checkpointer (_start+0x25) [0x5640df4a7265]
2022-11-10 09:56:05.032 UTC [1346448] LOG:  logging backtrace of PID 1346448
2022-11-10 09:56:05.032 UTC [1346448] LOG:  current backtrace:
postgres: logical replication launcher (set_backtrace+0x46)
[0x5640df9849c6]
postgres: logical replication launcher
(ProcessLogBacktraceInterrupt+0x16) [0x5640df7fd326]
postgres: logical replication launcher
(ApplyLauncherMain+0x11b) [0x5640df7a253b]
postgres: logical replication launcher
(StartBackgroundWorker+0x220) [0x5640df77e270]
postgres: logical replication launcher (+0x4369f4) [0x5640df7839f4]
postgres: logical replication launcher (+0x43771d) [0x5640df78471d]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f4b9feb6520]
/lib/x86_64-linux-gnu/libc.so.6(__select+0xbd) [0x7f4b9ff8f74d]
postgres: logical replication launcher (+0x43894d) [0x5640df78594d]
postgres: logical replication launcher (PostmasterMain+0xcb5)
[0x5640df787745]
postgres: logical replication launcher (main+0x20f) [0x5640df4a6f1f]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4b9fe9dd90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f4b9fe9de40]
postgres: logical replication launcher (_start+0x25) [0x5640df4a7265]

[2]
2022-11-10 10:25:20.021 UTC [1351953] LOG:  current backtrace:
postgres: ubuntu postgres [local] INSERT(set_backtrace+0x46)
[0x55c60ae069b6]
postgres: ubuntu postgres [local]
INSERT(procsignal_sigusr1_handler+0x1d8) [0x55c60ac7f4f8]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f75a395c520]
/lib/x86_64-linux-gnu/libc.so.6(+0x91104) [0x7f75a39ab104]
/lib/x86_64-linux-gnu/libc.so.6(+0x9ccf8) [0x7f75a39b6cf8]
postgres: ubuntu postgres [local] INSERT(PGSemaphoreLock+0x22)
[0x55c60abfb1f2]
postgres: ubuntu postgres [local] INSERT(LWLockAcquire+0x174)
[0x55c60ac8f384]
postgres: ubuntu postgres [local] 

Re: Printing backtrace of postgres processes

2022-04-17 Thread vignesh C
On Fri, Apr 15, 2022 at 11:49 AM Kyotaro Horiguchi
 wrote:
>
> At Thu, 14 Apr 2022 10:33:50 +0530, vignesh C  wrote in
> > On Wed, Apr 6, 2022 at 12:29 PM vignesh C  wrote:
> > >
> > > On Tue, Apr 5, 2022 at 9:18 PM Robert Haas  wrote:
> > > > This looks like a grotty hack.
> > >
> > > I have changed it so that the backtrace is set and returned to the
> > > caller. The caller will take care of logging or setting it to the
> > > error data context. Thoughts?
>
> The point here I think is that the StringInfo is useless here in the
> first place.  Addition to that, it is outside the use of StringInfo
> hammering in a string pointer into it.

Modified

> By the way, as Andres suggested very early in this thread, backrace()
> can be called from signal handlers with some preparation.  So we can
> run backtrace() in HandleLogBacktraceInterrupt() and backtrace_symbols
> in ProceeLogBacktraceInterrupt().  This make this a bit complex, but
> the outcome should be more informative.

Using that approach we could only send the trace to stderr, when we
are trying to log it to the logs it might result in deadlock as the
signal handler can be called in between malloc lock. We dropped that
approach to avoid issues pointed in [1].

> > > > Incidentally changing the behavior of pg_signal_backend() doesn't seem
> > > > like a great idea. We can do that as a separate commit, after
> > > > considering whether documentation changes are needed. But it's not
> > > > something that should get folded into a commit on another topic.
> > >
> > > Agreed. I have kept the logic of pg_signal_backend as it is.
> > > pg_log_backtrace and pg_log_backend_memory_contexts use the same
> > > functionality to check and send signal. I have added a new function
> > > "CheckProcSendDebugSignal" which has the common code implementation
> > > and will be called by both pg_log_backtrace and
> > > pg_log_backend_memory_contexts.
>
> The name looks like too specific than what it actually does, and
> rather doesn't manifest what it does.
> SendProcSignalBackendOrAuxproc() might be more eescriptive.
>
> Or we can provide BackendOrAuxiliaryPidGetProc(int pid, BackendId ).

Modified it to SendProcSignalBackendOrAuxproc

> > > > + /*
> > > > + * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
> > > > + * isn't valid; but by the time we reach kill(), a process for which we
> > > > + * get a valid proc here might have terminated on its own.  There's no 
> > > > way
> > > > + * to acquire a lock on an arbitrary process to prevent that. But since
> > > > + * this mechanism is usually used to debug a backend or an auxiliary
> > > > + * process running and consuming lots of memory, that it might end on 
> > > > its
> > > > + * own first without logging the requested info is not a problem.
> > > > + */
> > > >
> > > > This comment made a lot more sense where it used to be than it does
> > > > where it is now. I think more work is required here than just cutting
> > > > and pasting.
> > >
> > > This function was called from pg_signal_backend earlier, this logic is
> > > now moved to CheckProcSendDebugSignal which will be called only by
> > > pg_log_backtrace and pg_log_backend_memory_contexts. This looks
> > > appropriate in CheckProcSendDebugSignal with slight change to specify
> > > it can consume lots of memory or a long running process.
> > > Thoughts?
>
> For example. do you see the following part correct as well for
> pg_log_backtrace()?
">
> > + * this mechanism is usually used to debug a backend or an auxiliary
> > + * process running and consuming lots of memory, that it might end on its

I felt it was ok since I mentioned it as "consuming lots of memory or
a long running process", let me know if you want to change it to
something else, I will change it.

[1] - https://www.postgresql.org/message-id/1361750.1606795285%40sss.pgh.pa.us

The attached v21 patch has the changes for the same.

Regards,
Vignesh
From f7c29250f137844c36e5164ff457028a893fadd7 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Thu, 14 Apr 2022 10:31:50 +0530
Subject: [PATCH v21] Add function to log the backtrace of the specified
 postgres process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend or auxiliary process except
logger and statistic collector.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.

Authors: Vignesh C, Bharath Rupireddy, Justin Pryzby
Reviewers: Bharath Rupireddy, Justin Pryzby, Fujii Masao, Atsushi Torikoshi, Dilip Kumar, Robert Haas, Andres Freund, Tom lane, Craig Ringer
Discussion: 

Re: Printing backtrace of postgres processes

2022-04-15 Thread Kyotaro Horiguchi
At Thu, 14 Apr 2022 10:33:50 +0530, vignesh C  wrote in 
> On Wed, Apr 6, 2022 at 12:29 PM vignesh C  wrote:
> >
> > On Tue, Apr 5, 2022 at 9:18 PM Robert Haas  wrote:
> > > This looks like a grotty hack.
> >
> > I have changed it so that the backtrace is set and returned to the
> > caller. The caller will take care of logging or setting it to the
> > error data context. Thoughts?

The point here I think is that the StringInfo is useless here in the
first place.  Addition to that, it is outside the use of StringInfo
hammering in a string pointer into it.

By the way, as Andres suggested very early in this thread, backrace()
can be called from signal handlers with some preparation.  So we can
run backtrace() in HandleLogBacktraceInterrupt() and backtrace_symbols
in ProceeLogBacktraceInterrupt().  This make this a bit complex, but
the outcome should be more informative.


> > > Incidentally changing the behavior of pg_signal_backend() doesn't seem
> > > like a great idea. We can do that as a separate commit, after
> > > considering whether documentation changes are needed. But it's not
> > > something that should get folded into a commit on another topic.
> >
> > Agreed. I have kept the logic of pg_signal_backend as it is.
> > pg_log_backtrace and pg_log_backend_memory_contexts use the same
> > functionality to check and send signal. I have added a new function
> > "CheckProcSendDebugSignal" which has the common code implementation
> > and will be called by both pg_log_backtrace and
> > pg_log_backend_memory_contexts.

The name looks like too specific than what it actually does, and
rather doesn't manifest what it does.
SendProcSignalBackendOrAuxproc() might be more eescriptive.

Or we can provide BackendOrAuxiliaryPidGetProc(int pid, BackendId ).

> > > + /*
> > > + * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
> > > + * isn't valid; but by the time we reach kill(), a process for which we
> > > + * get a valid proc here might have terminated on its own.  There's no 
> > > way
> > > + * to acquire a lock on an arbitrary process to prevent that. But since
> > > + * this mechanism is usually used to debug a backend or an auxiliary
> > > + * process running and consuming lots of memory, that it might end on its
> > > + * own first without logging the requested info is not a problem.
> > > + */
> > >
> > > This comment made a lot more sense where it used to be than it does
> > > where it is now. I think more work is required here than just cutting
> > > and pasting.
> >
> > This function was called from pg_signal_backend earlier, this logic is
> > now moved to CheckProcSendDebugSignal which will be called only by
> > pg_log_backtrace and pg_log_backend_memory_contexts. This looks
> > appropriate in CheckProcSendDebugSignal with slight change to specify
> > it can consume lots of memory or a long running process.
> > Thoughts?

For example. do you see the following part correct as well for
pg_log_backtrace()?

> + * this mechanism is usually used to debug a backend or an auxiliary
> + * process running and consuming lots of memory, that it might end on its

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Printing backtrace of postgres processes

2022-04-13 Thread vignesh C
On Wed, Apr 6, 2022 at 12:29 PM vignesh C  wrote:
>
> On Tue, Apr 5, 2022 at 9:18 PM Robert Haas  wrote:
> >
> > On Wed, Mar 30, 2022 at 12:03 PM Justin Pryzby  wrote:
> > > On Wed, Mar 30, 2022 at 11:53:52AM -0400, Greg Stark wrote:
> > > > Sadly the cfbot is showing a patch conflict again. It's just a trivial
> > > > conflict in the regression test schedule so I'm not going to update
> > > > the status but it would be good to rebase it so we continue to get
> > > > cfbot testing.
> > >
> > > Done.  No changes.
> >
> > + if (chk_auxiliary_proc)
> > + ereport(WARNING,
> > + errmsg("PID %d is not a PostgreSQL server process", pid));
> > + else
> > + ereport(WARNING,
> > + errmsg("PID %d is not a PostgreSQL backend process", pid));
> >
> > This doesn't look right to me. I don't think that PostgreSQL server
> > processes are one kind of thing and PostgreSQL backend processes are
> > another kind of thing. I think they're two terms that are pretty
> > nearly interchangeable, or maybe at best you want to argue that
> > backend processes are some subset of server processes. I don't see
> > this sort of thing adding any clarity.
>
> I have changed it to "PID %d is not a PostgreSQL server process" which
> is similar to the existing warning message in
> pg_log_backend_memory_contexts.
>
> > -static void
> > +void
> >  set_backtrace(ErrorData *edata, int num_skip)
> >  {
> >   StringInfoData errtrace;
> > @@ -978,7 +978,18 @@ set_backtrace(ErrorData *edata, int num_skip)
> >  "backtrace generation is not supported by this installation");
> >  #endif
> >
> > - edata->backtrace = errtrace.data;
> > + if (edata)
> > + edata->backtrace = errtrace.data;
> > + else
> > + {
> > + /*
> > + * LOG_SERVER_ONLY is used to make sure the backtrace is never
> > + * sent to client. We want to avoid messing up the other client
> > + * session.
> > + */
> > + elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
> > + pfree(errtrace.data);
> > + }
> >  }
> >
> > This looks like a grotty hack.
>
> I have changed it so that the backtrace is set and returned to the
> caller. The caller will take care of logging or setting it to the
> error data context. Thoughts?
>
> > - PGPROC*proc = BackendPidGetProc(pid);
> > -
> > - /*
> > - * BackendPidGetProc returns NULL if the pid isn't valid; but by the time
> > - * we reach kill(), a process for which we get a valid proc here might
> > - * have terminated on its own.  There's no way to acquire a lock on an
> > - * arbitrary process to prevent that. But since so far all the callers of
> > - * this mechanism involve some request for ending the process anyway, that
> > - * it might end on its own first is not a problem.
> > - *
> > - * Note that proc will also be NULL if the pid refers to an auxiliary
> > - * process or the postmaster (neither of which can be signaled via
> > - * pg_signal_backend()).
> > - */
> > - if (proc == NULL)
> > - {
> > - /*
> > - * This is just a warning so a loop-through-resultset will not abort
> > - * if one backend terminated on its own during the run.
> > - */
> > - ereport(WARNING,
> > - (errmsg("PID %d is not a PostgreSQL backend process", pid)));
> > + PGPROC  *proc;
> >
> > + /* Users can only signal valid backend or an auxiliary process. */
> > + proc = CheckPostgresProcessId(pid, false, NULL);
> > + if (!proc)
> >   return SIGNAL_BACKEND_ERROR;
> > - }
> >
> > Incidentally changing the behavior of pg_signal_backend() doesn't seem
> > like a great idea. We can do that as a separate commit, after
> > considering whether documentation changes are needed. But it's not
> > something that should get folded into a commit on another topic.
>
> Agreed. I have kept the logic of pg_signal_backend as it is.
> pg_log_backtrace and pg_log_backend_memory_contexts use the same
> functionality to check and send signal. I have added a new function
> "CheckProcSendDebugSignal" which has the common code implementation
> and will be called by both pg_log_backtrace and
> pg_log_backend_memory_contexts.
>
> > + /*
> > + * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
> > + * isn't valid; but by the time we reach kill(), a process for which we
> > + * get a valid proc here might have terminated on its own.  There's no way
> > + * to acquire a lock on an arbitrary process to prevent that. But since
> > + * this mechanism is usually used to debug a backend or an auxiliary
> > + * process running and consuming lots of memory, that it might end on its
> > + * own first without logging the requested info is not a problem.
> > + */
> >
> > This comment made a lot more sense where it used to be than it does
> > where it is now. I think more work is required here than just cutting
> > and pasting.
>
> This function was called from pg_signal_backend earlier, this logic is
> now moved to CheckProcSendDebugSignal which will be called only by
> pg_log_backtrace and pg_log_backend_memory_contexts. This looks
> appropriate in CheckProcSendDebugSignal with 

Re: Printing backtrace of postgres processes

2022-04-06 Thread vignesh C
On Tue, Apr 5, 2022 at 9:18 PM Robert Haas  wrote:
>
> On Wed, Mar 30, 2022 at 12:03 PM Justin Pryzby  wrote:
> > On Wed, Mar 30, 2022 at 11:53:52AM -0400, Greg Stark wrote:
> > > Sadly the cfbot is showing a patch conflict again. It's just a trivial
> > > conflict in the regression test schedule so I'm not going to update
> > > the status but it would be good to rebase it so we continue to get
> > > cfbot testing.
> >
> > Done.  No changes.
>
> + if (chk_auxiliary_proc)
> + ereport(WARNING,
> + errmsg("PID %d is not a PostgreSQL server process", pid));
> + else
> + ereport(WARNING,
> + errmsg("PID %d is not a PostgreSQL backend process", pid));
>
> This doesn't look right to me. I don't think that PostgreSQL server
> processes are one kind of thing and PostgreSQL backend processes are
> another kind of thing. I think they're two terms that are pretty
> nearly interchangeable, or maybe at best you want to argue that
> backend processes are some subset of server processes. I don't see
> this sort of thing adding any clarity.

I have changed it to "PID %d is not a PostgreSQL server process" which
is similar to the existing warning message in
pg_log_backend_memory_contexts.

> -static void
> +void
>  set_backtrace(ErrorData *edata, int num_skip)
>  {
>   StringInfoData errtrace;
> @@ -978,7 +978,18 @@ set_backtrace(ErrorData *edata, int num_skip)
>  "backtrace generation is not supported by this installation");
>  #endif
>
> - edata->backtrace = errtrace.data;
> + if (edata)
> + edata->backtrace = errtrace.data;
> + else
> + {
> + /*
> + * LOG_SERVER_ONLY is used to make sure the backtrace is never
> + * sent to client. We want to avoid messing up the other client
> + * session.
> + */
> + elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
> + pfree(errtrace.data);
> + }
>  }
>
> This looks like a grotty hack.

I have changed it so that the backtrace is set and returned to the
caller. The caller will take care of logging or setting it to the
error data context. Thoughts?

> - PGPROC*proc = BackendPidGetProc(pid);
> -
> - /*
> - * BackendPidGetProc returns NULL if the pid isn't valid; but by the time
> - * we reach kill(), a process for which we get a valid proc here might
> - * have terminated on its own.  There's no way to acquire a lock on an
> - * arbitrary process to prevent that. But since so far all the callers of
> - * this mechanism involve some request for ending the process anyway, that
> - * it might end on its own first is not a problem.
> - *
> - * Note that proc will also be NULL if the pid refers to an auxiliary
> - * process or the postmaster (neither of which can be signaled via
> - * pg_signal_backend()).
> - */
> - if (proc == NULL)
> - {
> - /*
> - * This is just a warning so a loop-through-resultset will not abort
> - * if one backend terminated on its own during the run.
> - */
> - ereport(WARNING,
> - (errmsg("PID %d is not a PostgreSQL backend process", pid)));
> + PGPROC  *proc;
>
> + /* Users can only signal valid backend or an auxiliary process. */
> + proc = CheckPostgresProcessId(pid, false, NULL);
> + if (!proc)
>   return SIGNAL_BACKEND_ERROR;
> - }
>
> Incidentally changing the behavior of pg_signal_backend() doesn't seem
> like a great idea. We can do that as a separate commit, after
> considering whether documentation changes are needed. But it's not
> something that should get folded into a commit on another topic.

Agreed. I have kept the logic of pg_signal_backend as it is.
pg_log_backtrace and pg_log_backend_memory_contexts use the same
functionality to check and send signal. I have added a new function
"CheckProcSendDebugSignal" which has the common code implementation
and will be called by both pg_log_backtrace and
pg_log_backend_memory_contexts.

> + /*
> + * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
> + * isn't valid; but by the time we reach kill(), a process for which we
> + * get a valid proc here might have terminated on its own.  There's no way
> + * to acquire a lock on an arbitrary process to prevent that. But since
> + * this mechanism is usually used to debug a backend or an auxiliary
> + * process running and consuming lots of memory, that it might end on its
> + * own first without logging the requested info is not a problem.
> + */
>
> This comment made a lot more sense where it used to be than it does
> where it is now. I think more work is required here than just cutting
> and pasting.

This function was called from pg_signal_backend earlier, this logic is
now moved to CheckProcSendDebugSignal which will be called only by
pg_log_backtrace and pg_log_backend_memory_contexts. This looks
appropriate in CheckProcSendDebugSignal with slight change to specify
it can consume lots of memory or a long running process.
Thoughts?

Attached v20 patch has the changes for the same.

Regards,
Vignesh
From cc3847b30e0b61218aebdf68e837d366a82ebad1 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Wed, 6 Apr 2022 10:45:56 

Re: Printing backtrace of postgres processes

2022-04-05 Thread Robert Haas
On Wed, Mar 30, 2022 at 12:03 PM Justin Pryzby  wrote:
> On Wed, Mar 30, 2022 at 11:53:52AM -0400, Greg Stark wrote:
> > Sadly the cfbot is showing a patch conflict again. It's just a trivial
> > conflict in the regression test schedule so I'm not going to update
> > the status but it would be good to rebase it so we continue to get
> > cfbot testing.
>
> Done.  No changes.

+ if (chk_auxiliary_proc)
+ ereport(WARNING,
+ errmsg("PID %d is not a PostgreSQL server process", pid));
+ else
+ ereport(WARNING,
+ errmsg("PID %d is not a PostgreSQL backend process", pid));

This doesn't look right to me. I don't think that PostgreSQL server
processes are one kind of thing and PostgreSQL backend processes are
another kind of thing. I think they're two terms that are pretty
nearly interchangeable, or maybe at best you want to argue that
backend processes are some subset of server processes. I don't see
this sort of thing adding any clarity.

-static void
+void
 set_backtrace(ErrorData *edata, int num_skip)
 {
  StringInfoData errtrace;
@@ -978,7 +978,18 @@ set_backtrace(ErrorData *edata, int num_skip)
 "backtrace generation is not supported by this installation");
 #endif

- edata->backtrace = errtrace.data;
+ if (edata)
+ edata->backtrace = errtrace.data;
+ else
+ {
+ /*
+ * LOG_SERVER_ONLY is used to make sure the backtrace is never
+ * sent to client. We want to avoid messing up the other client
+ * session.
+ */
+ elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
+ pfree(errtrace.data);
+ }
 }

This looks like a grotty hack.

- PGPROC*proc = BackendPidGetProc(pid);
-
- /*
- * BackendPidGetProc returns NULL if the pid isn't valid; but by the time
- * we reach kill(), a process for which we get a valid proc here might
- * have terminated on its own.  There's no way to acquire a lock on an
- * arbitrary process to prevent that. But since so far all the callers of
- * this mechanism involve some request for ending the process anyway, that
- * it might end on its own first is not a problem.
- *
- * Note that proc will also be NULL if the pid refers to an auxiliary
- * process or the postmaster (neither of which can be signaled via
- * pg_signal_backend()).
- */
- if (proc == NULL)
- {
- /*
- * This is just a warning so a loop-through-resultset will not abort
- * if one backend terminated on its own during the run.
- */
- ereport(WARNING,
- (errmsg("PID %d is not a PostgreSQL backend process", pid)));
+ PGPROC  *proc;

+ /* Users can only signal valid backend or an auxiliary process. */
+ proc = CheckPostgresProcessId(pid, false, NULL);
+ if (!proc)
  return SIGNAL_BACKEND_ERROR;
- }

Incidentally changing the behavior of pg_signal_backend() doesn't seem
like a great idea. We can do that as a separate commit, after
considering whether documentation changes are needed. But it's not
something that should get folded into a commit on another topic.

+ /*
+ * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+ * isn't valid; but by the time we reach kill(), a process for which we
+ * get a valid proc here might have terminated on its own.  There's no way
+ * to acquire a lock on an arbitrary process to prevent that. But since
+ * this mechanism is usually used to debug a backend or an auxiliary
+ * process running and consuming lots of memory, that it might end on its
+ * own first without logging the requested info is not a problem.
+ */

This comment made a lot more sense where it used to be than it does
where it is now. I think more work is required here than just cutting
and pasting.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Printing backtrace of postgres processes

2022-03-30 Thread Justin Pryzby
On Wed, Mar 30, 2022 at 11:53:52AM -0400, Greg Stark wrote:
> Sadly the cfbot is showing a patch conflict again. It's just a trivial
> conflict in the regression test schedule so I'm not going to update
> the status but it would be good to rebase it so we continue to get
> cfbot testing.

Done.  No changes.
>From b984dacb4bf2794705a8da49fa68ac9d9a6f Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 25 Jan 2022 08:21:22 +0530
Subject: [PATCH] Add function to log the backtrace of the specified postgres
 process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend or auxiliary process except
logger and statistic collector.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.

Authors: Vignesh C, Bharath Rupireddy, Justin Pryzby
Reviewers: Bharath Rupireddy, Justin Pryzby, Fujii Masao, Atsushi Torikoshi, Dilip Kumar, Robert Haas, Andres Freund, Tom lane, Craig Ringer
Discussion: https://www.postgresql.org/message-id/CALDaNm3ZzmFS-=r7oduzj7y7bgqv+n06kqyft6c3xzdoknk...@mail.gmail.com
---
 doc/src/sgml/func.sgml| 62 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  4 ++
 src/backend/postmaster/interrupt.c|  4 ++
 src/backend/postmaster/pgarch.c   | 10 +++-
 src/backend/postmaster/startup.c  |  4 ++
 src/backend/postmaster/walwriter.c|  4 ++
 src/backend/storage/ipc/procarray.c   | 56 +
 src/backend/storage/ipc/procsignal.c  | 42 +
 src/backend/storage/ipc/signalfuncs.c | 73 ---
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/adt/mcxtfuncs.c | 40 +++--
 src/backend/utils/error/elog.c| 19 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procarray.h   |  2 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 49 +++
 src/test/regress/expected/backtrace_1.out | 55 +
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 33 ++
 24 files changed, 416 insertions(+), 65 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 524c6b98547..4d187b3a6d8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25394,6 +25394,30 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the backend with the
+specified process ID.  This function can send the request to
+backends and auxiliary processes except the logger and statistics
+collector.  The backtraces will be logged at LOG
+message level. They will appear in the server log based on the log
+configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace identifies
+which function a process is currently executing and it may be useful
+for developers to diagnose stuck processes and other problems. This
+function is supported only if PostgreSQL was built with the ability to
+capture backtraces, otherwise it will emit a warning.
+   
+  
+
   

 
@@ -25614,6 +25638,44 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_backtrace can be used to log the backtrace of
+a backend process. For example:
+
+postgres=# select pg_log_backtrace(pg_backend_pid());
+ pg_log_backtrace
+--
+ t
+(1 row)
+
+The backtrace will be logged as specified by the logging configuration.
+For example:
+
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
+postgres: postgresdba postgres [local] SELECT(set_backtrace+0x38) [0xae06c5]
+postgres: postgresdba postgres [local] SELECT(ProcessInterrupts+0x788) [0x950c34]
+postgres: postgresdba postgres 

Re: Printing backtrace of postgres processes

2022-03-30 Thread Greg Stark
Sadly the cfbot is showing a patch conflict again. It's just a trivial
conflict in the regression test schedule so I'm not going to update
the status but it would be good to rebase it so we continue to get
cfbot testing.




Re: Printing backtrace of postgres processes

2022-03-11 Thread vignesh C
On Wed, Mar 9, 2022 at 9:26 PM Justin Pryzby  wrote:
>
> rebased to appease cfbot.
>
> + couple of little fixes as 0002.

Thanks for rebasing and fixing a few issues. I have taken all your
changes except for mcxtfuncs changes as those changes were not done as
part of this patch. Attached v19 patch which has the changes for the
same.

Regards,
Vignesh
From baff4bd9331eff3d3d021724847c2fb89d4bcdcf Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 25 Jan 2022 08:21:22 +0530
Subject: [PATCH v19] Add function to log the backtrace of the specified
 postgres process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend or auxiliary process except
logger and statistic collector.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.

Authors: Vignesh C, Bharath Rupireddy, Justin Pryzby
Reviewers: Bharath Rupireddy, Justin Pryzby, Fujii Masao, Atsushi Torikoshi, Dilip Kumar, Robert Haas, Andres Freund, Tom lane, Craig Ringer
Discussion: https://www.postgresql.org/message-id/CALDaNm3ZzmFS-=r7oduzj7y7bgqv+n06kqyft6c3xzdoknk...@mail.gmail.com
---
 doc/src/sgml/func.sgml| 62 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  4 ++
 src/backend/postmaster/interrupt.c|  4 ++
 src/backend/postmaster/pgarch.c   | 10 +++-
 src/backend/postmaster/startup.c  |  4 ++
 src/backend/postmaster/walwriter.c|  4 ++
 src/backend/storage/ipc/procarray.c   | 56 +
 src/backend/storage/ipc/procsignal.c  | 42 +
 src/backend/storage/ipc/signalfuncs.c | 73 ---
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/adt/mcxtfuncs.c | 40 +++--
 src/backend/utils/error/elog.c| 19 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procarray.h   |  2 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 49 +++
 src/test/regress/expected/backtrace_1.out | 55 +
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 33 ++
 24 files changed, 416 insertions(+), 65 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8a802fb225..4171208719 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25355,6 +25355,30 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the backend with the
+specified process ID.  This function can send the request to
+backends and auxiliary processes except the logger and statistics
+collector.  The backtraces will be logged at LOG
+message level. They will appear in the server log based on the log
+configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace identifies
+which function a process is currently executing and it may be useful
+for developers to diagnose stuck processes and other problems. This
+function is supported only if PostgreSQL was built with the ability to
+capture backtraces, otherwise it will emit a warning.
+   
+  
+
   

 
@@ -25574,6 +25598,44 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_backtrace can be used to log the backtrace of
+a backend process. For example:
+
+postgres=# select pg_log_backtrace(pg_backend_pid());
+ pg_log_backtrace
+--
+ t
+(1 row)
+
+The backtrace will be logged as specified by the logging configuration.
+For example:
+
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
+postgres: postgresdba postgres [local] SELECT(set_backtrace+0x38) [0xae06c5]
+postgres: postgresdba postgres [local] SELECT(ProcessInterrupts+0x788) [0x950c34]

Re: Printing backtrace of postgres processes

2022-03-09 Thread Justin Pryzby
rebased to appease cfbot.

+ couple of little fixes as 0002.

-- 
Justin
>From 4be93f2bab460682a0f5af9e1e3f4970709b3517 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 25 Jan 2022 08:21:22 +0530
Subject: [PATCH 1/2] Add function to log the backtrace of the specified
 postgres process.

ci-os-only: html

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend or auxiliary process except
logger and statistic collector.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.

Authors: Vignesh C, Bharath Rupireddy
---
 doc/src/sgml/func.sgml| 62 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  4 ++
 src/backend/postmaster/interrupt.c|  4 ++
 src/backend/postmaster/pgarch.c   | 10 +++-
 src/backend/postmaster/startup.c  |  4 ++
 src/backend/postmaster/walwriter.c|  4 ++
 src/backend/storage/ipc/procarray.c   | 56 +
 src/backend/storage/ipc/procsignal.c  | 42 +
 src/backend/storage/ipc/signalfuncs.c | 73 ---
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/adt/mcxtfuncs.c | 40 +++--
 src/backend/utils/error/elog.c| 19 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procarray.h   |  2 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 49 +++
 src/test/regress/expected/backtrace_1.out | 55 +
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 33 ++
 24 files changed, 416 insertions(+), 65 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b727c3423aa..9f321fb019a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25355,6 +25355,30 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the backend with the
+specified process ID.  This function can send the request to
+backends and auxiliary processes except logger and statistics
+collector.  These backtraces will be logged at LOG
+message level. They will appear in the server log based on the log
+configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace identifies
+which function a process is currently executing and it may be useful
+for developers to diagnose stuck processes and other problems. This
+function is supported only if PostgreSQL was built with the ability to
+capture backtraces, otherwise it will emit a warning.
+   
+  
+
   

 
@@ -25574,6 +25598,44 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_backtrace can be used to log the backtrace of
+a backend process. For example:
+
+postgres=# select pg_log_backtrace(pg_backend_pid());
+ pg_log_backtrace
+--
+ t
+(1 row)
+
+The backtrace will be logged as specified by the logging configuration.
+For example:
+
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
+postgres: postgresdba postgres [local] SELECT(set_backtrace+0x38) [0xae06c5]
+postgres: postgresdba postgres [local] SELECT(ProcessInterrupts+0x788) [0x950c34]
+postgres: postgresdba postgres [local] SELECT() [0x761e89]
+postgres: postgresdba postgres [local] SELECT() [0x71bbda]
+postgres: postgresdba postgres [local] SELECT() [0x71e380]
+postgres: postgresdba postgres [local] SELECT(standard_ExecutorRun+0x1d6) [0x71c1fe]
+postgres: postgresdba postgres [local] SELECT(ExecutorRun+0x55) [0x71c026]
+postgres: postgresdba postgres [local] SELECT() [0x953fc5]
+postgres: postgresdba postgres [local] SELECT(PortalRun+0x262) [0x953c7e]
+ 

Re: Printing backtrace of postgres processes

2022-01-30 Thread vignesh C
On Sat, Jan 29, 2022 at 8:06 AM vignesh C  wrote:
>
> On Fri, Jan 28, 2022 at 1:54 PM Bharath Rupireddy
>  wrote:
> >
> > On Thu, Jan 27, 2022 at 10:45 AM vignesh C  wrote:
> > >
> > > On Wed, Jan 26, 2022 at 11:07 AM Bharath Rupireddy
> > >  wrote:
> > > >
> > > > On Tue, Jan 25, 2022 at 12:00 PM vignesh C  wrote:
> > > > > Thanks for the comments, attached v17 patch has the fix for the same.
> > > >
> > > > Have a minor comment on v17:
> > > >
> > > > can we modify the elog(LOG, to new style ereport(LOG, ?
> > > > + elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
> > > >
> > > > /*--
> > > >  * New-style error reporting API: to be used in this way:
> > > >  *  ereport(ERROR,
> > > >  *  errcode(ERRCODE_UNDEFINED_CURSOR),
> > > >  *  errmsg("portal \"%s\" not found", stmt->portalname),
> > > >  *  ... other errxxx() fields as needed ...);
> > > >  *
> > >
> > > Attached v18 patch has the changes for the same.
> >
> > Thanks. The v18 patch LGTM. I'm not sure if the CF bot failure is
> > related to the patch -  https://cirrus-ci.com/task/5633364051886080
>
> I felt this test failure is not related to this change. Let's wait and
> see the results of the next run. Also I noticed that this test seems
> to have failed many times in the buildfarm too recently:
> https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=60=HEAD==recoveryCheck=Submit

CFBot has passed in the last run.

Regards,
Vignesh




Re: Printing backtrace of postgres processes

2022-01-28 Thread vignesh C
On Fri, Jan 28, 2022 at 1:54 PM Bharath Rupireddy
 wrote:
>
> On Thu, Jan 27, 2022 at 10:45 AM vignesh C  wrote:
> >
> > On Wed, Jan 26, 2022 at 11:07 AM Bharath Rupireddy
> >  wrote:
> > >
> > > On Tue, Jan 25, 2022 at 12:00 PM vignesh C  wrote:
> > > > Thanks for the comments, attached v17 patch has the fix for the same.
> > >
> > > Have a minor comment on v17:
> > >
> > > can we modify the elog(LOG, to new style ereport(LOG, ?
> > > + elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
> > >
> > > /*--
> > >  * New-style error reporting API: to be used in this way:
> > >  *  ereport(ERROR,
> > >  *  errcode(ERRCODE_UNDEFINED_CURSOR),
> > >  *  errmsg("portal \"%s\" not found", stmt->portalname),
> > >  *  ... other errxxx() fields as needed ...);
> > >  *
> >
> > Attached v18 patch has the changes for the same.
>
> Thanks. The v18 patch LGTM. I'm not sure if the CF bot failure is
> related to the patch -  https://cirrus-ci.com/task/5633364051886080

I felt this test failure is not related to this change. Let's wait and
see the results of the next run. Also I noticed that this test seems
to have failed many times in the buildfarm too recently:
https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=60=HEAD==recoveryCheck=Submit

Regards,
Vignesh
From 3599e6940e16eb164737a8428af4b8ba5ef6bf59 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 25 Jan 2022 08:21:22 +0530
Subject: [PATCH v18] Add function to log the backtrace of the specified
 postgres process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend or auxiliary process except
logger and statistic collector.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.

Authors: Vignesh C, Bharath Rupireddy
---
 doc/src/sgml/func.sgml| 62 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  4 ++
 src/backend/postmaster/interrupt.c|  4 ++
 src/backend/postmaster/pgarch.c   | 10 +++-
 src/backend/postmaster/startup.c  |  4 ++
 src/backend/postmaster/walwriter.c|  4 ++
 src/backend/storage/ipc/procarray.c   | 56 +
 src/backend/storage/ipc/procsignal.c  | 42 +
 src/backend/storage/ipc/signalfuncs.c | 73 ---
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/adt/mcxtfuncs.c | 40 +++--
 src/backend/utils/error/elog.c| 19 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procarray.h   |  2 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 49 +++
 src/test/regress/expected/backtrace_1.out | 55 +
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 33 ++
 24 files changed, 416 insertions(+), 65 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0ee6974f1c..855ccc8902 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25318,6 +25318,30 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the backend with the
+specified process ID.  This function can send the request to
+backends and auxiliary processes except logger and statistics
+collector.  These backtraces will be logged at LOG
+message level. They will appear in the server log based on the log
+configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace identifies
+which function a process is currently executing and it may be useful
+for developers to diagnose stuck processes and other problems. This
+function is supported only if PostgreSQL was built with the ability to
+capture backtraces, otherwise it will emit a warning.
+   
+  
+
   

 
@@ 

Re: Printing backtrace of postgres processes

2022-01-28 Thread Bharath Rupireddy
On Thu, Jan 27, 2022 at 10:45 AM vignesh C  wrote:
>
> On Wed, Jan 26, 2022 at 11:07 AM Bharath Rupireddy
>  wrote:
> >
> > On Tue, Jan 25, 2022 at 12:00 PM vignesh C  wrote:
> > > Thanks for the comments, attached v17 patch has the fix for the same.
> >
> > Have a minor comment on v17:
> >
> > can we modify the elog(LOG, to new style ereport(LOG, ?
> > + elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
> >
> > /*--
> >  * New-style error reporting API: to be used in this way:
> >  *  ereport(ERROR,
> >  *  errcode(ERRCODE_UNDEFINED_CURSOR),
> >  *  errmsg("portal \"%s\" not found", stmt->portalname),
> >  *  ... other errxxx() fields as needed ...);
> >  *
>
> Attached v18 patch has the changes for the same.

Thanks. The v18 patch LGTM. I'm not sure if the CF bot failure is
related to the patch -  https://cirrus-ci.com/task/5633364051886080

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2022-01-26 Thread vignesh C
On Wed, Jan 26, 2022 at 11:07 AM Bharath Rupireddy
 wrote:
>
> On Tue, Jan 25, 2022 at 12:00 PM vignesh C  wrote:
> > Thanks for the comments, attached v17 patch has the fix for the same.
>
> Have a minor comment on v17:
>
> can we modify the elog(LOG, to new style ereport(LOG, ?
> + elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
>
> /*--
>  * New-style error reporting API: to be used in this way:
>  *  ereport(ERROR,
>  *  errcode(ERRCODE_UNDEFINED_CURSOR),
>  *  errmsg("portal \"%s\" not found", stmt->portalname),
>  *  ... other errxxx() fields as needed ...);
>  *

Attached v18 patch has the changes for the same.

Regards,
Vignesh
From 3599e6940e16eb164737a8428af4b8ba5ef6bf59 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 25 Jan 2022 08:21:22 +0530
Subject: [PATCH v18] Add function to log the backtrace of the specified
 postgres process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend or auxiliary process except
logger and statistic collector.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.

Authors: Vignesh C, Bharath Rupireddy
---
 doc/src/sgml/func.sgml| 62 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  4 ++
 src/backend/postmaster/interrupt.c|  4 ++
 src/backend/postmaster/pgarch.c   | 10 +++-
 src/backend/postmaster/startup.c  |  4 ++
 src/backend/postmaster/walwriter.c|  4 ++
 src/backend/storage/ipc/procarray.c   | 56 +
 src/backend/storage/ipc/procsignal.c  | 42 +
 src/backend/storage/ipc/signalfuncs.c | 73 ---
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/adt/mcxtfuncs.c | 40 +++--
 src/backend/utils/error/elog.c| 19 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procarray.h   |  2 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 49 +++
 src/test/regress/expected/backtrace_1.out | 55 +
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 33 ++
 24 files changed, 416 insertions(+), 65 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0ee6974f1c..855ccc8902 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25318,6 +25318,30 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the backend with the
+specified process ID.  This function can send the request to
+backends and auxiliary processes except logger and statistics
+collector.  These backtraces will be logged at LOG
+message level. They will appear in the server log based on the log
+configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace identifies
+which function a process is currently executing and it may be useful
+for developers to diagnose stuck processes and other problems. This
+function is supported only if PostgreSQL was built with the ability to
+capture backtraces, otherwise it will emit a warning.
+   
+  
+
   

 
@@ -25537,6 +25561,44 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_backtrace can be used to log the backtrace of
+a backend process. For example:
+
+postgres=# select pg_log_backtrace(pg_backend_pid());
+ pg_log_backtrace
+--
+ t
+(1 row)
+
+The backtrace will be logged as specified by the logging configuration.
+For example:
+
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
+postgres: postgresdba postgres [local] SELECT(set_backtrace+0x38) [0xae06c5]
+

Re: Printing backtrace of postgres processes

2022-01-25 Thread Bharath Rupireddy
On Tue, Jan 25, 2022 at 12:00 PM vignesh C  wrote:
> Thanks for the comments, attached v17 patch has the fix for the same.

Have a minor comment on v17:

can we modify the elog(LOG, to new style ereport(LOG, ?
+ elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);

/*--
 * New-style error reporting API: to be used in this way:
 *  ereport(ERROR,
 *  errcode(ERRCODE_UNDEFINED_CURSOR),
 *  errmsg("portal \"%s\" not found", stmt->portalname),
 *  ... other errxxx() fields as needed ...);
 *

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2022-01-24 Thread vignesh C
On Mon, Jan 24, 2022 at 1:05 PM torikoshia  wrote:
>
> On 2022-01-14 19:48, Bharath Rupireddy wrote:
> > On Sat, Nov 20, 2021 at 11:50 AM Bharath Rupireddy
> >  wrote:
> >>
> >> On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> >> > The Attached v15 patch has the fixes for the same.
> >>
> >> Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as
> >> RfC.
> >
> > The patch was not applying because of the recent commit [1]. I took
> > this opportunity and tried a bunch of things without modifying the
> > core logic of the pg_log_backtrace feature that Vignesh has worked on.
> >
> > I did following -  moved the duplicate code to a new function
> > CheckPostgresProcessId which can be used by pg_log_backtrace,
> > pg_log_memory_contexts, pg_signal_backend and pg_log_query_plan ([2]),
>
> Thanks for refactoring!
> I'm going to use it for pg_log_query_plan after this patch is merged.
>
> > modified the code comments, docs and tests to be more in sync with the
> > commit [1], moved two of ProcessLogBacktraceInterrupt calls (archiver
> > and wal writer) to their respective interrupt handlers. Here's the v16
> > version that I've come up with.
>
> I have some minor comments.
>
> > +
> > +You can get the file name and line number from the logged details
> > by using
> > +gdb/addr2line in linux platforms (users must ensure gdb/addr2line
> > is
> > +already installed).
> > +
> > +1)  "info line *address" from gdb on postgres executable. For example:
> > +gdb ./postgres
> > +(gdb) info line *0x71c25d
> > +Line 378 of "execMain.c" starts at address 0x71c25d
> > standard_ExecutorRun+470
> > and ends at 0x71c263
> > standard_ExecutorRun+476.
> > +OR
> > +2) Using "addr2line -e postgres address", For example:
> > +addr2line -e ./postgres 0x71c25d
> > +/home/postgresdba/src/backend/executor/execMain.c:378
> > +
> > +   
> > +
>
> Isn't it better to remove line 1) and 2) from ?
> I just glanced at the existing sgml, but  seems to
> contain only codes.

Modified

> > + * CheckPostgresProcessId -- check if the process with given pid is a
> > backend
> > + * or an auxiliary process.
> > + *
> > +
> > + */
>
> Isn't the 4th line needless?

Modified

> BTW, when I saw the name of this function, I thought it just checks if
> the specified pid is PostgreSQL process or not.
> Since it returns the pointer to the PGPROC or BackendId of the PID, it
> might be kind to write comments about it.

Modified

Thanks for the comments, attached v17 patch has the fix for the same.

Regards,
Vignesh
From 5859e2c34f73206cf48a40d9e429b02f923b7081 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 25 Jan 2022 08:21:22 +0530
Subject: [PATCH v17] Add function to log the backtrace of the specified
 postgres process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend or auxiliary process except
logger and statistic collector.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.

Authors: Vignesh C, Bharath Rupireddy
---
 doc/src/sgml/func.sgml| 62 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  4 ++
 src/backend/postmaster/interrupt.c|  4 ++
 src/backend/postmaster/pgarch.c   | 10 +++-
 src/backend/postmaster/startup.c  |  4 ++
 src/backend/postmaster/walwriter.c|  4 ++
 src/backend/storage/ipc/procarray.c   | 56 +
 src/backend/storage/ipc/procsignal.c  | 40 +
 src/backend/storage/ipc/signalfuncs.c | 73 ---
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/adt/mcxtfuncs.c | 40 +++--
 src/backend/utils/error/elog.c| 19 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procarray.h   |  2 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 49 +++
 src/test/regress/expected/backtrace_1.out | 55 +
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 33 ++
 24 files changed, 414 insertions(+), 65 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git 

Re: Printing backtrace of postgres processes

2022-01-24 Thread vignesh C
On Mon, Jan 24, 2022 at 10:06 PM Fujii Masao
 wrote:
>
>
>
> On 2022/01/24 16:35, torikoshia wrote:
> > On 2022-01-14 19:48, Bharath Rupireddy wrote:
> >> On Sat, Nov 20, 2021 at 11:50 AM Bharath Rupireddy
> >>  wrote:
> >>>
> >>> On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> >>> > The Attached v15 patch has the fixes for the same.
> >>>
> >>> Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as 
> >>> RfC.
> >>
> >> The patch was not applying because of the recent commit [1]. I took
> >> this opportunity and tried a bunch of things without modifying the
> >> core logic of the pg_log_backtrace feature that Vignesh has worked on.
>
> I have one question about this feature. When the PID of auxiliary process 
> like archiver is specified, probably the function always reports the same 
> result, doesn't it? Because, for example, the archiver always logs its 
> backtrace in HandlePgArchInterrupts().

It can be either from HandlePgArchInterrupts in pgarch_MainLoop or
from HandlePgArchInterrupts in pgarch_ArchiverCopyLoop, since the
archiver functionality is mainly in these functions.

Regards,
Vignesh




Re: Printing backtrace of postgres processes

2022-01-24 Thread Fujii Masao




On 2022/01/24 16:35, torikoshia wrote:

On 2022-01-14 19:48, Bharath Rupireddy wrote:

On Sat, Nov 20, 2021 at 11:50 AM Bharath Rupireddy
 wrote:


On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> The Attached v15 patch has the fixes for the same.

Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as RfC.


The patch was not applying because of the recent commit [1]. I took
this opportunity and tried a bunch of things without modifying the
core logic of the pg_log_backtrace feature that Vignesh has worked on.


I have one question about this feature. When the PID of auxiliary process like 
archiver is specified, probably the function always reports the same result, 
doesn't it? Because, for example, the archiver always logs its backtrace in 
HandlePgArchInterrupts().

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Printing backtrace of postgres processes

2022-01-23 Thread torikoshia

On 2022-01-14 19:48, Bharath Rupireddy wrote:

On Sat, Nov 20, 2021 at 11:50 AM Bharath Rupireddy
 wrote:


On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> The Attached v15 patch has the fixes for the same.

Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as 
RfC.


The patch was not applying because of the recent commit [1]. I took
this opportunity and tried a bunch of things without modifying the
core logic of the pg_log_backtrace feature that Vignesh has worked on.

I did following -  moved the duplicate code to a new function
CheckPostgresProcessId which can be used by pg_log_backtrace,
pg_log_memory_contexts, pg_signal_backend and pg_log_query_plan ([2]),


Thanks for refactoring!
I'm going to use it for pg_log_query_plan after this patch is merged.


modified the code comments, docs and tests to be more in sync with the
commit [1], moved two of ProcessLogBacktraceInterrupt calls (archiver
and wal writer) to their respective interrupt handlers. Here's the v16
version that I've come up with.


I have some minor comments.


+
+You can get the file name and line number from the logged details 
by using
+gdb/addr2line in linux platforms (users must ensure gdb/addr2line 
is

+already installed).
+
+1)  "info line *address" from gdb on postgres executable. For example:
+gdb ./postgres
+(gdb) info line *0x71c25d
+Line 378 of "execMain.c" starts at address 0x71c25d 
standard_ExecutorRun+470 
and ends at 0x71c263 
standard_ExecutorRun+476.

+OR
+2) Using "addr2line -e postgres address", For example:
+addr2line -e ./postgres 0x71c25d
+/home/postgresdba/src/backend/executor/execMain.c:378
+
+   
+


Isn't it better to remove line 1) and 2) from ?
I just glanced at the existing sgml, but  seems to 
contain only codes.



+ * CheckPostgresProcessId -- check if the process with given pid is a 
backend

+ * or an auxiliary process.
+ *
+
+ */


Isn't the 4th line needless?

BTW, when I saw the name of this function, I thought it just checks if 
the specified pid is PostgreSQL process or not.
Since it returns the pointer to the PGPROC or BackendId of the PID, it 
might be kind to write comments about it.



--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION




Re: Printing backtrace of postgres processes

2022-01-14 Thread Bharath Rupireddy
On Sat, Nov 20, 2021 at 11:50 AM Bharath Rupireddy
 wrote:
>
> On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> > The Attached v15 patch has the fixes for the same.
>
> Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as RfC.

The patch was not applying because of the recent commit [1]. I took
this opportunity and tried a bunch of things without modifying the
core logic of the pg_log_backtrace feature that Vignesh has worked on.

I did following -  moved the duplicate code to a new function
CheckPostgresProcessId which can be used by pg_log_backtrace,
pg_log_memory_contexts, pg_signal_backend and pg_log_query_plan ([2]),
modified the code comments, docs and tests to be more in sync with the
commit [1], moved two of ProcessLogBacktraceInterrupt calls (archiver
and wal writer) to their respective interrupt handlers. Here's the v16
version that I've come up with.

[1]
commit 790fbda902093c71ae47bff1414799cd716abb80
Author: Fujii Masao 
Date:   Tue Jan 11 23:19:59 2022 +0900

Enhance pg_log_backend_memory_contexts() for auxiliary processes.

[2] 
https://www.postgresql.org/message-id/flat/20220114063846.7jygvulyu6g23kdv%40jrouhaud

Regards,
Bharath Rupireddy.


v16-0001-Add-function-to-log-the-backtrace-of-the-specifi.patch
Description: Binary data


Re: Printing backtrace of postgres processes

2021-11-19 Thread Bharath Rupireddy
On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> The Attached v15 patch has the fixes for the same.

Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as RfC.

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-19 Thread vignesh C
On Thu, Nov 18, 2021 at 9:52 PM Justin Pryzby  wrote:
>
> On Wed, Nov 17, 2021 at 08:12:44PM +0530, vignesh C wrote:
> > Attached v14 patch has the fixes for the same.
>
> Thanks for updating the patch.
>
> I cleaned up the docs and comments.  I think this could be nearly "Ready".
>
> If you like the changes in my "fixup" patch (0002 and 0004), you should be 
> able
> to apply my 0002 on top of your 0001.  I'm sure it'll cause some conflicts 
> with
> your 2nd patch, though...

I have slightly modified and taken the changes. I have not taken a few
of the changes to keep it similar to pg_log_backend_memory_contexts.

> This doesn't bump the catversion, since that would cause the patch to fail in
> cfbot every time another commit updates catversion.

I have removed it, since it there in commit  message it is ok.

> Your 0001 patch allows printing backtraces of autovacuum, but the doc says 
> it's
> only for "backends".  Should the change to autovacuum.c be moved to the 2nd
> patch ?  Or, why is autovacuum so special that it should be handled in the
> first patch ?

I had separated the patches so that it is easier for review, I have
merged the changes as few rounds of review is done for the patch. Now
since the patch is merged, this change is handled.
The Attached v15 patch has the fixes for the same.

Regards,
Vignesh
From 66f7b2b5a42b3cbe347d227a64488cdcc025da8a Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Tue, 9 Nov 2021 16:28:03 +0530
Subject: [PATCH v15] Add function to log the backtrace of the specified
 backend process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend process.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.
---
 doc/src/sgml/func.sgml| 80 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  4 ++
 src/backend/postmaster/interrupt.c|  4 ++
 src/backend/postmaster/pgarch.c   |  4 ++
 src/backend/postmaster/startup.c  |  4 ++
 src/backend/postmaster/walwriter.c|  4 ++
 src/backend/storage/ipc/procsignal.c  | 40 
 src/backend/storage/ipc/signalfuncs.c | 66 +++
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/error/elog.c| 19 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 53 +++
 src/test/regress/expected/backtrace_1.out | 59 +
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 39 +++
 21 files changed, 394 insertions(+), 6 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 24447c0017..24a2b920dc 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25345,6 +25345,38 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the
+backend or the
+WAL sender or the
+auxiliary process
+with the specified process ID. This function cannot request
+postmaster process or
+logger or 
+statistics collector
+(all other auxiliary processes
+it can) for the backtrace. The backtrace will be logged at
+LOG message level. It will appear in the server log
+based on the log configuration set (See
+ for more information), but
+will not be sent to the client regardless of
+. A backtrace identifies
+which function the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is available if PostgreSQL was built with the ability to
+capture backtraces. If the feature is not supported, the function will
+emit a warning and return false.
+   
+  
+
   

 
@@ -25458,6 +25490,54 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 

Re: Printing backtrace of postgres processes

2021-11-18 Thread Justin Pryzby
On Wed, Nov 17, 2021 at 08:12:44PM +0530, vignesh C wrote:
> Attached v14 patch has the fixes for the same.

Thanks for updating the patch.

I cleaned up the docs and comments.  I think this could be nearly "Ready".

If you like the changes in my "fixup" patch (0002 and 0004), you should be able
to apply my 0002 on top of your 0001.  I'm sure it'll cause some conflicts with
your 2nd patch, though...

This doesn't bump the catversion, since that would cause the patch to fail in
cfbot every time another commit updates catversion.

Your 0001 patch allows printing backtraces of autovacuum, but the doc says it's
only for "backends".  Should the change to autovacuum.c be moved to the 2nd
patch ?  Or, why is autovacuum so special that it should be handled in the
first patch ?

-- 
Justin
>From 0829c99951194df3e101e1c648a05bb5651ef090 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Tue, 9 Nov 2021 16:28:03 +0530
Subject: [PATCH 1/4] Add function to log the backtrace of the specified
 backend process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend process.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.
---
 doc/src/sgml/func.sgml| 85 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/storage/ipc/procsignal.c  | 41 +++
 src/backend/storage/ipc/signalfuncs.c | 49 +
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/error/elog.c| 20 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 42 +++
 src/test/regress/expected/backtrace_1.out | 46 
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 30 
 16 files changed, 331 insertions(+), 6 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 24447c0017..a88583edfd 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25345,6 +25345,32 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the
+backend
+with the specified process ID. The backtrace will be logged at message
+level LOG. It will appear in the server log based on
+the log configuration set (See 
+for more information), but will not be sent to the client regardless of
+. A backtrace will identify
+where exactly the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is not supported for the postmaster, logger, checkpointer,
+walwriter, background writer or statistics collector process. This
+feature will be available if PostgreSQL was built with the ability to
+capture backtrace. If not available, the function will emit a warning
+and return false.
+   
+  
+
   

 
@@ -25458,6 +25484,65 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_backtrace can be used to log the backtrace of
+a backend process. For example:
+
+postgres=# select pg_log_backtrace(pg_backend_pid());
+ pg_log_backtrace
+--
+ t
+(1 row)
+
+The backtrace will be logged to the log file if logging is enabled, if logging
+is disabled backtrace will be logged to the console where the postmaster was
+started. For example:
+
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
+postgres: postgresdba postgres [local] SELECT(set_backtrace+0x38) [0xae06c5]
+postgres: postgresdba postgres [local] SELECT(ProcessInterrupts+0x788) [0x950c34]
+postgres: postgresdba postgres [local] SELECT() [0x761e89]
+postgres: postgresdba postgres [local] SELECT() [0x71bbda]
+postgres: postgresdba postgres [local] SELECT() [0x71e380]
+

Re: Printing backtrace of postgres processes

2021-11-17 Thread vignesh C
On Tue, Nov 16, 2021 at 1:12 AM Justin Pryzby  wrote:
>
> On Mon, Nov 15, 2021 at 09:12:49PM +0530, vignesh C wrote:
> > The idea here is to implement & expose pg_print_backtrace function, 
> > internally
>
> This patch is closely related to this one
> https://commitfest.postgresql.org/35/3142/
> | Logging plan of the currently running query
>
> I suggest to review that patch and make sure there's nothing you could borrow.

I had a look and felt the attached patch is in similar lines

> My only comment for now is that maybe the function name should be
> pg_log_backtrace() rather than pg_print_backtrace(), since it doesn't actually
> "print" the backtrace, but rather request the other backend to log its
> backtrace.

Modified

> Did you see that the windows build failed ?
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.153557
> I think you'll need to create an "alternate" output like
> src/test/regress/expected/misc_functions_1.out
>
> It's possible that's best done by creating a totally new .sql and .out files
> added to src/test/regress/parallel_schedule - because otherwise, you'd have to
> duplicate the existing 300 lines of misc_fuctions.out.

Modified

Attached v14 patch has the fixes for the same.

Regards,
Vignesh
From e267fb1772c05ad8f622d9dae0e474bcd8b25da6 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Tue, 9 Nov 2021 16:28:03 +0530
Subject: [PATCH v14 1/2] Add function to log the backtrace of the specified
 backend process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend process.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.
---
 doc/src/sgml/func.sgml| 85 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/storage/ipc/procsignal.c  | 41 +++
 src/backend/storage/ipc/signalfuncs.c | 49 +
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/error/elog.c| 20 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/catversion.h  |  2 +-
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 42 +++
 src/test/regress/expected/backtrace_1.out | 46 
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 30 
 17 files changed, 332 insertions(+), 7 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 24447c0017..a88583edfd 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25345,6 +25345,32 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backtrace
+
+pg_log_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the
+backend
+with the specified process ID. The backtrace will be logged at message
+level LOG. It will appear in the server log based on
+the log configuration set (See 
+for more information), but will not be sent to the client regardless of
+. A backtrace will identify
+where exactly the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is not supported for the postmaster, logger, checkpointer,
+walwriter, background writer or statistics collector process. This
+feature will be available if PostgreSQL was built with the ability to
+capture backtrace. If not available, the function will emit a warning
+and return false.
+   
+  
+
   

 
@@ -25458,6 +25484,65 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_backtrace can be used to log the backtrace of
+a backend process. For example:
+
+postgres=# select pg_log_backtrace(pg_backend_pid());
+ pg_log_backtrace
+--
+ t
+(1 row)
+
+The backtrace will be logged to the log file if logging is enabled, if logging
+is disabled backtrace will be 

Re: Printing backtrace of postgres processes

2021-11-15 Thread Bharath Rupireddy
On Tue, Nov 16, 2021 at 1:12 AM Justin Pryzby  wrote:
>
> On Mon, Nov 15, 2021 at 09:12:49PM +0530, vignesh C wrote:
> > The idea here is to implement & expose pg_print_backtrace function, 
> > internally
>
> This patch is closely related to this one
> https://commitfest.postgresql.org/35/3142/
> | Logging plan of the currently running query
>
> I suggest to review that patch and make sure there's nothing you could borrow.
>
> My only comment for now is that maybe the function name should be
> pg_log_backtrace() rather than pg_print_backtrace(), since it doesn't actually
> "print" the backtrace, but rather request the other backend to log its
> backtrace.

+1 for pg_log_backtrace().

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-15 Thread Justin Pryzby
On Mon, Nov 15, 2021 at 09:12:49PM +0530, vignesh C wrote:
> The idea here is to implement & expose pg_print_backtrace function, internally

This patch is closely related to this one
https://commitfest.postgresql.org/35/3142/
| Logging plan of the currently running query

I suggest to review that patch and make sure there's nothing you could borrow.

My only comment for now is that maybe the function name should be
pg_log_backtrace() rather than pg_print_backtrace(), since it doesn't actually
"print" the backtrace, but rather request the other backend to log its
backtrace.

Did you see that the windows build failed ?
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.153557

I think you'll need to create an "alternate" output like
src/test/regress/expected/misc_functions_1.out

It's possible that's best done by creating a totally new .sql and .out files
added to src/test/regress/parallel_schedule - because otherwise, you'd have to
duplicate the existing 300 lines of misc_fuctions.out.

-- 
Justin




Re: Printing backtrace of postgres processes

2021-11-15 Thread vignesh C
On Mon, Nov 15, 2021 at 11:37 AM Dilip Kumar  wrote:
>
> On Mon, Nov 15, 2021 at 10:34 AM vignesh C  wrote:
>
> >
> > Thanks for the comments, the attached v12 patch has the changes for the 
> > same.
>
> I have reviewed this patch and have some comments on v12-0001,
>
> 1.
> +This feature is not supported for the postmaster, logger, 
> checkpointer,
> +walwriter, background writer or statistics collector process. This
>
>
> Comment says it is not supported for postmaster, logger, checkpointer
> etc, but I just tried and it is working for checkpointer and walwriter
> processes, can you explain in comments why do we not want to support
> for these processes?  or the comment is old and now we are supporting
> for some of these processes.
>
>
> 2.
> postgres[64154]=# select pg_print_backtrace(64136);
> WARNING:  01000: PID 64136 is not a PostgreSQL server process
> LOCATION:  pg_print_backtrace, signalfuncs.c:335
>  pg_print_backtrace
> 
>  f
>
>
> For postmaster I am getting this WARNING "PID 64136 is not a
> PostgreSQL server process", even if we don't want to support this
> process I don't think this message is good.
>
>
>
> 3.
> I was looking into the patch, I tried to print the backtrace using GDB
> as well as using this function, I have complied in debug mode, I can
> see the backtrace printed
> by GDB is more detailed than printed by this API, I understand we can
> find out the function name from address, but can't we print the
> detailed call stack with all function names at least when debug
> symbols are available?
>
> Using GDB
>
> #0  0x7fa26c527e93 in __epoll_wait_nocancel () from
> #1  0x00947a61 in WaitEventSetWaitBlock (set=0x2
> #2  0x009478f9 in WaitEventSetWait (set=0x2f0111
> #3  0x007a6cef in secure_read (port=0x2f26800, p
> #4  0x007b0bd6 in pq_recvbuf () at pqcomm.c:957
> #5  0x007b0c86 in pq_getbyte () at pqcomm.c:1000
> #6  0x00978c13 in SocketBackend (inBuf=0x7ffea99
> #7  0x00978e37 in ReadCommand (inBuf=0x7ffea9937
> #8  0x0097dca5 in PostgresMain (dbname=0x2f2ef40
> 
>
> Using pg_print_backtrace
> postgres: dilipkumar postgres [local]
> SELECT(set_backtrace+0x38) [0xb118ff]
> postgres: dilipkumar postgres [local]
> SELECT(ProcessPrintBacktraceInterrupt+0x5b) [0x94fe42]
> postgres: dilipkumar postgres [local]
> SELECT(ProcessInterrupts+0x7d9) [0x97cb2a]
> postgres: dilipkumar postgres [local] SELECT() [0x78143c]
> postgres: dilipkumar postgres [local] SELECT() [0x736731]
> postgres: dilipkumar postgres [local] SELECT() [0x738f5f]
> postgres: dilipkumar postgres [local]
> SELECT(standard_ExecutorRun+0x1d6) [0x736d94]
> postgres: dilipkumar postgres [local] SELECT(ExecutorRun+0x55)
> [0x736bbc]
> postgres: dilipkumar postgres [local] SELECT() [0x97ff0c]
> postgres: dilipkumar postgres [local] SELECT(PortalRun+0x268) 
> [0x97fbbf]
> postgres: dilipkumar postgres [local] SELECT() [0x9798dc]
> postgres: dilipkumar postgres [local]
> SELECT(PostgresMain+0x6ca) [0x97dda7]

I did not find any API's with such an implementation. We have used
backtrace and backtrace_symbols to print the address. Same thing is
used to add error backtrace and print backtrace for assert failure,
see errbacktrace and ExceptionalCondition. I felt this is ok.

> 4.
> +WARNING:  backtrace generation is not supported by this installation
> + pg_print_backtrace
> +
> + f
>
> Should we give some hints on how to enable that?

Modified to include a hint message.
The Attached v13 patch has the fix for it.

Regards,
Vignesh
From 89f36d323c04c45dde450e78deeb76a726e146b5 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Tue, 9 Nov 2021 16:28:03 +0530
Subject: [PATCH v13 1/2] Print backtrace of specified postgres process.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PROCSIG_PRINT_BACKTRACE to postgres backend whose pid matches the
specified process id. Once the backend process receives this signal it will
print the backtrace of the process to the log file based on the logging
configuration, if logging is disabled backtrace will be printed to the
console where postmaster was started.
---
 doc/src/sgml/func.sgml   | 85 
 src/backend/catalog/system_functions.sql |  2 +
 src/backend/postmaster/autovacuum.c  |  4 +
 src/backend/storage/ipc/procsignal.c | 41 ++
 src/backend/storage/ipc/signalfuncs.c| 49 +++
 src/backend/tcop/postgres.c  |  4 +
 src/backend/utils/error/elog.c   | 20 -
 src/backend/utils/init/globals.c |  1 +
 src/include/catalog/catversion.h |  2 +-
 src/include/catalog/pg_proc.dat  |  5 ++
 src/include/miscadmin.h 

Re: Printing backtrace of postgres processes

2021-11-14 Thread Bharath Rupireddy
On Mon, Nov 15, 2021 at 12:08 PM Dilip Kumar  wrote:
> > > 2.
> > > postgres[64154]=# select pg_print_backtrace(64136);
> > > WARNING:  01000: PID 64136 is not a PostgreSQL server process
> > > LOCATION:  pg_print_backtrace, signalfuncs.c:335
> > >  pg_print_backtrace
> > > 
> > >  f
> > >
> > >
> > > For postmaster I am getting this WARNING "PID 64136 is not a
> > > PostgreSQL server process", even if we don't want to support this
> > > process I don't think this message is good.
> >
> > This is a generic message that is coming from pg_signal_backend, not
> > related to Vignesh's patch. I agree with you that emitting a "not
> > postgres server process" for the postmaster process which is the main
> > "postgres process" doesn't sound sensible. Please see there's already
> > a thread [1] and see the v1 patch [2] for changing this message.
> > Please let me know if you want me to revive that stalled thread?
>
> >[1] 
> >https://www.postgresql.org/message-id/CALj2ACW7Rr-R7mBcBQiXWPp%3DJV5chajjTdudLiF5YcpW-BmHhg%40mail.gmail.com
> >[2] 
> >https://www.postgresql.org/message-id/CALj2ACUGxedgYk-5nO8D2EJV2YHXnoycp_oqYAxDXTODhWkmkg%40mail.gmail.com
>
> Hmm, yeah I think I like the idea posted in [1], however, I could not
> open the link [2]

Thanks, I posted an updated v3 patch at [1]. Please review it there.

[1] - 
https://www.postgresql.org/message-id/CALj2ACWS2bJRW-bSvcoL4FvS%3DkbQ8SSWXi%3D9RFUt7uqZvTQWWw%40mail.gmail.com

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-14 Thread Bharath Rupireddy
On Mon, Nov 15, 2021 at 12:13 PM vignesh C  wrote:
>
> On Mon, Nov 15, 2021 at 11:00 AM Bharath Rupireddy
>  wrote:
> >
> > On Mon, Nov 15, 2021 at 10:34 AM vignesh C  wrote:
> > > > 2) I think "which is enough because the target process for logging of
> > > > backtrace is a backend" isn't valid anymore with 0002, righit? Please
> > > > remove it.
> > > > + * to call this function if we see PrintBacktracePending set. It is 
> > > > called from
> > > > + * CHECK_FOR_INTERRUPTS() or from process specific interrupt handlers, 
> > > > which is
> > > > + * enough because the target process for logging of backtrace is a 
> > > > backend.
> > > >
> > > > > Thanks for the comments, v11 patch attached at [1] has the changes 
> > > > > for the same.
> > >
> > > Modified
> >
> > I don't see the above change in v12. Am I missing something? I still
> > see the comment "It is called from CHECK_FOR_INTERRUPTS(), which is
> > enough because the target process for logging of backtrace is a
> > backend.".
>
> This change is present in the 0002
> (v12-0002-pg_print_backtrace-support-for-printing-backtrac.patch)
> patch.

Thanks. Yes, it was removed in 0002. I missed it.

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-14 Thread vignesh C
On Mon, Nov 15, 2021 at 11:00 AM Bharath Rupireddy
 wrote:
>
> On Mon, Nov 15, 2021 at 10:34 AM vignesh C  wrote:
> > > 2) I think "which is enough because the target process for logging of
> > > backtrace is a backend" isn't valid anymore with 0002, righit? Please
> > > remove it.
> > > + * to call this function if we see PrintBacktracePending set. It is 
> > > called from
> > > + * CHECK_FOR_INTERRUPTS() or from process specific interrupt handlers, 
> > > which is
> > > + * enough because the target process for logging of backtrace is a 
> > > backend.
> > >
> > > > Thanks for the comments, v11 patch attached at [1] has the changes for 
> > > > the same.
> >
> > Modified
>
> I don't see the above change in v12. Am I missing something? I still
> see the comment "It is called from CHECK_FOR_INTERRUPTS(), which is
> enough because the target process for logging of backtrace is a
> backend.".

This change is present in the 0002
(v12-0002-pg_print_backtrace-support-for-printing-backtrac.patch)
patch.

Regards,
Vignesh




Re: Printing backtrace of postgres processes

2021-11-14 Thread Dilip Kumar
On Mon, Nov 15, 2021 at 11:53 AM Bharath Rupireddy
 wrote:
>
> On Mon, Nov 15, 2021 at 11:37 AM Dilip Kumar  wrote:
> >
> > On Mon, Nov 15, 2021 at 10:34 AM vignesh C  wrote:
> >
> > >
> > > Thanks for the comments, the attached v12 patch has the changes for the 
> > > same.
> >
> > I have reviewed this patch and have some comments on v12-0001,
> >
> > 1.
> > +This feature is not supported for the postmaster, logger, 
> > checkpointer,
> > +walwriter, background writer or statistics collector process. This
> >
> >
> > Comment says it is not supported for postmaster, logger, checkpointer
> > etc, but I just tried and it is working for checkpointer and walwriter
> > processes, can you explain in comments why do we not want to support
> > for these processes?  or the comment is old and now we are supporting
> > for some of these processes.
>
> Please see the v12-0002 which will have the description modified.

Okay, now I see that.

> > 2.
> > postgres[64154]=# select pg_print_backtrace(64136);
> > WARNING:  01000: PID 64136 is not a PostgreSQL server process
> > LOCATION:  pg_print_backtrace, signalfuncs.c:335
> >  pg_print_backtrace
> > 
> >  f
> >
> >
> > For postmaster I am getting this WARNING "PID 64136 is not a
> > PostgreSQL server process", even if we don't want to support this
> > process I don't think this message is good.
>
> This is a generic message that is coming from pg_signal_backend, not
> related to Vignesh's patch. I agree with you that emitting a "not
> postgres server process" for the postmaster process which is the main
> "postgres process" doesn't sound sensible. Please see there's already
> a thread [1] and see the v1 patch [2] for changing this message.
> Please let me know if you want me to revive that stalled thread?

>[1] 
>https://www.postgresql.org/message-id/CALj2ACW7Rr-R7mBcBQiXWPp%3DJV5chajjTdudLiF5YcpW-BmHhg%40mail.gmail.com
>[2] 
>https://www.postgresql.org/message-id/CALj2ACUGxedgYk-5nO8D2EJV2YHXnoycp_oqYAxDXTODhWkmkg%40mail.gmail.com

Hmm, yeah I think I like the idea posted in [1], however, I could not
open the link [2]

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Printing backtrace of postgres processes

2021-11-14 Thread Bharath Rupireddy
On Mon, Nov 15, 2021 at 11:37 AM Dilip Kumar  wrote:
>
> On Mon, Nov 15, 2021 at 10:34 AM vignesh C  wrote:
>
> >
> > Thanks for the comments, the attached v12 patch has the changes for the 
> > same.
>
> I have reviewed this patch and have some comments on v12-0001,
>
> 1.
> +This feature is not supported for the postmaster, logger, 
> checkpointer,
> +walwriter, background writer or statistics collector process. This
>
>
> Comment says it is not supported for postmaster, logger, checkpointer
> etc, but I just tried and it is working for checkpointer and walwriter
> processes, can you explain in comments why do we not want to support
> for these processes?  or the comment is old and now we are supporting
> for some of these processes.

Please see the v12-0002 which will have the description modified.

> 2.
> postgres[64154]=# select pg_print_backtrace(64136);
> WARNING:  01000: PID 64136 is not a PostgreSQL server process
> LOCATION:  pg_print_backtrace, signalfuncs.c:335
>  pg_print_backtrace
> 
>  f
>
>
> For postmaster I am getting this WARNING "PID 64136 is not a
> PostgreSQL server process", even if we don't want to support this
> process I don't think this message is good.

This is a generic message that is coming from pg_signal_backend, not
related to Vignesh's patch. I agree with you that emitting a "not
postgres server process" for the postmaster process which is the main
"postgres process" doesn't sound sensible. Please see there's already
a thread [1] and see the v1 patch [2] for changing this message.
Please let me know if you want me to revive that stalled thread?

[1] 
https://www.postgresql.org/message-id/CALj2ACW7Rr-R7mBcBQiXWPp%3DJV5chajjTdudLiF5YcpW-BmHhg%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CALj2ACUGxedgYk-5nO8D2EJV2YHXnoycp_oqYAxDXTODhWkmkg%40mail.gmail.com

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-14 Thread Dilip Kumar
On Mon, Nov 15, 2021 at 10:34 AM vignesh C  wrote:

>
> Thanks for the comments, the attached v12 patch has the changes for the same.

I have reviewed this patch and have some comments on v12-0001,

1.
+This feature is not supported for the postmaster, logger, checkpointer,
+walwriter, background writer or statistics collector process. This


Comment says it is not supported for postmaster, logger, checkpointer
etc, but I just tried and it is working for checkpointer and walwriter
processes, can you explain in comments why do we not want to support
for these processes?  or the comment is old and now we are supporting
for some of these processes.


2.
postgres[64154]=# select pg_print_backtrace(64136);
WARNING:  01000: PID 64136 is not a PostgreSQL server process
LOCATION:  pg_print_backtrace, signalfuncs.c:335
 pg_print_backtrace

 f


For postmaster I am getting this WARNING "PID 64136 is not a
PostgreSQL server process", even if we don't want to support this
process I don't think this message is good.



3.
I was looking into the patch, I tried to print the backtrace using GDB
as well as using this function, I have complied in debug mode, I can
see the backtrace printed
by GDB is more detailed than printed by this API, I understand we can
find out the function name from address, but can't we print the
detailed call stack with all function names at least when debug
symbols are available?

Using GDB

#0  0x7fa26c527e93 in __epoll_wait_nocancel () from
#1  0x00947a61 in WaitEventSetWaitBlock (set=0x2
#2  0x009478f9 in WaitEventSetWait (set=0x2f0111
#3  0x007a6cef in secure_read (port=0x2f26800, p
#4  0x007b0bd6 in pq_recvbuf () at pqcomm.c:957
#5  0x007b0c86 in pq_getbyte () at pqcomm.c:1000
#6  0x00978c13 in SocketBackend (inBuf=0x7ffea99
#7  0x00978e37 in ReadCommand (inBuf=0x7ffea9937
#8  0x0097dca5 in PostgresMain (dbname=0x2f2ef40


Using pg_print_backtrace
postgres: dilipkumar postgres [local]
SELECT(set_backtrace+0x38) [0xb118ff]
postgres: dilipkumar postgres [local]
SELECT(ProcessPrintBacktraceInterrupt+0x5b) [0x94fe42]
postgres: dilipkumar postgres [local]
SELECT(ProcessInterrupts+0x7d9) [0x97cb2a]
postgres: dilipkumar postgres [local] SELECT() [0x78143c]
postgres: dilipkumar postgres [local] SELECT() [0x736731]
postgres: dilipkumar postgres [local] SELECT() [0x738f5f]
postgres: dilipkumar postgres [local]
SELECT(standard_ExecutorRun+0x1d6) [0x736d94]
postgres: dilipkumar postgres [local] SELECT(ExecutorRun+0x55)
[0x736bbc]
postgres: dilipkumar postgres [local] SELECT() [0x97ff0c]
postgres: dilipkumar postgres [local] SELECT(PortalRun+0x268) [0x97fbbf]
postgres: dilipkumar postgres [local] SELECT() [0x9798dc]
postgres: dilipkumar postgres [local]
SELECT(PostgresMain+0x6ca) [0x97dda7]

4.
+WARNING:  backtrace generation is not supported by this installation
+ pg_print_backtrace
+
+ f

Should we give some hints on how to enable that?


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Printing backtrace of postgres processes

2021-11-14 Thread Bharath Rupireddy
On Mon, Nov 15, 2021 at 10:34 AM vignesh C  wrote:
> > 2) I think "which is enough because the target process for logging of
> > backtrace is a backend" isn't valid anymore with 0002, righit? Please
> > remove it.
> > + * to call this function if we see PrintBacktracePending set. It is called 
> > from
> > + * CHECK_FOR_INTERRUPTS() or from process specific interrupt handlers, 
> > which is
> > + * enough because the target process for logging of backtrace is a backend.
> >
> > > Thanks for the comments, v11 patch attached at [1] has the changes for 
> > > the same.
>
> Modified

I don't see the above change in v12. Am I missing something? I still
see the comment "It is called from CHECK_FOR_INTERRUPTS(), which is
enough because the target process for logging of backtrace is a
backend.".

+ * ProcessPrintBacktraceInterrupt - Perform logging of backtrace of this
+ * backend process.
+ *
+ * Any backend that participates in ProcSignal signaling must arrange
+ * to call this function if we see PrintBacktracePending set.
+ * It is called from CHECK_FOR_INTERRUPTS(), which is enough because
+ * the target process for logging of backtrace is a backend.
+ */
+void
+ProcessPrintBacktraceInterrupt(void)

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-14 Thread vignesh C
On Mon, Nov 15, 2021 at 7:37 AM Bharath Rupireddy
 wrote:
>
> On Sun, Nov 14, 2021 at 8:49 PM vignesh C  wrote:
> > > 7) Do we need TAP tests for this function? I think it is sufficient to
> > > test the function in misc_functions.sql, please remove
> > > 002_print_backtrace_validation.pl. Note that we don't do similar TAP
> > > testing for pg_log_backend_memory_contexts as well.
> >
> > I felt let's keep this test case, all the other tests just check if it
> > returns true or false, it does not checks for the contents in the
> > logfile. This is the only test which checks the logfile.
>
>  I still don't agree to have test cases within a new file
> 002_print_backtrace_validation.pl. I feel this test case doesn't add
> value because the code coverage is done by .sql test cases and .pl
> just ensures the backtrace appears in the server logs. I don't think
> we ever need a new file for this purpose. If this is the case, then
> there are other functions like pg_log_backend_memory_contexts  or
> pg_log_query_plan (in progress thread) might add the same test files
> for the same reasons which make the TAP tests i.e. "make check-world"
> to take longer times. Moreover, pg_log_backend_memory_contexts  has
> been committed without having a TAP test case.
>
> I think we can remove it.

Removed

> Few more comments on v11:
> 1) I think we can improve here by adding a link to "backend" as well,
> I will modify it in the other thread.
> +Requests to log the backtrace of the backend or the
> +WAL sender or
> Something like:
> +Requests to log the backtrace of the  linkend="glossary-backend">backend or the
> +WAL sender or

Modified

> 2) I think "which is enough because the target process for logging of
> backtrace is a backend" isn't valid anymore with 0002, righit? Please
> remove it.
> + * to call this function if we see PrintBacktracePending set. It is called 
> from
> + * CHECK_FOR_INTERRUPTS() or from process specific interrupt handlers, which 
> is
> + * enough because the target process for logging of backtrace is a backend.
>
> > Thanks for the comments, v11 patch attached at [1] has the changes for the 
> > same.

Modified

Thanks for the comments, the attached v12 patch has the changes for the same.

Regards,
Vignesh
From c34d0114833219ef7b1b2d4283522ce9d4e6ffe3 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 9 Nov 2021 16:28:03 +0530
Subject: [PATCH v12 1/2] Print backtrace of specified postgres process.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PROCSIG_PRINT_BACKTRACE to postgres backend whose pid matches the
specified process id. Once the backend process receives this signal it will
print the backtrace of the process to the log file based on the logging
configuration, if logging is disabled backtrace will be printed to the
console where postmaster was started.
---
 doc/src/sgml/func.sgml   | 83 
 src/backend/catalog/system_functions.sql |  2 +
 src/backend/postmaster/autovacuum.c  |  4 +
 src/backend/storage/ipc/procsignal.c | 41 ++
 src/backend/storage/ipc/signalfuncs.c| 48 +++
 src/backend/tcop/postgres.c  |  4 +
 src/backend/utils/error/elog.c   | 20 -
 src/backend/utils/init/globals.c |  1 +
 src/include/catalog/catversion.h |  2 +-
 src/include/catalog/pg_proc.dat  |  5 ++
 src/include/miscadmin.h  |  1 +
 src/include/storage/procsignal.h |  3 +-
 src/include/utils/elog.h |  2 +
 src/test/regress/expected/misc_functions.out | 42 ++
 src/test/regress/sql/misc_functions.sql  | 31 
 15 files changed, 283 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 24447c0017..8d147825eb 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25345,6 +25345,32 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_print_backtrace
+
+pg_print_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the
+backend
+with the specified process ID. The backtrace will be logged at message
+level LOG. It will appear in the server log based on
+the log configuration set (See 
+for more information), but will not be sent to the client regardless of
+. A backtrace will identify
+where exactly the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is not supported for the postmaster, logger, checkpointer,
+walwriter, background writer or statistics collector process. This
+feature will be available if 

Re: Printing backtrace of postgres processes

2021-11-14 Thread Bharath Rupireddy
On Sun, Nov 14, 2021 at 8:49 PM vignesh C  wrote:
> > 7) Do we need TAP tests for this function? I think it is sufficient to
> > test the function in misc_functions.sql, please remove
> > 002_print_backtrace_validation.pl. Note that we don't do similar TAP
> > testing for pg_log_backend_memory_contexts as well.
>
> I felt let's keep this test case, all the other tests just check if it
> returns true or false, it does not checks for the contents in the
> logfile. This is the only test which checks the logfile.

 I still don't agree to have test cases within a new file
002_print_backtrace_validation.pl. I feel this test case doesn't add
value because the code coverage is done by .sql test cases and .pl
just ensures the backtrace appears in the server logs. I don't think
we ever need a new file for this purpose. If this is the case, then
there are other functions like pg_log_backend_memory_contexts  or
pg_log_query_plan (in progress thread) might add the same test files
for the same reasons which make the TAP tests i.e. "make check-world"
to take longer times. Moreover, pg_log_backend_memory_contexts  has
been committed without having a TAP test case.

I think we can remove it.

Few more comments on v11:
1) I think we can improve here by adding a link to "backend" as well,
I will modify it in the other thread.
+Requests to log the backtrace of the backend or the
+WAL sender or
Something like:
+Requests to log the backtrace of the backend or the
+WAL sender or

2) I think "which is enough because the target process for logging of
backtrace is a backend" isn't valid anymore with 0002, righit? Please
remove it.
+ * to call this function if we see PrintBacktracePending set. It is called from
+ * CHECK_FOR_INTERRUPTS() or from process specific interrupt handlers, which is
+ * enough because the target process for logging of backtrace is a backend.

> Thanks for the comments, v11 patch attached at [1] has the changes for the 
> same.

The v11 patches mostly look good to me except the above comments.

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-14 Thread vignesh C
On Fri, Nov 12, 2021 at 6:11 PM Bharath Rupireddy
 wrote:
>
> On Fri, Nov 12, 2021 at 5:15 PM Bharath Rupireddy
>  wrote:
> >
> > On Thu, Nov 11, 2021 at 12:14 PM vignesh C  wrote:
> > > Thanks for the comments, the attached v10 patch has the fixes for the 
> > > same.
> >
> > Thanks for the patches. Here are some comments:
> >
> > 1) In the docs, let's have the similar description of
> > pg_log_backend_memory_contexts for pg_print_backtrace, just for the
> > continuity in the users readability.
> >
> > 2) I don't know how the  part looks like in the Server
> > Signaling Functions table. I think here you can just say, it will emit
> > a warning and return false if not supported by the installation. And
> > you can give the  part after the description where you are
> > showing a sample backtrace.
> >
> > +capture backtrace. If not available, the function will return false
> > +and a warning is issued, for example:
> > +
> > +WARNING:  backtrace generation is not supported by this installation
> > + pg_print_backtrace
> > +
> > + f
> > +
> > +   
> > +  
> >
> > 3) Replace '!' with '.'.
> > + * Note: this is called within a signal handler!  All we can do is set
> >
> > 4) It is not only the next CFI but also the process specific interrupt
> > handlers (in your 0002 patch) right?
> > + * a flag that will cause the next CHECK_FOR_INTERRUPTS to invoke
> >
> > 5) I think you need to update CATALOG_VERSION_NO, mostly the committer
> > will take care of it but just in case.
> >
> > 6) Be consistent with casing "Verify" and "Might"
> > +# Verify that log output gets to the file
> > +# might need to retry if logging collector process is slow...
>
> Few more comments:
>
> 7) Do we need TAP tests for this function? I think it is sufficient to
> test the function in misc_functions.sql, please remove
> 002_print_backtrace_validation.pl. Note that we don't do similar TAP
> testing for pg_log_backend_memory_contexts as well.

I felt let's keep this test case, all the other tests just check if it
returns true or false, it does not checks for the contents in the
logfile. This is the only test which checks the logfile.

> 8) I hope you have manually tested the pg_print_backtrace for the
> startup process and other auxiliary processes.

Yes, I have checked them manually.

> 9) I think we can have a similar description (to the patch [1]). with
> links to each process definition in the glossary so that it will be
> easier for the users to follow the links and know what each process
> is. Especially, I didn't like the 0002 mentioned about the
> BackendPidGetProc or AuxiliaryPidGetProc as these are internal to the
> server and the users don't care about them.
>
> - * end on its own first and its backtrace are not logged is not a problem.
> + * BackendPidGetProc or AuxiliaryPidGetProc returns NULL if the pid isn't
> + * valid; but by the time we reach kill(), a process for which we get a
> + * valid proc here might have terminated on its own.  There's no way to
> + * acquire a lock on an arbitrary process to prevent that. But since this
> + * mechanism is usually used to debug a backend running and consuming lots
> + * of CPU cycles, that it might end on its own first and its backtrace are
> + * not logged is not a problem.
>   */
>
> Here's what I have written in the other patch [1], if okay please use this:
>
> +Requests to log memory contexts of the backend or the
> +WAL sender or
> +the auxiliary
> process
> +with the specified process ID. All of the
> +auxiliary
> processes
> +are supported except the  linkend="glossary-logger">logger
> +and the  linkend="glossary-stats-collector">statistics collector
> +as they are not connected to shared memory the function can
> not make requests.
> +These memory contexts will be logged at
> LOG message level.
> +They will appear in the server log based on the log configuration set
>  (See  for more information),

I had mentioned BackendPidGetProc or AuxiliaryPidGetProc as comments
in the function, but I have not changed that. I have changed the
documentation similar to your patch.

Thanks for the comments, v11 patch attached at [1] has the changes for the same.
[1] - 
https://www.postgresql.org/message-id/CALDaNm3BYGOG3-PQvYbWkB%3DG3h1KYJ8CO8UYbzfECH4DYGMGqA%40mail.gmail.com

Regards,
Vignesh




Re: Printing backtrace of postgres processes

2021-11-14 Thread vignesh C
On Fri, Nov 12, 2021 at 5:15 PM Bharath Rupireddy
 wrote:
>
> On Thu, Nov 11, 2021 at 12:14 PM vignesh C  wrote:
> > Thanks for the comments, the attached v10 patch has the fixes for the same.
>
> Thanks for the patches. Here are some comments:
>
> 1) In the docs, let's have the similar description of
> pg_log_backend_memory_contexts for pg_print_backtrace, just for the
> continuity in the users readability.

I have kept some contents of the description similar. There is some
additional information to explain more about the functionality. I felt
that will help the user to understand more about the feature.

> 2) I don't know how the  part looks like in the Server
> Signaling Functions table. I think here you can just say, it will emit
> a warning and return false if not supported by the installation. And
> you can give the  part after the description where you are
> showing a sample backtrace.
>
> +capture backtrace. If not available, the function will return false
> +and a warning is issued, for example:
> +
> +WARNING:  backtrace generation is not supported by this installation
> + pg_print_backtrace
> +
> + f
> +
> +   
> +  

Modified

> 3) Replace '!' with '.'.
> + * Note: this is called within a signal handler!  All we can do is set

I have changed it similar to HandleLogMemoryContextInterrupt

> 4) It is not only the next CFI but also the process specific interrupt
> handlers (in your 0002 patch) right?
> + * a flag that will cause the next CHECK_FOR_INTERRUPTS to invoke

Modified

> 5) I think you need to update CATALOG_VERSION_NO, mostly the committer
> will take care of it but just in case.

Modified

> 6) Be consistent with casing "Verify" and "Might"
> +# Verify that log output gets to the file
> +# might need to retry if logging collector process is slow...

Modified

The attached v11 patch has the changes for the same.

Regards,
Vignesh
From c29bafb470a367bd4f790f6f50edc3e8002f46c7 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Tue, 9 Nov 2021 16:28:03 +0530
Subject: [PATCH v11 1/2] Print backtrace of specified postgres process.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PROCSIG_PRINT_BACKTRACE to postgres backend whose pid matches the
specified process id. Once the backend process receives this signal it will
print the backtrace of the process to the log file based on the logging
configuration, if logging is disabled backtrace will be printed to the
console where postmaster was started.
---
 doc/src/sgml/func.sgml| 82 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 +
 src/backend/storage/ipc/procsignal.c  | 41 ++
 src/backend/storage/ipc/signalfuncs.c | 48 +++
 src/backend/tcop/postgres.c   |  4 +
 src/backend/utils/error/elog.c| 20 -
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/catversion.h  |  2 +-
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 .../t/002_print_backtrace_validation.pl   | 73 +
 src/test/regress/expected/misc_functions.out  | 42 ++
 src/test/regress/sql/misc_functions.sql   | 31 +++
 16 files changed, 355 insertions(+), 6 deletions(-)
 create mode 100644 src/test/modules/test_misc/t/002_print_backtrace_validation.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 24447c0017..620e6ee54a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25345,6 +25345,31 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_print_backtrace
+
+pg_print_backtrace ( pid integer )
+boolean
+   
+   
+Requests to log the backtrace of the backend with the specified process
+ID. The backtrace will be logged at message level
+LOG. It will appear in the server log based on the
+log configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace will identify
+where exactly the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is not supported for the postmaster, logger, checkpointer,
+walwriter, background writer or statistics collector process. This
+feature will be available if PostgreSQL was built with the ability to
+capture backtrace. If not available, the function will emit a warning
+and return false.
+   
+  
+
   

 
@@ -25458,6 

Re: Printing backtrace of postgres processes

2021-11-12 Thread Bharath Rupireddy
On Fri, Nov 12, 2021 at 5:15 PM Bharath Rupireddy
 wrote:
>
> On Thu, Nov 11, 2021 at 12:14 PM vignesh C  wrote:
> > Thanks for the comments, the attached v10 patch has the fixes for the same.
>
> Thanks for the patches. Here are some comments:
>
> 1) In the docs, let's have the similar description of
> pg_log_backend_memory_contexts for pg_print_backtrace, just for the
> continuity in the users readability.
>
> 2) I don't know how the  part looks like in the Server
> Signaling Functions table. I think here you can just say, it will emit
> a warning and return false if not supported by the installation. And
> you can give the  part after the description where you are
> showing a sample backtrace.
>
> +capture backtrace. If not available, the function will return false
> +and a warning is issued, for example:
> +
> +WARNING:  backtrace generation is not supported by this installation
> + pg_print_backtrace
> +
> + f
> +
> +   
> +  
>
> 3) Replace '!' with '.'.
> + * Note: this is called within a signal handler!  All we can do is set
>
> 4) It is not only the next CFI but also the process specific interrupt
> handlers (in your 0002 patch) right?
> + * a flag that will cause the next CHECK_FOR_INTERRUPTS to invoke
>
> 5) I think you need to update CATALOG_VERSION_NO, mostly the committer
> will take care of it but just in case.
>
> 6) Be consistent with casing "Verify" and "Might"
> +# Verify that log output gets to the file
> +# might need to retry if logging collector process is slow...

Few more comments:

7) Do we need TAP tests for this function? I think it is sufficient to
test the function in misc_functions.sql, please remove
002_print_backtrace_validation.pl. Note that we don't do similar TAP
testing for pg_log_backend_memory_contexts as well.

8) I hope you have manually tested the pg_print_backtrace for the
startup process and other auxiliary processes.

9) I think we can have a similar description (to the patch [1]). with
links to each process definition in the glossary so that it will be
easier for the users to follow the links and know what each process
is. Especially, I didn't like the 0002 mentioned about the
BackendPidGetProc or AuxiliaryPidGetProc as these are internal to the
server and the users don't care about them.

- * end on its own first and its backtrace are not logged is not a problem.
+ * BackendPidGetProc or AuxiliaryPidGetProc returns NULL if the pid isn't
+ * valid; but by the time we reach kill(), a process for which we get a
+ * valid proc here might have terminated on its own.  There's no way to
+ * acquire a lock on an arbitrary process to prevent that. But since this
+ * mechanism is usually used to debug a backend running and consuming lots
+ * of CPU cycles, that it might end on its own first and its backtrace are
+ * not logged is not a problem.
  */

Here's what I have written in the other patch [1], if okay please use this:

+Requests to log memory contexts of the backend or the
+WAL sender or
+the auxiliary
process
+with the specified process ID. All of the
+auxiliary
processes
+are supported except the logger
+and the statistics collector
+as they are not connected to shared memory the function can
not make requests.
+These memory contexts will be logged at
LOG message level.
+They will appear in the server log based on the log configuration set
 (See  for more information),

I have no further comments on v10.

[1] - 
https://www.postgresql.org/message-id/CALj2ACU1nBzpacOK2q%3Da65S_4%2BOaz_rLTsU1Ri0gf7YUmnmhfQ%40mail.gmail.com

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-12 Thread Bharath Rupireddy
On Thu, Nov 11, 2021 at 12:14 PM vignesh C  wrote:
> Thanks for the comments, the attached v10 patch has the fixes for the same.

Thanks for the patches. Here are some comments:

1) In the docs, let's have the similar description of
pg_log_backend_memory_contexts for pg_print_backtrace, just for the
continuity in the users readability.

2) I don't know how the  part looks like in the Server
Signaling Functions table. I think here you can just say, it will emit
a warning and return false if not supported by the installation. And
you can give the  part after the description where you are
showing a sample backtrace.

+capture backtrace. If not available, the function will return false
+and a warning is issued, for example:
+
+WARNING:  backtrace generation is not supported by this installation
+ pg_print_backtrace
+
+ f
+
+   
+  

3) Replace '!' with '.'.
+ * Note: this is called within a signal handler!  All we can do is set

4) It is not only the next CFI but also the process specific interrupt
handlers (in your 0002 patch) right?
+ * a flag that will cause the next CHECK_FOR_INTERRUPTS to invoke

5) I think you need to update CATALOG_VERSION_NO, mostly the committer
will take care of it but just in case.

6) Be consistent with casing "Verify" and "Might"
+# Verify that log output gets to the file
+# might need to retry if logging collector process is slow...


Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-10 Thread vignesh C
On Wed, Nov 10, 2021 at 12:17 PM Bharath Rupireddy
 wrote:
>
> On Tue, Nov 9, 2021 at 4:45 PM vignesh C  wrote:
> > Thanks for reporting this, the attached v9 patch has the changes for the 
> > same.
>
> Thanks for the v9 patch. I have some comments:
>
> 1) I think we are moving away from if (!superuser()) checks, see the
> commit [1]. The goal is to let the GRANT-REVOKE system deal with who
> is supposed to run these system functions. Since pg_print_backtrace
> also writes the info to server logs,

Modified

> 2) I think we need to have LOG_SERVER_ONLY instead of LOG to avoid
> bakctrace being sent to the connected client. This will be good from
> security perspective as well since we don't send backtrace over the
> wire to the client.
> + PrintBacktracePending = false;
> + ereport(LOG,
> + (errmsg("logging backtrace of PID %d", MyProcPid)));
>
> for pg_log_backend_memory_contexts:
> +   /*
> +* Use LOG_SERVER_ONLY to prevent the memory contexts
> from being sent
> +* to the connected client.
> +*
> +* We don't buffer the information about all memory
> contexts in a
> +* backend into StringInfo and log it as one message.
> Otherwise which
> +* may require the buffer to be enlarged very much and
> lead to OOM
> +* error since there can be a large number of memory
> contexts in a
> +* backend. Instead, we log one message per memory context.
> +*/
> +   ereport(LOG_SERVER_ONLY,

Modified

> 3) I think we need to extend this function to the auxiliary processes
> too, because users might be interested to see what these processes are
> doing and where they are currently stuck via their backtraces, see the
> proposal for pg_log_backend_memory_contexts at [2]. I think you need
> to add below code in couple of other places such as
> HandleCheckpointerInterrupts, HandleMainLoopInterrupts,
> HandlePgArchInterrupts, HandleStartupProcInterrupts,
> HandleWalWriterInterrupts.
>
> + /* Process printing backtrace */
> + if (PrintBacktracePending)
> + ProcessPrintBacktraceInterrupt();

Created 0002 patch to handle this.

Thanks for the comments, the attached v10 patch has the fixes for the same.

Regards,
Vignesh
From b8f4a48f3f8c95f15f596b4ad77f7a90d3ea376d Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 9 Nov 2021 16:28:03 +0530
Subject: [PATCH v10 1/2] Print backtrace of specified postgres process.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PROCSIG_PRINT_BACKTRACE to postgres backend whose pid matches the
specified process id. Once the backend process receives this signal it will
print the backtrace of the process to the log file based on the logging
configuration, if logging is disabled backtrace will be printed to the
console where postmaster was started.
---
 doc/src/sgml/func.sgml| 81 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 +
 src/backend/storage/ipc/procsignal.c  | 41 ++
 src/backend/storage/ipc/signalfuncs.c | 48 +++
 src/backend/tcop/postgres.c   |  4 +
 src/backend/utils/error/elog.c| 20 -
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 .../t/002_print_backtrace_validation.pl   | 73 +
 src/test/regress/expected/misc_functions.out  | 42 ++
 src/test/regress/sql/misc_functions.sql   | 31 +++
 15 files changed, 353 insertions(+), 5 deletions(-)
 create mode 100644 src/test/modules/test_misc/t/002_print_backtrace_validation.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 24447c0017..c9e6e470f3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25345,6 +25345,37 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_print_backtrace
+
+pg_print_backtrace ( pid integer )
+boolean
+   
+   
+Send a request to the backend with the specified process ID to log its
+backtrace. The backtrace will be logged at message level
+LOG. It will appear in the server log based on the
+log configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace will identify
+where exactly the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is not supported for the postmaster, logger, 

Re: Printing backtrace of postgres processes

2021-11-09 Thread Bharath Rupireddy
On Tue, Nov 9, 2021 at 4:45 PM vignesh C  wrote:
> Thanks for reporting this, the attached v9 patch has the changes for the same.

Thanks for the v9 patch. I have some comments:

1) I think we are moving away from if (!superuser()) checks, see the
commit [1]. The goal is to let the GRANT-REVOKE system deal with who
is supposed to run these system functions. Since pg_print_backtrace
also writes the info to server logs,

2) I think we need to have LOG_SERVER_ONLY instead of LOG to avoid
bakctrace being sent to the connected client. This will be good from
security perspective as well since we don't send backtrace over the
wire to the client.
+ PrintBacktracePending = false;
+ ereport(LOG,
+ (errmsg("logging backtrace of PID %d", MyProcPid)));

for pg_log_backend_memory_contexts:
+   /*
+* Use LOG_SERVER_ONLY to prevent the memory contexts
from being sent
+* to the connected client.
+*
+* We don't buffer the information about all memory
contexts in a
+* backend into StringInfo and log it as one message.
Otherwise which
+* may require the buffer to be enlarged very much and
lead to OOM
+* error since there can be a large number of memory
contexts in a
+* backend. Instead, we log one message per memory context.
+*/
+   ereport(LOG_SERVER_ONLY,

3) I think we need to extend this function to the auxiliary processes
too, because users might be interested to see what these processes are
doing and where they are currently stuck via their backtraces, see the
proposal for pg_log_backend_memory_contexts at [2]. I think you need
to add below code in couple of other places such as
HandleCheckpointerInterrupts, HandleMainLoopInterrupts,
HandlePgArchInterrupts, HandleStartupProcInterrupts,
HandleWalWriterInterrupts.

+ /* Process printing backtrace */
+ if (PrintBacktracePending)
+ ProcessPrintBacktraceInterrupt();

[1] commit f0b051e322d530a340e62f2ae16d99acdbcb3d05
Author: Jeff Davis 
Date:   Tue Oct 26 13:13:52 2021 -0700

Allow GRANT on pg_log_backend_memory_contexts().

Remove superuser check, allowing any user granted permissions on
pg_log_backend_memory_contexts() to log the memory contexts of any
backend.

Note that this could allow a privileged non-superuser to log the
memory contexts of a superuser backend, but as discussed, that does
not seem to be a problem.

Reviewed-by: Nathan Bossart, Bharath Rupireddy, Michael Paquier,
Kyotaro Horiguchi, Andres Freund
Discussion:
https://postgr.es/m/e5cf6684d17c8d1ef4904ae248605ccd6da03e72.ca...@j-davis.com

[2] 
https://www.postgresql.org/message-id/CALj2ACU1nBzpacOK2q%3Da65S_4%2BOaz_rLTsU1Ri0gf7YUmnmhfQ%40mail.gmail.com

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-11-09 Thread vignesh C
On Tue, Oct 12, 2021 at 10:47 AM bt21tanigaway
 wrote:
>
> Hi,
>
> > The previous patch was failing because of the recent test changes made
> > by commit 201a76183e2 which unified new and get_new_node, attached
> > patch has the changes to handle the changes accordingly.
> > Thanks for your update!
> > I have two comments.
>
> 1.Do we need “set_backtrace(NULL, 0);” on “HandleMainLoopInterrupts()”?
> I could observe that it works correctly without this. It is written on
> “HandleAutoVacLauncherInterrupts” as well, but I think it is necessary
> to prevent delays as well as [1].

I have removed this from HandleMainLoopInterrupts

> 2.The patch seems to forget to handle
> “ereport(LOG,(errmsg("logging backtrace of PID %d", MyProcPid)));” on
> “HandleAutoVacLauncherInterrupts” and “HandleMainLoopInterrupts()”.
> I think it should be the same as the process on “ProcessInterrupts()”.

I have create ProcessPrintBacktraceInterrupt which has the
implementation and is called wherever required. It is handled now.

> 3.How about creating a new function.
> Since the same process is on three functions( “ProcessInterrupts()”,
> “HandleAutoVacLauncherInterrupts”, “HandleMainLoopInterrupts()” ), I
> think it’s good to create a new function.

I have created ProcessPrintBacktraceInterrupt to handle it.

Thanks for the comments, v9 patch attached at [1] has the changes for the same.
[1] - 
https://www.postgresql.org/message-id/CALDaNm3MGVP_WK1Uuf%3DBiAJ9PeVOfciwLy0mrFA1JNbRp99VOQ%40mail.gmail.com

Regards,
Vignesh




Re: Printing backtrace of postgres processes

2021-11-09 Thread vignesh C
On Thu, Nov 4, 2021 at 4:06 PM Daniel Gustafsson  wrote:
>
> > On 26 Aug 2021, at 16:56, vignesh C  wrote:
>
> > The previous patch was failing because of the recent test changes made
> > by commit 201a76183e2 which unified new and get_new_node, attached
> > patch has the changes to handle the changes accordingly.
>
> This patch now fails because of the test changes made by commit b3b4d8e68a,
> please submit a rebase.

Thanks for reporting this, the attached v9 patch has the changes for the same.

Regards,
Vignesh
From 68a694c147a530ae2e964a126b6f36152ca14cd0 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 9 Nov 2021 16:28:03 +0530
Subject: [PATCH v9] Print backtrace of specified postgres process.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PROCSIG_PRINT_BACKTRACE to postgres backend whose pid matches the
specified process id. Once the backend process receives this signal it will
print the backtrace of the process to the log file based on the logging
configuration, if logging is disabled backtrace will be printed to the
console where postmaster was started.
---
 doc/src/sgml/func.sgml| 81 +++
 src/backend/postmaster/autovacuum.c   |  4 +
 src/backend/storage/ipc/procsignal.c  | 36 +
 src/backend/storage/ipc/signalfuncs.c | 46 +++
 src/backend/tcop/postgres.c   |  4 +
 src/backend/utils/error/elog.c| 20 -
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 .../t/002_print_backtrace_validation.pl   | 73 +
 12 files changed, 271 insertions(+), 5 deletions(-)
 create mode 100644 src/test/modules/test_misc/t/002_print_backtrace_validation.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 24447c0017..c9e6e470f3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25345,6 +25345,37 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_print_backtrace
+
+pg_print_backtrace ( pid integer )
+boolean
+   
+   
+Send a request to the backend with the specified process ID to log its
+backtrace. The backtrace will be logged at message level
+LOG. It will appear in the server log based on the
+log configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace will identify
+where exactly the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is not supported for the postmaster, logger, checkpointer,
+walwriter, background writer or statistics collector process. This
+feature will be available if PostgreSQL was built with the ability to
+capture backtrace. If not available, the function will return false
+and a warning is issued, for example:
+
+WARNING:  backtrace generation is not supported by this installation
+ pg_print_backtrace 
+
+ f
+
+   
+  
+
   

 
@@ -25458,6 +25489,56 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_print_backtrace can be used to print backtrace of
+a backend process. For example:
+
+postgres=# select pg_print_backtrace(pg_backend_pid());
+ pg_print_backtrace
+
+ t
+(1 row)
+
+The backtrace will be logged to the log file if logging is enabled, if logging
+is disabled backtrace will be logged to the console where the postmaster was
+started. For example:
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
+postgres: postgresdba postgres [local] SELECT(set_backtrace+0x38) [0xae06c5]
+postgres: postgresdba postgres [local] SELECT(ProcessInterrupts+0x788) [0x950c34]
+postgres: postgresdba postgres [local] SELECT() [0x761e89]
+postgres: postgresdba postgres [local] SELECT() [0x71bbda]
+postgres: postgresdba postgres [local] SELECT() [0x71e380]
+postgres: postgresdba postgres [local] SELECT(standard_ExecutorRun+0x1d6) [0x71c1fe]
+postgres: postgresdba postgres [local] SELECT(ExecutorRun+0x55) [0x71c026]
+postgres: postgresdba postgres [local] SELECT() [0x953fc5]
+postgres: postgresdba postgres [local] SELECT(PortalRun+0x262) [0x953c7e]
+postgres: postgresdba postgres [local] SELECT() [0x94db78]
+postgres: postgresdba postgres [local] SELECT(PostgresMain+0x7d7) [0x951e72]
+

Re: Printing backtrace of postgres processes

2021-11-04 Thread Daniel Gustafsson
> On 26 Aug 2021, at 16:56, vignesh C  wrote:

> The previous patch was failing because of the recent test changes made
> by commit 201a76183e2 which unified new and get_new_node, attached
> patch has the changes to handle the changes accordingly.

This patch now fails because of the test changes made by commit b3b4d8e68a,
please submit a rebase.

--
Daniel Gustafsson   https://vmware.com/





Re: Printing backtrace of postgres processes

2021-10-11 Thread bt21tanigaway

Hi,


The previous patch was failing because of the recent test changes made
by commit 201a76183e2 which unified new and get_new_node, attached
patch has the changes to handle the changes accordingly.
Thanks for your update!
I have two comments.


1.Do we need “set_backtrace(NULL, 0);” on “HandleMainLoopInterrupts()”?
I could observe that it works correctly without this. It is written on 
“HandleAutoVacLauncherInterrupts” as well, but I think it is necessary 
to prevent delays as well as [1].


2.The patch seems to forget to handle
“ereport(LOG,(errmsg("logging backtrace of PID %d", MyProcPid)));” on 
“HandleAutoVacLauncherInterrupts” and “HandleMainLoopInterrupts()”.

I think it should be the same as the process on “ProcessInterrupts()”.

3.How about creating a new function.
Since the same process is on three functions( “ProcessInterrupts()”, 
“HandleAutoVacLauncherInterrupts”, “HandleMainLoopInterrupts()” ), I 
think it’s good to create a new function.


[1] https://commitfest.postgresql.org/35/3342/

Regards,
Koyu Tanigawa




Re: Printing backtrace of postgres processes

2021-08-26 Thread vignesh C
On Wed, Jul 21, 2021 at 10:56 PM vignesh C  wrote:
>
> On Wed, Jul 21, 2021 at 3:52 PM Bharath Rupireddy 
>  wrote:
> >
> > On Tue, Jul 13, 2021 at 9:03 PM vignesh C  wrote:
> > >
> > > On Wed, May 12, 2021 at 2:27 AM Robert Haas  wrote:
> > > >
> > > > Maybe we should have a role that is specifically for server debugging
> > > > type things. This kind of overlaps with Mark Dilger's proposal to try
> > > > to allow SET for security-sensitive GUCs to be delegated via
> > > > predefined roles. The exact way to divide that up is open to question,
> > > > but it wouldn't seem crazy to me if the same role controlled the
> > > > ability to do this plus the ability to set the GUCs
> > > > backtrace_functions, debug_invalidate_system_caches_always,
> > > > wal_consistency_checking, and maybe a few other things.
> > >
> > > +1 for the idea of having a new role for this. Currently I have
> > > implemented this feature to be supported only for the superuser. If we
> > > are ok with having a new role to handle debugging features, I will
> > > make a 002 patch to handle this.
> >
> > I see that there are a good number of user functions that are
> > accessible only by superuser (I searched for "if (!superuser())" in
> > the code base). I agree with the intention to not overload the
> > superuser anymore and have a few other special roles to delegate the
> > existing superuser-only functions to them. In that case, are we going
> > to revisit and reassign all the existing superuser-only functions?
>
> As Robert pointed out, this idea is based on Mark Dilger's proposal. Mark 
> Dilger is already handling many of them at [1]. I'm proposing this patch only 
> for server debugging functionalities based on Robert's suggestions at [2].
> [1] - https://commitfest.postgresql.org/33/3223/
> [2] - 
> https://www.postgresql.org/message-id/CA%2BTgmoZz%3DK1bQRp0Ug%3D6uMGFWg-6kaxdHe6VSWaxq0U-YkppYQ%40mail.gmail.com

The previous patch was failing because of the recent test changes made
by commit 201a76183e2 which unified new and get_new_node, attached
patch has the changes to handle the changes accordingly.

Regards,
Vignesh
From e99c97e9a8356976e556fc0cc061d6c98d6b58da Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Thu, 26 Aug 2021 17:41:02 +0530
Subject: [PATCH v8] Print backtrace of specified postgres process.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PROCSIG_PRINT_BACKTRACE to postgres backend whose pid matches the
specified process id. Once the backend process receives this signal it will
print the backtrace of the process to the log file based on the logging
configuration, if logging is disabled backtrace will be printed to the
console where postmaster was started.
---
 doc/src/sgml/func.sgml| 76 +++
 src/backend/postmaster/autovacuum.c   |  7 ++
 src/backend/postmaster/interrupt.c|  8 ++
 src/backend/storage/ipc/procsignal.c  | 18 +
 src/backend/storage/ipc/signalfuncs.c | 46 +++
 src/backend/tcop/postgres.c   |  9 +++
 src/backend/utils/error/elog.c| 20 -
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  2 +
 src/include/storage/procsignal.h  |  2 +
 src/include/utils/elog.h  |  2 +
 .../t/002_print_backtrace_validation.pl   | 73 ++
 13 files changed, 265 insertions(+), 4 deletions(-)
 create mode 100644 src/test/modules/test_misc/t/002_print_backtrace_validation.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 78812b2dbe..de7d341ee0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24832,6 +24832,32 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_print_backtrace
+
+pg_print_backtrace ( pid integer )
+boolean
+   
+   
+Send a request to the backend with the specified process ID to log its
+backtrace. The backtrace will be logged at message level
+LOG. It will appear in the server log based on the
+log configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace will identify
+where exactly the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is not supported for the postmaster, logger, checkpointer,
+walwriter, background writer or statistics collector process. This
+feature will be available if PostgreSQL was built with the ability to
+capture backtrace. If not available, the function will return false
+and show a WARNING. Only superusers can request 

Re: Printing backtrace of postgres processes

2021-07-21 Thread vignesh C
On Wed, Jul 21, 2021 at 3:52 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Tue, Jul 13, 2021 at 9:03 PM vignesh C  wrote:
> >
> > On Wed, May 12, 2021 at 2:27 AM Robert Haas 
wrote:
> > >
> > > Maybe we should have a role that is specifically for server debugging
> > > type things. This kind of overlaps with Mark Dilger's proposal to try
> > > to allow SET for security-sensitive GUCs to be delegated via
> > > predefined roles. The exact way to divide that up is open to question,
> > > but it wouldn't seem crazy to me if the same role controlled the
> > > ability to do this plus the ability to set the GUCs
> > > backtrace_functions, debug_invalidate_system_caches_always,
> > > wal_consistency_checking, and maybe a few other things.
> >
> > +1 for the idea of having a new role for this. Currently I have
> > implemented this feature to be supported only for the superuser. If we
> > are ok with having a new role to handle debugging features, I will
> > make a 002 patch to handle this.
>
> I see that there are a good number of user functions that are
> accessible only by superuser (I searched for "if (!superuser())" in
> the code base). I agree with the intention to not overload the
> superuser anymore and have a few other special roles to delegate the
> existing superuser-only functions to them. In that case, are we going
> to revisit and reassign all the existing superuser-only functions?

As Robert pointed out, this idea is based on Mark Dilger's proposal. Mark
Dilger is already handling many of them at [1]. I'm proposing this patch
only for server debugging functionalities based on Robert's suggestions at
[2].
[1] - https://commitfest.postgresql.org/33/3223/
[2] -
https://www.postgresql.org/message-id/CA%2BTgmoZz%3DK1bQRp0Ug%3D6uMGFWg-6kaxdHe6VSWaxq0U-YkppYQ%40mail.gmail.com

Regards,
Vignesh


Re: Printing backtrace of postgres processes

2021-07-21 Thread Bharath Rupireddy
On Tue, Jul 13, 2021 at 9:03 PM vignesh C  wrote:
>
> On Wed, May 12, 2021 at 2:27 AM Robert Haas  wrote:
> >
> > Maybe we should have a role that is specifically for server debugging
> > type things. This kind of overlaps with Mark Dilger's proposal to try
> > to allow SET for security-sensitive GUCs to be delegated via
> > predefined roles. The exact way to divide that up is open to question,
> > but it wouldn't seem crazy to me if the same role controlled the
> > ability to do this plus the ability to set the GUCs
> > backtrace_functions, debug_invalidate_system_caches_always,
> > wal_consistency_checking, and maybe a few other things.
>
> +1 for the idea of having a new role for this. Currently I have
> implemented this feature to be supported only for the superuser. If we
> are ok with having a new role to handle debugging features, I will
> make a 002 patch to handle this.

I see that there are a good number of user functions that are
accessible only by superuser (I searched for "if (!superuser())" in
the code base). I agree with the intention to not overload the
superuser anymore and have a few other special roles to delegate the
existing superuser-only functions to them. In that case, are we going
to revisit and reassign all the existing superuser-only functions?

Regards,
Bharath Rupireddy.




Re: Printing backtrace of postgres processes

2021-07-13 Thread vignesh C
On Wed, May 12, 2021 at 2:27 AM Robert Haas  wrote:
>
> On Thu, May 6, 2021 at 3:31 PM Tom Lane  wrote:
> > Andres Freund  writes:
> > > On 2021-05-06 14:56:09 -0400, Tom Lane wrote:
> > >> If we think it's worth having a predefined role for, OK.  However,
> > >> I don't like the future I see us heading towards where there are
> > >> hundreds of random predefined roles.  Is there an existing role
> > >> that it'd be reasonable to attach this ability to?
> >
> > > It does seem like it'd be good to group it in with something
> > > else. There's nothing fitting 100% though.
> >
> > I'd probably vote for pg_read_all_data, considering that much of
> > the concern about this has to do with the possibility of exposure
> > of sensitive data.  I'm not quite sure what the security expectations
> > are for pg_monitor.
>
> Maybe we should have a role that is specifically for server debugging
> type things. This kind of overlaps with Mark Dilger's proposal to try
> to allow SET for security-sensitive GUCs to be delegated via
> predefined roles. The exact way to divide that up is open to question,
> but it wouldn't seem crazy to me if the same role controlled the
> ability to do this plus the ability to set the GUCs
> backtrace_functions, debug_invalidate_system_caches_always,
> wal_consistency_checking, and maybe a few other things.

+1 for the idea of having a new role for this. Currently I have
implemented this feature to be supported only for the superuser. If we
are ok with having a new role to handle debugging features, I will
make a 002 patch to handle this.

Regards,
Vignesh




Re: Printing backtrace of postgres processes

2021-05-11 Thread Robert Haas
On Thu, May 6, 2021 at 3:31 PM Tom Lane  wrote:
> Andres Freund  writes:
> > On 2021-05-06 14:56:09 -0400, Tom Lane wrote:
> >> If we think it's worth having a predefined role for, OK.  However,
> >> I don't like the future I see us heading towards where there are
> >> hundreds of random predefined roles.  Is there an existing role
> >> that it'd be reasonable to attach this ability to?
>
> > It does seem like it'd be good to group it in with something
> > else. There's nothing fitting 100% though.
>
> I'd probably vote for pg_read_all_data, considering that much of
> the concern about this has to do with the possibility of exposure
> of sensitive data.  I'm not quite sure what the security expectations
> are for pg_monitor.

Maybe we should have a role that is specifically for server debugging
type things. This kind of overlaps with Mark Dilger's proposal to try
to allow SET for security-sensitive GUCs to be delegated via
predefined roles. The exact way to divide that up is open to question,
but it wouldn't seem crazy to me if the same role controlled the
ability to do this plus the ability to set the GUCs
backtrace_functions, debug_invalidate_system_caches_always,
wal_consistency_checking, and maybe a few other things.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Printing backtrace of postgres processes

2021-05-06 Thread Andres Freund
Hi,

On 2021-05-06 15:22:02 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > we allow generating backtraces in all kind of places, including
> > e.g. some inside critical sections via backtrace_functions.
> 
> If there's an elog call inside a critical section, that seems
> like a problem already.  Are you sure that there are any such?

There's several, yes. In xlog.c there's quite a few that are gated by
wal_debug being enabled. But also a few without that,
e.g. XLogFileInit() logging
elog(DEBUG1, "creating and filling new WAL file");
and XLogFileInit() can be called within a critical section.

Greetings,

Andres Freund




Re: Printing backtrace of postgres processes

2021-05-06 Thread Andres Freund
Hi,

On 2021-05-06 15:31:02 -0400, Tom Lane wrote:
> I'd probably vote for pg_read_all_data, considering that much of
> the concern about this has to do with the possibility of exposure
> of sensitive data.  I'm not quite sure what the security expectations
> are for pg_monitor.

I was wondering the same, but looking at the docs of pg_read_all_data
that doesn't seem like a great fit:

  
   pg_read_all_data
   Read all data (tables, views, sequences), as if having SELECT
   rights on those objects, and USAGE rights on all schemas, even without
   having it explicitly.  This role does not have the role attribute
   BYPASSRLS set.  If RLS is being used, an administrator
   may wish to set BYPASSRLS on roles which this role is
   GRANTed to.
  

It's mostly useful to be able to run pg_dump without superuser
permissions.

Contrast that to pg_monitor:

  
   pg_monitor
   Read/execute various monitoring views and functions.
   This role is a member of pg_read_all_settings,
   pg_read_all_stats and
   pg_stat_scan_tables.
  
...
  
  The pg_monitor, pg_read_all_settings,
  pg_read_all_stats and 
pg_stat_scan_tables
  roles are intended to allow administrators to easily configure a role for the
  purpose of monitoring the database server. They grant a set of common 
privileges
  allowing the role to read various useful configuration settings, statistics 
and
  other system information normally restricted to superusers.
  

"normally restricted to superusers" seems to fit pretty well?

I think if we follow your argument, pg_read_server_files would be a bit
better fit than pg_read_all_data? But it seems "too powerful" to me.
  
   pg_read_server_files
   Allow reading files from any location the database can access on 
the server with COPY and
   other file-access functions.
  

Greetings,

Andres Freund




Re: Printing backtrace of postgres processes

2021-05-06 Thread Tom Lane
Andres Freund  writes:
> On 2021-05-06 14:56:09 -0400, Tom Lane wrote:
>> If we think it's worth having a predefined role for, OK.  However,
>> I don't like the future I see us heading towards where there are
>> hundreds of random predefined roles.  Is there an existing role
>> that it'd be reasonable to attach this ability to?

> It does seem like it'd be good to group it in with something
> else. There's nothing fitting 100% though.

I'd probably vote for pg_read_all_data, considering that much of
the concern about this has to do with the possibility of exposure
of sensitive data.  I'm not quite sure what the security expectations
are for pg_monitor.

regards, tom lane




Re: Printing backtrace of postgres processes

2021-05-06 Thread Tom Lane
Andres Freund  writes:
> On 2021-05-06 14:38:51 -0400, Robert Haas wrote:
>> On Wed, Feb 3, 2021 at 2:30 AM Tom Lane  wrote:
>>> This point is entirely separate from the question of whether
>>> triggering stack traces at inopportune moments could cause system
>>> malfunctions, but that question is also not to be ignored.

> I think that ship kind of has sailed with
>
> commit 71a8a4f6e36547bb060dbcc961ea9b57420f7190
> Author: Alvaro Herrera 
> Date:   2019-11-08 15:44:20 -0300
> Add backtrace support for error reporting

The fact that we have a scarily large surface area for that to
cause problems is not a great argument for making the surface
area even larger.  Also, I don't think v13 has been out long
enough for us to have full confidence that the backtrace behavior
doesn't cause any problems already.

> we allow generating backtraces in all kind of places, including
> e.g. some inside critical sections via backtrace_functions.

If there's an elog call inside a critical section, that seems
like a problem already.  Are you sure that there are any such?

regards, tom lane




Re: Printing backtrace of postgres processes

2021-05-06 Thread Andres Freund
Hi,

On 2021-05-06 14:56:09 -0400, Tom Lane wrote:
> Robert Haas  writes:
> > On Wed, Feb 3, 2021 at 2:30 AM Tom Lane  wrote:
> >> TBH, I'm leaning to the position that this should be superuser
> >> only.
> 
> > I agree that ordinary users shouldn't be able to trigger it, but I
> > think it should be restricted to some predefined role, new or
> > existing, rather than to superuser. I see no reason not to let
> > individual users decide what risks they want to take.
> 
> If we think it's worth having a predefined role for, OK.  However,
> I don't like the future I see us heading towards where there are
> hundreds of random predefined roles.  Is there an existing role
> that it'd be reasonable to attach this ability to?

It does seem like it'd be good to group it in with something
else. There's nothing fitting 100% though.

postgres[1475723][1]=# SELECT rolname FROM pg_roles WHERE rolname LIKE 'pg_%' 
ORDER BY rolname;
┌───┐
│  rolname  │
├───┤
│ pg_database_owner │
│ pg_execute_server_program │
│ pg_monitor│
│ pg_read_all_data  │
│ pg_read_all_settings  │
│ pg_read_all_stats │
│ pg_read_server_files  │
│ pg_signal_backend │
│ pg_stat_scan_tables   │
│ pg_write_all_data │
│ pg_write_server_files │
└───┘
(11 rows)

We could fit it into pg_monitor, but it's probably a bit more impactful
than most things in there? But I'd go for it anyway, I think.

Greetings,

Andres Freund




Re: Printing backtrace of postgres processes

2021-05-06 Thread Andres Freund
Hi,

On 2021-05-06 14:38:51 -0400, Robert Haas wrote:
> On Wed, Feb 3, 2021 at 2:30 AM Tom Lane  wrote:
> > This point is entirely separate from the question of whether
> > triggering stack traces at inopportune moments could cause system
> > malfunctions, but that question is also not to be ignored.
> 
> That worries me too, although I have a hard time saying exactly why.
> If we call an OS-provided function called backtrace() and it does
> something other than generate a backtrace - e.g. makes the process seg
> fault, or mucks about with the values of global variables - isn't that
> just a bug in the OS? Do we have particular reasons to believe that
> such bugs are common? My own skepticism here is mostly based on how
> inconsistent debuggers are about being able to tell you anything
> useful, which makes me think that in a binary compiled with any
> optimization, the ability of backtrace() to do something consistently
> useful is also questionable. But that's a separate question from
> whether it's likely to cause any active harm.

I think that ship kind of has sailed with

commit 71a8a4f6e36547bb060dbcc961ea9b57420f7190
Author: Alvaro Herrera 
Date:   2019-11-08 15:44:20 -0300

Add backtrace support for error reporting

we allow generating backtraces in all kind of places, including
e.g. some inside critical sections via backtrace_functions. I don't
think also doing so during interrupt processing is a meaningful increase
in exposed surface area?

Greetings,

Andres Freund




Re: Printing backtrace of postgres processes

2021-05-06 Thread Tom Lane
Robert Haas  writes:
> On Wed, Feb 3, 2021 at 2:30 AM Tom Lane  wrote:
>> TBH, I'm leaning to the position that this should be superuser
>> only.

> I agree that ordinary users shouldn't be able to trigger it, but I
> think it should be restricted to some predefined role, new or
> existing, rather than to superuser. I see no reason not to let
> individual users decide what risks they want to take.

If we think it's worth having a predefined role for, OK.  However,
I don't like the future I see us heading towards where there are
hundreds of random predefined roles.  Is there an existing role
that it'd be reasonable to attach this ability to?

regards, tom lane




Re: Printing backtrace of postgres processes

2021-05-06 Thread Robert Haas
On Wed, Feb 3, 2021 at 2:30 AM Tom Lane  wrote:
> A backtrace normally exposes the text of the current query, for
> instance, which could contain very sensitive data (passwords in ALTER
> USER, customer credit card numbers in ordinary data, etc etc).  We
> don't allow the postmaster log to be seen by any but very privileged
> users; it's not sane to think that this data is any less
> security-critical than the postmaster log.

I agree. Vignesh points out downthread that it's just printing out
memory addresses and not necessarily anything too directly usable, but
the memory addresses themselves are potentially security-sensitive. If
that were not a thing, there would not be such a thing as ASLR.

> This point is entirely separate from the question of whether
> triggering stack traces at inopportune moments could cause system
> malfunctions, but that question is also not to be ignored.

That worries me too, although I have a hard time saying exactly why.
If we call an OS-provided function called backtrace() and it does
something other than generate a backtrace - e.g. makes the process seg
fault, or mucks about with the values of global variables - isn't that
just a bug in the OS? Do we have particular reasons to believe that
such bugs are common? My own skepticism here is mostly based on how
inconsistent debuggers are about being able to tell you anything
useful, which makes me think that in a binary compiled with any
optimization, the ability of backtrace() to do something consistently
useful is also questionable. But that's a separate question from
whether it's likely to cause any active harm.

> TBH, I'm leaning to the position that this should be superuser
> only.  I do NOT agree with the idea that ordinary users should
> be able to trigger it, even against backends theoretically
> belonging to their own userid.  (Do I need to point out that
> some levels of the call stack might be from security-definer
> functions with more privilege than the session's nominal user?)

I agree that ordinary users shouldn't be able to trigger it, but I
think it should be restricted to some predefined role, new or
existing, rather than to superuser. I see no reason not to let
individual users decide what risks they want to take.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Printing backtrace of postgres processes

2021-05-06 Thread vignesh C
On Thu, May 6, 2021 at 7:43 AM Justin Pryzby  wrote:
>
> Here's a cleaned-up copy of the doc text.
>
> Send a request to the backend with the specified process ID to log its 
> backtrace.
> The backtrace will be logged at message level LOG.
> It will appear in the server log based on the log configuration set
> (See  for more information),
> but will not be sent to the client regardless of
> .
> A backtrace will identify where exactly the backend process is currently
> executing. This may be useful to developers to diagnose stuck
> processes and other problems. This feature is
> not supported for the postmaster, logger, or statistics collector process. 
> This
> feature will be available if PostgreSQL was built
> with the ability to capture backtracee.  If not available, the function will
> return false and show a WARNING.
> Only superusers can request backends to log their backtrace.

Thanks for  rephrasing, I have modified to include checkpointer,
walwriter and background writer process also.

> > - * this and related functions are not inlined.
> > + * this and related functions are not inlined. If edata pointer is valid
> > + * backtrace information will set in edata.
>
> will *be* set

Modified.

Thanks for the comments, Attached v8 patch has the fixes for the same.

Regards,
Vignesh
From ace69a3f22d0776a6f7fb4e793035354946558e8 Mon Sep 17 00:00:00 2001
From: vignesh 
Date: Wed, 5 May 2021 20:31:38 +0530
Subject: [PATCH v8] Print backtrace of specified postgres process.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PROCSIG_PRINT_BACKTRACE to postgres backend whose pid matches the
specified process id. Once the backend process receives this signal it will
print the backtrace of the process to the log file based on the logging
configuration, if logging is disabled backtrace will be printed to the
console where postmaster was started.
---
 doc/src/sgml/func.sgml| 76 +++
 src/backend/postmaster/autovacuum.c   |  7 ++
 src/backend/postmaster/interrupt.c|  8 ++
 src/backend/storage/ipc/procsignal.c  | 18 +
 src/backend/storage/ipc/signalfuncs.c | 46 +++
 src/backend/tcop/postgres.c   |  9 +++
 src/backend/utils/error/elog.c| 20 -
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  2 +
 src/include/storage/procsignal.h  |  2 +
 src/include/utils/elog.h  |  2 +
 .../t/002_print_backtrace_validation.pl   | 73 ++
 13 files changed, 265 insertions(+), 4 deletions(-)
 create mode 100644 src/test/modules/test_misc/t/002_print_backtrace_validation.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0b5571460d..1b76935ce4 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24974,6 +24974,32 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_print_backtrace
+
+pg_print_backtrace ( pid integer )
+boolean
+   
+   
+Send a request to the backend with the specified process ID to log its
+backtrace. The backtrace will be logged at message level
+LOG. It will appear in the server log based on the
+log configuration set (See  for
+more information), but will not be sent to the client regardless of
+. A backtrace will identify
+where exactly the backend process is currently executing. This may be
+useful to developers to diagnose stuck processes and other problems.
+This feature is not supported for the postmaster, logger, checkpointer,
+walwriter, background writer or statistics collector process. This
+feature will be available if PostgreSQL was built with the ability to
+capture backtrace. If not available, the function will return false
+and show a WARNING. Only superusers can request backends to log their
+backtrace.
+   
+  
+
   

 
@@ -25069,6 +25095,56 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_print_backtrace can be used to print backtrace of
+a backend process. For example:
+
+postgres=# select pg_print_backtrace(pg_backend_pid());
+ pg_print_backtrace
+
+ t
+(1 row)
+
+The backtrace will be logged to the log file if logging is enabled, if logging
+is disabled backtrace will be logged to the console where the postmaster was
+started. For example:
+2021-01-27 11:33:50.247 IST [111735] LOG:  current backtrace:
+postgres: postgresdba postgres [local] SELECT(set_backtrace+0x38) [0xae06c5]
+postgres: 

Re: Printing backtrace of postgres processes

2021-05-05 Thread Justin Pryzby
Here's a cleaned-up copy of the doc text.

Send a request to the backend with the specified process ID to log its 
backtrace.
The backtrace will be logged at message level LOG.
It will appear in the server log based on the log configuration set
(See  for more information),
but will not be sent to the client regardless of
.
A backtrace will identify where exactly the backend process is currently
executing. This may be useful to developers to diagnose stuck
processes and other problems. This feature is
not supported for the postmaster, logger, or statistics collector process. This
feature will be available if PostgreSQL was built
with the ability to capture backtracee.  If not available, the function will
return false and show a WARNING.
Only superusers can request backends to log their backtrace.

> - * this and related functions are not inlined.
> + * this and related functions are not inlined. If edata pointer is valid
> + * backtrace information will set in edata.

will *be* set

-- 
Justin




Re: Printing backtrace of postgres processes

2021-05-05 Thread vignesh C
On Wed, Feb 3, 2021 at 3:24 PM Bharath Rupireddy
 wrote:
>
> On Wed, Feb 3, 2021 at 1:49 PM vignesh C  wrote:
> >
> > On Wed, Feb 3, 2021 at 1:00 PM Tom Lane  wrote:
> > >
> > > vignesh C  writes:
> > > > On Mon, Feb 1, 2021 at 11:04 AM Bharath Rupireddy
> > > >  wrote:
> > > >> Are these superuser and permission checks enough from a security
> > > >> standpoint that we don't expose some sensitive information to the
> > > >> user?
> > >
> > > > This will just print the backtrace of the current backend. Users
> > > > cannot get password information from this.
> > >
> > > Really?
> > >
> > > A backtrace normally exposes the text of the current query, for
> > > instance, which could contain very sensitive data (passwords in ALTER
> > > USER, customer credit card numbers in ordinary data, etc etc).  We
> > > don't allow the postmaster log to be seen by any but very privileged
> > > users; it's not sane to think that this data is any less
> > > security-critical than the postmaster log.
> > >
> > > This point is entirely separate from the question of whether
> > > triggering stack traces at inopportune moments could cause system
> > > malfunctions, but that question is also not to be ignored.
> > >
> > > TBH, I'm leaning to the position that this should be superuser
> > > only.  I do NOT agree with the idea that ordinary users should
> > > be able to trigger it, even against backends theoretically
> > > belonging to their own userid.  (Do I need to point out that
> > > some levels of the call stack might be from security-definer
> > > functions with more privilege than the session's nominal user?)
> > >
> >
> > I had seen that the log that will be logged will be something like:
> > postgres: test postgres [local]
> > idle(ProcessClientReadInterrupt+0x3a) [0x9500ec]
> > postgres: test postgres [local] idle(secure_read+0x183) [0x787f43]
> > postgres: test postgres [local] idle() [0x7919de]
> > postgres: test postgres [local] idle(pq_getbyte+0x32) [0x791a8e]
> > postgres: test postgres [local] idle() [0x94fc16]
> > postgres: test postgres [local] idle() [0x950099]
> > postgres: test postgres [local] idle(PostgresMain+0x6d5) [0x954bd5]
> > postgres: test postgres [local] idle() [0x898a09]
> > postgres: test postgres [local] idle() [0x89838f]
> > postgres: test postgres [local] idle() [0x894953]
> > postgres: test postgres [local] idle(PostmasterMain+0x116b) 
> > [0x89422a]
> > postgres: test postgres [local] idle() [0x79725b]
> > /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f6e68d7]
> > postgres: test postgres [local] idle() [0x484249]
> > I was not sure if we would be able to get any secure information from
> > this. I did not notice the function arguments being printed. I felt
> > the function name, offset  and the return address will be logged. I
> > might be missing something here.
> > Thoughts?
>
> First of all, we need to see if the output of pg_print_backtrace shows
> up function parameter addresses or only function start addresses along
> with line and file information when attached to gdb. In either case,
> IMO, it will be easy for experienced hackers(I'm not one though) to
> calculate and fetch the query string or other function parameters or
> the variables inside the functions from the stack by looking at the
> code (which is available openly, of course).
>
> Say, if a backend is in a long running scan or insert operation, then
> pg_print_backtrace is issued from another session, the
> exec_simple_query function shows up query_string. Below is captured
> from attached gdb though, I'm not sure whether the logged backtrace
> will have function address or the function parameters addresses, I
> think we can check that by having a long running query which
> frequently checks interrupts and issue pg_print_backtrace from another
> session to that backend. Now, attach gdb to the backend in which the
> query is running, then take bt, see if the logged backtrace and the
> gdb bt have the same or closer addresses.
>
> #13 0x5644f4320729 in exec_simple_query (
> query_string=0x5644f6771bf0 "select pg_backend_pid();") at postgres.c:1240
> #14 0x5644f4324ff4 in PostgresMain (argc=1, argv=0x7ffd819bd5e0,
> dbname=0x5644f679d2b8 "postgres", username=0x5644f679d298 "bharath")
> at postgres.c:4394
> #15 0x5644f4256f9d in BackendRun (port=0x5644f67935c0) at 
> postmaster.c:4484
> #16 0x5644f4256856 in BackendStartup (port=0x5644f67935c0) at
> postmaster.c:4206
> #17 0x5644f4252a11 in ServerLoop () at postmaster.c:1730
> #18 0x5644f42521aa in PostmasterMain (argc=3, argv=0x5644f676b1f0)
> at postmaster.c:1402
> #19 0x5644f4148789 in main (argc=3, argv=0x5644f676b1f0) at main.c:209
>
> As suggested by Tom, I'm okay if this function is callable only by the
> superusers. In that case, the superusers can fetch the backtrace and
> send it for further analysis in case of 

Re: Printing backtrace of postgres processes

2021-03-05 Thread torikoshia

On 2021-03-04 21:55, Bharath Rupireddy wrote:
On Mon, Mar 1, 2021 at 10:43 AM torikoshia  
wrote:

Since the current patch use BackendPidGetProc(), it does not
support this feature not only postmaster, logging, and
statistics but also checkpointer, background writer, and
walwriter.

And when I specify pid of these PostgreSQL processes, it
says "PID  is not a PostgreSQL server process".

I think it may confuse users, so it might be worth
changing messages for those PostgreSQL processes.
AuxiliaryPidGetProc() may help to do it.


Exactly this was the doubt I got when I initially reviewed this patch.
And I felt it should be discussed in a separate thread, you may want
to update your thoughts there [1].

[1] -
https://www.postgresql.org/message-id/CALj2ACW7Rr-R7mBcBQiXWPp%3DJV5chajjTdudLiF5YcpW-BmHhg%40mail.gmail.com


Thanks!
I'm going to join the discussion there.


Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION




Re: Printing backtrace of postgres processes

2021-03-04 Thread Bharath Rupireddy
On Mon, Mar 1, 2021 at 10:43 AM torikoshia  wrote:
> Since the current patch use BackendPidGetProc(), it does not
> support this feature not only postmaster, logging, and
> statistics but also checkpointer, background writer, and
> walwriter.
>
> And when I specify pid of these PostgreSQL processes, it
> says "PID  is not a PostgreSQL server process".
>
> I think it may confuse users, so it might be worth
> changing messages for those PostgreSQL processes.
> AuxiliaryPidGetProc() may help to do it.

Exactly this was the doubt I got when I initially reviewed this patch.
And I felt it should be discussed in a separate thread, you may want
to update your thoughts there [1].

[1] - 
https://www.postgresql.org/message-id/CALj2ACW7Rr-R7mBcBQiXWPp%3DJV5chajjTdudLiF5YcpW-BmHhg%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Printing backtrace of postgres processes

2021-02-28 Thread torikoshia

Hi,

I also think this feature would be useful when supporting
environments that lack debugger or debug symbols.
I think such environments are not rare.


+ for more information. 
This
+will help in identifying where exactly the backend process is 
currently

+executing.

When I read this, I expected a backtrace would be generated at
the moment when it receives the signal, but actually it just
sets a flag that causes the next CHECK_FOR_INTERRUPTS to print
a backtrace.

How about explaining the timing of the backtrace generation?


+print backtrace of superuser backends. This feature is not 
supported

+for postmaster, logging and statistics process.

Since the current patch use BackendPidGetProc(), it does not
support this feature not only postmaster, logging, and
statistics but also checkpointer, background writer, and
walwriter.

And when I specify pid of these PostgreSQL processes, it
says "PID  is not a PostgreSQL server process".

I think it may confuse users, so it might be worth
changing messages for those PostgreSQL processes.
AuxiliaryPidGetProc() may help to do it.


diff --git a/src/backend/postmaster/checkpointer.c 
b/src/backend/postmaster/checkpointer.c

index 54a818b..5fae328 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -57,6 +57,7 @@
 #include "storage/shmem.h"
 #include "storage/smgr.h"
 #include "storage/spin.h"
+#include "tcop/tcopprot.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/resowner.h"
@@ -547,6 +548,13 @@ HandleCheckpointerInterrupts(void)
if (ProcSignalBarrierPending)
ProcessProcSignalBarrier();

+   /* Process printing backtrace */
+   if (PrintBacktracePending)
+   {
+   PrintBacktracePending = false;
+   set_backtrace(NULL, 0);
+   }
+

Although it implements backtrace for checkpointer, when
I specified pid of checkpointer it was refused from
BackendPidGetProc().


Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION




Re: Printing backtrace of postgres processes

2021-02-03 Thread Bharath Rupireddy
On Wed, Feb 3, 2021 at 1:49 PM vignesh C  wrote:
>
> On Wed, Feb 3, 2021 at 1:00 PM Tom Lane  wrote:
> >
> > vignesh C  writes:
> > > On Mon, Feb 1, 2021 at 11:04 AM Bharath Rupireddy
> > >  wrote:
> > >> Are these superuser and permission checks enough from a security
> > >> standpoint that we don't expose some sensitive information to the
> > >> user?
> >
> > > This will just print the backtrace of the current backend. Users
> > > cannot get password information from this.
> >
> > Really?
> >
> > A backtrace normally exposes the text of the current query, for
> > instance, which could contain very sensitive data (passwords in ALTER
> > USER, customer credit card numbers in ordinary data, etc etc).  We
> > don't allow the postmaster log to be seen by any but very privileged
> > users; it's not sane to think that this data is any less
> > security-critical than the postmaster log.
> >
> > This point is entirely separate from the question of whether
> > triggering stack traces at inopportune moments could cause system
> > malfunctions, but that question is also not to be ignored.
> >
> > TBH, I'm leaning to the position that this should be superuser
> > only.  I do NOT agree with the idea that ordinary users should
> > be able to trigger it, even against backends theoretically
> > belonging to their own userid.  (Do I need to point out that
> > some levels of the call stack might be from security-definer
> > functions with more privilege than the session's nominal user?)
> >
>
> I had seen that the log that will be logged will be something like:
> postgres: test postgres [local]
> idle(ProcessClientReadInterrupt+0x3a) [0x9500ec]
> postgres: test postgres [local] idle(secure_read+0x183) [0x787f43]
> postgres: test postgres [local] idle() [0x7919de]
> postgres: test postgres [local] idle(pq_getbyte+0x32) [0x791a8e]
> postgres: test postgres [local] idle() [0x94fc16]
> postgres: test postgres [local] idle() [0x950099]
> postgres: test postgres [local] idle(PostgresMain+0x6d5) [0x954bd5]
> postgres: test postgres [local] idle() [0x898a09]
> postgres: test postgres [local] idle() [0x89838f]
> postgres: test postgres [local] idle() [0x894953]
> postgres: test postgres [local] idle(PostmasterMain+0x116b) [0x89422a]
> postgres: test postgres [local] idle() [0x79725b]
> /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f6e68d7]
> postgres: test postgres [local] idle() [0x484249]
> I was not sure if we would be able to get any secure information from
> this. I did not notice the function arguments being printed. I felt
> the function name, offset  and the return address will be logged. I
> might be missing something here.
> Thoughts?

First of all, we need to see if the output of pg_print_backtrace shows
up function parameter addresses or only function start addresses along
with line and file information when attached to gdb. In either case,
IMO, it will be easy for experienced hackers(I'm not one though) to
calculate and fetch the query string or other function parameters or
the variables inside the functions from the stack by looking at the
code (which is available openly, of course).

Say, if a backend is in a long running scan or insert operation, then
pg_print_backtrace is issued from another session, the
exec_simple_query function shows up query_string. Below is captured
from attached gdb though, I'm not sure whether the logged backtrace
will have function address or the function parameters addresses, I
think we can check that by having a long running query which
frequently checks interrupts and issue pg_print_backtrace from another
session to that backend. Now, attach gdb to the backend in which the
query is running, then take bt, see if the logged backtrace and the
gdb bt have the same or closer addresses.

#13 0x5644f4320729 in exec_simple_query (
query_string=0x5644f6771bf0 "select pg_backend_pid();") at postgres.c:1240
#14 0x5644f4324ff4 in PostgresMain (argc=1, argv=0x7ffd819bd5e0,
dbname=0x5644f679d2b8 "postgres", username=0x5644f679d298 "bharath")
at postgres.c:4394
#15 0x5644f4256f9d in BackendRun (port=0x5644f67935c0) at postmaster.c:4484
#16 0x5644f4256856 in BackendStartup (port=0x5644f67935c0) at
postmaster.c:4206
#17 0x5644f4252a11 in ServerLoop () at postmaster.c:1730
#18 0x5644f42521aa in PostmasterMain (argc=3, argv=0x5644f676b1f0)
at postmaster.c:1402
#19 0x5644f4148789 in main (argc=3, argv=0x5644f676b1f0) at main.c:209

As suggested by Tom, I'm okay if this function is callable only by the
superusers. In that case, the superusers can fetch the backtrace and
send it for further analysis in case of any hangs or issues.

Others may have better thoughts.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Printing backtrace of postgres processes

2021-02-03 Thread vignesh C
On Wed, Feb 3, 2021 at 1:00 PM Tom Lane  wrote:
>
> vignesh C  writes:
> > On Mon, Feb 1, 2021 at 11:04 AM Bharath Rupireddy
> >  wrote:
> >> Are these superuser and permission checks enough from a security
> >> standpoint that we don't expose some sensitive information to the
> >> user?
>
> > This will just print the backtrace of the current backend. Users
> > cannot get password information from this.
>
> Really?
>
> A backtrace normally exposes the text of the current query, for
> instance, which could contain very sensitive data (passwords in ALTER
> USER, customer credit card numbers in ordinary data, etc etc).  We
> don't allow the postmaster log to be seen by any but very privileged
> users; it's not sane to think that this data is any less
> security-critical than the postmaster log.
>
> This point is entirely separate from the question of whether
> triggering stack traces at inopportune moments could cause system
> malfunctions, but that question is also not to be ignored.
>
> TBH, I'm leaning to the position that this should be superuser
> only.  I do NOT agree with the idea that ordinary users should
> be able to trigger it, even against backends theoretically
> belonging to their own userid.  (Do I need to point out that
> some levels of the call stack might be from security-definer
> functions with more privilege than the session's nominal user?)
>

I had seen that the log that will be logged will be something like:
postgres: test postgres [local]
idle(ProcessClientReadInterrupt+0x3a) [0x9500ec]
postgres: test postgres [local] idle(secure_read+0x183) [0x787f43]
postgres: test postgres [local] idle() [0x7919de]
postgres: test postgres [local] idle(pq_getbyte+0x32) [0x791a8e]
postgres: test postgres [local] idle() [0x94fc16]
postgres: test postgres [local] idle() [0x950099]
postgres: test postgres [local] idle(PostgresMain+0x6d5) [0x954bd5]
postgres: test postgres [local] idle() [0x898a09]
postgres: test postgres [local] idle() [0x89838f]
postgres: test postgres [local] idle() [0x894953]
postgres: test postgres [local] idle(PostmasterMain+0x116b) [0x89422a]
postgres: test postgres [local] idle() [0x79725b]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f6e68d7]
postgres: test postgres [local] idle() [0x484249]
I was not sure if we would be able to get any secure information from
this. I did not notice the function arguments being printed. I felt
the function name, offset  and the return address will be logged. I
might be missing something here.
Thoughts?

Regards,
Vignesh




Re: Printing backtrace of postgres processes

2021-02-02 Thread Tom Lane
vignesh C  writes:
> On Mon, Feb 1, 2021 at 11:04 AM Bharath Rupireddy
>  wrote:
>> Are these superuser and permission checks enough from a security
>> standpoint that we don't expose some sensitive information to the
>> user?

> This will just print the backtrace of the current backend. Users
> cannot get password information from this.

Really?

A backtrace normally exposes the text of the current query, for
instance, which could contain very sensitive data (passwords in ALTER
USER, customer credit card numbers in ordinary data, etc etc).  We
don't allow the postmaster log to be seen by any but very privileged
users; it's not sane to think that this data is any less
security-critical than the postmaster log.

This point is entirely separate from the question of whether
triggering stack traces at inopportune moments could cause system
malfunctions, but that question is also not to be ignored.

TBH, I'm leaning to the position that this should be superuser
only.  I do NOT agree with the idea that ordinary users should
be able to trigger it, even against backends theoretically
belonging to their own userid.  (Do I need to point out that
some levels of the call stack might be from security-definer
functions with more privilege than the session's nominal user?)

regards, tom lane




Re: Printing backtrace of postgres processes

2021-02-02 Thread vignesh C
On Mon, Feb 1, 2021 at 11:04 AM Bharath Rupireddy
 wrote:
>
> On Mon, Feb 1, 2021 at 6:14 AM Bharath Rupireddy
>  wrote:
> > On Fri, Jan 29, 2021 at 7:10 PM vignesh C  wrote:
> > > > 4) How about following
> > > > + errmsg("must be a superuser to print backtrace
> > > > of backend process")));
> > > > instead of
> > > > + errmsg("must be a superuser to print backtrace
> > > > of superuser query process")));
> > > >
> > >
> > > Here the message should include superuser, we cannot remove it. Non
> > > super user can log non super user provided if user has permissions for
> > > it.
> > >
> > > > 5) How about following
> > > >  errmsg("must be a member of the role whose backed
> > > > process's backtrace is being printed or member of
> > > > pg_signal_backend")));
> > > > instead of
> > > > + errmsg("must be a member of the role whose
> > > > backtrace is being logged or member of pg_signal_backend")));
> > > >
> > >
> > > Modified it.
> >
> > Maybe I'm confused here to understand the difference between
> > SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION macros and
> > corresponding error messages. Some clarification/use case to know in
> > which scenarios we hit those error messages might help me. Did we try
> > to add test cases that show up these error messages for
> > pg_print_backtrace? If not, can we add?
>
> Are these superuser and permission checks enough from a security
> standpoint that we don't expose some sensitive information to the
> user? Although I'm not sure, say from the backtrace printed and
> attached to GDB, can users see the passwords or other sensitive
> information from the system that they aren't supposed to see?
>
> I'm sure this point would have been discussed upthread.

This will just print the backtrace of the current backend. Users
cannot get password information from this. This backtrace will be sent
from customer side to the customer support. Developers will use gdb to
check the file and line number using the addresses. We are suggesting
to use gdb to get the file and line number from the address.  Users
can attach gdb to the process even now without this feature, I think
that behavior will be the same as is.  That will not be impacted
because of this feature. It was discussed to keep the checks similar
to pg_terminate_backend as discussed in [1].
[1] 
https://www.postgresql.org/message-id/CA%2BTgmoZ8XeQVCCqfq826kAr804a1ZnYy46FnQB9r2n_iOofh8Q%40mail.gmail.com

Regards,
Vignesh




Re: Printing backtrace of postgres processes

2021-02-02 Thread vignesh C
Thanks Bharath for your comments.

On Mon, Feb 1, 2021 at 6:14 AM Bharath Rupireddy
 wrote:
>
> On Fri, Jan 29, 2021 at 7:10 PM vignesh C  wrote:
> > > 4) How about following
> > > + errmsg("must be a superuser to print backtrace
> > > of backend process")));
> > > instead of
> > > + errmsg("must be a superuser to print backtrace
> > > of superuser query process")));
> > >
> >
> > Here the message should include superuser, we cannot remove it. Non
> > super user can log non super user provided if user has permissions for
> > it.
> >
> > > 5) How about following
> > >  errmsg("must be a member of the role whose backed
> > > process's backtrace is being printed or member of
> > > pg_signal_backend")));
> > > instead of
> > > + errmsg("must be a member of the role whose
> > > backtrace is being logged or member of pg_signal_backend")));
> > >
> >
> > Modified it.
>
> Maybe I'm confused here to understand the difference between
> SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION macros and
> corresponding error messages. Some clarification/use case to know in
> which scenarios we hit those error messages might help me. Did we try
> to add test cases that show up these error messages for
> pg_print_backtrace? If not, can we add?

I have tested this manually:

I have tested it manually, Here is the test I did:
Create 2 users:
create user test password 'test@123';
create user test1 password 'test@123';

Test1: Test print backtrace of a different user's session:
./psql -d postgres -U test
psql (14devel)
Type "help" for help.
postgres=> select pg_backend_pid();
 pg_backend_pid

  30142
(1 row)
--
./psql -d postgres -U test1
psql (14devel)
Type "help" for help.
postgres=> select pg_print_backtrace(30142);
ERROR:  must be a member of the role whose backtrace is being logged
or member of pg_signal_backend

The above will be successful after:
grant pg_signal_backend to test1;

Test1: Test for non super user trying to print backtrace of a super
user's session:
./psql -d postgres
psql (14devel)
Type "help" for help.
postgres=# select pg_backend_pid();
 pg_backend_pid

  30211
(1 row)

./psql -d postgres -U test1
psql (14devel)
Type "help" for help.
postgres=> select pg_print_backtrace(30211);
ERROR:  must be a superuser to print backtrace of superuser process
I have not added any tests for this as we required 2 active sessions
and I did not see any existing framework for this.
This test should help in relating the behavior.

> > Attached v5 patch has the fixes for the same.
> > Thoughts?
>
> Thanks. Here are some comments on v5 patch:
>
> 1) typo - it's "process" not "porcess" +a backend porcess. For example:
>

Modified.

> 2) select * from pg_print_backtrace(NULL);
> postgres=# select proname, proisstrict from pg_proc where proname =
> 'pg_print_backtrace';
>   proname   | proisstrict
> +-
>  pg_print_backtrace | t
>
>  See the documentation:
>  "proisstrict bool
>
> Function returns null if any call argument is null. In that case the
> function won't actually be called at all. Functions that are not
> “strict” must be prepared to handle null inputs."
> So below PG_ARGISNUL check is not needed, you can remove that, because
> pg_print_backtrace will not get called with null input.
> intbt_pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
>

Modified.

> 3) Can we just set results = true instead of PG_RETURN_BOOL(true); so
> that it will be returned from PG_RETURN_BOOL(result); just for
> consistency?
> if (!SendProcSignal(bt_pid, PROCSIG_BACKTRACE_PRINT,
> InvalidBackendId))
> PG_RETURN_BOOL(true);
> else
> ereport(WARNING,
> (errmsg("failed to send signal to postmaster: %m")));
> }
>
> PG_RETURN_BOOL(result);
>

I felt existing is better as it will reduce one instruction of setting
first and then returning. There are only 2 places from where we
return. I felt we could directly return true or false.

> 4) Below is what happens if I request for a backtrace of the
> postmaster process? 1388210 is pid of postmaster.
> postgres=# select * from pg_print_backtrace(1388210);
> WARNING:  PID 1388210 is not a PostgreSQL server process
>  pg_print_backtrace
> 
>  f
>
> Does it make sense to have a postmaster's backtrace? If yes, can we
> have something like below in sigusr1_handler()?
> if (CheckPostmasterSignal(PMSIGNAL_BACKTRACE_EMIT))
> {
> LogBackTrace();
> }
>

We had a discussion about this in [1]  earlier and felt including this
is not very useful.

> 5) Can we have PROCSIG_PRINT_BACKTRACE instead of
> PROCSIG_BACKTRACE_PRINT, just for readability and consistency with
> function name?
>

PROCSIG_PRINT_BACKTRACE is better, I have modified it.

> 6) I think it's 

Re: Printing backtrace of postgres processes

2021-02-01 Thread Dilip Kumar
On Fri, Jan 29, 2021 at 7:10 PM vignesh C  wrote:
>
> Thanks Bharath for your review comments. Please find my comments inline below.
>
> On Thu, Jan 28, 2021 at 7:40 PM Bharath Rupireddy
>  wrote:
> >
> > On Thu, Jan 28, 2021 at 5:22 PM vignesh C  wrote:
> > > Thanks for the comments, I have fixed and attached an updated patch
> > > with the fixes for the same.
> > > Thoughts?
> >
> > Thanks for the patch. Here are few comments:
> >
> > 1) I think it's return SIGNAL_BACKEND_SUCCESS; instead of return 0; in
> > check_valid_pid?
> >
>
> I did not want to use SIGNAL_BACKEND_SUCCESS as we have not yet
> signalled the backend process at this time. I have added
> BACKEND_VALIDATION_SUCCESS macro and used it here for better
> readability.
>
> > 2) How about following in pg_signal_backend for more readability
> > +if (ret != SIGNAL_BACKEND_SUCCESS)
> > +return ret;
> > instead of
> > +if (ret)
> > +return ret;
> >
>
> Modified it to ret != BACKEND_VALIDATION_SUCCESS
>
> > 3) How about validate_backend_pid or some better name instead of
> > check_valid_pid?
> >
>
> Modified it to validate_backend_pid
>
> > 4) How about following
> > + errmsg("must be a superuser to print backtrace
> > of backend process")));
> > instead of
> > + errmsg("must be a superuser to print backtrace
> > of superuser query process")));
> >
>
> Here the message should include superuser, we cannot remove it. Non
> super user can log non super user provided if user has permissions for
> it.
>
> > 5) How about following
> >  errmsg("must be a member of the role whose backed
> > process's backtrace is being printed or member of
> > pg_signal_backend")));
> > instead of
> > + errmsg("must be a member of the role whose
> > backtrace is being logged or member of pg_signal_backend")));
> >
>
> Modified it.
>
> > 6) I'm not sure whether "backtrace" or "call stack" is a generic term
> > from the user/developer perspective. In the patch, the function name
> > and documentation says callstack(I think it is "call stack" actually),
> > but the error/warning messages says backtrace. IMHO, having
> > "backtrace" everywhere in the patch, even the function name changed to
> > pg_print_backtrace, looks better and consistent. Thoughts?
> >
>
> Modified it to pg_print_backtrace.
>
> > 7) How about following in pg_print_callstack?
> > {
> > intbt_pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);
> > boolresult = false;
> >
> > if (r == SIGNAL_BACKEND_SUCCESS)
> > {
> > if (EmitProcSignalPrintCallStack(bt_pid))
> > result = true;
> > else
> > ereport(WARNING,
> > (errmsg("failed to send signal to postmaster: 
> > %m")));
> > }
> >
> > PG_RETURN_BOOL(result);
> > }
> >
>
> Modified similarly with slight change.
>
> > 8) How about following
> > +(errmsg("backtrace generation is not supported by
> > this PostgresSQL installation")));
> > instead of
> > +(errmsg("backtrace generation is not supported by
> > this installation")));
> >
>
> I used the existing message to maintain consistency with
> set_backtrace. I feel we can keep it the same.
>
> > 9) Typo - it's "example" +2) Using "addr2line -e postgres address", For 
> > exmple:
> >
>
> Modified it.
>
> > 10) How about
> > + * Handle print backtrace signal
> > instead of
> > + * Handle receipt of an print backtrace.
> >
>
> I used the existing message to maintain consistency similar to
> HandleProcSignalBarrierInterrupt. I feel we can keep it the same.
>
> > 11) Isn't below in documentation specific to Linux platform. What
> > happens if GDB is not there on the platform?
> > +
> > +1)  "info line *address" from gdb on postgres executable. For example:
> > +gdb ./postgres
> > +GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
> >
>
> I have made changes "You can get the file name and line number by
> using gdb/addr2line in linux platforms, as a prerequisite users must
> ensure gdb/addr2line is already installed".
>
> User will get an error like this in windows:
> select pg_print_backtrace(pg_backend_pid());
> WARNING:  backtrace generation is not supported by this installation
>  pg_print_callstack
> 
>  f
> (1 row)
>
> The backtrace will not be logged in case of windows, it will throw a
> warning "backtrace generation is not supported by this installation"
> Thoughts?
>
> > 12) +The callstack will be logged in the log file. What happens if the
> > server is started without a log file , ./pg_ctl -D data start? Where
> > will the backtrace go?
> >
>
> Updated to: The backtrace will be logged to the log file if logging is
> enabled, if logging is disabled backtrace will be logged to the
> console where the postmaster was started.
>
> > 13) Not sure, if it's an overkill, but how about pg_print_callstack
> > returning a 

Re: Printing backtrace of postgres processes

2021-02-01 Thread Kyotaro Horiguchi
At Fri, 29 Jan 2021 19:10:24 +0530, vignesh C  wrote in 
> Attached v5 patch has the fixes for the same.

PMSIGNAL_BACKTRACE_EMIT is not used anywhere?
(the commit message seems stale.)

+++ b/src/bin/pg_ctl/t/005_backtrace.pl

pg_ctl doesn't (or no longer?) seem relevant to this patch.


+   eval {
+   $current_logfiles = slurp_file($node->data_dir . 
'/current_logfiles');
+   };
+   last unless $@;

Is there any reason not just to do "while (! -e http://gnu.org/licenses/gpl.html
+This is free software: you are free to change and redistribute it.
+There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
+and "show warranty" for details.
+This GDB was configured as "x86_64-redhat-linux-gnu".
+For bug reporting instructions, please see:
+http://www.gnu.org/software/gdb/bugs/...
+Reading symbols from /home/postgresdba/inst/bin/postgres...done.

Almost all of the banner lines seem to be useless here.


 #define SIGNAL_BACKEND_SUCCESS 0
+#define BACKEND_VALIDATION_SUCCESS 0
 #define SIGNAL_BACKEND_ERROR 1
 #define SIGNAL_BACKEND_NOPERMISSION 2
 #define SIGNAL_BACKEND_NOSUPERUSER 3

Even though I can share the feeling that SIGNAL_BACKEND_SUCCESS is a
bit odd to represent "sending signal is allowed", I feel that it's
better using the existing symbol than using the new symbol.


+validate_backend_pid(int pid)

The function needs a comment. The name is somewhat
confusing. check_privilege_to_send_singal()?

Maybe the return value of the function should be changed to an enum,
and its callers should use switch-case to handle the value.


+   if (!SendProcSignal(bt_pid, PROCSIG_BACKTRACE_PRINT, 
InvalidBackendId))
+   PG_RETURN_BOOL(true);
+   else
+   ereport(WARNING,
+   (errmsg("failed to send signal 
to postmaster: %m")));
+   }
+
+   PG_RETURN_BOOL(result);

The variable "result" seems useless.


+   elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
+
+   errno = save_errno;
+}

You need to release the resouces held by the errtrace. And the
errtrace is a bit pointless. Why isn't it "backtrace"?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Printing backtrace of postgres processes

2021-01-31 Thread Bharath Rupireddy
On Mon, Feb 1, 2021 at 6:14 AM Bharath Rupireddy
 wrote:
> On Fri, Jan 29, 2021 at 7:10 PM vignesh C  wrote:
> > > 4) How about following
> > > + errmsg("must be a superuser to print backtrace
> > > of backend process")));
> > > instead of
> > > + errmsg("must be a superuser to print backtrace
> > > of superuser query process")));
> > >
> >
> > Here the message should include superuser, we cannot remove it. Non
> > super user can log non super user provided if user has permissions for
> > it.
> >
> > > 5) How about following
> > >  errmsg("must be a member of the role whose backed
> > > process's backtrace is being printed or member of
> > > pg_signal_backend")));
> > > instead of
> > > + errmsg("must be a member of the role whose
> > > backtrace is being logged or member of pg_signal_backend")));
> > >
> >
> > Modified it.
>
> Maybe I'm confused here to understand the difference between
> SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION macros and
> corresponding error messages. Some clarification/use case to know in
> which scenarios we hit those error messages might help me. Did we try
> to add test cases that show up these error messages for
> pg_print_backtrace? If not, can we add?

Are these superuser and permission checks enough from a security
standpoint that we don't expose some sensitive information to the
user? Although I'm not sure, say from the backtrace printed and
attached to GDB, can users see the passwords or other sensitive
information from the system that they aren't supposed to see?

I'm sure this point would have been discussed upthread.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Printing backtrace of postgres processes

2021-01-31 Thread Bharath Rupireddy
On Fri, Jan 29, 2021 at 7:10 PM vignesh C  wrote:
> > 4) How about following
> > + errmsg("must be a superuser to print backtrace
> > of backend process")));
> > instead of
> > + errmsg("must be a superuser to print backtrace
> > of superuser query process")));
> >
>
> Here the message should include superuser, we cannot remove it. Non
> super user can log non super user provided if user has permissions for
> it.
>
> > 5) How about following
> >  errmsg("must be a member of the role whose backed
> > process's backtrace is being printed or member of
> > pg_signal_backend")));
> > instead of
> > + errmsg("must be a member of the role whose
> > backtrace is being logged or member of pg_signal_backend")));
> >
>
> Modified it.

Maybe I'm confused here to understand the difference between
SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION macros and
corresponding error messages. Some clarification/use case to know in
which scenarios we hit those error messages might help me. Did we try
to add test cases that show up these error messages for
pg_print_backtrace? If not, can we add?

> Attached v5 patch has the fixes for the same.
> Thoughts?

Thanks. Here are some comments on v5 patch:

1) typo - it's "process" not "porcess" +a backend porcess. For example:

2) select * from pg_print_backtrace(NULL);
postgres=# select proname, proisstrict from pg_proc where proname =
'pg_print_backtrace';
  proname   | proisstrict
+-
 pg_print_backtrace | t

 See the documentation:
 "proisstrict bool

Function returns null if any call argument is null. In that case the
function won't actually be called at all. Functions that are not
“strict” must be prepared to handle null inputs."
So below PG_ARGISNUL check is not needed, you can remove that, because
pg_print_backtrace will not get called with null input.
intbt_pid = PG_ARGISNULL(0) ? -1 : PG_GETARG_INT32(0);

3) Can we just set results = true instead of PG_RETURN_BOOL(true); so
that it will be returned from PG_RETURN_BOOL(result); just for
consistency?
if (!SendProcSignal(bt_pid, PROCSIG_BACKTRACE_PRINT,
InvalidBackendId))
PG_RETURN_BOOL(true);
else
ereport(WARNING,
(errmsg("failed to send signal to postmaster: %m")));
}

PG_RETURN_BOOL(result);

4) Below is what happens if I request for a backtrace of the
postmaster process? 1388210 is pid of postmaster.
postgres=# select * from pg_print_backtrace(1388210);
WARNING:  PID 1388210 is not a PostgreSQL server process
 pg_print_backtrace

 f

Does it make sense to have a postmaster's backtrace? If yes, can we
have something like below in sigusr1_handler()?
if (CheckPostmasterSignal(PMSIGNAL_BACKTRACE_EMIT))
{
LogBackTrace();
}

5) Can we have PROCSIG_PRINT_BACKTRACE instead of
PROCSIG_BACKTRACE_PRINT, just for readability and consistency with
function name?

6) I think it's not the postmaster that prints backtrace of the
requested backend and we don't send SIGUSR1 with
PMSIGNAL_BACKTRACE_EMIT to postmaster, I think it's the backends, upon
receiving SIGUSR1 with PMSIGNAL_BACKTRACE_EMIT signal, log their own
backtrace. Am I missing anything here? If I'm correct, we need to
change the below description and other places wherever we refer to
this description.

The idea here is to implement & expose pg_print_backtrace function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PMSIGNAL_BACKTRACE_EMIT to the postmaster process. Once the process
receives this signal it will log the backtrace of the process.

7) Can we get the parallel worker's backtrace? IIUC it's possible.

8) What happens if a user runs pg_print_backtrace() on Windows or
MacOS or some other platform? Do you want to say something about other
platforms where gdb/addr2line doesn't exist?
+
+You can get the file name and line number by using gdb/addr2line in
+linux platforms, as a prerequisite users must ensure gdb/addr2line is
+already installed:
+

9) Can't we reuse set_backtrace with just adding a flag to
set_backtrace(ErrorData *edata, int num_skip, bool
is_print_backtrace_function), making it a non-static function and call
set_backtrace(NULL, 0, true)?

void
set_backtrace(ErrorData *edata, int num_skip, bool is_print_backtrace_function)
{
StringInfoData errtrace;
---
---
if (is_print_backtrace_function)
elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data);
else
edata->backtrace = errtrace.data;
}

I think it will be good if we do this, because we can avoid duplicate
code in set_backtrace and LogBackTrace.

10) I think it's "pg_signal_backend" instead of "pg_print_backtrace"?
+backtrace is being printed or the calling role has been granted
+pg_print_backtrace, however only 

  1   2   >