Re: [HACKERS] Questions regarding signal handler of postmaster

2016-12-26 Thread Tatsuo Ishii
> But we keep signals blocked almost all the time in the postmaster,
> so in reality no signal handler can interrupt anything except the
> select() wait call.  People complain about that coding technique
> all the time, but no one has presented any reason to believe that
> it's broken.

Ok, there seems no better solution than always blocking signals.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions regarding signal handler of postmaster

2016-12-26 Thread Tatsuo Ishii
> I encountered that problem with postmaster and fixed it in 9.4.0 (it's not 
> back-patched to earlier releases because it's relatively complex).
> 
> https://www.postgresql.org/message-id/20DAEA8949EC4E2289C6E8E58560DEC0@maumau
> 
> 
> [Excerpt from 9.4 release note]
> During crash recovery or immediate shutdown, send uncatchable termination 
> signals (SIGKILL) to child processes that do not shut down promptly (MauMau, 
> Álvaro Herrera)
> This reduces the likelihood of leaving orphaned child processes behind after 
> postmaster shutdown, as well as ensuring that crash recovery can proceed if 
> some child processes have become “stuck”.

Looks wild to me:-) I hope there exists better way to solve the problem...

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions regarding signal handler of postmaster

2016-12-26 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tatsuo Ishii
> In postmaster.c signal handler pmdie() calls ereport() and
> errmsg_internal(), which could call palloc() then malloc() if necessary.
> Because it is possible that pmdie() gets called while
> malloc() gets called in postmaster, I think it is possible that a deadlock
> situation could occur through an internal locking inside malloc(). I have
> not observed the exact case in PostgreSQL but I see a suspected case in
> Pgpool-II. In the stack trace #14, malloc() is called by Pgpool-II. It is
> interrupted by a signal in #11, and the signal handler calls malloc() again,
> and it is stuck at #0.

I encountered that problem with postmaster and fixed it in 9.4.0 (it's not 
back-patched to earlier releases because it's relatively complex).

https://www.postgresql.org/message-id/20DAEA8949EC4E2289C6E8E58560DEC0@maumau


[Excerpt from 9.4 release note]
During crash recovery or immediate shutdown, send uncatchable termination 
signals (SIGKILL) to child processes that do not shut down promptly (MauMau, 
Álvaro Herrera)
This reduces the likelihood of leaving orphaned child processes behind after 
postmaster shutdown, as well as ensuring that crash recovery can proceed if 
some child processes have become “stuck”.

Regards
Takayuki Tsunakawa




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions regarding signal handler of postmaster

2016-12-26 Thread Tom Lane
Tatsuo Ishii  writes:
> In postmaster.c signal handler pmdie() calls ereport() and
> errmsg_internal(), which could call palloc() then malloc() if
> necessary. Because it is possible that pmdie() gets called while
> malloc() gets called in postmaster, I think it is possible that a
> deadlock situation could occur through an internal locking inside
> malloc().

But we keep signals blocked almost all the time in the postmaster,
so in reality no signal handler can interrupt anything except the
select() wait call.  People complain about that coding technique
all the time, but no one has presented any reason to believe that
it's broken.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Questions regarding signal handler of postmaster

2016-12-26 Thread Tatsuo Ishii
In postmaster.c signal handler pmdie() calls ereport() and
errmsg_internal(), which could call palloc() then malloc() if
necessary. Because it is possible that pmdie() gets called while
malloc() gets called in postmaster, I think it is possible that a
deadlock situation could occur through an internal locking inside
malloc(). I have not observed the exact case in PostgreSQL but I see a
suspected case in Pgpool-II. In the stack trace #14, malloc() is
called by Pgpool-II. It is interrupted by a signal in #11, and the
signal handler calls malloc() again, and it is stuck at #0.

So my question is, is my concern about PostgreSQL valid?
If so, how can we fix it?

#0  __lll_lock_wait_private () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x7f67fe20ccba in _L_lock_12808 () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x7f67fe20a6b5 in __GI___libc_malloc (bytes=15) at malloc.c:2887
#3  0x7f67fe21072a in __GI___strdup (s=0x7f67fe305dd8 "/etc/localtime") at 
strdup.c:42
#4  0x7f67fe239f51 in tzset_internal (always=, 
explicit=explicit@entry=1)
at tzset.c:444
#5  0x7f67fe23a913 in __tz_convert (timer=timer@entry=0x7ffce1c1b7f8, 
use_localtime=use_localtime@entry=1, tp=tp@entry=0x7f67fe54bde0 <_tmbuf>) 
at tzset.c:632
#6  0x7f67fe2387d1 in __GI_localtime (t=t@entry=0x7ffce1c1b7f8) at 
localtime.c:42
#7  0x0045627b in log_line_prefix (buf=buf@entry=0x7ffce1c1b8d0, 
line_prefix=, 
edata=) at ../../src/utils/error/elog.c:2059
#8  0x0045894d in send_message_to_server_log (edata=0x753320 
)
at ../../src/utils/error/elog.c:2084
#9  EmitErrorReport () at ../../src/utils/error/elog.c:1129
#10 0x00456d8e in errfinish (dummy=) at 
../../src/utils/error/elog.c:434
#11 0x00421f57 in die (sig=2) at protocol/child.c:925
#12 
#13 _int_malloc (av=0x7f67fe546760 , bytes=4176) at malloc.c:3302
#14 0x7f67fe20a6c0 in __GI___libc_malloc (bytes=4176) at malloc.c:2891

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers