Re: [HACKERS] Active zombies at AIX

Konstantin Knizhnik Mon, 06 Feb 2017 01:47:42 -0800

Last update on the problem.

Using kdb tool (thank's to Tony Reix for advice and help) we get thefollowing trace of Poastgres backend in existing stack trace:


pvthread+073000 STACK:
[005E1958]slock+000578 (00000000005E1958, 8000000000001032 [??])
[00009558].simple_lock+000058 ()
[00651DBC]vm_relalias+00019C (??, ??, ??, ??, ??)
[006544AC]vm_map_entry_delete+00074C (??, ??, ??)
[00659C30]vm_map_delete+000150 (??, ??, ??, ??)
[00659D88]vm_map_deallocate+000048 (??, ??)
[0011C588]kexitx+001408 (??)
[000BB08C]kexit+00008C ()
___ Recovery (FFFFFFFFFFF9290) ___
WARNING: Eyecatcher/version mismatch in RWA


So there seems to be lock contention while unmapping memory segments.

My assumption was that Postgres is detaching all attached segmentsbefore exit (in shmem_exit callback or earlier).I have added logging around proc_exit_prepare function (which is calledby atexit callback) and check that it completes immediately.So I thought that this vm_map_deallocate can be related withdeallocation of normal (malloced) memory, because in Linux memoryallocator may use mmap.

But in AIX it is not true.

Below is report of Bergamini Demien (once again a lot of thanks forhelp with investigation the problem):

The memory allocator in AIX libc does not use mmap and vm_relalias() isonly called for shared memory mappings.

I talked with the AIX VMM expert at IBM and he said that what you hit isone of the most common performance bottlenecks in AIX memory management.

He also said that SysV Shared Memory (shmget/shmat) perform better onAIX than mmap.

Some improvements have been made in AIX 6.1 (see “perf suffers whenprocs sharing the same segs all exit at once”:http://www-01.ibm.com/support/docview.wss?uid=isg1IZ83819) but it doesnot help in your case.

In src/backend/port/sysv_shmem.c, it says that PostgreSQL 9.3 switchedfrom using SysV Shared Memory to using mmap.

Maybe you could try to switch back to using SysV Shared Memory on AIX tosee if it helps performance-wise.

Also, the good news is that there are some restricted tunables in AIXthat can be tweaked to help different workloads which may have differentdemands.


One of them is relalias_percentage which works with force_relalias_lite:

# vmo -h relalias_percentage

Help for tunable relalias_percentage:

Purpose:

If force_relalias_lite is set to 0, then this specifies the factor usedin the heuristic to decide whether to avoid locking the source mmappedsegment or not.


Values:

        Default: 0
        Range: 0 - 32767
        Type: Dynamic
        Unit:

Tuning:

This is used when tearing down an mmapped region and is a scalabilitystatement, where avoiding the lock may help system throughput, but, insome cases, at the cost of more compute time used. If the number ofpages being unmapped is less than this value divided by 100 andmultiplied by the total number of pages in memory in the source mmappedsegment, then the source lock will be avoided. A value of 0 forrelalias_percentage, with force_relalias_lite also set to 0, will causethe source segment lock to always be taken. Effective values forrelalias_percentage will vary by workload, however, a suggested valueis: 200.



You may also try to play with the munmap_npages vmo tunable.

Your vmo settings for lgpg_size, lgpg_regions and v_pinshm already seemcorrect.



On 24.01.2017 18:08, Konstantin Knizhnik wrote:

Hi hackers,
Yet another story about AIX. For some reasons AIX very slowly cleaningzombie processes.If we launch pgbench with -C parameter then very soon limit formaximal number of connections is exhausted.If maximal number of connection is set to 1000, then after ten secondsof pgbench activity we get about 900 zombie processes and it takesabout 100 seconds (!)
before all of them are terminated.

proctree shows a lot of defunt processes:

[14:44:41]root@postgres:~ # proctree 26084446
26084446 /opt/postgresql/xlc/9.6/bin/postgres -D /postg_fs/postgresql/xlc
4784362 <defunct>
4980786 <defunct>
11403448 <defunct>
11468930 <defunct>
11993176 <defunct>
12189710 <defunct>
12517390 <defunct>
13238374 <defunct>
13565974 <defunct>
13893826 postgres: wal writer process
14024716 <defunct>
15401000 <defunct>
...
25691556 <defunct>

But ps shows that status of process is <existing>

[14:46:02]root@postgres:~ # ps -elk | grep 25691556

  * A - 25691556 - - - - - <exiting>
Breakpoint set in reaper() function in postmaster shows that eachinvocation of this functions (called by SIGCHLD handler) proceed 5-10PIDS per invocation.So there are two hypothesis: either AIX is very slowly deliveringSIGCHLD to parent, either exit of process takes too much time.
The fact the backends are in exiting state makes second hypothesismore reliable.We have tried different Postgres configurations with local and TCPsockets, with different amount of shared buffers and built both withgcc and xlc.
In all cases behavior is similar: zombies do not want to die.
As far as it is not possible to attach debugger to defunct process, itis not clear how to understand what's going on.
I wonder if somebody has encountered similar problems at AIX and maybe can suggest some solution to solve this problem.
Thanks in advance
--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Active zombies at AIX

Reply via email to