Hi Konstantin

I've discussed the "zombie/exit" issue with our expert here.

- He does not think that AIX has anything special here

- If the process is marked <exiting> in ps, this is because the flag SEXIT is 
set, thus the process is blocked somewhere in the kexitx() syscall, waiting for 
something.

- In order to know what it is waiting for, the best would be to have a look 
with kdb.

- either it is waiting for an asynchronous I/O to end, or a thread to end if 
the process is multi-thread

- Using the proctree command for analyzing the issue is not a good idea, since 
the process will block in kexitx() if there is an operation on /proc being done

- If the process is marked <defunct>, that means that the process has not 
called waitpid() yet for getting the son's status. Maybe the parent is blocked 
in non-interruptible code where the signal handler cannot be called.

- In short, that may be due to many causes... Use kdb is the best way.

- Instead of proctree (which makes use of /proc), use: "ps -faT <pid>".


I'll try to reproduce here.

Regards

Tony

Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:

....

About the zombie issue, I've discussed with my colleagues. Looks like the 
process keeps zombie till the father looks at its status. However, though I did 
that several times, I  do not remember well the details. And that should be not 
specific to AIX. I'll discuss with another colleague, tomorrow, who should 
understand this better than me.

1. Process is not in zomby state (according to ps). It is in <exiting> state... 
It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then 
wait for them. It is about ten times slower than at Intel/Linux, but still much 
faster than 100 seconds. So there is some difference between postgress backend 
and dummy process doing nothing - just immediately terminating after return 
from fork()
....

Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not 
"check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless 
you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"



Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a 
guide for using it for testing.

pgbench is part of Postgres distributive (src/bin/pgbench)



I would add such tests when building my PostgreSQL RPMs on AIX. So any help is 
welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any 
idea how I should proceed ? Any PostgreSQL performance benchmark that I could 
find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is 
quite primitive and its results are rather artificial. TPC-C seems to be better 
choice.
But the best case is to implement your own benchmark simulating actual workload 
of your real application.


- I'm interested in any information for improving the performance & quality of 
my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are 
free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) 
sells IBM Power machines under the Escala brand since ages (25 years this 
year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX 
?

We still have one open issue at AIX: see 
https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Reply via email to