> From: "Kevin Grittner" <kgri...@gmail.com> > <liu-m...@mails.tsinghua.edu.cn> wrote: > > > "vmstat 1" output is as follow. Because I used only 30 cores (1/4 of all), > > cpu user time should be about 12*4 = 48. > > There seems to be no process blocked by IO. > > > > procs -----------memory---------- ---swap-- -----io---- -system-- > > ------cpu----- > > r b swpd free buff cache si so bi bo in cs us sy id > > wa st > > 28 0 0 981177024 315036 70843760 0 0 0 9 0 0 1 > > 0 99 0 0 > > 21 1 0 981178176 315036 70843784 0 0 0 0 25482 329020 > > 12 3 85 0 0 > > 18 1 0 981179200 315036 70843792 0 0 0 0 26569 323596 > > 12 3 85 0 0 > > 17 0 0 981175424 315036 70843808 0 0 0 0 25374 322992 > > 12 4 85 0 0 > > 12 0 0 981174208 315036 70843824 0 0 0 0 24775 321577 > > 12 3 85 0 0 > > 8 0 0 981179328 315036 70845336 0 0 0 0 13115 199020 > > 6 2 92 0 0 > > 13 0 0 981179200 315036 70845792 0 0 0 0 22893 301373 > > 11 3 87 0 0 > > 11 0 0 981179712 315036 70845808 0 0 0 0 26933 325728 > > 12 4 84 0 0 > > 30 0 0 981178304 315036 70845824 0 0 0 0 23691 315821 > > 11 4 85 0 0 > > 12 1 0 981177600 315036 70845832 0 0 0 0 29485 320166 > > 12 4 84 0 0 > > 32 0 0 981180032 315036 70845848 0 0 0 0 25946 316724 > > 12 4 84 0 0 > > 21 0 0 981176384 315036 70845864 0 0 0 0 24227 321938 > > 12 4 84 0 0 > > 21 0 0 981178880 315036 70845880 0 0 0 0 25174 326943 > > 13 4 83 0 0 > > This machine has 120 cores? Is hyperthreading enabled? If so, what > you are showing might represent a total saturation of the 30 cores. > Context switches of about 300,000 per second is pretty high. I can't > think of when I've seen that except when there is high spinlock > contention. >
Yes, and the hyper-threading is closed. > Just to put the above in context, how did you limit the test to 30 > cores? How many connections were open during the test? > I used numactl to limit the test in the first two sockets (15 cores in each socket) And there are 90 concurrent connections. > > The flame graph is attached. I use 'perf' to generate the flame graph. Only > > the CPUs running PG server are profiled. > > I'm not familiar with other part of PG. Can you find anything unusual in > > the graph? > > Two SSI functions stand out: > 10.86% PredicateLockTuple > 3.51% CheckForSerializableConflictIn > > In both cases, most of that seems to go to lightweight locking. Since > you said this is a CPU graph, that again suggests spinlock contention > issues. > > -- Yes. Is there any other kind of locks besides spinlock? I'm reading locks in PG now. If all locks are spinlock, the CPU should be used 100%. But now only 50% CPU are used. I'm afraid there are extra time waiting for mutex or semaphore. These SSI functions will cost more time than reported, because perf doesn't record the time sleeping & waiting for locks. CheckForSerializableConflictIn takes 10% of running time. (refer to my last email) -- Mengxing Liu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers