Re: CSStorm occurred again by postgreSQL8.2. (Re: [HACKERS] poor performance with Context Switch Storm at TPC-W.)
"Tom Lane <[EMAIL PROTECTED]>" wrote: > Katsuhiko Okano <[EMAIL PROTECTED]> writes: > > It does not solve, even if it increases the number of NUM_SUBTRANS_BUFFERS. > > The problem was only postponed. > > Can you provide a reproducible test case for this? Seven machines are required in order to perform measurement. (DB*1,AP*2,CLient*4) Enough work load was not able to be given in two machines. (DB*1,{AP+CL}*1) It was not able to reappear to a multiplex run of pgbench or a simple SELECT query. TPC-W of a work load tool used this time is a full scratch. Regrettably it cannot open to the public. If there is a work load tool of a free license, I would like to try. I will show if there is information required for others. The patch which outputs the number of times of LWLock was used this time. The following is old example output. FYI. # SELECT * FROM pg_stat_lwlocks; kind | pg_stat_get_lwlock_name | sh_call | sh_wait | ex_call | ex_wait | sleep --+++---+---+---+--- 0 | BufMappingLock | 559375542 | 33542 |320092 | 24025 | 0 1 | BufFreelistLock| 0 | 0 |370709 | 47 | 0 2 | LockMgrLock| 0 | 0 | 41718885 | 734502 | 0 3 | OidGenLock | 33 | 0 | 0 | 0 | 0 4 | XidGenLock | 12572279 | 10095 | 11299469 | 20089 | 0 5 | ProcArrayLock |8371330 | 72052 | 16965667 | 603294 | 0 6 | SInvalLock | 38822428 | 435 | 25917 | 128 | 0 7 | FreeSpaceLock | 0 | 0 | 16787 | 4 | 0 8 | WALInsertLock | 0 | 0 | 1239911 | 885 | 0 9 | WALWriteLock | 0 | 0 | 69907 | 5589 | 0 10 | ControlFileLock| 0 | 0 | 16686 | 1 | 0 11 | CheckpointLock | 0 | 0 |34 | 0 | 0 12 | CheckpointStartLock| 69509 | 0 |34 | 1 | 0 13 | CLogControlLock| 0 | 0 |236763 | 183 | 0 14 | SubtransControlLock| 0 | 0 | 753773945 | 205273395 | 0 15 | MultiXactGenLock | 66 | 0 | 0 | 0 | 0 16 | MultiXactOffsetControlLock | 0 | 0 |35 | 0 | 0 17 | MultiXactMemberControlLock | 0 | 0 |34 | 0 | 0 18 | RelCacheInitLock | 0 | 0 | 0 | 0 | 0 19 | BgWriterCommLock | 0 | 0 | 61457 | 1 | 0 20 | TwoPhaseStateLock | 33 | 0 | 0 | 0 | 0 21 | TablespaceCreateLock | 0 | 0 | 0 | 0 | 0 22 | BufferIO | 0 | 0 |695627 | 16 | 0 23 | BufferContent | 3568231805 | 1897 | 1361394 | 829 | 0 24 | CLog | 0 | 0 | 0 | 0 | 0 25 | SubTrans | 138571621 | 143208883 | 8122181 | 8132646 | 0 26 | MultiXactOffset| 0 | 0 | 0 | 0 | 0 27 | MultiXactMember| 0 | 0 | 0 | 0 | 0 (28 rows) I am pleased if interested. regards, Katsuhiko Okano okano katsuhiko _at_ oss ntt co jp ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: CSStorm occurred again by postgreSQL8.2. (Re: [HACKERS] poor performance with Context Switch Storm at TPC-W.)
Katsuhiko Okano <[EMAIL PROTECTED]> writes: > It does not solve, even if it increases the number of NUM_SUBTRANS_BUFFERS. > The problem was only postponed. Can you provide a reproducible test case for this? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
CSStorm occurred again by postgreSQL8.2. (Re: [HACKERS] poor performance with Context Switch Storm at TPC-W.)
Katsuhiko Okano wrote: > By PostgreSQL8.2, NUM_SUBTRANS_BUFFERS was changed into 128 > and recompile and measured again. > NOT occurrence of CSStorm. The value of WIPS was about 400. measured again. not occurrence when measured for 30 minutes. but occurrence when measured for 3 hours, and 1 hour and 10 minutes passed. It does not solve, even if it increases the number of NUM_SUBTRANS_BUFFERS. The problem was only postponed. > If the number of SLRU buffers is too low, > also in PostgreSQL8.1.4, if the number of buffers is increased > I think that the same result is brought. > (Although the buffer of CLOG or a multi-transaction also increases, > I think that effect is small) > > Now, NUM_SLRU_BUFFERS is changed into 128 in PostgreSQL8.1.4 > and is under measurement. Occurrence CSStorm when the version 8.1.4 passed similarly for 1 hour and 10 minutes. A strange point, The number of times of a LWLock lock for LRU buffers is 0 times until CSStorm occurs. After CSStorm occurs, the share lock and the exclusion lock are required and most locks are kept waiting. (exclusion lock for SubtransControlLock is increased rapidly after CSStorm start.) Is different processing done by whether CSStrom has occurred or not occurred? regards, Katsuhiko Okano okano katsuhiko _at_ oss ntt co jp ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster