Re: [PATCHES] [HACKERS] Hint Bits and Write I/O
Simon Riggs wrote: When running a VACUUM command we always dirty the block when setting hint bits, for a number of reasons: * VACUUM FULL expects all hint bits to be set prior to moving tuples * Setting all hint bits allows us to truncate the clog * it forces the VACUUM to write out its own dirty buffers, which is OK, since it is a background process. Other commands call HeapTupleSatisfiesVacuum(), yet these tasks can be more flexible with hint bit setting. These include ANALYZE, CREATE INDEX, CLUSTER, HOT pruning and index scan marking deleted tuples (with changes in all index AMs). This means we have to differentiate between VACUUM and other callers of HeapTupleSatisfiesVacuum(). So the patch changes the APIs of HeapTupleSatisfiesVacuum(), SetBufferCommitInfoNeedsSave() and SetHintBits() with changes to 13 AM and command files. There are many changes in tqual.c, which seems the right way because SetHintBits() is inlined. These make the patch fairly large, though most of it is simple changes. If only VACUUM is going to set flexible to off, maybe it's better to leave the APIs as they are and have a global that's set by VACUUM only (and reset in a PG_CATCH block). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] small typo in DTrace docs
On Fri, 2008-06-13 at 01:49 -0300, Euler Taveira de Oliveira wrote: Attached is a small patch to fix some typos in probes.d path. Applied, thanks. -Neil -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
[PATCHES] Rewrite sinval messaging to reduce contention
In this thread back in January http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php we saw evidence that contention for SInvalLock was a problem, and we (or at least I) concluded that the sinval messaging system needed a nontrivial rewrite to get rid of its habit of sending everyone to read the queue at the same time. Here is the proposed rewrite. The major changes are: * When the queue overflows, we only issue a cache reset to the specific backend or backends that still haven't read the oldest message, rather than resetting everyone as in the original coding. The reset-everyone approach forced unnecessary cache-rebuilding work in backends that weren't actually behind. * When we observe backend(s) falling well behind, we signal SIGUSR1 to only one backend, the one that is furthest behind and doesn't already have a signal outstanding for it. When it finishes catching up, it will pass on the SIGUSR1 to the next-furthest-back guy, if there is one that is far enough behind to justify a signal. This changes the former behavior where every backend in the system might try to hit the queue at the same time into a daisy chain where only one backend is catching up at a time; furthermore, backends that aren't well behind aren't forced to catch up too. * We don't attempt to clean out dead messages after every message-receipt operation; rather, we do it on the insertion side, and only when the queue fullness passes certain thresholds. (As presented, the thresholds are 8/16, 9/16, 10/16, etc, but this could be adjusted by changing a couple of #defines.) This greatly reduces the amount of contention for exclusive hold on SInvalLock. * Readers attempt to pull multiple messages (as presented, up to 32) out of the queue for each acquisition of SInvalLock. Although they take the lock in shared mode and thus aren't contending at the LWLock level, I thought this was important to do to reduce contention for the lock's spinlock --- the prevalence of s_lock() in the before oprofile convinces me that there was just too much traffic through the lock itself. Note: PMSIGNAL_WAKEN_CHILDREN is now unused and should be removed, but I didn't include that change in the patch. The test case I'm using looks like this: $ cat createdrop.sql create temp table register_reqs(full_key integer, register_index integer) WITHOUT OIDS ON COMMIT DROP; $ pgbench -n -f createdrop.sql -c 40 -t 1000 bench On my machine CVS HEAD gets about 485 transactions/sec on this; with the patch that improves to 700 tps. oprofile says that the routines above 1% of CPU in CVS HEAD are CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 10 samples %image name symbol name 363116 11.2433 postgres CatalogCacheIdInvalidate 3166089.8032 postgres s_lock 3058819.4711 postgres LWLockAcquire 2727178.4442 postgres LWLockRelease 2348127.2706 postgres CatalogCacheFlushRelation 2154456.6709 postgres hash_search_with_hash_value 78544 2.4320 postgres _bt_compare 62007 1.9199 postgres SIGetDataEntry 48614 1.5052 postgres AllocSetAlloc 48033 1.4873 postgres XLogInsert 41280 1.2782 postgres PinBuffer 40659 1.2589 postgres LocalExecuteInvalidationMessage 36906 1.1427 postgres ResetCatalogCache 35708 1.1056 postgres hash_any 34151 1.0574 postgres PrepareToInvalidateCacheTuple and with the patch CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated) Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 10 samples %image name symbol name 550342 13.5424 postgres CatalogCacheIdInvalidate 3626848.9246 postgres hash_search_with_hash_value 1992914.9040 postgres LWLockAcquire 1982774.8790 postgres CatalogCacheFlushRelation 1497163.6841 postgres _bt_compare 1140062.8054 postgres LWLockRelease 1031382.5379 postgres XLogInsert 84471 2.0786 postgres PrepareToInvalidateCacheTuple 78464 1.9308 postgres LocalExecuteInvalidationMessage 75336 1.8538 postgres PinBuffer 74837 1.8415 postgres AllocSetAlloc 63764 1.5691 postgres SIGetDataEntries 60468 1.4879 postgres hash_any 51662 1.2713 libc-2.7.so memcpy 43999 1.0827 libc-2.7.so memcmp 43407 1.0681 postgres _bt_checkpage
Re: [PATCHES] Simplify formatting.c
Bruce Momjian wrote: Alvaro Herrera wrote: Bruce Momjian wrote: I moved str_initcap() over into oracle_compat.c and then had initcap() convert to/from TEXT to call it. The code is a little weird because str_initcap() needs to convert to text to use texttowcs(), so in multibyte encodings initcap converts the string to text, then to char, then to text to call texttowcs(). I didn't see a cleaner way to do this. Why not use wchar2char? It seems there's room for extra cleanup here. Also, the prototype of str_initcap in builtins.h looks out of place. I talked to Alvaro on IM, and there is certainly much more cleanup to do in this area. I will work from the bottom up. First, is moving the USE_WIDE_UPPER_LOWER define to c.h, and removing TS_USE_WIDE and using USE_WIDE_UPPER_LOWER instead. Patch attached and applied. The second step is to move wchar2char() and char2wchar() from tsearch into /mb to be easier to use for other modules; also move pnstrdup(). Patch attached and applied. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/tsearch/ts_locale.c === RCS file: /cvsroot/pgsql/src/backend/tsearch/ts_locale.c,v retrieving revision 1.8 diff -c -c -r1.8 ts_locale.c *** src/backend/tsearch/ts_locale.c 17 Jun 2008 16:09:06 - 1.8 --- src/backend/tsearch/ts_locale.c 18 Jun 2008 18:37:02 - *** *** 16,140 #include tsearch/ts_locale.h #include tsearch/ts_public.h - #ifdef USE_WIDE_UPPER_LOWER - /* - * wchar2char --- convert wide characters to multibyte format - * - * This has the same API as the standard wcstombs() function; in particular, - * tolen is the maximum number of bytes to store at *to, and *from must be - * zero-terminated. The output will be zero-terminated iff there is room. - */ - size_t - wchar2char(char *to, const wchar_t *from, size_t tolen) - { - if (tolen == 0) - return 0; - - #ifdef WIN32 - if (GetDatabaseEncoding() == PG_UTF8) - { - int r; - - r = WideCharToMultiByte(CP_UTF8, 0, from, -1, to, tolen, - NULL, NULL); - - if (r = 0) - return (size_t) -1; - - Assert(r = tolen); - - /* Microsoft counts the zero terminator in the result */ - return r - 1; - } - #endif /* WIN32 */ - - return wcstombs(to, from, tolen); - } - - /* - * char2wchar --- convert multibyte characters to wide characters - * - * This has almost the API of mbstowcs(), except that *from need not be - * null-terminated; instead, the number of input bytes is specified as - * fromlen. Also, we ereport() rather than returning -1 for invalid - * input encoding. tolen is the maximum number of wchar_t's to store at *to. - * The output will be zero-terminated iff there is room. - */ - size_t - char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen) - { - if (tolen == 0) - return 0; - - #ifdef WIN32 - if (GetDatabaseEncoding() == PG_UTF8) - { - int r; - - /* stupid Microsloth API does not work for zero-length input */ - if (fromlen == 0) - r = 0; - else - { - r = MultiByteToWideChar(CP_UTF8, 0, from, fromlen, to, tolen - 1); - - if (r = 0) - { - /* see notes in oracle_compat.c about error reporting */ - pg_verifymbstr(from, fromlen, false); - ereport(ERROR, - (errcode(ERRCODE_CHARACTER_NOT_IN_REPERTOIRE), - errmsg(invalid multibyte character for locale), - errhint(The server's LC_CTYPE locale is probably incompatible with the database encoding.))); - } - } - - Assert(r tolen); - to[r] = 0; - - return r; - } - #endif /* WIN32 */ - - if (lc_ctype_is_c()) - { - /* - * pg_mb2wchar_with_len always adds trailing '\0', so 'to' should be - * allocated with sufficient space - */ - return pg_mb2wchar_with_len(from, (pg_wchar *) to, fromlen); - } - else - { - /* - * mbstowcs requires ending '\0' - */ - char *str = pnstrdup(from, fromlen); - size_t result; - - result = mbstowcs(to, str, tolen); - - pfree(str); - - if (result == (size_t) -1) - { - pg_verifymbstr(from, fromlen, false); - ereport(ERROR, - (errcode(ERRCODE_CHARACTER_NOT_IN_REPERTOIRE), - errmsg(invalid multibyte character for locale), - errhint(The server's LC_CTYPE locale is probably incompatible with the database encoding.))); - } - - if (result tolen) - to[result] = 0; - - return result; - } - } - - int t_isdigit(const char *ptr) { --- 16,23 Index: src/backend/tsearch/ts_utils.c === RCS file: /cvsroot/pgsql/src/backend/tsearch/ts_utils.c,v retrieving revision 1.9 diff -c -c -r1.9 ts_utils.c *** src/backend/tsearch/ts_utils.c 1 Jan 2008 19:45:52 - 1.9 ---
Re: [PATCHES] Rewrite sinval messaging to reduce contention
I wrote: In this thread back in January http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php we saw evidence that contention for SInvalLock was a problem, and we (or at least I) concluded that the sinval messaging system needed a nontrivial rewrite to get rid of its habit of sending everyone to read the queue at the same time. Here is the proposed rewrite. After further thought, I came up with two more improvements: * It turns out not to be too hard to apply the idea of moving multiple messages per LWLock acquisition on the writing side too, so I did that. * There is not actually any good reason why writers have to block readers or vice versa, except in the infrequent case that SICleanupQueue is needed. Writers don't look at the ProcStates that the readers are changing. The only point of overlap is that a writer will change maxMsgNum which readers want to look at --- but we are already assuming in several places that reading or writing an int is atomic, and if we assume that here too, it seems to work fine. To implement this, I split SInvalLock into SInvalReadLock and SInvalWriteLock. This seems to be reaching a point of diminishing returns for the particular test case I'm using --- the TPS rate only went up from 700 to 730 or so. However, enabling LWLOCK_STATS shows that the contention rate on the sinval locks is now completely negligible --- one block per thousand acquisitions on SInvalWriteLock, and less than one in 1 on SInvalReadLock. The vast majority of the LWLock contention now comes from WALInsertLock and the LockMgr locks: Lock# acquisitions # times blocked SInvalReadLock 6469840 380 SInvalWriteLock 240567 163 WALInsertLock 2388805 89142 LockMgr partition locks 8253142 177715 So I think this patch accomplishes the goal of making sinval not be a cause of contention. Patch version 2 attached. regards, tom lane binW9bQG9JmZj.bin Description: sinval-rewrite-2.patch.gz -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches