Re: [HACKERS] Preparation for PostgreSQL releases 8.2.5, 8.1.10, 8.0.14, 7.4.18, 7.3.20
On Wed, 2007-09-12 at 10:48 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: The following bug fix has not yet been applied to CVS http://archives.postgresql.org/pgsql-patches/2007-06/msg00100.php Frankly, this looks much more like it creates a bug than fixes one. I have not looked at all of the original thread, but this adds a wart (two warts, really) that seems certain to do the wrong thing in cases other than the one you are thinking of. Well, that's not a great help for anybody. What next action do you propose? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Patch queue triage
On Wed, 2007-09-12 at 17:47 -0400, Bruce Momjian wrote: For those who have forgotten the progress we have made toward 8.3, here are the open patches we had for 8.3 as of May 1, 2006: Could you please issue a list of open items for 8.3? I want to check whether you are waiting on me for anything and which things have been deferred to next release. Thanks, -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Preparation for PostgreSQL releases 8.2.5, 8.1.10, 8.0.14, 7.4.18, 7.3.20
On Wed, 2007-09-12 at 10:48 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: The following bug fix has not yet been applied to CVS http://archives.postgresql.org/pgsql-patches/2007-06/msg00100.php Frankly, this looks much more like it creates a bug than fixes one. I have not looked at all of the original thread, but this adds a wart (two warts, really) that seems certain to do the wrong thing in cases other than the one you are thinking of. Let me explain the bug and the fix briefly. The current recovery code allows a file to be archived a second time, which will fail if the archive_command is strict and refuses duplicate files. The second failure occurs only in disaster mode, when we have no partially written files from the failing server. The bug is fatal, but there is an easy workaround, if you know it. The manual says this should work and it does not. I have added code that prevents a file from being archived when we know for certain it has already been archived because we just de-archived it during recovery. The fix also closes a loophole in XLogArchiveNotify by stopping the writing of a .ready file if a .done already exists. These actions fix the bug directly, following the main design. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Limitation on number of positions (tsearch)
Why is there a limitation of 256 positions per lexeme in a tsvector? There doesn't seem to be a technical reason for that. WordEntryPosVector uses a uint16 to store the number of positions, so it go up to 65535. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Limitation on number of positions (tsearch)
Why is there a limitation of 256 positions per lexeme in a tsvector? There doesn't seem to be a technical reason for that. WordEntryPosVector uses a uint16 to store the number of positions, so it go up to 65535. For two reasons: - Ranking might become very slow if number of position is big - From practice: if word is very frequent on document then with high probability this is a stop word or (case of internet-wide search engines) document is a spam. That's common practice of search engines to limit number of word's positions, because increasing it doesn't give advantage in term of ranking and cause trouble from increasing of storage size. -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Preparation for PostgreSQL releases 8.2.5, 8.1.10, 8.0.14, 7.4.18, 7.3.20
On Thu, Sep 13, 2007 at 12:12:28AM +0200, Guillaume Lelarge wrote: Tom Lane a écrit : Dave Page [EMAIL PROTECTED] writes: Tom Lane wrote: Peter usually does it --- in theory any committer could, but he actually knows what to do and the rest of us would have to study ;-) Study or figure it out? If it hasn't already been it should be documented as part of the release process. Well, RELEASE_CHANGES has * Translation updates Translations are kept in the project pgtranslation on PgFoundry. 1. Check out the messages module (of the right branch). 2. Check out the admin module. 3. Run sh .../admin/cp-po .../messages .../pgsql 4. Commit. but it's not real clear (to me) which is the right branch They are named the same way PostgreSQL named its branches. For example REL8_2 for PostgreSQL 8.2. They are available here : http://pgfoundry.org/scm/?group_id=164 and what the ...s signify. .../admin/cp-po : ... is the path to the cp-po shell script you get when you did step 2 (Check out the admin module). .../messages : ... is the path to the messages branch you get when you did step 1 (Check out the messages module...) .../pgsql : ... is the path to your source dir (same branch as messages) It's not a big knowledge gap but I have other things to worry about ... It seems pretty straightforward now. Perhaps it can be used with cron. No. Doing that with cron is a really bad idea, imho. We do *not* want any automated commits going into the tree. If we wanted that, why did we bother breaking it out in the first place, and not just give everybody commit access, which would be the same thing? That said, it seems easy enough. I'll be happy to help doing it, but I don't want to step on the toes of someone already working on it. Peter - let me know if you want help mergeing them for this release - I'm availabel to help with that today or tomorrow. //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Preparation for PostgreSQL releases 8.2.5, 8.1.10, 8.0.14, 7.4.18, 7.3.20
Magnus Hagander a écrit : On Thu, Sep 13, 2007 at 12:12:28AM +0200, Guillaume Lelarge wrote: [...] It seems pretty straightforward now. Perhaps it can be used with cron. No. Doing that with cron is a really bad idea, imho. We do *not* want any automated commits going into the tree. If we wanted that, why did we bother breaking it out in the first place, and not just give everybody commit access, which would be the same thing? You're right. I haven't thought of that. That said, it seems easy enough. I'll be happy to help doing it, but I don't want to step on the toes of someone already working on it. Peter - let me know if you want help mergeing them for this release - I'm availabel to help with that today or tomorrow. Peter just sent a message on pgtranslation's mailing list, asking us (translators) to put our updates in pgtranslation's CVS ASAP. He'll synchronize the translations tonight. Thanks anyway. Regards. -- Guillaume. http://www.postgresqlfr.org/ http://dalibo.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] HOT line pointer bloat and PageRepairFragmentation
We know that HOT can cause line pointer bloat because of redirect dead line pointers. In the worst case there could be MaxHeapTuplesPerPage redirect-dead line pointers in a page. VACUUM can reclaim these line pointers and mark them ~LP_USED (what is now called LP_UNUSED). But we don't reclaim the space used by unused line pointers during repairing page fragmentation, and hence we would never be able to remove the line pointer bloat completely. Fundamentally we should be able to reclaim the unused line pointers at the end of the lp array (i.e. unused line pointers immediate to pd_lower) I had earlier tried to repair the bloat by reclaiming the space used by LP_UNUSED line pointers at the end of the array. But it doesn't work well with VACUUM FULL which tracks unused line pointers for moving tuples. Its not that we can not fix that issue, but I am reluctant to spend time on that right now because many of us feel that VACUUM FULL is near its EOL. How about passing a boolean to PageRepairFragmentation to command it to reclaim unused line pointers ? We pass true at all places except in the VACUUM FULL code path. IOW we reclaim unused line pointers in defragmentation and LAZY VACUUM. We would need to WAL log this information in xl_heap_clean so that we redo the same during recovery. I have a patch ready since I had already implemented this few weeks back. Comments ? Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation
Pavan Deolasee [EMAIL PROTECTED] writes: How about passing a boolean to PageRepairFragmentation to command it to reclaim unused line pointers ? The difficulty with this is having to be 100% confident that noplace in the system tries to dereference a TID without checking that the line number (offset) is within range. At one time that was demonstrably not so. I think we've cleaned up most if not all such places, but I wouldn't want to swear to it. I'm not convinced it's worth taking any risk for. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation
On 9/13/07, Tom Lane [EMAIL PROTECTED] wrote: The difficulty with this is having to be 100% confident that noplace in the system tries to dereference a TID without checking that the line number (offset) is within range. At one time that was demonstrably not so. I think we've cleaned up most if not all such places, but I wouldn't want to swear to it. If there are such places, aren't we already in problem ? An unused line pointer can be reused for unrelated tuple. Dereferencing the TID can cause data corruption, isn't it ? If you want, I can do a quick search for all callers of PageGetItemId and confirm that the offset is checked and add any missing checks. In normal circumstances, line pointer bloat should not occur. But in some typical cases it may cause unrepairable damage. For example: CREATE TABLE test (a int, b char(200)); CREATE UNIQUE INDEX testindx ON test(a); INSERT INTO test VALUES (1, 'foo'); Now, if we repeatedly update the tuple so that each update is a COLD update, we would bloat the page with redirect-dead line pointers. Any other idea to recover from this ? Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation
CREATE TABLE test (a int, b char(200)); CREATE UNIQUE INDEX testindx ON test(a); INSERT INTO test VALUES (1, 'foo'); Now, if we repeatedly update the tuple so that each update is a COLD update, we would bloat the page with redirect-dead line pointers. Um, sorry for not understanding, but why would a COLD update produce a redirect-dead line pointer (and not two LP_NORMAL ones) ? Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation
On 9/13/07, Zeugswetter Andreas ADI SD [EMAIL PROTECTED] wrote: CREATE TABLE test (a int, b char(200)); CREATE UNIQUE INDEX testindx ON test(a); INSERT INTO test VALUES (1, 'foo'); Now, if we repeatedly update the tuple so that each update is a COLD update, we would bloat the page with redirect-dead line pointers. Um, sorry for not understanding, but why would a COLD update produce a redirect-dead line pointer (and not two LP_NORMAL ones) ? The COLD updated (old) tuple would be pruned to dead line pointer once the tuple becomes DEAD. Normally that would let us reuse the tuple storage for other purposes. We do the same for DELETEd tuples. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Patch queue triage
Pavan Deolasee wrote: On 9/13/07, Bruce Momjian [EMAIL PROTECTED] wrote: For those who have forgotten the progress we have made toward 8.3, here are the open patches we had for 8.3 as of May 1, 2006: You mean May 1, 2007 ;-) Yea, sorry. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Patch queue triage
Simon Riggs wrote: On Wed, 2007-09-12 at 17:47 -0400, Bruce Momjian wrote: For those who have forgotten the progress we have made toward 8.3, here are the open patches we had for 8.3 as of May 1, 2006: Could you please issue a list of open items for 8.3? I want to check whether you are waiting on me for anything and which things have been deferred to next release. I am working on putting them all in the patch queue now. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] autovacuum launcher eating too much CPU
Hi, Darcy Buskermolen noticed that when one has many databases, the autovac launcher starts eating too much CPU. I tried it here with 200 databases and indeed it does seem to eat its share. Even with the default naptime, which I wouldn't have thought that was too high (it does make the launcher wake up about three times a second though). I'm looking at a profile and I can't seem to make much sense out of it. It seems to me like the problem is not autovac itself, but rather the pgstat code that reads the stat file from disk. Of course, autovac does need to read the file fairly regularly. Here is the top lines of gprof output. Comments? Is there something here that needs fixing? -- Alvaro Herrerahttp://www.advogato.org/person/alvherre Investigación es lo que hago cuando no sé lo que estoy haciendo (Wernher von Braun) Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls ms/call ms/call name 40.00 0.02 0.02 113360 0.00 0.00 hash_search_with_hash_value 20.00 0.03 0.0172816 0.00 0.00 AllocSetFreeIndex 20.00 0.04 0.0155517 0.00 0.00 AllocSetAlloc 20.00 0.05 0.01 4247 0.00 0.00 choose_nelem_alloc 0.00 0.05 0.00 396779 0.00 0.00 pg_isblank 0.00 0.05 0.00 119276 0.00 0.00 calc_bucket 0.00 0.05 0.00 113360 0.00 0.00 hash_search 0.00 0.05 0.00 113330 0.00 0.00 hash_uint32 0.00 0.05 0.00 113330 0.00 0.00 oid_hash 0.00 0.05 0.00 105035 0.00 0.00 get_hash_entry 0.00 0.05 0.0055514 0.00 0.00 MemoryContextAlloc 0.00 0.05 0.0041798 0.00 0.00 next_token 0.00 0.05 0.0025512 0.00 0.00 DynaHashAlloc 0.00 0.05 0.0016187 0.00 0.00 TimestampDifferenceExceeds 0.00 0.05 0.00 8742 0.00 0.00 MemoryContextStrdup 0.00 0.05 0.00 8511 0.00 0.00 seg_alloc 0.00 0.05 0.00 8507 0.00 0.00 my_log2 0.00 0.05 0.00 8379 0.00 0.00 read_pg_database_line 0.00 0.05 0.00 8364 0.00 0.00 check_list_invariants 0.00 0.05 0.00 8353 0.00 0.00 lappend 0.00 0.05 0.00 8322 0.00 0.00 backend_read_statsfile 0.00 0.05 0.00 8322 0.00 0.00 pgstat_fetch_stat_dbentry index % timeself childrencalled name spontaneous [1] 99.60.000.05 reaper [1] 0.000.05 1/1 StartAutoVacLauncher [3] 0.000.00 1/1 load_role [51] 0.000.00 2/3 StartChildProcess [238] 0.000.00 1/60 errstart [179] 0.000.00 1/1 AutoVacuumingActive [289] --- 0.000.05 1/1 StartAutoVacLauncher [3] [2] 99.50.000.05 1 AutoVacLauncherMain [2] 0.000.05 56/56 launch_worker [5] 0.000.00 1/1 rebuild_database_list [27] 0.000.00 2/2 errmsg [78] 0.000.00 2/2 errfinish [101] 0.000.00 1/1 BaseInit [115] 0.000.00 1/4330AllocSetContextCreate [24] 0.000.00 118/118 PostmasterIsAlive [165] 0.000.00 59/59 launcher_determine_sleep [181] 0.000.00 59/59 pg_usleep [182] 0.000.00 59/254 LWLockAcquire [157] 0.000.00 59/254 LWLockRelease [158] 0.000.00 58/288 GetCurrentTimestamp [156] 0.000.00 58/16187 TimestampDifferenceExceeds [135] 0.000.00 10/22 pqsignal [198] 0.000.00 2/60 errstart [179] 0.000.00 1/1 init_ps_display [383] 0.000.00 1/1 InitAuxiliaryProcess [308] 0.000.00 1/116 MemoryContextSwitchTo [166] 0.000.00 1/1 autovac_balance_cost [368] 0.000.00 1/1 proc_exit [403] --- 0.000.05 1/1 reaper [1] [3] 99.50.000.05 1 StartAutoVacLauncher [3] 0.000.05 1/1 AutoVacLauncherMain [2] 0.000.00
Re: [HACKERS] Postgresql.conf cleanup
Josh, is any of this happening for 8.3? --- Josh Berkus wrote: All, I'm working on cleaning up postgresql.conf and pg_settings for the release. Attached is a sample WIP. It's not in patch form because I'm not done yet; I've just been editing postgresql.conf and need to fix the docs and pg_settings to match. Issues encountered and changes made: PostgreSQL.conf suggestions: added section with the 7 most important obvious settings at the top and suggestions on how to calculate them. If people like this, I'll add it to the Tutorial in the docs as well. seq_scan_cost: this is independant of all of the other _costs. I can't think of any way in which that doesn't make the whole set of costs unmanageable. For example, if you want to change seq_scan_cost in order to make query cost more-or-less match up with ms execution time, you have to modify all 6 settings. If we do implement per-tablespace costs, then we'll need per-tablespace random_page_cost as well. Or am I missing something? (change requires restart): this phrase appears over 20 times in the notes. This is enough times to be really repetitive and take up a lot of scrolling space, while not actually covering all startup-time parameters. We should either (a) remove all such notes and rely on docs, or (b) make an annotation symbol (e.g. *R) and mark 100% of them. Votes? Vacuum: all vacuum autovacuum parameters put under their own section. Client Cost Defaults: this section became a catch-all for all userset parameters which people weren't sure what to do with. I've divided it into logical subsections, and moved some parameters to other sections where they logically belong (for example, explain_pretty_print belongs in Query Tuning). pg_settings issues transaction_isolation and transaction_read_only appear more than once in the pg_settings pseudo_table. The setting column is supposed to be unique. Given the amount of cleanup/improvement which I'm seeing as necessary for the GUCs, I'm wondering if I put this off too long for 8.3. --Josh ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Postgresql.conf cleanup
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Bruce Momjian wrote: Josh, is any of this happening for 8.3? I would be willing to take a stab at this and submit by Monday. Joshua D. Drake --- Josh Berkus wrote: All, I'm working on cleaning up postgresql.conf and pg_settings for the release. Attached is a sample WIP. It's not in patch form because I'm not done yet; I've just been editing postgresql.conf and need to fix the docs and pg_settings to match. Issues encountered and changes made: PostgreSQL.conf suggestions: added section with the 7 most important obvious settings at the top and suggestions on how to calculate them. If people like this, I'll add it to the Tutorial in the docs as well. seq_scan_cost: this is independant of all of the other _costs. I can't think of any way in which that doesn't make the whole set of costs unmanageable. For example, if you want to change seq_scan_cost in order to make query cost more-or-less match up with ms execution time, you have to modify all 6 settings. If we do implement per-tablespace costs, then we'll need per-tablespace random_page_cost as well. Or am I missing something? (change requires restart): this phrase appears over 20 times in the notes. This is enough times to be really repetitive and take up a lot of scrolling space, while not actually covering all startup-time parameters. We should either (a) remove all such notes and rely on docs, or (b) make an annotation symbol (e.g. *R) and mark 100% of them. New tasks Votes? Vacuum: all vacuum autovacuum parameters put under their own section. Client Cost Defaults: this section became a catch-all for all userset parameters which people weren't sure what to do with. I've divided it into logical subsections, and moved some parameters to other sections where they logically belong (for example, explain_pretty_print belongs in Query Tuning). pg_settings issues transaction_isolation and transaction_read_only appear more than once in the pg_settings pseudo_table. The setting column is supposed to be unique. Given the amount of cleanup/improvement which I'm seeing as necessary for the GUCs, I'm wondering if I put this off too long for 8.3. --Josh ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG6WfUATb/zqfZUUQRAqK5AJ46wEGl3MklaS1Y/cdyOKtUAf15WQCdFc3y 8lmhvlh/NiLwOExeGlDH75k= =14WD -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation
The COLD updated (old) tuple would be pruned to dead line pointer once the tuple becomes DEAD. Normally that would let us reuse the tuple storage for other purposes. We do the same for DELETEd tuples. Oh, I thought only pruned tuples from HOT chains can produce a redirect dead line pointer. This looks like a problem, since we might end up with a page filled with LP_DEAD slots, that all have no visibility info and can thus not be cleaned by vacuum. Maybe PageRepairFragmentation when called from HOT should prune less aggressively. e.g. prune until a max of 1/2 the available slots are LP_DEAD, and not prune the rest. Andreas ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation
On 9/13/07, Zeugswetter Andreas ADI SD [EMAIL PROTECTED] wrote: The COLD updated (old) tuple would be pruned to dead line pointer once the tuple becomes DEAD. Normally that would let us reuse the tuple storage for other purposes. We do the same for DELETEd tuples. Oh, I thought only pruned tuples from HOT chains can produce a redirect dead line pointer. This looks like a problem, since we might end up with a page filled with LP_DEAD slots, that all have no visibility info and can thus not be cleaned by vacuum. It has nothing to do with visibility info. We already know the tuple is DEAD and thats why its line pointer is LP_DEAD. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] autovacuum launcher eating too much CPU
Alvaro Herrera wrote: Darcy Buskermolen noticed that when one has many databases, the autovac launcher starts eating too much CPU. I tried it here with 200 databases and indeed it does seem to eat its share. Even with the default naptime, which I wouldn't have thought that was too high (it does make the launcher wake up about three times a second though). This patch does not solve the whole problem but it alleviates it a bit by throttling pgstat reads. One problem with it is that the interval for this increases: /* * Check whether pgstat data still says we need to vacuum this table. * It could have changed if something else processed the table while we * weren't looking. * * FIXME we ignore the possibility that the table was finished being * vacuumed in the last 500ms (PGSTAT_STAT_INTERVAL). This is a bug. */ MemoryContextSwitchTo(AutovacMemCxt); tab = table_recheck_autovac(relid); which could be a problem in itself, by causing unnecessary vacuums. Opinions? -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J In Europe they call me Niklaus Wirth; in the US they call me Nickel's worth. That's because in Europe they call me by name, and in the US by value! Index: src/backend/postmaster/autovacuum.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v retrieving revision 1.58 diff -c -p -r1.58 autovacuum.c *** src/backend/postmaster/autovacuum.c 12 Sep 2007 22:14:59 - 1.58 --- src/backend/postmaster/autovacuum.c 13 Sep 2007 16:57:59 - *** static void avl_sighup_handler(SIGNAL_AR *** 291,296 --- 291,297 static void avl_sigusr1_handler(SIGNAL_ARGS); static void avl_sigterm_handler(SIGNAL_ARGS); static void avl_quickdie(SIGNAL_ARGS); + static void autovac_refresh_stats(void); *** AutoVacLauncherMain(int argc, char *argv *** 488,494 DatabaseListCxt = NULL; DatabaseList = NULL; ! /* Make sure pgstat also considers our stat data as gone */ pgstat_clear_snapshot(); /* Now we can allow interrupts again */ --- 489,498 DatabaseListCxt = NULL; DatabaseList = NULL; ! /* ! * Make sure pgstat also considers our stat data as gone. Note: we ! * musn't use autovac_refresh_stats here. ! */ pgstat_clear_snapshot(); /* Now we can allow interrupts again */ *** rebuild_database_list(Oid newdb) *** 836,842 HTAB *dbhash; /* use fresh stats */ ! pgstat_clear_snapshot(); newcxt = AllocSetContextCreate(AutovacMemCxt, AV dblist, --- 840,846 HTAB *dbhash; /* use fresh stats */ ! autovac_refresh_stats(); newcxt = AllocSetContextCreate(AutovacMemCxt, AV dblist, *** do_start_worker(void) *** 1063,1069 oldcxt = MemoryContextSwitchTo(tmpcxt); /* use fresh stats */ ! pgstat_clear_snapshot(); /* Get a list of databases */ dblist = get_database_list(); --- 1067,1073 oldcxt = MemoryContextSwitchTo(tmpcxt); /* use fresh stats */ ! autovac_refresh_stats(); /* Get a list of databases */ dblist = get_database_list(); *** table_recheck_autovac(Oid relid) *** 2258,2264 PgStat_StatDBEntry *dbentry; /* use fresh stats */ ! pgstat_clear_snapshot(); shared = pgstat_fetch_stat_dbentry(InvalidOid); dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId); --- 2262,2268 PgStat_StatDBEntry *dbentry; /* use fresh stats */ ! autovac_refresh_stats(); shared = pgstat_fetch_stat_dbentry(InvalidOid); dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId); *** AutoVacuumShmemInit(void) *** 2734,2736 --- 2738,2759 else Assert(found); } + + /* + * Refresh pgstats data in an autovacuum process, at most every 500 ms. This + * is to avoid rereading the pgstats files too many times in quick succession. + */ + static void + autovac_refresh_stats(void) + { + static TimestampTz last_read = 0; + TimestampTz current_time; + + current_time = GetCurrentTimestamp(); + + if (!TimestampDifferenceExceeds(last_read, current_time, 500)) + return; + + pgstat_clear_snapshot(); + last_read = current_time; + } ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Postgresql.conf cleanup
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Bruce Momjian wrote: Josh, is any of this happening for 8.3? Should I run with this or let it lay? --- Josh Berkus wrote: All, I'm working on cleaning up postgresql.conf and pg_settings for the release. Attached is a sample WIP. It's not in patch form because I'm not done yet; I've just been editing postgresql.conf and need to fix the docs and pg_settings to match. Issues encountered and changes made: PostgreSQL.conf suggestions: added section with the 7 most important obvious settings at the top and suggestions on how to calculate them. If people like this, I'll add it to the Tutorial in the docs as well. seq_scan_cost: this is independant of all of the other _costs. I can't think of any way in which that doesn't make the whole set of costs unmanageable. For example, if you want to change seq_scan_cost in order to make query cost more-or-less match up with ms execution time, you have to modify all 6 settings. If we do implement per-tablespace costs, then we'll need per-tablespace random_page_cost as well. Or am I missing something? (change requires restart): this phrase appears over 20 times in the notes. This is enough times to be really repetitive and take up a lot of scrolling space, while not actually covering all startup-time parameters. We should either (a) remove all such notes and rely on docs, or (b) make an annotation symbol (e.g. *R) and mark 100% of them. Votes? Vacuum: all vacuum autovacuum parameters put under their own section. Client Cost Defaults: this section became a catch-all for all userset parameters which people weren't sure what to do with. I've divided it into logical subsections, and moved some parameters to other sections where they logically belong (for example, explain_pretty_print belongs in Query Tuning). pg_settings issues transaction_isolation and transaction_read_only appear more than once in the pg_settings pseudo_table. The setting column is supposed to be unique. Given the amount of cleanup/improvement which I'm seeing as necessary for the GUCs, I'm wondering if I put this off too long for 8.3. --Josh ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG6XLDATb/zqfZUUQRAt4eAJ93xvOvRRIWnqOgZzj1LmnZF1TvGwCfbMd9 Sm/parspTeRDOqZ7KQ3mHXM= =Uv7U -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Postgresql.conf cleanup
Bruce, Josh, is any of this happening for 8.3? Hmmm, just the format cleanup. I haven't heard any objections, but I haven't heard any comments on the underlying broken functionality either (like seq_scan_cost), which are beyond me to fix. Patch next week. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Problem
Hello, I am a developer, I am working in the creation of hypothetical index in the data base postgresql 8.2.4. I have a problem, the data base init normally but when I try to connect to the database it shows the following error: TRAP: FailedAssertion(!(!isnull), File: relcache.c, Line: 959) . You could help me to figure out what could it be? Yours truly.
Re: [HACKERS] Problem
Pedro Belmino wrote: Hello, I am a developer, I am working in the creation of hypothetical index in the data base postgresql 8.2.4. I have a problem, the data base init normally but when I try to connect to the database it shows the following error: TRAP: FailedAssertion(!(!isnull), File: relcache.c, Line: 959) . You could help me to figure out what could it be? We're going to need much more information. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] RETURNING and DO INSTEAD ... Intentional or not?
All, I'll note that we currently prevent adding RETURNING to a *conditional* DO INSTEAD rule. This means that if we have a conditional DO INSTEAD rule which inserts into a different table than the final unconditional rule, we'll be RETURNING wrong or empty values. Mind you, that's a pretty extreme corner case. FYI, after some tinkering around, I've found that RETURNING is 100% incompatible with any table which has conditional DO INSTEAD rules; there's just no way to make it work and return any intelligible data. This would be a completely corner case, except that people use conditional DO INSTEAD rules heavily with partitioning (and yes, real users are complaining). I don't see this as super-urgent to fix for 8.3, but can we put it up as a TODO? -- Make it possible to use RETURNING together with conditional DO INSTEAD rules, such as for partitioning setups. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Another HOT thought: why do we need indcreatexid at all?
AFAICS, the whole indcreatexid and validForTxn business is a waste of code. By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid, surely any transactions that could see the broken HOT chains are gone. There might have been some reason for this contraption before we had plan invalidation, but what use is it now? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Another HOT thought: why do we need indcreatexid at all?
Tom Lane [EMAIL PROTECTED] writes: AFAICS, the whole indcreatexid and validForTxn business is a waste of code. By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid, surely any transactions that could see the broken HOT chains are gone. There might have been some reason for this contraption before we had plan invalidation, but what use is it now? It sounds like you're missing one of the big problems HOT ran into. When you create a new index your new index could include columns which were previously not covered in any index. So there could be pre-existing HOT chains which would no longer be eligible for HOT treatment. The README called such chains broken HOT chains and has some more information about them. Nobody who can see any old tuples in such chains can risk using your new index since the chain will be indexed under the wrong key. *New* transactions can use the index however since they'll only see the head of the chain which is the key the chain is indexed under. It's the old transactions which can see the old key values which aren't included in the index. Or do you see some other reason that plan invalidation can solve this problem? We looked for and tried a lot of different approaches to solve this problem. This was the lowest impact solution and the only one that was convincingly correct (imho). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Another HOT thought: why do we need indcreatexid at all?
Tom Lane [EMAIL PROTECTED] writes: AFAICS, the whole indcreatexid and validForTxn business is a waste of code. By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid, surely any transactions that could see the broken HOT chains are gone. There might have been some reason for this contraption before we had plan invalidation, but what use is it now? Argh, sorry, rereading your message I see there are a few details which I missed which completely change the meaning of it. Ignore my previous mail :( -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Another HOT thought: why do we need indcreatexid at all?
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: AFAICS, the whole indcreatexid and validForTxn business is a waste of code. By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid, surely any transactions that could see the broken HOT chains are gone. There might have been some reason for this contraption before we had plan invalidation, but what use is it now? Argh, sorry, rereading your message I see there are a few details which I missed which completely change the meaning of it. Ignore my previous mail :( In answer to the real question you were actually asking, I believe you're correct that CREATE INDEX CONCURRENTLY should never need to set indcreatexid. Only regular non-concurrent CREATE INDEX needs to protect against that problem. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] plpgsql and qualified variable names
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Tom Lane wrote: I have just absorbed the significance of some code that has been in plpgsql since day one, but has never been documented anyplace. It seems that if you attach a label to a statement block in a plpgsql function, you can do more with the label than just use it in an EXIT statement (as I'd always supposed it was for). You can also use the label to qualify the names of variables declared in that block. For example, I've extended the example in section 37.3 like this: CREATE FUNCTION somefunc() RETURNS integer AS $$ outerblock DECLARE quantity integer := 30; BEGIN RAISE NOTICE 'Quantity here is %', quantity; -- Prints 30 quantity := 50; -- -- Create a subblock -- DECLARE quantity integer := 80; BEGIN RAISE NOTICE 'Quantity here is %', quantity; -- Prints 80 RAISE NOTICE 'Outer quantity here is %', outerblock.quantity; -- Prints 50 END; RAISE NOTICE 'Quantity here is %', quantity; -- Prints 50 RETURN quantity; END; $$ LANGUAGE plpgsql; Now the reason I'm interested in this is that it provides another technique you can use to deal with conflicts between plpgsql variable names and SQL table/column/function names: you can qualify the variable name with the block label when you use it in a SQL command. This is not in itself a solution to the conflict problem, because unqualified names are still at risk of being resolved the wrong way, but it still seems worth documenting in the new section I'm writing about variable substitution rules. Anyway, I'm not writing just to point out that we have a previously undocumented feature. I notice that the section on porting from Oracle PL/SQL mentions You cannot use parameter names that are the same as columns that are referenced in the function. Oracle allows you to do this if you qualify the parameter name using function_name.parameter_name. While i haven't tested yet, I believe that we could match this Oracle behavior with about a one-line code change: the outermost namespace level (block) that the function parameter aliases are put into just needs to be given a label equal to the function name, instead of being label-less as it currently is. Comments? Also, can anyone verify whether this labeling behavior matches Oracle? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Straightforward changes for increased SMP scalability
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Simon Riggs wrote: David Strong presented some excellent results of his SMP scalability testing at Ottawa in May. http://www.pgcon.org/2007/schedule/events/16.en.html There are some easy things we can do to take advantage of those results, especially the ones that were hardware independent. The hardware independent results were these two: - Avoid contention on WALInsertLock (+28% gain) - Increase NUM_BUFFER_PARTITIONS (+7.7% gain) Scalability begins to slow down at 8 CPUs on 8.2.4 and David was able to show good gains even at 8 CPUs with these changes. Proposals 1. For the first result, I suggest that we introduce some padding into the shmem structure XLogCtlData to alleviate false sharing that may exist between holders of WALInsertLock, WALWriteLock and info_lck. The cost of this will be at most about 200 bytes of shmem, with a low risk change. The benefits are hard to quantify, but we know this is an area of high contention and we should do all we can to reduce that. This hasn't been discussed previously, though we have seen good benefit from avoiding false sharing in other cases, e.g. LWLOCK padding. 2. Increase NUM_BUFFER_PARTITIONS from 16 to 256 (or higher). This has been discussed previously: http://archives.postgresql.org/pgsql-hackers/2006-09/msg00967.php Both of these changes are simple enough to consider for 8.3 Comments? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] tsearch2 documentation done
I am now satisified with the tsearch2 documentation in SGML and have marked the item as complete for 8.3. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Another HOT thought: why do we need indcreatexid at all?
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: AFAICS, the whole indcreatexid and validForTxn business is a waste of code. By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid, surely any transactions that could see the broken HOT chains are gone. In answer to the real question you were actually asking, I believe you're correct that CREATE INDEX CONCURRENTLY should never need to set indcreatexid. Only regular non-concurrent CREATE INDEX needs to protect against that problem. Argh, I'd momentarily gotten concurrent and nonconcurrent cases backwards. I would still desperately like to get rid of indcreatexid, though, because the patch's existing mechanism for clearing it is junk. There's no guarantee that it will get cleared before it wraps around, because the clearing is attached to vacuuming of the wrong table. Maybe you could make it work by special-casing vacuuming of pg_index itself, but the whole thing's a crock anyway. [ thinks some more ... ] Hmm, maybe instead of an explicit XID stored in the pg_index row proper, we could use the xmin of the pg_index row itself? That's already got a working mechanism for getting frozen. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] AutoVacuum Behaviour Question
On 9/13/07, Bruce Momjian [EMAIL PROTECTED] wrote: Alvaro Herrera wrote: Bruce Momjian wrote: Is this item closed? No, it isn't. Please add a TODO item about it: * Prevent long-lived temp tables from causing frozen-Xid advancement starvation Sorry, I don't understand this. Can you give me more text? Thanks. s/long-lived/orphaned/ ? And possibly this means better orphan detection and removal. Andrew
Re: [HACKERS] [GENERAL] AutoVacuum Behaviour Question
Alvaro Herrera wrote: Bruce Momjian wrote: Is this item closed? No, it isn't. Please add a TODO item about it: * Prevent long-lived temp tables from causing frozen-Xid advancement starvation Sorry, I don't understand this. Can you give me more text? Thanks. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] terms for database replication: synchronous vs eager
On 9/7/2007 11:01 AM, Markus Schiltknecht wrote: Hi, I'm asking for advice and hints regarding terms in database replication, especially WRT Postgres-R. (Sorry for crossposting, but I fear not reaching enough people on the Postgres-R ML alone) I'm struggling on how to classify the Postgres-R algorithm. Up until recently, most people thought of it as synchronous replication, but it's not synchronous in the strong (and very common) sense. I.e. after a node confirms to have committed a transaction, other nodes didn't necessarily commit already. (They only promise that they *will* commit without conflicts). This violates the common understanding of synchrony, because you can't commit on a node A and then query another node B and expect it be coherent immediately. That's right. And there is no guarantee about the lag at all. So you can find old data on node B long after you committed a change to node A. None the less, Postgres-R is eager (or pessimistic?) in the sense that it replicates *before* committing, so as to avoid divergence. In [1] I've tried to make that distinction clear, and I'm currently advocating for using synchronous only in the very strong (and commonly used) sense. I've choosen the word 'eager' to mean 'replicates before committing'. According to that definitions, Postgres-R is async but eager. Postgres-R is an asynchronous replication system by all means. It only makes sure that the workset data (that's what Postgres-R calls the replication log for one transaction) has been received by a group communication system supporting total order and that the group communication system decided it to be the transaction that (logically) happened before any possibly conflicting concurrent transaction. This is the wonderful idea how Postgres-R will have a failsafe conflict resolution mechanism in an asynchronous system. I don't know what you associate with the word eager. All I see is that Postgres-R makes sure that some other process, which might still reside on the same hardware as the DB, is now in charge of delivery. Nobody said that the GC implementation cannot have made the decision about the total order of two workset messages and already reported that to the local client application before those messages ever got transmitted over the wire. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] autovacuum launcher eating too much CPU
On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote: Hi, Darcy Buskermolen noticed that when one has many databases, the autovac launcher starts eating too much CPU. Don't forget the memory leak as well. after 3 or 4 days of running I end up with a 2GB+ AVL.. I tried it here with 200 databases and indeed it does seem to eat its share. Even with the default naptime, which I wouldn't have thought that was too high (it does make the launcher wake up about three times a second though). I'm looking at a profile and I can't seem to make much sense out of it. It seems to me like the problem is not autovac itself, but rather the pgstat code that reads the stat file from disk. Of course, autovac does need to read the file fairly regularly. Here is the top lines of gprof output. Comments? Is there something here that needs fixing? -- Darcy Buskermolen The PostgreSQL company, Command Prompt Inc. http://www.commandprompt.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] autovacuum launcher eating too much CPU
Darcy Buskermolen wrote: On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote: Hi, Darcy Buskermolen noticed that when one has many databases, the autovac launcher starts eating too much CPU. Don't forget the memory leak as well. after 3 or 4 days of running I end up with a 2GB+ AVL.. Huh, sorry for not letting you know, I already fixed that :-) (Please grab the latest CVS HEAD and confirm.) -- Alvaro Herrerahttp://www.advogato.org/person/alvherre Entristecido, Wutra (canción de Las Barreras) echa a Freyr a rodar y a nosotros al mar ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] ascii() for utf8
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Alvaro Herrera wrote: Decibel! wrote: Moving to -hackers. On Jul 27, 2007, at 1:22 PM, Stuart wrote: Does Postgresql have a function like ascii() that will return the unicode codepoint value for a utf8 character? (And symmetrically same for question chr() of course). I suspect that this is just a matter of no one scratching the itch. I suspect a patch would be accepted, or you could possibly put something on pgFoundry. Nay; there were some discussions about this not long ago, and I think one conclusion you could draw from them is that many people want these functions in the backend. I'd set it up so that ascii() and chr() act according to the appropriate locale setting (I'm not sure which one would be appropriate). I don't see why any of them would react to the locale, but they surely must honor client encoding. -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ I dream about dreams about dreams, sang the nightingale under the pale moon (Sandman) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Reducing Transaction Start/End Contention
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). This becomes clear as the number of connections increases, but it seems likely that the contention can be caused in a range of other circumstances. My thoughts on the causes of this contention are that the following 3 tasks contend with each other in the following way: CommitTransaction(): takes ProcArrayLock Exclusive but only needs access to one ProcArray element waits for GetSnapshotData():ProcArrayLock Shared ReadNewTransactionId():XidGenLock Shared which waits for GetNextTransactionId() takes XidGenLock Exclusive ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive two possible place where I/O is required ExtendSubtrans(): takes SubtransControlLock() one possible place where I/O is required Avoids lock on ProcArrayLock: atomically updates one ProcArray element or more simply: CommitTransaction() -- i.e. once per transaction waits for GetSnapshotData() -- i.e. once per SQL statement which waits for GetNextTransactionId() -- i.e. once per transaction This gives some goals for scalability improvements and some proposals. (1) and (2) are proposals for 8.3 tuning, the others are directions for further research. Goal: Reduce total time that GetSnapshotData() waits for GetNextTransactionId() 1. Increase size of Clog-specific BLCKSZ Clog currently uses BLCKSZ to define the size of clog buffers. This can be changed to use CLOG_BLCKSZ, which would then be set to 32768. This will naturally increase the amount of memory allocated to the clog, so we need not alter CLOG_BUFFERS above 8 if we do this (as previously suggested, with successful results). This will also reduce the number of ExtendClog() calls, which will probably reduce the overall contention also. 2. Perform ExtendClog() as a background activity Background process can look at the next transactionid once each cycle without holding any lock. If the xid is almost at the point where a new clog page would be allocated, then it will allocate one prior to the new page being absolutely required. Doing this as a background task would mean that we do not need to hold the XidGenLock in exclusive mode while we do this, which means that GetSnapshotData() and CommitTransaction() would also be less likely to block. Also, if any clog writes need to be performed when the page is moved forwards this would also be performed in the background. 3. Consider whether ProcArrayLock should use a new queued-shared lock mode that puts a maximum wait time on ExclusiveLock requests. It would be fairly hard to implement this well as a timer, but it might be possible to place a limit on queue length. i.e. allow Share locks to be granted immediately if a Shared holder already exists, but only if there is a queue of no more than N exclusive mode requests queued. This might prevent the worst cases of exclusive lock starvation. 4. Since shared locks are currently queued behind exclusive requests when they cannot be immediately satisfied, it might be worth reconsidering the way LWLockRelease works also. When we wake up the queue we only wake the Shared requests that are adjacent to the head of the queue. Instead we could wake *all* waiting Shared requestors. e.g. with a lock queue like this: (HEAD)S-S-X-S-X-S-X-S Currently we would wake the 1st and 2nd waiters only. If we were to wake the 3rd, 5th and 7th waiters also, then the queue would reduce in length very quickly, if we assume generally uniform service times. (If the head of the queue is X, then we wake only that one process and I'm not proposing we change that). That would mean queue jumping right? Well thats what already happens in other circumstances, so there cannot be anything intrinsically wrong with allowing it, the only question is: would it help? We need not wake the whole queue, there may be some generally more beneficial heuristic. The reason for considering this is not to speed up Shared requests but to reduce the queue length and thus the waiting time for the Xclusive requestors. Each time a Shared request is dequeued, we effectively re-enable queue jumping, so a Shared request arriving during that point will actually jump ahead of Shared requests that were unlucky enough to arrive while an Exclusive lock was held. Worse than that, the new incoming Shared requests exacerbate the starvation, so
Re: [HACKERS] clog_buffers to 64 in 8.3?
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Greg Smith wrote: On Thu, 2 Aug 2007, Tom Lane wrote: I find it entirely likely that simply changing the [NUM_CLOG_BUFFERS] constant would be a net loss on many workloads. Would it be reasonable to consider changing it to a compile-time option before the 8.3 beta? From how you describe the potential downsides, it sounds to me like something that specific distributors might want to adjust based on their target customer workloads and server scale. That would make it available as a tunable to those aiming at larger systems with enough CPU/memory throughput that the additional overhead of more linear searches is trumped by the reduced potential for locking contention, as appears to be the case in Sun's situation here. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What is happening on buildfarm member dugong
Sergey E. Koposov [EMAIL PROTECTED] writes: It turned out that the offending assert is Assert(BgWriterShmem != NULL); in bgwriter.c:990 After commenting it out everything works. That's simply bizarre ... Also, I tried to add 'volatile' to the declaration of BgWriterShmem. After that the problem disappears too. Hm. I don't see any very good reason in the code to add the volatile, and I see at least one place where we'd have to cast it away (the MemSet at line 836). My inclination is just to remove the Assert at line 990. It's not proving anything, since if indeed BgWriterShmem was NULL there, we'd dump core on the dereferences just a couple lines below. Do you want this patched any further back than HEAD? The buildfarm status page doesn't show dugong doing any back branches ... regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Reducing Transaction Start/End Contention
Bruce Momjian wrote: This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold I think the work on VIDs and latestCompletedXid make this completely obsolete. --- Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PGparam extension version 0.4
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Andrew Chernow wrote: Version 0.4 of libpq param put and PGresult get functions. Added support for inet and cidr, couple bug fixes. If you compile the test file, make sure you link with the patched libpq.so. Andrew [ application/x-compressed is not supported, skipping... ] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [EMAIL PROTECTED]: Re: [GENERAL] array_to_set functions]
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Pavel Stehule wrote: 2007/8/15, Merlin Moncure [EMAIL PROTECTED]: On 8/14/07, Bruce Momjian [EMAIL PROTECTED] wrote: TODO item? I would say yes...array_accum is virtually an essential function when working with arrays and the suggested array_to_set (and it's built in cousin, _pg_expand_array) really should not be built around generate_series when a C function is faster and will scale much better. Hello Merlin array_accum is good sample of PostgreSQL possibilities. But it is slow. SELECT ARRAY(SELECT ... FROM ...)) is much faster. :( so I unlike not necessary aggregate functions I agree. These constructs can be showed in doc Regards Pavel Stehule -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What is happening on buildfarm member dugong
On Tue, 11 Sep 2007, Tom Lane wrote: Well, the first thing I'd suggest is trying to localize which Assert makes it fail. From the bug's behavior I think it is highly probable that the problem is in fsync signalling, which puts it either in bgwriter.c or md.c. Try recompiling those modules separately without cassert (leaving all else enabled) and see if the problem comes and goes; if so, comment out one Assert at a time till you find which one. It turned out that the offending assert is Assert(BgWriterShmem != NULL); in bgwriter.c:990 After commenting it out everything works. Also, I tried to add 'volatile' to the declaration of BgWriterShmem. After that the problem disappears too. I'm not sure that it demonstrates that it's not an ICC bug, because obviously 'volatile' flag can change the way how the compiler works... I tried add the volatile keyword for BgWriterMem in PG 8.2.4, and indeed it solved the problem with PG8.2.4 version too. From what I see in bgwriter.c, the volatile keyword for BgWriterShmem seems very reasonable to me, although I'm not sure that it's really required there regards, Sergey PS I'm sorry for the wrong information about anti-aliasing flags for ICC. I was obviously confused by the ICC docs. *** Sergey E. Koposov Max Planck Institute for Astronomy/Cambridge Institute for Astronomy/Sternberg Astronomical Institute Tel: +49-6221-528-349 Web: http://lnfm1.sai.msu.ru/~math E-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing Transaction Start/End Contention
Bruce Momjian [EMAIL PROTECTED] writes: Alvaro Herrera wrote: I think the work on VIDs and latestCompletedXid make this completely obsolete. Please confirm, all of Simon's issues? Not sure --- the area is certainly still worth looking at, but the recent patches have changed things enough that no older patches should be applied without study. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing Transaction Start/End Contention
Bruce Momjian wrote: Alvaro Herrera wrote: Bruce Momjian wrote: This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold I think the work on VIDs and latestCompletedXid make this completely obsolete. Please confirm, all of Simon's issues? http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php Hmm, in looking closer, it seems there are some things that still seem worthy of more discussion. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Problem
Heikki Linnakangas [EMAIL PROTECTED] writes: Pedro Belmino wrote: I am a developer, I am working in the creation of hypothetical index in the data base postgresql 8.2.4. I have a problem, the data base init normally but when I try to connect to the database it shows the following error: TRAP: FailedAssertion(!(!isnull), File: relcache.c, Line: 959) . You could help me to figure out what could it be? We're going to need much more information. Indeed. The failure suggests that you tried to modify the structure of pg_index rows and didn't get all the consequences right ... but what you said does not offer any reason to change pg_index. There's already some things in CVS HEAD about allowing plugins to manufacture hypothetical indexes for the planner to consider --- have you looked at that? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] SQL feature requests
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Andrew Dunstan wrote: Chuck McDevitt wrote: PostgreSQL already has a huge amount of non-standard syntax and semantics (perhaps extensions is a better word?). Everything from non-standard cast operator, non-standard substr, non-standard trim, non standard group by semantics (allowing simple ints to mean column number)... Given a day, we could probably write down several pages of non-standard features of PGSQL. Quite so, and I've perpetrated a few myself. But for the most part they are either there for legacy reasons or add significant extra functionality. I rather like Alvaro's compromise suggestion re aliases in fact. At least there seems to be a better case for that than for group by 'blurfl'. But the argument that convinces me is not that it follows some de facto standard, but that it would add to clarity. Requiring an alias where it isn't used seems to me a piece of less than excellent design. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing Transaction Start/End Contention
Alvaro Herrera wrote: Bruce Momjian wrote: This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold I think the work on VIDs and latestCompletedXid make this completely obsolete. Please confirm, all of Simon's issues? http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php --- --- Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: ...This looks like a problem, since we might end up with a page filled with LP_DEAD slots, that all have no visibility info and can thus not be cleaned by vacuum. No, it's the other way round: an LP_DEAD item pointer can *always* be cleaned by VACUUM. It would not have become LP_DEAD unless someone had confirmed that the pointed-to tuple was no longer visible to anyone. The only reason we have LP_DEAD at all is that we don't want HOT pruning to be required to remove the index entries that link to the item pointer. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch2 documentation done
On Thu, 13 Sep 2007, Bruce Momjian wrote: I am now satisified with the tsearch2 documentation in SGML and have marked the item as complete for 8.3. what's about examples ? They certainly need updating. For example, http://momjian.us/main/writings/pgsql/sgml/textsearch-rule-dictionary-example.html need to change #include utils/ts_locale.h #include utils/ts_public.h #include utils/ts_utils.h to #include tsearch/ts_locale.h #include tsearch/ts_public.h #include tsearch/ts_utils.h Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch2 documentation done
Oleg Bartunov [EMAIL PROTECTED] writes: On Thu, 13 Sep 2007, Bruce Momjian wrote: I am now satisified with the tsearch2 documentation in SGML and have marked the item as complete for 8.3. what's about examples ? They certainly need updating. Bruce might be satisfied, but I'm not ... feel free to hack away on the docs. For example, http://momjian.us/main/writings/pgsql/sgml/textsearch-rule-dictionary-example.html need to change I'd much prefer to see the C-code examples pushed out to contrib modules. There is no way we can test or maintain things that are embedded in SGML examples. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] RETURNING and DO INSTEAD ... Intentional or not?
Josh Berkus [EMAIL PROTECTED] writes: FYI, after some tinkering around, I've found that RETURNING is 100% incompatible with any table which has conditional DO INSTEAD rules; there's just no way to make it work and return any intelligible data. This would be a completely corner case, except that people use conditional DO INSTEAD rules heavily with partitioning (and yes, real users are complaining). Those would be real users who are not on any PG mailing list? Cause I have not seen any complaints, much less any proposals for a solution... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org