Re: [PATCHES] Error correction for n_dead_tuples
This patch is no longer needed. We can revisit this during 8.4 to see if it is still needed now that we have HOT. --- ITAGAKI Takahiro wrote: Here is a patch discussed in http://archives.postgresql.org/pgsql-hackers/2007-02/msg00010.php Concurrent vacuum will save n_dead_tuples value at the beginning. Stats collector will be subtract the value from n_dead_tuples instead of setting it to zero. The statistics accuracy of n_dead_tuples will be better, especially just after finish of a vacuum. The behavior in VACUUM FULL is not changed because concurrent updates are not allowed during VACUUM FULL. Comments welcome. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center [ Attachment, skipping... ] ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Recalculating OldestXmin in a long-running vacuum
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Heikki Linnakangas wrote: Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Maybe we should keep this issue open until we resolve the vacuum WAL flush issue? I can then rerun the same tests to see if this patch is a win after that. Sounds like a plan, if you are willing to do that. Sure, just rerunning the same tests isn't much work. Bruce Momjian wrote: Would you like to add a TODO item? I don't know how we track things like this. Maybe add to the end of the patch queue, with link to this discussion so that we remember that it needs more testing? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Load Distributed Checkpoints, final patch
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Here's latest revision of Itagaki-sans Load Distributed Checkpoints patch: Applied with some minor revisions to make some of the internal APIs a bit cleaner; mostly, it seemed like a good idea to replace all those bool parameters with a flag-bits approach, so that you could have something like CHECKPOINT_FORCE | CHECKPOINT_WAIT instead of false, true, true, false ... For the moment I removed all the debugging elog's in the patch. We still have Greg Smith's checkpoint logging patch to look at (which I suppose needs adjustment now), and that seems like the appropriate venue to consider what to put in. Also, the question of redesigning the bgwriter's LRU scan is still open. I believe that's on Greg's plate, too. One other closely connected item that might be worth looking at is the code for creating new future xlog segments (PreallocXlogFiles). Greg was griping upthread about xlog segment creation being a real performance drag. I realized that as we currently have it set up, the checkpoint code is next to useless for high-WAL-volume installations, because it only considers making *one* future XLOG segment. Once you've built up enough XLOG segments, the system isn't too bad about recycling them, but there will be a nasty startup transient where foreground processes have to stop and make the things. I wonder whether it would help if we (a) have the bgwriter call PreallocXlogFiles during its normal loop, and (b) back the slop in PreallocXlogFiles way off, so that it will make a future segment as soon as we start using the last existing segment, instead of only when we're nearly done. This would at least make it more likely that the bgwriter does the work instead of a foreground process. I'm hesitant to go much further than that, because I don't want to bloat the minimum disk footprint for low-volume installations, but the minimum footprint is really 2 xlog files anyway... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] allow CSV quote in NULL
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Stephen Frost wrote: Greetings, Please find attached a minor patch to remove the constraints that a user can't include the delimiter or quote characters in a 'NULL AS' string when importing CSV files. This allows a user to explicitly request that NULL conversion happen on fields which are quoted. As the quote character is being allowed to be in the 'NULL AS' string now, there's no reason to exclude the delimiter character from being seen in that string as well, though unless quoted using the CSV quote character it won't ever be matched. An example of the usage: sfrost*=# \copy billing_data from ~/BillingSamplePricerFile.csv with csv header quote as '' null as '' This is no contrived example, it's an issue I ran into earlier today when I got a file which had (for reasons unknown to me and not easily changed upstream): 1,V,WASHDCABC12,,120033... Both of the ending columns shown are integer fields, the here being used to indicate a NULL value. Without the patch, an ERROR occurs: sfrost= \copy billing_data from ~/BillingSamplePricerFile.csv with csv header quote as '' ERROR: invalid input syntax for integer: And there's no way to get it to import with COPY CSV mode. The patch adds this ability without affecting existing usage or changing the syntax. Even with the patch an ERROR occurs with the default treatment of CSV files: sfrost=# \copy billing_data from ~/BillingSamplePricerFile.csv with csv header quote as '' ERROR: invalid input syntax for integer: Which would be expected. If the file is modified to remove the s for NULL columns, it imports just fine with the syntax above. It'd be really nice to have this included. Thanks! Stephen [ Attachment, skipping... ] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] strpos() KMP
Added to TODO: * Implement Boyer-Moore searching in strpos() http://archives.postgresql.org/pgsql-patches/2007-08/msg00012.php --- Pavel Ajtkulov wrote: Hello, this patch allow to use Knuth-Morrison-Pratt algorithm for strpos() function (see Cormen et al. Introduction to Algorithms, MIT Press, 2001). It also works with multibyte wchar. In worst case current brute force strpos() takes O(n * m) (n m is length of strings) time (example: 'aaa...aaab' search in 'aaa...aaa'). KMP algo always takes O(n + m) time. To check this someone need to create a table with one text attribute, and insert several thousands record 'aa..aa'(for example, with lenght = 1000) . After execute select count(*) from test where strpos(a, 'aaaaab') 0; on current and modified version. Also, I advise to use select .. where strpos(att, 'word') 0; instead select .. where attr like '%word%' (strpos must be faster than regex). In general, this belongs to artificial expressions. In natural language KMP is equal (execution time) current strpos() nearly. Ajtkulov Pavel [EMAIL PROTECTED] P. S. Sorry for prime English. [ Attachment, skipping... ] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Include Lists for Text Search
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Simon Riggs wrote: On Mon, 2007-09-10 at 10:21 -0400, Tom Lane wrote: Oleg Bartunov [EMAIL PROTECTED] writes: On Mon, 10 Sep 2007, Simon Riggs wrote: Can we include that functionality now? This could be realized very easyly using dict_strict, which returns only known words, and mapping contains only this dictionary. So, feel free to write it and submit. ... for 8.4. I've coded a small patch to allow CaseSensitive synonyms. CREATE TEXT SEARCH DICTIONARY my_diction ( TEMPLATE = biglist, DictFile = words, CaseSensitive = true ); -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com [ Attachment, skipping... ] ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Thread-safe PREPARE in ecpg
On Wed, Sep 26, 2007 at 01:43:34PM +0900, ITAGAKI Takahiro wrote: Here is a revised patch against CVS HEAD. I fixed a bug in ECPGdeallocate_all(). Applied to CVS HEAD. I also added your example to the regression tests. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Hash Index Build Patch
Hi Tom, Tom Raney wrote: We used spool functions from the BTree code to sort the index tuples. Sorting is done on the hash value of the tuples. The hash value depends on the number of primary bucket pages (henceforth just bucket pages) that will be required to fit all the index tuples. So, before sorting, the base relation is scanned to get the total number of tuples. Just wondering, wouldn't it be enough to obtain a tuple count estimate by using reltuples / relpages * RelationGetNumberOfBlocks, like the planner does? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Optimizer hook
I've applied this patch with revision to put the hook where I thought it made sense. Attached is a modification of your dummy.c to show use of the hook. I didn't test it heavily, but I did check that it seemed to work with either order of calling geqo() and standard_join_search(). regards, tom lane #include postgres.h #include fmgr.h #include optimizer/geqo.h #include optimizer/paths.h #include optimizer/pathnode.h PG_MODULE_MAGIC; void_PG_init(void); void_PG_fini(void); static RelOptInfo * my_join_search(PlannerInfo *root, int levels_needed, List *initial_rels) { RelOptInfo *dynamic_result, *geqo_result; Cost dynamic_cost, geqo_cost; List *dynamic_list, *geqo_list; struct HTAB *dynamic_hash, *geqo_hash; int savelength; savelength = list_length(root-join_rel_list); root-join_rel_hash = NULL; elog(LOG, Starting a join order search \dynamic\...); dynamic_result = standard_join_search(root, levels_needed, initial_rels); dynamic_cost = dynamic_result-cheapest_total_path-total_cost; dynamic_list = list_copy(root-join_rel_list); dynamic_hash = root-join_rel_hash; root-join_rel_list = list_truncate(root-join_rel_list, savelength); root-join_rel_hash = NULL; elog(LOG, Starting a join order search \geqo\...); geqo_result = geqo(root, levels_needed, initial_rels); geqo_cost = geqo_result-cheapest_total_path-total_cost; geqo_list = list_copy(root-join_rel_list); geqo_hash = root-join_rel_hash; fprintf(stderr, GEQO cost: %f Dynamic programming cost: %f\n, geqo_cost, dynamic_cost); if (geqo_cost dynamic_cost) { root-join_rel_list = geqo_list; root-join_rel_hash = geqo_hash; return geqo_result; } else { root-join_rel_list = dynamic_list; root-join_rel_hash = dynamic_hash; return dynamic_result; } } void _PG_init(void) { join_search_hook = my_join_search; } void _PG_fini(void) { join_search_hook = NULL; } ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Hash Index Build Patch
Alvaro Herrera wrote: Hi Tom, Tom Raney wrote: We used spool functions from the BTree code to sort the index tuples. Sorting is done on the hash value of the tuples. The hash value depends on the number of primary bucket pages (henceforth just bucket pages) that will be required to fit all the index tuples. So, before sorting, the base relation is scanned to get the total number of tuples. Just wondering, wouldn't it be enough to obtain a tuple count estimate by using reltuples / relpages * RelationGetNumberOfBlocks, like the planner does? Hello Alvaro, We thought of that and the verdict is still out whether it is more costly to scan the entire relation to get the accurate count or use the estimate and hope for the best with the possibility of splits occurring during the build. If we use the estimate and it is completely wrong (with the actual tuple count being much higher) the sort will provide no benefit and it will behave as did the original code. But, to be honest, I don't know exactly when the catalog is updated and how accurate the estimate is. If you have any information there (or anyone else) please let me know. It would be great to eliminate that extra pass! Sincerely, Tom Raney ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Hash Index Build Patch
Tom Raney [EMAIL PROTECTED] writes: Alvaro Herrera wrote: Just wondering, wouldn't it be enough to obtain a tuple count estimate by using reltuples / relpages * RelationGetNumberOfBlocks, like the planner does? We thought of that and the verdict is still out whether it is more costly to scan the entire relation to get the accurate count or use the estimate and hope for the best with the possibility of splits occurring during the build. If we use the estimate and it is completely wrong (with the actual tuple count being much higher) the sort will provide no benefit and it will behave as did the original code. I think this argument is *far* too weak to justify an extra pass over the relation. The planner-style calculation is quite unlikely to give a major underestimate of the rowcount. It might overestimate, eg if the relation is bloated by dead tuples, but an error in that direction won't kill you. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Minor recovery changes
Simon Riggs [EMAIL PROTECTED] writes: As discussed on -hackers and -admin... a few minor changes, rolled into a single patch. I've brought this up to speed with CVS HEAD and applied it. I was not, however, sure what your intention was with the files in test_warm_standby.tar.gz. That's not really intended to go into the contrib module, is it? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] Document and/or remove unreachable code in tuptoaster.c from varvarlena patch
Gregory Stark [EMAIL PROTECTED] writes: [ revised version of tuptoaster-fixup.patch ] I've applied most of this. I didn't like the hardwired assumption about minimum size to compress, and anyway you had put it in a place where it broke the intended behavior for SET STORAGE PLAIN. I replaced it with just a test to save palloc/pfree overhead in toast_compress_datum, which has at least got knowledge of which PGLZ strategy is being used. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org