Re: [PATCHES] CREATE TABLE LIKE INCLUDING INDEXES support
Hi, This version updates the patch to CVS HEAD and has various fixes and refactoring, including proper docs. I refactored get_opclass_name() into lsyscache.c, but then realized that this means that lsyscache.c will depend on commands/indexcmds.c (for GetDefaultOpClass()), which is arguably improper, so I'm tempted to revert and just duplicate the syscache lookups in both ruleutils.c and parse_utilcmd.c Yes, I too would vote for duplicating the lookups for the sake of simplicity. Nikhil: why are both options and inhreloptions necessary in IndexStmt? Won't at least one be NULL? Yes, in the CREATE..LIKE case, options will be NULL and in the normal CREATE..INDEX case inhreloptions will be NULL. Note that options is a List of DefElem entries, whereas inhreloptions is a char pointer. The challenge was with converting the stored reloptions belonging to the parent index into some form which could be consumed by transformRelOptions(). It was difficult to convert it into a list of DefElem entries and hence I had to introduce inhreloptions. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [PATCHES] Maintaining cluster order on insert
Tom Lane wrote: What you would hope to see as the benefit of the patch is that the time for the range SELECT degrades more slowly as more of the table is replaced. Ignoring the first SELECT as being a startup transient, it looks like HEAD degrades from about 3 msec to 6 msec over 10 iterations (20% replacement of the table), whereas with the patch it's about 3 msec to about 4 and a half. However, the INSERT steps went from around 20 sec each to about twice that. Clustering in general isn't very important on a table that fits in cache. Also note that the overhead of descending the b-tree twice on INSERTs hurts the most in a CPU bound test like that. The overhead of the two-phase method I proposed should be lower. Test 1: executed CREATE in 0.066563 sec executed INSERT in 40.465653 sec executed CREATE in 9.152698 sec executed CLUSTE in 20.036375 sec executed VACUUM in 1.440232 sec Test 2: executed CREATE in 0.086862 sec executed INSERT in 50.746362 sec executed CREATE in 12.115655 sec executed CLUSTE in 33.656341 sec executed VACUUM in 4.306563 sec Why is there such a big difference in all these initialization steps as well? Is it just random noise? I played a bit with that test program, results from my laptop attached. I used a version patched with the latest patch I submitted, because that's what I had readily available. I used the same patched binaries in both test runs, I just didn't CLUSTER the table in the other run. The main difference to your results is that the DELETE, VACUUM and INSERT operations are much faster both with and without the patch. Most INSERTs for example took 1 s, and in your results they took 15 s. Any idea why? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com executed CREATE in 0.098872 sec executed INSERT in 36.958824 sec executed CREATE in 7.320412 sec executed CLUSTE in 22.579578 sec executed VACUUM in 0.809287 sec executed DELETE in 8.567730 sec executed VACUUM in 5.194736 sec executed INSERT in 0.882852 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 839347 AND 840347 retrieved 993 tuples in 0.239210 sec executing select: SELECT count(*) FROM testtab retrieved 999898 tuples in 0.417749 sec executed DELETE in 5.398222 sec executed VACUUM in 0.949003 sec executed INSERT in 0.771617 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 393988 AND 394988 retrieved 1015 tuples in 0.001261 sec executing select: SELECT count(*) FROM testtab retrieved 138 tuples in 0.373199 sec executed DELETE in 5.613083 sec executed VACUUM in 1.289501 sec executed INSERT in 1.434903 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 782316 AND 783316 retrieved 1011 tuples in 0.001723 sec executing select: SELECT count(*) FROM testtab retrieved 999826 tuples in 0.353616 sec executed DELETE in 3.729799 sec executed VACUUM in 1.406940 sec executed INSERT in 1.169133 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 797641 AND 798641 retrieved 991 tuples in 0.001118 sec executing select: SELECT count(*) FROM testtab retrieved 999731 tuples in 0.340721 sec executed DELETE in 1.448537 sec executed VACUUM in 0.918642 sec executed INSERT in 0.471951 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 910735 AND 911735 retrieved 1029 tuples in 0.001174 sec executing select: SELECT count(*) FROM testtab retrieved 999773 tuples in 0.334880 sec executed DELETE in 0.972467 sec executed VACUUM in 0.925557 sec executed INSERT in 0.645214 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 197353 AND 198353 retrieved 975 tuples in 0.001076 sec executing select: SELECT count(*) FROM testtab retrieved 999751 tuples in 0.334343 sec executed DELETE in 0.966780 sec executed VACUUM in 0.923394 sec executed INSERT in 0.468693 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 334887 AND 335887 retrieved 983 tuples in 0.001381 sec executing select: SELECT count(*) FROM testtab retrieved 999801 tuples in 0.335184 sec executed DELETE in 0.792456 sec executed VACUUM in 0.934451 sec executed INSERT in 0.435028 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 767461 AND 768461 retrieved 998 tuples in 0.001279 sec executing select: SELECT count(*) FROM testtab retrieved 999854 tuples in 0.338010 sec executed DELETE in 0.553884 sec executed VACUUM in 0.933886 sec executed INSERT in 0.435980 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 277496 AND 278496 retrieved 1017 tuples in 0.001257 sec executing select: SELECT count(*) FROM testtab retrieved 23 tuples in 0.333156 sec executed DELETE in 0.518371 sec executed VACUUM in 0.932207 sec executed INSERT in 18.031924 sec executing select: SELECT count(*) FROM testtab WHERE k BETWEEN 553415 AND 554415 retrieved 1012 tuples in 0.001280 sec executing select:
Re: [PATCHES] Maintaining cluster order on insert
Tom Lane [EMAIL PROTECTED] writes: * delete a random 2% of the table * vacuum to recover space * insert a random 2% of the table * select (about) 1000 consecutively-numbered rows * select all the rows (this is just a cross check that the number of rows isn't changing too much) What you would hope to see as the benefit of the patch is that the time for the range SELECT degrades more slowly as more of the table is replaced. Ignoring the first SELECT as being a startup transient, it looks like HEAD degrades from about 3 msec to 6 msec over 10 iterations (20% replacement of the table), whereas with the patch it's about 3 msec to about 4 and a half. However, the INSERT steps went from around 20 sec each to about twice that. I would suggest: a) continuing the test until you have 100% replacement. b) defining the tables with a fill-factor large enough to hold at least one extra tuple c) considering this patch alongside GII where it is especially important. It's pretty serious what you're suggesting since it means that we'll basically never have a real cluster feature. I would sure hope we're missing something and there's a way to make this work usefully. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] Maintaining cluster order on insert
Gregory Stark wrote: It's pretty serious what you're suggesting since it means that we'll basically never have a real cluster feature. I would sure hope we're missing something and there's a way to make this work usefully. Another approach would be an online CLUSTER command. That means there'll be a lot of churn when tuples need to be moved back and forth, along with updating indexes, but it'd take the overhead out of the critical path. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Maintaining cluster order on insert
On Mon, 9 Jul 2007, Tom Lane wrote: The hardware is just a Dell x86_64 workstation with crappy IDE disk, so maybe things would look better elsewhere, but it's all I have to work with. Do you have write-caching turned off on the drive so INSERTs are being rate-limited by WAL syncs? Trying to characterize the class of setup you're using. The part that seemed curious to me about your results in the unpatched version is why the first INSERT takes 2.4 seconds, while the second takes 12.2 then later ones settle from 17-23 s. I could understand the 12-23 variation, but that the first one fires off in 2.4 seems kind of fishy; why so fast? Is there something that's just fitting in memory in that case that's just missing in the patched version? results-head: ... executed DELETE in 14.770937 sec executed VACUUM in 10.663301 sec executed INSERT in 2.449248 sec (1st) ... executed INSERT in 12.212027 sec (2nd) results-patch: ... executed DELETE in 18.062664 sec executed VACUUM in 28.487570 sec executed INSERT in 25.638022 sec (1st) ... executed INSERT in 40.759404 sec (2nd) -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] Maintaining cluster order on insert
Gregory Stark [EMAIL PROTECTED] writes: It's pretty serious what you're suggesting since it means that we'll basically never have a real cluster feature. It seems entirely likely that this is not the way to go about real clustering. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Maintaining cluster order on insert
Greg Smith [EMAIL PROTECTED] writes: On Mon, 9 Jul 2007, Tom Lane wrote: The hardware is just a Dell x86_64 workstation with crappy IDE disk, so maybe things would look better elsewhere, but it's all I have to work with. Do you have write-caching turned off on the drive so INSERTs are being rate-limited by WAL syncs? Trying to characterize the class of setup you're using. It's just desktop-grade junk :-(. Feel free to repeat the test on something more serious. The part that seemed curious to me about your results in the unpatched version is why the first INSERT takes 2.4 seconds, while the second takes 12.2 then later ones settle from 17-23 s. I could understand the 12-23 variation, but that the first one fires off in 2.4 seems kind of fishy; The test seemed to be mostly I/O bound according to vmstat. I supposed that it was fast up until the kernel's available space was full of dirty pages, and then subsequent writes would be constrained by actual disk write bandwidth. Anyway the numbers seemed relatively consistent after the setup and first test cycle, so I think the part I was actually trying to draw conclusions from was probably real enough. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] CREATE TABLE LIKE INCLUDING INDEXES support
Neil Conway wrote: On Tue, 2007-05-06 at 14:50 +0530, NikhilS wrote: PFA, a patch which provides the CREATE .. INCLUDING INDEXES functionality. This patch uses the same functionality introduced by the earlier patches in this thread albeit for the INCLUDING INDEXES case. Attached is a revised version of this patch. Sorry for taking so long to make progress on this: my new job been keeping my busy, and I've recently been ill. Illness only counts as an excuse if you _don't_ recover. If you recover, you weren't sick enough. ;-) LOL -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] CREATE TABLE LIKE INCLUDING INDEXES support
On 7/10/07, Bruce Momjian [EMAIL PROTECTED] wrote: Neil Conway wrote: Attached is a revised version of this patch. Sorry for taking so long to make progress on this: my new job been keeping my busy, and I've recently been ill. Illness only counts as an excuse if you _don't_ recover. If you recover, you weren't sick enough. ;-) LOL uh! that sounds like my boss talking! -- regards, Jaime Casanova Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning. Richard Cook ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster