Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-13 Thread Robert Haas
On Fri, Nov 12, 2010 at 10:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: FYI, I always wondered if the rare use of mergejoins justified the extra planning time of carrying around all those joinpaths. They're hardly rare. They fairly rare in the sorts of

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-12 Thread Bruce Momjian
Tom Lane wrote: Leonardo Francalanci m_li...@yahoo.it writes: Cases with lots of irrelevant indexes. Zoltan's example had 4 indexes per child table, only one of which was relevant to the query. In your test case there are no irrelevant indexes, which is why the runtime didn't change.

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-12 Thread Tom Lane
Bruce Momjian br...@momjian.us writes: FYI, I always wondered if the rare use of mergejoins justified the extra planning time of carrying around all those joinpaths. They're hardly rare. regards, tom lane -- Sent via pgsql-hackers mailing list

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-04 Thread Tom Lane
[ for the archives' sake ] I wrote: I had a thought about how to make get_tabstat_entry() faster without adding overhead: what if we just plain remove the search, and always assume that a new entry has to be added to the tabstat array? I spent some time looking into this idea. It doesn't

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-01 Thread Tom Lane
I wrote: samples %symbol name 447433 47.1553 get_tabstat_entry 185458 19.5456 find_all_inheritors 53064 5.5925 SearchCatCache 33864 3.5690 pg_strtok get_tabstat_entry and find_all_inheritors are both obviously O(N^2) in the number of tables they have to deal with.

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
I wrote: the right way to make this faster is to refactor things so that we don't generate useless equivalence classes in the first place, or at least don't keep them around in the planner's lists once we realize they're useless. After a bit of hacking, I propose the attached patch. I like

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
On the other hand, if I use a similar test case to my original one (i.e. the tables are much wider) then the query planning takes 1.42 seconds in 9.1 with this patch instead of about 4.7 seconds as we observed it using PostgreSQL 9.0.0. The beginning of the gprof output now looks like

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
but I don't get any gain from the patch... explain time is still around 250 ms. Tried with 9000 partitions, time is still 2 secs. Small correction: I tried with 3000 partitions (FOR i IN 0..3000 ...) and got 250ms with both versions, with 9000 partitions 2 secs (again no gain from the

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci m_li...@yahoo.it writes: I tried a simple test case: create table t (a integer, b text); DO $$DECLARE i int; BEGIN FOR i IN 0..9000 LOOP EXECUTE 'create table t' || i || ' ( CHECK (a ' || i*10 || ' and a = ' || (i+1)*10 || ' ) ) INHERITS

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
This is going to be dominated by constraint exclusion checking. There's basically no fix for that except a more explicit representation of the partitioning rules. Damn, I knew that was going to be more complicated :) So in which case does this patch help? I guess in a multi-index

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes: On the other hand, if I use a similar test case to my original one (i.e. the tables are much wider) then the query planning takes 1.42 seconds in 9.1 with this patch instead of about 4.7 seconds as we observed it using PostgreSQL 9.0.0. The beginning

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
I wrote: This is going to be dominated by constraint exclusion checking. Hmm, maybe I spoke too soon. With 9000 child tables I get a profile like this: samples %symbol name 447433 47.1553 get_tabstat_entry 185458 19.5456 find_all_inheritors 53064 5.5925 SearchCatCache 33864

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Robert Haas
On Fri, Oct 29, 2010 at 12:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: Boszormenyi Zoltan z...@cybertec.at writes: On the other hand, if I use a similar test case to my original one (i.e. the tables are much wider) then the query planning takes 1.42 seconds in 9.1 with this patch instead of

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
Hmm, maybe I spoke too soon. With 9000 child tables I get a profile like this: Well, the 9000-table-test-case was meant to check the difference in performance with/without the patch... I don't see the reason for trying to optimize such an unrealistic case. BTW can someone explain to me

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci m_li...@yahoo.it writes: BTW can someone explain to me which are the cases where the patch actually helps? Cases with lots of irrelevant indexes. Zoltan's example had 4 indexes per child table, only one of which was relevant to the query. In your test case there are no

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Fri, Oct 29, 2010 at 12:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: However, if the hot spot does stay in SearchCatCache, I can't help noticing that those bucket chains are looking a bit overloaded --- sixty-plus entries per bucket ain't good.  Maybe

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Alvaro Herrera
Excerpts from Tom Lane's message of vie oct 29 14:15:55 -0300 2010: I wrote: This is going to be dominated by constraint exclusion checking. Hmm, maybe I spoke too soon. With 9000 child tables I get a profile like this: samples %symbol name 447433 47.1553 get_tabstat_entry

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Tom Lane's message of vie oct 29 14:15:55 -0300 2010: samples %symbol name 447433 47.1553 get_tabstat_entry Is there a reason for keeping the pgstat info in plain lists? Yeah: anything else loses for small numbers of

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
Cases with lots of irrelevant indexes. Zoltan's example had 4 indexes per child table, only one of which was relevant to the query. In your test case there are no irrelevant indexes, which is why the runtime didn't change. Mmh... I must be doing something wrong. It looks to me it's not

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci m_li...@yahoo.it writes: Cases with lots of irrelevant indexes. Zoltan's example had 4 indexes per child table, only one of which was relevant to the query. In your test case there are no irrelevant indexes, which is why the runtime didn't change. Mmh... I must be

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Heikki Linnakangas írta: On 26.10.2010 18:34, Boszormenyi Zoltan wrote: thank you very much for pointing me to dynahash, here is the next version that finally seems to work. Two patches are attached, the first is the absolute minimum for making it work, this still has the Tree type for

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta: Heikki Linnakangas írta: On 26.10.2010 18:34, Boszormenyi Zoltan wrote: thank you very much for pointing me to dynahash, here is the next version that finally seems to work. Two patches are attached, the first is the absolute minimum for making it work,

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta: Boszormenyi Zoltan írta: Heikki Linnakangas írta: On 26.10.2010 18:34, Boszormenyi Zoltan wrote: thank you very much for pointing me to dynahash, here is the next version that finally seems to work. Two patches are attached, the first is

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta: Boszormenyi Zoltan írta: Boszormenyi Zoltan írta: Heikki Linnakangas írta: On 26.10.2010 18:34, Boszormenyi Zoltan wrote: thank you very much for pointing me to dynahash, here is the next version that finally

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Heikki Linnakangas
On 28.10.2010 13:54, Boszormenyi Zoltan wrote: A little better version, no need for the heavy hash_any, hash_uint32 on the lower 32 bits on pk_eclass is enough. The profiling runtime is now 0.42 seconds vs the previous 0.41 seconds for the tree version. Actually, I wonder if we could just have

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes: This means add_child_rel_equivalences() is still takes too much time, the previously posted test case calls this function 482 times, it's called for almost every 10th entry added to eq_classes. The elog() I put into this function says that at the

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Actually, I wonder if we could just have a separate canon_pathkeys list for each EquivalenceClass, instead of one big list in PlannerInfo. I'm not too familiar with equivalence classes and all that, Hm. I don't like getting rid

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Tom Lane írta: Boszormenyi Zoltan z...@cybertec.at writes: This means add_child_rel_equivalences() is still takes too much time, the previously posted test case calls this function 482 times, it's called for almost every 10th entry added to eq_classes. The elog() I put into this function

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes: Tom Lane írta: That seems like a ridiculously large number of ECs. What is the test query again? The test case is here: http://archives.postgresql.org/message-id/4cbd9ddc.4040...@cybertec.at After poking through that a bit, I think that the real

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-27 Thread Heikki Linnakangas
On 26.10.2010 18:34, Boszormenyi Zoltan wrote: thank you very much for pointing me to dynahash, here is the next version that finally seems to work. Two patches are attached, the first is the absolute minimum for making it work, this still has the Tree type for canon_pathkeys and eq_classes got

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-19 Thread Boszormenyi Zoltan
Hi, Boszormenyi Zoltan írta: There is one problem with the patch, it doesn't survive make check. One of the regression tests fails the Assert(!cur_em-em_is_child); line in process_equivalence() in equivclass.c, but I couldn't yet find it what causes it. The why is vaguely clear:

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-19 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes: The problem is with the two functions in path/equivclass.c, as process_equivalance() and those functions are all walk the tree, and the current RBTree code can only deal with one walk at a time. We need to push/pop the iterator state to be able to

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Robert Haas
On Tue, Sep 7, 2010 at 2:14 PM, Boszormenyi Zoltan z...@cybertec.at wrote: Hi, Robert Haas írta: 2010/9/3 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at: i tried this one with 5000 unindexed tables (just one col): test=# \timing Timing is on. test=# prepare x(int4) AS select *

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Hans-Jürgen Schönig
hello ... no, we have not checked memory consumption. there is still some stuff left to optimize away - it seems we are going close to O(n^2) somewhere. equal is called really often in our sample case as well: ach sample counts as 0.01 seconds. % cumulative self self total

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Stephen Frost
* Hans-Jürgen Schönig (postg...@cybertec.at) wrote: no, we have not checked memory consumption. there is still some stuff left to optimize away - it seems we are going close to O(n^2) somewhere. equal is called really often in our sample case as well: Did the mail with the scripts, etc, get

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Hans-Jürgen Schönig
here is the patch again. we accidentally attached a wrong test file to the original posting so it grew to big. we had to revoke it from the moderator (this happens if you code from 8am to 10pm). here is just the patch - it is nice and small. you can easily test it by making yourself a nice

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote: Neat. Have you checked what effect this has on memory consumption? Also, don't forget to add it to https://commitfest.postgresql.org/action/commitfest_view/open Would be good to have the patch updated to be against HEAD before posting to the

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Hans-Jürgen Schönig
On Sep 8, 2010, at 4:57 PM, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: Neat. Have you checked what effect this has on memory consumption? Also, don't forget to add it to https://commitfest.postgresql.org/action/commitfest_view/open Would be good to have the patch

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Stephen Frost
* Hans-Jürgen Schönig (postg...@cybertec.at) wrote: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that more can be done to reduce this. i guess we have

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Robert Haas
2010/9/8 Hans-Jürgen Schönig postg...@cybertec.at: On Sep 8, 2010, at 4:57 PM, Stephen Frost wrote: * Robert Haas (robertmh...@gmail.com) wrote: Neat.  Have you checked what effect this has on memory consumption? Also, don't forget to add it to

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote: 2010/9/8 Hans-Jürgen Schönig postg...@cybertec.at: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2 functions i got the feeling that

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Alvaro Herrera
Excerpts from Stephen Frost's message of mié sep 08 11:26:55 -0400 2010: * Hans-Jürgen Schönig (postg...@cybertec.at) wrote: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
=?iso-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at writes: here is the patch again. This patch seems remarkably documentation-free. What is it trying to accomplish and what is it doing to the planner data structures? (Which do have documentation BTW.) Also, what will it do to

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Stephen Frost írta: * Robert Haas (robertmh...@gmail.com) wrote: 2010/9/8 Hans-Jürgen Schönig postg...@cybertec.at: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so - looking at the other top 2

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Alvaro Herrera írta: Excerpts from Stephen Frost's message of mié sep 08 11:26:55 -0400 2010: * Hans-Jürgen Schönig (postg...@cybertec.at) wrote: but, it seems the problem we are looking is not sufficiently fixed yet. in our case we shaved off some 18% of planning time or so -

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes: Indeed, but code like this makes me wonder if this is really working the way it's supposed to: + val1 = (long)pk_left-pk_eclass; + val2 = (long)pk_right-pk_eclass; Ugh. Casting a pointer to long? We do have portable ways to do what this is

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes: This applies to anything else that may need to be converted from list to tree to decrease planning time. Like ec_members in EquivalenceClass. AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer equality is interesting.

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes: I'm not really happy with the naming, comments, or some of the code flow, but I think the concept looks reasonable. There seems to be a lot of unnecessary palloc/pfree traffic in this implementation. Getting rid of that might help the speed.

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Tom Lane írta: Boszormenyi Zoltan z...@cybertec.at writes: This applies to anything else that may need to be converted from list to tree to decrease planning time. Like ec_members in EquivalenceClass. AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Tom Lane írta: Stephen Frost sfr...@snowman.net writes: I'm not really happy with the naming, comments, or some of the code flow, but I think the concept looks reasonable. There seems to be a lot of unnecessary palloc/pfree traffic in this implementation. Getting rid of that might

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes: Tom Lane írta: AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer equality is interesting. So I doubt this hack is of any use for EquivalenceClass, even if the lists were likely to be long which they aren't. No, for

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Tom Lane írta: Boszormenyi Zoltan z...@cybertec.at writes: Tom Lane írta: AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer equality is interesting. So I doubt this hack is of any use for EquivalenceClass, even if the lists were likely to be long

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes: equal() is called over 8 million times with or without our patch: From where, though? You've provided not a shred of evidence that searching large ec_member lists is the problem. Also, did the test case you're using ever make it to the list?

[HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread PostgreSQL - Hans-Jürgen Schönig
hello everybody, we came across an issue which turned out to be more serious than previously expected. imagine a system with, say, 1000 partitions (heavily indexed) or so. the time taken by the planner is already fairly heavy in this case. i tried this one with 5000 unindexed tables (just one

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread Stephen Frost
* PostgreSQL - Hans-Jürgen Schönig (postg...@cybertec.at) wrote: did anybody think of a solution to this problem. or more precisely: can there be a solution to this problem? Please post to the correct list (-performance) and provide information like PG version, postgresql.conf, the actual table

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread PostgreSQL - Hans-Jürgen Schönig
On Sep 3, 2010, at 2:04 PM, Stephen Frost wrote: * PostgreSQL - Hans-Jürgen Schönig (postg...@cybertec.at) wrote: did anybody think of a solution to this problem. or more precisely: can there be a solution to this problem? Please post to the correct list (-performance) and provide

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread Stephen Frost
* PostgreSQL - Hans-Jürgen Schönig (postg...@cybertec.at) wrote: this seems like more a developer question to me than a pre performance one. it is not related to the table structure at all - it is basically an issue with incredibly large inheritance lists. it applies to postgres 9 and most

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread Robert Haas
2010/9/3 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at: i tried this one with 5000 unindexed tables (just one col): test=# \timing Timing is on. test=# prepare x(int4) AS select * from t_data order by id desc; PREPARE Time: 361.552 ms you will see similar or higher runtimes in

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread Tom Lane
=?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at writes: imagine a system with, say, 1000 partitions (heavily indexed) or so. the time taken by the planner is already fairly heavy in this case. As the fine manual points out, the current scheme for managing

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread PostgreSQL - Hans-Jürgen Schönig
On Sep 3, 2010, at 4:40 PM, Tom Lane wrote: =?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= postg...@cybertec.at writes: imagine a system with, say, 1000 partitions (heavily indexed) or so. the time taken by the planner is already fairly heavy in this case. As the fine manual