Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-13 Thread Robert Haas
On Fri, Nov 12, 2010 at 10:55 PM, Tom Lane wrote: > Bruce Momjian writes: >> FYI, I always wondered if the rare use of mergejoins justified the extra >> planning time of carrying around all those joinpaths. > > They're hardly rare. They fairly rare in the sorts of queries I normally issue, but I

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-12 Thread Tom Lane
Bruce Momjian writes: > FYI, I always wondered if the rare use of mergejoins justified the extra > planning time of carrying around all those joinpaths. They're hardly rare. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-12 Thread Bruce Momjian
Tom Lane wrote: > Leonardo Francalanci writes: > >> Cases with lots of irrelevant indexes. Zoltan's example had 4 indexes > >> per child table, only one of which was relevant to the query. In your > >> test case there are no irrelevant indexes, which is why the runtime > >> didn't change. >

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-04 Thread Tom Lane
[ for the archives' sake ] I wrote: > I had a thought about how to make get_tabstat_entry() faster without > adding overhead: what if we just plain remove the search, and always > assume that a new entry has to be added to the tabstat array? I spent some time looking into this idea. It doesn't r

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-01 Thread Tom Lane
I wrote: > samples %symbol name > 447433 47.1553 get_tabstat_entry > 185458 19.5456 find_all_inheritors > 53064 5.5925 SearchCatCache > 33864 3.5690 pg_strtok > get_tabstat_entry and find_all_inheritors are both obviously O(N^2) in > the number of tables they have to deal

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci writes: >> Cases with lots of irrelevant indexes. Zoltan's example had 4 indexes >> per child table, only one of which was relevant to the query. In your >> test case there are no irrelevant indexes, which is why the runtime >> didn't change. > Mmh... I must be doing som

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
> Cases with lots of irrelevant indexes. Zoltan's example had 4 indexes > per child table, only one of which was relevant to the query. In your > test case there are no irrelevant indexes, which is why the runtime > didn't change. Mmh... I must be doing something wrong. It looks to me it's

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Alvaro Herrera writes: > Excerpts from Tom Lane's message of vie oct 29 14:15:55 -0300 2010: >> samples %symbol name >> 447433 47.1553 get_tabstat_entry > Is there a reason for keeping the pgstat info in plain lists? Yeah: anything else loses for small numbers of tables per query, wh

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Alvaro Herrera
Excerpts from Tom Lane's message of vie oct 29 14:15:55 -0300 2010: > I wrote: > > This is going to be dominated by constraint exclusion checking. > > Hmm, maybe I spoke too soon. With 9000 child tables I get a profile > like this: > > samples %symbol name > 447433 47.1553 get_tabsta

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Robert Haas writes: > On Fri, Oct 29, 2010 at 12:53 PM, Tom Lane wrote: >> However, if the hot spot does stay in SearchCatCache, I can't help >> noticing that those bucket chains are looking a bit overloaded --- >> sixty-plus entries per bucket ain't good.  Maybe it's time to teach >> catcache.c

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci writes: > BTW can someone explain to me which are the cases where the > patch actually helps? Cases with lots of irrelevant indexes. Zoltan's example had 4 indexes per child table, only one of which was relevant to the query. In your test case there are no irrelevant indexe

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
> Hmm, maybe I spoke too soon. With 9000 child tables I get a profile > like this: Well, the 9000-table-test-case was meant to check the difference in performance with/without the patch... I don't see the reason for trying to optimize such an unrealistic case. BTW can someone explain to me whi

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Robert Haas
On Fri, Oct 29, 2010 at 12:53 PM, Tom Lane wrote: > Boszormenyi Zoltan writes: >> On the other hand, if I use a similar test case to my original one >> (i.e. the tables are much wider) then the query planning takes >> 1.42 seconds in 9.1 with this patch instead of about 4.7 seconds >> as we obser

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
I wrote: > This is going to be dominated by constraint exclusion checking. Hmm, maybe I spoke too soon. With 9000 child tables I get a profile like this: samples %symbol name 447433 47.1553 get_tabstat_entry 185458 19.5456 find_all_inheritors 53064 5.5925 SearchCatCache 33864

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Boszormenyi Zoltan writes: > On the other hand, if I use a similar test case to my original one > (i.e. the tables are much wider) then the query planning takes > 1.42 seconds in 9.1 with this patch instead of about 4.7 seconds > as we observed it using PostgreSQL 9.0.0. The beginning of the gprof

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
> This is going to be dominated by constraint exclusion checking. There's > basically no fix for that except a more explicit representation of the > partitioning rules. Damn, I knew that was going to be more complicated :) So in which case does this patch help? I guess in a multi-index scena

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci writes: > I tried a simple test case: > create table t (a integer, b text); > DO $$DECLARE i int; > BEGIN > FOR i IN 0..9000 LOOP > EXECUTE 'create table t' || i || ' ( CHECK (a >' || i*10 || ' > and a <= ' || (i+1)*10 || ' ) ) INHERITS (t)'; >

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
> but I don't get any gain from the patch... explain time is still around 250 >ms. > Tried with 9000 partitions, time is still 2 secs. Small correction: I tried with 3000 partitions (FOR i IN 0..3000 ...) and got 250ms with both versions, with 9000 partitions 2 secs (again no gain from the pat

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
> On the other hand, if I use a similar test case to my original one > (i.e. the tables are much wider) then the query planning takes > 1.42 seconds in 9.1 with this patch instead of about 4.7 seconds > as we observed it using PostgreSQL 9.0.0. The beginning of the gprof > output now looks like

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Tom Lane
I wrote: > the right way to make this faster is to refactor things so that we > don't generate useless equivalence classes in the first place, or > at least don't keep them around in the planner's lists once we realize > they're useless. After a bit of hacking, I propose the attached patch. > I l

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Tom Lane
Boszormenyi Zoltan writes: > Tom Lane írta: >> That seems like a ridiculously large number of ECs. What is the >> test query again? > The test case is here: > http://archives.postgresql.org/message-id/4cbd9ddc.4040...@cybertec.at After poking through that a bit, I think that the real issue is i

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Tom Lane írta: > Boszormenyi Zoltan writes: > >> This means add_child_rel_equivalences() is still takes >> too much time, the previously posted test case calls this >> function 482 times, it's called for almost every 10th entry >> added to eq_classes. The elog() I put into this function says >

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Tom Lane
Heikki Linnakangas writes: > Actually, I wonder if we could just have a separate canon_pathkeys list > for each EquivalenceClass, instead of one big list in PlannerInfo. I'm > not too familiar with equivalence classes and all that, Hm. I don't like getting rid of the main canon_pathkeys list l

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Tom Lane
Boszormenyi Zoltan writes: > This means add_child_rel_equivalences() is still takes > too much time, the previously posted test case calls this > function 482 times, it's called for almost every 10th entry > added to eq_classes. The elog() I put into this function says > that at the last call lis

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Heikki Linnakangas
On 28.10.2010 13:54, Boszormenyi Zoltan wrote: A little better version, no need for the heavy hash_any, hash_uint32 on the lower 32 bits on pk_eclass is enough. The profiling runtime is now 0.42 seconds vs the previous 0.41 seconds for the tree version. Actually, I wonder if we could just have

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta: > Boszormenyi Zoltan írta: > >> Boszormenyi Zoltan írta: >> >> >>> Heikki Linnakangas írta: >>> >>> >>> On 26.10.2010 18:34, Boszormenyi Zoltan wrote: > thank you very much for pointing me to dynahash, h

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta: > Boszormenyi Zoltan írta: > >> Heikki Linnakangas írta: >> >> >>> On 26.10.2010 18:34, Boszormenyi Zoltan wrote: >>> >>> thank you very much for pointing me to dynahash, here is the next version that finally seems to work. Two pat

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Boszormenyi Zoltan írta: > Heikki Linnakangas írta: > >> On 26.10.2010 18:34, Boszormenyi Zoltan wrote: >> >>> thank you very much for pointing me to dynahash, here is the >>> next version that finally seems to work. >>> >>> Two patches are attached, the first is the absolute minimum for >>

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-28 Thread Boszormenyi Zoltan
Heikki Linnakangas írta: > On 26.10.2010 18:34, Boszormenyi Zoltan wrote: >> thank you very much for pointing me to dynahash, here is the >> next version that finally seems to work. >> >> Two patches are attached, the first is the absolute minimum for >> making it work, this still has the Tree type

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-27 Thread Heikki Linnakangas
On 26.10.2010 18:34, Boszormenyi Zoltan wrote: thank you very much for pointing me to dynahash, here is the next version that finally seems to work. Two patches are attached, the first is the absolute minimum for making it work, this still has the Tree type for canon_pathkeys and eq_classes got

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-19 Thread Tom Lane
Boszormenyi Zoltan writes: > The problem is with the two functions in path/equivclass.c, > as process_equivalance() and those functions are all walk > the tree, and the current RBTree code can only deal with > one walk at a time. We need to push/pop the iterator state > to be able to serve more th

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-19 Thread Boszormenyi Zoltan
Hi, Boszormenyi Zoltan írta: > There is one problem with the patch, it doesn't survive > "make check". One of the regression tests fails the > Assert(!cur_em->em_is_child); > line in process_equivalence() in equivclass.c, but I couldn't > yet find it what causes it. The "why" is vaguely clear:

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Boszormenyi Zoltan writes: > equal() is called over 8 million times with or without our patch: >From where, though? You've provided not a shred of evidence that searching large ec_member lists is the problem. Also, did the test case you're using ever make it to the list?

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Tom Lane írta: > Boszormenyi Zoltan writes: > >> Tom Lane írta: >> >>> AFAIR, canonical pathkeys are the *only* thing in the planner where pure >>> pointer equality is interesting. So I doubt this hack is of any use for >>> EquivalenceClass, even if the lists were likely to be long which

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Boszormenyi Zoltan writes: > Tom Lane írta: >> AFAIR, canonical pathkeys are the *only* thing in the planner where pure >> pointer equality is interesting. So I doubt this hack is of any use for >> EquivalenceClass, even if the lists were likely to be long which they >> aren't. > No, for Equival

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Tom Lane írta: > Stephen Frost writes: > >> I'm not really happy with the naming, comments, or some of the code >> flow, but I think the concept looks reasonable. >> > > There seems to be a lot of unnecessary palloc/pfree traffic in this > implementation. Getting rid of that might help th

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Tom Lane írta: > Boszormenyi Zoltan writes: > >> This applies to anything else that may need to be converted >> from list to tree to decrease planning time. Like ec_members >> in EquivalenceClass. >> > > AFAIR, canonical pathkeys are the *only* thing in the planner where pure > pointer equ

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Stephen Frost writes: > I'm not really happy with the naming, comments, or some of the code > flow, but I think the concept looks reasonable. There seems to be a lot of unnecessary palloc/pfree traffic in this implementation. Getting rid of that might help the speed. reg

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Boszormenyi Zoltan writes: > This applies to anything else that may need to be converted > from list to tree to decrease planning time. Like ec_members > in EquivalenceClass. AFAIR, canonical pathkeys are the *only* thing in the planner where pure pointer equality is interesting. So I doubt this

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
Stephen Frost writes: > Indeed, but code like this makes me wonder if this is really working the > way it's supposed to: > + val1 = (long)pk_left->pk_eclass; > + val2 = (long)pk_right->pk_eclass; Ugh. Casting a pointer to long? We do have portable ways to do what this is trying to do, but

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Alvaro Herrera írta: > Excerpts from Stephen Frost's message of mié sep 08 11:26:55 -0400 2010: > >> * Hans-Jürgen Schönig (postg...@cybertec.at) wrote: >> >>> but, it seems the problem we are looking is not sufficiently fixed yet. >>> in our case we shaved off some 18% of planning time or

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Boszormenyi Zoltan
Stephen Frost írta: > * Robert Haas (robertmh...@gmail.com) wrote: > >> 2010/9/8 Hans-Jürgen Schönig : >> >>> but, it seems the problem we are looking is not sufficiently fixed yet. >>> in our case we shaved off some 18% of planning time or so - looking at the >>> other top 2 functions i g

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Tom Lane
=?iso-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= writes: > here is the patch again. This patch seems remarkably documentation-free. What is it trying to accomplish and what is it doing to the planner data structures? (Which do have documentation BTW.) Also, what will it do to runtime in normal cases wh

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Alvaro Herrera
Excerpts from Stephen Frost's message of mié sep 08 11:26:55 -0400 2010: > * Hans-Jürgen Schönig (postg...@cybertec.at) wrote: > > but, it seems the problem we are looking is not sufficiently fixed yet. > > in our case we shaved off some 18% of planning time or so - looking at the > > other top 2

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote: > 2010/9/8 Hans-Jürgen Schönig : > > but, it seems the problem we are looking is not sufficiently fixed yet. > > in our case we shaved off some 18% of planning time or so - looking at the > > other top 2 functions i got the feeling that more can be done

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Robert Haas
2010/9/8 Hans-Jürgen Schönig : > On Sep 8, 2010, at 4:57 PM, Stephen Frost wrote: > >> * Robert Haas (robertmh...@gmail.com) wrote: >>> Neat.  Have you checked what effect this has on memory consumption? >>> >>> Also, don't forget to add it to >>> https://commitfest.postgresql.org/action/commitfest

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Stephen Frost
* Hans-Jürgen Schönig (postg...@cybertec.at) wrote: > but, it seems the problem we are looking is not sufficiently fixed yet. > in our case we shaved off some 18% of planning time or so - looking at the > other top 2 functions i got the feeling that more can be done to reduce this. > i guess we h

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Hans-Jürgen Schönig
On Sep 8, 2010, at 4:57 PM, Stephen Frost wrote: > * Robert Haas (robertmh...@gmail.com) wrote: >> Neat. Have you checked what effect this has on memory consumption? >> >> Also, don't forget to add it to >> https://commitfest.postgresql.org/action/commitfest_view/open > > Would be good to have

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote: > Neat. Have you checked what effect this has on memory consumption? > > Also, don't forget to add it to > https://commitfest.postgresql.org/action/commitfest_view/open Would be good to have the patch updated to be against HEAD before posting to the c

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Hans-Jürgen Schönig
here is the patch again. we accidentally attached a wrong test file to the original posting so it grew to big. we had to revoke it from the moderator (this happens if you code from 8am to 10pm). here is just the patch - it is nice and small. you can easily test it by making yourself a nice paren

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Stephen Frost
* Hans-Jürgen Schönig (postg...@cybertec.at) wrote: > no, we have not checked memory consumption. > there is still some stuff left to optimize away - it seems we are going close > to O(n^2) somewhere. > "equal" is called really often in our sample case as well: Did the mail with the scripts, etc,

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Hans-Jürgen Schönig
hello ... no, we have not checked memory consumption. there is still some stuff left to optimize away - it seems we are going close to O(n^2) somewhere. "equal" is called really often in our sample case as well: ach sample counts as 0.01 seconds. % cumulative self self tota

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-08 Thread Robert Haas
On Tue, Sep 7, 2010 at 2:14 PM, Boszormenyi Zoltan wrote: > Hi, > > Robert Haas írta: >> 2010/9/3 PostgreSQL - Hans-Jürgen Schönig : >> >>> i tried this one with 5000 unindexed tables (just one col): >>> >>> test=# \timing >>> Timing is on. >>> test=# prepare x(int4) AS select * from t_data order

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread PostgreSQL - Hans-Jürgen Schönig
On Sep 3, 2010, at 4:40 PM, Tom Lane wrote: > =?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= > writes: >> imagine a system with, say, 1000 partitions (heavily indexed) or so. the >> time taken by the planner is already fairly heavy in this case. > > As the fine manual points out, the c

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread Tom Lane
=?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= writes: > imagine a system with, say, 1000 partitions (heavily indexed) or so. the time > taken by the planner is already fairly heavy in this case. As the fine manual points out, the current scheme for managing partitioned tables isn't inte

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread Robert Haas
2010/9/3 PostgreSQL - Hans-Jürgen Schönig : > i tried this one with 5000 unindexed tables (just one col): > > test=# \timing > Timing is on. > test=# prepare x(int4) AS select * from t_data order by id desc; > PREPARE > Time: 361.552 ms > > you will see similar or higher runtimes in case of 500 par

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread Stephen Frost
* PostgreSQL - Hans-Jürgen Schönig (postg...@cybertec.at) wrote: > this seems like more a developer question to me than a pre performance one. > it is not related to the table structure at all - it is basically an issue > with incredibly large inheritance lists. > it applies to postgres 9 and most

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread PostgreSQL - Hans-Jürgen Schönig
On Sep 3, 2010, at 2:04 PM, Stephen Frost wrote: > * PostgreSQL - Hans-Jürgen Schönig (postg...@cybertec.at) wrote: >> did anybody think of a solution to this problem. >> or more precisely: can there be a solution to this problem? > > Please post to the correct list (-performance) and provide in

Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread Stephen Frost
* PostgreSQL - Hans-Jürgen Schönig (postg...@cybertec.at) wrote: > did anybody think of a solution to this problem. > or more precisely: can there be a solution to this problem? Please post to the correct list (-performance) and provide information like PG version, postgresql.conf, the actual tabl

[HACKERS] plan time of MASSIVE partitioning ...

2010-09-03 Thread PostgreSQL - Hans-Jürgen Schönig
hello everybody, we came across an issue which turned out to be more serious than previously expected. imagine a system with, say, 1000 partitions (heavily indexed) or so. the time taken by the planner is already fairly heavy in this case. i tried this one with 5000 unindexed tables (just one c