Re: [HACKERS] Identity projection
Thank you to all involved. On Friday, March 15, 2013 12:52 AM Tom Lane wrote: I wrote: ... So I think this patch is missing a bet by not accepting equal() expressions. I've committed this with that logic, a comment explaining exactly why this is the way to do it, and some other cosmetic improvements. Thank you. Thank you for your time and the good job. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
I'm starting to review this patch now, and I'm having a hard time with this particular design decision: Amit Kapila amit.kap...@huawei.com writes: We cannot directly compare expressions in target list as even if expressions are equal, below node (ex. APPEND) will not do projection, and hence expr will not be evaluated. I think this is nonsense. Whatever the tlist of the lower node is, that is supposed to describe what it's going to return. That node might not be able to do projection, but that doesn't mean that something underneath it didn't. So I think this patch is missing a bet by not accepting equal() expressions. Did you have a test case showing that this wouldn't work? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
I wrote: ... So I think this patch is missing a bet by not accepting equal() expressions. I've committed this with that logic, a comment explaining exactly why this is the way to do it, and some other cosmetic improvements. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Thursday, March 14, 2013 8:35 PM Tom Lane wrote: I'm starting to review this patch now, and I'm having a hard time with this particular design decision: Amit Kapila amit.kap...@huawei.com writes: We cannot directly compare expressions in target list as even if expressions are equal, below node (ex. APPEND) will not do projection, and hence expr will not be evaluated. I think this is nonsense. Whatever the tlist of the lower node is, that is supposed to describe what it's going to return. That node might not be able to do projection, but that doesn't mean that something underneath it didn't. So I think this patch is missing a bet by not accepting equal() expressions. Did you have a test case showing that this wouldn't work? I have missed the point that lower nodes would have done the expression evaluation during projection. But I think before your checkin, below comment in grouping planner might be misleading: /* * If the top-level plan node is one that cannot do expression * evaluation, we must insert a Result node to project the * desired tlist. */ Now because top-level node cannot do expression evaluation, but some lower node would have done it. Here the need seems to be only in case when the top-level plan node tlist is not desired tlist. Why do we need expression evaluation inside comment? With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Friday, March 15, 2013 12:52 AM Tom Lane wrote: I wrote: ... So I think this patch is missing a bet by not accepting equal() expressions. I've committed this with that logic, a comment explaining exactly why this is the way to do it, and some other cosmetic improvements. Thank you. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Friday, March 08, 2013 11:21 PM Heikki Linnakangas wrote: On 12.02.2013 11:03, Amit Kapila wrote: + /* + * equivalent_tlists + *returns whether two traget lists are equivalent + * + * We consider two target lists equivalent if both have + * only Var entries and resjunk of each target entry is same. + * + * This function is used to decide whether to create a result node. + * We need to ensure that each entry is Var as below node will not be + * projection capable, so it will not be able to compute expressions. + */ + bool + equivalent_tlists(List *tlist1, List *tlist2) + { + ListCell *lp, +*lc; + + if (list_length(tlist1) != list_length(tlist2)) + return false; /* tlists not same length */ + + forboth(lp, tlist1, lc, tlist2) + { + TargetEntry *tle1 = (TargetEntry *) lfirst(lp); + TargetEntry *tle2 = (TargetEntry *) lfirst(lc); + + if (tle1-resjunk != tle1-resjunk) + return false; /* tlist doesn't match junk status */ + + if (tle1-expr IsA(tle1-expr, Var) + tle2-expr IsA(tle2-expr, Var)) + { + Var*var1 = (Var *) tle1-expr; + Var*var2 = (Var *) tle2-expr; + + if (var1-varattno != var2-varattno) + return false; /* different order */ + } + else + return false; + } + return true; + } Hmm, shouldn't that also check Var.varno and Var.varlevelsup? I tried really hard to come up with a query that would fail because of that, but I wasn't able to find one. Nevertheless, seems like those fields should be checked. I had reffered functions trivial_subqueryscan() and tlist_matches_tupdesc() which does something similar. I had rechecked today and found that both of these functions have Assert for Var.varno and Var.varlevelsup. If you think that for current case we should have check rather than Assert, then I can update the patch for same. On a more trivial note, equivalent_tlists() seems too generic for this. I'd expect two tlists that contain e.g an identical Const node to be equivalent, but this returns false for it. I'd suggest calling it something like tlist_needs_projection() instead. Also, it probably belongs in tlist.c. You are right. I shall update this in patch once above point for usage of vano and varlevelup is confirmed. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On 12.02.2013 11:03, Amit Kapila wrote: + /* + * equivalent_tlists + * returns whether two traget lists are equivalent + * + * We consider two target lists equivalent if both have + * only Var entries and resjunk of each target entry is same. + * + * This function is used to decide whether to create a result node. + * We need to ensure that each entry is Var as below node will not be + * projection capable, so it will not be able to compute expressions. + */ + bool + equivalent_tlists(List *tlist1, List *tlist2) + { + ListCell *lp, + *lc; + + if (list_length(tlist1) != list_length(tlist2)) + return false; /* tlists not same length */ + + forboth(lp, tlist1, lc, tlist2) + { + TargetEntry *tle1 = (TargetEntry *) lfirst(lp); + TargetEntry *tle2 = (TargetEntry *) lfirst(lc); + + if (tle1-resjunk != tle1-resjunk) + return false; /* tlist doesn't match junk status */ + + if (tle1-expr IsA(tle1-expr, Var) + tle2-expr IsA(tle2-expr, Var)) + { + Var*var1 = (Var *) tle1-expr; + Var*var2 = (Var *) tle2-expr; + + if (var1-varattno != var2-varattno) + return false; /* different order */ + } + else + return false; + } + return true; + } Hmm, shouldn't that also check Var.varno and Var.varlevelsup? I tried really hard to come up with a query that would fail because of that, but I wasn't able to find one. Nevertheless, seems like those fields should be checked. On a more trivial note, equivalent_tlists() seems too generic for this. I'd expect two tlists that contain e.g an identical Const node to be equivalent, but this returns false for it. I'd suggest calling it something like tlist_needs_projection() instead. Also, it probably belongs in tlist.c. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
I have updated the patch as per comments from Tom and Heikki. If you can verify it, then IMO it can be marked as 'Ready For Committer' Would you please do that? Done. Thank you. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Hello, I've read the discussion held so far and am satisfied that apply this patch only for Result node. I applied the patch and found that it worked pretty fine for me. Thank you and I also think that we may send this to committers. # It makes me fee ill at ease that the roles of us look inverted :-p At Wed, 13 Feb 2013 09:08:21 +0530, Amit Kapila amit.kap...@huawei.com wrote in 001801ce099b$92b2df20$b8189d60$@kap...@huawei.com I have updated the patch as per comments from Tom and Heikki. If you can verify it, then IMO it can be marked as 'Ready For Committer' Would you please do that? Regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Friday, February 15, 2013 2:03 PM Kyotaro HORIGUCHI wrote: Hello, I've read the discussion held so far and am satisfied that apply this patch only for Result node. I applied the patch and found that it worked pretty fine for me. Thank you and I also think that we may send this to committers. # It makes me fee ill at ease that the roles of us look inverted :-p At Wed, 13 Feb 2013 09:08:21 +0530, Amit Kapila amit.kap...@huawei.com wrote in 001801ce099b$92b2df20$b8189d60$@kap...@huawei.com I have updated the patch as per comments from Tom and Heikki. If you can verify it, then IMO it can be marked as 'Ready For Committer' Would you please do that? Done. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Saturday, February 09, 2013 9:03 AM Tom Lane wrote: Amit kapila amit.kap...@huawei.com writes: if (!is_projection_capable_plan(result_plan) compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) Sorry, the check I suggested in last mail should be as below: if (!is_projection_capable_plan(result_plan) !compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) You know, I was thinking that compare_tlist_exprs() was a pretty unhelpfully-chosen name for a function returning boolean, and this thinko pretty much proves the point. It'd be better to call it something like equivalent_tlists(), tlists_are_equivalent(), etc. (I'm not caring for the emphasis on the exprs either, because I think it'll also be necessary to compare resjunk fields for instance.) We cannot directly compare expressions in target list as even if expressions are equal, below node (ex. APPEND) will not do projection, and hence expr will not be evaluated. So I have checked if both expr's are Var and have same varattno. I have included tests of expr and resjunk in equivalent_tlists(). I am not able see the need of comparison for resno and ressortgroupref, so I have not include the check for them in equivalent_tlists(). Updated patch is attached with this mail. Conclusion of Test - 1. The performance improvement is very good (upto 60%) for table with 30 columns 2. The performance improvement is good (upto 36%) even when table contains 2 columns Configuration -- Os - Suse -11.2 CPU - Intel(R) Xeon(R) L5408 @ 2.13GHz Ram - 24G shared_buffers - 4GB Test Data --- 30 Columns -- CREATE TABLE tbl_parent (c01 int, c02 numeric, c03 int, c04 int, c05 int, c06 int, c07 int, c08 int, c09 int, c10 int, c11 numeric, c12 numeric, c13 numeric, c14 numeric, c15 numeric, c16 numeric, c17 numeric, c18 numeric, c19 numeric, c20 numeric, c21 int, c22 numeric, c23 int, c24 int, c25 int, c26 int, c27 int, c28 int, c29 int, c30 int); CREATE TABLE tbl_child () INHERITS(tbl_parent); INSERT INTO tbl_child (SELECT n, floor(random() * 1), n+2, n+3, n+4, n+5, n+6, n+7, n+8, n+9, n+10, n+11, n+12, n+13, n+14, n+15, n+16, n+17, n+18, n+19, n+20, n+21, n+22, n+23, n+24, n+25, n+26, n+27, n+28, n+29 FROM generate_series(0, 1000 - 1) n); EXPLAIN ANALYZE SELECT * FROM tbl_parent; Original - 9995.1994 ms Patched - 3817.268 ms (~61%) 10 Columns -- CREATE TABLE tbl_parent (c01 int, c02 numeric, c03 int, c04 int, c05 int, c06 numeric, c07 numeric, c08 numeric, c09 numeric, c10 numeric); CREATE TABLE tbl_child () INHERITS(tbl_parent); INSERT INTO tbl_child (SELECT n, floor(random() * 1), n+2, n+3, n+4, n+5, n+6, n+7, n+8, n+9 FROM generate_series(0, 1000 - 1) n); EXPLAIN ANALYZE SELECT * FROM tbl_parent; Original - 6404.2992 ms Patched - 3374.2356 ms (~47%) 2 Columns - CREATE TABLE tbl_parent (c01 numeric, c02 int); CREATE TABLE tbl_child () INHERITS(tbl_parent); INSERT INTO tbl_child (SELECT floor(random() * 1), n FROM generate_series(0, 1000 - 1) n); EXPLAIN ANALYZE SELECT * FROM tbl_parent; Original - 4952.5758 ms Patched - 3162.2286 ms (~36%) With Regards, Amit Kapila. identity_projection_v3_20130212.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Hello. Sorry for long absence. # I've lost my health and am not fully recovered.. The direction of the discussion now taken place is just what I've wanted. The patch I proposed simply came from my poor understanding about exact how to detect identity projection by comparing tlists, and I couldn't found how to eliminate unwanted nodes appropriately. Thanks, Amit. I'll catch up this discussion soon. amit.kapila Amit kapila amit.kap...@huawei.com writes: amit.kapila if (!is_projection_capable_plan(result_plan) compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) amit.kapila amit.kapila Sorry, the check I suggested in last mail should be as below: amit.kapila amit.kapila if (!is_projection_capable_plan(result_plan) !compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) amit.kapila amit.kapila You know, I was thinking that compare_tlist_exprs() was a pretty amit.kapila unhelpfully-chosen name for a function returning boolean, and this amit.kapila thinko pretty much proves the point. It'd be better to call it amit.kapila something like equivalent_tlists(), tlists_are_equivalent(), etc. amit.kapila (I'm not caring for the emphasis on the exprs either, because I think amit.kapila it'll also be necessary to compare resjunk fields for instance.) amit.kapila amit.kapila The fields which cannot be compared are resname, resorigtbl, resorigcol as these gets cleared in planner. amit.kapila I am not sure about fields resno and ressortgroupref, but I will check in more detail before sending patch. With best regards, -- Kyotaro Horiguchi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Wednesday, February 13, 2013 8:12 AM Kyotaro HORIGUCHI wrote: Hello. Sorry for long absence. # I've lost my health and am not fully recovered.. The direction of the discussion now taken place is just what I've wanted. The patch I proposed simply came from my poor understanding about exact how to detect identity projection by comparing tlists, and I couldn't found how to eliminate unwanted nodes appropriately. I have updated the patch as per comments from Tom and Heikki. If you can verify it, then IMO it can be marked as 'Ready For Committer' Thanks, Amit. I'll catch up this discussion soon. amit.kapila Amit kapila amit.kap...@huawei.com writes: amit.kapila if (!is_projection_capable_plan(result_plan) compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) amit.kapila amit.kapila Sorry, the check I suggested in last mail should be as below: amit.kapila amit.kapila if (!is_projection_capable_plan(result_plan) !compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) amit.kapila amit.kapila You know, I was thinking that compare_tlist_exprs() was a pretty amit.kapila unhelpfully-chosen name for a function returning boolean, and this amit.kapila thinko pretty much proves the point. It'd be better to call it amit.kapila something like equivalent_tlists(), tlists_are_equivalent(), etc. amit.kapila (I'm not caring for the emphasis on the exprs either, because I think amit.kapila it'll also be necessary to compare resjunk fields for instance.) amit.kapila amit.kapila The fields which cannot be compared are resname, resorigtbl, resorigcol as these gets cleared in planner. amit.kapila I am not sure about fields resno and ressortgroupref, but I will check in more detail before sending patch. With best regards, -- Kyotaro Horiguchi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Saturday, February 09, 2013 9:03 AM Tom Lane wrote: Amit kapila amit.kap...@huawei.com writes: if (!is_projection_capable_plan(result_plan) compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) Sorry, the check I suggested in last mail should be as below: if (!is_projection_capable_plan(result_plan) !compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) You know, I was thinking that compare_tlist_exprs() was a pretty unhelpfully-chosen name for a function returning boolean, and this thinko pretty much proves the point. It'd be better to call it something like equivalent_tlists(), tlists_are_equivalent(), etc. (I'm not caring for the emphasis on the exprs either, because I think it'll also be necessary to compare resjunk fields for instance.) The fields which cannot be compared are resname, resorigtbl, resorigcol as these gets cleared in planner. I am not sure about fields resno and ressortgroupref, but I will check in more detail before sending patch. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Friday, February 08, 2013 12:00 AM Tom Lane wrote: Amit Kapila amit.kap...@huawei.com writes: There can be 2 ways to remove result node a. Remove the Result plan node in case it is not required - This is same as currently it does for SubqueryScan. We can check if the result plan is trivial (with logic similar to trivial_subqueryscan()), then remove result node. b. to avoid adding it to Plan node in case it is not required - For this, in grouping_planner() currently it checks if the plan is projection capable (is_projection_capable_plan()), we can enhance this check such that it also check projection is really required depending if the targetlist contains any non Var element. Please suggest which way is more preferable and if one of above 2 seems to be okay, Adding a result node only to remove it again seems a bit expensive. It'd be better not to generate the node in the first place. (There's a technical reason to generate a temporary SubqueryScan, which is to keep Var numbering in the subplan separate from that in the upper plan; but AFAICS that doesn't apply to Result.) An advantage of removing useless Results at setrefs.c time is that we can be sure it will be applied to all Result nodes. However, we might be able to do it the other way with only one point-of-change if we hack make_result itself to check whether the proposed tlist is an identity. So for this, if a,b,c (below mentioned conds.) are true then don't create Result Plan, just return subplan a. subplan is NOT NULL and b. resconstantqual is NULL and c. compare expr of each TargetEntry for proposed tlist with subplan target Note that contains non Var element is the wrong test for this anyway --- the question is does the tlist have the same expressions in the same order as the tlist of the Result's child node. As per my understanding, currently in code wherever Result node can be avoided, it calls function is_projection_capable_plan(), so we can even enhance is_projection_capable_plan() so that it can also verify the expressions of tlists. But for this we need to change at all places from where is_projection_capable_plan() is called. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Amit Kapila amit.kap...@huawei.com writes: On Friday, February 08, 2013 12:00 AM Tom Lane wrote: As per my understanding, currently in code wherever Result node can be avoided, it calls function is_projection_capable_plan(), so we can even enhance is_projection_capable_plan() so that it can also verify the expressions of tlists. But for this we need to change at all places from where is_projection_capable_plan() is called. Hm. Really there's a whole dance that typically goes on, which is like if (!is_projection_capable_plan(result_plan)) { result_plan = (Plan *) make_result(root, sub_tlist, NULL, result_plan); } else { /* * Otherwise, just replace the subplan's flat tlist with * the desired tlist. */ result_plan-targetlist = sub_tlist; } Perhaps we could encapsulate this whole sequence into a function called say assign_targetlist_to_plan(), which would have the responsibility to decide whether a Result node needs to be inserted. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Friday, February 08, 2013 11:06 PM Tom Lane wrote: Amit Kapila amit(dot)kapila(at)huawei(dot)com writes: On Friday, February 08, 2013 12:00 AM Tom Lane wrote: As per my understanding, currently in code wherever Result node can be avoided, it calls function is_projection_capable_plan(), so we can even enhance is_projection_capable_plan() so that it can also verify the expressions of tlists. But for this we need to change at all places from where is_projection_capable_plan() is called. Hm. Really there's a whole dance that typically goes on, which is like if (!is_projection_capable_plan(result_plan)) Perhaps we could encapsulate this whole sequence into a function called say assign_targetlist_to_plan(), which would have the responsibility to decide whether a Result node needs to be inserted. If we want to encapsulate whole of above logic in assign_targetlist_to_plan(), then the responsibility of new functionwill be much higher, because the code that assigns targetlist is not same at all places. For example Related code in prepare_sort_from_pathkeys() is as below where it needs to append junk entry to target list. if (!adjust_tlist_in_place !is_projection_capable_plan(lefttree)) { /* copy needed so we don't modify input's tlist below */ tlist = copyObject(tlist); lefttree = (Plan *) make_result(root, tlist, NULL, lefttree); } /* Don't bother testing is_projection_capable_plan again */ adjust_tlist_in_place = true; /* * Add resjunk entry to input's tlist */ tle = makeTargetEntry(sortexpr, list_length(tlist) + 1, NULL, true); tlist = lappend(tlist, tle); lefttree-targetlist = tlist; /* just in case NIL before */ Similar kind of code is there in grouping_planner for the case of activeWindows.Now we can change the code such that places where any new target entry has to be added to target list, move that part of code before calling assign_targetlist_to_plan or pass extra parameters to assign_targetlist_to_plan, so that it can accomodate all such cases. The story doesn't ends there, in some places it has to make a copy of targetlist before assigning it to plan's targetlist. How about if just enhance the code as below: if (!is_projection_capable_plan(result_plan) compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) { result_plan = (Plan *) make_result(root, sub_tlist, NULL, result_plan); where the new function will be something as below: bool compare_tlist_exprs(List *tlist1, List *tlist2) { ListCell *lp,*lc; if (list_length(tlist1) != list_length(tlist2)) return false;/* tlists not same length */ forboth(lp, tlist1, lc, tlist2) { TargetEntry *ptle = (TargetEntry *) lfirst(lp); TargetEntry *ctle = (TargetEntry *) lfirst(lc); if(!equal(ptle-expr,ctle-expr)) return false; } return true; } With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Saturday, February 09, 2013 6:56 AM Amit kapila wrote: Friday, February 08, 2013 11:06 PM Tom Lane wrote: Amit Kapila amit(dot)kapila(at)huawei(dot)com writes: On Friday, February 08, 2013 12:00 AM Tom Lane wrote: As per my understanding, currently in code wherever Result node can be avoided, Hm. Really there's a whole dance that typically goes on, which is like if (!is_projection_capable_plan(result_plan)) Perhaps we could encapsulate this whole sequence into a function called say assign_targetlist_to_plan(), which would have the responsibility to decide whether a Result node needs to be inserted. if (!is_projection_capable_plan(result_plan) compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) Sorry, the check I suggested in last mail should be as below: if (!is_projection_capable_plan(result_plan) !compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Amit kapila amit.kap...@huawei.com writes: if (!is_projection_capable_plan(result_plan) compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) Sorry, the check I suggested in last mail should be as below: if (!is_projection_capable_plan(result_plan) !compare_tlist_exprs(sub_tlist, result_plan-targetlist) ) You know, I was thinking that compare_tlist_exprs() was a pretty unhelpfully-chosen name for a function returning boolean, and this thinko pretty much proves the point. It'd be better to call it something like equivalent_tlists(), tlists_are_equivalent(), etc. (I'm not caring for the emphasis on the exprs either, because I think it'll also be necessary to compare resjunk fields for instance.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On Friday, December 14, 2012 5:11 PM Heikki Linnakangas wrote: On 12.11.2012 12:07, Kyotaro HORIGUCHI wrote: Hello, This is new version of identity projection patch. Reverted projectionInfo and ExecBuildProjectionInfo. Identity projection is recognized directly in ExecGroup, ExecResult, and ExecWindowAgg. nodeAgg is reverted because I couldn't make it sane.. The following is the result of performance test posted before in order to show the source of the gain. Hmm, this reminds me of the discussion on removing useless Limit nodes: http://archives.postgresql.org/pgsql-performance/2012-12/msg00127.php. The optimization on Group, WindowAgg and Agg nodes doesn't seem that important, the cost of doing the aggregation/grouping is likely overwhelming the projection cost, and usually you do projection in grouping/aggregation anyway. But makes sense for Result. For Result, I think you should aim to remove the useless Result node from the plan altogether. I was reviewing this patch and found that it can be move forward as follows: There can be 2 ways to remove result node a. Remove the Result plan node in case it is not required - This is same as currently it does for SubqueryScan. We can check if the result plan is trivial (with logic similar to trivial_subqueryscan()), then remove result node. b. to avoid adding it to Plan node in case it is not required - For this, in grouping_planner() currently it checks if the plan is projection capable (is_projection_capable_plan()), we can enhance this check such that it also check projection is really required depending if the targetlist contains any non Var element. Please suggest which way is more preferable and if one of above 2 seems to be okay, I can update the patch on behalf of Kyotaro-san incase they don't have time currently. And do the same for useless Limit nodes. Is it okay, if this can be done as part of separate patch? With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Amit Kapila amit.kap...@huawei.com writes: There can be 2 ways to remove result node a. Remove the Result plan node in case it is not required - This is same as currently it does for SubqueryScan. We can check if the result plan is trivial (with logic similar to trivial_subqueryscan()), then remove result node. b. to avoid adding it to Plan node in case it is not required - For this, in grouping_planner() currently it checks if the plan is projection capable (is_projection_capable_plan()), we can enhance this check such that it also check projection is really required depending if the targetlist contains any non Var element. Please suggest which way is more preferable and if one of above 2 seems to be okay, Adding a result node only to remove it again seems a bit expensive. It'd be better not to generate the node in the first place. (There's a technical reason to generate a temporary SubqueryScan, which is to keep Var numbering in the subplan separate from that in the upper plan; but AFAICS that doesn't apply to Result.) An advantage of removing useless Results at setrefs.c time is that we can be sure it will be applied to all Result nodes. However, we might be able to do it the other way with only one point-of-change if we hack make_result itself to check whether the proposed tlist is an identity. Note that contains non Var element is the wrong test for this anyway --- the question is does the tlist have the same expressions in the same order as the tlist of the Result's child node. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Kyotaro, Are you planning to update this patch based on Heikki's comments? The patch is listed in the commitfest and we're trying to make some progress through all of those patches. Thanks, Stephen * Heikki Linnakangas (hlinn...@iki.fi) wrote: On 12.11.2012 12:07, Kyotaro HORIGUCHI wrote: Hello, This is new version of identity projection patch. Reverted projectionInfo and ExecBuildProjectionInfo. Identity projection is recognized directly in ExecGroup, ExecResult, and ExecWindowAgg. nodeAgg is reverted because I couldn't make it sane.. The following is the result of performance test posted before in order to show the source of the gain. Hmm, this reminds me of the discussion on removing useless Limit nodes: http://archives.postgresql.org/pgsql-performance/2012-12/msg00127.php. The optimization on Group, WindowAgg and Agg nodes doesn't seem that important, the cost of doing the aggregation/grouping is likely overwhelming the projection cost, and usually you do projection in grouping/aggregation anyway. But makes sense for Result. For Result, I think you should aim to remove the useless Result node from the plan altogether. And do the same for useless Limit nodes. - Heikki signature.asc Description: Digital signature
Re: [HACKERS] Identity projection
On 12.11.2012 12:07, Kyotaro HORIGUCHI wrote: Hello, This is new version of identity projection patch. Reverted projectionInfo and ExecBuildProjectionInfo. Identity projection is recognized directly in ExecGroup, ExecResult, and ExecWindowAgg. nodeAgg is reverted because I couldn't make it sane.. The following is the result of performance test posted before in order to show the source of the gain. Hmm, this reminds me of the discussion on removing useless Limit nodes: http://archives.postgresql.org/pgsql-performance/2012-12/msg00127.php. The optimization on Group, WindowAgg and Agg nodes doesn't seem that important, the cost of doing the aggregation/grouping is likely overwhelming the projection cost, and usually you do projection in grouping/aggregation anyway. But makes sense for Result. For Result, I think you should aim to remove the useless Result node from the plan altogether. And do the same for useless Limit nodes. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Hello, This is new version of identity projection patch. Reverted projectionInfo and ExecBuildProjectionInfo. Identity projection is recognized directly in ExecGroup, ExecResult, and ExecWindowAgg. nodeAgg is reverted because I couldn't make it sane.. The following is the result of performance test posted before in order to show the source of the gain. regards, -- -- Kyotaro Horiguchi NTT Open Source Software Center At Fri, 05 Oct 2012 16:04:16 +0900, Kyotaro HORIGUCHI wrote in 20121005.160416.256387378.horiguchi.kyot...@lab.ntt.co.jp Although I said as following, the gain seems a bit larger... I'll recheck the testing conditions... I had inspected more precisely on two aspects maginifying the effect of this patch by putting 300 columns into table. First, explain analyze says the difference caused by this patch is only in the actual time of Result node. orig$ psql -c 'explain analyze select * from parenta' QUERY PLAN -- Result (cost=0.00..176667.00 rows=101 width=1202) (actual time=0.013.. *2406.792* rows=100 loops=1) - Append (cost=0.00..176667.00 rows=101 width=1202) (actual time=0.011..412.749 rows=100 loops=1) - Seq Scan on parenta (cost=0.00..0.00 rows=1 width=1228) (actual time=0.001..0.001 rows=0 loops=1) - Seq Scan on childa000 parenta (cost=0.00..176667.00 rows=100 width=1202) (actual time=0.009..334.633 rows=100 loops=1) Total runtime: 2446.474 ms (5 rows) patched$ psql -c 'explain analyze select * from parenta' QUERY PLAN -- Result (cost=0.00..176667.00 rows=101 width=1202) (actual time=0.011.. *507.239* rows=100 loops=1) - Append (cost=0.00..176667.00 rows=101 width=1202) (actual time=0.011..419.420 rows=100 loops=1) - Seq Scan on parenta (cost=0.00..0.00 rows=1 width=1228) (actual time=0.000..0.000 rows=0 loops=1) - Seq Scan on childa000 parenta (cost=0.00..176667.00 rows=100 width=1202) (actual time=0.010..335.721 rows=100 loops=1) Total runtime: 545.879 ms (5 rows) Second, the results of configure --enable-profiling shows that the exectime of ExecProject chages greately. This is consistent with what explain showed. orig: % cumulative self self total time seconds secondscalls s/call s/call name 60.29 1.26 1.26 105 0.00 0.00 slot_deform_tuple ! 30.14 1.89 0.63 100 0.00 0.00 ExecProject 3.35 1.96 0.07 304 0.00 0.00 ExecProcNode 0.96 1.98 0.02 102 0.00 0.00 ExecScan 0.96 2.00 0.02 166379 0.00 0.00 TerminateBufferIO 0.48 2.01 0.01 304 0.00 0.00 InstrStartNode 0.48 2.02 0.01 304 0.00 0.00 InstrStopNode ! 0.48 2.03 0.01 101 0.00 0.00 ExecResult 0.48 2.04 0.01 830718 0.00 0.00 LWLockAcquire 0.48 2.05 0.01 506834 0.00 0.00 hash_search_with_hash_value 0.48 2.06 0.01 341656 0.00 0.00 LockBuffer 0.48 2.07 0.01 168383 0.00 0.00 ReadBuffer_common 0.48 2.08 0.014 0.00 0.00 InstrEndLoop 0.48 2.09 0.01 ExecCleanTargetListLength 0.00 2.09 0.00 205 0.00 0.00 MemoryContextReset patched: % cumulative self self total time seconds secondscalls ms/call ms/call name 23.08 0.03 0.03 304 0.00 0.00 ExecProcNode 15.38 0.05 0.02 102 0.00 0.00 heapgettup_pagemode 15.38 0.07 0.02 830718 0.00 0.00 LWLockAcquire 7.69 0.08 0.01 205 0.00 0.00 MemoryContextReset 7.69 0.09 0.01 102 0.00 0.00 ExecScan 7.69 0.10 0.01 100 0.00 0.00 ExecStoreTuple 7.69 0.11 0.01 841135 0.00 0.00 LWLockRelease 7.69 0.12 0.01 168383 0.00 0.00 ReadBuffer_common 7.69 0.13 0.01 168383 0.00 0.00 UnpinBuffer 0.00 0.13 0.00 304 0.00 0.00 InstrStartNode ... ! 0.00 0.13 0.00 101 0.00 0.00 ExecResult ! 0.00 0.13 0.00 100 0.00 0.00 ExecProject == diff --git a/src/backend/executor/nodeGroup.c
Re: [HACKERS] Identity projection
Kyotaro HORIGUCHI wrote: Hello, sorry for long absense, # I had unexpected and urgent time-consuming tasks... :-( I have had a bit more precise inspection by two aspects, and they seemd showing that the difference should be the execution time of ExecProject. I'll be able to back fully next week with reviesed patch, and to take some other pathes to review... Hi, I've marked this patch as Returned with Feedback (thanks Tom). Please submit an updated version to the upcoming commitfest. Thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Ah. It's too late. I'll re-submit updated versions of my patches left alone in the last CF. Hi, I've marked this patch as Returned with Feedback (thanks Tom). Please submit an updated version to the upcoming commitfest. Thanks. I'm sorry and thank you. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Hello, sorry for long absense, # I had unexpected and urgent time-consuming tasks... :-( I have had a bit more precise inspection by two aspects, and they seemd showing that the difference should be the execution time of ExecProject. I'll be able to back fully next week with reviesed patch, and to take some other pathes to review... Although I said as following, the gain seems a bit larger... I'll recheck the testing conditions... I had inspected more precisely on two aspects maginifying the effect of this patch by putting 300 columns into table. First, explain analyze says the difference caused by this patch is only in the actual time of Result node. orig$ psql -c 'explain analyze select * from parenta' QUERY PLAN -- Result (cost=0.00..176667.00 rows=101 width=1202) (actual time=0.013.. *2406.792* rows=100 loops=1) - Append (cost=0.00..176667.00 rows=101 width=1202) (actual time=0.011..412.749 rows=100 loops=1) - Seq Scan on parenta (cost=0.00..0.00 rows=1 width=1228) (actual time=0.001..0.001 rows=0 loops=1) - Seq Scan on childa000 parenta (cost=0.00..176667.00 rows=100 width=1202) (actual time=0.009..334.633 rows=100 loops=1) Total runtime: 2446.474 ms (5 rows) patched$ psql -c 'explain analyze select * from parenta' QUERY PLAN -- Result (cost=0.00..176667.00 rows=101 width=1202) (actual time=0.011.. *507.239* rows=100 loops=1) - Append (cost=0.00..176667.00 rows=101 width=1202) (actual time=0.011..419.420 rows=100 loops=1) - Seq Scan on parenta (cost=0.00..0.00 rows=1 width=1228) (actual time=0.000..0.000 rows=0 loops=1) - Seq Scan on childa000 parenta (cost=0.00..176667.00 rows=100 width=1202) (actual time=0.010..335.721 rows=100 loops=1) Total runtime: 545.879 ms (5 rows) Second, the results of configure --enable-profiling shows that the exectime of ExecProject chages greately. This is consistent with what explain showed. orig: % cumulative self self total time seconds secondscalls s/call s/call name 60.29 1.26 1.26 105 0.00 0.00 slot_deform_tuple ! 30.14 1.89 0.63 100 0.00 0.00 ExecProject 3.35 1.96 0.07 304 0.00 0.00 ExecProcNode 0.96 1.98 0.02 102 0.00 0.00 ExecScan 0.96 2.00 0.02 166379 0.00 0.00 TerminateBufferIO 0.48 2.01 0.01 304 0.00 0.00 InstrStartNode 0.48 2.02 0.01 304 0.00 0.00 InstrStopNode ! 0.48 2.03 0.01 101 0.00 0.00 ExecResult 0.48 2.04 0.01 830718 0.00 0.00 LWLockAcquire 0.48 2.05 0.01 506834 0.00 0.00 hash_search_with_hash_value 0.48 2.06 0.01 341656 0.00 0.00 LockBuffer 0.48 2.07 0.01 168383 0.00 0.00 ReadBuffer_common 0.48 2.08 0.014 0.00 0.00 InstrEndLoop 0.48 2.09 0.01 ExecCleanTargetListLength 0.00 2.09 0.00 205 0.00 0.00 MemoryContextReset patched: % cumulative self self total time seconds secondscalls ms/call ms/call name 23.08 0.03 0.03 304 0.00 0.00 ExecProcNode 15.38 0.05 0.02 102 0.00 0.00 heapgettup_pagemode 15.38 0.07 0.02 830718 0.00 0.00 LWLockAcquire 7.69 0.08 0.01 205 0.00 0.00 MemoryContextReset 7.69 0.09 0.01 102 0.00 0.00 ExecScan 7.69 0.10 0.01 100 0.00 0.00 ExecStoreTuple 7.69 0.11 0.01 841135 0.00 0.00 LWLockRelease 7.69 0.12 0.01 168383 0.00 0.00 ReadBuffer_common 7.69 0.13 0.01 168383 0.00 0.00 UnpinBuffer 0.00 0.13 0.00 304 0.00 0.00 InstrStartNode ... ! 0.00 0.13 0.00 101 0.00 0.00 ExecResult ! 0.00 0.13 0.00 100 0.00 0.00 ExecProject regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Hello, Thank you for suggestions. This patch reduces run time of such queries by 45% when result recored has 30 columns and seems to have no harm for performance. This patch seems quite unsafe to me: it's not generally okay for a plan node to return a slot that doesn't belong to it, because of tuple-lifespan issues. It's possible that Result in particular could do that safely, but if so we ought to hack nodeResult.c for it, not the generic projection machinery. Hmm.. Concerning tuple-lifespan, almost every types of node which may do projection has the route for no-projInfo. This patch for nodeResult eventually does the same thing. If they are special cases and the operation could not be done generally, I should follow the real(or hidden amoung code lines?) lifespan regulation... From the another point of view, execution nodes may hold tupleslots which palloc'ed in projInfos of them just after receiving a result tuple from their childs. And thy are finally free'd in next projection on the same node or ExecEndNode() after the fiish of processing the entire current query. The life of the contents in the slots should be valid until next projection in upper nodes or sending the result tuple. The execution tree is bottom-spreaded and every node in it could not be executed under different ancestors, and no multi-threaded execution.. The above is the figure from my view. And I suppose these facts(is it correct?) are enough to ensure the tuple-lifeplan. And concerning genericity of 'identity projection', .. Perhaps you're right. I capsulated the logic into ExecProject but it is usable only from a few kind of nodes.. I'll revert modification on ExecProject and do identity projection on each nodes which can do that. Something I'd been considering in connection with another example is teaching the planner not to generate a Result node in the first place, if the node is just doing an identity projection. There are a couple of ways that could be done --- one is to get setrefs.c to remove the node on-the-fly, similar to what it does with SubqueryScan. But it might be better just to check whether the node is actually needed before creating it in the first place. I completely agree with the last sentence regarding Result node. As I described in the previous message, it was a bit hard to find the way to do that. I'll seek for that with more effort. Another point here is that the projection code already special-cases simple projections, so it's a bit hard to believe that it's as slow as you suggest above. I wonder if your test case is confusing that optimization somehow. The whole table is on memory and query is very simple and the number of columns is relatively larger in this case. This is because I intended to improve retrieving a large part of partitioned table with many columns. In this case, the executor shuttles up and down in shallow tree and every level does almost nothing, but the result node does pfree/palloc and direct mapping up to 30 columns which seems rather heavy in the whole this execution. I could found no sign of failure of optimization in that so simple execution tree... And the effect of cource becomes smaller for fewer columns or more complex queries, or queries on tables hanging out of memory onto disks. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Although I said as following, the gain seems a bit larger... I'll recheck the testing conditions... Another point here is that the projection code already special-cases simple projections, so it's a bit hard to believe that it's as slow as you suggest above. I wonder if your test case is confusing that optimization somehow. The whole table is on memory and query is very simple and the number of columns is relatively larger in this case. This is because I intended to improve retrieving a large part of partitioned table with many columns. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: This patch seems quite unsafe to me: it's not generally okay for a plan node to return a slot that doesn't belong to it, because of tuple-lifespan issues. It's possible that Result in particular could do that safely, but if so we ought to hack nodeResult.c for it, not the generic projection machinery. Concerning tuple-lifespan, almost every types of node which may do projection has the route for no-projInfo. Huh? Maybe I'm missing something, but AFAICS the only place where ExecProject is conditionally bypassed is execScan.c, that is relation scan nodes. When execScan does skip the projection, what it returns instead is its scan-tuple slot --- that is, a different slot that still belongs to the scan node. So nothing changes from the standpoint of slot ownership or lifespan. Skipping the projection step in a non-leaf plan node would be an entirely different matter. From the another point of view, execution nodes may hold tupleslots which palloc'ed in projInfos of them just after receiving a result tuple from their childs. And thy are finally free'd in next projection on the same node or ExecEndNode() after the fiish of processing the entire current query. The life of the contents in the slots should be valid until next projection in upper nodes or sending the result tuple. The execution tree is bottom-spreaded and every node in it could not be executed under different ancestors, and no multi-threaded execution.. That's not so true as it used to be, with lazy evaluation of CTEs. I just had to fix a planner bug that amounted to assuming that plan tree execution is strictly top-down when it no longer is, so I'm uncomfortable with adding new assumptions of that kind to the executor. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: This patch reduces run time of such queries by 45% when result recored has 30 columns and seems to have no harm for performance. This patch seems quite unsafe to me: it's not generally okay for a plan node to return a slot that doesn't belong to it, because of tuple-lifespan issues. It's possible that Result in particular could do that safely, but if so we ought to hack nodeResult.c for it, not the generic projection machinery. Something I'd been considering in connection with another example is teaching the planner not to generate a Result node in the first place, if the node is just doing an identity projection. There are a couple of ways that could be done --- one is to get setrefs.c to remove the node on-the-fly, similar to what it does with SubqueryScan. But it might be better just to check whether the node is actually needed before creating it in the first place. Another point here is that the projection code already special-cases simple projections, so it's a bit hard to believe that it's as slow as you suggest above. I wonder if your test case is confusing that optimization somehow. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers