Re: [HACKERS] parameterized limit statements
On Mon, 2005-11-07 at 18:43, Tom Lane wrote: [snip] If it doesn't have a value for the parameter, it'll assume 10% of table rows, which is what it's done for a long time if the LIMIT isn't reducible to a constant. Is 10% a reasonable guess here ? Here we use limit in combination with prepared statements to get something like less than 1% of the table. There are no exceptions to that in our code... even if the limit amount is a parameter. Furthermore, the limit amount is always a small number, usually ~ 100, but never more than 1000. So in my case, we could live with a suboptimal plan when the percentage would be more than 10%, cause then the table would be small enough not to matter that much. In turn it has a huge impact to wrongly guess 10% for a huge table... I think the best would be to guess 5% but maximum say 5000. That could work well with both small and huge tables. Maybe those values could be made configurable... just ideas, not like I could implement this... [snip] Cheers, Csaba. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] parameterized limit statements
I noticed your 8/18 commit to address an issue I raised regarding parameterized limit statements. Specifically, prepared statements with a variable limit would tend to revert to bitmap or seqscan. I check out cvs tip and am still getting that behavior :(. So, I had a look at createplan.c to see what was going on. Inside makelimit, there is: if (count_est != 0) { double count_rows; if (count_est 0) count_rows = (double) count_est; else count_rows = clamp_row_est(lefttree-plan_rows * 0.10); if (count_rows plan-plan_rows) count_rows = plan-plan_rows; if (plan-plan_rows 0) plan-total_cost = plan-startup_cost + (plan-total_cost - plan-startup_cost) * count_rows / plan-plan_rows; plan-plan_rows = count_rows; if (plan-plan_rows 1) plan-plan_rows = 1; } Is this correct? plan_rows is assigned (from count_rows) after it is checked to determine cost. If this is correct, would you like a test cast demonstrating the behavior? Merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] parameterized limit statements
Merlin Moncure [EMAIL PROTECTED] writes: Is this correct? Sure, what do you think is wrong with it? plan_rows is initially a copy of the child node's output-rows estimate, and then it gets modified. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] parameterized limit statements
Merlin Moncure [EMAIL PROTECTED] writes: Is this correct? Sure, what do you think is wrong with it? plan_rows is initially a copy of the child node's output-rows estimate, and then it gets modified. OK, just a stab in the dark...not familiar at all with this code (seemed odd to use value in comparison right before it was assigned). I am still getting prepared statements that are flipping to seqscan or bitmap scan. The statements are invariably in form of select a,b,c,d from t where a = $1 and (a $1 or b = $2) and (a $1 or b $2 or c = $3) and (a $1 or b $2 or c $3 or d $4) order by a, b, c, d limit $5; ^^ If I hardcode $5 to any sub-ridiculous value, I get a proper index plan. Does your patch assume a limit of 1 or 10% of table rows? FYI: the planner gets it right about 95% of the time and produces the best possible plan...an index filtering on a and scanning for b,c,d. Merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] parameterized limit statements
^^ If I hardcode $5 to any sub-ridiculous value, I get a proper index plan. Does your patch assume a limit of 1 or 10% of table rows? If it doesn't have a value for the parameter, it'll assume 10% of table rows, which is what it's done for a long time if the LIMIT isn't reducible to a constant. I suspect the real issue here is that whatever you are doing doesn't give the planner a value to use for the parameter. IIRC, at the moment the only way that that happens is if you use the unnamed-statement variation of the Parse/Bind/Execute protocol. hm...I'm using named statements over ExecPrepared. I can also confirm the results inside psql with prepare/execute. I can send you a test case, but was just wondering if your change to makelimit was supposed to address this case. Merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] parameterized limit statements
Merlin Moncure [EMAIL PROTECTED] writes: The statements are invariably in form of select a,b,c,d from t where a = $1 and (a $1 or b = $2) and (a $1 or b $2 or c = $3) and (a $1 or b $2 or c $3 or d $4) order by a, b, c, d limit $5; ^^ If I hardcode $5 to any sub-ridiculous value, I get a proper index plan. Does your patch assume a limit of 1 or 10% of table rows? If it doesn't have a value for the parameter, it'll assume 10% of table rows, which is what it's done for a long time if the LIMIT isn't reducible to a constant. I suspect the real issue here is that whatever you are doing doesn't give the planner a value to use for the parameter. IIRC, at the moment the only way that that happens is if you use the unnamed-statement variation of the Parse/Bind/Execute protocol. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] parameterized limit statements
Merlin Moncure [EMAIL PROTECTED] writes: hm...I'm using named statements over ExecPrepared. I can also confirm the results inside psql with prepare/execute. I can send you a test case, but was just wondering if your change to makelimit was supposed to address this case. Nope, sorry. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings