Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-26 Thread Bruce Momjian
Robert Haas wrote: On Thu, Jan 20, 2011 at 4:17 AM, C?dric Villemain cedric.villemain.deb...@gmail.com wrote: I think his point is that we already have a proven formula (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air. The problem is to figure out what numbers to

Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-20 Thread Cédric Villemain
2011/1/19 Bruce Momjian br...@momjian.us: Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Nov 12, 2010 at 4:15 AM, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: I wondering if we could do something with a formula like 3 * amount_of_data_to_read / (3 *

Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-20 Thread Robert Haas
On Thu, Jan 20, 2011 at 4:17 AM, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: I think his point is that we already have a proven formula (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air. The problem is to figure out what numbers to apply the M-L formula to.

Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-20 Thread Cédric Villemain
2011/1/20 Robert Haas robertmh...@gmail.com: On Thu, Jan 20, 2011 at 4:17 AM, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: I think his point is that we already have a proven formula (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air. The problem is to figure

Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-20 Thread Cédric Villemain
2011/1/19 Bruce Momjian br...@momjian.us: Robert Haas wrote: On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Yeah. ?For Kevin's case, it seems like we want the caching percentage to vary not so much based on which table we're

Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-19 Thread Bruce Momjian
Tom Lane wrote: Mladen Gogala mladen.gog...@vmsinfo.com writes: Again, having an optimizer which will choose the plan completely accurately is, at least in my opinion, less important than having a possibility of manual control, the aforementioned knobs and buttons and produce the same

Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-19 Thread Bruce Momjian
Robert Haas wrote: On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Yeah. ?For Kevin's case, it seems like we want the caching percentage to vary not so much based on which table we're hitting at the moment but on how much of it

Re: [PERFORM] anti-join chosen even when slower than old plan

2011-01-19 Thread Bruce Momjian
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Nov 12, 2010 at 4:15 AM, C?dric Villemain cedric.villemain.deb...@gmail.com wrote: I wondering if we could do something with a formula like 3 * amount_of_data_to_read / (3 * amount_of_data_to_read + effective_cache_size)

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-13 Thread Marc Mamin
Hello, Just a short though: Is it imaginable to compare the prognoses of the plans with the actual results and somehow log the worst cases ? a) to help the DBA locate bad statistics and queries b) as additional information source for the planner This could possibly affect parameters of your

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-13 Thread bricklen
On Sat, Nov 13, 2010 at 1:32 AM, Marc Mamin m.ma...@intershop.de wrote: Hello, Just a short though: Is it imaginable to compare the prognoses of the plans with the actual results and somehow log the worst cases ? a) to help the DBA locate bad statistics and queries b) as additional

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-13 Thread Robert Haas
On Sat, Nov 13, 2010 at 4:32 AM, Marc Mamin m.ma...@intershop.de wrote: Hello, Just a short though: Is it imaginable to compare the prognoses of the plans with the actual results and somehow log the worst cases ? a) to help the DBA locate bad statistics and queries b) as additional

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Cédric Villemain
2010/11/11 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: Yeah.  For Kevin's case, it seems like we want the caching percentage to vary not so much based on which table we're hitting at the moment but on how much of it we're actually reading. Well, we could certainly

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Cédric Villemain
2010/11/11 Robert Haas robertmh...@gmail.com: On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Yeah.  For Kevin's case, it seems like we want the caching percentage to vary not so much based on which table we're hitting at the moment

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Vitalii Tymchyshyn
I'd say there are two Qs here: 1) Modify costs based on information on how much of the table is in cache. It would be great if this can be done, but I'd prefer to have it as admin knobs (because of plan stability). May be both admin and automatic ways can be followed with some parallel

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Cédric Villemain
I supposed it was an answer to my mail but not sure... please keep CC'ed people, it is easier to follow threads (at least for me) 2010/11/12 Vitalii Tymchyshyn tiv...@gmail.com: I'd say there are two Qs here: 1) Modify costs based on information on how much of the table is in cache. It would

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Vitalii Tymchyshyn
12.11.10 12:56, Cédric Villemain написав(ла): I supposed it was an answer to my mail but not sure... please keep CC'ed people, it is easier to follow threads (at least for me) OK 2010/11/12 Vitalii Tymchyshyntiv...@gmail.com: I'd say there are two Qs here: 1) Modify costs based on

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Cédric Villemain
2010/11/12 Vitalii Tymchyshyn tiv...@gmail.com: 12.11.10 12:56, Cédric Villemain написав(ла): I supposed it was an answer to my mail but not sure... please keep CC'ed people, it is easier to follow threads (at least for me) OK 2010/11/12 Vitalii Tymchyshyntiv...@gmail.com: I'd say

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Robert Haas
On Fri, Nov 12, 2010 at 4:15 AM, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: I wondering if we could do something with a formula like 3 * amount_of_data_to_read / (3 * amount_of_data_to_read + effective_cache_size) = percentage NOT cached.  That is, if we're reading an amount of

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Fri, Nov 12, 2010 at 4:15 AM, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: I wondering if we could do something with a formula like 3 * amount_of_data_to_read / (3 * amount_of_data_to_read + effective_cache_size) = percentage NOT

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Robert Haas
On Fri, Nov 12, 2010 at 11:43 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think his point is that we already have a proven formula (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air. The problem is to figure out what numbers to apply the M-L formula to. I'm not sure that's

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-12 Thread Cédric Villemain
2010/11/12 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: On Fri, Nov 12, 2010 at 4:15 AM, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: I wondering if we could do something with a formula like 3 * amount_of_data_to_read / (3 * amount_of_data_to_read +

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Віталій Тимчишин
2010/11/11 Robert Haas robertmh...@gmail.com But thinking over what you've written here, I'm reminded of something Peter said years ago, also about the optimizer. He was discussed the ratio of the estimated cost to the actual cost and made an off-hand remark that efforts had been made over

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall
On Wed, Nov 10, 2010 at 10:47:21PM -0500, Robert Haas wrote: On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Robert Haas robertmh...@gmail.com wrote: Unfortunately, to know how much data we're going to grovel through,

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Mladen Gogala
Kenneth Marshall wrote: I agree with the goal of avoiding the need for a GUC. This needs to be as automatic as possible. One idea I had had was computing a value for the amount of cache data in the system by keeping a sum or a weighted sum of the table usage in the system. Smaller tables and

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall
On Thu, Nov 11, 2010 at 09:15:58AM -0500, Mladen Gogala wrote: Kenneth Marshall wrote: I agree with the goal of avoiding the need for a GUC. This needs to be as automatic as possible. One idea I had had was computing a value for the amount of cache data in the system by keeping a sum or a

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
Mladen Gogala mladen.gog...@vmsinfo.com wrote: create a definitive bias toward one type of the execution plan. We're talking about trying to support the exact opposite. This all started because a database which was tuned for good response time for relatively small queries against a hot

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Bob Lunney
--- On Thu, 11/11/10, Mladen Gogala mladen.gog...@vmsinfo.com wrote: From: Mladen Gogala mladen.gog...@vmsinfo.com Subject: Re: [PERFORM] anti-join chosen even when slower than old plan To: Kenneth Marshall k...@rice.edu Cc: Robert Haas robertmh...@gmail.com, Tom Lane t...@sss.pgh.pa.us

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Mladen Gogala
Kevin Grittner wrote: Mladen Gogala mladen.gog...@vmsinfo.com wrote: create a definitive bias toward one type of the execution plan. We're talking about trying to support the exact opposite. I understand this, that is precisely the reason for my intervention into the discussion

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
Mladen Gogala mladen.gog...@vmsinfo.com writes: Again, having an optimizer which will choose the plan completely accurately is, at least in my opinion, less important than having a possibility of manual control, the aforementioned knobs and buttons and produce the same plan for the same

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Craig James
On 11/11/10 9:13 AM, Mladen Gogala wrote: Kevin Grittner wrote: Mladen Gogala mladen.gog...@vmsinfo.com wrote: create a definitive bias toward one type of the execution plan. We're talking about trying to support the exact opposite. I understand this, that is precisely the reason for my

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Robert Haas
On Thu, Nov 11, 2010 at 10:00 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Mladen Gogala mladen.gog...@vmsinfo.com wrote: create a definitive bias toward one type of the execution plan. We're talking about trying to support the exact opposite.  This all started because a database

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Mladen Gogala
Tom Lane wrote: More knobs and buttons is the Oracle way, True. Very true. and the end result of that process is that you have something as hard to use as Oracle. Also, you end up with something which is extremely reliable and adjustable to variety of conditions. That's generally not

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Let's back up a moment and talk about what the overall goal is, here. Ideally, we would like PostgreSQL to have excellent performance at all times, under all circumstances, with minimal tuning. Therefore, we do NOT want to add variables that will, by

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
I wrote: I do think that something based around a settable-per-table caching percentage might be a reasonable way to proceed. BTW ... on reflection it seems that this would *not* solve the use-case Kevin described at the start of this thread. What he's got AIUI is some large tables whose

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Robert Haas
On Thu, Nov 11, 2010 at 1:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Let's back up a moment and talk about what the overall goal is, here. Ideally, we would like PostgreSQL to have excellent performance at all times, under all circumstances, with

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Robert Haas
On Thu, Nov 11, 2010 at 1:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: I do think that something based around a settable-per-table caching percentage might be a reasonable way to proceed. BTW ... on reflection it seems that this would *not* solve the use-case Kevin described at the

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: BTW ... on reflection it seems that this would *not* solve the use-case Kevin described at the start of this thread. What he's got AIUI is some large tables whose recent entries are well- cached, and a lot of queries that tend to hit that well-cached

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Thu, Nov 11, 2010 at 1:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: I do think that something based around a settable-per-table caching percentage might be a reasonable way to proceed.  But the devil is in the details, and we don't have those yet. I

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
I wrote: Besides the fully-scanned object size relative to relation size costing adjustment idea, s/relation size/effective cache size/ -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription:

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Besides the fully-scanned object size relative to relation size costing adjustment idea, the only one which seemed to be likely to be useful for this sort of issue was the costing factors by user ID idea -- the interactive queries hitting the

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: I've tried to avoid having the planner need to know the total size of the database cluster, but it's kind of hard to avoid that if you want to model this honestly. Agreed. Perhaps the cost could start escalating when the pages to access hit

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Erm ... you can in fact do ALTER USER SET random_page_cost today. Ouch. I'm embarrassed to have missed that. I'll do that instead of adding those settings to the scripts, then. Thanks for pointing that out. -Kevin -- Sent via pgsql-performance

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Andres Freund
On Thursday 11 November 2010 19:58:49 Tom Lane wrote: I wrote: I do think that something based around a settable-per-table caching percentage might be a reasonable way to proceed. BTW ... on reflection it seems that this would *not* solve the use-case Kevin described at the start of this

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Jon Nelson
On Thu, Nov 11, 2010 at 1:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Besides the fully-scanned object size relative to relation size costing adjustment idea, the only one which seemed to be likely to be useful for this sort of issue was the

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Robert Haas
On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Yeah.  For Kevin's case, it seems like we want the caching percentage to vary not so much based on which table we're hitting at the moment but on how much of it we're actually reading.

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread gnuoytr
Original message Date: Thu, 11 Nov 2010 15:29:40 -0500 From: pgsql-performance-ow...@postgresql.org (on behalf of Robert Haas robertmh...@gmail.com) Subject: Re: [PERFORM] anti-join chosen even when slower than old plan To: Tom Lane t...@sss.pgh.pa.us Cc: Kevin Grittner kevin.gritt

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall
On Thu, Nov 11, 2010 at 03:56:25PM -0500, gnuo...@rcn.com wrote: On a thread some time ago, on a similar subject, I opined that I missed the ability to assign tables to tablespaces and buffers to tablespaces, thus having the ability to isolate needed tables (perhaps a One True Lookup Table,

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-10 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: The semi-join and anti-join have helped us quite a bit, but we have seen a situation where anti-join is chosen even though it is slower than the old fashioned plan. I know there have been other reports of

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-10 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Tom Lane t...@sss.pgh.pa.us wrote: In principle, the old-style plan ought to be equivalent to a nestloop antijoin with a seqscan of DbTranLogRecord on the outside and an indexscan of DbTranRepository on the inside. Can you force it to choose

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-10 Thread Kevin Grittner
Grzegorz Jaśkiewicz wrote: you're joining on more than one key. That always hurts performance. That's very clearly *not* the problem, as there is a plan which runs in acceptable time but the optimizer is not choosing without being coerced. (1) Virtually every query we run joins on

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-10 Thread Robert Haas
On Wed, Nov 10, 2010 at 10:15 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: But wait -- it turns out that this pain was self-inflicted.  Based on heavy testing of the interactive queries which users run against this database we tuned the database for fully-cached settings, with both

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-10 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: Wow. That's fascinating, and if you don't mind, I might mention this potential problem in a future talk at some point. I don't mind at all. For example, in your case, it would be sufficient to estimate the amount of data that a given query is

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-10 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Robert Haas robertmh...@gmail.com wrote: Unfortunately, to know how much data we're going to grovel through, we need to know the plan; and to decide on the right plan, we need to know how much data we're going to grovel through. And that's

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-10 Thread Robert Haas
On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Robert Haas robertmh...@gmail.com wrote: Unfortunately, to know how much data we're going to grovel through, we need to know the plan; and to decide on the right plan, we

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-10 Thread Mladen Gogala
On 11/10/2010 5:43 PM, Kevin Grittner wrote: The only half-sane answer I've thought of is to apply a different cost to full-table or full-index scans based on the ratio with effective cache size. The effective_cache_size is, in my humble opinion, a wrong method. It would be much easier to

[PERFORM] anti-join chosen even when slower than old plan

2010-11-09 Thread Kevin Grittner
The semi-join and anti-join have helped us quite a bit, but we have seen a situation where anti-join is chosen even though it is slower than the old fashioned plan. I know there have been other reports of this, but I just wanted to go on record with my details. The query: delete from

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-09 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote: samples %symbol name 2320174 33.7617 index_getnext I couldn't resist seeing where the time went within this function. Over 13.7% of the opannotate run time was on this bit of code: /* * The xmin should match the previous xmax

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-09 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Kevin Grittner kevin.gritt...@wicourts.gov wrote: samples %symbol name 2320174 33.7617 index_getnext I couldn't resist seeing where the time went within this function. Over 13.7% of the opannotate run time was on this bit of

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-09 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: However, you'd have to be spending a lot of time chasing through long HOT chains before that would happen enough to make this a hotspot... That makes it all the more mysterious, then. These tables are insert-only except for a weekly delete of one week of

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-09 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: The semi-join and anti-join have helped us quite a bit, but we have seen a situation where anti-join is chosen even though it is slower than the old fashioned plan. I know there have been other reports of this, but I just wanted to go on

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-09 Thread Grzegorz Jaśkiewicz
you're joining on more than one key. That always hurts performance. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance