Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Adarsh Sharma
t...@fuzzy.cz wrote: On 03/23/2011 04:17 AM, Adarsh Sharma wrote: explain analyze select distinct(p.crawled_page_id) from page_content p where NOT EXISTS (select 1 from clause2 c where c.source_id = p.crawled_page_id); You know... I'm surprised nobody has mentioned this, but DISTIN

[PERFORM] pg9.0.3 explain analyze running very slow compared to a different box with much less configuration

2011-03-23 Thread DM
Hi All, pg9.0.3 explain analyze running very slow compared to old box with much less configuration. But actual query is performing much better than the old server. old Server=== OS: CentOS release 5.4 (Final) Linux Server 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2

Re: [PERFORM] Shouldn't we have a way to avoid "risky" plans?

2011-03-23 Thread Josh Berkus
> If the planner starts operating on the basis of worst case rather than > expected-case performance, the complaints will be far more numerous than > they are today. Yeah, I don't think that's the way to go. The other thought I had was to accumulate a "risk" stat the same as we accumulate a "cos

Re: [PERFORM] Slow query on CLUTER -ed tables

2011-03-23 Thread Marti Raudsepp
2011/3/23 Laszlo Nagy : > "GroupAggregate  (cost=5553554.25..5644888.17 rows=2283348 width=50)" > "  ->  Sort  (cost=5553554.25..5559262.62 rows=2283348 width=50)" > "        Sort Key: pph.hid, ppoh.merchantid, pph.hdate" > "        ->  Nested Loop  (cost=0.00..5312401.66 rows=2283348 width=50)" >

Re: [PERFORM] Shouldn't we have a way to avoid "risky" plans?

2011-03-23 Thread Claudio Freire
On Wed, Mar 23, 2011 at 6:00 PM, Tom Lane wrote: > Claudio Freire writes: >> In my head, safer = better worst-case performance. > > If the planner starts operating on the basis of worst case rather than > expected-case performance, the complaints will be far more numerous than > they are today.

Re: [PERFORM] Shouldn't we have a way to avoid "risky" plans?

2011-03-23 Thread Tom Lane
Claudio Freire writes: > On Wed, Mar 23, 2011 at 5:29 PM, Josh Berkus wrote: >> On 3/23/11 10:35 AM, Claudio Freire wrote: >>>  *  consider plan bailout: execute a tempting plan, if it takes too >>> long or its effective cost raises well above the expected cost, bail >>> to a safer plan >> That

Re: [PERFORM] Shouldn't we have a way to avoid "risky" plans?

2011-03-23 Thread Justin Pitts
On Wed, Mar 23, 2011 at 1:12 PM, Josh Berkus wrote: > AFAICT, what's happening in this query is that PostgreSQL's statistics > on the device_nodes and several other tables are slightly out of date > (as in 5% of the table). What about some manner of query feedback mechanism ( along the lines of w

Re: [PERFORM] Shouldn't we have a way to avoid "risky" plans?

2011-03-23 Thread Claudio Freire
On Wed, Mar 23, 2011 at 5:29 PM, Josh Berkus wrote: > On 3/23/11 10:35 AM, Claudio Freire wrote: >>  *  consider plan bailout: execute a tempting plan, if it takes too >> long or its effective cost raises well above the expected cost, bail >> to a safer plan > > That would actually solve this part

[PERFORM] Slow query on CLUTER -ed tables

2011-03-23 Thread Laszlo Nagy
Given two tables: CREATE TABLE product_price_history ( hid bigint NOT NULL, hdate timestamp without time zone NOT NULL, id bigint NOT NULL, product_id bigint NOT NULL, more columns here CONSTRAINT pk_product_price_history PRIMARY KEY (hid); CREATE INDEX idx_product_price_history_i

Re: [PERFORM] Shouldn't we have a way to avoid "risky" plans?

2011-03-23 Thread Josh Berkus
On 3/23/11 10:35 AM, Claudio Freire wrote: > * consider plan bailout: execute a tempting plan, if it takes too > long or its effective cost raises well above the expected cost, bail > to a safer plan That would actually solve this particular case. It would still require us to have some definiti

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Cédric Villemain
2011/3/23 Uwe Bartels : > On 23 March 2011 16:36, Jeff Janes wrote: >> >> On Wed, Mar 23, 2011 at 6:19 AM, Jochen Erwied >> wrote: >> > Wednesday, March 23, 2011, 1:51:31 PM you wrote: >> > >> > [rearranged for quoting] >> > >> >> background writer stats >> >>  checkpoints_timed | checkpoints_req

Re: [PERFORM] Shouldn't we have a way to avoid "risky" plans?

2011-03-23 Thread Claudio Freire
On Wed, Mar 23, 2011 at 2:12 PM, Josh Berkus wrote: > Folks, > >... > It really seems like we should be able to detect an obvious high-risk > situation like this one.  Or maybe we're just being too optimistic about > discarding subplans? Why not letting the GEQO learn from past mistakes? If some

[PERFORM] Shouldn't we have a way to avoid "risky" plans?

2011-03-23 Thread Josh Berkus
Folks, Yet more evidence that we need some way to assess query plans which are high-risk and avoid them (or have Yet Another GUC): Merge Join (cost=29.16..1648.00 rows=382 width=78) (actual time=57215.167..57215.216 rows=1 loops=1) Merge Cond: (rn.node_id = device_nodes.node_id) -> Neste

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Jeff Janes
On Wed, Mar 23, 2011 at 8:26 AM, Uwe Bartels wrote: > Hi Brad, > > yes. that's the question > in the source code in freelist.c there is something that I don't understand. > > This is the first try to get a free page. The second try scans used buffers. > What makes me wonder is the why postgres

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Uwe Bartels
On 23 March 2011 16:36, Jeff Janes wrote: > On Wed, Mar 23, 2011 at 6:19 AM, Jochen Erwied > wrote: > > Wednesday, March 23, 2011, 1:51:31 PM you wrote: > > > > [rearranged for quoting] > > > >> background writer stats > >> checkpoints_timed | checkpoints_req | buffers_checkpoint | > buffers_cl

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Jeff Janes
On Wed, Mar 23, 2011 at 6:19 AM, Jochen Erwied wrote: > Wednesday, March 23, 2011, 1:51:31 PM you wrote: > > [rearranged for quoting] > >> background writer stats >>  checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | >> maxwritten_clean | buffers_backend | buffers_alloc >>

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Uwe Bartels
Hi Brad, yes. that's the question in the source code in freelist.c there is something that I don't understand. This is the first try to get a free page. The second try scans used buffers. What makes me wonder is the why postgres is checking for > where usage_count is suppose

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Nicholson, Brad (Toronto, ON, CA)
> -Original Message- > From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance- > ow...@postgresql.org] On Behalf Of t...@fuzzy.cz > Sent: Wednesday, March 23, 2011 10:42 AM > To: Uwe Bartels > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] buffercache/bgwrit

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Uwe Bartels
Hi Thomas, thanks, but there were no new informations in there for me. this article http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm I know and others on his website. Best... Uwe On 23 March 2011 15:41, wrote: > > Hi, > > > > I have very bad bgwriter statistics on a server wh

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread tv
> Hi, > > I have very bad bgwriter statistics on a server which runs since many > weeks > and it is still the same after a recent restart. > There are roughly 50% of buffers written by the backend processes and the > rest by checkpoints. > The statistics below are from a server with 140GB RAM, 32GB

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Shaun Thomas
On 03/23/2011 09:16 AM, t...@fuzzy.cz wrote: which means this is at least 8.4. Plus the 'distinct' step uses less than 1% of total time, so even if you improve it the impact will be minimal. Haha. Noted. I guess I'm still on my original crusade against DISTINCT. I was pulling it out of so muc

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread tv
> On 03/23/2011 04:17 AM, Adarsh Sharma wrote: > >> explain analyze select distinct(p.crawled_page_id) from page_content >> p where NOT EXISTS (select 1 from clause2 c where c.source_id = >> p.crawled_page_id); > > You know... I'm surprised nobody has mentioned this, but DISTINCT is > very slow unl

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Uwe Bartels
Hi Jochen, yes, I had that impression too. But it is running. ...And has almost no effect. I changed all parameter to the most aggressive, but Before I restarted the server I had a percentage of writes by the bgwriter of less that 1 percent. postgres=# select name,setting from pg_settings whe

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Shaun Thomas
On 03/23/2011 04:17 AM, Adarsh Sharma wrote: explain analyze select distinct(p.crawled_page_id) from page_content p where NOT EXISTS (select 1 from clause2 c where c.source_id = p.crawled_page_id); You know... I'm surprised nobody has mentioned this, but DISTINCT is very slow unless you have

Re: [PERFORM] buffercache/bgwriter

2011-03-23 Thread Jochen Erwied
Wednesday, March 23, 2011, 1:51:31 PM you wrote: [rearranged for quoting] > background writer stats > checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | > maxwritten_clean | buffers_backend | buffers_alloc > ---+-++-

Re: [PERFORM] good old VACUUM FULL

2011-03-23 Thread Shaun Thomas
On 03/23/2011 01:16 AM, Scott Marlowe wrote: Then either cluster failed (did you get an error message) or the table was not bloated. Given that it looks like it was greatly reduced in size by the vacuum full, I'd guess cluster failed for some reason. Or it just bloated again. Remember, he sti

[PERFORM] buffercache/bgwriter

2011-03-23 Thread Uwe Bartels
Hi, I have very bad bgwriter statistics on a server which runs since many weeks and it is still the same after a recent restart. There are roughly 50% of buffers written by the backend processes and the rest by checkpoints. The statistics below are from a server with 140GB RAM, 32GB shared_buffers

Re: [PERFORM] Reason of Slowness of query

2011-03-23 Thread Vitalii Tymchyshyn
23.03.11 09:30, Adarsh Sharma ???(??): Thanks Chetan, here is the output of your updated query : *explain select distinct(p.crawled_page_id) from page_content p where NOT EXISTS (select 1 from clause2 c where c.source_id = p.crawled_page_id); * QU

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread tv
> >> Actually the plans are equal, so I suppose it depends on what were >> run first :). Slow query operates with data mostly on disk, while >> fast one with data in memory. >> >> yeah. maybe the easiest way, is to start a fresh session and fire the >> queries. > > > After the fresh sta

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Adarsh Sharma
Actually the plans are equal, so I suppose it depends on what were run first :). Slow query operates with data mostly on disk, while fast one with data in memory. yeah. maybe the easiest way, is to start a fresh session and fire the queries. After the fresh start , the results o

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Adarsh Sharma
Vitalii Tymchyshyn wrote: 23.03.11 13:21, Adarsh Sharma ???(??): Thank U all, for U'r Nice Support. Let me Conclude the results, below results are obtained after finding the needed queries : *First Option : *pdc_uima=# explain analyze select distinct(p.crawled_page_id) from page_conten

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Chetan Suttraway
On Wed, Mar 23, 2011 at 4:51 PM, Vitalii Tymchyshyn wrote: > 23.03.11 13:21, Adarsh Sharma написав(ла): > > Thank U all, for U'r Nice Support. > > Let me Conclude the results, below results are obtained after finding the > needed queries : > > *First Option : > > *pdc_uima=# explain analyze selec

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Vitalii Tymchyshyn
23.03.11 13:21, Adarsh Sharma ???(??): Thank U all, for U'r Nice Support. Let me Conclude the results, below results are obtained after finding the needed queries : *First Option : *pdc_uima=# explain analyze select distinct(p.crawled_page_id) pdc_uima-# from page_content p left join cla

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Adarsh Sharma
Thank U all, for U'r Nice Support. Let me Conclude the results, below results are obtained after finding the needed queries : *First Option : *pdc_uima=# explain analyze select distinct(p.crawled_page_id) pdc_uima-# from page_content p left join clause2 c on (p.crawled_page_id = pdc_uima(# c.

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Chetan Suttraway
On Wed, Mar 23, 2011 at 4:08 PM, wrote: > > I just want to retrieve that id 's from page_content which do not have > > any entry in clause2 table. > > In that case the query probably does not work (at least the query you've > sent in the first post) as it will return even those IDs that have at >

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Chetan Suttraway
On Wed, Mar 23, 2011 at 3:49 PM, Adarsh Sharma wrote: > Vitalii Tymchyshyn wrote: > > 23.03.11 12:10, Adarsh Sharma написав(ла): > > I just want to retrieve that id 's from page_content which do not have any > entry in clause2 table. > > Then > select distinct(p.crawled_page_id) from page_conten

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread tv
> I just want to retrieve that id 's from page_content which do not have > any entry in clause2 table. In that case the query probably does not work (at least the query you've sent in the first post) as it will return even those IDs that have at least one other row in 'clause2' (not matching the !

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Vitalii Tymchyshyn
23.03.11 12:19, Adarsh Sharma ???(??): Vitalii Tymchyshyn wrote: 23.03.11 12:10, Adarsh Sharma ???(??): I just want to retrieve that id 's from page_content which do not have any entry in clause2 table. Then select distinct(p.crawled_page_id) from page_content p where NOT EXISTS (se

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Adarsh Sharma
Vitalii Tymchyshyn wrote: 23.03.11 12:10, Adarsh Sharma ???(??): I just want to retrieve that id 's from page_content which do not have any entry in clause2 table. Then select distinct(p.crawled_page_id) from page_content p where NOT EXISTS (select 1 from clause2 c where c.source_id =

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Vitalii Tymchyshyn
23.03.11 12:10, Adarsh Sharma ???(??): I just want to retrieve that id 's from page_content which do not have any entry in clause2 table. Then select distinct(p.crawled_page_id) from page_content p where NOT EXISTS (select 1 from clause2 c where c.source_id = p.crawled_page_id); is cor

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Adarsh Sharma
I just want to retrieve that id 's from page_content which do not have any entry in clause2 table. Thanks , Adarsh Vitalii Tymchyshyn wrote: 23.03.11 11:17, Adarsh Sharma ???(??): I think it is very much faster but I don't understand the query : *explain select distinct(b) from t1,t2 wh

Re: [PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Vitalii Tymchyshyn
23.03.11 11:17, Adarsh Sharma ???(??): I think it is very much faster but I don't understand the query : *explain select distinct(b) from t1,t2 where t1.b >t2.d union all select distinct(b) from t1,t2 where t1.b * I don't understand it too. What are you trying to get? Is it select dist

[PERFORM] Re-Reason of Slowness of Query

2011-03-23 Thread Adarsh Sharma
Thanks Chetan, After my Lunch Break, I tried the below steps : *My original query was : *explain analyze select distinct(p.crawled_page_id) from page_content p , clause2 c where p.crawled_page_id != c.source_id which hangs because it is wrong query to fetch the desired output . *Next Updated

Re: [PERFORM] Reason of Slowness of query

2011-03-23 Thread Adarsh Sharma
Thanks Chetan, here is the output of your updated query : *explain select distinct(p.crawled_page_id) from page_content p where NOT EXISTS (select 1 from clause2 c where c.source_id = p.crawled_page_id); * QUERY PLAN

Re: [PERFORM] Reason of Slowness of query

2011-03-23 Thread Chetan Suttraway
On Wed, Mar 23, 2011 at 12:50 PM, Samuel Gendler wrote: > On Tue, Mar 22, 2011 at 11:28 PM, Adarsh Sharma > wrote: > >> * >> *I perform a join query on it as : >> >> * explain analyze select distinct(p.crawled_page_id) from page_content p >> , clause2 c where p.crawled_page_id != c.source_id ; >

Re: [PERFORM] Reason of Slowness of query

2011-03-23 Thread Vitalii Tymchyshyn
23.03.11 08:28, Adarsh Sharma ???(??): * *I perform a join query on it as : * explain analyze select distinct(p.crawled_page_id) from page_content p , clause2 c where p.crawled_page_id != c.source_id ;* Your query is wrong. This query will return every *crawled_page_id* if clause2 has mor

Re: [PERFORM] Reason of Slowness of query

2011-03-23 Thread Samuel Gendler
On Tue, Mar 22, 2011 at 11:28 PM, Adarsh Sharma wrote: > * > *I perform a join query on it as : > > * explain analyze select distinct(p.crawled_page_id) from page_content p , > clause2 c where p.crawled_page_id != c.source_id ; > > *What it takes more than 1 hour to complete. As I issue the expla

Re: [PERFORM] Reason of Slowness of query

2011-03-23 Thread Chetan Suttraway
On Wed, Mar 23, 2011 at 11:58 AM, Adarsh Sharma wrote: > Dear all, > > I have 2 tables in my database name clause2( 4900 MB) & page_content(1582 > MB). > > My table definations are as : > > *page_content :- > > *CREATE TABLE page_content > ( > content_id integer, > wkb_geometry geometry, >