date:20080212

Re: [PERFORM] Optimizing No matching record Queries

2008-02-12 Thread Gregory Stark

"Stephen Denne" <[EMAIL PROTECTED]> writes:

> Pallav Kalva asked
> ...
>> and listing0_.fklistingsourceid=5525
> ...
>> order by listing0_.entrydate desc limit 10;
>
>>->  Index Scan Backward using idx_listing_entrydate on 
>> listing listing0_  (cost=0.00..781557.28 rows=5118 width=107) (actual 
>> time=2113544.412..2113544.412 rows=0 loops=1)
>>  Filter: (fklistingsourceid = 5525)
>
> Would it help to have a combined index on fklistingsourceid, entrydate?

I think that would help. You already have a ton of indexes, you might consider
whether all your queries start with a listingsourceid and whether you can have
that as a prefix on the existing index.

Another thing to try is raising the stats target on fklistingsourceid and/or
entrydate. The estimate seems pretty poor. It could just be that the
distribution is highly skewed which is a hard case to estimate correctly.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Optimizing No matching record Queries

2008-02-12 Thread Dean Gibson (DB Administrator)


On 2008-02-12 13:35, Pallav Kalva wrote:

Hi,

...
Table Definitions


\d listing.listingstatus
Table "listing.listingstatus"
Column  |Type 
|   Modifiers
-+-+ 

listingstatusid | integer | not null default 
nextval(('listing.listingstatusseq'::text)::regclass)

shortname   | text|
longname| text|
_entrydate  | timestamp without time zone | default 
('now'::text)::timestamp(6) without time zone

Indexes:
   "pk_listingstatus_listingstatusid" PRIMARY KEY, btree 
(listingstatusid), tablespace "indexdata"


Since you are searching by "shortname", trying adding an index on that.  
Although with that tiny a table, it might not matter.


The questions are:

1. Why in the planner scanning the entire idx_listing_entrydate, when 
I'd think it should be scanning the entire 
pk_listingstatus_listingstatusid ?
2. Why is "Index Scan using pk_listingstatus_listingstatusid on 
listingstatus listingsta1_  (cost=0.00..0.27 rows=1 width=4) (never 
executed)" ?


Note:  I'm new at this as well, and jumped in to learn as well as to help.

-- Dean

--
Mail to my list address MUST be sent via the mailing list.  All other mail will 
bounce.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Optimizing No matching record Queries

2008-02-12 Thread Stephen Denne

Pallav Kalva asked
...
> and listing0_.fklistingsourceid=5525
...
> order by listing0_.entrydate desc limit 10;

>->  Index Scan Backward using idx_listing_entrydate on 
> listing listing0_  (cost=0.00..781557.28 rows=5118 width=107) (actual 
> time=2113544.412..2113544.412 rows=0 loops=1)
>  Filter: (fklistingsourceid = 5525)

Would it help to have a combined index on fklistingsourceid, entrydate?

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer 
focus, and courage. This email with any attachments is confidential and may be 
subject to legal privilege.  If it is not intended for you please advise by 
reply immediately, destroy it and do not copy, disclose or use it in any way.

__
  This email has been scanned by the DMZGlobal Business Quality 
  Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Optimizing No matching record Queries

2008-02-12 Thread Pallav Kalva


Hi,

  I am using Postgres 8.2.4, we have to regularly run some queries on 
some big tables to see if we have any data for a particular request. But 
sometimes we might not have any matching rows on a particular request as 
in this case, when it cant find any matching rows it pretty much scans 
the whole table and it takes too long to execute.


 As you can see from explain analyze output the response time is 
horrible, Is there anything I can do to improve these queries ?


 Tables are autovacuumed regularly.


  select relname,relpages,reltuples from pg_class where relname in 
('listing','listingstatus','listedaddress');


   relname| relpages |  reltuples
---+--+-
listing   |   132725 | 9.22896e+06
listingstatus |1 |   6
listedaddress |63459 | 8.15774e+06
(3 rows)

helix_fdc=# select relname,last_autovacuum,last_autoanalyze from 
pg_stat_user_tables where relname in ('listing','listedaddress');

   relname|last_autovacuum|   last_autoanalyze
---+---+---
listing   | 2008-02-12 10:57:54.690913-05 | 2008-02-12 
10:57:54.690913-05
listedaddress | 2008-02-09 14:12:44.038341-05 | 2008-02-12 
11:17:47.822597-05

(3 rows)

Explain Analyze Output


explain analyze
select listing0_.listingid as listingid157_, listing0_.entrydate as 
entrydate157_, listing0_.lastupdate as lastupdate157_,
  listing0_.sourcereference as sourcere4_157_, listing0_.start as 
start157_, listing0_.stop as stop157_,
  listing0_.price as price157_, listing0_.updateHashcode as 
updateHa8_157_, listing0_.fklistedaddressid as fklisted9_157_,
  listing0_.fklistingsubtypeid as fklisti10_157_, 
listing0_.fkbestaddressid as fkbesta11_157_,
  listing0_.fklistingsourceid as fklisti12_157_, 
listing0_.fklistingtypeid as fklisti13_157_,
  listing0_.fklistingstatusid as fklisti14_157_, 
listing0_.fkpropertytypeid as fkprope15_157_
from listing.listing listing0_, listing.listingstatus listingsta1_, 
listing.listedaddress listedaddr2_

where listing0_.fklistingstatusid=listingsta1_.listingstatusid
and listing0_.fklistedaddressid=listedaddr2_.listedaddressid
and listing0_.fklistingsourceid=5525
and listingsta1_.shortname='active'
and (listedaddr2_.fkverifiedaddressid is not null)
order by listing0_.entrydate desc limit 10;
   
QUERY PLAN
--
Limit  (cost=0.00..11191.64 rows=10 width=107) (actual 
time=2113544.437..2113544.437 rows=0 loops=1)
  ->  Nested Loop  (cost=0.00..790129.94 rows=706 width=107) (actual 
time=2113544.427..2113544.427 rows=0 loops=1)
->  Nested Loop  (cost=0.00..783015.53 rows=853 width=107) 
(actual time=2113544.420..2113544.420 rows=0 loops=1)
  ->  Index Scan Backward using idx_listing_entrydate on 
listing listing0_  (cost=0.00..781557.28 rows=5118 width=107) (actual 
time=2113544.412..2113544.412 rows=0 loops=1)

Filter: (fklistingsourceid = 5525)
  ->  Index Scan using pk_listingstatus_listingstatusid on 
listingstatus listingsta1_  (cost=0.00..0.27 rows=1 width=4) (never 
executed)
Index Cond: (listing0_.fklistingstatusid = 
listingsta1_.listingstatusid)

Filter: (shortname = 'active'::text)
->  Index Scan using pk_listedaddress_listedaddressid on 
listedaddress listedaddr2_  (cost=0.00..8.33 rows=1 width=4) (never 
executed)
  Index Cond: (listing0_.fklistedaddressid = 
listedaddr2_.listedaddressid)

  Filter: (fkverifiedaddressid IS NOT NULL)
Total runtime: 2113544.580 ms
(12 rows)


Table Definitions


\d listing.listing
  Table "listing.listing"
  Column   |Type 
|Modifiers

+-+--
listingid  | integer | not null default 
nextval(('listing.listingseq'::text)::regclass)

fklistingsourceid  | integer | not null
fklistingtypeid| integer | not null
entrydate  | timestamp without time zone | not null
lastupdate | timestamp without time zone | not null
fklistedaddressid  | integer |
fkbestaddressid| integer |
sourcereference| text|
fkpropertytypeid   | integer | not null
fklistingstatusid  | integer | not null
start  | timestamp without time zone | not null
stop   | tim

Re: [PERFORM] Join Query Perfomance Issue

2008-02-12 Thread Peter Koczan

> I have serious performance problems with the following type of queries:
>
> Doesnt looks too bad to me, but i'm not that deep into sql query
> optimization. However, these type of query is used in a function to
> access a normalized, partitioned database, so better performance in this
> queries would speed up the whole database system big times.
> Any suggestions here would be great. I allready tested some things,
> using inner join, rearranging the order of the tables, but but only
> minor changes in the runtime, the version above seemed to get us the
> best performance.

Can you send the table definitions of the tables involved in the
query, including index information? Might be if we look hard enough we
can find something.

Peter

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Dell Perc/6

2008-02-12 Thread Greg Smith


On Tue, 12 Feb 2008, Craig James wrote:

Does anyone have performance info about the new Dell Perc/6 controllers?  I 
found a long discussion ("Dell vs HP") about the Perc/5, but nothing about 
Perc/6.  What's under the covers?


The Perc/6i has an LSI Logic MegaRAID SAS 1078 chipset under the hood.  I 
know the Linux drivers for the card seemed to stabilize around October, 
there's a good sized list of compatible distributions on LSI's site. 
FreeBSD support has some limitations but basically works.  I haven't seen 
any benchmarks for the current version of the card yet.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Questions about enabling SSL

2008-02-12 Thread fabrix peñuelas

Thanks Michael...

2008/2/11, Michael Fuhr <[EMAIL PROTECTED]>:
>
> On Mon, Feb 11, 2008 at 05:37:51PM -0700, Michael Fuhr wrote:
> > On Mon, Feb 11, 2008 at 04:58:35PM -0700, fabrix peñuelas wrote:
> > > If ssl is enable  in postgresql decreanse the performance of the
> database?
> > > How much?
> >
> > The performance impact of an encrypted connection depends on how
> > expensive the queries are and how much data they return.
>
> Another consideration is how much time you spend using each connection
> vs. how much time it takes to establish each connection.  A thousand
> simple queries over the same encrypted connection might be significantly
> faster than running each query over a separate unencrypted connection,
> which in turn will probably be significantly faster than using
> separate encrypted connections that must each carry out a relatively
> expensive key establishment.
>
> --
> Michael Fuhr
>

[PERFORM] Dell Perc/6

2008-02-12 Thread Craig James


Does anyone have performance info about the new Dell Perc/6 controllers?  I found a long 
discussion ("Dell vs HP") about the Perc/5, but nothing about Perc/6.  What's 
under the covers?

Here is the (abbreviated) info from Dell on this machine:

PowerEdge 1950 IIIQuad Core Intel® Xeon® E5405, 2x6MB Cache, 2.0GHz, 
1333MHz FSB
Additional Processors Quad Core Intel® Xeon® E5405, 2x6MB Cache, 2.0GHz, 
1333MHz FSB
Memory8GB 667MHz (4x2GB), Dual Ranked DIMMs
Hard Drive Configuration  Integrated SAS/SATA RAID 5, PERC 6/i Integrated

Thanks,
Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Update with Subquery Performance

2008-02-12 Thread Tom Lane

"Linux Guru" <[EMAIL PROTECTED]> writes:
> Analyzing did not help, here is the out of EXPLAIN ANALYZE of update query
> "Seq Scan on dummy  (cost=0.00..56739774.24 rows=23441 width=275) (actual
> time=18.927..577929.014 rows=22712 loops=1)"
> "  SubPlan"
> "->  Aggregate  (cost=2420.41..2420.43 rows=1 width=19) (actual time=
> 25.423..25.425 rows=1 loops=22712)"
> "  ->  Seq Scan on dummy "temp"  (cost=0.00..2416.01 rows=586
> width=19) (actual time=0.049..17.834 rows=2414 loops=22712)"
> "Filter: ((product)::text = ($0)::text)"
> "Total runtime: 578968.885 ms"

Yeah, that's just not going to be fast.  An index on the product column
might help a bit, but the real issue is that you're repetitively
calculating the same aggregates.  I think you need a separate temp
table, along the lines of

create temp table dummy_agg as
  select product,
 (case when sum(pd) <> 0 then sum(gd)/sum(pd)*100 else 0 end) as s
  from dummy
  group by product;

create index dummy_agg_i on dummy_agg(product); -- optional

update dummy
  set gp= (select s from dummy_agg where dummy_agg.product = dummy.product);

The index would only be needed if you expect a lot of rows (lot of
different product values).

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] mis-estimate in nested query causes slow runtimes

2008-02-12 Thread Chris Kratz

On 2/11/08, Tom Lane <[EMAIL PROTECTED]> wrote:
>
> Chris Kratz <[EMAIL PROTECTED]> writes:
> >   ->  Nested Loop  (cost=42.74..161.76 rows=1 width=38) (actual
> > time=2.932..27.772 rows=20153 loops=1)
> > ->  Hash Join  (cost=10.89..22.58 rows=1 width=24) (actual
> > time=0.065..0.134 rows=1 loops=1)
> >   Hash Cond: (mtchsrcprj3.funding_source_id =
> > mtchsrcprjfs3.nameid)
> >   ->  Seq Scan on project mtchsrcprj3  (cost=0.00..11.22
> > rows=122 width=8) (actual time=0.002..0.054 rows=122 loops=1)
> >   ->  Hash  (cost=10.83..10.83 rows=5 width=24) (actual
> > time=0.017..0.017 rows=1 loops=1)
> > ->  Index Scan using name_float_lfm_idx on
> > namemaster mtchsrcprjfs3  (cost=0.00..10.83 rows=5 width=24) (actual
> > time=0.012..0.013 rows=1 loops=1)
> >   Index Cond: (name_float_lfm = 'DWS'::text)
> > ->  Bitmap Heap Scan on transaction_details idatrndtl
> > (cost=31.85..121.60 rows=1407 width=22) (actual time=2.864..12.060
> > rows=20153 loops=1)
> >   Recheck Cond: (idatrndtl.ida_trans_match_source_id =
> > mtchsrcprj3.id)
> >   ->  Bitmap Index Scan on
> > transaction_details_ida_trans_match_source_id  (cost=0.00..31.50
> > rows=1407 width=0) (actual time=2.696..2.696 rows=20153 loops=1)
> > Index Cond: (idatrndtl.ida_trans_match_source_id =
> > mtchsrcprj3.id)
>
> > The first frustration is that I can't get the transaction details scan
> > to get any more accurate.  It thinks it will find 1407 records,
> > instead it finds 20,153.  Then for whatever reason it thinks that a
> > join between 1 record and 1407 records will return 1 record.  This is
> > mainly what I can't understand.  Why does it think it will only get
> > one record in response when it's a left join?
>
> I don't see any left join there ...
>
> > PG 8.2.4 on Linux kernel 2.6.9 x64
>
> The first thing you should do is update to 8.2.6; we've fixed a fair
> number of problems since then that were fallout from the outer-join
> planning rewrite in 8.2.
>
> If it still doesn't work very well, please post the pg_stats rows for
> the join columns involved (idatrndtl.ida_trans_match_source_id and
> mtchsrcprj3.id).  (You do have up-to-date ANALYZE stats for both
> of those tables, right?)
>
> regards, tom lane



I know it's somewhat premature as we haven't had a chance to do the update
yet, but here is what I did w/ the statistics with the current version for
chuckles and grins just to see if it would make a difference in the plan.

# alter table project alter column id set statistics 1000;
ALTER TABLE
# analyze project;
ANALYZE
# alter table transaction_details alter column ida_trans_match_source_id set
statistics 1000;
ALTER TABLE
# analyze transaction_details;
ANALYZE
# select * from pg_stats where (tablename='project' and attname='id') or
(tablename='transaction_details' and attname='ida_trans_match_source_id');
 schemaname |  tablename  |  attname  | null_frac |
avg_width | n_distinct |
  most_common_vals
|

most_common_freqs


 |


   histogram_bounds


 | correlation
+-+---+---+---++--+++-
 public | project | id| 0 |
4 | -1 |

 |




 |
{6,7,8,12,13,14,15,17,18,19,24,25,26,27,28,29,30,31,32,33,34,35,36,37,41,42,71,72,797,802,803,809,812,813,814,815,816,817,818,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,920,921,922,923,924,925,926,927,928,929,930,931,932,933,934,935,936,937,938,939,940,941,942,946,947,948,949,950,951,952

Re: [PERFORM] mis-estimate in nested query causes slow runtimes

2008-02-12 Thread Chris Kratz

On 2/11/08, Tom Lane <[EMAIL PROTECTED]> wrote:
>
> Chris Kratz <[EMAIL PROTECTED]> writes:
> >   ->  Nested Loop  (cost=42.74..161.76 rows=1 width=38) (actual
> > time=2.932..27.772 rows=20153 loops=1)
> > ->  Hash Join  (cost=10.89..22.58 rows=1 width=24) (actual
> > time=0.065..0.134 rows=1 loops=1)
> >   Hash Cond: (mtchsrcprj3.funding_source_id =
> > mtchsrcprjfs3.nameid)
> >   ->  Seq Scan on project mtchsrcprj3  (cost=0.00..11.22
> > rows=122 width=8) (actual time=0.002..0.054 rows=122 loops=1)
> >   ->  Hash  (cost=10.83..10.83 rows=5 width=24) (actual
> > time=0.017..0.017 rows=1 loops=1)
> > ->  Index Scan using name_float_lfm_idx on
> > namemaster mtchsrcprjfs3  (cost=0.00..10.83 rows=5 width=24) (actual
> > time=0.012..0.013 rows=1 loops=1)
> >   Index Cond: (name_float_lfm = 'DWS'::text)
> > ->  Bitmap Heap Scan on transaction_details idatrndtl
> > (cost=31.85..121.60 rows=1407 width=22) (actual time=2.864..12.060
> > rows=20153 loops=1)
> >   Recheck Cond: (idatrndtl.ida_trans_match_source_id =
> > mtchsrcprj3.id)
> >   ->  Bitmap Index Scan on
> > transaction_details_ida_trans_match_source_id  (cost=0.00..31.50
> > rows=1407 width=0) (actual time=2.696..2.696 rows=20153 loops=1)
> > Index Cond: (idatrndtl.ida_trans_match_source_id =
> > mtchsrcprj3.id)
>
> > The first frustration is that I can't get the transaction details scan
> > to get any more accurate.  It thinks it will find 1407 records,
> > instead it finds 20,153.  Then for whatever reason it thinks that a
> > join between 1 record and 1407 records will return 1 record.  This is
> > mainly what I can't understand.  Why does it think it will only get
> > one record in response when it's a left join?
>
> I don't see any left join there ...
>
> > PG 8.2.4 on Linux kernel 2.6.9 x64
>
> The first thing you should do is update to 8.2.6; we've fixed a fair
> number of problems since then that were fallout from the outer-join
> planning rewrite in 8.2.
>
> If it still doesn't work very well, please post the pg_stats rows for
> the join columns involved (idatrndtl.ida_trans_match_source_id and
> mtchsrcprj3.id).  (You do have up-to-date ANALYZE stats for both
> of those tables, right?)
>
> regards, tom lane
>

Thanks Tom, we will try the upgrade and see if that makes a difference.


-Chris

Re: [PERFORM] Join Query Perfomance Issue

2008-02-12 Thread Thomas Zaksek


Scott Marlowe schrieb:

On Feb 11, 2008 12:08 PM, Thomas Zaksek <[EMAIL PROTECTED]> wrote:
  

I have serious performance problems with the following type of queries:
/
/explain analyse SELECT '12.11.2007 18:04:00 UTC' AS zeit,
   'M' AS datatyp,
   p.zs_nr AS zs_de,
   j_ges,
   de_mw_abh_j_lkw(mw_abh) AS j_lkw,
   de_mw_abh_v_pkw(mw_abh) AS v_pkw,
   de_mw_abh_v_lkw(mw_abh) AS v_lkw,
   de_mw_abh_p_bel(mw_abh) AS p_bel
   FROM  messpunkt p, messungen_v_dat_2007_11_12 m, de_mw w
   WHERE  m.ganglinientyp = 'M'
   AND 381 = m.minute_tag
   AND (p.nr, p.mw_nr) = (m.messpunkt, w.nr);

Explain analze returns

 Nested Loop  (cost=0.00..50389.39 rows=3009 width=10) (actual
time=0.503..320.872 rows=2189 loops=1)
   ->  Nested Loop  (cost=0.00..30668.61 rows=3009 width=8) (actual
time=0.254..94.116 rows=2189 loops=1)



This nested loop is using us most of your time.  Try increasing
work_mem and see if it chooses a better join plan, and / or turn off
nested loops for a moment and see if that helps.

set enable_nestloop = off

Note that set enable_xxx = off

Is kind of a hammer to the forebrain setting.  It's not subtle, and
the planner can't work around it.  So use them with caution.  That
said, I had one reporting query that simply wouldn't run fast without
turning off nested loops for that one.  But don't turn off nested
queries universally, they are still a good choice for smaller amounts
of data.
  

I tried turning off nestloop, but with terrible results:

Hash Join  (cost=208328.61..228555.14 rows=3050 width=10) (actual 
time=33421.071..40362.136 rows=2920 loops=1)

  Hash Cond: (w.nr = p.mw_nr)
  ->  Seq Scan on de_mw w  (cost=0.00..14593.79 rows=891479 width=10) 
(actual time=0.012..3379.971 rows=891479 loops=1)
  ->  Hash  (cost=208290.49..208290.49 rows=3050 width=8) (actual 
time=33420.877..33420.877 rows=2920 loops=1)
->  Merge Join  (cost=5303.71..208290.49 rows=3050 width=8) 
(actual time=31.550..33407.688 rows=2920 loops=1)

  Merge Cond: (p.nr = m.messpunkt)
  ->  Index Scan using messpunkt_nr_idx on messpunkt p  
(cost=0.00..238879.39 rows=6306026 width=12) (actual 
time=0.056..17209.317 rows=4339470 loops=1)
  ->  Sort  (cost=5303.71..5311.34 rows=3050 width=4) 
(actual time=25.973..36.858 rows=2920 loops=1)

Sort Key: m.messpunkt
->  Index Scan using 
messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx on 
messungen_v_dat_2007_11_12 m  (cost=0.00..5127.20 rows=3050 width=4) 
(actual time=0.124..12.822 rows=2920 loops=1)
  Index Cond: ((ganglinientyp = 'M'::bpchar) 
AND (651 = minute_tag))

Total runtime: 40373.512 ms
(12 rows)
Looks crappy, isn't it?

I also tried to increase work_men, now the config is
work_mem = 4MB 
maintenance_work_mem = 128MB,

in regard to performance, it wasnt effective at all.

The postgresql runs  on a HP Server with dual Opteron, 3GB of Ram, what 
are good settings here? The database will have to work with tables of 
several 10Millions of Lines, but only a few columns each. No more than 
maybe ~5 clients accessing the database at the same time.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Update with Subquery Performance

2008-02-12 Thread Linux Guru

See, its calculating sum by grouping the product field. Here is an example

Product  GP
-  ---
A  30
B   40
A  30
C 50
C 50

Now the query calculates aggregated sum and divide by grouping product so
all A's must have same the result, so with B's and C's.

> Is this supposed to be updating every single row with one value?
> Cause I'm guessing it's running that sub select over and over instead
> of one time.
>
yes you are right  that its calculating every time for all elements in each
group i.e. GP(A) is calculated twice for A, where it should only calculated
once for each group. Is there any  way to achieve this?

analyze;
> set work_mem = 128000;
>  between the alter and update and see if that helps.

that did not help


> Also, as Tom said, post explain analyze output of the statement.


"Seq Scan on dummy  (cost=0.00..56739774.24 rows=23441 width=275) (actual
time=18.927..577929.014 rows=22712 loops=1)"
"  SubPlan"
"->  Aggregate  (cost=2420.41..2420.43 rows=1 width=19) (actual time=
25.423..25.425 rows=1 loops=22712)"
"  ->  Seq Scan on dummy "temp"  (cost=0.00..2416.01 rows=586
width=19) (actual time=0.049..17.834 rows=2414 loops=22712)"
"Filter: ((product)::text = ($0)::text)"
"Total runtime: 578968.885 ms"

Thanks

On Feb 12, 2008 2:29 AM, Scott Marlowe <[EMAIL PROTECTED]> wrote:

> On Feb 11, 2008 5:06 AM, Linux Guru <[EMAIL PROTECTED]> wrote:
> > We have a large datawarehouse stored in postgres and temp tables are
> created
> > based on user query. The process of temp table creation involves
> selecting
> > data from main fact table, this includes several select and update
> > statements and one of the following update statement is having
> performance
> > issues.
> >
> > The newly temp table created for this scenario contains 22712 rows. Here
> is
> > the query
> >
> > alter table dummy add column gp numeric(40,15);
> > update dummy set gp=(select (
> > case when sum(temp.pd) <> 0 then sum(temp.gd)/sum(temp.pd)*100 else 0
> end  )
> > from dummy as temp
> >  where temp.product=dummy.product)
>
> Is this supposed to be updating every single row with one value?
> Cause I'm guessing it's running that sub select over and over instead
> of one time.  I'm guessing that with more work_mem the planner might
> use a more efficient plan.  Try adding
>
> analyze;
> set work_mem = 128000;
>  between the alter and update and see if that helps.
>
> Also, as Tom said, post explain analyze output of the statement.
>
>
> >
> > Now this query basically updates a table using values within itself in
> the
> > subquery but it takes to much time i.e. approx 5 mins. The whole
> temp
> > table creation process is stucked in this query (there are 4 additional
> such
> > updates with same problem). Index creation is useless here since its
> only a
> > one time process.
> >
> > Here is the strip down version (the part making performance issue) of
> above
> > query i.e. only select statement
> > ---
> > select (case when sum(temp.pd) <> 0 then sum(temp.gd)/sum(temp.pd)*100
> else
> > 0 end  )   from dummy as temp, dummy as temp2
> >  where temp.product=temp2.product group by temp.product
> >
> > "HashAggregate  (cost=1652480.98..1652481.96 rows=39 width=39)"
> > "  ->  Hash Join  (cost=1636.07..939023.13 rows=71345785 width=39)"
> >  "Hash Cond: (("temp".product)::text = (temp2.product)::text)"
> > "->  Seq Scan on dummy "temp"  (cost=0.00..1311.03 rows=26003
> > width=39)"
> > "->  Hash  (cost=1311.03..1311.03 rows=26003 width=21)"
> >  "  ->  Seq Scan on dummy temp2  (cost=0.00..1311.03rows=26003
> > width=21)"
> > ---
> >
> >
> > Whats the solution of this problem, or any alternate way to write this
> > query?
> >
> >
> >
>

Re: [PERFORM] Update with Subquery Performance

2008-02-12 Thread Linux Guru

Analyzing did not help, here is the out of EXPLAIN ANALYZE of update query

"Seq Scan on dummy  (cost=0.00..56739774.24 rows=23441 width=275) (actual
time=18.927..577929.014 rows=22712 loops=1)"
"  SubPlan"
"->  Aggregate  (cost=2420.41..2420.43 rows=1 width=19) (actual time=
25.423..25.425 rows=1 loops=22712)"
"  ->  Seq Scan on dummy "temp"  (cost=0.00..2416.01 rows=586
width=19) (actual time=0.049..17.834 rows=2414 loops=22712)"
"Filter: ((product)::text = ($0)::text)"
"Total runtime: 578968.885 ms"


On Feb 11, 2008 9:59 PM, Tom Lane <[EMAIL PROTECTED]> wrote:

> "Linux Guru" <[EMAIL PROTECTED]> writes:
> > We have a large datawarehouse stored in postgres and temp tables are
> created
> > based on user query. The process of temp table creation involves
> selecting
> > data from main fact table, this includes several select and update
> > statements and one of the following update statement is having
> performance
> > issues.
>
> Try ANALYZEing the temp table before the step that's too slow.
>
> If that doesn't help, let's see EXPLAIN ANALYZE (not just EXPLAIN)
> output.
>
>regards, tom lane
>

Re: [PERFORM] Optimizing No matching record Queries

Re: [PERFORM] Optimizing No matching record Queries

Re: [PERFORM] Optimizing No matching record Queries

[PERFORM] Optimizing No matching record Queries

Re: [PERFORM] Join Query Perfomance Issue

Re: [PERFORM] Dell Perc/6

Re: [PERFORM] Questions about enabling SSL

[PERFORM] Dell Perc/6

Re: [PERFORM] Update with Subquery Performance

Re: [PERFORM] mis-estimate in nested query causes slow runtimes

Re: [PERFORM] mis-estimate in nested query causes slow runtimes

Re: [PERFORM] Join Query Perfomance Issue

Re: [PERFORM] Update with Subquery Performance

Re: [PERFORM] Update with Subquery Performance

14 matches

Site Navigation

Mail list logo

Footer information