Re: [PERFORM] Update with Subquery Performance

2008-02-12 Thread Linux Guru
See, its calculating sum by grouping the product field. Here is an example

Product  GP
-  ---
A  30
B   40
A  30
C 50
C 50

Now the query calculates aggregated sum and divide by grouping product so
all A's must have same the result, so with B's and C's.

 Is this supposed to be updating every single row with one value?
 Cause I'm guessing it's running that sub select over and over instead
 of one time.

yes you are right  that its calculating every time for all elements in each
group i.e. GP(A) is calculated twice for A, where it should only calculated
once for each group. Is there any  way to achieve this?

analyze;
 set work_mem = 128000;
  between the alter and update and see if that helps.

that did not help


 Also, as Tom said, post explain analyze output of the statement.


Seq Scan on dummy  (cost=0.00..56739774.24 rows=23441 width=275) (actual
time=18.927..577929.014 rows=22712 loops=1)
  SubPlan
-  Aggregate  (cost=2420.41..2420.43 rows=1 width=19) (actual time=
25.423..25.425 rows=1 loops=22712)
  -  Seq Scan on dummy temp  (cost=0.00..2416.01 rows=586
width=19) (actual time=0.049..17.834 rows=2414 loops=22712)
Filter: ((product)::text = ($0)::text)
Total runtime: 578968.885 ms

Thanks

On Feb 12, 2008 2:29 AM, Scott Marlowe [EMAIL PROTECTED] wrote:

 On Feb 11, 2008 5:06 AM, Linux Guru [EMAIL PROTECTED] wrote:
  We have a large datawarehouse stored in postgres and temp tables are
 created
  based on user query. The process of temp table creation involves
 selecting
  data from main fact table, this includes several select and update
  statements and one of the following update statement is having
 performance
  issues.
 
  The newly temp table created for this scenario contains 22712 rows. Here
 is
  the query
 
  alter table dummy add column gp numeric(40,15);
  update dummy set gp=(select (
  case when sum(temp.pd)  0 then sum(temp.gd)/sum(temp.pd)*100 else 0
 end  )
  from dummy as temp
   where temp.product=dummy.product)

 Is this supposed to be updating every single row with one value?
 Cause I'm guessing it's running that sub select over and over instead
 of one time.  I'm guessing that with more work_mem the planner might
 use a more efficient plan.  Try adding

 analyze;
 set work_mem = 128000;
  between the alter and update and see if that helps.

 Also, as Tom said, post explain analyze output of the statement.


 
  Now this query basically updates a table using values within itself in
 the
  subquery but it takes to much time i.e. approx 5 mins. The whole
 temp
  table creation process is stucked in this query (there are 4 additional
 such
  updates with same problem). Index creation is useless here since its
 only a
  one time process.
 
  Here is the strip down version (the part making performance issue) of
 above
  query i.e. only select statement
  ---
  select (case when sum(temp.pd)  0 then sum(temp.gd)/sum(temp.pd)*100
 else
  0 end  )   from dummy as temp, dummy as temp2
   where temp.product=temp2.product group by temp.product
 
  HashAggregate  (cost=1652480.98..1652481.96 rows=39 width=39)
-  Hash Join  (cost=1636.07..939023.13 rows=71345785 width=39)
   Hash Cond: ((temp.product)::text = (temp2.product)::text)
  -  Seq Scan on dummy temp  (cost=0.00..1311.03 rows=26003
  width=39)
  -  Hash  (cost=1311.03..1311.03 rows=26003 width=21)
 -  Seq Scan on dummy temp2  (cost=0.00..1311.03rows=26003
  width=21)
  ---
 
 
  Whats the solution of this problem, or any alternate way to write this
  query?
 
 
 



Re: [PERFORM] Update with Subquery Performance

2008-02-12 Thread Linux Guru
Analyzing did not help, here is the out of EXPLAIN ANALYZE of update query

Seq Scan on dummy  (cost=0.00..56739774.24 rows=23441 width=275) (actual
time=18.927..577929.014 rows=22712 loops=1)
  SubPlan
-  Aggregate  (cost=2420.41..2420.43 rows=1 width=19) (actual time=
25.423..25.425 rows=1 loops=22712)
  -  Seq Scan on dummy temp  (cost=0.00..2416.01 rows=586
width=19) (actual time=0.049..17.834 rows=2414 loops=22712)
Filter: ((product)::text = ($0)::text)
Total runtime: 578968.885 ms


On Feb 11, 2008 9:59 PM, Tom Lane [EMAIL PROTECTED] wrote:

 Linux Guru [EMAIL PROTECTED] writes:
  We have a large datawarehouse stored in postgres and temp tables are
 created
  based on user query. The process of temp table creation involves
 selecting
  data from main fact table, this includes several select and update
  statements and one of the following update statement is having
 performance
  issues.

 Try ANALYZEing the temp table before the step that's too slow.

 If that doesn't help, let's see EXPLAIN ANALYZE (not just EXPLAIN)
 output.

regards, tom lane



Re: [PERFORM] Join Query Perfomance Issue

2008-02-12 Thread Thomas Zaksek

Scott Marlowe schrieb:

On Feb 11, 2008 12:08 PM, Thomas Zaksek [EMAIL PROTECTED] wrote:
  

I have serious performance problems with the following type of queries:
/
/explain analyse SELECT '12.11.2007 18:04:00 UTC' AS zeit,
   'M' AS datatyp,
   p.zs_nr AS zs_de,
   j_ges,
   de_mw_abh_j_lkw(mw_abh) AS j_lkw,
   de_mw_abh_v_pkw(mw_abh) AS v_pkw,
   de_mw_abh_v_lkw(mw_abh) AS v_lkw,
   de_mw_abh_p_bel(mw_abh) AS p_bel
   FROM  messpunkt p, messungen_v_dat_2007_11_12 m, de_mw w
   WHERE  m.ganglinientyp = 'M'
   AND 381 = m.minute_tag
   AND (p.nr, p.mw_nr) = (m.messpunkt, w.nr);

Explain analze returns

 Nested Loop  (cost=0.00..50389.39 rows=3009 width=10) (actual
time=0.503..320.872 rows=2189 loops=1)
   -  Nested Loop  (cost=0.00..30668.61 rows=3009 width=8) (actual
time=0.254..94.116 rows=2189 loops=1)



This nested loop is using us most of your time.  Try increasing
work_mem and see if it chooses a better join plan, and / or turn off
nested loops for a moment and see if that helps.

set enable_nestloop = off

Note that set enable_xxx = off

Is kind of a hammer to the forebrain setting.  It's not subtle, and
the planner can't work around it.  So use them with caution.  That
said, I had one reporting query that simply wouldn't run fast without
turning off nested loops for that one.  But don't turn off nested
queries universally, they are still a good choice for smaller amounts
of data.
  

I tried turning off nestloop, but with terrible results:

Hash Join  (cost=208328.61..228555.14 rows=3050 width=10) (actual 
time=33421.071..40362.136 rows=2920 loops=1)

  Hash Cond: (w.nr = p.mw_nr)
  -  Seq Scan on de_mw w  (cost=0.00..14593.79 rows=891479 width=10) 
(actual time=0.012..3379.971 rows=891479 loops=1)
  -  Hash  (cost=208290.49..208290.49 rows=3050 width=8) (actual 
time=33420.877..33420.877 rows=2920 loops=1)
-  Merge Join  (cost=5303.71..208290.49 rows=3050 width=8) 
(actual time=31.550..33407.688 rows=2920 loops=1)

  Merge Cond: (p.nr = m.messpunkt)
  -  Index Scan using messpunkt_nr_idx on messpunkt p  
(cost=0.00..238879.39 rows=6306026 width=12) (actual 
time=0.056..17209.317 rows=4339470 loops=1)
  -  Sort  (cost=5303.71..5311.34 rows=3050 width=4) 
(actual time=25.973..36.858 rows=2920 loops=1)

Sort Key: m.messpunkt
-  Index Scan using 
messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx on 
messungen_v_dat_2007_11_12 m  (cost=0.00..5127.20 rows=3050 width=4) 
(actual time=0.124..12.822 rows=2920 loops=1)
  Index Cond: ((ganglinientyp = 'M'::bpchar) 
AND (651 = minute_tag))

Total runtime: 40373.512 ms
(12 rows)
Looks crappy, isn't it?

I also tried to increase work_men, now the config is
work_mem = 4MB 
maintenance_work_mem = 128MB,

in regard to performance, it wasnt effective at all.

The postgresql runs  on a HP Server with dual Opteron, 3GB of Ram, what 
are good settings here? The database will have to work with tables of 
several 10Millions of Lines, but only a few columns each. No more than 
maybe ~5 clients accessing the database at the same time.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Join Query Perfomance Issue

2008-02-12 Thread Peter Koczan
 I have serious performance problems with the following type of queries:

 Doesnt looks too bad to me, but i'm not that deep into sql query
 optimization. However, these type of query is used in a function to
 access a normalized, partitioned database, so better performance in this
 queries would speed up the whole database system big times.
 Any suggestions here would be great. I allready tested some things,
 using inner join, rearranging the order of the tables, but but only
 minor changes in the runtime, the version above seemed to get us the
 best performance.

Can you send the table definitions of the tables involved in the
query, including index information? Might be if we look hard enough we
can find something.

Peter

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Optimizing No matching record Queries

2008-02-12 Thread Gregory Stark
Stephen Denne [EMAIL PROTECTED] writes:

 Pallav Kalva asked
 ...
 and listing0_.fklistingsourceid=5525
 ...
 order by listing0_.entrydate desc limit 10;

-  Index Scan Backward using idx_listing_entrydate on 
 listing listing0_  (cost=0.00..781557.28 rows=5118 width=107) (actual 
 time=2113544.412..2113544.412 rows=0 loops=1)
  Filter: (fklistingsourceid = 5525)

 Would it help to have a combined index on fklistingsourceid, entrydate?

I think that would help. You already have a ton of indexes, you might consider
whether all your queries start with a listingsourceid and whether you can have
that as a prefix on the existing index.

Another thing to try is raising the stats target on fklistingsourceid and/or
entrydate. The estimate seems pretty poor. It could just be that the
distribution is highly skewed which is a hard case to estimate correctly.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Optimizing No matching record Queries

2008-02-12 Thread Stephen Denne
Pallav Kalva asked
...
 and listing0_.fklistingsourceid=5525
...
 order by listing0_.entrydate desc limit 10;

-  Index Scan Backward using idx_listing_entrydate on 
 listing listing0_  (cost=0.00..781557.28 rows=5118 width=107) (actual 
 time=2113544.412..2113544.412 rows=0 loops=1)
  Filter: (fklistingsourceid = 5525)

Would it help to have a combined index on fklistingsourceid, entrydate?

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer 
focus, and courage. This email with any attachments is confidential and may be 
subject to legal privilege.  If it is not intended for you please advise by 
reply immediately, destroy it and do not copy, disclose or use it in any way.

__
  This email has been scanned by the DMZGlobal Business Quality 
  Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Optimizing No matching record Queries

2008-02-12 Thread Pallav Kalva

Hi,

  I am using Postgres 8.2.4, we have to regularly run some queries on 
some big tables to see if we have any data for a particular request. But 
sometimes we might not have any matching rows on a particular request as 
in this case, when it cant find any matching rows it pretty much scans 
the whole table and it takes too long to execute.


 As you can see from explain analyze output the response time is 
horrible, Is there anything I can do to improve these queries ?


 Tables are autovacuumed regularly.


  select relname,relpages,reltuples from pg_class where relname in 
('listing','listingstatus','listedaddress');


   relname| relpages |  reltuples
---+--+-
listing   |   132725 | 9.22896e+06
listingstatus |1 |   6
listedaddress |63459 | 8.15774e+06
(3 rows)

helix_fdc=# select relname,last_autovacuum,last_autoanalyze from 
pg_stat_user_tables where relname in ('listing','listedaddress');

   relname|last_autovacuum|   last_autoanalyze
---+---+---
listing   | 2008-02-12 10:57:54.690913-05 | 2008-02-12 
10:57:54.690913-05
listedaddress | 2008-02-09 14:12:44.038341-05 | 2008-02-12 
11:17:47.822597-05

(3 rows)

Explain Analyze Output


explain analyze
select listing0_.listingid as listingid157_, listing0_.entrydate as 
entrydate157_, listing0_.lastupdate as lastupdate157_,
  listing0_.sourcereference as sourcere4_157_, listing0_.start as 
start157_, listing0_.stop as stop157_,
  listing0_.price as price157_, listing0_.updateHashcode as 
updateHa8_157_, listing0_.fklistedaddressid as fklisted9_157_,
  listing0_.fklistingsubtypeid as fklisti10_157_, 
listing0_.fkbestaddressid as fkbesta11_157_,
  listing0_.fklistingsourceid as fklisti12_157_, 
listing0_.fklistingtypeid as fklisti13_157_,
  listing0_.fklistingstatusid as fklisti14_157_, 
listing0_.fkpropertytypeid as fkprope15_157_
from listing.listing listing0_, listing.listingstatus listingsta1_, 
listing.listedaddress listedaddr2_

where listing0_.fklistingstatusid=listingsta1_.listingstatusid
and listing0_.fklistedaddressid=listedaddr2_.listedaddressid
and listing0_.fklistingsourceid=5525
and listingsta1_.shortname='active'
and (listedaddr2_.fkverifiedaddressid is not null)
order by listing0_.entrydate desc limit 10;
   
QUERY PLAN
--
Limit  (cost=0.00..11191.64 rows=10 width=107) (actual 
time=2113544.437..2113544.437 rows=0 loops=1)
  -  Nested Loop  (cost=0.00..790129.94 rows=706 width=107) (actual 
time=2113544.427..2113544.427 rows=0 loops=1)
-  Nested Loop  (cost=0.00..783015.53 rows=853 width=107) 
(actual time=2113544.420..2113544.420 rows=0 loops=1)
  -  Index Scan Backward using idx_listing_entrydate on 
listing listing0_  (cost=0.00..781557.28 rows=5118 width=107) (actual 
time=2113544.412..2113544.412 rows=0 loops=1)

Filter: (fklistingsourceid = 5525)
  -  Index Scan using pk_listingstatus_listingstatusid on 
listingstatus listingsta1_  (cost=0.00..0.27 rows=1 width=4) (never 
executed)
Index Cond: (listing0_.fklistingstatusid = 
listingsta1_.listingstatusid)

Filter: (shortname = 'active'::text)
-  Index Scan using pk_listedaddress_listedaddressid on 
listedaddress listedaddr2_  (cost=0.00..8.33 rows=1 width=4) (never 
executed)
  Index Cond: (listing0_.fklistedaddressid = 
listedaddr2_.listedaddressid)

  Filter: (fkverifiedaddressid IS NOT NULL)
Total runtime: 2113544.580 ms
(12 rows)


Table Definitions


\d listing.listing
  Table listing.listing
  Column   |Type 
|Modifiers

+-+--
listingid  | integer | not null default 
nextval(('listing.listingseq'::text)::regclass)

fklistingsourceid  | integer | not null
fklistingtypeid| integer | not null
entrydate  | timestamp without time zone | not null
lastupdate | timestamp without time zone | not null
fklistedaddressid  | integer |
fkbestaddressid| integer |
sourcereference| text|
fkpropertytypeid   | integer | not null
fklistingstatusid  | integer | not null
start  | timestamp without time zone | not null
stop   | timestamp 

Re: [PERFORM] Dell Perc/6

2008-02-12 Thread Greg Smith

On Tue, 12 Feb 2008, Craig James wrote:

Does anyone have performance info about the new Dell Perc/6 controllers?  I 
found a long discussion (Dell vs HP) about the Perc/5, but nothing about 
Perc/6.  What's under the covers?


The Perc/6i has an LSI Logic MegaRAID SAS 1078 chipset under the hood.  I 
know the Linux drivers for the card seemed to stabilize around October, 
there's a good sized list of compatible distributions on LSI's site. 
FreeBSD support has some limitations but basically works.  I haven't seen 
any benchmarks for the current version of the card yet.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Optimizing No matching record Queries

2008-02-12 Thread Dean Gibson (DB Administrator)

On 2008-02-12 13:35, Pallav Kalva wrote:

Hi,

...
Table Definitions


\d listing.listingstatus
Table listing.listingstatus
Column  |Type 
|   Modifiers
-+-+ 

listingstatusid | integer | not null default 
nextval(('listing.listingstatusseq'::text)::regclass)

shortname   | text|
longname| text|
_entrydate  | timestamp without time zone | default 
('now'::text)::timestamp(6) without time zone

Indexes:
   pk_listingstatus_listingstatusid PRIMARY KEY, btree 
(listingstatusid), tablespace indexdata


Since you are searching by shortname, trying adding an index on that.  
Although with that tiny a table, it might not matter.


The questions are:

1. Why in the planner scanning the entire idx_listing_entrydate, when 
I'd think it should be scanning the entire 
pk_listingstatus_listingstatusid ?
2. Why is Index Scan using pk_listingstatus_listingstatusid on 
listingstatus listingsta1_  (cost=0.00..0.27 rows=1 width=4) (never 
executed) ?


Note:  I'm new at this as well, and jumped in to learn as well as to help.

-- Dean

--
Mail to my list address MUST be sent via the mailing list.  All other mail will 
bounce.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Dell Perc/6

2008-02-12 Thread Craig James

Does anyone have performance info about the new Dell Perc/6 controllers?  I found a long 
discussion (Dell vs HP) about the Perc/5, but nothing about Perc/6.  What's 
under the covers?

Here is the (abbreviated) info from Dell on this machine:

PowerEdge 1950 IIIQuad Core Intel® Xeon® E5405, 2x6MB Cache, 2.0GHz, 
1333MHz FSB
Additional Processors Quad Core Intel® Xeon® E5405, 2x6MB Cache, 2.0GHz, 
1333MHz FSB
Memory8GB 667MHz (4x2GB), Dual Ranked DIMMs
Hard Drive Configuration  Integrated SAS/SATA RAID 5, PERC 6/i Integrated

Thanks,
Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] mis-estimate in nested query causes slow runtimes

2008-02-12 Thread Chris Kratz
On 2/11/08, Tom Lane [EMAIL PROTECTED] wrote:

 Chris Kratz [EMAIL PROTECTED] writes:
-  Nested Loop  (cost=42.74..161.76 rows=1 width=38) (actual
  time=2.932..27.772 rows=20153 loops=1)
  -  Hash Join  (cost=10.89..22.58 rows=1 width=24) (actual
  time=0.065..0.134 rows=1 loops=1)
Hash Cond: (mtchsrcprj3.funding_source_id =
  mtchsrcprjfs3.nameid)
-  Seq Scan on project mtchsrcprj3  (cost=0.00..11.22
  rows=122 width=8) (actual time=0.002..0.054 rows=122 loops=1)
-  Hash  (cost=10.83..10.83 rows=5 width=24) (actual
  time=0.017..0.017 rows=1 loops=1)
  -  Index Scan using name_float_lfm_idx on
  namemaster mtchsrcprjfs3  (cost=0.00..10.83 rows=5 width=24) (actual
  time=0.012..0.013 rows=1 loops=1)
Index Cond: (name_float_lfm = 'DWS'::text)
  -  Bitmap Heap Scan on transaction_details idatrndtl
  (cost=31.85..121.60 rows=1407 width=22) (actual time=2.864..12.060
  rows=20153 loops=1)
Recheck Cond: (idatrndtl.ida_trans_match_source_id =
  mtchsrcprj3.id)
-  Bitmap Index Scan on
  transaction_details_ida_trans_match_source_id  (cost=0.00..31.50
  rows=1407 width=0) (actual time=2.696..2.696 rows=20153 loops=1)
  Index Cond: (idatrndtl.ida_trans_match_source_id =
  mtchsrcprj3.id)

  The first frustration is that I can't get the transaction details scan
  to get any more accurate.  It thinks it will find 1407 records,
  instead it finds 20,153.  Then for whatever reason it thinks that a
  join between 1 record and 1407 records will return 1 record.  This is
  mainly what I can't understand.  Why does it think it will only get
  one record in response when it's a left join?

 I don't see any left join there ...

  PG 8.2.4 on Linux kernel 2.6.9 x64

 The first thing you should do is update to 8.2.6; we've fixed a fair
 number of problems since then that were fallout from the outer-join
 planning rewrite in 8.2.

 If it still doesn't work very well, please post the pg_stats rows for
 the join columns involved (idatrndtl.ida_trans_match_source_id and
 mtchsrcprj3.id).  (You do have up-to-date ANALYZE stats for both
 of those tables, right?)

 regards, tom lane



I know it's somewhat premature as we haven't had a chance to do the update
yet, but here is what I did w/ the statistics with the current version for
chuckles and grins just to see if it would make a difference in the plan.

# alter table project alter column id set statistics 1000;
ALTER TABLE
# analyze project;
ANALYZE
# alter table transaction_details alter column ida_trans_match_source_id set
statistics 1000;
ALTER TABLE
# analyze transaction_details;
ANALYZE
# select * from pg_stats where (tablename='project' and attname='id') or
(tablename='transaction_details' and attname='ida_trans_match_source_id');
 schemaname |  tablename  |  attname  | null_frac |
avg_width | n_distinct |
  most_common_vals
|

most_common_freqs


 |


   histogram_bounds


 | correlation
+-+---+---+---++--+++-
 public | project | id| 0 |
4 | -1 |

 |




 |

Re: [PERFORM] mis-estimate in nested query causes slow runtimes

2008-02-12 Thread Chris Kratz
On 2/11/08, Tom Lane [EMAIL PROTECTED] wrote:

 Chris Kratz [EMAIL PROTECTED] writes:
-  Nested Loop  (cost=42.74..161.76 rows=1 width=38) (actual
  time=2.932..27.772 rows=20153 loops=1)
  -  Hash Join  (cost=10.89..22.58 rows=1 width=24) (actual
  time=0.065..0.134 rows=1 loops=1)
Hash Cond: (mtchsrcprj3.funding_source_id =
  mtchsrcprjfs3.nameid)
-  Seq Scan on project mtchsrcprj3  (cost=0.00..11.22
  rows=122 width=8) (actual time=0.002..0.054 rows=122 loops=1)
-  Hash  (cost=10.83..10.83 rows=5 width=24) (actual
  time=0.017..0.017 rows=1 loops=1)
  -  Index Scan using name_float_lfm_idx on
  namemaster mtchsrcprjfs3  (cost=0.00..10.83 rows=5 width=24) (actual
  time=0.012..0.013 rows=1 loops=1)
Index Cond: (name_float_lfm = 'DWS'::text)
  -  Bitmap Heap Scan on transaction_details idatrndtl
  (cost=31.85..121.60 rows=1407 width=22) (actual time=2.864..12.060
  rows=20153 loops=1)
Recheck Cond: (idatrndtl.ida_trans_match_source_id =
  mtchsrcprj3.id)
-  Bitmap Index Scan on
  transaction_details_ida_trans_match_source_id  (cost=0.00..31.50
  rows=1407 width=0) (actual time=2.696..2.696 rows=20153 loops=1)
  Index Cond: (idatrndtl.ida_trans_match_source_id =
  mtchsrcprj3.id)

  The first frustration is that I can't get the transaction details scan
  to get any more accurate.  It thinks it will find 1407 records,
  instead it finds 20,153.  Then for whatever reason it thinks that a
  join between 1 record and 1407 records will return 1 record.  This is
  mainly what I can't understand.  Why does it think it will only get
  one record in response when it's a left join?

 I don't see any left join there ...

  PG 8.2.4 on Linux kernel 2.6.9 x64

 The first thing you should do is update to 8.2.6; we've fixed a fair
 number of problems since then that were fallout from the outer-join
 planning rewrite in 8.2.

 If it still doesn't work very well, please post the pg_stats rows for
 the join columns involved (idatrndtl.ida_trans_match_source_id and
 mtchsrcprj3.id).  (You do have up-to-date ANALYZE stats for both
 of those tables, right?)

 regards, tom lane


Thanks Tom, we will try the upgrade and see if that makes a difference.


-Chris


Re: [PERFORM] Update with Subquery Performance

2008-02-12 Thread Tom Lane
Linux Guru [EMAIL PROTECTED] writes:
 Analyzing did not help, here is the out of EXPLAIN ANALYZE of update query
 Seq Scan on dummy  (cost=0.00..56739774.24 rows=23441 width=275) (actual
 time=18.927..577929.014 rows=22712 loops=1)
   SubPlan
 -  Aggregate  (cost=2420.41..2420.43 rows=1 width=19) (actual time=
 25.423..25.425 rows=1 loops=22712)
   -  Seq Scan on dummy temp  (cost=0.00..2416.01 rows=586
 width=19) (actual time=0.049..17.834 rows=2414 loops=22712)
 Filter: ((product)::text = ($0)::text)
 Total runtime: 578968.885 ms

Yeah, that's just not going to be fast.  An index on the product column
might help a bit, but the real issue is that you're repetitively
calculating the same aggregates.  I think you need a separate temp
table, along the lines of

create temp table dummy_agg as
  select product,
 (case when sum(pd)  0 then sum(gd)/sum(pd)*100 else 0 end) as s
  from dummy
  group by product;

create index dummy_agg_i on dummy_agg(product); -- optional

update dummy
  set gp= (select s from dummy_agg where dummy_agg.product = dummy.product);

The index would only be needed if you expect a lot of rows (lot of
different product values).

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Questions about enabling SSL

2008-02-12 Thread fabrix peñuelas
Thanks Michael...

2008/2/11, Michael Fuhr [EMAIL PROTECTED]:

 On Mon, Feb 11, 2008 at 05:37:51PM -0700, Michael Fuhr wrote:
  On Mon, Feb 11, 2008 at 04:58:35PM -0700, fabrix peñuelas wrote:
   If ssl is enable  in postgresql decreanse the performance of the
 database?
   How much?
 
  The performance impact of an encrypted connection depends on how
  expensive the queries are and how much data they return.

 Another consideration is how much time you spend using each connection
 vs. how much time it takes to establish each connection.  A thousand
 simple queries over the same encrypted connection might be significantly
 faster than running each query over a separate unencrypted connection,
 which in turn will probably be significantly faster than using
 separate encrypted connections that must each carry out a relatively
 expensive key establishment.

 --
 Michael Fuhr