[PERFORM] Vacuum becomes slow

2005-06-30 Thread Martin Lesser
Hi,

the time needed for a daily VACUUM on a table with about 28 mio records
increases from day to day. What's the best way to avoid this? A full
vacuum will probably take too much time, are there other ways to keep
vacuum performant?

The database was updated to postgres-8.0 on Jun 04 this year.

Between Jun 07 and Jun 30 the time vacuum needed increased from 683
seconds up to 1,663 seconds, the output is posted below. E.g. the time
for vacuuming the index of a text-field (i_ids_user) raised from 123 sec
to 668 secs. The increase happens each day so this is not a problem of
the last run. The number of records in the table in the same time only
increased from 27.5 mio to 28.9 mio, the number of records updated daily
is about 700,000 to 1,000,000.

Regards

Martin


| Tue Jun 7 04:07:17 CEST 2005 Starting
|   SET VACUUM_MEM=25; VACUUM ANALYZE VERBOSE t_ids

INFO:  vacuuming public.t_ids
INFO:  index i_ids_score now contains 4323671 row versions in 12414 pages
DETAIL:  493855 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 1.76s/5.44u sec elapsed 33.22 sec.
INFO:  index i_ids_id now contains 2752 row versions in 61515 pages
DETAIL:  960203 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 8.09s/24.93u sec elapsed 108.43 sec.
INFO:  index i_ids_user now contains 2752 row versions in 103172 pages
DETAIL:  960203 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 14.00s/39.65u sec elapsed 123.47 sec.
INFO:  t_ids: removed 960203 row versions in 203369 pages
DETAIL:  CPU 22.88s/21.72u sec elapsed 294.22 sec.
INFO:  t_ids: found 960203 removable, 2752 nonremovable row versions in 
208912 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 214149 unused item pointers.
0 pages are entirely empty.
CPU 53.02s/93.76u sec elapsed 643.46 sec.
INFO:  vacuuming pg_toast.pg_toast_224670
INFO:  index pg_toast_224670_index now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  pg_toast_224670: found 0 removable, 0 nonremovable row versions in 0 
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  analyzing public.t_ids
INFO:  t_ids: scanned 9 of 208912 pages, containing 11846838 live rows 
and 0 dead rows; 9 rows in sample, 27499407 estimated total rows
VACUUM

| Tue Jun 7 04:18:40 CEST 2005 Job finished after 683 seconds



| Thu Jun 30 01:23:33 CEST 2005 Starting
|   SET VACUUM_MEM=25; VACUUM ANALYZE VERBOSE t_ids

INFO:  vacuuming public.t_ids
INFO:  index i_ids_score now contains 4460326 row versions in 29867 pages
DETAIL:  419232 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 4.58s/7.72u sec elapsed 368.13 sec.
INFO:  index i_ids_id now contains 28948643 row versions in 68832 pages
DETAIL:  795700 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 9.08s/25.29u sec elapsed 151.38 sec.
INFO:  index i_ids_user now contains 28948938 row versions in 131683 pages
DETAIL:  795700 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 20.10s/43.27u sec elapsed 668.00 sec.
INFO:  t_ids: removed 795700 row versions in 206828 pages
DETAIL:  CPU 23.35s/23.50u sec elapsed 309.19 sec.
INFO:  t_ids: found 795700 removable, 28948290 nonremovable row versions in 
223145 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 906106 unused item pointers.
0 pages are entirely empty.
CPU 63.10s/101.96u sec elapsed 1592.00 sec.
INFO:  vacuuming pg_toast.pg_toast_224670
INFO:  index pg_toast_224670_index now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  pg_toast_224670: found 0 removable, 0 nonremovable row versions in 0 
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  analyzing public.t_ids
INFO:  t_ids: scanned 9 of 223146 pages, containing 11675055 live rows 
and 288 dead rows; 9 rows in sample, 28947131 estimated total rows
VACUUM

| Thu Jun 30 01:51:16 CEST 2005 Job finished after 1663 seconds

[PERFORM] ODBC driver over network very slow

2005-06-30 Thread Milan Sekanina
We are running an application that uses psqlodbc driver on Windows XP to 
connect to a server and for some reason the download of data from the 
server is very slow. We have created a very simple test application that 
inserts a larger amount of data into the database and uses a simple 
SELECT * from test to download it back. The INSERT of 10MB takes about 
4 seconds, while the SELECT takes almost 5 minutes (with basically 
nothing else running on both the network and the two computers). If we 
run the PostgreSQL server on the local machine so that the network is 
not used, both actions are very fast.


Do you have any idea what could be the cause of this behavior? Are there 
any driver settings/other drivers we should use? We are currently using 
psqlodbc version 7.03.02.00, but it seems that other versions we tried 
show the same behavior. We have tweaked the various driver settings but 
the times remain basically unchanged.


Any ideas or hints are warmly welcome.

regards
Milan

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] slow simple update?

2005-06-30 Thread philippe ventrillon
You should provide a bit more details on what happens if you want people to
help you.
 Tipically  you will be asked an explain analyze of your query.

As a first tip if your table contains much more than 30.000 rows you could
try to set up a partial index with 
thru_date is null condition.


regards
--
Philippe

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Colin Taylor
Sent: mercredi 22 juin 2005 08:13
To: pgsql-performance@postgresql.org
Subject: [PERFORM] slow simple update?

Hi there,

I'm doing an update of ~30,000 rows and she takes about 15mins on pretty
good hardware, even just after a vacuum analyze.
I was hoping some kind soul could offer some performance advice. Do I just
have too many indexes? Or am I missing some trick with the nulls?


MY QUERY

update bob.product_price set thru_date = '2005-06-22 22:08:49.957'
where thru_date is null;


MY TABLE
=
   Table bob.product_price
   Column|   Type   | Modifiers 
-+--+---
 product_id  | character varying(20)| not null
 product_price_type_id   | character varying(20)| not null
 currency_uom_id | character varying(20)| not null
 product_store_id| character varying(20)| not null
 from_date   | timestamp with time zone | not null
 thru_date   | timestamp with time zone | 
 price   | numeric(18,2)| 
 created_date| timestamp with time zone | 
 created_by_user_login   | character varying(255)   | 
 last_modified_date  | timestamp with time zone | 
 last_modified_by_user_login | character varying(255)   | 
 last_updated_stamp  | timestamp with time zone | 
 last_updated_tx_stamp   | timestamp with time zone | 
 created_stamp   | timestamp with time zone | 
 created_tx_stamp| timestamp with time zone | 

Indexes:
-
pk_product_price primary key btree
 (product_id, product_price_type_id, currency_uom_id, product_store_id,
from_date), prdct_prc_txcrts btree (created_tx_stamp), prdct_prc_txstmp
btree (last_updated_tx_stamp), prod_price_cbul btree
(created_by_user_login), prod_price_cuom btree (currency_uom_id),
prod_price_lmbul btree (last_modified_by_user_login), prod_price_prod btree
(product_id), prod_price_pst btree (product_store_id), prod_price_type btree
(product_price_type_id)

Foreign Key constraints: 
-
prod_price_prod FOREIGN KEY (product_id) REFERENCES bob.product(product_id)
ON UPDATE NO ACTION ON DELETE NO ACTION, prod_price_type FOREIGN KEY
(product_price_type_id) REFERENCES
bob.product_price_type(product_price_type_id)
 ON UPDATE NO ACTION ON DELETE NO ACTION, prod_price_cuom FOREIGN KEY
(currency_uom_id) REFERENCES bob.uom(uom_id)  ON UPDATE NO ACTION ON DELETE
NO ACTION, prod_price_pst FOREIGN KEY (product_store_id) REFERENCES
bob.product_store(product_store_id)
 ON UPDATE NO ACTION ON DELETE NO ACTION, prod_price_cbul FOREIGN KEY
(created_by_user_login) REFERENCES
bob.user_login(user_login_id)
 ON UPDATE NO ACTION ON DELETE NO ACTION, prod_price_lmbul FOREIGN KEY
(last_modified_by_user_login) REFERENCES
bob.user_login(user_login_id)
 ON UPDATE NO ACTION ON DELETE NO ACTION

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Vacuum becomes slow

2005-06-30 Thread Tom Lane
Martin Lesser [EMAIL PROTECTED] writes:
 the time needed for a daily VACUUM on a table with about 28 mio records
 increases from day to day.

My guess is that the original timings were artificially low because the
indexes were in nearly perfect physical order, and as that condition
degrades over time, it takes longer for VACUUM to scan them.  If that's
the right theory, the runtime should level off soon, and maybe you don't
need to do anything.  You could REINDEX periodically but I think the
time taken to do that would probably be more than you want to spend
(especially since REINDEX locks out writes where VACUUM does not).

You should check that your FSM settings are large enough, but given that
the table itself doesn't seem to be bloating, that's probably not the
issue.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] ODBC driver over network very slow

2005-06-30 Thread Tom Lane
Milan Sekanina [EMAIL PROTECTED] writes:
 We are running an application that uses psqlodbc driver on Windows XP to 
 connect to a server and for some reason the download of data from the 
 server is very slow. We have created a very simple test application that 
 inserts a larger amount of data into the database and uses a simple 
 SELECT * from test to download it back. The INSERT of 10MB takes about 
 4 seconds, while the SELECT takes almost 5 minutes (with basically 
 nothing else running on both the network and the two computers). If we 
 run the PostgreSQL server on the local machine so that the network is 
 not used, both actions are very fast.

I seem to recall having seen similar reports not involving ODBC at all.
Try searching the mailing-list archives, but I think the cases we solved
involved getting rid of third-party add-ons to the Windows TCP stack.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] ODBC driver over network very slow

2005-06-30 Thread Merlin Moncure
 Milan Sekanina [EMAIL PROTECTED] writes:
  We are running an application that uses psqlodbc driver on Windows
XP to
  connect to a server and for some reason the download of data from
the
  server is very slow. We have created a very simple test application
that
  inserts a larger amount of data into the database and uses a simple
  SELECT * from test to download it back. The INSERT of 10MB takes
about
  4 seconds, while the SELECT takes almost 5 minutes (with basically
  nothing else running on both the network and the two computers). If
we
  run the PostgreSQL server on the local machine so that the network
is
  not used, both actions are very fast.
 
 I seem to recall having seen similar reports not involving ODBC at
all.
 Try searching the mailing-list archives, but I think the cases we
solved
 involved getting rid of third-party add-ons to the Windows TCP stack.

IIRC there was a TCP related fix in the odbc driver related to
performance with large buffers.  I'd suggest trying a newer odbc driver
first.

Merlin 

dave page wrote ([odbc] 500 times slower)
 
 My collegue spent some time to dig the following case and it 
 looks like 
 Nagle algorithm and delayed ACKs related problem.
 In psqlodbc.h
 #define SOCK_BUFFER_SIZE  4096
 
 I changed that value to 8192 and driver works fine for me.
 I am not sure why this change helps.

Err, no neither am I. Why do you think it's got something to do with
Nagle/delayed ACKs?

The only thing that instantly rings bells for me is that the max size of
a text field is 8190 bytes at present (which really should be increased,
if not removed altogether), which won't fit in the default buffer. But
then, I wouldn't expect to see the performance drop you describe with a
4096 byte buffer, only one much smaller.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] ODBC driver over network very slow

2005-06-30 Thread Kevin Grittner
I was hesitant to jump in on this because I am new to PostgreSQL and
haven't seen this problem with _it_, but I have seen this with the
Sybase database products.  You can configure Sybase to disable the Nagle
algorithm.  If you don't, any query which returns rows too big to fit in
their network buffer will be painfully slow.  Increasing the buffer size
can help with an individual query, but it just reduces the scope of the
problem.  What you really want to do is make sure that TCP_NODELAY is
set for the connection, to disable the Nagle algorithm; it just doesn't
seem to be appropriate for returning query results.
 
How this issue comes into play in PostgreSQL is beyond my ken, but
hopefully this observation is helpful to someone.
 
-Kevin
 
 
 Merlin Moncure [EMAIL PROTECTED] 06/30/05 9:10 AM 

 My collegue spent some time to dig the following case and it 
 looks like 
 Nagle algorithm and delayed ACKs related problem.
 In psqlodbc.h
 #define SOCK_BUFFER_SIZE  4096
 
 I changed that value to 8192 and driver works fine for me.
 I am not sure why this change helps.

Err, no neither am I. Why do you think it's got something to do with
Nagle/delayed ACKs?


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] start time very high

2005-06-30 Thread Jean-Max Reymond
2005/6/30, Jean-Max Reymond [EMAIL PROTECTED]:
 so the request run in 26.646 ms on the Sun and 0.469ms on my laptop :-(
 the database are the same, vacuumed and I think the Postgres (8.0.3)
 are well configured.
 The Sun has two disks and use the TABLESPACE to have index on one disk
 and data's on the other disk.
 It seems that the cost of the first sort is very high on the Sun.
 How is it possible ?

may be data's not loaded  in memory but on disk ?

-- 
Jean-Max Reymond
CKR Solutions Open Source
Nice France
http://www.ckr-solutions.com

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] start time very high

2005-06-30 Thread Josh Berkus
Jean-Max,

 I have two computers, one laptop (1.5 GHz, 512 Mb RAM, 1 disk 4200)
 and one big Sun (8Gb RAM, 2 disks SCSI).

Did you run each query several times?   It looks like the index is cached 
on one server and not on the other.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] start time very high

2005-06-30 Thread Tom Lane
Jean-Max Reymond [EMAIL PROTECTED] writes:
 so the request run in 26.646 ms on the Sun and 0.469ms on my laptop :-( 
 the database are the same, vacuumed and I think the Postgres (8.0.3)
 are well configured.

Are you sure they're both vacuumed?  The Sun machine's behavior seems
consistent with the idea of a lot of dead rows in its copy of the table.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] ported application having performance issues

2005-06-30 Thread John Mendenhall
pgsql performance gurus,

We ported an application from oracle to postgresql.
We are experiencing an approximately 50% performance
hit.  I am in the process of isolating the problem.
I have searched the internet (google) and tried various
things.  Only one thing seems to work.  I am trying to
find out if our solution is the only option, or if I
am doing something terribly wrong.

The original application runs on the following:

hw:
cpu0: SUNW,UltraSPARC-IIi (upaid 0 impl 0x12 ver 0x12 clock 302 MHz)
mem = 393216K (0x1800)

sw:
Solaris 5.6
Oracle 7.3.2.2.0
Apache 1.3.27
Perl 5.004_04
mod_perl 1.27
DBI 1.20
DBD::Oracle 1.12

The ported application runs on the following:

hw:
unix: [ID 389951 kern.info] mem = 262144K (0x1000)
rootnex: [ID 466748 kern.info] root nexus = Sun Ultra 5/10 UPA/PCI 
(UltraSPARC-IIi 360MHz)

sw:
Solaris 5.9
PostgreSQL 7.4.6
Apache 1.3.33
Perl 5.8.6
mod_perl 1.29
DBI 1.46
DBD::Pg 1.40.1

Based on assistance from another list, we have
tried the following:

(1) Upgraded DBD::Pg to current version 1.43
(2) Ensured all tables are analyzed regularly
(3) Setting some memory options in postgresql.conf
(4) Located a handful of slow queries by setting
log_min_duration_statement to 250.

Future options we will consider are:

(1) Attempting other option settings, like
random_page_cost
(2) Upgrading db server to current version 8.0.3

With our handful of slow queries, we have done
several iterations of changes to determine what
will address the issues.

We have broken this down to the direction of a join
and setting the enable_seqscan to off.  The table
definitions are at the bottom of this e-mail.  There
is one large table (contacts) and one smaller table
(lead_requests).  The final SQL is as follows:

SELECT
 c.id AS contact_id,
 lr.id AS lead_request_id
FROM
 lead_requests lr
  JOIN contacts c ON (c.id = lr.contact_id)
WHERE
 c.partner_id IS NULL
ORDER BY
 contact_id

I ran this query against freshly vacuum analyzed tables.

The first run is as follows:

db= explain analyze  SELECT
db-   c.id AS contact_id,
db-   lr.id AS lead_request_id
db-  FROM
db-   lead_requests lr
db-JOIN contacts c ON (c.id = lr.contact_id)
db-  WHERE
db-   c.partner_id IS NULL
db-  ORDER BY
db-   contact_id
db- ;
LOG:  duration: 4618.133 ms  statement: explain analyze  SELECT
  c.id AS contact_id,
  lr.id AS lead_request_id
 FROM
  lead_requests lr
   JOIN contacts c ON (c.id = lr.contact_id)
 WHERE
  c.partner_id IS NULL
 ORDER BY
  contact_id
  QUERY 
PLAN  
--
 Merge Join  (cost=4272.84..4520.82 rows=1230 width=21) (actual 
time=3998.771..4603.739 rows=699 loops=1)
   Merge Cond: (outer.contact_id = inner.id)
   -  Index Scan using lead_requests_contact_id_idx on lead_requests lr  
(cost=0.00..74.09 rows=1435 width=21) (actual time=0.070..22.431 rows=1430 
loops=1)
   -  Sort  (cost=4272.84..4352.28 rows=31775 width=11) (actual 
time=3998.554..4130.067 rows=32640 loops=1)
 Sort Key: c.id
 -  Seq Scan on contacts c  (cost=0.00..1896.77 rows=31775 width=11) 
(actual time=0.040..326.135 rows=32501 loops=1)
   Filter: (partner_id IS NULL)
 Total runtime: 4611.323 ms
(8 rows)

As you can see, run time over 4 seconds.
Then, I set enable_seqscan = off.

db= set enable_seqscan=off;
SET

Then I ran the exact same query:

db= explain analyze  SELECT
db-   c.id AS contact_id,
db-   lr.id AS lead_request_id
db-  FROM
db-   lead_requests lr
db-JOIN contacts c ON (c.id = lr.contact_id)
db-  WHERE
db-   c.partner_id IS NULL
db-  ORDER BY
db-   contact_id
db- ;
LOG:  duration: 915.304 ms  statement: explain analyze  SELECT
  c.id AS contact_id,
  lr.id AS lead_request_id
 FROM
  lead_requests lr
   JOIN contacts c ON (c.id = lr.contact_id)
 WHERE
  c.partner_id IS NULL
 ORDER BY
  contact_id
  QUERY 
PLAN  
--
 Merge Join  (cost=0.00..4749.84 rows=1230 width=21) (actual 
time=0.213..901.315 rows=699 loops=1)
   Merge Cond: (outer.contact_id = inner.id)
   -  Index Scan using lead_requests_contact_id_idx on lead_requests lr  
(cost=0.00..74.09 rows=1435 width=21) (actual time=0.073..21.448 rows=1430 
loops=1)
   -  Index Scan using contacts_pkey on contacts c  (cost=0.00..4581.30 
rows=31775 width=11) (actual time=0.038..524.217 

Re: [PERFORM] ported application having performance issues

2005-06-30 Thread Joshua D. Drake



Thank you very much in advance for any pointers you can
provide.  And, if this is the wrong forum for this question,
please let me know and I'll ask it elsewhere.


I think you may want to increase your statistics_target plus make sure 
you are running analyze. explain anaylze would do.


Sincerely,

Joshua D. Drake





JohnM





-
table definitions
-

-
db= \d contacts
   Table db.contacts
Column|Type | Modifiers 
--+-+---

 id   | numeric(38,0)   | not null
 db_id| character varying(32)   | 
 firstname| character varying(64)   | 
 lastname | character varying(64)   | 
 company  | character varying(128)  | 
 email| character varying(256)  | 
 phone| character varying(64)   | 
 address  | character varying(128)  | 
 city | character varying(128)  | 
 state| character varying(32)   | 
 postalcode   | character varying(16)   | 
 country  | character varying(2)| not null
 contact_info_modified| character(1)| 
 token_id | numeric(38,0)   | 
 status_id| numeric(38,0)   | 
 status_last_modified | timestamp without time zone | 
 notes| character varying(2000) | 
 demo_schedule| timestamp without time zone | 
 partner_id   | numeric(38,0)   | 
 prev_partner_id  | numeric(38,0)   | 
 prev_prev_partner_id | numeric(38,0)   | 
 site_last_visited| timestamp without time zone | 
 source_id| numeric(4,0)| 
 demo_requested   | timestamp without time zone | 
 sourcebook_requested | timestamp without time zone | 
 zip  | numeric(8,0)| 
 suffix   | numeric(8,0)| 
 feedback_request_sent| timestamp without time zone | 
 products_sold| character varying(512)  | 
 other_brand  | character varying(512)  | 
 printsample_requested| timestamp without time zone | 
 indoor_media_sample  | timestamp without time zone | 
 outdoor_media_sample | timestamp without time zone | 
 printers_owned   | character varying(256)  | 
 business_type| character varying(256)  | 
 printers_owned2  | character varying(256)  | 
 contact_quality_id   | numeric(38,0)   | 
 est_annual_value | numeric(38,2)   | 
 likelyhood_of_closing| numeric(38,0)   | 
 priority | numeric(38,0)   | 
 business_type_id | numeric(38,0)   | 
 lead_last_modified   | timestamp without time zone | 
 lead_value   | numeric(38,2)   | 
 channel_contact_flag | character(1)| 
 request_status_last_modified | timestamp without time zone | 
 master_key_number| numeric(38,0)   | 
 master_key_token | character varying(32)   | 
 current_media_cust   | character(1)| 
 kodak_media_id   | numeric(38,0)   | 
 printer_sample_id| numeric(38,0)   | 
 quantity_used_id | numeric(38,0)   | 
 rip_used_id  | numeric(38,0)   | 
 language_code| character varying(3)| 
 region_id| numeric(38,0)   | not null
 lead_deleted | timestamp without time zone | 
 last_request_set_status_id   | numeric(38,0)   | 
 address2 | character varying(128)  | 
 media_usage_id   | numeric(38,0)   | 
Indexes:

contacts_pkey primary key, btree (id)
contacts_partner_id_idx btree (partner_id)
contacts_partner_id_null_idx btree (partner_id) WHERE (partner_id IS NULL)
contacts_token_id_idx btree (token_id)
Check constraints:
sys_c0050644 CHECK (country IS NOT NULL)
sys_c0050643 CHECK (id IS NOT NULL)
sys_c0050645 CHECK (region_id IS NOT NULL)
Triggers:
insert_master_key BEFORE INSERT ON contacts FOR EACH ROW EXECUTE PROCEDURE 
pg_fct_insert_master_key()
-

-
db= \d lead_requests
 Table db.lead_requests
Column |Type | Modifiers 
---+-+---

 id| numeric(38,0) 

Re: [PERFORM] ported application having performance issues

2005-06-30 Thread Tom Lane
John Mendenhall [EMAIL PROTECTED] writes:
  Merge Join  (cost=4272.84..4520.82 rows=1230 width=21) (actual 
 time=3998.771..4603.739 rows=699 loops=1)
Merge Cond: (outer.contact_id = inner.id)
-  Index Scan using lead_requests_contact_id_idx on lead_requests lr  
 (cost=0.00..74.09 rows=1435 width=21) (actual time=0.070..22.431 rows=1430 
 loops=1)
-  Sort  (cost=4272.84..4352.28 rows=31775 width=11) (actual 
 time=3998.554..4130.067 rows=32640 loops=1)
  Sort Key: c.id
  -  Seq Scan on contacts c  (cost=0.00..1896.77 rows=31775 width=11) 
 (actual time=0.040..326.135 rows=32501 loops=1)
Filter: (partner_id IS NULL)
  Total runtime: 4611.323 ms

Hmm ... even on a SPARC, it doesn't seem like it should take 4 seconds
to sort 3 rows.  You can certainly see that the planner is not
expecting that (it's estimating a sort cost comparable to the scan cost,
which if true would put this in the sub-second ballpark).

Does increasing sort_mem help?

Have you considered using some other datatype than numeric for your
keys?  Numeric may be fast on Oracle but it's not amazingly fast on
Postgres.  bigint would be better, if you don't really need 38 digits;
if you do, I'd be inclined to think about plain char or varchar keys.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] ported application having performance issues

2005-06-30 Thread John Mendenhall
pgsql performance gurus,

I truly appreciate the suggestions provided.

I have tried each one separately to determine the
best fit.  I have included results for each suggestion.
I have also included my entire postgresql.conf file so
you can see our base configuration.
Each result is based on an in-session variable setting,
so it only affected the current session.

(1) Increase the default_statistics_target,
run vacuum, analyze on each table for each setting

The default setting is 10.

I tried the following settings, with the corresponding
results:

default_statistics_target =   10 time approximately 4500ms
default_statistics_target =  100 time approximately 3900ms
default_statistics_target =  500 time approximately 3900ms
default_statistics_target = 1000 time approximately 3900ms

So, this option does not quite get us there.

(2) Increase sort_mem value

The current setting for sort_mem is 2048.

sort_mem =  2048 time approximately 4500ms
sort_mem =  8192 time approximately 2750ms
sort_mem = 16384 time approximately 2650ms
sort_mem =  1024 time approximately 1000ms

Interesting to note...
When I set sort_mem to 1024, the plan started the join
with the lead_requests table and used the contacts index.
None of the above attempts used this.

(3) Decrease random_page_cost, increase effective_cache_size

The default setting for random_page_cost is 4.
Our setting for effective_cache_size is 2048.

random_page_cost = 4, effective_cache_size = 2048   time approximately 4500ms
random_page_cost = 3, effective_cache_size = 2048   time approximately 1050ms
random_page_cost = 3, effective_cache_size = 4096   time approximately 1025ms

The decrease of random_page_cost to 3 caused the plan
to work properly, using the lead_requests table as a
join starting point and using the contacts index.

*

It appears we learned the following:

(a) For some reason, setting the sort_mem smaller than
our current setting caused things to work correctly.
(b) Lowering random_page_cost causes things to work
correctly.

This brings up the following questions:

 (i) What is the ideal configuration for this query
to work?
(ii) Will this ideal configuration work for all our
other queries, or is this specific to this query only?
(iii) Should I try additional variable changes, or
lower/raise the variables I have already changed even
more?

Thanks again for the suggestions provided.  And,
thanks in advance for any additional thoughts or
suggestions.

JohnM

::
postgresql.conf
::
-
# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# pg_ctl reload.


#---
# CONNECTIONS AND AUTHENTICATION
#---

# - Connection Settings -

tcpip_socket = false
max_connections = 128
# note: increasing max_connections costs about 500 bytes of shared
# memory per connection slot, in addition to costs from shared_buffers
# and max_locks_per_transaction.
#superuser_reserved_connections = 2
#port = 5432
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal
#virtual_host = ''  # what interface to listen on; defaults to any
#rendezvous_name = ''   # defaults to the computer name

# - Security  Authentication -

#authentication_timeout = 60# 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false


#---
# RESOURCE USAGE (except WAL)
#---

# - Memory -

shared_buffers = 4096   # min 16, at least max_connections*2, 8KB each
sort_mem = 2048 # min 64, size in KB
#vacuum_mem = 8192  # min 1024, size in KB

# - Free Space Map -

#max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000   # min 100, ~50 bytes each

# - Kernel Resource Usage -

#max_files_per_process = 1000   # min 25
#preload_libraries = ''