Re: [PERFORM] db size

2008-04-14 Thread Vinubalaji Gopal
Hi Adrian,



When I do a pg_dump with the following parameters /usr/bin/pg_dump -E
UTF8 -F c -b I get a file of 14GB in size.


From the man page of pg_dump

-F format, --format=format

 Selects the format of the output. format can be one of the following:
c
output a custom archive suitable for input into pg_restore. This is the most 
flexible format in that it allows reordering of data load as well as schema 
elements. This format is also compressed by default.


 The output is compressed and it is a dump of the database which contain the 
SQL commands:



But the database is 110GB in size on the disk.  Why the big difference
in size?  Does this have anything to do with performance?

VACUUM or VACUUM FULL of the entire database will reduce the size of the 
database by reclaiming any unused space and you can use the filesystem based 
backup or backup/restore strategy.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] vacuum in Postgresql 8.0.x slowing down the database

2008-03-28 Thread Vinubalaji Gopal
On Wed, 26 Mar 2008 13:02:13 -0700
Joshua D. Drake [EMAIL PROTECTED] wrote:


 The slowness is likely attributed to Vacuum's use of I/O. When vacuum
 is running what does iostat -k 10 say?

Seems to be higher than normal - here is the output with vacuum run
without the other queries and the default vacuum taking ~1 hr:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  13.300.004.50   25.910.00   56.29
Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
md23356.94  2005.59 12945.45  20076 129584


avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  16.200.006.32   24.890.00   52.59

Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
md2 461.70   667.20  1512.00   6672  15120


I don't know if the output helps much since the vacuum took some time
and I lost more than half of my iostat -k screen output. (I scrolled up
- but got only some of the data)

If vacuum does affect the io what are the ways to reduce the io during
vacuum (the delay and cost parameter did not help that much - should I
consider reducing the cost even further)? Should I consider
partitioning the table?

Thank you.

--
Vinu

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] vacuum in Postgresql 8.0.x slowing down the database

2008-03-26 Thread Vinubalaji Gopal
Hey all,
   I had posted sometime back asking for the best way to perform vacuum
with a lower priority - I did tune it up to a lower priority
and still noticed that the other database queries are slowing down
with a vacuum on one big table. I also tried to upgrade Postgresql to
8.0.15 as suggested and I still could reproduce the problem.

It happens when I try to vacuum one particular table which has 9.5
million rows and all the other inserts/selects are slowing down by a
factor of 10 times. I am using
vacuum_cost_delay = 100
vacuum_cost_limit = 200

Even if I cancel the vacuum operation or let the vacuum complete - the
slowdown continues to persist till I restart my application.

Whats the best way to analyze the reason Postgresql is slowing down? I
had a look at pg_locks (did not know what to look for) and also tried
starting postgresql with the debug option using: postmaster -d 5
-D /var/pgdata (saw a lot of output including the query being performed
but could not gather anything useful) 

The big table has never been reindexed and has a primary, unique key
with btree index and one foreign key constraint.

--
Vinu

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] best way to run maintenance script

2008-03-14 Thread Vinubalaji Gopal
Hi all,
  I have been searching for the best way to run maintenance scripts
which does a vacuum, analyze and deletes some old data. Whenever the
maintenance script runs - mainly the pg_maintenance --analyze script -
it slows down postgresql inserts and I want to avoid that. The system is
under constant load and I am not interested in the time taken to vacuum.
Is there a utility or mechanism in postgresql which helps in reducing
priority of maintenance queries?

Is writing a postgresql C function and setting the priority of process
the only way to change the priority of the maintenance script or is
there a better way.
http://weblog.bignerdranch.com/?p=11

I tried using the nice command (Linux system) on the maintenance script
- it did not have any effect - guess it does not change the niceness of
the postgresql vacuum process.

(I am running Postgresql 8.0 on a Linux)

--
Vinu

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] best way to run maintenance script

2008-03-14 Thread Vinubalaji Gopal
Hi Joshua,

 You can use parameters such as vacuum_cost_delay to help this... see
 the docs:
 
 http://www.postgresql.org/docs/8.3/static/runtime-config-autovacuum.html

I am checking it out. Seems to be a nice option for vacuum - but wish
there was a way to change the delete priority or I will try to use the C
based priority hack.


 If you are truly running 8.0 and not something like 8.0.15 vacuum is
 the least of your worries.
Its 8.0.4. 

Thanks.

--
Vinu

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] best way to run maintenance script

2008-03-14 Thread Vinubalaji Gopal

On Fri, 2008-03-14 at 18:37 -0700, Tom Lane wrote:
 That's only a little bit better.  Read about all the bug fixes you're

Sure - will eventually upgrade it sometime - but it has to wait for
now :(


--
Vinu

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] best way to run maintenance script

2008-03-14 Thread Vinubalaji Gopal

 
 I think you will find if you do it the right way, which is to say the
 way that it is meant to be done with the configurable options, your
 life will be a great deal more pleasant than some one off hack.
 

yeah I agree. The pg_maintanence script which calls vacuum and analyze
is the one of the thing that is causing more problems. I am trying out
various vacuum options (vacuum_cost_limit, vacuum_cost_delay) and
finding it hard to understand the implications of the variables. What
are the optimal values for the vacuum_* parameters - for a really active
database (writes at the rate of ~ 50 rows/seconds).

I started with
vacuum_cost_delay = 200
vacuum_cost_limit = 400

and that did not help much. 

--
Vinu


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres scalability and performance on windows

2006-11-28 Thread Gopal
Tom,

This is the query and the schema

Query is :
SELECT  subq.percentCover, ds.datasetname, ds.maxresolution
FROM 
( 
select
sum(area(intersection(snaptogrid(chunkgeometry,0.0001), 
GeometryFromText('POLYGON((-0.140030845589332
50.8208343077265,-0.138958398039148 50.8478005422809,-0.0963639712296823
50.8471133071392,-0.0974609286275892 50.8201477285483,-0.140030845589332
50.8208343077265))',4326))) * 100/ (0.00114901195862628)) as
percentCover, 
datasetid as did from 
tbl_metadata_chunks 
where chunkgeometry 
GeometryFromText('POLYGON((-0.140030845589332
50.8208343077265,-0.138958398039148 50.8478005422809,-0.0963639712296823
50.8471133071392,-0.0974609286275892 50.8201477285483,-0.140030845589332
50.8208343077265))',4326)
and datasetid in (select datasetid from
tbl_metadata_dataset where typeofdataid=1)
group by did
order by did desc   
)
AS subq INNER JOIN tbl_metadata_dataset AS 
ds ON subq.did = ds.datasetid 
ORDER by ceil(subq.percentCover),1/ds.maxresolution
DESC;


Schema is

Table 1
CREATE TABLE public.tbl_metadata_dataset
(
datasetname varchar(70) NOT NULL,
maxresolution real,
typeofdataid integer NOT NULL,
datasetid serial NOT NULL,
CONSTRAINT PK_Dataset PRIMARY KEY (datasetid)
);
-- Indexes
CREATE INDEX dsnameindex ON tbl_metadata_dataset USING btree
(datasetname);-- Owner
ALTER TABLE public.tbl_metadata_dataset OWNER TO postgres;
-- Triggers
CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2196039 AFTER DELETE ON
tbl_metadata_dataset FROM tbl_metadata_chunks NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_noaction_del('dsid',
'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED',
'datasetid', 'datasetid');
CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2196040 AFTER UPDATE ON
tbl_metadata_dataset FROM tbl_metadata_chunks NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_noaction_upd('dsid',
'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED',
'datasetid', 'datasetid');


Table 2

CREATE TABLE public.tbl_metadata_chunks
(
chunkid serial NOT NULL,
chunkgeometry geometry NOT NULL,
datasetid integer NOT NULL,
CONSTRAINT tbl_metadata_chunks_pkey PRIMARY KEY (chunkid),
CONSTRAINT dsid FOREIGN KEY (datasetid) REFERENCES
tbl_metadata_dataset(datasetid)
);
-- Indexes
CREATE INDEX idx_dsid ON tbl_metadata_chunks USING btree (datasetid);
CREATE UNIQUE INDEX tbl_metadata_chunks_idx2 ON tbl_metadata_chunks
USING btree (nativetlx, nativetly, datasetid);
CREATE INDEX tbl_metadata_chunks_idx3 ON tbl_metadata_chunks USING gist
(chunkgeometry);-- Owner
ALTER TABLE public.tbl_metadata_chunks OWNER TO postgres;
-- Triggers
CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2194515 AFTER DELETE ON
tbl_metadata_chunks FROM tbl_metadata_chunkinfo NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_restrict_del('fk',
'tbl_metadata_chunkinfo', 'tbl_metadata_chunks', 'UNSPECIFIED',
'chunkid', 'chunkid');
CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2194516 AFTER UPDATE ON
tbl_metadata_chunks FROM tbl_metadata_chunkinfo NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_restrict_upd('fk',
'tbl_metadata_chunkinfo', 'tbl_metadata_chunks', 'UNSPECIFIED',
'chunkid', 'chunkid');
CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2196037 AFTER INSERT ON
tbl_metadata_chunks FROM tbl_metadata_dataset NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_check_ins('dsid',
'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED',
'datasetid', 'datasetid');
CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2196038 AFTER UPDATE ON
tbl_metadata_chunks FROM tbl_metadata_dataset NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_check_upd('dsid',
'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED',
'datasetid', 'datasetid');



-Original Message-
From: Frank Wiles [mailto:[EMAIL PROTECTED] 
Sent: 24 November 2006 17:05
To: Guido Neitzer
Cc: Gopal; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Postgres scalability and performance on windows

On Fri, 24 Nov 2006 09:22:45 +0100
Guido Neitzer [EMAIL PROTECTED] wrote:

  effective_cache_size = 82728  # typically 8KB each
 Hmm. I don't know what the real effect of this might be as the doc  
 states:
 
 This parameter has no effect on the size of shared memory allocated  
 by PostgreSQL, nor does it reserve kernel disk cache; it is used
 only for estimation purposes.

   This is a hint to the optimizer about how much of the database may
   be in the OS level cache.  

 -
   Frank Wiles [EMAIL PROTECTED]
   http://www.wiles.org

Re: [PERFORM] Postgres scalability and performance on windows

2006-11-24 Thread Gopal
Hi,

Thanks for your suggestions. Here's an output of the explain analyse.
I'll change the shared_buffers and look at the behaviour again.

Limit  (cost=59.53..59.53 rows=1 width=28) (actual time=15.681..15.681
rows=1 loops=1)
  -  Sort  (cost=59.53..59.53 rows=1 width=28) (actual
time=15.678..15.678 rows=1 loops=1)
Sort Key: ceil(subq.percentcover), (1::double precision /
ds.maxresolution)
-  Hash Join  (cost=58.19..59.52 rows=1 width=28) (actual
time=15.630..15.663 rows=2 loops=1)
  Hash Cond: (outer.datasetid = inner.did)
  -  Seq Scan on tbl_metadata_dataset ds  (cost=0.00..1.21
rows=21 width=24) (actual time=0.006..0.021 rows=21 loops=1)
  -  Hash  (cost=58.18..58.18 rows=1 width=12) (actual
time=15.591..15.591 rows=2 loops=1)
-  Sort  (cost=58.17..58.17 rows=1 width=117)
(actual time=15.585..15.586 rows=2 loops=1)
  Sort Key: tbl_metadata_chunks.datasetid
  -  HashAggregate  (cost=58.13..58.16 rows=1
width=117) (actual time=15.572..15.573 rows=2 loops=1)
-  Hash IN Join  (cost=3.34..58.10
rows=7 width=117) (actual time=0.261..0.544 rows=50 loops=1)
  Hash Cond: (outer.datasetid =
inner.datasetid)
  -  Bitmap Heap Scan on
tbl_metadata_chunks  (cost=2.05..56.67 rows=14 width=117) (actual
time=0.204..0.384 rows=60 loops=1)
Filter: (chunkgeometry 
'010320E6100100050058631EDF87ECC1BF608F3D1911694940A0958
A8763C9C1BF535069BA846C494026B5F1284FABB8BFAB1577356E6C494094E1170D33F3B
8BF7700CC99FA68494058631EDF87ECC1BF608F3D1 (..)
-  Bitmap Index Scan on
tbl_metadata_chunks_idx3  (cost=0.00..2.05 rows=14 width=0) (actual
time=0.192..0.192 rows=60 loops=1)
  Index Cond:
(chunkgeometry 
'010320E6100100050058631EDF87ECC1BF608F3D1911694940A0958
A8763C9C1BF535069BA846C494026B5F1284FABB8BFAB1577356E6C494094E1170D33F3B
8BF7700CC99FA68494058631EDF87ECC (..)
  -  Hash  (cost=1.26..1.26
rows=10 width=4) (actual time=0.037..0.037 rows=10 loops=1)
-  Seq Scan on
tbl_metadata_dataset  (cost=0.00..1.26 rows=10 width=4) (actual
time=0.005..0.024 rows=10 loops=1)
  Filter: (typeofdataid
= 1)
Total runtime: 15.871 ms



Gopal


[PERFORM] Postgres scalability and performance on windows

2006-11-23 Thread Gopal
Hi all,

 

I have a postgres installation thats running under 70-80% CPU usage
while

an MSSQL7 installation did 'roughly' the same thing with 1-2% CPU load.

 

Here's the scenario,

300 queries/second

Server: Postgres 8.1.4 on win2k server

CPU: Dual Xeon 3.6 Ghz, 

Memory: 4GB RAM

Disks: 3 x 36gb , 15K RPM SCSI

C# based web application calling postgres functions using npgsql 0.7.

Its almost completely read-only db apart from fortnightly updates.

 

Table 1 - About 300,000 rows with simple rectangles

Table 2 - 1 million rows 

Total size: 300MB

 

Functions : Simple coordinate reprojection and intersection query +
inner join of table1 and table2.

I think I have all the right indexes defined and indeed the performance
for  queries under low loads is fast.

 

 


==

postgresql.conf has following settings

max_connections = 150

hared_buffers = 2# min 16 or
max_connections*2, 8KB each

temp_buffers = 2000   # min 100, 8KB each

max_prepared_transactions = 25 # can be 0 or more

# note: increasing max_prepared_transactions costs ~600 bytes of shared
memory

# per transaction slot, plus lock space (see max_locks_per_transaction).

work_mem = 512   # min 64, size in KB

#maintenance_work_mem = 16384  # min 1024, size in
KB

max_stack_depth = 2048

effective_cache_size = 82728  # typically 8KB each

random_page_cost = 4   # units are one
sequential page fetch 


==

 

SQL server caches all the data in memory which is making it faster(uses
about 1.2GB memory- which is fine).

But postgres has everything spread across 10-15 processes, with each
process using about 10-30MB, not nearly enough to cache all the data and
ends up doing a lot of disk reads.

I've read that postgres depends on OS to cache the files, I wonder if
this is not happenning on windows.

 

In any case I cannot believe that having 15-20 processes running on
windows helps. Why not spwan of threads instead of processes, which
might

be far less expensive and more efficient. Is there any way of doing
this?

 

My question is, should I just accept the performance I am getting as the
limit on windows or should I be looking at some other params that I
might have missed?

 

Thanks,

Gopal



This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk