Re: [PERFORM] db size
Hi Adrian, When I do a pg_dump with the following parameters /usr/bin/pg_dump -E UTF8 -F c -b I get a file of 14GB in size. From the man page of pg_dump -F format, --format=format Selects the format of the output. format can be one of the following: c output a custom archive suitable for input into pg_restore. This is the most flexible format in that it allows reordering of data load as well as schema elements. This format is also compressed by default. The output is compressed and it is a dump of the database which contain the SQL commands: But the database is 110GB in size on the disk. Why the big difference in size? Does this have anything to do with performance? VACUUM or VACUUM FULL of the entire database will reduce the size of the database by reclaiming any unused space and you can use the filesystem based backup or backup/restore strategy. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] vacuum in Postgresql 8.0.x slowing down the database
On Wed, 26 Mar 2008 13:02:13 -0700 Joshua D. Drake [EMAIL PROTECTED] wrote: The slowness is likely attributed to Vacuum's use of I/O. When vacuum is running what does iostat -k 10 say? Seems to be higher than normal - here is the output with vacuum run without the other queries and the default vacuum taking ~1 hr: avg-cpu: %user %nice %system %iowait %steal %idle 13.300.004.50 25.910.00 56.29 Device:tpskB_read/skB_wrtn/skB_readkB_wrtn md23356.94 2005.59 12945.45 20076 129584 avg-cpu: %user %nice %system %iowait %steal %idle 16.200.006.32 24.890.00 52.59 Device:tpskB_read/skB_wrtn/skB_readkB_wrtn md2 461.70 667.20 1512.00 6672 15120 I don't know if the output helps much since the vacuum took some time and I lost more than half of my iostat -k screen output. (I scrolled up - but got only some of the data) If vacuum does affect the io what are the ways to reduce the io during vacuum (the delay and cost parameter did not help that much - should I consider reducing the cost even further)? Should I consider partitioning the table? Thank you. -- Vinu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] vacuum in Postgresql 8.0.x slowing down the database
Hey all, I had posted sometime back asking for the best way to perform vacuum with a lower priority - I did tune it up to a lower priority and still noticed that the other database queries are slowing down with a vacuum on one big table. I also tried to upgrade Postgresql to 8.0.15 as suggested and I still could reproduce the problem. It happens when I try to vacuum one particular table which has 9.5 million rows and all the other inserts/selects are slowing down by a factor of 10 times. I am using vacuum_cost_delay = 100 vacuum_cost_limit = 200 Even if I cancel the vacuum operation or let the vacuum complete - the slowdown continues to persist till I restart my application. Whats the best way to analyze the reason Postgresql is slowing down? I had a look at pg_locks (did not know what to look for) and also tried starting postgresql with the debug option using: postmaster -d 5 -D /var/pgdata (saw a lot of output including the query being performed but could not gather anything useful) The big table has never been reindexed and has a primary, unique key with btree index and one foreign key constraint. -- Vinu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] best way to run maintenance script
Hi all, I have been searching for the best way to run maintenance scripts which does a vacuum, analyze and deletes some old data. Whenever the maintenance script runs - mainly the pg_maintenance --analyze script - it slows down postgresql inserts and I want to avoid that. The system is under constant load and I am not interested in the time taken to vacuum. Is there a utility or mechanism in postgresql which helps in reducing priority of maintenance queries? Is writing a postgresql C function and setting the priority of process the only way to change the priority of the maintenance script or is there a better way. http://weblog.bignerdranch.com/?p=11 I tried using the nice command (Linux system) on the maintenance script - it did not have any effect - guess it does not change the niceness of the postgresql vacuum process. (I am running Postgresql 8.0 on a Linux) -- Vinu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] best way to run maintenance script
Hi Joshua, You can use parameters such as vacuum_cost_delay to help this... see the docs: http://www.postgresql.org/docs/8.3/static/runtime-config-autovacuum.html I am checking it out. Seems to be a nice option for vacuum - but wish there was a way to change the delete priority or I will try to use the C based priority hack. If you are truly running 8.0 and not something like 8.0.15 vacuum is the least of your worries. Its 8.0.4. Thanks. -- Vinu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] best way to run maintenance script
On Fri, 2008-03-14 at 18:37 -0700, Tom Lane wrote: That's only a little bit better. Read about all the bug fixes you're Sure - will eventually upgrade it sometime - but it has to wait for now :( -- Vinu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] best way to run maintenance script
I think you will find if you do it the right way, which is to say the way that it is meant to be done with the configurable options, your life will be a great deal more pleasant than some one off hack. yeah I agree. The pg_maintanence script which calls vacuum and analyze is the one of the thing that is causing more problems. I am trying out various vacuum options (vacuum_cost_limit, vacuum_cost_delay) and finding it hard to understand the implications of the variables. What are the optimal values for the vacuum_* parameters - for a really active database (writes at the rate of ~ 50 rows/seconds). I started with vacuum_cost_delay = 200 vacuum_cost_limit = 400 and that did not help much. -- Vinu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres scalability and performance on windows
Tom, This is the query and the schema Query is : SELECT subq.percentCover, ds.datasetname, ds.maxresolution FROM ( select sum(area(intersection(snaptogrid(chunkgeometry,0.0001), GeometryFromText('POLYGON((-0.140030845589332 50.8208343077265,-0.138958398039148 50.8478005422809,-0.0963639712296823 50.8471133071392,-0.0974609286275892 50.8201477285483,-0.140030845589332 50.8208343077265))',4326))) * 100/ (0.00114901195862628)) as percentCover, datasetid as did from tbl_metadata_chunks where chunkgeometry GeometryFromText('POLYGON((-0.140030845589332 50.8208343077265,-0.138958398039148 50.8478005422809,-0.0963639712296823 50.8471133071392,-0.0974609286275892 50.8201477285483,-0.140030845589332 50.8208343077265))',4326) and datasetid in (select datasetid from tbl_metadata_dataset where typeofdataid=1) group by did order by did desc ) AS subq INNER JOIN tbl_metadata_dataset AS ds ON subq.did = ds.datasetid ORDER by ceil(subq.percentCover),1/ds.maxresolution DESC; Schema is Table 1 CREATE TABLE public.tbl_metadata_dataset ( datasetname varchar(70) NOT NULL, maxresolution real, typeofdataid integer NOT NULL, datasetid serial NOT NULL, CONSTRAINT PK_Dataset PRIMARY KEY (datasetid) ); -- Indexes CREATE INDEX dsnameindex ON tbl_metadata_dataset USING btree (datasetname);-- Owner ALTER TABLE public.tbl_metadata_dataset OWNER TO postgres; -- Triggers CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2196039 AFTER DELETE ON tbl_metadata_dataset FROM tbl_metadata_chunks NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_noaction_del('dsid', 'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED', 'datasetid', 'datasetid'); CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2196040 AFTER UPDATE ON tbl_metadata_dataset FROM tbl_metadata_chunks NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_noaction_upd('dsid', 'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED', 'datasetid', 'datasetid'); Table 2 CREATE TABLE public.tbl_metadata_chunks ( chunkid serial NOT NULL, chunkgeometry geometry NOT NULL, datasetid integer NOT NULL, CONSTRAINT tbl_metadata_chunks_pkey PRIMARY KEY (chunkid), CONSTRAINT dsid FOREIGN KEY (datasetid) REFERENCES tbl_metadata_dataset(datasetid) ); -- Indexes CREATE INDEX idx_dsid ON tbl_metadata_chunks USING btree (datasetid); CREATE UNIQUE INDEX tbl_metadata_chunks_idx2 ON tbl_metadata_chunks USING btree (nativetlx, nativetly, datasetid); CREATE INDEX tbl_metadata_chunks_idx3 ON tbl_metadata_chunks USING gist (chunkgeometry);-- Owner ALTER TABLE public.tbl_metadata_chunks OWNER TO postgres; -- Triggers CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2194515 AFTER DELETE ON tbl_metadata_chunks FROM tbl_metadata_chunkinfo NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_restrict_del('fk', 'tbl_metadata_chunkinfo', 'tbl_metadata_chunks', 'UNSPECIFIED', 'chunkid', 'chunkid'); CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2194516 AFTER UPDATE ON tbl_metadata_chunks FROM tbl_metadata_chunkinfo NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_restrict_upd('fk', 'tbl_metadata_chunkinfo', 'tbl_metadata_chunks', 'UNSPECIFIED', 'chunkid', 'chunkid'); CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2196037 AFTER INSERT ON tbl_metadata_chunks FROM tbl_metadata_dataset NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_check_ins('dsid', 'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED', 'datasetid', 'datasetid'); CREATE CONSTRAINT TRIGGER RI_ConstraintTrigger_2196038 AFTER UPDATE ON tbl_metadata_chunks FROM tbl_metadata_dataset NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_check_upd('dsid', 'tbl_metadata_chunks', 'tbl_metadata_dataset', 'UNSPECIFIED', 'datasetid', 'datasetid'); -Original Message- From: Frank Wiles [mailto:[EMAIL PROTECTED] Sent: 24 November 2006 17:05 To: Guido Neitzer Cc: Gopal; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Postgres scalability and performance on windows On Fri, 24 Nov 2006 09:22:45 +0100 Guido Neitzer [EMAIL PROTECTED] wrote: effective_cache_size = 82728 # typically 8KB each Hmm. I don't know what the real effect of this might be as the doc states: This parameter has no effect on the size of shared memory allocated by PostgreSQL, nor does it reserve kernel disk cache; it is used only for estimation purposes. This is a hint to the optimizer about how much of the database may be in the OS level cache. - Frank Wiles [EMAIL PROTECTED] http://www.wiles.org
Re: [PERFORM] Postgres scalability and performance on windows
Hi, Thanks for your suggestions. Here's an output of the explain analyse. I'll change the shared_buffers and look at the behaviour again. Limit (cost=59.53..59.53 rows=1 width=28) (actual time=15.681..15.681 rows=1 loops=1) - Sort (cost=59.53..59.53 rows=1 width=28) (actual time=15.678..15.678 rows=1 loops=1) Sort Key: ceil(subq.percentcover), (1::double precision / ds.maxresolution) - Hash Join (cost=58.19..59.52 rows=1 width=28) (actual time=15.630..15.663 rows=2 loops=1) Hash Cond: (outer.datasetid = inner.did) - Seq Scan on tbl_metadata_dataset ds (cost=0.00..1.21 rows=21 width=24) (actual time=0.006..0.021 rows=21 loops=1) - Hash (cost=58.18..58.18 rows=1 width=12) (actual time=15.591..15.591 rows=2 loops=1) - Sort (cost=58.17..58.17 rows=1 width=117) (actual time=15.585..15.586 rows=2 loops=1) Sort Key: tbl_metadata_chunks.datasetid - HashAggregate (cost=58.13..58.16 rows=1 width=117) (actual time=15.572..15.573 rows=2 loops=1) - Hash IN Join (cost=3.34..58.10 rows=7 width=117) (actual time=0.261..0.544 rows=50 loops=1) Hash Cond: (outer.datasetid = inner.datasetid) - Bitmap Heap Scan on tbl_metadata_chunks (cost=2.05..56.67 rows=14 width=117) (actual time=0.204..0.384 rows=60 loops=1) Filter: (chunkgeometry '010320E6100100050058631EDF87ECC1BF608F3D1911694940A0958 A8763C9C1BF535069BA846C494026B5F1284FABB8BFAB1577356E6C494094E1170D33F3B 8BF7700CC99FA68494058631EDF87ECC1BF608F3D1 (..) - Bitmap Index Scan on tbl_metadata_chunks_idx3 (cost=0.00..2.05 rows=14 width=0) (actual time=0.192..0.192 rows=60 loops=1) Index Cond: (chunkgeometry '010320E6100100050058631EDF87ECC1BF608F3D1911694940A0958 A8763C9C1BF535069BA846C494026B5F1284FABB8BFAB1577356E6C494094E1170D33F3B 8BF7700CC99FA68494058631EDF87ECC (..) - Hash (cost=1.26..1.26 rows=10 width=4) (actual time=0.037..0.037 rows=10 loops=1) - Seq Scan on tbl_metadata_dataset (cost=0.00..1.26 rows=10 width=4) (actual time=0.005..0.024 rows=10 loops=1) Filter: (typeofdataid = 1) Total runtime: 15.871 ms Gopal
[PERFORM] Postgres scalability and performance on windows
Hi all, I have a postgres installation thats running under 70-80% CPU usage while an MSSQL7 installation did 'roughly' the same thing with 1-2% CPU load. Here's the scenario, 300 queries/second Server: Postgres 8.1.4 on win2k server CPU: Dual Xeon 3.6 Ghz, Memory: 4GB RAM Disks: 3 x 36gb , 15K RPM SCSI C# based web application calling postgres functions using npgsql 0.7. Its almost completely read-only db apart from fortnightly updates. Table 1 - About 300,000 rows with simple rectangles Table 2 - 1 million rows Total size: 300MB Functions : Simple coordinate reprojection and intersection query + inner join of table1 and table2. I think I have all the right indexes defined and indeed the performance for queries under low loads is fast. == postgresql.conf has following settings max_connections = 150 hared_buffers = 2# min 16 or max_connections*2, 8KB each temp_buffers = 2000 # min 100, 8KB each max_prepared_transactions = 25 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 512 # min 64, size in KB #maintenance_work_mem = 16384 # min 1024, size in KB max_stack_depth = 2048 effective_cache_size = 82728 # typically 8KB each random_page_cost = 4 # units are one sequential page fetch == SQL server caches all the data in memory which is making it faster(uses about 1.2GB memory- which is fine). But postgres has everything spread across 10-15 processes, with each process using about 10-30MB, not nearly enough to cache all the data and ends up doing a lot of disk reads. I've read that postgres depends on OS to cache the files, I wonder if this is not happenning on windows. In any case I cannot believe that having 15-20 processes running on windows helps. Why not spwan of threads instead of processes, which might be far less expensive and more efficient. Is there any way of doing this? My question is, should I just accept the performance I am getting as the limit on windows or should I be looking at some other params that I might have missed? Thanks, Gopal This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk