Re: [PERFORM] Analysis Function
On Sun, Jun 13, 2010 at 17:58, Tom Lane t...@sss.pgh.pa.us wrote: Magnus Hagander mag...@hagander.net writes: On Sun, Jun 13, 2010 at 17:42, Tom Lane t...@sss.pgh.pa.us wrote: ... (We presumably want timezone to default to the system timezone setting, but I wonder how we should make that work --- should an empty string be treated as meaning that?) Umm. NULL could be made to mean that, or we could provicde two different versions - one that takes TZ and one that doesn't. Using NULL like that seems a bit awkward: for one thing it'd mean the function couldn't be STRICT, and also it'd be bizarre that only this one argument could be null without leading to a null result. Hmm, yeah. And two separate functions isn't good either. Basically, I think it's important that there be a way to specify an explicit parameter value that behaves identically to the default. In that case, empty string seems fairly reasonable - if you look at the text based parsing, that's what we do if the timezone is an empty string (meaning not specified). And a third, construct_time(), no? Yeah, maybe ... do you think there's any demand for it? Yes, I think there is. Plus, it's for completeness :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Analysis Function
On Sun, Jun 13, 2010 at 21:19, David Jarvis thanga...@gmail.com wrote: Hi, It's not immediately obvious what the default value of timezone will be? The system's locale, like now(); documentation can clarify. By named parameter, I meant default value. You could construct a timestamp variable using: construct_timestamp( year := 1900, hour := 1 ) When I read that code, the first thing I think it should return is: 1900-01-01 01:00:00.-07 I agree construct_timestamp( hour := 1 ) and construct_date() are errors: year is required. Does it make sense to allow minutes when hours isn't specified? Or should we simply say that for each of the date and the time part, to specify at level n you need to have everything from the top up to level n-1 specified? E.g. month requires year to be specified, day requires both year and month etc? I prefer to_timestamp and to_date over the more verbose construct_timestamp. Yeah, I agree with that. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] query tuning help
Can any one please help me in tuning the query? explain select * from (select * from crmentity where deleted = 0 and createdtime between (now() - interval '6 month') and now() ) as crmentity inner join (select * from activity where activitytype = 'Emails' and date_start between (now() - interval '6 month') and now()) as activity on crmentity.crmid=activity.activityid inner join emaildetails on emaildetails.emailid = crmentity.crmid inner join vantage_email_track on vantage_email_track.mailid=emaildetails.emailid left join seactivityrel on seactivityrel.activityid = emaildetails.emailid QUERY PLAN Nested Loop Left Join (cost=8725.27..17121.20 rows=197 width=581) - Nested Loop (cost=8725.27..16805.64 rows=7 width=573) - Hash Join (cost=8725.27..10643.08 rows=789 width=292) Hash Cond: (emaildetails.emailid = public.activity.activityid) - Seq Scan on emaildetails (cost=0.00..1686.95 rows=44595 width=186) - Hash (cost=8664.41..8664.41 rows=4869 width=106) - Hash Join (cost=5288.61..8664.41 rows=4869 width=106) Hash Cond: (vantage_email_track.mailid = public.activity.activityid) - Seq Scan on vantage_email_track (cost=0.00..1324.52 rows=88852 width=12) - Hash (cost=4879.22..4879.22 rows=15071 width=94) - Bitmap Heap Scan on activity (cost=392.45..4879.22 rows=15071 width=94) Recheck Cond: (((activitytype)::text = 'Emails'::text) AND (date_start = (now() - '6 mons'::interval)) AND (date_start = now())) - Bitmap Index Scan on activity_activitytype_date_start_idx (cost=0.00..388.68 rows=15071 width=0) Index Cond: (((activitytype)::text = 'Emails'::text) AND (date_start = (now() - '6 mons'::interval)) AND (date_start = now())) - Index Scan using crmentity_pkey on crmentity (cost=0.00..7.80 rows=1 width=281) Index Cond: (public.crmentity.crmid = public.activity.activityid) Filter: ((public.crmentity.deleted = 0) AND (public.crmentity.createdtime = now()) AND (public.crmentity.createdtime = (now() - '6 mons'::interval))) - Index Scan using seactivityrel_activityid_idx on seactivityrel (cost=0.00..39.57 rows=441 width=8) Index Cond: (seactivityrel.activityid = emaildetails.emailid) (19 rows)
Re: [PERFORM] Analysis Function
Does it make sense to allow minutes when hours isn't specified? Or For time, 00 seems a reasonable default for all values; clearly document the defaults. Also, having a default makes the code simpler than level n plus level n-1. (Not to mention explaining it.) ;-) SELECT to_timestamp( minutes := 19 ) -- error (year not specified) SELECT to_timestamp( year := 2000, minutes := 19 ) -- 2000-01-01 00:19:00.-07 Dave
[PERFORM] Dead lock
Hi all, I have 2 data bases trying to perform an update query at the same time on a same table in a third data base using db link. I'm getting a dead lock exception: ERROR: deadlock detected DETAIL: Process 27305 waits for ShareLock on transaction 55575; blocked by process 27304. Process 27304 waits for ShareLock on transaction 55576; blocked by process 27305. HINT: See server log for query details. Actually the folowing function is installed on 2 dbs DB1 and DB2. This function issues an update query on DB3. When this function is running simultaneously on DB1 and DB2, it produces a dead lock making one of the functions (in DB1 or DB2) stop with the above exception: Is it normal? should'nt postgres be able to handle such situations, for ex: let one transaction wait untill the other commits or rollback then continue with the first transaction? Is there a parameter that should be set in postgresql.conf to allow handling of concurrent transaction...? CREATE OR REPLACE FUNCTION TEST_DB_LINK(VARCHAR) RETURNS VOID AS' DECLARE C INTEGER; P ALIAS FOR $1; DUMMY VARCHAR; BEGIN C:= 0; LOOP EXIT WHEN C 15; C:= C+1; SELECT INTO DUMMY DBLINK_EXEC(''CONNECTION_STRING TO DB3', ''UPDATE IN_FICHE_PRODUIT SET VALIDE = 1''); RAISE NOTICE ''%, %'', C,P; END LOOP; END;' LANGUAGE 'plpgsql'; Thanks for your time.
Re: [PERFORM] Dead lock
On 14/06/10 12:50, Elias Ghanem wrote: SELECT INTO DUMMY DBLINK_EXEC(''CONNECTION_STRING TO DB3', ''UPDATE IN_FICHE_PRODUIT SET VALIDE = 1''); If there's more than one value in that table, an explicit ORDER BY might help (otherwise you could get the situation where query A will update in the order 1,2,3 and query B will do 3,2,1 so neither will be able to get the requested locks until the other query has finished). Tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query tuning help
On 06/14/2010 05:41 AM, AI Rumman wrote: Can any one please help me in tuning the query? explain select * from (select * from crmentity where deleted = 0 and createdtime between (now() - interval '6 month') and now() ) as crmentity inner join (select * from activity where activitytype = 'Emails' and date_start between (now() - interval '6 month') and now()) as activity on crmentity.crmid=activity.activityid inner join emaildetails on emaildetails.emailid = crmentity.crmid inner join vantage_email_track on vantage_email_track.mailid=emaildetails.emailid left join seactivityrel on seactivityrel.activityid = emaildetails.emailid Can you send us 'explain analyze' too? - Seq Scan on emaildetails (cost=0.00..1686.95 rows=44595 width=186) - Seq Scan on vantage_email_track (cost=0.00..1324.52 rows=88852 width=12) do you have indexes on emaildetails(emailid) and vantage_email_track(mailid)? -Andy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Analysis Function
Magnus Hagander mag...@hagander.net writes: On Sun, Jun 13, 2010 at 21:19, David Jarvis thanga...@gmail.com wrote: I prefer to_timestamp and to_date over the more verbose construct_timestamp. Yeah, I agree with that. Those names are already taken. It will cause confusion (of both people and machines) if you try to overload them with this. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Dead lock
On 06/14/2010 06:50 AM, Elias Ghanem wrote: Hi all, I have 2 data bases trying to perform an update query at the same time on a same table in a third data base using db link. I'm getting a dead lock exception: ERROR: deadlock detected DETAIL: Process 27305 waits for ShareLock on transaction 55575; blocked by process 27304. Process 27304 waits for ShareLock on transaction 55576; blocked by process 27305. HINT: See server log for query details. Actually the folowing function is installed on 2 dbs DB1 and DB2. This function issues an update query on DB3. When this function is running simultaneously on DB1 and DB2, it produces a dead lock making one of the functions (in DB1 or DB2) stop with the above exception: Is it normal? should'nt postgres be able to handle such situations, for ex: let one transaction wait untill the other commits or rollback then continue with the first transaction? Is there a parameter that should be set in postgresql.conf to allow handling of concurrent transaction...? CREATE OR REPLACE FUNCTION TEST_DB_LINK(VARCHAR) RETURNS VOID AS' DECLARE C INTEGER; P ALIAS FOR $1; DUMMY VARCHAR; BEGIN C:= 0; LOOP EXIT WHEN C 15; C:= C+1; SELECT INTO DUMMY DBLINK_EXEC(''CONNECTION_STRING TO DB3', ''UPDATE IN_FICHE_PRODUIT SET VALIDE = 1''); RAISE NOTICE ''%, %'', C,P; END LOOP; END;' LANGUAGE 'plpgsql'; Thanks for your time. I think PG is doing what you want.. if you think about it. You start two transactions at the same time. A transaction is defined as do this set of operations, all of which must succeed or fail atomicly. One transaction cannot update the exact same row as another transaction because that would break the second transactions must succeed rule. -Andy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query hangs
AI Rumman rumman...@gmail.com wrote: [It takes a long time to return 2.6 million rows.] Any more idea, please. I don't recall you telling us exactly what the environment and connection type is in which you're trying to return this large result set. Any specific suggestions would depend on that information. I do wonder why you are returning 2.6 million rows. A result set that large is rarely useful directly (except during data conversion or loading of some sort). Is there any filtering or aggregation happening on the client side with the received rows? If so, my first suggestion would be to make that part of the query, rather than part of the client code. Is table partition a good solution for query optimization? Table partitioning is useful in some cases, but you haven't told us anything yet to indicate that it would help here. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Fwd: Dead lock
Hi, Actually i guess the problem is related to the way PG uses to aquire lock on the rows that will be updated. Suppose the update query will affect 5 rows: A, B, C, D and E. Apparently the folowing senario is happening: 1- Transaction1 locks row A 2- Trnasaction2 locks row B 3- Transaction1 updates row A 4- Tranasaction2 updates row B 5- Transaction1 *tries *to acquire lock on row B(and fail because row B is still locked by transaction2) 6- Transaction2 *tries *to acquire lock on row A(and fail because row A is still locked by transaction1) Hence the dead lock. Is this a plausible explanation of what is going on? If yes, what can be done to avoid the dead lock? Thanks again. Original Message Subject:Dead lock Date: Mon, 14 Jun 2010 14:50:43 +0300 From: Elias Ghanem e.gha...@acteos.com To: pgsql-performance@postgresql.org Hi all, I have 2 data bases trying to perform an update query at the same time on a same table in a third data base using db link. I'm getting a dead lock exception: ERROR: deadlock detected DETAIL: Process 27305 waits for ShareLock on transaction 55575; blocked by process 27304. Process 27304 waits for ShareLock on transaction 55576; blocked by process 27305. HINT: See server log for query details. Actually the folowing function is installed on 2 dbs DB1 and DB2. This function issues an update query on DB3. When this function is running simultaneously on DB1 and DB2, it produces a dead lock making one of the functions (in DB1 or DB2) stop with the above exception: Is it normal? should'nt postgres be able to handle such situations, for ex: let one transaction wait untill the other commits or rollback then continue with the first transaction? Is there a parameter that should be set in postgresql.conf to allow handling of concurrent transaction...? CREATE OR REPLACE FUNCTION TEST_DB_LINK(VARCHAR) RETURNS VOID AS' DECLARE C INTEGER; P ALIAS FOR $1; DUMMY VARCHAR; BEGIN C:= 0; LOOP EXIT WHEN C 15; C:= C+1; SELECT INTO DUMMY DBLINK_EXEC(''CONNECTION_STRING TO DB3', ''UPDATE IN_FICHE_PRODUIT SET VALIDE = 1''); RAISE NOTICE ''%, %'', C,P; END LOOP; END;' LANGUAGE 'plpgsql'; Thanks for your time.
Re: [PERFORM] Fwd: Dead lock
It's a standard (indeed, required) best practice of concurrent database programming across any brand of database to ensure that multi-row transactions always acquire the locks they use in a predictable order based on row identities, e.g. for the classic banking debit-credit pair, doing something like this (Java / JDBC, simplified for brevity and clarity): PreparedStatement debit = conn.prepareStatement(update account set balance = balance - ? where acc_no = ? and balance ?); debit.setLong(1, amount); debit.setLong(2, debit_acct); debit.setLong(3, amount); PreparedStatement credit = conn.prepareStatement(update account set balance = balance + ? where acc_no = ?); credit.setLong(1, amount); credit.setLong(2, credit_acct); try { // always acquire row locks in increasing account number order conn.beginTransaction(); if (credit_acct debit_acct) { credit.executeUpdate(); if (debit.executeUpdate() 1) throw new SQLException(Insufficient balance); } else { if (debit.executeUpdate() 1) throw new SQLException(Insufficient balance); credit.executeUpdate(); } } catch (SQLException e) { System.err.println(Oops. transaction failed: , e.getMessage()); conn.rollback(); } conn.commit(); If you're doing straight SQL bulk updates, then as someone suggested, you could use an ORDER BY on a subquery, but I don't know if that is a guarantee, if you're not actually displaying the results then the DB may be technically allowed to optimize it out from underneath you. The only way to be sure is a cursor / procedure. In short, this boils down to learning more about database programming. PG is performing as it should. Cheers Dave On Mon, Jun 14, 2010 at 10:36 AM, Elias Ghanem e.gha...@acteos.com wrote: Hi, Actually i guess the problem is related to the way PG uses to aquire lock on the rows that will be updated. Suppose the update query will affect 5 rows: A, B, C, D and E. Apparently the folowing senario is happening: 1- Transaction1 locks row A 2- Trnasaction2 locks row B 3- Transaction1 updates row A 4- Tranasaction2 updates row B 5- Transaction1 *tries *to acquire lock on row B(and fail because row B is still locked by transaction2) 6- Transaction2 *tries *to acquire lock on row A(and fail because row A is still locked by transaction1) Hence the dead lock. Is this a plausible explanation of what is going on? If yes, what can be done to avoid the dead lock? Thanks again. Original Message Subject: Dead lock Date: Mon, 14 Jun 2010 14:50:43 +0300 From: Elias Ghanem e.gha...@acteos.come.gha...@acteos.com To: pgsql-performance@postgresql.org Hi all, I have 2 data bases trying to perform an update query at the same time on a same table in a third data base using db link. I'm getting a dead lock exception: ERROR: deadlock detected DETAIL: Process 27305 waits for ShareLock on transaction 55575; blocked by process 27304. Process 27304 waits for ShareLock on transaction 55576; blocked by process 27305. HINT: See server log for query details. Actually the folowing function is installed on 2 dbs DB1 and DB2. This function issues an update query on DB3. When this function is running simultaneously on DB1 and DB2, it produces a dead lock making one of the functions (in DB1 or DB2) stop with the above exception: Is it normal? should'nt postgres be able to handle such situations, for ex: let one transaction wait untill the other commits or rollback then continue with the first transaction? Is there a parameter that should be set in postgresql.conf to allow handling of concurrent transaction...? CREATE OR REPLACE FUNCTION TEST_DB_LINK(VARCHAR) RETURNS VOID AS' DECLARE C INTEGER; P ALIAS FOR $1; DUMMY VARCHAR; BEGIN C:= 0; LOOP EXIT WHEN C 15; C:= C+1; SELECT INTO DUMMY DBLINK_EXEC(''CONNECTION_STRING TO DB3', ''UPDATE IN_FICHE_PRODUIT SET VALIDE = 1''); RAISE NOTICE ''%, %'', C,P; END LOOP; END;' LANGUAGE 'plpgsql'; Thanks for your time.
[PERFORM] RE: [PERFORM] Dbt2 w ith postgres issues on CentOS-5.3
I am facing sar related issues on Redhat Enterprise Linux64 5.4 too (60G Ram, No Swap space, Xeon Processor). sar -o /var/dbt2_data/PG/Output/driver/dbt2-sys1/sar_raw.out 60 204 |___ sadc 60 205 -z /var/dbt2_data/PG/Output/driver/dbt2-sys1/sar_raw.out It generates following sar data i.e. …. …. 03:52:43 AM 2.31 03:53:43 AM 2.31 03:54:43 AM 2.28 03:55:43 AM 2.31 03:56:43 AM 1.67 03:57:43 AM 0.29 03:58:43 AM 0.29 04:00:43 AM 0.30 04:04:00 AM 3.52 04:07:07 AM 0.30 04:09:36 AM 0.23 04:12:04 AM 0.36 04:14:25 AM 0.23 04:16:45 AM 0.26 04:19:10 AM 0.24 04:21:30 AM 0.38 04:23:55 AM 0.24 04:26:25 AM 0.35 04:28:48 AM 0.24 04:31:10 AM 0.27 04:33:40 AM 0.33 04:36:45 AM 0.41 04:39:12 AM 0.27 04:41:41 AM 0.26 04:44:11 AM 0.33 04:46:35 AM 0.25 04:49:06 AM 0.33 04:51:27 AM 0.27 04:53:56 AM 0.23 04:56:19 AM 0.36 04:58:43 AM 0.24 05:01:10 AM 0.35 05:03:43 AM 0.33 05:06:53 AM 0.29 05:09:25 AM 0.23 …. …. To fix this issue I have modified the sysstat-9.1.2/sadc.c and replaced signal based pause (That is not real time) with select based pause. That fixed the issue. Thanks. --- sadc.c.patch --- --- sadc.c.org2010-06-14 21:44:18.0 +0500 +++ sadc.c2010-06-14 22:52:51.693211184 +0500 @@ -33,6 +33,10 @@ #include sys/stat.h #include sys/utsname.h +#include sys/types.h +#include sys/time.h +#include time.h + #include version.h #include sa.h #include rd_stats.h @@ -792,6 +796,15 @@ } } +void pause_new( void ) +{ +struct timeval tvsel; +tvsel.tv_sec = interval; +tvsel.tv_usec = 0; + +select( 0, NULL, NULL, NULL, tvsel ); +} + /* *** * Main loop: Read stats from the relevant sources and display them. @@ -899,7 +912,7 @@ } if (count) { -pause(); +pause_new(); } /* Rotate activity file if necessary */ --- Best Regards, Asif Naeem Date: Wed, 21 Apr 2010 18:10:35 -0700 Subject: Re: [PERFORM] Dbt2 with postgres issues on CentOS-5.3 From: mark...@gmail.com To: anaeem...@hotmail.com CC: pgsql-performance@postgresql.org 2010/4/20 MUHAMMAD ASIF anaeem...@hotmail.com: Hi, I am using dbt2 on Linux 64 (CentOS release 5.3 (Final)) . I have compiled latest postgresql-8.4.3 code on the machine and run dbt2 against it. I am little confused about the results. I ran dbt2 with the following configuration i.e. DBT2 Options : WAREHOUSES=75 DB_CONNECTIONS=20 REGRESS_DURATION=1 #HOURS REGRESS_DURATION_SEC=$((60*60*$REGRESS_DURATION)) DBT2 Command : ./dbt2-pgsql-create-db ./dbt2-pgsql-build-db -d $DBDATA -g -r -w $WAREHOUSES ./dbt2-run-workload -a pgsql -c $DB_CONNECTIONS -d $REGRESS_DURATION_SEC -w $WAREHOUSES -o $OUTPUT_DIR ./dbt2-pgsql-stop-db I am not able to understand the sar related graphs. Iostat,mpstat and vmstat results are similar but sar results are strange. I tried to explore the dbt2 source code to find out the how graphs are drawn and why sar results differ.DBT2.pm : 189 reads sar.out and parse it and consider 1 minute elapsed time between each record i.e. That is certainly a weakness in the logic of the perl modules in plotting the charts accurately. I wouldn't be surprised if the other stat tools suffer the same problem. Regards, Mark _ Your E-mail and More On-the-Go. Get Windows Live Hotmail Free. https://signup.live.com/signup.aspx?id=60969
Re: [PERFORM] query hangs
On Thu, Jun 10, 2010 at 5:36 AM, AI Rumman rumman...@gmail.com wrote: I found only AccessShareLock in pg_locks during the query. And the query does not return data though I have been waiting for 10 mins. Do you have any idea ? I have queries that run for hours. As long as it's using CPU / IO (use top in unix, whatever in windows to see) it's not hung, it's just taking longer than you expected. Those are not the same thing at all. Seeing as how you're joining three tables with millions of rows with no where clause, it's gonna take some to complete. Go grab a sandwich, etc, come back when it's done. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Fwd: Dead lock
On Mon, Jun 14, 2010 at 11:58 AM, Dave Crooke dcro...@gmail.com wrote: If you're doing straight SQL bulk updates, then as someone suggested, you could use an ORDER BY on a subquery, but I don't know if that is a guarantee, if you're not actually displaying the results then the DB may be technically allowed to optimize it out from underneath you. The only way to be sure is a cursor / procedure. 'order by' should be safe if you use SELECT...FOR UPDATE. update doesn't have an order by clause. Using cursor/procedure vs a query is not the material point; you have to make sure locks are acquired in a regular way. update foo set x=x where id in (select * from bar order by x) does look dangerous. I think: update foo set x=x where id in (select * from bar order by x for update) should be ok. I don't usually do it that way. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
Hi Bob, Thanks a lot. Here's my best attempt to answer your questions: The VM is setup with a virtual disk image dynamically expanding to fill an allocation of 300GB on a fast, local hard drive (avg read speed = 778MB/s ). WAL files can have their own disk, but how significantly would this affect our performance? The filesystem of the host OS is NTFS (Windows Server 2008 OS 64), the guest filesystem is Ext2 (Ubuntu 64). The workload is OLAP (lots of large, complex queries on large tables run in sequence). In addition, I have reconfigured my server to use more memory. Here's a detailed blow by blow of how I reconfigured my system to get better performance (for anyone who might be interested)... In order to increase the shared memory on Ubuntu I edited the System V IPC values using sysctl: sysctl -w kernel.shmmax=16106127360* *sysctl -w kernel.shmall=2097152 I had some fun with permissions as I somehow managed to change the owner of the postgresql.conf to root where it needed to be postgres, resulting in failure to start the service.. (Fixed with chown postgres:postgres ./data/postgresql.conf and chmod u=rwx ./data -R). I changed the following params in my configuration file.. default_statistics_target=1 maintenance_work_mem=512MB work_mem=512MB shared_buffers=512MB wal_buffers=128MB With this config, the following command took 6,400,000ms: EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org; With plan: Seq Scan on match_data (cost=0.00..1392900.78 rows=32237278 width=232) (actual time=0.379..464270.682 rows=2961 loops=1) Total runtime: 6398238.890 ms With these changes to the previous config, the same command took 5,610,000ms: maintenance_work_mem=4GB work_mem=4GB shared_buffers=4GB effective_cache_size=4GB wal_buffers=1GB Resulting plan: Seq Scan on match_data (cost=0.00..2340147.72 rows=30888572 width=232) (actual time=0.094..452793.430 rows=2961 loops=1) Total runtime: 5614140.786 ms Then I performed these changes to the postgresql.conf file: max_connections=3 effective_cache_size=15GB maintenance_work_mem=5GB shared_buffers=7000MB work_mem=5GB And ran this query (for a quick look - can't afford the time for the previous tests..): EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 10; Result: Index Scan using match_data_pkey1 on match_data (cost=0.00..15662.17 rows=4490 width=232) (actual time=27.055..1908.027 rows=9 loops=1) Index Cond: (match_data_id 10) Total runtime: 25909.372 ms I then ran EntrepriseDB's Tuner on my postgres install (for a dedicated machine) and got the following settings and results: EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 10; Index Scan using match_data_pkey1 on match_data (cost=0.00..13734.54 rows=4495 width=232) (actual time=0.348..2928.844 rows=9 loops=1) Index Cond: (match_data_id 10) Total runtime: 1066580.293 ms For now, I will go with the config using 7000MB shared_buffers. Any suggestions on how I can further optimise this config for a single session, 64-bit install utilising ALL of 96GB RAM. I will spend the next week making the case for a native install of Linux, but first we need to be 100% sure that is the only way to get the most out of Postgres on this machine. Thanks very much. I now feel I am at a position where I can really explore and find the optimal configuration for my system, but would still appreciate any suggestions. Cheers, Tom On 11/06/2010 07:25, Bob Lunney wrote: Tom, First off, I wouldn't use a VM if I could help it, however, sometimes you have to make compromises. With a 16 Gb machine running 64-bit Ubuntu and only PostgreSQL, I'd start by allocating 4 Gb to shared_buffers. That should leave more than enough room for the OS and file system cache. Then I'd begin testing by measuring response times of representative queries with significant amounts of data. Also, what is the disk setup for the box? Filesystem? Can WAL files have their own disk? Is the workload OLTP or OLAP, or a mixture of both? There is more that goes into tuning a PG server for good performance than simply installing the software, setting a couple of GUCs and running it. Bob --- On Thu, 6/10/10, Tom Wilcox hungry...@gmail.com wrote: From: Tom Wilcox hungry...@gmail.com Subject: Re: [PERFORM] requested shared memory size overflows size_t To: Bob Lunney bob_lun...@yahoo.com Cc: Robert Haas robertmh...@gmail.com, pgsql-performance@postgresql.org Date: Thursday, June 10, 2010, 10:45 AM Thanks guys. I am currently installing Pg64 onto a Ubuntu Server 64-bit installation running as a VM in VirtualBox with 16GB of RAM accessible. If what you say is true then what do you suggest I do to configure my new setup to best use the available 16GB (96GB and native install eventually if the test goes well) of RAM on Linux. I was considering starting by using Enterprise DBs tuner to see if that
Re: [PERFORM] Large (almost 50%!) performance drop after upgrading to 8.4.4?
Alvaro Herrera wrote: Excerpts from Tom Lane's message of jue jun 10 11:46:25 -0400 2010: Yes, the folks at commandprompt need to be told about this. Loudly. It's a serious packaging error. Just notified Lacey, the packager (not so loudly, though); she's working on new packages, and apologizes for the inconvenience. Hello Everyone, New packages for 8.4.4 on CentOS 5.5 and RHEL 5.5 (all arches), have been built, and are available in the PGDG repo. http://yum.pgsqlrpms.org/8.4/redhat/rhel-5-i386/ http://yum.pgsqlrpms.org/8.4/redhat/rhel-5-x86_64/ Output from pg_config --configure --version is below. x86_64: '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--disable-rpath' '--with-perl' '--with-python' '--with-tcl' '--with-tclconfig=/usr/lib64' '--with-openssl' '--with-pam' '--with-krb5' '--with-gssapi' '--with-includes=/usr/include' '--with-libraries=/usr/lib64' '--enable-nls' '--enable-thread-safety' '--with-libxml' '--with-libxslt' '--with-ldap' '--with-system-tzdata=/usr/share/zoneinfo' '--sysconfdir=/etc/sysconfig/pgsql' '--datadir=/usr/share/pgsql' '--with-docdir=/usr/share/doc' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -I/usr/include/et' 'CPPFLAGS= -I/usr/include/et' PostgreSQL 8.4.4 i386: '--build=i686-redhat-linux-gnu' '--host=i686-redhat-linux-gnu' '--target=i386-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--disable-rpath' '--with-perl' '--with-python' '--with-tcl' '--with-tclconfig=/usr/lib' '--with-openssl' '--with-pam' '--with-krb5' '--with-gssapi' '--with-includes=/usr/include' '--with-libraries=/usr/lib' '--enable-nls' '--enable-thread-safety' '--with-libxml' '--with-libxslt' '--with-ldap' '--with-system-tzdata=/usr/share/zoneinfo' '--sysconfdir=/etc/sysconfig/pgsql' '--datadir=/usr/share/pgsql' '--with-docdir=/usr/share/doc' 'build_alias=i686-redhat-linux-gnu' 'host_alias=i686-redhat-linux-gnu' 'target_alias=i386-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -I/usr/include/et' 'CPPFLAGS= -I/usr/include/et' PostgreSQL 8.4.4 Again, I extend deep apologies for the inconvenience. If there is anything further we can help with, please let us know. Regards, Lacey -- Lacey Powers The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 ext 104 PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
On Mon, Jun 14, 2010 at 2:53 PM, Tom Wilcox hungry...@gmail.com wrote: maintenance_work_mem=4GB work_mem=4GB shared_buffers=4GB effective_cache_size=4GB wal_buffers=1GB It's pretty easy to drive your system into swap with such a large value for work_mem - you'd better monitor that carefully. The default value for wal_buffers is 64kB. I can't imagine why you'd need to increase that by four orders of magnitude. I'm not sure whether it will cause you a problem or not, but you're allocating quite a lot of shared memory that way that you might not really need. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
Tom I always prefer to choose apps based on business needs, then the OS based on the needs for the app. Cynically, I often feel that the best answer to we have a policy that says we're only allowed to use operating system x is to ignore the policy the kind of people ignorant enough to be that blinkered are usually not tech-savvy enough to notice when it gets flouted :-) More seriously, is the policy Windows only on the metal or could you run e.g. VMware ESX server? I/O is the area that takes the biggest hit in virtualization, and ESX server has far less overhead loss than either Hyper-V (which I presume you are using) or VMWare Workstation for NT (kernels). If it's a Windows-only policy, then perhaps you can run those traps in reverse, and switch to a Windows database, i.e. Microsoft SQL Server. Cheers Dave On Mon, Jun 14, 2010 at 1:53 PM, Tom Wilcox hungry...@gmail.com wrote: Hi Bob, Thanks a lot. Here's my best attempt to answer your questions: The VM is setup with a virtual disk image dynamically expanding to fill an allocation of 300GB on a fast, local hard drive (avg read speed = 778MB/s ). WAL files can have their own disk, but how significantly would this affect our performance? The filesystem of the host OS is NTFS (Windows Server 2008 OS 64), the guest filesystem is Ext2 (Ubuntu 64). The workload is OLAP (lots of large, complex queries on large tables run in sequence). In addition, I have reconfigured my server to use more memory. Here's a detailed blow by blow of how I reconfigured my system to get better performance (for anyone who might be interested)... In order to increase the shared memory on Ubuntu I edited the System V IPC values using sysctl: sysctl -w kernel.shmmax=16106127360* *sysctl -w kernel.shmall=2097152 I had some fun with permissions as I somehow managed to change the owner of the postgresql.conf to root where it needed to be postgres, resulting in failure to start the service.. (Fixed with chown postgres:postgres ./data/postgresql.conf and chmod u=rwx ./data -R). I changed the following params in my configuration file.. default_statistics_target=1 maintenance_work_mem=512MB work_mem=512MB shared_buffers=512MB wal_buffers=128MB With this config, the following command took 6,400,000ms: EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org; With plan: Seq Scan on match_data (cost=0.00..1392900.78 rows=32237278 width=232) (actual time=0.379..464270.682 rows=2961 loops=1) Total runtime: 6398238.890 ms With these changes to the previous config, the same command took 5,610,000ms: maintenance_work_mem=4GB work_mem=4GB shared_buffers=4GB effective_cache_size=4GB wal_buffers=1GB Resulting plan: Seq Scan on match_data (cost=0.00..2340147.72 rows=30888572 width=232) (actual time=0.094..452793.430 rows=2961 loops=1) Total runtime: 5614140.786 ms Then I performed these changes to the postgresql.conf file: max_connections=3 effective_cache_size=15GB maintenance_work_mem=5GB shared_buffers=7000MB work_mem=5GB And ran this query (for a quick look - can't afford the time for the previous tests..): EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 10; Result: Index Scan using match_data_pkey1 on match_data (cost=0.00..15662.17 rows=4490 width=232) (actual time=27.055..1908.027 rows=9 loops=1) Index Cond: (match_data_id 10) Total runtime: 25909.372 ms I then ran EntrepriseDB's Tuner on my postgres install (for a dedicated machine) and got the following settings and results: EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 10; Index Scan using match_data_pkey1 on match_data (cost=0.00..13734.54 rows=4495 width=232) (actual time=0.348..2928.844 rows=9 loops=1) Index Cond: (match_data_id 10) Total runtime: 1066580.293 ms For now, I will go with the config using 7000MB shared_buffers. Any suggestions on how I can further optimise this config for a single session, 64-bit install utilising ALL of 96GB RAM. I will spend the next week making the case for a native install of Linux, but first we need to be 100% sure that is the only way to get the most out of Postgres on this machine. Thanks very much. I now feel I am at a position where I can really explore and find the optimal configuration for my system, but would still appreciate any suggestions. Cheers, Tom On 11/06/2010 07:25, Bob Lunney wrote: Tom, First off, I wouldn't use a VM if I could help it, however, sometimes you have to make compromises. With a 16 Gb machine running 64-bit Ubuntu and only PostgreSQL, I'd start by allocating 4 Gb to shared_buffers. That should leave more than enough room for the OS and file system cache. Then I'd begin testing by measuring response times of representative queries with significant amounts of data. Also, what is the disk setup for the box? Filesystem? Can WAL files have their own
Re: [PERFORM] requested shared memory size overflows size_t
Hi Dave, I am definitely able to switch OS if it will get the most out of Postgres. So it is definitely a case of choosing the OS on the needs if the app providing it is well justified. Currently, we are running Ubuntu Server 64-bit in a VirtualBox VM. Cheers, Tom Dave Crooke wrote: Tom I always prefer to choose apps based on business needs, then the OS based on the needs for the app. Cynically, I often feel that the best answer to we have a policy that says we're only allowed to use operating system x is to ignore the policy the kind of people ignorant enough to be that blinkered are usually not tech-savvy enough to notice when it gets flouted :-) More seriously, is the policy Windows only on the metal or could you run e.g. VMware ESX server? I/O is the area that takes the biggest hit in virtualization, and ESX server has far less overhead loss than either Hyper-V (which I presume you are using) or VMWare Workstation for NT (kernels). If it's a Windows-only policy, then perhaps you can run those traps in reverse, and switch to a Windows database, i.e. Microsoft SQL Server. Cheers Dave On Mon, Jun 14, 2010 at 1:53 PM, Tom Wilcox hungry...@gmail.com mailto:hungry...@gmail.com wrote: Hi Bob, Thanks a lot. Here's my best attempt to answer your questions: The VM is setup with a virtual disk image dynamically expanding to fill an allocation of 300GB on a fast, local hard drive (avg read speed = 778MB/s ). WAL files can have their own disk, but how significantly would this affect our performance? The filesystem of the host OS is NTFS (Windows Server 2008 OS 64), the guest filesystem is Ext2 (Ubuntu 64). The workload is OLAP (lots of large, complex queries on large tables run in sequence). In addition, I have reconfigured my server to use more memory. Here's a detailed blow by blow of how I reconfigured my system to get better performance (for anyone who might be interested)... In order to increase the shared memory on Ubuntu I edited the System V IPC values using sysctl: sysctl -w kernel.shmmax=16106127360* *sysctl -w kernel.shmall=2097152 I had some fun with permissions as I somehow managed to change the owner of the postgresql.conf to root where it needed to be postgres, resulting in failure to start the service.. (Fixed with chown postgres:postgres ./data/postgresql.conf and chmod u=rwx ./data -R). I changed the following params in my configuration file.. default_statistics_target=1 maintenance_work_mem=512MB work_mem=512MB shared_buffers=512MB wal_buffers=128MB With this config, the following command took 6,400,000ms: EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org; With plan: Seq Scan on match_data (cost=0.00..1392900.78 rows=32237278 width=232) (actual time=0.379..464270.682 rows=2961 loops=1) Total runtime: 6398238.890 ms With these changes to the previous config, the same command took 5,610,000ms: maintenance_work_mem=4GB work_mem=4GB shared_buffers=4GB effective_cache_size=4GB wal_buffers=1GB Resulting plan: Seq Scan on match_data (cost=0.00..2340147.72 rows=30888572 width=232) (actual time=0.094..452793.430 rows=2961 loops=1) Total runtime: 5614140.786 ms Then I performed these changes to the postgresql.conf file: max_connections=3 effective_cache_size=15GB maintenance_work_mem=5GB shared_buffers=7000MB work_mem=5GB And ran this query (for a quick look - can't afford the time for the previous tests..): EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 10; Result: Index Scan using match_data_pkey1 on match_data (cost=0.00..15662.17 rows=4490 width=232) (actual time=27.055..1908.027 rows=9 loops=1) Index Cond: (match_data_id 10) Total runtime: 25909.372 ms I then ran EntrepriseDB's Tuner on my postgres install (for a dedicated machine) and got the following settings and results: EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 10; Index Scan using match_data_pkey1 on match_data (cost=0.00..13734.54 rows=4495 width=232) (actual time=0.348..2928.844 rows=9 loops=1) Index Cond: (match_data_id 10) Total runtime: 1066580.293 ms For now, I will go with the config using 7000MB shared_buffers. Any suggestions on how I can further optimise this config for a single session, 64-bit install utilising ALL of 96GB RAM. I will spend the next week making the case for a native install of Linux, but first we need to be 100% sure that is the only way to get the most out of Postgres on this machine. Thanks very much. I now feel I am at a position where I can really explore and find the optimal configuration for my system, but would still
Re: [PERFORM] requested shared memory size overflows size_t
Thanks a lot Dave, That's exactly the kind of answer I can use to justify the OS switch. Motivation for the previous setup was based on the fact that we will use the same machine for other projects that will use SQL Server and most of our experience lies within the MS domain. However, these projects are not a high priority currently and therefore I have been focusing on the best solution for a Postgres-focused setup. This does however mean that I will need to have the other projects running in a VM on Linux. However, they are less demanding in terms of resources. Cheers, Tom Dave Crooke wrote: With that clarification, I stand squarely behind what others are saying ... if performance is important to you, then you should always run databases on dedicated hardware, with the OS running on bare metal with no virtualization. VirtualBox has even more I/O losses than Hyper-V. It's simply not designed for this, and you're giving away a ton of performance. If nothing else, my confusion should indicate to you how unconventional and poorly performing this virtualizaed setup is ... I simply assumed that the only plausible reason you were piggybacking on virtualization on Windows was a mandated lack of alternative options. Reload the hardware with an OS which PGSQL supports well, and get rid of the VirtualBox and Windows layers. If you have hardware that only Windows supports well, then you may need to make some hardware changes. I haven't said anything about which Unix-like OS you may find people arguing passionately for BSD vs. Linux however, the difference between these is negligible compared to virtualized vs. real system, and at this point considerations like support base, ease of use and familiarity also come into play. IMHO Ubuntu would be a fine choice, and PGSQL is a first-class supported package from the distributor ... however, at customer sites, I've typically used Red Hat AS because they have a corporate preference for it, even though it is less convenient to install and manage. On Mon, Jun 14, 2010 at 7:41 PM, Tom Wilcox hungry...@gmail.com mailto:hungry...@gmail.com wrote: Hi Dave, I am definitely able to switch OS if it will get the most out of Postgres. So it is definitely a case of choosing the OS on the needs if the app providing it is well justified. Currently, we are running Ubuntu Server 64-bit in a VirtualBox VM. Cheers, Tom Dave Crooke wrote: Tom I always prefer to choose apps based on business needs, then the OS based on the needs for the app. Cynically, I often feel that the best answer to we have a policy that says we're only allowed to use operating system x is to ignore the policy the kind of people ignorant enough to be that blinkered are usually not tech-savvy enough to notice when it gets flouted :-) More seriously, is the policy Windows only on the metal or could you run e.g. VMware ESX server? I/O is the area that takes the biggest hit in virtualization, and ESX server has far less overhead loss than either Hyper-V (which I presume you are using) or VMWare Workstation for NT (kernels). If it's a Windows-only policy, then perhaps you can run those traps in reverse, and switch to a Windows database, i.e. Microsoft SQL Server. Cheers Dave On Mon, Jun 14, 2010 at 1:53 PM, Tom Wilcox hungry...@gmail.com mailto:hungry...@gmail.com mailto:hungry...@gmail.com mailto:hungry...@gmail.com wrote: Hi Bob, Thanks a lot. Here's my best attempt to answer your questions: The VM is setup with a virtual disk image dynamically expanding to fill an allocation of 300GB on a fast, local hard drive (avg read speed = 778MB/s ). WAL files can have their own disk, but how significantly would this affect our performance? The filesystem of the host OS is NTFS (Windows Server 2008 OS 64), the guest filesystem is Ext2 (Ubuntu 64). The workload is OLAP (lots of large, complex queries on large tables run in sequence). In addition, I have reconfigured my server to use more memory. Here's a detailed blow by blow of how I reconfigured my system to get better performance (for anyone who might be interested)... In order to increase the shared memory on Ubuntu I edited the System V IPC values using sysctl: sysctl -w kernel.shmmax=16106127360* *sysctl -w kernel.shmall=2097152 I had some fun with permissions as I somehow managed to change the owner of the postgresql.conf to root where it needed to be postgres, resulting in failure to start the service.. (Fixed with
Re: [PERFORM] requested shared memory size overflows size_t
Tom Wilcox wrote: default_statistics_target=1 wal_buffers=1GB max_connections=3 effective_cache_size=15GB maintenance_work_mem=5GB shared_buffers=7000MB work_mem=5GB That value for default_statistics_target means that every single query you ever run will take a seriously long time to generate a plan for. Even on an OLAP system, I would consider 10,000 an appropriate setting for a column or two in a particularly troublesome table. I wouldn't consider a value of even 1,000 in the postgresql.conf to be a good idea. You should consider making the system default much lower, and increase it only on columns that need it, not for every column on every table. There is no reason to set wal_buffers larger than 16MB, the size of a full WAL segment. Have you read http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server yet? checkpoint_segments is the main parameter you haven't touched yet you should consider increasing. Even if you have a low write load, when VACUUM runs it will be very inefficient running against a large set of tables without the checkpoint frequency being decreased some. Something in the 16-32 range would be plenty for an OLAP setup. At 3 connections, a work_mem of 5GB is possibly reasonable. I would normally recommend that you make the default much smaller than that though, and instead just increase to a large value for queries that benefit from it. If someone later increases max_connections to something higher, your server could run completely out of memory if work_mem isn't cut way back as part of that change. You could consider setting effective_cache_size to something even larger than that, EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 10; By the way--repeatedly running this form of query to test for improvements in speed is not going to give you particularly good results. Each run will execute a bunch of UPDATE statements that leave behind dead rows. So the next run done for comparison sake will either have to cope with that additional overhead, or it will end up triggering autovacuum and suffer from that. If you're going to use an UPDATE statement as your benchmark, at a minimum run a manual VACUUM ANALYZE in between each test run, to level out the consistency of results a bit. Ideally you'd restore the whole database to an initial state before each test. I will spend the next week making the case for a native install of Linux, but first we need to be 100% sure that is the only way to get the most out of Postgres on this machine. I really cannot imagine taking a system as powerful as you're using here and crippling it by running through a VM. You should be running Ubuntu directly on the hardware, ext3 filesystem without LVM, split off RAID-1 drive pairs dedicated to OS and WAL, then use the rest of them for the database. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] B-Heaps
Just curious if this would apply to PostgreSQL: http://queue.acm.org/detail.cfm?id=1814327 http://queue.acm.org/detail.cfm?id=1814327Now that I've read it, it seems like a no-brainer. So, how does PostgreSQL deal with the different latencies involved in accessing data on disk for searches / sorts vs. accessing data in memory? Is it allocated in a similar way as described in the article such that disk access is reduced to a minimum?
Re: [PERFORM] requested shared memory size overflows size_t
On Jun 14, 2010, at 11:53 AM, Tom Wilcox wrote: max_connections=3 effective_cache_size=15GB maintenance_work_mem=5GB shared_buffers=7000MB work_mem=5GB maintenance_work_mem doesn't need to be so high, it certainly has no effect on your queries below. It would affect vacuum, reindex, etc. With fast disk like this (assuming your 700MB/sec above was not a typo) make sure you tune autovacuum up to be much more aggressive than the default (increase the allowable cost per sleep by at least 10x). A big work_mem like above is OK if you know that no more than a couple sessions will be active at once. Worst case, a single connection ... probably ... won't use more than 2x that ammount. For now, I will go with the config using 7000MB shared_buffers. Any suggestions on how I can further optimise this config for a single session, 64-bit install utilising ALL of 96GB RAM. I will spend the next week making the case for a native install of Linux, but first we need to be 100% sure that is the only way to get the most out of Postgres on this machine. Getting the most from the RAM does *_NOT_* mean making Postgres use all the RAM. Postgres relies on the OS file cache heavily. If there is a lot of free RAM for the OS to use to cache files, it will help the performance. Both Windows and Linux aggressively cache file pages and do a good job at it. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
On Jun 14, 2010, at 7:06 PM, Greg Smith wrote: I really cannot imagine taking a system as powerful as you're using here and crippling it by running through a VM. You should be running Ubuntu directly on the hardware, ext3 filesystem without LVM, split off RAID-1 drive pairs dedicated to OS and WAL, then use the rest of them for the database. Great points. There is one other option that is decent for the WAL: If splitting out a volume is not acceptable for the OS and WAL -- absolutely split those two out into their own partitions. It is most important to make sure that WAL and data are not on the same filesystem, especially if ext3 is involved. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
Scott Carey sc...@richrelevance.com writes: Great points. There is one other option that is decent for the WAL: If splitting out a volume is not acceptable for the OS and WAL -- absolutely split those two out into their own partitions. It is most important to make sure that WAL and data are not on the same filesystem, especially if ext3 is involved. Uh, no, WAL really needs to be on its own *spindle*. The whole point here is to have one disk head sitting on the WAL and not doing anything else except writing to that file. Pushing WAL to a different partition but still on the same physical disk is likely to be a net pessimization, because it'll increase the average seek distance whenever the head does have to move between WAL and everything-else-in-the-database. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance