Re: [PERFORM] Performances with new Intel Core* processors
Vivek, On 7/31/06 2:04 PM, Vivek Khera [EMAIL PROTECTED] wrote: No, but it *does* matter how fast said processor can sling the memory around, and in my experience, the opterons have been much better at that due to the efficiency of the memory transport layer. My Mac laptop with a Core 1 and DDR2 RAM does 2700 MB/s memory bandwidth. The Core 2 also has lower memory latency than the Opteron. That said - Intel still hasn't figured out how to do cache-coherent SMP scaling yet - the Opteron has the outstanding EV6/HTX bus and the cc-numa cache coherency logic working today. - Luke ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performances with new Intel Core* processors
* Arjen van der Meijden: For a database system, however, processors hardly ever are the main bottleneck, are they? Not directly, but the choice of processor influences which chipsets/mainboards are available, which in turn has some impact on the number of RAM slots. (According to our hardware supplier, beyound 8 GB, the price per GB goes up sharply.) Unfortunately, it seems that the Core 2 Duo mainboards do not change that much in this area. -- Florian Weimer[EMAIL PROTECTED] BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Are there any performance penalty for opposite edian platform combinations....
Guoping Zhang [EMAIL PROTECTED] writes: In fact, it is a general question that Is it a good practice we shall avoid to run application server and database server on the platform with opposite edian? or it simply doesn't matter? Our network protocol uses big-endian consistently, so there will be some tiny hit for little-endian machines, independently of what's on the other end of the wire. I can't imagine you could measure the difference though. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Query/database optimization
Hello, I have a query: explain analyze select tu.url_id, tu.url, coalesce(sd.recurse, 100), case when COALESCE(get_option('use_banner')::integer,0) = 0 then 0 else ts.use_banner end as use_banner, ts.use_cookies, ts.use_robots, ts.includes, ts.excludes, ts.track_domain, ts.task_id,get_available_pages(ts.task_id,ts.customer_id), ts.redirects from task_url tu inner join task_scheduler ts on tu.task_id=ts.task_id inner join (subscription s inner join subscription_dic sd on sd.id=s.dict_id ) on s.customer_id=ts.customer_id inner join customer c on c.customer_id=ts.customer_id AND c.active WHERE get_available_pages(ts.task_id,ts.customer_id) 0 AND ((get_option('expired_users')::integer = 0) OR (isfinite(last_login) AND extract('day' from current_timestamp - last_login)::integer = coalesce(get_option('expired_users')::integer,100))) AND ((s.status is null AND ts.customer_id is null) OR s.status 0) AND (get_check_period(ts.task_id,ts.next_check) is null OR (unix_timestamp(get_check_period(ts.task_id,ts.next_check)) - unix_timestamp(timenow()) 3600)) AND ts.status 1 AND ((ts.start_time current_time AND ts.stop_time current_time) OR (ts.start_time is null AND ts.stop_time is null)) AND tu.url_id = 1 AND ts.customer_id not in (select distinct customer_id from task_scheduler where status = 1) order by ts.next_check is not null, unix_timestamp(ts.next_check) - unix_timestamp(timenow()) limit 10; which produces this query plan: Limit (cost=2874.98..2874.99 rows=2 width=88) (actual time=11800.535..11800.546 rows=3 loops=1) - Sort (cost=2874.98..2874.99 rows=2 width=88) (actual time=11800.529..11800.532 rows=3 loops=1) Sort Key: (ts.next_check IS NOT NULL), (date_part('epoch'::text, ts.next_check) - date_part('epoch'::text, (timenow())::timestamp without time zone)) - Nested Loop (cost=4.37..2874.97 rows=2 width=88) (actual time=10249.115..11800.486 rows=3 loops=1) - Nested Loop (cost=4.37..2868.87 rows=2 width=55) (actual time=10247.721..11796.303 rows=3 loops=1) Join Filter: (inner.id = outer.dict_id) - Nested Loop (cost=2.03..2865.13 rows=2 width=55) (actual time=10247.649..11796.142 rows=3 loops=1) Join Filter: (((inner.status IS NULL) AND (outer.customer_id IS NULL)) OR (inner.status 0)) - Nested Loop (cost=2.03..2858.34 rows=2 width=55) (actual time=10247.583..11795.936 rows=3 loops=1) - Seq Scan on customer c (cost=0.00..195.71 rows=231 width=4) (actual time=0.082..154.344 rows=4161 loops=1) Filter: (active AND isfinite(last_login) AND ((date_part('day'::text, (('now'::text)::timestamp(6) with time zone - (last_login)::timestamp with time zone)))::integer = 150)) - Index Scan using task_scheduler_icustomer_id on task_scheduler ts (cost=2.03..11.51 rows=1 width=51) (actual time=2.785..2.785 rows=0 loops=4161) Index Cond: (outer.customer_id = ts.customer_id) Filter: ((get_available_pages(task_id, customer_id) 0) AND ((get_check_period(task_id, next_check) IS NULL) OR ((date_part('epoch'::text, get_check_period(task_id, next_check)) - date_part('epoch'::text, (timenow())::timestamp without time zone)) 3600::double precision)) AND (status 1) AND start_time)::time with time zone ('now'::text)::time(6) with time zone) AND ((stop_time)::time with time zone ('now'::text)::time(6) with time zone)) OR ((start_time IS NULL) AND (stop_time IS NULL))) AND (NOT (hashed subplan))) SubPlan - Unique (cost=2.02..2.03 rows=1 width=4) (actual time=0.617..0.631 rows=3 loops=1) - Sort (cost=2.02..2.03 rows=1 width=4) (actual time=0.613..0.617 rows=3 loops=1) Sort Key: customer_id - Index Scan using task_scheduler_istatus on task_scheduler (cost=0.00..2.01 rows=1 width=4) (actual time=0.044..0.580 rows=3 loops=1) Index Cond: (status = 1) - Index Scan using subscription_icustomer_id on subscription s (cost=0.00..3.38 rows=1 width=12) (actual time=0.035..0.041 rows=1 loops=3) Index Cond: (outer.customer_id = s.customer_id) - Materialize (cost=2.34..2.65 rows=31 width=8) (actual time=0.008..0.027 rows=6 loops=3) - Seq Scan on subscription_dic sd (cost=0.00..2.31 rows=31 width=8) (actual time=0.013..0.034 rows=6 loops=1) - Index Scan using task_url_storage_task_id on task_url tu (cost=0.00..3.03 rows=1 width=37) (actual
Re: Fwd: [PERFORM] Savepoint performance
Actually, what we did in the tests at EnterpriseDB was encapsulate each SQL statement within its own BEGIN/EXCEPTION/END block. Using this approach, if a SQL statement aborts, the rollback is confined to the BEGIN/END block that encloses it. Other SQL statements would not be affected since the block would isolate and capture that exception. In the tests, the base-line version was a PL/pgSQL function for the dbt-2 new order transaction written within a single BEGIN/END block. The experimental version was a variation of the base-line altered so the processing of each order entailed entering three sub-blocks from the main BEGIN/END block. In addition, another sub-block was entered each time a detail line within an order was processed. The transactions per minute were recorded for runs of 20 minutes simulating 10 terminals and 6 hours simulating 10 terminals. Below are some of the numbers we got: With Sub- Test # Base Line Blocks Difference % Variation --- - -- 10 terminals, 1 6128 5861 20 minutes 2 5700 5702 3 6143 5556 4 5954 5750 5 5695 5925 Average of tests 1 - 5 5924 5758.8 -165.2 -2.79 10 terminals, 6 hours 5341 5396 55 1.03 As you can see, we didn't encounter a predictable, significant difference. Ernie Nishiseki, Architect EnterpriseDB Corporation wrote: -- Forwarded message -- From: Denis Lussier Date: Jul 27, 2006 10:33 PM Subject: Re: [PERFORM] Savepoint performance To: Tom Lane Cc: pgsql-performance@postgresql.org My understanding of EDB's approach is that our prototype just implicitly does a savepoint before each INSERT, UPDATE, or DELETE statement inside of PLpgSQL. We then rollback to that savepoint if a sql error occurs. I don 't believe our prelim approach changes any transaction start/end semantics on the server side and it doesn't change any PLpgSQL syntax either (although it does allow you to optionally code commits /or rollbacks inside stored procs). Can anybody point me to a thread on the 7.3 disastrous experiment? I personally think that doing commit or rollbacks inside stored procedures is usually bad coding practice AND can be avoided... It's a backward compatibility thing for non-ansi legacy stuff and this is why I was previously guessing that the community wouldn't be interested in this for PLpgSQL. Actually... does anybody know offhand if the ansi standard for stored procs allows for explicit transaction control inside of a stored procedure? --Luss On 7/27/06, Tom Lane wrote: Denis Lussier writes: Would the community be potentially interested in this feature if we created a BSD Postgres patch of this feature for PLpgSQL (likely for 8.3)?? Based on our rather disastrous experiment in 7.3, I'd say that fooling around with transaction start/end semantics on the server side is unlikely to fly ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation | fax: 732.331.1301 33 Wood Ave S, 2nd Floor | [EMAIL PROTECTED] Iselin, New Jersey 08830 | http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] PITR performance overhead?
I am looking for some general guidelines on what is the performance overhead of enabling point-in-time recovery (archive_command config) on an 8.1 database. Obviously it will depend on a multitude of factors, but some broad-brush statements and/or anecdotal evidence will suffice. Should one worry about its performance implications? Also, what can one do to mitigate it? Thanks, George ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PITR performance overhead?
In response to George Pavlov [EMAIL PROTECTED]: I am looking for some general guidelines on what is the performance overhead of enabling point-in-time recovery (archive_command config) on an 8.1 database. Obviously it will depend on a multitude of factors, but some broad-brush statements and/or anecdotal evidence will suffice. Should one worry about its performance implications? Also, what can one do to mitigate it? Prior to implementing PITR, I did some testing to see what kind of overhead it would add. It was negligible. I don't remember the details, but I seem to remember the performance hit was barely measurable. Note that in our usage scenarios, we have very little IO compared to CPU usage. The result is that our DB servers have plenty of disk bandwidth to spare. Since the log backup occurs as a background process, it made almost no difference in our tests. If your DB is very IO intensive, you may have different results. -- Bill Moran Collaborative Fusion Inc. IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] PostgreSQL scalability on Sun UltraSparc T1
On Sat, Jul 29, 2006 at 08:43:49AM -0700, Joshua D. Drake wrote: Jochem van Dieten wrote: Tweakers.net has done a database performance test between a Sun T2000 (8 core T1) and a Sun X4200 (2 dual core Opteron 280). The database benchmark is developed inhouse and represents the average query pattern from their website. It is MySQL centric because Tweakers.net runs on MySQL, but Arjen van der Meijden has ported it to PostgreSQL and has done basic optimizations like adding indexes. Arjen wrote about some of the preliminary results previously in http://archives.postgresql.org/pgsql-performance/2006-06/msg00358.php but the article has now been published http://tweakers.net/reviews/633/7 This is all the more impressive if you scroll down and look at the behaviour of MySQL (after tweaking by both MySQL AB and Sun). I would love to get my hands on that postgresql version and see how much farther it could be optimized. I'd love to get an english translation that we could use for PR. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] PostgreSQL scalability on Sun UltraSparc T1
On 1-8-2006 19:26, Jim C. Nasby wrote: On Sat, Jul 29, 2006 at 08:43:49AM -0700, Joshua D. Drake wrote: I'd love to get an english translation that we could use for PR. Actually, we have an english version of the Socket F follow-up. http://tweakers.net/reviews/638 which basically displays the same results for Postgres vs MySQL. If and when a translation of the other article arrives, I don't know. Other follow-up stories will follow as well, whether and how soon those will be translated, I also don't know. We are actually pretty interested in doing so, but its a lot of work to translate correctly :) Best regards, Arjen ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] How to speed up this translation query?
I need some expert advice on how to optimize a translation query (this word choice will become clear shortly, I hope). Say I have aHUMONGOUS table of foreign language translations (call it TRANS) with records like these: meaning_id: 1 language_id: 5 translation: jidoosha meaning_id: 1 language_id: 2 translation: voiture meaning_id: 1 language_id: 5 translation: kuruma meaning_id: 2 language_id: 2 translation: chat meaning_id: 2 language_id: 5 translation: neko meaning_id: 2 language_id: 3 translation: katz meaning_id: 3 language_id: 4 translation: pesce meaning_id: 3 language_id: 2 translation: poisson meaning_id: 3 language_id: 5 translation: sakana For the sake of this description, let's assume that the records above are all the records in TRANS (though in fact the number of records in TRANS is really abouttenmillion times greater). Now suppose I have a tiny table calledINPUT consisting of single textfield (say, word). E.g. suppose thatINPUT looks like this: katz voiturepesce Now, let's fix a language_id, say 5. This is the target language_id. Given this target language_id, and this particular INPUT table, I want the results of the query to be something like this: neko jidoosha kuruma sakana I.e. for each word W in INPUT, the query must first findeach record Rin TRANS that hasW as its translation field; then find each record Q in TRANSwhose language_id is 5 (the target language_id) AND has the same meaning_id as R does. E.g.if W is 'katz', then R is meaning_id: 2 language_id: 3 translation: katz and therefore the desired Q is meaning_id: 2 language_id: 5 translation: neko ...and so on. The only difficulty here is that performance is critical, and in real life,TRANS has around 50M records (and growing),while INPUT has typicallybetween 500and 1000 records. Any advice on how to make this as fast as possible would be much appreciated. Thanks! G. P.S. Just to show that this post is not just from a college student trying to get around doing homework, below I post my most successful query so far. It works, but it's performance isn't great. And it is annoyingly complex, to boot; I'mvery much the SQLnoob, and if nothing else, at least I'd like to learn to writebetter ( i.e. more elegant,morelegible, more clueful)SQL that this: SELECT q3.translation, q2.otherstuffFROM( SELECT INPUT.word, q1.meaning_id, INPUT.otherstuff FROM INPUT INNER JOIN ( SELECT translation, meaning_id FROM TRANS WHERE translation IN (SELECT word FROM INPUT) ) AS q1 ON INPUT.word = q1.translation) AS q2LEFT JOIN( SELECT translation, meaning_id FROM TRANS WHERE language_id=5) AS q3ON q2.meaning_id=q3.meaning_id; As you can see, there are additional fields that I didn't mention in my original description (e.g. INPUT.otherstuff). Also the above is actually a subquery in a larger query, but it is by far, the worst bottleneck. Last, there's an index on TRANS(translation).
Re: [PERFORM] How to speed up this translation query?
On 1 aug 2006, at 20.09, tlm wrote: SELECT q3.translation, q2.otherstuff FROM ( SELECT INPUT.word, q1.meaning_id, INPUT.otherstuff FROM INPUT INNER JOIN ( SELECT translation, meaning_id FROM TRANS WHERE translation IN (SELECT word FROM INPUT) ) AS q1 ON INPUT.word = q1.translation ) AS q2 LEFT JOIN ( SELECT translation, meaning_id FROM TRANS WHERE language_id=5 ) AS q3 ON q2.meaning_id=q3.meaning_id; Maybe I'm not following you properly, but I think you've made things a little bit more complicated than they need to be. The nested sub- selects look a little nasty. Now, you didn't provide any explain output but I think the following SQL will achieve the same result, and hopefully produce a better plan: SELECT t2.translation, i.otherstuff FROM input i INNER JOIN trans t ON i.word=t.translation INNER JOIN trans t2 ON t.meaning_id=t2.meaning_id WHERE t2.language_id=5; The query will also benefit from indices on trans.meaning_id and trans.language_id. Also make sure the tables are vacuumed and analyzed, to allow the planner to make good estimates. Sincerely, Niklas Johansson ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] XFS filessystem for Datawarehousing
I intend to test Postgres/Bizgres for DWH use. I want to use XFS filesystem to get the best possible performance at FS level(correct me if I am wrong !). Is anyone using XFS for storing/retrieving relatively large amount of data (~ 200GB)? If yes, what about the performance and stability of XFS. I am especially interested in recommendations about XFS mount options and mkfs.xfs options. My setup will be roughly this: 1) 4 SCSI HDD , 128GB each, 2) RAID 0 on the four SCSI HDD disks using LVM (software RAID) There are two other SATA HDD in the server. Server has 2 physical CPUs (XEON at 3 GHz), 4 Logical CPUs, 8 GB RAM, OS = SLES9 SP3 My questions: 1) Should I place external XFS journal on separate device ? 2) What should be the journal buffer size (logbsize) ? 3) How many journal buffers (logbufs) should I configure ? 4) How many allocations groups (for mkfs.xfs) should I configure 5) Is it wortj settion noatime ? 6) What I/O scheduler(elevators) should I use (massive sequencial reads) 7) What is the ideal stripe unit and width (for a RAID device) ? I will appreciate any options, suggestions, pointers. Best Regards. Milen Kulev ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
FW: [PERFORM] XFS filessystem for Datawarehousing -2
Sorry, forgot to ask: What is the recommended/best PG block size for DWH database? 16k, 32k, 64k ? What hsould be the relation between XFS/RAID stripe size and PG block size ? Best Regards. Milen Kulev -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Milen Kulev Sent: Tuesday, August 01, 2006 11:50 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] XFS filessystem for Datawarehousing I intend to test Postgres/Bizgres for DWH use. I want to use XFS filesystem to get the best possible performance at FS level(correct me if I am wrong !). Is anyone using XFS for storing/retrieving relatively large amount of data (~ 200GB)? If yes, what about the performance and stability of XFS. I am especially interested in recommendations about XFS mount options and mkfs.xfs options. My setup will be roughly this: 1) 4 SCSI HDD , 128GB each, 2) RAID 0 on the four SCSI HDD disks using LVM (software RAID) There are two other SATA HDD in the server. Server has 2 physical CPUs (XEON at 3 GHz), 4 Logical CPUs, 8 GB RAM, OS = SLES9 SP3 My questions: 1) Should I place external XFS journal on separate device ? 2) What should be the journal buffer size (logbsize) ? 3) How many journal buffers (logbufs) should I configure ? 4) How many allocations groups (for mkfs.xfs) should I configure 5) Is it wortj settion noatime ? 6) What I/O scheduler(elevators) should I use (massive sequencial reads) 7) What is the ideal stripe unit and width (for a RAID device) ? I will appreciate any options, suggestions, pointers. Best Regards. Milen Kulev ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] XFS filessystem for Datawarehousing
Hi Andrew, Thank you for your prompt reply. Are you using some special XFS options ? I mean special values for logbuffers bufferiosize , extent size preallocations etc ? I will have only 6 big tables and about 20 other relatively small (fact aggregation) tables (~ 10-20 GB each). I believe it should be a a good idea to use as much contigious chunks of space (from OS point of view) as possible in order to make full table scans as fast as possible. Best Regards, Milen Kulev -Original Message- From: J. Andrew Rogers [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 02, 2006 12:47 AM To: Milen Kulev Cc: Pgsql-Performance ((E-mail)) Subject: Re: [PERFORM] XFS filessystem for Datawarehousing On Aug 1, 2006, at 2:49 PM, Milen Kulev wrote: Is anyone using XFS for storing/retrieving relatively large amount of data (~ 200GB)? Yes, we've been using it on Linux since v2.4 (currently v2.6) and it has been rock solid on our database servers (Opterons, running in both 32-bit and 64-bit mode). Our databases are not quite 200GB (maybe 75GB for a big one currently), but ballpark enough that the experience is probably valid. We also have a few terabyte+ non- database XFS file servers too. Performance has been very good even with nearly full file systems, and reliability has been perfect so far. Some of those file systems get used pretty hard for months or years non-stop. Comparatively, I can only tell you that XFS tends to be significantly faster than Ext3, but we never did any serious file system tuning either. Knowing nothing else, my experience would suggest that XFS is a fine and safe choice for your application. J. Andrew Rogers ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] sub select performance due to seq scans
Not sure if this helps solve the problem but... (see below) As new records are added Indexes are used for awhile and then at some point postgres switches to seq scan. It is repeatable. Any suggestions/comments to try and solve this are welcome. Thanks Data is as follows: capsa.flatommemberrelation 1458 records capsa.flatommemberrelation(srcobj) 3 distinct capsa.flatommemberrelation(dstobj) 730 distinct capsa.flatomfilesysentry 732 records capsa.flatommemberrelation(objectid) 732 distinct capsa=# set enable_seqscan=on; SET Time: 0.599 ms capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409'); QUERY PLAN -- Aggregate (cost=196.01..196.02 rows=1 width=0) (actual time=965.420..965.422 rows=1 loops=1) - Nested Loop IN Join (cost=0.00..194.19 rows=728 width=0) (actual time=3.373..964.371 rows=729 loops=1) Join Filter: ("outer".objectid = "inner".dstobj) - Seq Scan on flatomfilesysentry (cost=0.00..65.28 rows=728 width=16) (actual time=0.007..1.505 rows=732 loops=1) - Seq Scan on flatommemberrelation (cost=0.00..55.12 rows=725 width=16) (actual time=0.004..0.848 rows=366 loops=732) Filter: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid) Total runtime: 965.492 ms (7 rows) Time: 966.806 ms --- capsa=# set enable_seqscan=off; SET Time: 0.419 ms capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409'); QUERY PLAN -- Aggregate (cost=24847.73..24847.74 rows=1 width=0) (actual time=24.859..24.860 rows=1 loops=1) - Nested Loop (cost=90.05..24845.91 rows=728 width=0) (actual time=2.946..23.640 rows=729 loops=1) - Unique (cost=88.04..91.67 rows=363 width=16) (actual time=2.917..6.671 rows=729 loops=1) - Sort (cost=88.04..89.86 rows=725 width=16) (actual time=2.914..3.998 rows=729 loops=1) Sort Key: flatommemberrelation.dstobj - Bitmap Heap Scan on flatommemberrelation (cost=7.54..53.60 rows=725 width=16) (actual time=0.260..1.411 rows=729 loops=1) Recheck Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid) - Bitmap Index Scan on capsa_flatommemberrelation_srcobj_idx (cost=0.00..7.54 rows=725 width=0) (actual time=0.244..0.244 rows=729 loops=1) Index Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid) - Bitmap Heap Scan on flatomfilesysentry (cost=2.00..63.64 rows=364 width=16) (actual time=0.014..0.015 rows=1 loops=729) Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj) - Bitmap Index Scan on flatomfilesysentry_pkey (cost=0.00..2.00 rows=364 width=0) (actual time=0.009..0.009 rows=1 loops=729) Index Cond: (flatomfilesysentry.objectid = "outer".dstobj) Total runtime: 25.101 ms (14 rows) Time: 26.878 ms H Hale [EMAIL PROTECTED] wrote: Tom, It is unique. Indexes: "flatomfilesysentry_pkey" PRIMARY KEY, btree (objectid) "capsa_flatomfilesysentry_name_idx" btree (name) Foreign-key constraints: "objectid" FOREIGN KEY (objectid) REFERENCES capsa_sys.master(objectid) ON DELETE CASCADE Tom Lane [EMAIL PROTECTED] wrote: H Hale writes: - Bitmap Heap Scan on flatomfilesysentry (cost=2.00..274.38 rows=3238 width=30) (actual time=0.011..0.013 rows=1 loops=6473) Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj) - Bitmap Index Scan on flatomfilesysentry_pkey (cost=0.00..2.00 rows=3238 width=0) (actual time=0.007..0.007 rows=1 loops=6473) Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)Well, there's our estimation failure: 3238 rows expected, one rowactual.What is the data distribution of flatomfilesysentry.objectid?It looks from this example like it is unique or nearly so,but the planner evidently does not think that. regards, tom lane---(end of broadcast)---TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] XFS filessystem for Datawarehousing
On Aug 1, 2006, at 2:49 PM, Milen Kulev wrote: Is anyone using XFS for storing/retrieving relatively large amount of data (~ 200GB)? Yes, we've been using it on Linux since v2.4 (currently v2.6) and it has been rock solid on our database servers (Opterons, running in both 32-bit and 64-bit mode). Our databases are not quite 200GB (maybe 75GB for a big one currently), but ballpark enough that the experience is probably valid. We also have a few terabyte+ non- database XFS file servers too. Performance has been very good even with nearly full file systems, and reliability has been perfect so far. Some of those file systems get used pretty hard for months or years non-stop. Comparatively, I can only tell you that XFS tends to be significantly faster than Ext3, but we never did any serious file system tuning either. Knowing nothing else, my experience would suggest that XFS is a fine and safe choice for your application. J. Andrew Rogers ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] XFS filessystem for Datawarehousing
J. Andrew Rogers wrote: On Aug 1, 2006, at 2:49 PM, Milen Kulev wrote: Is anyone using XFS for storing/retrieving relatively large amount of data (~ 200GB)? Yes, we've been using it on Linux since v2.4 (currently v2.6) and it has been rock solid on our database servers (Opterons, running in both 32-bit and 64-bit mode). Our databases are not quite 200GB (maybe 75GB for a big one currently), but ballpark enough that the experience is probably valid. We also have a few terabyte+ non- database XFS file servers too. Performance has been very good even with nearly full file systems, and reliability has been perfect so far. Some of those file systems get used pretty hard for months or years non-stop. Comparatively, I can only tell you that XFS tends to be significantly faster than Ext3, but we never did any serious file system tuning either. Most likely ext3 was used on the default configuration, which logs data operations as well as metadata, which is what XFS logs. I don't think I've seen any credible comparison between XFS and ext3 with the metadata-only journal option. On the other hand I don't think it makes sense to journal data on a PostgreSQL environment. Metadata is enough, given that we log data on WAL anyway. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] PITR performance overhead?
On 8/1/06, George Pavlov [EMAIL PROTECTED] wrote: I am looking for some general guidelines on what is the performance overhead of enabling point-in-time recovery (archive_command config) on an 8.1 database. Obviously it will depend on a multitude of factors, but some broad-brush statements and/or anecdotal evidence will suffice. Should one worry about its performance implications? Also, what can one do to mitigate it? pitr is extremely cheap both in performance drag and administation overhead for the benefits it provides. it comes almost for free, just make sure you can handle all the wal files and do sane backup scheduling. in fact, pitr can actually reduce the load on a server due to running less frequent backups. if your server is heavy i/o loaded, it might take a bit of planning. merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] XFS filessystem for Datawarehousing
Milen Kulev wrote: Is anyone using XFS for storing/retrieving relatively large amount of data (~ 200GB)? Yes, but not for that large - only about 40-50 GB of database data. If yes, what about the performance and stability of XFS. I'm pretty happy with the performance, particularly read (get 215MB/s sequential 8K reads from 4 (P)ATA drives setup as software RAID 0). I have always found XFS very stable (used it on servers for several years). I am especially interested in recommendations about XFS mount options and mkfs.xfs options. My setup will be roughly this: 1) 4 SCSI HDD , 128GB each, 2) RAID 0 on the four SCSI HDD disks using LVM (software RAID) My questions: 1) Should I place external XFS journal on separate device ? 2) What should be the journal buffer size (logbsize) ? 3) How many journal buffers (logbufs) should I configure ? 4) How many allocations groups (for mkfs.xfs) should I configure 5) Is it wortj settion noatime ? 6) What I/O scheduler(elevators) should I use (massive sequencial reads) 7) What is the ideal stripe unit and width (for a RAID device) ? 1-3) I have not done any experimentation with where to put the journal, or its buffer size / number of them (well worth doing I suspect tho). 4) I left it at the default. 5) I use noatime, but have not measured if there is any impact if I leave it off. 6) deadline scheduler seemed to give slightly better performance for sequential performance. 7) I tried out stripe width 2,4 (with 4 disks), and they seemed to give the same results. Stripe unit of 256K (tested 32K, 64K, 128K) seemed to give the best sequential performance. My software raid stripe size was matched to this in each case. I'll be interested to hear what you discover :-) Cheers Mark ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] XFS filessystem for Datawarehousing -2
Milen, On 8/1/06 3:19 PM, Milen Kulev [EMAIL PROTECTED] wrote: Sorry, forgot to ask: What is the recommended/best PG block size for DWH database? 16k, 32k, 64k ? What hsould be the relation between XFS/RAID stripe size and PG block size ? We have found that the page size in PG starts to matter only at very high disk performance levels around 1000MB/s. Other posters have talked about maintenance tasks improving in performance, but I haven't seen it. - Luke ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query/database optimization
Eugeny N Dzhurinsky [EMAIL PROTECTED] writes: [slow query] The bulk of your time seems to be going into this indexscan: - Index Scan using task_scheduler_icustomer_id on task_scheduler ts (cost=2.03..11.51 rows=1 width=51) (actual time=2.785..2.785 rows=0 loops=4161) Index Cond: (outer.customer_id = ts.customer_id) Filter: ((get_available_pages(task_id, customer_id) 0) AND ((get_check_period(task_id, next_check) IS NULL) OR ((date_part('epoch'::text, get_check_period(task_id, next_check)) - date_part('epoch'::text, (timenow())::timestamp without time zone)) 3600::double precision)) AND (status 1) AND start_time)::time with time zone ('now'::text)::time(6) with time zone) AND ((stop_time)::time with time zone ('now'::text)::time(6) with time zone)) OR ((start_time IS NULL) AND (stop_time IS NULL))) AND (NOT (hashed subplan))) SubPlan - Unique (cost=2.02..2.03 rows=1 width=4) (actual time=0.617..0.631 rows=3 loops=1) ... I kinda doubt that the index search itself is that slow --- doubtless the problem comes from having to evaluate that filter condition on a lot of rows. How fast are those functions you're calling? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly