Re: [PERFORM] RAID Controller (HP P400) beat by SW-RAID?
On 12-9-2011 0:44 Anthony Presley wrote: A few weeks back, we purchased two refurb'd HP DL360's G5's, and were hoping to set them up with PG 9.0.2, running replicated. These machines have (2) 5410 Xeon's, 36GB of RAM, (6) 10k SAS drives, and are using the HP SA P400i with 512MB of BBWC. PG is running on an ext4 (noatime) partition, and they drives configured as RAID 1+0 (seems with this controller, I cannot do JBOD). If you really want a JBOD-setup, you can try a RAID0 for each available disk, i.e. in your case 6 separate RAID0's. That's how we configured our Dell H700 - which doesn't offer JBOD as well - for ZFS. Best regards, Arjen -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] PG 9.x prefers slower Hash Joins?
In relation to my previous thread (about SW RAID vs. HW RAID on a P400), I was able to narrow down the filesystem speed and in general, our new system (running PG 9.1) is about 3x - 5x faster on the IO. In looking at the query plans in more depth, it appears that PG 9.0 and 9.1 are both preferring to do hash joins, which seem to have a linear time and are slower than PG 8.4 doing an index scan. For example, on PG 9.x: http://explain.depesz.com/s/qji - This takes 307ms, all the time. Doesn't matter if it's cached, or fresh from a reboot. Same query on PG 8.4: http://explain.depesz.com/s/8Pd - This can take 2-3s the first time, but then takes 42ms once it's cached. Both of these servers have the same indexes, similar postgresql.conf, and almost identical data. However, the old server is doing some different planning than the new server. I've run analyze on both of these databases. Some relevant PG parameters: max_connections = 150 shared_buffers = 6400MB (have tried as high as 20GB) work_mem = 20MB (have tried as high as 100MB) effective_io_concurrency = 6 fsync = on synchronous_commit = off wal_buffers = 16MB checkpoint_segments = 30 (have tried 200 when I was loading the db) random_page_cost = 2.5 effective_cache_size = 10240MB (have tried as high as 16GB) If I disable the hashjoin, I get massive improvements on PG 9.x ... as fast (or faster) than our PG 8.4 instance. -- Anthony Presley
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On Tue, Sep 13, 2011 at 00:26, Robert Klemme shortcut...@googlemail.com wrote: In the case of PG this particular example will work: 1. TX inserts new PK row 2. TX tries to insert same PK row = blocks 1. TX commits 2. TX fails with PK violation 2. TX does the update (if the error is caught) That goes against the point I was making in my earlier comment. In order to implement this error-catching logic, you'll have to allocate a new subtransaction (transaction ID) for EVERY ROW you insert. If you're going to be loading billions of rows this way, you will invoke the wrath of the vacuum freeze process, which will seq-scan all older tables and re-write every row that it hasn't touched yet. You'll survive it if your database is a few GB in size, but in the terabyte land that's unacceptable. Transaction IDs are a scarce resource there. In addition, such blocking will limit the parallelism you will get from multiple inserters. Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On Tue, Sep 13, 2011 at 5:13 PM, Marti Raudsepp ma...@juffo.org wrote: On Tue, Sep 13, 2011 at 00:26, Robert Klemme shortcut...@googlemail.com wrote: In the case of PG this particular example will work: 1. TX inserts new PK row 2. TX tries to insert same PK row = blocks 1. TX commits 2. TX fails with PK violation 2. TX does the update (if the error is caught) That goes against the point I was making in my earlier comment. In order to implement this error-catching logic, you'll have to allocate a new subtransaction (transaction ID) for EVERY ROW you insert. I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? If you're going to be loading billions of rows this way, you will invoke the wrath of the vacuum freeze process, which will seq-scan all older tables and re-write every row that it hasn't touched yet. You'll survive it if your database is a few GB in size, but in the terabyte land that's unacceptable. Transaction IDs are a scarce resource there. Certainly. But it's not needed as far as I can see. In addition, such blocking will limit the parallelism you will get from multiple inserters. Yes, I mentioned the speed issue. But regardless of the solution for MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ ins.sql Description: Binary data -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
Hi, (see attachment) under high concurency you may expect that your data is already in. In such a case you better do nothing at all: begin select dat=a_dat from t where id=a_id into test: if test is null then begin insert into t (id, dat) values (a_id, a_dat); exception when unique_violation then update t set dat = a_dat where id = a_id and dat a_dat; return 0; end; elsif not test then update t set dat = a_dat where id = a_id; return 0; end if; return 1; best regards, Marc Mamin -Ursprüngliche Nachricht- Von: pgsql-performance-ow...@postgresql.org im Auftrag von Robert Klemme Gesendet: Di 9/13/2011 6:34 An: Marti Raudsepp Cc: pgsql-performance@postgresql.org Betreff: Re: [PERFORM] Postgres for a data warehouse, 5-10 TB On Tue, Sep 13, 2011 at 5:13 PM, Marti Raudsepp ma...@juffo.org wrote: On Tue, Sep 13, 2011 at 00:26, Robert Klemme shortcut...@googlemail.com wrote: In the case of PG this particular example will work: 1. TX inserts new PK row 2. TX tries to insert same PK row = blocks 1. TX commits 2. TX fails with PK violation 2. TX does the update (if the error is caught) That goes against the point I was making in my earlier comment. In order to implement this error-catching logic, you'll have to allocate a new subtransaction (transaction ID) for EVERY ROW you insert. I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? If you're going to be loading billions of rows this way, you will invoke the wrath of the vacuum freeze process, which will seq-scan all older tables and re-write every row that it hasn't touched yet. You'll survive it if your database is a few GB in size, but in the terabyte land that's unacceptable. Transaction IDs are a scarce resource there. Certainly. But it's not needed as far as I can see. In addition, such blocking will limit the parallelism you will get from multiple inserters. Yes, I mentioned the speed issue. But regardless of the solution for MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On Tue, Sep 13, 2011 at 19:34, Robert Klemme shortcut...@googlemail.com wrote: I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a SAVEPOINT it can roll back to in case of an error. Yes, I mentioned the speed issue. But regardless of the solution for MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. In a mass-loading application you can often divide the work between threads in a manner that doesn't cause conflicts. For example, if the unique key is foobar_id and you have 4 threads, thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes (foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the work. I already suggested this in my original post. Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
I do not need to do insert updates from many threads. I want to do it from one thread. My current MySQL architecture is that I have a table with same layout as the main one, to hold new and updated objects. When there is enough objects, I begin a big INSERT SELECT ... ON DUPLICATE KEY UPDATE and stuff that into the master table. i On Tue, Sep 13, 2011 at 1:11 PM, Marti Raudsepp ma...@juffo.org wrote: On Tue, Sep 13, 2011 at 19:34, Robert Klemme shortcut...@googlemail.com wrote: I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a SAVEPOINT it can roll back to in case of an error. Yes, I mentioned the speed issue. But regardless of the solution for MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. In a mass-loading application you can often divide the work between threads in a manner that doesn't cause conflicts. For example, if the unique key is foobar_id and you have 4 threads, thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes (foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the work. I already suggested this in my original post. Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Hi Kevin, (sorry for late reply, PG forums seem to have problems with my e-mail client, now trying web mail) First, thanks for taking the time. I wish I could write back with quick, terse questions to your detailed reply - but I'm sorry, this is still going to be a wordy post. max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. Even for 300 stateful applications that can remain connected for up to a week, continuously distilling data (imports)? The 300 is overkill, a sys admin raised it from 100 when multiple large projects were loaded and the server refused the additional connections. We can take large imports and break them into multiple smaller ones which the operators are doing to try and improve import performance. It does result in some improvement, but I think they have gone over the top and the answer is to improve DB and OS performance. Perhaps I don't understand how connection pooling will work with stateful apps that are continuously reading and writing (the apps are DB I/O bound). you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). For performance or safety reasons? Since the sys admin thinks there's no performance benefit from this, I would like to be clear on why we should do this. Can our particular setup benefit from changing the bgwriter values? Probably not. If you find that your interactive users have periods where queries seem to freeze for a few minutes at a time and then return to normal levels of performance, you might need to make this more aggressive. We actually experience this. Once again, remember the overwhelming use of the system is long-running import threads with continuous connections. Every now and then the imports behave as if they are suddenly taking a deep breath, slowing down. Sometimes, so much we cancel the import and restart (the imports pick up where they left off). What would the bg_writer settings be in this case? Thanks again for your time, Carlo Date: Fri, 9 Sep 2011 13:16:28 -0500 From: kevin.gritt...@wicourts.gov To: pgsql-performance@postgresql.org; stonec.regis...@sympatico.ca Subject: Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post) Carlo Stonebanks stonec.regis...@sympatico.ca wrote: this is a full-time ETL system, with only a handful of actual *users* and automated processes over 300 connections running *import* programs 24/7 Intel* Xeon* Processor X5560 (8M Cache, 2.80 GHz, 6.40 GT/s Intel* QPI) x 2, dual quad core 48 GB RAM RAID 10, 6 X 600 GB 15krpm SAS) So, eight cores and six spindles. You are probably going to see *much* better throughput if you route those 300 workers through about 22 connections. Use a connection pooler which limits active transactions to that and queues up requests to start a transaction. Sys admin says that battery-backup RAID controller and consequent write settings should have no impact on performance. With only six drives, I your OS, WAL files, indexes, and heap files are all in the same RAID? If so, your sys admin is wrong -- you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. shared_buffers = 500MB# At 48GB of RAM, could we go to 2GB You might benefit from as much as 8GB, but only testing with your actual load will show for sure. effective_cache_size = 2457MB # Sys admin says assume 25% of 48GB Add together the shared_buffers setting and whatever the OS tells you is used for cache under your normal load. It's usually 75% of RM or higher. (NOTE: This doesn't cause any allocation of RAM; it's a hint to the cost calculations.) work_mem = 512MB # Complex reads are called many times a second Maybe, if you use the connection pooler as described above. Each connection can allocate this multiple times. So with 300 connections you could very easily start using 150GB of RAM in addition to your shared buffers; causing a swap storm followed by OOM crashes. If you stay with 300 connections this *must* be reduced by at least an order of magnitude. # from each connection, so what should this be? maintenance_work_mem = 256MB # Should this be bigger - 1GB at least? I'd go to 1 or 2 GB. checkpoint_segments = 128 # There is lots of write activity; this is high OK # but could it be higher? IMO, there's unlikely to be much benefit beyond that. #checkpoint_completion_target not set; # Recommendation appears to be .9 for our 128 checkpoint segments 0.9 is probably a good idea. default_statistics_target = 200 #
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
Interesting debate. 2011/9/13 Marti Raudsepp ma...@juffo.org: Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a SAVEPOINT it can roll back to in case of an error. Are you sure? In theory I always understood that there are no subtransactions. In fact when looking at the docs there is chapter 39.6.6. saying By default, any error occurring in a PL/pgSQL function aborts execution of the function, and indeed of the surrounding transaction as well. You can trap errors and recover from them by using a BEGIN block with an EXCEPTION clause. (http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html ) So the doc isn't totally explicit about this. But whatever: What would be the the function of a subtransaction? To give the possibility to recover and continue within the surrounding transaction? Stefan 2011/9/13 Marti Raudsepp ma...@juffo.org: On Tue, Sep 13, 2011 at 19:34, Robert Klemme shortcut...@googlemail.com wrote: I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a SAVEPOINT it can roll back to in case of an error. Yes, I mentioned the speed issue. But regardless of the solution for MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. In a mass-loading application you can often divide the work between threads in a manner that doesn't cause conflicts. For example, if the unique key is foobar_id and you have 4 threads, thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes (foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the work. I already suggested this in my original post. Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] raid array seek performance
I'm just beginning the process of benchmarking and tuning a new server. Something I really haven't done before. I'm using Greg's book as a guide. I started with bonnie++ (1.96) and immediately got anomalous results (I think). Hardware is as follows: 2x quad core xeon 5504 2.0Ghz, 2x4MB cache 192GB DDR3 1066 RAM 24x600GB 15K rpm SAS drives adaptec 52445 controller The default config, being tested at the moment, has 2 volumes, one 100GB and one 3.2TB, both are built from a stripe across all 24 disks, rather than splitting some spindles out for one volume and another set for the other volume. At the moment, I'm only testing against the single 3.2TB volume. The smaller volume is partitioned into /boot (ext2 and tiny) and / (ext4 and 91GB). The larger volume is mounted as xfs with the following options (cribbed from an email to the list earlier this week, I think): logbufs=8,noatime,nodiratime,nobarrier,inode64,allocsize=16m Bonnie++ delivered the expected huge throughput for sequential read and write. It seems in line with other benchmarks I found online. However, we are only seeing 180 seeks/sec, but seems quite low. I'm hoping someone might be able to confirm that and. hopefully, make some suggestions for tracking down the problem if there is one. Results are as follows: 1.96,1.96,newbox,1,1315935572,379G,,1561,99,552277,46,363872,34,3005,90,981924,49,179.1,56,16,19107,69,+,+++,20006,69,19571,72,+,+++,20336,63,7111us,10666ms,14067ms,65528us,592ms,170ms,949us,107us,160us,383us,31us,130us Version 1.96 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP newzonedb.z1.p 379G 1561 99 552277 46 363872 34 3005 90 981924 49 179.1 56 Latency 7111us 10666ms 14067ms 65528us 592ms 170ms --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min/sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP newbox16 19107 69 + +++ 20006 69 19571 72 + +++ 20336 63 Latency 949us 107us 160us 383us 31us 130us Also, my inclination is to default to the following volume layout: 2 disks in RAID 1 for system 4 disks in RAID 10 for WAL (xfs) 18 disks in RAID 10 for data (xfs) Use case is minimal OLTP traffic, plus a fair amount of data warehouse style traffic - low connection count, queries over sizeable fact tables (100s of millions of rows) partitioned over time, insert-only data loading, via COPY, plus some tables are populated via aggregation queries over other tables. Basically, based on performance of our current hardware, I'm not concerned about being able to handle the data-loading load, with the 4 drive raid 10 volume, so emphasis is on warehouse query speed. I'm not best pleased by the 2 Ghz CPUs, in that context, but I wasn't given a choice on the hardware. Any comments on that proposal are welcome. I've got only a week to settle on a config and ready the box for production, so the number of iterations I can go through is limited.
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On Tue, Sep 13, 2011 at 12:57 PM, Stefan Keller sfkel...@gmail.com wrote: Are you sure? In theory I always understood that there are no subtransactions. subtransaction is just another way of saying save points / rollback. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On 13.09.2011 20:11, Marti Raudsepp wrote: On Tue, Sep 13, 2011 at 19:34, Robert Klemmeshortcut...@googlemail.com wrote: I don't think so. You only need to catch the error (see attachment). Or does this create a sub transaction? Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a SAVEPOINT it can roll back to in case of an error. Ouch! Learn something new every day. Thanks for the update! http://www.postgresql.org/docs/8.4/interactive/plpgsql-structure.html Side note: it seems that Oracle handles this differently (i.e. no subtransaction but the INSERT would be rolled back) making the pattern pretty usable for this particular situation. Also, I have never heard that TX ids are such a scarse resource over there. Would anybody think it a good idea to optionally have a BEGIN EXCEPTION block without the current TX semantics? In absence of that what would be a better pattern to do it (other than UPDATE and INSERT if not found)? Yes, I mentioned the speed issue. But regardless of the solution for MySQL's INSERT..ON DUPLICATE KEY UPDATE which Igor mentioned you will have the locking problem anyhow if you plan to insert concurrently into the same table and be robust. In a mass-loading application you can often divide the work between threads in a manner that doesn't cause conflicts. Yeah, but concurrency might not the only reason to optionally update. If the data is there you might rather want to overwrite it instead of failure. Kind regards robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On 13.09.2011 20:57, Stefan Keller wrote: Interesting debate. Indeed. 2011/9/13 Marti Raudseppma...@juffo.org: Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a SAVEPOINT it can roll back to in case of an error. Are you sure? In theory I always understood that there are no subtransactions. What theory are you referring to? In fact when looking at the docs there is chapter 39.6.6. saying By default, any error occurring in a PL/pgSQL function aborts execution of the function, and indeed of the surrounding transaction as well. You can trap errors and recover from them by using a BEGIN block with an EXCEPTION clause. (http://www.postgresql.org/docs/current/interactive/plpgsql-control-structures.html ) So the doc isn't totally explicit about this. But whatever: What would be the the function of a subtransaction? To give the possibility to recover and continue within the surrounding transaction? I find this pretty explicit: It is important not to confuse the use of BEGIN/END for grouping statements in PL/pgSQL with the similarly-named SQL commands for transaction control. PL/pgSQL's BEGIN/END are only for grouping; they do not start or end a transaction. Functions and trigger procedures are always executed within a transaction established by an outer query — they cannot start or commit that transaction, since there would be no context for them to execute in. However, a block containing an EXCEPTION clause effectively forms a subtransaction that can be rolled back without affecting the outer transaction. For more about that see Section 38.6.5. http://www.postgresql.org/docs/8.4/interactive/plpgsql-structure.html Cheers robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
Robert Klemme wrote: On 12.09.2011 19:22, Andy Colson wrote: There are transaction isolation levels, but they are like playing with fire. (in my opinion). You make them sound like witchcraft. But they are clearly defined - even standardized. Yeah, for decades. Developing concurrency control from scratch at the application level over and over again is more like playing with fire, in my book. Granted, different RDBMS might implement them in different ways - here's PG's view of TX isolation: http://www.postgresql.org/docs/8.4/interactive/transaction-iso.html Oh, that link is *so* day-before-yesterday! Try this one: http://www.postgresql.org/docs/9.1/interactive/transaction-iso.html In my opinion anybody working with RDBMS should make himself familiar with this concept - at least know about it - because it is one of the fundamental features of RDBMS and certainly needs consideration in applications with highly concurrent DB activity. +1 Understanding what levels of transaction isolation are available, and what the implications of each are, is fundamental. Just as there are cases where a foreign key constraint doesn't exactly work for what you need to enforce, there are cases where serializable transactions don't fit. But where they do fit, developing the equivalent from scratch all over again is not as safe or productive as using the built-in feature. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres for a data warehouse, 5-10 TB
On Mon, Sep 12, 2011 at 11:26:10PM +0200, Robert Klemme wrote: You make them sound like witchcraft. But they are clearly defined - even standardized. Granted, different RDBMS might implement them in different ways - here's PG's view of TX isolation: http://www.postgresql.org/docs/8.4/interactive/transaction-iso.html Even better: PostgreSQL 9.1 (Released yesterday! Fresher than milk...) ships an improved algorithm for serializable transaction isolation level: http://www.postgresql.org/docs/9.1/interactive/transaction-iso.html More info: http://wiki.postgresql.org/wiki/Serializable Cheers, Dr. Gianni Ciolli - 2ndQuadrant Italia PostgreSQL Training, Services and Support gianni.cio...@2ndquadrant.it | www.2ndquadrant.it -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Carlo Stonebanks wrote: max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. Even for 300 stateful applications that can remain connected for up to a week, continuously distilling data (imports)? Absolutely. A good connection pooler will be able to hold those 300 *client* connections, and maintain a much smaller set of connections to the database. It will notice when a client connection is requesting the start of a database transaction. If there is an idle database connection it will route the requests there; otherwise it will put that client connection in a queue. When a database transaction is committed, a waiting client connection (if any) will be assigned to its database connection. Every benchmark I've seen shows that this will improve both throughput and latency over the approach of releasing a thundering herd of requests against the server. Picture a meat counter with four butchers behind it, and few spinning devices to slice meat. If customers queue up, and the butchers call on people as they are ready, things go better than if each butcher tries to take on one- fourth of the customers at a time and constantly switch between one order and another to try to make incremental progress on all of them. a sys admin raised it from 100 when multiple large projects were loaded and the server refused the additional connections. Whoever is making these decisions needs more training. I suggest Greg Smith's book: http://www.postgresql.org/docs/books/ (Full disclosure, I was a technical reviewer of the book and got a free copy.) you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). For performance or safety reasons? You get better performance with write-back. If you can't rely on the battery, then write-back is not safe and you need to use write- through. Since the sys admin thinks there's no performance benefit from this, I would like to be clear on why we should do this. If you can get him to change it back and forth for performance testing, it is easy enough to prove. Write a client application which inserts on row per database transaction. A nice, simple, short row -- like containing one integer column with no indexes. Have the external application create the table and do a million inserts. Try this with both cache settings. It's best not to issue a BEGIN and COMMIT at all. Don't loop in a function or a DO block, because that creates an implicit transaction. Every now and then the imports behave as if they are suddenly taking a deep breath, slowing down. Sometimes, so much we cancel the import and restart (the imports pick up where they left off). What would the bg_writer settings be in this case? I'm not sure what that is based on information so far, so it's unclear whether background writer settings would help; but on the face of it my bet would be that it's a context switching storm or swapping, and the connection pool would be the better solution. Those poor butchers are just overwhelmed -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?
The doc at http://www.postgresql.org/docs/current/interactive/indexes-types.html says: Caution: Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. They are also not replicated over streaming or file-based replication. For these reasons, hash index use is presently discouraged. I found a thread here http://archives.postgresql.org/pgsql-general/2005-05/msg00370.php about Hash index vs. b-tree index (PostgreSQL 8.0) mentioning some issues, like they * are not faster than B-trees even for = comparisons * aren't WAL safe * have poor concurrency (require coarser locks), * are significantly slower than creating a b+-tree index. In fact these statements seem to rely on the docs back in version 7.2 (see http://www.postgresql.org/docs/7.2/static/indexes-types.html ) Has this been verified on a recent release? I can't believe that hash performs so bad over all these points. Theory tells me otherwise and http://en.wikipedia.org/wiki/Hash_table seems to be a success. Are there any plans to give hash index another chance (or to bury it with a reason)? Stefan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] raid array seek performance
On Tue, Sep 13, 2011 at 12:13 PM, Samuel Gendler sgend...@ideasculptor.comwrote: I'm just beginning the process of benchmarking and tuning a new server. Something I really haven't done before. I'm using Greg's book as a guide. I started with bonnie++ (1.96) and immediately got anomalous results (I think). Hardware is as follows: 2x quad core xeon 5504 2.0Ghz, 2x4MB cache 192GB DDR3 1066 RAM 24x600GB 15K rpm SAS drives adaptec 52445 controller The default config, being tested at the moment, has 2 volumes, one 100GB and one 3.2TB, both are built from a stripe across all 24 disks, rather than splitting some spindles out for one volume and another set for the other volume. At the moment, I'm only testing against the single 3.2TB volume. The smaller volume is partitioned into /boot (ext2 and tiny) and / (ext4 and 91GB). The larger volume is mounted as xfs with the following options (cribbed from an email to the list earlier this week, I think): logbufs=8,noatime,nodiratime,nobarrier,inode64,allocsize=16m Bonnie++ delivered the expected huge throughput for sequential read and write. It seems in line with other benchmarks I found online. However, we are only seeing 180 seeks/sec, but seems quite low. I'm hoping someone might be able to confirm that and. hopefully, make some suggestions for tracking down the problem if there is one. Results are as follows: 1.96,1.96,newbox,1,1315935572,379G,,1561,99,552277,46,363872,34,3005,90,981924,49,179.1,56,16,19107,69,+,+++,20006,69,19571,72,+,+++,20336,63,7111us,10666ms,14067ms,65528us,592ms,170ms,949us,107us,160us,383us,31us,130us Version 1.96 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP newzonedb.z1.p 379G 1561 99 552277 46 363872 34 3005 90 981924 49 179.1 56 Latency 7111us 10666ms 14067ms 65528us 592ms 170ms --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min/sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP newbox16 19107 69 + +++ 20006 69 19571 72 + +++ 20336 63 Latency 949us 107us 160us 383us 31us 130us My seek times increase when I reduce the size of the file, which isn't surprising, since once everything fits into cache, seeks aren't dependent on mechanical movement. However, I am seeing lots of bonnie++ results in google which appear to be for a file size that is 2x RAM which show numbers closer to 1000 seeks/sec (compared to my 180). Usually, I am seeing 16GB file for 8GB hosts. So what is an acceptable random seeks/sec number for a file that is 2x memory? And does file size make a difference independent of available RAM such that the enormous 379GB file that is created on my host is skewing the results to the low end?
Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?
On 14 September 2011 00:04, Stefan Keller sfkel...@gmail.com wrote: Has this been verified on a recent release? I can't believe that hash performs so bad over all these points. Theory tells me otherwise and http://en.wikipedia.org/wiki/Hash_table seems to be a success. Hash indexes have been improved since 2005 - their performance was improved quite a bit in 9.0. Here's a more recent analysis: http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/ -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?
Peter Geoghegan pe...@2ndquadrant.com writes: On 14 September 2011 00:04, Stefan Keller sfkel...@gmail.com wrote: Has this been verified on a recent release? I can't believe that hash performs so bad over all these points. Theory tells me otherwise and http://en.wikipedia.org/wiki/Hash_table seems to be a success. Hash indexes have been improved since 2005 - their performance was improved quite a bit in 9.0. Here's a more recent analysis: http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/ Yeah, looking into the git logs shows several separate major changes committed during 2008, including storing only the hash code not the whole indexed value (big win on wide values, and lets you index values larger than one index page, which doesn't work in btree). I think that the current state of affairs is still what depesz said, namely that there might be cases where they'd be a win to use, except the lack of WAL support is a killer. I imagine somebody will step up and do that eventually. The big picture though is that we're not going to remove hash indexes, even if they're nearly useless in themselves, because hash index opclasses provide the foundation for the system's knowledge of how to do the datatype-specific hashing needed for hash joins and hash aggregation. And those things *are* big wins, even if hash indexes themselves never become so. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] RAID Controller (HP P400) beat by SW-RAID?
On Tue, Sep 13, 2011 at 1:22 AM, Arjen van der Meijden acmmail...@tweakers.net wrote: On 12-9-2011 0:44 Anthony Presley wrote: A few weeks back, we purchased two refurb'd HP DL360's G5's, and were hoping to set them up with PG 9.0.2, running replicated. These machines have (2) 5410 Xeon's, 36GB of RAM, (6) 10k SAS drives, and are using the HP SA P400i with 512MB of BBWC. PG is running on an ext4 (noatime) partition, and they drives configured as RAID 1+0 (seems with this controller, I cannot do JBOD). If you really want a JBOD-setup, you can try a RAID0 for each available disk, i.e. in your case 6 separate RAID0's. That's how we configured our Dell H700 - which doesn't offer JBOD as well - for ZFS. That's a pretty good idea ... I'll try that on our second server today. In the meantime, after tweaking it a bit, we were able to get (with iozone): Old New Initial write 75.85 220.68 Rewrite 63.95 253.07 Read 45.04 171.35 Re-read 45 2405.23 Random read 27.56 1733.46 Random write 50.7 239.47 Not as fas as I'd like, but faster than the old disks, for sure. -- Anthony
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Ok, connection pooler it is. As I understand it, even if there are no idle connections available we'll get the benefit of putting a turnstile on the butcher's door. I also ordered the book as soon as you mentioned - the title alone was enough to sell me on it! The book won't be for the errant sys admin who increased the connections, it's for me - I'll use it to whack the sys admin on the head. Thanks fo rthe tip, the author owes you a beer - as do I. Will the book recommend any particular connection pooler product, or is it inappropriate to ask for a recommendation on the forum? Carlo Date: Tue, 13 Sep 2011 16:13:00 -0500 From: kevin.gritt...@wicourts.gov To: pgsql-performance@postgresql.org; stonec.regis...@sympatico.ca Subject: RE: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post) Carlo Stonebanks wrote: max_connections = 300 Too high. Both throughput and latency should improve with correct use of a connection pooler. Even for 300 stateful applications that can remain connected for up to a week, continuously distilling data (imports)? Absolutely. A good connection pooler will be able to hold those 300 *client* connections, and maintain a much smaller set of connections to the database. It will notice when a client connection is requesting the start of a database transaction. If there is an idle database connection it will route the requests there; otherwise it will put that client connection in a queue. When a database transaction is committed, a waiting client connection (if any) will be assigned to its database connection. Every benchmark I've seen shows that this will improve both throughput and latency over the approach of releasing a thundering herd of requests against the server. Picture a meat counter with four butchers behind it, and few spinning devices to slice meat. If customers queue up, and the butchers call on people as they are ready, things go better than if each butcher tries to take on one- fourth of the customers at a time and constantly switch between one order and another to try to make incremental progress on all of them. a sys admin raised it from 100 when multiple large projects were loaded and the server refused the additional connections. Whoever is making these decisions needs more training. I suggest Greg Smith's book: http://www.postgresql.org/docs/books/ (Full disclosure, I was a technical reviewer of the book and got a free copy.) you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). For performance or safety reasons? You get better performance with write-back. If you can't rely on the battery, then write-back is not safe and you need to use write- through. Since the sys admin thinks there's no performance benefit from this, I would like to be clear on why we should do this. If you can get him to change it back and forth for performance testing, it is easy enough to prove. Write a client application which inserts on row per database transaction. A nice, simple, short row -- like containing one integer column with no indexes. Have the external application create the table and do a million inserts. Try this with both cache settings. It's best not to issue a BEGIN and COMMIT at all. Don't loop in a function or a DO block, because that creates an implicit transaction. Every now and then the imports behave as if they are suddenly taking a deep breath, slowing down. Sometimes, so much we cancel the import and restart (the imports pick up where they left off). What would the bg_writer settings be in this case? I'm not sure what that is based on information so far, so it's unclear whether background writer settings would help; but on the face of it my bet would be that it's a context switching storm or swapping, and the connection pool would be the better solution. Those poor butchers are just overwhelmed -Kevin
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
On 09/14/2011 02:56 AM, Carlo Stonebanks wrote: Even for 300 stateful applications that can remain connected for up to a week, continuously distilling data (imports)? If they're all doing active work all that time you can still benefit from a pooler. Say your server can service 50 connections at optimum speed, and any more result in reduced overall throughput. You have 300 apps with statements they want to run. Your pooler will basically queue them, so at any one time 50 are doing work and 250 are waiting for database access. This should _improve_ database throughput by reducing contention if 50 worker connections is your sweet spot. However, it will also increase latency for service for those workers because they may have to wait a while before their transaction runs, even though their transaction will complete much faster. You'd probably want to pool at the transaction level, so once a client gets a connection it keeps it for the lifetime of that transaction and the connection is handed back to the pool when the transaction commits or rolls back. you want the controller configured for write-back (with automatic switch to write-through on low or failed battery, if possible). For performance or safety reasons? Since the sys admin thinks there's no performance benefit from this, I would like to be clear on why we should do this. fsync! If your workload is read-only, it won't help you much. If your workload is write-heavy or fairly balanced it'll make a HUGE difference, because fsync() on commit won't have to wait for disk I/O, only I/O to the RAID card's cache controller. You can also play with commit_delay and synchronous_commit to trade guarantees of data persistence off against performance. Don't mind losing up to 5 mins of commits if you lose power? These options are for you. Whatever you do, do NOT set fsync=off. It should be called Eat my data if anything goes even slightly wrong=on; it does have legitimate uses, but they're not yours. Can our particular setup benefit from changing the bgwriter values? Probably not. If you find that your interactive users have periods where queries seem to freeze for a few minutes at a time and then return to normal levels of performance, you might need to make this more aggressive. We actually experience this. Once again, remember the overwhelming use of the system is long-running import threads with continuous connections. Every now and then the imports behave as if they are suddenly taking a deep breath, slowing down. Sometimes, so much we cancel the import and restart (the imports pick up where they left off). This could definitely be checkpointing issues. Enable checkpoint logging. What would the bg_writer settings be in this case? You need to tune it for your workload I'm afraid. See the manual and mailing list discussions. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Migrated from 8.3 to 9.0 - need to update config (re-post)
Craig Ringer wrote: I agreed with almost your entire post, but there is one sentence with which I take issue. However, it will also increase latency for service for those workers because they may have to wait a while before their transaction runs, even though their transaction will complete much faster. My benchmarks have shown that latency also improves. See these posts for my reasoning on why that is: http://archives.postgresql.org/pgsql-performance/2009-03/msg00138.php http://archives.postgresql.org/pgsql-performance/2010-01/msg00107.php So even though there is greater latency from the attempt to *start* the transaction until it is underway, the total latency from the attempt to start the transaction until *completion* is less on average, in spite of the time in queue. Perhaps that's what you were getting at, but it sounded to me like you're saying you sacrifice latency to achieve the throughput, and that isn't what I've seen. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance