Re: [PERFORM] strange index performance?
Scott Marlowe wrote: Also, what other kind of usage patterns are going on. For this test there was nothing else going on, it was just that one writer. The complete usage pattern is that there is one writer that writes this data, about 2 rows per second, and then a small number of readers that query for some data based on id1,3,4. (To help you visualize it, think of a query where you want to know the names all residents on the 4th floor in that particular street independent of house number. So id1 would be the street, id2 would be the house number, id3 would be the floor number and id4 would be the apartment number. Such a query would only use id1,3,4) I wrote a simple test case for this and on a table with 100,000 entries already in it, then inserting 10,000 in a transaction and 10,000 outside of a transaction, I get insert rates of 0.1 ms and 0.5 ms respectively. With a table with 1,000,000 rows already in place, the insert times with all the fields in an index was 1.5ms and 4.3ms respectively. With only i1, i3, i4, val1 in the index, the numbers for a table with 100,000 entries to start with was 0.1ms and 0.5 ms, just like the above with the larger index. With a 1,000,000 initial table, inserts take 2.1 and 3.0 ms respectively. How do you run that setup, because those times are amazing, my inserts take about 220ms, constantly from the first row in the table to the 1.2 billionth row. The client I was using for the inserts is a bare-bone use case simulation tool I have written in java, to test different insert and query strategies for this application. Its using JDBC copy to do the inserts. There is one thing you should know though, and that is that the real table I am using has 20 value fields where the 6 first fields contains a value, but that does not affect the difference int eh the execution time of the two indexes. So I don't think you've found the cause of your problem with the smaller index. I dont quite understand what you are saying here, but I assume you are saying that the smaller index is not the cause of the increased insert time? If so, I did the test with both indexes on exactly the same db and setup. And when the index uses all four ids the insert time is larger than if I only use id1,3,4. What concerns me about your test, is that you dont seem to get constant insert times, so there is a difference between the two tests, which miuch be why you dont see the problem I am seeing with my index. regards thomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] strange index performance?
On Sun, Jan 25, 2009 at 1:14 AM, Thomas Finneid wrote: > Scott Marlowe wrote: >> >> I wrote a >> simple test case for this and on a table with 100,000 entries already >> in it, then inserting 10,000 in a transaction and 10,000 outside of a >> transaction, I get insert rates of 0.1 ms and 0.5 ms respectively. >> With a table with 1,000,000 rows already in place, the insert times >> with all the fields in an index was 1.5ms and 4.3ms respectively. >> >> With only i1, i3, i4, val1 in the index, the numbers for a table with >> 100,000 entries to start with was 0.1ms and 0.5 ms, just like the >> above with the larger index. With a 1,000,000 initial table, inserts >> take 2.1 and 3.0 ms respectively. > > How do you run that setup, because those times are amazing, my inserts take > about 220ms, constantly from the first row in the table to the 1.2 billionth > row. The client I was using for the inserts is a bare-bone use case > simulation tool I have written in java, to test different insert and query > strategies for this application. Its using JDBC copy to do the inserts. The setup was a simple PHP script. I've attached it to this email. > There is one thing you should know though, and that is that the real table I > am using has 20 value fields where the 6 first fields contains a value, but > that does not affect the difference int eh the execution time of the two > indexes. Yes, but it will increase the insert time to the table depending very much on the size of those other fields. >> So I don't think you've found the cause of your problem with the smaller >> index. > > I dont quite understand what you are saying here, but I assume you are > saying that the smaller index is not the cause of the increased insert time? Yes, that's what I was saying. > If so, I did the test with both indexes on exactly the same db and setup. > And when the index uses all four ids the insert time is larger than if I > only use id1,3,4. I thought it was the other way around for you, that the smaller index was slower. > What concerns me about your test, is that you dont seem to get constant > insert times, so there is a difference between the two tests, which miuch be > why you dont see the problem I am seeing with my index. Yes, you need to look at things like increasing the number of wal segments and checkpointing. If a checkpoint is kicking in it's going to slow everything down. What version pgsql are you running? My tests were on 8.3.x on a core2duo laptop with a stock slow 80Gig hard drive, but most likely it's lying about fsync, so that could explain some of the numbers. I just ran it on my real server, since it's late at night, there's not much else going on. With 1M rows created ahead of time I got similar numbers: 0.12 ms per insert with all 10,000 inserted in a transaction 0.24 ms per insert with each insert being individual transactions (i.e. no begin; commt; wrapped around them all) This is on a machine with a 12 disk RAID-10 array under an Areca 1680ix controller with 512M battery backed cache. Note that the table had no other columns in it like yours does. test Description: Binary data -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD performance
da...@lang.hm writes: > they currently have it do a backup immediatly on power loss (which is a safe > choice as the contents won't be changing without power), but it then powers > off > (which is not good for startup time afterwords) So if you have a situation where it's power cycling rapidly each iteration drains the battery of the time it takes to save the state but only charges it for the time the power is on. I wonder how many iterations that gives you. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD performance
On Sun, 25 Jan 2009, Gregory Stark wrote: da...@lang.hm writes: they currently have it do a backup immediatly on power loss (which is a safe choice as the contents won't be changing without power), but it then powers off (which is not good for startup time afterwords) So if you have a situation where it's power cycling rapidly each iteration drains the battery of the time it takes to save the state but only charges it for the time the power is on. I wonder how many iterations that gives you. good question. assuming that it's smart enough to not start a save if it didn't finish doing a restore, and going from the timings in the article (~20 min save, ~15 min load and 4 hour battery life) you would get ~12 cycles from the initial battery plus whatever you could get from the battery charging (~3 hours during the initial battery time) if the battery could be fully charged in 3 hours it could keep doing this indefinantly. if it takes 6 hours it would get a half charge, so 12+6+3+1=22 cycles but even the initial 12 cycles is long enough that you should probably be taking action by then. in most situations you are going to have a UPS on your system anyway, and it will have the same type of problem (but usually with _much_ less than 4 hours worth of operation to start with) so while you could loose data from intermittent power, I think you would be far more likely to loose data due to a defective battery or the CF card not being fully seated or something like that. David Lang -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgresql 8.3 tps rate
Greg Smith wrote: > On Thu, 22 Jan 2009, Alvaro Herrera wrote: > >> Also, I think you should set the "scale" in the prepare step (-i) at >> least as high as the number of clients you're going to use. (I dimly >> recall some recent development in this area that might mean I'm wrong.) > > The idea behind that maxim (clients>=scale) is that locking on the > smaller tables will bottleneck resuls if you don't follow that advice. > It's a bit messier than that though. Increasing the scale will also > make the database larger, and once it gets bigger than available RAM > your results are going to dive hard because of that, more so than the > locking would have held you back. > > All kind of irrelevant for Ibrahim's case, because if you're not getting > more than 50MB/s out of your disks the pgbench results are kind of moot > anyway--there's a larger problem to sort out first. > > -- > * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD > IIRC this is a FreeBSD system, not Linux. Could there be some filesystem performance issue here? I know zero about FreeBSD filesystems. Also, is there a separate driver machine you can use to run pgbench? The pgbench client uses resources, which could lower your throughput. -- M. Edward (Ed) Borasky I've never met a happy clam. In fact, most of them were pretty steamed. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgresql 8.3 tps rate
[snip] I'm actually doing some very similar testing and getting very similar results. My disk is a single Seagate Barracuda 7200 RPM SATA (160 GB). The OS is openSUSE 11.1 (2.6.27 kernel) with the "stock" PostgreSQL 8.3.5 RPM. I started out running pgbench on the same machine but just moved the driver to another one trying to get better results. The other driver is quite small -- 512 MB 1.6 GHz -- so I might need to shut down the desktop and X on it. But the real mystery is this: I have two XFS partitions. Let's call them sda5 and sda6. The PostgreSQL install put the database in /var/lib/pgsql, which is on sda5. But I created a tablespace on sda6 specifically for the database that pgbench is using, and put that database there. At least that's what pgadmin3 is telling me I did. But when I run pgbench, I see intense I/O on *both* partitions sometimes, and other times I see it *only* on sda5. I can understand the "both" times -- I didn't move any "system-level" things like the write-ahead logs. But what I can't understand is the periods when it isn't using sda6, where the tablespace is. Anyhow, here's a short segment of the results I'm getting. starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 2 number of transactions per client: 100 number of transactions actually processed: 200/200 tps = 37.360964 (including connections establishing) tps = 37.430501 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 4 number of transactions per client: 100 number of transactions actually processed: 400/400 tps = 51.768918 (including connections establishing) tps = 51.985556 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 6 number of transactions per client: 100 number of transactions actually processed: 600/600 tps = 51.462103 (including connections establishing) tps = 51.734119 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 8 number of transactions per client: 100 number of transactions actually processed: 800/800 tps = 44.316328 (including connections establishing) tps = 44.473483 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 10 number of transactions per client: 100 number of transactions actually processed: 1000/1000 tps = 44.750672 (including connections establishing) tps = 44.910703 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 12 number of transactions per client: 100 number of transactions actually processed: 1200/1200 tps = 45.048743 (including connections establishing) tps = 45.205084 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 14 number of transactions per client: 100 number of transactions actually processed: 1400/1400 tps = 26.849217 (including connections establishing) tps = 26.916643 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 16 number of transactions per client: 100 number of transactions actually processed: 1600/1600 tps = 11.187072 (including connections establishing) tps = 11.198109 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 18 number of transactions per client: 100 number of transactions actually processed: 1800/1800 tps = 38.183978 (including connections establishing) tps = 38.301026 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 20 number of transactions per client: 100 number of transactions actually processed: 2000/2000 tps = 35.012091 (including connections establishing) tps = 35.109165 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 22 number of transactions per client: 100 number of transactions actually processed: 2200/2200 tps = 28.286106 (including connections establishing) tps = 28.350341 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 24 number of transactions per client: 100 number of transactions actually processed: 2400/2400 tps = 29.285593 (including connections establishing) tps = 29.358284 (excluding connections establishing) starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 26 number of transactions per client: 100 number of transactions actually processed: 2600/2600 tps = 29.237558 (including connections establishing) tps = 29.308422 (excluding connections
Re: [PERFORM] postgresql 8.3 tps rate
On Sun, 25 Jan 2009, M. Edward (Ed) Borasky wrote: I started out running pgbench on the same machine but just moved the driver to another one trying to get better results. That normally isn't necessary until you get to the point where you're running thousands of transactions per second. The CPU load of the pgbench isn't that big, and moving it to a network client does add its own overhead. I can understand the "both" times -- I didn't move any "system-level" things like the write-ahead logs. But what I can't understand is the periods when it isn't using sda6, where the tablespace is. Writes to the database are buffered by the shared_buffers mechanism. If you dirty a block, it has to be unused for a bit after that before it will be written out. The OS also buffers writes to the database disk. The combination of the two means that you can write things sometimes that won't turn into physical disk I/O to the database for a while. That is not true of the WAL, which will always be generating activity when running pgbench. number of transactions actually processed: 3000/3000 Generally, pgbench results start to be useful when you've pushed through around 100,000 transactions or run for a few minutes. It looks like your largest client test might be approaching that threshold only because the TPS rate is so slow. I'm not sure what is going on with your system, but the advice showing up earlier in this thread is well worth heeding here: if you haven't thoroughly proven that your disk setup works as expected on simple I/O tests such as dd and bonnie++, you shouldn't be running pgbench yet. It's not a tranparent benchmark unless you really understand what it's doing, and you can waste endless time chasing phantom database setup problems that way when you should be staring at hardware, driver, or OS level ones instead. Do you know the disks are working as they should here? Does the select-only pgbench give you reasonable results? -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] [PERFORMANCE] Buying hardware
So, the eternal problem with what hardware to buy. I really miss a hardware buying guide for database servers now that I'm about to buy one.. Some general guidelines mixed with ranked lists of what hardware that is best, shouldn't that be on the wiki?. THis is of course very difficult to advice about, but shouldn't geneeral advice be like: 1) A fast CPU but not on the bleeding edge 2) As much RAM as you can fit into the machine without paying way to much for it. Use the fastest ram you can find (what is it called today? PC 1333 MHz or something like that?) 3) Fast harddiscs. Best is raid X (what raid should one use?) 4) Use software raid unless you have the money to buy a raid controller, in which case here is the ranking of them ordered by quality and a general comment on exactly how much better they are than the one below on the list ;-) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [PERFORMANCE] Buying hardware
On Sun, Jan 25, 2009 at 2:21 PM, A B wrote: > So, the eternal problem with what hardware to buy. I really miss a > hardware buying guide for database servers now that I'm about to buy > one.. > Some general guidelines mixed with ranked lists of what hardware that > is best, shouldn't that be on the wiki?. > > THis is of course very difficult to advice about, but shouldn't > geneeral advice be like: > 1) A fast CPU but not on the bleeding edge More cores is more important than faster but fewer > 2) As much RAM as you can fit into the machine without paying way to > much for it. Use the fastest ram you can find (what is it called > today? PC 1333 MHz or something like that?) The speed of the RAM isn't as important as the amount and the speed of the chipset on the motherboard. > 3) Fast harddiscs. Best is raid X (what raid should one use?) Again, more slower disks > fewer slower ones. RAID-10 is almost always the right choice. > 4) Use software raid unless you have the money to buy a raid > controller, in which case here is the ranking of them > Areca and 3ware/Escalade are the two best controllers for the money out right now. They tend to take turns being the absolute best as they release new cards. Newer Arecas (the 1680 series) use an ethernet port for traps and such, so no need for special software that might be kernel version dependent. Both cost about the same for their top of the line cards. Make sure you have battery backed cache. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] strange index performance?
Scott Marlowe wrote: On Sun, Jan 25, 2009 at 1:14 AM, Thomas Finneid wrote: Scott Marlowe wrote: So I don't think you've found the cause of your problem with the smaller index. Ok I understand, but why dont you think the index is the problem? If so, I did the test with both indexes on exactly the same db and setup. And when the index uses all four ids the insert time is larger than if I only use id1,3,4. I thought it was the other way around for you, that the smaller index was slower. Sorry for the mistake, I meant to say the smaller index causes the slowest insert time. What version pgsql are you running? My tests were on 8.3.x on a I am running on pg 8.2.x (kubuntu x64) with 8GB ram, 8 opteron cores and 8 disks on a Areca Raid controller 0.12 ms per insert with all 10,000 inserted in a transaction 0.24 ms per insert with each insert being individual transactions (i.e. no begin; commt; wrapped around them all) This is on a machine with a 12 disk RAID-10 array under an Areca 1680ix controller with 512M battery backed cache. Note that the table had no other columns in it like yours does. Are you sure you mean to say 0.12 ms and not 0.12 seconds? My server also uses an Areca RAID controller (8 disk controller model), but it does not matter how many disks are in the stripe, its still the same performance. So, if you get that performance then I have must have set up postgres, the OS or the RAID controller wrong. What are the most important configurations you did to get that performance? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgresql 8.3 tps rate
Greg Smith wrote: > I'm not sure what is going on with your system, but the advice showing > up earlier in this thread is well worth heeding here: if you haven't > thoroughly proven that your disk setup works as expected on simple I/O > tests such as dd and bonnie++, you shouldn't be running pgbench yet. > It's not a tranparent benchmark unless you really understand what it's > doing, and you can waste endless time chasing phantom database setup > problems that way when you should be staring at hardware, driver, or OS > level ones instead. Do you know the disks are working as they should > here? Does the select-only pgbench give you reasonable results? Actually, this isn't so much a 'pgbench' exercise as it is a source of 'real-world application' data for my Linux I/O performance visualization tools. I've done 'iozone' tests, though not recently. But what I'm building is an I/O analysis toolset, not a database application. So I am "staring at hardware, driver or OS level" issues. :) To be more precise, I'm using block I/O layer tools, which are "beneath" the filesystem layer but "above" the driver and hardware levels. What you might find interesting is that, when I presented the earlier (iozone) test results at the Computer Measurement Group meeting in Las Vegas in December, there were two disk drive engineers in the audience, from, IIRC, Fujitsu. When they saw my results showing all four Linux schedulers yielding essentially the same performance metrics using some fairly tight statistical significance tests, they told me that it was because the drive was re-ordering operations according to its own internal scheduler! I haven't had a chance to investigate that in any detail yet, but I assume that they knew what they were talking about. The drive in question is an off-the-shelf unit that I got at CompUSA as part of a system that I had them build. In any event, it's *not* a "server-grade I/O subsystem", it's a single disk drive designed for "home desktop PC" use cases. In short, I don't expect server-grade TPS values. I did capture some 'iostat' data after I moved the 'pgbench' database back into the main partition where the rest PostgreSQL database lives. As I expected, the device and partition utilizations were in the high 90 percent range. I don't have the bandwidth figures from 'iostat' handy, but if the utilization is 98.4 percent, they may be the best I can get out of the drive with the xfs filesystem and the cfq scheduler. And the choice of scheduler might not matter. And the choice of filesystem might not matter. I may be getting all the drive can do. -- M. Edward (Ed) Borasky I've never met a happy clam. In fact, most of them were pretty steamed. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] strange index performance?
On Sun, Jan 25, 2009 at 2:54 PM, Thomas Finneid wrote: > Scott Marlowe wrote: >> >> On Sun, Jan 25, 2009 at 1:14 AM, Thomas Finneid wrote: >>> >>> Scott Marlowe wrote: So I don't think you've found the cause of your problem with the smaller index. > > Ok I understand, but why dont you think the index is the problem? Because on any decent hardware you should be able to insert a single row in way under 200ms. Whether it's got an index on it or not. >> I thought it was the other way around for you, that the smaller index >> was slower. > > Sorry for the mistake, I meant to say the smaller index causes the slowest > insert time. I'm guessing that you just had more data in the table or something by the time you tested that, or some cron job was running in the background, or some other issue, not the index. >> What version pgsql are you running? My tests were on 8.3.x on a > > I am running on pg 8.2.x (kubuntu x64) with 8GB ram, 8 opteron cores and 8 > disks on a Areca Raid controller Quite a similar machine. write back cache with battery backed controller on the controller? A really old Areca like an 11xx series or a newer one 12xx, 16xx? >> 0.12 ms per insert with all 10,000 inserted in a transaction >> 0.24 ms per insert with each insert being individual transactions >> (i.e. no begin; commt; wrapped around them all) This is on a machine >> with a 12 disk RAID-10 array under an Areca 1680ix controller with >> 512M battery backed cache. Note that the table had no other columns >> in it like yours does. > > Are you sure you mean to say 0.12 ms and not 0.12 seconds? My server also 0.12 seconds per insert is pretty slow. 10 inserts would take a second. I'm inserting 10,000 rows in about 2 seconds. Each insert is definitely in the 0.12 millisecond range. > uses an Areca RAID controller (8 disk controller model), but it does not > matter how many disks are in the stripe, its still the same performance. So, > if you get that performance then I have must have set up postgres, the OS or > the RAID controller wrong. What are the most important configurations you > did to get that performance? Hard to say. What does bonnie++ have to say about the performance of your RAID array? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance