[PERFORM] raid setup for db
Hi I am wondering what stripe size, on a raid 0, is the most suitable for postgres 8.2? I read a performance tutorial by Bruce Momjian and it suggest setting the stripe size to the same block size (as pg uses?) ( http://momjian.us/main/writings/pgsql/hw_performance/index.html ) But I want to check whether I have understood this correctly. Are there any other hot confguration tips I should consider or does anybody have any suggestions for other raid configuration articles? regards thomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] increase index performance
First off, is there a way to pre-filter some of this data, by a view, temporary table, partitioned indexes or something. Secondly, one of the problems seems to be the size of the data and its index, how can I calulate how much space a particular part of the index needs in memory? maybe I could rearrange things a bit better so it better first inside pages and so on. Thirdly I was a bit unclear and this was the best example I could think of (my client probably dont want me to talk about this at all... hence the contrived example): 85 city_ids, 2000 street_ids per city, 10 house_ids per street 500 floor_ids per house Now it should have the correct data distribution and the correct cardinality. In this particular query I am interested in all streets in a city that have the specific house id and the specific floor id. By specifying city_id, house_id and floor_id I should get all street_ids that matches The example you gave Greg assumed I wanted to follow cardinality, but I need to skip the second field in order to get the right query. So pulling the data based on the two first fields, City and Street would just give me data for a single street, when I want it for all streets. Greg Smith wrote: On Tue, 12 May 2009, Thomas Finneid wrote: on a database with 260 GB of data and an index size of 109GB on separate raid disks. there are 85 city_ids, 2000 street_ids per city, 20 house_ids per street per city 5 floor_ids per house_ per street per city You should test what happens if you reduce the index to just being (city_id,street_id). Having all the fields in there makes the index larger, and it may end up being faster to just pull all of the ~100 data rows for a particular (city_id,street_id) using the smaller index and then filter out just the ones you need. Having a smaller index to traverse also means that you'll be more likely to keep all the index blocks in the buffer cache moving forward. A second level improvement there is to then CLUSTER on the smaller index, which increases the odds you'll get all of the rows you need by fetching only a small number of data pages. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] AMD Shanghai versus Intel Nehalem
Just realized I made a mistake, I was under the impression that Shanghai CPUs had 8xxx numbers while barcelona had 23xx numbers. I was wrong, it appears the 8xxx numbers are for 4+ socket servers while the 23xx numbers are for 2 or fewer sockets. So, there are several quite affordable shanghai cpus out there, and many of the ones I quoted as barcelonas are in fact shanghais with the larger 6M L2 cache. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] AMD Shanghai versus Intel Nehalem
We have a dual E5540 with 16GB (I think 1066Mhz) memory here, but no AMD Shanghai. We haven't done PostgreSQL benchmarks yet, but given the previous experiences, PostgreSQL should be equally faster compared to mysql. Our databasebenchmark is actually mostly a cpu/memory-benchmark. Comparing the results of the dual E5540 (2.53Ghz with HT enabled) to a dual Intel X5355 (2.6Ghz quad core two from 2007), the peek load has increased from somewhere between 7 and 10 concurrent clients to somewhere around 25, suggesting better scalable hardware. With the 25 concurrent clients we handled 2.5 times the amount of queries/second compared to the 7 concurrent client-score for the X5355, both in MySQL 5.0.41. At 7 CC we still had 1.7 times the previous result. I'm not really sure how the shanghai cpu's compare to those older X5355's, the AMD's should be faster, but how much? I've no idea if we get a Shanghai to compare it with, but we will get a dual X5570 soon on which we'll repeat some of the tests, so that should at least help a bit with scaling the X5570-results around the world down. Best regards, Arjen On 12-5-2009 20:47 Scott Marlowe wrote: Anyone on the list had a chance to benchmark the Nehalem's yet? I'm primarily wondering if their promise of performance from 3 memory channels holds up under typical pgsql workloads. I've been really happy with the behavior of my AMD shanghai based server under heavy loads, but if the Nehalems much touted performance increase translates to pgsql, I'd like to know. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] raid setup for db
Thomas Finneid wrote: Hi I am wondering what stripe size, on a raid 0, is the most suitable for postgres 8.2? Hello Raid 0 for a database? This is a disaster waiting to happen. Are you sure you want to use raid0? regards -- Rafael Martinez, r.m.guerr...@usit.uio.no Center for Information Technology Services University of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] raid setup for db
Dont worry about it, this is just for performance testing. thomas Thomas Finneid wrote: Hi I am wondering what stripe size, on a raid 0, is the most suitable for postgres 8.2? Hello Raid 0 for a database? This is a disaster waiting to happen. Are you sure you want to use raid0? regards -- Rafael Martinez, r.m.guerr...@usit.uio.no Center for Information Technology Services University of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
On 5/12/09, Robert Haas robertmh...@gmail.com wrote: On Tue, May 12, 2009 at 1:00 PM, Dimitri dimitrik...@gmail.com wrote: On MySQL there is no changes if I set the number of sessions in the config file to 400 or to 2000 - for 2000 it'll just allocate more memory. I don't care whether the setting affects the speed of MySQL. I want to know if it affects the speed of PostgreSQL. the problem is they both have max_connections parameter, so as you asked for MySQL I answered for MySQL, did not test yet for PostgreSQL, will be in the next series.. After latest fix with default_statistics_target=5, version 8.3.7 is running as fast as 8.4, even 8.4 is little little bit slower. I understand your position with a pooler, but I also want you think about idea that 128 cores system will become a commodity server very soon, and to use these cores on their full power you'll need a database engine capable to run 256 users without pooler, because a pooler will not help you here anymore.. So what? People with 128-core systems will not be running trivial joins that return in 1-2ms and have one second think times between them. And if they are, and if they have nothing better to do than worry about whether MySQL can process those queries in 1/2000th of the think time rather than 1/1000th of the think time, then they can use MySQL. If we're going to worry about performance on 128-core system, we would be much better advised to put our efforts into parallel query execution than how many microseconds it takes to execute very simple queries. Do you really think nowdays for example a web forum application having PG as a backend will have queries running slower than 1-2ms to print a thread message within your browser??? or banking transactions?? Still, I have no problem with making PostgreSQL faster in the case you're describing. I'm just not interested in doing it on my own time for free. I am sure there are a number of people who read this list regularly who would be willing to do it for money, though. Maybe even me. :-) ...Robert You don't need to believe me, but I'm doing it for free - I still have my work to finish in parallel :-)) And on the same time I don't see any other way to learn and improve my knowledge, but nobody is perfect :-)) Rgds, -Dimitri -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Timestamp index not used in some cases
On Tuesday 12 May 2009 12:55:14 Scott Marlowe wrote: On Tue, May 12, 2009 at 3:00 AM, Евгений Василев evasi...@jarcomputers.com wrote: I have the following table: CREATE TABLE temp.tmp_135528 ( id integer NOT NULL, prid integer, group_id integer, iinv integer, oinv integer, isum numeric, osum numeric, idate timestamp without time zone, odate timestamp without time zone, CONSTRAINT t_135528_pk PRIMARY KEY (id) ) WITH (OIDS=FALSE); With index: CREATE INDEX t_135528 ON temp.tmp_135528 USING btree (idate, group_id, osum, oinv); When the following query is executed the index is not used: EXPLAIN SELECT id, osum FROM temp.tmp_135528 WHERE idate = '2007-05-17 00:00:00'::timestamp AND group_id = '13' AND osum = '19654.45328' AND oinv = -1 QUERY PLAN - -- Seq Scan on tmp_135528 (cost=0.00..7022.36 rows=1166 width=11) Filter: ((idate = '2007-05-17 00:00:00'::timestamp without time zone) AND (osum = 19654.45328) AND (group_id = 13) AND (oinv = (-1))) (2 rows) When idate = '2007-05-17 00:00:00'::timestamp is changed to idate = '2007-05-17 00:00:00'::timestamp or idate = '2007-05-17 00:00:00'::timestamp then the index is used: EXPLAIN SELECT id, osum FROM temp.tmp_135528 WHERE idate = '2007-05-17 00:00:00'::timestamp AND group_id = '13' AND osum = '19654.45328' AND oinv = -1; QUERY PLAN - -- Index Scan using t_135528 on tmp_135528 (cost=0.00..462.61 rows=47 width=11) Index Cond: ((idate = '2007-05-17 00:00:00'::timestamp without time zone) AND (group_id = 13) AND (osum = 19654.45328) AND (oinv = (-1))) (2 rows) Why I cannot use the index in = comparison on timestamp ? You can. But in this instance one query is returning 47 rows while the other is returning 1166 rows (or the planner thinks it is). There's a switchover point where it's cheaper to seq scan. You can adjust this point up and down by adjusting various costs parameters. random_page_cost is commonly lowered to the 1.5 to 2.0 range, and effective_cache_size is normally set higher, to match the cache in the kernel plus the shared_buffer size. Thank you this worked like a charm. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
Hi Scott, On 5/12/09, Scott Carey sc...@richrelevance.com wrote: Although nobody wants to support it, he should try the patch that Jignesh K. Shah (from Sun) proposed that makes ProcArrayLock lighter-weight. If it makes 32 cores much faster, then we have a smoking gun. Although everyone here is talking about this as an 'unoptimal' solution, the fact is there is no evidence that a connection pooler will fix the scalability from 16 32 cores. Certainly a connection pooler will help most results, but it may not fix the scalability problem. A question for Dimitri: What is the scalability from 16 32 cores at the 'peak' load that occurs near 2x the CPU count? Is it also poor? If this is also poor, IMO the community here should not be complaining about this unopimal case -- a connection pooler at that stage does little and prepared statements will increase throughput but not likely alter scalability. I'm attaching a small graph showing a TPS level on PG 8.4 depending on number of cores (X-axis is a number of concurrent users, Y-axis is the TPS number). As you may see TPS increase is near linear while moving from 8 to 16 cores, while on 32cores even it's growing slightly differently, what is unclear is why TPS level is staying limited to 11.000 TPS on 32cores. And it's pure read-only workload. If that result scales, then the short term answer is a connection pooler. In the tests that Jingesh ran -- making the ProcArrayLock faster helped the case where connections = 2x the CPU core count quite a bit. The thread about the CPU scalability is Proposal of tunable fix for scalability of 8.4, originally posted by Jignesh K. Shah j.k.s...@sun.com, March 11 2009. It would be very useful to see results of this benchmark with: 1. A Connection Pooler will not help, as each client is *not* disconnecting/reconnecting during the test, as well PG is keeping well even 256 users. And TPS limit is reached already on 64 users, don't think pooler will help here. 2. Jignesh's patch I've already tested it and it did not help in my case because the real problem is elsewhere.. (however, I did not test it yet with my latest config params) 3. Prepared statements yes, I'm preparing this test. #3 is important, because prepared statements are ideal for queries that perform well with low statistics_targets, and not ideal for those that require high statistics targets. Realistically, an app won't have more than a couple dozen statement forms to prepare. Setting the default statistics target to 5 is just a way to make some other query perform bad. Agree, but as you may have a different statistic target *per* table it should not be a problem. What is sure - all time spent on parse and planner will be removed here, and the final time should be a pure execution. Rgds, -Dimitri On 5/12/09 10:53 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Andres Freund escribió: Naturally it would still be nice to be good in this not optimal workload... I find it hard to justify wasting our scarce development resources into optimizing such a contrived workload. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance attachment: Hist_coresALL_RW0.ccrnone.pgsql-8.4beta1-buf4096-tps_avg-1.gif -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
I'm also confused, but seems discussion giving also other ideas :-) But yes, each client is connecting to the database server only *once*. To presice how the test is running: - 1 client is started = 1 in total - sleep ... - 1 another client is started = 2 in total - sleep .. - 2 another clients are started = 4 in total - sleep .. ... ... === 256 in total - sleep ... - kill clients So I even able to monitor how each new client impact all others. The test kit is quite flexible to prepare any kind of stress situations. Rgds, -Dimitri On 5/12/09, Glenn Maynard glennfmayn...@gmail.com wrote: I'm sorry, but I'm confused. Everyone keeps talking about connection pooling, but Dimitri has said repeatedly that each client makes a single connection and then keeps it open until the end of the test, not that it makes a single connection per SQL query. Connection startup costs shouldn't be an issue. Am I missing something here? test(N) starts N clients, each client creates a single connection and hammers the server for a while on that connection. test(N) is run for N=1,2,4,8...256. This seems like a very reasonable test scenario. -- Glenn Maynard -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
On MySQL there is no changes if I set the number of sessions in the config file to 400 or to 2000 - for 2000 it'll just allocate more memory. After latest fix with default_statistics_target=5, version 8.3.7 is running as fast as 8.4, even 8.4 is little little bit slower. I understand your position with a pooler, but I also want you think about idea that 128 cores system will become a commodity server very soon, and to use these cores on their full power you'll need a database engine capable to run 256 users without pooler, because a pooler will not help you here anymore.. Rgds, -Dimitri On 5/12/09, Robert Haas robertmh...@gmail.com wrote: On Tue, May 12, 2009 at 11:22 AM, Dimitri dimitrik...@gmail.com wrote: Robert, what I'm testing now is 256 users max. The workload is growing progressively from 1, 2, 4, 8 ... to 256 users. Of course the Max throughput is reached on the number of users equal to 2 * number of cores, but what's important for me here - database should continue to keep the workload! - response time regressing, but the troughput should remain near the same. So, do I really need a pooler to keep 256 users working?? - I don't think so, but please, correct me. Not an expert on this, but there has been a lot of discussion of the importance of connection pooling in this space. Is MySQL still faster if you lower max_connections to a value that is closer to the number of users, like 400 rather than 2000? BTW, I did not look to put PostgreSQL in bad conditions - the test is the test, and as I said 2 years ago PostgreSQL outperformed MySQL on the same test case, and there was nothing done within MySQL code to improve it explicitly for db_STRESS.. And I'm staying pretty honest when I'm testing something. Yeah but it's not really clear what that something is. I believe you said upthread that PG 8.4 beta 1 is faster than PG 8.3.7, but PG 8.4 beta 1 is slower than MySQL 5.4 whereas PG 8.3.7 was faster than some older version of MySQL. So PG got faster and MySQL got faster, but they sped things up more than we did. If our performance were getting WORSE, I'd be worried about that, but the fact that they were able to make more improvement on this particular case than we were doesn't excite me very much. Sure, I'd love it if PG were even faster than it is, and if you have a suggested patch please send it in... or if you want to profile it and send the results that would be great too. But I guess my point is that the case of a very large number of simultaneous users with pauses-for-thought between queries has already been looked at in the very recent past in a way that's very similar to what you are doing (and by someone who works at the same company you do, no less!) so I'm not quite sure why we're rehashing the issue. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] increase index performance
On Tue, 12 May 2009, Greg Smith wrote: You should test what happens if you reduce the index to just being (city_id,street_id). I think you're missing the point a little here. The point is that Thomas is creating an index on (city_id, street_id, house_id, floor_id) and running a query on (city_id, house_id, floor_id). Thomas, the order of columns in the index matters. The index is basically a tree structure, which resolves the left-most column before resolving the column to the right of it. So to answer your query, it will resolve the city_id, then it will have to scan almost all of the tree under that, because you are not constraining for street_id. A much better index to answer your query is (city_id, house_id, floor_id) - then it can just look up straight away. Instead of the index returning 20 rows to check, it will return just the 2000. Matthew -- An ant doesn't have a lot of processing power available to it. I'm not trying to be speciesist - I wouldn't want to detract you from such a wonderful creature, but, well, there isn't a lot there, is there? -- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware
Greg Stark wrote: On Mon, May 11, 2009 at 5:05 PM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: Good to know!!! I imagine that on a PS3 it would be _really_ fast... :-) well not really - while it is fairly easy to get postgresql running on a PS3 it is not a fast platform. While the main CPU there is a pretty fast Power based core it only has 256MB of Ram and a single SATA disk available(though you could add some USB disks). The nice thing about it is that TPC-C and other benchmarks all specify their bottom-line number in some unit like Transaction per second PER DOLLAR. So using a PS3 should be able to get ridiculously good results compared to expensive server hardware... I kinda doubt that - the PS3 is certainly not server grade hardware so you can only compare it to a desktop and I would bet that the typical desktop you get for the 400€(you can get 4GB RAM a quadcore CPU for that) price of a PS3 is going to outperform it significantly for almost every workload... Stefan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
On Tue, May 12, 2009 at 11:18 AM, Tom Lane t...@sss.pgh.pa.us wrote: Matthew Wakeling matt...@flymine.org writes: On Tue, 12 May 2009, Simon Riggs wrote: No, we spawn then authenticate. But you still have a single thread doing the accept() and spawn. At some point (maybe not now, but in the future) this could become a bottleneck given very short-lived connections. More to the point, each backend process is a pretty heavyweight object: it is a process, not a thread, and it's not going to be good for much until it's built up a reasonable amount of stuff in its private caches. I don't think the small number of cycles executed in the postmaster process amount to anything at all compared to the other overhead involved in getting a backend going. AIUI, whenever the connection pooler switches to serving a new client, it tells the PG backend to DISCARD ALL. But why couldn't we just implement this same logic internally? IOW, when a client disconnects, instead of having the backend exit immediately, have it perform the equivalent of DISCARD ALL and then stick around for a minute or two and, if a new connection request arrives within that time, have the old backend handle the new connection... (There is the problem of how to get the file descriptor returned by the accept() call in the parent process down to the child... but I think that at least on some UNIXen there is a way to pass an fd through a socket, or even dup it into another process by opening it from /proc/fd) ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL with PostGIS on embedded hardware
On Mon, May 11, 2009 at 5:05 PM, Stefan Kaltenbrunner ste...@kaltenbrunner.cc wrote: Good to know!!! I imagine that on a PS3 it would be _really_ fast... :-) well not really - while it is fairly easy to get postgresql running on a PS3 it is not a fast platform. While the main CPU there is a pretty fast Power based core it only has 256MB of Ram and a single SATA disk available(though you could add some USB disks). The nice thing about it is that TPC-C and other benchmarks all specify their bottom-line number in some unit like Transaction per second PER DOLLAR. So using a PS3 should be able to get ridiculously good results compared to expensive server hardware... -- greg -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
The idea is good, but *only* pooling will be not enough. I mean if all what pooler is doing is only keeping no more than N backends working - it'll be not enough. You never know what exactly your query will do - if you choose your N value to be sure to not overload CPU and then some of your queries start to read from disk - you waste your idle CPU time because it was still possible to run other queries requiring CPU time rather I/O, etc... I wrote some ideas about an ideal solution here (just omit the word mysql - as it's a theory it's valable for any db engine): http://dimitrik.free.fr/db_STRESS_MySQL_540_and_others_Apr2009.html#note_5442 Rgds, -Dimitri On 5/13/09, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Glenn Maynard glennfmayn...@gmail.com wrote: I'm sorry, but I'm confused. Everyone keeps talking about connection pooling, but Dimitri has said repeatedly that each client makes a single connection and then keeps it open until the end of the test, not that it makes a single connection per SQL query. Connection startup costs shouldn't be an issue. Am I missing something here? Quite aside from the overhead of spawning new processes, if you have more active connections than you have resources for them to go after, you just increase context switching and resource contention, both of which have some cost, without any offsetting gains. That would tend to explain why performance tapers off after a certain point. A connection pool which queues requests prevents this degradation. It would be interesting, with each of the CPU counts, to profile PostgreSQL at the peak of each curve to see where the time goes when it is operating with an optimal poolsize. Tapering after that point is rather uninteresting, and profiles would be less useful beyond that point, as the noise from the context switching and resource contention would make it harder to spot issues that really matter.. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
On 5/13/09 3:22 AM, Dimitri dimitrik...@gmail.com wrote: Hi Scott, On 5/12/09, Scott Carey sc...@richrelevance.com wrote: Although nobody wants to support it, he should try the patch that Jignesh K. Shah (from Sun) proposed that makes ProcArrayLock lighter-weight. If it makes 32 cores much faster, then we have a smoking gun. Although everyone here is talking about this as an 'unoptimal' solution, the fact is there is no evidence that a connection pooler will fix the scalability from 16 32 cores. Certainly a connection pooler will help most results, but it may not fix the scalability problem. A question for Dimitri: What is the scalability from 16 32 cores at the 'peak' load that occurs near 2x the CPU count? Is it also poor? If this is also poor, IMO the community here should not be complaining about this unopimal case -- a connection pooler at that stage does little and prepared statements will increase throughput but not likely alter scalability. I'm attaching a small graph showing a TPS level on PG 8.4 depending on number of cores (X-axis is a number of concurrent users, Y-axis is the TPS number). As you may see TPS increase is near linear while moving from 8 to 16 cores, while on 32cores even it's growing slightly differently, what is unclear is why TPS level is staying limited to 11.000 TPS on 32cores. And it's pure read-only workload. Interesting. What hardware is this, btw? Looks like the 32 core system probably has 2x the CPU and a bit less interconnect efficiency versus the 16 core one (which would be typical). Is the 16 core case the same, but with fewer cores per processor active? Or fewer processors total? Understanding the scaling difference may require a better understanding of the other differences besides core count. If that result scales, then the short term answer is a connection pooler. In the tests that Jingesh ran -- making the ProcArrayLock faster helped the case where connections = 2x the CPU core count quite a bit. The thread about the CPU scalability is Proposal of tunable fix for scalability of 8.4, originally posted by Jignesh K. Shah j.k.s...@sun.com, March 11 2009. It would be very useful to see results of this benchmark with: 1. A Connection Pooler will not help, as each client is *not* disconnecting/reconnecting during the test, as well PG is keeping well even 256 users. And TPS limit is reached already on 64 users, don't think pooler will help here. Actually, it might help a little. Postgres has a flaw that makes backends block on a lock briefly based on the number of total backends -- active or completely passive. Your tool has some (very small) user-side delay and a connection pooler would probably allow 64 of your users to efficiently 'fit' in 48 or so connection pooler slots. It is not about connecting and disconnecting in this case, its about minimizing Postgres' process count. If this does help, it would hint at certain bottlenecks. If it doesn't it would point elsewhere (and quiet some critics). However, its unrealistic for any process-per-connection system to have less backends than about 2x the core count -- else any waiting on I/O or network will just starve CPU. So this would just be done for research, not a real answer to making it scale better. For those who say but, what if its I/O bound! You don't need more backends then!: Well you don't need more CPU either if you're I/O bound. By definition, CPU scaling tests imply the I/O can keep up. 2. Jignesh's patch I've already tested it and it did not help in my case because the real problem is elsewhere.. (however, I did not test it yet with my latest config params) Great to hear that! -- That means this case is probably not ProcArrayLock. If its Solaris, could we get: 1. What is the CPU stats when it is in the inefficient state near 64 or 128 concurrent users (vmstat, etc. I'm interested in CPU in user/system/idle/wait time, and context switches/sec mostly). 2. A Dtrace probe on the postgres locks -- we might be able to identify something here. The results here would be useful -- if its an expected condition in the planner or parser, it would be useful confirmation. If its something unexpected and easy to fix -- it might be changed relatively soon. If its not easy to detect, it could be many other things -- but the process above at least rules some things out and better characterizes the state. 3. Prepared statements yes, I'm preparing this test. #3 is important, because prepared statements are ideal for queries that perform well with low statistics_targets, and not ideal for those that require high statistics targets. Realistically, an app won't have more than a couple dozen statement forms to prepare. Setting the default statistics target to 5 is just a way to make some other query perform bad. Agree, but as you may have a different statistic target *per* table it should not be a problem. What is
Re: [PERFORM] AMD Shanghai versus Intel Nehalem
On 5/12/09 10:06 PM, Scott Marlowe scott.marl...@gmail.com wrote: Just realized I made a mistake, I was under the impression that Shanghai CPUs had 8xxx numbers while barcelona had 23xx numbers. I was wrong, it appears the 8xxx numbers are for 4+ socket servers while the 23xx numbers are for 2 or fewer sockets. So, there are several quite affordable shanghai cpus out there, and many of the ones I quoted as barcelonas are in fact shanghais with the larger 6M L2 cache. At this point, I wouldn¹t go below 5520 on the Nehalem side (turbo + HT is just too big a jump, as is the 1066Mhz versus 800Mhz memory jump). Its $100 extra per CPU on a $10K + machine. The next 'step' is the 5550, since it can run 1333Mhz memory and has 2x the turbo -- but you would have to be more CPU bound for that. I wouldn't worry about the 5530 or 5540, they will only scale a little up from the 5520. For Opterons, I wouldn't touch anything but a Shanghai these days since its just not much more and we know the cache differences are very important for DB loads. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] superlative missuse
cl...@uah.es (Angel Alvarez) writes: more optimal plan... morreoptimal configuration... we suffer a 'more optimal' superlative missuse there is not so 'more optimal' thing but a simple 'better' thing. im not native english speaker but i think it still applies. If I wanted to be pedantic about it, I'd say that the word nearly is missing. That is, it would be strictly correct if one instead said more nearly optimal. I don't imagine people get too terribly confused by the lack of the word nearly, so I nearly don't care :-). -- select 'cbbrowne' || '@' || 'acm.org'; http://linuxfinances.info/info/languages.html Bureaucracies interpret communication as damage and route around it -- Jamie Zawinski -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
Dimitri dimitrik...@gmail.com wrote: The idea is good, but *only* pooling will be not enough. I mean if all what pooler is doing is only keeping no more than N backends working - it'll be not enough. You never know what exactly your query will do - if you choose your N value to be sure to not overload CPU and then some of your queries start to read from disk - you waste your idle CPU time because it was still possible to run other queries requiring CPU time rather I/O, etc... I never meant to imply that CPUs were the only resources which mattered. Network and disk I/O certainly come into play. I would think that various locks might count. You have to benchmark your actual workload to find the sweet spot for your load on your hardware. I've usually found it to be around (2 * cpu count) + (effective spindle count), where effective spindle count id determined not only by your RAID also your access pattern. (If everything is fully cached, and you have no write delays because of a BBU RAID controller with write-back, effective spindle count is zero.) Since the curve generally falls off more slowly past the sweet spot than it climbs to get there, I tend to go a little above the apparent sweet spot to protect against bad performance in a different load mix than my tests. I wrote some ideas about an ideal solution here (just omit the word mysql - as it's a theory it's valable for any db engine): http://dimitrik.free.fr/db_STRESS_MySQL_540_and_others_Apr2009.html#note_5442 I've seen similar techniques used in other databases, and I'm far from convinced that it's ideal or optimal. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] superlative missuse
El Miércoles, 13 de Mayo de 2009 Tom Lane escribió: Chris Browne cbbro...@acm.org writes: cl...@uah.es (Angel Alvarez) writes: there is not so 'more optimal' thing but a simple 'better' thing. If I wanted to be pedantic about it, I'd say that the word nearly is missing. That is, it would be strictly correct if one instead said more nearly optimal. This sort of construction is longstanding practice in English anyway. The most famous example I can think of offhand is in the US Constitution: ... in order to form a more perfect union ... Wooa!! So Tom lane for President still applies!! :-) Thanks all of you. regards, tom lane -- No imprima este correo si no es necesario. El medio ambiente está en nuestras manos. --- Angel J. Alvarez Miguel, Sección de Sistemas Area de Explotación, Servicios Informáticos Edificio Torre de Control, Campus Externo UAH Alcalá de Henares 28806, Madrid ** ESPAÑA ** RedIRIS Jabber: angel.uah...@rediris.es [www.uah.es]--- Tú lo compras, yo lo copio. Todo legal. -- Agua para todo? No, Agua para Todos. Clist UAH a.k.a Angel -[www.uah.es]--- No le daría Cocacola Zero, ni a mi peor enemigo. Para eso está el gas Mostaza que es mas piadoso. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] AMD Shanghai versus Intel Nehalem
On 5/12/09 11:08 PM, Arjen van der Meijden acmmail...@tweakers.net wrote: We have a dual E5540 with 16GB (I think 1066Mhz) memory here, but no AMD Shanghai. We haven't done PostgreSQL benchmarks yet, but given the previous experiences, PostgreSQL should be equally faster compared to mysql. Our databasebenchmark is actually mostly a cpu/memory-benchmark. Comparing the results of the dual E5540 (2.53Ghz with HT enabled) to a dual Intel X5355 (2.6Ghz quad core two from 2007), the peek load has increased from somewhere between 7 and 10 concurrent clients to somewhere around 25, suggesting better scalable hardware. With the 25 concurrent clients we handled 2.5 times the amount of queries/second compared to the 7 concurrent client-score for the X5355, both in MySQL 5.0.41. At 7 CC we still had 1.7 times the previous result. Excellent! That is a pretty huge boost. I'm curious which aspects of this new architecture helped the most. For Postgres, the following would seem the most relevant: 1. Shared L3 cache per processors -- more efficient shared datastructure access. 2. Faster atomic operations -- CompareAndSwap, etc are much faster. 3. Faster cache coherency. 4. Lower latency RAM with more overall bandwidth (Opteron style). Can you do a quick and dirty memory bandwidth test? (assuming linux) On the older X5355 machine and the newer E5540, try: /sbin/hdparm -T /dev/sddevice Where device is a valid letter for a device on your system. Here are the results for me on an older system with dual Intel E5335 (2Ghz, 4MB cache, family 6 model 15) Best result out of 5 (its not all that consistent, + or minus 10%) /dev/sda: Timing cached reads: 10816 MB in 2.00 seconds = 5416.89 MB/sec And a newer system with dual Xeon X5460 (3.16Ghz, 6MB cache, family 6 model 23) Best of 7 results: /dev/sdb: Timing cached reads: 26252 MB in 1.99 seconds = 13174.42 MB/sec Its not a very accurate measurement, but its quick and highlights relative hardware differences very easily. I'm not really sure how the shanghai cpu's compare to those older X5355's, the AMD's should be faster, but how much? I'm not sure either, and the Xeon platforms have evolved such that the chipsets and RAM configurations matter as much as the processor does. I've no idea if we get a Shanghai to compare it with, but we will get a dual X5570 soon on which we'll repeat some of the tests, so that should at least help a bit with scaling the X5570-results around the world down. Best regards, Arjen -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] increase index performance
Matthew Wakeling wrote: Thomas, the order of columns in the index matters. The index is basically a tree structure, which resolves the left-most column before resolving the column to the right of it. So to answer your query, it will resolve the city_id, then it will have to scan almost all of the tree under that, because you are not constraining for street_id. A much better index to answer your query is (city_id, house_id, floor_id) - then it can just look up straight away. Instead of the index returning 20 rows to check, it will return just the 2000. Thats something I was a bit unsure about, because of the cardinality of the data. But thanks, I will try it. Just need to populate a new data base with the new index. (Apparently, creating a new index on an already existing database is slower than just recreating the db, when the db is 250GB big) thomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
Hi, Le 13 mai 09 à 18:42, Scott Carey a écrit : will not help, as each client is *not* disconnecting/reconnecting during the test, as well PG is keeping well even 256 users. And TPS limit is reached already on 64 users, don't think pooler will help here. Actually, it might help a little. Postgres has a flaw that makes backends block on a lock briefly based on the number of total backends -- active or completely passive. Your tool has some (very small) user-side delay and a connection pooler would probably allow 64 of your users to efficiently 'fit' in 48 or so connection pooler slots. It seems you have think time, and I'm only insisting on what Scott said, but having thinktime means a connection pool can help. Pgbouncer is a good choice because it won't even attempt to parse the queries, and it has a flexible configuration. 3. Prepared statements yes, I'm preparing this test. It's possible to use prepared statement and benefit from pgbouncer at the same time, but up until now it requires the application to test whether its statements are already prepared at connection time, because the application is not controlling when pgbouncer is reusing an existing backend or giving it a fresh one. As I think I need this solution too, I've coded a PG module to scratch that itch this morning, and just published it (BSD licenced) on pgfoundry: http://preprepare.projects.postgresql.org/README.html http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/preprepare/preprepare/ With this module and the proper pgbouncer setup (connect_query='SELECT prepare_all();') the application has no more to special case the fresh- backend-nothing-prepared case, it's all transparent, just replace your SELECT query with its EXECUTE foo(x, y, z) counter part. I've took the approach to setup the prepared statements themselves into a table with columns name and statement, this latter one containing the full PREPARE SQL command. There's a custom variable preprepare.relation that has to be your table name (shema qualified). Each statement that you then put in there will get prepared when you SELECT prepare_all(); Hope this helps, regards, -- dim -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance